Skip to content

Instantly share code, notes, and snippets.

@iandewancker
Created February 8, 2018 06:27
Show Gist options
  • Save iandewancker/66afc25e49d073701fb179766572b31a to your computer and use it in GitHub Desktop.
Save iandewancker/66afc25e49d073701fb179766572b31a to your computer and use it in GitHub Desktop.
Feature importances in binary classifier
X_g = X[np.where(y==1)]
X_b = X[np.where(y==0)]
M = X.shape[1]
ranges = []
for i in xrange(M):
ranges.append((np.min(X[:,i]), np.max(X[:,i])))
importances = []
for i in xrange(M):
g_dist = np.histogram(X_g[:,i],bins=50,density=True,range=ranges[i])[0]
b_dist = np.histogram(X_b[:,i],bins=50,density=True,range=ranges[i])[0]
g_dist /= np.sum(g_dist)
b_dist /= np.sum(b_dist)
g_dist += 0.01
b_dist += 0.01
importances.append((i,scipy.stats.entropy(g_dist,b_dist)))
importances.sort(key=lambda x : x[1], reverse=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment