Skip to content

Instantly share code, notes, and snippets.

@WillKoehrsen
Created April 20, 2018 17:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save WillKoehrsen/ed886415516a53531f6371376a2a5b71 to your computer and use it in GitHub Desktop.
Save WillKoehrsen/ed886415516a53531f6371376a2a5b71 to your computer and use it in GitHub Desktop.
# X_train is our training data, we will make a copy for plotting
X_plot = X_train.copy()
# Compare grades to the median
X_plot['relation_median'] = (X_plot['Grade'] >= 12)
X_plot['Grade'] = X_plot['Grade'].replace({True: 'above',
False: 'below'})
# Plot all variables in a loop
plt.figure(figsize=(12, 12))
for i, col in enumerate(X_plot.columns[:-1]):
plt.subplot(3, 2, i + 1)
subset_above = X_plot[X_plot['relation_median'] == 'above']
subset_below = X_plot[X_plot['relation_median'] == 'below']
sns.kdeplot(subset_above[col], label = 'Above Median')
sns.kdeplot(subset_below[col], label = 'Below Median')
plt.legend()
plt.title('Distribution of %s' % col)
plt.tight_layout()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment