Skip to content

Instantly share code, notes, and snippets.

@richiefrost
Last active June 12, 2020 23:25
Show Gist options
  • Save richiefrost/6f060a8b08b6e6350e546f5c53486fd0 to your computer and use it in GitHub Desktop.
Save richiefrost/6f060a8b08b6e6350e546f5c53486fd0 to your computer and use it in GitHub Desktop.
Get the most salient attributes in a decision tree
from sklearn.tree import DecisionTreeClassifier
import pandas as pd
# Get the most valuable customers, from step 2
df = pd.read_csv('high_value_customers.csv')
# Churned is our target. Why did they/didn't they churn?
X, y = df.drop('Churned', axis=1), df['Churned']
model = DecisionTreeClassifier()
model.fit(X, y)
# Get features and their importances
features = X.columns
importances = model.feature_importances_
# Sort features by their importance
features_and_importances = zip(features, importances)
features_and_importances = sorted(features_and_importances, key=lambda x: x[1], reverse=True)
# Display the most important features
print('\n'.join(features_and_importances))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment