Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@Kiwibp
Created June 13, 2018 13:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Kiwibp/fce38fcb4e0e55d0af51d1621cd0ba2d to your computer and use it in GitHub Desktop.
Save Kiwibp/fce38fcb4e0e55d0af51d1621cd0ba2d to your computer and use it in GitHub Desktop.
# Tree-based estimators can be used to compute feature importances, which in turn can be used to discard irrelevant features.
clf = RandomForestClassifier(n_estimators=50, max_features='sqrt')
clf = clf.fit(train, targets)
# Let's have a look at the importance of each feature.
features = pd.DataFrame()
features['feature'] = train.columns
features['importance'] = clf.feature_importances_
# Sorting values by feature importance.
features.sort_values(['importance'],ascending=True, inplace=True)
features.set_index('feature', inplace=True)
features.plot(kind='barh', figsize=(20, 20));
# As you may notice, there is a great importance linked to Title_Mr, Age, Fare, and Sex.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment