Skip to content

Instantly share code, notes, and snippets.

@ryanbehdad
Last active October 16, 2020 03:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ryanbehdad/585938bbd7021cdc80d0341042f243dd to your computer and use it in GitHub Desktop.
Save ryanbehdad/585938bbd7021cdc80d0341042f243dd to your computer and use it in GitHub Desktop.
SHAP Full Explanation
# SHAP's force plot does not label all the important features
# We usually need to get the top (20) feautures that affect a decision for a particular instance
# In addition to their name, the features' values and their shapley values are also required.
# The below snippet
# 1. creates a dataframe containing all the features, their shapley value and their actual value
# 2. and exports the dataframe to a csv file
# 3. It also displays the force plot
import shap
shap.initjs()
app_id = '908259'
explainer = shap.TreeExplainer(xgb_model)
X = data[data.ApplicationId==app_id].drop(['ApplicationId', 'final_score'], axis=1)
shap_values = explainer.shap_values(X)
print('SHAP\'s expected value (if nothing is provided, it will give the following baseline score):', explainer.expected_value)
df_shap = pd.DataFrame(shap_values, columns=X.columns.values)
df_shap1 = df_shap.transpose()
df_shap1.columns=['shap_value']
df_shap2 = data[data.ApplicationId==app_id].transpose()
df_shap2.columns = ['value']
df_shap_final = df_shap1.merge(df_shap2, left_index=True, right_index=True, how='left')
df_shap_final = df_shap_final.sort_values('shap_value', ascending=False)
df_shap_final.to_csv(f'c:/temp/shap_{app_id}.csv')
df_shap_final.head(10)
_ = shap.force_plot(explainer.expected_value, shap_values[0, :],
data[data.ApplicationId==app_id].drop(['ApplicationId', 'final_score'], axis=1),
matplotlib=True,
show=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment