Last active
October 16, 2020 03:36
-
-
Save ryanbehdad/585938bbd7021cdc80d0341042f243dd to your computer and use it in GitHub Desktop.
SHAP Full Explanation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# SHAP's force plot does not label all the important features | |
# We usually need to get the top (20) feautures that affect a decision for a particular instance | |
# In addition to their name, the features' values and their shapley values are also required. | |
# The below snippet | |
# 1. creates a dataframe containing all the features, their shapley value and their actual value | |
# 2. and exports the dataframe to a csv file | |
# 3. It also displays the force plot | |
import shap | |
shap.initjs() | |
app_id = '908259' | |
explainer = shap.TreeExplainer(xgb_model) | |
X = data[data.ApplicationId==app_id].drop(['ApplicationId', 'final_score'], axis=1) | |
shap_values = explainer.shap_values(X) | |
print('SHAP\'s expected value (if nothing is provided, it will give the following baseline score):', explainer.expected_value) | |
df_shap = pd.DataFrame(shap_values, columns=X.columns.values) | |
df_shap1 = df_shap.transpose() | |
df_shap1.columns=['shap_value'] | |
df_shap2 = data[data.ApplicationId==app_id].transpose() | |
df_shap2.columns = ['value'] | |
df_shap_final = df_shap1.merge(df_shap2, left_index=True, right_index=True, how='left') | |
df_shap_final = df_shap_final.sort_values('shap_value', ascending=False) | |
df_shap_final.to_csv(f'c:/temp/shap_{app_id}.csv') | |
df_shap_final.head(10) | |
_ = shap.force_plot(explainer.expected_value, shap_values[0, :], | |
data[data.ApplicationId==app_id].drop(['ApplicationId', 'final_score'], axis=1), | |
matplotlib=True, | |
show=True) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment