This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#get a single influence-sensitivity plot (ISP) | |
RF_explainer.plot_isp("Age") | |
#get all ISPs | |
RF_explainer.plot_isps() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#retrieve existing segment groups that have been created in project | |
#Note: segment groups can also be created programmatically, see Python SDK reference | |
tru.get_segment_groups() | |
#create a new explainer object that is set to data and segment of interest | |
segment_explainer = tru.get_explainer('train') | |
segment_explainer.set_comparison_data_splits(tru.get_data_splits()) | |
segment_explainer.set_segment('Gender', 'Male') | |
#observe performance for male segment |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Drift metrics -- select metric of interest with optional parameter. Defaults to project setting. | |
RF_explainer.compute_model_score_instability() | |
#feature contributions to score drift -- related to shifts in influence density, not feature value | |
RF_explainer.compute_feature_contributors_to_instability().transpose().sort_values(by='test', ascending=False) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#get avg(abs(feature influences)), per feature, and sort highest to lowest | |
RF_explainer.get_global_feature_importances().transpose().sort_values(by=0, ascending=False) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#feature influences, row level | |
#returns results based on current truera workspace context (tru), as dataframe | |
RF_explainer.get_feature_influences() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Hotspots: error analysis / performance debugging | |
#note ability to select performance metric of interest | |
RF_explainer.find_hotspots(metric_of_interest='RECALL', | |
max_num_responses=3) | |
#alternatively, also include what-if performance analyses, if hotspot were eliminated | |
RF_explainer.find_hotspots(metric_of_interest='CLASSIFICATION_ACCURACY', | |
show_what_if_performance=True, | |
max_num_responses=3) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#for quick performance comparison, let's set set a base and comparison split | |
RF_explainer.set_base_data_split("train") | |
#optional: check which splits are available | |
tru.get_data_splits() | |
#TruEra will automatically ignore any splits already set as the base data, e.g., training data | |
RF_explainer.set_comparison_data_splits(tru.get_data_splits()) | |
#optional: list available performance metrics |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#instantiate truera workspace, if not already active in your current python kernel/environment | |
tru = explainer.get_truera_workspace() | |
tru.set_environment("remote") #retrieve project information from TruEra Web App | |
#set truera worksapce context | |
tru.set_project("Titanic Survival") | |
tru.set_data_collection("Titanic Passenger Data") | |
tru.set_model("Random Forest") | |
tru.set_data_split("Train") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Note: TruEra's use of an ID column / unique identifier enables proper type handling, higher data volume limits, and delayed addition of data such as labels or extra data for segmentation | |
#Note 2: by specifying a "split name", we automatically add this data, and corresponding Shapley values, to the project created in the TruSHAP explainer step | |
GBM_shap_values = GBM_explainer(X_train, | |
y=y_df, | |
id_col_name='index', | |
data_split_name = "train" | |
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#import trushap as shap | |
from truera.client.experimental.trushap import trushap as shap | |
#TruEra Web App - connection details | |
CONNECTION_STRING = os.getenv('url') | |
TOKEN = os.getenv('token') | |
#use connection string and token as arguments in shap.Explainer method | |
#define project resource names, as desired | |
GBM_explainer = shap.Explainer(GBM, |
NewerOlder