Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save ricardobarroslourenco/f7baeef29b3d65e1d742970882b7ca1f to your computer and use it in GitHub Desktop.
Save ricardobarroslourenco/f7baeef29b3d65e1d742970882b7ca1f to your computer and use it in GitHub Desktop.
Interpret_ML error
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "view-in-github"
},
"source": [
"<a href=\"https://colab.research.google.com/github/ricardobarroslourenco/SOS_Flux_Modelling/blob/main/notebooks/US-Ha1_scalecast.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wNlD-Bo9pxwX"
},
"source": [
"# Notebook to be used for TS modeling \n",
"\n",
"Scalecast: https://scalecast.readthedocs.io/en/latest/index.html\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## General environment parametrization"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "rpT9SF_RqDjh"
},
"outputs": [],
"source": [
"# Google collaboratory?\n",
"collaboratory = False\n",
"# Turn off GPU?\n",
"cpu_flag = False"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "wgQFTDu6qE7f"
},
"outputs": [],
"source": [
"# If needing to setup environment\n",
"# !pip install -q --upgrade scalecast darts prophet greykite shap kats pmdarima matplotlib numpy pandas"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"id": "yWb2FR8YuAXh"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/opt/conda/lib/python3.11/site-packages/shap/utils/_clustering.py:34: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/utils/_clustering.py:53: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/utils/_clustering.py:62: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/utils/_clustering.py:68: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/utils/_clustering.py:76: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/links.py:4: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @numba.jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/links.py:9: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @numba.jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/links.py:14: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @numba.jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/links.py:19: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @numba.jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/utils/_masked_model.py:362: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit # we can't use this when using a custom link function...\n",
"/opt/conda/lib/python3.11/site-packages/shap/utils/_masked_model.py:384: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/utils/_masked_model.py:427: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/utils/_masked_model.py:438: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/maskers/_tabular.py:185: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/maskers/_tabular.py:196: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/shap/maskers/_image.py:174: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"/opt/conda/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
" from .autonotebook import tqdm as notebook_tqdm\n",
"/opt/conda/lib/python3.11/site-packages/shap/explainers/_partition.py:675: NumbaDeprecationWarning: \u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
" @jit\n",
"\u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n",
"\u001b[1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"<WatchedFileHandler /host_folders/home/ricardo/PycharmProjects/SOS_Flux_Modelling/notebooks/log.txt (NOTSET)>"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import seaborn as sns\n",
"import matplotlib\n",
"import matplotlib.ticker as ticker\n",
"import matplotlib.pyplot as plt\n",
"\n",
"import shap\n",
"import sklearn\n",
"\n",
"import interpret.glassbox\n",
"\n",
"import xgboost\n",
"\n",
"\n",
"# from scalecast.Forecaster import Forecaster\n",
"# from scalecast.MVForecaster import MVForecaster\n",
"# from scalecast.multiseries import export_model_summaries\n",
"# from scalecast import GridGenerator\n",
"\n",
"from interpret.develop import debug_mode\n",
"debug_mode(log_filename='log.txt', log_level='INFO', native_debug=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load Data"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "UeFdOqf9y9F7",
"outputId": "44d8f309-6fd3-4cab-e619-6dfb561a2900"
},
"outputs": [],
"source": [
"# define prefix \n",
"\n",
"if collaboratory:\n",
" from google.colab import drive\n",
" drive.mount('/content/drive')\n",
"\n",
" prefix = '/content/drive/MyDrive/SOS_Flux_Modelling_files/data/US_Ha1/'\n",
"else:\n",
" # If needing to deal with zip/z01 files: https://askubuntu.com/a/751924/463917\n",
" prefix = '../data/ameriflux_raw/'"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"id": "a-1_azjIuB-n"
},
"outputs": [],
"source": [
"# read in data\n",
"data = pd.read_csv(prefix+'AMF_US-Ha1_FLUXNET_FULLSET_DD_1991-2020_3-5.csv',\n",
" delimiter=',',\n",
" parse_dates=['TIMESTAMP'],\n",
" na_values='-9999'\n",
" )\n",
"data = data.sort_values(['TIMESTAMP'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extraction/Transformation/Load (ETL)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# for variable in data.columns:\n",
"# print(variable)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"data.fillna(int(-9999), inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"loadings_labels = ['TA_F_MDS',\n",
" 'TA_F_MDS_QC',\n",
" 'TA_F_MDS_NIGHT',\n",
" 'TA_F_MDS_NIGHT_SD',\n",
" 'TA_F_MDS_NIGHT_QC',\n",
" 'TA_F_MDS_DAY',\n",
" 'TA_F_MDS_DAY_SD',\n",
" 'TA_F_MDS_DAY_QC',\n",
" 'TA_ERA',\n",
" 'TA_ERA_NIGHT',\n",
" 'TA_ERA_NIGHT_SD',\n",
" 'TA_ERA_DAY',\n",
" 'TA_ERA_DAY_SD',\n",
" 'TA_F',\n",
" 'TA_F_QC',\n",
" 'TA_F_NIGHT',\n",
" 'TA_F_NIGHT_SD',\n",
" 'TA_F_NIGHT_QC',\n",
" 'TA_F_DAY',\n",
" 'TA_F_DAY_SD',\n",
" 'TA_F_DAY_QC',\n",
" 'SW_IN_POT',\n",
" 'SW_IN_F_MDS',\n",
" 'SW_IN_F_MDS_QC',\n",
" 'SW_IN_ERA',\n",
" 'SW_IN_F',\n",
" 'SW_IN_F_QC',\n",
" 'LW_IN_F_MDS',\n",
" 'LW_IN_F_MDS_QC',\n",
" 'LW_IN_ERA',\n",
" 'LW_IN_F',\n",
" 'LW_IN_F_QC',\n",
" 'LW_IN_JSB',\n",
" 'LW_IN_JSB_QC',\n",
" 'LW_IN_JSB_ERA',\n",
" 'LW_IN_JSB_F',\n",
" 'LW_IN_JSB_F_QC',\n",
" 'VPD_F_MDS',\n",
" 'VPD_F_MDS_QC',\n",
" 'VPD_ERA',\n",
" 'VPD_F',\n",
" 'VPD_F_QC',\n",
" 'PA_ERA',\n",
" 'PA_F',\n",
" 'PA_F_QC',\n",
" 'P_ERA',\n",
" 'P_F',\n",
" 'P_F_QC',\n",
" 'WS_ERA',\n",
" 'WS_F',\n",
" 'WS_F_QC',\n",
" 'USTAR',\n",
" 'USTAR_QC',\n",
" 'NETRAD',\n",
" 'NETRAD_QC',\n",
" 'PPFD_IN',\n",
" 'PPFD_IN_QC',\n",
" 'PPFD_DIF',\n",
" 'PPFD_DIF_QC',\n",
" 'CO2_F_MDS',\n",
" 'CO2_F_MDS_QC',\n",
" 'TS_F_MDS_1',\n",
" 'TS_F_MDS_2',\n",
" 'TS_F_MDS_3',\n",
" 'TS_F_MDS_4',\n",
" 'TS_F_MDS_1_QC',\n",
" 'TS_F_MDS_2_QC',\n",
" 'TS_F_MDS_3_QC',\n",
" 'TS_F_MDS_4_QC',\n",
" 'G_F_MDS',\n",
" 'G_F_MDS_QC',\n",
" 'LE_F_MDS',\n",
" 'LE_F_MDS_QC',\n",
" 'LE_CORR',\n",
" 'LE_CORR_25',\n",
" 'LE_CORR_75',\n",
" 'LE_RANDUNC',\n",
" 'LE_CORR_JOINTUNC',\n",
" 'H_F_MDS',\n",
" 'H_F_MDS_QC',\n",
" 'H_CORR',\n",
" 'H_CORR_25',\n",
" 'H_CORR_75',\n",
" 'H_RANDUNC',\n",
" 'H_CORR_JOINTUNC',\n",
" 'EBC_CF_N',\n",
" 'EBC_CF_METHOD',\n",
" 'NIGHT_D',\n",
" 'DAY_D',\n",
" 'NIGHT_RANDUNC_N',\n",
" 'DAY_RANDUNC_N',\n",
" ]\n",
"\n",
"X = data[loadings_labels] # Define feature set\n",
"y = data['GPP_DT_VUT_REF']# Define target "
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"X100 = shap.utils.sample(X, 100)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Modelling Test"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-container-id-1 {color: black;background-color: white;}#sk-container-id-1 pre{padding: 0;}#sk-container-id-1 div.sk-toggleable {background-color: white;}#sk-container-id-1 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-1 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-1 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-1 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-1 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-1 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-1 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-1 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-1 div.sk-item {position: relative;z-index: 1;}#sk-container-id-1 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-1 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-1 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-1 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-1 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-1 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-1 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-1 div.sk-label-container {text-align: center;}#sk-container-id-1 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-1 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LinearRegression()</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">LinearRegression</label><div class=\"sk-toggleable__content\"><pre>LinearRegression()</pre></div></div></div></div></div>"
],
"text/plain": [
"LinearRegression()"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Base model\n",
"# a simple linear model\n",
"\n",
"model = sklearn.linear_model.LinearRegression()\n",
"model.fit(X, y)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"model_ExpBoostReg = interpret.glassbox.ExplainableBoostingRegressor(interactions=0)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"ename": "TerminatedWorkerError",
"evalue": "A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.\n\nThe exit codes of the workers are {SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11)}",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTerminatedWorkerError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[12], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mmodel_ExpBoostReg\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit\u001b[49m\u001b[43m(\u001b[49m\u001b[43mX\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43my\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/conda/lib/python3.11/site-packages/interpret/glassbox/_ebm/_ebm.py:848\u001b[0m, in \u001b[0;36mEBMModel.fit\u001b[0;34m(self, X, y, sample_weight, init_score)\u001b[0m\n\u001b[1;32m 821\u001b[0m early_stopping_rounds_local \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m0\u001b[39m\n\u001b[1;32m 823\u001b[0m parallel_args\u001b[38;5;241m.\u001b[39mappend(\n\u001b[1;32m 824\u001b[0m (\n\u001b[1;32m 825\u001b[0m dataset,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 845\u001b[0m )\n\u001b[1;32m 846\u001b[0m )\n\u001b[0;32m--> 848\u001b[0m results \u001b[38;5;241m=\u001b[39m \u001b[43mprovider\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mparallel\u001b[49m\u001b[43m(\u001b[49m\u001b[43mboost\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mparallel_args\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 850\u001b[0m \u001b[38;5;66;03m# let python reclaim the dataset memory via reference counting\u001b[39;00m\n\u001b[1;32m 851\u001b[0m \u001b[38;5;28;01mdel\u001b[39;00m parallel_args \u001b[38;5;66;03m# parallel_args holds references to dataset, so must be deleted\u001b[39;00m\n",
"File \u001b[0;32m/opt/conda/lib/python3.11/site-packages/interpret/provider/_compute.py:19\u001b[0m, in \u001b[0;36mJobLibProvider.parallel\u001b[0;34m(self, compute_fn, compute_args_iter)\u001b[0m\n\u001b[1;32m 18\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mparallel\u001b[39m(\u001b[38;5;28mself\u001b[39m, compute_fn, compute_args_iter):\n\u001b[0;32m---> 19\u001b[0m results \u001b[38;5;241m=\u001b[39m \u001b[43mParallel\u001b[49m\u001b[43m(\u001b[49m\u001b[43mn_jobs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_jobs\u001b[49m\u001b[43m)\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 20\u001b[0m \u001b[43m \u001b[49m\u001b[43mdelayed\u001b[49m\u001b[43m(\u001b[49m\u001b[43mcompute_fn\u001b[49m\u001b[43m)\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m)\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mfor\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43margs\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01min\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43mcompute_args_iter\u001b[49m\n\u001b[1;32m 21\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 22\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m results\n",
"File \u001b[0;32m/opt/conda/lib/python3.11/site-packages/joblib/parallel.py:1098\u001b[0m, in \u001b[0;36mParallel.__call__\u001b[0;34m(self, iterable)\u001b[0m\n\u001b[1;32m 1095\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_iterating \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mFalse\u001b[39;00m\n\u001b[1;32m 1097\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_backend\u001b[38;5;241m.\u001b[39mretrieval_context():\n\u001b[0;32m-> 1098\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mretrieve\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1099\u001b[0m \u001b[38;5;66;03m# Make sure that we get a last message telling us we are done\u001b[39;00m\n\u001b[1;32m 1100\u001b[0m elapsed_time \u001b[38;5;241m=\u001b[39m time\u001b[38;5;241m.\u001b[39mtime() \u001b[38;5;241m-\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_start_time\n",
"File \u001b[0;32m/opt/conda/lib/python3.11/site-packages/joblib/parallel.py:975\u001b[0m, in \u001b[0;36mParallel.retrieve\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 973\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[1;32m 974\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mgetattr\u001b[39m(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_backend, \u001b[38;5;124m'\u001b[39m\u001b[38;5;124msupports_timeout\u001b[39m\u001b[38;5;124m'\u001b[39m, \u001b[38;5;28;01mFalse\u001b[39;00m):\n\u001b[0;32m--> 975\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_output\u001b[38;5;241m.\u001b[39mextend(\u001b[43mjob\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[43mtimeout\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtimeout\u001b[49m\u001b[43m)\u001b[49m)\n\u001b[1;32m 976\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 977\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_output\u001b[38;5;241m.\u001b[39mextend(job\u001b[38;5;241m.\u001b[39mget())\n",
"File \u001b[0;32m/opt/conda/lib/python3.11/site-packages/joblib/_parallel_backends.py:567\u001b[0m, in \u001b[0;36mLokyBackend.wrap_future_result\u001b[0;34m(future, timeout)\u001b[0m\n\u001b[1;32m 564\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Wrapper for Future.result to implement the same behaviour as\u001b[39;00m\n\u001b[1;32m 565\u001b[0m \u001b[38;5;124;03mAsyncResults.get from multiprocessing.\"\"\"\u001b[39;00m\n\u001b[1;32m 566\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 567\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfuture\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mresult\u001b[49m\u001b[43m(\u001b[49m\u001b[43mtimeout\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mtimeout\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 568\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m CfTimeoutError \u001b[38;5;28;01mas\u001b[39;00m e:\n\u001b[1;32m 569\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTimeoutError\u001b[39;00m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01me\u001b[39;00m\n",
"File \u001b[0;32m/opt/conda/lib/python3.11/concurrent/futures/_base.py:456\u001b[0m, in \u001b[0;36mFuture.result\u001b[0;34m(self, timeout)\u001b[0m\n\u001b[1;32m 454\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m CancelledError()\n\u001b[1;32m 455\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_state \u001b[38;5;241m==\u001b[39m FINISHED:\n\u001b[0;32m--> 456\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m__get_result\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 457\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 458\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTimeoutError\u001b[39;00m()\n",
"File \u001b[0;32m/opt/conda/lib/python3.11/concurrent/futures/_base.py:401\u001b[0m, in \u001b[0;36mFuture.__get_result\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 399\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_exception:\n\u001b[1;32m 400\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 401\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_exception\n\u001b[1;32m 402\u001b[0m \u001b[38;5;28;01mfinally\u001b[39;00m:\n\u001b[1;32m 403\u001b[0m \u001b[38;5;66;03m# Break a reference cycle with the exception in self._exception\u001b[39;00m\n\u001b[1;32m 404\u001b[0m \u001b[38;5;28mself\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m\n",
"\u001b[0;31mTerminatedWorkerError\u001b[0m: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.\n\nThe exit codes of the workers are {SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11)}"
]
}
],
"source": [
"model_ExpBoostReg.fit(X, y)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"explainer_ebm = shap.Explainer(model_ExpBoostReg.predict, X100)\n",
"shap_values_ebm = explainer_ebm(X)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"shap.plots.waterfall(shap_values_ebm[sample_ind], max_display=100)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"shap.plots.beeswarm(shap_values_ebm, max_display=100)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"shap.plots.beeswarm(shap_values_ebm.abs, max_display=100)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"shap.plots.heatmap(shap_values_ebm[:10000], )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Dealing with correlated features"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"clustering = shap.utils.hclust(X, y)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"shap.plots.bar(shap_values_ebm, clustering=clustering, max_display=100)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Analysis of the partial dependency plots"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# make a standard partial dependence plot\n",
"for variable in loadings_labels:\n",
" print('Variable: '+variable)\n",
" sample_ind = 20\n",
" shap.partial_dependence_plot(\n",
" variable, model.predict, X100, model_expected_value=True,\n",
" feature_expected_value=True, ice=False,\n",
" shap_values=shap_values_ebm[sample_ind:sample_ind+1,:]\n",
" )\n",
" \n",
" shap.plots.scatter(shap_values_ebm[:,variable])\n",
" print('End '+variable)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"include_colab_link": true,
"provenance": []
},
"gpuClass": "standard",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment