Skip to content

Instantly share code, notes, and snippets.

View theomitsa's full-sized avatar

Theophano Mitsa theomitsa

View GitHub Profile
1. Create a calibration set.
2. Compute the residuals:
ri=|yi-yfi|
where yi is the actual observed value in the calibration set
and yfi is the model prediction for the same point.
3. Find the quantile of residuals q(1-a): where a is the significance level, e.g. 0.05.
4. Form the conformal interval for a new forecast:
interval at time t= yft+q(1-a) or interval at time t=yft-q(1-a)
income1['education'].replace('Preschool', 'IncompleteED',inplace=True)
income1['education'].replace('10th', 'IncompleteED',inplace=True)
income1['education'].replace('11th', 'IncompleteED',inplace=True)
income1['education'].replace('12th', 'IncompleteED',inplace=True)
income1['education'].replace('1st-4th', 'IncompleteED',inplace=True)
income1['education'].replace('5th-6th', 'IncompleteED',inplace=True)
income1['education'].replace('7th-8th', 'IncompleteED',inplace=True)
income1['education'].replace('9th', 'IncompleteED',inplace=True)
income1['education'].replace('Some-college', 'CommunityCollege',inplace=True)
income1['education'].replace('Assoc-acdm', 'CommunityCollege',inplace=True)
result = exp.segmented_diagnose(model='XGB1', show='accuracy_table',
segment_id=0, segment_feature='education', return_data=True)
result.data.head(10)
Segment ID Feature Segment Size ACC
0 0 education 2.000000 106 0.462264
1 1 education 5.000000 485 0.550515
2 2 education 6.000000 167 0.610778
3 3 education 0.000000 1538 0.711313
4 4 education 1.000000 2712 0.803835
5 5 education 3.000000 2938 0.854323
6 6 education 4.000000 1099 0.946315
ACC AUC F1 LogLoss Brier
Train 0.4863 0.7679 0.4706 1.1093 0.3762
Test 0.4623 0.7821 0.4124 1.1401 0.3906
Gap -0.0240 0.0143 -0.0582 0.0308 0.0144
@theomitsa
theomitsa / gist:8863162c725a479a19692729376f7aa1
Last active May 14, 2024 08:34
segmented diagnostic analysis
result = exp.segmented_diagnose(model='XGB1', show='segment_table',
segment_method='uniform', segment_feature='education',
segment_bins=5, return_data=True)
exp.model_diagnose(model="EBM", show="resilience_perf", resilience_method="worst-cluster",
figsize=(5, 4))
exp.model_diagnose(model="XGB1", show='robustness_perf', perturb_features=None,
perturb_method="quantile", metric="ACC", perturb_size=0.1, figsize=(6, 4))
exp.model_diagnose(model="EBM", show="reliability_distance",
threshold=1.1, distance_metric="PSI", figsize=(5, 4))
income1['marital-status'] = income1['marital-status'].astype('category')
# Generate the mapping
marital_mapping = dict(enumerate(income1['marital-status'].cat.categories))
print(marital_mapping)