Skip to content

Instantly share code, notes, and snippets.

@lgallen
Last active September 23, 2020 16:31
Show Gist options
  • Save lgallen/55449ee8f79836eb902a4945d29a8108 to your computer and use it in GitHub Desktop.
Save lgallen/55449ee8f79836eb902a4945d29a8108 to your computer and use it in GitHub Desktop.
Shabnam project homework help
# Grabbing the preprocessor
pre = fit_model.named_steps['preprocessor']
# Getting the numerical and categorical features from the pipeline
num_feats = pre.transformers_[0][2]
cat_feats = pre.transformers_[1][1]['onehot']\
.get_feature_names(categorical_features)
all_feats = num_feats+list(cat_feats)
# Dataframe for visual examination of coefficients
df_coefs = pd.DataFrame()
df_coefs['feature'] = all_feats
df_coefs['coefficient'] = model.coef_
# Filter out all but the coefficients with some significance
df_coefs[abs(df_coefs['coefficient']) > .01].sort_values('coefficient')
@lgallen
Copy link
Author

lgallen commented Sep 22, 2020

Run this after the you have trained the model in the homework submission you shared. This will help you understand what the model is doing. I think what you have now is OK, but you probably have some multicollinearity, especially among those geographic features.

@lgallen
Copy link
Author

lgallen commented Sep 23, 2020

Updating to include sort_values, typo in original that excluded negative coef.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment