Skip to content

Instantly share code, notes, and snippets.

@mzaradzki
Last active July 4, 2017 11:08
Show Gist options
  • Save mzaradzki/261b7ad5ef7cb726e0ac2164dc516d04 to your computer and use it in GitHub Desktop.
Save mzaradzki/261b7ad5ef7cb726e0ac2164dc516d04 to your computer and use it in GitHub Desktop.
# select columns that have "few" unique values
cramer_cols = [col for col in df.columns.values if (len(df[col].unique())<250)]
for col in cramer_cols:
try:
cm = pd.crosstab(df[col], df['status_group']).values # contingency table
cv1 = cramers_corrected_stat(cm)
if (cv1>=0.20):
print(col, int(cv1*100))
except:
None
# Output :
# funder 25
# installer 25
# region 20
# scheme_name 20
# extraction_type 24
# quantity 30
# waterpoint_type 25
# amount_tsh_zero 22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment