Skip to content

Instantly share code, notes, and snippets.

@tomonari-masada
Created April 3, 2017 07:50
Show Gist options
  • Save tomonari-masada/e0ce56569c2408b5e1ee3b1e49bfc6c6 to your computer and use it in GitHub Desktop.
Save tomonari-masada/e0ce56569c2408b5e1ee3b1e49bfc6c6 to your computer and use it in GitHub Desktop.
Reproduce Table 2.2 of Applied Logistic Regression (3rd Edition) with Statsmodels
import pandas as pd
import statsmodels.api as sm
# glow500.xls at https://www.umass.edu/statdata/statdata/data/glow/index.html
xls_file = pd.ExcelFile('glow500.xls')
df = xls_file.parse(header=0)
rate_dummies = pd.get_dummies(df['RATERISK'])
rate_dummies.columns = ['RATERISK1', 'RATERISK2', 'RATERISK3']
y = df['FRACTURE']
X = df.drop(['SUB_ID', 'SITE_ID', 'PHY_ID', 'RATERISK', 'FRACSCORE', 'FRACTURE'], axis=1)
X = pd.concat([X, rate_dummies], axis=1)
X = X.drop('RATERISK1', axis=1)
X = X.drop(['HEIGHT', 'BMI', 'MOMFRAC', 'ARMASSIST', 'SMOKE'], axis=1)
X = sm.add_constant(X)
result = sm.Logit(y, X).fit(disp=0)
print(result.summary())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment