Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
An illustration of the use of the Difference-In-Differences regression model to estimate the effect of hurricanes on house prices
import pandas as pd
from patsy import dmatrices
import statsmodels.api as sm
#Load the data set into a Pandas Dataframe
df = pd.read_csv('us_fred_coastal_us_states_avg_hpi_before_after_2005.csv', header=0)
#Print it
print(df)
#Form the regression expression in Patsy syntax. The intercept is assumed to be present and will be
# included in the data set automatically
reg_exp = 'HPI_CHG ~ Time_Period + Disaster_Affected + Time_Period*Disaster_Affected'
#Carve out the training matrices
y_train, X_train = dmatrices(reg_exp, df, return_type='dataframe')
#Build the DID model
did_model = sm.OLS(endog=y_train, exog=X_train)
#Train the model
did_model_results = did_model.fit()
#Print out the training results
did_model_results.summary()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment