Skip to content

Instantly share code, notes, and snippets.

View sachinsdate's full-sized avatar
💭
Up to my ears in regression modeling

sachinsdate

💭
Up to my ears in regression modeling
View GitHub Profile
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
from patsy import dmatrices
from matplotlib import pyplot as plt
#Import the 7-variable subset of the automobiles dataset into a DataFrame
df = pd.read_csv('automobiles_dataset_subset_uciml.csv', header=0)
G1 failures schoolsup famsup studytime goout sex
5 0 1 0 2 4 1
5 0 0 1 2 3 1
7 3 1 0 2 2 1
15 0 0 1 3 2 1
6 0 0 1 2 2 1
15 0 0 1 2 2 0
12 0 0 0 2 4 0
6 0 1 1 2 4 1
16 0 0 1 2 2 0
@sachinsdate
sachinsdate / unemployment_rate_us_fred.csv
Created June 21, 2022 05:45
US unemployment rate before and after the Great Recession of 2008-09. Source: https://fred.stlouisfed.org/series/UNRATE
DATE Time_Period UNRATE Epoch
01-01-02 1 5.7 0
01-02-02 2 5.7 0
01-03-02 3 5.7 0
01-04-02 4 5.9 0
01-05-02 5 5.8 0
01-06-02 6 5.8 0
01-07-02 7 5.8 0
01-08-02 8 5.7 0
01-09-02 9 5.7 0
@sachinsdate
sachinsdate / difference_in_differences_regression.py
Created June 17, 2022 09:04
An illustration of the use of the Difference-In-Differences regression model to estimate the effect of hurricanes on house prices
import pandas as pd
from patsy import dmatrices
import statsmodels.api as sm
#Load the data set into a Pandas Dataframe
df = pd.read_csv('us_fred_coastal_us_states_avg_hpi_before_after_2005.csv', header=0)
#Print it
print(df)
@sachinsdate
sachinsdate / dummy_variables_regression.py
Created June 13, 2022 10:47
Source code illustrating three different uses of dummy variables in a regression model.
import pandas as pd
import statsmodels.formula.api as smf
from patsy import dmatrices
import scipy.stats as st
from matplotlib import pyplot as plt
#Import the 7-variable subset of the automobiles dataset into a DataFrame
df = pd.read_csv('automobiles_dataset_subset_uciml.csv', header=0)
#############################################################################################
@sachinsdate
sachinsdate / us_fred_coastal_us_states_avg_hpi_before_after_2005.csv
Last active June 13, 2022 10:56
Dataset of state-wise house price inflation before and after the 2005 atlantic hurricane season. Data source: https://fred.stlouisfed.org/
STATE HPI_CHG Time_Period Disaster_Affected NUM_DISASTERS NUM_IND_ASSIST
GASTHPI_CHG 0.014008563 0 0 1 0
NCSTHPI_CHG 0.014220629 0 0 3 0
TXSTHPI_CHG 0.010191721 0 1 5 22
MASTHPI_CHG 0.027536563 0 0 4 9
ALSTHPI_CHG 0.017585072 0 1 4 14
MSSTHPI_CHG 0.013252413 0 1 3 49
SCSTHPI_CHG 0.017988328 0 0 1 0
NHSTHPI_CHG 0.028513272 0 0 5 6
LASTHPI_CHG 0.015574159 0 1 5 55
make aspiration body_style curb_weight num_of_cylinders engine_size price
alfa-romero std convertible 2548 4 130 13495
alfa-romero std convertible 2548 4 130 16500
alfa-romero std hatchback 2823 6 152 16500
audi std sedan 2337 4 109 13950
audi std sedan 2824 5 136 17450
audi std sedan 2507 5 136 15250
audi std sedan 2844 5 136 17710
audi std wagon 2954 5 136 18920
audi turbo sedan 3086 5 131 23875
@sachinsdate
sachinsdate / wb_data_panel_3ind_7units_1992_2014.csv
Created May 19, 2022 10:25
World Development Indicators data from World Bank under CC BY 4.0 license
COUNTRY YEAR GCF_GWTH_PCNT GDP_PCAP_GWTH_PCNT CO2_PCAP_GWTH_PCNT
Belgium 1992 1.829137475 1.11956586 -0.023584911
Belgium 1993 -2.956525218 -1.34799971 -0.023584911
Belgium 1994 3.764435394 2.909318769 0.040290861
Belgium 1995 4.113740593 2.170550274 -0.00495823
Belgium 1996 0.415438625 1.123669018 0.040558879
Belgium 1997 7.67936209 3.542789064 -0.025884622
Belgium 1998 1.535928255 1.744323895 0.021564632
Belgium 1999 3.811360631 3.305706514 -0.034875079
Belgium 2000 7.189729001 3.465452571 0.01286399
import pandas as pd
import statsmodels.formula.api as smf
from patsy import dmatrices
import scipy.stats as st
##########################################################################################
#Select between two non-nested fixed effects models using the Encompassing Principle
##########################################################################################
@sachinsdate
sachinsdate / conditional_variance.py
Last active April 7, 2022 12:23
Conditional variance and conditional covariance
import pandas as pd
from patsy import dmatrices
import numpy as np
import scipy.stats
import statsmodels.formula.api as sm
import matplotlib.pyplot as plt
#Read the automobiles dataset into a Pandas DataFrame
df = pd.read_csv('automobile_uciml_6vars.csv', header=0)