Skip to content

Instantly share code, notes, and snippets.

@akelleh
Created February 23, 2019 12:57
Show Gist options
  • Save akelleh/21ee3d185ad45f3bfe3f3cbcc54489ad to your computer and use it in GitHub Desktop.
Save akelleh/21ee3d185ad45f3bfe3f3cbcc54489ad to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Lalonde Pandas API Example"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll run through a quick example using the high-level Python API for the DoSampler. The DoSampler is different from most classic causal effect estimators. Instead of estimating statistics under interventions, it aims to provide the generality of Pearlian causal inference. In that context, the joint distribution of the variables under an intervention is the quantity of interest. It's hard to represent a joint distribution nonparametrically, so instead we provide a sample from that distribution, which we call a \"do\" sample.\n",
"\n",
"Here, when you specify an outcome, that is the variable you're sampling under an intervention. We still have to do the usual process of making sure the quantity (the conditional interventional distribution of the outcome) is identifiable. We leverage the familiar components of the rest of the package to do that \"under the hood\". You'll notice some similarity in the kwargs for the DoSampler.\n",
"\n",
"## Getting the Data\n",
"\n",
"First, download the data from the LaLonde example."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from rpy2.robjects import r as R\n",
"\n",
"%load_ext rpy2.ipython\n",
"%R install.packages(\"Matching\")\n",
"%R library(Matching)\n",
"%R data(lalonde)\n",
"%R -o lalonde\n",
"lalonde.to_csv(\"lalonde.csv\",index=False)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# the data already loaded in the previous cell. we include the import\n",
"# here you so you don't have to keep re-downloading it.\n",
"\n",
"import pandas as pd\n",
"\n",
"lalonde=pd.read_csv(\"lalonde.csv\")\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The `causal` Namespace"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We've created a \"namespace\" for `pandas.DataFrame`s containing causal inference methods. You can access it here with `lalonde.causal`, where `lalonde` is our `pandas.DataFrame`, and `causal` contains all our new methods! These methods are magically loaded into your existing (and future) dataframes when you `import dowhy.api`."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"import dowhy.api\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have the `causal` namespace, lets give it a try! \n",
"\n",
"## The `do` Operation\n",
"\n",
"The key feature here is the `do` method, which produces a new dataframe replacing the treatment variable with values specified, and the outcome with a sample from the interventional distribution of the outcome. If you don't specify a value for the treatment, it leaves the treatment untouched:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:dowhy.do_why:Causal Graph not provided. DoWhy will construct a graph based on data inputs.\n",
"INFO:dowhy.causal_identifier:Common causes of treatment and outcome:{'hisp', 'black', 'educ', 'nodegr', 'U', 'age', 'married'}\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"['hisp', 'black', 'educ', 'nodegr', 'age', 'married']\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"Model to find the causal effect of treatment treat on outcome re78\n",
"{'observed': 'yes'}\n",
"{'observed': 'yes'}\n",
"{'observed': 'yes'}\n",
"{'observed': 'yes'}\n",
"{'label': 'Unobserved Confounders', 'observed': 'no'}\n",
"There are unobserved common causes. Causal effect cannot be identified.\n",
"WARN: Do you want to continue by ignoring these unobserved confounders? [y/n] \n",
"Please respond with 'y' or 'n'\n",
"WARN: Do you want to continue by ignoring these unobserved confounders? [y/n] \n",
"Please respond with 'y' or 'n'\n",
"WARN: Do you want to continue by ignoring these unobserved confounders? [y/n] yes\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:dowhy.causal_identifier:Instrumental variables for treatment and outcome:[]\n",
"INFO:dowhy.do_sampler:Using WeightingSampler for do sampling.\n",
"INFO:dowhy.do_sampler:Caution: do samplers assume iid data.\n",
"/home/akelleh/.virtualenvs/data/lib/python3.6/site-packages/pandas/core/frame.py:3140: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
" self[k1] = value[k2]\n",
"/home/akelleh/.virtualenvs/data/lib/python3.6/site-packages/sklearn/linear_model/logistic.py:433: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n",
" FutureWarning)\n",
"/home/akelleh/.virtualenvs/data/lib/python3.6/site-packages/sklearn/utils/validation.py:761: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().\n",
" y = column_or_1d(y, warn=True)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"WeightingSampler\n",
"treatments ['treat']\n",
"backdoor ['hisp', 'black', 'educ', 'nodegr', 'age', 'married']\n",
" educ age 0\n",
"0 11 37 0\n",
"1 9 22 0\n",
"2 12 30 1\n",
"3 11 27 0\n",
"4 8 33 0\n"
]
}
],
"source": [
"do_df = lalonde.causal.do(x='treat',\n",
" outcome='re78',\n",
" common_causes=['nodegr', 'black', 'hisp', 'age', 'educ', 'married'],\n",
" variable_types={'age': 'c', 'educ':'c', 'black': 'd', 'hisp': 'd', \n",
" 'married': 'd', 'nodegr': 'd','re78': 'c', 'treat': 'b'})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice you get the usual output and prompts about identifiability. This is all `dowhy` under the hood!\n",
"\n",
"We now have an interventional sample in `do_df`. It looks very similar to the original dataframe. Compare them:"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>educ</th>\n",
" <th>black</th>\n",
" <th>hisp</th>\n",
" <th>married</th>\n",
" <th>nodegr</th>\n",
" <th>re74</th>\n",
" <th>re75</th>\n",
" <th>re78</th>\n",
" <th>u74</th>\n",
" <th>u75</th>\n",
" <th>treat</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>37</td>\n",
" <td>11</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>9930.05</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>22</td>\n",
" <td>9</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3595.89</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>30</td>\n",
" <td>12</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>24909.50</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>27</td>\n",
" <td>11</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>7506.15</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>33</td>\n",
" <td>8</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>289.79</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" age educ black hisp married nodegr re74 re75 re78 u74 u75 \\\n",
"0 37 11 1 0 1 1 0.0 0.0 9930.05 1 1 \n",
"1 22 9 0 1 0 1 0.0 0.0 3595.89 1 1 \n",
"2 30 12 1 0 0 0 0.0 0.0 24909.50 1 1 \n",
"3 27 11 1 0 0 1 0.0 0.0 7506.15 1 1 \n",
"4 33 8 1 0 0 1 0.0 0.0 289.79 1 1 \n",
"\n",
" treat \n",
"0 1 \n",
"1 1 \n",
"2 1 \n",
"3 1 \n",
"4 1 "
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lalonde.head()"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>educ</th>\n",
" <th>black</th>\n",
" <th>hisp</th>\n",
" <th>married</th>\n",
" <th>nodegr</th>\n",
" <th>re74</th>\n",
" <th>re75</th>\n",
" <th>re78</th>\n",
" <th>u74</th>\n",
" <th>u75</th>\n",
" <th>treat</th>\n",
" <th>propensity_score</th>\n",
" <th>weight</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>23</td>\n",
" <td>10</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0.370679</td>\n",
" <td>1.589014</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>22</td>\n",
" <td>8</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>1390.51</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0.390692</td>\n",
" <td>1.641205</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>24</td>\n",
" <td>11</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>2669.73</td>\n",
" <td>1468.38</td>\n",
" <td>10361.70</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.361870</td>\n",
" <td>1.567078</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>25</td>\n",
" <td>11</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.363949</td>\n",
" <td>2.747637</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>21</td>\n",
" <td>11</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>1553.29</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0.355663</td>\n",
" <td>1.551983</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" age educ black hisp married nodegr re74 re75 re78 u74 \\\n",
"0 23 10 1 0 0 1 0.00 0.00 0.00 1 \n",
"1 22 8 1 0 0 1 0.00 0.00 1390.51 1 \n",
"2 24 11 0 0 0 1 2669.73 1468.38 10361.70 0 \n",
"3 25 11 1 0 1 1 0.00 0.00 0.00 1 \n",
"4 21 11 1 0 0 1 0.00 0.00 1553.29 1 \n",
"\n",
" u75 treat propensity_score weight \n",
"0 1 0 0.370679 1.589014 \n",
"1 1 0 0.390692 1.641205 \n",
"2 0 0 0.361870 1.567078 \n",
"3 1 1 0.363949 2.747637 \n",
"4 1 0 0.355663 1.551983 "
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"do_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Treatment Effect Estimation\n",
"\n",
"We could get a naive estimate before for a treatment effect by doing"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAOAAAAASCAYAAABCd9LzAAAABHNCSVQICAgIfAhkiAAAB21JREFUaIHtmmeMVUUUx38rrOAqYgGXqEQQLGuJiEqJILuoJIoQ7MaGBSvWKKAY48bEKGqIig1jiyV+ULEANsSCYokFIggiCIsS3AhYQOnr+uE/kzc7O/e+W94mfnj/5OXunin/mXvPOTPnzEAZZZTxv8EZwBTgE2A90Ay8EFH3IlMe92sKtKsALgO+BP4G/gG+Bq4Edkg4zvMdjjEJ21hMAmYDvwCbgN+BecAdwJ4l4s7KsS/wNLAa2AI0AA8Au8e0GQ68B6wyXMuBl4GBJZpLWo6LyKYXDTH1G726e5rxvgYsM2P6C/gUuJRoPUrD4eN4w9eIvs1q4F3gZK9eKv2u8P6fDxxhGq4CDgZeRB/KRx9gVMRgBwNDgZnAKV7Zi8C5wG/Am8BG4ESgBngeuDCiT4vuwAKgHbALmuyTRdq42Ap8CywyY9gZGAAcjV7qAGQ4ebizcPQCPgP2At4AfgD6AXXAEuBYYJ3XZhIw3shfB9YCvYGRQHv0LqMcaNK5pOXIqhcNwG7I4fj4G7jf+f9K4DHgV+BD4GegGjgN6Ay8CpyJDCsrh4t7gXHIJt5G76ArcBTwPno/Frn0uw44ABlmLfErYBw+N21HevJTjXw50MWR7whMN2WnxfRbgSb8E3Af2VbAjhHyu0x/j5aAOwvHu6bsWk8+2cgf9+Td0ErSiIzWRR2F9xxC0rnk4QghSi9AxtGQsJ+hwAharyjdkDE2A6fn5LC4zPT3LNJTH5XO33n1uwVqyWaAh5t2q5B3dfGcKRsbaNfHlH0Q0/f1wL/AcUA92QwwCkeY/ma1IXcURy8jX0FrpepEYSuzsyPvb9q8EcG1HtgQUZZ0Lnk4fMTpBWQzjhAmGp4pJeDogFaylYSNz0dq/U4ac6XB5eb5FK33+t3MM+Q1rWww4cnWAPcADwJzco4xhBHm+V0bckdx1Jnne8gwXGwA5gJVaOtqsRRtdfvR0tuCDKsTWuV8pJlLVo4Q4vTCogMKdyYiJ1FH2FjjsM08t5eA40S01ZyGvstwYIJpF4p/U+t3+wjirNgJTa6JcDyx1jx7Bsr2d8a0P4qBcGTPo+3FxJKMFG5GsU9nFJsNQoZxj1cvD3dSjoPM88eIfpYCw4ADUXIHlNiZgLaoi1B8tg6tpiPRKntFzrlk4QihmF5YdDPjc7ECuBj4OAGPjUkB3ikBxzHmuRkl0Q7zyuegxOUa839W/Q6ilvRb0NGmzYyI8vNM+TJgD0deibY5NiPle5c70cdz5fXk24I2OnzNKLiuDtTLw52U44ki/dnY8dZA2ShkKC7PUpQI8JF1Lmk4QiimF6AM8VD0fqqQsj+OVp6NaPteDPcbnpkl4njM9LcdOc5ByKEeTiFm/8ipn1W/g6glvQHONW1GRJS3Q57Jpn2noq3Q9+gDrzRl/Z02/dELuNfrq57SxIDVKHhegjKUfduAO44DshvgeDO+ycirVpm+rXK44846lzQcUSimF3GwRvVakXrXmXqLaan8eTimGvlmoIdXVoUy2a5BZdHvSNSSzgAPNfV/IX7fXom2NQvQxP5EW5uDgYWmD7uEt0dKuwjt3V3UU9okzH7ofGdhG3L7HBY2E3lTRLuHTflVjqzWyKYF6lehZEcTMpqsc0nDEYWkehGF3qa9fwTj4hpT53sKcVgpOCYZ+ecR7Z405dc7sjT6HYta0hngg6Z+fcL6Pjoi5VzjyHaj+KGu/YXOdtJinumrSxtyuxwWY4xsakQbu9oc78is1/aPLSymUUjHZ51LGo4o5NWLzhRWoRBuMOULaH1UkpfjEgqhQwjWcd6SgCOk3yVLwnQELkDe8KmMfZyDskMvObItMf31BY5Etx+WEO2l0mBv82xqQ26Xw+JD8xyGMtNuJrQTOoTfCHzhyO1K1jWCx8q3kn0uaThCKIVe2MxvKLM4ASW05qOM5dpAnTwcs5GBHULr7wKFpMyKBBwh/Y5FLclXwAtM3ekJ6u4akPVBnuF3CgpaDPVEb516oSW/0pMfiLydjx0oxFlzc3Jn5Uh7EH8WhVhjH6/sJKQsmyh+va6e6Lnk5UiqFzW0POO06IGSPc20ztrebuRfkyzmy8IBheTJjZ58GJr/H7T83qn0218BR1G4RmT30gPRLQCQh7k5QGDPeJ4IlPmYhT7aQnTGVYPOVzahIH11gj6KYTaKt3rS8uD1ZOBu5O1XoD1/NTAExTGN6OZDHmTluBpdRXsIbTUXo2C9Dh1P3ObVfwWdwZ1g6tp7ijXomlcF2hrFxU7FkJcjqV6cjeLfOShRsQE50eFoFX2LltfERlPI6H6CEjA+GijobRYOi7FohzDZ1J2H9GqU4R+D7qFa5NLveuLjg4ZAmxrSBdnjgG9QcLoFLfuPoIvIaWDHGvLcDaashyc/DCU05iNnsh29vK9Mf0mzZ3HceTi6A8+gO45bkaLEXcauRDHQF+hWynZ0c2MG8tB555KHI41eDEFbsx+QXmxDK8YsdK7n31m2Y477fZSTw0VXdLNmJfoua5Ez6heoWyr9LqOMMsooo4wyyiijjLbAf7ayMlHrzdMSAAAAAElFTkSuQmCC\n",
"text/latex": [
"$$1794.3430848752569$$"
],
"text/plain": [
"1794.3430848752569"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(lalonde[lalonde['treat'] == 1].mean() - lalonde[lalonde['treat'] == 0].mean())['re78']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can do the same with our new sample from the interventional distribution to get a causal effect estimate"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAANMAAAASCAYAAADBs+vIAAAABHNCSVQICAgIfAhkiAAABrxJREFUaIHt2mmsXVUVB/Bf4VUriICobVQEBUwbMa2NoSUG+oqKUYYUHKMiJqJBHCBRwSGGFxOjKEERRWkcCIgmohQEHKiIYVCMKJW5gviEBhtsUSnSgcL1w9on93DeOefec859X8z9JzfnZU/rv9bbw1prb8YYY4xZwVtwHm7Ao+jh+zXtz8K1eBBb8QhuxZnYp6bfUbgGG1K/+3EpDh0Rrzq05Zzh3Ul+DyfVtGui43tzY1b9nizpNwfvx+/xGP6LW3Aydim03SfxXYP7Eqf/4Ea8r6R9hukaThtHqAu8GN/FQ9ieZH8Ve4+AF83s1VjOnELHdVicBG3AQlwiJlAZduBPuAsPY3csx6uFQZaLSZvHWTgdm3E5NuFAHIsJvMfMhdKUVx3acM6wL27Hrni2+Md8u6RdUx2XYFWFzMNwBK7G0YW6S/DOpMdP8Thej0W4OMnJcDK+iX/gOjyA+Tgee+IneKuYJHlMYy8xqYt4DGcXytrqcgB+ixfgCtyDQ7AS6/EaYc+2vGhmry5ySMQPEots0uATYF5F+edT3/ML5QvErrRRGK0ouyd28K686tCUc4Y5+BX+ii+rPpna6liF36U+xxbKj8uN9bxc+TNwZao7Pld+BI4xcwdeIBZWD28ukT+dfqNAlS7wy1T3kUL5Oan8Wx15NbVXWzmlmNR+0i5OfdcWypel8isq+j2KLbPIqw5VnDOciqdwOKZUL6ZR6JjhlWmsDeI0zOOiVPehkn5LUt2vh5Tz6dT+vJK6aaNZTHW6HJDq/mbmYt9D3yXbvQOvtvYaWs5EAzJNcEz63lYov1e4WYeI3WFTru5wYbjLZ4nTIFRxJtyAL+JcXC92+SqMUscPpO93zIwzFqRv2SmXlR0mdt4dA+Q8kb47K+qfKVzql4hJfZuwQ1XsU4Y6XVam7zViw8pjC27CkcIFv7Ylry726qz/pOFPgI+L3forIknQw5/x/JK2pwmDPYzV+AJ+hG3CmEXXqAuvUXCeEEHqejwrlU2pT0B01VGS9S8xwfctqf9B4nBKSV220/ZEfFmHCREH9vCGkvrp3Fj53/1YMWDsDIN0ydzmj1X0/3qq/2AHXm3t1VROKSYNP2k3FgT9XAS3VVglsmj5PveK4HCUvOowLOfPiR0on4WbMjib10VHODH1uaqi/l2p/j48N1c+V7iYmcyqDGmGs1O7qyvqzxQn8XzshoNF/PKUCOAXDxifwbqsVm/PLJ79VAdebe01Cv1bTdr5ItBbLzJjS0vanC52qHPwskRwqX4A+qVZ4FWHOs7LEtcipyn1//yuOhKuTU/f/SxiV/xCP0V7gXBD7xSL+O+pblmNjI+mNnd7+gQbBtkiXDNE20G6tFlMTXmNwl7DyCnFpPaTdj9xT3BHxZiXlfTZTQSnT4oJOBu86lDkPCEW2F3CZ85jSvU/P+PXRcdXpDEeNDNYz2MuzhBu2jb8W8RjC5MePby0ou+HU/2d+vFEExyY+m8e0G4YXdq4eW14dbFXEzkzMKnbpL019c+nIbPVXEx/ZrhMdYp2VLzqkOe8l3JfueyXv4MYhY7npjZT7dQwT2wM/6yoPy2Nf7vh4rcy7JnG2Dag3TC6nJTaXFBRn53orx0hrzwG2WsoObOVzYMXpm8+45Ht8GWJiXz5oOzTbCHPebvIPJVhKV4lXg+sF/cnGbrqOA8nJA5V8gfhHSIr9cOSujNEZnKduLDcVNJmGCxP37o7s2F1uS59jxSp8XxGbw9xYfs4bh4RryLq7DUSOZPqT4CXi9VZxC76Pu5Nhbq36fusLyrUvVEYcav6Zz2DeBH3FgvFsd6VcxmmVLt5XXU8IfW/cggezykpWyJ22Ef0N4cMn01j32K4GGmRp9/tZNhfJFN64o6qCk10aXJp25ZXU3s1klM8mVbpPwXJ/OhDcWH6e5NIKcObRMr3RnHZtlkE8ytEPLBRPLfJ48fiFcHrRNC7JrVbJJ6XzMEnzfRDm/Ai7iL2E/7vdK68DeemaKtjhuw+ZvUQstaKhXmHuI9ZJN4EbhXB/kO5tifqZyZvEMmHIqb1bQpvF3HM9SJA3yI2qqPEqfMzNc9pGupyinhO9DXhzt0tkgEr8Rd8ZgS8mtirixz0d9yq33Su7cEiMFwnJvNO8XDyD2mcqp1vrvDZbxavAXaK+5irxDHflRf9u4H9C+VtOVfxqco+tdGR+OcOk3jI8An8UQTS24XL8Q3xYLSKc93vN4U+K4Trc0+S8YTYxdeKd2zFt51ddCHuoL4n3g/uEBO47KFrW15N7NVFzhhjjDHGGGOMMcYY/z/4H/BDEbgZTEoJAAAAAElFTkSuQmCC\n",
"text/latex": [
"$$1381.348792570595$$"
],
"text/plain": [
"1381.348792570595"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(do_df[do_df['treat'] == 1].mean() - do_df[do_df['treat'] == 0].mean())['re78']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We could get some rough error bars on the outcome using the normal approximation for a 95% confidence interval, like\n"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAOAAAAASCAYAAABCd9LzAAAABHNCSVQICAgIfAhkiAAABl9JREFUaIHtmmmMFEUUx38gK2YVb3SjoMh6rdGIJLp44SKKUdTgEUPUaFSIGA8wHiiJcWLibQgEjSziEZT4QeSIJyAo4AEGhcjNogxqEHRBYFXOZf3wXmV6eqp7urp7hv0w/2TSu6/qX/Wq6lXVq1cFFVRQQbvBzcA4YAGwA2gD3ivC6Qa8BWwEdgNZYAxwVAhnIDAL+B3YCfwCfABcaMl7DDAEmAas0/zbga+Be4CORVuVrKwOwFBgEfAP8C+wGBgWkD9OHeXiGPRX3iZkzDYCM4FrUuYY3I7YUpvqbIOr7bnmj9tfWY/u/t+mAI6LfedhqRbcAqyK0KhaYLPmmw68AMzV/1cjjfbjRU1vBiYqZwqwB9iPDJYXwzT/RmAy8Dwy4bepfAoySaIgTlmTNW0z8AYwFlipskkp1VEuDsBLmv4bMAF4Ttv1o6bZEIdj0F11aiF8Arranmv+uP2V1TwZy+9RS35X+85DP+A0VaQhQqNmap4HffLRKh/vk9cArcjKcZyl7jZktfDicuA6CleoGuBX5dwUomOSsm7w6HSsR34w8JGm3ZiCvuXiDFX5O9oGP6ossjgcgw7AF8DPwMuET0BX23PNH9eOsvqLgjj2HYgGwhtVq+nrKWxUF3Lu2qEeeb1yZgSUuQNZ0aJilJY3zoHjUtYkld1vyd9L0+YmrKNcnM7An8AG7BPJhjgcL4Yjq35fZMcIm4BeNBDt+BM3vx9hfZwl+gR0tu9OEQu2oZ9+ZyGd7EUL8A0wAOgDzFF5E7IVX4DsKM0eTl9k4k530GGvfvc5cFzKqtGvbdUysksR49wTs45yca4EuiLn8/3IOeVsYBfwPfCdpZw4HIM6xP0aC8xHdqD2imJ93BlxHU9CNpWfkDa1+vI523eSCXiGftcGpDchE/B0chNwKzAScVFXqjJbkN30emA2cG/E+jsBd+jfn7so7lCW6cBTLJyeHm5P5Mwbp45ycc7X7y5gCTKRvJiPBDb+SsgxOryLuHajIup9oBClj2uQ9nixHrgLmOeRpWnfRbf1CYS7FM9q+pOWtEGqrDei1ATcGlU54BXlfeLAcS3rNpWvA472yKsQN8PoXjS6FVJHuTivq3wfsoJfAhwGnEPuLP9VChyAZ5DdwdsvGdqnC1qsj59Gdu/jgWpkERqPeAT/AedaOGnYd8km4OPIgI5Gdo5qoDe5AS0WVQN4SPOuIn9ixEFYWQchq6IJOTciLtUKpIM3aFp9CfRNm9OoabuAHr60aiTC6V9M4nDqkfH1j2OG9jcBk9iRmbjTfPI07Bso3igT1XokIP1VTb/PUuZUS/5q5N6klZx7Z8MDWsYKcme0uIhSVhXiVixDDHEb4lqcCSxXvs1FTaJvKTgmPB50bpuo6cMTcDoBaxD3q7Mvb4b2NQGT2tGpyt9i0SOJfRcUFtSoIZreGJBuZnx/j8ysGv5rC4OphF8rjND0ZRSGeV2RtKxDkAtp//knaR2l4tyteT4LSDcL6hMJOEeS73aF/caEtKeB0k7ANOzoCHLegYGzfScJwnyp3wHINYQ3EtoFuBjxkRd65GZV7BpQppHbIoojkajaUiQ612zJExVplDUYiX6+n2IdpeTMQQb/LArHC3IBlvUJOLuBNwPq7w2ch7w8WUN4BLWUSMuO+ujXGyFPYt8FaKD4quJ6EX8LufPUib60q5EB3knhC5qnlLeYaL56LeIi2i6JXcs63CLrhex8W4ETUqijXBwTOHrYJx+A9P3fyMqelGNDhgPvgrr2Vx3599gGPZCgShv5UV5n+/Y/vRmkPxC/+Cpkhi9QWTP5z29qgW+RbXwGcpitR+4I1wIXke8jd0Qm7RXIXaF5W1gHXKv6jEACHQZ3Iq8wWpGL0u0UIqt5vP+fjJzNsgnLWoR02nLVuQ65D9uJvKyYl0+PVUe5ON2Q8eqO7G5LkD4ahBjOYOBDXxlxODZkkIjiUOTs6Ier7bnmj9NfGSTGMR8JuLUgNj8QOYJ8iryWMjtaHPvOQ4Zwvz1r4XQH3gb+UEU2EP4Yu0qVWIi8DNiHvLb4GFlVXXWyhcKzKu+RQlmPAT8gwZfdyCC/hhimDXHqKBcHxA0ah4zTHsRQpyGXx0GIw/HD6Bu0A5r0qLaXdn5bf12GHDFWI+O/F/F8ZiN3h7a3o672XUEFFVRQQQUVVFBBBeXE/1eLYT62nMxjAAAAAElFTkSuQmCC\n",
"text/latex": [
"$$1082.2922266411258$$"
],
"text/plain": [
"1082.2922266411258"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"1.96*np.sqrt((do_df[do_df['treat'] == 1].var()/len(do_df[do_df['treat'] == 1])) + \n",
" (do_df[do_df['treat'] == 0].var()/len(do_df[do_df['treat'] == 0])))['re78']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"but note that these DO NOT contain propensity score estimation error. For that, a bootstrapping procedure might be more appropriate."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is just one statistic we can compute from the interventional distribution of `'re78'`. We can get all of the interventional moments as well, including functions of `'re78'`. We can leverage the full power of pandas, like"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 445.000000\n",
"mean 5109.177730\n",
"std 5862.739983\n",
"min 0.000000\n",
"25% 0.000000\n",
"50% 3783.660000\n",
"75% 8061.490000\n",
"max 34099.300000\n",
"Name: re78, dtype: float64"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"do_df['re78'].describe()"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 445.000000\n",
"mean 5300.765138\n",
"std 6631.493362\n",
"min 0.000000\n",
"25% 0.000000\n",
"50% 3701.810000\n",
"75% 8124.720000\n",
"max 60307.900000\n",
"Name: re78, dtype: float64"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lalonde['re78'].describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and even plot aggregations, like"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7f426a38fe48>"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAE4BJREFUeJzt3X+MXWWdx/H3F0qtIloKYwMtMnUpuBUi4lhKWDZKw0/JlqxCILrULqZ/AKsma9ZiNPgDN7B/6GKiJI0tFOJSCbtARSPUVlyXDdDhxyIUSgesMl1oh7ZWwC1Q/O4f92m5lBnnTns7l87zfiWTe873ec65z0na+cw55zn3RmYiSarPfp0egCSpMwwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVCkDQJIqZQBIUqXGtdIpIiYCPwCOBRL4e2AN8COgG1gHnJ+ZWyIigGuAs4E/Ap/JzAfLfuYCXym7vTIzl/y59z300EOzu7t7ZEckSZV74IEHns/MruH6RSsfBRERS4BfZeYPImI88A7gy8DmzLwqIhYAB2fmlyLibOAfaATAicA1mXliREwCeoEeGiHyAPDhzNwy1Pv29PRkb2/vsOOTJL0uIh7IzJ7h+g17CSgi3g38NbAIIDNfyczfA3OAHX/BLwHOLctzgBuy4V5gYkQcBpwBLM/MzeWX/nLgzBEelySpTVq5BzANGACui4iHIuIHEXEgMDkzny19ngMml+UpwDNN2/eX2lD1N4iI+RHRGxG9AwMDIzsaSVLLWgmAccAJwLWZ+SHgJWBBc4dsXEdqy8eKZubCzOzJzJ6urmEvYUmSdlMrN4H7gf7MvK+s30IjADZExGGZ+Wy5xLOxtK8HjmjafmqprQc+ukv97pEO+NVXX6W/v59t27aNdNN9yoQJE5g6dSoHHHBAp4ciaYwaNgAy87mIeCYijsnMNcBsYHX5mQtcVV5vL5ssAy6LiKU0bgJvLSFxJ/DPEXFw6Xc6cPlIB9zf389BBx1Ed3c3jQlHY09msmnTJvr7+5k2bVqnhyNpjGppGiiNWT0/LDOAngbm0bh8dHNEXAz8Fji/9P0pjRlAfTSmgc4DyMzNEfFNYFXp943M3DzSAW/btm1M//IHiAgOOeQQvAciaW9qKQAy82Ea0zd3NXuQvglcOsR+FgOLRzLAwYzlX/471HCMkjrLJ4ElqVKtXgJ6y+pe8JO27m/dVR9v275OOeUUXnjhBQA2btzIzJkzue2229i6dSuf/vSn+d3vfsf27dv54he/yLx589r2vtIbfO3dnR7B2PG1rZ0eQVvt8wHQaZlJZrLffm8+mfrVr361c/kTn/gEc+bMAeB73/seM2bM4Mc//jEDAwMcc8wxfOpTn2L8+PGjNm5J8hLQbli3bh3HHHMMF110Ecceeyw33ngjJ510EieccALnnXceL7744hv6/+EPf2DlypWce27jYemI4IUXXiAzefHFF5k0aRLjxpnFkkaXAbCb1q5dyyWXXMIvf/lLFi1axM9//nMefPBBenp6+Pa3v/2GvrfddhuzZ8/mXe96FwCXXXYZjz/+OIcffjjHHXcc11xzzaBnEJK0N/ln52468sgjmTVrFnfccQerV6/m5JNPBuCVV17hpJNOekPfm266ic9+9rM71++8806OP/54Vq5cyVNPPcVpp53GKaecsjMgJGk0GAC76cADDwQa9wBOO+00brrppkH7Pf/889x///3ceuutO2vXXXcdCxYsICI46qijmDZtGk888QQzZ84clbFLEngJaI/NmjWLe+65h76+PgBeeuklnnzyyZ3tt9xyC+eccw4TJkzYWXvve9/LihUrANiwYQNr1qzhfe973+gOXFL19vkzgHZO29wdXV1dXH/99Vx44YW8/PLLAFx55ZUcffTRACxdupQFC97w2Xl89atf5TOf+QzHHXccmcnVV1/NoYceOupjl1S3fT4AOqG7u5tHH3105/qpp57KqlWrBu179913v6l2+OGHc9ddd+2t4UlSS7wEJEmVMgAkqVL7ZAC08j3G+7oajlFSZ+1zATBhwgQ2bdo0pn9B7vg+gOaZQ5LUbvvcTeCpU6fS398/5j8rf8c3gknS3rLPBcABBxzgt2RJUhvsc5eAJEntYQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlDABJqlRLARAR6yLi1xHxcET0ltqkiFgeEWvL68GlHhHx3Yjoi4hHIuKEpv3MLf3XRsTcvXNIkqRWjOQM4GOZeXxm9pT1BcCKzJwOrCjrAGcB08vPfOBaaAQGcAVwIjATuGJHaEiSRt+eXAKaAywpy0uAc5vqN2TDvcDEiDgMOANYnpmbM3MLsBw4cw/eX5K0B1oNgATuiogHImJ+qU3OzGfL8nPA5LI8BXimadv+Uhuq/gYRMT8ieiOid6x/5r8kdVKr3wfwV5m5PiLeAyyPiCeaGzMzI6ItX9GVmQuBhQA9PT1j92u/JKnDWjoDyMz15XUjcCuNa/gbyqUdyuvG0n09cETT5lNLbai6JKkDhg2AiDgwIg7asQycDjwKLAN2zOSZC9xelpcBF5XZQLOAreVS0Z3A6RFxcLn5e3qpSZI6oJVLQJOBWyNiR/9/y8yfRcQq4OaIuBj4LXB+6f9T4GygD/gjMA8gMzdHxDeBVaXfNzJzc9uORJI0IsMGQGY+DXxwkPomYPYg9QQuHWJfi4HFIx+mJKndfBJYkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkSrX6ncD6M7oX/KTTQxhT1l318U4PQaqCZwCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKtVyAETE/hHxUETcUdanRcR9EdEXET+KiPGl/ray3lfau5v2cXmpr4mIM9p9MJKk1o3kDODzwONN61cD38nMo4AtwMWlfjGwpdS/U/oRETOAC4APAGcC34+I/fds+JKk3dVSAETEVODjwA/KegCnAreULkuAc8vynLJOaZ9d+s8Blmbmy5n5G6APmNmOg5AkjVyrZwD/CvwT8Keyfgjw+8zcXtb7gSlleQrwDEBp31r676wPso0kaZQNGwARcQ6wMTMfGIXxEBHzI6I3InoHBgZG4y0lqUqtnAGcDPxNRKwDltK49HMNMDEidnyfwFRgfVleDxwBUNrfDWxqrg+yzU6ZuTAzezKzp6ura8QHJElqzbABkJmXZ+bUzOymcRN3ZWZ+CvgF8MnSbS5we1leVtYp7SszM0v9gjJLaBowHbi/bUciSRqRPflGsC8BSyPiSuAhYFGpLwJujIg+YDON0CAzH4uIm4HVwHbg0sx8bQ/eX5K0B0YUAJl5N3B3WX6aQWbxZOY24Lwhtv8W8K2RDlKS1H4+CSxJlTIAJKlSBoAkVcoAkKRKGQCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlDABJqpQBIEmVMgAkqVIGgCRVygCQpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVWrYAIiICRFxf0T8T0Q8FhFfL/VpEXFfRPRFxI8iYnypv62s95X27qZ9XV7qayLijL11UJKk4bVyBvAycGpmfhA4HjgzImYBVwPfycyjgC3AxaX/xcCWUv9O6UdEzAAuAD4AnAl8PyL2b+fBSJJaN2wAZMOLZfWA8pPAqcAtpb4EOLcszynrlPbZERGlvjQzX87M3wB9wMy2HIUkacRaugcQEftHxMPARmA58BTw+8zcXrr0A1PK8hTgGYDSvhU4pLk+yDbN7zU/InojondgYGDkRyRJaklLAZCZr2Xm8cBUGn+1v39vDSgzF2ZmT2b2dHV17a23kaTqjWgWUGb+HvgFcBIwMSLGlaapwPqyvB44AqC0vxvY1FwfZBtJ0ihrZRZQV0RMLMtvB04DHqcRBJ8s3eYCt5flZWWd0r4yM7PULyizhKYB04H723UgkqSRGTd8Fw4DlpQZO/sBN2fmHRGxGlgaEVcCDwGLSv9FwI0R0QdspjHzh8x8LCJuBlYD24FLM/O19h6OJKlVwwZAZj4CfGiQ+tMMMosnM7cB5w2xr28B3xr5MCVJ7eaTwJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlDABJqpQBIEmVMgAkqVIGgCRVygCQpEoZAJJUKQNAkiplAEhSpYYNgIg4IiJ+ERGrI+KxiPh8qU+KiOURsba8HlzqERHfjYi+iHgkIk5o2tfc0n9tRMzde4clSRpOK2cA24F/zMwZwCzg0oiYASwAVmTmdGBFWQc4C5hefuYD10IjMIArgBOBmcAVO0JDkjT6hg2AzHw2Mx8syy8AjwNTgDnAktJtCXBuWZ4D3JAN9wITI+Iw4AxgeWZuzswtwHLgzLYejSSpZSO6BxAR3cCHgPuAyZn5bGl6DphclqcAzzRt1l9qQ9V3fY/5EdEbEb0DAwMjGZ4kaQRaDoCIeCfw78AXMvMPzW2ZmUC2Y0CZuTAzezKzp6urqx27lCQNoqUAiIgDaPzy/2Fm/kcpbyiXdiivG0t9PXBE0+ZTS22ouiSpA1qZBRTAIuDxzPx2U9MyYMdMnrnA7U31i8psoFnA1nKp6E7g9Ig4uNz8Pb3UJEkdMK6FPicDfwf8OiIeLrUvA1cBN0fExcBvgfNL20+Bs4E+4I/APIDM3BwR3wRWlX7fyMzNbTkKSdKIDRsAmflfQAzRPHuQ/glcOsS+FgOLRzJASdLe4ZPAklQpA0CSKmUASFKlDABJqpQBIEmVMgAkqVIGgCRVygCQpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKjVsAETE4ojYGBGPNtUmRcTyiFhbXg8u9YiI70ZEX0Q8EhEnNG0zt/RfGxFz987hSJJa1coZwPXAmbvUFgArMnM6sKKsA5wFTC8/84FroREYwBXAicBM4IodoSFJ6oxhAyAz/xPYvEt5DrCkLC8Bzm2q35AN9wITI+Iw4AxgeWZuzswtwHLeHCqSpFG0u/cAJmfms2X5OWByWZ4CPNPUr7/UhqpLkjpkj28CZ2YC2YaxABAR8yOiNyJ6BwYG2rVbSdIudjcANpRLO5TXjaW+Hjiiqd/UUhuq/iaZuTAzezKzp6urazeHJ0kazu4GwDJgx0yeucDtTfWLymygWcDWcqnoTuD0iDi43Pw9vdQkSR0ybrgOEXET8FHg0IjopzGb5yrg5oi4GPgtcH7p/lPgbKAP+CMwDyAzN0fEN4FVpd83MnPXG8uSpFE0bABk5oVDNM0epG8Clw6xn8XA4hGNTpK01/gksCRVygCQpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlDABJqpQBIEmVMgAkqVIGgCRVygCQpEoZAJJUqVEPgIg4MyLWRERfRCwY7feXJDWMagBExP7A94CzgBnAhRExYzTHIElqGO0zgJlAX2Y+nZmvAEuBOaM8BkkSMG6U328K8EzTej9wYnOHiJgPzC+rL0bEmlEaWw0OBZ7v9CCGE1d3egTqgH3i3yZfj06PoFVHttJptANgWJm5EFjY6XGMRRHRm5k9nR6HtCv/bXbGaF8CWg8c0bQ+tdQkSaNstANgFTA9IqZFxHjgAmDZKI9BksQoXwLKzO0RcRlwJ7A/sDgzHxvNMVTOS2t6q/LfZgdEZnZ6DJKkDvBJYEmqlAEgSZUyACSpUm+55wDUPhHxfhpPWk8ppfXAssx8vHOjkvRW4RnAGBURX6LxURsB3F9+ArjJD+GTBM4CGrMi4kngA5n56i718cBjmTm9MyOT/ryImJeZ13V6HDXwDGDs+hNw+CD1w0qb9Fb19U4PoBbeAxi7vgCsiIi1vP4BfO8FjgIu69ioJCAiHhmqCZg8mmOpmZeAxrCI2I/GR3A33wRelZmvdW5UEkTEBuAMYMuuTcB/Z+ZgZ69qM88AxrDM/BNwb6fHIQ3iDuCdmfnwrg0RcffoD6dOngFIUqW8CSxJlTIAJKlSBoBURMTEiLikjfv7QkS8o137k9rNAJBeNxF4UwBExO5OlvgCYADoLctZQNLrrgL+IiIeBl4FttGYpvh+4OiI+DTwOWA8cB9wSWa+FhHXAh8B3g7ckplXRMTnaDyI94uIeD4zP9aB45H+LGcBSUVEdAN3ZOaxEfFR4CfAsZn5m4j4S+BfgL/NzFcj4vvAvZl5Q0RMyszNEbE/sAL4XGY+EhHrgJ7MfL4jByQNwzMAaWj3Z+ZvyvJs4MPAqoiAxl/7G0vb+RExn8b/p8OAGcBQT7pKbxkGgDS0l5qWA1iSmZc3d4iIacAXgY9k5paIuB6YMHpDlHafN4Gl170AHDRE2wrgkxHxHoCImBQRRwLvohEUWyNiMnBWi/uTOs4zAKnIzE0RcU9EPAr8H7ChqW11RHwFuKt8xtKrwKWZeW9EPAQ8QeND9+5p2uVC4GcR8b/eBNZbkTeBJalSXgKSpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlS/w+IQWrZoZZmpwAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"%matplotlib inline\n",
"\n",
"do_df.groupby('treat').mean().plot(y='re78', kind='bar')"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7f426a032208>"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAE4NJREFUeJzt3W+sleWZ7/Hvpcgwx9EiuksUrJuO1A7V1Dp7EOM46Uj822Ygp9Vo2iPl2PBCPZ0m08zgZBqmrZ3ovGjHJq0JKSiajtQ4o1LbVCnUno4nKts/xypI2bVaN6OwBUrVDir2mhfrBpe4d/fasFgLuL+fZGc9z3Xf61nXk8D67efPWjsyE0lSfQ7rdgOSpO4wACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlDABJqpQBIEmVGtftBn6f4447Lnt7e7vdhiQdVB599NGXM7NntHkHdAD09vbS39/f7TYk6aASEc+3Ms9TQJJUKQNAkiplAEhSpQ7oawDDefPNNxkcHGTHjh3dbmW/mjBhAlOnTuWII47odiuSDlEHXQAMDg5y1FFH0dvbS0R0u539IjPZsmULg4ODTJs2rdvtSDpEHXSngHbs2MGxxx57yL75A0QExx577CF/lCOpuw66AAAO6Tf/XWrYR0nddVAGgCRp3x101wD21Lvw+23d3nPXf6xt2zrnnHN45ZVXANi8eTMzZ87k7rvvZvv27Xz605/mV7/6FTt37uQLX/gC8+fPb9vrSu/wj+/pdgeHjn/c3u0O2uqgD4Buy0wyk8MOe/fB1E9/+tPdy5/4xCeYM2cOAN/85jeZMWMG3/ve9xgaGuKUU07hU5/6FOPHj+9Y35LkKaC98Nxzz3HKKadwxRVXcOqpp3Lbbbdx1llnccYZZ3DJJZfw6quvvmP+b37zG1avXs3cuXOBxvn9V155hczk1VdfZdKkSYwbZxZL6iwDYC9t2LCBq666ip/85CcsWbKEH/3oRzz22GP09fXxta997R1z7777bmbPns3RRx8NwDXXXMO6des44YQTOO2007jxxhuHPYKQpP3JXzv30kknncSsWbO49957Wbt2LWeffTYAb7zxBmedddY75t5+++189rOf3b1+3333cfrpp7N69Wp+8YtfcN5553HOOefsDghJ6gQDYC8deeSRQOMawHnnncftt98+7LyXX36ZRx55hLvuumt37eabb2bhwoVEBCeffDLTpk3jmWeeYebMmR3pXZLAU0D7bNasWTz44IMMDAwA8Nprr/Hzn/989/idd97Jxz/+cSZMmLC79r73vY9Vq1YBsGnTJtavX8/73//+zjYuqXoH/RFAO2/b3Bs9PT3ccsstXH755bz++usAXHfddXzgAx8AYPny5SxcuPAdz/niF7/IZz7zGU477TQykxtuuIHjjjuu471LqltLARARE4FvA6cCCfxvYD3wXaAXeA64NDO3ReMjrDcCFwO/BT6TmY+V7cwD/qFs9rrMXNa2Pemg3t5ennrqqd3r5557LmvWrBl27gMPPPCu2gknnMD999+/v9qTpJa0egroRuCHmflB4MPAOmAhsCozpwOryjrARcD08rMAuAkgIiYBi4AzgZnAoog4pk37IUkao1EDICLeA/wFsAQgM9/IzF8Dc4Bdv8EvA+aW5TnArdnwEDAxIo4HLgBWZubWzNwGrAQubOveSJJa1soRwDRgCLg5Ih6PiG9HxJHA5Mx8scx5CZhclqcALzQ9f7DURqqPWWbuzdMOKjXso6TuaiUAxgFnADdl5keA13j7dA8A2Xi3ass7VkQsiIj+iOgfGhp61/iECRPYsmXLIf0GuevvATTfOSRJ7dbKReBBYDAzHy7rd9IIgE0RcXxmvlhO8Wwu4xuBE5ueP7XUNgIf3aP+wJ4vlpmLgcUAfX1973qXnzp1KoODgwwXDoeSXX8RTJL2l1EDIDNfiogXIuKUzFwPzAbWlp95wPXl8Z7ylBXANRGxnMYF3+0lJO4D/qnpwu/5wLVjbfiII47wr2RJUhu0+jmA/wN8JyLGA88C82mcProjIq4EngcuLXN/QOMW0AEat4HOB8jMrRHxFWDX/ZJfzsytbdkLSdKYtRQAmfkE0DfM0Oxh5iZw9QjbWQosHUuDkqT9w6+CkKRKGQCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlDABJqpQBIEmVMgAkqVIGgCRVygCQpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVaqlAIiI5yLiZxHxRET0l9qkiFgZERvK4zGlHhHxjYgYiIgnI+KMpu3MK/M3RMS8/bNLkqRWjOUI4C8z8/TM7CvrC4FVmTkdWFXWAS4CppefBcBN0AgMYBFwJjATWLQrNCRJnbcvp4DmAMvK8jJgblP91mx4CJgYEccDFwArM3NrZm4DVgIX7sPrS5L2QasBkMD9EfFoRCwotcmZ+WJZfgmYXJanAC80PXew1Eaqv0NELIiI/ojoHxoaarE9SdJYjWtx3p9n5saIeC+wMiKeaR7MzIyIbEdDmbkYWAzQ19fXlm1Kkt6tpSOAzNxYHjcDd9E4h7+pnNqhPG4u0zcCJzY9fWqpjVSXJHXBqAEQEUdGxFG7loHzgaeAFcCuO3nmAfeU5RXAFeVuoFnA9nKq6D7g/Ig4plz8Pb/UJEld0MopoMnAXRGxa/6/ZuYPI2INcEdEXAk8D1xa5v8AuBgYAH4LzAfIzK0R8RVgTZn35czc2rY9kSSNyagBkJnPAh8epr4FmD1MPYGrR9jWUmDp2NuUJLWbnwSWpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKtfptoPo9ehd+v9stHFKeu/5j3W5BqoJHAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUi0HQEQcHhGPR8S9ZX1aRDwcEQMR8d2IGF/qf1DWB8p4b9M2ri319RFxQbt3RpLUurEcAfw1sK5p/Qbg65l5MrANuLLUrwS2lfrXyzwiYgZwGfAh4ELgWxFx+L61L0naWy0FQERMBT4GfLusB3AucGeZsgyYW5bnlHXK+Owyfw6wPDNfz8xfAgPAzHbshCRp7Fo9AvgX4G+B35X1Y4FfZ+bOsj4ITCnLU4AXAMr49jJ/d32Y5+wWEQsioj8i+oeGhsawK5KksRg1ACLi48DmzHy0A/2QmYszsy8z+3p6ejrxkpJUpVb+JvDZwF9FxMXABOBo4EZgYkSMK7/lTwU2lvkbgROBwYgYB7wH2NJU36X5OZKkDhv1CCAzr83MqZnZS+Mi7urM/BTwY+CTZdo84J6yvKKsU8ZXZ2aW+mXlLqFpwHTgkbbtiSRpTFo5AhjJ3wHLI+I64HFgSakvAW6LiAFgK43QIDOfjog7gLXATuDqzHxrH15fkrQPxhQAmfkA8EBZfpZh7uLJzB3AJSM8/6vAV8fapCSp/fwksCRVygCQpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlDABJqpQBIEmVGjUAImJCRDwSEf8/Ip6OiC+V+rSIeDgiBiLiuxExvtT/oKwPlPHepm1dW+rrI+KC/bVTkqTRtXIE8DpwbmZ+GDgduDAiZgE3AF/PzJOBbcCVZf6VwLZS/3qZR0TMAC4DPgRcCHwrIg5v585Iklo3agBkw6tl9Yjyk8C5wJ2lvgyYW5bnlHXK+OyIiFJfnpmvZ+YvgQFgZlv2QpI0Zi1dA4iIwyPiCWAzsBL4BfDrzNxZpgwCU8ryFOAFgDK+HTi2uT7McyRJHdZSAGTmW5l5OjCVxm/tH9xfDUXEgojoj4j+oaGh/fUyklS9Md0FlJm/Bn4MnAVMjIhxZWgqsLEsbwROBCjj7wG2NNeHeU7zayzOzL7M7Ovp6RlLe5KkMWjlLqCeiJhYlv8QOA9YRyMIPlmmzQPuKcsryjplfHVmZqlfVu4SmgZMBx5p145IksZm3OhTOB5YVu7YOQy4IzPvjYi1wPKIuA54HFhS5i8BbouIAWArjTt/yMynI+IOYC2wE7g6M99q7+5Iklo1agBk5pPAR4apP8swd/Fk5g7gkhG29VXgq2NvU5LUbn4SWJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlDABJqpQBIEmVMgAkqVIGgCRVygCQpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVKlRAyAiToyIH0fE2oh4OiL+utQnRcTKiNhQHo8p9YiIb0TEQEQ8GRFnNG1rXpm/ISLm7b/dkiSNppUjgJ3A32TmDGAWcHVEzAAWAqsyczqwqqwDXARMLz8LgJugERjAIuBMYCawaFdoSJI6b9QAyMwXM/OxsvwKsA6YAswBlpVpy4C5ZXkOcGs2PARMjIjjgQuAlZm5NTO3ASuBC9u6N5Kklo3pGkBE9AIfAR4GJmfmi2XoJWByWZ4CvND0tMFSG6m+52ssiIj+iOgfGhoaS3uSpDFoOQAi4o+AfwM+n5m/aR7LzASyHQ1l5uLM7MvMvp6ennZsUpI0jJYCICKOoPHm/53M/PdS3lRO7VAeN5f6RuDEpqdPLbWR6pKkLmjlLqAAlgDrMvNrTUMrgF138swD7mmqX1HuBpoFbC+niu4Dzo+IY8rF3/NLTZLUBeNamHM28L+An0XEE6X298D1wB0RcSXwPHBpGfsBcDEwAPwWmA+QmVsj4ivAmjLvy5m5tS17IUkas1EDIDP/A4gRhmcPMz+Bq0fY1lJg6VgalCTtH34SWJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlDABJqpQBIEmVMgAkqVIGgCRVygCQpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVKlRAyAilkbE5oh4qqk2KSJWRsSG8nhMqUdEfCMiBiLiyYg4o+k588r8DRExb//sjiSpVa0cAdwCXLhHbSGwKjOnA6vKOsBFwPTyswC4CRqBASwCzgRmAot2hYYkqTtGDYDM/L/A1j3Kc4BlZXkZMLepfms2PARMjIjjgQuAlZm5NTO3ASt5d6hIkjpob68BTM7MF8vyS8DksjwFeKFp3mCpjVR/l4hYEBH9EdE/NDS0l+1JkkazzxeBMzOBbEMvu7a3ODP7MrOvp6enXZuVJO1hbwNgUzm1Q3ncXOobgROb5k0ttZHqkqQu2dsAWAHsupNnHnBPU/2KcjfQLGB7OVV0H3B+RBxTLv6eX2qSpC4ZN9qEiLgd+ChwXEQM0rib53rgjoi4EngeuLRM/wFwMTAA/BaYD5CZWyPiK8CaMu/LmbnnhWVJUgeNGgCZefkIQ7OHmZvA1SNsZymwdEzdSZL2Gz8JLEmVMgAkqVIGgCRVygCQpEoZAJJUKQNAkiplAEhSpQwASaqUASBJlTIAJKlSBoAkVcoAkKRKGQCSVCkDQJIqZQBIUqUMAEmqlAEgSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKlOh4AEXFhRKyPiIGIWNjp15ckNXQ0ACLicOCbwEXADODyiJjRyR4kSQ2dPgKYCQxk5rOZ+QawHJjT4R4kScC4Dr/eFOCFpvVB4MzmCRGxAFhQVl+NiPUd6q0GxwEvd7uJ0cQN3e5AXXBQ/NvkS9HtDlp1UiuTOh0Ao8rMxcDibvdxKIqI/szs63Yf0p78t9kdnT4FtBE4sWl9aqlJkjqs0wGwBpgeEdMiYjxwGbCiwz1IkujwKaDM3BkR1wD3AYcDSzPz6U72UDlPrelA5b/NLojM7HYPkqQu8JPAklQpA0CSKmUASFKlDrjPAah9IuKDND5pPaWUNgIrMnNd97qSdKDwCOAQFRF/R+OrNgJ4pPwEcLtfwicJvAvokBURPwc+lJlv7lEfDzydmdO705n0+0XE/My8udt91MAjgEPX74AThqkfX8akA9WXut1ALbwGcOj6PLAqIjbw9hfwvQ84Gbima11JQEQ8OdIQMLmTvdTMU0CHsIg4jMZXcDdfBF6TmW91rysJImITcAGwbc8h4P9l5nBHr2ozjwAOYZn5O+ChbvchDeNe4I8y84k9ByLigc63UyePACSpUl4ElqRKGQCSVCkDQCoiYmJEXNXG7X0+Iv5Hu7YntZsBIL1tIvCuAIiIvb1Z4vOAAaADlncBSW+7HvjjiHgCeBPYQeM2xQ8CH4iITwOfA8YDDwNXZeZbEXET8GfAHwJ3ZuaiiPgcjQ/i/TgiXs7Mv+zC/ki/l3cBSUVE9AL3ZuapEfFR4PvAqZn5y4j4E+Cfgf+ZmW9GxLeAhzLz1oiYlJlbI+JwYBXwucx8MiKeA/oy8+Wu7JA0Co8ApJE9kpm/LMuzgT8F1kQENH7b31zGLo2IBTT+Px0PzABG+qSrdMAwAKSRvda0HMCyzLy2eUJETAO+APxZZm6LiFuACZ1rUdp7XgSW3vYKcNQIY6uAT0bEewEiYlJEnAQcTSMotkfEZOCiFrcndZ1HAFKRmVsi4sGIeAr4L2BT09jaiPgH4P7yHUtvAldn5kMR8TjwDI0v3XuwaZOLgR9GxH96EVgHIi8CS1KlPAUkSZUyACSpUgaAJFXKAJCkShkAklQpA0CSKmUASFKl/hsD8Fa0KLSYqwAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"lalonde.groupby('treat').mean().plot(y='re78', kind='bar')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Specifying Interventions\n",
"\n",
"You can find the distribution of the outcome under an intervention to set the value of the treatment. "
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:dowhy.do_why:Causal Graph not provided. DoWhy will construct a graph based on data inputs.\n",
"INFO:dowhy.causal_identifier:Common causes of treatment and outcome:{'hisp', 'black', 'educ', 'nodegr', 'U', 'age', 'married'}\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"['hisp', 'black', 'educ', 'nodegr', 'age', 'married']\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"yes\n",
"{'observed': 'yes'}\n",
"Model to find the causal effect of treatment treat on outcome re78\n",
"{'observed': 'yes'}\n",
"{'observed': 'yes'}\n",
"{'observed': 'yes'}\n",
"{'observed': 'yes'}\n",
"{'label': 'Unobserved Confounders', 'observed': 'no'}\n",
"There are unobserved common causes. Causal effect cannot be identified.\n",
"WARN: Do you want to continue by ignoring these unobserved confounders? [y/n] yes\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:dowhy.causal_identifier:Instrumental variables for treatment and outcome:[]\n",
"INFO:dowhy.do_sampler:Using WeightingSampler for do sampling.\n",
"INFO:dowhy.do_sampler:Caution: do samplers assume iid data.\n",
"/home/akelleh/.virtualenvs/data/lib/python3.6/site-packages/pandas/core/frame.py:3140: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
" self[k1] = value[k2]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"WeightingSampler\n",
"treatments ['treat']\n",
"backdoor ['hisp', 'black', 'educ', 'nodegr', 'age', 'married']\n",
" educ age 0\n",
"0 11 37 0\n",
"1 9 22 0\n",
"2 12 30 1\n",
"3 11 27 0\n",
"4 8 33 0\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/akelleh/.virtualenvs/data/lib/python3.6/site-packages/sklearn/linear_model/logistic.py:433: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n",
" FutureWarning)\n",
"/home/akelleh/.virtualenvs/data/lib/python3.6/site-packages/sklearn/utils/validation.py:761: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().\n",
" y = column_or_1d(y, warn=True)\n"
]
}
],
"source": [
"do_df = lalonde.causal.do(x={'treat': 1},\n",
" outcome='re78',\n",
" common_causes=['nodegr', 'black', 'hisp', 'age', 'educ', 'married'],\n",
" variable_types={'age': 'c', 'educ':'c', 'black': 'd', 'hisp': 'd', \n",
" 'married': 'd', 'nodegr': 'd','re78': 'c', 'treat': 'b'})"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>educ</th>\n",
" <th>black</th>\n",
" <th>hisp</th>\n",
" <th>married</th>\n",
" <th>nodegr</th>\n",
" <th>re74</th>\n",
" <th>re75</th>\n",
" <th>re78</th>\n",
" <th>u74</th>\n",
" <th>u75</th>\n",
" <th>treat</th>\n",
" <th>propensity_score</th>\n",
" <th>weight</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>25</td>\n",
" <td>12</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.000</td>\n",
" <td>11965.80</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.548374</td>\n",
" <td>1.823572</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>17</td>\n",
" <td>9</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.000</td>\n",
" <td>0.00</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.369045</td>\n",
" <td>2.709700</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>19</td>\n",
" <td>10</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>5324.110</td>\n",
" <td>13829.60</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.362327</td>\n",
" <td>2.759937</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>38</td>\n",
" <td>11</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.000</td>\n",
" <td>0.00</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.391419</td>\n",
" <td>2.554808</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>20</td>\n",
" <td>12</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>377.569</td>\n",
" <td>1652.64</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.537215</td>\n",
" <td>1.861451</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" age educ black hisp married nodegr re74 re75 re78 u74 \\\n",
"0 25 12 1 0 0 0 0.0 0.000 11965.80 1 \n",
"1 17 9 1 0 0 1 0.0 0.000 0.00 1 \n",
"2 19 10 0 0 0 1 0.0 5324.110 13829.60 1 \n",
"3 38 11 1 0 0 1 0.0 0.000 0.00 1 \n",
"4 20 12 1 0 0 0 0.0 377.569 1652.64 1 \n",
"\n",
" u75 treat propensity_score weight \n",
"0 1 1 0.548374 1.823572 \n",
"1 1 1 0.369045 2.709700 \n",
"2 0 1 0.362327 2.759937 \n",
"3 1 1 0.391419 2.554808 \n",
"4 0 1 0.537215 1.861451 "
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"do_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This new dataframe gives the distribution of `'re78'` when `'treat'` is set to `1`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For much more detail on how the `do` method works, check the docstring:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Help on method do in module dowhy.api.causal_data_frame:\n",
"\n",
"do(x, method='weighting', num_cores=1, variable_types={}, outcome=None, params=None, dot_graph=None, common_causes=None, instruments=[], estimand_type='ate', proceed_when_unidentifiable=False, stateful=False) method of dowhy.api.causal_data_frame.CausalAccessor instance\n",
"\n"
]
}
],
"source": [
"help(lalonde.causal.do)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.7"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment