josef-pkt/example_gmm_euler.ipynb

## example_gmm_euler.ipynb
{
 "metadata": {
  "name": "example_gmm_euler_1"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Estimating Euler Equation with Generalized Method of Moments with statsmodels"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "One of the earliest application of GMM is the estimation of Euler equations in rational expectation models \n",
      "(Hansen and Singleton 1982). We can read about it in many textbooks, but here I just want to present it as \n",
      "another test case for GMM in statsmodels.\n",
      "\n",
      "The short form is that optimization by economic agents with rational expectation implies the following orthogonality condition in expectation:\n",
      "\n",
      "$\\hspace{10pt} E [z_t \\hspace{3pt} (1 - \\beta  (1 + r_{t+1})  (c_{t+1} / c_t)^{-\\gamma}) ]$\n",
      "    \n",
      "\n",
      "where beta is the discount factor of the agent, and gamma is the coefficient reflecting constant relative risk aversion. R is a rate of return on assets or interest rate, c is consumption\n",
      "(I have not looked at this part of economics in a long time.)\n",
      "\n",
      "The main point is that we cannot treat current and future consumption and future interest rates as exogenous when we try to estimate the parameters, since economic agents have more information about those variables than the econometrician. Those variables will be correlated with any residuals in the estimation and give us inconsistent estimates of the parameters if we don't take the endogeneity into account.\n"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Setup and Data"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "\n",
      "\"\"\"\n",
      "\n",
      "Created on Tue Oct 08 09:27:31 2013\n",
      "\n",
      "Author: Josef Perktold\n",
      "\"\"\"\n",
      "\n",
      "import numpy as np\n",
      "import pandas as pd\n",
      "\n",
      "from statsmodels.sandbox.regression import gmm\n",
      "\n",
      "dta = pd.read_csv('consumption.csv', parse_dates=[0])\n",
      "\n",
      "print dta.iloc[:5]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "                  qtr       r       c\n",
        "0 1947-01-01 00:00:00  0.0038  1017.2\n",
        "1 1947-04-01 00:00:00  0.0038  1034.0\n",
        "2 1947-07-01 00:00:00  0.0066  1037.5\n",
        "3 1947-10-01 00:00:00  0.0085  1037.7\n",
        "4 1948-01-01 00:00:00  0.0097  1042.6\n"
       ]
      }
     ],
     "prompt_number": 30
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Next, I create the lagged and leading variables for the estimation. \n",
      "As instruments we use lagged interest rate and current and lagged consumption growth."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "dta['c_growth'] = dta['c'] / dta['c'].shift(1)\n",
      "dta['c_growth_lag1'] = dta['c_growth'].shift(1)\n",
      "dta['r_lag1'] = dta['r'].shift(1)\n",
      "dta['r_lag2'] = dta['r'].shift(2)\n",
      "dta['r_forw1'] = dta['r'].shift(-1)\n",
      "dta['c_lag1'] = dta['c'].shift(1)\n",
      "dta['c_forw1'] = dta['c'].shift(-1)\n",
      "dta['const'] = 1\n",
      "\n",
      "dta_clean = dta.dropna()\n",
      "\n",
      "\n",
      "endog_df = dta_clean[['r_forw1', 'c_forw1', 'c']]\n",
      "exog_df = endog_df\n",
      "instrument_df = dta_clean[['r_lag1', 'r_lag2', 'c_growth', 'c_growth_lag1',\n",
      "                           'const']]\n",
      "\n",
      "endog, exog, instrument  = map(np.asarray, [endog_df, exog_df, instrument_df])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 31
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "The moment equations and GMM - version 1"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Currently statsmodels has two ways of specifying moment conditions. \n",
      "The first uses general non-linear functions for the (unconditional) moment condition:\n",
      "    \n",
      "$\\hspace{10pt} E[m(params)] = 0$\n",
      "\n",
      "The second version uses an instrumental variables approach with additive error structure\n",
      "    \n",
      "\\begin{equation*} \\hspace{10pt} E[z \\hspace{3pt} (y - f(x, params)] = 0 \\end{equation*}\n",
      "\n",
      "where z are the instruments and $u = (y - f(x, params)$ defines an additive residual, and `x` \n",
      "are explanatory variables that can be endogenous or exogenous.\n",
      "                                  \n",
      "In the following I use the class `NonlinearIVGMM`, which implements the second form of the moment conditions.\n",
      "However, our Euler equation doesn't fit's this only if we use endog or y as a constant.\n",
      "I'm not showing an example how we can use the first generic form, class `GMM`, nor how\n",
      "we could define the moment conditions directly by subclassing."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**The first version**"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "In the first version, we use `endog = 0` and define `u` to be the the Euler equation without the instrument part."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def moment_consumption1(params, exog):\n",
      "    beta, gamma = params\n",
      "    r_forw1, c_forw1, c = exog.T  # unwrap iterable (ndarray)\n",
      "    \n",
      "    # moment condition without instrument    \n",
      "    err = 1 - beta * (1 + r_forw1) * np.power(c_forw1 / c, -gamma)\n",
      "    return -err"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 32
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "endog1 = np.zeros(exog.shape[0])    \n",
      "mod1 = gmm.NonlinearIVGMM(endog1, exog, instrument, moment_consumption1, k_moms=4)\n",
      "w0inv = np.dot(instrument.T, instrument) / len(endog1)\n",
      "res1 = mod1.fit([1,-1], maxiter=2, inv_weights=w0inv) "
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Optimization terminated successfully.\n",
        "         Current function value: 0.000539\n",
        "         Iterations: 6\n",
        "         Function evaluations: 23\n",
        "         Gradient evaluations: 23\n",
        "Optimization terminated successfully.\n",
        "         Current function value: 0.152609\n",
        "         Iterations: 6\n",
        "         Function evaluations: 11\n",
        "         Gradient evaluations: 11\n"
       ]
      }
     ],
     "prompt_number": 33
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "print(res1.summary(yname='Euler Eq', xname=['discount', 'CRRA']))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "                            NonlinearIVGMM Results                            \n",
        "==============================================================================\n",
        "Dep. Variable:               Euler Eq   Hansen J:                        36.47\n",
        "Model:                 NonlinearIVGMM   Prob (Hansen J):              1.20e-08\n",
        "Method:                           GMM                                         \n",
        "Date:                Wed, 25 Dec 2013                                         \n",
        "Time:                        20:08:33                                         \n",
        "No. Observations:                 239                                         \n",
        "==============================================================================\n",
        "                 coef    std err          z      P>|z|      [95.0% Conf. Int.]\n",
        "------------------------------------------------------------------------------\n",
        "discount       0.8977      0.017     52.860      0.000         0.864     0.931\n",
        "CRRA          -6.7953      2.050     -3.315      0.001       -10.813    -2.778\n",
        "==============================================================================\n"
       ]
      }
     ],
     "prompt_number": 34
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "However, the default GMM standard errors of the estimate are only robust to heteroscedasticity, i.e. variance differs by observation.\n",
      "Since we using time series for our estimation, there is a strong chance that the errors, \n",
      "or better moment conditions, are correlated over time. In this case, we can use a HAC robust standard error, \n",
      "that is robust to heteroscedasticity as well as autocorrelation.\n",
      "\n",
      "In the GMM version in statsmodels, we define these options through `weights_method='hac', wargs={'maxlag':4}`,\n",
      "which uses the Newey, West standard errors based on 4 lags."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# We don't need Nelder-Mead in this case, we can use bfgs default directly\n",
      "#res1_ = mod1.fit([1,-1], maxiter=0, inv_weights=w0inv, opt_method='nm')\n",
      "\n",
      "res1_hac4_2s = mod1.fit([1, -1], maxiter=2, inv_weights=w0inv, weights_method='hac', wargs={'maxlag':4})\n",
      "print('\\n\\n')\n",
      "print(res1_hac4_2s.summary(yname='Euler Eq', xname=['discount', 'CRRA']))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Optimization terminated successfully.\n",
        "         Current function value: 0.000539\n",
        "         Iterations: 6\n",
        "         Function evaluations: 23\n",
        "         Gradient evaluations: 23\n",
        "Optimization terminated successfully.\n",
        "         Current function value: 0.055395\n",
        "         Iterations: 5\n",
        "         Function evaluations: 10\n",
        "         Gradient evaluations: 10\n",
        "\n",
        "\n",
        "\n",
        "                            NonlinearIVGMM Results                            \n",
        "==============================================================================\n",
        "Dep. Variable:               Euler Eq   Hansen J:                        13.24\n",
        "Model:                 NonlinearIVGMM   Prob (Hansen J):               0.00133\n",
        "Method:                           GMM                                         \n",
        "Date:                Wed, 25 Dec 2013                                         \n",
        "Time:                        20:08:33                                         \n",
        "No. Observations:                 239                                         \n",
        "==============================================================================\n",
        "                 coef    std err          z      P>|z|      [95.0% Conf. Int.]\n",
        "------------------------------------------------------------------------------\n",
        "discount       0.9213      0.014     67.281      0.000         0.894     0.948\n",
        "CRRA          -4.1136      1.507     -2.729      0.006        -7.068    -1.160\n",
        "==============================================================================\n"
       ]
      }
     ],
     "prompt_number": 35
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "The moment equations and GMM - version 2"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "In version 2 of the moment equations, we use that the non-instrument part of the Euler Equation has the form `1 - f(x, params)`.\n",
      "Our `moment_consumption2` defines only the `f(x, params)' part and we use a vector of ones for endog."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def moment_consumption2(params, exog):\n",
      "    beta, gamma = params\n",
      "    #endog, exog = args\n",
      "    r_forw1, c_forw1, c = exog.T  # unwrap iterable (ndarray)\n",
      "    \n",
      "    # 2nd part of moment condition without instrument    \n",
      "    predicted = beta * (1. + r_forw1) * np.power(c_forw1 / c, -gamma)\n",
      "    return predicted"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 36
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "endog2 = np.ones(exog.shape[0])    \n",
      "mod2 = gmm.NonlinearIVGMM(endog2, exog, instrument, moment_consumption2, k_moms=4)\n",
      "w0inv = np.dot(instrument.T, instrument) / len(endog2)  \n",
      "res2_hac4_2s = mod2.fit([1,-1], maxiter=2, inv_weights=w0inv, weights_method='hac', wargs={'maxlag':4})"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Optimization terminated successfully.\n",
        "         Current function value: 0.000539\n",
        "         Iterations: 6\n",
        "         Function evaluations: 23\n",
        "         Gradient evaluations: 23\n",
        "Optimization terminated successfully.\n",
        "         Current function value: 0.055395\n",
        "         Iterations: 5\n",
        "         Function evaluations: 10\n",
        "         Gradient evaluations: 10\n"
       ]
      }
     ],
     "prompt_number": 37
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "print(res2_hac4_2s.summary(yname='Euler Eq', xname=['discount', 'CRRA']))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "                            NonlinearIVGMM Results                            \n",
        "==============================================================================\n",
        "Dep. Variable:               Euler Eq   Hansen J:                        13.24\n",
        "Model:                 NonlinearIVGMM   Prob (Hansen J):               0.00133\n",
        "Method:                           GMM                                         \n",
        "Date:                Wed, 25 Dec 2013                                         \n",
        "Time:                        20:08:33                                         \n",
        "No. Observations:                 239                                         \n",
        "==============================================================================\n",
        "                 coef    std err          z      P>|z|      [95.0% Conf. Int.]\n",
        "------------------------------------------------------------------------------\n",
        "discount       0.9213      0.014     67.281      0.000         0.894     0.948\n",
        "CRRA          -4.1136      1.507     -2.729      0.006        -7.068    -1.160\n",
        "==============================================================================\n"
       ]
      }
     ],
     "prompt_number": 38
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Compare"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We should get the same estimates in both versions for speciying the moment conditions."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "res1_hac4_2s.params"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 39,
       "text": [
        "array([ 0.92129628, -4.11361243])"
       ]
      }
     ],
     "prompt_number": 39
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "res2_hac4_2s.params"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 40,
       "text": [
        "array([ 0.92129628, -4.11361243])"
       ]
      }
     ],
     "prompt_number": 40
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "And the two are the same (up to differences numerical precision of the algorithms) :"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "res1_hac4_2s.params - res2_hac4_2s.params, np.max(np.abs(res1_hac4_2s.params - res2_hac4_2s.params))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 41,
       "text": [
        "(array([  3.27948779e-12,   3.71550790e-10]), 3.7155079013473369e-10)"
       ]
      }
     ],
     "prompt_number": 41
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Stata manual has params `[0.9205, -4.2224]` and standard errors equal to `[0.0135, 1.4739]`. Stata doesn't center the moments by default. We can get something closer to the results of Stata by using `centered = False` as a weights argument:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "res_ = mod2.fit([1,-1], maxiter=2, inv_weights=w0inv, weights_method='hac', \n",
      "                wargs={'maxlag':4, 'centered':False}, optim_args={'disp':0})\n",
      "print res_.params\n",
      "print res_.bse"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "[ 0.92045263 -4.22337303]\n",
        "[ 0.01345577  1.47341411]\n"
       ]
      }
     ],
     "prompt_number": 42
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Extra: Iterated GMM"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "As comparison we can also iterate the GMM estimation until convergence.\n",
      "\n",
      "The results look strange, especially with HAC robust weighting matrix. \n",
      "In both cases the estimate for the discount factor is large and close to one, however \n",
      "the preference parameter for relative risk aversion is 0.08 and -6.03 respectively.\n",
      "\n",
      "This needs verification or an explanation. \n",
      "I have a test case for iterated GMM with heteroscedasticity robust standard errors \n",
      "where statsmodels and Stata agree at several decimals, so there might be something\n",
      "problematic with this dataset."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**with HAC standard errors**"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": true,
     "input": [
      "res2_hac_i = mod2.fit([1,-1], maxiter=100, inv_weights=w0inv, \n",
      "                      weights_method='hac', wargs={'maxlag':4}, \n",
      "                      optim_args={'disp':0})"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 43
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "#print(res2_hac4_i.summary(yname='Euler Eq', xname=['discount', 'CRRA']))\n",
      "res2_hac_i.params"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 44,
       "text": [
        "array([ 0.95862044,  0.07853072])"
       ]
      }
     ],
     "prompt_number": 44
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**with HC standard errors**"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "res2_i = mod2.fit([1,-1], maxiter=100, inv_weights=w0inv, optim_args={'disp':0})"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 45
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "#print(res2_i.summary(yname='Euler Eq', xname=['discount', 'CRRA']))\n",
      "print res2_i.params"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "[ 0.90335842 -6.02767703]\n"
       ]
      }
     ],
     "prompt_number": 46
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Extra: Problems with Optimization"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "When I ran several different cases for this example, I sometimes got convergence to different results. \n",
      "Although the objective function is quadratic in the moments, the momend conditions themselves can have \n",
      "more complicated non-linearities, that could be the cause for local minima.\n",
      "\n",
      "In the above examples I used the function for the moment conditions as an argument to the class.\n",
      "As alternative it is also possible to directly subclass a GMM class. This has the advantage\n",
      "that we can override specific methods. For example, currently it is not possible to give a\n",
      "analytical derivative for the moment conditions, but it can be added by subclassing and \n",
      "replacing the generic forward difference method `gradient_momcond`.\n"
     ]
    }
   ],
   "metadata": {}
  }
 ]
}
	{
	"metadata": {
	"name": "example_gmm_euler_1"
	},
	"nbformat": 3,
	"nbformat_minor": 0,
	"worksheets": [
	{
	"cells": [
	{
	"cell_type": "heading",
	"level": 1,
	"metadata": {},
	"source": [
	"Estimating Euler Equation with Generalized Method of Moments with statsmodels"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"One of the earliest application of GMM is the estimation of Euler equations in rational expectation models \n",
	"(Hansen and Singleton 1982). We can read about it in many textbooks, but here I just want to present it as \n",
	"another test case for GMM in statsmodels.\n",
	"\n",
	"The short form is that optimization by economic agents with rational expectation implies the following orthogonality condition in expectation:\n",
	"\n",
	"$\\hspace{10pt} E [z_t \\hspace{3pt} (1 - \\beta (1 + r_{t+1}) (c_{t+1} / c_t)^{-\\gamma}) ]$\n",
	" \n",
	"\n",
	"where beta is the discount factor of the agent, and gamma is the coefficient reflecting constant relative risk aversion. R is a rate of return on assets or interest rate, c is consumption\n",
	"(I have not looked at this part of economics in a long time.)\n",
	"\n",
	"The main point is that we cannot treat current and future consumption and future interest rates as exogenous when we try to estimate the parameters, since economic agents have more information about those variables than the econometrician. Those variables will be correlated with any residuals in the estimation and give us inconsistent estimates of the parameters if we don't take the endogeneity into account.\n"
	]
	},
	{
	"cell_type": "heading",
	"level": 2,
	"metadata": {},
	"source": [
	"Setup and Data"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"\n",
	"\"\"\"\n",
	"\n",
	"Created on Tue Oct 08 09:27:31 2013\n",
	"\n",
	"Author: Josef Perktold\n",
	"\"\"\"\n",
	"\n",
	"import numpy as np\n",
	"import pandas as pd\n",
	"\n",
	"from statsmodels.sandbox.regression import gmm\n",
	"\n",
	"dta = pd.read_csv('consumption.csv', parse_dates=[0])\n",
	"\n",
	"print dta.iloc[:5]"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	" qtr r c\n",
	"0 1947-01-01 00:00:00 0.0038 1017.2\n",
	"1 1947-04-01 00:00:00 0.0038 1034.0\n",
	"2 1947-07-01 00:00:00 0.0066 1037.5\n",
	"3 1947-10-01 00:00:00 0.0085 1037.7\n",
	"4 1948-01-01 00:00:00 0.0097 1042.6\n"
	]
	}
	],
	"prompt_number": 30
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Next, I create the lagged and leading variables for the estimation. \n",
	"As instruments we use lagged interest rate and current and lagged consumption growth."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"dta['c_growth'] = dta['c'] / dta['c'].shift(1)\n",
	"dta['c_growth_lag1'] = dta['c_growth'].shift(1)\n",
	"dta['r_lag1'] = dta['r'].shift(1)\n",
	"dta['r_lag2'] = dta['r'].shift(2)\n",
	"dta['r_forw1'] = dta['r'].shift(-1)\n",
	"dta['c_lag1'] = dta['c'].shift(1)\n",
	"dta['c_forw1'] = dta['c'].shift(-1)\n",
	"dta['const'] = 1\n",
	"\n",
	"dta_clean = dta.dropna()\n",
	"\n",
	"\n",
	"endog_df = dta_clean[['r_forw1', 'c_forw1', 'c']]\n",
	"exog_df = endog_df\n",
	"instrument_df = dta_clean[['r_lag1', 'r_lag2', 'c_growth', 'c_growth_lag1',\n",
	" 'const']]\n",
	"\n",
	"endog, exog, instrument = map(np.asarray, [endog_df, exog_df, instrument_df])"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 31
	},
	{
	"cell_type": "heading",
	"level": 2,
	"metadata": {},
	"source": [
	"The moment equations and GMM - version 1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Currently statsmodels has two ways of specifying moment conditions. \n",
	"The first uses general non-linear functions for the (unconditional) moment condition:\n",
	" \n",
	"$\\hspace{10pt} E[m(params)] = 0$\n",
	"\n",
	"The second version uses an instrumental variables approach with additive error structure\n",
	" \n",
	"\\begin{equation} \\hspace{10pt} E[z \\hspace{3pt} (y - f(x, params)] = 0 \\end{equation}\n",
	"\n",
	"where z are the instruments and $u = (y - f(x, params)$ defines an additive residual, and `x` \n",
	"are explanatory variables that can be endogenous or exogenous.\n",
	" \n",
	"In the following I use the class `NonlinearIVGMM`, which implements the second form of the moment conditions.\n",
	"However, our Euler equation doesn't fit's this only if we use endog or y as a constant.\n",
	"I'm not showing an example how we can use the first generic form, class `GMM`, nor how\n",
	"we could define the moment conditions directly by subclassing."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The first version"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"In the first version, we use `endog = 0` and define `u` to be the the Euler equation without the instrument part."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"def moment_consumption1(params, exog):\n",
	" beta, gamma = params\n",
	" r_forw1, c_forw1, c = exog.T # unwrap iterable (ndarray)\n",
	" \n",
	" # moment condition without instrument \n",
	" err = 1 - beta * (1 + r_forw1) * np.power(c_forw1 / c, -gamma)\n",
	" return -err"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 32
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"endog1 = np.zeros(exog.shape[0]) \n",
	"mod1 = gmm.NonlinearIVGMM(endog1, exog, instrument, moment_consumption1, k_moms=4)\n",
	"w0inv = np.dot(instrument.T, instrument) / len(endog1)\n",
	"res1 = mod1.fit([1,-1], maxiter=2, inv_weights=w0inv) "
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"Optimization terminated successfully.\n",
	" Current function value: 0.000539\n",
	" Iterations: 6\n",
	" Function evaluations: 23\n",
	" Gradient evaluations: 23\n",
	"Optimization terminated successfully.\n",
	" Current function value: 0.152609\n",
	" Iterations: 6\n",
	" Function evaluations: 11\n",
	" Gradient evaluations: 11\n"
	]
	}
	],
	"prompt_number": 33
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"print(res1.summary(yname='Euler Eq', xname=['discount', 'CRRA']))"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	" NonlinearIVGMM Results \n",
	"==============================================================================\n",
	"Dep. Variable: Euler Eq Hansen J: 36.47\n",
	"Model: NonlinearIVGMM Prob (Hansen J): 1.20e-08\n",
	"Method: GMM \n",
	"Date: Wed, 25 Dec 2013 \n",
	"Time: 20:08:33 \n",
	"No. Observations: 239 \n",
	"==============================================================================\n",
	" coef std err z P>\|z\| [95.0% Conf. Int.]\n",
	"------------------------------------------------------------------------------\n",
	"discount 0.8977 0.017 52.860 0.000 0.864 0.931\n",
	"CRRA -6.7953 2.050 -3.315 0.001 -10.813 -2.778\n",
	"==============================================================================\n"
	]
	}
	],
	"prompt_number": 34
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"However, the default GMM standard errors of the estimate are only robust to heteroscedasticity, i.e. variance differs by observation.\n",
	"Since we using time series for our estimation, there is a strong chance that the errors, \n",
	"or better moment conditions, are correlated over time. In this case, we can use a HAC robust standard error, \n",
	"that is robust to heteroscedasticity as well as autocorrelation.\n",
	"\n",
	"In the GMM version in statsmodels, we define these options through `weights_method='hac', wargs={'maxlag':4}`,\n",
	"which uses the Newey, West standard errors based on 4 lags."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"# We don't need Nelder-Mead in this case, we can use bfgs default directly\n",
	"#res1_ = mod1.fit([1,-1], maxiter=0, inv_weights=w0inv, opt_method='nm')\n",
	"\n",
	"res1_hac4_2s = mod1.fit([1, -1], maxiter=2, inv_weights=w0inv, weights_method='hac', wargs={'maxlag':4})\n",
	"print('\\n\\n')\n",
	"print(res1_hac4_2s.summary(yname='Euler Eq', xname=['discount', 'CRRA']))"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"Optimization terminated successfully.\n",
	" Current function value: 0.000539\n",
	" Iterations: 6\n",
	" Function evaluations: 23\n",
	" Gradient evaluations: 23\n",
	"Optimization terminated successfully.\n",
	" Current function value: 0.055395\n",
	" Iterations: 5\n",
	" Function evaluations: 10\n",
	" Gradient evaluations: 10\n",
	"\n",
	"\n",
	"\n",
	" NonlinearIVGMM Results \n",
	"==============================================================================\n",
	"Dep. Variable: Euler Eq Hansen J: 13.24\n",
	"Model: NonlinearIVGMM Prob (Hansen J): 0.00133\n",
	"Method: GMM \n",
	"Date: Wed, 25 Dec 2013 \n",
	"Time: 20:08:33 \n",
	"No. Observations: 239 \n",
	"==============================================================================\n",
	" coef std err z P>\|z\| [95.0% Conf. Int.]\n",
	"------------------------------------------------------------------------------\n",
	"discount 0.9213 0.014 67.281 0.000 0.894 0.948\n",
	"CRRA -4.1136 1.507 -2.729 0.006 -7.068 -1.160\n",
	"==============================================================================\n"
	]
	}
	],
	"prompt_number": 35
	},
	{
	"cell_type": "heading",
	"level": 2,
	"metadata": {},
	"source": [
	"The moment equations and GMM - version 2"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"In version 2 of the moment equations, we use that the non-instrument part of the Euler Equation has the form `1 - f(x, params)`.\n",
	"Our `moment_consumption2` defines only the `f(x, params)' part and we use a vector of ones for endog."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"def moment_consumption2(params, exog):\n",
	" beta, gamma = params\n",
	" #endog, exog = args\n",
	" r_forw1, c_forw1, c = exog.T # unwrap iterable (ndarray)\n",
	" \n",
	" # 2nd part of moment condition without instrument \n",
	" predicted = beta * (1. + r_forw1) * np.power(c_forw1 / c, -gamma)\n",
	" return predicted"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 36
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"endog2 = np.ones(exog.shape[0]) \n",
	"mod2 = gmm.NonlinearIVGMM(endog2, exog, instrument, moment_consumption2, k_moms=4)\n",
	"w0inv = np.dot(instrument.T, instrument) / len(endog2) \n",
	"res2_hac4_2s = mod2.fit([1,-1], maxiter=2, inv_weights=w0inv, weights_method='hac', wargs={'maxlag':4})"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"Optimization terminated successfully.\n",
	" Current function value: 0.000539\n",
	" Iterations: 6\n",
	" Function evaluations: 23\n",
	" Gradient evaluations: 23\n",
	"Optimization terminated successfully.\n",
	" Current function value: 0.055395\n",
	" Iterations: 5\n",
	" Function evaluations: 10\n",
	" Gradient evaluations: 10\n"
	]
	}
	],
	"prompt_number": 37
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"print(res2_hac4_2s.summary(yname='Euler Eq', xname=['discount', 'CRRA']))"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	" NonlinearIVGMM Results \n",
	"==============================================================================\n",
	"Dep. Variable: Euler Eq Hansen J: 13.24\n",
	"Model: NonlinearIVGMM Prob (Hansen J): 0.00133\n",
	"Method: GMM \n",
	"Date: Wed, 25 Dec 2013 \n",
	"Time: 20:08:33 \n",
	"No. Observations: 239 \n",
	"==============================================================================\n",
	" coef std err z P>\|z\| [95.0% Conf. Int.]\n",
	"------------------------------------------------------------------------------\n",
	"discount 0.9213 0.014 67.281 0.000 0.894 0.948\n",
	"CRRA -4.1136 1.507 -2.729 0.006 -7.068 -1.160\n",
	"==============================================================================\n"
	]
	}
	],
	"prompt_number": 38
	},
	{
	"cell_type": "heading",
	"level": 2,
	"metadata": {},
	"source": [
	"Compare"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We should get the same estimates in both versions for speciying the moment conditions."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"res1_hac4_2s.params"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 39,
	"text": [
	"array([ 0.92129628, -4.11361243])"
	]
	}
	],
	"prompt_number": 39
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"res2_hac4_2s.params"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 40,
	"text": [
	"array([ 0.92129628, -4.11361243])"
	]
	}
	],
	"prompt_number": 40
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"And the two are the same (up to differences numerical precision of the algorithms) :"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"res1_hac4_2s.params - res2_hac4_2s.params, np.max(np.abs(res1_hac4_2s.params - res2_hac4_2s.params))"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 41,
	"text": [
	"(array([ 3.27948779e-12, 3.71550790e-10]), 3.7155079013473369e-10)"
	]
	}
	],
	"prompt_number": 41
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Stata manual has params `[0.9205, -4.2224]` and standard errors equal to `[0.0135, 1.4739]`. Stata doesn't center the moments by default. We can get something closer to the results of Stata by using `centered = False` as a weights argument:"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"res_ = mod2.fit([1,-1], maxiter=2, inv_weights=w0inv, weights_method='hac', \n",
	" wargs={'maxlag':4, 'centered':False}, optim_args={'disp':0})\n",
	"print res_.params\n",
	"print res_.bse"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"[ 0.92045263 -4.22337303]\n",
	"[ 0.01345577 1.47341411]\n"
	]
	}
	],
	"prompt_number": 42
	},
	{
	"cell_type": "heading",
	"level": 2,
	"metadata": {},
	"source": [
	"Extra: Iterated GMM"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"As comparison we can also iterate the GMM estimation until convergence.\n",
	"\n",
	"The results look strange, especially with HAC robust weighting matrix. \n",
	"In both cases the estimate for the discount factor is large and close to one, however \n",
	"the preference parameter for relative risk aversion is 0.08 and -6.03 respectively.\n",
	"\n",
	"This needs verification or an explanation. \n",
	"I have a test case for iterated GMM with heteroscedasticity robust standard errors \n",
	"where statsmodels and Stata agree at several decimals, so there might be something\n",
	"problematic with this dataset."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"with HAC standard errors"
	]
	},
	{
	"cell_type": "code",
	"collapsed": true,
	"input": [
	"res2_hac_i = mod2.fit([1,-1], maxiter=100, inv_weights=w0inv, \n",
	" weights_method='hac', wargs={'maxlag':4}, \n",
	" optim_args={'disp':0})"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 43
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"#print(res2_hac4_i.summary(yname='Euler Eq', xname=['discount', 'CRRA']))\n",
	"res2_hac_i.params"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 44,
	"text": [
	"array([ 0.95862044, 0.07853072])"
	]
	}
	],
	"prompt_number": 44
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"with HC standard errors"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"res2_i = mod2.fit([1,-1], maxiter=100, inv_weights=w0inv, optim_args={'disp':0})"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 45
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"#print(res2_i.summary(yname='Euler Eq', xname=['discount', 'CRRA']))\n",
	"print res2_i.params"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"[ 0.90335842 -6.02767703]\n"
	]
	}
	],
	"prompt_number": 46
	},
	{
	"cell_type": "heading",
	"level": 2,
	"metadata": {},
	"source": [
	"Extra: Problems with Optimization"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"When I ran several different cases for this example, I sometimes got convergence to different results. \n",
	"Although the objective function is quadratic in the moments, the momend conditions themselves can have \n",
	"more complicated non-linearities, that could be the cause for local minima.\n",
	"\n",
	"In the above examples I used the function for the moment conditions as an argument to the class.\n",
	"As alternative it is also possible to directly subclass a GMM class. This has the advantage\n",
	"that we can override specific methods. For example, currently it is not possible to give a\n",
	"analytical derivative for the moment conditions, but it can be added by subclassing and \n",
	"replacing the generic forward difference method `gradient_momcond`.\n"
	]
	}
	],
	"metadata": {}
	}
	]
	}