Skip to content

Instantly share code, notes, and snippets.

@fonnesbeck
Created April 17, 2015 03:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fonnesbeck/f837ecffc1f190a56c52 to your computer and use it in GitHub Desktop.
Save fonnesbeck/f837ecffc1f190a56c52 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "",
"signature": "sha256:158f8b01b663430159d4227a372122645c1c8ba515963b466bc49ade1164e141"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Manatee Proportions of Mortality Model"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Import modules\n",
"%matplotlib inline\n",
"from pymc import Lambda, Multinomial, Dirichlet, deterministic, MCMC, Matplot, Uniform, potential, AdaptiveMetropolis\n",
"from pymc import multinomial_like, extend_dirichlet\n",
"import numpy as np\n",
"import pandas as pd"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 36
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cause of death data were extracted Dec 3, 2012. Data were processed as shown below. In particular, adults and subadults were combined into a single age class for the purposes of this analysis."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Import data\n",
"mdata = pd.read_csv(\"data/FracAnal_1993_2010_Chris_20121203.csv\")\n",
" \n",
"# FL regions\n",
"regions = ['ATL', 'USJ', 'NW', 'SW']\n",
"# Causes of death\n",
"causes = ['watercraft', 'wcs', 'entanglement', 'cold', 'redtide', 'other']\n",
"# Age classes\n",
"# classes = ('Calves', 'Subadults', 'Adults')\n",
"classes = ['Calves', 'Adults']\n",
"\n",
"# Combine adult and subadults\n",
"for r in regions:\n",
" for c in causes:\n",
" mdata[c][(mdata['age']=='Adults') & (mdata['area']==r)] += (mdata[c][(mdata['age']=='Subadults') & (mdata['area']==r)]).tolist()\n",
" \n",
"data = mdata[mdata['age']!='Subadults']"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here is a sample of the dataset:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"data.head()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
" year watercraft wcs entanglement cold redtide other undetermined area age\n",
"0 1993 5 0 0 0 0 4 4 ATL Calves\n",
"1 1994 3 0 0 0 0 4 2 ATL Calves\n",
"2 1995 1 0 1 0 0 4 10 ATL Calves\n",
"3 1996 7 0 0 5 0 6 7 ATL Calves\n",
"4 1997 7 0 0 3 0 8 15 ATL Calves"
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to avoid the complexities of dealing with the recent cold year, analyses were conducted from 2001 to 2009, inclusive. Recovery rates were taken as fixed, extracted from Runge et al.'s incidental take paper. Additionally, we modeled red tide years in the Southwest region differently than all the other regions and years; we included a red tide effect parameter that accounts for the increased red tide activity in this region during 2003, 2005 and 2006."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Constants\n",
"JUVENILE, ADULT = 0, 1\n",
"STARTYEAR, ENDYEAR = 2001, 2009\n",
"YEARS = ENDYEAR - STARTYEAR + 1\n",
"\n",
"# Recovery rates from incidental take paper\n",
"recovery = dict(zip(regions, (0.8625, 0.8242, 0.4069, 0.5837)))\n",
"\n",
"# Strong red tide years\n",
"tide_years = (2003, 2005, 2006)\n",
"\n",
"# Uninformative priors for unknown proportions\n",
"prior0 = {'Calves':[np.ones(len(causes))]*2, 'Adults':[np.ones(len(causes))]*2}"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"def generate_model(age_class='Adults', prior=prior0):\n",
"\n",
" # Containers for variables\n",
" U = []\n",
" p = []\n",
" pi = []\n",
" X = []\n",
" X_like = []\n",
" Z = []\n",
" undetermined = []\n",
" theta = []\n",
"\n",
" # Red tide effect factor\n",
" a = Uniform('a', lower=0, upper=2)\n",
"\n",
" # Loop over regions\n",
" for area in regions:\n",
" \n",
" # Upper St. John never gets red tide\n",
" k = len(causes) - 1*(area == 'USJ')\n",
" \n",
" if area == 'USJ':\n",
" \n",
" undetermined_priors = r_[prior[age_class][0][:4], prior[age_class][0][-1]]\n",
" \n",
" elif area == 'SW':\n",
" \n",
" undetermined_priors = prior[age_class][1]\n",
" \n",
" else:\n",
"\n",
" undetermined_priors = prior[age_class][0]\n",
" \n",
" \n",
" # Undetermined deaths by latent cause modeled as multinomial\n",
" # with Dirichlet prior on probabilities\n",
" p_i = Dirichlet('p_%s' % area, theta=undetermined_priors, value=np.ones(k-1)/k)\n",
" p.append(p_i)\n",
" \n",
" # Proportions of mortality for region i\n",
" pi_i = Dirichlet('pi_%s' % area, theta=np.ones(k))\n",
" pi.append(pi_i)\n",
"\n",
" # Loop over years\n",
" for year in range(STARTYEAR, ENDYEAR+1):\n",
"\n",
" # Subset data by year and age\n",
" cdata = data[(data[\"area\"]==area) & (data['year']==year) & (data['age']==age_class)]\n",
"\n",
" # Undetermined mortality\n",
" undetermined_i = cdata['undetermined']\n",
" undetermined.append(undetermined_i)\n",
" \n",
" # Upper St. John never gets red tide\n",
" if area == 'USJ':\n",
" # Remove red tide as a cause\n",
" Z_i = cdata[causes[0:4] + [causes[-1]]].values[0]\n",
" else:\n",
" Z_i = cdata[causes].values[0]\n",
" \n",
" \n",
" # Undetermined deaths by latent cause modeled as multinomial\n",
" # with Dirichlet prior on probabilities\n",
" U_i = Multinomial('U_%s%i' % (area,year), n=undetermined_i, p=p_i)\n",
" U.append(U_i)\n",
"\n",
" # Total deaths by cause, corrected for recovery rate\n",
" X_i = Lambda('X_%s%i' % (area,year), lambda u=U_i, z=Z_i: ((z + u)/recovery[area]).astype(int))\n",
" X.append(X_i)\n",
"\n",
" # A red tide year ...\n",
" if (year in tide_years) and (area == 'SW'):\n",
"\n",
" @deterministic\n",
" def theta_i(p=pi_i, a=a):\n",
" \"\"\"Adjust proportions in red tide years\"\"\"\n",
" theta = p.copy()\n",
" theta[4] += a\n",
" return theta/(1+a) \n",
" theta_i.__name__ += '_%s%i' % (area,year)\n",
" theta.append(theta_i)\n",
"\n",
" # Normal year ...\n",
" else:\n",
" \n",
" # No adjustment to proportions\n",
" theta_i = pi_i\n",
" \n",
" @potential\n",
" def X_like_i(x=X_i, p=theta_i):\n",
" \"The likelihood of total deaths by cause\"\n",
" return multinomial_like(x, n=np.sum(x), p=extend_dirichlet(p))\n",
" X_like.append(X_like_i)\n",
" \n",
" return locals()"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Each model was run for 200K iterations, with half discarded as burn-in, and the remaining samples thinned by a factor iof 10."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"iterations = 200000\n",
"burnin = 100000\n",
"thin = 10"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"## Adult model"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"adult_model = MCMC(generate_model('Adults'))\n",
"adult_model.sample(iterations, burn=burnin, thin=thin)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
" \r",
"[****************100%******************] 200000 of 200000 complete"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Proportions of mortality"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"[p.summary() for p in adult_model.pi]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"pi_ATL:\n",
" \n",
"\tMean SD MC Error 95% HPD interval\n",
"\t------------------------------------------------------------------\n",
"\t0.58 0.042 0.004 [ 0.497 0.651]\n",
"\t0.01 0.004 0.0 [ 0.003 0.018]\n",
"\t0.011 0.004 0.0 [ 0.004 0.02 ]\n",
"\t0.111 0.039 0.004 [ 0.051 0.183]\n",
"\t0.081 0.028 0.003 [ 0.032 0.136]\n",
"\t\n",
"\t\n",
"\tPosterior quantiles:\n",
"\t\n",
"\t2.5 25 50 75 97.5\n",
"\t |---------------|===============|===============|---------------|\n",
"\t0.497 0.553 0.579 0.612 0.651\n",
"\t0.004 0.007 0.009 0.012 0.02\n",
"\t0.004 0.008 0.01 0.013 0.022\n",
"\t0.053 0.077 0.104 0.145 0.187\n",
"\t0.029 0.059 0.078 0.104 0.134\n",
"\t\n",
"\n",
"pi_USJ:"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
" \n",
"\tMean SD MC Error 95% HPD interval\n",
"\t------------------------------------------------------------------\n",
"\t0.523 0.143 0.013 [ 0.281 0.787]\n",
"\t0.069 0.086 0.008 [ 0. 0.269]\n",
"\t0.043 0.032 0.001 [ 0.001 0.103]\n",
"\t0.274 0.137 0.013 [ 0.059 0.536]\n",
"\t\n",
"\t\n",
"\tPosterior quantiles:\n",
"\t\n",
"\t2.5 25 50 75 97.5\n",
"\t |---------------|===============|===============|---------------|\n",
"\t0.279 0.408 0.521 0.633 0.786\n",
"\t0.006 0.024 0.044 0.072 0.357\n",
"\t0.005 0.02 0.036 0.058 0.123\n",
"\t0.07 0.158 0.257 0.375 0.55\n",
"\t\n",
"\n",
"pi_NW:"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
" \n",
"\tMean SD MC Error 95% HPD interval\n",
"\t------------------------------------------------------------------\n",
"\t0.556 0.078 0.006 [ 0.4 0.696]\n",
"\t0.011 0.012 0.0 [ 0. 0.035]\n",
"\t0.033 0.021 0.001 [ 0.002 0.072]\n",
"\t0.202 0.076 0.007 [ 0.088 0.375]\n",
"\t0.021 0.015 0.001 [ 0. 0.052]\n",
"\t\n",
"\t\n",
"\tPosterior quantiles:\n",
"\t\n",
"\t2.5 25 50 75 97.5\n",
"\t |---------------|===============|===============|---------------|\n",
"\t0.4 0.501 0.559 0.614 0.696\n",
"\t0.0 0.003 0.008 0.016 0.044\n",
"\t0.007 0.019 0.029 0.043 0.085\n",
"\t0.094 0.147 0.186 0.244 0.385\n",
"\t0.002 0.01 0.018 0.029 0.061\n",
"\t\n",
"\n",
"pi_SW:"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
" \n",
"\tMean SD MC Error 95% HPD interval\n",
"\t------------------------------------------------------------------\n",
"\t0.462 0.017 0.001 [ 0.431 0.496]\n",
"\t0.031 0.009 0.001 [ 0.017 0.051]\n",
"\t0.008 0.003 0.0 [ 0.002 0.014]\n",
"\t0.077 0.012 0.001 [ 0.055 0.1 ]\n",
"\t0.326 0.023 0.002 [ 0.28 0.368]\n",
"\t\n",
"\t\n",
"\tPosterior quantiles:\n",
"\t\n",
"\t2.5 25 50 75 97.5\n",
"\t |---------------|===============|===============|---------------|\n",
"\t0.432 0.451 0.462 0.473 0.497\n",
"\t0.018 0.025 0.03 0.036 0.054\n",
"\t0.003 0.006 0.008 0.01 0.016\n",
"\t0.055 0.069 0.077 0.085 0.102\n",
"\t0.28 0.31 0.326 0.342 0.368\n",
"\t\n"
]
},
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 8,
"text": [
"[None, None, None, None]"
]
}
],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"## Calf model"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"calf_model = MCMC(generate_model('Calves'))\n",
"calf_model.sample(iterations, burn=burnin, thin=thin)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
" \r",
"[****************100%******************] 200000 of 200000 complete"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Proportions of mortality"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"[p.summary() for p in calf_model.pi]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"pi_ATL:\n",
" \n",
"\tMean SD MC Error 95% HPD interval\n",
"\t------------------------------------------------------------------\n",
"\t0.177 0.029 0.003 [ 0.12 0.223]\n",
"\t0.004 0.003 0.0 [ 0. 0.01]\n",
"\t0.015 0.009 0.001 [ 0.003 0.034]\n",
"\t0.503 0.042 0.004 [ 0.428 0.578]\n",
"\t0.021 0.014 0.001 [ 0.005 0.053]\n",
"\t\n",
"\t\n",
"\tPosterior quantiles:\n",
"\t\n",
"\t2.5 25 50 75 97.5\n",
"\t |---------------|===============|===============|---------------|\n",
"\t0.122 0.154 0.179 0.201 0.226\n",
"\t0.0 0.002 0.003 0.005 0.012\n",
"\t0.003 0.007 0.013 0.02 0.035\n",
"\t0.424 0.471 0.504 0.536 0.575\n",
"\t0.006 0.012 0.017 0.026 0.061\n",
"\t\n",
"\n",
"pi_USJ:\n",
" \n",
"\tMean SD MC Error 95% HPD interval\n",
"\t------------------------------------------------------------------\n",
"\t0.533 0.17 0.012 [ 0.215 0.829]\n",
"\t0.065 0.063 0.003 [ 0. 0.19]\n",
"\t0.093 0.099 0.008 [ 0. 0.313]\n",
"\t0.232 0.144 0.011 [ 0.025 0.535]\n",
"\t\n",
"\t\n",
"\tPosterior quantiles:\n",
"\t\n",
"\t2.5 25 50 75 97.5\n",
"\t |---------------|===============|===============|---------------|\n",
"\t0.218 0.401 0.537 0.668 0.834\n",
"\t0.002 0.019 0.045 0.091 0.239\n",
"\t0.002 0.023 0.056 0.126 0.37\n",
"\t0.043 0.124 0.196 0.31 0.578\n",
"\t\n",
"\n",
"pi_NW:"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
" \n",
"\tMean SD MC Error 95% HPD interval\n",
"\t------------------------------------------------------------------\n",
"\t0.443 0.096 0.01 [ 0.219 0.567]\n",
"\t0.019 0.017 0.001 [ 0. 0.052]\n",
"\t0.019 0.019 0.001 [ 0. 0.059]\n",
"\t0.25 0.054 0.005 [ 0.151 0.361]\n",
"\t0.019 0.02 0.001 [ 0. 0.059]\n",
"\t\n",
"\t\n",
"\tPosterior quantiles:\n",
"\t\n",
"\t2.5 25 50 75 97.5\n",
"\t |---------------|===============|===============|---------------|\n",
"\t0.207 0.397 0.469 0.516 0.561\n",
"\t0.001 0.006 0.014 0.028 0.06\n",
"\t0.0 0.005 0.013 0.027 0.071\n",
"\t0.155 0.212 0.239 0.274 0.37\n",
"\t0.0 0.005 0.013 0.026 0.073\n",
"\t\n",
"\n",
"pi_SW:\n",
" \n",
"\tMean SD MC Error 95% HPD interval\n",
"\t------------------------------------------------------------------\n",
"\t0.214 0.012 0.001 [ 0.187 0.231]\n",
"\t0.003 0.002 0.0 [ 0. 0.008]\n",
"\t0.005 0.003 0.0 [ 0. 0.01]\n",
"\t0.457 0.033 0.003 [ 0.385 0.515]\n",
"\t0.184 0.033 0.003 [ 0.125 0.26 ]\n",
"\t\n",
"\t\n",
"\tPosterior quantiles:\n",
"\t\n",
"\t2.5 25 50 75 97.5\n",
"\t |---------------|===============|===============|---------------|\n",
"\t0.187 0.207 0.216 0.224 0.231\n",
"\t0.0 0.002 0.003 0.004 0.009\n",
"\t0.001 0.003 0.004 0.006 0.012\n",
"\t0.387 0.436 0.459 0.479 0.518\n",
"\t0.125 0.161 0.181 0.203 0.26\n",
"\t\n"
]
},
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 10,
"text": [
"[None, None, None, None]"
]
}
],
"prompt_number": 10
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment