fonnesbeck/Phase II Vaccine Power Analysis.ipynb

## Phase II Vaccine Power Analysis.ipynb
{
  "cells": [
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "# Phase II Power Analysis\n\nUsing the sample information from the phase I trial, we can calculate power for a phase II trial.\n\n![Table 2](http://d.pr/i/O9Y6+)"
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Proper Bayesian power analysis\n\nWe wish to identify sample sizes for high dose and standard dose (in a 2:1 ratio) that will be able to detect a positive difference in the probability $p$ of a $\\ge 4$ fold titer rise to H3N2 influenza. That is, we wish to be able to detect:\n\n$$\\delta = p^{(HD)} - p^{(SD)} > 0$$\n\nwith 80% power. This analysis will be conducted using the *proper Bayesian* approach (Spiegelhalter et al. 2004). The advantage of this approach for power analysis over a frequentist power analysis is that it does not assume that the underlying probabilities are known. Failure to do so will tend to underestimate variance and result in underpowered experiments.\n\nA sample leading to a successful trial (one in which a difference is detected) under a particular sample size $n$ is:\n\n$$E_n = \\left\\{\\mathbf{x}; Pr( \\delta > 0 | \\mathbf{x} ) \\ge 0.95\\right\\}$$\n\nso the probability of a successful trial is:\n\n$$\\psi(n) = P_{X^{(n)}}(X^{(n)} \\in E_n)$$\n\nwhich is the **expected Bayesian power**.\n\nIn our case, the power is:\n\n$$\\psi(n) = \\Phi\\left(\\frac{\\pi_{hd} - \\pi_{sd}}{\\sqrt{\\pi_{hd}(1-\\pi_{hd})/n_{hd} - \\pi_{sd}(1-\\pi_{sd})/n_{sd}}}\\right)$$\n\nwhere the $\\pi_i$ are the posterior probabilities, parameterized by a beta distribution with a binomial likelihood for the data."
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "We will use the sample data from the phase I trial (GiaQuinta et al. 2014) to construct prior distributions for $p^{(HD)}$ and $p^{(SD)}$."
    },
    {
      "metadata": {
        "trusted": true,
        "collapsed": true
      },
      "cell_type": "code",
      "source": "%matplotlib inline\nimport numpy as np\nimport seaborn as sns\nimport pymc3 as pm\nimport theano.tensor as tt\nplt = sns.plt",
      "execution_count": 1,
      "outputs": []
    },
    {
      "metadata": {
        "trusted": true,
        "collapsed": true
      },
      "cell_type": "code",
      "source": "# Standard normal from complimentary error function\nΦ = lambda x: 0.5*tt.erfc(-x/tt.sqrt(2))\n\ndef power_analysis(n):\n    \n    # Allocate samples 2:1\n    n_hd = int(2*n/3)\n    n_sd = n - n_hd\n    \n    with pm.Model() as model:\n\n        # probabilities of >= 4-fold-rise taken from Beta priors constructed from phase I data\n        π_hd = pm.Beta('π_hd', 12, 10)\n        π_sd = pm.Beta('π_sd', 2, 13)\n        \n        # Draw binomial samples from probabilities\n        x_hd = pm.Binomial('x_hd', n_hd, π_hd)\n        x_sd = pm.Binomial('x_sd', n_sd, π_sd)\n        \n        # Empirical probabilities\n        π_hd_post = x_hd/n_hd\n        π_sd_post = x_sd/n_sd\n        \n        # Convert difference in probabilities to standard normal\n        d = pm.Deterministic('d', (π_hd_post - π_sd_post)\n                             / tt.sqrt(π_hd_post*(1-π_hd_post)/n_hd + π_sd_post*(1-π_sd_post)/n_sd))\n        \n        # P-value\n        p = 1 - Φ(d)\n        \n        # Successful detection\n        power = pm.Deterministic('power', p<0.05)\n        \n    return model",
      "execution_count": 2,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "Use model to estimate power for a range of sample sizes.\n\nn=60:"
    },
    {
      "metadata": {
        "trusted": true,
        "collapsed": false
      },
      "cell_type": "code",
      "source": "iterations = 30000\nkeep = 20000\n\nmod60 = power_analysis(60)\nwith mod60:\n    tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n    \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
      "execution_count": 3,
      "outputs": [
        {
          "output_type": "stream",
          "text": "100%|██████████| 30000/30000 [00:07<00:00, 3802.60it/s]\n",
          "name": "stderr"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": "         mean  mc_error\npower  0.8896  0.006621",
            "text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>mean</th>\n      <th>mc_error</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>power</th>\n      <td>0.8896</td>\n      <td>0.006621</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
          },
          "metadata": {},
          "execution_count": 3
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "n=48:"
    },
    {
      "metadata": {
        "trusted": true,
        "collapsed": false,
        "scrolled": true
      },
      "cell_type": "code",
      "source": "iterations = 30000\nkeep = 20000\n\nmod48 = power_analysis(48)\nwith mod48:\n    tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n    \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
      "execution_count": 4,
      "outputs": [
        {
          "output_type": "stream",
          "text": "100%|██████████| 30000/30000 [00:08<00:00, 3689.98it/s]\n",
          "name": "stderr"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": "         mean  mc_error\npower  0.8685    0.0075",
            "text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>mean</th>\n      <th>mc_error</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>power</th>\n      <td>0.8685</td>\n      <td>0.0075</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
          },
          "metadata": {},
          "execution_count": 4
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "n=42:"
    },
    {
      "metadata": {
        "trusted": true,
        "collapsed": false
      },
      "cell_type": "code",
      "source": "mod42 = power_analysis(42)\nwith mod42:\n    tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n    \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
      "execution_count": 5,
      "outputs": [
        {
          "output_type": "stream",
          "text": "100%|██████████| 30000/30000 [00:07<00:00, 3967.43it/s]\n",
          "name": "stderr"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": "        mean  mc_error\npower  0.836  0.008768",
            "text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>mean</th>\n      <th>mc_error</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>power</th>\n      <td>0.836</td>\n      <td>0.008768</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
          },
          "metadata": {},
          "execution_count": 5
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "n=36:"
    },
    {
      "metadata": {
        "trusted": true,
        "collapsed": false
      },
      "cell_type": "code",
      "source": "mod36 = power_analysis(36)\nwith mod36:\n    tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n    \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
      "execution_count": 6,
      "outputs": [
        {
          "output_type": "stream",
          "text": "100%|██████████| 30000/30000 [00:07<00:00, 3774.80it/s]\n",
          "name": "stderr"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": "         mean  mc_error\npower  0.8212  0.008525",
            "text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>mean</th>\n      <th>mc_error</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>power</th>\n      <td>0.8212</td>\n      <td>0.008525</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
          },
          "metadata": {},
          "execution_count": 6
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "n=27:"
    },
    {
      "metadata": {
        "trusted": true,
        "collapsed": false
      },
      "cell_type": "code",
      "source": "mod27 = power_analysis(27)\nwith mod27:\n    tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n    \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
      "execution_count": 7,
      "outputs": [
        {
          "output_type": "stream",
          "text": "100%|██████████| 30000/30000 [00:08<00:00, 3413.36it/s]\n",
          "name": "stderr"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": "          mean  mc_error\npower  0.74735  0.009091",
            "text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>mean</th>\n      <th>mc_error</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>power</th>\n      <td>0.74735</td>\n      <td>0.009091</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
          },
          "metadata": {},
          "execution_count": 7
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "The power analysis suggests that a sample size in the low- to mid-30s would be sufficient to detect a difference between high and standard dose with 80% power in a similar population. A sample size over 60 would be required for 90% power."
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## References\n\n1. GiaQuinta, S., Michaels, M. G., McCullers, J. A., Wang, L., Fonnesbeck, C., O'Shea, A., et al. (2014). Randomized, double-blind comparison of standard-dose vs. high-dose trivalent inactivated influenza vaccine in pediatric solid organ transplant patients. Pediatric Transplantation, 19(2), 219–228. http://doi.org/10.1111/petr.12419\n1. Spiegelhalter, D. J., Abrams, K. R., and Myles, J. P. (2004). Bayesian approaches to Clinical Trials and Health Care Evaluation. Wiley, New York."
    }
  ],
  "metadata": {
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3",
      "language": "python"
    },
    "language_info": {
      "codemirror_mode": {
        "version": 3,
        "name": "ipython"
      },
      "file_extension": ".py",
      "nbconvert_exporter": "python",
      "name": "python",
      "pygments_lexer": "ipython3",
      "mimetype": "text/x-python",
      "version": "3.5.2"
    },
    "gist": {
      "id": "c6c1dc8c112d785331684e2ccd2575c6",
      "data": {
        "description": "Vaccine Power Analysis-Copy1.ipynb",
        "public": true
      }
    },
    "_draft": {
      "nbviewer_url": "https://gist.github.com/c6c1dc8c112d785331684e2ccd2575c6"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 1
}
	{
	"cells": [
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "# Phase II Power Analysis\n\nUsing the sample information from the phase I trial, we can calculate power for a phase II trial.\n\n![Table 2](http://d.pr/i/O9Y6+)"
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "## Proper Bayesian power analysis\n\nWe wish to identify sample sizes for high dose and standard dose (in a 2:1 ratio) that will be able to detect a positive difference in the probability $p$ of a $\\ge 4$ fold titer rise to H3N2 influenza. That is, we wish to be able to detect:\n\n$$\\delta = p^{(HD)} - p^{(SD)} > 0$$\n\nwith 80% power. This analysis will be conducted using the proper Bayesian approach (Spiegelhalter et al. 2004). The advantage of this approach for power analysis over a frequentist power analysis is that it does not assume that the underlying probabilities are known. Failure to do so will tend to underestimate variance and result in underpowered experiments.\n\nA sample leading to a successful trial (one in which a difference is detected) under a particular sample size $n$ is:\n\n$$E_n = \\left\\{\\mathbf{x}; Pr( \\delta > 0 \| \\mathbf{x} ) \\ge 0.95\\right\\}$$\n\nso the probability of a successful trial is:\n\n$$\\psi(n) = P_{X^{(n)}}(X^{(n)} \\in E_n)$$\n\nwhich is the expected Bayesian power.\n\nIn our case, the power is:\n\n$$\\psi(n) = \\Phi\\left(\\frac{\\pi_{hd} - \\pi_{sd}}{\\sqrt{\\pi_{hd}(1-\\pi_{hd})/n_{hd} - \\pi_{sd}(1-\\pi_{sd})/n_{sd}}}\\right)$$\n\nwhere the $\\pi_i$ are the posterior probabilities, parameterized by a beta distribution with a binomial likelihood for the data."
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "We will use the sample data from the phase I trial (GiaQuinta et al. 2014) to construct prior distributions for $p^{(HD)}$ and $p^{(SD)}$."
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": true
	},
	"cell_type": "code",
	"source": "%matplotlib inline\nimport numpy as np\nimport seaborn as sns\nimport pymc3 as pm\nimport theano.tensor as tt\nplt = sns.plt",
	"execution_count": 1,
	"outputs": []
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": true
	},
	"cell_type": "code",
	"source": "# Standard normal from complimentary error function\nΦ = lambda x: 0.5tt.erfc(-x/tt.sqrt(2))\n\ndef power_analysis(n):\n \n # Allocate samples 2:1\n n_hd = int(2n/3)\n n_sd = n - n_hd\n \n with pm.Model() as model:\n\n # probabilities of >= 4-fold-rise taken from Beta priors constructed from phase I data\n π_hd = pm.Beta('π_hd', 12, 10)\n π_sd = pm.Beta('π_sd', 2, 13)\n \n # Draw binomial samples from probabilities\n x_hd = pm.Binomial('x_hd', n_hd, π_hd)\n x_sd = pm.Binomial('x_sd', n_sd, π_sd)\n \n # Empirical probabilities\n π_hd_post = x_hd/n_hd\n π_sd_post = x_sd/n_sd\n \n # Convert difference in probabilities to standard normal\n d = pm.Deterministic('d', (π_hd_post - π_sd_post)\n / tt.sqrt(π_hd_post(1-π_hd_post)/n_hd + π_sd_post(1-π_sd_post)/n_sd))\n \n # P-value\n p = 1 - Φ(d)\n \n # Successful detection\n power = pm.Deterministic('power', p<0.05)\n \n return model",
	"execution_count": 2,
	"outputs": []
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "Use model to estimate power for a range of sample sizes.\n\nn=60:"
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": false
	},
	"cell_type": "code",
	"source": "iterations = 30000\nkeep = 20000\n\nmod60 = power_analysis(60)\nwith mod60:\n tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
	"execution_count": 3,
	"outputs": [
	{
	"output_type": "stream",
	"text": "100%\|██████████\| 30000/30000 [00:07<00:00, 3802.60it/s]\n",
	"name": "stderr"
	},
	{
	"output_type": "execute_result",
	"data": {
	"text/plain": " mean mc_error\npower 0.8896 0.006621",
	"text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>mean</th>\n <th>mc_error</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>power</th>\n <td>0.8896</td>\n <td>0.006621</td>\n </tr>\n </tbody>\n</table>\n</div>"
	},
	"metadata": {},
	"execution_count": 3
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "n=48:"
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": false,
	"scrolled": true
	},
	"cell_type": "code",
	"source": "iterations = 30000\nkeep = 20000\n\nmod48 = power_analysis(48)\nwith mod48:\n tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
	"execution_count": 4,
	"outputs": [
	{
	"output_type": "stream",
	"text": "100%\|██████████\| 30000/30000 [00:08<00:00, 3689.98it/s]\n",
	"name": "stderr"
	},
	{
	"output_type": "execute_result",
	"data": {
	"text/plain": " mean mc_error\npower 0.8685 0.0075",
	"text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>mean</th>\n <th>mc_error</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>power</th>\n <td>0.8685</td>\n <td>0.0075</td>\n </tr>\n </tbody>\n</table>\n</div>"
	},
	"metadata": {},
	"execution_count": 4
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "n=42:"
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": false
	},
	"cell_type": "code",
	"source": "mod42 = power_analysis(42)\nwith mod42:\n tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
	"execution_count": 5,
	"outputs": [
	{
	"output_type": "stream",
	"text": "100%\|██████████\| 30000/30000 [00:07<00:00, 3967.43it/s]\n",
	"name": "stderr"
	},
	{
	"output_type": "execute_result",
	"data": {
	"text/plain": " mean mc_error\npower 0.836 0.008768",
	"text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>mean</th>\n <th>mc_error</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>power</th>\n <td>0.836</td>\n <td>0.008768</td>\n </tr>\n </tbody>\n</table>\n</div>"
	},
	"metadata": {},
	"execution_count": 5
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "n=36:"
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": false
	},
	"cell_type": "code",
	"source": "mod36 = power_analysis(36)\nwith mod36:\n tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
	"execution_count": 6,
	"outputs": [
	{
	"output_type": "stream",
	"text": "100%\|██████████\| 30000/30000 [00:07<00:00, 3774.80it/s]\n",
	"name": "stderr"
	},
	{
	"output_type": "execute_result",
	"data": {
	"text/plain": " mean mc_error\npower 0.8212 0.008525",
	"text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>mean</th>\n <th>mc_error</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>power</th>\n <td>0.8212</td>\n <td>0.008525</td>\n </tr>\n </tbody>\n</table>\n</div>"
	},
	"metadata": {},
	"execution_count": 6
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "n=27:"
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": false
	},
	"cell_type": "code",
	"source": "mod27 = power_analysis(27)\nwith mod27:\n tr = pm.sample(iterations, step=pm.Metropolis(), init=None)\n \npm.df_summary(tr[-keep:], varnames=['power'])[['mean', 'mc_error']]",
	"execution_count": 7,
	"outputs": [
	{
	"output_type": "stream",
	"text": "100%\|██████████\| 30000/30000 [00:08<00:00, 3413.36it/s]\n",
	"name": "stderr"
	},
	{
	"output_type": "execute_result",
	"data": {
	"text/plain": " mean mc_error\npower 0.74735 0.009091",
	"text/html": "<div>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>mean</th>\n <th>mc_error</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>power</th>\n <td>0.74735</td>\n <td>0.009091</td>\n </tr>\n </tbody>\n</table>\n</div>"
	},
	"metadata": {},
	"execution_count": 7
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "The power analysis suggests that a sample size in the low- to mid-30s would be sufficient to detect a difference between high and standard dose with 80% power in a similar population. A sample size over 60 would be required for 90% power."
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "## References\n\n1. GiaQuinta, S., Michaels, M. G., McCullers, J. A., Wang, L., Fonnesbeck, C., O'Shea, A., et al. (2014). Randomized, double-blind comparison of standard-dose vs. high-dose trivalent inactivated influenza vaccine in pediatric solid organ transplant patients. Pediatric Transplantation, 19(2), 219–228. http://doi.org/10.1111/petr.12419\n1. Spiegelhalter, D. J., Abrams, K. R., and Myles, J. P. (2004). Bayesian approaches to Clinical Trials and Health Care Evaluation. Wiley, New York."
	}
	],
	"metadata": {
	"kernelspec": {
	"name": "python3",
	"display_name": "Python 3",
	"language": "python"
	},
	"language_info": {
	"codemirror_mode": {
	"version": 3,
	"name": "ipython"
	},
	"file_extension": ".py",
	"nbconvert_exporter": "python",
	"name": "python",
	"pygments_lexer": "ipython3",
	"mimetype": "text/x-python",
	"version": "3.5.2"
	},
	"gist": {
	"id": "c6c1dc8c112d785331684e2ccd2575c6",
	"data": {
	"description": "Vaccine Power Analysis-Copy1.ipynb",
	"public": true
	}
	},
	"_draft": {
	"nbviewer_url": "https://gist.github.com/c6c1dc8c112d785331684e2ccd2575c6"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 1
	}