nmatzke/week_05_exercise_v3.R

## week_05_exercise_v3.R

#######################################################
# BIOSCI220: Week 5 Lab Exercise: Modelling epidemics
#######################################################


#######################################################
# Setup
#######################################################
#
# We only need to add one thing to last-week's setup:
# the R package "deSolve".
#

# You can check if you have this already, by typing:
library(deSolve)

# ...e.g., the R-snippets website has deSolve already
# installed (along with thousands of other R packages).

# If you need to install it, type:
# install.packages("deSolve")

# R might do a few things here:
#
# 1. It might ask you to choose a CRAN repository.
#    These are computer servers around the world
#    that hold copies of all of the R packages.
#    Closer servers will be a bit faster, so you
#    might choose an Australian server. However,
#    I find that almost everything is equally
#    fast, so often I just pick "0", for 0-cloud,
#    the master server in the U.K.
#
# 2. R might ask you if you want to install "source"
#    or a "binary". Choose the "binary" option if
#    at all possible -- binaries are precompiled,
#    which means you don't have to install a
#    bunch of compiler programs to compile the
#    package from source.
#
#    Compiling from source, if you have to do it,
#    often works fairly easily on Mac and Linux
#    computers, but is often quite difficult on
#    Windows, unless you are already a computer
#    nerd that has installed compilers for other
#    reasons.
#
#    If you can't get install.packages() to work,
#    you can always use the R Snippets website,
#    keeping in mind that you may have to do the
#    tedious operation of copy-pasting the whole
#    script, down to your line of interest (I am
#    trying to write the script to avoid this, but
#    it is not always possible).
#
# 3. Note that once you get install.packages("deSolve")
#    to work once, you don't have to run it ever again
#    (unless if you install a new version of R or
#     something).  So *don't* go re-running
#    install.packages("deSolve") every time you
#    re-start the script.
#
#####################################################


#######################################################
# Week 5: SIR models in R
#######################################################
#
# To get a better sense of what biologically realistic
# models are, how they work, and how they are used,
# in Week 5, we will work with SIR models.
#
# SIR models describe the changes between
# 3 states:
#
# S = Susceptible
# I = Infectious
# R = Recovered
#
# And N = total population, i.e. S+I+R
#
# ...and that they rely on 2 parameters,
#
#  R0    = the basic reproductive number (the number of new people
#          each infected person infects)
#  D_inf = duration of infectiousness (in timesteps, e.g. days)
#
# These parameters can be transformed into other parameters, that are alternate descriptions
# of the dynamics of the epidemic.
#
# beta = R0 / D_inf = the rate at which an Infectious individual infects Susceptible individuals
# nu   = 1 / D_inf  = the rate at which an Infectious individual moves to Recovered
#
# Remember the equations from the slides that describe
# the rates of change between the S, I, and R states?
#
# The first equation describes "dS", the rate of change in S.
#
# This is:
#
# dS <- -beta * S * I/N
#
# ...that is, as the proportion of Infected (I/N) goes up,
# more Susceptibles (S) will be infected per day. When
# Susceptibles move to Infected, they are removed from
# Susceptible, so we use -beta instead of beta.
#
# The second equation describes "dI", the rate of change in
# Infectious (I):
#
# dI <- beta * I/N * S - nu * I
#
# The first part of this equation, beta * I/N * S, is the same
# as the dS equation, because the individuals that leave S are
# added to I.
#
# The second part of this equation, -nu * I, describes the
# number of individuals that leave I by becoming Recovered.
#
# The third equation:
#
# dR <- nu * I
#
# ...is the same as the rate of individuals leaving I.
#
#######################################################
#
# This SIR model is a common, simple, example of a system of
# Ordinary Differential Equations (ODEs). There are
# whole courses, and careers, built on differential
# equations, because they can describe a diverse array
# of physical and biological processes.
#
# A big advantage of ODEs is that all you need to make them
# work is:
#
# (a) an initial state (the starting state)
#
# (b) a system of equations describing the rates of change
#     in the states.
#
# ...that is, you don't need equations that will describe
# what the process looks like at every time point. Instead,
# "numerical solvers" in the computer can work out the
# approximate solution at any time-point, by starting at
# the initial conditions, and taking small time-steps. A
# standard library for numerical solving of ODEs is the
# "deSolve" library in R.
#
# BIOSCI220 is not intended to be a detailed introduction
# to ODEs and their analysis. It is just good for you to
# be a little bit aware of what ODEs are. When you hear of
# scientists, statisticians, etc. that are "modellers",
# many of them are working with ODEs.
#
#######################################################

#######################################################
#
# How do we implement this model in R, so that we can
# use it to do science?
#
# We write our own R functions. Every time you give
# R a command, you are using a function.
#
#######################################################
#
# The great thing about R is that you can see the code inside
# of a function, just by typing the function name and
# hitting return.
#
# Let's write out the system of ODEs for the SIR model.
#
# We will put them in an R function, called "SIR_ode":
#
# Note 1: I have written comments inside the function,
# lines starting with "#", to describe what is happening.
# Comments are EXTREMELY USEFUL when you are writing
# your own R scripts & functions. NOTE: COMMENTED LINES
# ARE NOT FUNCTIONAL, IT DOESN'T MATTER IF YOU PASTE
# THEM INTO R, OR NOT.
#
# Note 2: To make functions work in R, you have to
# paste them ALL AT ONCE.  That is, you paste everything
# from the name, "SIR_ode", to the end, "}".
#
# Once you've pasted the function into R, it will be
# remembered, within the same R session (this does not
# apply on the R snippets website).
#
SIR_ode <- function(time, state, parameters)
	{
	# The above inputs are:
	# time = a time point
	# state = an R list of the 3 state values, i.e. the
	#         population sizes of S, I, and R
	# parameters = an R list of the parameters R0 and D_inf

	# Convert the input parameters "R0" and "D_inf"
	# into the rates "beta" and "nu"
	beta <- parameters[["R0"]] / parameters[["D_inf"]]
	nu <- 1 / parameters[["D_inf"]]

	# Extract the current values of the states S, I, and R
	S <- state[["S"]]
	I <- state[["I"]]
	R <- state[["R"]]

	# Calculate N, the total of S+I+R,
	# i.e. the population size
	N <- S + I + R

	# Write out the system of
	# Ordinary Differential Equations (ODEs)
	dS <- -beta * I/N * S
	dI <- beta * I/N * S - nu * I
	dR <- nu * I

	# Return the current rates of change of S, I, and R:
	return(list(c(dS, dI, dR)))
	} # End of function SIR_ode

# Once you have pasted the function definition in R, you have
# it loaded into memory for the duration of the R session.
#
# To see this, paste the function definition above, then
# type:

SIR_ode

# As you can see, typing the name of a previously loaded
# function, WITHOUT the parenthesis, shows the code inside
# the function. This is a GREAT trick to learn how R programs
# work. Many R programmers (like me) have learned how key
# computational methods in their field work, by studying
# the R code produced by other Computational Biologists.
##############################################################


##############################################################
# OK, now that we have defined the SIR model as a system
# of ODEs, what can we do with it?
#
# Let's run the SIR_ode code for one particular timestep:
#

# Input the parameters
parameters = list()
parameters$R0    = 3.0
parameters$D_inf = 2.0

# Input the starting state of the system
state = list()
state$S = 999
state$I = 1
state$R = 0

# Input the time
time = 1


# Run the SIR_ode function
SIR_ode(time, state, parameters)

# The output of SIR_ode is the rates of
# change in the 3 states, in order:
# S, I, and R


# QUESTION #1: What value did R report for
# dS, the rate of change of S? (This is the
# first value of the 3 values produced by
# SIR_ode.)

##############################################################


##############################################################
# Knowing the rate of change doesn't, by itself, tell us
# what we want to know, which is the numbers of Susceptible,
# Infectious, and Recovered individuals at different timepoints.
#
# To get the trajectory of an epidemic, we have to calculate
# the approximate solution in small time increments. This
# basically works because the rate can be assumed to be
# approximately constant over small time increments.
#
# The R function that does the numerical approximation of
# the ODE over a series of time-steps is deSolve's ode()
# function.
#
# The function ode() solves systems of ODEs, if you input:
#
# * init.state = the starting values of the 3 states
# * times = the time points at which you would like to see
#           approximated values of the states
# * func = the function holding the actual system of
#          Ordinary Differential Equations (ODEs). In our
#          case, this is SIR_ode(), which we defined above.
# * parms = the parameter values for R0 and D_inf. NOTE:
#           in modelling, parameters are often abbreviated
#           with the symbol "theta".
# * method = There are dozens of methods for approximating
#            solutions of ODEs. We will use 'ode45', a
#            basic "Runge-Kutta" method. We don't need to
#            know how this works.

# Let's try it:

# Input the starting state of the system
state = list()
state$S = 999
state$I = 1
state$R = 0
init.state = unlist(state)

times = 1:30

# Input the parameters
parameters = list()
parameters$R0    = 3.0
parameters$D_inf = 2.0

library(deSolve)
trajectory = ode(y = init.state, times=times, func=SIR_ode, parms=parameters, method='ode45')
trajectory

# Have a look at the output "trajectory", which is a table of predicted
# values of S, I and R, for time-points 1-30:

trajectory

# ...and please answer the questions below. These questions are meant to help
# you understand the outputs contained in "trajectory".


# QUESTION #2: What is the total population size at the beginning of this simulation?
# This answer should be an integer (i.e., a whole number, with no decimal values.)


# QUESTION #3: What is the total population size at the end of this simulation?
# This answer should be an integer (i.e., a whole number, with no decimal values.)


# QUESTION #4: Does the total population size change at any point in this simulation?


#######################################################
# Plotting the trajectory
#######################################################
#
# With careful study, you can get a sense of the
# trajectory of an epidemic by studying the
# "trajectory" output table.  However, it is much
# easier to see what is going on by plotting the
# trajectory. Here is some simple code to do that.
#

# Convert the trajectory matrix to an R data.frame:
trajectory = as.data.frame(trajectory)

# Set up the y-axis min and max
miny = 0
maxy = max(rowSums(trajectory[,-1]))

# Plot the points as invisible white dots (just to set up the plot)
plot(x=trajectory$time, y=trajectory$S, pch=".", col="white", xlim=c(0, max(trajectory$time)), ylim=c(miny, maxy), xlab="Week", ylab="Number of individuals")
title("Plot #1: SIR model simulation\n(blue=Susceptible, red=Infectious, green=Recovered)")

# Plot states S, I, and R as blue, red, and green lines
lines(x=trajectory$time, y=trajectory$S, lwd=3, col="blue")
lines(x=trajectory$time, y=trajectory$I, lwd=3, col="firebrick2")
lines(x=trajectory$time, y=trajectory$R, lwd=3, col="green3")

# Spend several minutes studying this plot, to make sure you
# understand it, and to understand how it corresponds exactly
# to the numbers stored in the "trajectory" table.
#
# (NOTE: You will probably have to watch the lecture introduction
#  in order to successfully answer some of these questions.)
#
# Please answer the following questions while examining Plot #1:
#
# Question #5: Over which weeks is the epidemic growing approximately
#             exponentially?
#
# Question #6: About what percentage of the population is infected at once,
#             at the peak of the Infectious curve?
#
# Question #7: About what week does the peak of the Infectious curve occur?
#
# Question #8: About what percentage of the population ends up having been infected
#             after 30 weeks?
#
# Question #9: About what percentage of the population never gets infected
#             in this simulation?
#
# Question #10: This simulation represents the dynamics of an epidemic
#             in a "natural" situation, with zero intervention, no medicine
#             and no vaccines. Why then does the number of Infected eventually
#             decrease, and then drop to ~0?
#
# Question #11: The PM of the United Kingdom, Boris Johnson, initially suggested
#             relying on naturally-acquired herd immunity to stop the spread
#             of COVID-19. Assuming that 1% of infected individuals die, what
#             does Plot #1 suggest is the flaw in this strategy?
#
# Question #12: Plot #1 suggests that an epidemic will eventually "burn itself
#             out" naturally, and the percentage currently infected is
#             naturally limited to about 30%. Applied to COVID-19, does Plot #1
#             describe an optimistic scenario, or a worst-case disaster?


#######################################################
# Doing a more realistic simulation
#######################################################
#
# In order to make the "trajectory" table readable, I
# made the previous simulation small -- 1000 individuals,
# and 30 time-steps. If we interpret the time-steps as
# weeks, this approximates COVID-like dynamics, because
# COVID's duration of infectivity (the D_inf parameter)
# is something like 2 weeks. I also had the R0 parameter
# set to 3.0, meaning that each Infectious individual
# infects another 3 individuals, at time 0 when the
# population is 100% susceptible.
#
# However, we can run a very similar simulation for a
# realistic population size.  So, let's run one for
# a New-Zealand-like population with 5 million
# individuals.
#
# To make this easier, I have written two functions:
#
# simulate_SIR() -- this runs the simulation, with inputs:
# * theta -- an R list, containing the parameters R0 and D_inf
# * init.state -- the starting counts for S, I, and R individuals
# * times -- the time-points at which to measure the epidemic
#
# plot_trajectory_lines() -- plots the trajectory, with inputs
# * trajectory -- the table output by simulate_SIR()
# * theta -- the parameters from above, if you want them to display
#            on the plot
#
#
# I will paste in the functions here. Hopefully, you can
# see that they are just more elaborate versions of the
# by-hand simulation and plotting that we did above.
#
# PLEASE REMEMBER TO PASTE IN EACH FUNCTION DEFINITION ALL
# AT ONCE, FROM THE FUNCTION TITLE, ALL THE WAY TO THE
# CLOSING "}" BRACKET

simulate_SIR <- function(theta, init.state, times)
	{
	trajectory <- data.frame(ode(y = init.state,
                times = times,
                func = SIR_ode,
                parms = theta,
                method = "ode45"))
	return(trajectory)
	} # End of simulate_SIR()

plot_trajectory_lines <- function(trajectory,  theta=NULL, thetas_states_df=NULL, boxcolors=c("blue","firebrick2","green3"), title_prefix="", ylog=FALSE, vline_text=TRUE)
	{
	# Write the parameters in the plot title
	if (is.null(thetas_states_df) == TRUE)
		{
		if (is.null(theta) == FALSE)
			{
			titletxt = paste0("Deterministic projection of epidemic dynamics\nwhen R0=", theta$R0, ", D_inf=", theta$D_inf)
			} else {
			titletxt = paste0("Deterministic projection of epidemic dynamics\n(parameters R0 and D_inf not given)")
			}
		} else {
		starting_I = thetas_states_df$I[1]
		starting_R0 = thetas_states_df$R0[1]
		starting_D_inf = thetas_states_df$D_inf[1]
		titletxt = paste0("Deterministic projection of epidemic dynamics\n(", nrow(thetas_states_df), " regimes; starts with I=", starting_I, ", R0=", starting_R0, ", D_inf=", starting_D_inf, ")")
		}

	# Add a prefix to the title
	if (title_prefix != "")
		{
		titletxt = paste(title_prefix, titletxt, sep=" ")
		}

	# Log-scale the y-axis:
	logaxis = "" # default (no log-scaling of plot)
	miny = 0
	if (ylog == TRUE)
		{
		logaxis = "y"
		miny = 1.0
		}

	# Set up the y-axis min and max
	maxy = max(rowSums(trajectory[,-1]))

	# Plot the points as invisible white dots (just to set up the plot)
	plot(x=trajectory$time, y=trajectory$S, pch=".", col="white", xlim=c(0, max(trajectory$time)), ylim=c(miny, maxy), xlab="Day", ylab="Number of individuals", log=logaxis)
	title(titletxt)

	# Plot states S, I, and R as lines
	lines(x=trajectory$time, y=trajectory$S,
	lwd=3, col=boxcolors[1])
	lines(x=trajectory$time, y=trajectory$I, lwd=3, col=boxcolors[2])
	lines(x=trajectory$time, y=trajectory$R, lwd=3, col=boxcolors[3])

	# Plot the regime values
	if (is.null(thetas_states_df) == FALSE)
		{
		for (rn in 2:nrow(thetas_states_df))
			{
			abline(v=thetas_states_df$time[rn], lty="dotted", col="grey")
			tmprow = thetas_states_df[rn,]

			Stxt = NULL
			Itxt = NULL
			Rtxt = NULL
			if (tmprow$S != 0)
				Stxt = paste0("S=+", tmprow$S, " ")
			if (tmprow$I != 0)
				Itxt = paste0("I=+", tmprow$I, " ")
			if (tmprow$R != 0)
				Rtxt = paste0("R=+", tmprow$R, " ")
			txt1 = paste0("R0=", thetas_states_df$R0[rn], " D_inf=", thetas_states_df$D_inf[rn])
			txt2 = paste0(Stxt, Itxt, Rtxt, sep=" ")
			txt2 = gsub(pattern="  ", replacement=" ", x=txt2)
			txt2 = gsub(pattern="  ", replacement=" ", x=txt2)
			txt = paste0(txt1, " ", txt2)

			# Add text next to vertical lines, if desired
			if (vline_text == TRUE)
				{
				text(x=thetas_states_df$time[rn], y=miny, labels=txt, adj=c(0,1), pos=4, offset=-0.15, col="grey", srt=90, cex=0.7)
				}
			} # END for (rn in 2:nrow(thetas_states_df))
		} # END if (is.null(thetas_states_df) == FALSE)
	} # END of plot_trajectory_lines()


# OK, let's run the model, and make a projection, assuming a
# realistic population size, an R0 of 3.0, and a
# D_inf of 14 days, and an epidemic proceeding naturally,
# "burning out of control."
#
# (NZ current population size on April 5
# is 4,812,882, according to worldometers.info. We
# will approximate this as 5 million to ease
# readability. Plus, there are apparently 100,000+
# visitors still in NZ.)
#
init.state = c(S=5000000, I=1, R=0)
theta = list(R0=3.0, D_inf=14.0)
times = 1:400
trajectory = simulate_SIR(theta, init.state, times)
plot_trajectory_lines(trajectory, theta=theta, title_prefix="Plot #2:")

#
# We can use this model to make projections of certain key numbers
# that governments would want to know. While the plot is useful
# for getting a visual of the course of an uncontrolled epidemic,
# to get actual numbers, it is easiest to use the "trajectory"
# table.
#
# I will illustrate, with some simple R commands. You will then use
# similar commands to answer the questions below.
#
# USEFUL FUNCTIONS FOR TABLES
#
# head()/tail(): Very large tables are hard to see in R, because R
# only prints 50 or lines of the table to the screen.
#
# The "head" and "tail" functions show the top and bottom of the
# table:

head(trajectory)
tail(trajectory)

# To see the names of each column in the table, use the names() function:
names(trajectory)

# To get the minimum or maximum value in a particular column,
# use the min() and max() functions.
#
# You can refer to a particular column with the $name convention
min(trajectory$time)
max(trajectory$time)

min(trajectory$S)
max(trajectory$S)

# Other common R functions that work the same way include
# mean(), median(), sum(), sd(), etc.
#


# Please answer the questions below.

# QUESTION #13: What is the total number of people that
# get infected in NZ, according to this projection? (report the exact
# number, which can be read from the trajectory table)

# QUESTION #14: About what percentage of the population is this? (multiple choice)

# QUESTION #15: If COVID-19 had a mortality rate of 1.5% (I am not
# saying this is true -- this rate is much-debated, for various
# reasons, and NZ will likely do much better than this), what
# number of people would die?

# QUESTION #16: Is this projected death number roughly the same as
# the number projected by University of Auckland physicist
# Shaun Hendy (https://unidirectory.auckland.ac.nz/profile/shaun-hendy ) for an
# uncontrolled epidemic, according to this article?
#
# Kate Newton (27 March 2020). The man modelling NZ's Covid-19 spread
# from his kitchen table. Radio New Zealand.
# https://www.rnz.co.nz/news/in-depth/412744/the-man-modelling-nz-s-covid-19-spread-from-his-kitchen-table
#
# (NOTE: Don't overthink "roughly the same". In the world of projecting
#  exponential processes, any projection within 50% is a pretty good match.)
#

# We can do other operations on the table. For example, according to this April 5
# article,
#
# Tom Dillane and Chris Knox (April 5, 2020), "Coronavirus Covid 19: New Zealand's
# best and worst prepared DHBs by elderly population and ICU beds." NZ Herald.
# https://www.nzherald.co.nz/nz/news/article.cfm?c_id=1&objectid=12321390
#
# "As of February 25, the Ministry of Health reported just 173 total ICU beds
#  nationally. This was scraped up to a total of 233 including their high dependency
# care beds and cardiac care unit beds, with respirators."
#
# The article later says the number of ICU beds can be gotten up to about 500, and
# that about 4% of symptomatic COVID-19 cases need ICU beds. We can use our model to
# answer, "How many days into the epidemic do we run out of beds?"
#
# Let's assume that all of infections are symptomatic (this number
# is much-debated). This suggests that 4% of total infections will need ICU beds.
#
# So, we need to ask, on what day do ICU bed run out, overwhelming the health
# care system?
#
# First, so that we have the code in one place, let's run the model again.
init.state = c(S=5000000, I=1, R=0)
theta = list(R0=3.0, D_inf=14.0)
times = 1:400
trajectory = simulate_SIR(theta, init.state, times)
plot_trajectory_lines(trajectory, theta=theta, title_prefix="Plot #2:")

# Next, let's use a greater-than (">") statement to return TRUE/FALSE (TF)
# on whether the model's (current number of infections) * 0.04 is greater
# than 500:

TF = (trajectory$I * 0.04) > 500
TF

# Look at the screen printout of TF to see on which day hospitals are overwhelmed, under
# this scenario.
#
# Question #17: On what day are hospitals overwhelmed, if we assume 4% of infections
# require ICUs?


# Question #18: Let's say we decide to be very optimistic, and say that half of infections are
# asymptomatic, so that only 2% of infections require ICUs. Under this assumption,
# on what day are hospitals overwhelmed?


# Question #19: What should you tell a politician if they said "Why is this
# difference in the number of days so small?!? We cut the hospitalization
# rate in half!"
#
# - Math is complex, just accept it.
# - Your gut instincts are fine, just go with your gut, I'm sure it will be fine.
# - If we really try hard, we could quadruple the number of ICU beds. That
#   will definitely solve this problem.
# - Exponential growth *really* sucks. During the exponential phase of an epidemic,
#   the number of cases doubles every few days, so halving the number of hospitalizations,
#   or doubling the number of ICU beds, only buys you a few days. We need to prioritize
#   slowing the rate of new infections over everything else.


################################################################
# How to scare a politician
################################################################
#
# The graph that really focused politicians' minds is the one
# that shows when hospitals get overwhelmed. We can make this
# graph here.
#
# If 2% of covid infections need ICUs, and if we have 500 ICU
# beds, then hospitals get overwhelmed at 500 / 0.02 = 25000
# infections.

hospitals_overwhelmed = 500 / 0.02
hospitals_overwhelmed

# We can plot this number, 25000 infections, as a horizonal line
# in R. Let's do our  plot again, then add the line:

init.state = c(S=5000000, I=1, R=0)
theta = list(R0=3.0, D_inf=14.0)
times = 1:400
trajectory = simulate_SIR(theta, init.state, times)
plot_trajectory_lines(trajectory, theta=theta, title_prefix="Plot #3a:")

# Adding the horizontal line with abline():
# h = draw a horizontal line (v for vertical line, a & b for slope/intercept)
# lty = line type, e.g. dashed, dotted, solid
# lwd = line width, default=1, lwd=2 is twice as think
# col = line color, default is black
abline(h=hospitals_overwhelmed, lty="dashed", lwd=2, col="red")

# Let's add a legend:
legend_texts = c("Susceptible (S)", "Infectious (I)", "Recovered (R)", "hospitals overwhelmed")
line_types = c("solid", "solid", "solid", "dashed")
line_widths = c(1, 1, 1, 2)
line_colors = c("blue", "firebrick2", "green3", "red")
legend(x="right", legend=legend_texts, lty=line_types, lwd=line_widths, col=line_colors)


# NOTE: that horizontal dashed line looks close to zero, but that's because
# it's being compared to 5 million.  We can see it more clearly if we make
# a zoomed-in plot:

# Set a *narrower* min and max on the y-axis
miny = 0
maxy = 1500000

# Plot the points as invisible white dots (just to set up the plot)
plot(x=trajectory$time, y=trajectory$S, pch=".", col="white", xlim=c(0, max(trajectory$time)), ylim=c(miny, maxy), xlab="Week", ylab="Number of individuals")
title("Plot #3b: SIR model simulation\n(blue=Susceptible, red=Infectious, green=Recovered)")

# Plot states S, I, and R as blue, red, and green lines
lines(x=trajectory$time, y=trajectory$S, lwd=3, col="blue")
lines(x=trajectory$time, y=trajectory$I, lwd=3, col="firebrick2")
lines(x=trajectory$time, y=trajectory$R, lwd=3, col="green3")

# Add the horizontal line
abline(h=hospitals_overwhelmed, lty="dashed", lwd=2, col="red")

# Let's add a legend:
legend_texts = c("Susceptible (S)", "Infectious (I)", "Recovered (R)", "hospitals overwhelmed")
line_types = c("solid", "solid", "solid", "dashed")
line_widths = c(1, 1, 1, 2)
line_colors = c("blue", "firebrick2", "green3", "red")
legend(x="right", legend=legend_texts, lty=line_types, lwd=line_widths, col=line_colors)


# NOTE: Another method to "zoom in" on the bottom of the plot
# is to plot the y-axis on the log scale:

plot_trajectory_lines(trajectory, theta=theta, title_prefix="Plot #3c (y-axis logged):", ylog=TRUE)
abline(h=hospitals_overwhelmed, lty="dashed", lwd=2, col="red")


# Question #20: In the above example, I started NZ's epidemic with
# a single infection (I=1). What happens if we make the (much more realistic)
# assumption that we had 100 infectious people come in from abroad
# at the beginning of the infection?  On what day are hospitals overwhelmed
# now, assuming that 4% of infections need the ICU?


# Question #21: If politicians weren't scared before, should they be scared
# after you pointed this out?
#
# - Yeah, nah.
# - OMG yes.
#


#######################################################
# Modelling interventions (post your answer to
# the Canvas discussion)
#######################################################
#
# The model projections we have been looking at above
# constitute uncontrolled epidemics, where we let
# the epidemic spread naturally until herd immunity
# stops the spread. This would be fine if COVID-19
# had a low death rate, but it is pretty catastrophic
# if the death rate approaches even 1%.
#
# HOWEVER: Models are not "predictions", they are
# "if/then" statements. I.e., *if* the parameters are
# such-and-such, and *if* the model decently approximates
# reality, and *if* nothing else changes, *then* we
# can predict the future will look something like this.
#
# The *whole point* of epidemic modeling is to project
# the results of different scenarios, including the
# scenarios where humans react to terrifying model
# projections, and decide to take action.
#
# I have modified the above simulation function so that
# the parameters and states can be changed at different
# time-points.
#

simulate_SIR_changes <- function(thetas_states_df, times)
	{
	# Loop through the rows
	trajectory = NULL
	for (rn in 1:nrow(thetas_states_df))
		{
		if (rn == 1)
			{
			init.state = c(S=thetas_states_df$S[rn], I=thetas_states_df$I[rn], R=thetas_states_df$R[rn])
			theta = list(R0=thetas_states_df$R0[rn], D_inf=thetas_states_df$D_inf[rn])
			} else {
			last_row = trajectory[nrow(trajectory),]
			init.state = c(S=last_row$S+thetas_states_df$S[rn], I=last_row$I+thetas_states_df$I[rn], R=last_row$R+thetas_states_df$R[rn])
			theta = list(R0=thetas_states_df$R0[rn], D_inf=thetas_states_df$D_inf[rn])
			}

		# Read in the parameters for the initial, versus later, time-bins
		if (rn >= nrow(thetas_states_df))
			{
			start_time = thetas_states_df$time[rn]
			end_time = max(times)
			} else {
			start_time = thetas_states_df$time[rn]
			end_time = thetas_states_df$time[rn+1]
			}

		TF1 = times >= start_time
		TF2 = times <= end_time
		TF = (TF1 + TF2) == 2
		tmp_times = times[TF]

		tmp_trajectory = simulate_SIR(theta=theta, init.state=init.state, times=tmp_times)
		trajectory = rbind(trajectory, tmp_trajectory)
		} # END for (rn in 1:nrow(thetas_states_df))
	return(trajectory)
	} # End simulate_SIR_changes()


# Basically, instead of inputing
# * theta, a list of parameters
# * init.state, the number of individuals in each state
#
# ...we instead input "thetas_states_df", an R data.frame
# containing the states and parameters at the start,
# followed by rows describing new parameters at
# any time points you like. You can also change the
# state variables suddenly by adding or subtracting
# S, I, and/or R.
#

# Here are some example scenarios. Take a look at these one-by-one. Your
# goal is just to develop an understanding of how different R0,
# different vaccination regimes, etc., effect the dynamics of the outbreak.

# Example where government lockdown on Day 60 drops R0, from 3.0 to 0.3
times = 1:400
time = c(1, 60)
R0 = c(3.0, 0.3)
D_inf = c(14.0, 14.0)
S = c(5000000, 0)
I = c(100, 0)
R = c(0, 0)
thetas_states_table = cbind(time, R0, D_inf, S, I, R)
thetas_states_df = as.data.frame(thetas_states_table, stringsAsFactors=FALSE)
thetas_states_df
trajectory = simulate_SIR_changes(thetas_states_df, times)
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Plot #5:")


# Example where government lockdown on Day 60 drops R0 from 3.0 to 0.3,
# keeps lockdown on for 4 weeks, then "goes back to normal"
times = 1:400
time = c(1, 60, 88)
R0 = c(3.0, 0.3, 3.0)
D_inf = c(14.0, 14.0, 14.0)
S = c(5000000, 0, 0)
I = c(100, 0, 0)
R = c(0, 0, 0)
thetas_states_table = cbind(time, R0, D_inf, S, I, R)
thetas_states_df = as.data.frame(thetas_states_table, stringsAsFactors=FALSE)
thetas_states_df
trajectory = simulate_SIR_changes(thetas_states_df, times)
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Plot #7:")


# Example where government lockdown on Day 60 drops R0, but only from 3.0 to 1.5
times = 1:400
time = c(1, 60)
R0 = c(3.0, 1.5)
D_inf = c(14.0, 14.0)
S = c(5000000, 0)
I = c(100, 0)
R = c(0, 0)
thetas_states_table = cbind(time, R0, D_inf, S, I, R)
thetas_states_df = as.data.frame(thetas_states_table, stringsAsFactors=FALSE)
thetas_states_df
trajectory = simulate_SIR_changes(thetas_states_df, times)
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Plot #8:")


# Example where government lockdown on Day 60 drops R0 only from 3.0 to 1.5,
# but on Day 200, 500,000 vaccine doses suddenly become available and are
# administered over the next 10 weeks (obviously extremely difficult to do this fast)
times = 1:400
time = c(1, 60, 200, 207, 214, 221, 228, 235, 242, 249, 256, 263)
R0 = c(3.0, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5)
D_inf = c(14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0, 14.0)
S = c(5000000, 0, -50000, -50000, -50000, -50000, -50000, -50000, -50000, -50000, -50000, -50000)
I = c(100, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
R = c(0, 0, 50000, 50000, 50000, 50000, 50000, 50000, 50000, 50000, 50000, 50000)
thetas_states_table = cbind(time, R0, D_inf, S, I, R)
thetas_states_df = as.data.frame(thetas_states_table, stringsAsFactors=FALSE)
thetas_states_df
trajectory = simulate_SIR_changes(thetas_states_df, times)
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Plot #9:")


# Example where 50% of the population is vaccinated before the 100 infectious
# individuals enter
times = 1:400
time = c(1, 390)
R0 = c(3.0, 3.0)
D_inf = c(14.0, 14.0)
S = c(2500000, 0)
I = c(100, 0)
R = c(2500000, 0)
thetas_states_table = cbind(time, R0, D_inf, S, I, R)
thetas_states_df = as.data.frame(thetas_states_table, stringsAsFactors=FALSE)
thetas_states_df
trajectory = simulate_SIR_changes(thetas_states_df, times)
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Plot #10:")


# Example where 80% of the population is vaccinated before the 100 infectious
# individuals enter
times = 1:400
time = c(1, 390)
R0 = c(3.0, 3.0)
D_inf = c(14.0, 14.0)
S = c(1000000, 0)
I = c(100, 0)
R = c(4000000, 0)
thetas_states_table = cbind(time, R0, D_inf, S, I, R)
thetas_states_df = as.data.frame(thetas_states_table, stringsAsFactors=FALSE)
thetas_states_df
trajectory = simulate_SIR_changes(thetas_states_df, times)
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Plot #11:")


# Note that, when the numbers of infected are very small, it can be
# useful to look at the plot using the log of the y-axis:

# Example where government lockdown on Day 30 drops R0 from 3.0 to 0.1,
# but we go back to R0 of 3.0 after 28 days
times = 1:400
time = c(1, 30, 58)
R0 = c(3.0, 0.1, 3.0)
D_inf = c(14.0, 14.0, 14.0)
S = c(5000000, 0, 0)
I = c(100, 0, 0)
R = c(0, 0, 0)
thetas_states_table = cbind(time, R0, D_inf, S, I, R)
thetas_states_df = as.data.frame(thetas_states_table, stringsAsFactors=FALSE)
thetas_states_df
trajectory = simulate_SIR_changes(thetas_states_df, times)

# A standard plot
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Plot #12a:", ylog=FALSE)

# Same plot, but with the y-axis displayed on the log-scale
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Plot #12b:", ylog=TRUE)


#######################################################
# DISCUSSION ASSIGNMENT
#######################################################
#
# OK, after running through the Week 5 lab exercise, we have now
# seen many possible modeled scenarios, and the code for each.
#
# If you look at the code for Plots 5-12, and examine the code for
# "time", "RO", "D_inf", "S", "I", and "R",
# you can see that they all have to have the same number of elements,
# for a particular plot:
#
# If you have 3 regimes, like in Plot #12, you have to have to put
# 3 items in each of those variables, before the "theta_states_df" is
# constructed.
#
# If you have 12 regimes, like in Plot #9, you need 12 items in
# each variable.
#
# If a particular variable doesn't change, you just leave it the
# same between regimes (or, for the states S, I, and R, you
# put "0" to represent no change).
#
# Once you understand this, you should be able to design your
# own SIR model projection.  Here is one I have just made, which
# you can modify:


times = 1:730           # 2 years
time = c(1, 30, 58)     # strong Level 4 lockdown on Day 30, for 28 days
R0 = c(3.0, 0.1, 3.0)   # R0 returns to 3.0 after the lockdown (Level 0 conditions)
D_inf = c(14.0, 14.0, 14.0)
S = c(5000000, 0, 0)
I = c(100, 0, 0)
R = c(0, 0, 0)
thetas_states_table = cbind(time, R0, D_inf, S, I, R)
thetas_states_df = as.data.frame(thetas_states_table, stringsAsFactors=FALSE)
thetas_states_df
trajectory = simulate_SIR_changes(thetas_states_df, times)

# A standard plot
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Nick's model:", ylog=FALSE)

# Same plot, but with the y-axis displayed on the log-scale
plot_trajectory_lines(trajectory, theta=NULL, thetas_states_df=thetas_states_df, title_prefix="Nick's model", ylog=TRUE)


#
# Here is your mission (make sure you've worked through the Week 5 Lab
# Exercise first!)
#
# 1. Pretend you are have been hired as a consultant by a large NZ
# company called KiwiBiz. KiwiBiz has hired a number of consultants, and
# wants each of you to provide your own forecast of what you consider a
# "plausible" scenario for COVID-19 progression in NZ over the next 2
# years (730 days). This forecast should include governmental
# interventions that you think are likely, as well as, possibly, the
# introduction of vaccines.
#
# The company wants to see a number of forecasts from different
# consultants, so that they can plan for different scenarios.
#
# PLEASE NOTE: IN REAL-LIFE CONSULTING JOBS, YOU ARE EXPECTED TO MAKE
# MANY JUDGMENT CALLS YOURSELF, WITHOUT CONSULTING THE EMPLOYER OVER
# EVERY LITTLE AMBIGUITY. So, do your best to resolve ambiguities
# yourself, by making a reasonable decision about them. Try to only ask
# instructors questions if they are technical difficulties (e.g., R
# problems etc.)
#
#
#
#
# 2. The company asks that you keep in mind:
#
# (a) Assume that "natural" R0=3.0, i.e. during "Level 0" (normal life),
# R0=3.0 (and D_inf=14 days). Assume that the epidemic starts with 5
# million Susceptibles, and 100 Infected.
#
# (b) Lockdown levels 1, 2, 3, and 4 will be progressively more
# effective at reducing R0. However, we don't know what R0 will be at
# these levels, so you should put in plausible guesses for whichever
# levels you use in your scenario.
#
# (c) The Level 4 lockdown is very good at reducing R0 (although we
# don't know how much), but it is also extremely damaging to the
# economy, so the government will try to get us out of Level 4 as fast
# as possible, while also minimizing infections and deaths.
#
#
#
#
# 3. For your scenario, post to the Discussion forum:
#
# (a) A screenshot of your graph (or 2 graphs, e.g. you might include
# the regular, and the logged-y-axis version, if necessary). Use the
# "images" button at the top of the Discussion text box to upload the
# image.
#
# (b) Your input parameters, specifically these lines:
#
# time = c(1, 30, 58) R0 = c(3.0, 0.1, 3.0) D_inf = c(14.0, 14.0, 14.0)
# S = c(5000000, 0, 0) I = c(100, 0, 0) R = c(0, 0, 0)
#
# (c) The total number of days spent in Level-4 lockdown
#
# (d) The approximate total number infected
#
# (e) Write a short paragraph summarizing your scenario and why your
# assumptions are plausible.
#
# (f) Write a short paragraph describing what you think is the biggest
# limitation in your modeling, and what you would need in order to
# ameliorate this limitation.
#
# (g) Post your summary to Canvas. After other summaries have posted,
# write a comment on at least one other summary, stating what you found
# interesting, or asking a question, etc. Maintain professionalism, as
# in last week.
#
#
# NOTE: Here is my plot for a scenario with a strong Level 4 lockdown on
# Day 30, for 28 days, but then R0 returns to 3.0 after the lockdown
# (Level 0 conditions).
#
# I am also including the same plot, with the y-axis logged, to show the
# early dynamics during a small number of cases.
#
# My quick summary is: The problem with this scenario is that we have 28
# days of harsh lockdown, but then the epidemic recovers and eventually
# infects 90% of the population. We might do better by putting in the
# lockdown earlier, by extending it, by moving to Level 3 instead of
# Level 0, by introducing a vaccine, etc.
##############################################################


#######################################################
# References
#
# My SIR code relies heavily on code from the fitR package
# (Funk et al. 2019).
#
# Adam, David (2020). "UK has enough intensive care units for
# coronavirus, expert predicts." New Scientist, March 25, 2020.
# https://www-newscientist-com.ezproxy.auckland.ac.nz/article/2238578-uk-has-enough-intensive-care-units-for-coronavirus-expert-predicts/
#
# Anderson, Charles (2020). "New Zealand coronavirus deaths during lockdown could
# be just 20, modelling suggests." The Guardian. 27 Mar 2020.
# https://www.theguardian.com/world/2020/mar/27/new-zealand-coronavirus-deaths-during-lockdown-could-be-just-20-modelling-suggests
#
# Dillane, Tom; Knox, Chris (2020), "Coronavirus Covid 19: New Zealand's
# best and worst prepared DHBs by elderly population and ICU beds."
# NZ Herald. April 5, 2020
# https://www.nzherald.co.nz/nz/news/article.cfm?c_id=1&objectid=12321390
#
# Fefferman, Nina (2020). "The role of applied math in real-time pandemic
# response: How basic disease models work." Date: 3:30 EDT Tuesday,
# March 31, 2020.  http://www.nimbios.org/webinars#fefferman
#
# Funk, Sebastian (2019). "Introduction to model fitting in R." http://sbfnk.github.io/mfiidd/introduction.html
#
# Funk, Sebastian; Camacho, Anton; Johnson, Helen; Minter, Amanda; O’Reilly, Kathleen; Davies, Nicholas (2019). "Model fitting and inference for infectious disease dynamics."  Centre for the Mathematical Modelling of Infectious Diseases (CMMID), London School of Hygiene & Tropical Medicine.
#
# James, Alex; Hendy, Shaun C.; Plank, Michael J.; Steyn, Nicholas (2020). "Suppression and Mitigation
# Strategies for Control of COVID-19 in New Zealand." 25 March 2020.
# https://cpb-ap-se2.wpmucdn.com/blogs.auckland.ac.nz/dist/d/75/files/2017/01/Supression-and-Mitigation-New-Zealand-TPM-006.pdf
#
# Newton, Kate (2020). "The man modelling NZ's Covid-19 spread
# from his kitchen table." Radio New Zealand. 27 March 2020.
# https://www.rnz.co.nz/news/in-depth/412744/the-man-modelling-nz-s-covid-19-spread-from-his-kitchen-table
#########################################################