Skip to content

Instantly share code, notes, and snippets.

@nmatzke
Last active March 30, 2020 07:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nmatzke/e7a49f963912174136c3d319e2981231 to your computer and use it in GitHub Desktop.
Save nmatzke/e7a49f963912174136c3d319e2981231 to your computer and use it in GitHub Desktop.
BIOSCI220: Lab 4 exercise and questions
# Module 2: "Model-based Inference"
# https://canvas.auckland.ac.nz/courses/48087/pages/module-2-model-based-inference?module_item_id=747842
# You can find the Lab 4 Worksheet questions at:
#
# Canvas -> BIOSCI 220 -> Quizzes -> Lab 4 Worksheet
# The below Lab 4 code is posted at:
#
# https://gist.github.com/nmatzke/e7a49f963912174136c3d319e2981231
#######################################################
# WEEK 4 LAB
#######################################################
# When I planned Week 4 and recorded the lectures,
# COVID-19 was a faraway news story. Now, of course,
# it has taken over our lives.
#
# My goals for BIOSCI220 have been to (1) communicate
# the importance of model-based inference in modern
# science, but also (2) encourage you to think critically
# about models, and how decisions are made based on them.
#
# I usually like to emphasize #2, because there is a
# tendency, inside and outside of science, to naively
# belive whatever some model says, without much
# understanding of what goes into the model, or what
# the consequences might be if the model is wrong.
#
# However, in the light of COVID-19, we are now in the
# midst of situation where pretty literal life-and-
# death decisions are based on models -- frankly,
# rightly so, because we are in a situation where
# ignoring the models and trying to carry on with
# "business as usual" would have devastating
# consequences.
#
# Even in this situation, it is better to understand
# what is going on, and to understand why decisions
# are being made the way they are, than to just
# shrug and decide to believe or disbelieve the
# authorities on faith.
#
# To help you in this endeavour, I have scrapped some
# of my original material to focus "Module 2: Model-based
# Inference" on epidemiological models. The basic plan is:
#
# Week 4: Basics of model-based inference:
# Lectures: What are models?
# Lab: Likelihood and maximizing the likelihood
#
# Week 5: The SIR model (Susceptible, Infectious, Recovered)
# Lectures: Intro to SIR models
# Lab: Forecasting epidemics with SIR models in R (simulation)
#
# Week 6: Inference and model comparison
# Lectures: Methods of inference
# Lab: Inference of SIR model parameters from data
#
#######################################################
#######################################################
#
# LAB 4 SETUP INSTRUCTIONS
#
# For lab 4, you will need access to these:
#
# 1. Web browser, and access to CANVAS (for quizzes,
# lectures, online disucssions etc.).
#
# 2. Ability to copy/paste text. This is possible even
# on a smartphone, but probably is much easier on
# a desktop/laptop/tablet.
#
# 3. R. You can either install R on your desktop/laptop
# machine, or you can use R through the web, for
# example with this website: https://rdrr.io/snippets/
#
# Note that, for the R-through-the-web pages, you
# will have to paste ALL necessary code in one go -
# it does not remember whatever you ran previously.
# This may mean that, later in the lab exercise, you
# have to paste in a large chunk of the lab code into
# the run window at once.
#
# 4. The TAs and I will take questions on Piazza all
# week. I also plan to do live Zoom help sessions,
# so if you want to access these, please install
# Zoom.
#
#######################################################
#######################################################
# LAB 4, PART 1: READING AND THINKING --
# ADVANTAGES/DISADVANTAGES OF MODELS
#######################################################
#
# In lieu of an in-person discussion in class, I ask
# all students to contribute to a Canvas discussion
# on "models and COVID".
#
# You will recall that we have 3 assigned readings for
# the week:
#
# 1. Asimov - The Relativity of Wrong
# 2. O'Neill - Weapons of Math Destruction
# 3. Stevens - "Why outbreaks like coronavirus spread
# exponentially, and how to "flatten the
# curve" (free online). March 14, 2020
# https://www.washingtonpost.com/graphics/2020/world/corona-simulator/
#
# ASSIGNMENT: Each student should read the above
# articles, think about what they mean about
# models, their importance, and where and how
# critical thinking should be applied to them.
#
# THEN, with those thoughts in mind, *find another
# media article* that discusses the modeling of
# COVID-19 in New Zealand or elsewhere. There
# are *many* such articles up right now. They cover
# topics like describing the model projections,
# describing debates between epidemiologists (some
# of which have been very public!), or describing
# the reaction/nonreaction of politicians or the
# public to model projections.
#
# Instructions:
#
# 1. Make a post on the CANVAS discussion page:
#
# 2. Start the post with the reference for the article:
# Author, date, title, publication, link
#
# 3. Write a short paragraph summarizing the
# article.
#
# 4. Write a second short paragraph giving your
# reaction. Don't overthink this - we are
# in a stressful time, and I want this to be
# open to anything from a highly academic
# evaluation ("This article gets the science
# wrong because...") to questions ("I wish
# this article answered question X that I have
# because..."), to personal reactions ("It makes
# me angry/sad/happy that person/country X/Y
# did/did not do Z because"...).
#
# Include the reference & 2 paragraphs in a SINGLE post.
#
# 5. After a number of posts have gone up,
# please post a short reply to at least one
# other post. Ideally, a discussion of
# interesting topics will naturally emerge, but
# I would like to see every post get at least
# one reaction. OF COURSE, REMEMBER TO BE POLITE
# AND COLLEGIAL AND SCHOLARLY.
#
# 6. It's OK if different posts discuss the same articles,
# although, for your own interest, you might try to
# find a new article.
#
# 7. This discussion assignment will be worth 10%
# of your lab 4 grade. The only ways to lose points
# are (a) not do it, (b) do something insulting/
# flippant/mean.
#
####################################################
#######################################################
# LAB 4, PART 2: INTRODUCTION TO ML INFERENCE
########################################################
#
# In Week 4, we will begin our exploration of
# model-based inference with a very simple dataset
# and model.
#
# In Weeks 5 & 6, we will apply these basic skills to
# SIR models from epidemiology, which are not nearly
# as complex or hard to understand as you might have
# thought. Basic SIR models can be coded and run in
# a few lines of R code.
#
# BACKGROUND
#
# Many biological problems, in many different fields,
# from ecology to epidemiology, are concerned with
# trying to estimate the prevalence or frequency of
# something.
#
# For example:
#
# - Out of 1000 ponds in a region, how many contain a
# particular endangered frog species?
#
# - Out of a million people, how many are resistant
# to measles (due to e.g. vaccination or previous
# infection)?
#
# These examples (leaving out numerous real-world
# complexities) boil down to this: there are 2
# possible outcomes, and we are interested in how
# often each outcome occurs.
#
# Another process with 2 outcomes is coin-flipping.
#
# Coin-flipping, and any other 2-outcome process,
# can be described with a one-parameter model, using
# a binomial distribution ("bi-nomial" = "two names").
#
# (We will use coin-flipping as an example, but you can
# mentally substitute any other binomial process if
# you like.)
#
# The single parameter describing a binomial process
# is "p", the probability of one of the outcomes. The
# probability of the other outcome is 1-p.
#
# Because there are so many different probabilities and
# "p" terms used in probability
# and statistics, I will use "P_heads" - the probability
# that a coin will produce "Heads" on a single flip - to
# describe this parameter.
#
########################################################
# Let's consider some coin-flip data.
#
# Here are 100 coin flips:
coin_flips = c('H','T','H','T','H','H','T','H','H','H','T','H','H','T','T','T','T','H','H','H','H','H','H','H','H','H','H','H','H','H','H','H','H','T','T','T','H','T','T','T','H','T','T','T','H','H','H','T','T','H','H','H','T','H','H','H','T','T','H','H','H','H','H','H','H','T','T','H','H','H','H','T','T','H','H','H','T','T','H','H','H','H','H','H','T','T','T','H','H','H','H','H','H','T','H','T','H','H','T','T')
# Look at the data
coin_flips
# What is your guess at "P_heads", the probability of heads?
#
#
# In the case of binomial data, we actually have a simple
# formula to calculate the best estimate of P_heads:
# Find the heads
heads_TF = (coin_flips == "H")
heads_TF
# Find the tails
tails_TF = (coin_flips == "T")
tails_TF
numHeads = sum(heads_TF)
numHeads
numTails = sum(tails_TF)
numTails
numTotal = length(coin_flips)
numTotal
# Here's the formula:
P_heads_ML_estimate = numHeads / numTotal
P_heads_ML_estimate
# PLEASE ANSWER LAB 4 QUESTION #1, NOW:
#
# 1. What number (exactly) was reported for P_heads_ML_estimate?
#
# (Obviously the answer is 0.65, but I want you to realize that
# the computer will auto-grade this. So, 0.65 will be marked correct,
# but .65, 0.650, etc., will be marked incorrect. Extra spaces may
# also cause the answer to be marked incorrect, e.g. "0.65" is correct,
# but " 0.65" or "0.65 " are incorrect. For future answers, when numbers
# are asked for, be sure to report the exact number R gave, without
# spaces etc.)
# Well, duh, that seems pretty obvious. At least it would have been, if we
# weren't thinking of coins, where we have a strong prior belief that the
# coin is probably fair.
#
# It turns out that this formula can be justified through a technique
# known as Maximum Likelihood.
#
# What does it mean to say we have a "maximum likelihood" estimate of P_heads?
#
# "Likelihood", in statistics, means "the probability of the data under the model"
#
# So, a "Maximum Likelihood estimate" of P_heads is the value of the parameter
# P_heads that maximizes the probability of seeing the data you saw, namely,
# 65 Heads and 35 Tails.
#
# Comparing likelihoods for different possible parameter values of P_heads
#
# Let's calculate the probability of these same coin flip data under the
# hypothesis/model that P_heads=0.5
#
# We'll be very inefficient, and use a for-loop, and
# if/else statements
# Loop through all 100 flips
# Make a list of the probability of
# each datum
P_heads_guess = 0.5
# Empty list of probabilities
probs_list = rep(NA, times=length(coin_flips))
probs_list
for (i in 1:length(coin_flips))
{
# Print an update
cat("\nAnalysing coin flip #", i, "/", length(coin_flips), sep="")
# Get the current coin flip
coin_flip = coin_flips[i]
# If the coin flip is heads, give that datum
# probability P_heads_guess.
# If tails, give it (1-P_heads_guess)
if (coin_flip == "H")
{
probs_list[i] = P_heads_guess
} # End if heads
if (coin_flip == "T")
{
probs_list[i] = (1-P_heads_guess)
} # End if tails
} # End for-loop
# Look at the resulting probabilities
probs_list
# We get the probability of all the data by multiplying
# all the probabilities, with the prod() function.
likelihood_of_data_given_P_heads_guess1 = prod(probs_list)
likelihood_of_data_given_P_heads_guess1
# PLEASE ANSWER LAB 4, QUESTION #2, NOW:
# 2. What number (exactly) was reported for likelihood_of_data_given_P_heads_guess1?
# That's a pretty small number! You'll see that it's
# just 0.5^100:
0.5^100
# A probability of 0.5 is not small, but multiply it
# 100 values of 0.5 together, and you get a small value.
# That's the probability of that specific sequence of
# heads/tails, given the hypothesis that the true
# probability is P_heads_guess.
# Let's try another parameter value:
# Loop through all 100 flips
# Make a list of the probability of
# each datum
P_heads_guess = 0.7
# Empty list of probabilities
probs_list = rep(NA, times=length(coin_flips))
probs_list
for (i in 1:length(coin_flips))
{
# Print an update
cat("\nAnalysing coin flip #", i, "/", length(coin_flips), sep="")
# Get the current coin flip
coin_flip = coin_flips[i]
# If the coin flip is heads, give that datum
# probability P_heads_guess.
# If tails, give it (1-P_heads_guess)
if (coin_flip == "H")
{
probs_list[i] = P_heads_guess
} # End if heads
if (coin_flip == "T")
{
probs_list[i] = (1-P_heads_guess)
} # End if tails
} # End for-loop
# Look at the resulting probabilities
probs_list
# We get the probability of all the data by multiplying
# all the probabilities
likelihood_of_data_given_P_heads_guess2 = prod(probs_list)
likelihood_of_data_given_P_heads_guess2
# PLEASE ANSWER LAB 4, QUESTION #3, NOW:
# 3. What number (exactly) was reported for likelihood_of_data_given_P_heads_guess2?
# We got a different likelihood. It's also very small.
# But that's not important. What's important is,
# how many times higher is it?
likelihood_of_data_given_P_heads_guess2 / likelihood_of_data_given_P_heads_guess1
# PLEASE ANSWER LAB 4, QUESTION #4, NOW:
# 4. What number (exactly) was reported for
# likelihood_of_data_given_P_heads_guess2 / likelihood_of_data_given_P_heads_guess1 ?
# Whoa! That's a lot higher! This means the coin flip data is 54 times more
# probable under the hypothesis that P_heads=0.7 than under the
# hypothesis that P_heads=0.5.
# Maximum likelihood: Hopefully you can see that the data is in some sense
# "best explained" by the parameter value (the valueof P_heads) that
# maximizes the probability of the data. This is what we call the Maximum
# Likelihood solution.
# To try different values of P_heads, we could keep copying and pasting code,
# but that seems annoying. Let's define an R "function" instead.
#
# Functions, in R, collect a number of commands, and run them when you
# type the name of the function. Functions are simple programs. All of
# the R commands you have ever run, e.g. mean(), prod(), etc. are all
# just functions that R has pre-programmed.
#
# NOTES ON FUNCTIONS IN R:
# * To run, functions have to be followed by (). Usually, but not always,
# inputs go into the (). One example of a function that needs no inputs
# is getwd(), which gets your current working directory (wd) and prints it
# to screen:
getwd()
# * To see the code that is running inside the function, you can often
# just type the name of the function, without the (). This works for
# most R functions except for "primitive" ones (these are core R
# functions, like getwd, mean, length, etc., that are programmed
# a different way to be super-fast).
#
# * Typing the names of functions and figuring out exactly what is
# happening in the code is a great way to learn R and learn a
# specific scientific field, since you can see *exactly* what
# some scientist is doing in their analysis.
#
# * For functions that are part of packages, you can get the help
# page by type a ? before the function name. E.g., ?mean will open
# the help page for the "mean" function. R help is often not
# written for beginners, but it is better than nothing.
#
# * When you try to run a function and get an error, this usually
# means you left out a required input, or you put in the wrong
# kind of input (e.g., if the input is supposed to be a number,
# and you put in a word like "hello", you will get an error
# message.
#
# When you type
mean("hello")
# ...what happens?
# PLEASE ANSWER LAB 4, QUESTION #5, NOW:
# 5. When you run 'mean("hello")', paste in the last part of the error message you get back.
# Your paste should being with "argument" and end with "NA".
# Let's define a function
#
# This function calculates the probability of a
# sequence of coin flip data, given a value of P_heads_guess
calc_prob_coin_flip_data <- function(P_heads_guess, coin_flips)
{
# Empty list of probabilities
probs_list = rep(NA, times=length(coin_flips))
probs_list
for (i in 1:length(coin_flips))
{
# Print an update
#cat("\nAnalysing coin flip #", i, "/", length(coin_flips), sep="")
# Get the current coin flip
coin_flip = coin_flips[i]
# If the coin flip is heads, give that datum
# probability P_heads_guess.
# If tails, give it (1-P_heads_guess)
if (coin_flip == "H")
{
probs_list[i] = P_heads_guess
} # End if heads
if (coin_flip == "T")
{
probs_list[i] = (1-P_heads_guess)
} # End if tails
} # End for-loop
# Look at the resulting probabilities
probs_list
# We get the probability of all the data by multiplying
# all the probabilities
likelihood_of_data_given_P_heads_guess = prod(probs_list)
# Return result
return(likelihood_of_data_given_P_heads_guess)
}
# Now, we can just use this function:
calc_prob_coin_flip_data(P_heads_guess=0.5, coin_flips=coin_flips)
calc_prob_coin_flip_data(P_heads_guess=0.6, coin_flips=coin_flips)
calc_prob_coin_flip_data(P_heads_guess=0.7, coin_flips=coin_flips)
# Look at that! We did all of that work in a split-second.
# PLEASE ANSWER LAB 4, QUESTION #6, NOW:
# 6. What number (exactly) was reported for
# calc_prob_coin_flip_data(P_heads_guess=0.6, coin_flips=coin_flips) ?
# In fact, we can make another for-loop, and search for the ML
# value of P_heads by trying all of the values and plotting them.
# Sequence of 50 possible values of P_heads between 0 and 1
P_heads_values_to_try = seq(from=0, to=1, length.out=50)
likelihoods = rep(NA, times=length(P_heads_values_to_try))
for (i in 1:length(P_heads_values_to_try))
{
# Get the current guess at P_heads_guess
P_heads_guess = P_heads_values_to_try[i]
# Calculate likelihood of the coin flip data under
# this value of P_heads
likelihood = calc_prob_coin_flip_data(P_heads_guess=P_heads_guess, coin_flips=coin_flips)
# Store the likelihood value
likelihoods[i] = likelihood
} # End for-loop
# Here are the resulting likelihoods:
likelihoods
# Let's try plotting the likelihoods to see if there's a peak
plot(x=P_heads_values_to_try, y=likelihoods)
lines(x=P_heads_values_to_try, y=likelihoods)
# Whoa! That's quite a peak! You can see that the likelihoods
# vary over several orders of magnitude.
#
# Partially because of this extreme variation, we often use the
# logarithm of the likelihood, also known as
#
# * log-likelihood
# * lnL
#
# In R, the log() function returns the natural logarithm, while
# log10 returns the logarithms at base 10. I mention this because
# on some calculators, the LN button returns the natural log,
# and the LOG button returns the log10 logarithm.
#
#
# There are several reasons scientists, and scientific papers,
# often use the log-likelihood instead of the raw likelihood.
# These include:
#
# * Computers have a minimum numerical precision. If a likelihood
# value gets too small, it gets rounded to 0. You can see this
# happen if you calculate the raw likelihood of over 1000 coin-flips
# using the parameter P_heads = 0.5
#
# Machine precision can vary between computers, so I will give you
# the results on my machine (but, try it on yours):
0.5^1073
# ...equals 9.881313e-324, which is a very small number
0.5^1074
# ...equals 4.940656e-324, which is half the size of the previous number,
# which makes sense, because we just multiplied it by 0.5
#
# However, on my machine,
0.5^1075
# ...equals 0. Instead of one more coin-flip causing the likelihood to be
# halved again, zero is infinitely less than 4.940656e-324. So the
# computer has made a "mistake" here.
#
# Log-likelihoods do not have this problem. Compare:
1073 * log(0.5)
1074 * log(0.5)
1075 * log(0.5)
1076 * log(0.5)
# We can convert log-likelihoods to raw-likelihood with exp(), which reverses
# the log() operation.
exp(1073 * log(0.5))
exp(1074 * log(0.5))
exp(1075 * log(0.5))
exp(1076 * log(0.5))
# You can see that, working in log-likelihood space, we have no
# numerical precision issues, but with raw likelihoods, we do.
# PLEASE ANSWER LAB 4, QUESTION #7, NOW:
# 7. What number (exactly) was reported for 1075 * log(0.5) ?
# Other reasons scientists use log-likelihoods instead of likelihoods:
#
# * The log-likelihood and raw-likelihood will have their maximums
# at the same parameter values.
#
# * Taking the logarithm of an equation converts multiplication to
# addition, which can make calculus much easier.
#
# * Log-likelihoods are easier to read and interpret than raw
# likelihoods, once you get used to log-likelihoods. For any
# large dataset, the raw likelihood will be a very tiny
# number, hard to read/remember/interpret.
#
# * It turns out that a number of fundamental statistical measures
# and tests use the log-likelihood as an input. These include
# the Likelihood Ratio Test (LRT) and Akaike Information Criterion (AIC).
# We will talk about these in future weeks.
# PLOTTING THE LOG-LIKELIHOOD
# We will repeat the code to calculate the likelihood
# for a sequence of 50 possible values of P_heads between 0 and 1:
P_heads_values_to_try = seq(from=0, to=1, length.out=50)
likelihoods = rep(NA, times=length(P_heads_values_to_try))
for (i in 1:length(P_heads_values_to_try))
{
# Get the current guess at P_heads_guess
P_heads_guess = P_heads_values_to_try[i]
# Calculate likelihood of the coin flip data under
# this value of P_heads
likelihood = calc_prob_coin_flip_data(P_heads_guess=P_heads_guess, coin_flips=coin_flips)
# Store the likelihood value
likelihoods[i] = likelihood
} # End for-loop
# Here are the resulting likelihoods:
likelihoods
# Let's take the log (the natural log, i.e. the base is exp(1)).
log_likelihoods = log(likelihoods, base=exp(1))
plot(x=P_heads_values_to_try, y=log_likelihoods)
lines(x=P_heads_values_to_try, y=log_likelihoods)
# Let's plot the raw- and log-likelihood together:
par(mfrow=c(2,1))
plot(x=P_heads_values_to_try, y=likelihoods, main="Likelihood (L) of the data")
lines(x=P_heads_values_to_try, y=likelihoods)
plot(x=P_heads_values_to_try, y=log_likelihoods, main="Log-likelihood (LnL) of the data")
lines(x=P_heads_values_to_try, y=log_likelihoods)
# PLEASE ANSWER LAB 4, QUESTION #8, NOW:
# 8. Compare the raw-likelihood curve and the log-likelihood curve. Does it look
# like the top of each curve appears in about the same location on the
# x-axis? (i.e., at the same parameter value for P_heads?)
#
# Maximum likelihood optimization
#
# You can see that the maximum likelihood of the data occurs when
# P_heads is somewhere around 0.6 or 0.7. What is it
# exactly?
#
# We could just keep trying more values until we find whatever
# precision we desire. But, R has a function for
# maximum likelihood optimization!
#
# It's called optim(). Optim() takes a function as an input.
# Fortunately, we've already written a function!
#
# Let's modify our function a bit to return the log-likelihood,
# and print the result:
# Function that calculates the probability of coin flip data
# given a value of P_heads_guess
calc_prob_coin_flip_data2 <- function(P_heads_guess, coin_flips)
{
# Empty list of probabilities
probs_list = rep(NA, times=length(coin_flips))
probs_list
for (i in 1:length(coin_flips))
{
# Print an update
#cat("\nAnalysing coin flip #", i, "/", length(coin_flips), sep="")
# Get the current coin flip
coin_flip = coin_flips[i]
# If the coin flip is heads, give that datum
# probability P_heads_guess.
# If tails, give it (1-P_heads_guess)
if (coin_flip == "H")
{
probs_list[i] = P_heads_guess
} # End if heads
if (coin_flip == "T")
{
probs_list[i] = (1-P_heads_guess)
} # End if tails
} # End for-loop
# Look at the resulting probabilities
probs_list
# We get the probability of all the data by multiplying
# all the probabilities
likelihood_of_data_given_P_heads_guess = prod(probs_list)
# Get the log-likelihood
LnL = log(likelihood_of_data_given_P_heads_guess)
LnL
# Error correction: if -Inf, reset to a low value
if (is.finite(LnL) == FALSE)
{
LnL = -1000
}
# Print some output
print_txt = paste("\nWhen P_heads=", P_heads_guess, ", LnL=", LnL, sep="")
cat(print_txt)
# Return result
return(LnL)
}
# Try the function out:
LnL = calc_prob_coin_flip_data2(P_heads_guess=0.1, coin_flips=coin_flips)
LnL = calc_prob_coin_flip_data2(P_heads_guess=0.2, coin_flips=coin_flips)
LnL = calc_prob_coin_flip_data2(P_heads_guess=0.3, coin_flips=coin_flips)
# Looks like it works! Let's use optim() to search for he
# best P_heads value:
# Set a starting value of P_heads
starting_value = 0.1
# Set the limits of the search
limit_bottom = 0
limit_top = 1
optim_result = optim(par=starting_value, fn=calc_prob_coin_flip_data2, coin_flips=coin_flips, method="L-BFGS-B", lower=limit_bottom, upper=limit_top, control=list(fnscale=-1))
# You can see the search print out as it proceeds.
# Let's see what ML search decided on:
optim_result
# Let's compare the LnL from ML search, with the binomial mean
optim_result$par
# Here's the formula:
P_heads_ML_estimate = numHeads / numTotal
P_heads_ML_estimate
# PLEASE ANSWER LAB 4, QUESTION #9, NOW:
# 9. Is the ML value of P_heads_guess found by numerical optimization
# (found in optim_result$par ) very close to the ML value found
# by analytical solution (P_heads_ML_estimate)?
# PLEASE ANSWER LAB 4, QUESTION #10, NOW:
# 10. What was the maximized log-likelihood found by optim (hint: look at
# optim_result$value)? Numerical optimization results might vary
# slightly, so round this value to 2 decimal places in your answer.
# Where does that formula -- the "analytic solution" for P_heads_ML_estimate
# come from? We will discuss this further in future lectures, but here is a summary.
#
# It turns out that, in simple cases, we can use calculus to derive
# "analytical solutions" for maximizing the likelihood. Basically, this involves:
#
# - taking the derivative of the likelihood function
# (the derivative is the slope of a function)
# - setting the formula for the derivative to equal 0
# (because, when the derivative=0, you are at a maximum or minimum)
# - solving for the parameter value that causes the derivative to
# equal 0.
#
# This same basic calculus-based procedure provides a justification for
# the formulas that are used in e.g. linear regression.
#
# The nice things about calculus-derived formulas for maximizing likelihood
# -- known as "analytic solutions" -- include:
#
# - it is very fast
# - the solution can be found without a computer (this was useful back in
# the 20th century, when statistics was being developed without computers)
# - there is no chance of optimization error (optimization errors can occur
# if likelihood surfaces are too flat, too bumpy, etc.)
#
# This raises the question:
# Why would anyone ever go through all the optimization rigamarole, when
# they could just calculate P_heads directly?
#
# Well, only in simple cases do we have a formula for the maximum likelihood
# estimate. The optim() strategy works whether or not
# there is a simple formula.
#
# In real life science, with complex models of biological processes (such as
# epidemiological models), ML optimization gets used A LOT, but most scientists
# don't learn it until graduate school, if then.
#
# WHAT WE LEARNED
#
# You can now see that there is nothing magical, or even particularly
# difficult, about these concepts:
#
# - likelihood
# - log-likelihood
# - maximum likelihood (ML)
# - inferring a parameter through ML
#
#
# IMPLICATIONS
#
# Despite their simplicity, it turns out that likelihood is one of
# the core concepts in model-based inference, and statistics
# generally. Bayesian analyses have likelihood as a core component
# (we will talk about this more later), and methods from introductory
# statistics (linear models etc.)
#######################################################
#######################################################
# PART 3: BUILDING BLOCKS OF MODELS IN R
#######################################################
#
# R has a number of functions that speed up the process
# of building models, and simulating from them.
#
# For example, for binomial processes, these are:
#
# dbinom(x, size, prob, log) -- the probability density
# (the likelihood) of x "heads"
#
# size = the number of tries (the number of coin flips)
#
# prob = the parameter value (e.g. P_heads)
#
# log = should the likelihood be log-transformed when output?
#
#
# The other functions are:
#
# rbinom(n, size, prob) -- simulate n binomial outcomes,
# using "size" and "prob" as above
#
# pbinom(q, ...) - used for calculating tail probabilities
# (proportion of the distribution less than the value q)
#
# qbinom(p, ...) - reverse of pbinom()
#
#
# Any time you hear about "probability densities", "distributions"
# "random number generators", "random draws", or "simulations"
# in R, functions like these are being used.
#
# For uniform distributions, the equivalent functions are:
#
# dunif, runif, punif, qunif
#
# For normal (=Gaussian/bell-curve) distributions, the equivalents are:
#
#
# dnorm, rnorm, pnorm, qnorm
# Random seeds, and random number generators
# Random numbers in computers are not truly random. To make random numbers
# repeatable, a "seed" number is always used.
# Compare:
rnorm(n=1, mean=6, sd=1)
rnorm(n=1, mean=6, sd=1)
rnorm(n=1, mean=6, sd=1)
set.seed(54321)
rnorm(n=1, mean=6, sd=1)
rnorm(n=1, mean=6, sd=1)
set.seed(54321)
rnorm(n=1, mean=6, sd=1)
rnorm(n=1, mean=6, sd=1)
set.seed(54321)
rnorm(n=1, mean=6, sd=1)
rnorm(n=1, mean=6, sd=1)
# By re-setting the seed, we re-set the random number generator
# PLEASE ANSWER LAB 4, QUESTION #11, NOW:
# 11. If you set the random number seed to 54322, what is the first
# random number generated by rnorm(n=1, mean=6, sd=1)?
# Generate 1000 random numbers from a normal distribution, and
# then plot a histogram of them, with this code:
random_numbers = rnorm(n=1000, mean=6, sd=1)
hist(random_numbers)
# PLEASE ANSWER LAB 4, QUESTION #12, NOW:
# 12. What shape do you see?
# PLEASE ANSWER LAB 4, QUESTION #13, NOW:
# 13. I am 6 feet tall. What is the raw likelihood of my height,
# assuming that I am a random draw from a population that is
# 7 feet tall, with a standard deviation of 0.5 feet?
dnorm(x=6.0, mean=7.0, sd=0.5)
# PLEASE ANSWER LAB 4, QUESTION #14, NOW:
# 14. What is the log-likelihood of my height?
log(dnorm(x=6.0, mean=7.0, sd=0.5))
# PLEASE ANSWER LAB 4, QUESTION #15, NOW:
# 15. If we change the model of my population from mean height=7
# feet to mean height=6 feet, will the likelihood of my 6-foot
# height go up or down?
# PLEASE ANSWER LAB 4, QUESTION #16, NOW:
# 16. What is the raw likelihood of my height, assuming I am a
# random draw from a population that is 6 feet tall, with
# a standard deviation of 0.5 feet?
dnorm(x=6.0, mean=6.0, sd=0.5)
# NOTE: The "d" in "dnorm" refers to probability density. For continuous data,
# the probability of any exact value is 0, because there are an infinite number
# of possible values for a continuous variable. So we talk about the probability
# density instead of the exact probability (or probability mass). Probability
# densities work the same as probability masses for our purposes.
# PLEASE ANSWER LAB 4, QUESTION #17, NOW:
# 17. What is the corresponding log-likelihood?
# PLEASE ANSWER LAB 4, QUESTION #18, NOW:
# 18. What shape does dnorm() make, for a variety of
# height values? Try:
#
height_values = seq(4, 8, 0.01)
probability_densities = dnorm(x=height_values, mean=6.0, sd=0.5)
plot(x=height_values, y=probability_densities)
#######################################################
# LAB 4 EXERCISE: PLEASE FILL OUT THE EXERCISE ON CANVAS.
# IT IS DUE BY 11:59 PM, SATURDAY, APRIL 4.
#######################################################
#######################################################
# LAB 5: WHEN YOU FINISH LAB 4, WHICH WAS JUST
# INTRODUCTORY MATERIAL, YOU ARE WELCOME TO MOVE
# ON TO LAB 5, WHEN I POST IT. IT WILL INVOLVE
# SIR MODELS OF EPIDEMICS, AND SO IT WILL BE MUCH MORE
# INTERACTIVE.
#######################################################
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment