fonsp/js_changes_demo.js Secret

## js_changes_demo.js
### A Pluto.jl notebook ###
# v0.12.4

using Markdown
using InteractiveUtils

# ╔═╡ 05b01f6e-106a-11eb-2a88-5f523fafe433
begin
	using Pkg
	Pkg.activate(mktempdir())
	Pkg.add([
			Pkg.PackageSpec(name="PlutoUI", version="0.6.7-0.6"),
			Pkg.PackageSpec(name="Plots", version="1.6-1"),
			Pkg.PackageSpec(name="JSON"),
			])

	using Plots
	gr()
	using PlutoUI
	import JSON
end

# ╔═╡ 048890ee-106a-11eb-1a81-5744150543e8
md"_homework 6, version 0_"

# ╔═╡ 056ed7f2-106a-11eb-3543-31a5cb560e80
# WARNING FOR OLD PLUTO VERSIONS, DONT DELETE ME

html"""
<script>
const warning = html`
<h2 style="color: #800">Oopsie! You need to update Pluto to the latest version</h2>
<p>Close Pluto, go to the REPL, and type:
<pre><code>julia> import Pkg
julia> Pkg.update("Pluto")
</code></pre>
`

const super_old = window.version_info == null || window.version_info.pluto == null
if(super_old) {
	return warning
}
const version_str = window.version_info.pluto.substring(1)
const numbers = version_str.split(".").map(Number)
console.log(numbers)

if(numbers[0] > 0 || numbers[1] > 12 || numbers[2] > 1) {

} else {
	return warning
}

</script>

"""

# ╔═╡ 0579e962-106a-11eb-26b5-2160f461f4cc
md"""

# **Homework 6**: _Epidemic modeling III_
`18.S191`, fall 2020

This notebook contains _built-in, live answer checks_! In some exercises you will see a coloured box, which runs a test case on your code, and provides feedback based on the result. Simply edit the code, run it, and the check runs again.

_For MIT students:_ there will also be some additional (secret) test cases that will be run as part of the grading process, and we will look at your notebook and write comments.

Feel free to ask questions!
"""

# ╔═╡ 0587db1c-106a-11eb-0560-c3d53c516805
# edit the code below to set your name and kerberos ID (i.e. email without @mit.edu)

student = (name = "Jazzy Doe", kerberos_id = "jazz")

# you might need to wait until all other cells in this notebook have completed running.
# scroll around the page to see what's up

# ╔═╡ 0565af4c-106a-11eb-0d38-2fb84493d86f
md"""

Submission by: **_$(student.name)_** ($(student.kerberos_id)@mit.edu)
"""

# ╔═╡ 05976f0c-106a-11eb-03a4-0febbc18fae8
md"_Let's create a package environment:_"

# ╔═╡ 0d191540-106e-11eb-1f20-bf72a75fb650
md"""
We began this module with **data** on the COVID-19 epidemic, but then looked at mathematical **models**.
How can we make the connection between data and models?

Models have *parameters*, such as the rate of recovery from infection.
Where do the parameter values come from? Ideally we would like to extract them from data.
The goal of this homework is to do this by *fitting* a model to data.

For simplicity we will use data that generated from the spatial model in Homework 5, rather than real-world data,
and we will fit the simplest SIR model. But the same ideas apply more generally.

There are many ways to fit a function to data, but all must involve some form of **optimization**,
usually **minimization** of a particular function, a **loss function**; this is the basis of the vast field of **machine learning**.

The loss function is a function of the model parameters; it measures *how far* the model *output* is from the data,
for the given values of the parameters.

We emphasise that this material is pedagogical; there is no suggestion that these specific techniques should be used actual calculations; rather, it is the underlying ideas that are important.
"""

# ╔═╡ 46caa70a-107c-11eb-1cd8-d5f432203eac
[sqrt(1)]

# ╔═╡ 518fb3aa-106e-11eb-0fcd-31091a8f12db
md"""
## **Exercise 1:** _Simulating the SIR differential equations_

Recall from lectures that the ordinary differential equations (ODEs) for the SIR model are as follows:

$$\begin{align*}
\dot{s} &= - \beta s \, i \\
\dot{i} &= + \beta s \, i - \gamma i \\
\dot{r} &= +\gamma i
\end{align*}$$

where ``\dot{s} := \frac{ds}{dt}`` is the derivative of $s$ with respect to time.
Recall that $s$ denotes the *proportion* (fraction) of the population that is susceptible.

We will use the simplest possible method to simulate these, namely the **Euler method**. The Euler method is *not* a good method to solve ODEs accurately. But for our purposes it is good enough.

For an ODE with a single variable, $\dot{x} = f(x)$, the Euler method takes steps of length $h$ in time. If we have an approximation $x_n$ at time $t_n$ it gives us an approximation for the value $x_{n+1}$ of $x$ at time $t_{n+1} = t_n + h$ as

$$x_{n+1} = x_n + h \, f(x_n)$$

In the case of several functions $s$, $i$ and $r$, we must use a rule like this for *each* of the differential equations within a *single* time step to get new values for *each* of $s$, $i$ and $r$ at the end of the time step in terms of the values at the start of the time step. In Julia we can write `s` for the old value and `s_new` for the new value.

1. Implement a function `euler_SIR(β, γ, s0, i0, r0, h, T)` that integrates these equations with the given parameter values and initial values, with a step size $h$ up to a final time $T$.

    It should return vectors of the trajectory of $s$, $i$ and $r$, as well as the collection of times at which they are calculated.

2. Run the SIR model with $\beta = 0.1$, $\gamma = 0.05$, and $s_0 = 0.99$, $i_0 = 0.01$ and $r_0 = 0$ for a time $T = 300$ with time step ("step size") $h=0.1$.

3. Plot $s$, $i$ and $r$ as a function of time $t$.

3. Do you see an epidemic outbreak (i.e. a rapid growth in number of infected individuals, followed by a decline)? What happens after a long time? Does everybody get infected?

4. Make an interactive visualization in which vary $\beta$ and $\gamma$. What relation should $\beta$ and $\gamma$ have for an epidemic outbreak to occur?

"""

# ╔═╡ 0ba5a304-107d-11eb-3252-d9f15172fbf9
x = [50, 200]

# ╔═╡ ad5b94c6-1071-11eb-25f6-bf800b21b265
"""
<script src="https://cdn.jsdelivr.net/npm/d3@6.2.0/dist/d3.min.js"></script>

<script id="hello">

const x = $(JSON.json(x))

const svg = this == null ? DOM.svg(600,200) : this
const s = this == null ? d3.select(svg) : this.s

s.selectAll("circle")
	.data(x)
	.join("circle")
    .transition()
    .duration(300)
	.attr("cx", d => d)
	.attr("cy", 100)
	.attr("r", 10)
	.attr("fill", "gray")


const output = svg
output.s = s
return output
</script>

""" |> HTML

# ╔═╡ 82539bbe-106e-11eb-0e9e-170dfa6a7dad
md"""

## **Exercise 2:** _Numerical derivatives_

For fitting we need optimization, and for optimization we will use *derivatives* (rates of change).
In this exercise we will see one method for calculating derivatives numerically.

1. Define a function `deriv` that takes a function `f`, representing a function $f: \mathbb{R} \to \mathbb{R}$, a number $a$ and an optional $h$ with default value 0.001, and calculates the **finite-difference approximation**

    $$f'(a) \simeq \frac{f(a + h) - f(a)}{h}$$

    of the derivative $f'(a)$.


2. Write a function `tangent_line` that calculates the tangent line to $f$ at a point $a$. It should also accept a value of $x$ at which it will calculate the height of the tangent line. Use the function `deriv` to calculate $f'(a)$.

3. Make an interactive visualization of the function $f(x) = x^3 - 2x$, showing the tangent line at a point $a$ and allowing you to vary $a$.
Make sure that the line is indeed visually tangent to the graph!

4. Write functions `∂x(f, a, b)` and `∂y(f, a, b)` that calculate the **partial derivatives** $\frac{\partial f}{\partial x}$ and $\frac{\partial f}{\partial y}$ at $(a, b)$ of a function $f : \mathbb{R}^2 \to \mathbb{R}$ (i.e. a function that takes two real numbers and returns one real).

    Recall that $\frac{\partial f}{\partial x}$  is the derivative of the single-variable function $g(x) := f(x, b)$ obtained by fixing the value of $y$ to $b$.

    You should use **anonymous functions** for this. These have the form `x -> x^2`, meaning "the function that sends $x$ to $x^2$".

5. Write a function `gradient(f, a, b)` that calculates the **gradient** of a function $f$ at the point $(a, b)$, given by the vector $\nabla f(a, b) := (\frac{\partial f}{\partial x}(a, b), \frac{\partial f}{\partial y}(a, b))$.


"""

# ╔═╡ 82579b90-106e-11eb-0018-4553c29e57a2
md"""
## **Exercise 3:** _Minimisation using gradient descent_

In this exercise we will use **gradient descent** to find local **minima** of (smooth enough) functions.

To do so we will think of a function as a hill. To find a minimum we should "roll down the hill".


1. Write a function `gradient_descent_1d(f, x0)` to minimize a 1D function, i.e. a function $f: \mathbb{R} \to \mathbb{R}$.

    To do so we notice that the derivative tells us the direction in which the function *increases*. So we take steps in the *opposite* direction, of a small size $\eta$ (eta).

    Use this to write the function starting from the point `x0` and using your function `deriv` to approximate the derivative.

    Take a reasonably large number of steps, say 100, with $\eta = 0.01$.

    What would be a better way to decide when to end the function?

2. Write an interactive visualization showing the progress of gradient descent on the function

    $$f(x) = x^{4} +3x^{3} - 3x + 5.$$

    Make sure that it does indeed get close to a local minimum.

    How can you find different local minima?

3.  Write a function `gradient_descent_2d(f, x0, y0)` that does the same for functions $f(x, y)$ of two variables.

    Multivariable calculus tells us that the gradient $\nabla f(a, b)$ at a point $(a, b)$ is the direction in which the function *increases* the fastest. So again we should take a small step in the *opposite* direction. Note that the gradient is a *vector* which tells us which direction to move in the plane $(a, b)$.

4. Apply this to the **Himmelblau function** $f(x, y) := (x^2 + y - 11)^2 + (x + y^2 - 7)^2$. Visualize the trajectory using both contours (`contour` function) and a 2D surface (`surface` function). [Use e.g. `surface(-2:0.01:2, -2:0.01:2, f)`]

    Can you find different minima?

    You can try to install the `PlotlyJS` package and activate it with `using Plots; plotlyjs()`. This will / should give an *interactive* 3D plot. (But don't spend time on this if it doesn't immediately work.)

"""

# ╔═╡ 8261eb92-106e-11eb-2ccc-1348f232f5c3
md"""
## **Exercise 4:** _Learning parameter values_

In this exercise we will apply gradient descent to fit a simple function $y = f_{a, b}(x)$ to some data given as pairs $(x_i, y_i)$. Here $a$ and $b$ are **parameters** that appear in the form of the function $f$. We want to find the parameters that provide the **best fit**, i.e. the version $f_{a, b}$ of the function that is closest to the data when we vary $a$ and $b$.

To do so we need to define what "best" means. We will define a measure of the distance between the function and the data, given by a **loss function**, which itself depends on the values of $a$ and $b$. Then we will *minimize* the loss function over $a$ and $b$ to find those values that minimize this distance, and hence are "best" in this precise sense.

The iterative procedure by which we gradually adjust the parameter values to improve the loss function is often called **machine learning** or just **learning**, since the computer is "discovering" information in a gradual way, which is supposed to remind us of how humans learn. [Hint: This is not how humans learn.]


1. Load the data from the file `some_data.csv` into vectors `xs` and `ys`.

2. Plot the data. What does it remind you of?

3. Let's try to fit a gaussian (normal) distribution. Its PDF with mean $\mu$ and standard deviation $\sigma$ is

    $$f_{\mu, \sigma}(x) := \frac{1}{\sigma \sqrt{2 \pi}}\exp \left[- \frac{(x - \mu)^2}{2 \sigma^2} \right]$$

    Define this function as `f(x, μ, σ)`.


4. Define a **loss function** to measure the "distance" between the actual data and the function. It will depend on the values of $\mu$ and $\sigma$ that you choose:

    $$\mathcal{L}(\mu, \sigma) := \sum_i [f_{\mu, \sigma}(x_i) - y_i]^2$$

5. Use your `gradient_descent` function to find a local minimum of $\mathcal{L}$, starting with initial values $\mu = 0$ and $\sigma = 1$.

    What are the final values for $\mu$ and $\sigma$ that you find?

6. Modify the gradient descent function to store the whole history of the values of $\mu$ and $\sigma$ that it went through.

    Make an interactive visualization showing how the resulting curve $f_{\mu, \sigma}$ evolves over time, compared to the original data.

    ("Time" here corresponds to the iterations in the gradient descent function.)

"""

# ╔═╡ 826bb0dc-106e-11eb-29eb-03e7ddf9e4b5
md"""

## **Exercise 5:** _Putting it all together — fitting an SIR model to data_

In this exercise we will fit the (non-spatial) SIR ODE model from Exercise 1 to some data generated from the spatial model in Problem Set 4.
If we are able to find a good fit, that would suggest that the spatial aspect "does not matter" too much for the dynamics of these models.
If the fit is not so good, perhaps there is an important effect of space. (As usual in statistics, and indeed in modelling in general, we should be very cautious of making claims of this nature.)

This fitting procedure will be different from that in Exercise 4, however: we no longer have an explicit form for the function that we are fitting -- rather, it is the output
of an ODE! So what should we do?

We will try to find the parameters $\beta$ and $\gamma$ for which *the output of the ODEs when we simulate it with those parameters* best matches the data!

1. Load the data from the file XXX. It is given as columns of $(t, s, i, r)$ values. Extract this into vectors `ts`, `Ss`, `Is` and `Rs`.

2. Make a loss function $\mathcal{L}(\beta, \gamma)$. This will compare *the solution of the SIR equations* with parameters $\beta$ and $\gamma$ with your data.

    To do this, once we have generated the solution of the SIR equations with parameters $\beta$ and $\gamma$, we must evaluate that solution at the times $t$ in the vector `ts`, so that we can compare with the data at that time.

    Normally we would do this by some kind of **interpolation** (fitting a function locally through nearby data points). As a cheap alternative, we will just use the results corresponding to the closest value $t'$ with $t' < t$ that does occur in the solution of the ODEs. Write a function to calculate this value.

    (You should be able to do it *without* searching through the whole vector. Hint: Use the fact that the times are equally spaced.)

    The loss function should take the form

    $$\mathcal{L} = \mathcal{L}_S + 2 \mathcal{L}_I + \mathcal{L}_R$$

    where e.g. $\mathcal{L}_S$ is the loss function for the $S$ data, defined as in the previous exercise. The factor 2 weights $I$ more heavily, since the behaviour of $I$ is what we particularly care about, so we want to fit that better. You should experiment with different weights to see what effect it has.

4. Minimize the loss function and make an interactive visualization of how the solution of the SIR model compares to the data, as the parameters change during the gradient descent procedure.

    If you made it this far, congratulations -- you have just taken your first step into the exciting world of scientific machine learning!

"""

# ╔═╡ b94b7610-106d-11eb-2852-25337ce6ec3a
if student.name == "Jazzy Doe" || student.kerberos_id == "jazz"
	md"""
	!!! danger "Before you submit"
	    Remember to fill in your **name** and **Kerberos ID** at the top of this notebook.
	"""
end

# ╔═╡ b94f9df8-106d-11eb-3be8-c5a1bb79d0d4
md"## Function library

Just some helper functions used in the notebook."

# ╔═╡ b9586d66-106d-11eb-0204-a91c8f8355f7
hint(text) = Markdown.MD(Markdown.Admonition("hint", "Hint", [text]))

# ╔═╡ b9616f92-106d-11eb-1bd1-ede92a617fdb
almost(text) = Markdown.MD(Markdown.Admonition("warning", "Almost there!", [text]))

# ╔═╡ b969dbaa-106d-11eb-3e5a-81766a333c49
still_missing(text=md"Replace `missing` with your answer.") = Markdown.MD(Markdown.Admonition("warning", "Here we go!", [text]))

# ╔═╡ b9728c20-106d-11eb-2286-4f670c229f3e
keep_working(text=md"The answer is not quite right.") = Markdown.MD(Markdown.Admonition("danger", "Keep working on it!", [text]))

# ╔═╡ b97afa48-106d-11eb-3c2c-cdee1d1cc6d7
yays = [md"Fantastic!", md"Splendid!", md"Great!", md"Yay ❤", md"Great! 🎉", md"Well done!", md"Keep it up!", md"Good job!", md"Awesome!", md"You got the right answer!", md"Let's move on to the next section."]

# ╔═╡ b98238ce-106d-11eb-1e39-f9eda5df76af
correct(text=rand(yays)) = Markdown.MD(Markdown.Admonition("correct", "Got it!", [text]))

# ╔═╡ b989e544-106d-11eb-3c53-3906c5c922fb
not_defined(variable_name) = Markdown.MD(Markdown.Admonition("danger", "Oopsie!", [md"Make sure that you define a variable called **$(Markdown.Code(string(variable_name)))**"]))

# ╔═╡ 05bfc716-106a-11eb-36cb-e7c488050d54
TODO = html"<h1 style='display: inline; color: purple;'>TODO</h1>"

# ╔═╡ acbb9a56-106d-11eb-115b-7d067adb302c


# ╔═╡ Cell order:
# ╟─048890ee-106a-11eb-1a81-5744150543e8
# ╟─0565af4c-106a-11eb-0d38-2fb84493d86f
# ╟─056ed7f2-106a-11eb-3543-31a5cb560e80
# ╟─0579e962-106a-11eb-26b5-2160f461f4cc
# ╠═0587db1c-106a-11eb-0560-c3d53c516805
# ╟─05976f0c-106a-11eb-03a4-0febbc18fae8
# ╠═05b01f6e-106a-11eb-2a88-5f523fafe433
# ╠═0d191540-106e-11eb-1f20-bf72a75fb650
# ╠═46caa70a-107c-11eb-1cd8-d5f432203eac
# ╟─518fb3aa-106e-11eb-0fcd-31091a8f12db
# ╠═0ba5a304-107d-11eb-3252-d9f15172fbf9
# ╠═ad5b94c6-1071-11eb-25f6-bf800b21b265
# ╠═82539bbe-106e-11eb-0e9e-170dfa6a7dad
# ╟─82579b90-106e-11eb-0018-4553c29e57a2
# ╟─8261eb92-106e-11eb-2ccc-1348f232f5c3
# ╟─826bb0dc-106e-11eb-29eb-03e7ddf9e4b5
# ╟─b94b7610-106d-11eb-2852-25337ce6ec3a
# ╟─b94f9df8-106d-11eb-3be8-c5a1bb79d0d4
# ╟─b9586d66-106d-11eb-0204-a91c8f8355f7
# ╟─b9616f92-106d-11eb-1bd1-ede92a617fdb
# ╟─b969dbaa-106d-11eb-3e5a-81766a333c49
# ╟─b9728c20-106d-11eb-2286-4f670c229f3e
# ╟─b97afa48-106d-11eb-3c2c-cdee1d1cc6d7
# ╟─b98238ce-106d-11eb-1e39-f9eda5df76af
# ╟─b989e544-106d-11eb-3c53-3906c5c922fb
# ╠═05bfc716-106a-11eb-36cb-e7c488050d54
# ╠═acbb9a56-106d-11eb-115b-7d067adb302c
	### A Pluto.jl notebook ###
	# v0.12.4

	using Markdown
	using InteractiveUtils

	# ╔═╡ 05b01f6e-106a-11eb-2a88-5f523fafe433
	begin
	using Pkg
	Pkg.activate(mktempdir())
	Pkg.add([
	Pkg.PackageSpec(name="PlutoUI", version="0.6.7-0.6"),
	Pkg.PackageSpec(name="Plots", version="1.6-1"),
	Pkg.PackageSpec(name="JSON"),
	])

	using Plots
	gr()
	using PlutoUI
	import JSON
	end

	# ╔═╡ 048890ee-106a-11eb-1a81-5744150543e8
	md"_homework 6, version 0_"

	# ╔═╡ 056ed7f2-106a-11eb-3543-31a5cb560e80
	# WARNING FOR OLD PLUTO VERSIONS, DONT DELETE ME

	html"""
	<script>
	const warning = html`
	<h2 style="color: #800">Oopsie! You need to update Pluto to the latest version</h2>
	<p>Close Pluto, go to the REPL, and type:
	<pre><code>julia> import Pkg
	julia> Pkg.update("Pluto")
	</code></pre>
	`

	const super_old = window.version_info == null \|\| window.version_info.pluto == null
	if(super_old) {
	return warning
	}
	const version_str = window.version_info.pluto.substring(1)
	const numbers = version_str.split(".").map(Number)
	console.log(numbers)

	if(numbers[0] > 0 \|\| numbers[1] > 12 \|\| numbers[2] > 1) {

	} else {
	return warning
	}

	</script>

	"""

	# ╔═╡ 0579e962-106a-11eb-26b5-2160f461f4cc
	md"""

	# Homework 6: _Epidemic modeling III_
	`18.S191`, fall 2020

	This notebook contains _built-in, live answer checks_! In some exercises you will see a coloured box, which runs a test case on your code, and provides feedback based on the result. Simply edit the code, run it, and the check runs again.

	_For MIT students:_ there will also be some additional (secret) test cases that will be run as part of the grading process, and we will look at your notebook and write comments.

	Feel free to ask questions!
	"""

	# ╔═╡ 0587db1c-106a-11eb-0560-c3d53c516805
	# edit the code below to set your name and kerberos ID (i.e. email without @mit.edu)

	student = (name = "Jazzy Doe", kerberos_id = "jazz")

	# you might need to wait until all other cells in this notebook have completed running.
	# scroll around the page to see what's up

	# ╔═╡ 0565af4c-106a-11eb-0d38-2fb84493d86f
	md"""

	Submission by: _$(student.name)_ ($(student.kerberos_id)@mit.edu)
	"""

	# ╔═╡ 05976f0c-106a-11eb-03a4-0febbc18fae8
	md"_Let's create a package environment:_"

	# ╔═╡ 0d191540-106e-11eb-1f20-bf72a75fb650
	md"""
	We began this module with data on the COVID-19 epidemic, but then looked at mathematical models.
	How can we make the connection between data and models?

	Models have parameters, such as the rate of recovery from infection.
	Where do the parameter values come from? Ideally we would like to extract them from data.
	The goal of this homework is to do this by fitting a model to data.

	For simplicity we will use data that generated from the spatial model in Homework 5, rather than real-world data,
	and we will fit the simplest SIR model. But the same ideas apply more generally.

	There are many ways to fit a function to data, but all must involve some form of optimization,
	usually minimization of a particular function, a loss function; this is the basis of the vast field of machine learning.

	The loss function is a function of the model parameters; it measures how far the model output is from the data,
	for the given values of the parameters.

	We emphasise that this material is pedagogical; there is no suggestion that these specific techniques should be used actual calculations; rather, it is the underlying ideas that are important.
	"""

	# ╔═╡ 46caa70a-107c-11eb-1cd8-d5f432203eac
	[sqrt(1)]

	# ╔═╡ 518fb3aa-106e-11eb-0fcd-31091a8f12db
	md"""
	## Exercise 1: _Simulating the SIR differential equations_

	Recall from lectures that the ordinary differential equations (ODEs) for the SIR model are as follows:

	$$\begin{align*}
	\dot{s} &= - \beta s \, i \\
	\dot{i} &= + \beta s \, i - \gamma i \\
	\dot{r} &= +\gamma i
	\end{align*}$$

	where ``\dot{s} := \frac{ds}{dt}`` is the derivative of $s$ with respect to time.
	Recall that $s$ denotes the proportion (fraction) of the population that is susceptible.

	We will use the simplest possible method to simulate these, namely the Euler method. The Euler method is not a good method to solve ODEs accurately. But for our purposes it is good enough.

	For an ODE with a single variable, $\dot{x} = f(x)$, the Euler method takes steps of length $h$ in time. If we have an approximation $x_n$ at time $t_n$ it gives us an approximation for the value $x_{n+1}$ of $x$ at time $t_{n+1} = t_n + h$ as

	$$x_{n+1} = x_n + h \, f(x_n)$$

	In the case of several functions $s$, $i$ and $r$, we must use a rule like this for each of the differential equations within a single time step to get new values for each of $s$, $i$ and $r$ at the end of the time step in terms of the values at the start of the time step. In Julia we can write `s` for the old value and `s_new` for the new value.

	1. Implement a function `euler_SIR(β, γ, s0, i0, r0, h, T)` that integrates these equations with the given parameter values and initial values, with a step size $h$ up to a final time $T$.

	It should return vectors of the trajectory of $s$, $i$ and $r$, as well as the collection of times at which they are calculated.

	2. Run the SIR model with $\beta = 0.1$, $\gamma = 0.05$, and $s_0 = 0.99$, $i_0 = 0.01$ and $r_0 = 0$ for a time $T = 300$ with time step ("step size") $h=0.1$.

	3. Plot $s$, $i$ and $r$ as a function of time $t$.

	3. Do you see an epidemic outbreak (i.e. a rapid growth in number of infected individuals, followed by a decline)? What happens after a long time? Does everybody get infected?

	4. Make an interactive visualization in which vary $\beta$ and $\gamma$. What relation should $\beta$ and $\gamma$ have for an epidemic outbreak to occur?

	"""

	# ╔═╡ 0ba5a304-107d-11eb-3252-d9f15172fbf9
	x = [50, 200]

	# ╔═╡ ad5b94c6-1071-11eb-25f6-bf800b21b265
	"""
	<script src="https://cdn.jsdelivr.net/npm/d3@6.2.0/dist/d3.min.js"></script>

	<script id="hello">

	const x = $(JSON.json(x))

	const svg = this == null ? DOM.svg(600,200) : this
	const s = this == null ? d3.select(svg) : this.s

	s.selectAll("circle")
	.data(x)
	.join("circle")
	.transition()
	.duration(300)
	.attr("cx", d => d)
	.attr("cy", 100)
	.attr("r", 10)
	.attr("fill", "gray")


	const output = svg
	output.s = s
	return output
	</script>

	""" \|> HTML

	# ╔═╡ 82539bbe-106e-11eb-0e9e-170dfa6a7dad
	md"""

	## Exercise 2: _Numerical derivatives_

	For fitting we need optimization, and for optimization we will use derivatives (rates of change).
	In this exercise we will see one method for calculating derivatives numerically.

	1. Define a function `deriv` that takes a function `f`, representing a function $f: \mathbb{R} \to \mathbb{R}$, a number $a$ and an optional $h$ with default value 0.001, and calculates the finite-difference approximation

	$$f'(a) \simeq \frac{f(a + h) - f(a)}{h}$$

	of the derivative $f'(a)$.


	2. Write a function `tangent_line` that calculates the tangent line to $f$ at a point $a$. It should also accept a value of $x$ at which it will calculate the height of the tangent line. Use the function `deriv` to calculate $f'(a)$.

	3. Make an interactive visualization of the function $f(x) = x^3 - 2x$, showing the tangent line at a point $a$ and allowing you to vary $a$.
	Make sure that the line is indeed visually tangent to the graph!

	4. Write functions `∂x(f, a, b)` and `∂y(f, a, b)` that calculate the partial derivatives $\frac{\partial f}{\partial x}$ and $\frac{\partial f}{\partial y}$ at $(a, b)$ of a function $f : \mathbb{R}^2 \to \mathbb{R}$ (i.e. a function that takes two real numbers and returns one real).

	Recall that $\frac{\partial f}{\partial x}$ is the derivative of the single-variable function $g(x) := f(x, b)$ obtained by fixing the value of $y$ to $b$.

	You should use anonymous functions for this. These have the form `x -> x^2`, meaning "the function that sends $x$ to $x^2$".

	5. Write a function `gradient(f, a, b)` that calculates the gradient of a function $f$ at the point $(a, b)$, given by the vector $\nabla f(a, b) := (\frac{\partial f}{\partial x}(a, b), \frac{\partial f}{\partial y}(a, b))$.


	"""

	# ╔═╡ 82579b90-106e-11eb-0018-4553c29e57a2
	md"""
	## Exercise 3: _Minimisation using gradient descent_

	In this exercise we will use gradient descent to find local minima of (smooth enough) functions.

	To do so we will think of a function as a hill. To find a minimum we should "roll down the hill".



	1. Write a function `gradient_descent_1d(f, x0)` to minimize a 1D function, i.e. a function $f: \mathbb{R} \to \mathbb{R}$.

	To do so we notice that the derivative tells us the direction in which the function increases. So we take steps in the opposite direction, of a small size $\eta$ (eta).

	Use this to write the function starting from the point `x0` and using your function `deriv` to approximate the derivative.

	Take a reasonably large number of steps, say 100, with $\eta = 0.01$.

	What would be a better way to decide when to end the function?

	2. Write an interactive visualization showing the progress of gradient descent on the function

	$$f(x) = x^{4} +3x^{3} - 3x + 5.$$

	Make sure that it does indeed get close to a local minimum.

	How can you find different local minima?

	3. Write a function `gradient_descent_2d(f, x0, y0)` that does the same for functions $f(x, y)$ of two variables.

	Multivariable calculus tells us that the gradient $\nabla f(a, b)$ at a point $(a, b)$ is the direction in which the function increases the fastest. So again we should take a small step in the opposite direction. Note that the gradient is a vector which tells us which direction to move in the plane $(a, b)$.

	4. Apply this to the Himmelblau function $f(x, y) := (x^2 + y - 11)^2 + (x + y^2 - 7)^2$. Visualize the trajectory using both contours (`contour` function) and a 2D surface (`surface` function). [Use e.g. `surface(-2:0.01:2, -2:0.01:2, f)`]

	Can you find different minima?

	You can try to install the `PlotlyJS` package and activate it with `using Plots; plotlyjs()`. This will / should give an interactive 3D plot. (But don't spend time on this if it doesn't immediately work.)

	"""

	# ╔═╡ 8261eb92-106e-11eb-2ccc-1348f232f5c3
	md"""
	## Exercise 4: _Learning parameter values_

	In this exercise we will apply gradient descent to fit a simple function $y = f_{a, b}(x)$ to some data given as pairs $(x_i, y_i)$. Here $a$ and $b$ are parameters that appear in the form of the function $f$. We want to find the parameters that provide the best fit, i.e. the version $f_{a, b}$ of the function that is closest to the data when we vary $a$ and $b$.

	To do so we need to define what "best" means. We will define a measure of the distance between the function and the data, given by a loss function, which itself depends on the values of $a$ and $b$. Then we will minimize the loss function over $a$ and $b$ to find those values that minimize this distance, and hence are "best" in this precise sense.

	The iterative procedure by which we gradually adjust the parameter values to improve the loss function is often called machine learning or just learning, since the computer is "discovering" information in a gradual way, which is supposed to remind us of how humans learn. [Hint: This is not how humans learn.]


	1. Load the data from the file `some_data.csv` into vectors `xs` and `ys`.

	2. Plot the data. What does it remind you of?

	3. Let's try to fit a gaussian (normal) distribution. Its PDF with mean $\mu$ and standard deviation $\sigma$ is

	$$f_{\mu, \sigma}(x) := \frac{1}{\sigma \sqrt{2 \pi}}\exp \left[- \frac{(x - \mu)^2}{2 \sigma^2} \right]$$

	Define this function as `f(x, μ, σ)`.


	4. Define a loss function to measure the "distance" between the actual data and the function. It will depend on the values of $\mu$ and $\sigma$ that you choose:

	$$\mathcal{L}(\mu, \sigma) := \sum_i [f_{\mu, \sigma}(x_i) - y_i]^2$$

	5. Use your `gradient_descent` function to find a local minimum of $\mathcal{L}$, starting with initial values $\mu = 0$ and $\sigma = 1$.

	What are the final values for $\mu$ and $\sigma$ that you find?

	6. Modify the gradient descent function to store the whole history of the values of $\mu$ and $\sigma$ that it went through.

	Make an interactive visualization showing how the resulting curve $f_{\mu, \sigma}$ evolves over time, compared to the original data.

	("Time" here corresponds to the iterations in the gradient descent function.)

	"""

	# ╔═╡ 826bb0dc-106e-11eb-29eb-03e7ddf9e4b5
	md"""

	## Exercise 5: _Putting it all together — fitting an SIR model to data_

	In this exercise we will fit the (non-spatial) SIR ODE model from Exercise 1 to some data generated from the spatial model in Problem Set 4.
	If we are able to find a good fit, that would suggest that the spatial aspect "does not matter" too much for the dynamics of these models.
	If the fit is not so good, perhaps there is an important effect of space. (As usual in statistics, and indeed in modelling in general, we should be very cautious of making claims of this nature.)

	This fitting procedure will be different from that in Exercise 4, however: we no longer have an explicit form for the function that we are fitting -- rather, it is the output
	of an ODE! So what should we do?

	We will try to find the parameters $\beta$ and $\gamma$ for which the output of the ODEs when we simulate it with those parameters best matches the data!

	1. Load the data from the file XXX. It is given as columns of $(t, s, i, r)$ values. Extract this into vectors `ts`, `Ss`, `Is` and `Rs`.

	2. Make a loss function $\mathcal{L}(\beta, \gamma)$. This will compare the solution of the SIR equations with parameters $\beta$ and $\gamma$ with your data.

	To do this, once we have generated the solution of the SIR equations with parameters $\beta$ and $\gamma$, we must evaluate that solution at the times $t$ in the vector `ts`, so that we can compare with the data at that time.

	Normally we would do this by some kind of interpolation (fitting a function locally through nearby data points). As a cheap alternative, we will just use the results corresponding to the closest value $t'$ with $t' < t$ that does occur in the solution of the ODEs. Write a function to calculate this value.

	(You should be able to do it without searching through the whole vector. Hint: Use the fact that the times are equally spaced.)

	The loss function should take the form

	$$\mathcal{L} = \mathcal{L}_S + 2 \mathcal{L}_I + \mathcal{L}_R$$

	where e.g. $\mathcal{L}_S$ is the loss function for the $S$ data, defined as in the previous exercise. The factor 2 weights $I$ more heavily, since the behaviour of $I$ is what we particularly care about, so we want to fit that better. You should experiment with different weights to see what effect it has.

	4. Minimize the loss function and make an interactive visualization of how the solution of the SIR model compares to the data, as the parameters change during the gradient descent procedure.

	If you made it this far, congratulations -- you have just taken your first step into the exciting world of scientific machine learning!

	"""

	# ╔═╡ b94b7610-106d-11eb-2852-25337ce6ec3a
	if student.name == "Jazzy Doe" \|\| student.kerberos_id == "jazz"
	md"""
	!!! danger "Before you submit"
	Remember to fill in your name and Kerberos ID at the top of this notebook.
	"""
	end

	# ╔═╡ b94f9df8-106d-11eb-3be8-c5a1bb79d0d4
	md"## Function library

	Just some helper functions used in the notebook."

	# ╔═╡ b9586d66-106d-11eb-0204-a91c8f8355f7
	hint(text) = Markdown.MD(Markdown.Admonition("hint", "Hint", [text]))

	# ╔═╡ b9616f92-106d-11eb-1bd1-ede92a617fdb
	almost(text) = Markdown.MD(Markdown.Admonition("warning", "Almost there!", [text]))

	# ╔═╡ b969dbaa-106d-11eb-3e5a-81766a333c49
	still_missing(text=md"Replace `missing` with your answer.") = Markdown.MD(Markdown.Admonition("warning", "Here we go!", [text]))

	# ╔═╡ b9728c20-106d-11eb-2286-4f670c229f3e
	keep_working(text=md"The answer is not quite right.") = Markdown.MD(Markdown.Admonition("danger", "Keep working on it!", [text]))

	# ╔═╡ b97afa48-106d-11eb-3c2c-cdee1d1cc6d7
	yays = [md"Fantastic!", md"Splendid!", md"Great!", md"Yay ❤", md"Great! 🎉", md"Well done!", md"Keep it up!", md"Good job!", md"Awesome!", md"You got the right answer!", md"Let's move on to the next section."]

	# ╔═╡ b98238ce-106d-11eb-1e39-f9eda5df76af
	correct(text=rand(yays)) = Markdown.MD(Markdown.Admonition("correct", "Got it!", [text]))

	# ╔═╡ b989e544-106d-11eb-3c53-3906c5c922fb
	not_defined(variable_name) = Markdown.MD(Markdown.Admonition("danger", "Oopsie!", [md"Make sure that you define a variable called $(Markdown.Code(string(variable_name)))"]))

	# ╔═╡ 05bfc716-106a-11eb-36cb-e7c488050d54
	TODO = html"<h1 style='display: inline; color: purple;'>TODO</h1>"

	# ╔═╡ acbb9a56-106d-11eb-115b-7d067adb302c


	# ╔═╡ Cell order:
	# ╟─048890ee-106a-11eb-1a81-5744150543e8
	# ╟─0565af4c-106a-11eb-0d38-2fb84493d86f
	# ╟─056ed7f2-106a-11eb-3543-31a5cb560e80
	# ╟─0579e962-106a-11eb-26b5-2160f461f4cc
	# ╠═0587db1c-106a-11eb-0560-c3d53c516805
	# ╟─05976f0c-106a-11eb-03a4-0febbc18fae8
	# ╠═05b01f6e-106a-11eb-2a88-5f523fafe433
	# ╠═0d191540-106e-11eb-1f20-bf72a75fb650
	# ╠═46caa70a-107c-11eb-1cd8-d5f432203eac
	# ╟─518fb3aa-106e-11eb-0fcd-31091a8f12db
	# ╠═0ba5a304-107d-11eb-3252-d9f15172fbf9
	# ╠═ad5b94c6-1071-11eb-25f6-bf800b21b265
	# ╠═82539bbe-106e-11eb-0e9e-170dfa6a7dad
	# ╟─82579b90-106e-11eb-0018-4553c29e57a2
	# ╟─8261eb92-106e-11eb-2ccc-1348f232f5c3
	# ╟─826bb0dc-106e-11eb-29eb-03e7ddf9e4b5
	# ╟─b94b7610-106d-11eb-2852-25337ce6ec3a
	# ╟─b94f9df8-106d-11eb-3be8-c5a1bb79d0d4
	# ╟─b9586d66-106d-11eb-0204-a91c8f8355f7
	# ╟─b9616f92-106d-11eb-1bd1-ede92a617fdb
	# ╟─b969dbaa-106d-11eb-3e5a-81766a333c49
	# ╟─b9728c20-106d-11eb-2286-4f670c229f3e
	# ╟─b97afa48-106d-11eb-3c2c-cdee1d1cc6d7
	# ╟─b98238ce-106d-11eb-1e39-f9eda5df76af
	# ╟─b989e544-106d-11eb-3c53-3906c5c922fb
	# ╠═05bfc716-106a-11eb-36cb-e7c488050d54
	# ╠═acbb9a56-106d-11eb-115b-7d067adb302c