Skip to content

Instantly share code, notes, and snippets.

@mbeltagy
Last active October 26, 2016 07:59
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mbeltagy/3ba5f77da6382da192c3 to your computer and use it in GitHub Desktop.
Save mbeltagy/3ba5f77da6382da192c3 to your computer and use it in GitHub Desktop.
Julia translation of Norvig's notebook on "Probability, Paradox, and the Reasonable Person Principle"
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"<div style=\"text-align: right\">Peter Norvig, 3 Oct 2015, revised 27 Oct 2015, translated to Julia by Mohammed El-Beltagy 27 Nov 2015, updated for Julia 0.5 compatiblity 26 Oct 2016</div> \n",
"\n",
"# Probability, Paradox, and the Reasonable Person Principle\n",
"\n",
"*This notebook is a Julia translation of [Norvig's Original Notebook](http://nbviewer.ipython.org/url/norvig.com/ipython/Probability.ipynb). Python generators were replaced by Julia's coroutines. When dealing with\n",
"changing input types, I used Julia's multiple dispatch instead of checking for\n",
"types inside a function. Norvig's description and original text is only\n",
"changes when the results are slightly different is some simulations. *\n",
"\n",
"In this notebook, we cover the basics of probability theory, and show how to implement the theory in Julia. (You should have a little background in [probability](http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/pdf.html) and [Julia](http://julialang.org/learning/).) Then we show how to solve some particularly perplexing paradoxical probability problems.\n",
"\n",
"Over 200 years ago, Pierre-Simon Laplace [wrote](https://en.wikipedia.org/wiki/Classical_definition_of_probability):\n",
"\n",
">The probability of an event is the ratio of the number of cases favorable to it, to the number of all cases possible, when [the cases are] equally possible. ... Probability is thus simply a fraction whose numerator is the number of favorable cases and whose denominator is the number of all the cases possible.\n",
"\n",
"Laplace really nailed it, way back then. If you want to untangle a probability problem (paradoxical or not), all you have to do is be methodical about defining exactly what the cases are, and then careful in counting the number of favorable and total cases. We'll start being methodical by defining terms:\n",
"\n",
"\n",
"- **[Experiment](https://en.wikipedia.org/wiki/Experiment_(probability_theory%29):**\n",
" An occurrence with an uncertain outcome that we can observe.\n",
" <br>*For example, rolling a die.*\n",
"- **[Outcome](https://en.wikipedia.org/wiki/Outcome_(probability%29):**\n",
" The result of an experiment; one particular state of the world. Synonym for \"case.\"\n",
" <br>*For example:* `6`.\n",
"- **[Sample Space](https://en.wikipedia.org/wiki/Sample_space):**\n",
" The set of all possible outcomes for the experiment. (For now, assume each outcome is equally likely.)\n",
" <br>*For example,* `{1, 2, 3, 4, 5, 6}`.\n",
"- **[Event](https://en.wikipedia.org/wiki/Event_(probability_theory%29):**\n",
" A subset of possible outcomes that together have some property we are interested in.\n",
" <br>*For example, the event \"even die roll\" is the set of outcomes* `{2, 4, 6}`. \n",
"- **[Probability](https://en.wikipedia.org/wiki/Probability_theory):**\n",
" The number of possible outcomes in the event divided by the number in the sample space.\n",
" <br>*For example, the probability of an even outcome from a six-sided die is* `|{2, 4, 6}| / |{1, 2, 3, 4, 5, 6}| = 3/6 = 1/2.`\n",
"\n",
"# Definition of `P` for Probability in Julia"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"The probability of an event, given a sample space of equiprobable outcomes.\"\n",
"P(event, space)=length(intersect(event,space))//length(space);"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Read this as *\"Probability is thus simply a fraction whose numerator is the number of favorable cases (outcomes in the intersection of the sample space and the event) and whose denominator is the number of all the cases possible (the sample space).\"* Note I use `//` rather than regular division because I want exact answers like 1/3, not 0.3333333333333333. \n",
"\n",
"# Warm-up Problem: Die Roll"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Let's consider the experiment of rolling a single six-sided fair die. We'll call the sample space `D`:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"D = [1, 2, 3, 4, 5, 6];"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"The probability of the event of \"*rolling an even number*\" can be calculated as follows:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"even = [2, 4, 6]\n",
"\n",
"P(even, D)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"But that's inelegant&mdash;I had to explicitly enumerate all the even numbers from one to six. If I ever wanted to deal with a different kind of die, say a twelve or twenty-sided die, I would have to go back and change the definition of `even`. I would prefer to define even numbers once and for all with a *predicate* (a function that returns True or False), if only `P` would accept that."
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"# Revised Version of `P`, accepting a predicate for the event\n",
"\n",
"It would be great if we could specify an event as either a set of outcomes, or a predicate over outcomes. Let's make it so:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING: Method definition P(Any, Any) in module Main at In[1]:2 overwritten at In[4]:4.\n",
"WARNING: replacing docs for 'P :: Tuple{Any,Any}' in module 'Main'.\n"
]
}
],
"source": [
"\"\"\"The probability of an event, given a sample space of equiprobable outcomes.\n",
" event can be either a set of outcomes, or a predicate (true for outcomes in the event).\"\"\"\n",
"function P(event, space)\n",
" event_ = such_that(event, space)\n",
" length(intersect(event_,space))//length(space)\n",
"end\n",
"\n",
" #Making use of Julia's multiple dispatch\n",
"\"The subset of elements in the collection for which the predicate is true.\"\n",
"function such_that(predicate::Function, collection)\n",
" filter(predicate,collection)\n",
"end;\n",
"\"Default return for a collection\"\n",
"function such_that(event, collection)\n",
" event\n",
"end;"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"3-element Array{Int64,1}:\n",
" 2\n",
" 4\n",
" 6"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"even_p(n) = (n % 2 == 0)\n",
"\n",
"such_that(even_p, D)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(even_p, D)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"6-element Array{Int64,1}:\n",
" 2\n",
" 4\n",
" 6\n",
" 8\n",
" 10\n",
" 12"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"D12 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]\n",
"\n",
"such_that(even_p, D12)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(even_p, D12)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"# The Two Child Paradoxes\n",
"\n",
"In 1959, [Martin Gardner]() [posed](https://en.wikipedia.org/wiki/Boy_or_Girl_paradox) these two problems:\n",
"\n",
"- **Problem 1.** Mr. Jones has two children. The older child is a boy. What is the\n",
"probability that both children are boys?\n",
"\n",
"- **Problem 2.** Mr. Smith has two children. At least one of them is a boy. What is\n",
"the probability that both children are boys? \n",
"\n",
"And in 2010, Gary Foshee added this one:\n",
"\n",
"- **Problem 3.** I have two children. At least one of them is a boy born on Tuesday. What is\n",
"the probability that both children are boys? \n",
"\n",
"Problems 2 and 3 are considered *paradoxes* because they have surprising answers that people\n",
"argue about. \n",
"\n",
"(*Note:* Assume equiprobable outcomes; don't worry that actually 51% of births are male, etc.)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"## Problem 1: Older child is a boy. What is the probability both are boys?\n",
"\n",
"We use `\"BG\"` to denote the outcome in which the older child is a boy and the younger a girl. The sample space, `S`, is:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"S = [\"BG\", \"BB\", \"GB\", \"GG\"];"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Let's define predicates for the conditions of having two boys, and of the older child being a boy:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"two_boys(outcome)= length(matchall(r\"B\", outcome)) == 2\n",
"\n",
"older_is_a_boy(outcome)= startswith(outcome,\"B\");"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Now we can answer Problem 1:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(two_boys, such_that(older_is_a_boy, S))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"## Problem 2: At least one is a boy. What is the probability both are boys? \n",
"\n",
"Implementing this problem and finding the answer is easy:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"at_least_one_boy(outcome)= 'B' in outcome;"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//3"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(two_boys, such_that(at_least_one_boy, S))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Understanding the problem is tougher. Some people think the answer should be 1/2. Can we justify the answer 1/3? We can see there are three equiprobable outcomes in which there is at least one boy:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"3-element Array{String,1}:\n",
" \"BG\"\n",
" \"BB\"\n",
" \"GB\""
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"such_that(at_least_one_boy, S)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Of those three outcomes, only one has two boys, so the answer of 1/3 is indeed justified. \n",
"\n",
"But some people *still* think the answer should be 1/2.\n",
"Their reasoning is *\"If one child is a boy, then there are two equiprobable outcomes for the other child, so the probability that the other child is a boy, and thus that there are two boys, is 1/2.\"* \n",
"\n",
"When two methods of reasoning give two different answers, we have a [paradox](https://en.wikipedia.org/wiki/Paradox). Here are three responses to a paradox:\n",
"\n",
"1. The very fundamentals of mathematics must be incomplete, and this problem reveals it!\n",
"2. I'm right, and anyone who disagrees with me is an idiot!\n",
"3. I have the right answer for one interpretation of the problem, and you have the right answer\n",
"for a different interpretation of the problem.\n",
"\n",
"If you're [Bertrand Russell](https://en.wikipedia.org/wiki/Russell%27s_paradox) or [Georg Cantor](https://en.wikipedia.org/wiki/Cantor%27s_paradox), you might very well uncover a fundamental flaw in mathematics; for the rest of us, I recommend Response 3. When I believe the answer is 1/3, and I hear someone say the answer is 1/2, my response is *\"How interesting! They must have a different interpretation of the problem; I should try to discover what their interpretation is, and why their answer is correct.\"* First I explicitly describe my understanding of the experiment:\n",
"\n",
"- **Experiment 2a.** Mr. Smith is chosen at random from families with two children. He is asked if at least one of his children is a boy. He replies \"yes.\"\n",
"\n",
"Next I envision another possible interpretation of the experiment:\n",
"\n",
"- **Experiment 2b.** Mr. Smith is chosen at random from families with two children. He is observed at a time when he is accompanied by one of his children, chosen at random. The child is observed to be a boy. \n",
"\n",
"Experiment 2b needs a different sample space, which we will call `S2b`. It consists of 8 outcomes, not just 4; for each of the 4 outcomes in `S`, we have a choice of observing either the older child or the younger child. We will use the notation `'GB/g?'` to mean that the older child is a girl, the younger a boy, the older child was observed to be a girl, and the younger was not observed. The sample space is therefore:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"S2b = [\"BB/b?\", \"BB/?b\", \n",
" \"BG/b?\", \"BG/?g\", \n",
" \"GB/g?\", \"GB/?b\", \n",
" \"GG/g?\", \"GG/?g\"];"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Now we can figure out the subset of this sample space in which we observe Mr. Smith with a boy:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"4-element Array{String,1}:\n",
" \"BB/b?\"\n",
" \"BB/?b\"\n",
" \"BG/b?\"\n",
" \"GB/?b\""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"observed_boy(outcome)= 'b' in outcome\n",
"\n",
"such_that(observed_boy, S2b)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"And finally we can determine the probability that he has two boys, given that we observed him with a boy:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(two_boys, such_that(observed_boy, S2b))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"The paradox is resolved. Two reasonable people can have different interpretations of the problem, and can each reason flawlessly to reach different conclusions, 1/3 or 1/2. Which interpretation is \"better?\" We could debate that, or we could just agree to use unambiguous problem descriptions (that is, use the language of Experiment 2a or Experiment 2b, not the ambiguous language of Problem 2). \n",
"\n",
"\n",
"\n",
"## The Reasonable Person Principle\n",
"\n",
"It is an unfortunate fact of human nature that we often assume the other person is an idiot. As [George Carlin puts it](https://www.youtube.com/watch?v=XWPCE2tTLZQ) *\"Have you ever noticed when you're driving that anybody driving slower than you is an idiot, and anyone going faster than you is a maniac?\"*\n",
"\n",
"The assumption that other people are more likely to be **reasonable** rather than **idiots** is known as the **[reasonable person principle](http://www.cs.cmu.edu/~weigand/staff/)**. It is a guiding principle at Carnegie Mellon University's School of Computer Science, and is a principle I try to live by as well.\n",
"\n",
"Now let's return to an even more paradoxical problem.\n",
"\n",
"## Problem 3. One is a boy born on Tuesday. What's the probability both are boys?\n",
"\n",
"When Gary Foshee posed this problem, most people could not imagine how the boy's birth-day-of-week could be relevant, and felt the answer should be the same as Problem 2. But in order to tell for sure, we should clearly state what the experiment is, define the sample space, and calculate. First:\n",
"\n",
"- **Experiment 3a.** A parent is chosen at random from families with two children. She is asked if at least one of her children is a boy born on Tuesday. She replies \"yes.\"\n",
"\n",
"Next we'll define a sample space. We'll use the notation \"`G1B3`\" to mean the older child is a girl born on the first day of the week (Sunday) and the younger a boy born on the third day of the week (Tuesday). We'll call the resulting sample space `S3`."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"sexesdays = [\"$sex$day\" \n",
" for sex in \"GB\", \n",
" day in \"1234567\"]\n",
"\n",
"S3 = [\"$older$younger\" \n",
" for older in sexesdays, \n",
" younger in sexesdays];"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"String[\"B1B1\",\"B1B2\",\"B1B3\",\"B1B4\",\"B1B5\",\"B1B6\",\"B1B7\",\"B1G1\",\"B1G2\",\"B1G3\",\"B1G4\",\"B1G5\",\"B1G6\",\"B1G7\",\"B2B1\",\"B2B2\",\"B2B3\",\"B2B4\",\"B2B5\",\"B2B6\",\"B2B7\",\"B2G1\",\"B2G2\",\"B2G3\",\"B2G4\",\"B2G5\",\"B2G6\",\"B2G7\",\"B3B1\",\"B3B2\",\"B3B3\",\"B3B4\",\"B3B5\",\"B3B6\",\"B3B7\",\"B3G1\",\"B3G2\",\"B3G3\",\"B3G4\",\"B3G5\",\"B3G6\",\"B3G7\",\"B4B1\",\"B4B2\",\"B4B3\",\"B4B4\",\"B4B5\",\"B4B6\",\"B4B7\",\"B4G1\",\"B4G2\",\"B4G3\",\"B4G4\",\"B4G5\",\"B4G6\",\"B4G7\",\"B5B1\",\"B5B2\",\"B5B3\",\"B5B4\",\"B5B5\",\"B5B6\",\"B5B7\",\"B5G1\",\"B5G2\",\"B5G3\",\"B5G4\",\"B5G5\",\"B5G6\",\"B5G7\",\"B6B1\",\"B6B2\",\"B6B3\",\"B6B4\",\"B6B5\",\"B6B6\",\"B6B7\",\"B6G1\",\"B6G2\",\"B6G3\",\"B6G4\",\"B6G5\",\"B6G6\",\"B6G7\",\"B7B1\",\"B7B2\",\"B7B3\",\"B7B4\",\"B7B5\",\"B7B6\",\"B7B7\",\"B7G1\",\"B7G2\",\"B7G3\",\"B7G4\",\"B7G5\",\"B7G6\",\"B7G7\",\"G1B1\",\"G1B2\",\"G1B3\",\"G1B4\",\"G1B5\",\"G1B6\",\"G1B7\",\"G1G1\",\"G1G2\",\"G1G3\",\"G1G4\",\"G1G5\",\"G1G6\",\"G1G7\",\"G2B1\",\"G2B2\",\"G2B3\",\"G2B4\",\"G2B5\",\"G2B6\",\"G2B7\",\"G2G1\",\"G2G2\",\"G2G3\",\"G2G4\",\"G2G5\",\"G2G6\",\"G2G7\",\"G3B1\",\"G3B2\",\"G3B3\",\"G3B4\",\"G3B5\",\"G3B6\",\"G3B7\",\"G3G1\",\"G3G2\",\"G3G3\",\"G3G4\",\"G3G5\",\"G3G6\",\"G3G7\",\"G4B1\",\"G4B2\",\"G4B3\",\"G4B4\",\"G4B5\",\"G4B6\",\"G4B7\",\"G4G1\",\"G4G2\",\"G4G3\",\"G4G4\",\"G4G5\",\"G4G6\",\"G4G7\",\"G5B1\",\"G5B2\",\"G5B3\",\"G5B4\",\"G5B5\",\"G5B6\",\"G5B7\",\"G5G1\",\"G5G2\",\"G5G3\",\"G5G4\",\"G5G5\",\"G5G6\",\"G5G7\",\"G6B1\",\"G6B2\",\"G6B3\",\"G6B4\",\"G6B5\",\"G6B6\",\"G6B7\",\"G6G1\",\"G6G2\",\"G6G3\",\"G6G4\",\"G6G5\",\"G6G6\",\"G6G7\",\"G7B1\",\"G7B2\",\"G7B3\",\"G7B4\",\"G7B5\",\"G7B6\",\"G7B7\",\"G7G1\",\"G7G2\",\"G7G3\",\"G7G4\",\"G7G5\",\"G7G6\",\"G7G7\"]"
]
}
],
"source": [
"@assert length(S3) == (2*7)^2 == 196\n",
"\n",
"print(sort(vec(S3)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"We determine below that the probability of having at least one boy is 3/4, both in `S3` and in `S`:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"3//4"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(at_least_one_boy, S3)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"3//4"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(at_least_one_boy, S)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"The probability of two boys is 1/4 in either sample space:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//4"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(two_boys, S3)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//4"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(two_boys, S)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"And the probability of two boys given at least one boy is 1/3 in either sample space:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//3"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(two_boys, such_that(at_least_one_boy, S3))"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//3"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(two_boys, such_that(at_least_one_boy, S))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"We will define a predicate for the event of at least one boy born on Tuesday: "
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"at_least_one_boy_tues(outcome)= contains(outcome,\"B3\");"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"We are now ready to answer Problem 3:"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"13//27"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(two_boys, such_that(at_least_one_boy_tues, S3))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"13/27 is quite different from 1/3 (but rather close to 1/2). So \"at least one boy born on Tuesday\" is quite different from \"at least one boy.\" Are you surprised? Do you accept the answer, or do you think we did something wrong? Are there other interpretations of the experiment that lead to other answers?\n",
"\n",
"Here is one alternative interpretation:"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"- **Experiment 3b.** A parent is chosen at random from families with two children. She is observed at a time when she is accompanied by one of her children, chosen at random. The child is observed to be a boy who reports that his birth day is Tuesday.\n",
"\n",
"We can represent outcomes in this sample space with the notation `G1B3/??b3`, meaning the older child is a girl born on Sunday, the younger a boy born on Tuesday, the older was not observed, and the younger was."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"observed_boy_tues(outcome)= contains(outcome,\"b3\") \n",
"\n",
"S3b=[]\n",
"for children in S3\n",
" for i=1:2\n",
" first_observed=(i==1) ? lowercase(children)[1:2]*\"??\":\"??\"*lowercase(children)[3:4]\n",
" push!(S3b,\"$children/$first_observed\")\n",
" end\n",
"end"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Now we can answer this version of problem 3:"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(two_boys, such_that(observed_boy_tues, S3b))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"So with the wording of Experiment 3b, the answer to problem 3 is the same as 2b.\n",
"\n",
"Still confused? Let's build a visualization tool to make things more concrete.\n",
"\n",
"# Visualization\n",
"\n",
"We'll display the results as a two dimensional table of outcomes, with each cell in the table is a color-coded outcome. A cell will be white if it does not satisfy the predicate we are working with; green if the outcome contains two boys; and yellow if it does satisfy the predicate, but does not have two boys. Every cell in a row has the same older child, and every cell in a column has the same younger child. Here's the code to display a table:"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"\"\"Display sample space in a table, color-coded: green if event and condition is true; \n",
" yellow if only condition is true; white otherwise.\"\"\"\n",
"function table(space, n=1, event=two_boys, condition=older_is_a_boy)\n",
" # n is the number of characters that make up the older child.\n",
" olders = sort(unique([outcome[1:n] for outcome in space]))\n",
" html= string(\"<table>\",\n",
" join([row(older, space, event, condition) for older in olders]),\n",
" \"</table>\", \n",
" P(event, such_that(condition, space)))\n",
" display(\"text/html\",html)\n",
"end\n",
"\n",
"\"Display a row where an older child is paired with each of the possible younger children.\"\n",
"function row(older, space, event, condition)\n",
" thisrow = sort(filter((x)->startswith(x,older),space))\n",
" string(\"<tr>\", join([cell(outcome, event, condition) for outcome in thisrow]),\"</tr>\")\n",
"end\n",
"\n",
"\"Display outcome in appropriate color.\"\n",
"function cell(outcome, event, condition)\n",
" color = (event(outcome) && condition(outcome))? \"lightgreen\": condition(outcome)? \"yellow\":\"ghostwhite\"\n",
" return \"<td style=\\\"background-color: $color\\\">$outcome</td>\" \n",
"end;"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"We can use this visualization tool to see that in Problem 1, there is one outcome with two boys (green) out of a total of two outcomes where the older is a boy (green and yellow) so the probability of two boys given that the older is a boy is 1/2."
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<table><tr><td style=\"background-color: lightgreen\">BB</td><td style=\"background-color: yellow\">BG</td></tr><tr><td style=\"background-color: ghostwhite\">GB</td><td style=\"background-color: ghostwhite\">GG</td></tr></table>1//2"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Problem 1\n",
"table(S, 1, two_boys, older_is_a_boy)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"For Problem 2, we see the probability of two boys (green) given at least one boy (green and yellow) is 1/3. "
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<table><tr><td style=\"background-color: lightgreen\">BB</td><td style=\"background-color: yellow\">BG</td></tr><tr><td style=\"background-color: yellow\">GB</td><td style=\"background-color: ghostwhite\">GG</td></tr></table>1//3"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Problem 2\n",
"table(S, 1, two_boys, at_least_one_boy)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"The answer is still 1/3 when we consider the day of the week of each birth. (We've just made each cell \"bigger\" by enumerating all the days-of-week.)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<table><tr><td style=\"background-color: lightgreen\">B1B1</td><td style=\"background-color: lightgreen\">B1B2</td><td style=\"background-color: lightgreen\">B1B3</td><td style=\"background-color: lightgreen\">B1B4</td><td style=\"background-color: lightgreen\">B1B5</td><td style=\"background-color: lightgreen\">B1B6</td><td style=\"background-color: lightgreen\">B1B7</td><td style=\"background-color: yellow\">B1G1</td><td style=\"background-color: yellow\">B1G2</td><td style=\"background-color: yellow\">B1G3</td><td style=\"background-color: yellow\">B1G4</td><td style=\"background-color: yellow\">B1G5</td><td style=\"background-color: yellow\">B1G6</td><td style=\"background-color: yellow\">B1G7</td></tr><tr><td style=\"background-color: lightgreen\">B2B1</td><td style=\"background-color: lightgreen\">B2B2</td><td style=\"background-color: lightgreen\">B2B3</td><td style=\"background-color: lightgreen\">B2B4</td><td style=\"background-color: lightgreen\">B2B5</td><td style=\"background-color: lightgreen\">B2B6</td><td style=\"background-color: lightgreen\">B2B7</td><td style=\"background-color: yellow\">B2G1</td><td style=\"background-color: yellow\">B2G2</td><td style=\"background-color: yellow\">B2G3</td><td style=\"background-color: yellow\">B2G4</td><td style=\"background-color: yellow\">B2G5</td><td style=\"background-color: yellow\">B2G6</td><td style=\"background-color: yellow\">B2G7</td></tr><tr><td style=\"background-color: lightgreen\">B3B1</td><td style=\"background-color: lightgreen\">B3B2</td><td style=\"background-color: lightgreen\">B3B3</td><td style=\"background-color: lightgreen\">B3B4</td><td style=\"background-color: lightgreen\">B3B5</td><td style=\"background-color: lightgreen\">B3B6</td><td style=\"background-color: lightgreen\">B3B7</td><td style=\"background-color: yellow\">B3G1</td><td style=\"background-color: yellow\">B3G2</td><td style=\"background-color: yellow\">B3G3</td><td style=\"background-color: yellow\">B3G4</td><td style=\"background-color: yellow\">B3G5</td><td style=\"background-color: yellow\">B3G6</td><td style=\"background-color: yellow\">B3G7</td></tr><tr><td style=\"background-color: lightgreen\">B4B1</td><td style=\"background-color: lightgreen\">B4B2</td><td style=\"background-color: lightgreen\">B4B3</td><td style=\"background-color: lightgreen\">B4B4</td><td style=\"background-color: lightgreen\">B4B5</td><td style=\"background-color: lightgreen\">B4B6</td><td style=\"background-color: lightgreen\">B4B7</td><td style=\"background-color: yellow\">B4G1</td><td style=\"background-color: yellow\">B4G2</td><td style=\"background-color: yellow\">B4G3</td><td style=\"background-color: yellow\">B4G4</td><td style=\"background-color: yellow\">B4G5</td><td style=\"background-color: yellow\">B4G6</td><td style=\"background-color: yellow\">B4G7</td></tr><tr><td style=\"background-color: lightgreen\">B5B1</td><td style=\"background-color: lightgreen\">B5B2</td><td style=\"background-color: lightgreen\">B5B3</td><td style=\"background-color: lightgreen\">B5B4</td><td style=\"background-color: lightgreen\">B5B5</td><td style=\"background-color: lightgreen\">B5B6</td><td style=\"background-color: lightgreen\">B5B7</td><td style=\"background-color: yellow\">B5G1</td><td style=\"background-color: yellow\">B5G2</td><td style=\"background-color: yellow\">B5G3</td><td style=\"background-color: yellow\">B5G4</td><td style=\"background-color: yellow\">B5G5</td><td style=\"background-color: yellow\">B5G6</td><td style=\"background-color: yellow\">B5G7</td></tr><tr><td style=\"background-color: lightgreen\">B6B1</td><td style=\"background-color: lightgreen\">B6B2</td><td style=\"background-color: lightgreen\">B6B3</td><td style=\"background-color: lightgreen\">B6B4</td><td style=\"background-color: lightgreen\">B6B5</td><td style=\"background-color: lightgreen\">B6B6</td><td style=\"background-color: lightgreen\">B6B7</td><td style=\"background-color: yellow\">B6G1</td><td style=\"background-color: yellow\">B6G2</td><td style=\"background-color: yellow\">B6G3</td><td style=\"background-color: yellow\">B6G4</td><td style=\"background-color: yellow\">B6G5</td><td style=\"background-color: yellow\">B6G6</td><td style=\"background-color: yellow\">B6G7</td></tr><tr><td style=\"background-color: lightgreen\">B7B1</td><td style=\"background-color: lightgreen\">B7B2</td><td style=\"background-color: lightgreen\">B7B3</td><td style=\"background-color: lightgreen\">B7B4</td><td style=\"background-color: lightgreen\">B7B5</td><td style=\"background-color: lightgreen\">B7B6</td><td style=\"background-color: lightgreen\">B7B7</td><td style=\"background-color: yellow\">B7G1</td><td style=\"background-color: yellow\">B7G2</td><td style=\"background-color: yellow\">B7G3</td><td style=\"background-color: yellow\">B7G4</td><td style=\"background-color: yellow\">B7G5</td><td style=\"background-color: yellow\">B7G6</td><td style=\"background-color: yellow\">B7G7</td></tr><tr><td style=\"background-color: yellow\">G1B1</td><td style=\"background-color: yellow\">G1B2</td><td style=\"background-color: yellow\">G1B3</td><td style=\"background-color: yellow\">G1B4</td><td style=\"background-color: yellow\">G1B5</td><td style=\"background-color: yellow\">G1B6</td><td style=\"background-color: yellow\">G1B7</td><td style=\"background-color: ghostwhite\">G1G1</td><td style=\"background-color: ghostwhite\">G1G2</td><td style=\"background-color: ghostwhite\">G1G3</td><td style=\"background-color: ghostwhite\">G1G4</td><td style=\"background-color: ghostwhite\">G1G5</td><td style=\"background-color: ghostwhite\">G1G6</td><td style=\"background-color: ghostwhite\">G1G7</td></tr><tr><td style=\"background-color: yellow\">G2B1</td><td style=\"background-color: yellow\">G2B2</td><td style=\"background-color: yellow\">G2B3</td><td style=\"background-color: yellow\">G2B4</td><td style=\"background-color: yellow\">G2B5</td><td style=\"background-color: yellow\">G2B6</td><td style=\"background-color: yellow\">G2B7</td><td style=\"background-color: ghostwhite\">G2G1</td><td style=\"background-color: ghostwhite\">G2G2</td><td style=\"background-color: ghostwhite\">G2G3</td><td style=\"background-color: ghostwhite\">G2G4</td><td style=\"background-color: ghostwhite\">G2G5</td><td style=\"background-color: ghostwhite\">G2G6</td><td style=\"background-color: ghostwhite\">G2G7</td></tr><tr><td style=\"background-color: yellow\">G3B1</td><td style=\"background-color: yellow\">G3B2</td><td style=\"background-color: yellow\">G3B3</td><td style=\"background-color: yellow\">G3B4</td><td style=\"background-color: yellow\">G3B5</td><td style=\"background-color: yellow\">G3B6</td><td style=\"background-color: yellow\">G3B7</td><td style=\"background-color: ghostwhite\">G3G1</td><td style=\"background-color: ghostwhite\">G3G2</td><td style=\"background-color: ghostwhite\">G3G3</td><td style=\"background-color: ghostwhite\">G3G4</td><td style=\"background-color: ghostwhite\">G3G5</td><td style=\"background-color: ghostwhite\">G3G6</td><td style=\"background-color: ghostwhite\">G3G7</td></tr><tr><td style=\"background-color: yellow\">G4B1</td><td style=\"background-color: yellow\">G4B2</td><td style=\"background-color: yellow\">G4B3</td><td style=\"background-color: yellow\">G4B4</td><td style=\"background-color: yellow\">G4B5</td><td style=\"background-color: yellow\">G4B6</td><td style=\"background-color: yellow\">G4B7</td><td style=\"background-color: ghostwhite\">G4G1</td><td style=\"background-color: ghostwhite\">G4G2</td><td style=\"background-color: ghostwhite\">G4G3</td><td style=\"background-color: ghostwhite\">G4G4</td><td style=\"background-color: ghostwhite\">G4G5</td><td style=\"background-color: ghostwhite\">G4G6</td><td style=\"background-color: ghostwhite\">G4G7</td></tr><tr><td style=\"background-color: yellow\">G5B1</td><td style=\"background-color: yellow\">G5B2</td><td style=\"background-color: yellow\">G5B3</td><td style=\"background-color: yellow\">G5B4</td><td style=\"background-color: yellow\">G5B5</td><td style=\"background-color: yellow\">G5B6</td><td style=\"background-color: yellow\">G5B7</td><td style=\"background-color: ghostwhite\">G5G1</td><td style=\"background-color: ghostwhite\">G5G2</td><td style=\"background-color: ghostwhite\">G5G3</td><td style=\"background-color: ghostwhite\">G5G4</td><td style=\"background-color: ghostwhite\">G5G5</td><td style=\"background-color: ghostwhite\">G5G6</td><td style=\"background-color: ghostwhite\">G5G7</td></tr><tr><td style=\"background-color: yellow\">G6B1</td><td style=\"background-color: yellow\">G6B2</td><td style=\"background-color: yellow\">G6B3</td><td style=\"background-color: yellow\">G6B4</td><td style=\"background-color: yellow\">G6B5</td><td style=\"background-color: yellow\">G6B6</td><td style=\"background-color: yellow\">G6B7</td><td style=\"background-color: ghostwhite\">G6G1</td><td style=\"background-color: ghostwhite\">G6G2</td><td style=\"background-color: ghostwhite\">G6G3</td><td style=\"background-color: ghostwhite\">G6G4</td><td style=\"background-color: ghostwhite\">G6G5</td><td style=\"background-color: ghostwhite\">G6G6</td><td style=\"background-color: ghostwhite\">G6G7</td></tr><tr><td style=\"background-color: yellow\">G7B1</td><td style=\"background-color: yellow\">G7B2</td><td style=\"background-color: yellow\">G7B3</td><td style=\"background-color: yellow\">G7B4</td><td style=\"background-color: yellow\">G7B5</td><td style=\"background-color: yellow\">G7B6</td><td style=\"background-color: yellow\">G7B7</td><td style=\"background-color: ghostwhite\">G7G1</td><td style=\"background-color: ghostwhite\">G7G2</td><td style=\"background-color: ghostwhite\">G7G3</td><td style=\"background-color: ghostwhite\">G7G4</td><td style=\"background-color: ghostwhite\">G7G5</td><td style=\"background-color: ghostwhite\">G7G6</td><td style=\"background-color: ghostwhite\">G7G7</td></tr></table>1//3"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Problem 2\n",
"table(S3, 2, two_boys, at_least_one_boy)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"\n",
"\n",
"Now for the paradox of Problem 3:"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/html": [
"<table><tr><td style=\"background-color: ghostwhite\">B1B1</td><td style=\"background-color: ghostwhite\">B1B2</td><td style=\"background-color: lightgreen\">B1B3</td><td style=\"background-color: ghostwhite\">B1B4</td><td style=\"background-color: ghostwhite\">B1B5</td><td style=\"background-color: ghostwhite\">B1B6</td><td style=\"background-color: ghostwhite\">B1B7</td><td style=\"background-color: ghostwhite\">B1G1</td><td style=\"background-color: ghostwhite\">B1G2</td><td style=\"background-color: ghostwhite\">B1G3</td><td style=\"background-color: ghostwhite\">B1G4</td><td style=\"background-color: ghostwhite\">B1G5</td><td style=\"background-color: ghostwhite\">B1G6</td><td style=\"background-color: ghostwhite\">B1G7</td></tr><tr><td style=\"background-color: ghostwhite\">B2B1</td><td style=\"background-color: ghostwhite\">B2B2</td><td style=\"background-color: lightgreen\">B2B3</td><td style=\"background-color: ghostwhite\">B2B4</td><td style=\"background-color: ghostwhite\">B2B5</td><td style=\"background-color: ghostwhite\">B2B6</td><td style=\"background-color: ghostwhite\">B2B7</td><td style=\"background-color: ghostwhite\">B2G1</td><td style=\"background-color: ghostwhite\">B2G2</td><td style=\"background-color: ghostwhite\">B2G3</td><td style=\"background-color: ghostwhite\">B2G4</td><td style=\"background-color: ghostwhite\">B2G5</td><td style=\"background-color: ghostwhite\">B2G6</td><td style=\"background-color: ghostwhite\">B2G7</td></tr><tr><td style=\"background-color: lightgreen\">B3B1</td><td style=\"background-color: lightgreen\">B3B2</td><td style=\"background-color: lightgreen\">B3B3</td><td style=\"background-color: lightgreen\">B3B4</td><td style=\"background-color: lightgreen\">B3B5</td><td style=\"background-color: lightgreen\">B3B6</td><td style=\"background-color: lightgreen\">B3B7</td><td style=\"background-color: yellow\">B3G1</td><td style=\"background-color: yellow\">B3G2</td><td style=\"background-color: yellow\">B3G3</td><td style=\"background-color: yellow\">B3G4</td><td style=\"background-color: yellow\">B3G5</td><td style=\"background-color: yellow\">B3G6</td><td style=\"background-color: yellow\">B3G7</td></tr><tr><td style=\"background-color: ghostwhite\">B4B1</td><td style=\"background-color: ghostwhite\">B4B2</td><td style=\"background-color: lightgreen\">B4B3</td><td style=\"background-color: ghostwhite\">B4B4</td><td style=\"background-color: ghostwhite\">B4B5</td><td style=\"background-color: ghostwhite\">B4B6</td><td style=\"background-color: ghostwhite\">B4B7</td><td style=\"background-color: ghostwhite\">B4G1</td><td style=\"background-color: ghostwhite\">B4G2</td><td style=\"background-color: ghostwhite\">B4G3</td><td style=\"background-color: ghostwhite\">B4G4</td><td style=\"background-color: ghostwhite\">B4G5</td><td style=\"background-color: ghostwhite\">B4G6</td><td style=\"background-color: ghostwhite\">B4G7</td></tr><tr><td style=\"background-color: ghostwhite\">B5B1</td><td style=\"background-color: ghostwhite\">B5B2</td><td style=\"background-color: lightgreen\">B5B3</td><td style=\"background-color: ghostwhite\">B5B4</td><td style=\"background-color: ghostwhite\">B5B5</td><td style=\"background-color: ghostwhite\">B5B6</td><td style=\"background-color: ghostwhite\">B5B7</td><td style=\"background-color: ghostwhite\">B5G1</td><td style=\"background-color: ghostwhite\">B5G2</td><td style=\"background-color: ghostwhite\">B5G3</td><td style=\"background-color: ghostwhite\">B5G4</td><td style=\"background-color: ghostwhite\">B5G5</td><td style=\"background-color: ghostwhite\">B5G6</td><td style=\"background-color: ghostwhite\">B5G7</td></tr><tr><td style=\"background-color: ghostwhite\">B6B1</td><td style=\"background-color: ghostwhite\">B6B2</td><td style=\"background-color: lightgreen\">B6B3</td><td style=\"background-color: ghostwhite\">B6B4</td><td style=\"background-color: ghostwhite\">B6B5</td><td style=\"background-color: ghostwhite\">B6B6</td><td style=\"background-color: ghostwhite\">B6B7</td><td style=\"background-color: ghostwhite\">B6G1</td><td style=\"background-color: ghostwhite\">B6G2</td><td style=\"background-color: ghostwhite\">B6G3</td><td style=\"background-color: ghostwhite\">B6G4</td><td style=\"background-color: ghostwhite\">B6G5</td><td style=\"background-color: ghostwhite\">B6G6</td><td style=\"background-color: ghostwhite\">B6G7</td></tr><tr><td style=\"background-color: ghostwhite\">B7B1</td><td style=\"background-color: ghostwhite\">B7B2</td><td style=\"background-color: lightgreen\">B7B3</td><td style=\"background-color: ghostwhite\">B7B4</td><td style=\"background-color: ghostwhite\">B7B5</td><td style=\"background-color: ghostwhite\">B7B6</td><td style=\"background-color: ghostwhite\">B7B7</td><td style=\"background-color: ghostwhite\">B7G1</td><td style=\"background-color: ghostwhite\">B7G2</td><td style=\"background-color: ghostwhite\">B7G3</td><td style=\"background-color: ghostwhite\">B7G4</td><td style=\"background-color: ghostwhite\">B7G5</td><td style=\"background-color: ghostwhite\">B7G6</td><td style=\"background-color: ghostwhite\">B7G7</td></tr><tr><td style=\"background-color: ghostwhite\">G1B1</td><td style=\"background-color: ghostwhite\">G1B2</td><td style=\"background-color: yellow\">G1B3</td><td style=\"background-color: ghostwhite\">G1B4</td><td style=\"background-color: ghostwhite\">G1B5</td><td style=\"background-color: ghostwhite\">G1B6</td><td style=\"background-color: ghostwhite\">G1B7</td><td style=\"background-color: ghostwhite\">G1G1</td><td style=\"background-color: ghostwhite\">G1G2</td><td style=\"background-color: ghostwhite\">G1G3</td><td style=\"background-color: ghostwhite\">G1G4</td><td style=\"background-color: ghostwhite\">G1G5</td><td style=\"background-color: ghostwhite\">G1G6</td><td style=\"background-color: ghostwhite\">G1G7</td></tr><tr><td style=\"background-color: ghostwhite\">G2B1</td><td style=\"background-color: ghostwhite\">G2B2</td><td style=\"background-color: yellow\">G2B3</td><td style=\"background-color: ghostwhite\">G2B4</td><td style=\"background-color: ghostwhite\">G2B5</td><td style=\"background-color: ghostwhite\">G2B6</td><td style=\"background-color: ghostwhite\">G2B7</td><td style=\"background-color: ghostwhite\">G2G1</td><td style=\"background-color: ghostwhite\">G2G2</td><td style=\"background-color: ghostwhite\">G2G3</td><td style=\"background-color: ghostwhite\">G2G4</td><td style=\"background-color: ghostwhite\">G2G5</td><td style=\"background-color: ghostwhite\">G2G6</td><td style=\"background-color: ghostwhite\">G2G7</td></tr><tr><td style=\"background-color: ghostwhite\">G3B1</td><td style=\"background-color: ghostwhite\">G3B2</td><td style=\"background-color: yellow\">G3B3</td><td style=\"background-color: ghostwhite\">G3B4</td><td style=\"background-color: ghostwhite\">G3B5</td><td style=\"background-color: ghostwhite\">G3B6</td><td style=\"background-color: ghostwhite\">G3B7</td><td style=\"background-color: ghostwhite\">G3G1</td><td style=\"background-color: ghostwhite\">G3G2</td><td style=\"background-color: ghostwhite\">G3G3</td><td style=\"background-color: ghostwhite\">G3G4</td><td style=\"background-color: ghostwhite\">G3G5</td><td style=\"background-color: ghostwhite\">G3G6</td><td style=\"background-color: ghostwhite\">G3G7</td></tr><tr><td style=\"background-color: ghostwhite\">G4B1</td><td style=\"background-color: ghostwhite\">G4B2</td><td style=\"background-color: yellow\">G4B3</td><td style=\"background-color: ghostwhite\">G4B4</td><td style=\"background-color: ghostwhite\">G4B5</td><td style=\"background-color: ghostwhite\">G4B6</td><td style=\"background-color: ghostwhite\">G4B7</td><td style=\"background-color: ghostwhite\">G4G1</td><td style=\"background-color: ghostwhite\">G4G2</td><td style=\"background-color: ghostwhite\">G4G3</td><td style=\"background-color: ghostwhite\">G4G4</td><td style=\"background-color: ghostwhite\">G4G5</td><td style=\"background-color: ghostwhite\">G4G6</td><td style=\"background-color: ghostwhite\">G4G7</td></tr><tr><td style=\"background-color: ghostwhite\">G5B1</td><td style=\"background-color: ghostwhite\">G5B2</td><td style=\"background-color: yellow\">G5B3</td><td style=\"background-color: ghostwhite\">G5B4</td><td style=\"background-color: ghostwhite\">G5B5</td><td style=\"background-color: ghostwhite\">G5B6</td><td style=\"background-color: ghostwhite\">G5B7</td><td style=\"background-color: ghostwhite\">G5G1</td><td style=\"background-color: ghostwhite\">G5G2</td><td style=\"background-color: ghostwhite\">G5G3</td><td style=\"background-color: ghostwhite\">G5G4</td><td style=\"background-color: ghostwhite\">G5G5</td><td style=\"background-color: ghostwhite\">G5G6</td><td style=\"background-color: ghostwhite\">G5G7</td></tr><tr><td style=\"background-color: ghostwhite\">G6B1</td><td style=\"background-color: ghostwhite\">G6B2</td><td style=\"background-color: yellow\">G6B3</td><td style=\"background-color: ghostwhite\">G6B4</td><td style=\"background-color: ghostwhite\">G6B5</td><td style=\"background-color: ghostwhite\">G6B6</td><td style=\"background-color: ghostwhite\">G6B7</td><td style=\"background-color: ghostwhite\">G6G1</td><td style=\"background-color: ghostwhite\">G6G2</td><td style=\"background-color: ghostwhite\">G6G3</td><td style=\"background-color: ghostwhite\">G6G4</td><td style=\"background-color: ghostwhite\">G6G5</td><td style=\"background-color: ghostwhite\">G6G6</td><td style=\"background-color: ghostwhite\">G6G7</td></tr><tr><td style=\"background-color: ghostwhite\">G7B1</td><td style=\"background-color: ghostwhite\">G7B2</td><td style=\"background-color: yellow\">G7B3</td><td style=\"background-color: ghostwhite\">G7B4</td><td style=\"background-color: ghostwhite\">G7B5</td><td style=\"background-color: ghostwhite\">G7B6</td><td style=\"background-color: ghostwhite\">G7B7</td><td style=\"background-color: ghostwhite\">G7G1</td><td style=\"background-color: ghostwhite\">G7G2</td><td style=\"background-color: ghostwhite\">G7G3</td><td style=\"background-color: ghostwhite\">G7G4</td><td style=\"background-color: ghostwhite\">G7G5</td><td style=\"background-color: ghostwhite\">G7G6</td><td style=\"background-color: ghostwhite\">G7G7</td></tr></table>13//27"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Problem 3\n",
"table(S3, 2, two_boys, at_least_one_boy_tues)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"We see there are 27 relevant outcomes, of which 13 are green. So 13/27 really does seem to be the right answer. This picture also gives us a way to think about why the answer is not 1/3. Think of the yellow-plus-green area as a horizontal stripe and a vertical stripe, with an overlap. Each stripe is half yellow and half green, so if there were no overlap at all, the probability of green would be 1/2. When each stripe takes up half the sample space and the overlap is maximal, the probability is 1/3. And in the Problem 3 table, where the overlap is small, the probability is close to 1/2 (but slightly smaller).\n",
"\n",
"One way to look at it is that if I tell you very specific information (such as a boy born on Tuesday), it is unlikely that this applies to both children, so we have smaller overlap and a probability closer to 1/2, but if I give you broad information (a boy), this is more likely to apply to either child, resulting in a larger overlap, and a probability closer to 1/3.\n",
"\n",
"You can read some more discussions of the problem by (in alphabetical order) \n",
"[Alex Bellos](https://www.newscientist.com/article/dn18950-magic-numbers-a-meeting-of-mathemagical-tricksters?full=true),\n",
"[Alexander Bogomolny](http://www.cut-the-knot.org/Probability/BearBornOnTuesday.shtml),\n",
"[Andrew Gelman](http://andrewgelman.com/2010/05/27/hype_about_cond/),\n",
"[David Bigelow](https://web.viu.ca/bigelow2/Problem%201127%20Solution.pdf),\n",
"[Julie Rehmeyer](https://www.sciencenews.org/article/when-intuition-and-math-probably-look-wrong),\n",
"[Keith Devlin](https://www.maa.org/external_archive/devlin/devlin_05_10.html),\n",
"[Peter Lynch](http://mathsci.ucd.ie/~plynch/Publications/BIMS-TwoChildParadox.pdf),\n",
"[Tanya Khovanova](http://arxiv.org/pdf/1102.0173v1.pdf),\n",
"and\n",
"[Wendy Taylor &amp; Kaye Stacey](http://www.aamt.edu.au/Journals/Sample-articles/amt70_2_taylor.pdf)."
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"# The Sleeping Beauty Paradox\n",
"\n",
"The [Sleeping Beauty Paradox](https://en.wikipedia.org/wiki/Sleeping_Beauty_problem) is another tricky one:\n",
"\n",
">Sleeping Beauty volunteers to undergo the following experiment and is told all of the following details: On Sunday she will be put to sleep. Once or twice, during the experiment, Beauty will be awakened, interviewed, and put back to sleep with an amnesia-inducing drug that makes her forget that awakening. A fair coin will be tossed to determine which experimental procedure to undertake: if the coin comes up heads, Beauty will be awakened and interviewed on Monday only. If the coin comes up tails, she will be awakened and interviewed on Monday and Tuesday. In either case, she will be awakened on Wednesday without interview and the experiment ends.\n",
"Any time Sleeping Beauty is awakened and interviewed, she is asked, \"What is your belief now for the proposition that the coin landed heads?\"\n",
"\n",
"What should Sleeping Beauty say when she is interviewed? First, she should define the sample space. She could use the notation `\"heads/Monday/interviewed\"` to mean the outcome where the coin flip was heads, it is Monday, and she is interviewed. So it seems there are 4 equiprobable outcomes:"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"B = [\"heads/Monday/interviewed\", \"heads/Tuesday/sleep\",\n",
" \"tails/Monday/interviewed\", \"tails/Tuesday/interviewed\"];"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"At this point, you're probably expecting me to define predicates, like this:\n",
"\n",
" heads(outcome)= contains(outcome,\"heads\")\n",
" interviewed(outcome) = contains(outcome,\"interviewed\")\n",
" \n",
"We've seen a lot of predicates like this. I think it is time to heed the \"[don't repeat yourself](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)\" principle, so I will define a predicate-defining function:"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"Return a predicate that is true of all outcomes that have 'property' as a substring.\"\n",
"T(property)=(outcome)-> contains(outcome,property);"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Now we can get the answer:"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//3"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"heads = T(\"heads\")\n",
"interviewed = T(\"interviewed\")\n",
"\n",
"P(heads, such_that(interviewed, B))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"(Note I could have done that in one line instead of three: `P(T(\"heads\"), such_that(T(\"interviewed\"), B))`, but that's kind of ugly.)\n",
"\n",
"This problem is considered a paradox because there are people who argue that the answer should be 1/2, not 1/3. I admit I'm having difficulty coming up with a sample space that supports the \"halfer\" position. I do know of a question that has the answer 1/2:"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(heads, B) "
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"But that seems like the wrong question; we want the probability of heads given that Sleeping Beauty was interviewed, not the unconditional probability. \n",
"\n",
"The \"halfers\" argue that before Sleeping Beauty goes to sleep, her unconditional probability for heads should be 1/2. When she is interviewed, she doesn't know anything more than before she went to sleep, so nothing has changed, so the probability of heads should still be 1/2. I find two flaws with this argument. First, if you want to convince me, show me a sample space; don't just make philosophical arguments. (Although a philosophical argument can be employed to help you define the right sample space.) Second, while I agree that before she goes to sleep, Beauty's *unconditional* probability for heads should be 1/2, I would say that both before she goes to sleep and when she is awakened, her *conditional* probability of heads *given that she is being interviewed* should be 1/3, as shown by the sample space."
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"# The Monty Hall Paradox\n",
"\n",
"[This](https://en.wikipedia.org/wiki/Monty_Hall_problem) is one of the most famous probability paradoxes. It can be stated as follows:\n",
"\n",
"> Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, \"Do you want to pick door No. 2?\" Is it to your advantage to switch your choice?\n",
"\n",
"Much has been written about this problem, but to solve it all we have to do is be careful about how we understand the problem, and about defining our sample space. I will define outcomes of the form `\"Car1<Pick1/Open2\"`, which means\n",
"* \"Car1\": First the car is randomly placed behind door 1.\n",
"* \"<\": The host randomly commits to the strategy of opening the lowest-numbered allowable door. A door is allowable if it does not contain the car and was not picked by the contestant. Alternatively, the host could have chosen to open the highest-numbered allowable door (\">\").\n",
"(If you don't like the idea of the host commiting to a strategy, try this: the host mentally flips a coin to decide which door to open; include `\"/Head/\"` or `\"/Tail/\"` instead of `\"<\"` or `\">\"` in the description of the outcome.)\n",
"* `Pick1`: The contestant picks door 1. Our sample space will only consider cases where the contestant picks door 1, but by symmetry, the same arguments could be used if the contestant picked door 2 or 3.\n",
"* `Open2`: After hearing the contestant's choice, and following the strategy, the host opens a door; in this case door 2.\n",
"\n",
"We can see that the sample space has 6 equiprobable outcomes:"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"M = [\"Car1<Pick1/Open2\", \"Car1>Pick1/Open3\",\n",
" \"Car2<Pick1/Open3\", \"Car2>Pick1/Open3\",\n",
" \"Car3<Pick1/Open2\", \"Car3>Pick1/Open2\"];"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Now, assuming the contestant picks door 1 and the host opens door 3, what is the probability that the car is behind door 1? Or door 2?"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//3"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(T(\"Car1\"), such_that(T(\"Open3\"), M))"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"2//3"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(T(\"Car2\"), such_that(T(\"Open3\"), M))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"We see that the strategy of **switching** from door 1 to door 2 will win the car 2/3 of the time, whereas the strategy of **sticking** with the original pick wins the car only 1/3 of the time. So if you like cars more than goats, you should switch. But don't feel bad if you got this one wrong; it turns out that Monty Hall himself, who opened numerous doors while hosting *Let's Make a Deal* for 13 years, didn't know the answer either, as revealed in this letter from Monty to statistician Lawrence Denenberg, when Denenberg asked for permission to use the problem in his textbook:\n",
"<img src=\"http://norvig.com/monty-hall-letter.jpg\">\n",
"\n",
"If you were Denenberg, how would you answer Monty, in non-mathematical terms. I would try something like this:\n",
"\n",
"> When the contestant makes her initial pick, she has 1/3 chance of picking the car, and there is a 2/3 chance the car is behind one of the other doors. That's still true after you open a door, but now the 2/3 chance for *either* other door becomes concentrated as 2/3 behind *one* other door, so the contestant should switch.\n",
"\n",
"But that argument was not persuasive to everyone. [Marilyn vos Savant](http://marilynvossavant.com/game-show-problem/) reports that many of her readers (including, she is pleased to point out, many Ph.D.s) still insist the answer is that it doesn't matter if the contestant switches; the odds are 1/2 either way. Let's try to discover what problem and what sample space those people are dealing with. Perhaps they are reasoning like this:\n",
"\n",
"They define outcomes of the form `\"Car1/Pick1/Open2/Goat\"`, which means:\n",
"* `Car1`: First the car is randomly placed behind door 1.\n",
"* `Pick1`: The contestant picks door 1. \n",
"* `Open2`: The host opens one of the two other doors at random (so the host might open the door with the car).\n",
"* `Goat`: We observe there is a goat behind door 2.\n",
"\n",
"Under this interpretation, the sample space is:"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"M2 = [\"Car1/Pick1/Open2/Goat\", \"Car1/Pick1/Open3/Goat\",\n",
" \"Car2/Pick1/Open2/Car\", \"Car2/Pick1/Open3/Goat\",\n",
" \"Car3/Pick1/Open2/Goat\", \"Car3/Pick1/Open3/Car\"];"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"And we can calculate the probability of the car being behind each door, given that the contestant picks door 1 and the host opens door 3 to reveal a goat:"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(T(\"Car1\"), such_that(T(\"Open3/Goat\"), M2))"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P(T(\"Car2\"), such_that(T(\"Open3/Goat\"), M2))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"So we see that under this interpretation it doesn't matter if you switch or not. \n",
"\n",
"Is this a valid interpretation? I agree that the wording of the problem can be seen as being ambiguous. However, this interpretation has a serious problem: in all the history of *Let's Make a Deal*, it was never the case that the host opened up a door with the car (or other grand prize). This strongly suggests (but does not quite prove) that `M` and not `M2` is the correct sample space."
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"# Non-Equiprobable Outcomes: Probability Distributions\n",
"\n",
"So far, we have made the assumption that every outcome in a sample space is equally likely. In real life, the probability of a child being a girl (or boy) is not exactly 1/2 and the sex of a second child is not completely independent from the first. An [article](http://people.kzoo.edu/barth/math105/moreboys.pdf) gives the following counts for two-child families in Denmark:\n",
"\n",
" GG: 121801 GB: 126840\n",
" BG: 127123 BB: 135138\n",
" \n",
"We call this mapping from outcomes to their frequencies a *distribution*. Here are two more definitions:\n",
"\n",
"* [Distribution](http://mathworld.wolfram.com/StatisticalDistribution.html): An assignment of frequencies to every outcome in a sample space. \n",
"\n",
"* [Probability Distribution](https://en.wikipedia.org/wiki/Probability_distribution): A distribution that has been *normalized* so that the sum of the frequencies is 1 (and each frequency is between 0 and 1).\n",
"\n",
"We can implement distributions as `Dict`s with this code:"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"ProbDist=Dict\n",
"\n",
"\"Probability Distribution\"\n",
"probdist(;entries...)= return normalize(Dict(entries))\n",
"\n",
"\"Given a distribution dict, return a version where the values are normalized to sum to 1.\"\n",
"function normalize(dist)\n",
" total = sum(values(dist))\n",
" return Dict(string(e) => dist[e] / total \n",
" for e in keys(dist))\n",
"end;"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"Dict{String,Float64} with 4 entries:\n",
" \"BB\" => 0.264509\n",
" \"BG\" => 0.248821\n",
" \"GG\" => 0.238404\n",
" \"GB\" => 0.248267"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"DK = probdist(GG=121801, GB=126840,\n",
" BG=127123, BB=135138)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Now we need to modify the functions `P` and `such_that` to accept either a sample space or a probability distribution:"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"\"\"The probability of an event, given a sample space of equiprobable outcomes. \n",
" event: a collection of outcomes, or a predicate that is true of outcomes in the event. \n",
" space: a set of outcomes or a probability distribution of {outcome: frequency} pairs.\"\"\"\n",
"function P(event::Function, space::ProbDist)\n",
" event_ = such_that(event, space)\n",
" return sum([space[v] for v in collect(filter(e-> event(e), keys(space)))])\n",
"end\n",
"function P(event, space::ProbDist)\n",
" return sum([space[v] for v in collect(filter(e-> e in event, keys(space)))])\n",
"end\n",
"\n",
"\"\"\"The elements in the space for which the predicate is true.\n",
" If space is a set, return a subset {element,...};\n",
" if space is a dict, return a sub-dict of {element: frequency,...} pairs;\n",
" in both cases only with elements where predicate(element) is true.\"\"\"\n",
"function such_that(predicate::Function, space::ProbDist)\n",
" return normalize(Dict(e=>space[e] for e in filter(predicate,keys(space))))\n",
"end;"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"First, let's verify that it still works on the old problems where the sample space is a set:"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//2"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Problem 1 in S\n",
"P(two_boys, such_that(older_is_a_boy, S))"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"1//3"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Problem 2 in S\n",
"P(two_boys, such_that(at_least_one_boy, S))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Now let's see if the definitions work with the probability distribution `DK`. We expect a little over 1/2 for Problem 1, and a little over 1/3 for problem 2:"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"0.5152805792702689"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Problem 1 in DK\n",
"P(two_boys, such_that(older_is_a_boy, DK))"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"0.34730828242538575"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Problem 2 in DK\n",
"P(two_boys, such_that(at_least_one_boy, DK))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"It all looks good. Now let's try a new problem that would not have been feasible with a set-based sample space.\n",
"\n",
"## Problem 4. One is a boy born on Feb. 29. What is the probability both are boys?\n",
"\n",
"* **Problem 4.** I have two children. At least one of them is a boy born on leap day, February 29. What is the probability that both children are boys? Assume that 51.5% of births are boys and that birth days are distributed evenly across the 4&times;365 + 1 days in a 4-year cycle.\n",
"\n",
"We will use the notation `GLBN` to mean an older girl born on leap day and a younger boy born on a non-leap day. We'll define a helper function, `joint`, that creates the joint probability distribution of two probability distributions:"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"\"\"The joint distribution of two independent probability distributions. \n",
" Result is all entries of the form {a+b: P(a)*P(b)}\"\"\"\n",
"joint(A, B)=Dict(a * b => A[a] * B[b] for a in keys(A),b in keys(B));"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"Dict{String,Float64} with 16 entries:\n",
" \"BLGN\" => 0.000170845\n",
" \"GNGN\" => 0.234903\n",
" \"GNBN\" => 0.249433\n",
" \"BLBL\" => 1.24255e-7\n",
" \"GNGL\" => 0.000160893\n",
" \"BNGN\" => 0.249433\n",
" \"BNBL\" => 0.000181412\n",
" \"BLGL\" => 1.17017e-7\n",
" \"BLBN\" => 0.000181412\n",
" \"GLBN\" => 0.000170845\n",
" \"GLGN\" => 0.000160893\n",
" \"GLBL\" => 1.17017e-7\n",
" \"BNBN\" => 0.264862\n",
" \"BNGL\" => 0.000170845\n",
" \"GLGL\" => 1.102e-7\n",
" \"GNBL\" => 0.000170845"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sexes = probdist(B=51.5, G=48.5) # Probability distribution over sexes\n",
"days = probdist(L=1, N=4*365) # Probability distribution over Leap days and Non-leap days\n",
"child = joint(sexes, days) # Probability distribution for one child family\n",
"S4 = joint(child, child); # Probability distribution for two-child family"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Let's check out these last two probability distributions:"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"Dict{String,Float64} with 4 entries:\n",
" \"BN\" => 0.514648\n",
" \"BL\" => 0.000352498\n",
" \"GL\" => 0.000331964\n",
" \"GN\" => 0.484668"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"child"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"Dict{String,Float64} with 16 entries:\n",
" \"BLGN\" => 0.000170845\n",
" \"GNGN\" => 0.234903\n",
" \"GNBN\" => 0.249433\n",
" \"BLBL\" => 1.24255e-7\n",
" \"GNGL\" => 0.000160893\n",
" \"BNGN\" => 0.249433\n",
" \"BNBL\" => 0.000181412\n",
" \"BLGL\" => 1.17017e-7\n",
" \"BLBN\" => 0.000181412\n",
" \"GLBN\" => 0.000170845\n",
" \"GLGN\" => 0.000160893\n",
" \"GLBL\" => 1.17017e-7\n",
" \"BNBN\" => 0.264862\n",
" \"BNGL\" => 0.000170845\n",
" \"GLGL\" => 1.102e-7\n",
" \"GNBL\" => 0.000170845"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"S4"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"And we can solve the problem. Since \"boy born on a leap day\" applies to so few children, we expect the probability of two boys to be just ever so slightly below the baseline rate for boys, 51.5%."
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"0.5149145040963757"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Problem 4\n",
"\n",
"boy_on_leap_day = T(\"BL\")\n",
"\n",
"P(two_boys, such_that(boy_on_leap_day, S4))"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"# Simulation\n",
"\n",
"Sometimes it is inconvenient to explicitly define a sample space. Perhaps the sample space is infinite, or perhaps it is just very large and complicated, and we feel more confident in writing a program to *simulate* the situation, rather than one to *enumerate* the complete sample space. *Sampling* from the simulation\n",
"can give an accurate estimate of the probability.\n",
"\n",
"For example, here's a simulation of the Monty Hall problem. Given a boolean input saying whether the contestent wants to switch doors or not, the function `monty(switch)` returns True iff the contestant picks the car."
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"\"\"Simulate this sequence of events:\n",
"- The host randomly chooses a door for the 'car'\n",
"- The contestant randomly makes a 'pick' of one of the doors\n",
"- The host randomly selects a valid door to be 'opened.' \n",
"- If 'switch' is True, contestant changes 'pick' to the other door\n",
"Return true if the pick is the door with the car.\"\"\"\n",
"function monty(switch=true)\n",
" doors = [1, 2, 3]\n",
" car = rand(doors)\n",
" pick = rand(doors)\n",
" opened = rand(filter(d->d != car && d != pick,doors))\n",
" pick = switch? filter(d->d!=pick && d!=opened,doors)[1]: pick \n",
" pick==car\n",
"end;"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"We can confirm that the contestant wins about 2/3 of the time with the `switch` strategy, and only wins about 1/3 of the time when not switching:"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"0.66662"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"count(_->monty(true),1:100000)/100000"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"0.33176"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"count(_->monty(false),1:100000)/100000"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"# Simulating Monopoly\n",
"\n",
"Here's an example where simulation seems to be much easier than enumeration: [problem 84](https://projecteuler.net/problem=84) from the excellent [Project Euler](https://projecteuler.net) asks the reader to simulate the game of Monopoly for a single player, and report on the probability of the player ending a roll on each of the squares on the board. The simulation takes into account die rolls, chance and community chest cards, and going to jail (from the \"go to jail\" space, from a card, or from rolling doubles three times in a row). The simulation does not take into account anything about buying or selling properties or exchanging money or winning or losing the game."
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"# The board: a list of the names of the 40 squares\n",
"board = \"\"\"GO A1 CC1 A2 T1 R1 B1 CH1 B2 B3\n",
" JAIL C1 U1 C2 C3 R2 D1 CC2 D2 D3 \n",
" FP E1 CH2 E2 E3 R3 F1 F2 U2 F3 \n",
" G2J G1 G2 CC3 G3 R4 CH3 H1 T2 H2\"\"\" |> split\n",
"\n",
"# Lists of 16 community chest and 16 chance cards. See do_card.\n",
"CC = append!([\"GO\", \"JAIL\"], repeat([\"?\"],outer=[14]))\n",
"\n",
"CH = append!(\"GO JAIL C1 E3 H2 R1 R R U -3\" |> split, repeat([\"?\"],outer=[6]))\n",
"\n",
"\"\"\"Simulate given number of steps of monopoly game, \n",
"yielding the name of the current square after each step.\"\"\"\n",
"function monopoly(steps)\n",
" global here\n",
" here = 1\n",
" CC_deck = shuffle(CC)\n",
" CH_deck = shuffle(CH)\n",
" doubles = 0\n",
" function monopolyTask()\n",
" for _=1:steps\n",
" d1, d2 = rand(1:6), rand(1:6)\n",
" goto(here + d1 + d2)\n",
" doubles = (d1 == d2) ? (doubles + 1): 0\n",
" if doubles == 3 || board[here] == \"G2J\" \n",
" goto(\"JAIL\")\n",
" elseif startswith(board[here],\"CC\")\n",
" do_card(CC_deck)\n",
" elseif startswith(board[here],\"CH\")\n",
" do_card(CH_deck)\n",
" end\n",
" produce(board[here])\n",
" end\n",
" end\n",
" Task(monopolyTask)\n",
"end\n",
"\n",
"\"Go to destination square (a square number). Update 'here'.\"\n",
"function goto(square::Int)\n",
" global here\n",
" here = (square-1) % length(board)+1\n",
"end\n",
"\n",
"\"Go to destination square (a square name). Update 'here'.\"\n",
"function goto(square::AbstractString)\n",
" global here\n",
" here = findfirst(board,square)\n",
"end\n",
"\n",
"\"Take the top card from deck and do what it says.\"\n",
"function do_card(deck)\n",
" global here\n",
" card = pop!(deck) # The top card\n",
" unshift!(deck,card) # Move top card to bottom of deck\n",
" if card == \"R\"|| card == \"U\" \n",
" while !startswith(board[here],card)\n",
" goto(here + 1) # Advance to next railroad or utility\n",
" end\n",
" elseif card == \"-3\"\n",
" goto(here - 3) # Go back 3 spaces\n",
" elseif card != \"?\"\n",
" goto(card) # Go to destination named on card\n",
" end\n",
"end; "
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Let's run the simulation for a million dice rolls, and see a histogram and a list of the counts for each square:"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"results = collect(monopoly(10^6));"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"PyPlot.Figure(PyObject <matplotlib.figure.Figure object at 0x7fea9c85d890>)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING: using PyPlot.table in module Main conflicts with an existing identifier.\n"
]
}
],
"source": [
"using PyPlot\n",
"figure(\"hist\",figsize=(5,3))\n",
"ax=axes()\n",
"axis([1 ,40, 0 ,70000])\n",
"ax[:hist]([findfirst(board,name) for name in results], bins=40);"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"40-element Array{Tuple{SubString{String},Int64},1}:\n",
" (\"JAIL\",62333)\n",
" (\"E3\",31957) \n",
" (\"D3\",31013) \n",
" (\"GO\",30922) \n",
" (\"R3\",30544) \n",
" (\"R1\",29830) \n",
" (\"D2\",29461) \n",
" (\"R2\",29388) \n",
" (\"FP\",28824) \n",
" (\"E1\",28342) \n",
" (\"U2\",28033) \n",
" (\"D1\",27555) \n",
" (\"E2\",27289) \n",
" ⋮ \n",
" (\"B2\",23197) \n",
" (\"B3\",22818) \n",
" (\"B1\",22538) \n",
" (\"H1\",21872) \n",
" (\"T2\",21802) \n",
" (\"A2\",21605) \n",
" (\"A1\",21185) \n",
" (\"CC1\",19113) \n",
" (\"CH2\",10766) \n",
" (\"CH3\",8643) \n",
" (\"CH1\",8540) \n",
" (\"G2J\",0) "
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# We have to implement most_common here at it is not found Julia libraries \n",
"using StatsBase\n",
"function most_common(r)\n",
" commonList=collect(zip(board,counts([findfirst(board,name)::Int for name in r])))\n",
" sort!(commonList,by=x->x[2],rev=true)\n",
"end\n",
"most_common(results)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"We can see that `JAIL` is by far the most popular square (at a little over 6%), and that the three least popular squares (around 1%) are the three chance squares, `CH1`, `CH2`, and `CH3` (because 10 of the 16 chance cards send the player away from the square), and of course the \"Go to Jail\" square, square number 30 on the plot, which has a count of 0 because you can't end a turn there. The other squares are pretty evenly distributed at 2% to 3% each."
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"# Classy Monopoly\n",
"\n",
"Some people might think that the \"`global here`\" is bad style. One way to eliminate global variables is to pack them up into objects. We can do that by making a `Monopoly` class. My personal preference would be that the \"`global here`\" declarations add less visual clutter, but I present this refactored version for those who prefer it."
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"40-element Array{Tuple{SubString{String},Int64},1}:\n",
" (\"JAIL\",62114)\n",
" (\"E3\",31839) \n",
" (\"D3\",30928) \n",
" (\"R3\",30741) \n",
" (\"GO\",30736) \n",
" (\"D2\",29479) \n",
" (\"R2\",29396) \n",
" (\"R1\",29377) \n",
" (\"FP\",28731) \n",
" (\"E1\",28443) \n",
" (\"U2\",28065) \n",
" (\"D1\",27947) \n",
" (\"F1\",27210) \n",
" ⋮ \n",
" (\"B2\",23395) \n",
" (\"B3\",22924) \n",
" (\"B1\",22522) \n",
" (\"T2\",21958) \n",
" (\"A2\",21734) \n",
" (\"H1\",21693) \n",
" (\"A1\",21443) \n",
" (\"CC1\",18672) \n",
" (\"CH2\",10524) \n",
" (\"CH3\",8975) \n",
" (\"CH1\",8365) \n",
" (\"G2J\",0) "
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"type Monopoly\n",
" board \n",
" CC \n",
" CH \n",
" here\n",
" function Monopoly()\n",
" this=new()\n",
" this.board= \"\"\"GO A1 CC1 A2 T1 R1 B1 CH1 B2 B3\n",
" JAIL C1 U1 C2 C3 R2 D1 CC2 D2 D3 \n",
" FP E1 CH2 E2 E3 R3 F1 F2 U2 F3 \n",
" G2J G1 G2 CC3 G3 R4 CH3 H1 T2 H2\"\"\" |> split\n",
" this.CC = append!([\"GO\", \"JAIL\"], repeat([\"?\"],outer=[14]))\n",
" this.CH = append!(\"GO JAIL C1 E3 H2 R1 R R U -3\" |> split, repeat([\"?\"],outer=[6]))\n",
" shuffle!(this.CC)\n",
" shuffle!(this.CH)\n",
" this.here=1\n",
" return this\n",
" end\n",
"end\n",
"\n",
"\"\"\"Simulate given number of steps of monopoly game, incrementing counter \n",
"for current square after each step. Return a list of (square, count) pairs in order.\"\"\"\n",
"function simulate(m::Monopoly,steps)\n",
" counter=zeros(Int,40)\n",
" doubles = 0\n",
" for _=1:steps\n",
" d1, d2 = rand(1:6), rand(1:6)\n",
" goto(m, m.here + d1 + d2)\n",
" doubles = (d1 == d2) ? (doubles + 1): 0\n",
" if doubles == 3 || m.board[m.here] == \"G2J\" \n",
" goto(m, \"JAIL\")\n",
" elseif startswith(m.board[m.here],\"CC\") || startswith(m.board[m.here],\"CH\")\n",
" do_card(m)\n",
" end\n",
" counter[m.here]+=1\n",
" end\n",
" commonList=collect(zip(m.board,counter))\n",
" sort!(commonList,by=x->x[2],rev=true)\n",
"end\n",
"\n",
"\"Go to destination square (a square number). Update 'here'.\"\n",
"function goto(m::Monopoly, square::Int)\n",
" m.here = (square-1) % length(m.board)+1\n",
"end\n",
"\n",
"\"Go to destination square (a square name). Update 'here'.\"\n",
"function goto(m::Monopoly,square::AbstractString)\n",
" m.here = findfirst(m.board,square)\n",
"end\n",
"\n",
"\"Take the top card from deck and do what it says.\"\n",
"function do_card(m::Monopoly)\n",
" deck= startswith(m.board[m.here],\"CC\") ? m.CC: m.CH #Which deck based on location\n",
" card = pop!(deck) # The top card\n",
" unshift!(deck,card) # Move top card to bottom of deck\n",
" if card == \"R\"|| card == \"U\" \n",
" while !startswith(m.board[m.here],card)\n",
" goto(m,m.here + 1) # Advance to next railroad or utility\n",
" end\n",
" elseif card == \"-3\"\n",
" before=m.here\n",
" goto(m,m.here - 3) # Go back 3 spaces\n",
" if m.here==0\n",
" println(before, \" \",card)\n",
" end\n",
" elseif card != \"?\"\n",
" goto(m,card) # Go to destination named on card\n",
" end\n",
" \n",
"end\n",
" \n",
"simulate(Monopoly(),10^6)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"# The St. Petersburg Paradox\n",
"\n",
"One more famous paradox: The [St. Petersburg paradox](https://en.wikipedia.org/wiki/St._Petersburg_paradox) from 1713, named for the home town of the [Bernoulli brothers](http://www.storyofmathematics.com/18th_bernoulli.html):\n",
"\n",
"> A casino offers a game of chance for a single player in which a fair coin is tossed at each stage. The pot starts at 2 dollars and is doubled every time a head appears. The first time a tail appears, the game ends and the player wins whatever is in the pot. Thus the player wins 2 dollars if a tail appears on the first toss, 4 dollars if a head appears on the first toss and a tail on the second, etc. What is the expected value of this game to the player?\n",
"\n",
"To calculate the expected value, we see there is a 1/2 chance of a tail on the first toss (yielding a pot of \\$2) and if not that, a 1/2 &times; 1/2 = 1/4 chance of a tail on the second toss (yielding a pot of \\$4), and so on. So in total, the expected value is:\n",
"\n",
"$$\\frac{1}{2}\\cdot 2 + \\frac{1}{4}\\cdot 4 + \\frac{1}{8}\\cdot 8 + \\frac{1}{16} \\cdot 16 + \\cdots = 1 + 1 + 1 + 1 + \\cdots = \\infty$$\n",
"\n",
"The expected value is infinite! But anyone playing the game would not expect to win an infinite amount; thus the paradox.\n",
"\n",
"## Response 1: Limited Resources\n",
"\n",
"The first major response to the paradox is that the casino's resources are limited. Once you break their bank, they can't pay out any more, and thus the expected return is finite. Let's model that by creating a probability distribution for the problem with a limited bank. We keep doubling the pot and halving the probability of winning the amount in the pot (half because you get the pot on a tail but not a head), until we reach the limit:"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"30-element Array{Tuple{Any,Any},1}:\n",
" (2,0.5) \n",
" (4,0.25) \n",
" (8,0.125) \n",
" (16,0.0625) \n",
" (32,0.03125) \n",
" (64,0.015625) \n",
" (128,0.0078125) \n",
" (256,0.00390625) \n",
" (512,0.00195313) \n",
" (1024,0.000976563) \n",
" (2048,0.000488281) \n",
" (4096,0.000244141) \n",
" (8192,0.00012207) \n",
" ⋮ \n",
" (524288,1.90735e-6) \n",
" (1048576,9.53674e-7) \n",
" (2097152,4.76837e-7) \n",
" (4194304,2.38419e-7) \n",
" (8388608,1.19209e-7) \n",
" (16777216,5.96046e-8) \n",
" (33554432,2.98023e-8) \n",
" (67108864,1.49012e-8) \n",
" (134217728,7.45058e-9) \n",
" (268435456,3.72529e-9) \n",
" (536870912,1.86265e-9) \n",
" (1000000000,1.86265e-9)"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"Return the probability distribution for the St. Petersburg Paradox with a limited bank.\"\n",
"function st_pete(limit)\n",
" P = Dict() # The probability distribution\n",
" pot = 2 # Amount of money in the pot\n",
" pr = 1/2 # Probability that you end up with the amount in pot\n",
" while pot < limit \n",
" P[pot] = pr\n",
" pot, pr = pot * 2, pr / 2\n",
" end\n",
" P[limit] = pr * 2 # pr * 2 because you get limit for heads or tails\n",
" assert(sum(values(P)) == 1.0)\n",
" sort(collect(zip(keys(P),values(P))))\n",
"end\n",
"\n",
"StP = st_pete(10^9)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Now we define the function `EV` to compute the [expected value](https://en.wikipedia.org/wiki/Expected_value) of the (limited) St. Petersburg probability distribution:"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"The expected value of a probability distribution.\"\n",
"EV(P) =sum([v[1] * v[2] for v in P]);"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"30.862645149230957"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"EV(StP)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"This says that for a casino with a bankroll of a billion dollars, you should be willing to pay \\$30.86 to play the game. Would you pay that much? I wouldn't, and neither would Daniel Bernoulli. \n",
"\n",
"## Response 2: Value of Money\n",
"\n",
"Bernoulli came up with a second response to the paradox based on the idea that if you have a lot of money, then additional money becomes less valuable to you. How much less valuable? Bernoulli proposed, and [experiments confirm](https://books.google.com/books?id=1oEa-BiARWUC&pg=PA205&lpg=PA205&dq=mr+beard+oil+wildcatter+value+of+money+utility&source=bl&ots=cBDIX-rkTz&sig=GHB8-inorWrU39vA8JYV_sCtqB8&hl=en&sa=X&ved=0CCAQ6AEwAGoVChMI5fu-p8qlyAIViKWICh0XAAz5#v=onepage&q=mr%20beard%20oil%20wildcatter%20value%20of%20money%20utility&f=false), that *the value of money is roughly logarithmic.* The idea is that if I had very little money, and I won \\$1000, I would be very happy. But if I already had a million dollars and I won \\$1000, it would make less difference to me; the \\$1000 would be less valuable. \n",
"\n",
"I'll write the function `util` to describe what a dollar amount is worth to a hypothetical gambler. `util` says that a dollar is worth a dollar, until the amount is \"enough\" money. After that point, each additional dollar is worth half as much (only brings half as much happiness). Value keeps accumulating at this rate until we reach the next threshold of \"enough,\" when the utility of additional dollars is halfed again. The exact details of `util` are not critical; what matters is that overall money becomes less valuable after we have won a lot of it."
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"The value of money: only half as valuable after you already have enough.\"\n",
"function util(dollars, enough=1000)\n",
" if dollars < enough\n",
" return dollars\n",
" else\n",
" return enough + util((dollars-enough)/2., enough*2)\n",
" end\n",
"end;"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"A table and a plot will give a feel for the `util` function. Notice the characterisitc concave-down shape of the plot."
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 100 $ = 100 util\n",
" 1000 $ = 1000 util\n",
" 10000 $ = 4250 util\n",
" 100000 $ = 15938 util\n",
" 1000000 $ = 51594 util\n",
" 10000000 $ = 162461 util\n",
" 100000000 $ = 535646 util\n",
" 1000000000 $ = 1658229 util\n",
" 10000000000 $ = 5171073 util\n"
]
}
],
"source": [
"for d=2:10\n",
" println(lpad(10^d,15),\" \\$ = \",lpad(round(Int,util(10^d)),10), \" util\")\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"PyPlot.Figure(PyObject <matplotlib.figure.Figure object at 0x7fea9a6bbf10>)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Y axis is util(x); x axis is in thousands of dollars.\n"
]
}
],
"source": [
"figure(\"Value of Money\",figsize=(5,3))\n",
"plot([util(x) for x=1000:1000:10000000])\n",
"println(\"Y axis is util(x); x axis is in thousands of dollars.\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Now I will define the function `EU`, which computes the [expected utility](http://wiki.lesswrong.com/wiki/Expected_utility) of the game:"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"The expected utility of a probability distribution, given a utility function.\"\n",
"EU(P, U)= sum([e[2] * U(e[1]) for e in P]);"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"13.101970414893003"
]
},
"execution_count": 72,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"EU(StP, util)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"That says we should pay up to \\$13.10 to play the game, which sounds more reasonable than \\$30.86.\n",
"\n",
"# Understanding the St. Petersburg Problem Better through Simulation\n",
"\n",
"Before I plunk down my \\$13, I'd like to understand the game better. I'll write a simulation of the game:"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"flip()=rand([\"head\", \"tail\"])\n",
"\n",
"\"Simulate one round of the St. Petersburg game, and return the payoff.\"\n",
"function simulate_st_pete(limit=10^9)\n",
" pot = 2\n",
" while flip() == \"head\"\n",
" pot = pot * 2\n",
" if pot > limit\n",
" return limit\n",
" end\n",
" end\n",
" return pot\n",
"end;"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"I will run the simulation 100,000 times (with a random seed specified for reproducability) and make the results into a probability distribution:"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"17-element Array{Tuple{Int64,Float64},1}:\n",
" (2,0.50021) \n",
" (4,0.24995) \n",
" (8,0.12632) \n",
" (16,0.06167) \n",
" (32,0.03038) \n",
" (64,0.01543) \n",
" (128,0.00845) \n",
" (256,0.00376) \n",
" (512,0.00191) \n",
" (1024,0.00085) \n",
" (2048,0.00053) \n",
" (4096,0.00027) \n",
" (8192,0.00015) \n",
" (16384,4.0e-5) \n",
" (32768,4.0e-5) \n",
" (65536,1.0e-5) \n",
" (131072,3.0e-5)"
]
},
"execution_count": 74,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"srand(1234)\n",
"# With Julia to can take a few million samples at get the results instantly \n",
"results = sort([(z[1],z[2]) for z in proportionmap([simulate_st_pete() for _=1:100_000])], by=x->x[1])"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"13.139574999999997"
]
},
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"EU(results, util) "
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"The results are about what you would expect: about half the pots are 2, a quarter are 4, and higher pots are more and more unlikely. The expected utility was just a little bit more than the theoretical expected utility (13.14 vs. 13.10)."
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"text/plain": [
"19.82342"
]
},
"execution_count": 76,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"EV(results)"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"The expected value (19.82) is almost two thirds of the theoretical expected value (30.86). Why should there be such a big difference? I think the answer is *variance*. If I averaged an infinite number of rounds I would get 30.86, but if I can only average a finite number, most of the time I will get a result less than 30.86, and a very small number of times I will get an average very much larger than 30.86, because the round happened to include a very big (but very rare) pot.\n",
"\n",
"To see better how things unfold, I will define a function to plot the running average of repeated rounds:"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [],
"source": [
"\"For each element in the iterable, yield the mean of all elements seen so far.\" \n",
"function running_averages(iterable)\n",
" total, n = 0, 0\n",
" function run_avg_task()\n",
" for x in iterable\n",
" total, n = total + x, n + 1\n",
" produce(total / n)\n",
" end\n",
" end\n",
" Task(run_avg_task)\n",
"end\n",
"\n",
"\"Plot the running average of calling the function n times.\"\n",
"function plot_running_averages(fn, n)\n",
" plot(collect(running_averages([fn() for _=1:n])))\n",
"end;"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"Let's do ten repetitions of plotting 100,000 rounds each repetition:"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {
"button": false,
"collapsed": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"PyPlot.Figure(PyObject <matplotlib.figure.Figure object at 0x7fea9a6127d0>)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"srand(555)\n",
"figure(\"Running Average\",figsize=(5,3))\n",
"for i=1:10\n",
" plot_running_averages(simulate_st_pete, 100000)\n",
"end\n",
"axis([1 ,100_000, 2, 140]);"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"deletable": true,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"source": [
"What can we see from this? Eight of the 10 repetitions have a final average payoff between 10 and 35. So a price around \\$13 still seems reasonable. One outlier has an average payoff just over 100 and another just over 60, so if you are feeling lucky you might be willing to pay some amount between \\$13 and \\$30.\n",
"\n",
"# Conclusion\n",
"\n",
"We've seen how to manage probability paradoxes. Just be explicit about what the problem says, and then methodical about defining the sample space, and finally be careful in counting the number of outcomes in the numerator and denominator. Easy as 1-2-3. But the bigger lesson is: treat those around you as reasonable people, and when they have different opinons, try to discover what problem they are solving."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 0.4.6",
"language": "julia",
"name": "julia-0.4"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "0.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment