Skip to content

Instantly share code, notes, and snippets.

@Per48edjes
Last active July 1, 2024 04:34
Show Gist options
  • Save Per48edjes/57c74b832bc1b58b33ebde49be11aeb9 to your computer and use it in GitHub Desktop.
Save Per48edjes/57c74b832bc1b58b33ebde49be11aeb9 to your computer and use it in GitHub Desktop.
Fiddler on the Proof: Fiddler (06/01/2024)
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "6479454d-d434-4808-a73d-01255eed4844",
"metadata": {},
"source": [
"# Fiddler on the Proof\n",
"\n",
"Ravi Dayabhai & Conrad Warren 🪓 2024-06-01"
]
},
{
"cell_type": "markdown",
"id": "52b23e3c-c3d9-4a10-87b0-8f3910e19419",
"metadata": {},
"source": [
"## Problem\n",
"\n",
"Here’s a puzzle I once heard from a colleague, who was in turn asked it at a job interview:\n",
"\n",
"Starting with a line segment of length 1, randomly split it somewhere along its length into two parts. Compute the product of these two lengths. Then take each of the two resulting segments and repeat the process. That is, for each one, randomly split it somewhere along its length into two parts and compute the product. Then do this for all four resulting segments, then the eight after that, and the 16 after that, and so on.\n",
"\n",
"After doing this (forever), you add up all the products you computed throughout. On average, what value would you expect this sum to approach?"
]
},
{
"cell_type": "markdown",
"id": "eaa13279-6073-4c6d-8fa7-e57145e55b49",
"metadata": {},
"source": [
"## Solution\n",
"\n",
"The sum of all products computed throughout approaches $\\frac{1}{2}$ in the limit."
]
},
{
"cell_type": "markdown",
"id": "75277d57-d547-41e9-9813-88d90e03dc12",
"metadata": {},
"source": [
"### Rationale"
]
},
{
"cell_type": "markdown",
"id": "862207bb-0b57-44ab-9825-91c0fb7177ff",
"metadata": {},
"source": [
"Let's start by performing the process above once; let random variable $X \\sim \\text{Unif}(0,1)$ be the point where the interval $(0, 1)$ is split (read: a \"split point\").\n",
"\n",
"This gives us $X(1-X)$ for the product of the lengths of the resulting segments from splitting at $X$. In expectation, this is $E(X(1-X))$, which can be shown to evaluate to $\\frac{1}{6}$."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "8ca4dfbf-b0bb-4f6f-a098-89f1e9f994b5",
"metadata": {},
"outputs": [],
"source": [
"from sympy.stats import Uniform, E\n",
"import sympy as sp"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f8066588-d1fc-444d-b6af-cd4d8be32c05",
"metadata": {},
"outputs": [
{
"data": {
"text/latex": [
"$\\displaystyle \\frac{1}{6}$"
],
"text/plain": [
"1/6"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X = Uniform(\"x\", 0, 1)\n",
"E(X * (1 - X))"
]
},
{
"cell_type": "markdown",
"id": "cb2f7fb1-d3e0-449e-a460-96dfafa3f996",
"metadata": {},
"source": [
"So, this is definitely a lower bound to our answer. Let's see what happens when we split the two resulting segments, but first, we'll add some notation to better manage our random variables. "
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "fdee054c-ae2c-422c-84cc-24c79b8b08cc",
"metadata": {},
"source": [
"Let $X_{i, j}$ be the random variable modeling the $j$th split point at split level $i$. A \"split level\" $i$ comprises the resulting $2^{i}$ number of segments (from splitting the prior split level's segments), and $j$ is an index of the split points resulting in the $i$th split level's segments. Both $i$ and $j$ are $1$-indexed.\n",
"\n",
"So, by way of example, $X$ above translates to $X_{1,1}$ since we will have $2^{i} = 2^{1} = 2$ resulting segments, and there is only $j = 1$ split point. If we take the left and right resulting segments and split them, we would add split points $X_{2,1}$ (yielding 2 segments) and $X_{2,2}$ (also yielding 2 segments, for a total of $2^{2} = 4$ segments at this split level). Below is the expression for the overall sum for $2$ split levels, $S_{2}$:\n",
"\n",
"$$\n",
"S_2 = X_{1,1}(1-X_{1,1}) + X_{2, 1}(X_{1,1} - X_{2, 1}) + (X_{2,2} - X_{1, 1})(1 - X_{2,2})\n",
"$$\n",
"\n",
"where $X_{1,1} \\sim \\text{Unif}(0, 1)$, $X_{2,1} \\sim \\text{Unif}(0, X_{1,1})$, and $X_{2,2} \\sim \\text{Unif}(X_{1,1}, 1)$."
]
},
{
"cell_type": "markdown",
"id": "ec92cd29-2d2a-4a49-b39d-da21125205f5",
"metadata": {},
"source": [
"The expected value of the overall sum is:\n",
"\n",
"$$\n",
"E(S_2) = E(X_{1,1}(1-X_{1,1}) + X_{2, 1}(X_{1,1} - X_{2, 1}) + (X_{2,2} - X_{1, 1})(1 - X_{2,2})) \n",
"$$"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "d0436296-0540-40d7-bcde-eb2f46e5873f",
"metadata": {},
"source": [
"...which can be simplified by leveraging symmetry and linearity of expectation:\n",
"\n",
"$$\n",
"\\begin{align*}\n",
"E(S_2) &= E(X_{1,1}(1-X_{1,1})) + \\underbrace{E(X_{2, 1}(X_{1,1} - X_{2, 1})) + E(X_{2,2} - X_{1, 1})(1 - X_{2,2}))}_{\\text{same by symmetry}}\\\\ \n",
"&= E(X_{1,1}(1-X_{1,1})) + 2 \\cdot E(X_{2, 1}(X_{1,1} - X_{2, 1}))\\\\\n",
"&= \\frac{1}{6} + 2 \\cdot E(X_{2, 1}(X_{1,1} - X_{2, 1}))\n",
"\\end{align*}\n",
"$$"
]
},
{
"cell_type": "markdown",
"id": "2ea04a67-aa4a-4cc2-a705-4a463d0afbbf",
"metadata": {},
"source": [
"Using the law of iterated expectation gives us a path forward to calculate $E(X_{2, 1}(X_{1,1} - X_{2, 1}))$ from the expectations conditional on $X_{1, 1}$:\n",
"\n",
"$$\n",
"E(X_{2, 1}(X_{1,1} - X_{2, 1})) = E(E(X_{2, 1}(X_{1,1} - X_{2, 1}) \\mid X_{1, 1}))\n",
"$$"
]
},
{
"cell_type": "markdown",
"id": "91ade742-2058-44be-9cbd-7b1ad7bd82ba",
"metadata": {},
"source": [
"Just to confirm that our assumption about symmetry is true, we can calculate the *left* and *right* split products (i.e., those arising from split points $X_{2,1}$ and $X_{2, 2}$, resp.) in expectation, directly:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "c1113371-a6d3-462f-a409-d1c00a21dd41",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The conditional expectation of the LEFT segment product: X_11**2/6\n"
]
},
{
"data": {
"text/latex": [
"$\\displaystyle \\frac{1}{18}$"
],
"text/plain": [
"1/18"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X_11, X_21, X_22 = sp.symbols('X_11 X_21 X_22')\n",
"E_left_conditional_X_11 = sp.simplify(sp.integrate(X_21 * (X_11 - X_21) / X_11, (X_21, 0, X_11)))\n",
"\n",
"print(f\"The conditional expectation of the LEFT segment product: {E_left_conditional_X_11}\")\n",
"\n",
"E_left = sp.integrate(E_left_conditional_X_11, (X_11, 0, 1))\n",
"E_left"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "86886c93-2d31-4e94-8947-7199912d2bca",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The conditional expectation of the RIGHT segment product: X_11**2/6 - X_11/3 + 1/6\n"
]
},
{
"data": {
"text/latex": [
"$\\displaystyle \\frac{1}{18}$"
],
"text/plain": [
"1/18"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"E_right_conditional_X_11 = sp.simplify(sp.integrate((X_22 - X_11) * (1 - X_22) / (1 - X_11), (X_22, X_11, 1)))\n",
"\n",
"print(f\"The conditional expectation of the RIGHT segment product: {E_right_conditional_X_11}\")\n",
"\n",
"E_right = sp.integrate(E_right_conditional_X_11, (X_11, 0, 1))\n",
"E_right"
]
},
{
"cell_type": "markdown",
"id": "dc1cd30f-d4cd-4988-8ce9-5aa70120d3a4",
"metadata": {},
"source": [
"Great! We have verified that $E(X_{2, 1}(X_{1,1} - X_{2, 1})) = E((X_{2,2} - X_{1, 1})(1 - X_{2,2}))$."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "c1734692-fcf4-42d7-a8ad-4a3ac61415d3",
"metadata": {},
"source": [
"So the first two terms/partial sums of our overall sum (in expectation) are:\n",
"\n",
"$$\n",
"\\begin{align*}\n",
"E(X_{1,1}(1-X_{1,1}) + X_{2, 1}(X_{1,1} - X_{2, 1}) + (X_{2,2} - X_{1, 1})(1 - X_{2,2})) &=\\\\\n",
"E(X_{1,1}(1-X_{1,1}) + 2 \\cdot (X_{2,2} - X_{1, 1})(1 - X_{2,2})) &= E(X_{1,1}(1-X_{1,1})) + 2 \\cdot E(X_{2, 1}(X_{1,1} - X_{2, 1}))\\\\\n",
"&= \\frac{1}{6} + 2 \\cdot E(X_{2, 1}(X_{1,1} - X_{2, 1}))\\\\\n",
"&= \\frac{1}{6} + 2 \\cdot \\frac{1}{18}\\\\\n",
"&= \\frac{1}{6} + \\frac{1}{6} \\cdot 2 \\cdot \\left( \\frac{1}{3} \\right)\n",
"\\end{align*}\n",
"$$"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "5f0eaccc-f0e9-44b5-9303-122908b24613",
"metadata": {},
"source": [
"It seems like we have a recurrence relation that results in a geometric sum. Said differently, we would really, *really* like to end up with the following closed-form expression for our sum in expectation, $E(S_n)$:\n",
"\n",
"$$\n",
"\\begin{align*}\n",
"E(S_n) &=\\frac{1}{6} + \\frac{1}{6} \\cdot \\left( \\frac{2}{3} \\right) + \\frac{1}{6} \\cdot \\left( \\frac{2}{3} \\right)^{2} + \\ldots + \\frac{1}{6} \\cdot \\left( \\frac{2}{3} \\right)^{n-1}\\\\ \n",
"E(S_n) &\\xrightarrow{\\: n \\to \\infty \\:} \\sum_{i=1}^{\\infty} \\frac{1}{6} \\left( \\frac{2}{3} \\right)^{i-1} = \\frac{1}{6} \\cdot \\frac{1}{(1 - \\frac{2}{3})} = \\frac{1}{2}\n",
"\\end{align*}\n",
"$$"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "026a2a01-4895-4f75-a59a-dade162caaba",
"metadata": {},
"source": [
"**Claim**. The contribution to the overall sum (in expectation) of the $i$th split level is given by $\\frac{1}{6} \\cdot \\left( \\frac{2}{3} \\right)^{i-1}$ for $i \\geq 1$.\n",
"\n",
"**Proof** (by induction).\n",
"\n",
"- The base case ($i=1$) is true (see above).\n",
"- Split level $i$ consists of $2^{i}$ intervals, so $2^{i-1}$ *pairs* of segments contribute to the partial sum for split level $i$.\n",
"- For any split point $X \\sim \\text{Unif}(0, L)$ on a segment of length $L$, the expected product of the resultant two segments' lengths is $E(X(L-X) | L) = \\frac{L^{2}}{6}$, so $E(X(L-X)) = E(E(X(L-X) | L)) = \\frac{1}{6}E(L^{2})$.\n",
" - Let $L_{i}$ be the length of an interval at level $i-1$, where $L_1 = 1$; then $\\frac{1}{6} E(L_{i}^{2})$ is the expected product of a pair of segments formed by taking a split in the interval $(0, L_{i})$.\n",
" - By symmetry, we know each of these products contribute the same amount (in expectation) to the partial sum for split level $i$.\n",
" - From above, we know that there are $2^{i-1}$ of these pairs, so the partial sum at split level $i$ is $\\frac{1}{6} \\cdot 2^{i-1} \\cdot E(L_{i}^{2})$; thus, it suffices to show that $E(L_{i}^{2}) = \\left( \\frac{1}{3} \\right)^{i-1}$ (for $i \\geq 1$).\n",
" - The base case is trivially true as $i = 1 \\implies E(L_{1}^{2}) = \\left( \\frac{1}{3} \\right)^{0} = 1$.\n",
" - Next, we will show $E(L_{i-1}^{2}) = \\left( \\frac{1}{3} \\right)^{i-2} \\implies E(L_{i}^{2}) = \\left( \\frac{1}{3} \\right)^{i-1}$.\n",
"\n",
"$$\n",
"\\begin{align*}\n",
"E(L_{i}^{2}) = E(E(L_{i}^{2} \\mid L_{i-1})) &= \\int_{0}^{L_{i-1}} L_{i}^{2} \\cdot \\frac{1}{L_{i-1}} dL_{i}\\\\\n",
"&= \\frac{1}{L_{i-1}} \\cdot \\frac{1}{3} \\cdot L_{i-1}^{3}\\\\\n",
"&= \\frac{1}{3} \\cdot L_{i-1}^{2}\\\\\n",
"E(L_{i}^{2}) = E(E(L_{i}^{2})) &= \\frac{1}{3} \\cdot E(L_{i-1}^{2}) = \\left( \\frac{1}{3} \\right)^{i-1}\\\\\n",
"\\end{align*}\n",
"$$"
]
},
{
"cell_type": "markdown",
"id": "55b45dbc-2223-47b4-9678-a7ce765e6c91",
"metadata": {},
"source": [
"### Numeric Approximation"
]
},
{
"cell_type": "markdown",
"id": "32867f4f-9c7b-4475-9bfe-385b2ae7c1bd",
"metadata": {},
"source": [
"Below, we run a small simulation of the process to approximate $E(S_n)$:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a989037e-2874-4a89-a899-4356bd5d896b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.49887369860910025"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import random\n",
"\n",
"# n terms (in S_n), N trials\n",
"n, N = 16, 1_000\n",
"\n",
"# Define the recurrence relation\n",
"def S(L: int, i: int):\n",
" if i == n:\n",
" return 0\n",
" x = random.uniform(0, L)\n",
" return x * (L-x) + S(x, i+1) + S(L-x, i+1)\n",
"\n",
"# Approximating E(S_n)\n",
"sum(S(1, 1) for _ in range(N)) / N"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment