Skip to content

Instantly share code, notes, and snippets.

@christianp
Created May 2, 2023 08:48
Show Gist options
  • Save christianp/bd82a79798a4c76282e0b19c897a1593 to your computer and use it in GitHub Desktop.
Save christianp/bd82a79798a4c76282e0b19c897a1593 to your computer and use it in GitHub Desktop.
Jupyter notebook answering the question: which way of splitting people up in three groups by something to do with their birth date produces the most evenly-sized groups?
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"language_info": {
"codemirror_mode": {
"name": "python",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8"
},
"kernelspec": {
"name": "python",
"display_name": "Python (Pyodide)",
"language": "python"
}
},
"nbformat_minor": 4,
"nbformat": 4,
"cells": [
{
"cell_type": "markdown",
"source": "Data from https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/articles/howpopularisyourbirthday/2015-12-18",
"metadata": {}
},
{
"cell_type": "code",
"source": "import csv\n\nwith open('data.csv') as f:\n d = csv.DictReader(f)\n rows = list(d)\n\nfrom datetime import datetime\n\nrows = [(datetime.strptime('2000 '+r['date'],'%Y %d-%b'), float(r['average'])) for r in rows]",
"metadata": {
"trusted": true
},
"execution_count": 38,
"outputs": []
},
{
"cell_type": "code",
"source": "months = [31,28,31,30,31,30,31,31,30,31,30,31]",
"metadata": {
"trusted": true
},
"execution_count": 39,
"outputs": []
},
{
"cell_type": "code",
"source": "from collections import defaultdict\n\nmethods = [\n (\"Month number mod 3\", lambda d: d.month % 3),\n (\"Day number mod 3\", lambda d: d.day % 3),\n (\"Contiguous thirds of the year\", lambda d: (d.month -1)//3),\n (\"Day of the year mod 3\", lambda d: (sum(months[:d.month-1]) + d.day) % 3),\n]\n\nresults = []\n\nfor desc,method in methods:\n f = defaultdict(lambda: 0)\n\n for d,n in rows:\n m = method(d)\n f[m] += n\n\n t = sum(f.values())\n\n p = [n/t for n in f.values()]\n results.append((desc, max(p) - min(p)))\n\nresults.sort(key=lambda x:x[1])\n\nfor desc, p in results:\n print(f\"{p}\\t{desc}\")",
"metadata": {
"trusted": true
},
"execution_count": 40,
"outputs": [
{
"name": "stdout",
"text": "8.054183179989627e-05\tDay of the year mod 3\n0.006354592308398355\tMonth number mod 3\n0.014516064139928453\tContiguous thirds of the year\n0.02048407072431757\tDay number mod 3\n",
"output_type": "stream"
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment