Skip to content

Instantly share code, notes, and snippets.

@NaimKabir
Last active November 15, 2020 22:29
Show Gist options
  • Save NaimKabir/3ac68d9105db14e9dce83d08be2d27c5 to your computer and use it in GitHub Desktop.
Save NaimKabir/3ac68d9105db14e9dce83d08be2d27c5 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Getting + Formatting Data\n",
"\n",
"I downloaded Oakland County, MI election data from [**this link.**](https://results.enr.clarityelections.com//MI/Oakland/105840/269402/reports/detailxml.zip)\n",
"\n",
"You can navigate to it from the Oakland county [**gov site**](https://www.oakgov.com/clerkrod/elections/Pages/default.aspx), which will redirect you to that results site if you go to \"Unofficial Results\" for November's elections.\n",
"\n",
"I have to use XML rather than their XLS offering because I don't want to pay to use Excel and get spreadsheets. Would it kill people to just use generic CSVs? Godddamn. Anyway, such is life."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%load_ext blackcellmagic\n",
"\n",
"import pandas as pd\n",
"import xml.etree.ElementTree as et\n",
"tree = et.parse('detail.xml') \n",
"root = tree.getroot()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now I'll just grab the contests I care about: straight ticket votes and votes for presidential candidates."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"straight_ticket_xml = None\n",
"presidential_vote_xml = None\n",
"for child in root:\n",
" if child.attrib.get(\"text\", \"\") == \"Straight Party Ticket\":\n",
" straight_ticket_xml = child\n",
" if (\n",
" child.attrib.get(\"text\", \"\")\n",
" == \"Electors of President and Vice-President of the United States\"\n",
" ):\n",
" presidential_vote_xml = child\n",
"assert straight_ticket_xml is not None\n",
"assert presidential_vote_xml is not None"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see what kind of info we're working with here."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"VoteType Undervotes\n",
"VoteType Overvotes\n",
"Choice Democratic Party\n",
"Choice Republican Party\n",
"Choice Libertarian Party\n",
"Choice U.S. Taxpayers Party\n",
"Choice Working Class Party\n",
"Choice Green Party\n",
"Choice Natural Law Party\n"
]
}
],
"source": [
"for child in straight_ticket_xml:\n",
" print(child.tag, child.attrib.get(\"name\", child.attrib.get(\"text\", \"\")))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have undervotes and overvotes, which we don't super care about so I'll just ignore.\n",
"\n",
"We also have data for each choice voter can make, which will be useful for grabbing vote totals at the precinct level and structuring our dataset. Ok, let's start to get some structure so we can work with dataframes instead of XML (god I hate XML)."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"def get_vote_dataframe(vote_xml):\n",
" \"\"\"Let's deal with tables instead of XML.\"\"\"\n",
"\n",
" choice_data = {} # store precinct votes for each choice voters can make\n",
" for child in vote_xml:\n",
" if child.tag == \"Choice\":\n",
" precinct_data = {} # store votes for each precint\n",
"\n",
" # we don't care about the Absentee vs. In-person distinction, so we'll\n",
" # just pool votes from both types\n",
" for vote_type_node in child:\n",
" for precinct_node in vote_type_node:\n",
" precinct_name = precinct_node.attrib[\"name\"]\n",
" precinct_data[precinct_name] = precinct_data.get(\n",
" precinct_name, 0\n",
" ) + int(precinct_node.attrib[\"votes\"])\n",
"\n",
" choice_data[child.attrib[\"text\"]] = precinct_data\n",
" return pd.DataFrame(choice_data)\n",
"\n",
"\n",
"straight_ticket_data = get_vote_dataframe(straight_ticket_xml)\n",
"presidential_vote_data = get_vote_dataframe(presidential_vote_xml)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Democratic Party</th>\n",
" <th>Republican Party</th>\n",
" <th>Libertarian Party</th>\n",
" <th>U.S. Taxpayers Party</th>\n",
" <th>Working Class Party</th>\n",
" <th>Green Party</th>\n",
" <th>Natural Law Party</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Addison Township, Precinct 1</th>\n",
" <td>226</td>\n",
" <td>673</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Addison Township, Precinct 2</th>\n",
" <td>236</td>\n",
" <td>716</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Addison Township, Precinct 3</th>\n",
" <td>155</td>\n",
" <td>435</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Bloomfield Township, Precinct 1</th>\n",
" <td>387</td>\n",
" <td>392</td>\n",
" <td>5</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Bloomfield Township, Precinct 2</th>\n",
" <td>610</td>\n",
" <td>436</td>\n",
" <td>5</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Walled Lake, Precinct 3</th>\n",
" <td>287</td>\n",
" <td>359</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Wixom, Precinct 1</th>\n",
" <td>452</td>\n",
" <td>722</td>\n",
" <td>10</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Wixom, Precinct 2</th>\n",
" <td>380</td>\n",
" <td>537</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Wixom, Precinct 3</th>\n",
" <td>986</td>\n",
" <td>372</td>\n",
" <td>7</td>\n",
" <td>8</td>\n",
" <td>7</td>\n",
" <td>5</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Wixom, Precinct 4</th>\n",
" <td>627</td>\n",
" <td>660</td>\n",
" <td>7</td>\n",
" <td>2</td>\n",
" <td>5</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>506 rows × 7 columns</p>\n",
"</div>"
],
"text/plain": [
" Democratic Party Republican Party \\\n",
"Addison Township, Precinct 1 226 673 \n",
"Addison Township, Precinct 2 236 716 \n",
"Addison Township, Precinct 3 155 435 \n",
"Bloomfield Township, Precinct 1 387 392 \n",
"Bloomfield Township, Precinct 2 610 436 \n",
"... ... ... \n",
"Walled Lake, Precinct 3 287 359 \n",
"Wixom, Precinct 1 452 722 \n",
"Wixom, Precinct 2 380 537 \n",
"Wixom, Precinct 3 986 372 \n",
"Wixom, Precinct 4 627 660 \n",
"\n",
" Libertarian Party U.S. Taxpayers Party \\\n",
"Addison Township, Precinct 1 4 0 \n",
"Addison Township, Precinct 2 4 1 \n",
"Addison Township, Precinct 3 2 0 \n",
"Bloomfield Township, Precinct 1 5 1 \n",
"Bloomfield Township, Precinct 2 5 0 \n",
"... ... ... \n",
"Walled Lake, Precinct 3 4 0 \n",
"Wixom, Precinct 1 10 1 \n",
"Wixom, Precinct 2 4 0 \n",
"Wixom, Precinct 3 7 8 \n",
"Wixom, Precinct 4 7 2 \n",
"\n",
" Working Class Party Green Party \\\n",
"Addison Township, Precinct 1 0 2 \n",
"Addison Township, Precinct 2 1 2 \n",
"Addison Township, Precinct 3 1 1 \n",
"Bloomfield Township, Precinct 1 0 1 \n",
"Bloomfield Township, Precinct 2 0 2 \n",
"... ... ... \n",
"Walled Lake, Precinct 3 2 1 \n",
"Wixom, Precinct 1 0 1 \n",
"Wixom, Precinct 2 2 0 \n",
"Wixom, Precinct 3 7 5 \n",
"Wixom, Precinct 4 5 2 \n",
"\n",
" Natural Law Party \n",
"Addison Township, Precinct 1 0 \n",
"Addison Township, Precinct 2 0 \n",
"Addison Township, Precinct 3 0 \n",
"Bloomfield Township, Precinct 1 0 \n",
"Bloomfield Township, Precinct 2 0 \n",
"... ... \n",
"Walled Lake, Precinct 3 1 \n",
"Wixom, Precinct 1 0 \n",
"Wixom, Precinct 2 0 \n",
"Wixom, Precinct 3 1 \n",
"Wixom, Precinct 4 1 \n",
"\n",
"[506 rows x 7 columns]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"straight_ticket_data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To make presidential header names a bit more wieldy, I'm going to abbreviate:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"rename_mapper = {\n",
" \"Joseph R. Biden/Kamala D. Harris\": \"biden\",\n",
" \"Donald J. Trump/Michael R. Pence\": \"trump\",\n",
" \"Jo Jorgensen/Jeremy Cohen\": \"jorgensen\",\n",
" \"Don Blankenship/William Mohr\": \"blankenship\",\n",
" \"Howie Hawkins/Angela Walker\": \"hawkins\",\n",
" \"Rocky De La Fuente/Darcy Richardson\": \"rocky\",\n",
"}\n",
"presidential_vote_data = presidential_vote_data.rename(columns=rename_mapper)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>biden</th>\n",
" <th>trump</th>\n",
" <th>jorgensen</th>\n",
" <th>blankenship</th>\n",
" <th>hawkins</th>\n",
" <th>rocky</th>\n",
" <th>Rejected write-ins</th>\n",
" <th>Unassigned write-ins</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Addison Township, Precinct 1</th>\n",
" <td>439</td>\n",
" <td>1123</td>\n",
" <td>19</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Addison Township, Precinct 2</th>\n",
" <td>484</td>\n",
" <td>1096</td>\n",
" <td>17</td>\n",
" <td>0</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Addison Township, Precinct 3</th>\n",
" <td>336</td>\n",
" <td>713</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Bloomfield Township, Precinct 1</th>\n",
" <td>685</td>\n",
" <td>617</td>\n",
" <td>6</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Bloomfield Township, Precinct 2</th>\n",
" <td>1099</td>\n",
" <td>710</td>\n",
" <td>14</td>\n",
" <td>0</td>\n",
" <td>7</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Walled Lake, Precinct 3</th>\n",
" <td>520</td>\n",
" <td>574</td>\n",
" <td>18</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Wixom, Precinct 1</th>\n",
" <td>1011</td>\n",
" <td>1206</td>\n",
" <td>15</td>\n",
" <td>4</td>\n",
" <td>6</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Wixom, Precinct 2</th>\n",
" <td>815</td>\n",
" <td>996</td>\n",
" <td>33</td>\n",
" <td>1</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Wixom, Precinct 3</th>\n",
" <td>1379</td>\n",
" <td>540</td>\n",
" <td>27</td>\n",
" <td>4</td>\n",
" <td>9</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Wixom, Precinct 4</th>\n",
" <td>1103</td>\n",
" <td>1006</td>\n",
" <td>28</td>\n",
" <td>3</td>\n",
" <td>8</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>506 rows × 8 columns</p>\n",
"</div>"
],
"text/plain": [
" biden trump jorgensen blankenship \\\n",
"Addison Township, Precinct 1 439 1123 19 1 \n",
"Addison Township, Precinct 2 484 1096 17 0 \n",
"Addison Township, Precinct 3 336 713 8 0 \n",
"Bloomfield Township, Precinct 1 685 617 6 0 \n",
"Bloomfield Township, Precinct 2 1099 710 14 0 \n",
"... ... ... ... ... \n",
"Walled Lake, Precinct 3 520 574 18 0 \n",
"Wixom, Precinct 1 1011 1206 15 4 \n",
"Wixom, Precinct 2 815 996 33 1 \n",
"Wixom, Precinct 3 1379 540 27 4 \n",
"Wixom, Precinct 4 1103 1006 28 3 \n",
"\n",
" hawkins rocky Rejected write-ins \\\n",
"Addison Township, Precinct 1 2 1 0 \n",
"Addison Township, Precinct 2 4 1 0 \n",
"Addison Township, Precinct 3 0 0 0 \n",
"Bloomfield Township, Precinct 1 2 1 0 \n",
"Bloomfield Township, Precinct 2 7 2 0 \n",
"... ... ... ... \n",
"Walled Lake, Precinct 3 2 0 0 \n",
"Wixom, Precinct 1 6 0 0 \n",
"Wixom, Precinct 2 4 0 0 \n",
"Wixom, Precinct 3 9 0 0 \n",
"Wixom, Precinct 4 8 2 0 \n",
"\n",
" Unassigned write-ins \n",
"Addison Township, Precinct 1 1 \n",
"Addison Township, Precinct 2 4 \n",
"Addison Township, Precinct 3 4 \n",
"Bloomfield Township, Precinct 1 3 \n",
"Bloomfield Township, Precinct 2 3 \n",
"... ... \n",
"Walled Lake, Precinct 3 1 \n",
"Wixom, Precinct 1 7 \n",
"Wixom, Precinct 2 2 \n",
"Wixom, Precinct 3 7 \n",
"Wixom, Precinct 4 3 \n",
"\n",
"[506 rows x 8 columns]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"presidential_vote_data"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment