Skip to content

Instantly share code, notes, and snippets.

@kburnham
Created July 17, 2015 16:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kburnham/0d619494039af4a0172b to your computer and use it in GitHub Desktop.
Save kburnham/0d619494039af4a0172b to your computer and use it in GitHub Desktop.
{
"metadata": {
"name": "",
"signature": "sha256:479d65b161afda29d9bddd2391ff8427ccdc214abf259212993ef9b90a02a3d4"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Data Wrangling with MongoDB: Austin, Texas"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Kevin Burnham <font size=2><kburnham@gmail.com></font>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"-----\n",
"<a id=\"top\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Table of Contents\n",
"<ol>\n",
"<li><a href=\"#introduction\">Introduction</a></li>\n",
"<li><a href=\"#auditing_and_cleaning\">Auditing & Cleaning</a></li>\n",
"<li><a href=\"#exporting\">Exporting to MongoDB</a></li>\n",
"<li><a href=\"#querying\">Querying the Database</a></li>\n",
"<li><a href=\"#conclusion\">Conclusion</a></li>\n",
"<li><a href=\"#bibliography\">Bibliography</a></li>\n",
"<li><a href=\"#code\">Code</a></li>\n",
"</ol>\n",
"\n",
"-----"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Introduction"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this project we begin with an XML file provided by OpenStreetMaps (OSM). We audit and clean the data of interest to us and then export it to a MongoDB database using an intermediary JSON file. Finally, we query the data to see what we can learn about Austin. \n",
"\n",
"-----"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"auditing_and_cleaning\"></a>"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Auditing & Cleaning"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We begin by getting a count of all the different tag types that we find in the data. We then identify the information that is most valuable to us, audit those data to determine what types of wrangling is necessary and write helper functions to do that wrangling. \n",
"\n",
"As a first step we count the tags with this <a href='#count_tags'>`count_tags`</a> function and get the following result:\n",
"\n",
"\n",
" {'bounds': 1,\n",
" 'member': 10712,\n",
" 'nd': 880991,\n",
" 'node': 761182,\n",
" 'osm': 1,\n",
" 'relation': 938,\n",
" 'tag': 527014,\n",
" 'way': 78371}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since we are interested mainly in places to go in Austin, we will take a closer look at that tags in our nodes. With the <a href=\"#count_keys\">`count_keys`</a> function we count the number of occurences of each key in the tags of nodes only. The most common keys are:\n",
"\n",
" - ele 1599\n",
" - gnis:feature_id 1443\n",
" - name 4774\n",
" - highway 8013\n",
" - power 2059\n",
" - amenity 3324\n",
" - created_by 5446"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we choose several key types to focus on and make a list of the values contained in those keys. The function below takes in a tag key type (e.g. 'amenity') and an XML file and returns a count of the values associated with that key:\n",
"\n",
"\n",
" def count_values(key, filename): \n",
" counter = collections.Counter()\n",
" for item, elem in ET.iterparse(filename):\n",
" if elem.tag == 'node':\n",
" for tag in elem.findall('tag'):\n",
" if tag.attrib['k'] == key:\n",
" counter[(tag.attrib['v'])] += 1\n",
" return counter\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using the `count_values` function and focusing on tags of special interest to us, we identify problems and opportunities in the OSM data, describe our proposed solutions and link to the helper functions we will use when we export data to the JSON file. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"return_religion\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Religions/Denominations**\n",
"\n",
" - There is a lack of uniformity in the classification of denominations. 'mormon' and 'latter_day_saints' are listed seperately but should be combined into a single key, same with 'catholic' and 'roman_catholic'.\n",
" \n",
" - Some of the religion/denomination names contain the '_' instead of a space, we'll replace those with ' '. Also, 'Jehovahs' should have an apostrophe.\n",
" \n",
" - The names of religions and denominations should be capitalized.\n",
" \n",
" See the make_national_cuisine_list and get_cuisine functions <a href=\"#clean_religion\">here</a>.\n",
"\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" <a id =\"elevation\"></a>\n",
"Elevation\n",
" - Many of the tags in the OSM database have a `gnis:feature_id`. GNIS is the Geographical Names Information System of the United States Geological Survey (USGS). Where we have a GNIS feature id in the OSM database we can check the elevation against the data from the USGS. Below we show a histogram of the differences between the OSM and USGS datasets. It shows that while most of the differences are rather small, a few are significantly different. The code for importing and comparing the data is <a href=\"#elevation_code\">here</a>.\n",
" \n",
"- When we find a discrepancy between the USGS data and the OSM data we will use the USGS data and create a field to indicate that the elevation supplied does come from the USGS. However, just in case we later regret this decision, we will retain the data from the `ele` tag in a field called `old_elevation`. See the <a href=\"#export_code\">final export function</a> for details."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#plot a histogram of the differences between the 'ele' tag and the USGS elevation data \n",
"differences = [np.abs(x) for x in elevation_differences.values()]\n",
"df = pd.DataFrame(differences, columns=['Differences between OSM and gnis data'])\n",
"p = ggplot(aes(x='Differences between OSM and gnis data'), data=df)\n",
"p + geom_histogram(binwidth=1, color=\"black\",fill=\"palevioletred\") + theme_bw()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAApEAAAHzCAYAAAB4/0YHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl0VGWC/vGnkmC2qjIJCZgQIAQwLKMgsZEldqBlUXtU\ntKVVRsUepxmX0aZhxrHnuOBxBrVHW2YcR+V0j2s7DUGWbhdkjq3RKKAGiaISkU1WRchSqUoCqXp/\nf/Cjmpck6BtCFur7OcdzUnXr3vvWU6/WY92qez3GGCMAAADAQVxnDwAAAADdDyUSAAAAziiRAAAA\ncEaJBAAAgDNKJAAAAJxRIgEAAOAs4XgLly9frk2bNik1NVW33HKLJCkUCmnJkiWqrq5WWlqapk+f\nruTkZEnSO++8o48++kgej0cXXXSRBg0adPKfAQAAADrccT+JPOecc3Tttdda95WVlSk/P1+33367\n8vPzVVZWJkn65ptvtGHDBt1666269tpr9corrygSiZy8kQMAAKDTHLdE9u/fX0lJSdZ9lZWVGjly\npCRpxIgR2rhxY/T+s846S/Hx8UpPT1dGRoZ27dp1koYNAACAznTcw9ktCQaD8nq9kiSv16tgMChJ\nCgQCys3NjT7O7/crEAhIkmpra1VXV2dtx+v1yu/3t3ngAAAA6DzOJfJoHo/nez2uvLxcpaWl1n3F\nxcWaOHHiieweAAAAncS5RKampioQCMjn8ykQCCg1NVWS5PP5VFNTE31cbW1t9JPGwsJCFRQUWNs5\n8mlmV1BfXx/9cRDI41jkYSMPG3k0RyY28rCRh6075+F8ip+CggJVVFRIktavX68hQ4ZE79+wYYOa\nmppUVVWlAwcOqE+fPpIOH9rOycmx/ulKh7KNMZ09hC6FPGzkYSMPG3k0RyY28rCRh60753HcTyKX\nLFmibdu2KRQK6Te/+Y0mTpyooqIilZSUaN26ddFT/EhSr169NHz4cD3++OOKi4vTj3/84+99uBsA\nAADdi8d05wrcTkKhkFJSUjp7GF0GedjIw0YeNvJojkxs5GEjD1t3zoMr1gAAAMAZJRIAAADOKJEA\nAABwRokEAACAM0okAAAAnFEiAQAA4IwSCQAAAGeUSAAAADijRAIAAMAZJRIAAADOKJEAAABwRokE\nAACAM0okAAAAnFEiAQAA4IwSCQAAAGeUSAAAADijRAIAAMAZJRIAAADOKJEAAABwRokEAACAM0ok\nAAAAnFEiAQAA4IwSCQAAAGeUSAAAADijRAIAAMAZJRIAAADOKJEAAABwRokEAACAM0okAAAAnFEi\nAQAA4IwSCQAAAGcJnT0AVx6Pp83rGmPacSQAAACxq9uVSEnaNPsx53UGL7jtJIwEAAAgNnE4GwAA\nAM4okQAAAHBGiQQAAIAzSiQAAACcUSIBAADgjBIJAAAAZ5RIAAAAOOu080Q2NDQoEol06D5DoVCL\n94fD4VaXxSLysJGHjTxs5NEcmdjIw0Yetq6SR0pKivM6nVYik5KSOnyfrQUUCoXaFN6pijxs5GEj\nDxt5NEcmNvKwkYetO+fB4WwAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEA\nAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkA\nAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgA\nAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QC\nAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJwltHXFd955Rx9//LE8Ho969eqladOm6eDBg1qy\nZImqq6uVlpam6dOnKzk5uT3HCwAAgC6gTSWyqqpK5eXl+od/+AclJCSopKREGzZs0DfffKP8/HwV\nFRWprKxMZWVlmjx5cnuPGQAAAJ2sTYezExMTFR8fr0OHDikcDuvQoUPy+XyqrKzUyJEjJUkjRozQ\nxo0b23WwAAAA6Bra9ElkSkqKxo4dq0cffVQJCQkaNGiQBg4cqGAwKK/XK0nyer0KBoPtOlgAAAB0\nDW0qkQcOHNCaNWs0e/ZsJSYmqqSkRBUVFdZjPB5P9O/a2lrV1dVZy71er/x+f1t23+6OHivI41jk\nYSMPG3k0RyY28rCRh60759GmErl792717dtXKSkpkqShQ4dq586d8nq9CgQC8vl8CgQCSk1NlSSV\nl5ertLTU2kZxcbEmTpx4gsNvH/z4x0YeNvKwkYeNPJojExt52MjD1p3zaFOJzMzMVGlpqQ4dOqSE\nhARt2bJFffr0UY8ePVRRUaGioiKtX79eQ4YMkSQVFhaqoKDA2saRw95dQX19fbd+EdsbedjIw0Ye\nNvJojkxs5GEjD1t3zqNNJfKMM87QiBEjtHDhQnk8HmVnZ6uwsFCNjY0qKSnRunXroqf4kSS/399l\nDl23xBjT2UPoUsjDRh428rCRR3NkYiMPG3nYunMebT5PZFFRkYqKiqz7UlJSNHPmzBMeFAAAALo2\nrlgDAAAAZ5RIAAAAOKNEAgAAwBklEgAAAM4okQAAAHBGiQQAAIAzSiQAAACcUSIBAADgjBIJAAAA\nZ5RIAAAAOKNEAgAAwBklEgAAAM4okQAAAHBGiQQAAIAzSiQAAACcUSIBAADgjBIJAAAAZ5RIAAAA\nOKNEAgAAwBklEgAAAM4okQAAAHBGiQQAAIAzSiQAAACcUSIBAADgjBIJAAAAZ5RIAAAAOKNEAgAA\nwBklEgAAAM4okQAAAHBGiQQAAIAzSiQAAACcUSIBAADgjBIJAAAAZ5RIAAAAOKNEAgAAwBklEgAA\nAM4okQAAAHBGiQQAAIAzSiQAAACcUSIBAADgLKGzdtzQ0KBIJNKh+wyFQi3eHw6HW10Wi8jDRh42\n8rCRR3NkYiMPG3nYukoeKSkpzut0WolMSkrq8H22FlAoFGpTeKcq8rCRh408bOTRHJnYyMNGHrbu\nnAeHswEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEA\nAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkA\nAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgA\nAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QC\nAADAGSUSAAAAziiRAAAAcEaJBAAAgLOEtq5YX1+vP/7xj9q3b58kadq0acrIyNCSJUtUXV2ttLQ0\nTZ8+XcnJye02WAAAAHQNbS6RK1eu1ODBg3XVVVcpHA7r0KFDevvtt5Wfn6+ioiKVlZWprKxMkydP\nbs/xAgAAoAto0+HshoYGbd++XaNGjZIkxcfHKykpSZWVlRo5cqQkacSIEdq4cWP7jRQAAABdRps+\niayqqlJqaqqWL1+uvXv3KicnRxdeeKGCwaC8Xq8kyev1KhgMSpJqa2tVV1dnbcPr9crv95/g8NuH\nx+Pp7CF0KeRhIw8bedjIozkysZGHjTxs3TmPNpXISCSiPXv26OKLL1afPn302muvqayszHrM0aGU\nl5ertLTUWl5cXKyJEye2Zfftju9t2sjDRh428rCRR3NkYiMPG3nYunMebSqRfr9ffr9fffr0kSQN\nGzZMZWVl8nq9CgQC8vl8CgQCSk1NlSQVFhaqoKDA2saRTyy7gvr6+m79IrY38rCRh408bOTRHJnY\nyMNGHrbunEebSqTP55Pf79e3336rzMxMbdmyRVlZWcrKylJFRYWKioq0fv16DRkyRNJfSmdXZYzp\n7CF0KeRhIw8bedjIozkysZGHjTxs3TmPNv86++KLL9bSpUsVDoeVnp6uadOmKRKJqKSkROvWrYue\n4gcAAACnnjaXyDPOOEOzZs1qdv/MmTNPaEAAAADo+rhiDQAAAJxRIgEAAOCMEgkAAABnlEgAAAA4\no0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADA\nGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAA\nziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAA\ncEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAA\ngLOEztpxQ0ODIpFIh+4zFAq1eH84HG51WSwiDxt52MjDRh7NkYmNPGzkYesqeaSkpDiv02klMikp\nqcP32VpAoVCoTeGdqsjDRh428rCRR3NkYiMPG3nYunMeHM4GAACAM0okAAAAnFEiAQAA4IwSCQAA\nAGeUSAAAADijRAIAAMAZJRIAAADOKJEAAABwRokEAACAM0okAAAAnFEiAQAA4IwSCQAAAGeUSAAA\nADijRAIAAMAZJRIAAADOKJEAAABwRokEAACAM0okAAAAnFEiAQAA4IwSCQAAAGeUSAAAADijRAIA\nAMAZJRIAAADOKJEAAABwRokEAACAM0okAAAAnFEiAQAA4IwSCQAAAGeUSAAAADijRAIAAMAZJRIA\nAADOKJEAAABwRokEAACAM0okAAAAnFEiAQAA4IwSCQAAAGeUSAAAADijRAIAAMBZwomsHIlEtHDh\nQvn9fs2YMUOhUEhLlixRdXW10tLSNH36dCUnJ7fXWAEAANBFnNAnkWvWrFFWVlb0dllZmfLz83X7\n7bcrPz9fZWVlJzxAAAAAdD1tLpE1NTXatGmTRo0aFb2vsrJSI0eOlCSNGDFCGzduPPERAgAAoMtp\n8+Hs119/XVOmTFFjY2P0vmAwKK/XK0nyer0KBoOSpNraWtXV1Vnre71e+f3+tu6+XXk8ns4eQpdC\nHjbysJGHjTyaIxMbedjIw9ad82hTiaysrFRqaqqys7O1devWFh9zdCjl5eUqLS21lhcXF2vixIlt\n2X2743ubNvKwkYeNPGzk0RyZ2MjDRh627pxHm0rkjh07VFlZqU2bNqmpqUmNjY1aunSpUlNTFQgE\n5PP5FAgElJqaKkkqLCxUQUGBtY0jn1h2BfX19d36RWxv5GEjDxt52MijOTKxkYeNPGzdOY82lchJ\nkyZp0qRJkqRt27bpvffe0xVXXKFVq1apoqJCRUVFWr9+vYYMGSJJ8vv9XebQdUuMMZ09hC6FPGzk\nYSMPG3k0RyY28rCRh60753FCp/g5VlFRkUpKSrRu3broKX4AAABw6jnhEpmXl6e8vDxJUkpKimbO\nnHmimwQAAEAXxxVrAAAA4IwSCQAAAGeUSAAAADijRAIAAMAZJRIAAADOKJEAAABwRokEAACAM0ok\nAAAAnFEiAQAA4IwSCQAAAGeUSAAAADijRAIAAMAZJRIAAADOKJEAAABwRokEAACAM0okAAAAnFEi\nAQAA4IwSCQAAAGeUSAAAADijRAIAAMAZJRIAAADOKJEAAABwRokEAACAM0okAAAAnFEiAQAA4IwS\nCQAAAGeUSAAAADijRAIAAMAZJRIAAADOKJEAAABwRokEAACAM0okAAAAnFEiAQAA4IwSCQAAAGeU\nSAAAADijRAIAAMAZJRIAAADOKJEAAABwRokEAACAM0okAAAAnCV01o4bGhoUiUQ6dJ8ej8d5nWAw\neBJG0rWFw2GFQqHOHkaXQR428rCRR3NkYiMPG3nYukoeKSkpzut0WolMSkrq8H1umv2Y0+MHL7it\nTaF2d6FQKCafd2vIw0YeNvJojkxs5GEjD1t3zoPD2QAAAHBGiQQAAIAzSiQAAACcUSIBAADgjBIJ\nAAAAZ5RIAAAAOKNEAgAAwBklEgAAAM4okQAAAHBGiQQAAIAzSiQAAACcUSIBAADgjBIJAAAAZ5RI\nAAAAOKNEAgAAwBklEgAAAM4okQAAAHBGiQQAAIAzSiQAAACcUSIBAADgjBIJAAAAZ5RIAAAAOKNE\nAgAAwBklEgAAAM4okQAAAHBGiQQAAIAzSiQAAACcUSIBAADgjBIJAAAAZwmdPYCuzuPxtGk9Y0w7\njwQAAKDroER+h02zH3NeZ/CC207CSAAAALoODmcDAADAGSUSAAAAziiRAAAAcEaJBAAAgDNKJAAA\nAJxRIgEAAOCMEgkAAABnbTpPZE1NjZYtW6ZgMChJKiws1JgxYxQKhbRkyRJVV1crLS1N06dPV3Jy\ncrsOGAAAAJ2vTSUyLi5OU6dOVXZ2thobG7Vw4UINHDhQH330kfLz81VUVKSysjKVlZVp8uTJ7T1m\nAAAAdLI2Hc72+XzKzs6WJCUmJiozM1O1tbWqrKzUyJEjJUkjRozQxo0b22+kAAAA6DJO+DuRVVVV\n2rt3r3JzcxUMBuX1eiVJXq83ergbAAAAp5YTunZ2Y2OjFi9erAsvvFCJiYnWMo/HE/27trZWdXV1\n1nKv1yu/338iu8dJcvRrB/I4FnnYyKM5MrGRh408bN05jzaXyHA4rMWLF+vss8/W0KFDJUmpqakK\nBALy+XwKBAJKTU2VJJWXl6u0tNRav7i4WBMnTjyBoeNk4cdQNvKwkYeNPJojExt52MjD1p3zaFOJ\nNMZoxYoVysrK0tixY6P3FxQUqKKiQkVFRVq/fr2GDBki6fCvtwsKCqxtHDnsja6nvr6+W0/q9kYe\nNvKwkUdzZGIjDxt52LpzHm0qkV999ZU+/vhj9e7dW08++aQk6YILLlBRUZFKSkq0bt266Cl+JMnv\n93PouhsxxnT2ELoU8rCRh408miMTG3nYyMPWnfNoU4ns37+/5s2b1+KymTNnnsh4AAAA0A1wxRoA\nAAA4o0QCAADAGSUSAAAAziiRAAAAcHZCJxtH69p68tDu/CstAAAQOyiRJ8mm2Y85rzN4wW0nYSQA\nAADtj8PZAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiR\nAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJ\nBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNK\nJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgLOEztpxQ0OD\nIpFIZ+2+y/J4PM7rBIPBdh1DOBxWKBRq1212Z+RhIw8beTRHJjbysJGHravkkZKS4rxOp5XIpKSk\nztp1l7Zp9mNOjx+84LY2vfDHEwqF2n2b3Rl52MjDRh7NkYmNPGzkYevOeXA4GwAAAM4okQAAAHBG\niQQAAIAzSiQAAACcUSIBAADgjBIJAAAAZ5RIAAAAOOu080Si/bTlBOWSZIxp55EAAIBYQYk8Bbie\noFw6fJJyAACAtuJwNgAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJ\nBAAAgDNKJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4o0QCAADAGSUSAAAAziiRAAAAcEaJBAAAgDNK\nJAAAAJxRIgEAAOCMEgkAAABnlEgAAAA4S+jsAQDtyePxdPg+jTEdvk8AADobJRKnnE2zH3NeZ/CC\n29q8HgAAsYjD2QAAAHBGiQQAAIAzSiQAAACc8Z1IOOmMH64AAICuhxIJZ64/QDmRH620ZV8AAODk\n43A2AAAAnFEiAQAA4IzD2TGM7ze2j7bkeKqeoLyj59SpmiO6nxOZ+8zj7oXX+i/avURu2rRJK1eu\nlDFGo0aNUlFRUXvvAu2Ek2u3D763aeP7r4hV/Dc1dvBaH9auh7MjkYheffVVXXvttbr11lv1ySef\naN++fe25CwAAAHQB7Void+3apYyMDKWnpys+Pl5/9Vd/pY0bN7bnLgAAANAFtOvh7NraWp1++unR\n236/X7t27VJtba3q6uqsx3q9Xvn9/vbcPYAOwHdpbeTRHJngeJgftu6ch8e047c8P/vsM3355Ze6\n9NJLJUkVFRXatWuXkpOTVVpaaj22f//++slPfkKR/P9qa2tVXl6uwsJCMhF5HIs8bORhI4/myMRG\nHjbysLU1j3Y9nO3z+VRTU2MNyu/3q7CwULNmzYr+c/nll2v79u3NPp2MZXV1dSotLSWT/488bORh\nIw8beTRHJjbysJGHra15tOvh7JycHB04cEBVVVXy+XzasGGDrrzySvn9fpo+AADAKaRdS2R8fLwu\nvvhivfDCC4pEIho1apSysrLacxcAAADoAtr9PJGDBw/W4MGD23uzAAAA6ELi582bN6+jd2qM0Wmn\nnaa8vDwlJiZ29O67JDKxkYeNPGzkYSOP5sjERh428rC1NY92/XU2AAAAYkOnXDubSyPaHn30USUm\nJiouLk5xcXGaNWtWZw+pQy1fvlybNm1SamqqbrnlFklSKBTSkiVLVF1drbS0NE2fPl3JycmdPNKO\n01Imb775ptatW6fU1FRJ0gUXXBAzXx2pqanRsmXLFAwGJUmFhYUaM2ZMzM6T1vKI1Tly6NAhPfPM\nM2pqalI4HNaQIUM0adKkmJ0freURq/PjiEgkooULF8rv92vGjBkxOz+OdmwmrnOkw0vkkUsjXn/9\n9fL7/Vq4cKEKCgpi+gc4Ho9HN9xwg1JSUjp7KJ3inHPO0Xnnnadly5ZF7ysrK1N+fr6KiopUVlam\nsrIyTZ48uRNH2bFaysTj8Wjs2LEaN25cJ46sc8TFxWnq1KnKzs5WY2OjFi5cqIEDB+qjjz6KyXnS\nWh6xOkd69OihmTNn6rTTTlM4HNb//M//aPv27aqsrIzJ+dFaHrE6P45Ys2aNsrKy1NjYKIn3Gal5\nJq5zpF3PE/l9cGlEHKt///5KSkqy7qusrNTIkSMlSSNGjIi5OdJSJrHM5/MpOztbkpSYmKjMzEzV\n1tbG7DxpLY9Ydtppp0mSwuGwjDFKTk6O2fkhtZxHLKupqdGmTZs0atSo6H2xPD+kljNx/YZjh38S\n2dqlEWPdc889J4/Ho3PPPVeFhYWdPZxOFwwG5fV6JR2+ROaRw3axbu3ataqoqFBOTo6mTJkSk28M\nVVVV2rt3r3Jzc5knsvPYsWNHzM6RSCSip556SlVVVTr33HPVq1evmJ4fLeXx2Wefxez8eP311zVl\nypToJ24S7zMtZeLxeJzmSIeXyO58jciT5cYbb5TP51MwGNRzzz2nzMxM9e/fv7OH1WUwZw4799xz\nVVxcLEn685//rFWrVumyyy7r5FF1rMbGRi1evFgXXnhhs18QxuI8OTaPWJ4jcXFxuvnmm9XQ0KDn\nn39eW7dutZbH2vxoKY9YnR+VlZVKTU1VdnZ2s3lxRKzNj9YycZ0jHX44u7VLI8Yyn88nSUpNTdXQ\noUP5ZFaHswgEApKkQCAQ/ZJvLPN6vfJ4PPJ4PBo1alTMzZNwOKzFixfr7LPP1tChQyXF9jxpKY9Y\nnyOSlJSUpDPPPFO7d++O6flxxNF5xOr82LFjhyorK7VgwQK99NJL2rp1q5YuXRrT86O1TFznSIeX\nyKMvjdjU1KQNGzaooKCgo4fRZRw8eDD6UfLBgwe1efNm9erVq5NH1fkKCgpUUVEhSVq/fr2GDBnS\nySPqfEf+YydJGzdujKl5YozRihUrlJWVpbFjx0bvj9V50loesTpHgsGg6uvrJR3+ZfLmzZuVnZ0d\ns/OjtTxidX5MmjRJc+bM0ezZs3XllVdqwIABuuKKK2J2fkitZ+I6RzrlPJFHTvFz5NKI559/fkcP\nocuoqqrSH/7wB0mHv8Ny9tlnx1weS5Ys0bZt2xQKheT1ejVx4kQVFBSopKRENTU1MXnqhWMzmTBh\ngrZt26a9e/fK4/EoLS1Nl1xySfT7PKe67du36+mnn1bv3r2jh50uuOAC9enTJybnSWt5fPLJJzE5\nR77++mstW7ZMxhgZYzRixAiNHz9eoVAoJudHa3ksXbo0JufH0bZt26b33nsveoqfWJwfx9q6datW\nr16tGTNmOM8RTjYOAAAAZx1+OBsAAADdHyUSAAAAziiRAAAAcEaJBAAAgDNKJAAAAJxRIgEAAOCM\nEgl0ETfffLP+9V//NXr7iSeeUO/eveX3+1VVVaV3331XgwcPls/n0x//+MdOHGn7mzdvnq677rrO\nHga+w1tvvaW+ffuelG1ffPHFev755094O3l5eXrjjTfaYUQAvgslEugAeXl5SklJkd/vV3p6usaP\nH6+nnnpKR5+m9YknntBdd90l6fBVJubOnas33nhDtbW1Sk9P1z333KPbb79dgUBAl156aWc9lZPi\nRK5b210K6Hvvvacf/ehH8vv9SktL06WXXqrPP//cesz8+fOVn58vn8+nvn376uqrr44umzBhguLi\n4vTxxx9b61x++eWKi4vT22+/3SHP42R59dVX2+V1PHLJtu8jLi5OW7ZsOeF9ArGKEgl0AI/Ho5df\nflm1tbX66quvdOedd+qhhx7SjTfe2OLj9+7dq4aGhug1kSXpq6++0rBhw9q0/3A43Kb1Osqpfs2D\n1atXa+rUqbr88su1Z88ebd26NXoVka1bt0qSnn32Wb3wwgt64403FAgE9OGHH2rSpEnRbXg8HhUU\nFOi5556L3rd//36tXr06Zi5fdzKc6nMPOJkokUAH8/l8uuSSS7Ro0SI9++yz+uyzzyRJN9xwg+6+\n+25t2rQpeg3XtLQ0XXDBBRo0aJC2bNmiSy65RH6/X4cOHVJNTY1uvPFG5eTkKDc3V3fffbcikYgk\n6ZlnntH48eM1Z84cZWZm6r777tPBgwf1j//4j+rfv7/OOOMM3XzzzWpoaJB0+DBlbm6ufvOb36h3\n797Kycl5jxp0AAAOGklEQVTRM888Ex1zfX295s6dq7y8PKWlpen888+PrrtmzRqNGzdO6enpGjly\npEpLS6PrPfPMMxo4cKD8fr/y8/P14osvtpiJx+NRQ0ODrr76avn9fhUWFlqfuO3evVs/+clP1KtX\nL+Xn5+uxxx6TJK1cuVIPPPCAFi1aJJ/Pp3POOUdvvfWWzjrrrOi6kydP1ujRo6O3zz///OjXAVrb\nrnS4XDz44IMaNGiQMjMzddVVV6mqqkrS4UunxcXF6bnnnlP//v2VlZWl+fPnt/qa33HHHZo5c6Zu\nu+02paamKj09Xffff7/GjBmjefPmSZI++OADTZ06VQMGDJAk9e7dW3/3d39nbWfGjBlatGhRtPj8\n7//+r6644gr16NGj1X2/8sorOuecc3T66aerX79+uu+++6LLvut51NfX64YbblBGRoaGDx+uDz74\noNX9SNKqVatUUFCgtLQ03XrrrSouLtbvfvc7SYfnQlFRkf7pn/5JGRkZys/P18qVK6PrTpgwIfrY\nL7/8UsXFxUpLS1NWVpb1ieyxnn/+efXv31+ZmZnNXoP3339fY8eOVXp6unJycnTbbbfp0KFDkqQf\n/vCHkqQRI0bI5/OppKRE1dXV+uu//mv16tVLGRkZuuSSS7Rr167jPmcgphkAJ11eXp554403mt3f\nr18/8+STTxpjjLnhhhvM3XffbYwxZtu2bcbj8ZhwONzqNqZNm2ZuuukmEwqFzDfffGNGjx5tnnrq\nKWOMMU8//bRJSEgw//Vf/2XC4bCpr683s2fPNpdddpmpqqoygUDAXHLJJeZXv/qVMcaYN9980yQk\nJJh7773XNDU1mVdffdWkpKSY6upqY4wxt9xyi5k4caLZvXu3CYfDZvXq1aaxsdHs3LnT9OzZ07z2\n2mvGGGP+7//+z/Ts2dN8++23pq6uzvj9fvPFF18YY4zZu3ev+fTTT1vM59577zU9evQwL730kmlq\najIPP/ywGTBggGlqajLhcNiMGjXK3H///ebQoUNmy5YtJj8/37z++uvGGGPmzZtnrrvuuui2QqGQ\nSUpKMvv37zcHDx40vXr1Mrm5uaaurs6EQiGTnJxsDhw48J3bXbBggRk7dqzZtWuXOXjwoPn7v/97\nc8011xhjjNm6davxeDxm1qxZpqGhwVRUVJjExETz+eefN3tuwWDQxMfHm7feeqvZsqefftpkZ2cb\nY4x54YUXTEZGhvn3f/9388EHH5impibrsRMmTDC//e1vzZQpU6J5jx492qxevdrk5uaa0tLSFrN9\n6623zIYNG4wxxnz88cemd+/eZvny5cd9Hhs3bjTGGPPP//zP5oc//KGpqqoyO3bsMMOHDzd9+/Zt\ncT/79u0zfr/fLFu2zITDYfMf//EfpkePHuZ3v/td9Ln26NHD/Pa3vzWRSMQ88cQTJicnx3p+Rx57\n9dVXm/nz5xtjjGlsbDTvvvtui/v89NNPjdfrNe+8845pbGw0c+bMMQkJCdF/T8rLy83atWtNOBw2\n27ZtM0OHDjULFiyIru/xeMzmzZujt/fv32+WLl1q6uvrTSAQMNOnTzfTpk1rcd8ADl+cHcBJ1lqJ\nHDNmTPTN8oYbbjB33XWXMeYvb+6tlci9e/eaxMREU19fH13+4osvmokTJxpjDr9h9+vXL7osEomY\n1NRU6w3zvffeMwMGDDDGHC6RycnJ1v569eoVfQNOTk42H3/8cbPxP/jgg1aBM8aYqVOnmmeffdYE\ng0GTlpZmXnrpJRMKhY6bz7333mvGjh1rjTc7O9u88847Zs2aNdZzMcaY+fPnm5/97GfRda+99lpr\n+fnnn2+WLl1qVq9ebaZMmWKuuuoqs3LlSvPnP//ZnH322cYY853bHTJkiPWa7d692/To0cOEw+Ho\n67Nr167o8tGjR5s//OEPzZ7bjh07jMfjMZWVlc2Wvfbaa6ZHjx7R27///e/NpEmTTGpqqunZs6d5\n6KGHosuOlMgXXnjBXHPNNebzzz83Z555pjHGHLdEHusXv/iF+eUvf2mMMa0+j0WLFhljjFWqjTFm\n4cKFJjc3t8XtPvvss2bcuHHWfX379rVK5KBBg6LLgsGg8Xg85uuvv44+vyOPvf76682sWbPMzp07\nj/tc7rvvvmixP7LN0047rcV/14wx5tFHHzWXX3559PaxJfJYH330kUlPTz/uGIBYltDZn4QCsWzn\nzp3KyMhwXm/79u06dOiQsrOzo/dFIhH169cvevvoX9Hu27dPoVBIhYWF0fuMMdHD35LUs2dPxcX9\n5RsuKSkpqqur07fffquGhgYNHDiwxXGUlJToT3/6U/S+pqYm/ehHP1JKSooWLVqkhx9+WDfeeKPG\njx+vRx55RAUFBS0+p9zc3OjfHo9Hubm52r17tzwej3bv3q309PTo8nA4HD0c2ZLi4uLoIfri4mKl\np6ertLRUiYmJmjBhQnTsx9vu9u3boz9aOSIhIUFff/119PYZZ5xh5RUMBpuNJT09XXFxcdqzZ4/O\nPPNMa9mePXuUmZkZvT1jxgzNmDFD4XBYy5Yt09/8zd/onHPO0eTJk6O5XHHFFZo7d6569uyp66+/\nvtUMjli7dq3uvPNOffrppzp48KAaGxv105/+1HrMsc+jrq5O0uHD/UfPo6Pn17F2795tvYaSmt0+\ndj+SVFdX1+w7nb/+9a919913a/To0UpPT9fcuXP1s5/9rNk+9+zZY+0jJSVFPXv2jN7+4osvNGfO\nHJWXlysUCqmpqUnnnntuq88hFArpl7/8pV5//fXoVxfq6upkjDmhH38Bpyq+Ewl0kg8++EC7d+9W\nUVGR87p9+/ZVYmKi9u/fr6qqKlVVVammpkaffPJJ9DFHv+llZmYqOTlZn332WfTx1dXVqq2t/c59\nZWZmKikpSV9++WWzZf369dN1110X3WZVVZUCgYDuuOMOSdKUKVO0atUq7d27V0OGDNHPf/7zVvez\nY8eO6N+RSEQ7d+5Unz591LdvXw0YMMDaR21trV5++WVJskreEcXFxXrzzTf19ttva8KECdFSWVpa\nquLi4miGx9tuv379tHLlSmt5KBSyivv3kZqaqrFjx2rx4sXNli1evNj68cwR8fHxuvLKK3X22Wdr\nw4YN1rLk5GRddNFFevLJJ7/Xr5lnzJihadOmaefOnaqurtZNN91k/c/D8WRnZ+urr76K3j7672Pl\n5ORo586d0dvGGOu2i969e2vhwoXatWuXnnrqKd1yyy0t/oo6OzvbmjehUEj79++P3r755ps1bNgw\nffnll6qpqdG//du/Hfe5P/LII/riiy/0/vvvq6amRqWlpTKHj9i16XkApzpKJNBBjrwRHSkq11xz\nja677joNHz7cWv59ZGdna8qUKZozZ44CgYAikYg2b97c6mle4uLi9POf/1yzZ8/Wvn37JEm7du3S\nqlWrvnNfcXFx+tu//VvNmTNHe/bsUTgc1urVq3Xw4EFde+21+tOf/qRVq1YpHA6roaFBb731lnbt\n2qVvvvlGK1asUDAYVI8ePZSamqr4+PhW91NeXq5ly5apqalJCxYsUFJSksaMGaMf/OAH8vl8+vWv\nf636+nqFw2Ft2LBBH374oaTDhWPbtm1WfuPGjVNlZaU++OADjR49WsOGDdP27du1du3a6CeN5513\n3nG3e9NNN+lf/uVfosVp375933l+ztZewwcffFDPPvusHnvsMQUCAVVVVemuu+7S2rVrde+990o6\n/OvsV199Nfp6vvbaa/r000913nnnNdve/PnzVVpaetxPBo+oq6tTenq6TjvtNL3//vt68cUXv/en\naj/96U/1wAMPqLq6Wjt37rR+eHSsH//4x/rkk0+0YsUKNTU16fHHH9fevXu/136OVVJSEi2gaWlp\n8ng8Lf7PwpVXXqmXX35Z7777rg4ePKh77rnHKol1dXXy+XxKSUnRxo0b9cQTT1jr9+7dW5s3b7Ye\nn5ycrNNPP10HDhywfoQEoDlKJNBBjvyyul+/fnrggQc0d+5cPf3009Hlx57f7rve6J977jkdPHhQ\nw4YNU0ZGhqZPnx59027pXHkPPfSQBg0apDFjxuj000/X5MmT9cUXX3yv/T388MM666yz9IMf/EA9\ne/bUr371K0UiEeXm5mrFihWaP3++evXqpX79+umRRx6JHip/9NFH1adPH/Xs2VPvvPNOszfxo/c9\nbdo0LVq0SBkZGfr973+vpUuXKj4+XvHx8Xr55Ze1fv165efnKysrS7NmzYp+ijp9+nRJhw/HHzlU\nmZKSosLCQg0fPlwJCYe/tTNu3Djl5eVFDx/HxcUdd7u/+MUvdOmll2rKlCny+/0aO3as3n///ePm\n1VqG48eP1+uvv66lS5cqJydHeXl5qqioUFlZWfRrAn6/X/Pnz1f//v2Vnp6uO++8U08++aTGjRvX\nbHvZ2dkt3t+S//7v/9Y999wjv9+v+++/X1ddddX3GrMk3Xvvverfv78GDBigCy+8UNdff32rj+/Z\ns6dKSkp0xx13KDMzU59//rnOPfdcJSYmRvdz7LqtbevDDz/UmDFj5PP5dNlll+k///M/lZeX1+xx\nw4YN0+OPP64ZM2YoJydHGRkZ1uH3hx9+WC+++KL8fr9mzZqlq6++2trnvHnzNHPmTKWnp2vJkiWa\nPXu26uvrlZmZqXHjxumiiy7iMDZwHB7D5/QAgHYWiUTUt29fvfjii9GvEAA4tfBJJACgXaxatUrV\n1dVqbGyMnrNxzJgxnTwqACcLJRIA0C5Wr16tQYMGKSsrS6+88oqWL18ePZwN4NTD4WwAAAA445NI\nAAAAOKNEAgAAwBklEgAAAM4okQAAAHBGiQQAAICz/wcClSPCxyzEUAAAAABJRU5ErkJggg==\n",
"text": [
"<matplotlib.figure.Figure at 0x25c576f90>"
]
},
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 30,
"text": [
"<ggplot: (634412557)>"
]
}
],
"prompt_number": 30
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"cuisine\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Cuisine**\n",
"\n",
" - There are a number of cases where the same cuisine type is listed under different headings (e.g. \"BBQ\", \"Barbecue\" and \"Bar-B-Q\"). We will want to make a mapping dictionary so that we have uniform values in our database.\n",
" \n",
" - There are some cuisine entries in which multiple cuisines are indicated by a single field. We should break those apart into an array of cuisine types.\n",
" \n",
" - Some of the cuisine names are also national/ethnic names, and should be capitalized, while others are generic food types and should not be. We will use a list we found on ranker.com to check for which cuisine types should be capitalized.\n",
" \n",
"see the code for `make_national_cuisine_list` and `get_cuisine` <a href=\"#cuisine_code\">here.</a>\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"address\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Address**\n",
"\n",
" - There is a lack of standardization in how street names are represented. We want to import the street names in their full forms (e.g. \"Lane\" instead of \"Ln.\"). We need to audit the data and create a mapping dictionary for making the needed conversions.\n",
" \n",
" - We will also standardize compass point directions(e.g. N. = North)\n",
" \n",
" - Postcodes are not uniform. Some have only a five digit code while others include an additional 4 digits. Some of the postcodes include a 'TX' prefix and at least one ('14150') clearly does not belong. We need a helper function that will filter out inappropriate zip codes, truncate the zip code to 5 digits and, where possible, also return the 'plus4' code under a seperate entry.\n",
" \n",
" The code for auditing and cleaning the address data is <a href=\"#address_code\">here</a>."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"exporting\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"-----"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Exporting to MongoDB"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we would like to export the XML file to a MongoDB database. Then we can query that database to learn some interesting facts about Austin and to find points of interest. \n",
"\n",
"First we write a function that will take in a node from the OSM datafile and return a dictionary. The key-value pairs of the dictionary are taken from the attributes and tags of the node. We will create a subdictionary called `created` for information about the creation of the node and an array called `pos` holding the geographical coordinates of the node. Later we can use this array to create a 2D index in MongoDB. As noted above, we will also create an array of cuisine types whenever we have a `cuisine` key. \n",
" \n",
"See `node_to_dictionary` <a href=\"#export_code\">here</a>."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our next step is to create a JSON file using the `node_to_dictionary` function described above as follows:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def make_json(filename):\n",
" file_out = filename + \".json\"\n",
" with codecs.open(file_out, \"w\") as fo: #https://www.udacity.com/course/viewer#!/c-ud032-nd/l-768058569/e-865240067/m-863660253\n",
" for item, elem in ET.iterparse(filename):\n",
" if elem.tag == \"node\":\n",
" new_dict = node_to_dictionary(elem)\n",
" if new_dict:\n",
" fo.write(json.dumps(new_dict) + '\\n')"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 12
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"make_json(filename)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And then we use that file to create our MongoDB database."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"client = MongoClient()\n",
"db = client.austintexas"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 14
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"with open('austin_texas.osm.json') as input:\n",
" for line in input:\n",
" data = json.loads(line)\n",
" db.osm.insert(data)\n"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 15
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"querying\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"-----"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Querying the Database"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can query the database and answer some burning questions about Austin, TX."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question #1 - What is the relative size of each of the Christian denominations (measured by number of places of worship)?**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"result = db.osm.aggregate([{\"$match\" : {\"religion\" : \"Christian\", \"denomination\" : {\"$exists\" : 1}}},\n",
" {\"$group\" : {\"_id\" : \"$denomination\", \"count\" : {\"$sum\" : 1}}},\n",
" {\"$sort\" : {\"count\" : -1}}])\n",
"\n",
"#we want to see a pie chart of the results\n",
"denominations = {x['_id'] : x['count'] for x in result['result']}\n",
"\n",
"#limit the number of pie slices by compressing the 4 smallest denominations into an 'other' category\n",
"main_denominations = {x : y for x,y in denominations.items() if y >7}\n",
"main_denominations['other'] = sum(denominations.values()) - sum(main_denominations.values())\n",
"\n",
"#http://matplotlib.org/examples/pie_and_polar_charts/pie_demo_features.html\n",
"types = [key for key in main_denominations.keys()]\n",
"counts = [float(value) for value in main_denominations.values()]\n",
"colors = [\"sage\", \"c\", \"azure\", \"slateblue\", \"mediumpurple\", \"skyblue\", \"cornflowerblue\"]\n",
"\n",
"plt.pie(counts, labels=types, colors=colors, autopct='%1.f%%')\n",
"plt.title('Christian Denominations in Austin, TX')\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAD8CAYAAABw1c+bAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXeYFMXWh98zvWk2AZJzziBJcs6SczQgIKbPiNxrzgIG\nBBOiVzDdiyggKggIiiKoIBIliATJOcPG2ek+3x8z6IJkdnd2dup9nnl2pqe76tc9vb+qPnW6WlQV\ng8FgMIQOrkALMBgMBkPWYozfYDAYQgxj/AaDwRBiGOM3GAyGEMMYv8FgMIQYxvgNBoMhxDDGfw5E\n5GkR+e9lbrNORJpdQV03iMi8y90upyEic0Tkpkwq+xEReTczyr5AnadEpFRW1nm5iEgJv04JtBZD\n1hKyxi8iA0Vkuf/E3+s3nsb+ry/75gZVraaqiy5SZykRcUTElW67yara/nLruxT8dSX49/GwiHwr\nIn0zo66rRVU7quplNbbnQkRaiMius8oerarDrrbsy0FV41R1+5VuLyKx/t9uTkZpEpHtItLq9GdV\n3enXeVU384jI2/5z7JSIpIqIJ93n2SJSU0ROiEjZdNvUEZFjIlLiauo2XBkhafwiMhwYBzwPFACK\nA+OBLqdXuYyywq5EwhVsc6Vcq6pxQAXgA+BNEXkyC+s3XBm9gJ1ACxEpmEFlKplw7qnqHf4GJA4Y\nBXxy+rOqdlLV1cCbwLsAIhIOvAc8oao7M1qP4RJQ1ZB6AbmAU0CvC6zzFPAp8CFwElgH1En3/Xbg\n38BvQDJg+Ze18n9fD1gOnAD2A2P8y3cCjr/+k0AD4BZgcbqyX/Ovd8JfRpN03z0NTD2frnPshwOU\nOWtZL7/ma9Idj0nAXmA38Bzg8n93C/Aj8DJwFPgTuD5dWUWAmcARYDNw61lapwH/9Wv9DSgPPAIc\nAHYAbdOtvxAYeon1DgY2+MvdCtzmXx7j3zc73TEu7Nfy33TbdwXWA8eA74FKZ/22DwJrgOPAJ0Ck\n/7t8wFf+7Y4AiwC52LHH1+CO9297Elh69u9yju2/A4YD3wAPXuh39Zf/3IU0+n8HG0jyH5sRQCl/\nWa50v8Gz/mN/EpgH5L3M/68zjnW65RHA78Bt+P6/Fl9OueaVsa9Q7PE3BKKAzy+wjuAzhyn4jHEm\nvh5LevoDHYDcqmpzZnjoNWCcquYCyuAzQICm/r+5VDVeVZeeo+5lQA0gD/AxME1EItJ93+Uiui7G\nTCAMqOv//AHgAcoCtYB2wK3p1q8HbATyAi/hayRO8wm+Rqow0BsYJSIt033fGfjIvy+r8JkY+BqM\n54B30q2rnHkML1TvAaCTqsbjawTGiUgtVU0Ergf2qq+3Ga+q+9KXKyIV8B3Xe/GZ5BxgVrorNwX6\nAO2B0sC1+Boi8DUIu/zbFQAeUb+rXQL98JliHmALMPJ8K4pISaAZvkZ+KnDzRcpOf+zOqVFVb8L3\nW3X2H5sx5ylrAL79LYDPrEdcbMcuBVX1AEPx/ZbD/e8NASIUjT8vcFhVnYust1hVv/b/Y/8Pnxmf\nRoHXVXWPqqaeY1sPUF5E8qlqkqr+4l9+0cts9cX8j6mqo6pjgUig4iXquiiqmgYcBq7xhxA6AA+o\narKqHgJexdeonWaHqk7y1/cRUFhECohIcaAR8JCqelR1DTCRM01qkap+428Yp+M79i/4P38KlBKR\n+PNIPWe9/n2Yo6rb/O8XAfP5u1E91zFOv6wf8JWqLvDrGAO4/ftymtdVdb+qHgNmATX9yz34GrlS\nqmqr6k/n0X42CsxQ1eX+OienK/Nc3AQsU9XdwAygiohcaP30XKnG0zrfV9UtqpqCr9G51HovhfVA\nGvCbqm7KwHINl0koGv8RIF/6AdbzcCDd+yQg6qxtdnF+huKLqf8uIstEpNOlihORESKyQUSOi8gx\nfD37fJeh62LlhwP58YVQSgLhwD7/QNsx4G3/96fZf/qNqib538bi67Uf9feyT7MTKJru88F075Px\nNbia7vPpss7F+epFRDqIyFIROeLX3BFfo3IpFPHrPF224vst0+ven+59cjqNL+Prrc8Xka0i8tAl\n1gln/m7pyzwXN+O/SlTVI/hCMIMuUv7pxu1qNML59z0jeAX4ASguIv0ysFzDZRKKxr8ESAV6XGCd\nS7l8P+86/h7TQFXND7wITBcR98XKFZGmwL+APqqaW1Xz4Iv1Z+SAXDfAiy+ktAvfscirqnn8r1yq\nWv0SytmL76ohvTGUwDdOkGmISCTwGb6QQQH/MZrD38foYr/dHnwN3unyBN/g/p7zrP9XeaqaoKoj\nVLUsvlDg8PRZMhmBiDQCygGPi8g+EdmHLzw5MF0DnwREp9us8Gmd59F4OvyWFVPxnrMOEWmDL0x5\nG3An8JqI5MkCPYZzEHLGr6ongCeB8SLSTUSiRSTc34t80b/aVRmtiNwoIqd7zSfw/TM4wCH/37Ln\n2TQOnykfFpEIf/bN+UIhlyzHr+kaEbkB35jAC/5w0j58YZKxIhInIi4RKXsp9yOo6i7gZ2C0iESK\nyLXAEHzhp8wkwv86DDgi0gHfuMRpDgB5LxBCmgZ0EpFW/qufB4EUfPtyLv46F0Sks4iU8zcWJ/EN\nltqXoPlyzqdB+H6TyvjCeDWAavjCUR3966wGbhARS0SuxzcecCGNp8OaBzj/uXdRrf500IuNN/xj\nexGJAf4D3K+qR1V1Lr7xnnEXKcuQSYSc8QP4Y+fDgcfxhSN2Anfx94Dv2QONnOPzhWgPrBORU/hO\n7v6qmuoPWYwEfhKRoyJS/6y6vva/NuHLLkkmXVjiCnWt8evYjM+Y71fVp9N9fzM+I92AL/wzDSh0\nifUNwJcZshdfLPpJVf3uMrSeT/t5t1XVU/gGZqf69Q4AvvxrJdWN+Aa///Qf48Lpy1PVP4AbgTfw\nNcSdgC6q6r0ELeXwGdYpfA3FeFX94QLbXXR/0iMiUfgGlt9Q1YPpXtvxZeWcNt378PWejwEDOTNR\n4UIaR+O7kjjmT2k+l45z6vYnGFyDLyPpQpxrX0cBG1R1Srpl9wMdRKT1RcozZAJy6UkJBoMhVBHf\nzY13qeoNgdZiuHqM8RsMBkOIEZKhHoPBYAhljPEbDAZDiGGM32AwGEIMY/wGg8EQYhjjNxgMhhDD\nGL/BYDCEGMb4DQaDIcQwxm8wGAwhhjF+g8FgCDGM8RsMBkOIYYzfYDAYQgxj/AaDwRBiGOM3GAyG\nEMMYv8FgMIQYxvgNBoMhxDDGbzAYDCGGMX6DwWAIMYzxGwwGQ4hhjN9gMBhCDGP8BoPBEGIY4zcY\nDIYQwxi/wWAwhBjG+A0GgyHEMMZvMBgMIUZYoAUYQhcRiQKuAWLOesX+43N4eC7CwuIBUE1D1YPj\neLDtNBwnDfD6X3a692nACeA4cMz/9zhwVFVTs2xHDYZshqhqoDUYchgiEgYUBIr4X0VxuYoSHV0O\nl6skjlOY1NR82HYkbncqkZEOkZEOUVGK261ER7v+esXEhBEdHY7bLURG+ipwHLDtv1+OA16v4vU6\n2LaDbSter+LxOJw65eXECYdTpyAx0UVSUhgpKZG4XGlERJzAso4gchDH2Uty8nZsewuwFfgT2KOq\nToAOo8GQaRjjN1wRIiJAAaAKUAW3uxbh4bXxeEqTmhpPdHQKefKkkT8/FCoUQaFCUeTNK+TLx1+v\n+HgQyXrxqpCYCMePw7FjcOKE7+/hw8quXUns3OnlwIFwkpIicLsPYlnbSUtbT3Ly7/zdKPypqklZ\nL95guHqM8RsuiN/gi3Da4GNi6mBZNUlJKYvLZVGsWArlykVQtqybUqWgeHHImxfCckAUMSUF9u3z\nvfbuhV27UtmxI5W9e4WjR91ERBwjLGwlCQmLUF0OrFTVw4GWbTBcjJAxfhGxgd8AwRcHvltVl1xh\nWfcD76hqsv/zbGCAqp68lPWzMyLiBq5DpDFxce1JSbmOsDCLEiU8lC8fSZkyUZQqBSVLQu7cgemx\nZwdsG/bsgU2b4I8/0li7Nolt29y4XAlERKwiIWERjnO6MdgfaLkGQ3pCyfhPqWqc/3074FFVbXGF\nZW0DrlPVI5mxflYiIkWBRkRGtiAiojXJyWUoWjSZWrWiqFEjgqpVIX/+QMsMDhzHd3Xgawy8rFuX\nyNatUUAK4eG/curUl8ACYKOGyj+eIVsSqsbfB18PvaeIxAJfAHmAcOBxVZ0pIqWAr4HlQG1gPXAz\nMAx4GfgDOKSqrUVku3+dVGAqUBSwgOfwDXKOSb9+luzweRCR4kBHYmM7Y9sNgRgqV/ZQu3Yc1aoJ\nFStCVFQgJeYsVOHAAVi3Dn75JYlff1VSU9OwrAUkJs4CFqjq7kDLNIQWoWT8XmAtEAUUBlqp6koR\nsYBoVT0lIvmAJapa3m/8fwKNVXWJiEwCNqjqK/4efB1VPeovextQB2gJtFfV2/zL4/zlnrF+VuLf\nvwaEh3cjIqI3jlOYevVs6tePoVo1KFYsdMM1gUDVN16wciUsXZrAqlVhiBxF9WuSk+cA3wfiPDGE\nFqFk/Ol7/A2AiapaTUTCgXFAU8ABKgClgWjgB1Ut6d+mJXCvqva4gPHnBeYDnwJfqeqP6b/Pqn9o\nEckDtCcmpi9pae3Il8+heXM3jRqFUbkyWFZWyDBcCo4Df/4JK1YoS5acYsOGKCIjN5OY+B6q01V1\nZ6AlGnIeOSD14vJR1aUikk9E8gOdgHxAbVW1/SZ9OtaRvlWUsz6fq9zNIlLLX+bzIrJAVZ/LhF34\nByJSBperDzEx/QkPr0K1aqm0aBFH/fpQsGBWSDBcCS4XlCsH5coJ/frFk5YGq1ZV5dtvn2fx4pES\nF7fN3whMU9UdgZZryBmEpPGLSCV801UcAeKBg37TbwmUTLdqCRFpoKpLgYHAYv/yU/7tzujBi0hh\n4JiqThaRE8CQC62fAfuRC+hLbOz/ER1dkRYtoGnTKGrVgsjIiIysy5BFhIdDvXpQr54brxdWrqzM\nggXPsnjxcxIXt42kpPdxnGmquj3QUg3BSyiFek7H+MHXe39EVeeKSF5gFr5pApYD9YEO+BqGuf5l\ndfAN7t6kqikicjdwN747O1unC/Vch2/g18E3XcAd/nGEM9a/yv0IA9oRG3snHk8batf20rlzLPXq\n+UzDkDPxemHVKliwIJlFiwTL2kFS0iQc50NVPRhoeYbgImSM/3LxD+7OUtXqAZZy+iaqGkRF3Yrq\nTRQtCt26xdGihRAfH2h5hqzG64XVq+Hrr5NZvFgID59PYuIrwGKTJmq4FIzxnwe/8c9U1WsDqCEX\nLtdQ3O57CA/PT+fOkbRvH0axYoGSZMhunDoF8+crU6cmkpBwjOTkV1D9UFWPB1qaIftijD8bIiIV\niIr6F45zA/XrKz17RnPttb6BQIPhXKjCmjXw2WdJ/PKLi/DwGSQljVPfVBIGwxkY488m+MM5bYiJ\neQLVunTvbtG9e7i5a9Zw2Rw9CnPm2Hz2WSppabtJTHwRmGymojacxhh/gPEP1vYhOvpZ4uMLceON\nsbRpw19TEBsMV4ptw/Ll8PHHCWzalIrH8xSOM0lVUwItzRBYjPEHCBGJxuUaSkTE45Qo4eaWW3w5\n9yacY8gMfv8dJk5MZMMGDx7PMzjOf4Jh0kBD5mCMP4sRkTBcrqGEh4/m2msjGDQohqpVAy3LECr8\n8QdMmpTIb7+l4fU+j21PMM8VCD2M8WcR/hh+d9zu1yldOjf33BNLpUqBlmUIVbZs8TUAq1bZ2PYo\nvN7xqpoQaFmGrMEYfxYgIk2Jjn6LPHlKc++9MdStayZGM2QP/vwT3n8/iV9/tfF6n8a231DVtEDL\nMmQuxvgzERGpRkzM64SH1+fOO6Np08bE8A3Zk23bYNy4RLZsOURy8lBV/S7QkgyZhzH+TEBEihEd\n/TLQjUGDIune3UWEmTrHkM1RhR9/hHHjkkhNXUhS0p1mdtCciTH+DERELCzrHsLCRtKjRzg33BBO\nbGygZRkMl0dqKnz8cRqffupF9WU8ntEmBTRnYYw/gxCRKkRHf0KxYmV47LEYSpQItCSD4erYvx9e\ney2R1asTSEm5Hd8UJsYwcgDG+K8SEYkgIuIJXK4Huf32SLp2dZk4viFHsXw5jBmTyKlTK0lKukVV\n/wy0JMPVYYz/KhCRBrjdU6hSpQAPPRRtplcw5Fi8Xpg61eajj1Lxeodj2/8xvf/gxRj/FSAisURF\nvYxlDWL4cDctW5r0TENosG0bPP10IocPLycpaaCq7g20JMPlY4z/MhGRhkRFfUGjRnHce6+bXLkC\nLclgyFq8Xnj33TQ+/xzS0sap6kOBlmS4PELy0YtXgogIYWH343aP5NFH3TRpEmhJBkNgWLMGFiwI\ni8kd6+BJuSfSHVHIk5J2p5n6IXgwPf5LQETiiI6eTL58rRg9OoYiRQItyWDIek6cgFdftVm61GrV\npSZtBzQmJSmVGW/NT/5j5baDnpS0rqr6W6BlGi6OMf6LICLVcLvn0Lx5fh54IMrciGUIOVRh3jzl\njTekYOF4e8ij3az4PGfen7Jy4Xr98p0FKbZtD/em2W8HSKnhEjHGfwHEsm4iIuJt7rvPzfXXm9Fb\nQ+ixZw+MGuW4du6Q3kObSa3mVc676uG9x3jvmemJiaeSP/KkpN2jqnYWKjVcBsb4z4GIROF2TyA2\nti8vvBBNmTKBlmQwZC1eL0yZ4jB5sqtC9WLODSM6uyIiLj4kmJyYwocjP0/av+PwktRkT3cz42f2\nxBj/WYhIftzu76hZsyyPPeYmJibQkgyGrGX9ehg5UiNTE/WW4R1cpaoUu6zNba/NZ2/NT1m/dPMu\nT0paK1XdnUlKDVeIMf50iEgJoqJ+pEePggwbFmFy8w0hRUICTJhgs2CB1ahNFToNboHrCu9CV1V+\n+PxX73fTlpxIS/W2VdVVGazWcBUY4/cjIpWJilrEkCF56NPHCrQegyHLUIXFi2HMGPLkjrJvfbSr\ndU3B3BlS9Nolm3Ta63OT01K9A1R1ZoYUarhqjPEDIlKfyMj5DB8eR7t2pptvCB0OHoSXXrLl9w2u\nzgMbSqOOtTK8il2b9/H+c58leVK9T9lp9itmqofAE/LGLyJtiYr6nCefjKFhw0DLMRiyBtuGzz9X\nJk2SkmXz27c80tWKio7KtOqOHTzJxKenJp46ljQpLTXtfmP+gSWkjV9crr643R/wwgtuqlcPtByD\nIWvYsgWef94JP36UgXe3dlWqUzZLqk06lcyER6Yknjh86g1PatojWVKp4ZyErPFLePjtuN1jGTcu\nmrJZc+IbDAElJQUmTbKZNcuq1bCs9ryrvYSFZe0U4gnHkxj/0OSkhBOJo9JSvSOztHLDX4Sk8YvL\n1YfY2A+YMCGaokUDLSfzsW244w7Inx9GjYIPPoDZsyG3fwBv2DCoVw/WroVXX4XwcHjiCSha1Jfp\n8cwz8PLLAd0Fw1WybBm88ILGRYoz+KHOVuGSgZtC/MSRU4x/aHJS0snkR71p9msBExLChNwkbSLS\nArf7Q8aOdYeE6QN89hmULAnJyb7PItCnD/Tte+Z606bBiy/Cvn0wcybceSf8979w441Zr9mQMRw7\nBuPG2bL8V1fr7rWldZ+GAc9Yy5U3jjtHDYge/+/Jo6wwK8H22pMCrSnUCKlHRYlIDSIjZzFypJty\n5QItJ2s4dAh++QU6dfKl7cHff88mLMwXDkhJ8b3fs8e3fY0aWafXkDGowuzZyo03UmjfFh4dP1ha\n98k+yQt5CuTijlEDoiPd4W+ISwYEWk+oETI9fhEpRWTkdzz0UAy1Mj5lLdsyfrwvzJOY+PcyEZgx\nA+bNg4oV4a67IDYWBg6E0aMhMhIeeQTefhuGDg2cdsOVsWsXjBrlWLt30ef2llKjaaWA9/LPRb4i\nebj9+f7utx+dMlFEUlT180BrChVCoscvIvlwuxcxbFguWrYMnTz9JUt8cfzy5c9c3rUrTJkCEydC\n3rzw1lu+5eXK+RqKsWNh717fd6q+GP+oUb6wgSH7kpYG77/vMGwYlXPZ8vTEW101mlYKtKoLUrBE\nPm59pm90RFT4ZBFpEWg9oUKOH9wVkRjc7iV07VqRO+4IrTmV330XvvkGLAs8HkhKgqZN4dFH/15n\n/37f5/fe+3uZKvz73/Dkk/D6677B3337fA/dNlcA2ZO1a2HkSI3ypjiDR3S0SlQMrmdGbFmzg49e\n+OJ4Wqq3qnmcY+aTo0M9IiJER8+gQYPy3H57aJk++Ax72DDf+9WrYepUn8kfOeLrzYPvVv3Spc/c\nbt48aNAA4uIgNdUXGhLxxf4N2YuEBHjzTZsfFlpN2laTDoOaWVc6v04gKVejJM261435ceaKWSLS\nQFXTAq0pJ5OjjZ+wsHvIn78xDz8cZSZc4+8Hwr/zDmzd6ntfuDAMH/73OikpMH/+3+mbffrAww/7\nUjwffzxr9RrOjyosXAhjx5I3bzS3vnYzufPHB1rVVdGqT8Pwbet3V9q1ef8Y4L5A68nJ5NhQj4jU\nICpqCRMnhk7apiE02L8fXnrRcW3aJF1vbCT12+ecrKvEU8mMu/eDpMQTSTde7WCviCSoauzF1/T7\nBRRR1bn+z08Dp1T1lavRkF0JvmvCS8Af15/F8OFRxvQNOQbbhk8/dbjlFkrbx3jiP0NzlOkDxMS5\nGfRoj+jwyLCPRKT8xbe4IJfTq60FdLzCbf+BiGRrb83W4q4Yt/sdGjbMR9u2Jr5jyBls2gSDB2vE\nJ5MZ/O/O3PZ0b1eUO2cOWxUvX4gONzeLjogKnysi0RlZtogsFJE6/vf5RGSbiIQDzwL9RGSViJy+\ns7GKiHwvIltF5J50ZdwoIr/41337tMmLSIKIjBGR1UBDEXlCRJaJyFoReecsDS/4y/hDRJpk5D5e\nCjnO+MXl6kdsbA9GjHAHWovBcNUkJ8Prr9vcey91KuTiqYm3uirUKhVoVZlOg+truirUKlUk0h2e\n0Xf1Kmf15v0DyU8An6hqLVWdCghQCWgH1AOeEhFLRCoDfYFGqloLcIAb/EVFA0tVtaaq/gS8qar1\nVLU64BaRzuk0WKpaH7gfeCqD9/Gi5CjjF5HSRERM5Pnno3Eb3zcEOUuXwsCBxC9dxP0vDaD3/7WX\nYMzYuRJEhD73dHBHRUd1EZHuWVGl/3UaBb5S1TRVPQIcBAoBrYE6wHIRWQW0Ak6nxdnAZ+nKaCUi\nS0XkN/966Z9UP8P/dyVQKoP35aLkmKweEXERHf05gwa5qVAh0HIMhivn6FEYM8aW1atd7XrWkRa9\n6mfLO28zm4iocPre3yHmg+dnTBKRBap6KgOK9fJ3h/diDyDwpHtv87dffqiqj55j/ZTTzxkQkShg\nPFBHVfeIyFNn1Zd6jnKzjJzTfRC5gYIFy9K7d0j+kxhyAI4Ds2YpN9xA0aM7eXTCLdKiV/1Aqwoo\nZaoWp2r98u6IqPCXMqjI7cB1/ve90y0/CcRdZFsFFgC9RSQ/gIhcIyIlzrHuaZM/IiKxQJ8rVpwJ\n5Igev4jEEBn5Kg8+GEuIXAobchg7dsDIkY61fy/97m4r1RtVMB0YP52HtnT/vnzrIBGZpKrLL2PT\naBHZle7zK8AYYKqI3AbM5u94//fAw/7wzWj/sn9k9qjq7yLyODDfP6ibBtwF7Ey/vqoeF5F3gXXA\nfuCXC+jM8pz6HJHHL5GRo2jQ4D6eeSZDMwAMhkzH44GPPnKYPt1VtU5J7X9vBwmLyBH9sQxlxXfr\ndNak79enJnuuNY9tvHqC3vhFpASRkRv56CM3BQoEWo7BcOmsXg2jRmk0ac4tIzpaxcsXDrSibIvj\nKK/d/0Hiwd1Hb1PVjwOtJ9gJfuOPifmSXr06MmSI6SYZgoOTJ+GN121+/Mlq3qE619/ULNCKgoLt\nv+/mvWdnHElLTSuhqkmB1hPMBHVAXEQaExbWhgEDjOkbsj+q8O23ysCB5N+6lodfv9mY/mVQqnIx\nytco4Q4Ltx4OtJZgJ2h7/P70zXXcf39l2rYNtByD4cLs2wejRzuuP7dK90FNpG6b6oFWFJQc3X+c\ncfd/mOj1eAupakKg9QQrwdzjv548eYrTpk2gdRgM58e24eOPHQYPppyVwFPvDjWmfxVcUyg35aqX\nQERuCbSWYCZ4QyQxMU8xaFCsmW7ZkG3ZuBFGPq8Riaf05ke7UrZ6iWDuaGUbmvesG/Pn+l2PiMhb\nquoEWk8wEpTGLyI1iY+vRsuWgZZiMPyTpCR45x2befOses0r0u22vkH5cJTsSslKRcmVNy7+0J6j\nHYGvAq0nGAnOszEm5jH69YskLCjbLUNO5qefYMAAcq34meFjb6DHHW1DZn6drEJEaNm7fmxUTKR5\nMtAVEnSDuyKSj4iI3UybFkl8cD9xyJCDOHzYN7/O2t9cHfrUk6bdrrv4NlfJ9De/5o8V24jJFc39\nrw4CYM6HP7Bx+Z9YYRZ5C+Wi993XExUTyfbf9/DlfxZghbnoP7wT+QrnITkxhSmvfMWQJ3tfpKbs\nhzfNZtSQCUnJiakNVHVtoPUEG8HXFXG5bqFxY9uYviFb4Djw+efKTTdR/OQeefztIVli+gB1WlVj\n8BM9z1hWvkZJ7n/tFu4bdzP5iuRh4QzfTAE/zlrB4Cd60nlIS5bNWwPA99OW0rJXgyzRmtGEhVs0\n6VonItId8VCgtQQjQWX8IiJERt5Pjx5magZD4Nm2DW67zQn74D296b523DWynys69mITPmYcpasU\nw31WfeVrlsLl8iU8FC9fmBNHfBmPluXCk5KGJzUNK8ziyP7jnDiSQOmqxbJMb0ZTr12NMNtr9zo9\nYZrh0gkq4weaEB+fi2rVAq3DEMqkpsI77zjceSfXFgmXpyYOc1Wpf7VPCcx4ln+3joq1fVPFt+hZ\nj6mvz2XR57/SoENN5n/8I+1uyPIHP2UosbmiqVKvnAP0D7SWYCO4RkcjInrStq3bpHAaAsbKlTBq\nlMZYjg4e2YeiZQpmy5Px++lLscIsajarDEDh0gW464WBAGxbv5v4PLGoo3w8ZhZWuEWnQS2IzR18\nF9LVGlaI3rxmR1/gjUBrCSaCy/jDw3vQoIGZrtaQ9Zw4Aa+9ZrNkidWySw1pN6BJtj0PV3y3jj9W\nbGPoM/8hIeKSAAAgAElEQVScAl5V+f6zpQwY3pmZE7+j4y3NOXbgJD/PWUm7gcF3BVCuRgk8KZ56\nIuJW1eRA6wkWgsb4RaQ4UVGFqFQp0FIMoYQqzJ+vvP66FCgUz9Dxg4jPExtoVeflj5XbWPTlcm57\nri/h55jeeeXCDVSsUwZ3bBRpqWkIAgKeVG8A1F497pgoCpbIl7L3z4MtgLmB1hMsBI3xA9dz3XVe\nLCsy0EIMIcKePb75dXZsk15Dm1G7ZdVs1cufMvYrtq3fTdKpZF4Y9g5t+jdi4Yxl2Gk2k56ZDkCJ\nCkXofrtvWhNPahorv1/P0Kd86ZtNutbhg5EzsMIs+j/QKWD7cbVUb1gh9vDeYz0wxn/JBE0ev8TH\nz+P//q8d7dsHWoohp+P1wpQpDpMnuypUL+bcMKKzK8I8HCXbsm/7Id5+dMoBT0paYfOQlksjKM5m\nEQknPLwZdesGWoohp7NhAzz/vEamJuqgJ7pTukqxYMt8CzkKlcyHFWbFQVp5YFOg9QQDQWH8QCMK\nFfJwzTVZlyRtCC0SE2HCBJsF31qNWlWRTkNamPl1ggQRofJ1ZWTlwg0dMMZ/SQSH8btcrWjcOCbQ\nMgw5lMWL4eWXyZMrilvH3sg1hXIHWpHhMql0XVn378v/7AW8FmgtwUBwGH9sbB3KlMlWA2uGHMCh\nQ/DSS7ZsWO/qPKCBNOpU25xjQUrBEnlxbKdCoHUEC8Fh/I5TiWLBe2u5IZth2775dSZNkpJl88st\n/xkiUdEmihjMXFMwN2mpaXlFJFxV0wKtJ7uT7Y1fRISwsOIULx5oKYacwNat8PzzTvixwwwYfr1U\nvq6sCeTnAMLCLdxx7pTEE0mlgM2B1pPdyfbGDxQhIsImNvveNGMIAlJS4L33bGbOtGo1LCs9Xxom\nYWHG83MSeQvlthNPJJXDGP9FCQbjr0CRIh7AHWghhiDl11/hhRc0NgKGvNCPwiXzZ8v5dQxXR8ES\neSN3/rG3POZGrosSHMZfpkx4oEUYgpDjx2HcOFt+XeZq3a22tO7b0Aze5mAKFMsbFREZXiXQOoKB\n7G/8ERHVKF06+KYNNAQOVZg7Vxk/XgoVycXQ8YMlGGeeNFweeQvnJizCujbQOoKB7G/8kZFlyG+e\ns2C4RHbtgtGjHWvXTnrf1lxqNqtievkhQr7CebC9TplA6wgGsr/xi0SYh6obLkpaGkye7PDJJ65K\nNYozcOKtrnPNTmnIucTEu7G9dlygdQQDwfCfEW6M33BB1q6FkSM1ypuig5/uSYmKRUy6TghihVk4\njhqzuASC4SBFYJmrdcM5SEiA8eNtFn5vNWlbTToMambm1wlhrDALddSYxSUQDMZvevyGM9m5E4YP\nh2PHCIsMd9VpXonwyDC+/eTnQCszBBBVUFVLRCxVtQOtJzsTDI5qjN/gu+N22jRif/vNTjpwwBKx\ncBC8yanyy7w1AMTlLakxeYo4AVZqCCyW/2WM/wJkf0dVDTehnhDEcWDxYuSrrzR22zZNS0pyNWvV\nyi7br5/17hsTiIrIrZaFnEo+QlRcPOFRsd7Eo4fCUpOPW4XKNrSLVWltFS7XiNyFKiAm/BMSOI7N\ne/cUVFXHE2gt2Z3sb/wQZow/REhKgi++IGLhQjts3z4r2u3Wrr16OV1Hj7YaNWtGQkKCVbVYaa1X\n4y5NS0ty7du/2Bl0/VjXe/Puddxxca4hb33KqYP7WDVnmvXHz+/Yv375jEsdRwqUvs4uVqWNq3D5\nRpK3WHVcVjCc9obLxfF6EJfL9PQvgez/HyCSSHJyoFUYMovdu2HqVGJWrvSmHT4cVqZcOafXzTe7\nOnXvTuWqVUVELADHcWhZp75dOG8dalS8wUpOPcbK9e+5iheownODf3R9OPcBfXNAK9re9bDT88lX\nXS6XywLYvWE1K7/61Nq6bIq9et4rLtuTInmLX2sXr9LGVbhCY8lXshZh4WZmzpyAY6chYoz/Usj+\nxq+6nYMHawVahiEDWbYMvvhC47Zs0dSTJ10NmzSxez/7bFj7Tp0oULDgOeMytw68QY8dTnb16/ic\niAjRUdfgjsxtb971i1W9bGuGdnpD/tj5M+9PfIA1cz9z+r/writfiTIUq1KTYlVqgi/uy6EdW1g5\n6xNr089z7fU/vCOe5FOu3IUrOsWrtJbCFZpKgdLXERFlUsGDkdSk47jCIhIDrSMYyPYPW5fw8NHc\neONDDBpkJtYKVjwe+Oorwr791oncs0fCLEs6du1qd+/Tx2rWqhVu94Xn3/tw4kRG3Hkffa6fTO74\nEn8t/+r7e7Ro3mLOwDYj/4oFeh0v782+x9m452dX+3se14b9hsqFUjxPHtrPiplT2Pjjt3p4+zYn\nJeG4FZevpFOsUkuKVGzmKli2AVGx12TAQch6UpNOsHjyfRzbtxERodmNr7N9zWx2b/iOvMWq0fzm\ntwDYvGwqqYnHqNby9gArvjr2/rGIBZOGrElJOFoz0FqyO9m/x+/17mDfvhTM7JzBxaFDMHUq7mXL\nbOfgQatIsWJOr379pHOPHlKjdm1Oh3AuxoZ16xhx5720aTTyDNMHqFimi/yyauwZrh7mCuO2LhNc\nG7Yv4sMJI3T1nGk64IV3XdcULXnO8uPzF6Ll0AdoOfQBAaykk8dZPWeaa8PCebpk+iw76cQRKzpX\nIS1SqblTtGJzq1C5hsTkLnyFByVrWTr9EYpXbUubYR/g2F6STuznyK619Hx0EYsn38/Rvb8Tn68U\nm5dO4fq7pwda7lVz6shO1HHMM3cvgexv/LCLvXvNtMzBwNq1MH26xm3c6KSeOGHVuu46u+/DD1vt\nO3emWPHil51ak5KSQocmLZ1q5ftSuljzf2xftngrFv7yjBw+sZN8uc5sFKqUasbIW350TZxzt/Nq\nn2Z0vP8prd9nsIhc+MIxOj43jfoPo1H/YQJYnpQU1n7zpaxbMMu14qtn7MRjh6yI6FxauHwTp1jl\nFlahco2Iy1eKi5Wb1XiST7J/y9K/evUuK4yI6Fw4thdVxetJxmWFsXbBeKq2uA3/kEhQc/LQNtuT\ncnJ9oHUEA8Fh/AcPZq//KoMPrxfmzcOaN89x79ol6vVK+06dnO733We1bNuWuLi4q3KT9o2a2tER\nxaV+jf87Z6PhcoUR4y5gb9zxk9Xk2hL/+D4sLII7uv7Hte7P7/jozYd01Zzp2n/UO648RS79aW4R\nUVHU6dKPOl36CWB5vV42Lpona+d/Yf22YKz356kPh7mscAqV86WQFirXkDyFKgY8hfTU4R1Exebl\nh//ezdE968lXvAYN+4yieNU2fPFCS4pUbE5EVBwHt6+kVocRAdWaURw/sDkJ1T8DrSMYyP4xfpE8\nRETsZ968iEBrMeCb4376dKJ+/tlm/34rb/782qNPH+3Ss6fruvr1sTIo9faR+x/QD975iAGdpklU\nZO7zrrdw2Shsz3777p4fXLBijzeFd7+6y/nzwApXpwef07o9brpo7/9ScByHbct/YvXc6Wxbucw+\neeiASx074Cmkh3asYuaY6+k6Yi75S9ZmyfRHiYiKo07nR/5aZ/Hk+6nSbCiHdq5mz8aFXFO0CrWu\nfzBLdWYk059vdOL4vj86qepPgdaS3QmGHv9xHMc3L4t5/GJg2LTJd9fs+vW258gRq8q119p97rnH\n1bFrV0qXLStAhl6RzZ01i3fffFt6tJ3IhUwfoHr5vsz4ZpBlO14s1/lP54iwKP6v+3uuNVvmM/nV\nR3XV7Gnaf9Q7rlwFi1yVVpfLRdl6TSlbryn4M4f+SiH99RN7zfyxLm9qcpankMbkLkJMniLkL1kb\ngNI1u7Lmm9f++v7wrt8AyFWwLL9++SzX3z2NRf+9hxMH/yRXgeCc2Tjx2N4IYFugdQQD2d74VVUl\nPv53fv+9BnXrBlpOaOA4sHAhrtmznZjt2/GmpLhatm1r93zlFavN9deTO0+eTAsI79m9m1t6D9Cm\n1/2b/NdUvmiDkjdPOcKsSGfH/jWuMkXqXLT8GuXaUblEE9c7s++0x/ZsRJd/j9I6XQdkSO//NP9M\nId3KyllTrE0/z3HWL3pHPUmZn0IanasgsbmLcuLAFnIVLMeeP34gT+GKf32/4qsXaDpwHLY3Def0\ntDYuF3ZaSobqyCo8KafwepLCgf2B1hIMZPtQD4CEh4+ib99/MWxYtm+ogpaEBPj8cyIWLbLD9u2z\nYuPitGvPnk7X3r2thk2aEB6e+U+/dByHykVK2nmia9Gq4dOX3Lh8Nm+QU6NsU+nSaPhluffKTXOY\n8sOTWqRyde03coIrPn+hyxd9BZw8tJ8Vsz5h4+Jv/k4hzVvSKVY5Y1NIj+xex+LJ9+HYacTlK0Xz\nm94kwh3PjjVzOLJnPbU7/guAX2Y8xe7fvyNv0aq0uOXtq643EOz9YxHfThy8PjXxWLVAawkGgsP4\nRdpQtux0Jk7MFWgtOYqdO+HTT4lZs8ZOO3TIKluxotNnwADp2K2bVKxcOcszVXq27+Cs/mUzfa7/\nn8uyLn1IZ9WGj9i2c7bz2M1zLz9zyJPAO7Nud3YeWe/q9vBLWqtTnwzt/V8KvhTS6WxY+LUe2LLJ\nSTpx2JdCWrGZU7RSC6tQuQbE5L66kFROZ8VXo+3fvnnjVW9aymWNVIuIDfyGL/rxOzBIVS97qgAR\neVRVR13udldQT0mgkapOuch6pYBZqlr9nN8HifHHEBZ2jFmzwokyt9dfMY4Dv/yCfPmlxm7dqp6E\nBFfj5s3tXv37W+06diR/gQIBk/b6yy/z/GPP0q/jp8TFXF7P2+NJ4P0ZbRh528/ERF14TOB8/Lpx\nJlMXPaPFr63t9Hn2TSsuX8ErKicj8KSksO7bL1n77Szd+/t65+8U0sZOscots20KaSD58uW2Jw5t\nXzlQVedcznYickpV4/zv/wesUNVxl1t/+nIyExFpATyoql0usl4pgt34ASQ+fj1PP12F2rUDLSW4\nSE2FWbMI//ZbO2LfPldEWJh06t7d7t6nj9WkRQuiskFDumzJEjo2a0XHZuMoVqjeFZXx3y872n2a\nP2bVqtDhinUkp55iwqxhzt6jf7i6P/6K1mjfI8t7/+fC6/Xyx+L5/Db/C3b+tsqbcOSgP4W0gT+F\ntFG2SCENFHZaKh+OKOVxvJ6Cqnr8crY9y/jvAKoD/wLeBKoC4cDTqjpTRG4BuuK7p6gs8LmqPiQi\nLwAjgLXAOlW9SURuBO4BIoBfgLtU1RGR64GR+MZ/DqtqGxG5BngPKA0kAbep6loRaQ686pfqAM2B\nb4FK+AaxPwC+AP4LxPjXu1tVl+Qc44+IGEP//vczZEjw32mS2ezfD1OnEr18uW0fPGgVK1nS6dW/\nP5179HBVr1EjW/UUT548SeXCJZxq5W+kTtUhV+xccxc9qPliczk3Xz/mqs+PpetnMP2n57VUrfpO\n72det2KvyX+1RWYojuOwbcXPrJ4znW2rfrFPHkyfQtran0J6bcjMQrpn40IWTBp6RfH908YvImHA\ndGAuUBJYr6qTRSQ3PuOuBfQFngBqAh7gD6Cxqu45qwGpDLwI9FBVW0TeApYAXwMrgKaqukNEcqvq\ncRF5Azioqs+JSEtgrKrWEpGZwGi/kUcDqUATYMTpHr+IuAFHVVNFpDzwsarWvZjxB8+ZkZa2gF9+\nGcaQIfGBlpItWb3ad9fs5s1O6okT1nUNGth9HnvMat+5M0WKFs22XcFW1zWw8+auRu0qg6/KsKuU\n7SELf3nKUtWrbtgaVO3JtWXbyFszhzKmW316PfUq1dt0vaoyMxKXy0XZuk0oW7cJ+DOH9mz8jZWz\nPrG2LPvUXjN/3BkppIXKN5L8pWrn2FlId66d7/GmJEy7ws3dIrLK/34Rvp73EqCLiJweL4gESgAK\nLFDVUwAisgFfI7HnrDJbA3WA5f5zMQpftlF9YJGq7gBId3XSGOjpX/a9iOQVkTjgJ2CciEwGZvgb\nmLNP7gjgTRGpge/hMxUuZaeDx/jhJ7ZtizL5/H48Hpg3j7B585yo3btFVKV9585Oj3/9y2rRpg0x\nMTHZ/srozltu0f17jrn6d3rnqkMqxQs3Is32cODYnxS6puxVa4uOimdE32nWT799ymdP3aerZk9z\nej35qhWTJ+9Vl50ZFK10LUUrXQtnpJB+Ym1aMtdZv+g/6kk6+XcKafkmUqBM3RwzC+n2NV+lOo53\n9hVunqyqZ8z+6z8Xe6rq5rOW18fX6z6Nzfk99ENVffSs7TtfQMfZ/wCqqi+KyFdAJ+AnEWl/ju0e\nAPb5w0sWcEn5uEFj/Kp6UmJjv+P779vTpUv2iVVkJUePwrRpRC1danPggJW/QAHt2a8fXXr2lNp1\n6+IKoglXPv3f/5g2ear0bv9fIsJjLr7BRXC5XMRGF7J/377YVeiashl2fjS+th81K7SX8V8OYUy3\n+vR+5nWqtuyYUcVnGvlLlqX93Y/R/u7HXAAnD+9n5cxPXb8v/kY3L/vYTjn1dwpp4YpNXYXKNiAq\nNns2ahfi+IHNpCQcdYCVGVjsPOBefDF6RKSWqq7iwjcqpolImKp6gQXAlyIyTlUP+WP4scBS4C0R\nKaWq20XkGlU9CiwGbgCe9w/eHlLVBBEpq6rrgfUiUheoCOwG0rfY8f5lADfjb/gvRtDE+AFEpCOl\nS3/Ce+/ljK7KpbBxI0ybRtyGDXbq0aNW9Vq17N4DBrg6dO0qpUqXDrS6K2Lr5s00rFpTW9R7knIl\n22aYSf+48hUST26y7+87JVMawEVr/sfMpa9ouUYtnJ6Pj7Wic+XJjGqyhKSTx1kz9zM2LJyr+zef\nTiEtqEUqNg+qFNIl0x/1bPzxwze8nuQrmnBIRE6qavxZy6LwDao2AlzAn6raVUQGAXVU9V7/erOA\nl1V1kX+Atyu+rKCbRKQv8Ih/+zR8g7vL/IO7o/zLD6hqexHJgy/EVAZIxDe4u05EXgda4hvYXQfc\ngi/cNA/IC7wPzAY+8y//2l9PvD/GP1NVrz3nfgeZ8YcRFXWIt97KTZCa3kVxHPjuO1yzZ2vMjh1q\nezyuVu3a2b3697datW9PrlzBfSuDx+OhYqHiTrECrbVpnREZatAnTu3i0zl9eenOlYSHRWZk0X9x\nMukwb3051D6atMfq89xbVG7WLlPqyWrOSCHduN5JPHbYioiK08IVmmTbFFJvWgqTH6qYnJaaUF1V\ntwZaTzARVMYPIJGRL9Gly33cfXfOmbQtIQGmTyfyxx9t1/79Vnx8vHbv08fp0rOn1aBxY8LCgiYi\nd1HaNmzs7Nx8kh5tJ7lcF5hb50r5YEYbZ0in11wVizfM8LLT8/2qD/hq2ataqVk7p/ujL1vuuOBu\nkM/G6/Wy6cdvWDPvc3b9tsp7KhumkG5d/hk/ffKvpalJxzP3x86BBJ/xi5QnOnoNX37pJpgNcds2\n38Rna9bYniNHrAqVKzu9+/eXTt27S/mKFS++fRDyzKOPMmHsBPp3mkZ0VOY81eqLb4dpxWI1tGez\nRzPdkU4mHuLNLwfbJ1IPWn2fn0DFxq0zu8qA4TgO21f+zKrZ09m2apl96tB+l2Pbkr9UHbt41TYB\nSSH98qW2pw7tWDlEVYP/KTJZTNAZP4DExa3ioYdq0qRJoKVcOo4DP//su2t22zZNS0x0NWnZ0u49\nYIDVrmNHrskbfANrl8N38+fTt2M3urZ+m0L5zplanCGs2zyNDZs+dp4avCDLuqLfLn+XucvfpGqr\nTnbXh1+0omJDYwjKn0LK1mU/2cf37/WnkFb330vQJFNTSI8f2Mzno1uetNOS86uqJ1MqycEEp/GL\n3ELt2m/wyivZO68zJQW++ILw77+3w/ftc0VFRkqXnj3tbr16WY2bNycyMnPi0NmNQwcPUr1kOa1X\n7S6qVeibqUFirzeF9z5ryTNDfiA+Jl9mVnUGx07t462ZQ+1TaUet/qPfoVz95hlex6HtW5jyyLC/\nPh/ds4O2dzzEyUMH2PTzdxSuWJW+z44HYNXsaSSdOErjgVn3HN3DO/9kxaxP2PTz987R3bs4nUJa\nrHIrilRo6srIFNIl0x/zbPzxgze9nuTgfYBAAAlW448hMnIv48fHU/bqc7YzlH374NNPiV6xwvYe\nOmSVKlPG6dWvH5169HBVrV49Ww2OZQWO41C9RFk7OqwibRqNtLJi/yfP7GJ3azzcqlu5W6bXdTbz\nlk1g/sq3qd6um93lX6OsyJjM6Zs4jsML11fnrg/n8dlzDzD0rWnMeO4BGg24jbzFSvHh/TcyZPxU\nXBn0YJwr4eThA6yc9Qm/L/pGD+/40zmdQlq0cguKVGx2xSmkXk8ykx+uZAZ1r4KgDJKraqKEhT3N\nhAnPMWbM1SeBXy0rVsCMGb67Zk+etOo1amT3feopq12nThQqXDjb3jWbFdzYo6eTeFJdnTo8mWXz\n3hTIe621Zut8u27lblnueu3r3UndSt0YP3MwY7rV1f6j3xX/HbYZypZffiBvsdK4c+XG8aahqnhS\nkrHCwln037doNGBYQE0fID5fQVoMvo8Wg+8TwEo+dYLVc6e7Nnz/tS6dPts+M4X09IPsL55C+vvi\n9x1EfjSmf+UEZY8fQEQiiYrazUsv5aN65sWMz4nHA3PmEPbNN07Unj3iAunQtavdvU8fq3nr1kRH\nR2etnmzKf958k8eGP0y/DlOIjy2aZfXuPbCCrxcP58U7V+CSwLW7c5a+wberJ1KzQy+784jnrAh3\nxvVRpj99L0Wr1KRh3yEs+vBNVn/9GeXqNafJjXfw+fMPMui1yRlWV2bhSUlh/YJZrP12pu45PQtp\nuhTSgmUbEp+/9BlXyWkpCXz8WLXktJRTDVT1twDKD2qC1vgBxOUaQoUKrzFhQiyZ3Zs8fBimTcO9\ndKntHDxoFS5SxOnZrx+de/Rw1axTB1eIzox4PtasXEmb+o1p3+RlShRplOX1vze9ud7Xe7IUK1Al\ny+tOz+ETu3hr5hA7RZJdA154V0rXvvrMQ2+ahxfaX8v9n/1IbJ4zxzFmPPcADfoOYc+G1Wxe+gOF\ny1eh5a3Dr7rOrMDr9bLpp2/4bd4X7Fyz0nvq6MEwlyucQmUb2MWqtrYKlW3I9jWzvWsXjJ/jST6Z\n9XG8HERQhnr+QvUjdu58ml9/jaXelU3ne0HWr/fdNbtxo516/LhVo3Ztu++//22179yZEiVLGqc/\nDwkJCXRq3sapUekmShRpFJDjFBtd2NmwY7GrWIEqAR1UyZerOE/e9I0166exvH93P+p06W93uP9p\nK8J95VeFm35aQJHK1/7D9Pdu9HWA85Uoy9evP8eQ8VOZ/vS9HN75J/lKZP/n6IaFhVGleQeqNO8A\nEOY4DttXLWHVnOnWH0vetX/98jlXWkpCGGT+A09yOkFt/KrqFZHhvPnm+3zwQSxX2+v2euHbb3HN\nnasxu3bheDzSpmNHu+fdd1st27UjPj4+aObCCSRtGzS2c8eWp2712wJ2vEoWbWat2TLPblf39mzx\nm3VpPJz6VXrw1le3smHh1zrwpYlSssaVdVbWfD2DGtf3/Mfybya8SM8nxmJ701DHAUBcLrypwfkc\nXZfLRZk6jSlTpzGANeP54amr50z/xJOc9EugtQU7OaHX+hlHjuzmhx+ubOuTJ+H994kaMsR2d+tG\nkcmT9Y7WrXXqzJmy88QJPpw61erWuzfx8WY26Evhgdvv0B1/7nW1b/KSJQGMr1ct34c9hzZaqWlJ\nAdNwNgXylObpmxZYdUt0kkl39mbWy4/ZaSmX95Q/T3IiW5YtolqrMyd63LBwLsWq1iIuX0Hccbko\nXKEar/VtjtfjoVD5wIa7MoL9W35n9expaWkpJn0zIwjqGP9pRKQt11zzBZMnR1/Soxm3boWpU4ld\nu9b2HDliVa5e3e7dv7+rY7duUrZ8+cwXfAX835AhzJ89m/wFCvDz2rUAfDFtGi88/TSbNm7ku2XL\nqFmnDgBLf/qJB++6i4iICCZNmUKZcuU4fvw4Q/r1Y8a8eZmm8fNp0xg28GZ6tv2AfHkCfxw/+ry9\nc2O70a6qpVsEWso/2HdkC29/Ncy2I9U18MVJUqJ6nUBLyraoKv+5tWvizrXLH7PT0l4LtJ6cQE7o\n8aOq35CSMp9330095wqOAwsXIiNGaFzv3k7UPfdwfUSEPfbll60/9u3jh+XLrXtGjMi2pg9ww+DB\nTP/66zOWValenf99/jmNmjUj/eD2+LFjmT53LqNffZX33n4bgDHPP8+Djz2Wafp2bN/O7Tfeos3r\nPqbZwfQBcucqK+u2fW8HWse5KJy3HM8M+t6qVbitTLytB7PHPml7Pec+fUOdjYvms/ePtUccr/et\nQGvJKQR1jP8MkpKGMXv2Flq3jqRKFUhKgi++IGLhQjts3z4rOjpau/Xq5XR98UWrYdOmREREZIvY\n76XSqGlTdmzffsayCpUqnXPd8PBwkhITSUpMJCIigm1bt7J3924aN2uWKdq8Xi+t6zayy5foQMXS\nHbPNca1YqpOsWPeWq1+rZwIt5bz0av4YDav0YcJXw1j/3Wwd+NIkKValZqBlZRsSjh1m+tP3JHmS\nEoeoalqg9eQUcozxq+phEbmLhx9+LyY+3ko7fDisTLlyTu9Bg1ydunenUpUq4n9CTY7ngUce4Y6b\nb8YdHc3bH33EEyNG8MTIkZlWX/c27WzseGlS51/Z6gqyfKn2LPp1pBw7tY88cYUDLee8FMlfgWdu\n/t6avvA5/c/QrjQaMMxpc+dDrrDwnDMB7ZXgOA6fPHxbUlpqyn9UdUGg9eQkstU/agYwJcrr3VW5\nWDHXuh07WLpunWvEY49J5apVQ2qqhOo1avDNkiXMXLCAbVu3UqhIERzHYXC/ftx2000cOngww+p6\n6dln+XXJcqtTi9dclhWeYeVmBC5XGDHu/PbGHT8GWspFcblc9G31lAzv/Qmrv5iq43o11tPpmaHK\nj/97y969ftXWtJTkfwdaS04jRxm/qmpKcnLD9WvXntjgHwANZVSVV0aO5F+PP86LzzzDc2PGMGjY\nMGA3jiUAACAASURBVN55/fUMKf/HhQt56dnRdGg2jhh3/gwpM6MpXKCOtXrLvGwZ5z8XxfJX4dmb\nf7Aq52nA24M7M/+t0badFnoRjl3rVvLt2y8lpSYldDEhnownRxk/+EI+yUlJ/Qf365d89MiRQMvJ\nOs6RnTXlo49o16kTufPkISkpCRFBREhK+v/2zjs+imqL478zu2mbRg29Sw+EIqABDL0EnvQiiAZ9\nFrAhdlBU9NlQ8QmodAL4FJCi9N5bpITQpIYeCOnb25z3x0xixGBIsmSz2fv9fObD7J2Ze88sm9/c\nuffcc4ru4piWloYh0f24bcQYuWpYy/wvcBNN6w/B+WtxGln2GO2HJEkY3vUjGjfwRxxe9iN9M7Qj\nJ5096W6zig2LPgsLXx1lslvMMcx82d32lEZKnfADADNvstlsC0YPG2ZyOj3nD/6fePqxx9AzMhLn\nzpxB0xo1sGjePKxZtQpNa9TAoQMHMLRPHwzu3TvnfJPJhJ9iY/HMCy8AAF4YPx5DoqMxcfx4PD1m\nTJFskWUZnVu3c1Yp31qOaDiyRP+Gwso3hkbjw1dued4bYM1KzfDRk7ukekEt8P2TvbBl5hTZ6XC4\n26z7CjNj6XtjTTaT4X/MvMLd9pRWSoUff14QkU9QcPCekaNHR3z+3/96R+D7YuKp4Y/xlrW7MSx6\nCfloA9xtTr78smGU3Lp+V4p++GWPnei5lBSPWevHOgPKl6URX8yVKtXL26PL0zmwbL68/r8fXrCZ\njM2Z2TOXHHsAJbq3VhSY2W7Q63svmjs3NXb2bNnd9pQWYufMwerlv1HfTtM9QvQBoE6NztKxC5s8\nuodTu0oLfByzR1PLrzHNeLwHts+dWup6/0lnT2Ld1PfNNpOxrxD9+0up7fFnQ0QNAwICDi1dty6o\nY6dO7jbHozl14gSiWrZFt8j/oE5112eYul9YLJmIXdUTnz53EAF+hcsA9eOmt3Hy0g4EBZTHhFFr\nAQBGSwbmr3sFaVk3UD6kGkZHfwudfwgu3jiMJds+gFbjg5jeU1GxTC2YLFmYv+4VvDBwfpHv5+KN\nw5i9/kU5MKwiRnwxVwqrUzIWzBUFq9GAb4Z2NGbcvP48y/Jid9tT2im1Pf5smPmM2WweMKJfP/PF\n8+fdbY7HYrFY0LtDZzm8/lDZk0QfAPz9Q6HzL+s8e3V/oeto13QQxvSf+5eyzb/PRKOa7TEpZjMa\n1HgYmw/NBABsOzIPY/vPwaCoidiT8BMAYGPcd+jZtmhzK9nUrdoaH43eLVXT1KXpI7tiZ+w0Wfbg\nuSyHzYp5Lw41mTLTlwjRLx5KvfADADNvsZjNb/Tr1s2UkZHhbnM8kp6RHZ063xpoF/GCR/5mypdt\nKB2/uK3Q6vhAtTbQ+YX+pez4xW1o21iJktmuyUAkXNgCANBIPrDazbDaTdBofHA74zIyDDfxQHXX\nhQ7XSlqMjv6GxvSdhT3zZmD6yG5yyhXPS0glO5348Y3RppvnTm+3mYzPutseb8Ej/4gLg81mm5GW\nkvLjiEcfNVmtIiZKQXhn3Kt89vRFqfcjX0qS5JmLnxvWfZROJu6QXDm0qTel5CR0D9ZVgN6UAgDo\n3uY5LNr4BrYcmo1HIkZizb6p6Bv5qsvazU396g/ho9F7pDCuRt8O74zdi7+XZdkzprSYGSsmj7Mk\nHt4fbzMZBjGz5762eBheI/wAYDQaxybEx+8c1revyWIRc0f3wvrVqzF7+g/Ut9O35O9Xxt3mFJo6\n1TrBZjdRSuaV+1I/EYGgOA1Vr9gYrw1fhpcGLURKxhWEBlUCM2Pe2lewcMPr0Jtcu75EK2nx7z7T\n6bnoH7Bz1jf4blQPOfVqokvbuB9s+Hay7cTWNRetJkMvZha9sWLEq4SfmR0Gvb7foQMHdgzt00eI\nfz5cv3YNMYMf444PvskVyzV2tzlFQpIkBOoqOU9f3u2yOoN1FZBlvA0AyDQmI0hX/i/HmRkb475H\nr7Zjsf7gNAx45G1Ehg/FzvhYl9mQm4Y1I/FxzG6pnLUi/jusE/b9PJtLau9/yw+f2w8snZ9kNRmi\nmFnvbnu8Da8SfkB18zQY+h+Ji9s2qFcvIf53QZZldHnwYWedal3kxvX6Fdr/fduBDzF/RXf8vHZo\nTtn5K5vx09oh+P6nNkhOO51TnnQ7HkvWDceyDaOQqb8KALDa9Fi97YUi3MmfVKvUVhN/znXhG5rV\n7YKDp5Q1RgdPrUTzet3+cjzu9Eo0rdMJOv9Q2OxqwhUi2Oz37zen1fri2Ue/l/7d81ts/W4K/xDT\nW06/cX/ecgrLlplf2Hcv+j7JZjY+xMwpRamLiEKJaEyuz52IaHXRrSzdeJ3wAzniP+DYkSObB/bo\nYTKbC5YFyRsY3LuPbLf4UVTbCUUa1G9U91H07TTtL2XlQx9A745fokrFlsj9RDn2x4/o22kaOrR+\nDSfO/QIAOHRiDlqHP10UE3Jo1mAYEpOOahxOW4Gvnb9uHL5eOgzJ6Yl4b05HHDj5C7q3eQ5/XNmH\nyQu649zV/ej+4HM559vsZhw8tRKPRDwOAOjS6in88OszWLnrU3RoPsIl9/NPNKkThf/E7JGCjcGY\nOrgjDiybzyXBdXvrrCn23Qu/S7KZje2Y+aYLqiwLYKwL6gEAeEsE31ITlrmgqPl6ByccPbp0QI8e\nPVds3KjT6QqfALs08e2UKdizfY80LHoJNJqihQauGtYSWYYbfykrG1onz3Ml0sLuMMPuMEMj+SBT\nfxVGUzKqhrUqkg1/tlsbvj4B8qWkeKmgHjajo7/Js/ylQXkP2/j6BODlwYtyPter9iDeeXxNgdos\nKlqtL8b0myMlXNiKH6e9zfHrfuHhn8yUylSpXqx2ZLN11pf2XbHTb9rMpocKK/pENB7AaPXjHAAP\nAahHREcBbAawFkAQES0DEA7gMDM/rl7bGsBXAIIApACIYeabRLQDwFEAHQD8D8DUwt6jp+CVPf5s\nmNlhMBiGHo+PXz+ge3eTXi+GGuP278fkCZPQq+MUBAdWLta2WzUdja37J+HoqViENxiKgwnfo12E\nyzpzAICQoFo4mbijZA583yea1+uKj2J2S/4Zvpg6uD3iVi4q1t6/0+HAqk9et+6KnXbDZja1Y+ak\nwtSjCncMgLZQBP8ZAJ8DuMDMLZn5TQAEoCWAVwA0AVCXiNoTkQ+AaQAGMfODAOYDyE5SwQB8mLkN\nM5d60Qe8XPgBRfyNBsOwEwkJv0S1amW8ctl7gwFmZmZiQLfecuumT8vVK7vO5/xeqVC2AQb1XIB+\nXX9AluEaAgMqKBOke97Gln3vwWRJK3Ib9Wp2k7L97b0JX60/XhwwXxrV5Qts+PpDzHqmn5x560b+\nFxYRc1YGZj/b3xi/fnmczWxqUVjRV+kAYAUzm5nZCGAFgLzSysUx8w316RYPoDaAhgCaAtiivh1M\nBFAt1zVLimCXx+H1wg8AzOw0Ggwx169efb9DRIT5wN697jbJLXRt87CzfJlwbtVktFt/F8yMwyfm\nonX4v/H7iVmIbDkOTR4YgONnfi5y3Y3r9keq/ppkMBf9IeKJtKjfEx8/uYt8UsBfD4rE4d9+um+9\n/+TEc/hmaJQp6czxBVajoQszF3X1JAO409EgL+Nzu4Y68eeQ9kn1zaAlMzdn5l65zjMW0TaPQgi/\nCjOzxWL5Kiszc9CAHj0M/1uwwP0zYcXImJgYvnk9XerR/hNNcWYry+tLPpO4BrWqdYC/bwgcDoua\nPY3gcBTdG8bXV4dA//LOM1f2FbkuT8XXV4eXBy7UjIj6GGumvIu5zw9yZt12xTzrn5zZuxUzRnU3\nG1KTX7aajC8ysysiyu0G0J+IAogoEMAAAHsB5BeAiQGcAVCRiB4ClOi9RNTEBTZ5JEL474CZ15tN\npnavv/jirXdfe81WWuL5/xNLFi/Gsh+XUt/O08nXJ8ildW/aOwErNo9Ghv4yYldF4/SFX3Hx6nbE\nrorGrdQTWLvjFazZ/lLO+XaHGWcS1yC8vuL+2aLRSKzZ8TL2HvkaTesPdolNFcs30yRc2FL6/2Pz\noXXDvvjoyZ2EJAt9PeBhHF27rMi9f2bGnsXfO398Y3SmzWTs4XTY5+Z/1T3XfRTAAgBxAA4AmM3M\nRwDsJaLjRPQ5FJH/202oWbwGA/iciOKhTOY+7CrbPI1SH52zsBBRheDg4PVtIiObxC5bpgsOLlxU\nx5LOhXPn8HDTFtyp7SQ8UKu7x8arLwhXbhzA1v3v4PPnD3lVLuZ/Iu70r1i2+0OuEfGgPGTydE1w\n+bAC1+GwWbHio1ctJ7evu24zGbuK7FklF9HjvwvMnKLX69sf3Lt3eVSrVsbzZ8+62ySXY7PZ0LVd\nB7lh3X6yt4g+AFSv3Bay7OSk1HPuNqXE0LZxP0x+cic5rmTiq/4P4diGlQXqERrSbuOH0X2Mp7av\n32kzGVsI0S/ZCOH/B5jZZjQYnrx6+fKbj7RqZVo8b16JWATjKvpEdZb9pDC0bznOKxatZCNJEoJ0\nleXTl3eVnv9MFxDgF4zxQ5ZoBkVOwMqPX0PsKyOdhvT8F9ZeP30M/x0aZUpOPDvdajJEM7OhGMwV\nFAEh/PnAzGyz2b4zGY3t3nrllcujBg0yl4bQzh9OmIDjR09JvaOmSpLkfev4alRtrzl2frNX+fPf\nKw83HYzJT+wg84Vb+KpfO5zYmvfCM6fDga2zpthnPv0vgzE9ZbTNbHqbmcV36gGIMf4CQEQBgUFB\n03U63fB5S5boPDWj17ZNmzA0uh8e7foDKldo5m5z3ILemISf1gzE52MOw1fr725zSix7En7GqgOf\nc722HeWBk77RBJYpBwC4fek8fnzzKWNG0rV4q1E/nJmvudlUQQEQwl8IiKhPgE63aNRTT+k+/OIL\nv4AAz8g9CwC3k5PRrNYD3DZ8LMIbDPWacf28iF3ZXY7p9ZXUqFYHd5tSojGa0zHjt6edKYYrmkEf\nfIuMm9ecG6d9bHU6HG/JDvuMUjX+6SW4ZaiHiCoT0c9EdJ6IDhHRWiLKM3GoK6LvEdEOImql7q8l\nopCi2M/Ma80mU/2fFi7c1LZxY+PhuLiiVFdsyLKMTq3aOWtUipSb1h/i1aIPAGVC6uJE4navd+vM\nj8CAsnhz2ApNh4aPYcnE57FpxifJdou5hdNumy5E3zMpduEnxX9uJYBtzPyAGjfjHQCV7nKJK6Lv\n5fw4mbkPM2cVsT4wc2pWZuaj165c+XffTp2yXnz6aUtaqmsTbLiaxwcMlI1ZLHVuN6lYF2mVVOrX\n6i0dv7hVzHPlg1N2YPOhWc4d8QtMTqt9gs1krM3MwiXKg3HHj74zABszz8ouYOYEAEeJaAsRHSai\nBCJ6VD38GdToe0T0BRQRDyKiZUR0mohykjMTUVciOqJeP5eI/hZakoguEVE5df8JIjpGRPFEtLAw\nNyPL8s9ms7n2yqVLF0XUqWOeP3Mml8RFX7OmT8fm9Vukvp2nkVaMaQMAGtSORpYxhTINt9xtSonl\nUlI8Plv8L+OmuO9/tzsszZyy/VNmLnhca0GJwh3CHw7gcB7lFgADmLk1gC5QwqcCwFvIP/peJBH5\nQ4m4N5SZm0OJzzHmzkag9v6JqCmUQE2dmbmFWl+hYOZ0g17/rF6vj5z05pvH2kdElKjhn2NHjmDi\nq2+gZ/vPEBJULf8LvASt1heBARUcf1zxzthM/0S6Pglz175kmrb8ifSbaRdesNgMkcx80d12CVyD\nO4T/bmOCEoBPiegYlLjaVYkoDH8PygT8PfpeHSjR9xKZ+bx6TizyjtwHtc4uAJYycxqgiHeh7iYX\nzByvz8pqdebUqbF9O3fOGBMTY0m5fbuo1RYJg8GAPlHd5IhGo+SaVSPdaktJpFKFFtpj5zeVvFc0\nN2G1m7Bm31THR7E9zCcTt39rc5hrMsuxYiy/dOEO4T8JoHUe5SMBVADQiplbAkgGcLcxibyi7935\nw8xvEDuvSH9FhplZluWFZpOp9m/Ll89vUa+eedb06bLdbnd1U/dE94faO8sE1ec2zZ4VY9l5EF5/\nEM5e3a+Rvdz9XGYZcadX8aS5j5h2xi9ca3dYGtnslnfEYqzSSbGLATNvA+BHRM9klxFRcwA1ASQz\ns5OIOgOopR7W496j79Umonpq2SgAO/7h/G0AhuQa7y9XiNu5u0HMmQa9fqxBr2/78bvv7m9Svbpp\n4Zw5XJwPgFefe54vX7wh9ezwhYZI6H5eVK4YASINX0s+5W5T3MaF64fw6aI+xmXbPzxlsmR2N1v1\n/Zm5ZCXqFbgUd6nBAADdVHfOE1Ay4awD8CARJUAR7dOA4j2De4u+Z4WSkm2ZWocDwA93M4CZT6nt\n7lSj9X3pyhvM1c6JzIyMDreTk3u8+/rrceE1axoXz5vHDocrotTenZXLlmHRvFjq22k6+fmWzgBz\nriI4sJp8+pL3hW84d+0gvl4y1PDdqqdu30w7/5zFZmjOzN4br9qLEAu4ihki6hgSGvqVTqdrMunT\nTwOHjhwJrda1IRMuX7qENg2bcsfW76BhnWjht5kPvyfMRNKtPfJbI38t9a9FzIw/ruzB6r1fG5LT\nLxqsdvNEgBepYYsFXoIQfjdBRFEhoaFfBQYFNXr/008Dh4wYAY2m6LHSHA4HGlWp6axUJhJRbSd4\nVfC1wmKypGHRqmh89nwc/H1dm4+gpMDMOJm4Hav3fmVIzbqeZrUbJwBY4qIEKQIPQwi/myGiziGh\noV/pAgMbvPLGGwEjRo+WQkNDC11fn6jOzjPHk2hQj1hJo/FxoaWlm4UrezpHdP9Y06xuV3eb4lKc\nsgMJFzZjzd6vDVmmlGSLzfA2gOUimJp3I4S/BKCuZo4MCQ19w26z9Rw8YgTGjBvn3yQ8vED1fDF5\nMr76z5cY3mcZAgMq3h9jSylrtr/E1cvXkB/r9nGpeEvKMqZg34klzh1HY62y7LhgtunfA7BaCL4A\nEMJf4iCiKn5+fmM0Gs1LjcPDNS+/+WZwn3798p0H2LNjB/p3642+naejaljLYrK29HDu8kYcjP+G\n//PMXo+dE2FmXLhxCNuPzDeeurRLo9Fol1ltxqlqykKBIAch/CUUIvIBMDC0TJl3NBpN/THjxvnF\nPPuspmLY31PipaWloWn1Otyy8dPcotHjpX6C8n4gyw7M/SUKE0atQ4XQGu42p0BYbAYc+uM33nJ4\njtFgSsu0OcxfMssLmNnzE0cI7gtC+D0AImoRFBz8mt1uH/xIly7OJ595JrB7797w8/ODLMtoWa+h\nU+OogZ4dp4jga0Xgf6sHOHu2+bemQ/PH3G1Kvsgs4+KNwzhwcrnlyNm10Eo+O802/RQowQ/FH7Xg\nHxHC70EQURkAg8qULfuC3WZrPGDYMNy4dsPv0L7jGBa9hHy0npMXoCSyI+4TyPZbzhcGzC+x4/xJ\nqefw++lV9v0nf7E5nLYUm908U2bnIpEIRVAQhPB7KERUU6PRPAVo3tJKfmhQp4/UoHYv37Dy4RC9\n/sKRmn4OKzbHYMrYo9CUoHSUyemJOHxmjfPgqRUmvSnVAfBCm8MyH0CC6N0LCoMQ/lIAETWVJJ/h\nGo1vjEbyLdugdm/fOtWjfCpXiIBw6SwYC5Z34ef6zaK6VVu5zQaZZVy5dRynEnc4D51Zbcow3JKJ\n6Geb3bwYwD7hmSMoKkL4SxGqW2i4JPkM89H6D3I4rHWrhLW01q3RObhmlUgRkvkeWL7xSTmiXkf6\nV+T4Yn1tMprTcfryHiRc2Gw6dWmXRES3nbJjhd1hWQlgDzOLCKIClyGEvxRDRBUAdPf1CRooy/Ye\nfr6hUp3qUb61qnX0rRrWCmJO4O8cPbUQiVfWyhOfWH9fvaNklnEt+SROJu5wHj23wZicnujn56Pb\nZ7JmLgWwnpkv38/2Bd6NEH4vgZTwnM2JNNG+PoGD7Q5z07ByjS01q0YGVarQXAor1wQimBtgsxkw\nf0U3fPLsfuj8C7+C+m/1Oiy4cjMBF28c5jNX9+kvJR3zkyRNilN2rLQ7LL8C2K0GGiw0ROQEkAAl\nTPlpAE8ys7mIdRqY+Z7jWBDROAAzC9ouEX0IYBczby2ojYKCI4TfS1ETznfRaPyitBq/LnaHqVFg\nQEVLlYottVXDWuoqVWiGsiF1IUkl1sHlvrHo12jnkE4TNS3r9y50HZnGZCTeOILz13+3n7myz3w7\n41KAn0/gBbvTusXusOyEMlZ/w3VWA0SkZ+ZgdX8xgMPMPDXXcW1BY/PkrvMez08E8KAaVfder5HE\nvEXxIoRfACBnwVg4gId8fYK7MuSHZae9Qvky9c3VKj8YWKl8uLZsaF2EBFWDRirdE8brd73GFYJC\n5Sd6fZnvU88pO3A74xKSUs7hesof8uVbCcZryacli01Pvj66w2Zr1gZmeQ+AQ8xsup923yH8zwFo\nDmApgI8BpEHJUtcEwOcAogD4AZjBzLOIqAqAJVByX2gBPM/Me4lID2A2gB4AbgIYDiAUSva61mpb\n9QH8DGABlPDmZwDcZuauRNQDwAdqWxcAjGZmIxFdUq/pDuALAL2hhJRYTkSTAPQFEADlAfmc2s4O\nAAeg5O0uA+BpZt7j4q/RKyg5PmsCt6KG5T2qbt8DSnKa5LSTbVPS/4j08QmKkmVHQ4fTUj4woIK5\nbGhdrli2UWDZ0DqasiF1UCakFnx9At16D66icb0BtPPgBxpmznGNlWUnMgy3kJR6DkmpZ3Hl1nHj\n1eRTznT9dZ1W65eqlXxOWmyG/U7ZkQBluOWs3WF1Sy+WiLQAoqHkuACUHNVNmfkyET0LIIOZ2xKR\nH4A9RLQJwEAAG5j5E3VYUKdeGwjgd2YeT0TvAXifmV8iokwiimDmY1DyYMxj5hlENB5AJ2ZOU+eY\nJgLoysxmInoLwHgAH0HJqZGS6+HRC3/m2ZjGzJPV8oVE1JeZ16jHNczcjoh6A3gfyoNDUECE8Avu\nipqPeIO6AQCIyN9gulXfYLrV6NrNg018fYJag7mp3WGu4eOjs5UJqeUoF1LXNyS4ekBgQBiCdGEI\n0lVCoC6sRE8mO2U7jKbbMJhuwWrLgt1hQeyG8dZ0fZIlNfOqpDel6rQaX4OP1u+c3WGNsznMhwGc\nAHDSYbUb3W2/SgARZcfl2QVgHoD2UHJUZ08W9wDQjIgGq59DADwA4HcA89Q3v1WqoAOADOVNAAAW\nA1ih7s8BMFoV+qEA2uRhz0NQ3jD2qQ9QXwC5E70syeMaAOhCRG9AefiUg/I9r1GPZbd/BEDtu1wv\nyAch/IICwcwWAMfVLQcikqy2rBq3Uo43vpVyvJ4kaWv5aHX1CVTbKTuqOJyW8hJpZD+/EGuAXzk5\nUFeRggLCfAMCyvv7anXQagPgk7Pl9dkfGo0vmOVcmxPMDGYn5DvKnbIDVluWuulz9i3WDJvZmm63\nWDOdFmsmW21ZZLXptXaH0U+j8cvUSD43AVy2O210+MyaIwD2AkgEcMlqNxVporQYMKv5qnNQBffO\nB9OLzLz5zouJqCOUIZYFRPQ1My+68xT82StfAaXHvQ3KXEL6XWzazMwj7nLsbw9MIvIHMANAa2a+\nTkTv46+5t7MnwLNzbQsKgfjiBC5BnZy7rG5/g4jIyc5QkzmlssmcUik142wlAJUBlNNIvqGSxqes\nRJoQAoUACGIgECwHMmSdLDv9ZXb6Mzu1AMkEYhDJBJKVf8HI2ScZgEwk2YmkTAKlMzhFlh0pDqfl\nJrOcBiD9ji0VwE273eQNSUk2AhhLRNuZ2UFEDQBcA1ABwHVmnqOKb0sAi6CkZx0CpXc+AsBuQOkA\nENFGKMOCT+WqXw/lLSINwEEAM4ioHjNfIKJAAFWZ+dw/2Jct8qlEFKS2vdQldy7IQQi/oFhQQwtk\nqNsfbjantJKXp8adOarnQBkiOaIu+EuGkgO7E4A3iMgORbyfUM83AmhLRO8CuAVgWK66/qdeuylX\n2SwAG4joujq5GwPgJ3U+AVDG/O8q/MycQUSzoQzv3ITy8CjI/QruAeHVIxAICgURvQ4gmJnfd7ct\ngoIhevwCgaDAENFKAHUAdHG3LYKCI5J2lHCISCaiRbk+a4noNhGtzue6CNXlLfvzB0T0WhHsyLme\niD4korsmpyWifkTUuLBtCUo+zDyAmVuonl8CD0MIf8nHCKCpOuEGKH7L15D/+GZLKL7c2RR1TC/n\nemZ+P5+l9QOguPEJBIISiBB+z2AdgD7q/mMAfoLiWgciCiSieUR0kIiOENGjqi/2ZADDiOgoEQ1V\nr21CRNuJ6AIRvZRdORGNJ6Lj6vZKrvKJRHSGiHZDWfXJavkCIhqk7n9GRCeJ6BgRTSGihwH8C8AU\nte269/WbEQgEBUaM8XsGSwBMIqI1AJoBmAugo3psIoCtzPyUmqHrIIAtAN6D4gv9MqAM1QBoBMV7\nIwTAGSL6DkALADEA2kLpCBwkop0ANFA8OCIA+EBZMHNIbZMBMBGVB9CfmRupbYQwcxYR/QZl+X32\nYhuBQFCCEMLvATDzcSKqDaW3v/aOwz0A/Ev1sACUmCg1obwR5I4pzwDWqKEZUokoGYoffQcAK7Kj\nKRLRCigPFUkttwCwqGJ+JxnqsblQVlauyXVMpAETCEooYqjHc/gNSgCsnGGeXAxk5pbqVpuZ/0De\nY/q2XPvZKx/5jvruZR9QXIGdUN4UfoGy4nNDruPCT1ggKKEI4fcc5gH4gJlP3lG+EcDL2R+IKHvJ\nvh5KpMV/gqGsxOxPRAHqysr+UOK87FLL/YkoGIqw/wX1/DLMvB5K8K2IXG2HFOTmBAJB8SGEv+TD\nAMDM15l5eq6y7B71RwB8iCiBiE4A+FAt3w5lMjf35O7feuHMfBRKON04KCFvZzPzMbV8CYBjUCaX\n4/KwKxjAaiI6BuUB8qp67Gcoq0APi8ldgaDkIVbuCgQCgZchevwCgUDgZQjhFwgEAi9DCL9AIBB4\nGUL4BQKBwMsQwi8QCARehhB+gUAg8DKE8AsEAoGXIYRfIBAIvAwh/AKBQOBlCOEXCAQCL0MITGm2\nuAAAADhJREFUv0AgEHgZQvgFAoHAyxDCLxAIBF6GEH6BQCDwMoTwCwQCgZchhF8gEAi8DCH8AoFA\n4GX8H2C6MMSeDDR+AAAAAElFTkSuQmCC\n",
"text": [
"<matplotlib.figure.Figure at 0x25c37c0d0>"
]
}
],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question #2 - What are the highest points in the database?**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"result = db.osm.aggregate([{\"$sort\" : {\"elevation\" : -1}},\n",
" {\"$limit\" : 6},\n",
" {\"$project\" : {\"_id\" : 0,\"name\" : 1, 'elevation' : 1}}])\n",
"\n",
"for x in result['result']:\n",
" print x['name'].title(), x['elevation']"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Soil Conservation Service Site 15 Dam 700\n",
"Jack Mountain 451\n",
"Roundtree Mountain 440\n",
"Mount Sharp Cemetery 437\n",
"Martin Cemetery 432\n",
"Shingle Hills 432\n"
]
}
],
"prompt_number": 18
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question #3 - What are the most common cuisine types in the database?**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"result = db.osm.aggregate([{\"$unwind\" : \"$cuisine\"},\n",
" {\"$group\" : {\"_id\" : \"$cuisine\", \"count\" : {\"$sum\" : 1} }},\n",
" {\"$sort\" : {\"count\" : -1}},\n",
" {\"$limit\" : 10}\n",
" ])\n",
"for x in result['result']:\n",
" print x['_id'], x['count']\n"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Mexican 71\n",
"burger 49\n",
"pizza 38\n",
"sandwich 35\n",
"Chinese 21\n",
"American 20\n",
"coffee shop 18\n",
"Indian 15\n",
"Italian 13\n",
"Thai 13\n"
]
}
],
"prompt_number": 19
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question 4 - What are the closest Mexican restaurant to my current location?**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#first we need to create a geospatial index on the 'pos' field of our documents.\n",
"db.osm.ensure_index([('pos', pymongo.GEO2D)])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 20,
"text": [
"u'pos_2d'"
]
}
],
"prompt_number": 20
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"my_loc = [-97.753333, 30.349167]\n",
"for doc in db.osm.find({\"cuisine\" : \"Mexican\",\"pos\" : {\"$near\" : my_loc}}, { \"name\" : 1, \"_id\" : 0, 'address.housenumber' : 1}).limit(5):\n",
" print doc['name'].title()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Arandas\n",
"Chipotle Mexican Grill\n",
"Torchy'S Tacos\n",
"Panaderia\n",
"El Nuevo Mexico\n"
]
}
],
"prompt_number": 21
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question #5 - What are the closest natural sites?**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for doc in db.osm.find({\"natural\" : {\"$exists\" : 1}, \"name\" : {\"$exists\" : 1}, \"pos\" : {\"$near\" : my_loc}},\n",
" { \"name\" : 1, \"_id\" : 0}).limit(5):\n",
" print doc['name'].title()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Cat Mountain\n",
"Mount Lucas\n",
"Mount Barker\n",
"Mount Bonnell\n",
"Mount Larson\n"
]
}
],
"prompt_number": 23
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question #6 - How do I find the closest amenities to a given address?**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#this function finds the closest amenities to the supplied address, if the address is not in the db, return None\n",
"def find_closest(housenumber, street, amenity):\n",
" current_loc = db.osm.find_one({'address.housenumber' : housenumber, 'address.street' : street}, {\"pos\" : 1})\n",
" if not current_loc:\n",
" return None\n",
" return db.osm.find({'amenity' : amenity, 'pos' : {\"$near\" : current_loc['pos']}}, {\"_id\" : 0, \"name\" : 1, \"address.housenumber\" : 1,\n",
" \"address.street\" : 1}).limit(5)\n"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 27
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#find the closest bars\n",
"locations = find_closest('2911', 'San Jacinto Boulevard', 'bar')\n",
"\n",
"if locations:\n",
" for loc in locations:\n",
" print loc['name'].title()\n",
" if 'address' in loc:\n",
" print loc['address']['housenumber'] + \" \" + loc['address']['street']\n",
" print \"-------\"\n",
"else:\n",
" print \"Address not found!\""
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Cactus Cafe\n",
"2247 Guadalupe Street\n",
"-------\n",
"Nast'Y\n",
"-------\n",
"Dive\n",
"1703 Guadalupe Street\n",
"-------\n",
"Haymaker\n",
"2310 Manor Road\n",
"-------\n",
"Mohawk\n",
"912 Red River Street\n",
"-------\n"
]
}
],
"prompt_number": 26
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"-----\n",
"<a id=\"conclusion\"></a>"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Conclusion"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First an overview of the file(s) used:\n",
" - Size of original OSM file: 174.6 MB\n",
" - Size of intermediary JSON file: 140 MB\n",
" - Size of MongoDB database: 67.1 MB\n",
" - Number of unique contributors: 761\n",
" - Number of nodes: 761182\n",
" \n",
"The analysis above represents only some of the steps that would be taken to clean and improve the data that we received from OSM. It is clear from the start that the data here are only a partial representation of the actual points in Austin. For example, there is a Mexican restaurant across the street from me that is not in the database. In addition to being incomplete, we can not even be certain that the data are representative. It may be, for example, that Baptist churches are, for some reason, more likely to have been added to the database than the churches of other denominations. Furthermore, we only examined and cleaned a few of the tags that exist in the database, no doubt each of the other tags would require similar efforts to clean and improve. \n",
"\n",
"One of the best ways that the data could be improved would be the inclusion of relevant data from the <a href=\"www.data.austintexas.gov\">City of Austin Data Portal</a>. This site has numerous datasets that could be used to correct, refine and expand the data we have collected here. Datasets include locations of dangerous dogs, golf courses, police and fire station locations and many more. An especially useful improvement to the database would be to add entries for restaurant inspection scores that would be even more useful than proximity data in deciding where to eat. \n",
"\n",
"-----"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"bibliography\"></a>"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Resources"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The primary source of technical information used to complete this project was provided by Udacity's <a href=\"https://www.udacity.com/course/viewer#!/c-ud032-nd/l-491558559/m-816599080\">DataWrangling with MongoDB</a> course.\n",
"\n",
"Other resources consulted include:\n",
"\n",
"- the <a href=\"http://docs.mongodb.org/manual/tutorial/getting-started/\">the MongoDB manual</a>.\n",
"- Python's <a href=\"https://docs.python.org/2/library/json.html\">JSON documentation</a>.\n",
"- Python's <a href=\"https://docs.python.org/2/library/xml.etree.elementtree.html\">ElementTree documentation</a>.\n",
"\n",
"Cases where specific code segments were borrowed from other sources are commented within the code in the appendix.\n",
"\n",
"The following files were used to build the database:\n",
"\n",
"- `austin_texas.osm` from Open Street Map's Austin, TX <a href=\"https://mapzen.com/metro-extracts/\">metro extract</a>.\n",
"- `TX_Features_20141202.txt` provided by the <a href=\"geonames.usgs.gov/domestic/\">US Geological Survey</a>. \n",
"- `cuisine_list.txt` from ranker.com's <a href=\"http://www.ranker.com/crowdranked-list/favorite-types-of-cuisine\">list of most popular national cuisines</a>.\n",
"\n",
"-----"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Return to <a href=\"#top\">top</a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id='code'></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h2>Code</h2>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Import necessary modules**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#import the modules I will need\n",
"import xml.etree.ElementTree as ET\n",
"import pprint\n",
"import csv\n",
"import pandas as pd\n",
"from ggplot import *\n",
"from scipy import stats\n",
"import numpy as np\n",
"import collections\n",
"import re\n",
"from pymongo import MongoClient\n",
"import datetime\n",
"import time\n",
"import pymongo\n",
"import matplotlib.pyplot as plt\n",
"import json\n",
"import codecs\n",
"\n",
"#we have a graph to show later\n",
"%matplotlib inline\n"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Designate input file**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"filename = 'austin_texas.osm'"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id='count_tags'></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Count the tags in the database**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#count the tags in our database\n",
"def count_tags(filename):\n",
" tags = {}\n",
" for item, elem in ET.iterparse(filename):\n",
" if elem.tag not in tags:\n",
" tags[elem.tag] = 1\n",
" else:\n",
" tags[elem.tag] += 1\n",
" return tags"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Return to <a href = \"#auditing_and_cleaning\">Auditing & Cleaning</a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"count_keys\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Count the keys in the nodes**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def count_keys(filename):\n",
" tag_keys = collections.Counter()\n",
" for item, elem in ET.iterparse(filename):\n",
" if elem.tag == 'node':\n",
" for x in elem:\n",
" tag_keys[x.attrib['k']] += 1\n",
" return tag_keys\n"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Return to <a href = \"#auditing_and_cleaning\">Auditing & Cleaning</a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"clean_religion\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Clean the religion and denomination names**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#clean the religion and denomination names\n",
"def clean_religion(entry):\n",
" new_entry = entry.lower()\n",
" if new_entry == \"latter_day_saints\":\n",
" return \"Mormon\"\n",
" if new_entry == \"roman_catholic\":\n",
" return \"Catholic\"\n",
" if 'jehovahs' in new_entry:\n",
" return \"Jehovah's Witness\"\n",
" return entry.replace('_',' ').title()"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Return to <a href = \"#return_religion\">religion/denomination</a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"elevation_code\"></a>"
]
},
{
"cell_type": "heading",
"level": 6,
"metadata": {},
"source": [
"Import GNIS data from the USGS and compare with the OSM data"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#We want a list of gnis:feature_ids that are in our OSM data, we will then extract those features from the gnis data\n",
"gnis_features = []\n",
"for item, elem in ET.iterparse(filename):\n",
" if elem.tag == 'node':\n",
" for x in elem:\n",
" if x.attrib['k'] == \"gnis:feature_id\":\n",
" if \";\" in str(x.attrib['v']):\n",
" for q in x.attrib['v'].split(';'):\n",
" gnis_features.append(q)\n",
" else:\n",
" gnis_features.append(x.attrib['v'])\n",
" \n",
"#now we import the data from the USGS which includes data on the elevation. We will only import the ones that exist in the OSM \n",
"#database. See below for an example entry from these data.\n",
"gnis_data = []\n",
"with open('TX_Features_20141202.txt', 'rb') as csvfile:\n",
" reader = csv.DictReader(csvfile, delimiter='|')\n",
" for line in reader:\n",
" if line['\\xef\\xbb\\xbfFEATURE_ID'] in gnis_features: #the data we have include all of TX, but we just want the points in OSM\n",
" gnis_data.append(line)\n",
" \n",
"#finally, we compare the elevations from the two difference datasets and create a dictionary of features and the elevations\n",
"#differences\n",
"elevation_differences = {}\n",
"for item, elem in ET.iterparse('austin_texas.osm'):\n",
" if elem.tag == 'node':\n",
" OSM_ele = None\n",
" for tag in elem: #first get the OSM elevation\n",
" if tag.attrib['k'] == 'ele':\n",
" OSM_ele = tag.attrib['v']\n",
" if not OSM_ele:\n",
" continue\n",
" for tag in elem: #now get the gnis elevation\n",
" if tag.attrib['k'] == 'gnis:feature_id':\n",
" for item in gnis_data:\n",
" if item['\\xef\\xbb\\xbfFEATURE_ID'] == tag.attrib['v']:\n",
" if item['ELEV_IN_M'] != OSM_ele:\n",
" for name in elem:\n",
" if name.attrib['k'] == 'name':\n",
" elevation_differences[name.attrib['v']] = int(OSM_ele) - int(item['ELEV_IN_M'])\n",
" \n",
"#this function creates a dictionary from the USGS data with entries like this {feature_id : elevation}\n",
"def get_gnis_elevations(gnis_textfile):\n",
" gnis_elevations = {}\n",
" with open(gnis_textfile, 'rb') as csvfile:\n",
" reader = csv.DictReader(csvfile, delimiter='|')\n",
" for line in reader:\n",
" if line['ELEV_IN_M'] != '':\n",
" gnis_elevations[line['\\xef\\xbb\\xbfFEATURE_ID']] = int(line['ELEV_IN_M'])\n",
" else:\n",
" gnis_elevations[line['\\xef\\xbb\\xbfFEATURE_ID']] = None\n",
" return gnis_elevations\n",
"\n",
"gnis_elevations = get_gnis_elevations('TX_Features_20141202.txt')"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Return to <a href = \"#return_religion\">elevation</a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"cuisine_code\"></a>"
]
},
{
"cell_type": "heading",
"level": 6,
"metadata": {},
"source": [
"Audit and clean the cuisine types"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#first we will generate the list of cuisine types that should be capitalized from the ranker.com list we found.\n",
"def make_national_cuisine_list(filename):\n",
" national_cuisine_list = []\n",
" with open(filename, 'rb') as cuisines:\n",
" for line in cuisines:\n",
" national_cuisine_list.append(line.split()[1].strip().lower())\n",
" #we need to add a couple to this list\n",
" national_cuisine_list.extend(['salvadoran', 'columbian', 'jamaican', 'asian', 'middle eastern'])\n",
" \n",
" return national_cuisine_list\n",
"\n",
"national_cuisine_list = make_national_cuisine_list('cuisine_list.txt')\n",
"\n",
"#and now the function for cleaning cuisine types\n",
"def get_cuisine(cuisine_type, cuisine_list):\n",
" cuisine_dictionary = {'barbecue' : 'bbq', 'Texas Style BBQ' : 'bbq', 'Bar-B-Q' : 'bbq', 'Hamburgers and fires' : 'burger', \n",
" 'Sandwich shop' : 'sandwich', 'mediteranian' : 'Mediterranean', 'el_salvadorian' : 'salvadoran'}\n",
" if cuisine_type in cuisine_dictionary:\n",
" cuisine_type = cuisine_dictionary[cuisine_type]\n",
" cleaned = cuisine_type.replace('_', ' ').strip()\n",
" cleaned = cleaned.lower().replace('fusion', '')\n",
" #split any that require splitting\n",
" if ',' in cleaned:\n",
" cleaned = [x.strip() for x in cleaned.split(',')]\n",
" #after the split we need to check the new entries against our dictionary\n",
" cleaned = [cuisine_dictionary[type] if type in cuisine_dictionary else type for type in cleaned]\n",
" elif '/' in cleaned:\n",
" cleaned = cleaned.split('/')\n",
" #after the split we need to check the new entries against our dictionary\n",
" cleaned = [cuisine_dictionary[type] if type in cuisine_dictionary else type for type in cleaned]\n",
" elif cleaned == 'mexican korean': #this one had to be hard coded since the only seperator was '_'\n",
" cleaned = ['mexican', 'korean']\n",
" \n",
" else:\n",
" cleaned = [cleaned]\n",
" return [x.capitalize().strip() if x.strip() in national_cuisine_list else x.strip() for x in cleaned]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"address_code\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Return to <a href=\"#cuisine\">cuisine</a>"
]
},
{
"cell_type": "heading",
"level": 6,
"metadata": {},
"source": [
"Audit and clean the address data"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#the regex for extracting the last word in the input string. \n",
"street_type_re = re.compile(r'\\b\\S+\\.?$', re.IGNORECASE)\n",
"\n",
"#these are the acceptable names for street types, if the last word is not one of these we want to know\n",
"expected = [\"Street\", \"Avenue\", \"Boulevard\", \"Drive\", \"Court\", \"Place\", \"Square\", \"Lane\", \"Road\", \n",
" \"Trail\", \"Parkway\", \"Commons\", \"Expressway\", \"Cove\", \"Crossing\", \"Way\", \"Pass\", \"Path\", \"Loop\", \"Highway\", \"Quarry\"]\n",
"\n",
"#we build the mapping below to make sure that all of the abbreviations are matched to the full equivalent\n",
"mapping = {\"St\" : \"Street\", \"St.\": \"Street\", \"Dr.\" : \"Drive\", \"Ave\" : \"Avenue\", \"Rd\" : \"Road\", \"street\" : \"Street\", \"Blvd\" : \"Boulevard\",\n",
" \"Blvd.\" : \"Boulevard\", \"Dr\" : \"Drive\", \"Pkwy\" : \"Parkway\", \"Ln\" : \"Lane\", \"Avene\" : \"Avenue\", \"Cv\" : \"Cove\", \"CR\" : \"Circle\",\n",
" \"Cir\" : \"Circle\", \"Ct\" : \"Court\", \"RD\" : \"Road\", \"Expwy\" : \"Expressway\", \"lane\" : \"Lane\", \"N.\" : \"North\", \"N\" : \"North\", \"W.\" : \"West\", \n",
" \"W\" : \"West\" , \"E\" : \"East\", \"E.\" : \"East\", \"S.\" : \"South\", \"S\" : \"South\", \"U.S.\" : \"US\", \"Hwy\" : \"Highway\",\n",
" \"dr.\" : \"Drive\", \"rd\" : \"road\", \"ave\" : \"Avenue\", \"ln\" : \"Lane\", \"st\" : \"Street\", \"st.\" : \"Street\", \"blvd.\" : \"Boulevard\",\n",
" \"rd\" : \"Road\", \"blvd\" : \"Boulevard\", \"dr\" : \"Drive\", \"sb\" : \"Southbound\", \"ih\" : \"Interstate\", \"ih35\" : \"Interstate 35\",\n",
" \"i\" : \"Interstate\", \"fm\" : \"Farm to Market\", \"rr\" : \"Rural Route\"} \n",
"#we want the keys of mapping to be lower case, but we don't want to retype the whole thing, so:\n",
"\n",
"mapping = {k.lower(): v for k,v in mapping.items()}\n",
"#see street types\n",
"def audit_street_types(filename):\n",
" for item,elem in ET.iterparse(filename, events = ('start', )):\n",
" if elem.tag == 'node':\n",
" for tag in elem.findall('tag'):\n",
" if tag.attrib['k'] == 'addr:street':\n",
" m = street_type_re.search(tag.attrib['v'])\n",
" if m:\n",
" street_type = m.group()\n",
" if street_type not in expected and street_type not in mapping:\n",
" print tag.attrib['v']\n",
" \n",
"#this function shows the street names that exist in the data\n",
"\n",
"def audit_street_types(filename):\n",
" for item,elem in ET.iterparse(filename, events = ('start', )):\n",
" if elem.tag == 'node':\n",
" for tag in elem.findall('tag'):\n",
" if tag.attrib['k'] == 'addr:street':\n",
" m = street_type_re.search(tag.attrib['v'])\n",
" if m:\n",
" street_type = m.group()\n",
" if street_type not in expected and street_type not in mapping:\n",
" print tag.attrib['v']\n",
" \n",
"#this function, when passed a street name and a mapping, cleans it and returns the new name for insertion into the database\n",
"\n",
"def clean_streetname(name, mapping):\n",
" new_name = \"\"\n",
" name = name.replace(\"-\", \" \")\n",
" for i in name.split():\n",
" if i.lower() in mapping:\n",
" new_name += mapping[i.lower()]\n",
" else:\n",
" new_name += i.capitalize()\n",
" new_name += \" \"\n",
" return new_name.strip()\n",
"\n",
"#this function takes in a postcode from the OSM file and return the cleaned postcode and the seperate 4plus code where possible\n",
"#http://stackoverflow.com/questions/2499966/python-a-smarter-way-of-string-to-integer-conversion/2500023#2500023\n",
"def clean_zipcode(zipcode):\n",
" new_zip_4 = None\n",
" new_zip = ''.join([x for x in zipcode if x.isdigit()])\n",
" if len(new_zip) > 5:\n",
" new_zip_4 = new_zip[5:]\n",
" new_zip = new_zip[0:5]\n",
" if 73000 > int(new_zip) < 79000:\n",
" return None\n",
" \n",
" return new_zip, new_zip_4\n",
" "
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Return to <a href=\"#address\">address</a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"export_code\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Create dictionaries from nodes**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#fields that should be in the created subdictionary\n",
"CREATED = [ \"version\", \"changeset\", \"timestamp\", \"user\", \"uid\", \"created_by\", \"version\"]\n",
"#the fields we are interested in\n",
"KEYS_TO_IMPORT = [\"amenity\", \"natural\", \"emergency\", \"public_transport\", \"leisure\", \"shelter\", \"waterway\", \"aeroway\",\n",
" \"name\", \"historic\", \"created_by\", \"name\"]\n",
" \n",
"def node_to_dictionary(node):\n",
" #initialize necessary dictionaries and lists\n",
" new_document = {}\n",
" created_dict = {}\n",
" address = {}\n",
" cuisine = []\n",
" #get the attributes of the node - import them into the appropriate dictionary\n",
" for value in node.attrib:\n",
" if value in CREATED:\n",
" created_dict[value] = node.attrib[value]\n",
" else:\n",
" new_document[value] = node.attrib[value]\n",
" #convert the timestamp from OSM to a Python time.struct\n",
" \"\"\"if 'timestamp' in created_dict:\n",
" newtime = datetime.datetime.strptime(created_dict['timestamp'], \"%Y-%m-%dT%H:%M:%SZ\")\n",
" created_dict['timestamp'] = newtime\"\"\"\n",
" #combine the lat and lon values into a single list entry 'pos' and convert the values to floats\n",
" if 'lat' in new_document and 'lon' in new_document:\n",
" #[lon, lat] so we can use it as a geospatial index later - http://docs.mongodb.org/manual/core/2dsphere/\n",
" new_document['pos'] = [float(new_document['lon']), float(new_document['lat'])]\n",
" del (new_document['lat'], new_document['lon'])\n",
" #now import the tags calling helper functions as needed to process the incoming data\n",
" for tag in node.findall('tag'):\n",
" new_key = tag.attrib['k'].lower().strip()\n",
" new_value = tag.attrib['v'].lower().strip()\n",
" if new_key in CREATED:\n",
" created_dict[new_key] = new_value\n",
" if new_key == 'religion' or new_key == 'denomination':\n",
" new_document[new_key] = clean_religion(new_value)\n",
" if new_key == \"gnis:feature_id\":\n",
" #get the ele from this node\n",
" for x in node.findall('tag'):\n",
" if x.attrib['k'] == \"ele\":\n",
" OSM_elevation = int(x.attrib['v'])\n",
" #get the elevation from the USGS GNIS\n",
" if new_value not in gnis_elevations:\n",
" continue\n",
" GNIS_elevation = gnis_elevations[new_value]\n",
" if OSM_elevation == GNIS_elevation:\n",
" new_document['elevation'] = GNIS_elevation\n",
" else:\n",
" new_document['elevation'] = GNIS_elevation\n",
" new_document['old_elevation'] = OSM_elevation\n",
" new_document['ele_gnis_verified'] = True\n",
" continue\n",
" if new_key == 'ele':\n",
" new_document['elevation'] = int(float(new_value))\n",
" new_document['ele_gnis_verified'] = False\n",
" if new_key == \"cuisine\":\n",
" cuisine = get_cuisine(new_value, national_cuisine_list)\n",
" #clean our postcodes and street names\n",
" if new_key[0:5] == \"addr:\":\n",
" if new_key[5:] == \"postcode\":\n",
" post_code = clean_zipcode(new_value)\n",
" if post_code:\n",
" address['postcode'] = post_code[0]\n",
" if post_code[1] != None:\n",
" address['zipplus4'] = post_code[1]\n",
" elif new_key[5:] == \"street\":\n",
" address['street'] = clean_streetname(new_value, mapping)\n",
" else:\n",
" address[new_key[5:]] = new_value\n",
" #if it's not on our list, ignore it\n",
" if new_key not in KEYS_TO_IMPORT:\n",
" continue\n",
" #import whatever's left\n",
" new_document[new_key] = new_value\n",
" if cuisine != []:\n",
" new_document['cuisine'] = cuisine\n",
" if address != {}:\n",
" new_document['address'] = address\n",
" new_document['created'] = created_dict\n",
" return new_document\n"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Return to <a href=\"#exporting\">file export</a>"
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment