Skip to content

Instantly share code, notes, and snippets.

@laurenmarietta
Last active February 11, 2020 17:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save laurenmarietta/8126760854e8d91e1588791ac89331fd to your computer and use it in GitHub Desktop.
Save laurenmarietta/8126760854e8d91e1588791ac89331fd to your computer and use it in GitHub Desktop.
addresses_to_coords_OpenCage
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tutorial: Using OpenCage to convert addresses to geo-coordinates\n",
"\n",
"\n",
"This Jupyter notebook walks through how to use the free [OpenCage Geocoder API](https://opencagedata.com/) to turn addresses into geo-coordinates with latitude and longitude.\n",
"\n",
"If you have any questions, please contact Lauren Chambers at lchambers@aclum.org.\n",
"\n",
"### Table of Contents\n",
"1. Create an OpenCage account\n",
"1. Load addresses\n",
"1. Geocode addresses to coordinates\n",
"1. Quality control\n",
"1. Conclusion"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Import necessary packages\n",
"\n",
"If you haven't already, you will need to install the `opencage` package (available with `pip`), as well as `numpy` and `pandas`."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Standard library\n",
"import json\n",
"\n",
"# Third-party\n",
"import numpy as np\n",
"from opencage.geocoder import OpenCageGeocode\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Create an OpenCage Account\n",
"\n",
"In order to use the OpenCage Geocoder API, you have to make an account. Good news is, unless you make over 2,500 requests in a day, it's free! (And it doesn't require your credit card info, either. Looking at you, Google & Amazon.) Plus, OpenCage has great documentation and tutorials available for a ton of different languages.\n",
"\n",
"You can make an account here: https://opencagedata.com/users/sign_up\n",
"\n",
"Once you have an account, you will be provided with an API key. There are many ways to handle such keys; this notebook looks for the key in a JSON file, `api_keys.json`, in the same directory as this notebook. The notebook expects the OpenCage API key under the entry `'opencage_api_key'`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Load addresses\n",
"\n",
"For the purposes of this example, I've created `random_addresses.csv`, a list of addresses generated from random geo-coordinates within Springfield, MA and random names. *They do not represent real people or real residences.* Furthermore, to add \"noise\" to the dataset, not every address is legitimate."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Name</th>\n",
" <th>Street Address</th>\n",
" <th>City</th>\n",
" <th>State</th>\n",
" <th>Zip Code</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>Fiona Lor</td>\n",
" <td>P.O. Box 123456</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1129</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>Ghassaan el-Sharifi</td>\n",
" <td>205 Bowles Park</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1014</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>Jeremiah Lial</td>\n",
" <td>75 Clydesdale Lane</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1129</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>Arhab al-Semaan</td>\n",
" <td>25 Bond Street</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1107</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>Kaley Lin</td>\n",
" <td>71 Chauncey Drive</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1151</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>95</td>\n",
" <td>Mirenda Louis</td>\n",
" <td>241 Corcoran Boulevard</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1118</td>\n",
" </tr>\n",
" <tr>\n",
" <td>96</td>\n",
" <td>Damien Baca</td>\n",
" <td>20 Barrington Dr</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1129</td>\n",
" </tr>\n",
" <tr>\n",
" <td>97</td>\n",
" <td>Olivia Mckenna</td>\n",
" <td>26 Sterling Street</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1199</td>\n",
" </tr>\n",
" <tr>\n",
" <td>98</td>\n",
" <td>Arielle Fisher</td>\n",
" <td>540 Tiffany Street</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1108</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99</td>\n",
" <td>Breanna Blanco-Araujo</td>\n",
" <td>43 Bay Street</td>\n",
" <td>Springfield</td>\n",
" <td>MA</td>\n",
" <td>1111</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>100 rows × 5 columns</p>\n",
"</div>"
],
"text/plain": [
" Name Street Address City State Zip Code\n",
"0 Fiona Lor P.O. Box 123456 Springfield MA 1129\n",
"1 Ghassaan el-Sharifi 205 Bowles Park Springfield MA 1014\n",
"2 Jeremiah Lial 75 Clydesdale Lane Springfield MA 1129\n",
"3 Arhab al-Semaan 25 Bond Street Springfield MA 1107\n",
"4 Kaley Lin 71 Chauncey Drive Springfield MA 1151\n",
".. ... ... ... ... ...\n",
"95 Mirenda Louis 241 Corcoran Boulevard Springfield MA 1118\n",
"96 Damien Baca 20 Barrington Dr Springfield MA 1129\n",
"97 Olivia Mckenna 26 Sterling Street Springfield MA 1199\n",
"98 Arielle Fisher 540 Tiffany Street Springfield MA 1108\n",
"99 Breanna Blanco-Araujo 43 Bay Street Springfield MA 1111\n",
"\n",
"[100 rows x 5 columns]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"addresses = pd.read_csv(\"random_addresses.csv\")\n",
"addresses"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Get geo-coordinates from OpenCage API\n",
"\n",
"First things first, we need to connect to OpenCage using our API key."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Load API key\n",
"with open('api_keys.json') as fo:\n",
" api_key = json.load(fo)['opencage_api_key']"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"# Initialize connection to API\n",
"geocoder = OpenCageGeocode(api_key)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we can use the Geocoder to convert our addresses to geo-coordinates.\n",
"\n",
"<p style=\"color:red\"><b>Caution:</b></p>\n",
"\n",
"The following code block will query OpenCage for all 100 addresses in our list. If you run it 25 times, you will hit your daily limit for queries! If you want to see what the code does for a subset of these addresses, you can modify the for loop at line 10 to iterate over just a few addresses, e.g.:\n",
"```\n",
"for i, row in addresses[:5].iterrows():\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1. Querying Fiona Lor: P.O. Box 123456, Springfield, MA, 1129\n",
"Cannot query; PO Box address\n",
"\n",
"2. Querying Ghassaan el-Sharifi: 205 Bowles Park, Springfield, MA, 1014\n",
"\n",
"3. Querying Jeremiah Lial: 75 Clydesdale Lane, Springfield, MA, 1129\n",
"\n",
"4. Querying Arhab al-Semaan: 25 Bond Street, Springfield, MA, 1107\n",
"\n",
"5. Querying Kaley Lin: 71 Chauncey Drive, Springfield, MA, 1151\n",
"\n",
"6. Querying Aparna Tram: 71 Agnes Street, Springfield, MA, 1108\n",
"\n",
"7. Querying Eric Tarar: 357 Cottage Street, Springfield, MA, 1119\n",
"\n",
"8. Querying Brandon Morales: 1500 Main Street Suite 1234, Springfield, MA, 1101\n",
"\n",
"9. Querying Ivy Zhang: 1727 South Branch Parkway, Springfield, MA, 1128\n",
"\n",
"10. Querying Mukhtaar el-Mussa: Po Box 1111, Springfield, MA, 1109\n",
"Cannot query; PO Box address\n",
"\n",
"11. Querying Faviola Sanchez: 1785 Allen Street, Springfield, MA, 1128\n",
"\n",
"12. Querying James Williams: 1146 Allen Street, Springfield, MA, 1118\n",
"\n",
"13. Querying Saurya Patel: 1068 Bradley Road, Springfield, MA, 1118\n",
"\n",
"14. Querying Brittany Shoels: 435 Porter Lake Drive, Springfield, MA, 1106\n",
"\n",
"15. Querying Emmanuel Chu: 101 Corcoran Blvd, Springfield, MA, 1119\n",
"\n",
"16. Querying Min New: 106 Saint James Cir, Springfield, MA, 1104\n",
"\n",
"17. Querying Tevin Swanson: 25 Sparrow Drive, Springfield, MA, 1119\n",
"\n",
"18. Querying Reece Huff: 1340 Boston Road, Springfield, MA, 1151\n",
"\n",
"19. Querying Hunter Burgos: 769 Worthington Street, Springfield, MA, 1101\n",
"\n",
"20. Querying Ashlee Hayes: 57 Lyons Street, Springfield, MA, 1151\n",
"\n",
"21. Querying Alyssa Wilson: 196 Corcoran Boulevard, Springfield, MA, 1118\n",
"\n",
"22. Querying Omar Venegas: 1623 Bay Street, Springfield, MA, 1119\n",
"\n",
"23. Querying Labeeb al-Saladin: 54 Oak Hollow Road, Springfield, MA, 1128\n",
"\n",
"24. Querying Lizette Madrid: 2 South Greeting Road, Springfield, MA, 1108\n",
"\n",
"25. Querying Thien-Kim Tran: 100 Reeds Lndg, Springfield, MA, 1111\n",
"\n",
"26. Querying Kara Kuljis: 28 Lowell Street, Springfield, MA, 1107\n",
"\n",
"27. Querying Naqiyya el-Saeed: 2067 Allen Street, Springfield, MA, 1128\n",
"\n",
"28. Querying Geoffrey Amelkin: PO Box 1221, Springfield, MA, 1111\n",
"Cannot query; PO Box address\n",
"\n",
"29. Querying Jaarallah el-Zaman: 74 Springfield Street, Springfield, MA, 1199\n",
"\n",
"30. Querying Muneefa al-Aziz: 117 Louis Road, Springfield, MA, 1118\n",
"\n",
"31. Querying Shawna Bailey: 170 Waldorf Street, Springfield, MA, 1111\n",
"\n",
"32. Querying Linh Cayabyab: 415 State Street, Springfield, MA, 1105\n",
"\n",
"33. Querying Shadhaa el-Saade: 23 Reed Street, Springfield, MA, 1109\n",
"\n",
"34. Querying Kasia Woon: nan, Springfield, MA, 1119\n",
"\n",
"35. Querying Elizabeth Hernandez Bonilla: 119 Stafford Street, Springfield, MA, 1014\n",
"\n",
"36. Querying James Turner: 125 Tinkham Road, Springfield, MA, 1129\n",
"\n",
"37. Querying Jacobo Grajales: 65 Olmsted Drive, Springfield, MA, 1108\n",
"\n",
"38. Querying Robert Williams: 37 Leland Drive, Springfield, MA, 1111\n",
"\n",
"39. Querying Dakota Stogdill: 1400 State Street, Springfield, MA, 1109\n",
"\n",
"40. Querying Esainea Digby: 1695 Main St Suite 444, Springfield, MA, 1109\n",
"\n",
"41. Querying Jonathan Klinkerman: 600 South Branch Parkway, Springfield, MA, 1106\n",
"\n",
"42. Querying Sengthong Goulet: 139 Page Boulevard, Springfield, MA, 1104\n",
"\n",
"43. Querying Anthony Montabon: 10 Gralia Drive, Springfield, MA, 1128\n",
"\n",
"44. Querying Mark Garcia: 239 Commonwealth Avenue, Springfield, MA, 1108\n",
"\n",
"45. Querying Karin Pacheco: 122 Wayne Street, Springfield, MA, 1118\n",
"\n",
"46. Querying Veronica Lujan: 42 Old Lane Road, Springfield, MA, 1129\n",
"\n",
"47. Querying Myranda Wisham: 24 Craig Street, Springfield, MA, 1108\n",
"\n",
"48. Querying Autumn Whittier: 1000 Anniversary Street, Springfield, MA, 1014\n",
"\n",
"49. Querying Shaafi el-Faraj: 1059 South Branch Parkway, Springfield, MA, 1118\n",
"\n",
"50. Querying Sameeha el-Fares: 759 Chestnut Street, Springfield, MA, 1199\n",
"\n",
"51. Querying Carina Gorostieta: 28 Angelica Drive, Springfield, MA, 1129\n",
"\n",
"52. Querying Veronica Colmenero Chavez: 22 South Greeting Road, Springfield, MA, 1108\n",
"\n",
"53. Querying Abdul Majeed el-Hariri: 153 Hamilton Street, Springfield, MA, 1151\n",
"\n",
"54. Querying Aubry Sapp: 135 Jerilis Drive, Springfield, MA, 1151\n",
"\n",
"55. Querying Ischara Mcgrew: 136 Thompkins Avenue, Springfield, MA, 1118\n",
"\n",
"56. Querying Joseph Kierstead: 1084 Parker Street, Springfield, MA, 1129\n",
"\n",
"57. Querying Isis O'Donnell: 2100 Wilbraham Road, Springfield, MA, 1119\n",
"\n",
"58. Querying Oscar Valdez: 200 Washington Boulevard, Springfield, MA, 1108\n",
"\n",
"59. Querying Ryan Coleman: Apt. 123, Springfield, MA, 1128\n",
"\n",
"60. Querying Rachelle Yi: 21 Emery Street, Springfield, MA, 1104\n",
"\n",
"61. Querying Noori el-Selim: 20 Lemnos Lane, Springfield, MA, 1119\n",
"\n",
"62. Querying Jeremy Hofferber: 53 Sherbrooke Street, Springfield, MA, 1104\n",
"\n",
"63. Querying Cameron Huerta: PO Box 80808, Springfield, MA, 1106\n",
"Cannot query; PO Box address\n",
"\n",
"64. Querying Lee Chong: P.O. Box 65432, Springfield, MA, 1111\n",
"Cannot query; PO Box address\n",
"\n",
"65. Querying Jose Hernandez-Gonzalez: 1727 South Branch Parkway, Springfield, MA, 1128\n",
"\n",
"66. Querying Tae Mian: 2 Pasco Road, Springfield, MA, 1151\n",
"\n",
"67. Querying Ryan White: 41 Brookdale Drive, Springfield, MA, 1119\n",
"\n",
"68. Querying Ghazaala al-Majid: 90 Parkerview Street, Springfield, MA, 1129\n",
"\n",
"69. Querying Kabeera al-Samaan: , Springfield, MA, 1107\n",
"\n",
"70. Querying Mujaahida al-Assad: 24 Puritan Road, Springfield, MA, 1128\n",
"\n",
"71. Querying Christian Estrada: 82 Stuart Street, Springfield, MA, 1119\n",
"\n",
"72. Querying Jonathan Posey-Hughes: 1059 South Branch Parkway, Springfield, MA, 1118\n",
"\n",
"73. Querying Alexis Banks: 37 Balis Street, Springfield, MA, 1111\n",
"\n",
"74. Querying Ronald Harris: 152 Harkness Avenue, Springfield, MA, 1118\n",
"\n",
"75. Querying Jose Love: 15 Wrenwood Street, Springfield, MA, 1129\n",
"\n",
"76. Querying Francine Two Crow: 100 Progress Avenue, Springfield, MA, 1119\n",
"\n",
"77. Querying Thomas Gomez: 36 Birch Glen Drive, Springfield, MA, 1119\n",
"\n",
"78. Querying Michael Shepherd: 275 Bicentennial Hwy, Springfield, MA, 1118\n",
"\n",
"79. Querying Courtney Garrison: 1306 Liberty Street, Springfield, MA, 1104\n",
"\n",
"80. Querying Paul Wans: 1060 Bay Street, Springfield, MA, 1111\n",
"\n",
"81. Querying Crystal Juarez: 1000 Five Mile Pond Road, Springfield, MA, 1151\n",
"\n",
"82. Querying Francisco Infante: 1200 Wilbraham Road, Springfield, MA, 1119\n",
"\n",
"83. Querying Brooklyn Robinson: 45 Davenport Street, Springfield, MA, 1119\n",
"\n",
"84. Querying Cooper Paxton: 130 Main Street, Springfield, MA, 1151\n",
"\n",
"85. Querying Trezen Banks-Hill: 40 Sonia Street, Springfield, MA, 1129\n",
"\n",
"86. Querying Addison Bigbey: 123 Glenoak Drive, Springfield, MA, 1129\n",
"\n",
"87. Querying Asmar el-Salahuddin: 81 Agnes Street, Springfield, MA, 1108\n",
"\n",
"88. Querying Timothy Alexander: 23 Egan Drive, Springfield, MA, 1111\n",
"\n",
"89. Querying Brandon Joyce: 2460 Main Street, Springfield, MA, 1107\n",
"\n",
"90. Querying Colin Fitzgerald: 86 West Allen Ridge Road, Springfield, MA, 1118\n",
"\n",
"91. Querying Chen Phan: 1234 Tamarack Drive, Springfield, MA, 1118\n",
"\n",
"92. Querying Alyssa Washington: 50 Benton Street, Springfield, MA, 1109\n",
"\n",
"93. Querying Tyler Delaney: 12 Pine Road, Springfield, MA, 1108\n",
"\n",
"94. Querying Sara Magda: 137 Oregon Street, Springfield, MA, 1118\n",
"\n",
"95. Querying Phillip O Connor: 720 Hall of Fame Avenue, Springfield, MA, 1105\n",
"\n",
"96. Querying Mirenda Louis: 241 Corcoran Boulevard, Springfield, MA, 1118\n",
"\n",
"97. Querying Damien Baca: 20 Barrington Dr, Springfield, MA, 1129\n",
"\n",
"98. Querying Olivia Mckenna: 26 Sterling Street, Springfield, MA, 1199\n",
"\n",
"99. Querying Arielle Fisher: 540 Tiffany Street, Springfield, MA, 1108\n",
"\n",
"100. Querying Breanna Blanco-Araujo: 43 Bay Street, Springfield, MA, 1111\n",
"\n"
]
}
],
"source": [
"# Create an empty dataframe, which we will fill with query results\n",
"all_coords = pd.DataFrame()\n",
"\n",
"# Define a filename where we will save the query results.\n",
"# (To ensure results aren't lost if a problem arises during \n",
"# queries, the file will be updated after each query.)\n",
"csv_filename = 'address_coords.csv'\n",
"\n",
"# For each address...\n",
"for i, row in addresses.iterrows():\n",
" \n",
" # Combine the various columns in our table to create one address string\n",
" address = \"{}, {}, {}, {}\".format(row[\"Street Address\"], row[\"City\"],\n",
" row[\"State\"], row[\"Zip Code\"])\n",
" \n",
" print('{}. Querying {}: {}'.format(i+1, row[\"Name\"], address))\n",
" \n",
" # Skip if the address is a PO box\n",
" if \" box\" in address.lower():\n",
" print(\"Cannot query; PO Box address\")\n",
" result = []\n",
" \n",
" # Otherwise, send the query to OpenCage!\n",
" else:\n",
" result = geocoder.geocode(address,\n",
" # Opts out of extra info OpenCage provides\n",
" no_annotations = 1,\n",
" # Only return results with high confidence\n",
" min_confidence = 9)\n",
" \n",
" # Create and fill a dictionary with the query results\n",
" coords_dict = {}\n",
" if result != []:\n",
" # Include components of location search (country, street, location type, etc.)\n",
" coords_dict.update(result[0]['components'])\n",
" # Include formatted area\n",
" coords_dict['formatted'] = result[0]['formatted']\n",
" # Include confidence marker\n",
" coords_dict['confidence'] = result[0]['confidence']\n",
" # Finally, include latitude and longitude!\n",
" coords_dict.update(result[0]['geometry'])\n",
"\n",
" # Include the original name\n",
" coords_dict['name'] = row[\"Name\"]\n",
"\n",
" # Add the dictionary to the dataframe of all results\n",
" all_coords = all_coords.append(coords_dict, ignore_index=True)\n",
" \n",
" # Update the CSV file\n",
" all_coords.to_csv(csv_filename)\n",
" print()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"At this point, I like to save the dataframe to another variable, so I don't accidentally overwrite it."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"# Create deep copy\n",
"opencage_coords = all_coords.copy(deep = True)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>name</th>\n",
" <th>ISO_3166-1_alpha-2</th>\n",
" <th>ISO_3166-1_alpha-3</th>\n",
" <th>_category</th>\n",
" <th>_type</th>\n",
" <th>city</th>\n",
" <th>confidence</th>\n",
" <th>continent</th>\n",
" <th>country</th>\n",
" <th>country_code</th>\n",
" <th>...</th>\n",
" <th>state</th>\n",
" <th>state_code</th>\n",
" <th>road_type</th>\n",
" <th>school</th>\n",
" <th>bus_stop</th>\n",
" <th>neighbourhood</th>\n",
" <th>town</th>\n",
" <th>hospital</th>\n",
" <th>local_administrative_area</th>\n",
" <th>bakery</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>Fiona Lor</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>Ghassaan el-Sharifi</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>Jeremiah Lial</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>Arhab al-Semaan</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>Kaley Lin</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>95</td>\n",
" <td>Mirenda Louis</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>96</td>\n",
" <td>Damien Baca</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>97</td>\n",
" <td>Olivia Mckenna</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>98</td>\n",
" <td>Arielle Fisher</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99</td>\n",
" <td>Breanna Blanco-Araujo</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>100 rows × 27 columns</p>\n",
"</div>"
],
"text/plain": [
" name ISO_3166-1_alpha-2 ISO_3166-1_alpha-3 _category \\\n",
"0 Fiona Lor NaN NaN NaN \n",
"1 Ghassaan el-Sharifi US USA building \n",
"2 Jeremiah Lial US USA building \n",
"3 Arhab al-Semaan US USA building \n",
"4 Kaley Lin US USA building \n",
".. ... ... ... ... \n",
"95 Mirenda Louis US USA building \n",
"96 Damien Baca US USA building \n",
"97 Olivia Mckenna US USA building \n",
"98 Arielle Fisher US USA building \n",
"99 Breanna Blanco-Araujo US USA building \n",
"\n",
" _type city confidence continent \\\n",
"0 NaN NaN NaN NaN \n",
"1 building Springfield 10.0 North America \n",
"2 building Springfield 10.0 North America \n",
"3 building Springfield 10.0 North America \n",
"4 building Springfield 10.0 North America \n",
".. ... ... ... ... \n",
"95 building Springfield 10.0 North America \n",
"96 building Springfield 10.0 North America \n",
"97 building Springfield 10.0 North America \n",
"98 building Springfield 10.0 North America \n",
"99 building Springfield 10.0 North America \n",
"\n",
" country country_code ... state state_code \\\n",
"0 NaN NaN ... NaN NaN \n",
"1 United States of America us ... Massachusetts MA \n",
"2 United States of America us ... Massachusetts MA \n",
"3 United States of America us ... Massachusetts MA \n",
"4 United States of America us ... Massachusetts MA \n",
".. ... ... ... ... ... \n",
"95 United States of America us ... Massachusetts MA \n",
"96 United States of America us ... Massachusetts MA \n",
"97 United States of America us ... Massachusetts MA \n",
"98 United States of America us ... Massachusetts MA \n",
"99 United States of America us ... Massachusetts MA \n",
"\n",
" road_type school bus_stop neighbourhood town hospital \\\n",
"0 NaN NaN NaN NaN NaN NaN \n",
"1 NaN NaN NaN NaN NaN NaN \n",
"2 NaN NaN NaN NaN NaN NaN \n",
"3 NaN NaN NaN NaN NaN NaN \n",
"4 NaN NaN NaN NaN NaN NaN \n",
".. ... ... ... ... ... ... \n",
"95 NaN NaN NaN NaN NaN NaN \n",
"96 NaN NaN NaN NaN NaN NaN \n",
"97 NaN NaN NaN NaN NaN NaN \n",
"98 NaN NaN NaN NaN NaN NaN \n",
"99 NaN NaN NaN NaN NaN NaN \n",
"\n",
" local_administrative_area bakery \n",
"0 NaN NaN \n",
"1 NaN NaN \n",
"2 NaN NaN \n",
"3 NaN NaN \n",
"4 NaN NaN \n",
".. ... ... \n",
"95 NaN NaN \n",
"96 NaN NaN \n",
"97 NaN NaN \n",
"98 NaN NaN \n",
"99 NaN NaN \n",
"\n",
"[100 rows x 27 columns]"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Examine the resulting dataframe\n",
"all_coords"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Quality Control\n",
"\n",
"So how well did OpenCage do? "
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"5"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# How many coordinates couldn't it find?\n",
"len(all_coords[all_coords.lat.isna()])"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>name</th>\n",
" <th>ISO_3166-1_alpha-2</th>\n",
" <th>ISO_3166-1_alpha-3</th>\n",
" <th>_category</th>\n",
" <th>_type</th>\n",
" <th>city</th>\n",
" <th>confidence</th>\n",
" <th>continent</th>\n",
" <th>country</th>\n",
" <th>country_code</th>\n",
" <th>...</th>\n",
" <th>state</th>\n",
" <th>state_code</th>\n",
" <th>road_type</th>\n",
" <th>school</th>\n",
" <th>bus_stop</th>\n",
" <th>neighbourhood</th>\n",
" <th>town</th>\n",
" <th>hospital</th>\n",
" <th>local_administrative_area</th>\n",
" <th>bakery</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>24</td>\n",
" <td>Thien-Kim Tran</td>\n",
" <td>ID</td>\n",
" <td>IDN</td>\n",
" <td>place</td>\n",
" <td>county</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>Asia</td>\n",
" <td>Indonesia</td>\n",
" <td>id</td>\n",
" <td>...</td>\n",
" <td>Aceh</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>54</td>\n",
" <td>Ischara Mcgrew</td>\n",
" <td>ID</td>\n",
" <td>IDN</td>\n",
" <td>place</td>\n",
" <td>county</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>Asia</td>\n",
" <td>Indonesia</td>\n",
" <td>id</td>\n",
" <td>...</td>\n",
" <td>Aceh</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>58</td>\n",
" <td>Ryan Coleman</td>\n",
" <td>CH</td>\n",
" <td>CHE</td>\n",
" <td>place</td>\n",
" <td>county</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>Europe</td>\n",
" <td>Switzerland</td>\n",
" <td>ch</td>\n",
" <td>...</td>\n",
" <td>Lucerne</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>Ettiswil</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>68</td>\n",
" <td>Kabeera al-Samaan</td>\n",
" <td>ID</td>\n",
" <td>IDN</td>\n",
" <td>place</td>\n",
" <td>county</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>Asia</td>\n",
" <td>Indonesia</td>\n",
" <td>id</td>\n",
" <td>...</td>\n",
" <td>Aceh</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>92</td>\n",
" <td>Tyler Delaney</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>building</td>\n",
" <td>building</td>\n",
" <td>Springfield Township</td>\n",
" <td>10.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Pennsylvania</td>\n",
" <td>PA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 27 columns</p>\n",
"</div>"
],
"text/plain": [
" name ISO_3166-1_alpha-2 ISO_3166-1_alpha-3 _category \\\n",
"24 Thien-Kim Tran ID IDN place \n",
"54 Ischara Mcgrew ID IDN place \n",
"58 Ryan Coleman CH CHE place \n",
"68 Kabeera al-Samaan ID IDN place \n",
"92 Tyler Delaney US USA building \n",
"\n",
" _type city confidence continent \\\n",
"24 county NaN 9.0 Asia \n",
"54 county NaN 9.0 Asia \n",
"58 county NaN 9.0 Europe \n",
"68 county NaN 9.0 Asia \n",
"92 building Springfield Township 10.0 North America \n",
"\n",
" country country_code ... state state_code \\\n",
"24 Indonesia id ... Aceh NaN \n",
"54 Indonesia id ... Aceh NaN \n",
"58 Switzerland ch ... Lucerne NaN \n",
"68 Indonesia id ... Aceh NaN \n",
"92 United States of America us ... Pennsylvania PA \n",
"\n",
" road_type school bus_stop neighbourhood town hospital \\\n",
"24 NaN NaN NaN NaN NaN NaN \n",
"54 NaN NaN NaN NaN NaN NaN \n",
"58 NaN NaN NaN NaN NaN NaN \n",
"68 NaN NaN NaN NaN NaN NaN \n",
"92 NaN NaN NaN NaN NaN NaN \n",
"\n",
" local_administrative_area bakery \n",
"24 NaN NaN \n",
"54 NaN NaN \n",
"58 Ettiswil NaN \n",
"68 NaN NaN \n",
"92 NaN NaN \n",
"\n",
"[5 rows x 27 columns]"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Any where the state isn't MA?\n",
"all_coords[(all_coords.state != \"Massachusetts\") & pd.notna(all_coords.state)]"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>name</th>\n",
" <th>ISO_3166-1_alpha-2</th>\n",
" <th>ISO_3166-1_alpha-3</th>\n",
" <th>_category</th>\n",
" <th>_type</th>\n",
" <th>city</th>\n",
" <th>confidence</th>\n",
" <th>continent</th>\n",
" <th>country</th>\n",
" <th>country_code</th>\n",
" <th>...</th>\n",
" <th>state</th>\n",
" <th>state_code</th>\n",
" <th>road_type</th>\n",
" <th>school</th>\n",
" <th>bus_stop</th>\n",
" <th>neighbourhood</th>\n",
" <th>town</th>\n",
" <th>hospital</th>\n",
" <th>local_administrative_area</th>\n",
" <th>bakery</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>23</td>\n",
" <td>Lizette Madrid</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>road</td>\n",
" <td>road</td>\n",
" <td>Springfield</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>residential</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>24</td>\n",
" <td>Thien-Kim Tran</td>\n",
" <td>ID</td>\n",
" <td>IDN</td>\n",
" <td>place</td>\n",
" <td>county</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>Asia</td>\n",
" <td>Indonesia</td>\n",
" <td>id</td>\n",
" <td>...</td>\n",
" <td>Aceh</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>31</td>\n",
" <td>Linh Cayabyab</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>education</td>\n",
" <td>school</td>\n",
" <td>Springfield</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>High School of Commerce</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>33</td>\n",
" <td>Kasia Woon</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>transportation</td>\n",
" <td>bus_stop</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>bus_stop</td>\n",
" <td>NaN</td>\n",
" <td>ELM / SHEAFFER</td>\n",
" <td>Merrick</td>\n",
" <td>West Springfield</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>38</td>\n",
" <td>Dakota Stogdill</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>health</td>\n",
" <td>hospital</td>\n",
" <td>Springfield</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>Vibra Hospital of Western Massachusetts</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>47</td>\n",
" <td>Autumn Whittier</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>road</td>\n",
" <td>road</td>\n",
" <td>Springfield</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>residential</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>49</td>\n",
" <td>Sameeha el-Fares</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>health</td>\n",
" <td>hospital</td>\n",
" <td>Springfield</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>Baystate Medical Center</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>51</td>\n",
" <td>Veronica Colmenero Chavez</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>road</td>\n",
" <td>road</td>\n",
" <td>Springfield</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>residential</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>54</td>\n",
" <td>Ischara Mcgrew</td>\n",
" <td>ID</td>\n",
" <td>IDN</td>\n",
" <td>place</td>\n",
" <td>county</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>Asia</td>\n",
" <td>Indonesia</td>\n",
" <td>id</td>\n",
" <td>...</td>\n",
" <td>Aceh</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>58</td>\n",
" <td>Ryan Coleman</td>\n",
" <td>CH</td>\n",
" <td>CHE</td>\n",
" <td>place</td>\n",
" <td>county</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>Europe</td>\n",
" <td>Switzerland</td>\n",
" <td>ch</td>\n",
" <td>...</td>\n",
" <td>Lucerne</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>Ettiswil</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>68</td>\n",
" <td>Kabeera al-Samaan</td>\n",
" <td>ID</td>\n",
" <td>IDN</td>\n",
" <td>place</td>\n",
" <td>county</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>Asia</td>\n",
" <td>Indonesia</td>\n",
" <td>id</td>\n",
" <td>...</td>\n",
" <td>Aceh</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>80</td>\n",
" <td>Crystal Juarez</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>road</td>\n",
" <td>road</td>\n",
" <td>Springfield</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>track</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>88</td>\n",
" <td>Brandon Joyce</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>commerce</td>\n",
" <td>bakery</td>\n",
" <td>Springfield</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>Old San Juan Bakery</td>\n",
" </tr>\n",
" <tr>\n",
" <td>90</td>\n",
" <td>Chen Phan</td>\n",
" <td>US</td>\n",
" <td>USA</td>\n",
" <td>road</td>\n",
" <td>road</td>\n",
" <td>Springfield</td>\n",
" <td>9.0</td>\n",
" <td>North America</td>\n",
" <td>United States of America</td>\n",
" <td>us</td>\n",
" <td>...</td>\n",
" <td>Massachusetts</td>\n",
" <td>MA</td>\n",
" <td>residential</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>14 rows × 27 columns</p>\n",
"</div>"
],
"text/plain": [
" name ISO_3166-1_alpha-2 ISO_3166-1_alpha-3 \\\n",
"23 Lizette Madrid US USA \n",
"24 Thien-Kim Tran ID IDN \n",
"31 Linh Cayabyab US USA \n",
"33 Kasia Woon US USA \n",
"38 Dakota Stogdill US USA \n",
"47 Autumn Whittier US USA \n",
"49 Sameeha el-Fares US USA \n",
"51 Veronica Colmenero Chavez US USA \n",
"54 Ischara Mcgrew ID IDN \n",
"58 Ryan Coleman CH CHE \n",
"68 Kabeera al-Samaan ID IDN \n",
"80 Crystal Juarez US USA \n",
"88 Brandon Joyce US USA \n",
"90 Chen Phan US USA \n",
"\n",
" _category _type city confidence continent \\\n",
"23 road road Springfield 9.0 North America \n",
"24 place county NaN 9.0 Asia \n",
"31 education school Springfield 9.0 North America \n",
"33 transportation bus_stop NaN 9.0 North America \n",
"38 health hospital Springfield 9.0 North America \n",
"47 road road Springfield 9.0 North America \n",
"49 health hospital Springfield 9.0 North America \n",
"51 road road Springfield 9.0 North America \n",
"54 place county NaN 9.0 Asia \n",
"58 place county NaN 9.0 Europe \n",
"68 place county NaN 9.0 Asia \n",
"80 road road Springfield 9.0 North America \n",
"88 commerce bakery Springfield 9.0 North America \n",
"90 road road Springfield 9.0 North America \n",
"\n",
" country country_code ... state state_code \\\n",
"23 United States of America us ... Massachusetts MA \n",
"24 Indonesia id ... Aceh NaN \n",
"31 United States of America us ... Massachusetts MA \n",
"33 United States of America us ... Massachusetts MA \n",
"38 United States of America us ... Massachusetts MA \n",
"47 United States of America us ... Massachusetts MA \n",
"49 United States of America us ... Massachusetts MA \n",
"51 United States of America us ... Massachusetts MA \n",
"54 Indonesia id ... Aceh NaN \n",
"58 Switzerland ch ... Lucerne NaN \n",
"68 Indonesia id ... Aceh NaN \n",
"80 United States of America us ... Massachusetts MA \n",
"88 United States of America us ... Massachusetts MA \n",
"90 United States of America us ... Massachusetts MA \n",
"\n",
" road_type school bus_stop neighbourhood \\\n",
"23 residential NaN NaN NaN \n",
"24 NaN NaN NaN NaN \n",
"31 NaN High School of Commerce NaN NaN \n",
"33 bus_stop NaN ELM / SHEAFFER Merrick \n",
"38 NaN NaN NaN NaN \n",
"47 residential NaN NaN NaN \n",
"49 NaN NaN NaN NaN \n",
"51 residential NaN NaN NaN \n",
"54 NaN NaN NaN NaN \n",
"58 NaN NaN NaN NaN \n",
"68 NaN NaN NaN NaN \n",
"80 track NaN NaN NaN \n",
"88 NaN NaN NaN NaN \n",
"90 residential NaN NaN NaN \n",
"\n",
" town hospital \\\n",
"23 NaN NaN \n",
"24 NaN NaN \n",
"31 NaN NaN \n",
"33 West Springfield NaN \n",
"38 NaN Vibra Hospital of Western Massachusetts \n",
"47 NaN NaN \n",
"49 NaN Baystate Medical Center \n",
"51 NaN NaN \n",
"54 NaN NaN \n",
"58 NaN NaN \n",
"68 NaN NaN \n",
"80 NaN NaN \n",
"88 NaN NaN \n",
"90 NaN NaN \n",
"\n",
" local_administrative_area bakery \n",
"23 NaN NaN \n",
"24 NaN NaN \n",
"31 NaN NaN \n",
"33 NaN NaN \n",
"38 NaN NaN \n",
"47 NaN NaN \n",
"49 NaN NaN \n",
"51 NaN NaN \n",
"54 NaN NaN \n",
"58 Ettiswil NaN \n",
"68 NaN NaN \n",
"80 NaN NaN \n",
"88 NaN Old San Juan Bakery \n",
"90 NaN NaN \n",
"\n",
"[14 rows x 27 columns]"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Any where the location category isn't \"building\"?\n",
"all_coords[(all_coords._category != \"building\") & pd.notna(all_coords._category)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Obnoxiously, I don't know of any other way to fix these sorts of errors than to go through the results manually and replace suspect results with latitudes/longitudes from individual [Google Maps searches](https://support.google.com/maps/answer/18539?co=GENIE.Platform%3DDesktop&hl=en). (If anyone knows of a better way, please share.)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conclusion\n",
"\n",
"This method clearly isn't perfect - somewhere between 10 and 24 of the 100 addresses resulted in inaccurate queries. However, it's better than determining the geo-coordinates of 100 addresses manually!\n",
"\n",
"Once you have a list of latitudes and longitudes of addresses within Springfield, you can map them to city wards using Lauren's R app, located here: https://laurenmarietta.shinyapps.io/springfield_wards/\n",
"\n",
"Again, if you have any questions, please contact Lauren Chambers at lchambers@aclum.org."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Name Street Address City State Zip Code
Fiona Lor P.O. Box 123456 Springfield MA 01129
Ghassaan el-Sharifi 205 Bowles Park Springfield MA 01014
Jeremiah Lial 75 Clydesdale Lane Springfield MA 01129
Arhab al-Semaan 25 Bond Street Springfield MA 01107
Kaley Lin 71 Chauncey Drive Springfield MA 01151
Aparna Tram 71 Agnes Street Springfield MA 01108
Eric Tarar 357 Cottage Street Springfield MA 01119
Brandon Morales 1500 Main Street Suite 1234 Springfield MA 01101
Ivy Zhang 1727 South Branch Parkway Springfield MA 01128
Mukhtaar el-Mussa Po Box 1111 Springfield MA 01109
Faviola Sanchez 1785 Allen Street Springfield MA 01128
James Williams 1146 Allen Street Springfield MA 01118
Saurya Patel 1068 Bradley Road Springfield MA 01118
Brittany Shoels 435 Porter Lake Drive Springfield MA 01106
Emmanuel Chu 101 Corcoran Blvd Springfield MA 01119
Min New 106 Saint James Cir Springfield MA 01104
Tevin Swanson 25 Sparrow Drive Springfield MA 01119
Reece Huff 1340 Boston Road Springfield MA 01151
Hunter Burgos 769 Worthington Street Springfield MA 01101
Ashlee Hayes 57 Lyons Street Springfield MA 01151
Alyssa Wilson 196 Corcoran Boulevard Springfield MA 01118
Omar Venegas 1623 Bay Street Springfield MA 01119
Labeeb al-Saladin 54 Oak Hollow Road Springfield MA 01128
Lizette Madrid 2 South Greeting Road Springfield MA 01108
Thien-Kim Tran 100 Reeds Lndg Springfield MA 01111
Kara Kuljis 28 Lowell Street Springfield MA 01107
Naqiyya el-Saeed 2067 Allen Street Springfield MA 01128
Geoffrey Amelkin PO Box 1221 Springfield MA 01111
Jaarallah el-Zaman 74 Springfield Street Springfield MA 01199
Muneefa al-Aziz 117 Louis Road Springfield MA 01118
Shawna Bailey 170 Waldorf Street Springfield MA 01111
Linh Cayabyab 415 State Street Springfield MA 01105
Shadhaa el-Saade 23 Reed Street Springfield MA 01109
Kasia Woon Springfield MA 01119
Elizabeth Hernandez Bonilla 119 Stafford Street Springfield MA 01014
James Turner 125 Tinkham Road Springfield MA 01129
Jacobo Grajales 65 Olmsted Drive Springfield MA 01108
Robert Williams 37 Leland Drive Springfield MA 01111
Dakota Stogdill 1400 State Street Springfield MA 01109
Esainea Digby 1695 Main St Suite 444 Springfield MA 01109
Jonathan Klinkerman 600 South Branch Parkway Springfield MA 01106
Sengthong Goulet 139 Page Boulevard Springfield MA 01104
Anthony Montabon 10 Gralia Drive Springfield MA 01128
Mark Garcia 239 Commonwealth Avenue Springfield MA 01108
Karin Pacheco 122 Wayne Street Springfield MA 01118
Veronica Lujan 42 Old Lane Road Springfield MA 01129
Myranda Wisham 24 Craig Street Springfield MA 01108
Autumn Whittier 1000 Anniversary Street Springfield MA 01014
Shaafi el-Faraj 1059 South Branch Parkway Springfield MA 01118
Sameeha el-Fares 759 Chestnut Street Springfield MA 01199
Carina Gorostieta 28 Angelica Drive Springfield MA 01129
Veronica Colmenero Chavez 22 South Greeting Road Springfield MA 01108
Abdul Majeed el-Hariri 153 Hamilton Street Springfield MA 01151
Aubry Sapp 135 Jerilis Drive Springfield MA 01151
Ischara Mcgrew 136 Thompkins Avenue Springfield MA 01118
Joseph Kierstead 1084 Parker Street Springfield MA 01129
Isis O'Donnell 2100 Wilbraham Road Springfield MA 01119
Oscar Valdez 200 Washington Boulevard Springfield MA 01108
Ryan Coleman Apt. 123 Springfield MA 01128
Rachelle Yi 21 Emery Street Springfield MA 01104
Noori el-Selim 20 Lemnos Lane Springfield MA 01119
Jeremy Hofferber 53 Sherbrooke Street Springfield MA 01104
Cameron Huerta PO Box 80808 Springfield MA 01106
Lee Chong P.O. Box 65432 Springfield MA 01111
Jose Hernandez-Gonzalez 1727 South Branch Parkway Springfield MA 01128
Tae Mian 2 Pasco Road Springfield MA 01151
Ryan White 41 Brookdale Drive Springfield MA 01119
Ghazaala al-Majid 90 Parkerview Street Springfield MA 01129
Kabeera al-Samaan Springfield MA 01107
Mujaahida al-Assad 24 Puritan Road Springfield MA 01128
Christian Estrada 82 Stuart Street Springfield MA 01119
Jonathan Posey-Hughes 1059 South Branch Parkway Springfield MA 01118
Alexis Banks 37 Balis Street Springfield MA 01111
Ronald Harris 152 Harkness Avenue Springfield MA 01118
Jose Love 15 Wrenwood Street Springfield MA 01129
Francine Two Crow 100 Progress Avenue Springfield MA 01119
Thomas Gomez 36 Birch Glen Drive Springfield MA 01119
Michael Shepherd 275 Bicentennial Hwy Springfield MA 01118
Courtney Garrison 1306 Liberty Street Springfield MA 01104
Paul Wans 1060 Bay Street Springfield MA 01111
Crystal Juarez 1000 Five Mile Pond Road Springfield MA 01151
Francisco Infante 1200 Wilbraham Road Springfield MA 01119
Brooklyn Robinson 45 Davenport Street Springfield MA 01119
Cooper Paxton 130 Main Street Springfield MA 01151
Trezen Banks-Hill 40 Sonia Street Springfield MA 01129
Addison Bigbey 123 Glenoak Drive Springfield MA 01129
Asmar el-Salahuddin 81 Agnes Street Springfield MA 01108
Timothy Alexander 23 Egan Drive Springfield MA 01111
Brandon Joyce 2460 Main Street Springfield MA 01107
Colin Fitzgerald 86 West Allen Ridge Road Springfield MA 01118
Chen Phan 1234 Tamarack Drive Springfield MA 01118
Alyssa Washington 50 Benton Street Springfield MA 01109
Tyler Delaney 12 Pine Road Springfield MA 01108
Sara Magda 137 Oregon Street Springfield MA 01118
Phillip O Connor 720 Hall of Fame Avenue Springfield MA 01105
Mirenda Louis 241 Corcoran Boulevard Springfield MA 01118
Damien Baca 20 Barrington Dr Springfield MA 01129
Olivia Mckenna 26 Sterling Street Springfield MA 01199
Arielle Fisher 540 Tiffany Street Springfield MA 01108
Breanna Blanco-Araujo 43 Bay Street Springfield MA 01111
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment