Skip to content

Instantly share code, notes, and snippets.

@ericrobskyhuntley
Forked from rgdonohue/README.md
Last active March 22, 2020 00:26
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save ericrobskyhuntley/0c293113aa75a254237c143e0cf962fa to your computer and use it in GitHub Desktop.
Save ericrobskyhuntley/0c293113aa75a254237c143e0cf962fa to your computer and use it in GitHub Desktop.
Batch Geocoding Script with GeoPy

Multi-Service Geocoder

This Python script utilizes the GeoPy geocoding library to batch geocode a number of addresses, using various services until a pair of latitude/longitude values are returned. Python 3 port and refactor of a script by @rgdonohue.

https://gist.github.com/ericmhuntley/0c293113aa75a254237c143e0cf962fa

Built to anticipate an input csv should that includes columns named street, city, state, country.

Usage Example

python geocode.py data.csv 100

Where data.csv is an appropriately formatted csv encoded in utf-8 and 100 is the timout between each request in units of milliseconds.

# import the geocoding services you'd like to try
from geopy.geocoders import ArcGIS, Bing, Nominatim, OpenCage, GoogleV3, OpenMapQuest
import csv, sys
import pandas as pd
import keys
in_file = str(sys.argv[1])
out_file = str('gc_' + in_file)
timeout = int(sys.argv[2])
print('creating geocoding objects.')
arcgis = ArcGIS(timeout=timeout)
bing = Bing(api_key=keys.bing_api,timeout=100)
nominatim = Nominatim(user_agent=keys.n_user, timeout=timeout)
opencage = OpenCage(api_key=keys.oc_api,timeout=timeout)
googlev3 = GoogleV3(api_key=keys.g3_api, domain='maps.googleapis.com', timeout=timeout)
openmapquest = OpenMapQuest(api_key=keys.omq_api, timeout=timeout)
# choose and order your preference for geocoders here
geocoders = [openmapquest, nominatim, opencage, googlev3, arcgis]
def gc(address):
street = str(address['street'])
city = str(address['city'])
state = str(address['state'])
country = str(address['country'])
add_concat = street + ", " + city + ", " + state + " " + country
for gcoder in geocoders:
location = gcoder.geocode(add_concat)
if location != None:
print(f'geocoded record {address.name}: {street}')
located = pd.Series({
'lat': location.latitude,
'lng': location.longitude,
'time': pd.to_datetime('now')
})
else:
print(f'failed to geolocate record {address.name}: {street}')
located = pd.Series({
'lat': 'null',
'lng': 'null',
'time': pd.to_datetime('now')
})
return located
print('opening input.')
reader = pd.read_csv(in_file, header=0)
print('geocoding addresses.')
reader = reader.merge(reader.apply(lambda add: gc(add), axis=1), left_index=True, right_index=True)
print(f'writing to {out_file}.')
reader.to_csv(out_file, encoding='utf-8', index=False)
print('done.')
# User Agent identification (e.g., email address) for Nominatim
# (Querying without user agent is against ToS)
# n_user = ''
# Bing API key
# bing_api = ''
# OpenCage API key
# oc_api = ''
# GoogleV3 API key
# g3_api = ''
# OpenMapQuest API key
# omq_api =''
@jazon33y
Copy link

Looks great, and thank you for posting this! Are these all of the free geocoding APIs? Just curious as to the choice of geocoding services.

Cheers,

@ericrobskyhuntley
Copy link
Author

ericrobskyhuntley commented Oct 2, 2018

@jazon33y - thanks for the question! This is a fairly thin wrapper around a selection of geocoding services supported by the GeoPy module. For a complete list of supported geocoders (any of which could be easily implemented here, given access to an API key), check out the GeoPy documentation.

@Kraussvan7
Copy link

Hello Eric,

Thanks for posting this valuable code (at least for a newbie like me).
I would like to ask if you could add a feature to it.
Imagine that you have thousands of addresses to geolocate, every single failure (communication problem for example)
make you restart from the beginning.
It's possible to include a counter so when it reaches it's value (could be a parameter too), it edit's the output file, append the new set of processed records, resets the counter and go on to the next record?

Thank you in advance!

Best regards
KV

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment