Skip to content

Instantly share code, notes, and snippets.

@abelsonlive
Last active August 29, 2015 14:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save abelsonlive/022508c4189a2840b7e6 to your computer and use it in GitHub Desktop.
Save abelsonlive/022508c4189a2840b7e6 to your computer and use it in GitHub Desktop.
1 50 Diamond St. Brooklyn NY
2 442 George Road, New York, NY
"""MR-based geocoder
"""
from mrjob.job import MRJob
class MRGeocode(MRJob):
def mapper(self, _, line):
id, address = line.split(',')
lat, long = geocode(address)
line += ",{},{}".format(lat, long)
yield id, line
if __name__ == '__main__':
MRGeocode.run()
# locally
python mrgeo.py input.csv > geocoded.csv
# on EMR
python mrgeo.py input.csv -r emr > geocoded.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment