Skip to content

Instantly share code, notes, and snippets.

@alexwoolford
Last active August 29, 2015 14:06
Show Gist options
  • Save alexwoolford/4e667be8827bcb61145e to your computer and use it in GitHub Desktop.
Save alexwoolford/4e667be8827bcb61145e to your computer and use it in GitHub Desktop.
# The csv dataset was downloaded from http://data.denvergov.org/dataset/city-and-county-of-denver-crime
import pandas as pd
from elasticsearch import Elasticsearch
import json
data = pd.read_csv('/Users/awoolford/Downloads/crime.csv')
es = Elasticsearch()
# ignore 400 cause by IndexAlreadyExistsException when creating an index
es.indices.create(index='crime-index', ignore=400)
for id, record in enumerate(json.loads(data.to_json(orient='records'))):
doc = dict()
for key, value in record.iteritems():
try:
doc[key.encode('utf-8')] = value.encode('utf-8')
except:
doc[key.encode('utf-8')] = value
doc['LOCATION'] = [doc['GEO_LON'], doc['GEO_LAT']]
es.index(index="crimes-index", doc_type='crime', id=id, body=doc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment