Skip to content

Instantly share code, notes, and snippets.

@tsnobip
Created December 14, 2018 12:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tsnobip/1f64880d94a4689dc7a1c5578b9628f9 to your computer and use it in GitHub Desktop.
Save tsnobip/1f64880d94a4689dc7a1c5578b9628f9 to your computer and use it in GitHub Desktop.
Python script to bulk insert an ndjson file to an ElasticSearch cluster
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk
es_host = "MY_ES_URL:PORT"
user = "MY_USERNAME"
secret = "MY_PASSWORD"
file_path = "/PATH/TO/FILE.ndjson"
index = "INDEX_NAME"
doc_type = "_doc"
es = Elasticsearch(hosts=[es_host],
http_auth=(user, secret),
port=443,
scheme="https",
timeout=90)
def read_file_generator(ndjson_path):
file_object = open(ndjson_path, 'r')
while True:
doc = file_object.readline()
if not doc:
break
yield doc
bulk(es, read_file_generator(file_path), index=index, doc_type=doc_type)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment