Skip to content

Instantly share code, notes, and snippets.

@Rotzke
Forked from rjurney/load_directory_json.gz.sh
Created February 21, 2022 21:00
Show Gist options
  • Save Rotzke/73e27992d7ca35d18d3223349db740c4 to your computer and use it in GitHub Desktop.
Save Rotzke/73e27992d7ca35d18d3223349db740c4 to your computer and use it in GitHub Desktop.
How to bulk load gzip'd JSON in Elastic
# Bulk load the Foo data we prepared via PySpark in etl/transform_foo.spark.py
for path in data/foo/elastic/part*
do
file=$(basename ${path})
echo "Submitting ${path} to Elastic index foo ..."
curl ${USER_STRING} \
-X POST \
-H "Content-encoding: gzip" \
-H "Content-Type: application/x-ndjson" \
"http://${HOSTNAME}:${PORT}/foo/_bulk" \
--data-binary "@${path}" \
> "data/foo/elastic_report/${file}.json"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment