Skip to content

Instantly share code, notes, and snippets.

@smith
Last active July 1, 2024 13:30
Show Gist options
  • Save smith/316efafa60e30d3aa8a1a6503ba0decd to your computer and use it in GitHub Desktop.
Save smith/316efafa60e30d3aa8a1a6503ba0decd to your computer and use it in GitHub Desktop.
The least complicated way to import a CSV into Elasticsearch
#!/bin/bash
# The simplest possible way to import a CSV into Elasticsearch without having to use a UI or depending on an Elasticsearch client library.
#
# Set $INDEX (the index name to import to) and $URL (the _bulk endpoint URL) environment variables.
#
# Requires `bash`, `awk`, `curl`, and [`mlr`](https://miller.readthedocs.io/en/6.12.0/) (for converting CSV to JSON.)
#
# Usage: sh ./import.sh < filename.csv
INDEX=entities-github_project_item-default
URL="http://elastic:changeme@localhost:9200/$INDEX/_bulk"
# Convert to ndjson
mlr --icsv --ojsonl cat | \
# Add the index operation above each line
awk -v i="$INDEX" \
'{ printf "{ \"index\": \{ \"_index\": \"%s\" \} \}\n%s\n", i, $0 }' | \
# Post to _bulk API
curl -H "Content-Type: application/json; charset=UTF-8" -XPOST "$URL" --data-binary @-
@smith
Copy link
Author

smith commented Jun 28, 2024

One line version:

mlr --icsv --ojsonl cat | awk -v i="$INDEX" \ '{ printf "{ \"index\": \{ \"_index\": \"%s\" \} \}\n%s\n", i, $0 }' | curl -H  "Content-Type: application/json; charset=UTF-8" -XPOST "$URL" --data-binary @-

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment