Skip to content

Instantly share code, notes, and snippets.

@bamthomas
Last active September 12, 2022 11:37
Show Gist options
  • Save bamthomas/f87031e73964830c4489aef9f963ed1a to your computer and use it in GitHub Desktop.
Save bamthomas/f87031e73964830c4489aef9f963ed1a to your computer and use it in GitHub Desktop.
scroll Elasticsearch with bash
#!/bin/bash
# from https://gist.github.com/cb372/4567f624894706c70e65
es_url=$1
index=$2
response=$(curl -s -H 'content-type: application/json' $es_url/$index/_search?scroll=1m -d @query.json)
scroll_id=$(echo $response | jq -r ._scroll_id)
hits_count=$(echo $response | jq -r '.hits.hits | length')
hits_so_far=hits_count
echo Got initial response with $hits_count hits and scroll ID $scroll_id
# TODO process first page of results here
echo $response | jq -r '.hits.hits[]._id' > es.csv
while [ "$hits_count" != "0" ]; do
response=$(curl -s -H 'content-type: application/json' $es_url/_search/scroll -d "{ \"scroll\": \"1m\", \"scroll_id\": \"$scroll_id\" }")
scroll_id=$(echo $response | jq -r ._scroll_id)
hits_count=$(echo $response | jq -r '.hits.hits | length')
hits_so_far=$((hits_so_far + hits_count))
echo "Got response with $hits_count hits (hits so far: $hits_so_far), new scroll ID $scroll_id"
# TODO process page of results
echo $response | jq -r '.hits.hits[]._id' >> es.csv
done
echo Done!
@bamthomas
Copy link
Author

{
"_source": {
"includes": [ "_id" ]
},
"query": { "match_all": {} },
"size": 1000
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment