Skip to content

Instantly share code, notes, and snippets.

@bpaquet
Last active August 29, 2015 14:16
Show Gist options
  • Save bpaquet/ed5af9b772bc931d61e0 to your computer and use it in GitHub Desktop.
Save bpaquet/ed5af9b772bc931d61e0 to your computer and use it in GitHub Desktop.
Dump a large elasticsearch index to a csv file
require 'net/http'
require 'json'
require 'uri'
scroll_id = nil
counter = 0
while true
$stderr.puts "Running #{counter}"
if scroll_id
url = "curl -X GET -s 'http://localhost:9200/_search/scroll?scroll=1m' -d '#{scroll_id}'"
else
url = "curl -X GET -s 'http://localhost:9200/logstash-2015.02.27/_search?scroll=1m&size=1000'"
end
res = %x{#{url}}.strip
body = JSON.parse(res)
scroll_id = body["_scroll_id"] unless scroll_id
current = body['hits']['hits'].length
break if current == 0
counter += current
body['hits']['hits'].each do |x|
puts "#{x['_source']['host']};#{x['_source']['path']};#{x['_source']['type']};#{JSON.dump(x['_source']).size};#{x['_source']['message'].size};"
end
end
$stderr.puts "End #{counter}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment