Skip to content

Instantly share code, notes, and snippets.

@burdandrei
Created November 17, 2015 15:02
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save burdandrei/cd3d9d8cd92ce4a8a1f5 to your computer and use it in GitHub Desktop.
Save burdandrei/cd3d9d8cd92ce4a8a1f5 to your computer and use it in GitHub Desktop.
Export all your data from CloudSearch and be free!
#!/usr/bin/env ruby
#
# Cloudsearch export script
#
# Required ENV Variables
# * AWS_ACCESS_KEY_ID
# * AWS_SECRET_ACCESS_KEY
# * CS_SEARCH_ENDPOINT
# * OUT_FILENAME
#
require 'aws-sdk'
search_instance = Aws::CloudSearchDomain::Client.new(endpoint: ENV['CS_SEARCH_ENDPOINT'])
cursor = 'initial'
bulk_size = 10000 # max CloudSearch bulk size
f = File.new(ENV['OUT_FILENAME'], 'w')
loop do
s = search_instance.search({
query: "matchall",
query_parser: 'structured',
cursor: cursor,
size: bulk_size
})
cursor = s.hits.cursor
s.hits.hit.each do |doc|
f.puts(doc.fields)
end
break if s.hits.hit.empty?
end
f.close
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment