Skip to content

Instantly share code, notes, and snippets.

@wycleffsean
Created August 11, 2017 07:41
Show Gist options
  • Save wycleffsean/e0ab4c920a6f7f7f2d77f66e197c6ed4 to your computer and use it in GitHub Desktop.
Save wycleffsean/e0ab4c920a6f7f7f2d77f66e197c6ed4 to your computer and use it in GitHub Desktop.
Ruby Peach
def self.gzip_to_s3
s3_bucket = Aws::S3::Resource.new(region:'us-east-1')
.bucket('basf-content-server')
total = SalesDataDocument.count
docs = SalesDataDocument.all
Peach.enum_for(docs).each do |doc, queue|
progress = total - queue.length
Rails.logger.info <<-LOG if (progress % 20).zero?
storing document ##{progress} of #{total}
LOG
s3_obj = s3_bucket.object("sales-data/#{Rails.env}/#{doc.name}.gz")
next if s3_obj.exists? && s3_obj.last_modified >= doc.updated_at
file = get(doc.name)
gzipped = ActiveSupport::Gzip.compress(file)
s3_obj.put body: gzipped
end
end
class Peach
private_class_method :new
def self.enum_for(*args)
new(*args)
end
def initialize(list, pool_size = 5)
@queue = list.inject(Queue.new, :<<)
@pool_size = pool_size
end
def each
Array.new(@pool_size) do
Thread.new do
yield(@queue.pop, @queue) until @queue.empty?
end
end.each(&:join)
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment