Skip to content

Instantly share code, notes, and snippets.

@vanpelt
Created November 8, 2012 02:33
Show Gist options
  • Save vanpelt/4036211 to your computer and use it in GitHub Desktop.
Save vanpelt/4036211 to your computer and use it in GitHub Desktop.
Don't trust querying mongo directly...
require 'rubygems'
require 'sequel'
require 'bson'
require 'set'
class Time
def beginning_of_month
Time.utc(self.year, self.month, 1)
end
end
DB = Sequel.connect('postgres://foo:bar@localhost/builder')
BUFFER_LENGTH = 1000
months = Hash.new(Set.new)
counter = 0
buffer = {}
io = File.new('conversions.bson')
while(!io.eof?) do
ob = BSON.read_bson_document(io)
next unless ob["finished_at"]
month = ob["finished_at"].beginning_of_month
if months[month].add?(ob["worker_id"]).nil?
DB["UPDATE conversions SET earnings = earnings + ? WHERE worker_id = ? AND month = ?", ob["amount"], ob["worker_id"], month]
elsif buffer.length >= BUFFER_LENGTH
DB[:conversions].insert_multiple buffer.map do |k,v|
v.merge(:worker_id => k[0], :month => k[1])
end
buffer = {}
else
buffer[[ob["worker_id"], month]] ||= Hash.new(0)
buffer[[ob["worker_id"], month]][:earnings] += ob["amount"]
buffer[[ob["worker_id"], month]][:channel] = ob["external_type"]
end
counter += 1
puts "#{counter}" if counter % 10000 == 0
if counter % 100 == 0
$stdout.print "."
$stdout.flush
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment