Skip to content

Instantly share code, notes, and snippets.

@cluesque
Created December 1, 2022 19:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cluesque/842e07d16aa8e1e2efa333f987f7802c to your computer and use it in GitHub Desktop.
Save cluesque/842e07d16aa8e1e2efa333f987f7802c to your computer and use it in GitHub Desktop.
ProgressRegressor class
# Useful for tracking the progress of a large batch job
# Pass in 'counter', a lambda that returns an integer
# This will periodically call the counter,
# track the rate counted things are consumed (or produced)
# and make a prediction about when it completes
# options:
# - divisor: how many times per minute to sample, default 6 (every ten seconds)
# - rolling_window: how many rate samples to average when making predictions (default 10)
# - target: what value are we aiming for, default 0 (consuming a queue)
# example:
# We have a lot of sidekiq jobs in progress, predict when they'll finish:
# > ProgressRegressor.new(lambda{ Sidekiq::Queue.new('mailers').size }).run
# You know you need to create 100k Widget records
# > ProgressRegressor.new(lambda{ Widget.count }, target: 100_000).run
class ProgressRegressor
attr_accessor :counter, :divisor, :rate_samples, :rolling_window, :target
def initialize(counter, divisor: 6, rolling_window: 10, target: 0)
@counter = counter
@divisor = divisor
@rolling_window = rolling_window
@target = target
@rate_samples = []
end
def run
100.times do
begin
before = counter.call
after = nil # ruby scoping is weird
duration =
Benchmark.realtime do
sleep 60.0 / divisor
after = counter.call
end
rate = (before - after).abs * (60.0 / duration)
rate_samples << rate
duration = (after * 1.0 / rate)
rolling_rate = mean(rate_samples.last(rolling_window))
remaining = (after - target).abs
rolling_duration = (remaining * 1.0 / rolling_rate)
log "Remaining: #{remaining}, Rate: #{rate.round(1)}, rolling rate: #{rolling_rate.round(1)} minutes left: #{rolling_duration.to_i}, done estimate: #{done_time(rolling_duration)}"
rescue => err
log "Rescued #{err}"
end
end
end
def done_time(duration)
format = duration > (24 * 60) ? '%B %d %H:%M:%S' : '%H:%M:%S'
(Time.zone.now + duration.minutes).strftime(format)
end
def mean(array)
array.sum * 1.0 / array.size
end
def log(msg)
puts "#{Time.now.to_s(:db)} #{msg}"
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment