Created
December 1, 2022 19:13
-
-
Save cluesque/842e07d16aa8e1e2efa333f987f7802c to your computer and use it in GitHub Desktop.
ProgressRegressor class
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Useful for tracking the progress of a large batch job | |
# Pass in 'counter', a lambda that returns an integer | |
# This will periodically call the counter, | |
# track the rate counted things are consumed (or produced) | |
# and make a prediction about when it completes | |
# options: | |
# - divisor: how many times per minute to sample, default 6 (every ten seconds) | |
# - rolling_window: how many rate samples to average when making predictions (default 10) | |
# - target: what value are we aiming for, default 0 (consuming a queue) | |
# example: | |
# We have a lot of sidekiq jobs in progress, predict when they'll finish: | |
# > ProgressRegressor.new(lambda{ Sidekiq::Queue.new('mailers').size }).run | |
# You know you need to create 100k Widget records | |
# > ProgressRegressor.new(lambda{ Widget.count }, target: 100_000).run | |
class ProgressRegressor | |
attr_accessor :counter, :divisor, :rate_samples, :rolling_window, :target | |
def initialize(counter, divisor: 6, rolling_window: 10, target: 0) | |
@counter = counter | |
@divisor = divisor | |
@rolling_window = rolling_window | |
@target = target | |
@rate_samples = [] | |
end | |
def run | |
100.times do | |
begin | |
before = counter.call | |
after = nil # ruby scoping is weird | |
duration = | |
Benchmark.realtime do | |
sleep 60.0 / divisor | |
after = counter.call | |
end | |
rate = (before - after).abs * (60.0 / duration) | |
rate_samples << rate | |
duration = (after * 1.0 / rate) | |
rolling_rate = mean(rate_samples.last(rolling_window)) | |
remaining = (after - target).abs | |
rolling_duration = (remaining * 1.0 / rolling_rate) | |
log "Remaining: #{remaining}, Rate: #{rate.round(1)}, rolling rate: #{rolling_rate.round(1)} minutes left: #{rolling_duration.to_i}, done estimate: #{done_time(rolling_duration)}" | |
rescue => err | |
log "Rescued #{err}" | |
end | |
end | |
end | |
def done_time(duration) | |
format = duration > (24 * 60) ? '%B %d %H:%M:%S' : '%H:%M:%S' | |
(Time.zone.now + duration.minutes).strftime(format) | |
end | |
def mean(array) | |
array.sum * 1.0 / array.size | |
end | |
def log(msg) | |
puts "#{Time.now.to_s(:db)} #{msg}" | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment