Skip to content

Instantly share code, notes, and snippets.

@mperham
Created January 16, 2014 21:10
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mperham/8463495 to your computer and use it in GitHub Desktop.
Save mperham/8463495 to your computer and use it in GitHub Desktop.
Using #join with a Sidekiq::Batch in Rails to parallelize data migration
class MoveInventoryUpdatesToS3 < ActiveRecord::Migration
def up
batch = Sidekiq::Batch.new
batch.jobs do
InventoryUpdate.where("content is not null").pluck(:id).each do |iuid|
InventoryUpdate.delay.send_to_s3(iuid)
end
end
# wait for all jobs to finish
batch.status.join
end
end
@jelder
Copy link

jelder commented Jan 16, 2014

Beautiful, but what about a scenario where the migration adds a column (say for counter_cache: true) and we want to populate that using Sidekiq? I don't see anywhere to insert a InventoryUpdate.reset_column_information in your example.

@mperham
Copy link
Author

mperham commented Jan 16, 2014

That might be a case where you have a two-step migration or populate the column with a SQL UPDATE statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment