Skip to content

Instantly share code, notes, and snippets.

@pcockwell
Last active August 29, 2015 14:27
Show Gist options
  • Save pcockwell/e90c59fd2f8b9057ed8f to your computer and use it in GitHub Desktop.
Save pcockwell/e90c59fd2f8b9057ed8f to your computer and use it in GitHub Desktop.
Using Parallel with an ActiveRecord Generator
require 'parallel'
###############################################################
# Blocks run via Parallel will block on the completion
# of all threads or processes before continuing afterwards
module ParallelHelper
def process_in_batches_with_threads(enumerable, num_threads, &block)
# Even out batches as much as possible between each thread
batch_size = (enumerable.size.to_f / num_threads.to_f).ceil
# Pass 1 batch to each thread
results = Parallel.map(
enumerable.is_a?(ActiveRecord::Relation) ? get_active_record_batch(enumerable, num_threads, batch_size) : enumerable.each_slice(batch_size),
:in_threads => num_threads
) do |batch|
block.call(batch)
end
results
end
def process_in_batches_with_processes(enumerable, num_processes, &block)
# Even out batches as much as possible between each process
batch_size = (enumerable.size.to_f / num_processes.to_f).ceil
# Pass 1 batch to each process
results = Parallel.map(
enumerable.is_a?(ActiveRecord::Relation) ? get_active_record_batch(enumerable, num_processes, batch_size) : enumerable.each_slice(batch_size),
:in_processes => num_processes
) do |batch|
block.call(batch)
end
results
end
private
def get_active_record_batch(relation, num_chunks, chunk_size)
Enumerator.new do |generator|
for index in 0..(num_chunks-1)
generator.yield relation.limit(chunk_size).offset(chunk_size * index)
end
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment