Skip to content

Instantly share code, notes, and snippets.

@nz
Last active December 22, 2015 07:49
Show Gist options
  • Save nz/6440574 to your computer and use it in GitHub Desktop.
Save nz/6440574 to your computer and use it in GitHub Desktop.
Simple asynchronous batch indexing with Sunspot. Currently an untested work-in-progress, I expect to refactor and contribute this to Sunspot proper.
# rails generate migration add_indexed_at_to_searchable_models
class AddIndexedAtToSearchableModels
TABLES = [ :articles, :authors, :comments ]
def self.up
TABLES.each do |name|
change_table(name) do |t|
t.integer :indexed_at
end
end
end
def self.down
TABLES.each do |name|
change_table(name) do |t|
t.remove :indexed_at
end
end
end
end
# lib/tasks/indexer.rb
namespace :indexer do
task :run => :environment do
verbose = true
interval = 10.seconds
# Load the requested models, or all of Sunspot's searchable models
# TODO: error handling for invalid models or models not found
models = if ENV['MODELS']
ENV['MODELS'].split(/,/).map{ |m| m.constantize }
else
Sunspot.searchable
end
if verbose
puts "Requested the following models: #{models.map{|m|m.name}.join(', ')}"
end
# Filter the models to those with an indexed_at column
models = models.select{ |m| m.columns.find{ |c| c.name == "indexed_at" }}
# Warn and exit if we don't have any models to work with
if models.blank?
puts "Your models must provide an indexed_at timestamp field"
exit(1)
end
# Infinite loop to look for new and updated objects and reindex them
loop do
last_run = Time.now
models.each do |model|
# Find batches of documents that have never been indexed, or have been
# updated since they were last indexed.
model.where('indexed_at IS NULL OR updated_at > indexed_at').find_in_batches do |batch|
batch = batch.select{ |record| record.indexable? }
Sunspot.index(batch)
batch.update_all('indexed_at = ?', Time.now)
end
end
# Poll on an interval
if Time.now - last_run < interval
sleep interval - (Time.now - last_run)
end
end
end
end
@nz
Copy link
Author

nz commented Sep 4, 2013

Let me stress: this is an untested rough draft, for inspirational and discussion purposes only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment