Skip to content

Instantly share code, notes, and snippets.

@nz
Created September 21, 2010 00:32
Show Gist options
  • Save nz/588931 to your computer and use it in GitHub Desktop.
Save nz/588931 to your computer and use it in GitHub Desktop.
#
# Experimenting with a brute force check to see if your Solr index is in sync
# with your database. This example uses Sunspot for the #solr_index method,
# but not for the searching. We query Solr and the local database directly
# to page through this stuff as quickly as possible.
#
# Got a better approach? Ping @websolr on Twitter :)
#
start = 0
rows = 100
while start < Post.count
search_url = "http://index.websolr.com/solr/a1b2c3d4e5f/select?wt=ruby&q=*:*&rows=#{rows}&start=#{start}"
results = eval(open(search_url).string)
ids = results["response"]["docs"].collect{|d|d["id"][/[0-9]+/].to_i}
unless Post.count(:conditions => { :id => ids }) == rows
Post.find(:all, :conditions => [ "id not in (?)", ids ]).each do |post|
puts "Didn't find Post #{post.id}, re-indexing..."
post.solr_index
end
end
start = start + rows
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment