Skip to content

Instantly share code, notes, and snippets.

@billdueber
Created May 7, 2010 13:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save billdueber/393437 to your computer and use it in GitHub Desktop.
Save billdueber/393437 to your computer and use it in GitHub Desktop.
# Example of pushing stuff to solr with solrj in jruby
require 'rubygems'
# Load any .jar files your want with "require '../path/to/jarfile.jar'"
# For both marc4j4r and jruby_streaming_update_solr_server, if you load the
# appropriate jar first that's the version that will be used. If not,
# we fall back on the one shipped with the gem
# require '../jars/myjavacode.jar'
require 'jruby_streaming_uddate_solr_server'
require 'threach'
# require 'marc4j4r' # if that's your thing
########## config #########
# URL to solr
solrURL = 'http://localhost:8983/solr'
# Size of the suss queue
sussQueueSize = 20
# Number of threads to use to push stuff to solr
sussThreads = 1
# Number of threads to use to consume stuff and push it into the suss
consumerThreads = 2
##### end config #########
suss = StreamingUpdateSolrServer.new(solrURL,sussQueueSize,sussThreads)
# If you want to use threach (threaded each) you just need some sort of a feed
# object that implements Enumerable. Otherwise, you can do whatever you wanna do.
# we'll pretend we're getting tab-delimited data from a file.
File.open("myfile.txt") do |f|
f.threach(consumerThreads) do |line|
data = line.split("\t")
doc = SolrInputDocument.new
doc['name'] = data[0]
doc['age'] = data[1]
# "siblings", we'll say, is comma-delimited within the column
# We can add them all at once by using #add and passing an
# array of values
doc.add('sibling', data[2].split(","))
# Can also call add multiple times; the effect is to append
# new values
data[3].split(',').each do |kid|
doc.add('kid', kid)
end
# You can merge in a hash
doc.merge! somethingThatReturnsAHash(data)
# Send it to the suss
suss << doc
end
end
suss.commit
suss.optimize
# That example used the doc object directly. You can also add a hash to the suss.
# The whole loop could be:
mything.each do |r|
suss << createAHashFromMyData(r)
end
suss.commit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment