Skip to content

Instantly share code, notes, and snippets.

@billdueber
Last active December 20, 2015 04:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save billdueber/6070784 to your computer and use it in GitHub Desktop.
Save billdueber/6070784 to your computer and use it in GitHub Desktop.
Using marc4j from jruby
# I just nabbed the source of marc4j and built it with "ant jar"
require 'marc4j-2.5.1-beta.jar'
# Conveniently add Enumerable to the reader interface so I can get #each, #each_with_index, etc.
# This would be automatic if MarcReader were specified as an iterable, as per a recent github issue
# on the marc4j repo (https://github.com/marc4j/marc4j/issues/11)
module org.marc4j::MarcReader
include Enumerable
def each
if block_given?
while self.hasNext
yield self.next
end
else
self.to_enum(:each)
end
end
end
# Pull in the 'batch.dat' file from the ruby-marc test suite
istream = java.io.FileInputStream.new('batch.dat')
# Make a reader out of it. I'm specifying utf-8, but you could leave it blank
# and get the "best guess" as well
reader = org.marc4j.MarcStreamReader.new(istream, 'UTF-8')
reader.each do |r|
# whatever
end
# Do the same thing with a permissive reader
istream = java.io.FileInputStream.new('batch.dat')
reader = org.marc4j.MarcPermissiveStreamReader.new(istream, true, true)
iter = reader.each # get the iterator
puts iter.next
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment