Created
June 9, 2014 14:02
-
-
Save jbfink/9ea54cafda09a2b1af75 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/home/jbfink/.rbenv/versions/2.1.2/gemsets/ruby-marc/gems/marc-0.8.1/lib/marc/marc8/to_unicode.rb:163:in `rescue in transcode': MARC8, input byte offset 20, code set: 0x45, code point: 0xa0 (Encoding::InvalidByteSequenceError) | |
from /home/jbfink/.rbenv/versions/2.1.2/gemsets/ruby-marc/gems/marc-0.8.1/lib/marc/marc8/to_unicode.rb:144:in `transcode' | |
from /home/jbfink/.rbenv/versions/2.1.2/gemsets/ruby-marc/gems/marc-0.8.1/lib/marc/reader.rb:397:in `set_encoding' | |
from /home/jbfink/.rbenv/versions/2.1.2/gemsets/ruby-marc/gems/marc-0.8.1/lib/marc/reader.rb:359:in `block (2 levels) in decode' | |
from /home/jbfink/.rbenv/versions/2.1.2/gemsets/ruby-marc/gems/marc-0.8.1/lib/marc/reader.rb:358:in `each' | |
from /home/jbfink/.rbenv/versions/2.1.2/gemsets/ruby-marc/gems/marc-0.8.1/lib/marc/reader.rb:358:in `block in decode' | |
from /home/jbfink/.rbenv/versions/2.1.2/gemsets/ruby-marc/gems/marc-0.8.1/lib/marc/reader.rb:307:in `upto' | |
from /home/jbfink/.rbenv/versions/2.1.2/gemsets/ruby-marc/gems/marc-0.8.1/lib/marc/reader.rb:307:in `decode' | |
from /home/jbfink/.rbenv/versions/2.1.2/gemsets/ruby-marc/gems/marc-0.8.1/lib/marc/reader.rb:247:in `each' | |
from slice.rb:3:in `<main>' |
Is it actually compiled/binary marc or that weird text representation? If the latter, is it UTF-8?
It's the Harvard dataset here: http://openmetadata.lib.harvard.edu/bibdata . file / magic sez it's MARC-21 Bibliographic.
Binary totally works and is probably close enough for government work. I don't want (at this stage) to mess with Catmandu or other things to convert it -- maybe later. Thanks man, you're the best.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
http://rubydoc.info/github/ruby-marc/ruby-marc/MARC/Reader
Relevant part:
If you have Marc8 data, you really want to convert it to UTF8 outside of ruby, but if you can't:
MARC::Reader.new("marc8.marc" :external_encoding => "binary")
But you probably will have problems subsequently in your own code using the MARC::Record.