Skip to content

Instantly share code, notes, and snippets.

@mark-cooper
Created January 10, 2012 23:40
Show Gist options
  • Save mark-cooper/1591952 to your computer and use it in GitHub Desktop.
Save mark-cooper/1591952 to your computer and use it in GitHub Desktop.
Transforming subject headings using ruby-marc
# TRANSFORM SUBJECT HEADINGS EXAMPLE
require 'marc'
# Function: text_to_datafield
# Returns a vanilla MARC::DataField object from string
# Format examples:
# 655 _0 Account books.
# 655 _7 3-D films.|2lcgft
def text_to_datafield(text_field)
tag = text_field[(0..2)]
ind1 = text_field[4] == '_' ? ' ' : text_field[4]
ind2 = text_field[5] == '_' ? ' ' : text_field[5]
subfields = []
text = text_field.dup
text[6] = '|a' # Coax subfield 'a'
s = text.split(/\|/)
s.shift # Remove the tag & inds
s.each do |sub|
subfields << MARC::Subfield.new(sub[0], sub[(1..sub.length - 1)].strip)
end
MARC::DataField.new(tag, ind1, ind2, *subfields)
end
# Read file containing subject heading pairs such as:
# 655 _0 Action and adventure films.=655 _7 Adventure films.|2gsafd
# 655 _0 Adventure stories.=655 _7 Adventure fiction.|2gsafd
transforms = {}
# http://www.libcode.net/transforms.html for examples
File.open('config/transforms.txt').each_line do |transform|
from, to = transform.strip.split '='
# Use string as safe hash key
transforms[text_to_datafield(from).value] = text_to_datafield(to)
end
w = MARC::Writer.new('records/transformed.mrc')
MARC::ForgivingReader.new('records/transforms.dat').each do |r|
del = []
add = []
r.find_all { |f| f.tag =~ /65./ }.each do |s|
if transforms.has_key? s.value
del << s
add << transforms[s.value]
puts 'WAS: ' + s.to_s
puts 'NOW: ' + transforms[s.value].to_s
puts '-' * 20
end
end
del.each { |d| r.fields.delete d }
add.each { |a| r.append a }
r.fields.sort_by! { |f| f.tag }
w.write r
end
w.close
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment