Skip to content

Instantly share code, notes, and snippets.

@aelkiss
Created April 22, 2019 19:49
Show Gist options
  • Save aelkiss/df6a4fd0117c15e6abceadf102c2edc7 to your computer and use it in GitHub Desktop.
Save aelkiss/df6a4fd0117c15e6abceadf102c2edc7 to your computer and use it in GitHub Desktop.
Convert HathiTrust IDs to GRIN volume IDs
#!ruby
grin_ids = {
'uc1' => 'UCAL',
'hvd' => 'Harvard',
'uva' => 'UVA',
'mdp' => 'UOM',
'nnc1' => 'Columbia',
'uiug' => 'UIUC',
'pur1' => 'PURD',
'chi' => 'CHI',
'wu' => 'Wisc',
'nyp' => 'NYPL',
'pst' => 'PSU',
'ucm' => 'UCM',
'coo' => 'Cornell',
'ien' => 'NWU',
'njp' => 'PRNC',
'inu' => 'IND',
'umn' => 'MINN',
'keio' => 'Keio',
'osu' => 'OSU',
'msu' => 'MSU',
'iau' => 'IOWA'
}
STDIN.each do |line|
fields = line.strip.split("\t")
namespace = fields.shift
id = fields.shift
next unless grin_ids[namespace]
grin_id = case id
when /^31822\d{9}$/ then 'UCSD'
when /^l\d{10}$/, /^31158\d{9}$/ then 'UCLA'
when /^32106\d{9}$/ then 'UCSC'
when /^31378\d{9}$/ then 'UCSF'
when /^31175\d{9}$/ then 'UCD'
when /^31210\d{9}$/ then 'UCD'
when /^[bcd]\d{9}$/ then 'UCBK'
when /^(a{1,3}|ax|[bcd]|cc)\d{10}$/ then 'SRLF'
else grin_ids[namespace]
end
puts "#{grin_id}:#{id.upcase} " + fields.join("\t")
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment