Skip to content

Instantly share code, notes, and snippets.

@nichtich
Created November 6, 2012 09:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nichtich/4023799 to your computer and use it in GitHub Desktop.
Save nichtich/4023799 to your computer and use it in GitHub Desktop.
Convert B3Kat NTriples to several Beacon files
#!/usr/bin/awk -f
#
# Given B3Kat in NTriples format this script extracts:
#
# 1. mappings from ISBN to B3Kat title URI
# into file isbn2b3kat.beacon
# 2. mappings from B3Kat title URI to ISIL of holding library
# into file b3kat2isil.beacon
#
# Given the actual dumps from <http://lod.b3kat.de/download> one can invoke:
#
# zcat lod.b3kat.de.part??.rdf.gz | \
# rapper -I http://lod.b3kat.de/ -i rdfxml -o ntriples - | \
# ./b3kat-isbn-and-isil.awk
#
# See <http://gbv.github.com/beaconspec/beacon.html> for a specification of
# the resulting Beacon text format.
#
BEGIN {
I2B = "isbn2b3kat.beacon"
B2I = "b3kat2isil.beacon"
print "#FORMAT: BEACON" > I2B
print "#PREFIX: urn:isbn:" >> I2B
print "#RELATION: http://purl.org/ontology/bibo/isbn" >> I2B
print "#TARGET: http://lod.b3kat.de/title/" >> I2B
print "#DESCRIPTION: Mapping of ISBN to title URI in B3Kat" >> I2B
print "#TIMESTAMP:", `date -u +%FT%TZ` >> I2B
print >> I2B
print "#FORMAT: BEACON" > B2I
print "#PREFIX: http://lod.b3kat.de/title/" >> B2I
print "#RELATION: http://purl.org/ontology/daia/heldBy" >> B2I
print "#TARGET: http://lobid.org/organisation/" >> B2I
print "#DESCRIPTION: Mapping of B3Kat title URI to ISIL" >> B2I
print "#TIMESTAMP:", `date -u +%FT%TZ` >> I2B
print >> B2I
}
# Subject must look like a title URI
! $1 ~ /^<http:\/\/lod\.b3kat\.de\/title\// { }
# B3Kat title URI to ISBN
$2 == "<http://purl.org/ontology/bibo/isbn>" {
title = substr($1,28,length($1)-28)
isbn = substr($3, 2, length($3)-2)
OFS="||"
print isbn, title >> I2B
}
# B3Kat title URI to item URI
$2 == "<http://purl.org/vocab/frbr/core#exemplar>" &&
$3 ~ /^<http:\/\/lod\.b3kat\.de\/bib\/[A-Z0-9:-]+\/item\// {
# FIXME: ISIL allows '/' (but not German ISIL)
title = substr($1,28,length($1)-28)
split($3,a,"/")
isil = a[5]
OFS="||"
print title, isil >> B2I
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment