Skip to content

Instantly share code, notes, and snippets.

@nichtich
Created December 6, 2011 16:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nichtich/1438869 to your computer and use it in GitHub Desktop.
Save nichtich/1438869 to your computer and use it in GitHub Desktop.
Simple awk scripts to convert between BEACON format and N-Triples
# Simple awk script to convert BEACON format to N-Triples
# First published at https://gist.github.com/gists/1438869
# hereby put into public domain
BEGIN {
FS = "|"
link = "http://www.w3.org/2000/01/rdf-schema#seeAlso"
header = 1
}
{
if (header && $1 ~ /^(\xef\xbb\xbf)?[ \t]*#/) { # may contain UTF-8 BOM!
sub(/^[ \t]*#/,"",$1)
key = $1
value = $1
gsub(/^[^:]+:[ \t]*|[ \t\n\r]+$/,"",value)
if (key ~ /^LINK:/)
link = value
else if (key ~ /^PREFIX:/)
prefix = value
else if (key ~ /^TARGET:/) {
target = value
if (targetprefix) {
print "Cannot set both TARGET and TARGETPREFIX!" > "/dev/stderr"
exit
}
if (target !~ /{ID}/)
target = target "{ID}"
} else if (key ~ /^TARGETPREFIX:/) {
targetprefix = value
if (target) {
print "Cannot set both TARGET and TARGETPREFIX!" > "/dev/stderr"
exit
}
}
} else if ($1 !~ /^[ \t\n\r]*$/) { # ignore empty source fields
header = 0
source = $1
gsub(/^[ \t]+|[ \t\n\r]+$/,"",source) # trim
if (NF > 1 && (targetprefix || $NF ~ /^[ \t]*[a-zA-Z][a-zA-Z+.-]*:.+/)) {
# use last field if more than one field and last field looks like an URI or targetprefix set
t = $NF
} else if (target) {
t = target
sub(/{ID}/,source,t)
} else {
t = ""
}
if (t) {
gsub(/^[ \t]+|[ \t\n\r]+$/,"",t) # trim target URI
print "<" prefix source "> <" link "> <" targetprefix t "> ."
}
}
}
# Simple awk script to convert N-Triples to BEACON
# USAGE: awk -f YOURFILE.nt -vprefix=YOURPREFIX -vtarget=YOURTARGET
# or awk -f YOURFILE.nt -vprefix=YOURPREFIX -vtargetprefix=YOURTARGETPRFIX
BEGIN {
if (!prefix) {
print "Missing argument '-vprefix=http://...'" > "/dev/stderr"
exit
}
print "#PREFIX: " prefix
if (target) {
print "#TARGET: " target
} else if (targetprefix) {
print "#TARGETPREFIX: " targetprefix
}
}
substr($1,2,length(prefix)) == prefix && substr($3,2,length(target)) == target {
gsub(/^<|>$/,"",$1)
gsub(/^<|>$/,"",$2)
gsub(/^<|>$/,"",$3)
if (!link) { # use first triple to get predicate
link = $2
print "#LINK: " link
print ""
}
if ($2 == link) { # ignore all triples with different predicate
s = substr($1,length(prefix)+1)
if ( target && substr($3,length(target)+1) == s ) {
print s # one common identifier
} else if (targetprefix) {
if ( substr($3,1,length(targetprefix)) == targetprefix ) {
t = substr($3,length(targetprefix)+1)
print s "|" t
}
} else {
print s "|" $3 # explicit URL
}
}
}
{ } # ignore the rest
@nichtich
Copy link
Author

nichtich commented Dec 6, 2011

@nichtich
Copy link
Author

This implementation does not reflect the current state of BEACON specification anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment