Skip to content

Instantly share code, notes, and snippets.

@jpstroop
Created November 4, 2013 22:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jpstroop/7310094 to your computer and use it in GitHub Desktop.
Save jpstroop/7310094 to your computer and use it in GitHub Desktop.
Qualified DC to N-Triples
require 'nokogiri'
require 'rdf'
require 'chronic'
xmldoc = Nokogiri::XML('<metadata xmlns="http://example.org/myapp/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xsi:schemaLocation="http://example.org/myapp/ http://example.org/myapp/schema.xsd">
<dc:title>
UKOLN
</dc:title>
<dcterms:alternative>
UK Office for Library and Information Networking
</dcterms:alternative>
<dc:subject>
national centre, network information support, library
community, awareness, research, information services,public
library networking, bibliographic management, distributed
library systems, metadata, resource discovery,
conferences,lectures, workshops
</dc:subject>
<dc:subject xsi:type="dcterms:DDC">
062
</dc:subject>
<dc:subject xsi:type="dcterms:UDC">
061(410)
</dc:subject>
<dc:description>
UKOLN is a national focus of expertise in digital information
management. It provides policy, research and awareness services
to the UK library, information and cultural heritage communities.
UKOLN is based at the University of Bath.
</dc:description>
<dc:publisher>
UKOLN, University of Bath
</dc:publisher>
<dcterms:isPartOf xsi:type="dcterms:URI">
http://www.bath.ac.uk/
</dcterms:isPartOf>
<dc:identifier xsi:type="dcterms:URI">
http://www.ukoln.ac.uk/
</dc:identifier>
<dcterms:modified xsi:type="dcterms:W3CDTF">
2001-07-18
</dcterms:modified>
<dc:format xsi:type="dcterms:IMT">
text/html
</dc:format>
<dcterms:extent>
14 Kbytes
</dcterms:extent>
</metadata>
')
map = {
'alternative' => RDF::DC.alternative,
'description' => RDF::DC.description,
'extent' => RDF::DC.extent,
'format' => RDF::DC.format,
'identifier' => RDF::DC.identifier,
'isPartOf' => RDF::DC.isPartOf,
'modified' => RDF::DC.modified,
'publisher' => RDF::DC.publisher,
'subject' => RDF::DC.subject,
'title' => RDF::DC.title
}
date_localnames = [
'modified',
'created',
'date',
'dateAccepted',
'dateCopyrighted',
'dateSubmitted'
]
g = RDF::Graph.new
id = RDF::URI.new('http://localhost/myobject')
xmldoc.xpath('/*/*').each do |e|
ln = e.xpath('local-name()')
v = e.text().strip()
if date_localnames.include? ln
v = DateTime.parse(Chronic::parse(v).to_s)
else
v= v.split.join(' ')
end
g << [id, map[ln], v]
end
puts g.dump(:ntriples)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment