Systems Developer at Penn State
hjc14@psu.edu / hector@hectorcorrea.com
Slides at: http://tinyurl.com/rdf4rdbms
Notes on my current understanding of how RDF compares to the components of a traditional application that uses Ruby on Rails with a relational database backend (MySQL, PostgreSQL, Oracle.)
Tables/columns/relationships
Book table:
name type
----- -------
id integer
title string
isbn string
Page table:
name type
----- -------
id integer
book_id integer (foreign key to books)
number integer
text string
And we use SQL to query and update:
INSERT INTO books(id, title, isbn)
VALUE (1, "Lord of the Rings", "123-456-789")
INSERT INTO pages(id, book_id, number, text)
VALUE (1, 1, 1, 'hi frodo')
INSERT INTO pages(id, book_id, number, text)
VALUE (2, 1, 2, 'dude is that mordor?')
SELECT * FROM books WHERE title = 'lord of the rings'
SELECT * FROM pages WHERE book_id = 1
Assuming you have the tables described above you can define a couple of ActiveRecord classes in Rails to represent each table:
class Book < ActiveRecord::Base
has_many :pages
end
class Page < ActiveRecord::Base
belongs_to :book
end
Notice the has_many
and belongs_to
to indicate
the one-to-many relationship. Fields are not indicated
on the class but they are automatically picked from
the tables at runtime.
You can access the database via ActiveRecord objects:
# Create a new Book object
b = Book.new
b.title = "Lord of the Rings"
b.isbn = "123-456-789"
# Create a couple of page objects and
# add them to the book.pages collection.
p1 = Page.new(number: 1, text: "hi frodo")
b.pages < p1
p2 = Page.new(number: 2, text: "dude, is that mordor?")
b.pages < p2
# Save the book object
# (will save both book and pages)
b.save
# Fetch the saved record
b = Book.find(1)
puts b.title # => "Lord of the Rings"
puts b.pages[0].text # => "hi frodo"
RDF is a W3C standard for data interchange on the Web (See http://www.w3.org/RDF)
There are no tables or columns in RDF. There are triples and graphs.
Triple is a three part statement that includes a subject, a predicate, and an object:
book1 title "Lord of the Rings"
There are many ways to represent RDF including N-Triples, Turtle, and RDF/XML. The examples below use N-Triples. Here is how the previous triple would look like in N-Triples.
<book1> <title> "Lord of the Rings"
An RDF graph is a collection of triples:
<book1> <title> "Lord of the Rings"
<book1> <isbn> "123-456-789"
<page1> <number> "1"
<page1> <text> "hi frodo"
<page2> <number> "2"
<page2> <text> "dude, is that mordor?"
<book1> <page> <page1>
<book1> <page> <page2>
Subjects and predicates in a triple are URIs. Objects can be URIs (to reference another object) or literals.
<http://libraries.psu.edu/catalog/book1> <http://abc.org/1.1/title> "Lord of the Rings"
<http://libraries.psu.edu/catalog/book1/page1> <http://xyz.org/ns#/number> "1"
<http://libraries.psu.edu/catalog/book1/page1> <http://xyz.org/ns#/text> "hi frodo"
<http://libraries.psu.edu/catalog/book1/page2> <http://xyz.org/ns#/number> "2"
<http://libraries.psu.edu/catalog/book1/page2> <http://xyz.org/ns#/text> "dude, is that mordor?"
<http://libraries.psu.edu/catalog/book1> <http://xyz.org/ns#/pages> <http://libraries.psu.edu/catalog/book1/page1>
<http://libraries.psu.edu/catalog/book1> <http://xyz.org/ns#/pages> <http://libraries.psu.edu/catalog/book1/page2>
It is common to use standard predicates defined by other organizations so that institutions can share information knowing that a specific predicate means the same thing across datasets. For example the following two predicates represent different things even if both are called "title"
http://purl.org/dc/elements/1.1/title
http://scholarsphere.psu.edu/ns#/title
A triple is roughly the equivalent of a cell (row/column) in a relational database (See http://workingontologist.org, page 31)
Fedora 4 is a document repository suited for large objects (e.g, text, images, audio and video files) and natively supports RDF to store metadata about these objects.
Fedora stands for Flexible Extensible Digital Object Repository Architecture. See http://www.fedora-commons.org/about
Fedora provides an HTTP API to create and update objects.
For example, this request will create a new object in Fedora:
HTTP POST http://localhost:8983/fedora/rest/book1
...and something like this will add a couple of "fields" (RDF statements) to this new object:
HTTP POST http://localhost:8983/fedora/rest/book1
content-body
<> <http://whatever/title> "Lord of the Rings"
<> <http://whatever/isbn> "978-0618640157"
ActiveFedora is a Ruby gem that does for Fedora what ActiveRecord does for relational databases. This means that we can define a class as follow:
class BookObject < ActiveFedora::Base
property :title, predicate: ::RDF::DC.title
property :isbn, predicate: ::RDF::URI.new('http://libraries.psu.edu/metadata/isbn')
end
...and then create and fetch data using code as follows:
# Create an object...
b = BookObject.new( title: ["Lord of the Rings"], isbn: ["123-456-789"] )
b.save
puts b.id # => "123"
# ...and fetch it
b = BookObject.find("123")
puts b.title # => "Lord of the Rings"
puts b.isbn # => "123-456-789"
ActiveFedora automatically adds a property hasModel
to the Fedora object to represent what Ruby class
this object should be serialized into when it's fetched.
That's how b.title
and b.isbn
were populated in the
previous example.
Notice that we do specify the fields (predicates) in our ActiveFedora models. This is because there is no table with a specific structure in Fedora where Rails could pick them up as ActiveRecord does for relational databases.
You can also define relationships like the one between Books and Pages.
Behind the scenes ActiveFedora uses ActiveTriples to handle triples and LDP to handle the HTTP communication to Fedora.
Here are several good basic examples on ActiveFedora by Esme: https://github.com/escowles/testdrive
.