Skip to content

Instantly share code, notes, and snippets.

@wragge
Created February 25, 2016 04:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wragge/a17660f0c9770aa967b4 to your computer and use it in GitHub Desktop.
Save wragge/a17660f0c9770aa967b4 to your computer and use it in GitHub Desktop.

#Linked Open Data!

How many of you were at the VALA conference?

Lots and lots of discussion about LOD

  • Vocabularies
  • RDF
  • Triples
  • SPARQL
  • Tim Berners-Lee

But that's not really what I'm going to talk about today.

Rather than focusing on the how, I want to talk about the why.

Linked Open Data takes a lot of work. Why should we bother?

I'm conscious that you're all here today to talk about discovery services, so I suppose we have to say that one of the big WHYs is smarter search, multiple channels for discovery.

And that's true.

But is it enough?

What I get really passionate about is the possibility of using Linked Open Data to share what we know.

And when I say we, I mean WE -- YOU and ME. Not Google, not Ex Libris, not Facebook -- individuals, people.

Linked Open Data is a means of using the web to publish, share, and reuse structured data. That might be your family tree. It might be your My Little Pony Collection. It might be your research into the history of Australia's immigration policy.

LOD is not just library metadata, its not just fuel for a new generation of search engines. It's a way to share our passions and connect them with the passions of others.

That's WHY I get excited about LOD.


Who's Googled themselves? (C'mon, everyone has Googled themselves)

I certainly have. And as a result I have discovered that I am locked in a battle for world domination with two other Tim Sherratts. One is a British sound engineer, the other is an American religious historian.

Which one am I?

Ok, I think you can probably guess that the fact that I'm standing here talking about LOD and not the latest speaker systems means that I'm Tim Sherratt, historian and hacker.

But what if you just saw my name on a web page? You could probably still make a pretty good guess because as human beings we're really good at picking up on context. We'd probably see a whole range of clues in the content of the page.

But what if you're not a human being? What if you're a computer?


Now here come's the death-defying live demo -- actually if you want to play around with LOD, this is a neat little playground.

RDFA PLAYGROUND

My name is Tim.

Again, you have a pretty good idea who that 'Tim' refers to. But to a computer it's just a series of characters. Unless it's been trained it's not even likely to know that 'Tim' is a name that refers to a person.

So let's help it.

typeof="Person"

But what's a person? Need to relate the term 'Person' to something that defines it -- yes a vocabulary...

vocab="http://schema.org/"

So now the computer can look up a url to find out more about the term.

But how does the string of characters 'Tim Sherratt' relate to the person?

property="name"

Let's make this even nicer by giving our little node it's own identifier.

about="#tim"

But the computer still doesn't know which Tim Sherratt this person is. What can we use to help?

Yes -- the awesome power of identifiers.

It just so happens I have my very own Trove party id. Nothing to do with parties -- everything to do with people.

property="sameAs" href="http://nla.gov.au/nla.party-479364"

Add Kate.

And then we can do something like...

property="knows" href="#kate"

That's Linked Open Data:

  • Share and reuse vocabularies so that we're speaking the same language
  • Share and reuse identifiers so that we know we're talking about the same things
  • Put all this in nicely labelled buckets so that computers can find and use it

BUT WAIT A MINUTE! I was going to talk about sharing knowledge. It may be important for me to know who I am, but it's not a very interesting example is it?


Hmmm something that I could have included in my little demo was the information that both Kate and I are historians -- slightly more interesting.

You know what historians do right? What do historians do?

Well I suppose, but I'll let you into a secret -- what historians do is create Linked Open Data. THEY JUST DON'T KNOW IT.

Don't believe me? Think about it. A historian's research generally consists of identifying entities -- people, places, organisations, events, resources -- and defining relationships between them.

Tim knows Kate.

Sometimes this LOD might be in a database, sometimes it might be on index cards or scraps of paper.

But what happens when historians come to the strangely named process of 'writing up'?

All that rich contextual data is squeezed out, glimpsed perhaps as sad little strings in footnotes.

Of course turning all that data into an interesting narrative is a large part of the art of history, but can't we have both? Can't we have strong narratives and rich data?

With Linked Open Data, I think we can.


In 1908, James Minahan arrived back in Australia. James had been born in Australia -- his father was Chinese, his mother was Irish. At the age of 5 his father took him back to Australia, where he stayed for the next 26 years.

When he came back, at the age of 31, James spoke no English. He looked and sounded Chinese.

This might not have been a particularly remarkable event, except for the fact that in 1901 Australia passed the Immigration Restriction Act and implmented the White Australia Policy.

Under the Act James Minahan was subjected to the Dictation Test and declared a 'prohibited immigrant'. But how could you be an immigrant if you were born here? This was the question that took James Minahan's case all the way to the High Court.

Kate is not just a historian, she's a historian of Chinese Australia. In particular she's interested in families and relationships -- in complex stories such as that of James Minahan.

Kate has written up much of the story, but of course it involves many people, many organisations, many events, with detail drawn from a range of different sources in cultural heritage collections.

Narrative and Linked Open Data -- is it possible?

Here come's live demo number two. This is something that I'm still building -- I've had several goes over the last few years, but I keep coming back to it, because I think it's important.

I had hoped to have all the data entered and described ready for today, but we didn't quite make it -- so there's quite a few blanks, but here goes...

Linked Open Data meets history...


BUT it just looks like text!

Scroll and the entities appear!

Relationship between text and entities.

Footnotes!

Explore the links -- look at the relationships.

Different views -- wall, map, timeline

All data -- filterable.

Ok BUT WHERE's THE LINKED OPEN DATA?

This is just a html page with some javascript -- no databases, no platforms.

Where is the data coming from, it's all just sitting in the HTML as LOD.

LET's HAVE A LOOK!

Data in 2 places -- JSON-LD and RDFa in text.

SHOW THE TRIPLES!


Why do I get excited about this?

PDFs are dead ends they go nowhere. What I want is for every history book or article to be a starting point -- a portal, a discovery interface that will enable and encourage readers to explore not just the narrative, not just its associated collection of resources, but all the resources in linked collections. Every publication should be a gateway, not a closed world.

Linked Open Data can help make that possible.


That's just one possibility. There are many more. As I said at the outset, LOD is about people sharing knowledge. Sometimes that can be trivial, sometimes it can be immensely powerful.

The Open Memory Project.

Documenting the experiences of Holocaust victims -- that to me is a pretty powerful WHY.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment