Created
April 11, 2016 04:06
-
-
Save jpmccu/73e0a0d594bac4f44afff76b82f1be3d to your computer and use it in GitHub Desktop.
Ontology Engineering: SPARQL
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Ontology Engineering\n", | |
"## Question Answering with SPARQL\n", | |
"\n", | |
"## With your host, Jim McCusker" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# SPARQL is the SPARQL Protocol and Query Language\n", | |
"\n", | |
"That means it is:\n", | |
"\n", | |
"1. A [Query Language](https://www.w3.org/TR/sparql11-query/)\n", | |
"2. A [Web Protocol](https://www.w3.org/TR/sparql11-protocol/)\n", | |
"3. A [W3C Standard](https://www.w3.org/TR/sparql11-overview/)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# SPARQL Tools" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Protege SPARQL Plugin\n", | |
"\n", | |
"![Protege SPARQL Plugin Tab](http://i.imgur.com/amWS9ms.png)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# [Yet Another SPARQL GUI](http://yasgui.org/)\n", | |
"\n", | |
"![YASGUI](http://i.imgur.com/PVdhntT.png)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# RDF Databases Supporting SPARQL:\n", | |
"\n", | |
"* OpenLink Virtuoso\n", | |
"* AllegroGraph\n", | |
"* Fuseki\n", | |
"* BlazeGraph\n", | |
"* Sesame\n", | |
"* 4store\n", | |
"* Stardog\n", | |
"* ... (there are many)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# SPARQL is supported in:\n", | |
"\n", | |
"* Python (rdflib) remote, local\n", | |
"* Java (Jena, Sesame) remote, local\n", | |
"* R (sparql package) remote\n", | |
"\n", | |
"I haven't tried these, but:\n", | |
"\n", | |
"* .Net (dotNetRDF) remote\n", | |
"* PHP (sparqllib.php) remote\n", | |
"\n", | |
"Also, anything that can do HTTP GETs and read JSON, CSV, or XML (like Javascript) can easily talk to a SPARQL endpoint." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# SPARQL Queries are Basic Graph Patterns\n", | |
"\n", | |
"```sparql\n", | |
"select ?s ?p ?o where {\n", | |
" ?s ?p ?o.\n", | |
"} limit 100\n", | |
"```\n", | |
"[Try it.](http://yasgui.org/short/VyDhtWEJ-)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Which query has more matches?\n", | |
"\n", | |
"```sparql\n", | |
"# A\n", | |
"select ?a ?b ?c ?d ?e ?f where {\n", | |
" ?a ?b ?c.\n", | |
" ?d ?e ?f.\n", | |
"}\n", | |
"```\n", | |
"\n", | |
"```sparql\n", | |
"# B\n", | |
"select ?a ?b ?c ?d ?e where {\n", | |
" ?a ?b ?c.\n", | |
" ?c ?d ?e.\n", | |
"}\n", | |
"```\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Blank nodes are [ ].\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?class where {\n", | |
" [] a ?class.\n", | |
"} limit 100\n", | |
"```\n", | |
"[Try it.](http://yasgui.org/short/E1Bx9ZVyZ)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# You can embed patterns in blank nodes.\n", | |
"\n", | |
"For instance, if you don't actually care about the URI:\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?uLabel where {\n", | |
" [] <http://dbpedia.org/property/universe> [\n", | |
" rdfs:label ?uLabel\n", | |
" ].\n", | |
"} limit 100\n", | |
"```\n", | |
"[Try it.](http://yasgui.org/short/VJUfcZ41b)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# What's the difference between:\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?uLabel where {\n", | |
" [] <http://dbpedia.org/property/universe> [\n", | |
" rdfs:label ?uLabel\n", | |
" ].\n", | |
"} limit 100\n", | |
"```\n", | |
"\n", | |
"# and\n", | |
"\n", | |
"```sparql\n", | |
"select ?uLabel where {\n", | |
" [] <http://dbpedia.org/property/universe> [\n", | |
" rdfs:label ?uLabel\n", | |
" ].\n", | |
"} limit 100\n", | |
"```\n", | |
"\n", | |
"# ?" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Complex Patterns Increase Specificity.\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?a ?b ?universe ?uLabel where {\n", | |
" ?a <http://dbpedia.org/property/universe> ?universe.\n", | |
" ?b <http://dbpedia.org/property/universe> ?universe.\n", | |
" ?universe rdfs:label ?uLabel.\n", | |
"} LIMIT 1000\n", | |
"```\n", | |
"\n", | |
"## vs\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?entityA ?entityB ?universe ?uLabel where {\n", | |
" ?a <http://dbpedia.org/property/universe> ?universe.\n", | |
" ?b <http://dbpedia.org/property/universe> ?universe.\n", | |
"} limit 1000\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"## Union\n", | |
"\n", | |
"UNION groups two distinct BGP patterns together:\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?a ?b ?universe where {\n", | |
" {\n", | |
" ?a <http://dbpedia.org/property/universe> ?u.\n", | |
" ?b <http://dbpedia.org/property/universe> ?u.\n", | |
" ?u rdfs:label ?universe.\n", | |
" } UNION {\n", | |
" ?a <http://dbpedia.org/property/universe> ?universe.\n", | |
" ?b <http://dbpedia.org/property/universe> ?universe.\n", | |
" }\n", | |
"} LIMIT 1000\n", | |
"```\n", | |
"[Try it](http://yasgui.org/short/4kOU9-EyW)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"## Optional\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?a ?b ?universe ?uLabel where {\n", | |
" ?a <http://dbpedia.org/property/universe> ?universe.\n", | |
" ?b <http://dbpedia.org/property/universe> ?universe.\n", | |
" optional {\n", | |
" ?universe rdfs:label ?uLabel.\n", | |
" }\n", | |
"}\n", | |
"```\n", | |
"[Try it](http://yasgui.org/short/EJUbib4kW)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Prefixes\n", | |
"\n", | |
"Prefixes work a little different than in Turtle, make things readable too:\n", | |
"\n", | |
"```sparql\n", | |
"PREFIX dbp: <http://dbpedia.org/property/>\n", | |
"\n", | |
"select distinct ?u where {\n", | |
" ?entityA dbp:universe ?u.\n", | |
"}\n", | |
"```\n", | |
"[Try it](http://yasgui.org/short/Vyv7oWN1Z)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Named Graphs\n", | |
"\n", | |
"Most SPARQL endpoints support and have multiple named graphs:\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?g where {\n", | |
" graph ?g {\n", | |
" [] ?p [].\n", | |
" }\n", | |
"} limit 100\n", | |
"```\n", | |
"\n", | |
"Named graphs are resources like anything else. Use URIs to name them, or use a variable to find them. You can also look for information about named graphs:\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?g ?glabel where {\n", | |
" graph ?g {\n", | |
" [] ?p [].\n", | |
" }\n", | |
" ?g rdfs:label ?glabel.\n", | |
"} limit 100\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Property Paths\n", | |
"\n", | |
"Property paths allow for fairly complex traversal of graphs:\n", | |
"\n", | |
"* Basic enumeration: `<prop>+` $\\geq 1$, `<prop>*` $s \\geq 0$ , `<prop>?` $s \\in \\{0,1\\}$\n", | |
"* Traversal: `<prop1>/<prop2>`\n", | |
"* Alternates: `<prop1>|<prop2>`\n", | |
"* Grouping: `(<prop>)`, `(<prop1>/<prop2>)+|<prop3>`\n", | |
"* Inverse: `^<prop>`\n", | |
"* Negations: `!<prop>`, `!^<prop>`, `!(<prop1>|<prop2>|...|<propN>)`" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Filtering and Expressions\n", | |
"\n", | |
"Filters are evaluated against graph pattern matches. [There are lots of ways to evaluate filters](https://www.w3.org/TR/2013/REC-sparql11-query-20130321/#expressions).\n", | |
"\n", | |
"```sparql\n", | |
"# Find all root classes\n", | |
"\n", | |
"select distinct ?root where {\n", | |
" ?c a owl:Class;\n", | |
" rdfs:subClassOf* ?root.\n", | |
" optional {\n", | |
" ?root rdfs:subClassOf ?superRoot.\n", | |
" }\n", | |
" FILTER(!bound(?superRoot) && isIRI(?root))\n", | |
"}\n", | |
"```\n", | |
"\n", | |
"[Try it](http://yasgui.org/short/EkxujZ4yb)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# BINDing Variables\n", | |
"\n", | |
"BIND lets you set a variable, and can be done in UNION sub-patterns or anywhere.\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?a ?b ?universe ?source where {\n", | |
" {\n", | |
" ?a <http://dbpedia.org/property/universe> ?u.\n", | |
" ?b <http://dbpedia.org/property/universe> ?u.\n", | |
" ?u rdfs:label ?universe.\n", | |
" BIND(\"Hello.\" as ?source)\n", | |
" }\n", | |
"} LIMIT 1000\n", | |
"```\n", | |
"[Try it](http://yasgui.org/short/VJVQhZVk-)\n", | |
"\n", | |
"These values can be any computed expression." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# VALUES\n", | |
"\n", | |
"Multiple bindings can be assigned using the VALUES keyword:\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?e ?universe where {\n", | |
" VALUES ?universe { \n", | |
" <http://dbpedia.org/resource/Whoniverse>\n", | |
" <http://dbpedia.org/resource/Discworld_(world)>\n", | |
" }\n", | |
" ?e <http://dbpedia.org/property/universe> ?universe.\n", | |
"}\n", | |
"```\n", | |
"[Try it](http://yasgui.org/short/NJHOn-V1b)\n", | |
"\n", | |
"Multiple values can be bound at once, or provide undefined slots:\n", | |
"\n", | |
"```sparql\n", | |
"VALUES (?x ?y) {\n", | |
" (:uri1 1)\n", | |
" (:uri2 UNDEF)\n", | |
"}\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Sort, Limit, Offset\n", | |
"\n", | |
"Can be used together to page through results or just customize the result order\n", | |
"\n", | |
"```sparql\n", | |
"select ?s ?p ?o where {\n", | |
" ?s ?p ?o.\n", | |
"} order by desc(?s) ?p offset 100 limit 100\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Group By and Aggregation\n", | |
"\n", | |
"Analytic queries can be really cool.\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?universe (count(?e) as ?articles) where {\n", | |
" ?e <http://dbpedia.org/property/universe> ?universe.\n", | |
"} group by ?universe order by desc(?articles)\n", | |
"```\n", | |
"[Try it](http://yasgui.org/short/NJZWpZVkW)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Subqueries\n", | |
"\n", | |
"Sometimes you need to evaluate some initial bindings separately:\n", | |
"\n", | |
"```sparql\n", | |
"select distinct ?topic ?universe where {\n", | |
" ?topic <http://dbpedia.org/property/universe> ?universe.\n", | |
" {\n", | |
" select distinct ?universe (count(?e) as ?articles) where {\n", | |
" ?e <http://dbpedia.org/property/universe> ?universe.\n", | |
" } group by ?universe order by desc(?articles) limit 5\n", | |
" }\n", | |
"}\n", | |
"```\n", | |
"[Try it](http://yasgui.org/short/VyZLpZNJZ)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": true, | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Query Forms\n", | |
"\n", | |
"In addition to SELECT, there are CONSTRUCT, ASK, and DESCRIBE:\n", | |
"\n", | |
"```sparql\n", | |
"construct {\n", | |
" ?s ?p ?o; a ?type.\n", | |
" ?p a rdfs:Property.\n", | |
" ?type a owl:Class.\n", | |
"} where {\n", | |
" ?s ?p ?o; a ?type.\n", | |
"}\n", | |
"\n", | |
"describe ?x where { ?x a owl:Class. }\n", | |
"\n", | |
"ask { ?x a owl:Classs. }\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# SPARQL Update\n", | |
"\n", | |
"```sparql\n", | |
"INSERT DATA { <http://example/book1> dc:title \"Great Expectations\".}\n", | |
"\n", | |
"DELETE DATA { <http://example/book2> dc:title \"David Copperfield\".}\n", | |
"```\n", | |
"\n", | |
"Can be with or without graph identifiers, can use `where` to do more general data matches.\n", | |
"\n", | |
"## Graph Managment:\n", | |
"```sparql\n", | |
"CREATE GRAPH <http://dbpedia.org/graph/dickens>\n", | |
"\n", | |
"DROP GRAPH <http://dbpedia.org/graph/dickens>\n", | |
"\n", | |
"# Makes the destination graph have all and only the triples from the default.\n", | |
"COPY DEFAULT TO <http://dbpedia.org/graph>\n", | |
"\n", | |
"# Adds all triples from the default graph to the destination\n", | |
"ADD DEFAULT TO <http://dbpedia.org/graph>\n", | |
"\n", | |
"MOVE <http://dbpedia.org/graph/dickens> TO <http://dbpedia.org/graph/WorksOfCharlesDickens>\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# SPARQL Federation\n", | |
"\n", | |
"Can be used for Semantic Extraction, Transformation, and Loading (SETL):\n", | |
"\n", | |
"```sparql\n", | |
"INSERT graph <http://dbpedia.org/universes> {\n", | |
" ?topic dc:partOf ?universe.\n", | |
" ?topic a schema:FictionalEntity.\n", | |
" ?universe a schema:FictionalEntity, schema:FictionalUniverse.\n", | |
"} WHERE {\n", | |
" SERVICE <http://dbpedia.org/sparql> { \n", | |
" ?topic <http://dbpedia.org/property/universe> ?universe.\n", | |
" } \n", | |
"}\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Service Descriptions\n", | |
"\n", | |
"A basic example:\n", | |
"\n", | |
"```turtle\n", | |
"@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n", | |
"@prefix dbp: <http://dbpedia.org/> .\n", | |
"@prefix sd: <http://www.w3.org/ns/sparql-service-description#> .\n", | |
"@prefix f: <http://www.w3.org/ns/formats/> .\n", | |
"\n", | |
"dbp:sparql a sd:Service ;\n", | |
" sd:endpoint dbp:sparql ;\n", | |
" sd:feature sd:DereferencesURIs, sd:UnionDefaultGraph .\n", | |
" sd:resultFormat f:RDFa, f:SPARQL_Results_JSON , f:SPARQL_Results_XML, \n", | |
" f:Turtle, f:N-Triples, f:N3, f:RDF_XML, f:SPARQL_Results_CSV ;\n", | |
" sd:supportedLanguage sd:SPARQL10Query ;\n", | |
" sd:url dbp:sparql .\n", | |
"```\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Exercises (work in groups):\n", | |
"\n", | |
"1. What are the most common properties used on instances of the class `http://schema.org/Book`?\n", | |
"\n", | |
"2. How much of the [RDFS entailment regime](https://www.w3.org/TR/rdf-schema/) can be performed in a _single_ CONSTRUCT query?\n", | |
"\n", | |
"Example:\n", | |
"\n", | |
"```sparql\n", | |
"construct {\n", | |
" ?s ?p_ ?o.\n", | |
"} where {\n", | |
" ?s ?p ?o.\n", | |
" ?p rdfs:subPropertyOf* ?p_.\n", | |
"}\n", | |
"```" | |
] | |
} | |
], | |
"metadata": { | |
"celltoolbar": "Slideshow", | |
"kernelspec": { | |
"display_name": "Python 2", | |
"language": "python", | |
"name": "python2" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 2 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython2", | |
"version": "2.7.10" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment