Skip to content

Instantly share code, notes, and snippets.

@knbknb
Last active April 3, 2024 07:54
Show Gist options
  • Save knbknb/3c7c0bbc4e4b35c1d624ef04080f174c to your computer and use it in GitHub Desktop.
Save knbknb/3c7c0bbc4e4b35c1d624ef04080f174c to your computer and use it in GitHub Desktop.
SPARQL/RDF/SemWeb: notes
# Example.com demo
INSERT DATA
{
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix : <http://example.com/demo/> .
:General_Country_Info_Graph dcterms:title "General country information graph" ;
dcterms:description "Named graph containing basic physical and human geography characteristics of countries" ;
dcterms:source <https://www.geonames.org/> .
}
# Describe the named graph
PREFIX : <http://example.com/demo/>
DESCRIBE :General_Country_Info_Graph
# Exposing the full vocabulary for the graph
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX : <http://example.com/demo/>
SELECT DISTINCT ?t ?p ?pType
WHERE {
?s ?p ?o ;
a ?t .
BIND (
COALESCE(
IF(isIRI(?o), "Resource Predicate", 1/0),
"Literal Predicate"
) AS ?pType
)
}
GROUP BY ?t ?p ?pType
ORDER BY ASC(?t) ASC(?p)

Set of explorative SPARQL queries

From: Lerning SPAQRL 1.1 - Bob DuCharme -

and RDF&SPARQL Essentials - Tish Chungoora

# retrieve a list of all named graphs from a graph database
    SELECT ?g 
    WHERE {
      GRAPH ?g { }
    }
# What are the types of things that exist in the graph?
SELECT DISTINCT ?t
WHERE {
    ?s a ?t
}
GROUP BY ?t
ORDER BY ASC(?t)
# What are the predicates defined in the graph?
SELECT DISTINCT ?p
WHERE {
    ?s ?p ?o
}
ORDER BY ASC(?p)
# What is the full vocabulary for the graph?
SELECT DISTINCT ?t ?p
WHERE {
    ?s ?p ?o ;
        a ?t
}
GROUP BY ?t ?p
ORDER BY ASC(?t) ASC(?p)
# What is the full vocabulary for the graph (distinguishing the resource predicates from literal predicates)?
SELECT DISTINCT ?t ?p ?pType
WHERE {
    ?s ?p ?o ;
        a ?t .

    BIND (
    COALESCE(
        IF(isIRI(?o), "Resource Predicate", 1/0),
        "Literal Predicate"
        ) AS ?pType
    )
}
GROUP BY ?t ?p ?pType
ORDER BY ASC(?t) ASC(?p)
SELECT DISTINCT ?t ?p
WHERE {
    ?s ?p ?o ;
        a ?t
}
GROUP BY ?t ?p
ORDER BY ASC(?t) ASC(?p)
# Wikidata items of Wikipedia articles
# Language version and project is defined in schema:isPartOf with de.wikipedia.org for German Wikipedia, es.wikivoyage for Spanish Wikivoyage, etc.
# returns 
# Wikidata  wd:Q2013
# Berlin    wd:Q64

SELECT ?lemma ?item WHERE {
  VALUES ?lemma {
    "Wikipedia"@de
    "Wikidata"@de
    "Berlin"@de
    "Technische Universität Berlin"@de
  }
  ?sitelink schema:about ?item;
    schema:isPartOf <https://de.wikipedia.org/>;
    schema:name ?lemma.
}

RDF Nodes cannot exist without at least one edge, to another node or to themselves.

IRI: Internationalized Resource Identifier, generalized URI used for proper things URLs are not officially recognised as a type of RDF node

Literal: string, number, date, etc. used for values XML schema datatypes

Blank Nodes: avoid if possible make querying RDF Data harder don't carry persistent labels used in skos:collection, skos:OrderedCollection to build up linked lists

A directed graph

:Bugs_Bunny :name "Bugs Bunny"

: Colon is a prefix for the "base URI" or the default namespace :name is a relative IRI - snake case is common "Bugs Bunny" is a literal

PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX : http://looneytunes-graph.com/

:Bugs_Bunny :name "Bugs Bunny" ; :species "Hare" ; :gender "Male" .

these are equivalent:

:Bugs_Bunny rdf:type :Looney_Tunes_Character. :Bugs_Bunny a :Looney_Tunes_Character.

  • the rdf:type is so common that it has a shorthand "a"
visual graph

Looney TUnes

SPARQL best practices

  • plan complex queeries (pencil+paper, pseudo-code, comments)
  • write lean queries using prefixes and semicolons, commas
  • (short) varnames that match what you are looking for (?m for "movies")
  • use DISTINCT often to avoid duplicates
  • use LIMIT 5 to avoid long waits
  • combine LIMIT with ORDER BY DESC(?var) to get top n results
  • use OFFSET to page through results
  • OPTIONAL for Left-join equivalent
  • you can follow different approaches to design queries
  • DESCRIBE :item to get all triples about an item (ingoing and outgoing links 1 hop away)
  • ASK :item to check if an item exists or if a fact is true (Logicians proposition)
    • often used to check if a triple exists, especially in federated queries
  • CONSTRUCT to build a new graph from a query, or new properties/relationships (similar to a view in relational databases)
    • can get the data for a new graph, or to infer new relationships. Useful with federated queries

BIND to execute expressions and assign the result to a variable

  • useful for calculations, string manipulations, and conditional tests

VALUES to provide a list of possible values for a variable COALESCE to provide default values, or COALESCE( IF()) as a cleaner conditional test

CONSTRUCT queries

We can, of course, further customise the output and include other triples if we wanted to: for example, let's calculate some birth dates, and use a custom predicate identified as :birth_date, hit "run", and we get a slightly bigger sub-graph.

What I personally like about CONSTRUCT queries, is that idea of customising the output of following graph patterns. CONSTRUCT queries are a very flexible construct in SPARQL and useful for building inferred graphs on-the-fly.

Use LOAD 👍

LOAD file:///$HOME/data/_my-turtle.ttl INTO GRAPH http://example.com/demo/Info_Graph; LOAD ...

In Blazegraph, select the "SPARQL Update" selectionlist, use N-Quads, because we want to create named graphs.

Most basic query:

select  *
where {  ?s ?p ?o  }

usable for small custom knowledge graphs

PREFIX : <http://looneytunes-graph.com/>
select  ?s ?p ?o 
where {  ?s ?p ?o  .
      ?s a  :Looney_Tunes_Character
      }

Add query modifiers after the WHERE clause. Most common are: LIMIT OFFSET ORDER BY

Very similar to Turtle Syntax

PREFIX : <http://looneytunes-graph.com/>
SELECT  ?n 
WHERE {
      :Bugs_Bunny :name  ?n
      }

LIMIT 5
PREFIX : <http://looneytunes-graph.com/>
SELECT  ?p ?n
WHERE {
       
      :Bugs_Bunny :created_by  ?p .
       ?p :name ?n .
      }
# what are the debut release dates for looney tunes characters?
PREFIX : <http://looneytunes-graph.com/>
SELECT  ?n ?m ?d
WHERE {
    ?c a :Looney_Tunes_Character ;
       :name ?n ;
       :made_debut_appearance_in ?m .
    ?m :release_date ?d;
          
}
order by desc(?d)
#
# list the lifespan of each Looney Tunes character creator. 
# In the tabulated results make sure to list each creator along with their corresponding lifespan value in years.
# 
PREFIX : <http://looneytunes-graph.com/>
SELECT DISTINCT  ?p1 ?bd ?dd ?lifespan
WHERE {
       
      ?c a :Looney_Tunes_Character ;
         :created_by ?p1 .
      ?p1 :born_on ?bd;
          :died_on ?dd .

   
  BIND(year(?bd) as ?bYear ) .
  BIND(year(?dd) as ?dYear ) .
  BIND((?dYear - ?bYear) as ?lifespan ) 
}
#
# list the lifespan of each Looney Tunes character creator. 
# In the tabulated results make sure to list each creator along with their corresponding lifespan value in years.
# 
PREFIX : <http://looneytunes-graph.com/>
SELECT DISTINCT  ?p1 ?bd ?dd ?lifespan ?period
WHERE {
       
      ?c a :Looney_Tunes_Character ;
         :created_by ?p1 .
      ?p1 :born_on ?bd;
          :died_on ?dd .
  
     
  BIND(year(?bd) as ?bYear ) 
  BIND(year(?dd) as ?dYear ) 
  BIND((?dYear - ?bYear) as ?lifespan ) 
  # uses 1/0 error to force a FALSE value  
  BIND(COALESCE( IF (?bYear < 1900, "Born pre-1900", 1/0),
                IF (?bYear >= 1900, "Born post-1900", 1/0),
                
                "NA")
                as ?period )
}
# property paths
## Inverse path
PREFIX : <http://looneytunes-graph.com/>

SELECT ?c ?p
WHERE {
    ?p ^:created_by ?c
}
## What are the possible paths I can take between two nodes (e.g. between Bugs Bunny & Daffy Duck)?
## will probably work with a small graph only

PREFIX : <http://looneytunes-graph.com/>

SELECT DISTINCT ?subject ?predicate ?object
WHERE {
    VALUES ?start { :Bugs_Bunny }
    VALUES ?end { :Daffy_Duck }
    
    ?start (:|!:)* ?subject .
    ?subject ?predicate ?object .
    ?object (:|!:)* ?end
}
# BIND()/BOUND() : use this 
# if some OPTIONAL values are NULL and there is nothing to bind
#....
  OPTIONAL { ?clang :language_variant ?langVar. }
  # !BOUND checks if there is no bound ?langVar 
  # - if this is true, then simply ?langN is returned, 
  # otherwise, the string function CONCAT concatenates the necessary strings as ?langWithVariant
  BIND(IF(!(BOUND(?langVar)), 
          ?langN, 
          CONCAT(?langN, "[", ?langVar, "]"))    AS ?langWithVariant)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment