Skip to content

Instantly share code, notes, and snippets.

@ColinMaudry
Last active November 24, 2023 15:46
Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save ColinMaudry/6fd6a5f610f0ac3e6696 to your computer and use it in GitHub Desktop.
Save ColinMaudry/6fd6a5f610f0ac3e6696 to your computer and use it in GitHub Desktop.
cURL examples to query Wikidata

SPARQL Queries (with cURL command) on Wikidata

This gist resulted to be just the spark for a proper article, and won't be maintained here anymore.

The SPARQL endpoint is http://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql and it has a Web form to fire queries. However http://www.wikidata.org/prop/direct/P31 ("instance of") tells you what the entity is.

The repository doesn't have named graphs, or at least the SPARQL endpoint rejects graph queries. The classes of entities (rdf:type) are not described in the repository.

To find the HTML page of an entity (such as https://www.wikidata.org/entity/Q866405), simply replace /entity/ with /wiki/.

By default the SPARQL endpoint returns the results in SPARQL Results XML format but

  • adding -H "Accept: application/json" to the cURL command gets them in JSON Query Results
  • adding -H "Accept: text/csv" to the cURL command gets them in CSV format (the most readable).

Don't forget to URL encode the query ;-)

The list of entity types

#Selects the first 20 types of entities encountered:

select distinct ?type where {
?thing a ?type
}
limit 20

curl http://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query=select%20distinct%20%3Ftype%20where%20%7B%0A%3Fthing%20a%20%3Ftype%0A%7D%0Alimit%2020 -H "Accept: text/csv"

Description of an entity (East Antarctica)

describe queries return all the triples in which the selected entity is subject (position 1/3). Certain SPARQL endpoints (such as this one) also return the triples in which the entity is object (position 3/3). The query result being a graph and not a table like for SPARQL Results, the default format is RDF/XML. JSON and Turtle (text/turtle) can also be requested.

describe <http://www.wikidata.org/entity/Q866405>

curl http://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query=describe%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ866405%3E -H "Accept: text/turtle"

@ColinMaudry
Copy link
Author

If you need a specific SPARQL request, you can request (uh uh) it here and I'll see what I can do.

@lucaswerkmeister
Copy link

curl can also do the URL encoding for you:

curl -G https://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql --data-urlencode query='
select distinct ?type where {
?thing a ?type
}
limit 20
'

(Note the -G option to use a GET request)

@ColinMaudry
Copy link
Author

Nice thanks, I didn't know about it

However, if I paste that it my console, it treats the line breaks as Enter keypresses, and thus as many commands as the number of lines. But that may work in a shell script.

My objective here was to give one-liners to paste in console.

@lucaswerkmeister
Copy link

Weird, the shell shouldn’t start executing the command until the quote is closed… which shell are you using?

@ColinMaudry
Copy link
Author

Apparently it's a Windows limitation 😬

Via SSH on my Ubuntu server, it worked.

I tried with both ConEmu and the standard command prompt from my Windows laptop: fails. In order to leave it cross-platform, I'll leave the URL encoding steps. Thanks for the input!

@vorachet
Copy link

vorachet commented Jun 8, 2017

I try to use https://query.wikidata.org/sparql with HTTP POST. It does not work. This manual (https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual) said

"POST requests also accepts query in the body of the request, instead of URL, allowing to run larger queries without hitting URL length limit."

Any one uses https://query.wikidata.org/sparql with POST?

@dr0i
Copy link

dr0i commented Jun 23, 2017

@vorachet At https://en.wikibooks.org/wiki/SPARQL/Wikidata_Query_Service it is said that PUT is forbidden. Normally, you don't need POST to run large queries - use GET. I do it like @lucaswerkmeister commented on 29 May 2015. Put this into a file.sh:

curl --header "Accept: application/sparql-results+json"  -G 'https://query.wikidata.org/sparql' --data-urlencode query='
SELECT ?s WHERE {
your complex query here
}'

and run it e.g. in bash with "bash file.sh".

@ppKrauss
Copy link

ppKrauss commented Aug 2, 2018

More one example. Generating a CSV file (countries.csv) with only item labels (cuts the Wikidata URL) and some data.

curl -o countries.csv -G 'https://query.wikidata.org/sparql' \
     --header "Accept: text/csv"  \
     --data-urlencode query='
 SELECT DISTINCT ?iso2 ?qid ?osm_relid ?itemLabel
 WHERE {
  ?item wdt:P297 _:b0.
  BIND(strafter(STR(?item),"http://www.wikidata.org/entity/") as ?qid).
  OPTIONAL { ?item wdt:P1448 ?name .}
  OPTIONAL { ?item wdt:P297 ?iso2 .}
  OPTIONAL { ?item wdt:P402 ?osm_relid .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" . }
 }
 ORDER BY ?iso2
'

@ImperialRoyalKing999
Copy link

Thank you all. I will try iso2

@fishfree
Copy link

fishfree commented Jun 7, 2023

More one example. Generating a CSV file (countries.csv) with only item labels (cuts the Wikidata URL) and some data.

curl -o countries.csv -G 'https://query.wikidata.org/sparql' \
     --header "Accept: text/csv"  \
     --data-urlencode query='
 SELECT DISTINCT ?iso2 ?qid ?osm_relid ?itemLabel
 WHERE {
  ?item wdt:P297 _:b0.
  BIND(strafter(STR(?item),"http://www.wikidata.org/entity/") as ?qid).
  OPTIONAL { ?item wdt:P1448 ?name .}
  OPTIONAL { ?item wdt:P297 ?iso2 .}
  OPTIONAL { ?item wdt:P402 ?osm_relid .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" . }
 }
 ORDER BY ?iso2
'

It does not work now, even I changed the query path to https://query.wikidata.org

@dr0i
Copy link

dr0i commented Jun 9, 2023

@fishfree try again, maybe this was just a hickup at WD. Tried your snippet and it works here.

@Podbrushkin
Copy link

Podbrushkin commented Nov 24, 2023

Maybe this will be helpful too. Both prints to console and writes to $resp variable. From Powershell:

curl -X POST -H 'Content-Type: application/sparql-query' -H 'Accept: text/csv' --data ($sparql -join ' ') https://query.wikidata.org/sparql | ConvertFrom-Csv -OutVariable resp
Invoke-RestMethod -Uri https://query.wikidata.org/sparql -Method Post -Body $sparql -Headers @{
    "Content-Type" = "application/sparql-query"
    "Accept" = "application/sparql-results+json"
} -OutVariable resp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment