Skip to content

Instantly share code, notes, and snippets.

@lokal-profil
Last active March 11, 2021 08:37
Show Gist options
  • Save lokal-profil/33238fae106db70249f766d4a30f0314 to your computer and use it in GitHub Desktop.
Save lokal-profil/33238fae106db70249f766d4a30f0314 to your computer and use it in GitHub Desktop.
OpenRefine: Matching a column of external identifiers to WIkidata entities
# Allows for reconciliation against Wikidata using _only_ an external identifier.
# This differs from the normal reconciliation which would use such a column together
# with normal matching techniques such as label matching.
# Add column by fetching URLs...
return "https://query.wikidata.org/sparql?format=json&query=SELECT%20DISTINCT%20%3Fq%20%7B%20VALUES%20%3Fvalue%20%7B%20%22{val}%22%20%22{val}%22%20%22{val}%22%20%7D%20.%20%3Fq%20wdt%3AP{prop}%20%3Fvalue%20%7D".format(prop=1260, val=value)
# more legible version of the above
import urllib
query = "SELECT DISTINCT ?q {{ VALUES ?value {{ '{val}' '{val}' '{val}' }} . ?q wdt:P{prop} ?value }}".format(prop=1260, val=value)
return "https://query.wikidata.org/sparql?format=json&query={}".format(urllib.quote(query))
# Create new column based on column
import json
return json.loads(value)['results']['bindings'][0]['q']['value'].split('/')[-1]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment