Skip to content

Instantly share code, notes, and snippets.

@ettorerizza
Last active January 16, 2019 08:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ettorerizza/b7ffa4ffd2dcb1a6514c4ee08d777e1d to your computer and use it in GitHub Desktop.
Save ettorerizza/b7ffa4ffd2dcb1a6514c4ee08d777e1d to your computer and use it in GitHub Desktop.
Example of Python/Jython script to extract Wikipedia sitelink from values reconcilied with Wikidata in OpenRefine
import json
import urllib2
langs = ["fr", "en", "de", "nl"] # ordered list of languages you want to try until there is a match
value = cell.recon.match.id
for lang in langs:
wiki = lang + "wiki"
url = ("https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&props=sitelinks/urls&ids=" +
value +
"&sitefilter=" +
wiki)
response = urllib2.urlopen(url)
if response:
data = json.loads(response.read())
for i in data['entities'].values():
try:
return i['sitelinks'][wiki]['url']
except:
pass
@ettorerizza
Copy link
Author

Example

screenshot-127 0 0 1-3333-2019 01 16-09-14-49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment