Skip to content

Instantly share code, notes, and snippets.

@egonw
Last active January 1, 2017 10:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save egonw/1102a17fc319d0ac9950a97c3164d305 to your computer and use it in GitHub Desktop.
Save egonw/1102a17fc319d0ac9950a97c3164d305 to your computer and use it in GitHub Desktop.
Add DSSTox (EPA CompTox Dashboard) IDs to Wikidata using InChIKey equivalence
sparql = """
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT (substr(str(?compound),32) as ?wd) ?key ?dsstox WHERE {
?compound wdt:P235 ?key .
MINUS { ?compound wdt:P3117 ?dsstox . }
}
"""
if (bioclipse.isOnline()) {
results = rdf.sparqlRemote(
"https://query.wikidata.org/sparql", sparql
)
}
// make a map
map = new HashMap()
for (i=1;i<=results.rowCount;i++) {
rowVals = results.getRow(i)
map.put(rowVals[1], rowVals[0])
}
new File(bioclipse.fullPath("/CompToxDash/dsstox_20160701.tsv")).eachLine{ line ->
fields = line.split("\t")
dsstox = fields[0]
inchikey = fields[2]
if (map.containsKey(inchikey)) {
ui.append("/CompToxDash/mappings.txt", map.get(inchikey) + "\tP3117\t\"${dsstox}\"\tS248\tQ28061352\n")
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment