Skip to content

Instantly share code, notes, and snippets.

@nakao
Last active Dec 19, 2015
Embed
What would you like to do?
Fix invalid ChEMBL-RDF files. Put the script in the ChEMBL-RDF directory, then exec. You get fixed files as "*.ttl.new"
files = ["chembl_16_biocmpt.ttl", "chembl_16_target.ttl", "chembl_16_targetcmpt.ttl"]
def iotax(arg)
"<http://identifiers.org/taxonomy/#{arg}>"
end
def ncbitax(arg)
"<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=#{arg}>"
end
files.each do |file|
fin = File.new(file, "r")
out = File.new(file + ".new", "w")
fin.each("\n") do |line|
if line =~ /iotax:(\d+)/
num = $1
line.sub!(/iotax:\d+/, iotax(num))
end
if line =~ /ncbitax:(\d+)/
num = $1
line.sub!(/ncbitax:\d+/, ncbitax(num))
end
out.print(line)
end
fin.close
out.close
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment