Skip to content

Instantly share code, notes, and snippets.

@cmarat
Last active January 11, 2018 19:20
Show Gist options
  • Save cmarat/51e07f1a1165c4b3c158 to your computer and use it in GitHub Desktop.
Save cmarat/51e07f1a1165c4b3c158 to your computer and use it in GitHub Desktop.
Convert Geonames RDF dump to n-triples.
'''
Created on 13 Nov 2014
@author: <https://github.com/cmarat>
Convert Geonames RDF dump [1] to n-triples.
Uncompress the dump and pass the file name as a command
line parameter, or pipe it into stdin.
[1] http://download.geonames.org/all-geonames-rdf.zip
'''
import rdflib
import fileinput
fout = open('geonames.nt', 'w')
xml_lines = (l for l in fileinput.input() if l[:5] == '<?xml')
for line in xml_lines:
g = rdflib.Graph()
g.parse(data=line, format='xml')
fout.write(g.serialize(format='nt'))
fout.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment