Skip to content

Instantly share code, notes, and snippets.

@thibaudcolas
Created January 25, 2013 21:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thibaudcolas/4638199 to your computer and use it in GitHub Desktop.
Save thibaudcolas/4638199 to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash
#
cat geonames_insee_reg_dept.rdf | grep 'rdf:about' | cut -c 25-56 | sort > reg_dept.txt
cat reg_dept.txt | grep . | sed 's/$/about.rdf/' > reg_dept_links.txt
wget -i reg_dept_links.txt -O geonames_tmp.rdf
cat geonames_tmp.rdf | grep '?xml' | uniq > geonames_reg_dept.rdf
cat geonames_tmp.rdf | grep '<rdf:RDF' | uniq >> geonames_reg_dept.rdf
cat geonames_tmp.rdf | grep -v '?xml' | grep -v 'rdf:RDF' >> geonames_reg_dept.rdf
cat geonames_tmp.rdf | grep '</rdf:RDF' | uniq >> geonames_reg_dept.rdf
wc -l geonames_reg_dept.rdf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment