Skip to content

Instantly share code, notes, and snippets.

@springmeyer
Last active March 9, 2021 20:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save springmeyer/0eb65f51bce697c37536 to your computer and use it in GitHub Desktop.
Save springmeyer/0eb65f51bce697c37536 to your computer and use it in GitHub Desktop.
Dedupe geojson -> shapefile

First convert to sqlite

$ time ogr2ogr -f SQLite out.db 1447454472-debug.geojson 

real    2m31.904s
user    1m37.525s
sys 0m48.042s

Checks number of features

$ sqlite3 out.db "Select count(*) from ogrgeojson"
3199531

Run dedupe

$ time ogr2ogr -f SQlite dedupe.db out.db -sql "Select * from ogrgeojson group by from_node,to_node" -dialect SQLITE

real    0m16.296s
user    0m13.737s
sys 0m1.240s

Yay, less features

$ sqlite3 dedupe.db "Select count(*) from 'select'"
1754049

Convert back to a format we can upload to mapbox.com

$ time ogr2ogr new.shp dedupe.db
Warning 6: Field timestamp create as date field, though DateTime requested.

real    0m17.298s
user    0m13.988s
sys 0m1.113s

$ $ zip -r -9 upload.zip new.*
  adding: new.dbf (deflated 89%)
  adding: new.prj (deflated 15%)
  adding: new.shp (deflated 81%)
  adding: new.shx (deflated 72%)

Then upload upload.zip to https://www.mapbox.com/studio/data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment