Skip to content

Instantly share code, notes, and snippets.

@daviddias
Created December 19, 2016 02:42
Show Gist options
  • Save daviddias/fc0c44d51283d2174a782224853a9d75 to your computer and use it in GitHub Desktop.
Save daviddias/fc0c44d51283d2174a782224853a9d75 to your computer and use it in GitHub Desktop.
peermaps p2p data format for ad-hoc openstreetmap extracts

peermaps data is ready

Peermaps is a project to bring OpenStreetMap data to the p2p web.

I've just finished a big part of this project: subdividing planet-latest.osm.pbf to support ad-hoc extracts. With ad-hoc extracts over p2p networks, you can download only the parts of planet osm that you need without having to download the whole 34G thing and process it using tens of gigabytes of RAM. Using the peermaps data archive, you can build tile sets for the entire planet on very modest hardware. Every .o5m.gz file is at most 1M. More details below.

You can download the 38G peermaps dataset (or some subset thereof) with ipfs and dat:

on ipfs

Download (and help mirror!) the whole archive:

$ ipfs get QmXJ8KkgKyjRxTrEDvmZWZMNGq1dk3t97AVhF1Xeov3kB4

Browse using ipfs ls and ipfs cat:

$ ipfs ls QmXJ8KkgKyjRxTrEDvmZWZMNGq1dk3t97AVhF1Xeov3kB4
$ ipfs cat QmSd1tpbpboqXuJJxYGH8HPGqkDVPggZbf86gdzxsxLJzb | osmconvert -

on dat

Download (and help mirror!) the whole archive:

$ dat 04ed0b08ff595a992a594ad1ab624072646467ec7eda2dc40e4aa512e49cb196 osmtiles
$ ls osmtiles

To get at particular files in the archive, you can use the dat-js library.

processing info

Processing time: 68 hours, plus 2.5 hours to ipfs add -r .. I ran these calculations on my laptop, which is a fairly modest machine in terms of RAM, CPU, and disk.

I used these scripts to generate the data. The branch factor of the output is 16. The total archive size is 38G and there are 215836 .o5m.gz files along with 14389 meta.json files.

data format

The data consists of a nested set of self-similar directories. Each directory has a meta.json file which maps the branch numbers 0 through 15, inclusive, to a [west,south,east,north] bounding extent in longitude and latitude decimal degrees. Here's what meta.json looks like:

{"0":[-22.5,38.68218745348944,-16.875,41.01449966573052],
"1":[-22.5,41.01449966573052,-16.875,43.432536557789774],
"2":[-22.5,43.432536557789774,-16.875,45.95137432591568],
"3":[-22.5,45.95137432591568,-16.875,48.590377890729144],
"4":[-16.875,38.68218745348944,-11.25,41.01449966573052],
"5":[-16.875,41.01449966573052,-11.25,43.432536557789774],
"6":[-16.875,43.432536557789774,-11.25,45.95137432591568],
"7":[-16.875,45.95137432591568,-11.25,48.590377890729144],
"8":[-11.25,38.68218745348944,-5.625,41.01449966573052],
"9":[-11.25,41.01449966573052,-5.625,43.432536557789774],
"10":[-11.25,43.432536557789774,-5.625,45.95137432591568],
"11":[-11.25,45.95137432591568,-5.625,48.590377890729144],
"12":[-5.625,38.68218745348944,0,41.01449966573052],
"13":[-5.625,41.01449966573052,0,43.432536557789774],
"14":[-5.625,43.432536557789774,0,45.95137432591568],
"15":[-5.625,45.95137432591568,0,48.590377890729144]}

For each branch number $n, there is either a file, $n.o5m.gz or another directory $n/ with its own sub-branches and meta.json. .o5m.gz files can be read by osmconvert and turned into osm xml, csv, osm pbf, or other formats.

One nice thing about this format is that you can walk the directory structure recursively, reading meta.json for a set of extents to build a list of files you can pass to osmconvert for an ad-hoc extract. You can stream this data from peers without having to download the entire archive.

next steps

There's a lot left to do to fully realize the dream of fully decentralized p2p cartography:

  • finish up the peermaps command to mirror, generate data from planet-latest.osm.pbf, and perform ad-hoc extracts
  • create tilesets and upload them to p2p networks
  • offline p2p web maps and embedded viewer with custom overlays and data viz
  • integrate edits from osm-p2p-db into the archive pipeline
  • offline p2p landsat / aerial photography layers
  • offline p2p routing/trip planning algorithms
  • bluetooth data replication
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment