daviddias/peermaps-data.md

## peermaps-data.md

      
    Raw
  

              peermaps-data.md
            
          
    peermaps data is ready

Peermaps is a project to bring OpenStreetMap data to the p2p web.
I've just finished a big part of this project: subdividing planet-latest.osm.pbf to support ad-hoc
extracts. With ad-hoc extracts over p2p networks, you can download only the parts of planet osm that you need
without having to download the whole 34G thing and process it using tens of gigabytes of RAM. Using the peermaps
data archive, you can build tile sets for the entire planet on very modest hardware. Every .o5m.gz file is at most 1M.
More details below.
You can download the 38G peermaps dataset (or some subset thereof) with ipfs and dat:
on ipfs

Download (and help mirror!) the whole archive:
$ ipfs get QmXJ8KkgKyjRxTrEDvmZWZMNGq1dk3t97AVhF1Xeov3kB4
Browse using ipfs ls and ipfs cat:
$ ipfs ls QmXJ8KkgKyjRxTrEDvmZWZMNGq1dk3t97AVhF1Xeov3kB4
$ ipfs cat QmSd1tpbpboqXuJJxYGH8HPGqkDVPggZbf86gdzxsxLJzb | osmconvert -
on dat

Download (and help mirror!) the whole archive:
$ dat 04ed0b08ff595a992a594ad1ab624072646467ec7eda2dc40e4aa512e49cb196 osmtiles
$ ls osmtiles
To get at particular files in the archive, you can use the dat-js library.
processing info

Processing time: 68 hours, plus 2.5 hours to ipfs add -r ..
I ran these calculations on my laptop, which is a fairly modest machine in terms of RAM, CPU, and disk.
I used these scripts to generate the data. The branch factor of the output is 16. The total archive size is 38G and
there are 215836 .o5m.gz files along with 14389 meta.json files.
data format

The data consists of a nested set of self-similar directories. Each directory has a meta.json file which
maps the branch numbers 0 through 15, inclusive, to a [west,south,east,north] bounding extent in longitude
and latitude decimal degrees. Here's what meta.json looks like:
{"0":[-22.5,38.68218745348944,-16.875,41.01449966573052],
"1":[-22.5,41.01449966573052,-16.875,43.432536557789774],
"2":[-22.5,43.432536557789774,-16.875,45.95137432591568],
"3":[-22.5,45.95137432591568,-16.875,48.590377890729144],
"4":[-16.875,38.68218745348944,-11.25,41.01449966573052],
"5":[-16.875,41.01449966573052,-11.25,43.432536557789774],
"6":[-16.875,43.432536557789774,-11.25,45.95137432591568],
"7":[-16.875,45.95137432591568,-11.25,48.590377890729144],
"8":[-11.25,38.68218745348944,-5.625,41.01449966573052],
"9":[-11.25,41.01449966573052,-5.625,43.432536557789774],
"10":[-11.25,43.432536557789774,-5.625,45.95137432591568],
"11":[-11.25,45.95137432591568,-5.625,48.590377890729144],
"12":[-5.625,38.68218745348944,0,41.01449966573052],
"13":[-5.625,41.01449966573052,0,43.432536557789774],
"14":[-5.625,43.432536557789774,0,45.95137432591568],
"15":[-5.625,45.95137432591568,0,48.590377890729144]}
For each branch number $n, there is either a file, $n.o5m.gz or another
directory $n/ with its own sub-branches and meta.json. .o5m.gz files can be read by osmconvert and turned
into osm xml, csv, osm pbf, or other formats.
One nice thing about this format is that you can walk the directory structure recursively, reading meta.json
for a set of extents to build a list of files you can pass to osmconvert for an ad-hoc extract. You can stream
this data from peers without having to download the entire archive.
next steps

There's a lot left to do to fully realize the dream of fully decentralized p2p cartography:

finish up the peermaps command to mirror, generate data from planet-latest.osm.pbf, and perform ad-hoc extracts
create tilesets and upload them to p2p networks
offline p2p web maps and embedded viewer with custom overlays and data viz
integrate edits from osm-p2p-db into the archive pipeline
offline p2p landsat / aerial photography layers
offline p2p routing/trip planning algorithms
bluetooth data replication