The goal of this dataset/project would be to generate geospatial statistics based on OpenStreetMap data and stored as JSON and PBF formats. These pre-computed tiles would enable data scientists to perform complex geospatial analysis at a country wide scale with minimal CPU processing.
There could be thousands of different geospatial statitics that could be generated from a simple OSM dataset within a given BBox, here's a list of the most useful geospatial statitic that would be defined:
highway=*
(path, residential, secondary, etc...)- length (total length calculated in kilometers)
- nodes (number of single vertices)
- features (each individual feature)
landuse=*
(residential, industrial, commercial, etc...)- sqkm area
- nodes
- features
natural=*
(wood, water, grass, sand, etc...)- sqkm area
- nodes
- features
amenity=*
(school, bar, college, etc...)- sqkm area
- nodes
- features
Simplified JSON Example
{
"tile": [655, 1582, 12],
"quadtree": "023010203331",
"highway": {
"length": 13.23,
"path": {
"length": 11,
"nodes": 10,
"features": 2
},
"residential": {
"length": 2.23,
"nodes": 15,
"features": 3
},
},
"landuse": {
"area": 0.4,
"residential": {
"area": 0.3,
"commercial": 0.1,
"nodes": 300,
"features": 1
}
}
}
Geospatial processing can be a very CPU intensive task and can drastically slow down your geospatial analysis to a crawl if done incorrectly. Using this approach, one could simply focus on the aggregated geospatial properties without having to process all the geometry everytime they run their analysis, last minute tweaking of an algorithm would now take seconds vs. hours of re-processing. Using these aggregated statitics you easily start defining patterns/signatures per Tiles and allow a software to recognize this same pattern for the entire country, similar use cases would be to train an AI system.
- Pokemon Go (Lakes, Paths, Parks all in one small residential area).
- High volume of no-ortho'ed buildings (typically from new users not using Q or S to ortho buildings).
- High % of overlapping buildings (poor import or double import)
Multiple OSM statitics tiles shall be created for each zoom level starting from Zoom 12, each zoom level narrows BBox of the dataset by 4x. Having the ability to use multiple tiles at different zoom levels would be beneficial, this would allow to quickly eliminate certain tiles at lower zoom levels if they do not meet the certain minimum requirements of it's parent tile.