OpenStreetMap data is heavily normalized, making it very hard to process. Modeled on a relational database, it seems to have missed the second part of the "Normalize until it hurts; denormalize until it works" proverb.
Each node has an ID, and every way and relation uses an ID to reference that node. This means that every data consumer must keep an enrmous cache of 8 billion node IDs and corresponding lat,lng
pairs while processing input data. In most cases, node ID
gets discarded right after parsing.
I would like to propose a new easy to process data strucutre, for both bulk downloads and streaming update use cases.
Target audience
- YES -- Data consumers who transform OSM data into something else, i.e. tiles, shapes, analytical reports, etc.