Before diving too deeply into the various friction points when working with archives of earth observation data in xarray, let's look at a more optimal case from the earth systems world. In the notebook here we demonstrate how using zarr's consolidated metadata option to access the dimensional and chunk reference information, a massive dataset's dimensions and variables can be loaded extremely quickly. With this consolidated metadata available to reference chunks on disk, we can leverage xarray's dask integration to use normal xarray operations to lazily load chunks in parallel and perform our calculations using dask's blocked algorithm implementations. Gravy.
But the earth observation story is more complicated... Not everything lives in standardized file containers and more importantly our grid coordinate systems are "all over the map" :] Here are some of the current challenges.
- Consolida