Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Created April 5, 2024 03:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mdsumner/aed9ed2f8f41ccb287a61cee124da8a8 to your computer and use it in GitHub Desktop.
Save mdsumner/aed9ed2f8f41ccb287a61cee124da8a8 to your computer and use it in GitHub Desktop.
https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140/12?u=michael_sumner
possible response
I just don't think xarray has any business here, it should read the arrays and represent them as they are. Resolving grids to regular from shear affine or gcps/rcps/geolocation arrays is the job of the warper api. As soon as you're in xarray at that point its too late, you should go back to the sources and stream them through the warper first, where gdal has already identified or been told about which georeferencing scheme is active, and target either to virtual or materialized data sets. That's why I baulked at these degenerate rectilinear coordinate arrays in netcdf and then in xarray, it's a suboptimal situation when I can see that a four-value extent is enough (even in gdal, the 1D coord arrays are tagged as geolocation arrays, ready for the warper, same with 2D cooord arrays, gcps, and rpcs, or the geotransform). I think you're suggesting that xarray should model something like the regridding/warper engine of gdal in itself and I think the sources should be better arranged outside of it, possibly with kerchunk-like virtualization. A major problem in the xarray family is that GDAL is mostly seen via rasterio, which is simply a downstream binding to the library in one language.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment