mdsumner/gist:aed9ed2f8f41ccb287a61cee124da8a8

## gistfile1.txt
https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140/12?u=michael_sumner

possible response

I just don't think xarray has any business here, it should read the arrays and represent them as they are. Resolving grids to regular from shear affine or gcps/rcps/geolocation arrays is the job of the warper api. As soon as you're in xarray at that point its too late, you should go back to the sources and stream them through the warper first, where gdal has already identified or been told about which georeferencing scheme is active, and target either to virtual or materialized data sets. That's why I baulked at these degenerate rectilinear coordinate arrays in netcdf and then in xarray, it's a suboptimal situation when I can see that a four-value extent is enough (even in gdal, the 1D coord arrays are tagged as geolocation arrays, ready for the warper, same with 2D cooord arrays, gcps, and rpcs, or the geotransform).  I think you're suggesting that xarray should model something like the regridding/warper engine of gdal in itself and I think the sources should be better arranged outside of it, possibly with kerchunk-like virtualization.   A major problem in the xarray family is that GDAL is mostly seen via rasterio, which is simply a downstream binding to the library  in one language.
	https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140/12?u=michael_sumner

	possible response

	I just don't think xarray has any business here, it should read the arrays and represent them as they are. Resolving grids to regular from shear affine or gcps/rcps/geolocation arrays is the job of the warper api. As soon as you're in xarray at that point its too late, you should go back to the sources and stream them through the warper first, where gdal has already identified or been told about which georeferencing scheme is active, and target either to virtual or materialized data sets. That's why I baulked at these degenerate rectilinear coordinate arrays in netcdf and then in xarray, it's a suboptimal situation when I can see that a four-value extent is enough (even in gdal, the 1D coord arrays are tagged as geolocation arrays, ready for the warper, same with 2D cooord arrays, gcps, and rpcs, or the geotransform). I think you're suggesting that xarray should model something like the regridding/warper engine of gdal in itself and I think the sources should be better arranged outside of it, possibly with kerchunk-like virtualization. A major problem in the xarray family is that GDAL is mostly seen via rasterio, which is simply a downstream binding to the library in one language.