Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Created March 1, 2024 21:08
Show Gist options
  • Save mdsumner/e8dc2b66712fadab84b29893db6965af to your computer and use it in GitHub Desktop.
Save mdsumner/e8dc2b66712fadab84b29893db6965af to your computer and use it in GitHub Desktop.

it's almost right, but stored in float32 with fuzz - degenerate rectilinear is a terrible way to store a regular grid ... is it regular? did they make a mistake? the coords and the metadata don't agree, it's impossible to be sure and that's normal in netcdf, a regular grid is six numbers (maybe eight) and a crs. sadly xarray has elevated this practice to world standard - I appreciate odc has helpers, but that's python - I R has helpers also of course, I like to keep as our in GDAL as much as possible so it's lang-agnostic

@mdsumner
Copy link
Author

mdsumner commented Mar 1, 2024

issue 127 is to me a foundational problem with xarray (which it inherits from netcdf), which is why I don't use it for this task (interested to see what Alex lands on though) - everything works perfectly well in osgeo.gdal, and xarray can do whatever it needs downstream like any other tool in this scene. I'll hit up the providers with feedback, when I have all the implications and details lined up, but it seems no one else bothered to do that in 22 years 🤔

@mdsumner
Copy link
Author

mdsumner commented Mar 1, 2024

I think your description complicates it a bit, the coords are on the right edge of an implicit "area cell" and there's numeric fuzz in them (literally I've seen it treated as a curvilinear grid because of that, and it's not even rectilinear) - it's just a grid on -/+180 -/+89.995 at 0.01 res with 36000x17999 which accounts for a half missing row at top and bottom - the right fix is to set a clean transform and ignore the coords, or push through the warper api and have it resolve with a slight change. xarray being stuck with the degenerate rectilinear c

@mdsumner
Copy link
Author

mdsumner commented Mar 1, 2024

it's not possible to determine what they intended, and the netcdf metadata attributes disagree with the actual coords in there - I'll hit them up with questions when I have the story laid out fully, it's easy to fix - it's an upfront a_ullr or a_gt metadata assignment for gdal, but there's a whole family of stuff to fix as well like change the offset scale to give celsius not F (so lazy load can act for longer), and actually specify the crs - I don't know how netcdf assigns fixes for metadata, but with gdal it's just -a_ args with VRT or the vrt:// protocol - also I wonder if you try to stream it does the netcdf side use #mode=bytes or does it use gdal vsicurl and re-generate the lon lat coords from gdal's slightly off transform? when I do it with osgeo.gdal I stream it, and there's no crufty coord arrays used at all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment