Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Last active March 14, 2024 04:24
Show Gist options
  • Save mdsumner/66aac8e54b54faa24e14b63f396dca32 to your computer and use it in GitHub Desktop.
Save mdsumner/66aac8e54b54faa24e14b63f396dca32 to your computer and use it in GitHub Desktop.

GHRSST has a bit of a broken grid, the coordinates are degenerate rectilinear (i.e. completely described by the grid in xmin,xmax,ymin,ymax and the dimensions 36000x17999 but they store the longitude and latitude 1D coordinate arrays materalized as 32-bit floating point with resultant noise). It's not clear how the grid should be aligned exactly, and the metadata attributes list -180,180, -90,90 as the valid range but this can't be correct at 0.01 degree resolution for that size grid).

(ncdump -h listed below)

## replace this local file with the one at the new earthdata store commented out (but you need your earthdata creds, or Authorization Bearer token)
ncsrc <- "/rdsi/PUBLIC/raad/data/podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/v4.1/2002/152/20020601090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc"
## https://cmr.earthdata.nasa.gov/virtual-directory/collections/C1996881146-POCLOUD/temporal/2002/05/31
## https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/20020601090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc

## read the lon,lat 1D arrays from the file (every file stores these 36000 lons and 17999 lats with ostensibly 0.01 spacing
library(RNetCDF)
con <- open.nc(ncsrc)

lat <- var.get.nc(con, "lat")
lon <- var.get.nc(con, "lon")
close.nc(con)

## weirdly right-edge aligned
print(range(lon))
#> [1] -179.99  180.00

## these are ok, but we miss half a cell from -90/90 (hence the 17999 thing, but why did they do that?)
print(range(lat))
#> [1] -89.99  89.99

## there's noise in the values
plot(diff(lon), pch = ".")
format(range(diff(lon)), digits = 16)
#> [1] "0.0099945068359375" "0.0100097656250000"

## and it's way more noise than in Float64
points(diff(fixlon <- seq(-179.995, 179.995, by = 0.01)), pch = ".",  col = "red")

format(range(diff(fixlon)), digits = 16)
#> [1] "0.009999999999990905" "0.010000000000047748"
range(fixlon); length(fixlon)
#> [1] -179.995  179.995
#> [1] 36000

gdalex <- vapour::vapour_raster_info(sprintf("NetCDF:%s:analysed_sst", ncsrc))$extent

fixex <- c(-180, -180, -89.995, 89.995)

## ewk
format(gdalex, digits = 16)
#> [1] "-179.9950055" " 180.0050000" " -89.9949979" "  89.9949979"

Now we plot the points - see description above the figure.

par(mfrow = c(2, 2))

## UL --------------------------------------------------------------------------
plot(NA, xlim = c(-180, -179.95), ylim = c(89.95, 90), asp = 1, xlab = "lon", ylab = "lat")
title("UL: upper left")

abline(v = -180, h = 90)
abline(h = fixex[3:4], lty =  2)
points(expand.grid(head(lon), tail(lat)))
vaster::plot_extent(gdalex, border = "red", add = TRUE)


## UR --------------------------------------------------------------------------
plot(NA, xlim = c(179.95, 180), ylim = c(89.95, 90), asp = 1, xlab = "lon", ylab = "lat")
title("UR: upper right")

abline(v = 180, h = 90)
abline(h = fixex[3:4], lty =  2)
points(expand.grid(tail(lon), tail(lat)))
vaster::plot_extent(gdalex, border = "red", add = TRUE)
## LL --------------------------------------------------------------------------
plot(NA, xlim = c(-180, -179.95), ylim = c(-90, -89.95), asp = 1, xlab = "lon", ylab = "lat")
title("LL: lower left")

abline(v = -180, h = -90)
abline(h = fixex[3:4], lty =  2)
points(expand.grid(head(lon), head(lat)))
vaster::plot_extent(gdalex, border = "red", add = TRUE)
## LR --------------------------------------------------------------------------
plot(NA, xlim = c(179.95, 180), ylim = c(-90, -89.95), asp = 1, xlab = "lon", ylab = "lat")
abline(v = 180, h = -90)
title("LR: lower right")

abline(h = fixex[3:4], lty =  2)
points(expand.grid(tail(lon), head(lat)))
vaster::plot_extent(gdalex, border = "red", add = TRUE, xlab = "lon", ylab = "lat")

These points are the implicit pairs by expanding lon,lat from the file and compare what GDAL derives as the extent (red), and the vertical black lines is -180,180,-90,90. The hatched line is what we think should be assigned (an implicit missing row, split half at the top and bottom) - close enough to GDAL but with the longitude fixed and the extent made tidy (resolution 0.01, aligned to 0).

Created on 2024-03-02 with reprex v2.0.2

@mdsumner
Copy link
Author

mdsumner commented Mar 2, 2024

to fix all that up for a GDAL reader I do

'vrt://{ncsrc}?a_ullr=-180,89.995,180,-89.995&a_srs=EPSG:4326&a_offset=25&a_scale=0.001&sd_name=analysed_sst'

the a_scale/offset mean we get Celsius from the Int16 rather than Fahrenheit, the a_ullr fixes the regular extent and avoids all the numeric fuzz and redundancy of the lon,lat arrays, sd_name means we don't need the 'NETCDF:dsn:variable' prefix:suffix stuff for the GDAL subdataset (but only since GDAL 3.9.0), and a_srs means we can churn through the warper for lovely new grids of any extent and dimension and crs we like.

@mdsumner
Copy link
Author

mdsumner commented Mar 2, 2024

to bring home the point about the single precision (I think this is right, appreciate if anyone points out a mistake or misunderstanding)

Generate the coord values first in double precision, then in single (in R we have to return doubles, but the calc is done in float for the second one).

library(cpp11)
cpp_function('
doubles coord64(double start, int len, double step) {
   writable::doubles out = writable::doubles(len); 
   for (int i = 0; i < len; i++) {
     out[i] = start + i * step; 
   }
   return out ;
}
             ')

range(lon64 <- coord64(-179.99, 36000, 0.01))
#> [1] -179.99  180.00
format(range(diff(lon64)), digits = 16)
#> [1] "0.009999999999990905" "0.010000000000047748"
range(lat64 <- coord64(-89.99, 17999, 0.01))
#> [1] -89.99  89.99
format(range(diff(lat64)), digits = 16)
#> [1] "0.009999999999990905" "0.010000000000019327"

cpp_function('

doubles coord32(float start, int len, float step) {
  writable::doubles out = writable::doubles(len); 
  for (int i = 0; i < len; i++) {
    out[i] = start + i * step; 
  }
  return out ;
}
')

range(lon32 <- coord32(-179.99, 36000, 0.01))
#> [1] -179.99  180.00
format(range(diff(lon32)), digits = 16)
#> [1] "0.009979248046875" "0.010009765625000"
range(lat32 <- coord32(-89.99, 17999, 0.01))
#> [1] -89.99  89.99
format(range(diff(lat32)), digits = 16)
#> [1] "0.0099945068359375" "0.0100097656250000"

Created on 2024-03-02 with reprex v2.0.2

@mdsumner
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment