Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Created May 29, 2024 00:53
Show Gist options
  • Save mdsumner/492d2a98bffc6de5974a96f50a0b75f2 to your computer and use it in GitHub Desktop.
Save mdsumner/492d2a98bffc6de5974a96f50a0b75f2 to your computer and use it in GitHub Desktop.

getting some mixed success

f <- system.file("extdata/bcsd.zarr/", package = "rnz", mustWork = TRUE)

nc <- RNetCDF::open.nc(sprintf("file://%s#mode=nczarr,file", f))
print.nc(nc)


## massive GeoTIFF
src <- "/vsicurl/https://gebco2023.s3.valeria.science/gebco_2023_land_cog.tif"

## define a target extent query for gdal
projwin <- c("-projwin", 140,  -35, 152, -45)
zarrlocal <- tempfile(fileext = ".zarr")
gdalraster::translate(src, zarrlocal, cl_arg = c(projwin, "-of", "ZARR"))

nczarr <- sprintf("file://%s#mode=nczarr,file", zarrlocal)
library(RNetCDF)
nc <- open.nc(nczarr)
print.nc(nc)
## this will cause tidync to abort
# netcdf netcdf4 {
#   dimensions:
#     .zdim_2880 = 2880 ;
#     .zdim_2400 = 2400 ;
#     variables:
#       NC_DOUBLE X(.zdim_2880) ;
#     NC_DOUBLE Y(.zdim_2400) ;
#     Error in var.inq.nc(x, id) : 
#       NetCDF: Not a valid data type or _FillValue type mismatch
#     

## now an actual NetCDF
dsn <- "/vsicurl/https://dapds00.nci.org.au/thredds/fileServer/gb6/BRAN/BRAN2020/daily/ocean_temp_2023_12.nc"
zarrlocal <- tempfile(fileext = ".zarr")
zarr <- sprintf("file://%s#mode=nczarr,file", zarrlocal)

sf::gdal_utils("mdimtranslate", dsn, zarrlocal, options = c("-of", "ZARR", 
                                                         "-array", 
                                                         "name=temp,view=[1:2,2:3,:,:]"))


nc <- open.nc(zarr)
print.nc(nc)
# netcdf netcdf4 {
#   dimensions:
#     .zdim_1 = 1 ;
#     .zdim_1500 = 1500 ;
#     .zdim_3600 = 3600 ;
#     variables:
#       NC_DOUBLE subset_Time_1_1_1(.zdim_1) ;
#     NC_CHAR subset_Time_1_1_1:bounds = "Time_bnds" ;
#     NC_CHAR subset_Time_1_1_1:calendar = "GREGORIAN" ;
#     NC_CHAR subset_Time_1_1_1:calendar_type = "GREGORIAN" ;
#     NC_CHAR subset_Time_1_1_1:cartesian_axis = "T" ;
#     NC_CHAR subset_Time_1_1_1:cell_methods = "Time: mean" ;
#     NC_CHAR subset_Time_1_1_1:long_name = "Time" ;
#     NC_CHAR subset_Time_1_1_1:units = "days since 1979-01-01 00:00:00" ;
#     NC_DOUBLE subset_st_ocean_2_1_1(.zdim_1) ;
#     NC_CHAR subset_st_ocean_2_1_1:cartesian_axis = "Z" ;
#     NC_CHAR subset_st_ocean_2_1_1:edges = "st_edges_ocean" ;
#     NC_CHAR subset_st_ocean_2_1_1:long_name = "tcell zstar depth" ;
#     NC_CHAR subset_st_ocean_2_1_1:positive = "down" ;
#     NC_CHAR subset_st_ocean_2_1_1:units = "meters" ;
#     NC_SHORT temp(.zdim_3600, .zdim_1500, .zdim_1, .zdim_1) ;
#     NC_CHAR temp:cell_methods = "time: mean Time: mean" ;
#     NC_CHAR temp:coordinates = "geolon_t geolat_t" ;
#     NC_CHAR temp:long_name = "Potential temperature" ;
#     NC_BYTE temp:packing = 4 ;
#     NC_CHAR temp:standard_name = "sea_water_potential_temperature" ;
#     NC_CHAR temp:time_avg_info = "average_T1,average_T2,average_DT" ;
#     NC_SHORT temp:valid_range = -32767, 32767 ;
#     NC_CHAR temp:units = "degrees C" ;
#     NC_DOUBLE temp:add_offset = 245 ;
#     NC_DOUBLE temp:scale_factor = 0.00778221990913153 ;
#     Error in var.inq.nc(x, id) : NetCDF: Numeric conversion not representable
@mdsumner
Copy link
Author

mdsumner commented May 29, 2024

Don't use nczarr ☝️

Prep a local Zarr with GDAL:

export url=/vsicurl/https://thredds.nci.org.au/thredds/fileServer/cj50/access-om2/cf-compliant/access-om2-01/v20200608/jra55v13_ryf8485_freshRCP45/ocean/surface-temp/surface-temp_access-om2-01_193704_193712.nc

gdalmdimtranslate $url surface-temp_access-om2-01_193704_193712.zarr -of ZARR

Now open it lazily with tidync, via the zarr support in netcdf

#R version 4.4.0 (2024-04-24) -- "Puppy Cup"

library(tidync)
tidync("file://surface-temp_access-om2-01_193704_193712.zarr#mode=zarr")
not a file:
' file://surface-temp_access-om2-01_193704_193712.zarr#mode=zarr '

... attempting remote connection

Connection succeeded.

Data Source (1): surface-temp_access-om2-01_193704_193712.zarr#mode=zarr ...

Grids (7) <dimension family> : <associated variables>

[1]   D2,D1,D0 : surface_temp    **ACTIVE GRID** ( 87480000  values per variable)
[2]   D2,D1    : geolat_t, geolon_t
[3]   D3,D0    : time_bounds
[4]   D0       : average_DT, average_T1, average_T2, time
[5]   D1       : yt_ocean
[6]   D2       : xt_ocean
[7]   D3       : nv

Dimensions 4 (3 active):

  dim   name    length     min    max start count    dmin   dmax unlim coord_dim
  <chr> <chr>    <dbl>   <dbl>  <dbl> <int> <int>   <dbl>  <dbl> <lgl> <lgl>
1 D0    time         9  7.07e5 7.07e5     1     9  7.07e5 7.07e5 FALSE TRUE
2 D1    yt_oce2700 -8.11e1 9.00e1     1  2700 -8.11e1 9.00e1 FALSE TRUE
3 D2    xt_oce3600 -2.80e2 7.99e1     1  3600 -2.80e2 7.99e1 FALSE TRUE

Inactive dimensions:

  dim   name  length   min   max unlim coord_dim
  <chr> <chr>  <dbl> <dbl> <dbl> <lgl> <lgl>
1 D3    nv         2     1     2 FALSE TRUE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment