Skip to content

Instantly share code, notes, and snippets.

View mdsumner's full-sized avatar

Michael Sumner mdsumner

  • Integrated Digital East Antarctica, Australian Antarctic Division
  • Hobart, Australia
View GitHub Profile
from osgeo import gdal
gdal.UseExceptions()



## we do something with vrt
dsn = "vrt:///vsicurl/https://github.com/cran/rgl/raw/master/inst/textures/worldsmall.png?a_ullr=-180,90,180,-90&expand=rgb"

fun stuff, compare two polygons from one dataset by plotting them together (local equal area projection ensures scale is right)

library(terra)
x <- terra::vect("/vsizip//vsicurl/https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM0.zip")

crs_local <- function(x) {
    cc <- terra::geom(terra::centroids(x))[, c("x", "y"), drop = FALSE]
 sprintf("+proj=laea +lon_0=%i +lat_0=%i", round(cc[,1]), round(cc[,2]))
library(arrow)
#> 
#> Attaching package: 'arrow'
#> The following object is masked from 'package:utils':
#> 
#>     timestamp
library(terra)
#> terra 1.7.71
#> 
https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140/12?u=michael_sumner
possible response
I just don't think xarray has any business here, it should read the arrays and represent them as they are. Resolving grids to regular from shear affine or gcps/rcps/geolocation arrays is the job of the warper api. As soon as you're in xarray at that point its too late, you should go back to the sources and stream them through the warper first, where gdal has already identified or been told about which georeferencing scheme is active, and target either to virtual or materialized data sets. That's why I baulked at these degenerate rectilinear coordinate arrays in netcdf and then in xarray, it's a suboptimal situation when I can see that a four-value extent is enough (even in gdal, the 1D coord arrays are tagged as geolocation arrays, ready for the warper, same with 2D cooord arrays, gcps, and rpcs, or the geotransform). I think you're sugge
dsn <- "/vsicurl/https://e84-earth-search-sentinel-data.s3.us-west-2.amazonaws.com/sentinel-2-c1-l2a/1/L/AC/2023/1/S2A_T01LAC_20230103T221938_L2A/TCI.tif"
library(vapour)
system.time(d <- gdal_raster_data(dsn, bands = 1:3, target_dim = c(1373, 1373)))

0.5 - 2 seconds in California

10-15 seconds in Western Australia or Tasmania

phwoar, this is a bit exciting, how about convert all those GADM files to one (Geo)Parquet and open it lazily:

url <- "https://geodata.ucdavis.edu/gadm/gadm4.1/gpkg"
f <- readLines(url)
files <- gsub("\"", "", na.omit(stringr::str_extract(f, "gadm.*\\.gpkg\"")))
gpkgs <- file.path(url, files)     


dir.create("raw")
c("/vsicurl/https://e84-earth-search-sentinel-data.s3.us-west-2.amazonaws.com/sentinel-2-c1-l2a/1/L/AH/2023/1/S2B_T01LAH_20230111T222752_L2A/TCI.tif",
"/vsicurl/https://e84-earth-search-sentinel-data.s3.us-west-2.amazonaws.com/sentinel-2-c1-l2a/1/L/AJ/2023/12/S2B_T01LAJ_20231217T222755_L2A/TCI.tif",
"/vsicurl/https://e84-earth-search-sentinel-data.s3.us-west-2.amazonaws.com/sentinel-2-c1-l2a/1/K/AB/2023/12/S2A_T01KAB_20231229T221935_L2A/TCI.tif",
"/vsicurl/https://e84-earth-search-sentinel-data.s3.us-west-2.amazonaws.com/sentinel-2-c1-l2a/1/K/AA/2023/12/S2A_T01KAA_20231229T221935_L2A/TCI.tif",
"/vsicurl/https://e84-earth-search-sentinel-data.s3.us-west-2.amazonaws.com/sentinel-2-c1-l2a/1/K/AV/2023/12/S2A_T01KAV_20231229T221935_L2A/TCI.tif",
"/vsicurl/https://e84-earth-search-sentinel-data.s3.us-west-2.amazonaws.com/sentinel-2-c1-l2a/1/K/CB/2023/12/S2B_T01KCB_20231231T220917_L2A/TCI.tif",
"/vsicurl/https://e84-earth-search-sentinel-data.s3.us-west-2.amazonaws.com/sentinel-2-c1-l2a/1/K/BB/2023/12/S2B_T01KBB_202
    library(terra)
#> terra 1.7.71
set.seed(2)
query <- buffer(project(vect(geosphere::randomCoordinates(1), crs = "EPSG:4326"),   "EPSG:3857"), 
                c(1e6, 4e6), capstyle = "square")

maps::map()
plot(project(query, "EPSG:4326"), add = TRUE, border= "firebrick", lwd = 4)
import fsspec
import kerchunk.hdf

fs = fsspec.filesystem('https')

urls = ["https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/202401/oisst-avhrr-v02r01.20240119.nc",
        "https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/202401/oisst-avhrr-v02r01.20240120.nc",
        "https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/202401/oisst-avhrr-v02r01.20240121.nc",
        "https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/202401/oisst-avhrr-v02r01.20240122.nc",