Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save emileten/b500b135345923e25daf5d80da1e36ee to your computer and use it in GitHub Desktop.
Save emileten/b500b135345923e25daf5d80da1e36ee to your computer and use it in GitHub Desktop.
Open multiple netcdf datasets with xarray, and automatically modify (add a dimension with coordinates based on file path) and then combine them.
### This works #####
FILEPATTERN = '...'
def func(ds):
var = next(var for var in ds)
fp = ds[var].encoding['source']
coordds = ds.assign_coords(path=fp)
dimds = coordds.expand_dims('path')
return dimds
fs = fsspec.filesystem('gs')
fps = fs.glob(FILEPATTERN)
ds = xr.open_mfdataset([fs.open(x) for x in fps], concat_dim='path', preprocess=func)
### Not this ###
FILEPATTERN = '...'
def func(ds):
var = next(var for var in ds)
fp = ds[var].encoding['source']
# Here, I swap the order....
dimds = coordds.expand_dims('path')
coordds = ds.assign_coords(path=fp)
return dimds
fs = fsspec.filesystem('gs')
fps = fs.glob(FILEPATTERN)
ds = xr.open_mfdataset([fs.open(x) for x in fps], concat_dim='path', preprocess=func)
### Although opening ONE of these files and doing the `func` stuff manually does work, in both cases...
### Some deprecated function usage in `open_mfdataset` with `preprocess` ?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment