Skip to content

Instantly share code, notes, and snippets.

@aidanheerdegen
Created May 1, 2018 04:27
Show Gist options
  • Save aidanheerdegen/63908c72e8747ea60eb336b65ee99f5b to your computer and use it in GitHub Desktop.
Save aidanheerdegen/63908c72e8747ea60eb336b65ee99f5b to your computer and use it in GitHub Desktop.
Copy a variable from one netcdf file to another robustly, ignoring HDF read errors
#!/usr/bin/env python
import netCDF4 as nc
import sys
import numpy as np
import itertools as it
name_var = sys.argv[1]
orig = sys.argv[2]
new = sys.argv[3]
orig = nc.Dataset(orig,mode='r')
var = orig.variables[name_var]
csizes = np.array(var.chunking())
print(csizes)
new = nc.Dataset(new,mode='r+')
new_var = new.createVariable(name_var, var.datatype, var.dimensions)
new_var.setncatts({k: var.getncattr(k) for k in var.ncattrs()})
dimlim = np.asarray(var.shape)
steps = (dimlim-1)/csizes + 1
for index in np.ndindex(*steps):
index *= csizes
origslices = []
newslices = []
# Make up slices of size bufferChunk
for start, step, end in it.izip(index, csizes, dimlim):
# min checks we don't go beyond the limits of the variable
origslices.append(slice(start,min(start+step,end),None))
newslices.append(slice(start,min(start+step,end),None))
while True:
try:
# Copy the data
new_var[newslices] = var[origslices]
except:
print('Error :',origslices)
tslice = origslices[0]
# default to using previous time slice
increment = -1
if tslice.start == 0:
increment = +1
origslices[0] = slice(tslice.start+increment, tslice.stop+increment,None)
print('New slice :',origslices)
else:
break
new.sync() # flush data to disk
@aidanheerdegen
Copy link
Author

There was an issue with a couple of MOM5 output files with a corrupted chunk. In this case they are super high temporal resolution, there is very little missing data, and the data is designed to be used for an animation, so data integrity isn't an issue.

Move the original file to a backup name, e.g.

mv ocean_temp_3hourly.nc.0008 ocean_temp_3hourly.nc.0008.old

Use NCO tool ncks to create a copy of the data file without the corrupted variable

ncks -O -x -v temp ocean_temp_3hourly.nc.0008.old ocean_temp_3hourly.nc.0008

Then call the program to copy the variable

python cleanvar.py temp ocean_temp_3hourly.nc.0008.old ocean_temp_3hourly.nc.0008

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment