Skip to content

Instantly share code, notes, and snippets.

@JohannesBuchner
Created June 2, 2019 04:40
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save JohannesBuchner/0242841821757b727b796ac49980bd9f to your computer and use it in GitHub Desktop.
Save JohannesBuchner/0242841821757b727b796ac49980bd9f to your computer and use it in GitHub Desktop.
Translate a csv/npy/csv.gz integer file to efficiently compressed HDF5
import sys
import numpy
import h5py
filename = sys.argv[1]
outfilename = filename.replace('.npy', '').replace('.gz', '').replace('.csv', '') + '.h5'
if filename.endswith('.npy'):
print('loading NPY...')
data = numpy.load(filename)
else:
print('loading CSV...')
data = numpy.loadtxt(filename, dtype=int)
print('storing as HDF5...')
with h5py.File(outfilename, 'w') as f:
f.create_dataset('data', data=data, compression='gzip', shuffle=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment