Last active
January 18, 2017 15:57
-
-
Save martindurant/dc27a072da47fab8d63117488f1fd7f1 to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@shoyer , @alimanfoo
Since I had been working on fastparquet as standard storage for tabular data, I am also thinking about a standard format for array data for dask. netCDF and HDF are good legacy archival formats, but don't play nicely with parallel access across a cluster or from an archive store like s3. zarr is certainly non-standard, but would make a very nice internal store for intermediates. This gist is a simple motivator that we could use zarr not only for dask but for xarray too without too much expenditure of effort.