Skip to content

Instantly share code, notes, and snippets.

@ajelenak
ajelenak / h5stat-chunks.py
Last active January 11, 2024 01:42
Additional HDF5 dataset chunk statistics
from collections import defaultdict
from typing import Union
from dataclasses import dataclass
import argparse
from functools import partial, reduce
import operator
import math
import h5py
@ajelenak
ajelenak / h5ublock.py
Created May 19, 2023 00:00
Create a copy of HDF5 file with specified user block. User block is filled with zero bytes.
import argparse
from warnings import warn
import h5py
WARN_UBLOCK_SIZE = 10 * 1024 * 1024
COPY_BLOCK_SIZE = 10 * 1024 * 1024
parser = argparse.ArgumentParser(
description='Add empty user block to an HDF5 file.',
@ajelenak
ajelenak / HDF5 Ecosystem.plantuml
Created September 24, 2021 13:51
HDF5 Ecosystem (PlantUML)
@startmindmap HDF5 Ecosystem
<style>
node {
RoundCorner 40
MaximumWidth 300
FontName Helvetica
FontSize 18
}
rootNode {
@ajelenak
ajelenak / h5comprat.py
Created June 24, 2021 23:54
How to compute and display compression ratios of HDF5 datasets in an HDF5 file using Python
import sys
import h5py
def comp_ratio(name, obj):
if isinstance(obj, h5py.Dataset) and obj.chunks is not None:
if obj.id.get_create_plist().get_nfilters():
stor_size = obj.id.get_storage_size()
if stor_size != 0:
ratio = float(obj.nbytes) / float(stor_size)
@ajelenak
ajelenak / HDF5 Universe.plantuml
Last active April 22, 2020 14:11
UML diagram of the HDF5 software components
@startuml HDF5 Universe
title HDF5 Universe
together {
folder "Abstract\nData Model" as ADM
folder "Programming\nModel" as PM
folder Library as L
}
@ajelenak
ajelenak / h5-to-zarr.py
Last active March 1, 2023 16:04
Python code to extract HDF5 chunk locations and add them to Zarr metadata.
# Requirements:
# HDF5 library version 1.10.5 or later
# h5py version 3.0 or later
# pip install git+https://github.com/HDFGroup/zarr-python.git@hdf5
import logging
from urllib.parse import urlparse, urlunparse
import numpy as np
import h5py
import zarr
@ajelenak
ajelenak / .zmetadata.json
Created February 6, 2020 22:00
Zarr consolidated metadata (.zmetadata) with HDF5 chunk file locations.
{
"metadata": {
".zattrs": {
"Conventions": "UGRID-0.9.0",
"_FillValue": -99999.0,
"_NCProperties": "version=1|h5netcdfversion=0.6.1|hdf5libversion=1.10.2",
"a00": 0.35,
"agrid": "grid",
"b00": 0.3,
"c00": 0.35,
@ajelenak
ajelenak / store_info.py
Last active December 1, 2021 04:03
Python script for reporting HDF5 dataset storage information for HDF5 files either in a file system or S3.
#!/usr/bin/env python3
"""
Print storage information for every HDF5 dataset in a file.
Run "store_info.py --help" for information.
"""
from os import SEEK_SET
import argparse
import json
from functools import partial
@ajelenak
ajelenak / cloud-access-to-hdf5.ipynb
Last active May 8, 2024 15:47
Access HDF5 Files in S3
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ajelenak
ajelenak / DAS-HDF5 and xarray.ipynb
Created April 18, 2019 16:29
Exploring a DAS-HDF5 with xarray
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.