t20100/hdf5-jpeg2000-codec-with-blosc2-grok.ipynb

## hdf5-jpeg2000-codec-with-blosc2-grok.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "5cd1042a-8279-47c6-90ff-0d479f2af798",
   "metadata": {},
   "source": [
    "# How-to use JPEG2000 compression with HDF5 from Python using blosc2&grok\n",
    "\n",
    "**Goal**: Writing and reading [JPEG2000](https://jpeg.org/jpeg2000/) compressed data in HDF5 file from Python with [blosc2](https://www.blosc.org/c-blosc2/c-blosc2.html) and [grok](https://github.com/GrokImageCompression/grok).\n",
    "\n",
    "[HDF5](https://www.hdfgroup.org/) (Hierarchical Data Format) is a file format designed to store and organize large amounts of data.\n",
    "[hdf5plugin](http://www.silx.org/doc/hdf5plugin/latest/) provides some [HDF5 compression filters](https://portal.hdfgroup.org/documentation/hdf5-docs/registered_filter_plugins.html) - including the blosc2 filter - and makes them usable from [h5py](https://docs.h5py.org/en/stable/), a Pythonic interface to the HDF5 binary data format.\n",
    "\n",
    "[blosc2](https://www.blosc.org/c-blosc2/c-blosc2.html) is a \"meta\"-compressor optimized for binary data supporting different compressors and filters with support for external plugins.\n",
    "[blosc2-grok](https://pypi.org/project/blosc2-grok/) is one of the blosc2 plugins which enables the use of [JPEG2000](https://jpeg.org/jpeg2000/) codec thanks to the [grok library](https://github.com/GrokImageCompression/grok).\n",
    "\n",
    "Notebook license: [CC-0](https://creativecommons.org/public-domain/cc0/)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9dcefbd8-29d8-4091-bc7c-979e4d2779ad",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Install required packages\n",
    "!pip install blosc2 blosc2_grok h5py hdf5plugin b2h5py jupyterlab_h5web"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "f9f12f43-8188-4aa0-b89f-ee500406532c",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "import blosc2\n",
    "import blosc2_grok\n",
    "import h5py\n",
    "import hdf5plugin\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c2a858cd-8d96-40e2-868c-037ca46cdd0d",
   "metadata": {},
   "source": [
    "## Write a stack of images as a HDF5 dataset compressed with JPEG2000\n",
    "\n",
    "To write a dataset compressed with JPEG2000 using blosc2&grok, one has to compress the data with blosc2 and write it using HDF5's direct chunk write.\n",
    "\n",
    "Indeed, as of today, it is not possible to create a dataset compressed with blosc2&grok using h5py's [Group.create_dataset](https://docs.h5py.org/en/stable/high/group.html#h5py.Group.create_dataset).\n",
    "\n",
    "We define a function ``b2_grok_compress_stack`` which compresses a numpy array with blosc2&grok and a function ``create_blosc2_grok_stack_dataset`` which uses the first function and write the compressed data to a HDF5 dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "8efde572-be71-46e8-ba15-dd1d835a3bd4",
   "metadata": {},
   "outputs": [],
   "source": [
    "def b2_grok_compress_stack(data: np.ndarray, rate: float) -> blosc2.NDArray:\n",
    "    \"\"\"Compress a 3D array with blosc2&grok as a stack of JPEG2000 images.\n",
    "\n",
    "    :param data: 3D array of data\n",
    "    :param rate: The requested compression ratio\n",
    "    \"\"\"\n",
    "    blosc2_grok.set_params_defaults(\n",
    "        cod_format=blosc2_grok.GrkFileFmt.GRK_FMT_JP2,\n",
    "        quality_mode=\"rates\",\n",
    "        quality_layers=np.array([rate], dtype=np.float64),\n",
    "    )\n",
    "    return blosc2.asarray(\n",
    "        data,\n",
    "        chunks=data.shape,\n",
    "        blocks=(1,) + data.shape[1:],  # Compress slice by slice\n",
    "        cparams={\n",
    "            'codec': blosc2.Codec.GROK,\n",
    "            'filters': [],\n",
    "            'splitmode': blosc2.SplitMode.NEVER_SPLIT,\n",
    "        },\n",
    "    )\n",
    "\n",
    "\n",
    "def create_blosc2_grok_stack_dataset(\n",
    "    group: h5py.Group,\n",
    "    h5path: str,\n",
    "    data: np.ndarray,\n",
    "    rate: float,\n",
    ") -> h5py.Dataset:\n",
    "    \"\"\"Store data compressed with blosc2&grok in a new dataset: group[h5path]\n",
    "\n",
    "    :param group: The root group where to create the dataset\n",
    "    :param h5path: The path of the new dataset in the group\n",
    "    :param data: The stack data to compress\n",
    "    :param rate: The requested compression ratio\n",
    "    \"\"\"\n",
    "    dataset = group.create_dataset(  # Create the HDF5 dataset\n",
    "        h5path,\n",
    "        shape=data.shape,\n",
    "        dtype=data.dtype,\n",
    "        chunks=data.shape,\n",
    "        allow_unknown_filter=True,\n",
    "        compression=hdf5plugin.Blosc2(),\n",
    "    )\n",
    "    blosc2_array = b2_grok_compress_stack(data, rate)  # Compress the data with blosc2 & grok\n",
    "    # Write the compressed data to HDF5 using direct unk write\n",
    "    dataset.id.write_direct_chunk((0, 0, 0), blosc2_array.schunk.to_cframe())\n",
    "    return dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e9069fcf-8215-45a9-b576-fee64318f2d6",
   "metadata": {},
   "source": [
    "### Example with dummy data\n",
    "\n",
    "Compress a stack of 10 images of 1024x1024 with a compression rate of 10."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "9c7e38dd-4c04-4858-8fb8-2635abd863bc",
   "metadata": {},
   "outputs": [],
   "source": [
    "shape = 10, 1024, 1024\n",
    "data = np.arange(np.prod(shape), dtype=np.uint16).reshape(*shape)\n",
    "\n",
    "with h5py.File(\"blosc2-grok.h5\", \"w\") as h5f:\n",
    "    create_blosc2_grok_stack_dataset(h5f, \"data\", data, rate=10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "28aeb4fd-b9fa-49a2-84d2-760133ad0919",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "blosc2-grok.h5 file size: 43707 bytes\n"
     ]
    }
   ],
   "source": [
    "print(f\"blosc2-grok.h5 file size: {os.path.getsize('blosc2-grok.h5')} bytes\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e90ffd1-c40a-4569-88e7-ff3a7df2b777",
   "metadata": {},
   "source": [
    "## Read HDF5 dataset compressed with JPEG2000\n",
    "\n",
    "Provided that the hdf5plugin and blosc2-grok Python packages are installed, it is possible to read back the written data with h5py."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "093adb17-590f-46df-8bce-6d40aaa4c2dc",
   "metadata": {},
   "outputs": [],
   "source": [
    "with h5py.File(\"blosc2-grok.h5\", \"r\") as h5f:\n",
    "    read_data = h5f[\"data\"][()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e403b693-3c42-4913-a9d7-c6d7697a00f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "from jupyterlab_h5web import H5Web\n",
    "\n",
    "H5Web(\"blosc2-grok.h5\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "475abe53-6831-4117-897d-118f13c3c7df",
   "metadata": {},
   "source": [
    "### Slice access time\n",
    "\n",
    "Accessing data this way requires to decompress completely HDF5 chunks even if only accessing a slice.\n",
    "\n",
    "For instance, in this case, accessing one frame requires to decompress the entire HF5 chunk:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "bffd6590-f80e-46de-87ad-f125836cfa0d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "211 ms ± 7.22 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "\n",
    "# Read one frame\n",
    "with h5py.File(\"blosc2-grok.h5\", \"r\") as h5f:\n",
    "    x = h5f[\"data\"][0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "d1d2d005-503c-42fa-9c83-a8de51013925",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "230 ms ± 18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "\n",
    "# Read all frames\n",
    "with h5py.File(\"blosc2-grok.h5\", \"r\") as h5f:\n",
    "    x = h5f[\"data\"][()]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "198bbeff-b00e-4cb6-a861-d410f2aa55eb",
   "metadata": {},
   "source": [
    "### Optimised slice reading with b2h5py\n",
    "\n",
    "[b2h5py](https://pypi.org/project/b2h5py) provides h5py with optimized reading of n-dimensional slices of Blosc2-compressed datasets.\n",
    "This optimized slicing leverages direct chunk access and 2-level partitioning into chunks and then smaller blocks (so that less data is actually decompressed).\n",
    "\n",
    "Example: Read the first slice with ``b2h5py``:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "03d3d3dd-27b8-43f9-b7f3-1351fc7b69a4",
   "metadata": {},
   "outputs": [],
   "source": [
    "import b2h5py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "5c6513ff-0477-405c-adf7-a7acd8a06eaf",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "28.3 ms ± 2.64 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "\n",
    "# With b2h5py\n",
    "with h5py.File(\"blosc2-grok.h5\", \"r\") as h5f:\n",
    "    b2h5py_data = b2h5py.B2Dataset(h5f[\"data\"])[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b2f63014-91d5-433a-97b7-d2c532dfa149",
   "metadata": {},
   "source": [
    "## Example with tomography radios\n",
    "\n",
    "First, download raw data: http://www.silx.org/pub/leaps-innov/tomography/lung_raw_2000-2100.h5\n",
    "\n",
    "Read raw data and compress it with a 10x compression ratio:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "cd430d70-41d3-41af-a10d-e3c87d782e2b",
   "metadata": {},
   "outputs": [],
   "source": [
    "with h5py.File(\"lung_raw_2000-2100.h5\", \"r\") as h5f:\n",
    "    images = h5f[\"data\"][()]\n",
    "\n",
    "with h5py.File(\"lung_raw-blosc2-grok.h5\", \"w\") as h5f:\n",
    "    create_blosc2_grok_stack_dataset(h5f, \"data\", images, rate=10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "1af9ad9c-d180-49cf-af92-19978a903f05",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Files size:\n",
      "- lung_raw-blosc2-grok.h5:  41936825 bytes\n",
      "- lung_raw_2000-2100.h5:   419438624 bytes\n"
     ]
    }
   ],
   "source": [
    "print(\"Files size:\")\n",
    "print(f\"- lung_raw-blosc2-grok.h5:  {os.path.getsize('lung_raw-blosc2-grok.h5')} bytes\")\n",
    "print(f\"- lung_raw_2000-2100.h5:   {os.path.getsize('lung_raw_2000-2100.h5')} bytes\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "80097d3e-15cc-462c-b7e3-19c11912bdf5",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

## requirements.txt
b2h5py
blosc2>=2.5.1
blosc2_grok>=0.2.2
h5py
hdf5plugin>=4.4.0
jupyterlab_h5web
	{
	"cells": [
	{
	"cell_type": "markdown",
	"id": "5cd1042a-8279-47c6-90ff-0d479f2af798",
	"metadata": {},
	"source": [
	"# How-to use JPEG2000 compression with HDF5 from Python using blosc2&grok\n",
	"\n",
	"Goal: Writing and reading [JPEG2000](https://jpeg.org/jpeg2000/) compressed data in HDF5 file from Python with [blosc2](https://www.blosc.org/c-blosc2/c-blosc2.html) and [grok](https://github.com/GrokImageCompression/grok).\n",
	"\n",
	"[HDF5](https://www.hdfgroup.org/) (Hierarchical Data Format) is a file format designed to store and organize large amounts of data.\n",
	"[hdf5plugin](http://www.silx.org/doc/hdf5plugin/latest/) provides some [HDF5 compression filters](https://portal.hdfgroup.org/documentation/hdf5-docs/registered_filter_plugins.html) - including the blosc2 filter - and makes them usable from [h5py](https://docs.h5py.org/en/stable/), a Pythonic interface to the HDF5 binary data format.\n",
	"\n",
	"[blosc2](https://www.blosc.org/c-blosc2/c-blosc2.html) is a \"meta\"-compressor optimized for binary data supporting different compressors and filters with support for external plugins.\n",
	"[blosc2-grok](https://pypi.org/project/blosc2-grok/) is one of the blosc2 plugins which enables the use of [JPEG2000](https://jpeg.org/jpeg2000/) codec thanks to the [grok library](https://github.com/GrokImageCompression/grok).\n",
	"\n",
	"Notebook license: [CC-0](https://creativecommons.org/public-domain/cc0/)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "9dcefbd8-29d8-4091-bc7c-979e4d2779ad",
	"metadata": {},
	"outputs": [],
	"source": [
	"# Install required packages\n",
	"!pip install blosc2 blosc2_grok h5py hdf5plugin b2h5py jupyterlab_h5web"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"id": "f9f12f43-8188-4aa0-b89f-ee500406532c",
	"metadata": {},
	"outputs": [],
	"source": [
	"import os\n",
	"\n",
	"import blosc2\n",
	"import blosc2_grok\n",
	"import h5py\n",
	"import hdf5plugin\n",
	"import numpy as np"
	]
	},
	{
	"cell_type": "markdown",
	"id": "c2a858cd-8d96-40e2-868c-037ca46cdd0d",
	"metadata": {},
	"source": [
	"## Write a stack of images as a HDF5 dataset compressed with JPEG2000\n",
	"\n",
	"To write a dataset compressed with JPEG2000 using blosc2&grok, one has to compress the data with blosc2 and write it using HDF5's direct chunk write.\n",
	"\n",
	"Indeed, as of today, it is not possible to create a dataset compressed with blosc2&grok using h5py's [Group.create_dataset](https://docs.h5py.org/en/stable/high/group.html#h5py.Group.create_dataset).\n",
	"\n",
	"We define a function ``b2_grok_compress_stack`` which compresses a numpy array with blosc2&grok and a function ``create_blosc2_grok_stack_dataset`` which uses the first function and write the compressed data to a HDF5 dataset."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"id": "8efde572-be71-46e8-ba15-dd1d835a3bd4",
	"metadata": {},
	"outputs": [],
	"source": [
	"def b2_grok_compress_stack(data: np.ndarray, rate: float) -> blosc2.NDArray:\n",
	" \"\"\"Compress a 3D array with blosc2&grok as a stack of JPEG2000 images.\n",
	"\n",
	" :param data: 3D array of data\n",
	" :param rate: The requested compression ratio\n",
	" \"\"\"\n",
	" blosc2_grok.set_params_defaults(\n",
	" cod_format=blosc2_grok.GrkFileFmt.GRK_FMT_JP2,\n",
	" quality_mode=\"rates\",\n",
	" quality_layers=np.array([rate], dtype=np.float64),\n",
	" )\n",
	" return blosc2.asarray(\n",
	" data,\n",
	" chunks=data.shape,\n",
	" blocks=(1,) + data.shape[1:], # Compress slice by slice\n",
	" cparams={\n",
	" 'codec': blosc2.Codec.GROK,\n",
	" 'filters': [],\n",
	" 'splitmode': blosc2.SplitMode.NEVER_SPLIT,\n",
	" },\n",
	" )\n",
	"\n",
	"\n",
	"def create_blosc2_grok_stack_dataset(\n",
	" group: h5py.Group,\n",
	" h5path: str,\n",
	" data: np.ndarray,\n",
	" rate: float,\n",
	") -> h5py.Dataset:\n",
	" \"\"\"Store data compressed with blosc2&grok in a new dataset: group[h5path]\n",
	"\n",
	" :param group: The root group where to create the dataset\n",
	" :param h5path: The path of the new dataset in the group\n",
	" :param data: The stack data to compress\n",
	" :param rate: The requested compression ratio\n",
	" \"\"\"\n",
	" dataset = group.create_dataset( # Create the HDF5 dataset\n",
	" h5path,\n",
	" shape=data.shape,\n",
	" dtype=data.dtype,\n",
	" chunks=data.shape,\n",
	" allow_unknown_filter=True,\n",
	" compression=hdf5plugin.Blosc2(),\n",
	" )\n",
	" blosc2_array = b2_grok_compress_stack(data, rate) # Compress the data with blosc2 & grok\n",
	" # Write the compressed data to HDF5 using direct unk write\n",
	" dataset.id.write_direct_chunk((0, 0, 0), blosc2_array.schunk.to_cframe())\n",
	" return dataset"
	]
	},
	{
	"cell_type": "markdown",
	"id": "e9069fcf-8215-45a9-b576-fee64318f2d6",
	"metadata": {},
	"source": [
	"### Example with dummy data\n",
	"\n",
	"Compress a stack of 10 images of 1024x1024 with a compression rate of 10."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"id": "9c7e38dd-4c04-4858-8fb8-2635abd863bc",
	"metadata": {},
	"outputs": [],
	"source": [
	"shape = 10, 1024, 1024\n",
	"data = np.arange(np.prod(shape), dtype=np.uint16).reshape(*shape)\n",
	"\n",
	"with h5py.File(\"blosc2-grok.h5\", \"w\") as h5f:\n",
	" create_blosc2_grok_stack_dataset(h5f, \"data\", data, rate=10)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"id": "28aeb4fd-b9fa-49a2-84d2-760133ad0919",
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"blosc2-grok.h5 file size: 43707 bytes\n"
	]
	}
	],
	"source": [
	"print(f\"blosc2-grok.h5 file size: {os.path.getsize('blosc2-grok.h5')} bytes\")"
	]
	},
	{
	"cell_type": "markdown",
	"id": "2e90ffd1-c40a-4569-88e7-ff3a7df2b777",
	"metadata": {},
	"source": [
	"## Read HDF5 dataset compressed with JPEG2000\n",
	"\n",
	"Provided that the hdf5plugin and blosc2-grok Python packages are installed, it is possible to read back the written data with h5py."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"id": "093adb17-590f-46df-8bce-6d40aaa4c2dc",
	"metadata": {},
	"outputs": [],
	"source": [
	"with h5py.File(\"blosc2-grok.h5\", \"r\") as h5f:\n",
	" read_data = h5f[\"data\"][()]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "e403b693-3c42-4913-a9d7-c6d7697a00f5",
	"metadata": {},
	"outputs": [],
	"source": [
	"from jupyterlab_h5web import H5Web\n",
	"\n",
	"H5Web(\"blosc2-grok.h5\")"
	]
	},
	{
	"cell_type": "markdown",
	"id": "475abe53-6831-4117-897d-118f13c3c7df",
	"metadata": {},
	"source": [
	"### Slice access time\n",
	"\n",
	"Accessing data this way requires to decompress completely HDF5 chunks even if only accessing a slice.\n",
	"\n",
	"For instance, in this case, accessing one frame requires to decompress the entire HF5 chunk:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"id": "bffd6590-f80e-46de-87ad-f125836cfa0d",
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"211 ms ± 7.22 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
	]
	}
	],
	"source": [
	"%%timeit\n",
	"\n",
	"# Read one frame\n",
	"with h5py.File(\"blosc2-grok.h5\", \"r\") as h5f:\n",
	" x = h5f[\"data\"][0]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"id": "d1d2d005-503c-42fa-9c83-a8de51013925",
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"230 ms ± 18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
	]
	}
	],
	"source": [
	"%%timeit\n",
	"\n",
	"# Read all frames\n",
	"with h5py.File(\"blosc2-grok.h5\", \"r\") as h5f:\n",
	" x = h5f[\"data\"][()]"
	]
	},
	{
	"cell_type": "markdown",
	"id": "198bbeff-b00e-4cb6-a861-d410f2aa55eb",
	"metadata": {},
	"source": [
	"### Optimised slice reading with b2h5py\n",
	"\n",
	"[b2h5py](https://pypi.org/project/b2h5py) provides h5py with optimized reading of n-dimensional slices of Blosc2-compressed datasets.\n",
	"This optimized slicing leverages direct chunk access and 2-level partitioning into chunks and then smaller blocks (so that less data is actually decompressed).\n",
	"\n",
	"Example: Read the first slice with ``b2h5py``:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"id": "03d3d3dd-27b8-43f9-b7f3-1351fc7b69a4",
	"metadata": {},
	"outputs": [],
	"source": [
	"import b2h5py"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 10,
	"id": "5c6513ff-0477-405c-adf7-a7acd8a06eaf",
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"28.3 ms ± 2.64 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
	]
	}
	],
	"source": [
	"%%timeit\n",
	"\n",
	"# With b2h5py\n",
	"with h5py.File(\"blosc2-grok.h5\", \"r\") as h5f:\n",
	" b2h5py_data = b2h5py.B2Dataset(h5f[\"data\"])[0]"
	]
	},
	{
	"cell_type": "markdown",
	"id": "b2f63014-91d5-433a-97b7-d2c532dfa149",
	"metadata": {},
	"source": [
	"## Example with tomography radios\n",
	"\n",
	"First, download raw data: http://www.silx.org/pub/leaps-innov/tomography/lung_raw_2000-2100.h5\n",
	"\n",
	"Read raw data and compress it with a 10x compression ratio:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"id": "cd430d70-41d3-41af-a10d-e3c87d782e2b",
	"metadata": {},
	"outputs": [],
	"source": [
	"with h5py.File(\"lung_raw_2000-2100.h5\", \"r\") as h5f:\n",
	" images = h5f[\"data\"][()]\n",
	"\n",
	"with h5py.File(\"lung_raw-blosc2-grok.h5\", \"w\") as h5f:\n",
	" create_blosc2_grok_stack_dataset(h5f, \"data\", images, rate=10)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 12,
	"id": "1af9ad9c-d180-49cf-af92-19978a903f05",
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"Files size:\n",
	"- lung_raw-blosc2-grok.h5: 41936825 bytes\n",
	"- lung_raw_2000-2100.h5: 419438624 bytes\n"
	]
	}
	],
	"source": [
	"print(\"Files size:\")\n",
	"print(f\"- lung_raw-blosc2-grok.h5: {os.path.getsize('lung_raw-blosc2-grok.h5')} bytes\")\n",
	"print(f\"- lung_raw_2000-2100.h5: {os.path.getsize('lung_raw_2000-2100.h5')} bytes\")"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "80097d3e-15cc-462c-b7e3-19c11912bdf5",
	"metadata": {},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3 (ipykernel)",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.12.0"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 5
	}
	b2h5py
	blosc2>=2.5.1
	blosc2_grok>=0.2.2
	h5py
	hdf5plugin>=4.4.0
	jupyterlab_h5web