Skip to content

Instantly share code, notes, and snippets.

@betolink
Created January 30, 2024 20:33
Show Gist options
  • Save betolink/b545c364f80882c113b8cc27b763c729 to your computer and use it in GitHub Desktop.
Save betolink/b545c364f80882c113b8cc27b763c729 to your computer and use it in GitHub Desktop.
cloud optimized HDF5
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "11f9a9cb-c049-461e-8578-7090a644508e",
"metadata": {},
"source": [
"# Cloud Optimized HDF: or How I Learned to Stop Worrying and Love the Format\n",
"\n",
"\n",
"<img src=\"https://i.imgflip.com/8e4hnc.jpg\" width=\"400px\"/>\n"
]
},
{
"cell_type": "markdown",
"id": "6332a484-8fd6-4448-827f-aa48e6322f8f",
"metadata": {},
"source": [
"<img src=\"https://i.imgflip.com/8e4iqw.jpg\" width=\"400px\">"
]
},
{
"cell_type": "markdown",
"id": "2d37475f-42b0-4105-b34c-529f627d9066",
"metadata": {},
"source": [
"## The big ol list of \"ifs\"\n",
"\n",
"* We use the most recent versions of h5py, xarray and fsspec\n",
"* We create the HDF5 files with [cloud optimized flags](https://www.youtube.com/watch?v=rcS5vt-mKok)\n",
" * if the files are out there we can repack them, consolidating the metadata and perhaps incresing the chunk sizes\n",
"* We know how to \"tweak the nobs\" (or a fair understanding of what the I/O libraries are doing)."
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "736bb5fb-c5cd-42bf-be4e-6b81ae6eb865",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"xarray v2024.1.1\n",
"h5py v3.10.0\n",
"s3fs v2023.12.2\n"
]
}
],
"source": [
"import xarray as xr\n",
"import h5py\n",
"import s3fs\n",
"\n",
"fs = s3fs.S3FileSystem(anon=True)\n",
"\n",
"for library in (xr, h5py, s3fs):\n",
" print(f'{library.__name__} v{library.__version__}')"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "78d6697b-9f84-4edf-b426-fde27560bc68",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'ETag': '\"237bbd5828745b9e1a1e0ba88486e43c-835\"',\n",
" 'LastModified': datetime.datetime(2024, 1, 29, 4, 48, 24, tzinfo=tzutc()),\n",
" 'size': 6997123664,\n",
" 'name': 'its-live-data/cloud-experiments/h5cloud/atl03/big/original/ATL03_20190219140808_08110212_006_02.h5',\n",
" 'type': 'file',\n",
" 'StorageClass': 'INTELLIGENT_TIERING',\n",
" 'VersionId': None,\n",
" 'ContentType': 'application/x-hdf5'}"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# a \"big\" ATL03 file from the ICESat-2 mission\n",
"original_granule = \"s3://its-live-data/cloud-experiments/h5cloud/atl03/big/original/ATL03_20190219140808_08110212_006_02.h5\"\n",
"# the same \"big\" ATL03 file from the ICESat-2 mission, metadata consolidated in 8MB-size pages.\n",
"cloud_optimized = \"s3://its-live-data/cloud-experiments/h5cloud/atl03/big/repacked/ATL03_20190219140808_08110212_006_02_repacked.h5\"\n",
"\n",
"fs.info(original_granule)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "e94bb01e-a325-4ab3-8f6a-ac5799d14f02",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'ETag': '\"08af0688f787f10eee1ccfb13f7eb66d-836\"',\n",
" 'LastModified': datetime.datetime(2024, 1, 29, 4, 52, 44, tzinfo=tzutc()),\n",
" 'size': 7008000000,\n",
" 'name': 'its-live-data/cloud-experiments/h5cloud/atl03/big/repacked/ATL03_20190219140808_08110212_006_02_repacked.h5',\n",
" 'type': 'file',\n",
" 'StorageClass': 'INTELLIGENT_TIERING',\n",
" 'VersionId': None,\n",
" 'ContentType': 'application/x-hdf5'}"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fs.info(cloud_optimized)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ec2bce8f-bcf4-4982-8556-d3a71209af74",
"metadata": {},
"outputs": [],
"source": [
"# don't even try this out of region (us-west-2) will take forever, forever >= 30 minutes\n",
"ds = xr.open_dataset(fs.open(original_granule),\n",
" group=\"/gt1l/heights\",\n",
" engine=\"h5netcdf\")\n",
"ds"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9b5701b-6a8b-41ac-a56a-34a4f42125e1",
"metadata": {},
"outputs": [],
"source": [
"# again... don't even try this out of region (us-west-2) will take forever, forever >= 30 minutes\n",
"ds = xr.open_dataset(fs.open(cloud_optimized),\n",
" group=\"/gt1l/heights\",\n",
" engine=\"h5netcdf\")\n",
"ds"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "0def8b43-7616-4e01-a502-3f44811ae47e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 4.16 s, sys: 3.04 s, total: 7.2 s\n",
"Wall time: 20.6 s\n"
]
},
{
"data": {
"text/html": [
"<div><svg style=\"position: absolute; width: 0; height: 0; overflow: hidden\">\n",
"<defs>\n",
"<symbol id=\"icon-database\" viewBox=\"0 0 32 32\">\n",
"<path d=\"M16 0c-8.837 0-16 2.239-16 5v4c0 2.761 7.163 5 16 5s16-2.239 16-5v-4c0-2.761-7.163-5-16-5z\"></path>\n",
"<path d=\"M16 17c-8.837 0-16-2.239-16-5v6c0 2.761 7.163 5 16 5s16-2.239 16-5v-6c0 2.761-7.163 5-16 5z\"></path>\n",
"<path d=\"M16 26c-8.837 0-16-2.239-16-5v6c0 2.761 7.163 5 16 5s16-2.239 16-5v-6c0 2.761-7.163 5-16 5z\"></path>\n",
"</symbol>\n",
"<symbol id=\"icon-file-text2\" viewBox=\"0 0 32 32\">\n",
"<path d=\"M28.681 7.159c-0.694-0.947-1.662-2.053-2.724-3.116s-2.169-2.030-3.116-2.724c-1.612-1.182-2.393-1.319-2.841-1.319h-15.5c-1.378 0-2.5 1.121-2.5 2.5v27c0 1.378 1.122 2.5 2.5 2.5h23c1.378 0 2.5-1.122 2.5-2.5v-19.5c0-0.448-0.137-1.23-1.319-2.841zM24.543 5.457c0.959 0.959 1.712 1.825 2.268 2.543h-4.811v-4.811c0.718 0.556 1.584 1.309 2.543 2.268zM28 29.5c0 0.271-0.229 0.5-0.5 0.5h-23c-0.271 0-0.5-0.229-0.5-0.5v-27c0-0.271 0.229-0.5 0.5-0.5 0 0 15.499-0 15.5 0v7c0 0.552 0.448 1 1 1h7v19.5z\"></path>\n",
"<path d=\"M23 26h-14c-0.552 0-1-0.448-1-1s0.448-1 1-1h14c0.552 0 1 0.448 1 1s-0.448 1-1 1z\"></path>\n",
"<path d=\"M23 22h-14c-0.552 0-1-0.448-1-1s0.448-1 1-1h14c0.552 0 1 0.448 1 1s-0.448 1-1 1z\"></path>\n",
"<path d=\"M23 18h-14c-0.552 0-1-0.448-1-1s0.448-1 1-1h14c0.552 0 1 0.448 1 1s-0.448 1-1 1z\"></path>\n",
"</symbol>\n",
"</defs>\n",
"</svg>\n",
"<style>/* CSS stylesheet for displaying xarray objects in jupyterlab.\n",
" *\n",
" */\n",
"\n",
":root {\n",
" --xr-font-color0: var(--jp-content-font-color0, rgba(0, 0, 0, 1));\n",
" --xr-font-color2: var(--jp-content-font-color2, rgba(0, 0, 0, 0.54));\n",
" --xr-font-color3: var(--jp-content-font-color3, rgba(0, 0, 0, 0.38));\n",
" --xr-border-color: var(--jp-border-color2, #e0e0e0);\n",
" --xr-disabled-color: var(--jp-layout-color3, #bdbdbd);\n",
" --xr-background-color: var(--jp-layout-color0, white);\n",
" --xr-background-color-row-even: var(--jp-layout-color1, white);\n",
" --xr-background-color-row-odd: var(--jp-layout-color2, #eeeeee);\n",
"}\n",
"\n",
"html[theme=dark],\n",
"body[data-theme=dark],\n",
"body.vscode-dark {\n",
" --xr-font-color0: rgba(255, 255, 255, 1);\n",
" --xr-font-color2: rgba(255, 255, 255, 0.54);\n",
" --xr-font-color3: rgba(255, 255, 255, 0.38);\n",
" --xr-border-color: #1F1F1F;\n",
" --xr-disabled-color: #515151;\n",
" --xr-background-color: #111111;\n",
" --xr-background-color-row-even: #111111;\n",
" --xr-background-color-row-odd: #313131;\n",
"}\n",
"\n",
".xr-wrap {\n",
" display: block !important;\n",
" min-width: 300px;\n",
" max-width: 700px;\n",
"}\n",
"\n",
".xr-text-repr-fallback {\n",
" /* fallback to plain text repr when CSS is not injected (untrusted notebook) */\n",
" display: none;\n",
"}\n",
"\n",
".xr-header {\n",
" padding-top: 6px;\n",
" padding-bottom: 6px;\n",
" margin-bottom: 4px;\n",
" border-bottom: solid 1px var(--xr-border-color);\n",
"}\n",
"\n",
".xr-header > div,\n",
".xr-header > ul {\n",
" display: inline;\n",
" margin-top: 0;\n",
" margin-bottom: 0;\n",
"}\n",
"\n",
".xr-obj-type,\n",
".xr-array-name {\n",
" margin-left: 2px;\n",
" margin-right: 10px;\n",
"}\n",
"\n",
".xr-obj-type {\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-sections {\n",
" padding-left: 0 !important;\n",
" display: grid;\n",
" grid-template-columns: 150px auto auto 1fr 20px 20px;\n",
"}\n",
"\n",
".xr-section-item {\n",
" display: contents;\n",
"}\n",
"\n",
".xr-section-item input {\n",
" display: none;\n",
"}\n",
"\n",
".xr-section-item input + label {\n",
" color: var(--xr-disabled-color);\n",
"}\n",
"\n",
".xr-section-item input:enabled + label {\n",
" cursor: pointer;\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-section-item input:enabled + label:hover {\n",
" color: var(--xr-font-color0);\n",
"}\n",
"\n",
".xr-section-summary {\n",
" grid-column: 1;\n",
" color: var(--xr-font-color2);\n",
" font-weight: 500;\n",
"}\n",
"\n",
".xr-section-summary > span {\n",
" display: inline-block;\n",
" padding-left: 0.5em;\n",
"}\n",
"\n",
".xr-section-summary-in:disabled + label {\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-section-summary-in + label:before {\n",
" display: inline-block;\n",
" content: '►';\n",
" font-size: 11px;\n",
" width: 15px;\n",
" text-align: center;\n",
"}\n",
"\n",
".xr-section-summary-in:disabled + label:before {\n",
" color: var(--xr-disabled-color);\n",
"}\n",
"\n",
".xr-section-summary-in:checked + label:before {\n",
" content: '▼';\n",
"}\n",
"\n",
".xr-section-summary-in:checked + label > span {\n",
" display: none;\n",
"}\n",
"\n",
".xr-section-summary,\n",
".xr-section-inline-details {\n",
" padding-top: 4px;\n",
" padding-bottom: 4px;\n",
"}\n",
"\n",
".xr-section-inline-details {\n",
" grid-column: 2 / -1;\n",
"}\n",
"\n",
".xr-section-details {\n",
" display: none;\n",
" grid-column: 1 / -1;\n",
" margin-bottom: 5px;\n",
"}\n",
"\n",
".xr-section-summary-in:checked ~ .xr-section-details {\n",
" display: contents;\n",
"}\n",
"\n",
".xr-array-wrap {\n",
" grid-column: 1 / -1;\n",
" display: grid;\n",
" grid-template-columns: 20px auto;\n",
"}\n",
"\n",
".xr-array-wrap > label {\n",
" grid-column: 1;\n",
" vertical-align: top;\n",
"}\n",
"\n",
".xr-preview {\n",
" color: var(--xr-font-color3);\n",
"}\n",
"\n",
".xr-array-preview,\n",
".xr-array-data {\n",
" padding: 0 5px !important;\n",
" grid-column: 2;\n",
"}\n",
"\n",
".xr-array-data,\n",
".xr-array-in:checked ~ .xr-array-preview {\n",
" display: none;\n",
"}\n",
"\n",
".xr-array-in:checked ~ .xr-array-data,\n",
".xr-array-preview {\n",
" display: inline-block;\n",
"}\n",
"\n",
".xr-dim-list {\n",
" display: inline-block !important;\n",
" list-style: none;\n",
" padding: 0 !important;\n",
" margin: 0;\n",
"}\n",
"\n",
".xr-dim-list li {\n",
" display: inline-block;\n",
" padding: 0;\n",
" margin: 0;\n",
"}\n",
"\n",
".xr-dim-list:before {\n",
" content: '(';\n",
"}\n",
"\n",
".xr-dim-list:after {\n",
" content: ')';\n",
"}\n",
"\n",
".xr-dim-list li:not(:last-child):after {\n",
" content: ',';\n",
" padding-right: 5px;\n",
"}\n",
"\n",
".xr-has-index {\n",
" font-weight: bold;\n",
"}\n",
"\n",
".xr-var-list,\n",
".xr-var-item {\n",
" display: contents;\n",
"}\n",
"\n",
".xr-var-item > div,\n",
".xr-var-item label,\n",
".xr-var-item > .xr-var-name span {\n",
" background-color: var(--xr-background-color-row-even);\n",
" margin-bottom: 0;\n",
"}\n",
"\n",
".xr-var-item > .xr-var-name:hover span {\n",
" padding-right: 5px;\n",
"}\n",
"\n",
".xr-var-list > li:nth-child(odd) > div,\n",
".xr-var-list > li:nth-child(odd) > label,\n",
".xr-var-list > li:nth-child(odd) > .xr-var-name span {\n",
" background-color: var(--xr-background-color-row-odd);\n",
"}\n",
"\n",
".xr-var-name {\n",
" grid-column: 1;\n",
"}\n",
"\n",
".xr-var-dims {\n",
" grid-column: 2;\n",
"}\n",
"\n",
".xr-var-dtype {\n",
" grid-column: 3;\n",
" text-align: right;\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-var-preview {\n",
" grid-column: 4;\n",
"}\n",
"\n",
".xr-index-preview {\n",
" grid-column: 2 / 5;\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-var-name,\n",
".xr-var-dims,\n",
".xr-var-dtype,\n",
".xr-preview,\n",
".xr-attrs dt {\n",
" white-space: nowrap;\n",
" overflow: hidden;\n",
" text-overflow: ellipsis;\n",
" padding-right: 10px;\n",
"}\n",
"\n",
".xr-var-name:hover,\n",
".xr-var-dims:hover,\n",
".xr-var-dtype:hover,\n",
".xr-attrs dt:hover {\n",
" overflow: visible;\n",
" width: auto;\n",
" z-index: 1;\n",
"}\n",
"\n",
".xr-var-attrs,\n",
".xr-var-data,\n",
".xr-index-data {\n",
" display: none;\n",
" background-color: var(--xr-background-color) !important;\n",
" padding-bottom: 5px !important;\n",
"}\n",
"\n",
".xr-var-attrs-in:checked ~ .xr-var-attrs,\n",
".xr-var-data-in:checked ~ .xr-var-data,\n",
".xr-index-data-in:checked ~ .xr-index-data {\n",
" display: block;\n",
"}\n",
"\n",
".xr-var-data > table {\n",
" float: right;\n",
"}\n",
"\n",
".xr-var-name span,\n",
".xr-var-data,\n",
".xr-index-name div,\n",
".xr-index-data,\n",
".xr-attrs {\n",
" padding-left: 25px !important;\n",
"}\n",
"\n",
".xr-attrs,\n",
".xr-var-attrs,\n",
".xr-var-data,\n",
".xr-index-data {\n",
" grid-column: 1 / -1;\n",
"}\n",
"\n",
"dl.xr-attrs {\n",
" padding: 0;\n",
" margin: 0;\n",
" display: grid;\n",
" grid-template-columns: 125px auto;\n",
"}\n",
"\n",
".xr-attrs dt,\n",
".xr-attrs dd {\n",
" padding: 0;\n",
" margin: 0;\n",
" float: left;\n",
" padding-right: 10px;\n",
" width: auto;\n",
"}\n",
"\n",
".xr-attrs dt {\n",
" font-weight: normal;\n",
" grid-column: 1;\n",
"}\n",
"\n",
".xr-attrs dt:hover span {\n",
" display: inline-block;\n",
" background: var(--xr-background-color);\n",
" padding-right: 10px;\n",
"}\n",
"\n",
".xr-attrs dd {\n",
" grid-column: 2;\n",
" white-space: pre-wrap;\n",
" word-break: break-all;\n",
"}\n",
"\n",
".xr-icon-database,\n",
".xr-icon-file-text2,\n",
".xr-no-icon {\n",
" display: inline-block;\n",
" vertical-align: middle;\n",
" width: 1em;\n",
" height: 1.5em !important;\n",
" stroke-width: 0;\n",
" stroke: currentColor;\n",
" fill: currentColor;\n",
"}\n",
"</style><pre class='xr-text-repr-fallback'>&lt;xarray.Dataset&gt;\n",
"Dimensions: (delta_time: 73765028, ds_surf_type: 5)\n",
"Coordinates:\n",
" * delta_time (delta_time) datetime64[ns] 2019-02-19T14:08:08.557345384...\n",
" lat_ph (delta_time) float64 ...\n",
" lon_ph (delta_time) float64 ...\n",
"Dimensions without coordinates: ds_surf_type\n",
"Data variables:\n",
" dist_ph_across (delta_time) float32 ...\n",
" dist_ph_along (delta_time) float32 ...\n",
" h_ph (delta_time) float32 ...\n",
" pce_mframe_cnt (delta_time) uint32 ...\n",
" ph_id_channel (delta_time) uint8 ...\n",
" ph_id_count (delta_time) uint8 ...\n",
" ph_id_pulse (delta_time) uint8 ...\n",
" quality_ph (delta_time) int8 ...\n",
" signal_conf_ph (delta_time, ds_surf_type) int8 ...\n",
" weight_ph (delta_time) uint8 ...\n",
"Attributes:\n",
" Description: Contains arrays of the parameters for each received photon.\n",
" data_rate: Data are stored at the photon detection rate.</pre><div class='xr-wrap' style='display:none'><div class='xr-header'><div class='xr-obj-type'>xarray.Dataset</div></div><ul class='xr-sections'><li class='xr-section-item'><input id='section-31cf86d7-ee77-41b3-9cb5-6a197e2a1b59' class='xr-section-summary-in' type='checkbox' disabled ><label for='section-31cf86d7-ee77-41b3-9cb5-6a197e2a1b59' class='xr-section-summary' title='Expand/collapse section'>Dimensions:</label><div class='xr-section-inline-details'><ul class='xr-dim-list'><li><span class='xr-has-index'>delta_time</span>: 73765028</li><li><span>ds_surf_type</span>: 5</li></ul></div><div class='xr-section-details'></div></li><li class='xr-section-item'><input id='section-2e66e781-01ee-4c97-a518-81ae6bb1cc5b' class='xr-section-summary-in' type='checkbox' checked><label for='section-2e66e781-01ee-4c97-a518-81ae6bb1cc5b' class='xr-section-summary' >Coordinates: <span>(3)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><ul class='xr-var-list'><li class='xr-var-item'><div class='xr-var-name'><span class='xr-has-index'>delta_time</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>datetime64[ns]</div><div class='xr-var-preview xr-preview'>2019-02-19T14:08:08.557345384 .....</div><input id='attrs-e147b8fb-976e-4f37-a6e2-1faec259050e' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-e147b8fb-976e-4f37-a6e2-1faec259050e' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-91df7df9-46bc-4c62-ad22-5b7bd60156a7' class='xr-var-data-in' type='checkbox'><label for='data-91df7df9-46bc-4c62-ad22-5b7bd60156a7' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>referenceInformation</dd><dt><span>description :</span></dt><dd>The transmit time of a given photon, measured in seconds from the ATLAS Standard Data Product Epoch. Note that multiple received photons associated with a single transmit pulse will have the same delta_time. The ATLAS Standard Data Products (SDP) epoch offset is defined within /ancillary_data/atlas_sdp_gps_epoch as the number of GPS seconds between the GPS epoch (1980-01-06T00:00:00.000000Z UTC) and the ATLAS SDP epoch. By adding the offset contained within atlas_sdp_gps_epoch to delta time parameters, the time in gps_seconds relative to the GPS epoch can be computed.</dd><dt><span>long_name :</span></dt><dd>Elapsed GPS seconds</dd><dt><span>source :</span></dt><dd>Operations</dd><dt><span>standard_name :</span></dt><dd>time</dd></dl></div><div class='xr-var-data'><pre>array([&#x27;2019-02-19T14:08:08.557345384&#x27;, &#x27;2019-02-19T14:08:08.557345384&#x27;,\n",
" &#x27;2019-02-19T14:08:08.557345384&#x27;, ..., &#x27;2019-02-19T14:15:49.416156048&#x27;,\n",
" &#x27;2019-02-19T14:15:49.416156048&#x27;, &#x27;2019-02-19T14:15:49.416156048&#x27;],\n",
" dtype=&#x27;datetime64[ns]&#x27;)</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>lat_ph</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>float64</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-ed70b6f7-f566-4f26-b6dd-f71ecb9e2416' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-ed70b6f7-f566-4f26-b6dd-f71ecb9e2416' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-1529015c-5b42-42cd-b94a-74b6c81c5739' class='xr-var-data-in' type='checkbox'><label for='data-1529015c-5b42-42cd-b94a-74b6c81c5739' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>modelResult</dd><dt><span>description :</span></dt><dd>Latitude of each received photon. Computed from the ECF Cartesian coordinates of the bounce point.</dd><dt><span>long_name :</span></dt><dd>Latitude</dd><dt><span>source :</span></dt><dd>ATL03g ATBD, Section 3.4</dd><dt><span>standard_name :</span></dt><dd>latitude</dd><dt><span>units :</span></dt><dd>degrees_north</dd><dt><span>valid_max :</span></dt><dd>90.0</dd><dt><span>valid_min :</span></dt><dd>-90.0</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=float64]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>lon_ph</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>float64</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-e5e9476a-cf76-41d6-bf3b-d5b5b22337c6' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-e5e9476a-cf76-41d6-bf3b-d5b5b22337c6' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-e0601dd2-919b-41cb-8e40-f93e1eb261da' class='xr-var-data-in' type='checkbox'><label for='data-e0601dd2-919b-41cb-8e40-f93e1eb261da' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>modelResult</dd><dt><span>description :</span></dt><dd>Longitude of each received photon. Computed from the ECF Cartesian coordinates of the bounce point.</dd><dt><span>long_name :</span></dt><dd>Longitude</dd><dt><span>source :</span></dt><dd>ATL03g ATBD, Section 3.4</dd><dt><span>standard_name :</span></dt><dd>longitude</dd><dt><span>units :</span></dt><dd>degrees_east</dd><dt><span>valid_max :</span></dt><dd>180.0</dd><dt><span>valid_min :</span></dt><dd>-180.0</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=float64]</pre></div></li></ul></div></li><li class='xr-section-item'><input id='section-1e517cf8-22f1-455c-a045-049af5872f5b' class='xr-section-summary-in' type='checkbox' checked><label for='section-1e517cf8-22f1-455c-a045-049af5872f5b' class='xr-section-summary' >Data variables: <span>(10)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><ul class='xr-var-list'><li class='xr-var-item'><div class='xr-var-name'><span>dist_ph_across</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>float32</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-886141b3-4b54-4550-bb83-29ec0488c0c9' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-886141b3-4b54-4550-bb83-29ec0488c0c9' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-e91f01c9-ed48-46d6-b84e-6b17f42c12d1' class='xr-var-data-in' type='checkbox'><label for='data-e91f01c9-ed48-46d6-b84e-6b17f42c12d1' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>modelResult</dd><dt><span>description :</span></dt><dd>Across-track distance projected to the ellipsoid of the received photon from the reference ground track. This is based on the Along-Track Segment algorithm described in Section 3.1.</dd><dt><span>long_name :</span></dt><dd>Distance off RGT.</dd><dt><span>source :</span></dt><dd>ATL03 ATBD, Section 3.1</dd><dt><span>units :</span></dt><dd>meters</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=float32]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>dist_ph_along</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>float32</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-8c4724c5-3260-4e64-865b-580b9433cb20' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-8c4724c5-3260-4e64-865b-580b9433cb20' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-33721089-23e5-4aaa-aa07-0b6e45030d8d' class='xr-var-data-in' type='checkbox'><label for='data-33721089-23e5-4aaa-aa07-0b6e45030d8d' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>modelResult</dd><dt><span>description :</span></dt><dd>Along-track distance in a segment projected to the ellipsoid of the received photon, based on the Along-Track Segment algorithm. Total along track distance can be found by adding this value to the sum of segment lengths measured from the start of the most recent reference groundtrack.</dd><dt><span>long_name :</span></dt><dd>Distance from equator crossing.</dd><dt><span>source :</span></dt><dd>ATL03 ATBD, Section 3.1</dd><dt><span>units :</span></dt><dd>meters</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=float32]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>h_ph</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>float32</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-586a17db-7c35-47c2-9e86-21206202fdb7' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-586a17db-7c35-47c2-9e86-21206202fdb7' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-15c16215-12de-47ab-a6b2-c7cf188080aa' class='xr-var-data-in' type='checkbox'><label for='data-15c16215-12de-47ab-a6b2-c7cf188080aa' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>physicalMeasurement</dd><dt><span>description :</span></dt><dd>Height of each received photon, relative to the WGS-84 ellipsoid including the geophysical corrections noted in Section 6. Please note that neither the geoid, ocean tide nor the dynamic atmosphere (DAC) corrections are applied to the ellipsoidal heights.</dd><dt><span>long_name :</span></dt><dd>Photon WGS84 Height</dd><dt><span>source :</span></dt><dd>ATL03g ATBD, Section 3.4</dd><dt><span>standard_name :</span></dt><dd>height</dd><dt><span>units :</span></dt><dd>meters</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=float32]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>pce_mframe_cnt</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>uint32</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-67956705-c41c-4e6c-b6e5-d9013904aac9' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-67956705-c41c-4e6c-b6e5-d9013904aac9' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-6ec104c0-a876-4e77-9674-e09c9d1fd7db' class='xr-var-data-in' type='checkbox'><label for='data-6ec104c0-a876-4e77-9674-e09c9d1fd7db' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>referenceInformation</dd><dt><span>description :</span></dt><dd>The major frame counter is read from the digital flow controller in a given PCE card. The counter identifies individual major frames across diag and science packets. Used as part of the photon ID.</dd><dt><span>long_name :</span></dt><dd>PCE Major frame counter</dd><dt><span>source :</span></dt><dd>Retained from prior a_alt_science_ph packet</dd><dt><span>units :</span></dt><dd>counts</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=uint32]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>ph_id_channel</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>uint8</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-8d0497f8-971c-421a-ba7c-18558b1086f3' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-8d0497f8-971c-421a-ba7c-18558b1086f3' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-d8320e49-6e4c-4df3-be70-79c3cbc57415' class='xr-var-data-in' type='checkbox'><label for='data-d8320e49-6e4c-4df3-be70-79c3cbc57415' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>referenceInformation</dd><dt><span>description :</span></dt><dd>Channel number assigned for each received photon event. This is part of the photon ID. Values range from 1 to 120 to span all channels and rise/fall edges. Values 1 to 60 are for falling edge; PCE1 (1 to 20), PCE 2 (21 to 40) and PCE3 (41 to 60). Values 61 to 120 are for rising edge; PCE1 (61 to 80), PCE 2 (81 to 100) and PC3 (101 to 120).</dd><dt><span>long_name :</span></dt><dd>Receive channel id</dd><dt><span>source :</span></dt><dd>Derived as part of Photon ID</dd><dt><span>units :</span></dt><dd>1</dd><dt><span>valid_max :</span></dt><dd>120</dd><dt><span>valid_min :</span></dt><dd>1</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=uint8]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>ph_id_count</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>uint8</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-071a48a2-b5ac-4579-96ef-090cbbce9f8f' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-071a48a2-b5ac-4579-96ef-090cbbce9f8f' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-095b01f6-7586-4681-8f10-56a59f3f7e5c' class='xr-var-data-in' type='checkbox'><label for='data-095b01f6-7586-4681-8f10-56a59f3f7e5c' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>referenceInformation</dd><dt><span>description :</span></dt><dd>The photon event counter is part of photon ID and counts from 1 for each channel until reset by laser pulse counter.</dd><dt><span>long_name :</span></dt><dd>photon event counter</dd><dt><span>source :</span></dt><dd>Derived as part of Photon ID</dd><dt><span>units :</span></dt><dd>counts</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=uint8]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>ph_id_pulse</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>uint8</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-9fd5d954-7985-456a-b18d-6727aeb0e006' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-9fd5d954-7985-456a-b18d-6727aeb0e006' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-1f4cead2-18b5-45a8-9765-ce54f67956f4' class='xr-var-data-in' type='checkbox'><label for='data-1f4cead2-18b5-45a8-9765-ce54f67956f4' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>referenceInformation</dd><dt><span>description :</span></dt><dd>The laser pulse counter is part of photon ID and counts from 1 to 200 and is reset for each new major frame.</dd><dt><span>long_name :</span></dt><dd>laser pulse counter</dd><dt><span>source :</span></dt><dd>Derived as part of Photon ID</dd><dt><span>units :</span></dt><dd>counts</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=uint8]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>quality_ph</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>int8</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-1af6dd94-dc6a-464a-bb79-58006788862e' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-1af6dd94-dc6a-464a-bb79-58006788862e' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-d542ec89-e4f5-44f6-9541-cbf7203ec5e4' class='xr-var-data-in' type='checkbox'><label for='data-d542ec89-e4f5-44f6-9541-cbf7203ec5e4' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>qualityInformation</dd><dt><span>description :</span></dt><dd>Indicates the quality of the associated photon. 0=nominal, 1=possible_afterpulse, 2=possible_impulse_response_effect, 3=possible_tep. Use this flag in conjunction with signal_conf_ph to identify those photons that are likely noise or likely signal.</dd><dt><span>flag_meanings :</span></dt><dd>nominal possible_afterpulse possible_impulse_response_effect possible_tep</dd><dt><span>flag_values :</span></dt><dd>[0 1 2 3]</dd><dt><span>long_name :</span></dt><dd>Photon Quality</dd><dt><span>source :</span></dt><dd>ATL03 ATBD</dd><dt><span>units :</span></dt><dd>1</dd><dt><span>valid_max :</span></dt><dd>3</dd><dt><span>valid_min :</span></dt><dd>0</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=int8]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>signal_conf_ph</span></div><div class='xr-var-dims'>(delta_time, ds_surf_type)</div><div class='xr-var-dtype'>int8</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-4e8a2984-01b3-45c3-b996-9e082978d1ee' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-4e8a2984-01b3-45c3-b996-9e082978d1ee' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-af56990c-398c-49b2-87f2-e72435c0f1b1' class='xr-var-data-in' type='checkbox'><label for='data-af56990c-398c-49b2-87f2-e72435c0f1b1' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>qualityInformation</dd><dt><span>description :</span></dt><dd>Confidence level associated with each photon event selected as signal. 0=noise. 1=added to allow for buffer but algorithm classifies as background; 2=low; 3=med; 4=high). This parameter is a 5xN array where N is the number of photons in the granule, and the 5 rows indicate signal finding for each surface type (in order: land, ocean, sea ice, land ice and inland water). Events not associated with a specific surface type have a confidence level of -1. Events evaluated as TEP returns have a confidence level of -2.</dd><dt><span>flag_meanings :</span></dt><dd>possible_tep not_considered noise buffer low medium high</dd><dt><span>flag_values :</span></dt><dd>[-2 -1 0 1 2 3 4]</dd><dt><span>long_name :</span></dt><dd>Photon Signal Confidence</dd><dt><span>source :</span></dt><dd>ATL03 ATBD, Section 5, Conf</dd><dt><span>units :</span></dt><dd>1</dd><dt><span>valid_max :</span></dt><dd>4</dd><dt><span>valid_min :</span></dt><dd>-2</dd></dl></div><div class='xr-var-data'><pre>[368825140 values with dtype=int8]</pre></div></li><li class='xr-var-item'><div class='xr-var-name'><span>weight_ph</span></div><div class='xr-var-dims'>(delta_time)</div><div class='xr-var-dtype'>uint8</div><div class='xr-var-preview xr-preview'>...</div><input id='attrs-67192baa-45f6-48a5-8c52-826db3407d9a' class='xr-var-attrs-in' type='checkbox' ><label for='attrs-67192baa-45f6-48a5-8c52-826db3407d9a' title='Show/Hide attributes'><svg class='icon xr-icon-file-text2'><use xlink:href='#icon-file-text2'></use></svg></label><input id='data-55a37534-e4d1-4494-9734-32cdfc8a56a6' class='xr-var-data-in' type='checkbox'><label for='data-55a37534-e4d1-4494-9734-32cdfc8a56a6' title='Show/Hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-var-attrs'><dl class='xr-attrs'><dt><span>contentType :</span></dt><dd>modelResult</dd><dt><span>description :</span></dt><dd>Computed weight of each photon. The weight is calculated by a windowed KNN algorithm using the distances between each photon and its K nearest neighbors. Values range from 0 to 255 where 255 is the most heavily weighted photon and would be considered likely signal.</dd><dt><span>long_name :</span></dt><dd>Photon weight</dd><dt><span>source :</span></dt><dd>ATBD Section 5</dd><dt><span>units :</span></dt><dd>1</dd><dt><span>valid_max :</span></dt><dd>255</dd><dt><span>valid_min :</span></dt><dd>0</dd></dl></div><div class='xr-var-data'><pre>[73765028 values with dtype=uint8]</pre></div></li></ul></div></li><li class='xr-section-item'><input id='section-96d85cf2-2902-441f-8efb-3c94c2d06e71' class='xr-section-summary-in' type='checkbox' ><label for='section-96d85cf2-2902-441f-8efb-3c94c2d06e71' class='xr-section-summary' >Indexes: <span>(1)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><ul class='xr-var-list'><li class='xr-var-item'><div class='xr-index-name'><div>delta_time</div></div><div class='xr-index-preview'>PandasIndex</div><div></div><input id='index-6efda7b7-c11d-4242-9b30-8dc3296dc6d5' class='xr-index-data-in' type='checkbox'/><label for='index-6efda7b7-c11d-4242-9b30-8dc3296dc6d5' title='Show/Hide index repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-index-data'><pre>PandasIndex(DatetimeIndex([&#x27;2019-02-19 14:08:08.557345384&#x27;,\n",
" &#x27;2019-02-19 14:08:08.557345384&#x27;,\n",
" &#x27;2019-02-19 14:08:08.557345384&#x27;,\n",
" &#x27;2019-02-19 14:08:08.557345384&#x27;,\n",
" &#x27;2019-02-19 14:08:08.557345384&#x27;,\n",
" &#x27;2019-02-19 14:08:08.557445384&#x27;,\n",
" &#x27;2019-02-19 14:08:08.557545388&#x27;,\n",
" &#x27;2019-02-19 14:08:08.557545388&#x27;,\n",
" &#x27;2019-02-19 14:08:08.557545388&#x27;,\n",
" &#x27;2019-02-19 14:08:08.557545388&#x27;,\n",
" ...\n",
" &#x27;2019-02-19 14:15:49.416056052&#x27;,\n",
" &#x27;2019-02-19 14:15:49.416056052&#x27;,\n",
" &#x27;2019-02-19 14:15:49.416056052&#x27;,\n",
" &#x27;2019-02-19 14:15:49.416056052&#x27;,\n",
" &#x27;2019-02-19 14:15:49.416156048&#x27;,\n",
" &#x27;2019-02-19 14:15:49.416156048&#x27;,\n",
" &#x27;2019-02-19 14:15:49.416156048&#x27;,\n",
" &#x27;2019-02-19 14:15:49.416156048&#x27;,\n",
" &#x27;2019-02-19 14:15:49.416156048&#x27;,\n",
" &#x27;2019-02-19 14:15:49.416156048&#x27;],\n",
" dtype=&#x27;datetime64[ns]&#x27;, name=&#x27;delta_time&#x27;, length=73765028, freq=None))</pre></div></li></ul></div></li><li class='xr-section-item'><input id='section-afb124f7-8865-4d67-9de3-15b770742fa0' class='xr-section-summary-in' type='checkbox' checked><label for='section-afb124f7-8865-4d67-9de3-15b770742fa0' class='xr-section-summary' >Attributes: <span>(2)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><dl class='xr-attrs'><dt><span>Description :</span></dt><dd>Contains arrays of the parameters for each received photon.</dd><dt><span>data_rate :</span></dt><dd>Data are stored at the photon detection rate.</dd></dl></div></li></ul></div></div>"
],
"text/plain": [
"<xarray.Dataset>\n",
"Dimensions: (delta_time: 73765028, ds_surf_type: 5)\n",
"Coordinates:\n",
" * delta_time (delta_time) datetime64[ns] 2019-02-19T14:08:08.557345384...\n",
" lat_ph (delta_time) float64 ...\n",
" lon_ph (delta_time) float64 ...\n",
"Dimensions without coordinates: ds_surf_type\n",
"Data variables:\n",
" dist_ph_across (delta_time) float32 ...\n",
" dist_ph_along (delta_time) float32 ...\n",
" h_ph (delta_time) float32 ...\n",
" pce_mframe_cnt (delta_time) uint32 ...\n",
" ph_id_channel (delta_time) uint8 ...\n",
" ph_id_count (delta_time) uint8 ...\n",
" ph_id_pulse (delta_time) uint8 ...\n",
" quality_ph (delta_time) int8 ...\n",
" signal_conf_ph (delta_time, ds_surf_type) int8 ...\n",
" weight_ph (delta_time) uint8 ...\n",
"Attributes:\n",
" Description: Contains arrays of the parameters for each received photon.\n",
" data_rate: Data are stored at the photon detection rate."
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"\n",
"# this one is different! you can try this at home (cloud otpmized HDF5!)\n",
"\n",
"io_params ={\n",
" \"fsspec_params\": {\n",
" # \"skip_instance_cache\": True\n",
" \"cache_type\": \"blockcache\", # or \"first\" with enough space\n",
" \"block_size\": 8*1024*1024 # could be bigger\n",
" },\n",
" \"h5py_params\" : {\n",
" \"driver_kwds\": { # only recent versions of xarray and h5netcdf allow this correctly\n",
" \"page_buf_size\": 32*1024*1024, # this one only works in repacked files\n",
" \"rdcc_nbytes\": 8*1024*1024 # this one is to read the chunks \n",
" }\n",
"\n",
" }\n",
"}\n",
"ds = xr.open_dataset(fs.open(cloud_optimized, **io_params[\"fsspec_params\"]),\n",
" group=\"/gt1l/heights\",\n",
" engine=\"h5netcdf\",\n",
" **io_params[\"h5py_params\"])\n",
"ds"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "da959721-2f9d-4151-b361-6f9f38fa5b8c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 11 s, sys: 2.02 s, total: 13 s\n",
"Wall time: 1min 25s\n"
]
},
{
"data": {
"text/html": [
"<div><svg style=\"position: absolute; width: 0; height: 0; overflow: hidden\">\n",
"<defs>\n",
"<symbol id=\"icon-database\" viewBox=\"0 0 32 32\">\n",
"<path d=\"M16 0c-8.837 0-16 2.239-16 5v4c0 2.761 7.163 5 16 5s16-2.239 16-5v-4c0-2.761-7.163-5-16-5z\"></path>\n",
"<path d=\"M16 17c-8.837 0-16-2.239-16-5v6c0 2.761 7.163 5 16 5s16-2.239 16-5v-6c0 2.761-7.163 5-16 5z\"></path>\n",
"<path d=\"M16 26c-8.837 0-16-2.239-16-5v6c0 2.761 7.163 5 16 5s16-2.239 16-5v-6c0 2.761-7.163 5-16 5z\"></path>\n",
"</symbol>\n",
"<symbol id=\"icon-file-text2\" viewBox=\"0 0 32 32\">\n",
"<path d=\"M28.681 7.159c-0.694-0.947-1.662-2.053-2.724-3.116s-2.169-2.030-3.116-2.724c-1.612-1.182-2.393-1.319-2.841-1.319h-15.5c-1.378 0-2.5 1.121-2.5 2.5v27c0 1.378 1.122 2.5 2.5 2.5h23c1.378 0 2.5-1.122 2.5-2.5v-19.5c0-0.448-0.137-1.23-1.319-2.841zM24.543 5.457c0.959 0.959 1.712 1.825 2.268 2.543h-4.811v-4.811c0.718 0.556 1.584 1.309 2.543 2.268zM28 29.5c0 0.271-0.229 0.5-0.5 0.5h-23c-0.271 0-0.5-0.229-0.5-0.5v-27c0-0.271 0.229-0.5 0.5-0.5 0 0 15.499-0 15.5 0v7c0 0.552 0.448 1 1 1h7v19.5z\"></path>\n",
"<path d=\"M23 26h-14c-0.552 0-1-0.448-1-1s0.448-1 1-1h14c0.552 0 1 0.448 1 1s-0.448 1-1 1z\"></path>\n",
"<path d=\"M23 22h-14c-0.552 0-1-0.448-1-1s0.448-1 1-1h14c0.552 0 1 0.448 1 1s-0.448 1-1 1z\"></path>\n",
"<path d=\"M23 18h-14c-0.552 0-1-0.448-1-1s0.448-1 1-1h14c0.552 0 1 0.448 1 1s-0.448 1-1 1z\"></path>\n",
"</symbol>\n",
"</defs>\n",
"</svg>\n",
"<style>/* CSS stylesheet for displaying xarray objects in jupyterlab.\n",
" *\n",
" */\n",
"\n",
":root {\n",
" --xr-font-color0: var(--jp-content-font-color0, rgba(0, 0, 0, 1));\n",
" --xr-font-color2: var(--jp-content-font-color2, rgba(0, 0, 0, 0.54));\n",
" --xr-font-color3: var(--jp-content-font-color3, rgba(0, 0, 0, 0.38));\n",
" --xr-border-color: var(--jp-border-color2, #e0e0e0);\n",
" --xr-disabled-color: var(--jp-layout-color3, #bdbdbd);\n",
" --xr-background-color: var(--jp-layout-color0, white);\n",
" --xr-background-color-row-even: var(--jp-layout-color1, white);\n",
" --xr-background-color-row-odd: var(--jp-layout-color2, #eeeeee);\n",
"}\n",
"\n",
"html[theme=dark],\n",
"body[data-theme=dark],\n",
"body.vscode-dark {\n",
" --xr-font-color0: rgba(255, 255, 255, 1);\n",
" --xr-font-color2: rgba(255, 255, 255, 0.54);\n",
" --xr-font-color3: rgba(255, 255, 255, 0.38);\n",
" --xr-border-color: #1F1F1F;\n",
" --xr-disabled-color: #515151;\n",
" --xr-background-color: #111111;\n",
" --xr-background-color-row-even: #111111;\n",
" --xr-background-color-row-odd: #313131;\n",
"}\n",
"\n",
".xr-wrap {\n",
" display: block !important;\n",
" min-width: 300px;\n",
" max-width: 700px;\n",
"}\n",
"\n",
".xr-text-repr-fallback {\n",
" /* fallback to plain text repr when CSS is not injected (untrusted notebook) */\n",
" display: none;\n",
"}\n",
"\n",
".xr-header {\n",
" padding-top: 6px;\n",
" padding-bottom: 6px;\n",
" margin-bottom: 4px;\n",
" border-bottom: solid 1px var(--xr-border-color);\n",
"}\n",
"\n",
".xr-header > div,\n",
".xr-header > ul {\n",
" display: inline;\n",
" margin-top: 0;\n",
" margin-bottom: 0;\n",
"}\n",
"\n",
".xr-obj-type,\n",
".xr-array-name {\n",
" margin-left: 2px;\n",
" margin-right: 10px;\n",
"}\n",
"\n",
".xr-obj-type {\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-sections {\n",
" padding-left: 0 !important;\n",
" display: grid;\n",
" grid-template-columns: 150px auto auto 1fr 20px 20px;\n",
"}\n",
"\n",
".xr-section-item {\n",
" display: contents;\n",
"}\n",
"\n",
".xr-section-item input {\n",
" display: none;\n",
"}\n",
"\n",
".xr-section-item input + label {\n",
" color: var(--xr-disabled-color);\n",
"}\n",
"\n",
".xr-section-item input:enabled + label {\n",
" cursor: pointer;\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-section-item input:enabled + label:hover {\n",
" color: var(--xr-font-color0);\n",
"}\n",
"\n",
".xr-section-summary {\n",
" grid-column: 1;\n",
" color: var(--xr-font-color2);\n",
" font-weight: 500;\n",
"}\n",
"\n",
".xr-section-summary > span {\n",
" display: inline-block;\n",
" padding-left: 0.5em;\n",
"}\n",
"\n",
".xr-section-summary-in:disabled + label {\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-section-summary-in + label:before {\n",
" display: inline-block;\n",
" content: '►';\n",
" font-size: 11px;\n",
" width: 15px;\n",
" text-align: center;\n",
"}\n",
"\n",
".xr-section-summary-in:disabled + label:before {\n",
" color: var(--xr-disabled-color);\n",
"}\n",
"\n",
".xr-section-summary-in:checked + label:before {\n",
" content: '▼';\n",
"}\n",
"\n",
".xr-section-summary-in:checked + label > span {\n",
" display: none;\n",
"}\n",
"\n",
".xr-section-summary,\n",
".xr-section-inline-details {\n",
" padding-top: 4px;\n",
" padding-bottom: 4px;\n",
"}\n",
"\n",
".xr-section-inline-details {\n",
" grid-column: 2 / -1;\n",
"}\n",
"\n",
".xr-section-details {\n",
" display: none;\n",
" grid-column: 1 / -1;\n",
" margin-bottom: 5px;\n",
"}\n",
"\n",
".xr-section-summary-in:checked ~ .xr-section-details {\n",
" display: contents;\n",
"}\n",
"\n",
".xr-array-wrap {\n",
" grid-column: 1 / -1;\n",
" display: grid;\n",
" grid-template-columns: 20px auto;\n",
"}\n",
"\n",
".xr-array-wrap > label {\n",
" grid-column: 1;\n",
" vertical-align: top;\n",
"}\n",
"\n",
".xr-preview {\n",
" color: var(--xr-font-color3);\n",
"}\n",
"\n",
".xr-array-preview,\n",
".xr-array-data {\n",
" padding: 0 5px !important;\n",
" grid-column: 2;\n",
"}\n",
"\n",
".xr-array-data,\n",
".xr-array-in:checked ~ .xr-array-preview {\n",
" display: none;\n",
"}\n",
"\n",
".xr-array-in:checked ~ .xr-array-data,\n",
".xr-array-preview {\n",
" display: inline-block;\n",
"}\n",
"\n",
".xr-dim-list {\n",
" display: inline-block !important;\n",
" list-style: none;\n",
" padding: 0 !important;\n",
" margin: 0;\n",
"}\n",
"\n",
".xr-dim-list li {\n",
" display: inline-block;\n",
" padding: 0;\n",
" margin: 0;\n",
"}\n",
"\n",
".xr-dim-list:before {\n",
" content: '(';\n",
"}\n",
"\n",
".xr-dim-list:after {\n",
" content: ')';\n",
"}\n",
"\n",
".xr-dim-list li:not(:last-child):after {\n",
" content: ',';\n",
" padding-right: 5px;\n",
"}\n",
"\n",
".xr-has-index {\n",
" font-weight: bold;\n",
"}\n",
"\n",
".xr-var-list,\n",
".xr-var-item {\n",
" display: contents;\n",
"}\n",
"\n",
".xr-var-item > div,\n",
".xr-var-item label,\n",
".xr-var-item > .xr-var-name span {\n",
" background-color: var(--xr-background-color-row-even);\n",
" margin-bottom: 0;\n",
"}\n",
"\n",
".xr-var-item > .xr-var-name:hover span {\n",
" padding-right: 5px;\n",
"}\n",
"\n",
".xr-var-list > li:nth-child(odd) > div,\n",
".xr-var-list > li:nth-child(odd) > label,\n",
".xr-var-list > li:nth-child(odd) > .xr-var-name span {\n",
" background-color: var(--xr-background-color-row-odd);\n",
"}\n",
"\n",
".xr-var-name {\n",
" grid-column: 1;\n",
"}\n",
"\n",
".xr-var-dims {\n",
" grid-column: 2;\n",
"}\n",
"\n",
".xr-var-dtype {\n",
" grid-column: 3;\n",
" text-align: right;\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-var-preview {\n",
" grid-column: 4;\n",
"}\n",
"\n",
".xr-index-preview {\n",
" grid-column: 2 / 5;\n",
" color: var(--xr-font-color2);\n",
"}\n",
"\n",
".xr-var-name,\n",
".xr-var-dims,\n",
".xr-var-dtype,\n",
".xr-preview,\n",
".xr-attrs dt {\n",
" white-space: nowrap;\n",
" overflow: hidden;\n",
" text-overflow: ellipsis;\n",
" padding-right: 10px;\n",
"}\n",
"\n",
".xr-var-name:hover,\n",
".xr-var-dims:hover,\n",
".xr-var-dtype:hover,\n",
".xr-attrs dt:hover {\n",
" overflow: visible;\n",
" width: auto;\n",
" z-index: 1;\n",
"}\n",
"\n",
".xr-var-attrs,\n",
".xr-var-data,\n",
".xr-index-data {\n",
" display: none;\n",
" background-color: var(--xr-background-color) !important;\n",
" padding-bottom: 5px !important;\n",
"}\n",
"\n",
".xr-var-attrs-in:checked ~ .xr-var-attrs,\n",
".xr-var-data-in:checked ~ .xr-var-data,\n",
".xr-index-data-in:checked ~ .xr-index-data {\n",
" display: block;\n",
"}\n",
"\n",
".xr-var-data > table {\n",
" float: right;\n",
"}\n",
"\n",
".xr-var-name span,\n",
".xr-var-data,\n",
".xr-index-name div,\n",
".xr-index-data,\n",
".xr-attrs {\n",
" padding-left: 25px !important;\n",
"}\n",
"\n",
".xr-attrs,\n",
".xr-var-attrs,\n",
".xr-var-data,\n",
".xr-index-data {\n",
" grid-column: 1 / -1;\n",
"}\n",
"\n",
"dl.xr-attrs {\n",
" padding: 0;\n",
" margin: 0;\n",
" display: grid;\n",
" grid-template-columns: 125px auto;\n",
"}\n",
"\n",
".xr-attrs dt,\n",
".xr-attrs dd {\n",
" padding: 0;\n",
" margin: 0;\n",
" float: left;\n",
" padding-right: 10px;\n",
" width: auto;\n",
"}\n",
"\n",
".xr-attrs dt {\n",
" font-weight: normal;\n",
" grid-column: 1;\n",
"}\n",
"\n",
".xr-attrs dt:hover span {\n",
" display: inline-block;\n",
" background: var(--xr-background-color);\n",
" padding-right: 10px;\n",
"}\n",
"\n",
".xr-attrs dd {\n",
" grid-column: 2;\n",
" white-space: pre-wrap;\n",
" word-break: break-all;\n",
"}\n",
"\n",
".xr-icon-database,\n",
".xr-icon-file-text2,\n",
".xr-no-icon {\n",
" display: inline-block;\n",
" vertical-align: middle;\n",
" width: 1em;\n",
" height: 1.5em !important;\n",
" stroke-width: 0;\n",
" stroke: currentColor;\n",
" fill: currentColor;\n",
"}\n",
"</style><pre class='xr-text-repr-fallback'>&lt;xarray.DataArray &#x27;h_ph&#x27; ()&gt;\n",
"array(1031.6101, dtype=float32)</pre><div class='xr-wrap' style='display:none'><div class='xr-header'><div class='xr-obj-type'>xarray.DataArray</div><div class='xr-array-name'>'h_ph'</div></div><ul class='xr-sections'><li class='xr-section-item'><div class='xr-array-wrap'><input id='section-19374e0f-4dff-44d7-9766-60414d947edf' class='xr-array-in' type='checkbox' checked><label for='section-19374e0f-4dff-44d7-9766-60414d947edf' title='Show/hide data repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-array-preview xr-preview'><span>1.032e+03</span></div><div class='xr-array-data'><pre>array(1031.6101, dtype=float32)</pre></div></div></li><li class='xr-section-item'><input id='section-9a50227e-b04f-4b5f-b6d6-8949c315d983' class='xr-section-summary-in' type='checkbox' disabled ><label for='section-9a50227e-b04f-4b5f-b6d6-8949c315d983' class='xr-section-summary' title='Expand/collapse section'>Coordinates: <span>(0)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><ul class='xr-var-list'></ul></div></li><li class='xr-section-item'><input id='section-02889685-215f-4a0b-9a87-46d363b1f9b7' class='xr-section-summary-in' type='checkbox' disabled ><label for='section-02889685-215f-4a0b-9a87-46d363b1f9b7' class='xr-section-summary' title='Expand/collapse section'>Indexes: <span>(0)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><ul class='xr-var-list'></ul></div></li><li class='xr-section-item'><input id='section-365a9107-43dd-4b86-b315-dd6d06741953' class='xr-section-summary-in' type='checkbox' disabled ><label for='section-365a9107-43dd-4b86-b315-dd6d06741953' class='xr-section-summary' title='Expand/collapse section'>Attributes: <span>(0)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><dl class='xr-attrs'></dl></div></li></ul></div></div>"
],
"text/plain": [
"<xarray.DataArray 'h_ph' ()>\n",
"array(1031.6101, dtype=float32)"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"\n",
"# takes about ~2 minutes\n",
"ds.h_ph.mean()"
]
},
{
"cell_type": "markdown",
"id": "35caf411-afe7-44f7-9264-5e7b892456d0",
"metadata": {},
"source": [
"<center><img src=\"https://i.imgflip.com/8e4kuf.jpg\" width=\"400px\"></center></center>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@forrestfwilliams
Copy link

I care! Great job guys!

@betolink
Copy link
Author

betolink commented Jan 30, 2024

Forrest!! yeah haha we do care! next up the cost savings factor of cloud optimized HDF5

@alex-s-gardner
Copy link

someday there will be a parade named in your honor !! Well done

@evetion
Copy link

evetion commented Jan 31, 2024

Very interesting work, that will save many many bytes of bandwith and I/O in the long run.

@betolink
Copy link
Author

Yeah, I'm excited about this, thanks to @ajelenak for his guidance on HDF5 internals. Besides the performance improvement, a good side effect for data providers (with data on AWS) is the co$$$t reduction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment