Skip to content

Instantly share code, notes, and snippets.

@piotr-florek-mohc
Created February 2, 2024 15:56
Show Gist options
  • Save piotr-florek-mohc/02b40a8b9683814027d349bff4dd5bd5 to your computer and use it in GitHub Desktop.
Save piotr-florek-mohc/02b40a8b9683814027d349bff4dd5bd5 to your computer and use it in GitHub Desktop.
description of the cmorisation process of DWD dataset
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "220e5b04-c4b4-48ec-9e70-e4e493d88680",
"metadata": {},
"source": [
"DWD data cmorisation\n",
"====================\n",
"\n",
"This notebook demonstrates how to cmorise relatively well-structured input data, produced by DWD.\n",
"\n",
"As a prerequisite, we need environment which will contain `cmor`, `cftime`, `iris`, and `jupyterlab` (for interactive processing). They can be installed with\n",
"\n",
"`conda create -n seasonal_cmorisation -c conda-forge jupyterlab iris cftime cmor`\n",
"`conda activate seasonal_cmorisation`\n",
"\n",
"Now we can open the notebook and run the following imports, including some utility functions to handle input json generation:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "827ac9bd-68c5-4521-9f61-df6cd05ac217",
"metadata": {},
"outputs": [],
"source": [
"import cftime\n",
"import cmor\n",
"import iris\n",
"import numpy as np\n",
"import re\n",
"import os\n",
"import cf_units\n",
"\n",
"from cmor_utils import create_cmor_json_config, generate_dataset_info, parse_filename, DATASET_ROOT, BASE_TIME_UNIT, MIP_ERA, MIP_TABLES_DIR"
]
},
{
"cell_type": "markdown",
"id": "7b52bbee-9148-47bd-95b8-5f072d4133a9",
"metadata": {},
"source": [
"Let's load the input dataset first and see how it's structured"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "1920c04b-51ff-4a1b-b60f-bdf4947afe52",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0: surface_temperature / (K) (realization: 10; time: 10; latitude: 192; longitude: 384)\n"
]
}
],
"source": [
"filename = \"tas_Ayr_DWD_s2018_r1-10i1p1f1_gn_climatology-1971-2000.nc\"\n",
"\n",
"cubes = iris.load(os.path.join(DATASET_ROOT, filename))\n",
"print(cubes)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ba66181e-3d44-4fff-b283-62d1da266d01",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"<style>\n",
" a.iris {\n",
" text-decoration: none !important;\n",
" }\n",
" table.iris {\n",
" white-space: pre;\n",
" border: 1px solid;\n",
" border-color: #9c9c9c;\n",
" font-family: monaco, monospace;\n",
" }\n",
" th.iris {\n",
" background: #303f3f;\n",
" color: #e0e0e0;\n",
" border-left: 1px solid;\n",
" border-color: #9c9c9c;\n",
" font-size: 1.05em;\n",
" min-width: 50px;\n",
" max-width: 125px;\n",
" }\n",
" tr.iris :first-child {\n",
" border-right: 1px solid #9c9c9c !important;\n",
" }\n",
" td.iris-title {\n",
" background: #d5dcdf;\n",
" border-top: 1px solid #9c9c9c;\n",
" font-weight: bold;\n",
" }\n",
" .iris-word-cell {\n",
" text-align: left !important;\n",
" white-space: pre;\n",
" }\n",
" .iris-subheading-cell {\n",
" padding-left: 2em !important;\n",
" }\n",
" .iris-inclusion-cell {\n",
" padding-right: 1em !important;\n",
" }\n",
" .iris-panel-body {\n",
" padding-top: 0px;\n",
" }\n",
" .iris-panel-title {\n",
" padding-left: 3em;\n",
" }\n",
" .iris-panel-title {\n",
" margin-top: 7px;\n",
" }\n",
"</style>\n",
"<table class=\"iris\" id=\"139786213593232\">\n",
" <tr class=\"iris\">\n",
"<th class=\"iris iris-word-cell\">Surface Temperature (K)</th>\n",
"<th class=\"iris iris-word-cell\">realization</th>\n",
"<th class=\"iris iris-word-cell\">time</th>\n",
"<th class=\"iris iris-word-cell\">latitude</th>\n",
"<th class=\"iris iris-word-cell\">longitude</th>\n",
"</tr>\n",
" <tr class=\"iris\">\n",
"<td class=\"iris-word-cell iris-subheading-cell\">Shape</td>\n",
"<td class=\"iris iris-inclusion-cell\">10</td>\n",
"<td class=\"iris iris-inclusion-cell\">10</td>\n",
"<td class=\"iris iris-inclusion-cell\">192</td>\n",
"<td class=\"iris iris-inclusion-cell\">384</td>\n",
"</tr>\n",
" <tr class=\"iris\">\n",
" <td class=\"iris-title iris-word-cell\">Dimension coordinates</td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\trealization</td>\n",
" <td class=\"iris-inclusion-cell\">x</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\ttime</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">x</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tlatitude</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">x</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tlongitude</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">x</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-title iris-word-cell\">Attributes</td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tConventions</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;CF-1.5&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\thindcast_year</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;2018&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tinstitute_id</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;DWD&#x27;</td>\n",
"</tr>\n",
"</table>\n",
" "
],
"text/plain": [
"<iris 'Cube' of surface_temperature / (K) (realization: 10; time: 10; latitude: 192; longitude: 384)>"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset = cubes[0]\n",
"dataset"
]
},
{
"cell_type": "markdown",
"id": "673d3f95-ffbc-46db-93dc-d89c74d0918d",
"metadata": {},
"source": [
"The first (and most difficult task) is making sure time axis is well-structured and in a format that CMOR understands. The following commands will check the calendar and units used, and spacing between time points"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f5f04a42-7375-418f-a91e-6315d1a21c1f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[13470. 22254. 31014. 39774. 48534. 57318. 66078. 74838. 83598. 92382.]\n",
"[8784. 8760. 8760. 8760. 8784. 8760. 8760. 8760. 8784.]\n"
]
},
{
"data": {
"text/plain": [
"Unit('hours since 2018-1-1 00:00:00', calendar='proleptic_gregorian')"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(dataset.coord('time').points)\n",
"print(dataset.coord('time').points[1:10]-dataset.coord('time').points[0:9])\n",
"dataset.coord('time').units"
]
},
{
"cell_type": "markdown",
"id": "f0e36a32-7e88-4a48-ae79-0fea363a4b22",
"metadata": {},
"source": [
"We can now check what dates these time points correspond to"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "06a537fd-2767-4c02-b193-dc23b4dba757",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['2019-07-16 06:00:00', '2020-07-16 06:00:00', '2021-07-16 06:00:00', '2022-07-16 06:00:00', '2023-07-16 06:00:00', '2024-07-16 06:00:00', '2025-07-16 06:00:00', '2026-07-16 06:00:00', '2027-07-16 06:00:00', '2028-07-16 06:00:00']\n"
]
}
],
"source": [
"calendar = dataset.coord('time').units.calendar\n",
"\n",
"cftime_points = cftime.num2date(dataset.coord('time').points,\n",
" str(dataset.coord('time').units),\n",
" calendar=calendar)\n",
"print([str(cftp) for cftp in cftime_points])"
]
},
{
"cell_type": "markdown",
"id": "a3586733-c1fb-4898-bc14-e466cf944887",
"metadata": {},
"source": [
"Let's convert `hours since 2018-1-1` to `days since 2000-1-1`"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "a65cb0cf-e1ff-45f8-820b-329ee4f2a32d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Unit('days since 2000-1-1', calendar='proleptic_gregorian')"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset.coord('time').convert_units(cf_units.Unit(BASE_TIME_UNIT, calendar=calendar))\n",
"dataset.coord('time').units"
]
},
{
"cell_type": "markdown",
"id": "13205041-2b97-4c76-90a0-9f66d1b767dc",
"metadata": {},
"source": [
"And confirm they still correspond to the same dates"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "45d4c567-cecc-4444-98bb-59b832c1e247",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['2019-07-16 06:00:00', '2020-07-16 06:00:00', '2021-07-16 06:00:00', '2022-07-16 06:00:00', '2023-07-16 06:00:00', '2024-07-16 06:00:00', '2025-07-16 06:00:00', '2026-07-16 06:00:00', '2027-07-16 06:00:00', '2028-07-16 06:00:00']\n"
]
}
],
"source": [
"cftime_points = cftime.num2date(dataset.coord('time').points,\n",
" str(dataset.coord('time').units),\n",
" calendar=calendar)\n",
"print([str(cftp) for cftp in cftime_points])"
]
},
{
"cell_type": "markdown",
"id": "01664dc7-e133-4e85-9fd6-7b9c95819d05",
"metadata": {},
"source": [
"Using `guess_bounds()` method from `iris` library would interpolate start and end bounds evenly which is not what we want, as they should span a calendar year. This needs to be fixed manually:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "7c73bd27-4fdd-4245-94c9-86059d8ee2ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[(6940, 7305), (7305, 7671), (7671, 8036), (8036, 8401), (8401, 8766), (8766, 9132), (9132, 9497), (9497, 9862), (9862, 10227), (10227, 10593)]\n",
"['2019-01-01 00:00:00', '2020-01-01 00:00:00', '2021-01-01 00:00:00', '2022-01-01 00:00:00', '2023-01-01 00:00:00', '2024-01-01 00:00:00', '2025-01-01 00:00:00', '2026-01-01 00:00:00', '2027-01-01 00:00:00', '2028-01-01 00:00:00']\n"
]
}
],
"source": [
"time_bounds = []\n",
"for time_point in cftime_points:\n",
" start_date = cftime.date2num(cftime.datetime(time_point.year, 1, 1, calendar=calendar), units=BASE_TIME_UNIT, calendar=calendar)\n",
" end_date = cftime.date2num(cftime.datetime(time_point.year+1, 1, 1, calendar=calendar), units=BASE_TIME_UNIT, calendar=calendar)\n",
" time_bounds.append((start_date, end_date,))\n",
"print(time_bounds)\n",
"\n",
"print([str(cftp) for cftp in cftime.num2date([dates[0] for dates in time_bounds], str(dataset.coord('time').units), calendar=calendar)])"
]
},
{
"cell_type": "markdown",
"id": "e8bf638a-9e8c-4b67-b2c5-21c29d13b40b",
"metadata": {},
"source": [
"Adding time bounds and fixing lon and lat bounds using `guess_bounds()`"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "0d61452e-59fc-4829-a014-ebadabd16f3d",
"metadata": {},
"outputs": [],
"source": [
"dataset.coord('time').bounds = time_bounds\n",
"dataset.coord('longitude').guess_bounds()\n",
"dataset.coord('latitude').guess_bounds()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "a8df511a-f26e-4553-b9ba-6c892171a173",
"metadata": {},
"outputs": [],
"source": [
"time_coord = dataset.coord('time')\n",
"lon_coord = dataset.coord('longitude')\n",
"lat_coord = dataset.coord('latitude')\n",
"realization_coord = dataset.coord('realization')"
]
},
{
"cell_type": "markdown",
"id": "7b963581-b4f7-4df3-97e8-5a40ada4adfb",
"metadata": {},
"source": [
"Now our dataset is ready to be CMORised. We need to generate `dataset_info` dictionary that will contain all metadata that gets added to the dataset."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "68bb607d-942a-4ca0-8768-b5d56b53d6ba",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'_AXIS_ENTRY_FILE': 'decadal_coordinate.json',\n",
" '_FORMULA_VAR_FILE': 'decadal_formula_terms.json',\n",
" '_controlled_vocabulary_file': 'decadal_CV.json',\n",
" 'activity_id': 'DCPP',\n",
" 'calendar': 'proleptic_gregorian',\n",
" 'cv_version': 'v1.0',\n",
" 'ensemble_label': 'r1-10i1p1f1',\n",
" 'experiment': 'dcppB forecast',\n",
" 'experiment_id': 'dcppB-forecast',\n",
" 'forcing_index': '1',\n",
" 'grid': 'grid',\n",
" 'grid_label': 'gn',\n",
" 'initialization_index': '1',\n",
" 'institution': 'Deutscher Wetterdienst, Offenbach am Main 63067, Germany',\n",
" 'institution_id': 'DWD',\n",
" 'license': 'This is license text.',\n",
" 'mip_era': 'decadal',\n",
" 'nominal_resolution': '100 km',\n",
" 'outpath': '.',\n",
" 'physics_index': '1',\n",
" 'source': 'ICON-ESM-LR (2017): \\naerosol: none, prescribed MACv2-SP\\natmos: ICON-A (icosahedral/triangles; 160 km; 47 levels; top level 80 km)\\natmosChem: none\\nland: JSBACH4.20\\nlandIce: none/prescribed\\nocean: ICON-O (icosahedral/triangles; 40 km; 40 levels; top grid cell 0-12 m)\\nocnBgchem: HAMOCC\\nseaIce: unnamed (thermodynamic (Semtner zero-layer) dynamic (Hibler 79) sea ice model)',\n",
" 'source_id': 'ICON-ESM-LR',\n",
" 'source_type': 'AOGCM',\n",
" 'variant_label': 'r1i1p1f1',\n",
" 'output_file_template': '<variable_id><table><source_id><experiment_id><ensemble_label><grid_label>',\n",
" 'realization_index': 1}"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mip_table, institution_id, ensemble_label, grid_label, climatology, initialization_index, physics_index, forcing_index = parse_filename(filename)\n",
"activity_id = \"DCPP\"\n",
"calendar = calendar\n",
"experiment_id = \"dcppB-forecast\"\n",
"source_id = \"ICON-ESM-LR\"\n",
"dataset_info = generate_dataset_info(\n",
" mip_era=MIP_ERA,\n",
" activity_id=activity_id,\n",
" calendar=calendar,\n",
" ensemble_label=ensemble_label,\n",
" experiment_id=experiment_id,\n",
" institution_id=institution_id,\n",
" source_id=source_id,\n",
" source_type='AOGCM',\n",
" grid='grid',\n",
" grid_label=grid_label,\n",
" nominal_resolution='100 km',\n",
" forcing_index=forcing_index,\n",
" initialization_index=initialization_index,\n",
" physics_index=physics_index,\n",
" realization_index=1\n",
")\n",
"dataset_info"
]
},
{
"cell_type": "markdown",
"id": "dc07cb6f-7aad-4ed1-beb3-58e8166803dc",
"metadata": {},
"source": [
"Initial CMOR setup: loading configuration file and MIP tables"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "5d44e661-b11d-452f-b1c5-21a8407574b3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"config_filepath = 'CMOR_input_dwd.json'\n",
"create_cmor_json_config(dataset_info, config_filepath)\n",
"cmor.setup(inpath=MIP_TABLES_DIR, netcdf_file_action=cmor.CMOR_REPLACE, create_subdirectories=False)\n",
"cmor.dataset_json(config_filepath)\n",
"table = os.path.join(MIP_TABLES_DIR, '{}_{}.json'.format(MIP_ERA, mip_table))\n",
"cmor.load_table(table)"
]
},
{
"cell_type": "markdown",
"id": "6aec7218-edd4-4822-b52b-50bd9a4e54d3",
"metadata": {},
"source": [
"Now I can create time axis with correct units"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "9ccbdf4c-7150-4ce5-80ae-3763dda25f7f",
"metadata": {},
"outputs": [],
"source": [
"time = cmor.axis(table_entry='time', units=str(time_coord.units), coord_vals=time_coord.points, cell_bounds=time_coord.bounds)"
]
},
{
"cell_type": "markdown",
"id": "898e6f69-227b-4a15-b96c-b1c07da4161d",
"metadata": {},
"source": [
"Reference time is a number of days between 2018-1-1 (initialisation time) and the calendar base date (2000-1-1). It can be calculated using `cftime` library"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "81a2b7fc-7000-4d8f-8cb0-e62e8e34cdcf",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6575"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"date1 = cftime.date2num(cftime.datetime(2018, 1, 1, calendar=calendar), units=BASE_TIME_UNIT, calendar=calendar)\n",
"date2 = cftime.date2num(cftime.datetime(2000, 1, 1, calendar=calendar), units=BASE_TIME_UNIT, calendar=calendar)\n",
"date_offset = date1 - date2\n",
"date_offset"
]
},
{
"cell_type": "markdown",
"id": "c1180af2-6d47-4b9d-8d15-b8a16224414d",
"metadata": {},
"source": [
"Creating more axes"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "46651ee1-0a6c-48de-abf6-aa4e6a160385",
"metadata": {},
"outputs": [],
"source": [
"reftime = cmor.axis(table_entry=\"reftime1\", units=str(time_coord.units), coord_vals=np.array([date_offset]))\n",
"height2m = cmor.axis(table_entry=\"height2m\", units=\"m\", coord_vals=np.array((2.0,)))\n",
"latitude = cmor.axis(table_entry=\"latitude\", units=str(lat_coord.units), coord_vals=lat_coord.points, cell_bounds=lat_coord.bounds)\n",
"longitude = cmor.axis(table_entry=\"longitude\", units=str(lon_coord.units), coord_vals=lon_coord.points, cell_bounds=lon_coord.bounds)\n",
"realization = cmor.axis(table_entry=\"realization\", units=\"1\", coord_vals=realization_coord.points)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "b1997c92-bdd2-48e4-b503-8ac858647b52",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(10, 10, 192, 384)"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset.data.shape"
]
},
{
"cell_type": "markdown",
"id": "e19c49a7-f49d-4f8c-b757-f8862a2bce53",
"metadata": {},
"source": [
"We also need to add two dimensions to our data so they could be matched with height and reference time."
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "61ab08f2-c987-44ed-9f17-46f7ed5e58fc",
"metadata": {},
"outputs": [],
"source": [
"data = dataset.data"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "ab198da5-97a0-4fe6-9a5f-7e208b42285f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(10, 10, 192, 384, 1, 1)"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"expanded = np.ma.expand_dims(data, axis=[4,5])\n",
"expanded.shape"
]
},
{
"cell_type": "markdown",
"id": "499b0576-e915-4f22-b5fe-c27ce9a8bb46",
"metadata": {},
"source": [
"Now everything is ready, and we can create our tas anomaly variable."
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "ae51681a-5845-4d06-b59d-deb80e895ce6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
".//tasAnom_Ayr_ICON-ESM-LR_dcppB-forecast_r1-10i1p1f1_gn_2019-2028.nc\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"\u001b[2;34;47mC Traceback:\n",
"In function: cmor_close_variable\n",
"! \u001b[0m\n",
"\n",
"\u001b[1;34;47m!!!!!!!!!!!!!!!!!!!!!!!!!\n",
"!\n",
"! Warning: while closing variable 0 (tasAnom, table Ayr)\n",
"! we noticed you wrote 0 time steps for the variable,\n",
"! but its time axis 0 (time) has 10 time steps\n",
"!\n",
"!!!!!!!!!!!!!!!!!!!!!!!!!\u001b[0m\n",
"\n"
]
}
],
"source": [
"axis_ids = [realization, time, latitude, longitude, reftime, height2m]\n",
"tasanom_var_id = cmor.variable(table_entry=\"tasAnom\", axis_ids=axis_ids, units=\"K\")\n",
"cmor.set_deflate(tasanom_var_id, True, True, 1) # enabling netCDF4 compression\n",
"cmor.write(tasanom_var_id, expanded)\n",
"filename = cmor.close(tasanom_var_id, file_name=True)\n",
"print(filename)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "b706d01c-9b37-4a07-a7d9-a8a7971a3d6d",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/h04/pflorek/.conda/envs/seasonal_cmorisation/lib/python3.11/site-packages/iris/fileformats/cf.py:859: UserWarning: Missing CF-netCDF measure variable 'areacella', referenced by netCDF variable 'tasAnom'\n",
" warnings.warn(\n"
]
}
],
"source": [
"cmorised_file = iris.load(filename)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "a3786521-0d59-40eb-98d7-6dbe77525aad",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"<style>\n",
" a.iris {\n",
" text-decoration: none !important;\n",
" }\n",
" table.iris {\n",
" white-space: pre;\n",
" border: 1px solid;\n",
" border-color: #9c9c9c;\n",
" font-family: monaco, monospace;\n",
" }\n",
" th.iris {\n",
" background: #303f3f;\n",
" color: #e0e0e0;\n",
" border-left: 1px solid;\n",
" border-color: #9c9c9c;\n",
" font-size: 1.05em;\n",
" min-width: 50px;\n",
" max-width: 125px;\n",
" }\n",
" tr.iris :first-child {\n",
" border-right: 1px solid #9c9c9c !important;\n",
" }\n",
" td.iris-title {\n",
" background: #d5dcdf;\n",
" border-top: 1px solid #9c9c9c;\n",
" font-weight: bold;\n",
" }\n",
" .iris-word-cell {\n",
" text-align: left !important;\n",
" white-space: pre;\n",
" }\n",
" .iris-subheading-cell {\n",
" padding-left: 2em !important;\n",
" }\n",
" .iris-inclusion-cell {\n",
" padding-right: 1em !important;\n",
" }\n",
" .iris-panel-body {\n",
" padding-top: 0px;\n",
" }\n",
" .iris-panel-title {\n",
" padding-left: 3em;\n",
" }\n",
" .iris-panel-title {\n",
" margin-top: 7px;\n",
" }\n",
"</style>\n",
"<table class=\"iris\" id=\"139786189781712\">\n",
" <tr class=\"iris\">\n",
"<th class=\"iris iris-word-cell\">Air Temperature Anomaly (K)</th>\n",
"<th class=\"iris iris-word-cell\">time</th>\n",
"<th class=\"iris iris-word-cell\">realization</th>\n",
"<th class=\"iris iris-word-cell\">latitude</th>\n",
"<th class=\"iris iris-word-cell\">longitude</th>\n",
"</tr>\n",
" <tr class=\"iris\">\n",
"<td class=\"iris-word-cell iris-subheading-cell\">Shape</td>\n",
"<td class=\"iris iris-inclusion-cell\">10</td>\n",
"<td class=\"iris iris-inclusion-cell\">10</td>\n",
"<td class=\"iris iris-inclusion-cell\">192</td>\n",
"<td class=\"iris iris-inclusion-cell\">384</td>\n",
"</tr>\n",
" <tr class=\"iris\">\n",
" <td class=\"iris-title iris-word-cell\">Dimension coordinates</td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\ttime</td>\n",
" <td class=\"iris-inclusion-cell\">x</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\trealization</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">x</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tlatitude</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">x</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tlongitude</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">x</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-title iris-word-cell\">Auxiliary coordinates</td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tforecast_period</td>\n",
" <td class=\"iris-inclusion-cell\">x</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
" <td class=\"iris-inclusion-cell\">-</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-title iris-word-cell\">Scalar coordinates</td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tforecast_reference_time</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">2018-01-01 00:00:00</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\theight</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">2.0 m</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-title iris-word-cell\">Cell methods</td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\t0</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">area: time: mean</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-title iris-word-cell\">Attributes</td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
" <td class=\"iris-title\"></td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tConventions</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;CF-1.7 CMIP-6.2&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tactivity_id</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;DCPP&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tcmor_version</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;3.7.3&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tcomment</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;near-surface (usually, 2 meter) air temperature anomaly relative to cl ...&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tcreation_date</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;2023-11-16T11:47:27Z&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tcv_version</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;v1.0&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tdata_specs_version</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;decadal v1.0&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tensemble_label</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;r1-10i1p1f1&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\texperiment</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;dcppB forecast&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\texperiment_id</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;dcppB-forecast&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\texternal_variables</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;areacella&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tforcing_index</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">1</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tfrequency</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;yr&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tfurther_info_url</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;https://furtherinfo.es-doc.org/decadal.DWD.ICON-ESM-LR.dcppB-forecast. ...&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tgrid</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;grid&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tgrid_label</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;gn&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\thistory</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&quot;2023-11-16T11:47:27Z altered by CMOR: Treated scalar dimension: &#x27;reftime&#x27;. ...&quot;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tinitialization_index</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">1</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tinstitution</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;Deutscher Wetterdienst, Offenbach am Main 63067, Germany&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tinstitution_id</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;DWD&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tlicense</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;This is license text.&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tmip_era</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;decadal&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tnominal_resolution</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;100 km&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tphysics_index</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">1</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tproduct</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;model-output&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\trealization_index</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">1</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\trealm</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;atmos&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tsource</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;ICON-ESM-LR (2017): \\naerosol: none, prescribed MACv2-SP\\natmos: ICON-A ...&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tsource_id</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;ICON-ESM-LR&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tsource_type</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;AOGCM&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\ttable_id</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;Ayr&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\ttable_info</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;Creation Date:(28 May 2023) MD5:a36073cf17db69f6dce2bebfb36c6bd1&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\ttitle</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;ICON-ESM-LR output prepared for decadal&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\ttracking_id</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;0aa3f781-628e-4a10-a564-2ac8dcd9d170&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tvariable_id</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;tasAnom&#x27;</td>\n",
"</tr>\n",
"<tr class=\"iris\">\n",
" <td class=\"iris-word-cell iris-subheading-cell\">\tvariant_label</td>\n",
" <td class=\"iris-word-cell\" colspan=\"4\">&#x27;r1i1p1f1&#x27;</td>\n",
"</tr>\n",
"</table>\n",
" "
],
"text/plain": [
"<iris 'Cube' of air_temperature_anomaly / (K) (time: 10; realization: 10; latitude: 192; longitude: 384)>"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cmorised_file[0]"
]
},
{
"cell_type": "markdown",
"id": "38f145f8-c162-49be-ba50-a29cb5d2990c",
"metadata": {},
"source": [
"We can now confirm data points match"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "7b9df349-3ae2-4417-a4e9-fb7a75ba58ed",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.7036743"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cmorised_file[0].data[0,1,2,3] # axis order is time, realization, lat, lon"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "cf76cce7-8b58-46c2-a646-9f9eb8be1cb1",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.7036743"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset.data[1,0,2,3] # axis order is realization, time, lat, lon"
]
},
{
"cell_type": "markdown",
"id": "5bb0942b-5efd-453b-a3b4-7be09ac0313c",
"metadata": {},
"source": [
"And now we can check time and leadtime values"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "4c269a0f-02e2-497c-a2b6-ef5db910bd3a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"masked_array(data=[ 547.5, 913. , 1278.5, 1643.5, 2008.5, 2374. , 2739.5,\n",
" 3104.5, 3469.5, 3835. ],\n",
" mask=False,\n",
" fill_value=1e+20)"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cmorised_file[0].coord('forecast_period').points"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "82fb01fb-b531-407d-9226-949244ae3034",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 7122.5, 7488. , 7853.5, 8218.5, 8583.5, 8949. , 9314.5,\n",
" 9679.5, 10044.5, 10410. ])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cmorised_file[0].coord('time').points"
]
},
{
"cell_type": "markdown",
"id": "67635a7f-eb9b-4b49-8b03-675233f63f27",
"metadata": {},
"source": [
"Known caveats & issues\n",
"======================\n",
"\n",
"At its current state, CMOR expects a single realisation index, even if the dataset contains realisation dimension. This means the variant label is also incorrect (CMOR recreates it from ripf indices even if provided with a full label). As a workaround, the `ensemble_label` attribute is used instead.\n",
"\n",
"The realisation dimension cannot be the first axis of the dataset, otherwise CMOR gets confused and is unable to create a filename with the date range suffix. Unfortunately the axis creation order can be controlled only indirectly, via reordering coordinates in variable's `dimensions` definition in MIP tables (which are being read in reverse order as far as I can tell), e.g.\n",
"\n",
"`\"dimensions\": \"longitude latitude realization time reftime1 height2m leadtime\"`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1c48439e-151f-4cc2-a13e-f2a081613706",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment