Skip to content

Instantly share code, notes, and snippets.

@ngupta23
Created January 10, 2023 20:08
Show Gist options
  • Save ngupta23/8adf65543584632f8ec57a27d9ec3480 to your computer and use it in GitHub Desktop.
Save ngupta23/8adf65543584632f8ec57a27d9ec3480 to your computer and use it in GitHub Desktop.
Introduction.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/ngupta23/8adf65543584632f8ec57a27d9ec3480/introduction.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QOOYOBk1-Vxw"
},
"source": [
"# • Introduction"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IfQjIOqB-Vxz"
},
"source": [
"<a href=\"https://colab.research.google.com/github/Nixtla/hierarchicalforecast/blob/main/nbs/examples/Introduction.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "v6cs_aL--Vx1"
},
"source": [
"## 1. Hierarchical Series"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "URs0E-rE-Vx1"
},
"source": [
"In many applications, a set of time series is hierarchically organized. Examples include the presence of geographic levels, products, or categories that define different types of aggregations. \n",
"\n",
"In such scenarios, forecasters are often required to provide predictions for all disaggregate and aggregate series. A natural desire is for those predictions to be **\"coherent\"**, that is, for the bottom series to add up precisely to the forecasts of the aggregated series."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xxsBsmS1-Vx2"
},
"source": [
"![Figure 1. A two level time series hierarchical structure, with four bottom level variables.](https://github.com/Nixtla/hierarchicalforecast/blob/main/nbs/examples/imgs/hierarchical_motivation1.png?raw=1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0zQvGTl_-Vx3"
},
"source": [
"Figure 1. shows a simple hierarchical structure where we have four bottom-level series, two middle-level series, and the top level representing the total aggregation. Its hierarchical aggregations or coherency constraints are:\n",
"\n",
"\\begin{align}\n",
" y_{\\mathrm{Total},\\tau} = y_{\\beta_{1},\\tau}+y_{\\beta_{2},\\tau}+y_{\\beta_{3},\\tau}+y_{\\beta_{4},\\tau} \n",
" \\qquad \\qquad \\qquad \\qquad \\qquad \\\\\n",
" \\mathbf{y}_{[a],\\tau}=\\left[y_{\\mathrm{Total},\\tau},\\; y_{\\beta_{1},\\tau}+y_{\\beta_{2},\\tau},\\;y_{\\beta_{3},\\tau}+y_{\\beta_{4},\\tau}\\right]^{\\intercal} \n",
" \\qquad\n",
" \\mathbf{y}_{[b],\\tau}=\\left[ y_{\\beta_{1},\\tau},\\; y_{\\beta_{2},\\tau},\\; y_{\\beta_{3},\\tau},\\; y_{\\beta_{4},\\tau} \\right]^{\\intercal}\n",
"\\end{align}\n",
"\n",
"Luckily these constraints can be compactly expressed with the following matrices:\n",
"\n",
"\\begin{align}\n",
"\\mathbf{S}_{[a,b][b]}\n",
"=\n",
"\\begin{bmatrix}\n",
"\\mathbf{A}_{\\mathrm{[a][b]}} \\\\ \n",
" \\\\\n",
" \\\\\n",
"\\mathbf{I}_{\\mathrm{[b][b]}} \\\\\n",
" \\\\\n",
"\\end{bmatrix}\n",
"=\n",
"\\begin{bmatrix}\n",
"1 & 1 & 1 & 1 \\\\\n",
"1 & 1 & 0 & 0 \\\\\n",
"0 & 0 & 1 & 1 \\\\\n",
"1 & 0 & 0 & 0 \\\\\n",
"0 & 1 & 0 & 0 \\\\\n",
"0 & 0 & 1 & 0 \\\\\n",
"0 & 0 & 0 & 1 \\\\\n",
"\\end{bmatrix}\n",
"\\end{align}\n",
"\n",
"where $\\mathbf{A}_{[a,b][b]}$ aggregates the bottom series to the upper levels, and $\\mathbf{I}_{\\mathrm{[b][b]}}$ is an identity matrix. The representation of the hierarchical series is then:\n",
"\n",
"\\begin{align}\n",
"\\mathbf{y}_{[a,b],\\tau} = \\mathbf{S}_{[a,b][b]} \\mathbf{y}_{[b],\\tau}\n",
"\\end{align}\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Caz2rliM-Vx5"
},
"source": [
"To visualize an example, in Figure 2. One can think of the hierarchical time series structure levels to represent different geographical aggregations. For example, in Figure 2. the top level is the total aggregation of series within a country, the middle level being its states and the bottom level its regions."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ByslUUny-Vx6"
},
"source": [
"![Figure 2. A hierarchy can be composed of geographic levels. In this example the top level corresponds to country aggregation, middle level to states, and bottom level to regions.](https://github.com/Nixtla/hierarchicalforecast/blob/main/nbs/examples/imgs/hierarchical_motivation2.png?raw=1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1FpIiVIr-Vx7"
},
"source": [
"## 2. Hierarchical Forecast"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yMawYGKP-Vx8"
},
"source": [
"To achieve **\"coherency\"**, most statistical solutions to the hierarchical forecasting challenge implement a two-stage reconciliation process. \n",
"1. First, we obtain a set of the base forecast $\\mathbf{\\hat{y}}_{[a,b],\\tau}$ \n",
"2. Later, we reconcile them into coherent forecasts $\\mathbf{\\tilde{y}}_{[a,b],\\tau}$.\n",
"\n",
"Most hierarchical reconciliation methods can be expressed by the following transformations:\n",
"\n",
"\\begin{align}\n",
"\\tilde{\\mathbf{y}}_{[a,b],\\tau} = \\mathbf{S}_{[a,b][b]} \\mathbf{P}_{[b][a,b]} \\hat{\\mathbf{y}}_{[a,b],\\tau}\n",
"\\end{align}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "U3dse7T4-Vx-"
},
"source": [
"The HierarchicalForecast library offers a Python collection of reconciliation methods, datasets, evaluation and visualization tools for the task. Among its available reconciliation methods we have `BottomUp`, `TopDown`, `MiddleOut`, `MinTrace`, `ERM`. Among its probabilistic coherent methods we have `Normality`, `Bootstrap`, `PERMBU`."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2kwH152Z-Vx-"
},
"source": [
"## 3. Minimal Example"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "hl1twiTj-Vx_"
},
"outputs": [],
"source": [
"%%capture\n",
"!pip install hierarchicalforecast\n",
"!pip install -U numba statsforecast datasetsforecast"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KMyGd-46-VyA"
},
"source": [
"### Wrangling Data"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ihrBv8UZ-VyB",
"outputId": "f3115e17-aedc-4a81-9e7d-96aaf998fc68"
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"/usr/local/lib/python3.8/dist-packages/statsforecast/core.py:21: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)\n",
" from tqdm.autonotebook import tqdm\n"
]
}
],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"# compute base forecast no coherent\n",
"from statsforecast.models import (\n",
" Naive,\n",
" AutoARIMA,\n",
" HoltWinters,\n",
" CrostonClassic as Croston, \n",
" HistoricAverage,\n",
" DynamicOptimizedTheta as DOT,\n",
" SeasonalNaive,\n",
" ETS,\n",
" IMAPA,\n",
" RandomWalkWithDrift,\n",
" SeasonalExponentialSmoothing,\n",
" SeasonalWindowAverage,\n",
" SimpleExponentialSmoothing,\n",
" TSB,\n",
" WindowAverage,\n",
" DynamicOptimizedTheta,\n",
" AutoETS,\n",
" AutoCES\n",
")\n",
"from statsforecast.core import StatsForecast\n",
"\n",
"#obtain hierarchical reconciliation methods and evaluation\n",
"from hierarchicalforecast.utils import aggregate\n",
"from hierarchicalforecast.methods import BottomUp, TopDown\n",
"from hierarchicalforecast.core import HierarchicalReconciliation"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WmSgiCNV-VyB"
},
"source": [
"We are going to creat a synthetic data set to illustrate a hierarchical time series structure like the one in Figure 1.\n",
"\n",
"We will create a two level structure with four bottom series where aggregations of the series are self evident."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 300
},
"id": "_URxbFgW-VyB",
"outputId": "cd50cf7a-da08-49f4-8c77-0c889045afd0"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" ds top_level middle_level bottom_level y\n",
"62 2005-03-01 Australia State1 r1 235.960275\n",
"63 2005-04-01 Australia State1 r1 239.705676\n",
"126 2005-03-01 Australia State1 r2 598.950013\n",
"127 2005-04-01 Australia State1 r2 608.457156\n",
"190 2005-03-01 Australia State2 r3 4611.561833\n",
"191 2005-04-01 Australia State2 r3 4684.761228\n",
"254 2005-03-01 Australia State2 r4 3771.548450\n",
"255 2005-04-01 Australia State2 r4 3831.414299"
],
"text/html": [
"\n",
" <div id=\"df-0ae04b52-b1a6-48a5-9820-26acf6d778d4\">\n",
" <div class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>ds</th>\n",
" <th>top_level</th>\n",
" <th>middle_level</th>\n",
" <th>bottom_level</th>\n",
" <th>y</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>62</th>\n",
" <td>2005-03-01</td>\n",
" <td>Australia</td>\n",
" <td>State1</td>\n",
" <td>r1</td>\n",
" <td>235.960275</td>\n",
" </tr>\n",
" <tr>\n",
" <th>63</th>\n",
" <td>2005-04-01</td>\n",
" <td>Australia</td>\n",
" <td>State1</td>\n",
" <td>r1</td>\n",
" <td>239.705676</td>\n",
" </tr>\n",
" <tr>\n",
" <th>126</th>\n",
" <td>2005-03-01</td>\n",
" <td>Australia</td>\n",
" <td>State1</td>\n",
" <td>r2</td>\n",
" <td>598.950013</td>\n",
" </tr>\n",
" <tr>\n",
" <th>127</th>\n",
" <td>2005-04-01</td>\n",
" <td>Australia</td>\n",
" <td>State1</td>\n",
" <td>r2</td>\n",
" <td>608.457156</td>\n",
" </tr>\n",
" <tr>\n",
" <th>190</th>\n",
" <td>2005-03-01</td>\n",
" <td>Australia</td>\n",
" <td>State2</td>\n",
" <td>r3</td>\n",
" <td>4611.561833</td>\n",
" </tr>\n",
" <tr>\n",
" <th>191</th>\n",
" <td>2005-04-01</td>\n",
" <td>Australia</td>\n",
" <td>State2</td>\n",
" <td>r3</td>\n",
" <td>4684.761228</td>\n",
" </tr>\n",
" <tr>\n",
" <th>254</th>\n",
" <td>2005-03-01</td>\n",
" <td>Australia</td>\n",
" <td>State2</td>\n",
" <td>r4</td>\n",
" <td>3771.548450</td>\n",
" </tr>\n",
" <tr>\n",
" <th>255</th>\n",
" <td>2005-04-01</td>\n",
" <td>Australia</td>\n",
" <td>State2</td>\n",
" <td>r4</td>\n",
" <td>3831.414299</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-0ae04b52-b1a6-48a5-9820-26acf6d778d4')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
" \n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
" </svg>\n",
" </button>\n",
" \n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" flex-wrap:wrap;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-0ae04b52-b1a6-48a5-9820-26acf6d778d4 button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-0ae04b52-b1a6-48a5-9820-26acf6d778d4');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
" </div>\n",
" "
]
},
"metadata": {},
"execution_count": 3
}
],
"source": [
"# Create Figure 1. synthetic bottom data\n",
"N = 64\n",
"ds = pd.date_range(start='2000-01-01', periods=N, freq='MS')\n",
"y_base = np.arange(1,N+1)\n",
"np.random.seed(42)\n",
"r1 = y_base * (10**1) * np.random.rand()\n",
"r2 = y_base * (10**1) * np.random.rand()\n",
"r3 = y_base * (10**2) * np.random.rand()\n",
"r4 = y_base * (10**2) * np.random.rand()\n",
"\n",
"ys = np.concatenate([r1, r2, r3, r4])\n",
"ds = np.tile(ds, 4)\n",
"unique_ids = ['r1'] * N + ['r2'] * N + ['r3'] * N + ['r4'] * N\n",
"top_level = 'Australia'\n",
"middle_level = ['State1'] * N * 2 + ['State2'] * N * 2\n",
"bottom_level = unique_ids\n",
"\n",
"bottom_df = dict(ds=ds,\n",
" top_level=top_level, \n",
" middle_level=middle_level, \n",
" bottom_level=bottom_level,\n",
" y=ys)\n",
"bottom_df = pd.DataFrame(bottom_df)\n",
"bottom_df.groupby('bottom_level').tail(2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "GaajtZKx-VyD"
},
"source": [
"The previously introduced hierarchical series $\\mathbf{y}_{[a,b]\\tau}$ is captured within the `Y_hier_df` dataframe.\n",
"\n",
"The aggregation constraints matrix $\\mathbf{S}_{[a][b]}$ is captured within the `S_df` dataframe.\n",
"\n",
"Finally the `tags` contains a list within `Y_hier_df` composing each hierarchical level, for example\n",
"the `tags['top_level']` contains `Australia`'s aggregated series index."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vvteW2PF-VyD",
"outputId": "fb4a120a-02ef-41d9-ff67-67ae6309ef41"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"S_df.shape (7, 4)\n",
"Y_hier_df.shape (448, 3)\n",
"tags['top_level'] ['Australia']\n"
]
}
],
"source": [
"# Create hierarchical structure and constraints\n",
"hierarchy_levels = [['top_level'],\n",
" ['top_level', 'middle_level'],\n",
" ['top_level', 'middle_level', 'bottom_level']]\n",
"Y_hier_df, S_df, tags = aggregate(df=bottom_df, spec=hierarchy_levels)\n",
"Y_hier_df = Y_hier_df.reset_index()\n",
"print('S_df.shape', S_df.shape)\n",
"print('Y_hier_df.shape', Y_hier_df.shape)\n",
"print(\"tags['top_level']\", tags['top_level'])"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 927
},
"id": "xAgjFxP1-VyE",
"outputId": "e5b8f4a4-72bf-4121-9629-78eca5033b34"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" unique_id ds y\n",
"60 Australia 2005-01-01 8925.384998\n",
"61 Australia 2005-02-01 9071.702785\n",
"62 Australia 2005-03-01 9218.020572\n",
"63 Australia 2005-04-01 9364.338359\n",
"124 Australia/State1 2005-01-01 808.405199\n",
"125 Australia/State1 2005-02-01 821.657744\n",
"126 Australia/State1 2005-03-01 834.910288\n",
"127 Australia/State1 2005-04-01 848.162832\n",
"188 Australia/State2 2005-01-01 8116.979799\n",
"189 Australia/State2 2005-02-01 8250.045041\n",
"190 Australia/State2 2005-03-01 8383.110284\n",
"191 Australia/State2 2005-04-01 8516.175526\n",
"252 Australia/State1/r1 2005-01-01 228.469472\n",
"253 Australia/State1/r1 2005-02-01 232.214874\n",
"254 Australia/State1/r1 2005-03-01 235.960275\n",
"255 Australia/State1/r1 2005-04-01 239.705676\n",
"316 Australia/State1/r2 2005-01-01 579.935727\n",
"317 Australia/State1/r2 2005-02-01 589.442870\n",
"318 Australia/State1/r2 2005-03-01 598.950013\n",
"319 Australia/State1/r2 2005-04-01 608.457156\n",
"380 Australia/State2/r3 2005-01-01 4465.163045\n",
"381 Australia/State2/r3 2005-02-01 4538.362439\n",
"382 Australia/State2/r3 2005-03-01 4611.561833\n",
"383 Australia/State2/r3 2005-04-01 4684.761228\n",
"444 Australia/State2/r4 2005-01-01 3651.816754\n",
"445 Australia/State2/r4 2005-02-01 3711.682602\n",
"446 Australia/State2/r4 2005-03-01 3771.548450\n",
"447 Australia/State2/r4 2005-04-01 3831.414299"
],
"text/html": [
"\n",
" <div id=\"df-e70de742-112b-4203-b938-db63442c2adc\">\n",
" <div class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>unique_id</th>\n",
" <th>ds</th>\n",
" <th>y</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>60</th>\n",
" <td>Australia</td>\n",
" <td>2005-01-01</td>\n",
" <td>8925.384998</td>\n",
" </tr>\n",
" <tr>\n",
" <th>61</th>\n",
" <td>Australia</td>\n",
" <td>2005-02-01</td>\n",
" <td>9071.702785</td>\n",
" </tr>\n",
" <tr>\n",
" <th>62</th>\n",
" <td>Australia</td>\n",
" <td>2005-03-01</td>\n",
" <td>9218.020572</td>\n",
" </tr>\n",
" <tr>\n",
" <th>63</th>\n",
" <td>Australia</td>\n",
" <td>2005-04-01</td>\n",
" <td>9364.338359</td>\n",
" </tr>\n",
" <tr>\n",
" <th>124</th>\n",
" <td>Australia/State1</td>\n",
" <td>2005-01-01</td>\n",
" <td>808.405199</td>\n",
" </tr>\n",
" <tr>\n",
" <th>125</th>\n",
" <td>Australia/State1</td>\n",
" <td>2005-02-01</td>\n",
" <td>821.657744</td>\n",
" </tr>\n",
" <tr>\n",
" <th>126</th>\n",
" <td>Australia/State1</td>\n",
" <td>2005-03-01</td>\n",
" <td>834.910288</td>\n",
" </tr>\n",
" <tr>\n",
" <th>127</th>\n",
" <td>Australia/State1</td>\n",
" <td>2005-04-01</td>\n",
" <td>848.162832</td>\n",
" </tr>\n",
" <tr>\n",
" <th>188</th>\n",
" <td>Australia/State2</td>\n",
" <td>2005-01-01</td>\n",
" <td>8116.979799</td>\n",
" </tr>\n",
" <tr>\n",
" <th>189</th>\n",
" <td>Australia/State2</td>\n",
" <td>2005-02-01</td>\n",
" <td>8250.045041</td>\n",
" </tr>\n",
" <tr>\n",
" <th>190</th>\n",
" <td>Australia/State2</td>\n",
" <td>2005-03-01</td>\n",
" <td>8383.110284</td>\n",
" </tr>\n",
" <tr>\n",
" <th>191</th>\n",
" <td>Australia/State2</td>\n",
" <td>2005-04-01</td>\n",
" <td>8516.175526</td>\n",
" </tr>\n",
" <tr>\n",
" <th>252</th>\n",
" <td>Australia/State1/r1</td>\n",
" <td>2005-01-01</td>\n",
" <td>228.469472</td>\n",
" </tr>\n",
" <tr>\n",
" <th>253</th>\n",
" <td>Australia/State1/r1</td>\n",
" <td>2005-02-01</td>\n",
" <td>232.214874</td>\n",
" </tr>\n",
" <tr>\n",
" <th>254</th>\n",
" <td>Australia/State1/r1</td>\n",
" <td>2005-03-01</td>\n",
" <td>235.960275</td>\n",
" </tr>\n",
" <tr>\n",
" <th>255</th>\n",
" <td>Australia/State1/r1</td>\n",
" <td>2005-04-01</td>\n",
" <td>239.705676</td>\n",
" </tr>\n",
" <tr>\n",
" <th>316</th>\n",
" <td>Australia/State1/r2</td>\n",
" <td>2005-01-01</td>\n",
" <td>579.935727</td>\n",
" </tr>\n",
" <tr>\n",
" <th>317</th>\n",
" <td>Australia/State1/r2</td>\n",
" <td>2005-02-01</td>\n",
" <td>589.442870</td>\n",
" </tr>\n",
" <tr>\n",
" <th>318</th>\n",
" <td>Australia/State1/r2</td>\n",
" <td>2005-03-01</td>\n",
" <td>598.950013</td>\n",
" </tr>\n",
" <tr>\n",
" <th>319</th>\n",
" <td>Australia/State1/r2</td>\n",
" <td>2005-04-01</td>\n",
" <td>608.457156</td>\n",
" </tr>\n",
" <tr>\n",
" <th>380</th>\n",
" <td>Australia/State2/r3</td>\n",
" <td>2005-01-01</td>\n",
" <td>4465.163045</td>\n",
" </tr>\n",
" <tr>\n",
" <th>381</th>\n",
" <td>Australia/State2/r3</td>\n",
" <td>2005-02-01</td>\n",
" <td>4538.362439</td>\n",
" </tr>\n",
" <tr>\n",
" <th>382</th>\n",
" <td>Australia/State2/r3</td>\n",
" <td>2005-03-01</td>\n",
" <td>4611.561833</td>\n",
" </tr>\n",
" <tr>\n",
" <th>383</th>\n",
" <td>Australia/State2/r3</td>\n",
" <td>2005-04-01</td>\n",
" <td>4684.761228</td>\n",
" </tr>\n",
" <tr>\n",
" <th>444</th>\n",
" <td>Australia/State2/r4</td>\n",
" <td>2005-01-01</td>\n",
" <td>3651.816754</td>\n",
" </tr>\n",
" <tr>\n",
" <th>445</th>\n",
" <td>Australia/State2/r4</td>\n",
" <td>2005-02-01</td>\n",
" <td>3711.682602</td>\n",
" </tr>\n",
" <tr>\n",
" <th>446</th>\n",
" <td>Australia/State2/r4</td>\n",
" <td>2005-03-01</td>\n",
" <td>3771.548450</td>\n",
" </tr>\n",
" <tr>\n",
" <th>447</th>\n",
" <td>Australia/State2/r4</td>\n",
" <td>2005-04-01</td>\n",
" <td>3831.414299</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e70de742-112b-4203-b938-db63442c2adc')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
" \n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
" </svg>\n",
" </button>\n",
" \n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" flex-wrap:wrap;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-e70de742-112b-4203-b938-db63442c2adc button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-e70de742-112b-4203-b938-db63442c2adc');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
" </div>\n",
" "
]
},
"metadata": {},
"execution_count": 5
}
],
"source": [
"Y_hier_df.groupby('unique_id').tail(4)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 269
},
"id": "yL6cVvdG-VyF",
"outputId": "0cea2b5f-d957-4c7a-b3db-4427dbe2bdf6"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" Australia/State1/r1 Australia/State1/r2 \\\n",
"Australia 1.0 1.0 \n",
"Australia/State1 1.0 1.0 \n",
"Australia/State2 0.0 0.0 \n",
"Australia/State1/r1 1.0 0.0 \n",
"Australia/State1/r2 0.0 1.0 \n",
"Australia/State2/r3 0.0 0.0 \n",
"Australia/State2/r4 0.0 0.0 \n",
"\n",
" Australia/State2/r3 Australia/State2/r4 \n",
"Australia 1.0 1.0 \n",
"Australia/State1 0.0 0.0 \n",
"Australia/State2 1.0 1.0 \n",
"Australia/State1/r1 0.0 0.0 \n",
"Australia/State1/r2 0.0 0.0 \n",
"Australia/State2/r3 1.0 0.0 \n",
"Australia/State2/r4 0.0 1.0 "
],
"text/html": [
"\n",
" <div id=\"df-e1acccdb-5854-424b-94e4-a8101dda2791\">\n",
" <div class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Australia/State1/r1</th>\n",
" <th>Australia/State1/r2</th>\n",
" <th>Australia/State2/r3</th>\n",
" <th>Australia/State2/r4</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Australia</th>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State1</th>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State2</th>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State1/r1</th>\n",
" <td>1.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State1/r2</th>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State2/r3</th>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State2/r4</th>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e1acccdb-5854-424b-94e4-a8101dda2791')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
" \n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
" </svg>\n",
" </button>\n",
" \n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" flex-wrap:wrap;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-e1acccdb-5854-424b-94e4-a8101dda2791 button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-e1acccdb-5854-424b-94e4-a8101dda2791');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
" </div>\n",
" "
]
},
"metadata": {},
"execution_count": 6
}
],
"source": [
"S_df"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BxOnE_NH-VyG"
},
"source": [
"### Base Predictions"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"id": "tjX7ws3c-VyG"
},
"outputs": [],
"source": [
"FH = 2\n",
"# Split train/test sets\n",
"Y_test_df = Y_hier_df.groupby('unique_id').tail(FH)\n",
"Y_train_df = Y_hier_df.drop(Y_test_df.index)"
]
},
{
"cell_type": "code",
"source": [
"Y_train_df.groupby(\"unique_id\").tail(2), Y_test_df.groupby(\"unique_id\").tail(2)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "RHGBx2i4J8Bm",
"outputId": "b8ab8bf3-ba6c-4417-db2a-7ccc7543bde7"
},
"execution_count": 8,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"( unique_id ds y\n",
" 60 Australia 2005-01-01 8925.384998\n",
" 61 Australia 2005-02-01 9071.702785\n",
" 124 Australia/State1 2005-01-01 808.405199\n",
" 125 Australia/State1 2005-02-01 821.657744\n",
" 188 Australia/State2 2005-01-01 8116.979799\n",
" 189 Australia/State2 2005-02-01 8250.045041\n",
" 252 Australia/State1/r1 2005-01-01 228.469472\n",
" 253 Australia/State1/r1 2005-02-01 232.214874\n",
" 316 Australia/State1/r2 2005-01-01 579.935727\n",
" 317 Australia/State1/r2 2005-02-01 589.442870\n",
" 380 Australia/State2/r3 2005-01-01 4465.163045\n",
" 381 Australia/State2/r3 2005-02-01 4538.362439\n",
" 444 Australia/State2/r4 2005-01-01 3651.816754\n",
" 445 Australia/State2/r4 2005-02-01 3711.682602,\n",
" unique_id ds y\n",
" 62 Australia 2005-03-01 9218.020572\n",
" 63 Australia 2005-04-01 9364.338359\n",
" 126 Australia/State1 2005-03-01 834.910288\n",
" 127 Australia/State1 2005-04-01 848.162832\n",
" 190 Australia/State2 2005-03-01 8383.110284\n",
" 191 Australia/State2 2005-04-01 8516.175526\n",
" 254 Australia/State1/r1 2005-03-01 235.960275\n",
" 255 Australia/State1/r1 2005-04-01 239.705676\n",
" 318 Australia/State1/r2 2005-03-01 598.950013\n",
" 319 Australia/State1/r2 2005-04-01 608.457156\n",
" 382 Australia/State2/r3 2005-03-01 4611.561833\n",
" 383 Australia/State2/r3 2005-04-01 4684.761228\n",
" 446 Australia/State2/r4 2005-03-01 3771.548450\n",
" 447 Australia/State2/r4 2005-04-01 3831.414299)"
]
},
"metadata": {},
"execution_count": 8
}
]
},
{
"cell_type": "code",
"source": [
"# Compute base Naive predictions\n",
"# Careful identifying correct data freq, this data quarterly 'Q'\n",
"# NG: Changed freq from Q to M since inut data is monthly\n",
"# NG: Changed model from Naive to ARIMA since Naive output is already reconciled without doing anything\n",
"SP = 5 \n",
"fcst = StatsForecast(df=Y_train_df,\n",
" models=[AutoARIMA(season_length=SP), ETS(season_length=SP)],\n",
" freq='M', n_jobs=-1)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "X1bf-KDsLk_x",
"outputId": "9c193c12-67c9-4801-b536-38cedde9241d"
},
"execution_count": 9,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"/usr/local/lib/python3.8/dist-packages/statsforecast/models.py:526: FutureWarning: `ETS` will be deprecated in future versions of `StatsForecast`. Please use `AutoETS` instead.\n",
" ETS._warn()\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"Y_hat_df = fcst.forecast(h=FH, fitted=True)\n",
"Y_fitted_df = fcst.forecast_fitted_values()"
],
"metadata": {
"id": "_VHdRCbJFP-X"
},
"execution_count": 10,
"outputs": []
},
{
"cell_type": "code",
"source": [
"Y_hat_df.groupby(\"unique_id\").tail(2), Y_fitted_df.groupby(\"unique_id\").tail(2)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "RjI6RwcMEWBm",
"outputId": "393d3b07-b80d-4a4c-8a2b-eea19ca2c224"
},
"execution_count": 11,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"( ds AutoARIMA ETS\n",
" unique_id \n",
" Australia 2005-02-28 9198.676758 9218.020508\n",
" Australia 2005-03-31 9298.998047 9364.338867\n",
" Australia/State1 2005-02-28 833.157959 834.910278\n",
" Australia/State1 2005-03-31 842.244324 848.162842\n",
" Australia/State1/r1 2005-02-28 235.465103 235.960281\n",
" Australia/State1/r1 2005-03-31 238.033112 239.705673\n",
" Australia/State1/r2 2005-02-28 597.693054 598.950012\n",
" Australia/State1/r2 2005-03-31 604.211487 608.457153\n",
" Australia/State2 2005-02-28 8365.516602 8383.110352\n",
" Australia/State2 2005-03-31 8456.750977 8516.175781\n",
" Australia/State2/r3 2005-02-28 4601.883789 4611.562012\n",
" Australia/State2/r3 2005-03-31 4652.071777 4684.761230\n",
" Australia/State2/r4 2005-02-28 3763.633789 3771.548340\n",
" Australia/State2/r4 2005-03-31 3804.680176 3831.414307,\n",
" ds y AutoARIMA ETS\n",
" unique_id \n",
" Australia 2005-01-01 8925.384766 8906.094727 8925.385742\n",
" Australia 2005-02-01 9071.703125 9052.153320 9071.703125\n",
" Australia/State1 2005-01-01 808.405212 806.657959 808.405212\n",
" Australia/State1 2005-02-01 821.657715 819.887268 821.657776\n",
" Australia/State1/r1 2005-01-01 228.469467 227.975677 228.469467\n",
" Australia/State1/r1 2005-02-01 232.214874 231.714462 232.214874\n",
" Australia/State1/r2 2005-01-01 579.935730 578.682190 579.935730\n",
" Australia/State1/r2 2005-02-01 589.442871 588.172729 589.442871\n",
" Australia/State2 2005-01-01 8116.979980 8099.435547 8116.979980\n",
" Australia/State2 2005-02-01 8250.044922 8232.268555 8250.044922\n",
" Australia/State2/r3 2005-01-01 4465.163086 4455.512695 4465.163086\n",
" Australia/State2/r3 2005-02-01 4538.362305 4528.583008 4538.362305\n",
" Australia/State2/r4 2005-01-01 3651.816650 3643.923828 3651.816650\n",
" Australia/State2/r4 2005-02-01 3711.682617 3703.684326 3711.682617)"
]
},
"metadata": {},
"execution_count": 11
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0f6xcA6t-VyH"
},
"source": [
"### Reconciliation"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 520
},
"id": "1saCZinF-VyH",
"outputId": "7ab9e2d5-952e-40e9-cec2-48cadb4c2d84"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" ds AutoARIMA ETS AutoARIMA/BottomUp \\\n",
"unique_id \n",
"Australia 2005-02-28 9198.676758 9218.020508 9198.675781 \n",
"Australia 2005-03-31 9298.998047 9364.338867 9298.996094 \n",
"Australia/State1 2005-02-28 833.157959 834.910278 833.158142 \n",
"Australia/State1 2005-03-31 842.244324 848.162842 842.244629 \n",
"Australia/State1/r1 2005-02-28 235.465103 235.960281 235.465103 \n",
"Australia/State1/r1 2005-03-31 238.033112 239.705673 238.033112 \n",
"Australia/State1/r2 2005-02-28 597.693054 598.950012 597.693054 \n",
"Australia/State1/r2 2005-03-31 604.211487 608.457153 604.211487 \n",
"Australia/State2 2005-02-28 8365.516602 8383.110352 8365.517578 \n",
"Australia/State2 2005-03-31 8456.750977 8516.175781 8456.751953 \n",
"Australia/State2/r3 2005-02-28 4601.883789 4611.562012 4601.883789 \n",
"Australia/State2/r3 2005-03-31 4652.071777 4684.761230 4652.071777 \n",
"Australia/State2/r4 2005-02-28 3763.633789 3771.548340 3763.633789 \n",
"Australia/State2/r4 2005-03-31 3804.680176 3831.414307 3804.680176 \n",
"\n",
" ETS/BottomUp \n",
"unique_id \n",
"Australia 9218.020508 \n",
"Australia 9364.337891 \n",
"Australia/State1 834.910278 \n",
"Australia/State1 848.162842 \n",
"Australia/State1/r1 235.960281 \n",
"Australia/State1/r1 239.705673 \n",
"Australia/State1/r2 598.950012 \n",
"Australia/State1/r2 608.457153 \n",
"Australia/State2 8383.110352 \n",
"Australia/State2 8516.175781 \n",
"Australia/State2/r3 4611.562012 \n",
"Australia/State2/r3 4684.761230 \n",
"Australia/State2/r4 3771.548340 \n",
"Australia/State2/r4 3831.414307 "
],
"text/html": [
"\n",
" <div id=\"df-f41d5489-f2e1-45bb-984f-14914b414633\">\n",
" <div class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>ds</th>\n",
" <th>AutoARIMA</th>\n",
" <th>ETS</th>\n",
" <th>AutoARIMA/BottomUp</th>\n",
" <th>ETS/BottomUp</th>\n",
" </tr>\n",
" <tr>\n",
" <th>unique_id</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Australia</th>\n",
" <td>2005-02-28</td>\n",
" <td>9198.676758</td>\n",
" <td>9218.020508</td>\n",
" <td>9198.675781</td>\n",
" <td>9218.020508</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia</th>\n",
" <td>2005-03-31</td>\n",
" <td>9298.998047</td>\n",
" <td>9364.338867</td>\n",
" <td>9298.996094</td>\n",
" <td>9364.337891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State1</th>\n",
" <td>2005-02-28</td>\n",
" <td>833.157959</td>\n",
" <td>834.910278</td>\n",
" <td>833.158142</td>\n",
" <td>834.910278</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State1</th>\n",
" <td>2005-03-31</td>\n",
" <td>842.244324</td>\n",
" <td>848.162842</td>\n",
" <td>842.244629</td>\n",
" <td>848.162842</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State1/r1</th>\n",
" <td>2005-02-28</td>\n",
" <td>235.465103</td>\n",
" <td>235.960281</td>\n",
" <td>235.465103</td>\n",
" <td>235.960281</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State1/r1</th>\n",
" <td>2005-03-31</td>\n",
" <td>238.033112</td>\n",
" <td>239.705673</td>\n",
" <td>238.033112</td>\n",
" <td>239.705673</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State1/r2</th>\n",
" <td>2005-02-28</td>\n",
" <td>597.693054</td>\n",
" <td>598.950012</td>\n",
" <td>597.693054</td>\n",
" <td>598.950012</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State1/r2</th>\n",
" <td>2005-03-31</td>\n",
" <td>604.211487</td>\n",
" <td>608.457153</td>\n",
" <td>604.211487</td>\n",
" <td>608.457153</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State2</th>\n",
" <td>2005-02-28</td>\n",
" <td>8365.516602</td>\n",
" <td>8383.110352</td>\n",
" <td>8365.517578</td>\n",
" <td>8383.110352</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State2</th>\n",
" <td>2005-03-31</td>\n",
" <td>8456.750977</td>\n",
" <td>8516.175781</td>\n",
" <td>8456.751953</td>\n",
" <td>8516.175781</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State2/r3</th>\n",
" <td>2005-02-28</td>\n",
" <td>4601.883789</td>\n",
" <td>4611.562012</td>\n",
" <td>4601.883789</td>\n",
" <td>4611.562012</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State2/r3</th>\n",
" <td>2005-03-31</td>\n",
" <td>4652.071777</td>\n",
" <td>4684.761230</td>\n",
" <td>4652.071777</td>\n",
" <td>4684.761230</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State2/r4</th>\n",
" <td>2005-02-28</td>\n",
" <td>3763.633789</td>\n",
" <td>3771.548340</td>\n",
" <td>3763.633789</td>\n",
" <td>3771.548340</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Australia/State2/r4</th>\n",
" <td>2005-03-31</td>\n",
" <td>3804.680176</td>\n",
" <td>3831.414307</td>\n",
" <td>3804.680176</td>\n",
" <td>3831.414307</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-f41d5489-f2e1-45bb-984f-14914b414633')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
" \n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
" </svg>\n",
" </button>\n",
" \n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" flex-wrap:wrap;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-f41d5489-f2e1-45bb-984f-14914b414633 button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-f41d5489-f2e1-45bb-984f-14914b414633');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
" </div>\n",
" "
]
},
"metadata": {},
"execution_count": 12
}
],
"source": [
"# You can select a reconciler from our collection\n",
"reconcilers = [BottomUp()] # MinTrace(method='mint_shrink')\n",
"hrec = HierarchicalReconciliation(reconcilers=reconcilers)\n",
"\n",
"Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df, \n",
" Y_df=Y_fitted_df,\n",
" S=S_df, tags=tags)\n",
"Y_rec_df.groupby('unique_id').head(2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JpZJ18HK-VyI"
},
"source": [
"## References"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RMpDFTQ_-VyL"
},
"source": [
"- [Hyndman, R.J., & Athanasopoulos, G. (2021). \"Forecasting: principles and practice, 3rd edition: \n",
"Chapter 11: Forecasting hierarchical and grouped series.\". OTexts: Melbourne, Australia. OTexts.com/fpp3 \n",
"Accessed on July 2022.](https://otexts.com/fpp3/hierarchical.html)<br>\n",
"- [Orcutt, G.H., Watts, H.W., & Edwards, J.B.(1968). Data aggregation and information loss. The American \n",
"Economic Review, 58 , 773{787).](http://www.jstor.org/stable/1815532)<br>\n",
"- [Disaggregation methods to expedite product line forecasting. Journal of Forecasting, 9 , 233–254. \n",
"doi:10.1002/for.3980090304.](https://onlinelibrary.wiley.com/doi/abs/10.1002/for.3980090304)<br>\n",
"- [Wickramasuriya, S. L., Athanasopoulos, G., & Hyndman, R. J. (2019). \\\"Optimal forecast reconciliation for\n",
"hierarchical and grouped time series through trace minimization\\\". Journal of the American Statistical Association, \n",
"114 , 804–819. doi:10.1080/01621459.2018.1448825.](https://robjhyndman.com/publications/mint/)<br>\n",
"- [Ben Taieb, S., & Koo, B. (2019). Regularized regression for hierarchical forecasting without \n",
"unbiasedness conditions. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge \n",
"Discovery & Data Mining KDD '19 (p. 1337{1347). New York, NY, USA: Association for Computing Machinery.](https://doi.org/10.1145/3292500.3330976)<br>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.10.6 ('hierarchicalforecast')",
"language": "python",
"name": "python3"
},
"colab": {
"provenance": [],
"include_colab_link": true
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment