Skip to content

Instantly share code, notes, and snippets.

@max-kuk
Created January 17, 2020 09:19
Show Gist options
  • Save max-kuk/28100fa8587b76a3a50e4f4a9fb09733 to your computer and use it in GitHub Desktop.
Save max-kuk/28100fa8587b76a3a50e4f4a9fb09733 to your computer and use it in GitHub Desktop.
GradientBoostingV3.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
},
"pycharm": {
"stem_cell": {
"cell_type": "raw",
"metadata": {
"collapsed": false
},
"source": []
}
},
"colab": {
"name": "GradientBoostingV3.ipynb",
"provenance": [],
"include_colab_link": true
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/max-kuk/28100fa8587b76a3a50e4f4a9fb09733/gradientboostingv3.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hZrNAobMDEy7",
"colab_type": "text"
},
"source": [
"#Gradient Boosting"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ObjqBm-PFnFc",
"colab_type": "text"
},
"source": [
"## Introduction"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sBnu84B9E2UA",
"colab_type": "text"
},
"source": [
"Gradient Boosting is a method of converting weak learners into strong learners. In boosting, each new tree is a fit on a modified version of the original data set. The GBR build on top of decision trees and improves its quality."
]
},
{
"cell_type": "code",
"metadata": {
"id": "80sVr8H_wWhJ",
"colab_type": "code",
"outputId": "b29d8d73-acf8-4237-cb85-e928fdcc95be",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 54
}
},
"source": [
"!pip install scikit-learn==0.22 --quiet\n",
"!pip install catboost --quiet"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"\u001b[K |████████████████████████████████| 7.0MB 10.9MB/s \n",
"\u001b[K |████████████████████████████████| 63.6MB 60kB/s \n",
"\u001b[?25h"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"pycharm": {
"is_executing": false
},
"id": "TchBVi7ZwUZa",
"colab_type": "code",
"colab": {}
},
"source": [
"import xgboost as xgb\n",
"import catboost as cb\n",
"import lightgbm as lgb\n",
"import numpy as np\n",
"import pandas as pd\n",
"from joblib import dump, load\n",
"from google.colab import drive\n",
"from sklearn.metrics import mean_squared_error\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.experimental import enable_hist_gradient_boosting\n",
"from sklearn.ensemble import HistGradientBoostingRegressor\n",
"from sklearn.ensemble import VotingRegressor\n",
"from sklearn.ensemble import ExtraTreesRegressor"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "RDqGFcmdwmjz",
"colab_type": "code",
"outputId": "d1d03af2-206c-4911-84ee-afa7721e9a19",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 129
}
},
"source": [
"# mount google drive\n",
"drive.mount('/content/drive')"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly\n",
"\n",
"Enter your authorization code:\n",
"··········\n",
"Mounted at /content/drive\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "A0CosB0bwUZr",
"colab_type": "code",
"outputId": "0cca1b68-356e-4219-bd16-a04a950fac2c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 313
}
},
"source": [
"# load csv file as dataframe\n",
"# change path to your's one\n",
"df = pd.read_csv('/content/drive/My Drive/Optimax Machine Learning Challenge/Optimax/Max/training_data.csv', parse_dates = ['delivery_start'])\n",
"\n",
"df.head()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>delivery_start</th>\n",
" <th>saldo_final_target</th>\n",
" <th>wgtavg_current_target</th>\n",
" <th>recent_wgt_avg_current_target</th>\n",
" <th>DA_price_target</th>\n",
" <th>solar_current_target</th>\n",
" <th>wind_current_target</th>\n",
" <th>solar_current_1_target</th>\n",
" <th>wind_current_1_target</th>\n",
" <th>solar_current_24_target</th>\n",
" <th>wind_current_24_target</th>\n",
" <th>solar_current_48_target</th>\n",
" <th>wind_current_48_target</th>\n",
" <th>solar_current_96_target</th>\n",
" <th>wind_current_96_target</th>\n",
" <th>REST_NET_current_target</th>\n",
" <th>OPTIMIZATION_NET_current_target</th>\n",
" <th>pos_AFRR_avg_log_target</th>\n",
" <th>pos_MFRR_avg_log_target</th>\n",
" <th>neg_AFRR_avg_log_target</th>\n",
" <th>neg_MFRR_avg_log_target</th>\n",
" <th>weekend_target</th>\n",
" <th>holiday_de_target</th>\n",
" <th>max_GER_daily_target</th>\n",
" <th>min_GER_daily_target</th>\n",
" <th>max_GER_daily_quad_target</th>\n",
" <th>min_GER_daily_quad_target</th>\n",
" <th>max_GER_hist_daily_target</th>\n",
" <th>min_GER_hist_daily_target</th>\n",
" <th>max_GER_hist_daily_quad_target</th>\n",
" <th>min_GER_hist_daily_quad_target</th>\n",
" <th>max_GER_daily_yesterday_target</th>\n",
" <th>min_GER_daily_yesterday_target</th>\n",
" <th>max_GER_daily_quad_yesterday_target</th>\n",
" <th>min_GER_daily_quad_yesterday_target</th>\n",
" <th>max_GER_hist_daily_yesterday_target</th>\n",
" <th>min_GER_hist_daily_yesterday_target</th>\n",
" <th>max_GER_hist_daily_quad_yesterday_target</th>\n",
" <th>min_GER_hist_daily_quad_yesterday_target</th>\n",
" <th>max_GER_daily_before_yesterday_target</th>\n",
" <th>min_GER_daily_before_yesterday_target</th>\n",
" <th>max_GER_daily_quad_before_yesterday_target</th>\n",
" <th>min_GER_daily_quad_before_yesterday_target</th>\n",
" <th>max_GER_hist_daily_before_yesterday_target</th>\n",
" <th>min_GER_hist_daily_before_yesterday_target</th>\n",
" <th>max_GER_hist_daily_quad_before_yesterday_target</th>\n",
" <th>min_GER_hist_daily_quad_before_yesterday_target</th>\n",
" <th>light_hour_target</th>\n",
" <th>saldo_latest</th>\n",
" <th>wgtavg_latest</th>\n",
" <th>recent_wgt_avg_current_latest</th>\n",
" <th>DA_price_latest</th>\n",
" <th>solar_current_latest</th>\n",
" <th>wind_current_latest</th>\n",
" <th>solar_current_1_latest</th>\n",
" <th>wind_current_1_latest</th>\n",
" <th>solar_current_24_latest</th>\n",
" <th>wind_current_24_latest</th>\n",
" <th>solar_current_48_latest</th>\n",
" <th>wind_current_48_latest</th>\n",
" <th>solar_current_96_latest</th>\n",
" <th>wind_current_96_latest</th>\n",
" <th>REST_NET_current_latest</th>\n",
" <th>OPTIMIZATION_NET_current_latest</th>\n",
" <th>pos_AFRR_avg_log_latest</th>\n",
" <th>pos_MFRR_avg_log_latest</th>\n",
" <th>neg_AFRR_avg_log_latest</th>\n",
" <th>neg_MFRR_avg_log_latest</th>\n",
" <th>weekend_latest</th>\n",
" <th>holiday_de_latest</th>\n",
" <th>light_hour_latest</th>\n",
" <th>saldo_4_latest</th>\n",
" <th>saldo_8_latest</th>\n",
" <th>saldo_12_latest</th>\n",
" <th>saldo_16_latest</th>\n",
" <th>saldo_20_latest</th>\n",
" <th>saldo_24_latest</th>\n",
" <th>saldo_28_latest</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2018-08-31 05:30:00</td>\n",
" <td>-725100.0</td>\n",
" <td>4981</td>\n",
" <td>5879</td>\n",
" <td>5507</td>\n",
" <td>0</td>\n",
" <td>9067370</td>\n",
" <td>0.0</td>\n",
" <td>9113180.0</td>\n",
" <td>0</td>\n",
" <td>9639630</td>\n",
" <td>0</td>\n",
" <td>10457410</td>\n",
" <td>0</td>\n",
" <td>10382260</td>\n",
" <td>-44900.0</td>\n",
" <td>-12000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>770.06</td>\n",
" <td>225.0</td>\n",
" <td>20.8</td>\n",
" <td>11.4</td>\n",
" <td>432.64</td>\n",
" <td>129.96</td>\n",
" <td>5.50</td>\n",
" <td>113981</td>\n",
" <td>4539</td>\n",
" <td>3293</td>\n",
" <td>5234</td>\n",
" <td>0</td>\n",
" <td>9558560</td>\n",
" <td>0</td>\n",
" <td>9631190</td>\n",
" <td>0.0</td>\n",
" <td>9932050</td>\n",
" <td>0.0</td>\n",
" <td>10661320.0</td>\n",
" <td>0</td>\n",
" <td>10623150</td>\n",
" <td>-64500.0</td>\n",
" <td>-4100.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>4.50</td>\n",
" <td>-288073</td>\n",
" <td>-441388</td>\n",
" <td>-389065</td>\n",
" <td>95488</td>\n",
" <td>-86087</td>\n",
" <td>-1010</td>\n",
" <td>-30138</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2018-08-31 05:45:00</td>\n",
" <td>-757308.0</td>\n",
" <td>5999</td>\n",
" <td>6119</td>\n",
" <td>7117</td>\n",
" <td>0</td>\n",
" <td>8854150</td>\n",
" <td>0.0</td>\n",
" <td>8969140.0</td>\n",
" <td>0</td>\n",
" <td>9511380</td>\n",
" <td>0</td>\n",
" <td>10377630</td>\n",
" <td>0</td>\n",
" <td>10295260</td>\n",
" <td>-45300.0</td>\n",
" <td>20000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>770.06</td>\n",
" <td>225.0</td>\n",
" <td>20.8</td>\n",
" <td>11.4</td>\n",
" <td>432.64</td>\n",
" <td>129.96</td>\n",
" <td>5.75</td>\n",
" <td>-546964</td>\n",
" <td>5209</td>\n",
" <td>5459</td>\n",
" <td>5712</td>\n",
" <td>0</td>\n",
" <td>9384930</td>\n",
" <td>0</td>\n",
" <td>9481980</td>\n",
" <td>0.0</td>\n",
" <td>9910670</td>\n",
" <td>0.0</td>\n",
" <td>10629780.0</td>\n",
" <td>0</td>\n",
" <td>10580490</td>\n",
" <td>-64500.0</td>\n",
" <td>-30000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>4.75</td>\n",
" <td>-165132</td>\n",
" <td>-558223</td>\n",
" <td>-501631</td>\n",
" <td>21421</td>\n",
" <td>211732</td>\n",
" <td>-273266</td>\n",
" <td>-190866</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2018-08-31 06:00:00</td>\n",
" <td>-264864.0</td>\n",
" <td>5108</td>\n",
" <td>3833</td>\n",
" <td>4998</td>\n",
" <td>0</td>\n",
" <td>8802050</td>\n",
" <td>0.0</td>\n",
" <td>8821560.0</td>\n",
" <td>0</td>\n",
" <td>9444080</td>\n",
" <td>0</td>\n",
" <td>10374940</td>\n",
" <td>0</td>\n",
" <td>10286420</td>\n",
" <td>500.0</td>\n",
" <td>-8000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>770.06</td>\n",
" <td>225.0</td>\n",
" <td>20.8</td>\n",
" <td>11.4</td>\n",
" <td>432.64</td>\n",
" <td>129.96</td>\n",
" <td>6.00</td>\n",
" <td>-313044</td>\n",
" <td>3788</td>\n",
" <td>4795</td>\n",
" <td>4243</td>\n",
" <td>0</td>\n",
" <td>9259790</td>\n",
" <td>0</td>\n",
" <td>9309050</td>\n",
" <td>0.0</td>\n",
" <td>9838350</td>\n",
" <td>0.0</td>\n",
" <td>10599740.0</td>\n",
" <td>0</td>\n",
" <td>10547400</td>\n",
" <td>-35300.0</td>\n",
" <td>-19900.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>5.00</td>\n",
" <td>69905</td>\n",
" <td>-429277</td>\n",
" <td>-406975</td>\n",
" <td>-390933</td>\n",
" <td>-188560</td>\n",
" <td>98356</td>\n",
" <td>15057</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2018-08-31 06:15:00</td>\n",
" <td>-383680.0</td>\n",
" <td>4841</td>\n",
" <td>4436</td>\n",
" <td>5790</td>\n",
" <td>13420</td>\n",
" <td>8560360</td>\n",
" <td>13420.0</td>\n",
" <td>8659060.0</td>\n",
" <td>13350</td>\n",
" <td>9280500</td>\n",
" <td>10540</td>\n",
" <td>10258540</td>\n",
" <td>16800</td>\n",
" <td>10173690</td>\n",
" <td>10000.0</td>\n",
" <td>0.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>770.06</td>\n",
" <td>225.0</td>\n",
" <td>20.8</td>\n",
" <td>11.4</td>\n",
" <td>432.64</td>\n",
" <td>129.96</td>\n",
" <td>6.25</td>\n",
" <td>-466235</td>\n",
" <td>3993</td>\n",
" <td>5887</td>\n",
" <td>4792</td>\n",
" <td>0</td>\n",
" <td>9037960</td>\n",
" <td>0</td>\n",
" <td>9162720</td>\n",
" <td>0.0</td>\n",
" <td>9703380</td>\n",
" <td>0.0</td>\n",
" <td>10529810.0</td>\n",
" <td>0</td>\n",
" <td>10465710</td>\n",
" <td>-45300.0</td>\n",
" <td>-29700.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>5.25</td>\n",
" <td>141107</td>\n",
" <td>-313306</td>\n",
" <td>-606466</td>\n",
" <td>-452395</td>\n",
" <td>-227666</td>\n",
" <td>-141029</td>\n",
" <td>-140003</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2018-08-31 06:30:00</td>\n",
" <td>-786364.0</td>\n",
" <td>5883</td>\n",
" <td>3483</td>\n",
" <td>7091</td>\n",
" <td>77930</td>\n",
" <td>8346200</td>\n",
" <td>78430.0</td>\n",
" <td>8421290.0</td>\n",
" <td>79040</td>\n",
" <td>9144950</td>\n",
" <td>55470</td>\n",
" <td>10142130</td>\n",
" <td>104750</td>\n",
" <td>9995690</td>\n",
" <td>10400.0</td>\n",
" <td>-17000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>770.06</td>\n",
" <td>225.0</td>\n",
" <td>20.8</td>\n",
" <td>11.4</td>\n",
" <td>432.64</td>\n",
" <td>129.96</td>\n",
" <td>6.50</td>\n",
" <td>-726191</td>\n",
" <td>5448</td>\n",
" <td>5486</td>\n",
" <td>5507</td>\n",
" <td>0</td>\n",
" <td>8930020</td>\n",
" <td>0</td>\n",
" <td>8947730</td>\n",
" <td>0.0</td>\n",
" <td>9558500</td>\n",
" <td>0.0</td>\n",
" <td>10457410.0</td>\n",
" <td>0</td>\n",
" <td>10382240</td>\n",
" <td>-44700.0</td>\n",
" <td>-30000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>5.50</td>\n",
" <td>113981</td>\n",
" <td>-288073</td>\n",
" <td>-441388</td>\n",
" <td>-389065</td>\n",
" <td>95488</td>\n",
" <td>-86087</td>\n",
" <td>-1010</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" delivery_start saldo_final_target ... saldo_24_latest saldo_28_latest\n",
"0 2018-08-31 05:30:00 -725100.0 ... -1010 -30138\n",
"1 2018-08-31 05:45:00 -757308.0 ... -273266 -190866\n",
"2 2018-08-31 06:00:00 -264864.0 ... 98356 15057\n",
"3 2018-08-31 06:15:00 -383680.0 ... -141029 -140003\n",
"4 2018-08-31 06:30:00 -786364.0 ... -86087 -1010\n",
"\n",
"[5 rows x 78 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 220
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UdLSWJ2cpAWx",
"colab_type": "text"
},
"source": [
"##Feature preprocessing"
]
},
{
"cell_type": "code",
"metadata": {
"id": "vDNS9qYGwUZ0",
"colab_type": "code",
"outputId": "a24fcb4e-816f-42e1-c7ba-75998a726b32",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 726
}
},
"source": [
"df['year'] = df['delivery_start'].dt.year\n",
"df['day_of_year'] = df['delivery_start'].dt.dayofyear\n",
"df['weekday'] = df['delivery_start'].dt.weekday\n",
"df['week_of_year'] = df['delivery_start'].dt.week\n",
"df['day_of_month'] = df['delivery_start'].dt.day\n",
"df['quarter'] = df['delivery_start'].dt.quarter\n",
"df['hour'] = df['delivery_start'].dt.hour\n",
"df['minute'] = df['delivery_start'].dt.minute\n",
"df.drop('delivery_start', axis=1, inplace=True)\n",
"df.head(20)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>saldo_final_target</th>\n",
" <th>wgtavg_current_target</th>\n",
" <th>recent_wgt_avg_current_target</th>\n",
" <th>DA_price_target</th>\n",
" <th>solar_current_target</th>\n",
" <th>wind_current_target</th>\n",
" <th>solar_current_1_target</th>\n",
" <th>wind_current_1_target</th>\n",
" <th>solar_current_24_target</th>\n",
" <th>wind_current_24_target</th>\n",
" <th>solar_current_48_target</th>\n",
" <th>wind_current_48_target</th>\n",
" <th>solar_current_96_target</th>\n",
" <th>wind_current_96_target</th>\n",
" <th>REST_NET_current_target</th>\n",
" <th>OPTIMIZATION_NET_current_target</th>\n",
" <th>pos_AFRR_avg_log_target</th>\n",
" <th>pos_MFRR_avg_log_target</th>\n",
" <th>neg_AFRR_avg_log_target</th>\n",
" <th>neg_MFRR_avg_log_target</th>\n",
" <th>weekend_target</th>\n",
" <th>holiday_de_target</th>\n",
" <th>max_GER_daily_target</th>\n",
" <th>min_GER_daily_target</th>\n",
" <th>max_GER_daily_quad_target</th>\n",
" <th>min_GER_daily_quad_target</th>\n",
" <th>max_GER_hist_daily_target</th>\n",
" <th>min_GER_hist_daily_target</th>\n",
" <th>max_GER_hist_daily_quad_target</th>\n",
" <th>min_GER_hist_daily_quad_target</th>\n",
" <th>max_GER_daily_yesterday_target</th>\n",
" <th>min_GER_daily_yesterday_target</th>\n",
" <th>max_GER_daily_quad_yesterday_target</th>\n",
" <th>min_GER_daily_quad_yesterday_target</th>\n",
" <th>max_GER_hist_daily_yesterday_target</th>\n",
" <th>min_GER_hist_daily_yesterday_target</th>\n",
" <th>max_GER_hist_daily_quad_yesterday_target</th>\n",
" <th>min_GER_hist_daily_quad_yesterday_target</th>\n",
" <th>max_GER_daily_before_yesterday_target</th>\n",
" <th>min_GER_daily_before_yesterday_target</th>\n",
" <th>...</th>\n",
" <th>min_GER_hist_daily_quad_before_yesterday_target</th>\n",
" <th>light_hour_target</th>\n",
" <th>saldo_latest</th>\n",
" <th>wgtavg_latest</th>\n",
" <th>recent_wgt_avg_current_latest</th>\n",
" <th>DA_price_latest</th>\n",
" <th>solar_current_latest</th>\n",
" <th>wind_current_latest</th>\n",
" <th>solar_current_1_latest</th>\n",
" <th>wind_current_1_latest</th>\n",
" <th>solar_current_24_latest</th>\n",
" <th>wind_current_24_latest</th>\n",
" <th>solar_current_48_latest</th>\n",
" <th>wind_current_48_latest</th>\n",
" <th>solar_current_96_latest</th>\n",
" <th>wind_current_96_latest</th>\n",
" <th>REST_NET_current_latest</th>\n",
" <th>OPTIMIZATION_NET_current_latest</th>\n",
" <th>pos_AFRR_avg_log_latest</th>\n",
" <th>pos_MFRR_avg_log_latest</th>\n",
" <th>neg_AFRR_avg_log_latest</th>\n",
" <th>neg_MFRR_avg_log_latest</th>\n",
" <th>weekend_latest</th>\n",
" <th>holiday_de_latest</th>\n",
" <th>light_hour_latest</th>\n",
" <th>saldo_4_latest</th>\n",
" <th>saldo_8_latest</th>\n",
" <th>saldo_12_latest</th>\n",
" <th>saldo_16_latest</th>\n",
" <th>saldo_20_latest</th>\n",
" <th>saldo_24_latest</th>\n",
" <th>saldo_28_latest</th>\n",
" <th>year</th>\n",
" <th>day_of_year</th>\n",
" <th>weekday</th>\n",
" <th>week_of_year</th>\n",
" <th>day_of_month</th>\n",
" <th>quarter</th>\n",
" <th>hour</th>\n",
" <th>minute</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>-725100.0</td>\n",
" <td>4981</td>\n",
" <td>5879</td>\n",
" <td>5507</td>\n",
" <td>0</td>\n",
" <td>9067370</td>\n",
" <td>0.0</td>\n",
" <td>9113180.0</td>\n",
" <td>0</td>\n",
" <td>9639630</td>\n",
" <td>0</td>\n",
" <td>10457410</td>\n",
" <td>0</td>\n",
" <td>10382260</td>\n",
" <td>-44900.0</td>\n",
" <td>-12000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>5.50</td>\n",
" <td>113981</td>\n",
" <td>4539</td>\n",
" <td>3293</td>\n",
" <td>5234</td>\n",
" <td>0</td>\n",
" <td>9558560</td>\n",
" <td>0</td>\n",
" <td>9631190</td>\n",
" <td>0.0</td>\n",
" <td>9932050</td>\n",
" <td>0.0</td>\n",
" <td>10661320.0</td>\n",
" <td>0</td>\n",
" <td>10623150</td>\n",
" <td>-64500.0</td>\n",
" <td>-4100.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>4.50</td>\n",
" <td>-288073</td>\n",
" <td>-441388</td>\n",
" <td>-389065</td>\n",
" <td>95488</td>\n",
" <td>-86087</td>\n",
" <td>-1010</td>\n",
" <td>-30138</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>-757308.0</td>\n",
" <td>5999</td>\n",
" <td>6119</td>\n",
" <td>7117</td>\n",
" <td>0</td>\n",
" <td>8854150</td>\n",
" <td>0.0</td>\n",
" <td>8969140.0</td>\n",
" <td>0</td>\n",
" <td>9511380</td>\n",
" <td>0</td>\n",
" <td>10377630</td>\n",
" <td>0</td>\n",
" <td>10295260</td>\n",
" <td>-45300.0</td>\n",
" <td>20000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>5.75</td>\n",
" <td>-546964</td>\n",
" <td>5209</td>\n",
" <td>5459</td>\n",
" <td>5712</td>\n",
" <td>0</td>\n",
" <td>9384930</td>\n",
" <td>0</td>\n",
" <td>9481980</td>\n",
" <td>0.0</td>\n",
" <td>9910670</td>\n",
" <td>0.0</td>\n",
" <td>10629780.0</td>\n",
" <td>0</td>\n",
" <td>10580490</td>\n",
" <td>-64500.0</td>\n",
" <td>-30000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>4.75</td>\n",
" <td>-165132</td>\n",
" <td>-558223</td>\n",
" <td>-501631</td>\n",
" <td>21421</td>\n",
" <td>211732</td>\n",
" <td>-273266</td>\n",
" <td>-190866</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>45</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>-264864.0</td>\n",
" <td>5108</td>\n",
" <td>3833</td>\n",
" <td>4998</td>\n",
" <td>0</td>\n",
" <td>8802050</td>\n",
" <td>0.0</td>\n",
" <td>8821560.0</td>\n",
" <td>0</td>\n",
" <td>9444080</td>\n",
" <td>0</td>\n",
" <td>10374940</td>\n",
" <td>0</td>\n",
" <td>10286420</td>\n",
" <td>500.0</td>\n",
" <td>-8000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>6.00</td>\n",
" <td>-313044</td>\n",
" <td>3788</td>\n",
" <td>4795</td>\n",
" <td>4243</td>\n",
" <td>0</td>\n",
" <td>9259790</td>\n",
" <td>0</td>\n",
" <td>9309050</td>\n",
" <td>0.0</td>\n",
" <td>9838350</td>\n",
" <td>0.0</td>\n",
" <td>10599740.0</td>\n",
" <td>0</td>\n",
" <td>10547400</td>\n",
" <td>-35300.0</td>\n",
" <td>-19900.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>5.00</td>\n",
" <td>69905</td>\n",
" <td>-429277</td>\n",
" <td>-406975</td>\n",
" <td>-390933</td>\n",
" <td>-188560</td>\n",
" <td>98356</td>\n",
" <td>15057</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>6</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>-383680.0</td>\n",
" <td>4841</td>\n",
" <td>4436</td>\n",
" <td>5790</td>\n",
" <td>13420</td>\n",
" <td>8560360</td>\n",
" <td>13420.0</td>\n",
" <td>8659060.0</td>\n",
" <td>13350</td>\n",
" <td>9280500</td>\n",
" <td>10540</td>\n",
" <td>10258540</td>\n",
" <td>16800</td>\n",
" <td>10173690</td>\n",
" <td>10000.0</td>\n",
" <td>0.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>6.25</td>\n",
" <td>-466235</td>\n",
" <td>3993</td>\n",
" <td>5887</td>\n",
" <td>4792</td>\n",
" <td>0</td>\n",
" <td>9037960</td>\n",
" <td>0</td>\n",
" <td>9162720</td>\n",
" <td>0.0</td>\n",
" <td>9703380</td>\n",
" <td>0.0</td>\n",
" <td>10529810.0</td>\n",
" <td>0</td>\n",
" <td>10465710</td>\n",
" <td>-45300.0</td>\n",
" <td>-29700.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>5.25</td>\n",
" <td>141107</td>\n",
" <td>-313306</td>\n",
" <td>-606466</td>\n",
" <td>-452395</td>\n",
" <td>-227666</td>\n",
" <td>-141029</td>\n",
" <td>-140003</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>6</td>\n",
" <td>15</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>-786364.0</td>\n",
" <td>5883</td>\n",
" <td>3483</td>\n",
" <td>7091</td>\n",
" <td>77930</td>\n",
" <td>8346200</td>\n",
" <td>78430.0</td>\n",
" <td>8421290.0</td>\n",
" <td>79040</td>\n",
" <td>9144950</td>\n",
" <td>55470</td>\n",
" <td>10142130</td>\n",
" <td>104750</td>\n",
" <td>9995690</td>\n",
" <td>10400.0</td>\n",
" <td>-17000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>6.50</td>\n",
" <td>-726191</td>\n",
" <td>5448</td>\n",
" <td>5486</td>\n",
" <td>5507</td>\n",
" <td>0</td>\n",
" <td>8930020</td>\n",
" <td>0</td>\n",
" <td>8947730</td>\n",
" <td>0.0</td>\n",
" <td>9558500</td>\n",
" <td>0.0</td>\n",
" <td>10457410.0</td>\n",
" <td>0</td>\n",
" <td>10382240</td>\n",
" <td>-44700.0</td>\n",
" <td>-30000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>5.50</td>\n",
" <td>113981</td>\n",
" <td>-288073</td>\n",
" <td>-441388</td>\n",
" <td>-389065</td>\n",
" <td>95488</td>\n",
" <td>-86087</td>\n",
" <td>-1010</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>6</td>\n",
" <td>30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>-1105900.0</td>\n",
" <td>6693</td>\n",
" <td>9786</td>\n",
" <td>7689</td>\n",
" <td>257360</td>\n",
" <td>8204930</td>\n",
" <td>257410.0</td>\n",
" <td>8214680.0</td>\n",
" <td>262960</td>\n",
" <td>9022530</td>\n",
" <td>222800</td>\n",
" <td>10016150</td>\n",
" <td>335510</td>\n",
" <td>9818180</td>\n",
" <td>-12800.0</td>\n",
" <td>-9900.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>6.75</td>\n",
" <td>-752912</td>\n",
" <td>5981</td>\n",
" <td>6119</td>\n",
" <td>7117</td>\n",
" <td>0</td>\n",
" <td>8729180</td>\n",
" <td>0</td>\n",
" <td>8835670</td>\n",
" <td>0.0</td>\n",
" <td>9419420</td>\n",
" <td>0.0</td>\n",
" <td>10377630.0</td>\n",
" <td>0</td>\n",
" <td>10295240</td>\n",
" <td>-45300.0</td>\n",
" <td>13200.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>5.75</td>\n",
" <td>-546964</td>\n",
" <td>-165132</td>\n",
" <td>-558223</td>\n",
" <td>-501631</td>\n",
" <td>21421</td>\n",
" <td>211732</td>\n",
" <td>-273266</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>6</td>\n",
" <td>45</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>-106368.0</td>\n",
" <td>4960</td>\n",
" <td>3923</td>\n",
" <td>6391</td>\n",
" <td>579470</td>\n",
" <td>8087420</td>\n",
" <td>579420.0</td>\n",
" <td>8065710.0</td>\n",
" <td>593020</td>\n",
" <td>8855670</td>\n",
" <td>544200</td>\n",
" <td>9878060</td>\n",
" <td>744590</td>\n",
" <td>9690490</td>\n",
" <td>-28900.0</td>\n",
" <td>15800.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>7.00</td>\n",
" <td>-268385</td>\n",
" <td>5126</td>\n",
" <td>6331</td>\n",
" <td>4998</td>\n",
" <td>0</td>\n",
" <td>8598780</td>\n",
" <td>0</td>\n",
" <td>8698830</td>\n",
" <td>0.0</td>\n",
" <td>9385370</td>\n",
" <td>0.0</td>\n",
" <td>10374940.0</td>\n",
" <td>0</td>\n",
" <td>10218410</td>\n",
" <td>1600.0</td>\n",
" <td>-18000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>6.00</td>\n",
" <td>-313044</td>\n",
" <td>69905</td>\n",
" <td>-429277</td>\n",
" <td>-406975</td>\n",
" <td>-390933</td>\n",
" <td>-188560</td>\n",
" <td>98356</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>7</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>53172.0</td>\n",
" <td>6055</td>\n",
" <td>4716</td>\n",
" <td>7189</td>\n",
" <td>1045000</td>\n",
" <td>7838220</td>\n",
" <td>1047990.0</td>\n",
" <td>7817860.0</td>\n",
" <td>1076680</td>\n",
" <td>8446120</td>\n",
" <td>1014600</td>\n",
" <td>9571940</td>\n",
" <td>1337790</td>\n",
" <td>9374530</td>\n",
" <td>-59100.0</td>\n",
" <td>6300.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>7.25</td>\n",
" <td>-382374</td>\n",
" <td>4381</td>\n",
" <td>2567</td>\n",
" <td>5790</td>\n",
" <td>13160</td>\n",
" <td>8462140</td>\n",
" <td>13210</td>\n",
" <td>8472960</td>\n",
" <td>13350.0</td>\n",
" <td>9270240</td>\n",
" <td>10540.0</td>\n",
" <td>10258540.0</td>\n",
" <td>16050</td>\n",
" <td>10056350</td>\n",
" <td>17900.0</td>\n",
" <td>-20000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>6.25</td>\n",
" <td>-466235</td>\n",
" <td>141107</td>\n",
" <td>-313306</td>\n",
" <td>-606466</td>\n",
" <td>-452395</td>\n",
" <td>-227666</td>\n",
" <td>-141029</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>7</td>\n",
" <td>15</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>-197600.0</td>\n",
" <td>5993</td>\n",
" <td>6672</td>\n",
" <td>7243</td>\n",
" <td>1638640</td>\n",
" <td>7614700</td>\n",
" <td>1641630.0</td>\n",
" <td>7558760.0</td>\n",
" <td>1685780</td>\n",
" <td>8172960</td>\n",
" <td>1636520</td>\n",
" <td>9279140</td>\n",
" <td>2096120</td>\n",
" <td>9052520</td>\n",
" <td>-60400.0</td>\n",
" <td>-16000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>7.50</td>\n",
" <td>-789973</td>\n",
" <td>5491</td>\n",
" <td>5993</td>\n",
" <td>7091</td>\n",
" <td>77940</td>\n",
" <td>8363200</td>\n",
" <td>77890</td>\n",
" <td>8336050</td>\n",
" <td>79040.0</td>\n",
" <td>9116070</td>\n",
" <td>55470.0</td>\n",
" <td>10142130.0</td>\n",
" <td>104750</td>\n",
" <td>9949540</td>\n",
" <td>-14500.0</td>\n",
" <td>-27000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>6.50</td>\n",
" <td>-726191</td>\n",
" <td>113981</td>\n",
" <td>-288073</td>\n",
" <td>-441388</td>\n",
" <td>-389065</td>\n",
" <td>95488</td>\n",
" <td>-86087</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>7</td>\n",
" <td>30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>-527936.0</td>\n",
" <td>6180</td>\n",
" <td>5285</td>\n",
" <td>7248</td>\n",
" <td>2359090</td>\n",
" <td>7396810</td>\n",
" <td>2291650.0</td>\n",
" <td>7324160.0</td>\n",
" <td>2357200</td>\n",
" <td>7790170</td>\n",
" <td>2319100</td>\n",
" <td>8997750</td>\n",
" <td>2941020</td>\n",
" <td>8705790</td>\n",
" <td>-60900.0</td>\n",
" <td>20000.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>7.75</td>\n",
" <td>-1107723</td>\n",
" <td>6502</td>\n",
" <td>4906</td>\n",
" <td>7689</td>\n",
" <td>254430</td>\n",
" <td>8252520</td>\n",
" <td>257410</td>\n",
" <td>8229310</td>\n",
" <td>264340.0</td>\n",
" <td>8866120</td>\n",
" <td>222800.0</td>\n",
" <td>10016150.0</td>\n",
" <td>335510</td>\n",
" <td>9825690</td>\n",
" <td>-22800.0</td>\n",
" <td>-11100.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>6.75</td>\n",
" <td>-752912</td>\n",
" <td>-546964</td>\n",
" <td>-165132</td>\n",
" <td>-558223</td>\n",
" <td>-501631</td>\n",
" <td>21421</td>\n",
" <td>211732</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>7</td>\n",
" <td>45</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>-363888.0</td>\n",
" <td>6431</td>\n",
" <td>5617</td>\n",
" <td>7613</td>\n",
" <td>3242040</td>\n",
" <td>7345860</td>\n",
" <td>3056450.0</td>\n",
" <td>7090730.0</td>\n",
" <td>3039320</td>\n",
" <td>7442840</td>\n",
" <td>3066830</td>\n",
" <td>8699810</td>\n",
" <td>3111170</td>\n",
" <td>8378270</td>\n",
" <td>-50100.0</td>\n",
" <td>-21000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>8.00</td>\n",
" <td>-108944</td>\n",
" <td>5040</td>\n",
" <td>3923</td>\n",
" <td>6391</td>\n",
" <td>573510</td>\n",
" <td>8170380</td>\n",
" <td>576490</td>\n",
" <td>8108110</td>\n",
" <td>596500.0</td>\n",
" <td>8764200</td>\n",
" <td>544200.0</td>\n",
" <td>9878060.0</td>\n",
" <td>744590</td>\n",
" <td>9690490</td>\n",
" <td>-60900.0</td>\n",
" <td>15800.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>7.00</td>\n",
" <td>-268385</td>\n",
" <td>-313044</td>\n",
" <td>69905</td>\n",
" <td>-429277</td>\n",
" <td>-406975</td>\n",
" <td>-390933</td>\n",
" <td>-188560</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>-372152.0</td>\n",
" <td>6367</td>\n",
" <td>5737</td>\n",
" <td>7279</td>\n",
" <td>4048630</td>\n",
" <td>7239920</td>\n",
" <td>4021210.0</td>\n",
" <td>7106360.0</td>\n",
" <td>3699980</td>\n",
" <td>6892870</td>\n",
" <td>3824860</td>\n",
" <td>8382040</td>\n",
" <td>3867000</td>\n",
" <td>8428160</td>\n",
" <td>-50300.0</td>\n",
" <td>-20000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>8.25</td>\n",
" <td>54310</td>\n",
" <td>5974</td>\n",
" <td>6634</td>\n",
" <td>7189</td>\n",
" <td>1109460</td>\n",
" <td>7992440</td>\n",
" <td>1042020</td>\n",
" <td>7897190</td>\n",
" <td>1076680.0</td>\n",
" <td>8412320</td>\n",
" <td>1014600.0</td>\n",
" <td>9571940.0</td>\n",
" <td>1337790</td>\n",
" <td>9363900</td>\n",
" <td>-60300.0</td>\n",
" <td>2900.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>7.25</td>\n",
" <td>-382374</td>\n",
" <td>-466235</td>\n",
" <td>141107</td>\n",
" <td>-313306</td>\n",
" <td>-606466</td>\n",
" <td>-452395</td>\n",
" <td>-227666</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>8</td>\n",
" <td>15</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>-111476.0</td>\n",
" <td>6798</td>\n",
" <td>6359</td>\n",
" <td>7254</td>\n",
" <td>4834990</td>\n",
" <td>7175650</td>\n",
" <td>4838470.0</td>\n",
" <td>6988630.0</td>\n",
" <td>4457870</td>\n",
" <td>6589270</td>\n",
" <td>4604910</td>\n",
" <td>8060500</td>\n",
" <td>4635510</td>\n",
" <td>8167600</td>\n",
" <td>-50100.0</td>\n",
" <td>-900.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>8.50</td>\n",
" <td>-198056</td>\n",
" <td>5878</td>\n",
" <td>6190</td>\n",
" <td>7243</td>\n",
" <td>1755480</td>\n",
" <td>7980290</td>\n",
" <td>1706080</td>\n",
" <td>7697840</td>\n",
" <td>1671080.0</td>\n",
" <td>8093950</td>\n",
" <td>1636520.0</td>\n",
" <td>9279140.0</td>\n",
" <td>1658780</td>\n",
" <td>9041440</td>\n",
" <td>-60400.0</td>\n",
" <td>-39700.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>7.50</td>\n",
" <td>-789973</td>\n",
" <td>-726191</td>\n",
" <td>113981</td>\n",
" <td>-288073</td>\n",
" <td>-441388</td>\n",
" <td>-389065</td>\n",
" <td>95488</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>8</td>\n",
" <td>30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>61832.0</td>\n",
" <td>6425</td>\n",
" <td>5707</td>\n",
" <td>7148</td>\n",
" <td>5583660</td>\n",
" <td>6895560</td>\n",
" <td>5646290.0</td>\n",
" <td>6946440.0</td>\n",
" <td>5172740</td>\n",
" <td>6393680</td>\n",
" <td>5358250</td>\n",
" <td>7787870</td>\n",
" <td>5388420</td>\n",
" <td>7978170</td>\n",
" <td>-50600.0</td>\n",
" <td>-20000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>8.75</td>\n",
" <td>-529466</td>\n",
" <td>6190</td>\n",
" <td>6000</td>\n",
" <td>7248</td>\n",
" <td>2503040</td>\n",
" <td>7819080</td>\n",
" <td>2470230</td>\n",
" <td>7665010</td>\n",
" <td>2296540.0</td>\n",
" <td>7462630</td>\n",
" <td>2319100.0</td>\n",
" <td>8997750.0</td>\n",
" <td>2355270</td>\n",
" <td>9022670</td>\n",
" <td>-60900.0</td>\n",
" <td>1600.0</td>\n",
" <td>9.953467</td>\n",
" <td>10.353193</td>\n",
" <td>8.947156</td>\n",
" <td>9.387649</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>7.75</td>\n",
" <td>-1107723</td>\n",
" <td>-752912</td>\n",
" <td>-546964</td>\n",
" <td>-165132</td>\n",
" <td>-558223</td>\n",
" <td>-501631</td>\n",
" <td>21421</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>8</td>\n",
" <td>45</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>358868.0</td>\n",
" <td>6666</td>\n",
" <td>5911</td>\n",
" <td>7654</td>\n",
" <td>6378670</td>\n",
" <td>6556060</td>\n",
" <td>6401520.0</td>\n",
" <td>6671540.0</td>\n",
" <td>5881510</td>\n",
" <td>6172460</td>\n",
" <td>6099880</td>\n",
" <td>7510510</td>\n",
" <td>6087630</td>\n",
" <td>7207930</td>\n",
" <td>-40200.0</td>\n",
" <td>-20000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>9.00</td>\n",
" <td>-365188</td>\n",
" <td>6366</td>\n",
" <td>5617</td>\n",
" <td>7613</td>\n",
" <td>3240330</td>\n",
" <td>7701440</td>\n",
" <td>3271020</td>\n",
" <td>7493330</td>\n",
" <td>3015530.0</td>\n",
" <td>7054910</td>\n",
" <td>3066830.0</td>\n",
" <td>8699810.0</td>\n",
" <td>3112170</td>\n",
" <td>8688580</td>\n",
" <td>-50100.0</td>\n",
" <td>-21000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>8.00</td>\n",
" <td>-108944</td>\n",
" <td>-268385</td>\n",
" <td>-313044</td>\n",
" <td>69905</td>\n",
" <td>-429277</td>\n",
" <td>-406975</td>\n",
" <td>-390933</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>9</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>276240.0</td>\n",
" <td>7307</td>\n",
" <td>7542</td>\n",
" <td>7240</td>\n",
" <td>7007930</td>\n",
" <td>6630080</td>\n",
" <td>7178940.0</td>\n",
" <td>6473140.0</td>\n",
" <td>6854580</td>\n",
" <td>6148820</td>\n",
" <td>7408920</td>\n",
" <td>7528390</td>\n",
" <td>7325860</td>\n",
" <td>7561910</td>\n",
" <td>-49500.0</td>\n",
" <td>-20000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>9.25</td>\n",
" <td>-367136</td>\n",
" <td>6159</td>\n",
" <td>5278</td>\n",
" <td>7279</td>\n",
" <td>3963450</td>\n",
" <td>7379600</td>\n",
" <td>4026080</td>\n",
" <td>7437220</td>\n",
" <td>3731500.0</td>\n",
" <td>6832670</td>\n",
" <td>3824860.0</td>\n",
" <td>8382040.0</td>\n",
" <td>3867220</td>\n",
" <td>8427650</td>\n",
" <td>-50300.0</td>\n",
" <td>-30000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>8.25</td>\n",
" <td>54310</td>\n",
" <td>-382374</td>\n",
" <td>-466235</td>\n",
" <td>141107</td>\n",
" <td>-313306</td>\n",
" <td>-606466</td>\n",
" <td>-452395</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>9</td>\n",
" <td>15</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>-33212.0</td>\n",
" <td>7307</td>\n",
" <td>7270</td>\n",
" <td>6957</td>\n",
" <td>7829140</td>\n",
" <td>6554470</td>\n",
" <td>7798450.0</td>\n",
" <td>6537280.0</td>\n",
" <td>7855410</td>\n",
" <td>6142920</td>\n",
" <td>8811970</td>\n",
" <td>7548920</td>\n",
" <td>8727470</td>\n",
" <td>7597560</td>\n",
" <td>-49500.0</td>\n",
" <td>-20000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>9.50</td>\n",
" <td>-110583</td>\n",
" <td>6045</td>\n",
" <td>5340</td>\n",
" <td>7254</td>\n",
" <td>4765010</td>\n",
" <td>6979810</td>\n",
" <td>4772360</td>\n",
" <td>7120880</td>\n",
" <td>4457870.0</td>\n",
" <td>6587190</td>\n",
" <td>4604910.0</td>\n",
" <td>8060500.0</td>\n",
" <td>4576880</td>\n",
" <td>7721200</td>\n",
" <td>-50100.0</td>\n",
" <td>-30000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>8.50</td>\n",
" <td>-198056</td>\n",
" <td>-789973</td>\n",
" <td>-726191</td>\n",
" <td>113981</td>\n",
" <td>-288073</td>\n",
" <td>-441388</td>\n",
" <td>-389065</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>9</td>\n",
" <td>30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>42360.0</td>\n",
" <td>7337</td>\n",
" <td>8154</td>\n",
" <td>6450</td>\n",
" <td>8471330</td>\n",
" <td>6446520</td>\n",
" <td>8509040.0</td>\n",
" <td>6451360.0</td>\n",
" <td>8679240</td>\n",
" <td>6113270</td>\n",
" <td>9945170</td>\n",
" <td>7540660</td>\n",
" <td>10117470</td>\n",
" <td>7591440</td>\n",
" <td>-59500.0</td>\n",
" <td>-20000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>9.75</td>\n",
" <td>64945</td>\n",
" <td>6120</td>\n",
" <td>4756</td>\n",
" <td>7148</td>\n",
" <td>5399100</td>\n",
" <td>6942100</td>\n",
" <td>5570110</td>\n",
" <td>6764420</td>\n",
" <td>5172740.0</td>\n",
" <td>6403450</td>\n",
" <td>5358250.0</td>\n",
" <td>7787870.0</td>\n",
" <td>5281430</td>\n",
" <td>7797700</td>\n",
" <td>-50600.0</td>\n",
" <td>-19600.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>8.75</td>\n",
" <td>-529466</td>\n",
" <td>-1107723</td>\n",
" <td>-752912</td>\n",
" <td>-546964</td>\n",
" <td>-165132</td>\n",
" <td>-558223</td>\n",
" <td>-501631</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>9</td>\n",
" <td>45</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>-359984.0</td>\n",
" <td>6897</td>\n",
" <td>7106</td>\n",
" <td>7358</td>\n",
" <td>9291780</td>\n",
" <td>6243550</td>\n",
" <td>9155680.0</td>\n",
" <td>6346980.0</td>\n",
" <td>9478950</td>\n",
" <td>6082970</td>\n",
" <td>11081240</td>\n",
" <td>7535240</td>\n",
" <td>11137410</td>\n",
" <td>7584410</td>\n",
" <td>-59100.0</td>\n",
" <td>-19600.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>10.00</td>\n",
" <td>357994</td>\n",
" <td>6656</td>\n",
" <td>5911</td>\n",
" <td>7654</td>\n",
" <td>6209520</td>\n",
" <td>6739540</td>\n",
" <td>6207670</td>\n",
" <td>6724310</td>\n",
" <td>5881510.0</td>\n",
" <td>6231050</td>\n",
" <td>6099880.0</td>\n",
" <td>7510510.0</td>\n",
" <td>6021370</td>\n",
" <td>7528920</td>\n",
" <td>-40200.0</td>\n",
" <td>-20000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>9.00</td>\n",
" <td>-365188</td>\n",
" <td>-108944</td>\n",
" <td>-268385</td>\n",
" <td>-313044</td>\n",
" <td>69905</td>\n",
" <td>-429277</td>\n",
" <td>-406975</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>10</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>-253556.0</td>\n",
" <td>7361</td>\n",
" <td>7431</td>\n",
" <td>7171</td>\n",
" <td>9822690</td>\n",
" <td>5875650</td>\n",
" <td>9961710.0</td>\n",
" <td>6141030.0</td>\n",
" <td>10122360</td>\n",
" <td>5956020</td>\n",
" <td>11884640</td>\n",
" <td>7412140</td>\n",
" <td>11942710</td>\n",
" <td>7418520</td>\n",
" <td>-56300.0</td>\n",
" <td>-9200.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>19.25</td>\n",
" <td>12.5</td>\n",
" <td>370.56</td>\n",
" <td>156.25</td>\n",
" <td>20.7</td>\n",
" <td>11.2</td>\n",
" <td>428.49</td>\n",
" <td>125.44</td>\n",
" <td>20.25</td>\n",
" <td>16.25</td>\n",
" <td>410.06</td>\n",
" <td>264.06</td>\n",
" <td>20.7</td>\n",
" <td>11.1</td>\n",
" <td>428.49</td>\n",
" <td>123.21</td>\n",
" <td>27.75</td>\n",
" <td>15.0</td>\n",
" <td>...</td>\n",
" <td>129.96</td>\n",
" <td>10.25</td>\n",
" <td>278260</td>\n",
" <td>7274</td>\n",
" <td>7827</td>\n",
" <td>7240</td>\n",
" <td>6979350</td>\n",
" <td>6637480</td>\n",
" <td>7021870</td>\n",
" <td>6646210</td>\n",
" <td>6854580.0</td>\n",
" <td>6211000</td>\n",
" <td>7408920.0</td>\n",
" <td>7528390.0</td>\n",
" <td>7535130</td>\n",
" <td>7560100</td>\n",
" <td>-49500.0</td>\n",
" <td>-30000.0</td>\n",
" <td>9.913784</td>\n",
" <td>10.308286</td>\n",
" <td>8.929568</td>\n",
" <td>9.377971</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>9.25</td>\n",
" <td>-367136</td>\n",
" <td>54310</td>\n",
" <td>-382374</td>\n",
" <td>-466235</td>\n",
" <td>141107</td>\n",
" <td>-313306</td>\n",
" <td>-606466</td>\n",
" <td>2018</td>\n",
" <td>243</td>\n",
" <td>4</td>\n",
" <td>35</td>\n",
" <td>31</td>\n",
" <td>3</td>\n",
" <td>10</td>\n",
" <td>15</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>20 rows × 85 columns</p>\n",
"</div>"
],
"text/plain": [
" saldo_final_target wgtavg_current_target ... hour minute\n",
"0 -725100.0 4981 ... 5 30\n",
"1 -757308.0 5999 ... 5 45\n",
"2 -264864.0 5108 ... 6 0\n",
"3 -383680.0 4841 ... 6 15\n",
"4 -786364.0 5883 ... 6 30\n",
"5 -1105900.0 6693 ... 6 45\n",
"6 -106368.0 4960 ... 7 0\n",
"7 53172.0 6055 ... 7 15\n",
"8 -197600.0 5993 ... 7 30\n",
"9 -527936.0 6180 ... 7 45\n",
"10 -363888.0 6431 ... 8 0\n",
"11 -372152.0 6367 ... 8 15\n",
"12 -111476.0 6798 ... 8 30\n",
"13 61832.0 6425 ... 8 45\n",
"14 358868.0 6666 ... 9 0\n",
"15 276240.0 7307 ... 9 15\n",
"16 -33212.0 7307 ... 9 30\n",
"17 42360.0 7337 ... 9 45\n",
"18 -359984.0 6897 ... 10 0\n",
"19 -253556.0 7361 ... 10 15\n",
"\n",
"[20 rows x 85 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 221
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ygLUy0i-wUZ9",
"colab_type": "code",
"colab": {}
},
"source": [
"# drop the \"saldo_final_target\"-column, as we don't need it in train dataset\n",
"X = df.drop('saldo_final_target', axis=1)\n",
"y = df.saldo_final_target\n",
"\n",
"# create the train & test datasets\n",
"X_train, X_valid, y_train, y_valid = train_test_split(\n",
" X, y, test_size=0.22, shuffle=False)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "hs5HdvwSDMjS",
"colab_type": "text"
},
"source": [
"##Algorithms"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "24U8zhnWM1G3",
"colab_type": "text"
},
"source": [
"Here we use different gradient boosting libraries such as LightGBM, Catboost, XGBoost and scikit-learn. The Catboost Regressor outperforms other boosting algorithms, but we use some other too, to combine average predictions and reach the lowest RMSE."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yBW_eiLIoeyt",
"colab_type": "text"
},
"source": [
"###CatBoostRegressor"
]
},
{
"cell_type": "code",
"metadata": {
"id": "xY7OtlxXwUar",
"colab_type": "code",
"colab": {}
},
"source": [
"model = cb.CatBoostRegressor(task_type='CPU', depth=8, iterations=2000, eval_metric='RMSE', loss_function='RMSE', od_type='Iter', metric_period=100, od_wait=1000, boost_from_average=False)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "K1NvNBBqwUay",
"colab_type": "code",
"outputId": "3f7e8e1f-7fcc-4d04-cb36-1c26650e94b2",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 503
}
},
"source": [
"model.fit(X_train, y_train, eval_set=(X_valid, y_valid), plot=False, use_best_model=True) #cat_features=[20,67],"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Warning: Overfitting detector is active, thus evaluation metric is calculated on every iteration. 'metric_period' is ignored for evaluation metric.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"0:\tlearn: 624399.4780103\ttest: 421644.0396205\tbest: 421644.0396205 (0)\ttotal: 77.7ms\tremaining: 2m 35s\n",
"100:\tlearn: 469742.4572142\ttest: 389166.9865160\tbest: 389166.9865160 (100)\ttotal: 6.31s\tremaining: 1m 58s\n",
"200:\tlearn: 442056.2989789\ttest: 384472.6834493\tbest: 384472.6834493 (200)\ttotal: 12.5s\tremaining: 1m 52s\n",
"300:\tlearn: 422858.1051921\ttest: 381773.8767117\tbest: 381773.8767117 (300)\ttotal: 18.6s\tremaining: 1m 45s\n",
"400:\tlearn: 405058.3157546\ttest: 379944.2485578\tbest: 379932.3816398 (397)\ttotal: 24.8s\tremaining: 1m 38s\n",
"500:\tlearn: 389195.8677996\ttest: 378871.9150952\tbest: 378809.3496884 (498)\ttotal: 30.8s\tremaining: 1m 32s\n",
"600:\tlearn: 375443.5160531\ttest: 378040.4860470\tbest: 378040.4860470 (600)\ttotal: 36.9s\tremaining: 1m 25s\n",
"700:\tlearn: 362391.2126904\ttest: 377290.4218304\tbest: 377269.9724733 (699)\ttotal: 43.1s\tremaining: 1m 19s\n",
"800:\tlearn: 350424.6758707\ttest: 377127.9604995\tbest: 377086.6126549 (727)\ttotal: 49.3s\tremaining: 1m 13s\n",
"900:\tlearn: 339292.5143504\ttest: 377174.2761568\tbest: 376990.7325833 (864)\ttotal: 55.4s\tremaining: 1m 7s\n",
"1000:\tlearn: 329165.8335779\ttest: 376917.5459193\tbest: 376917.5459193 (1000)\ttotal: 1m 1s\tremaining: 1m 1s\n",
"1100:\tlearn: 319725.2730668\ttest: 376952.0757035\tbest: 376917.5459193 (1000)\ttotal: 1m 7s\tremaining: 55.2s\n",
"1200:\tlearn: 310945.8016741\ttest: 377311.0054499\tbest: 376917.5459193 (1000)\ttotal: 1m 13s\tremaining: 49s\n",
"1300:\tlearn: 302363.4060430\ttest: 377410.7986433\tbest: 376917.5459193 (1000)\ttotal: 1m 19s\tremaining: 42.9s\n",
"1400:\tlearn: 294188.4248081\ttest: 377603.9883818\tbest: 376917.5459193 (1000)\ttotal: 1m 25s\tremaining: 36.7s\n",
"1500:\tlearn: 286699.2225445\ttest: 377732.4859433\tbest: 376917.5459193 (1000)\ttotal: 1m 32s\tremaining: 30.6s\n",
"1600:\tlearn: 279613.5815930\ttest: 377810.1252070\tbest: 376917.5459193 (1000)\ttotal: 1m 38s\tremaining: 24.5s\n",
"1700:\tlearn: 272605.0637017\ttest: 378107.3361411\tbest: 376917.5459193 (1000)\ttotal: 1m 44s\tremaining: 18.4s\n",
"1800:\tlearn: 266075.0054694\ttest: 378165.0873572\tbest: 376917.5459193 (1000)\ttotal: 1m 50s\tremaining: 12.3s\n",
"1900:\tlearn: 259727.1860009\ttest: 378445.6462339\tbest: 376917.5459193 (1000)\ttotal: 1m 57s\tremaining: 6.1s\n",
"1999:\tlearn: 253819.7156191\ttest: 378480.3625230\tbest: 376917.5459193 (1000)\ttotal: 2m 3s\tremaining: 0us\n",
"\n",
"bestTest = 376917.5459\n",
"bestIteration = 1000\n",
"\n",
"Shrink model to first 1001 iterations.\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<catboost.core.CatBoostRegressor at 0x7ff07a2755c0>"
]
},
"metadata": {
"tags": []
},
"execution_count": 224
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "HbASL-XdwUbC",
"colab_type": "code",
"outputId": "7b413e24-87fe-4f94-c55d-51929931724a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"y_pred = model.predict(X_valid)\n",
"print(np.sqrt(mean_squared_error(y_valid, np.array(y_pred).astype(int)))) #376917 "
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"379063.92118302867\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "GbbjYGXponvr",
"colab_type": "text"
},
"source": [
"###HistGradientBoostingRegressor"
]
},
{
"cell_type": "code",
"metadata": {
"id": "b_dHMpmRwUbw",
"colab_type": "code",
"outputId": "3d3a9cb1-f202-4997-9757-330775eadb7f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"source": [
"est = HistGradientBoostingRegressor(verbose=1, n_iter_no_change=100, max_iter=72, scoring='neg_mean_squared_error')\n",
"est.fit(X_train, y_train)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Binning 0.017 GB of training data: 0.307 s\n",
"Binning 0.002 GB of validation data: 0.010 s\n",
"Fitting gradient boosted rounds:\n",
"[1/72] 1 tree, 31 leaves, max depth = 7, train score: -359010319184.47498, val score: -358859405674.27411, in 0.044s\n",
"[2/72] 1 tree, 31 leaves, max depth = 7, train score: -336888202084.80933, val score: -338079268859.11029, in 0.037s\n",
"[3/72] 1 tree, 31 leaves, max depth = 7, train score: -318299064139.34082, val score: -319365419621.55231, in 0.043s\n",
"[4/72] 1 tree, 31 leaves, max depth = 6, train score: -302626166191.98438, val score: -304569266352.34503, in 0.044s\n",
"[5/72] 1 tree, 31 leaves, max depth = 8, train score: -289695738398.72968, val score: -292172114103.28186, in 0.041s\n",
"[6/72] 1 tree, 31 leaves, max depth = 7, train score: -279096971100.60626, val score: -281864628979.45276, in 0.039s\n",
"[7/72] 1 tree, 31 leaves, max depth = 7, train score: -269779706151.36594, val score: -272992412348.61447, in 0.069s\n",
"[8/72] 1 tree, 31 leaves, max depth = 7, train score: -261867812330.56229, val score: -265592773559.78958, in 0.042s\n",
"[9/72] 1 tree, 31 leaves, max depth = 7, train score: -255103045786.40720, val score: -259537424709.62466, in 0.039s\n",
"[10/72] 1 tree, 31 leaves, max depth = 7, train score: -249292602304.92130, val score: -254133717818.84616, in 0.038s\n",
"[11/72] 1 tree, 31 leaves, max depth = 7, train score: -244356565850.39554, val score: -249293972418.79303, in 0.040s\n",
"[12/72] 1 tree, 31 leaves, max depth = 9, train score: -239671770785.49210, val score: -245241867097.16849, in 0.071s\n",
"[13/72] 1 tree, 31 leaves, max depth = 7, train score: -235702953629.17719, val score: -242011969762.92554, in 0.079s\n",
"[14/72] 1 tree, 31 leaves, max depth = 8, train score: -231875144252.49060, val score: -238711697198.43024, in 0.048s\n",
"[15/72] 1 tree, 31 leaves, max depth = 9, train score: -228726931783.51819, val score: -236067476093.20868, in 0.065s\n",
"[16/72] 1 tree, 31 leaves, max depth = 8, train score: -225416281024.18045, val score: -233078562077.45166, in 0.054s\n",
"[17/72] 1 tree, 31 leaves, max depth = 9, train score: -222492514400.94867, val score: -230718153251.23868, in 0.077s\n",
"[18/72] 1 tree, 31 leaves, max depth = 9, train score: -219742549170.89294, val score: -228292949062.94345, in 0.036s\n",
"[19/72] 1 tree, 31 leaves, max depth = 9, train score: -217376966919.84186, val score: -226250403235.71765, in 0.037s\n",
"[20/72] 1 tree, 31 leaves, max depth = 8, train score: -214904084498.67700, val score: -224343798810.69724, in 0.039s\n",
"[21/72] 1 tree, 31 leaves, max depth = 9, train score: -212795661550.86050, val score: -222551522945.42719, in 0.038s\n",
"[22/72] 1 tree, 31 leaves, max depth = 8, train score: -210972034749.29950, val score: -221096573404.62851, in 0.043s\n",
"[23/72] 1 tree, 31 leaves, max depth = 8, train score: -208987637486.21475, val score: -219742510550.05951, in 0.037s\n",
"[24/72] 1 tree, 31 leaves, max depth = 10, train score: -207295562480.21976, val score: -218312223316.93539, in 0.037s\n",
"[25/72] 1 tree, 31 leaves, max depth = 12, train score: -205491347130.47165, val score: -217265337458.73581, in 0.039s\n",
"[26/72] 1 tree, 31 leaves, max depth = 8, train score: -203974365550.50534, val score: -216004475538.36374, in 0.039s\n",
"[27/72] 1 tree, 31 leaves, max depth = 8, train score: -202457877006.87674, val score: -214815930341.41656, in 0.042s\n",
"[28/72] 1 tree, 31 leaves, max depth = 8, train score: -200797897592.49780, val score: -213682774445.39835, in 0.041s\n",
"[29/72] 1 tree, 31 leaves, max depth = 11, train score: -199075025282.75558, val score: -212247259177.45938, in 0.039s\n",
"[30/72] 1 tree, 31 leaves, max depth = 9, train score: -197658638657.81299, val score: -211428671326.46707, in 0.040s\n",
"[31/72] 1 tree, 31 leaves, max depth = 10, train score: -196179551251.71014, val score: -210481152305.57025, in 0.040s\n",
"[32/72] 1 tree, 31 leaves, max depth = 10, train score: -194834310896.11874, val score: -209455068545.05392, in 0.043s\n",
"[33/72] 1 tree, 31 leaves, max depth = 9, train score: -193401112615.31485, val score: -208643049219.73361, in 0.039s\n",
"[34/72] 1 tree, 31 leaves, max depth = 13, train score: -191940063219.12802, val score: -207571586945.58020, in 0.040s\n",
"[35/72] 1 tree, 31 leaves, max depth = 8, train score: -190650677006.69971, val score: -206585830462.84302, in 0.042s\n",
"[36/72] 1 tree, 31 leaves, max depth = 9, train score: -189202820602.40079, val score: -206012686177.65265, in 0.057s\n",
"[37/72] 1 tree, 31 leaves, max depth = 14, train score: -187906450123.75125, val score: -205003074973.55600, in 0.084s\n",
"[38/72] 1 tree, 31 leaves, max depth = 8, train score: -186566021002.28354, val score: -204283423818.39868, in 0.087s\n",
"[39/72] 1 tree, 31 leaves, max depth = 9, train score: -185425007858.88388, val score: -203400225914.71661, in 0.043s\n",
"[40/72] 1 tree, 31 leaves, max depth = 9, train score: -184183375414.84415, val score: -202688282007.45590, in 0.042s\n",
"[41/72] 1 tree, 31 leaves, max depth = 14, train score: -183256654895.19791, val score: -201765546846.92960, in 0.051s\n",
"[42/72] 1 tree, 31 leaves, max depth = 10, train score: -182089221368.00809, val score: -201167504171.34204, in 0.048s\n",
"[43/72] 1 tree, 31 leaves, max depth = 13, train score: -181205003926.71603, val score: -200615470209.61438, in 0.045s\n",
"[44/72] 1 tree, 31 leaves, max depth = 16, train score: -180282030871.18661, val score: -200311354069.96301, in 0.045s\n",
"[45/72] 1 tree, 31 leaves, max depth = 11, train score: -179342158962.32544, val score: -199686362190.24014, in 0.045s\n",
"[46/72] 1 tree, 31 leaves, max depth = 14, train score: -178505265214.37695, val score: -199415140629.99194, in 0.046s\n",
"[47/72] 1 tree, 31 leaves, max depth = 15, train score: -177575605577.81946, val score: -198842654293.90155, in 0.047s\n",
"[48/72] 1 tree, 31 leaves, max depth = 10, train score: -176655846172.70551, val score: -198134736054.20752, in 0.049s\n",
"[49/72] 1 tree, 31 leaves, max depth = 11, train score: -175852202490.02209, val score: -197494300394.57031, in 0.046s\n",
"[50/72] 1 tree, 31 leaves, max depth = 11, train score: -174982969151.97556, val score: -197045533775.88809, in 0.047s\n",
"[51/72] 1 tree, 31 leaves, max depth = 11, train score: -174162843888.20679, val score: -196756575779.22696, in 0.044s\n",
"[52/72] 1 tree, 31 leaves, max depth = 11, train score: -173529141473.27731, val score: -196313323833.15082, in 0.048s\n",
"[53/72] 1 tree, 31 leaves, max depth = 13, train score: -172688657060.77356, val score: -195676175074.52988, in 0.046s\n",
"[54/72] 1 tree, 31 leaves, max depth = 14, train score: -172003922901.74954, val score: -195054432874.39297, in 0.049s\n",
"[55/72] 1 tree, 31 leaves, max depth = 11, train score: -170885202095.43250, val score: -194495278758.52258, in 0.049s\n",
"[56/72] 1 tree, 31 leaves, max depth = 11, train score: -170196237342.84259, val score: -193967856976.57901, in 0.046s\n",
"[57/72] 1 tree, 31 leaves, max depth = 10, train score: -169390040306.30313, val score: -193408653413.06058, in 0.046s\n",
"[58/72] 1 tree, 31 leaves, max depth = 10, train score: -168693852698.90979, val score: -192767838973.78247, in 0.060s\n",
"[59/72] 1 tree, 31 leaves, max depth = 9, train score: -168069609877.89691, val score: -192469345996.40329, in 0.058s\n",
"[60/72] 1 tree, 31 leaves, max depth = 8, train score: -167307077433.50085, val score: -191991285973.30786, in 0.051s\n",
"[61/72] 1 tree, 31 leaves, max depth = 12, train score: -166581452736.98901, val score: -191426614095.69150, in 0.055s\n",
"[62/72] 1 tree, 31 leaves, max depth = 10, train score: -165768588833.54834, val score: -191145440491.91599, in 0.054s\n",
"[63/72] 1 tree, 31 leaves, max depth = 13, train score: -165080579438.67590, val score: -190960942312.80600, in 0.050s\n",
"[64/72] 1 tree, 31 leaves, max depth = 9, train score: -164522952006.81750, val score: -190860359709.65204, in 0.046s\n",
"[65/72] 1 tree, 31 leaves, max depth = 10, train score: -163807708931.44101, val score: -190522182457.54459, in 0.053s\n",
"[66/72] 1 tree, 31 leaves, max depth = 10, train score: -163204753476.91559, val score: -190010511251.04550, in 0.058s\n",
"[67/72] 1 tree, 31 leaves, max depth = 13, train score: -162421552444.96170, val score: -189396762907.65021, in 0.055s\n",
"[68/72] 1 tree, 31 leaves, max depth = 14, train score: -161867368687.15915, val score: -188853087951.92032, in 0.052s\n",
"[69/72] 1 tree, 31 leaves, max depth = 12, train score: -161214628856.22748, val score: -188604608673.07971, in 0.052s\n",
"[70/72] 1 tree, 31 leaves, max depth = 10, train score: -160560827181.44745, val score: -188296759315.06836, in 0.055s\n",
"[71/72] 1 tree, 31 leaves, max depth = 8, train score: -159948925512.64307, val score: -188003365344.56970, in 0.052s\n",
"[72/72] 1 tree, 31 leaves, max depth = 8, train score: -159215250215.74039, val score: -187326673265.71603, in 0.049s\n",
"Fit 72 trees in 3.922 s, (2232 total leaves)\n",
"Time spent computing histograms: 1.213s\n",
"Time spent finding best splits: 0.427s\n",
"Time spent applying splits: 0.193s\n",
"Time spent predicting: 0.009s\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"HistGradientBoostingRegressor(l2_regularization=0.0, learning_rate=0.1,\n",
" loss='least_squares', max_bins=255,\n",
" max_depth=None, max_iter=72, max_leaf_nodes=31,\n",
" min_samples_leaf=20, n_iter_no_change=100,\n",
" random_state=None,\n",
" scoring='neg_mean_squared_error', tol=1e-07,\n",
" validation_fraction=0.1, verbose=1,\n",
" warm_start=False)"
]
},
"metadata": {
"tags": []
},
"execution_count": 176
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "2WekPdg4wUb6",
"colab_type": "code",
"outputId": "919a0501-bc3a-4845-80a3-e99687b603d4",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"est.score(X_train,y_train)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0.580155350466026"
]
},
"metadata": {
"tags": []
},
"execution_count": 177
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "pj4MgeWYx1qp",
"colab_type": "code",
"outputId": "bc71182e-f932-4dab-fe51-be2b46aa242c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"y_pred = est.predict(X_valid)\n",
"print(np.sqrt(mean_squared_error(y_valid, np.array(y_pred).astype(int)))) #380442"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"383815.97723866074\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iXg8Ixhyosjq",
"colab_type": "text"
},
"source": [
"###ExtraTreesRegressor"
]
},
{
"cell_type": "code",
"metadata": {
"id": "eVWSd1RvZlEP",
"colab_type": "code",
"outputId": "7912cdbd-5e7f-4100-d01a-f201d684f160",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
}
},
"source": [
"est = ExtraTreesRegressor(verbose=1, n_estimators=500)\n",
"est.fit(X_train, y_train)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n",
"[Parallel(n_jobs=1)]: Done 500 out of 500 | elapsed: 4.6min finished\n"
],
"name": "stderr"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"ExtraTreesRegressor(bootstrap=False, ccp_alpha=0.0, criterion='mse',\n",
" max_depth=None, max_features='auto', max_leaf_nodes=None,\n",
" max_samples=None, min_impurity_decrease=0.0,\n",
" min_impurity_split=None, min_samples_leaf=1,\n",
" min_samples_split=2, min_weight_fraction_leaf=0.0,\n",
" n_estimators=500, n_jobs=None, oob_score=False,\n",
" random_state=None, verbose=1, warm_start=False)"
]
},
"metadata": {
"tags": []
},
"execution_count": 179
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "8Svd6QE7Z5fq",
"colab_type": "code",
"outputId": "59f07c68-7076-4e40-ac28-d77dd9c94152",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 69
}
},
"source": [
"y_pred = est.predict(X_valid)\n",
"print(np.sqrt(mean_squared_error(y_valid, np.array(y_pred).astype(int))))"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"384745.616197803\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"[Parallel(n_jobs=1)]: Done 500 out of 500 | elapsed: 1.1s finished\n"
],
"name": "stderr"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "eJ_hJ6GnovVK",
"colab_type": "text"
},
"source": [
"###LGBMRegressor"
]
},
{
"cell_type": "code",
"metadata": {
"id": "FyfT8ZznfXo0",
"colab_type": "code",
"colab": {}
},
"source": [
"lgb_clf = lgb.LGBMRegressor(objective='regression', max_depth=12, num_leaves=31, verbose_eval=10, importance_type='gain')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "xrwXPSv6hISI",
"colab_type": "code",
"outputId": "5b54636d-b229-45d6-e9de-c68d0776bb1f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"source": [
"lgb_clf.fit(X_train, y_train, eval_set=(X_valid, y_valid), early_stopping_rounds=100, eval_metric='rmse' )"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"[1]\tvalid_0's rmse: 428482\tvalid_0's l2: 1.83597e+11\n",
"Training until validation scores don't improve for 100 rounds.\n",
"[2]\tvalid_0's rmse: 420973\tvalid_0's l2: 1.77218e+11\n",
"[3]\tvalid_0's rmse: 414606\tvalid_0's l2: 1.71899e+11\n",
"[4]\tvalid_0's rmse: 409481\tvalid_0's l2: 1.67675e+11\n",
"[5]\tvalid_0's rmse: 405126\tvalid_0's l2: 1.64127e+11\n",
"[6]\tvalid_0's rmse: 402104\tvalid_0's l2: 1.61688e+11\n",
"[7]\tvalid_0's rmse: 399686\tvalid_0's l2: 1.59749e+11\n",
"[8]\tvalid_0's rmse: 397276\tvalid_0's l2: 1.57828e+11\n",
"[9]\tvalid_0's rmse: 395269\tvalid_0's l2: 1.56238e+11\n",
"[10]\tvalid_0's rmse: 393555\tvalid_0's l2: 1.54885e+11\n",
"[11]\tvalid_0's rmse: 392355\tvalid_0's l2: 1.53942e+11\n",
"[12]\tvalid_0's rmse: 390952\tvalid_0's l2: 1.52843e+11\n",
"[13]\tvalid_0's rmse: 389941\tvalid_0's l2: 1.52054e+11\n",
"[14]\tvalid_0's rmse: 389221\tvalid_0's l2: 1.51493e+11\n",
"[15]\tvalid_0's rmse: 388789\tvalid_0's l2: 1.51157e+11\n",
"[16]\tvalid_0's rmse: 388176\tvalid_0's l2: 1.50681e+11\n",
"[17]\tvalid_0's rmse: 387991\tvalid_0's l2: 1.50537e+11\n",
"[18]\tvalid_0's rmse: 387690\tvalid_0's l2: 1.50303e+11\n",
"[19]\tvalid_0's rmse: 387530\tvalid_0's l2: 1.50179e+11\n",
"[20]\tvalid_0's rmse: 387252\tvalid_0's l2: 1.49964e+11\n",
"[21]\tvalid_0's rmse: 387051\tvalid_0's l2: 1.49808e+11\n",
"[22]\tvalid_0's rmse: 387054\tvalid_0's l2: 1.49811e+11\n",
"[23]\tvalid_0's rmse: 386642\tvalid_0's l2: 1.49492e+11\n",
"[24]\tvalid_0's rmse: 386978\tvalid_0's l2: 1.49752e+11\n",
"[25]\tvalid_0's rmse: 386726\tvalid_0's l2: 1.49557e+11\n",
"[26]\tvalid_0's rmse: 386716\tvalid_0's l2: 1.49549e+11\n",
"[27]\tvalid_0's rmse: 386673\tvalid_0's l2: 1.49516e+11\n",
"[28]\tvalid_0's rmse: 386345\tvalid_0's l2: 1.49262e+11\n",
"[29]\tvalid_0's rmse: 386145\tvalid_0's l2: 1.49108e+11\n",
"[30]\tvalid_0's rmse: 385861\tvalid_0's l2: 1.48889e+11\n",
"[31]\tvalid_0's rmse: 385710\tvalid_0's l2: 1.48772e+11\n",
"[32]\tvalid_0's rmse: 385645\tvalid_0's l2: 1.48722e+11\n",
"[33]\tvalid_0's rmse: 385441\tvalid_0's l2: 1.48564e+11\n",
"[34]\tvalid_0's rmse: 385334\tvalid_0's l2: 1.48482e+11\n",
"[35]\tvalid_0's rmse: 385296\tvalid_0's l2: 1.48453e+11\n",
"[36]\tvalid_0's rmse: 385121\tvalid_0's l2: 1.48318e+11\n",
"[37]\tvalid_0's rmse: 384870\tvalid_0's l2: 1.48125e+11\n",
"[38]\tvalid_0's rmse: 384638\tvalid_0's l2: 1.47946e+11\n",
"[39]\tvalid_0's rmse: 384293\tvalid_0's l2: 1.47681e+11\n",
"[40]\tvalid_0's rmse: 383908\tvalid_0's l2: 1.47386e+11\n",
"[41]\tvalid_0's rmse: 383728\tvalid_0's l2: 1.47247e+11\n",
"[42]\tvalid_0's rmse: 383588\tvalid_0's l2: 1.4714e+11\n",
"[43]\tvalid_0's rmse: 383725\tvalid_0's l2: 1.47245e+11\n",
"[44]\tvalid_0's rmse: 383707\tvalid_0's l2: 1.47231e+11\n",
"[45]\tvalid_0's rmse: 383463\tvalid_0's l2: 1.47044e+11\n",
"[46]\tvalid_0's rmse: 383496\tvalid_0's l2: 1.47069e+11\n",
"[47]\tvalid_0's rmse: 383178\tvalid_0's l2: 1.46826e+11\n",
"[48]\tvalid_0's rmse: 383091\tvalid_0's l2: 1.46758e+11\n",
"[49]\tvalid_0's rmse: 382987\tvalid_0's l2: 1.46679e+11\n",
"[50]\tvalid_0's rmse: 382717\tvalid_0's l2: 1.46473e+11\n",
"[51]\tvalid_0's rmse: 382659\tvalid_0's l2: 1.46428e+11\n",
"[52]\tvalid_0's rmse: 382559\tvalid_0's l2: 1.46352e+11\n",
"[53]\tvalid_0's rmse: 382704\tvalid_0's l2: 1.46462e+11\n",
"[54]\tvalid_0's rmse: 382482\tvalid_0's l2: 1.46293e+11\n",
"[55]\tvalid_0's rmse: 382636\tvalid_0's l2: 1.4641e+11\n",
"[56]\tvalid_0's rmse: 382563\tvalid_0's l2: 1.46354e+11\n",
"[57]\tvalid_0's rmse: 382602\tvalid_0's l2: 1.46384e+11\n",
"[58]\tvalid_0's rmse: 382355\tvalid_0's l2: 1.46195e+11\n",
"[59]\tvalid_0's rmse: 382382\tvalid_0's l2: 1.46216e+11\n",
"[60]\tvalid_0's rmse: 382551\tvalid_0's l2: 1.46345e+11\n",
"[61]\tvalid_0's rmse: 382352\tvalid_0's l2: 1.46193e+11\n",
"[62]\tvalid_0's rmse: 382232\tvalid_0's l2: 1.46101e+11\n",
"[63]\tvalid_0's rmse: 382285\tvalid_0's l2: 1.46142e+11\n",
"[64]\tvalid_0's rmse: 382033\tvalid_0's l2: 1.45949e+11\n",
"[65]\tvalid_0's rmse: 381970\tvalid_0's l2: 1.45901e+11\n",
"[66]\tvalid_0's rmse: 381851\tvalid_0's l2: 1.4581e+11\n",
"[67]\tvalid_0's rmse: 381837\tvalid_0's l2: 1.45799e+11\n",
"[68]\tvalid_0's rmse: 381657\tvalid_0's l2: 1.45662e+11\n",
"[69]\tvalid_0's rmse: 381711\tvalid_0's l2: 1.45703e+11\n",
"[70]\tvalid_0's rmse: 381631\tvalid_0's l2: 1.45642e+11\n",
"[71]\tvalid_0's rmse: 381535\tvalid_0's l2: 1.45569e+11\n",
"[72]\tvalid_0's rmse: 381429\tvalid_0's l2: 1.45488e+11\n",
"[73]\tvalid_0's rmse: 381385\tvalid_0's l2: 1.45454e+11\n",
"[74]\tvalid_0's rmse: 381410\tvalid_0's l2: 1.45474e+11\n",
"[75]\tvalid_0's rmse: 381233\tvalid_0's l2: 1.45339e+11\n",
"[76]\tvalid_0's rmse: 381216\tvalid_0's l2: 1.45326e+11\n",
"[77]\tvalid_0's rmse: 381014\tvalid_0's l2: 1.45172e+11\n",
"[78]\tvalid_0's rmse: 380917\tvalid_0's l2: 1.45098e+11\n",
"[79]\tvalid_0's rmse: 380898\tvalid_0's l2: 1.45084e+11\n",
"[80]\tvalid_0's rmse: 381132\tvalid_0's l2: 1.45261e+11\n",
"[81]\tvalid_0's rmse: 381206\tvalid_0's l2: 1.45318e+11\n",
"[82]\tvalid_0's rmse: 381198\tvalid_0's l2: 1.45312e+11\n",
"[83]\tvalid_0's rmse: 381145\tvalid_0's l2: 1.45271e+11\n",
"[84]\tvalid_0's rmse: 381230\tvalid_0's l2: 1.45337e+11\n",
"[85]\tvalid_0's rmse: 381257\tvalid_0's l2: 1.45357e+11\n",
"[86]\tvalid_0's rmse: 381404\tvalid_0's l2: 1.45469e+11\n",
"[87]\tvalid_0's rmse: 381530\tvalid_0's l2: 1.45565e+11\n",
"[88]\tvalid_0's rmse: 381516\tvalid_0's l2: 1.45555e+11\n",
"[89]\tvalid_0's rmse: 381640\tvalid_0's l2: 1.45649e+11\n",
"[90]\tvalid_0's rmse: 381564\tvalid_0's l2: 1.45591e+11\n",
"[91]\tvalid_0's rmse: 381563\tvalid_0's l2: 1.4559e+11\n",
"[92]\tvalid_0's rmse: 381607\tvalid_0's l2: 1.45624e+11\n",
"[93]\tvalid_0's rmse: 381611\tvalid_0's l2: 1.45627e+11\n",
"[94]\tvalid_0's rmse: 381540\tvalid_0's l2: 1.45572e+11\n",
"[95]\tvalid_0's rmse: 381380\tvalid_0's l2: 1.4545e+11\n",
"[96]\tvalid_0's rmse: 381517\tvalid_0's l2: 1.45556e+11\n",
"[97]\tvalid_0's rmse: 381469\tvalid_0's l2: 1.45518e+11\n",
"[98]\tvalid_0's rmse: 381410\tvalid_0's l2: 1.45474e+11\n",
"[99]\tvalid_0's rmse: 381505\tvalid_0's l2: 1.45546e+11\n",
"[100]\tvalid_0's rmse: 381462\tvalid_0's l2: 1.45513e+11\n",
"Did not meet early stopping. Best iteration is:\n",
"[79]\tvalid_0's rmse: 380898\tvalid_0's l2: 1.45084e+11\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"LGBMRegressor(boosting_type='gbdt', class_weight=None, colsample_bytree=1.0,\n",
" importance_type='gain', learning_rate=0.1, max_depth=12,\n",
" min_child_samples=20, min_child_weight=0.001, min_split_gain=0.0,\n",
" n_estimators=100, n_jobs=-1, num_leaves=31,\n",
" objective='regression', random_state=None, reg_alpha=0.0,\n",
" reg_lambda=0.0, silent=True, subsample=1.0,\n",
" subsample_for_bin=200000, subsample_freq=0, verbose_eval=10)"
]
},
"metadata": {
"tags": []
},
"execution_count": 182
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ozp7epKzCUcB",
"colab_type": "text"
},
"source": [
"###XGBRegressor"
]
},
{
"cell_type": "code",
"metadata": {
"id": "RRBIAL4srvZ7",
"colab_type": "code",
"outputId": "1c46c907-1473-4de5-d117-82a1a62fa217",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"source": [
"model = xgb.XGBRegressor(n_estimators=200, objective='reg:squarederror',max_depth=4, importance_type='gain', min_child_weight=0.001, base_score=0.4)\n",
"\n",
"model.fit(X_train, y_train, eval_set=[(X_valid, y_valid)], early_stopping_rounds=50, eval_metric='rmse') #381025"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Series.base is deprecated and will be removed in a future version\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"[0]\tvalidation_0-rmse:416374\n",
"Will train until validation_0-rmse hasn't improved in 50 rounds.\n",
"[1]\tvalidation_0-rmse:410719\n",
"[2]\tvalidation_0-rmse:406104\n",
"[3]\tvalidation_0-rmse:402553\n",
"[4]\tvalidation_0-rmse:400454\n",
"[5]\tvalidation_0-rmse:398491\n",
"[6]\tvalidation_0-rmse:396855\n",
"[7]\tvalidation_0-rmse:395936\n",
"[8]\tvalidation_0-rmse:395278\n",
"[9]\tvalidation_0-rmse:394435\n",
"[10]\tvalidation_0-rmse:393879\n",
"[11]\tvalidation_0-rmse:393176\n",
"[12]\tvalidation_0-rmse:392635\n",
"[13]\tvalidation_0-rmse:392506\n",
"[14]\tvalidation_0-rmse:392100\n",
"[15]\tvalidation_0-rmse:391430\n",
"[16]\tvalidation_0-rmse:391490\n",
"[17]\tvalidation_0-rmse:391009\n",
"[18]\tvalidation_0-rmse:391067\n",
"[19]\tvalidation_0-rmse:390610\n",
"[20]\tvalidation_0-rmse:390327\n",
"[21]\tvalidation_0-rmse:390069\n",
"[22]\tvalidation_0-rmse:389919\n",
"[23]\tvalidation_0-rmse:389760\n",
"[24]\tvalidation_0-rmse:389236\n",
"[25]\tvalidation_0-rmse:388798\n",
"[26]\tvalidation_0-rmse:388397\n",
"[27]\tvalidation_0-rmse:388203\n",
"[28]\tvalidation_0-rmse:388164\n",
"[29]\tvalidation_0-rmse:387960\n",
"[30]\tvalidation_0-rmse:387858\n",
"[31]\tvalidation_0-rmse:387378\n",
"[32]\tvalidation_0-rmse:387342\n",
"[33]\tvalidation_0-rmse:387154\n",
"[34]\tvalidation_0-rmse:387132\n",
"[35]\tvalidation_0-rmse:387014\n",
"[36]\tvalidation_0-rmse:386888\n",
"[37]\tvalidation_0-rmse:386480\n",
"[38]\tvalidation_0-rmse:386380\n",
"[39]\tvalidation_0-rmse:386223\n",
"[40]\tvalidation_0-rmse:386186\n",
"[41]\tvalidation_0-rmse:385906\n",
"[42]\tvalidation_0-rmse:385720\n",
"[43]\tvalidation_0-rmse:385562\n",
"[44]\tvalidation_0-rmse:385563\n",
"[45]\tvalidation_0-rmse:385579\n",
"[46]\tvalidation_0-rmse:385252\n",
"[47]\tvalidation_0-rmse:385018\n",
"[48]\tvalidation_0-rmse:385052\n",
"[49]\tvalidation_0-rmse:384995\n",
"[50]\tvalidation_0-rmse:385040\n",
"[51]\tvalidation_0-rmse:384839\n",
"[52]\tvalidation_0-rmse:384812\n",
"[53]\tvalidation_0-rmse:384703\n",
"[54]\tvalidation_0-rmse:384608\n",
"[55]\tvalidation_0-rmse:384595\n",
"[56]\tvalidation_0-rmse:384663\n",
"[57]\tvalidation_0-rmse:384653\n",
"[58]\tvalidation_0-rmse:384612\n",
"[59]\tvalidation_0-rmse:384578\n",
"[60]\tvalidation_0-rmse:384490\n",
"[61]\tvalidation_0-rmse:384486\n",
"[62]\tvalidation_0-rmse:384572\n",
"[63]\tvalidation_0-rmse:384577\n",
"[64]\tvalidation_0-rmse:384476\n",
"[65]\tvalidation_0-rmse:384407\n",
"[66]\tvalidation_0-rmse:384369\n",
"[67]\tvalidation_0-rmse:384358\n",
"[68]\tvalidation_0-rmse:384310\n",
"[69]\tvalidation_0-rmse:384270\n",
"[70]\tvalidation_0-rmse:384141\n",
"[71]\tvalidation_0-rmse:384081\n",
"[72]\tvalidation_0-rmse:384005\n",
"[73]\tvalidation_0-rmse:384195\n",
"[74]\tvalidation_0-rmse:384141\n",
"[75]\tvalidation_0-rmse:384118\n",
"[76]\tvalidation_0-rmse:384088\n",
"[77]\tvalidation_0-rmse:383974\n",
"[78]\tvalidation_0-rmse:383950\n",
"[79]\tvalidation_0-rmse:384027\n",
"[80]\tvalidation_0-rmse:384009\n",
"[81]\tvalidation_0-rmse:383912\n",
"[82]\tvalidation_0-rmse:383840\n",
"[83]\tvalidation_0-rmse:383814\n",
"[84]\tvalidation_0-rmse:383825\n",
"[85]\tvalidation_0-rmse:383776\n",
"[86]\tvalidation_0-rmse:383614\n",
"[87]\tvalidation_0-rmse:383704\n",
"[88]\tvalidation_0-rmse:383641\n",
"[89]\tvalidation_0-rmse:383666\n",
"[90]\tvalidation_0-rmse:383647\n",
"[91]\tvalidation_0-rmse:383693\n",
"[92]\tvalidation_0-rmse:383701\n",
"[93]\tvalidation_0-rmse:383624\n",
"[94]\tvalidation_0-rmse:383552\n",
"[95]\tvalidation_0-rmse:383535\n",
"[96]\tvalidation_0-rmse:383566\n",
"[97]\tvalidation_0-rmse:383416\n",
"[98]\tvalidation_0-rmse:383319\n",
"[99]\tvalidation_0-rmse:383220\n",
"[100]\tvalidation_0-rmse:383301\n",
"[101]\tvalidation_0-rmse:383347\n",
"[102]\tvalidation_0-rmse:383208\n",
"[103]\tvalidation_0-rmse:383199\n",
"[104]\tvalidation_0-rmse:383129\n",
"[105]\tvalidation_0-rmse:382979\n",
"[106]\tvalidation_0-rmse:382907\n",
"[107]\tvalidation_0-rmse:382957\n",
"[108]\tvalidation_0-rmse:383070\n",
"[109]\tvalidation_0-rmse:383064\n",
"[110]\tvalidation_0-rmse:383056\n",
"[111]\tvalidation_0-rmse:383038\n",
"[112]\tvalidation_0-rmse:383101\n",
"[113]\tvalidation_0-rmse:383035\n",
"[114]\tvalidation_0-rmse:382947\n",
"[115]\tvalidation_0-rmse:382938\n",
"[116]\tvalidation_0-rmse:382960\n",
"[117]\tvalidation_0-rmse:382891\n",
"[118]\tvalidation_0-rmse:382878\n",
"[119]\tvalidation_0-rmse:382872\n",
"[120]\tvalidation_0-rmse:382877\n",
"[121]\tvalidation_0-rmse:382880\n",
"[122]\tvalidation_0-rmse:382841\n",
"[123]\tvalidation_0-rmse:383035\n",
"[124]\tvalidation_0-rmse:382944\n",
"[125]\tvalidation_0-rmse:382940\n",
"[126]\tvalidation_0-rmse:382877\n",
"[127]\tvalidation_0-rmse:382884\n",
"[128]\tvalidation_0-rmse:382904\n",
"[129]\tvalidation_0-rmse:382978\n",
"[130]\tvalidation_0-rmse:382888\n",
"[131]\tvalidation_0-rmse:382932\n",
"[132]\tvalidation_0-rmse:382854\n",
"[133]\tvalidation_0-rmse:382530\n",
"[134]\tvalidation_0-rmse:382540\n",
"[135]\tvalidation_0-rmse:382511\n",
"[136]\tvalidation_0-rmse:382451\n",
"[137]\tvalidation_0-rmse:382392\n",
"[138]\tvalidation_0-rmse:382333\n",
"[139]\tvalidation_0-rmse:382230\n",
"[140]\tvalidation_0-rmse:382187\n",
"[141]\tvalidation_0-rmse:382144\n",
"[142]\tvalidation_0-rmse:381914\n",
"[143]\tvalidation_0-rmse:381917\n",
"[144]\tvalidation_0-rmse:381917\n",
"[145]\tvalidation_0-rmse:381916\n",
"[146]\tvalidation_0-rmse:381899\n",
"[147]\tvalidation_0-rmse:381986\n",
"[148]\tvalidation_0-rmse:381938\n",
"[149]\tvalidation_0-rmse:381826\n",
"[150]\tvalidation_0-rmse:381768\n",
"[151]\tvalidation_0-rmse:381783\n",
"[152]\tvalidation_0-rmse:381772\n",
"[153]\tvalidation_0-rmse:381765\n",
"[154]\tvalidation_0-rmse:381704\n",
"[155]\tvalidation_0-rmse:381710\n",
"[156]\tvalidation_0-rmse:381719\n",
"[157]\tvalidation_0-rmse:381725\n",
"[158]\tvalidation_0-rmse:381730\n",
"[159]\tvalidation_0-rmse:381730\n",
"[160]\tvalidation_0-rmse:381446\n",
"[161]\tvalidation_0-rmse:381367\n",
"[162]\tvalidation_0-rmse:381096\n",
"[163]\tvalidation_0-rmse:381025\n",
"[164]\tvalidation_0-rmse:381122\n",
"[165]\tvalidation_0-rmse:381140\n",
"[166]\tvalidation_0-rmse:381188\n",
"[167]\tvalidation_0-rmse:381181\n",
"[168]\tvalidation_0-rmse:381195\n",
"[169]\tvalidation_0-rmse:381135\n",
"[170]\tvalidation_0-rmse:381155\n",
"[171]\tvalidation_0-rmse:381189\n",
"[172]\tvalidation_0-rmse:381188\n",
"[173]\tvalidation_0-rmse:381699\n",
"[174]\tvalidation_0-rmse:381699\n",
"[175]\tvalidation_0-rmse:381591\n",
"[176]\tvalidation_0-rmse:381610\n",
"[177]\tvalidation_0-rmse:381766\n",
"[178]\tvalidation_0-rmse:381766\n",
"[179]\tvalidation_0-rmse:381576\n",
"[180]\tvalidation_0-rmse:381580\n",
"[181]\tvalidation_0-rmse:381567\n",
"[182]\tvalidation_0-rmse:381564\n",
"[183]\tvalidation_0-rmse:381575\n",
"[184]\tvalidation_0-rmse:381570\n",
"[185]\tvalidation_0-rmse:381572\n",
"[186]\tvalidation_0-rmse:381577\n",
"[187]\tvalidation_0-rmse:381546\n",
"[188]\tvalidation_0-rmse:381840\n",
"[189]\tvalidation_0-rmse:381830\n",
"[190]\tvalidation_0-rmse:381834\n",
"[191]\tvalidation_0-rmse:381676\n",
"[192]\tvalidation_0-rmse:381739\n",
"[193]\tvalidation_0-rmse:381800\n",
"[194]\tvalidation_0-rmse:381803\n",
"[195]\tvalidation_0-rmse:381788\n",
"[196]\tvalidation_0-rmse:381787\n",
"[197]\tvalidation_0-rmse:381756\n",
"[198]\tvalidation_0-rmse:381913\n",
"[199]\tvalidation_0-rmse:381853\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"XGBRegressor(base_score=0.4, booster='gbtree', colsample_bylevel=1,\n",
" colsample_bynode=1, colsample_bytree=1, gamma=0,\n",
" importance_type='gain', learning_rate=0.1, max_delta_step=0,\n",
" max_depth=4, min_child_weight=0.001, missing=None,\n",
" n_estimators=200, n_jobs=1, nthread=None,\n",
" objective='reg:squarederror', random_state=0, reg_alpha=0,\n",
" reg_lambda=1, scale_pos_weight=1, seed=None, silent=None,\n",
" subsample=1, verbosity=1)"
]
},
"metadata": {
"tags": []
},
"execution_count": 183
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9x4ikR3So3FE",
"colab_type": "text"
},
"source": [
"##Ensemble"
]
},
{
"cell_type": "code",
"metadata": {
"id": "F0oJ_d9v0Nr_",
"colab_type": "code",
"outputId": "4eba4e63-4362-4b4d-9390-a8ccce616aaf",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 225
}
},
"source": [
"# let's combine our best models\n",
"r1 = cb.CatBoostRegressor(task_type='CPU', depth=8, iterations=1000, eval_metric='RMSE', loss_function='RMSE', od_type='Iter', metric_period=100, od_wait=500, boost_from_average=False)\n",
"r2 = lgb.LGBMRegressor(objective='regression', max_depth=12, num_leaves=31, verbose_eval=10, n_estimators=79)\n",
"#r3 = xgb.XGBRegressor(n_estimators=163, objective='reg:squarederror',max_depth=4)\n",
"#r4 = HistGradientBoostingRegressor(verbose=1, n_iter_no_change=100, max_iter=72, scoring='neg_mean_squared_error')\n",
"er = VotingRegressor([('cb', r1), ('lb', r2)])#\n",
"er.fit(X_train, y_train)\n",
"y_pred = er.predict(X_valid)\n",
"print(np.sqrt(mean_squared_error(y_valid, np.array(y_pred).astype(int))))\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"0:\tlearn: 624399.4780103\ttotal: 62.1ms\tremaining: 1m 2s\n",
"100:\tlearn: 469742.4572142\ttotal: 6.32s\tremaining: 56.2s\n",
"200:\tlearn: 442056.2989789\ttotal: 12.7s\tremaining: 50.4s\n",
"300:\tlearn: 422858.1051921\ttotal: 19s\tremaining: 44.2s\n",
"400:\tlearn: 405058.3157546\ttotal: 25.3s\tremaining: 37.8s\n",
"500:\tlearn: 389195.8677996\ttotal: 31.5s\tremaining: 31.4s\n",
"600:\tlearn: 375443.5160531\ttotal: 37.6s\tremaining: 24.9s\n",
"700:\tlearn: 362391.2126904\ttotal: 43.6s\tremaining: 18.6s\n",
"800:\tlearn: 350424.6758707\ttotal: 49.7s\tremaining: 12.3s\n",
"900:\tlearn: 339292.5143504\ttotal: 55.8s\tremaining: 6.13s\n",
"999:\tlearn: 329244.8114390\ttotal: 1m 1s\tremaining: 0us\n",
"376565.03793049976\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ck3oV0Dlpx8F",
"colab_type": "text"
},
"source": [
"#Results"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MgnpiU7lp02J",
"colab_type": "text"
},
"source": [
"| Algorithm | RMSE |\n",
"| ------------------------------------ | ------------ |\n",
"| CatboostRegressor | 376.917 kW |\n",
"| HistGradientBoostingRegressor | 382.193 kW |\n",
"| ExtraTreesRegressor | 385.468 kW |\n",
"| LGBMRegressor | 380.898 kW |\n",
"| Ensemble (CatboostRegressor + HistGradientBoostingRegressor + LGBMRegressor) | 377.716 kW |\n",
"| Ensemble (CatboostRegressor + XGBRegressor + LGBMRegressor) | 377.111 kW |\n",
"| Ensemble (LGBMRegressor + XGBRegressor) | 379.470 kW |\n",
"| Ensemble (CatboostRegressor + XGBRegressor) | 377.073 kW |\n",
"| Ensemble (CatboostRegressor + HistGradientBoostingRegressor + XGBRegressor + LGBMRegressor) | 377.450 kW |\n",
"| **Ensemble (CatboostRegressor + LGBMRegressor)** | **376.565 kW** |\n",
"\n",
"\n",
"So our final score is 376.565 kW!"
]
},
{
"cell_type": "code",
"metadata": {
"id": "0j_M0UKmJkCd",
"colab_type": "code",
"colab": {}
},
"source": [
"# save model\n",
"dump(er, 'model.joblib')\n",
"clf = load('model.joblib')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "glJ2cworJvdc",
"colab_type": "code",
"outputId": "cdb98a8e-7972-490e-a4d1-30d8720310b2",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"y_pred = clf.predict(X_valid)\n",
"print(np.sqrt(mean_squared_error(y_valid, np.array(y_pred).astype(int))))"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"376565.03793049976\n"
],
"name": "stdout"
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment