Skip to content

Instantly share code, notes, and snippets.

@kunigaku
Created November 3, 2020 06:01
Show Gist options
  • Save kunigaku/2e8efdd9822c934e3164d8dbc94a50bb to your computer and use it in GitHub Desktop.
Save kunigaku/2e8efdd9822c934e3164d8dbc94a50bb to your computer and use it in GitHub Desktop.
NumeraiでRandom Seed Average.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "NumeraiでRandom Seed Average.ipynb",
"provenance": [],
"collapsed_sections": [],
"authorship_tag": "ABX9TyNa5bWM+qwmWdY8OhZQhums",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/kunigaku/2e8efdd9822c934e3164d8dbc94a50bb/numerai-random-seed-average.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yobt1aH638Zr"
},
"source": [
"# NumeraiでRandom Seed Average\n",
"\n",
"ベースは[公式example model](https://github.com/numerai/example-scripts/blob/master/example_model.py)\n",
"\n",
"GPUインスタンスで実行してください。"
]
},
{
"cell_type": "code",
"metadata": {
"id": "SlCROAwJ33yS"
},
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"from xgboost import XGBRegressor"
],
"execution_count": 1,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "DPyaUebj4oOD",
"outputId": "8d6beef3-9cdb-490e-e173-95bb01fd6670",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 246
}
},
"source": [
"training_data = pd.read_csv(\"https://numerai-public-datasets.s3-us-west-2.amazonaws.com/latest_numerai_training_data.csv.xz\")\n",
"training_data.head()"
],
"execution_count": 2,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>era</th>\n",
" <th>data_type</th>\n",
" <th>feature_intelligence1</th>\n",
" <th>feature_intelligence2</th>\n",
" <th>feature_intelligence3</th>\n",
" <th>feature_intelligence4</th>\n",
" <th>feature_intelligence5</th>\n",
" <th>feature_intelligence6</th>\n",
" <th>feature_intelligence7</th>\n",
" <th>feature_intelligence8</th>\n",
" <th>feature_intelligence9</th>\n",
" <th>feature_intelligence10</th>\n",
" <th>feature_intelligence11</th>\n",
" <th>feature_intelligence12</th>\n",
" <th>feature_charisma1</th>\n",
" <th>feature_charisma2</th>\n",
" <th>feature_charisma3</th>\n",
" <th>feature_charisma4</th>\n",
" <th>feature_charisma5</th>\n",
" <th>feature_charisma6</th>\n",
" <th>feature_charisma7</th>\n",
" <th>feature_charisma8</th>\n",
" <th>feature_charisma9</th>\n",
" <th>feature_charisma10</th>\n",
" <th>feature_charisma11</th>\n",
" <th>feature_charisma12</th>\n",
" <th>feature_charisma13</th>\n",
" <th>feature_charisma14</th>\n",
" <th>feature_charisma15</th>\n",
" <th>feature_charisma16</th>\n",
" <th>feature_charisma17</th>\n",
" <th>feature_charisma18</th>\n",
" <th>feature_charisma19</th>\n",
" <th>feature_charisma20</th>\n",
" <th>feature_charisma21</th>\n",
" <th>feature_charisma22</th>\n",
" <th>feature_charisma23</th>\n",
" <th>feature_charisma24</th>\n",
" <th>feature_charisma25</th>\n",
" <th>...</th>\n",
" <th>feature_wisdom8</th>\n",
" <th>feature_wisdom9</th>\n",
" <th>feature_wisdom10</th>\n",
" <th>feature_wisdom11</th>\n",
" <th>feature_wisdom12</th>\n",
" <th>feature_wisdom13</th>\n",
" <th>feature_wisdom14</th>\n",
" <th>feature_wisdom15</th>\n",
" <th>feature_wisdom16</th>\n",
" <th>feature_wisdom17</th>\n",
" <th>feature_wisdom18</th>\n",
" <th>feature_wisdom19</th>\n",
" <th>feature_wisdom20</th>\n",
" <th>feature_wisdom21</th>\n",
" <th>feature_wisdom22</th>\n",
" <th>feature_wisdom23</th>\n",
" <th>feature_wisdom24</th>\n",
" <th>feature_wisdom25</th>\n",
" <th>feature_wisdom26</th>\n",
" <th>feature_wisdom27</th>\n",
" <th>feature_wisdom28</th>\n",
" <th>feature_wisdom29</th>\n",
" <th>feature_wisdom30</th>\n",
" <th>feature_wisdom31</th>\n",
" <th>feature_wisdom32</th>\n",
" <th>feature_wisdom33</th>\n",
" <th>feature_wisdom34</th>\n",
" <th>feature_wisdom35</th>\n",
" <th>feature_wisdom36</th>\n",
" <th>feature_wisdom37</th>\n",
" <th>feature_wisdom38</th>\n",
" <th>feature_wisdom39</th>\n",
" <th>feature_wisdom40</th>\n",
" <th>feature_wisdom41</th>\n",
" <th>feature_wisdom42</th>\n",
" <th>feature_wisdom43</th>\n",
" <th>feature_wisdom44</th>\n",
" <th>feature_wisdom45</th>\n",
" <th>feature_wisdom46</th>\n",
" <th>target_kazutsugi</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>n000315175b67977</td>\n",
" <td>era1</td>\n",
" <td>train</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>1.0</td>\n",
" <td>0.5</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>...</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.5</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>n0014af834a96cdd</td>\n",
" <td>era1</td>\n",
" <td>train</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.5</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>0.5</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.0</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>...</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.0</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>n001c93979ac41d4</td>\n",
" <td>era1</td>\n",
" <td>train</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>1.0</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>0.5</td>\n",
" <td>1.0</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.0</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>...</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.5</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>1.0</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>1.0</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>n0034e4143f22a13</td>\n",
" <td>era1</td>\n",
" <td>train</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>1.0</td>\n",
" <td>0.5</td>\n",
" <td>0.5</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>0.75</td>\n",
" <td>0.0</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>...</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>1.0</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>1.0</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>0.5</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>n00679d1a636062f</td>\n",
" <td>era1</td>\n",
" <td>train</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.0</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.0</td>\n",
" <td>0.5</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>...</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.0</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 314 columns</p>\n",
"</div>"
],
"text/plain": [
" id era ... feature_wisdom46 target_kazutsugi\n",
"0 n000315175b67977 era1 ... 0.75 0.75\n",
"1 n0014af834a96cdd era1 ... 1.00 0.25\n",
"2 n001c93979ac41d4 era1 ... 0.75 0.00\n",
"3 n0034e4143f22a13 era1 ... 1.00 0.00\n",
"4 n00679d1a636062f era1 ... 0.75 0.75\n",
"\n",
"[5 rows x 314 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 2
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "8DBzqhy44sXg",
"outputId": "8dbbdedb-aa36-40a9-8c9f-4bc3e0c33d70",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 246
}
},
"source": [
"tournament_data = pd.read_csv(\"https://numerai-public-datasets.s3-us-west-2.amazonaws.com/latest_numerai_tournament_data.csv.xz\")\n",
"tournament_data.head()"
],
"execution_count": 3,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>era</th>\n",
" <th>data_type</th>\n",
" <th>feature_intelligence1</th>\n",
" <th>feature_intelligence2</th>\n",
" <th>feature_intelligence3</th>\n",
" <th>feature_intelligence4</th>\n",
" <th>feature_intelligence5</th>\n",
" <th>feature_intelligence6</th>\n",
" <th>feature_intelligence7</th>\n",
" <th>feature_intelligence8</th>\n",
" <th>feature_intelligence9</th>\n",
" <th>feature_intelligence10</th>\n",
" <th>feature_intelligence11</th>\n",
" <th>feature_intelligence12</th>\n",
" <th>feature_charisma1</th>\n",
" <th>feature_charisma2</th>\n",
" <th>feature_charisma3</th>\n",
" <th>feature_charisma4</th>\n",
" <th>feature_charisma5</th>\n",
" <th>feature_charisma6</th>\n",
" <th>feature_charisma7</th>\n",
" <th>feature_charisma8</th>\n",
" <th>feature_charisma9</th>\n",
" <th>feature_charisma10</th>\n",
" <th>feature_charisma11</th>\n",
" <th>feature_charisma12</th>\n",
" <th>feature_charisma13</th>\n",
" <th>feature_charisma14</th>\n",
" <th>feature_charisma15</th>\n",
" <th>feature_charisma16</th>\n",
" <th>feature_charisma17</th>\n",
" <th>feature_charisma18</th>\n",
" <th>feature_charisma19</th>\n",
" <th>feature_charisma20</th>\n",
" <th>feature_charisma21</th>\n",
" <th>feature_charisma22</th>\n",
" <th>feature_charisma23</th>\n",
" <th>feature_charisma24</th>\n",
" <th>feature_charisma25</th>\n",
" <th>...</th>\n",
" <th>feature_wisdom8</th>\n",
" <th>feature_wisdom9</th>\n",
" <th>feature_wisdom10</th>\n",
" <th>feature_wisdom11</th>\n",
" <th>feature_wisdom12</th>\n",
" <th>feature_wisdom13</th>\n",
" <th>feature_wisdom14</th>\n",
" <th>feature_wisdom15</th>\n",
" <th>feature_wisdom16</th>\n",
" <th>feature_wisdom17</th>\n",
" <th>feature_wisdom18</th>\n",
" <th>feature_wisdom19</th>\n",
" <th>feature_wisdom20</th>\n",
" <th>feature_wisdom21</th>\n",
" <th>feature_wisdom22</th>\n",
" <th>feature_wisdom23</th>\n",
" <th>feature_wisdom24</th>\n",
" <th>feature_wisdom25</th>\n",
" <th>feature_wisdom26</th>\n",
" <th>feature_wisdom27</th>\n",
" <th>feature_wisdom28</th>\n",
" <th>feature_wisdom29</th>\n",
" <th>feature_wisdom30</th>\n",
" <th>feature_wisdom31</th>\n",
" <th>feature_wisdom32</th>\n",
" <th>feature_wisdom33</th>\n",
" <th>feature_wisdom34</th>\n",
" <th>feature_wisdom35</th>\n",
" <th>feature_wisdom36</th>\n",
" <th>feature_wisdom37</th>\n",
" <th>feature_wisdom38</th>\n",
" <th>feature_wisdom39</th>\n",
" <th>feature_wisdom40</th>\n",
" <th>feature_wisdom41</th>\n",
" <th>feature_wisdom42</th>\n",
" <th>feature_wisdom43</th>\n",
" <th>feature_wisdom44</th>\n",
" <th>feature_wisdom45</th>\n",
" <th>feature_wisdom46</th>\n",
" <th>target_kazutsugi</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>n0003aa52cab36c2</td>\n",
" <td>era121</td>\n",
" <td>validation</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.0</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.0</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.0</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>...</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.25</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>1.0</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>n000920ed083903f</td>\n",
" <td>era121</td>\n",
" <td>validation</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.5</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.0</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" <td>...</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>1.0</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.5</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>n0038e640522c4a6</td>\n",
" <td>era121</td>\n",
" <td>validation</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" <td>1.0</td>\n",
" <td>1.00</td>\n",
" <td>1.0</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>1.00</td>\n",
" <td>1.0</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.5</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>...</td>\n",
" <td>0.25</td>\n",
" <td>0.5</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.75</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>n004ac94a87dc54b</td>\n",
" <td>era121</td>\n",
" <td>validation</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>1.0</td>\n",
" <td>0.75</td>\n",
" <td>0.0</td>\n",
" <td>0.50</td>\n",
" <td>0.00</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>1.0</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>1.00</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>...</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>n0052fe97ea0c05f</td>\n",
" <td>era121</td>\n",
" <td>validation</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>1.0</td>\n",
" <td>0.50</td>\n",
" <td>0.5</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>1.0</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.0</td>\n",
" <td>0.00</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" <td>1.00</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>...</td>\n",
" <td>0.00</td>\n",
" <td>0.5</td>\n",
" <td>0.50</td>\n",
" <td>0.0</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.50</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.25</td>\n",
" <td>0.25</td>\n",
" <td>0.75</td>\n",
" <td>1.00</td>\n",
" <td>1.0</td>\n",
" <td>0.75</td>\n",
" <td>0.75</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" <td>0.75</td>\n",
" <td>0.00</td>\n",
" <td>0.00</td>\n",
" <td>0.75</td>\n",
" <td>1.0</td>\n",
" <td>0.0</td>\n",
" <td>0.25</td>\n",
" <td>1.00</td>\n",
" <td>1.00</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 314 columns</p>\n",
"</div>"
],
"text/plain": [
" id era ... feature_wisdom46 target_kazutsugi\n",
"0 n0003aa52cab36c2 era121 ... 0.00 0.00\n",
"1 n000920ed083903f era121 ... 0.50 0.25\n",
"2 n0038e640522c4a6 era121 ... 0.00 1.00\n",
"3 n004ac94a87dc54b era121 ... 0.25 0.75\n",
"4 n0052fe97ea0c05f era121 ... 1.00 1.00\n",
"\n",
"[5 rows x 314 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 3
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Y8Pp8iKk4wHh",
"outputId": "47aa4f7a-50c9-47ea-c676-ecb7b78409a1",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"# submitしないのでvalidation dataだけにしておく\n",
"tournament_data = tournament_data.dropna()\n",
"len(tournament_data)"
],
"execution_count": 4,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"106895"
]
},
"metadata": {
"tags": []
},
"execution_count": 4
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "lIOVPetY4PK2"
},
"source": [
"TARGET_NAME = [x for x in training_data.columns if 'target' in x][0]\n",
"PREDICTION_NAME = \"prediction\"\n",
"feature_cols = training_data.columns[training_data.columns.str.startswith('feature')]"
],
"execution_count": 5,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "w8Rs7dFOvoh0"
},
"source": [
"def dataframe_cast_to_float32(df, features):\n",
" typemap = {}\n",
" for f in features:\n",
" typemap[f] = 'float32'\n",
" return df.astype(typemap)\n",
"training_data = dataframe_cast_to_float32(training_data, feature_cols)\n",
"tournament_data = dataframe_cast_to_float32(tournament_data, feature_cols)"
],
"execution_count": 6,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "ItD8hOPZ-v6k",
"outputId": "07d3446d-6780-4cd5-d113-1fd1959c16d4",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"%%time\n",
"predictions = []\n",
"for itr in range(30):\n",
" model = XGBRegressor(max_depth=5, learning_rate=0.01, n_estimators=2000, n_jobs=-1, colsample_bytree=0.1, seed=itr, tree_method='gpu_hist')\n",
" model.fit(training_data[feature_cols], training_data[TARGET_NAME])\n",
" predictions.append(model.predict(tournament_data[feature_cols]))"
],
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"text": [
"[02:34:22] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:36:28] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:38:36] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:40:45] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:42:53] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:45:01] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:47:09] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:49:18] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:51:26] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:53:34] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:55:42] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:57:50] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[02:59:58] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:02:06] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:04:15] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:06:23] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:08:31] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:10:40] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:12:48] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:14:56] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:17:04] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:19:12] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:21:21] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:23:29] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:25:37] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:27:46] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:29:54] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:32:02] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:34:10] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"[03:36:18] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
"CPU times: user 39min 39s, sys: 24min 24s, total: 1h 4min 4s\n",
"Wall time: 1h 4min 4s\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Wm1jFANM4Xhe"
},
"source": [
"# Submissions are scored by spearman correlation\n",
"def correlation(predictions, targets):\n",
" ranked_preds = predictions.rank(pct=True, method=\"first\")\n",
" return np.corrcoef(ranked_preds, targets)[0, 1]\n",
"\n",
"# convenience method for scoring\n",
"def score(df):\n",
" return correlation(df[PREDICTION_NAME], df[TARGET_NAME])"
],
"execution_count": 8,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "pDyMycA4ytC6",
"outputId": "4fd9f5ac-95de-4117-94d0-edeaec39cbef",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"scores = []\n",
"for prediction in predictions:\n",
" scores.append(correlation(pd.Series(prediction), tournament_data[TARGET_NAME]))\n",
"scores"
],
"execution_count": 9,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[0.028905115759917572,\n",
" 0.028839441329809517,\n",
" 0.02921323028079193,\n",
" 0.028522350017145304,\n",
" 0.02936801687338555,\n",
" 0.029079186323748136,\n",
" 0.028979724294433268,\n",
" 0.029251024058632485,\n",
" 0.02840886764966899,\n",
" 0.02880805730907734,\n",
" 0.02843635982530843,\n",
" 0.026877035233155034,\n",
" 0.02839237423196859,\n",
" 0.028494873071677432,\n",
" 0.02799380622096903,\n",
" 0.028915742343947268,\n",
" 0.02921264209585863,\n",
" 0.028761814861732047,\n",
" 0.029395473225945552,\n",
" 0.028925262488193715,\n",
" 0.02911980690722348,\n",
" 0.028839393923219304,\n",
" 0.02903148499758035,\n",
" 0.028984570921122724,\n",
" 0.029681396528901567,\n",
" 0.029568379432439205,\n",
" 0.028575497738446528,\n",
" 0.029951257363657842,\n",
" 0.028710364124751427,\n",
" 0.028693068012659983]"
]
},
"metadata": {
"tags": []
},
"execution_count": 9
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "e2H8yh970A_R",
"outputId": "5c76b649-9a1f-455b-9190-bf7781e94f68",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"sum(scores) / len(scores)"
],
"execution_count": 10,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0.028864520581512277"
]
},
"metadata": {
"tags": []
},
"execution_count": 10
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "s768OlLi0EzL",
"outputId": "c4ddac8b-404b-4513-80b0-84db0f34638b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 217
}
},
"source": [
"prediction_df = tournament_data[\"id\"].to_frame()\n",
"for itr in range(len(predictions)):\n",
" prediction_df[f\"prediction_{itr}\"] = predictions[itr]\n",
"prediction_df.head()"
],
"execution_count": 11,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>prediction_0</th>\n",
" <th>prediction_1</th>\n",
" <th>prediction_2</th>\n",
" <th>prediction_3</th>\n",
" <th>prediction_4</th>\n",
" <th>prediction_5</th>\n",
" <th>prediction_6</th>\n",
" <th>prediction_7</th>\n",
" <th>prediction_8</th>\n",
" <th>prediction_9</th>\n",
" <th>prediction_10</th>\n",
" <th>prediction_11</th>\n",
" <th>prediction_12</th>\n",
" <th>prediction_13</th>\n",
" <th>prediction_14</th>\n",
" <th>prediction_15</th>\n",
" <th>prediction_16</th>\n",
" <th>prediction_17</th>\n",
" <th>prediction_18</th>\n",
" <th>prediction_19</th>\n",
" <th>prediction_20</th>\n",
" <th>prediction_21</th>\n",
" <th>prediction_22</th>\n",
" <th>prediction_23</th>\n",
" <th>prediction_24</th>\n",
" <th>prediction_25</th>\n",
" <th>prediction_26</th>\n",
" <th>prediction_27</th>\n",
" <th>prediction_28</th>\n",
" <th>prediction_29</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>n0003aa52cab36c2</td>\n",
" <td>0.484879</td>\n",
" <td>0.482867</td>\n",
" <td>0.478019</td>\n",
" <td>0.476245</td>\n",
" <td>0.475820</td>\n",
" <td>0.485848</td>\n",
" <td>0.481255</td>\n",
" <td>0.485239</td>\n",
" <td>0.481609</td>\n",
" <td>0.482669</td>\n",
" <td>0.476397</td>\n",
" <td>0.482509</td>\n",
" <td>0.484143</td>\n",
" <td>0.483471</td>\n",
" <td>0.483858</td>\n",
" <td>0.483059</td>\n",
" <td>0.481547</td>\n",
" <td>0.478849</td>\n",
" <td>0.483934</td>\n",
" <td>0.485651</td>\n",
" <td>0.485224</td>\n",
" <td>0.484253</td>\n",
" <td>0.480352</td>\n",
" <td>0.482058</td>\n",
" <td>0.483976</td>\n",
" <td>0.482240</td>\n",
" <td>0.483506</td>\n",
" <td>0.481370</td>\n",
" <td>0.483327</td>\n",
" <td>0.479972</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>n000920ed083903f</td>\n",
" <td>0.482023</td>\n",
" <td>0.482295</td>\n",
" <td>0.477458</td>\n",
" <td>0.476867</td>\n",
" <td>0.478193</td>\n",
" <td>0.477557</td>\n",
" <td>0.467728</td>\n",
" <td>0.480419</td>\n",
" <td>0.481357</td>\n",
" <td>0.476561</td>\n",
" <td>0.483260</td>\n",
" <td>0.468827</td>\n",
" <td>0.478936</td>\n",
" <td>0.476545</td>\n",
" <td>0.470140</td>\n",
" <td>0.477997</td>\n",
" <td>0.478097</td>\n",
" <td>0.473285</td>\n",
" <td>0.479412</td>\n",
" <td>0.473597</td>\n",
" <td>0.478102</td>\n",
" <td>0.475989</td>\n",
" <td>0.469913</td>\n",
" <td>0.470789</td>\n",
" <td>0.486752</td>\n",
" <td>0.478580</td>\n",
" <td>0.482650</td>\n",
" <td>0.482449</td>\n",
" <td>0.475580</td>\n",
" <td>0.477760</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>n0038e640522c4a6</td>\n",
" <td>0.538685</td>\n",
" <td>0.532875</td>\n",
" <td>0.534055</td>\n",
" <td>0.533150</td>\n",
" <td>0.534803</td>\n",
" <td>0.538281</td>\n",
" <td>0.532359</td>\n",
" <td>0.536322</td>\n",
" <td>0.534888</td>\n",
" <td>0.536595</td>\n",
" <td>0.533760</td>\n",
" <td>0.531219</td>\n",
" <td>0.534250</td>\n",
" <td>0.530842</td>\n",
" <td>0.534509</td>\n",
" <td>0.530837</td>\n",
" <td>0.531356</td>\n",
" <td>0.542174</td>\n",
" <td>0.537338</td>\n",
" <td>0.536501</td>\n",
" <td>0.531979</td>\n",
" <td>0.535674</td>\n",
" <td>0.535155</td>\n",
" <td>0.536897</td>\n",
" <td>0.534812</td>\n",
" <td>0.538941</td>\n",
" <td>0.539470</td>\n",
" <td>0.530719</td>\n",
" <td>0.532399</td>\n",
" <td>0.541055</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>n004ac94a87dc54b</td>\n",
" <td>0.503593</td>\n",
" <td>0.505625</td>\n",
" <td>0.504488</td>\n",
" <td>0.508390</td>\n",
" <td>0.502429</td>\n",
" <td>0.500836</td>\n",
" <td>0.497685</td>\n",
" <td>0.499730</td>\n",
" <td>0.498881</td>\n",
" <td>0.500211</td>\n",
" <td>0.505863</td>\n",
" <td>0.505476</td>\n",
" <td>0.504645</td>\n",
" <td>0.507249</td>\n",
" <td>0.507383</td>\n",
" <td>0.508969</td>\n",
" <td>0.500206</td>\n",
" <td>0.500303</td>\n",
" <td>0.500447</td>\n",
" <td>0.503361</td>\n",
" <td>0.505668</td>\n",
" <td>0.494800</td>\n",
" <td>0.496916</td>\n",
" <td>0.502270</td>\n",
" <td>0.503275</td>\n",
" <td>0.510193</td>\n",
" <td>0.496999</td>\n",
" <td>0.501016</td>\n",
" <td>0.504974</td>\n",
" <td>0.500095</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>n0052fe97ea0c05f</td>\n",
" <td>0.500158</td>\n",
" <td>0.504503</td>\n",
" <td>0.497771</td>\n",
" <td>0.496250</td>\n",
" <td>0.499183</td>\n",
" <td>0.502728</td>\n",
" <td>0.504217</td>\n",
" <td>0.499516</td>\n",
" <td>0.499576</td>\n",
" <td>0.501709</td>\n",
" <td>0.497215</td>\n",
" <td>0.498417</td>\n",
" <td>0.504225</td>\n",
" <td>0.501185</td>\n",
" <td>0.500335</td>\n",
" <td>0.503084</td>\n",
" <td>0.501732</td>\n",
" <td>0.503859</td>\n",
" <td>0.504526</td>\n",
" <td>0.501309</td>\n",
" <td>0.499638</td>\n",
" <td>0.503786</td>\n",
" <td>0.502310</td>\n",
" <td>0.504324</td>\n",
" <td>0.496339</td>\n",
" <td>0.504285</td>\n",
" <td>0.499146</td>\n",
" <td>0.505928</td>\n",
" <td>0.502193</td>\n",
" <td>0.499559</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id prediction_0 ... prediction_28 prediction_29\n",
"0 n0003aa52cab36c2 0.484879 ... 0.483327 0.479972\n",
"1 n000920ed083903f 0.482023 ... 0.475580 0.477760\n",
"2 n0038e640522c4a6 0.538685 ... 0.532399 0.541055\n",
"3 n004ac94a87dc54b 0.503593 ... 0.504974 0.500095\n",
"4 n0052fe97ea0c05f 0.500158 ... 0.502193 0.499559\n",
"\n",
"[5 rows x 31 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "mEIv4tL31Fjp",
"outputId": "eae8848d-d571-424d-c38f-b7a50f3ef7ce",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"pre_col_list = prediction_df.columns[prediction_df.columns.str.startswith('prediction_')]\n",
"pre_col_list"
],
"execution_count": 12,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"Index(['prediction_0', 'prediction_1', 'prediction_2', 'prediction_3',\n",
" 'prediction_4', 'prediction_5', 'prediction_6', 'prediction_7',\n",
" 'prediction_8', 'prediction_9', 'prediction_10', 'prediction_11',\n",
" 'prediction_12', 'prediction_13', 'prediction_14', 'prediction_15',\n",
" 'prediction_16', 'prediction_17', 'prediction_18', 'prediction_19',\n",
" 'prediction_20', 'prediction_21', 'prediction_22', 'prediction_23',\n",
" 'prediction_24', 'prediction_25', 'prediction_26', 'prediction_27',\n",
" 'prediction_28', 'prediction_29'],\n",
" dtype='object')"
]
},
"metadata": {
"tags": []
},
"execution_count": 12
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "lHFJLR1T09F5",
"outputId": "783ba909-f71d-4f6e-f934-dfa49b082ff3",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 217
}
},
"source": [
"prediction_df[PREDICTION_NAME] = prediction_df[pre_col_list].mean(axis=1)\n",
"prediction_df.head()"
],
"execution_count": 13,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>prediction_0</th>\n",
" <th>prediction_1</th>\n",
" <th>prediction_2</th>\n",
" <th>prediction_3</th>\n",
" <th>prediction_4</th>\n",
" <th>prediction_5</th>\n",
" <th>prediction_6</th>\n",
" <th>prediction_7</th>\n",
" <th>prediction_8</th>\n",
" <th>prediction_9</th>\n",
" <th>prediction_10</th>\n",
" <th>prediction_11</th>\n",
" <th>prediction_12</th>\n",
" <th>prediction_13</th>\n",
" <th>prediction_14</th>\n",
" <th>prediction_15</th>\n",
" <th>prediction_16</th>\n",
" <th>prediction_17</th>\n",
" <th>prediction_18</th>\n",
" <th>prediction_19</th>\n",
" <th>prediction_20</th>\n",
" <th>prediction_21</th>\n",
" <th>prediction_22</th>\n",
" <th>prediction_23</th>\n",
" <th>prediction_24</th>\n",
" <th>prediction_25</th>\n",
" <th>prediction_26</th>\n",
" <th>prediction_27</th>\n",
" <th>prediction_28</th>\n",
" <th>prediction_29</th>\n",
" <th>prediction</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>n0003aa52cab36c2</td>\n",
" <td>0.484879</td>\n",
" <td>0.482867</td>\n",
" <td>0.478019</td>\n",
" <td>0.476245</td>\n",
" <td>0.475820</td>\n",
" <td>0.485848</td>\n",
" <td>0.481255</td>\n",
" <td>0.485239</td>\n",
" <td>0.481609</td>\n",
" <td>0.482669</td>\n",
" <td>0.476397</td>\n",
" <td>0.482509</td>\n",
" <td>0.484143</td>\n",
" <td>0.483471</td>\n",
" <td>0.483858</td>\n",
" <td>0.483059</td>\n",
" <td>0.481547</td>\n",
" <td>0.478849</td>\n",
" <td>0.483934</td>\n",
" <td>0.485651</td>\n",
" <td>0.485224</td>\n",
" <td>0.484253</td>\n",
" <td>0.480352</td>\n",
" <td>0.482058</td>\n",
" <td>0.483976</td>\n",
" <td>0.482240</td>\n",
" <td>0.483506</td>\n",
" <td>0.481370</td>\n",
" <td>0.483327</td>\n",
" <td>0.479972</td>\n",
" <td>0.482138</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>n000920ed083903f</td>\n",
" <td>0.482023</td>\n",
" <td>0.482295</td>\n",
" <td>0.477458</td>\n",
" <td>0.476867</td>\n",
" <td>0.478193</td>\n",
" <td>0.477557</td>\n",
" <td>0.467728</td>\n",
" <td>0.480419</td>\n",
" <td>0.481357</td>\n",
" <td>0.476561</td>\n",
" <td>0.483260</td>\n",
" <td>0.468827</td>\n",
" <td>0.478936</td>\n",
" <td>0.476545</td>\n",
" <td>0.470140</td>\n",
" <td>0.477997</td>\n",
" <td>0.478097</td>\n",
" <td>0.473285</td>\n",
" <td>0.479412</td>\n",
" <td>0.473597</td>\n",
" <td>0.478102</td>\n",
" <td>0.475989</td>\n",
" <td>0.469913</td>\n",
" <td>0.470789</td>\n",
" <td>0.486752</td>\n",
" <td>0.478580</td>\n",
" <td>0.482650</td>\n",
" <td>0.482449</td>\n",
" <td>0.475580</td>\n",
" <td>0.477760</td>\n",
" <td>0.477304</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>n0038e640522c4a6</td>\n",
" <td>0.538685</td>\n",
" <td>0.532875</td>\n",
" <td>0.534055</td>\n",
" <td>0.533150</td>\n",
" <td>0.534803</td>\n",
" <td>0.538281</td>\n",
" <td>0.532359</td>\n",
" <td>0.536322</td>\n",
" <td>0.534888</td>\n",
" <td>0.536595</td>\n",
" <td>0.533760</td>\n",
" <td>0.531219</td>\n",
" <td>0.534250</td>\n",
" <td>0.530842</td>\n",
" <td>0.534509</td>\n",
" <td>0.530837</td>\n",
" <td>0.531356</td>\n",
" <td>0.542174</td>\n",
" <td>0.537338</td>\n",
" <td>0.536501</td>\n",
" <td>0.531979</td>\n",
" <td>0.535674</td>\n",
" <td>0.535155</td>\n",
" <td>0.536897</td>\n",
" <td>0.534812</td>\n",
" <td>0.538941</td>\n",
" <td>0.539470</td>\n",
" <td>0.530719</td>\n",
" <td>0.532399</td>\n",
" <td>0.541055</td>\n",
" <td>0.535063</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>n004ac94a87dc54b</td>\n",
" <td>0.503593</td>\n",
" <td>0.505625</td>\n",
" <td>0.504488</td>\n",
" <td>0.508390</td>\n",
" <td>0.502429</td>\n",
" <td>0.500836</td>\n",
" <td>0.497685</td>\n",
" <td>0.499730</td>\n",
" <td>0.498881</td>\n",
" <td>0.500211</td>\n",
" <td>0.505863</td>\n",
" <td>0.505476</td>\n",
" <td>0.504645</td>\n",
" <td>0.507249</td>\n",
" <td>0.507383</td>\n",
" <td>0.508969</td>\n",
" <td>0.500206</td>\n",
" <td>0.500303</td>\n",
" <td>0.500447</td>\n",
" <td>0.503361</td>\n",
" <td>0.505668</td>\n",
" <td>0.494800</td>\n",
" <td>0.496916</td>\n",
" <td>0.502270</td>\n",
" <td>0.503275</td>\n",
" <td>0.510193</td>\n",
" <td>0.496999</td>\n",
" <td>0.501016</td>\n",
" <td>0.504974</td>\n",
" <td>0.500095</td>\n",
" <td>0.502733</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>n0052fe97ea0c05f</td>\n",
" <td>0.500158</td>\n",
" <td>0.504503</td>\n",
" <td>0.497771</td>\n",
" <td>0.496250</td>\n",
" <td>0.499183</td>\n",
" <td>0.502728</td>\n",
" <td>0.504217</td>\n",
" <td>0.499516</td>\n",
" <td>0.499576</td>\n",
" <td>0.501709</td>\n",
" <td>0.497215</td>\n",
" <td>0.498417</td>\n",
" <td>0.504225</td>\n",
" <td>0.501185</td>\n",
" <td>0.500335</td>\n",
" <td>0.503084</td>\n",
" <td>0.501732</td>\n",
" <td>0.503859</td>\n",
" <td>0.504526</td>\n",
" <td>0.501309</td>\n",
" <td>0.499638</td>\n",
" <td>0.503786</td>\n",
" <td>0.502310</td>\n",
" <td>0.504324</td>\n",
" <td>0.496339</td>\n",
" <td>0.504285</td>\n",
" <td>0.499146</td>\n",
" <td>0.505928</td>\n",
" <td>0.502193</td>\n",
" <td>0.499559</td>\n",
" <td>0.501300</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id prediction_0 ... prediction_29 prediction\n",
"0 n0003aa52cab36c2 0.484879 ... 0.479972 0.482138\n",
"1 n000920ed083903f 0.482023 ... 0.477760 0.477304\n",
"2 n0038e640522c4a6 0.538685 ... 0.541055 0.535063\n",
"3 n004ac94a87dc54b 0.503593 ... 0.500095 0.502733\n",
"4 n0052fe97ea0c05f 0.500158 ... 0.499559 0.501300\n",
"\n",
"[5 rows x 32 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 13
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "HyB9p7Jt1Rfr",
"outputId": "c499e030-f761-4ae0-d1a6-866877fe6ef2",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"correlation(prediction_df[PREDICTION_NAME], tournament_data[TARGET_NAME])"
],
"execution_count": 14,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0.02916771995422608"
]
},
"metadata": {
"tags": []
},
"execution_count": 14
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment