Skip to content

Instantly share code, notes, and snippets.

@wboykinm
Created April 1, 2019 00:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wboykinm/3f9c8fa53299534a7f429e34991105ab to your computer and use it in GitHub Desktop.
Save wboykinm/3f9c8fa53299534a7f429e34991105ab to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"_uuid": "e38835708d1b268a5aa1a291572b8de51a07e9b6"
},
"source": [
"# Data-driven power rankings\n",
"\n",
"Ranking the teams by their strength is a very important piece of information your model should use.\n",
"\n",
"Some of the rankings are using expert opinion (i.e. majority vote of fan opinion). However, this may be not the best approach, as human opinions tend to be biased towards their favorites. Moreover, it may be easy to determine a top few strong teams, but it is a much harder to rank teams in the mid-lower range ranks.\n",
"\n",
"In NCAA tournament teams are assigned with tournament starting seeds, which also has expert bias - it is not uncommon, that a team with higher win ratio % would get lower seed compared to a team with lower win ratio %, based on the regular season schedule (which is not the same for every team!). Therefore, the judgment of team strength of assigned seed sometimes get controversial opinion among fans and media!\n",
"\n",
"The idea of this kernel is to try to calculate the team strength rankings only based on the data we have in our disposal. This way we will eliminate all possible human biases and get a robust estimate of team's strength."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
"_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5"
},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import statsmodels.api as sm\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"\n",
"seeds = pd.read_csv('../input/datafiles/NCAATourneySeeds.csv')\n",
"tourney_results = pd.read_csv('../input/datafiles/NCAATourneyCompactResults.csv')\n",
"regular_results = pd.read_csv('../input/datafiles/RegularSeasonCompactResults.csv')"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "177d18adcc1743fd9b493981ffda26574aa4354f"
},
"source": [
"We are going to use the function which allows us to duplicate the data by swapping the order of teams in given tables:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"_cell_guid": "79c7e3d0-c299-4dcb-8224-4455121ee9b0",
"_uuid": "d629ff2d2480ee46fbb7e2d37f6b5fab8052498a"
},
"outputs": [],
"source": [
"def prepare_data(df):\n",
" dfswap = df[['Season', 'DayNum', 'LTeamID', 'LScore', 'WTeamID', 'WScore', 'WLoc', 'NumOT']]\n",
"\n",
" dfswap.loc[df['WLoc'] == 'H', 'WLoc'] = 'A'\n",
" dfswap.loc[df['WLoc'] == 'A', 'WLoc'] = 'H'\n",
" df.columns.values[6] = 'location'\n",
" dfswap.columns.values[6] = 'location' \n",
" df.columns = [x.replace('W','T1_').replace('L','T2_') for x in list(df.columns)]\n",
" dfswap.columns = [x.replace('L','T1_').replace('W','T2_') for x in list(dfswap.columns)]\n",
" output = pd.concat([df, dfswap]).sort_index().reset_index(drop=True)\n",
" \n",
" return output"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"_uuid": "0a3c9bc650994b4ca298503b29e56f14702bf9c0"
},
"outputs": [],
"source": [
"tourney_results = prepare_data(tourney_results)\n",
"regular_results = prepare_data(regular_results)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"_uuid": "d5a4ccfadb1974a82acb1956c5ae561bc545f05b"
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Season</th>\n",
" <th>DayNum</th>\n",
" <th>T1_TeamID</th>\n",
" <th>T1_Score</th>\n",
" <th>T2_TeamID</th>\n",
" <th>T2_Score</th>\n",
" <th>location</th>\n",
" <th>NumOT</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1985</td>\n",
" <td>20</td>\n",
" <td>1228</td>\n",
" <td>81</td>\n",
" <td>1328</td>\n",
" <td>64</td>\n",
" <td>N</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1985</td>\n",
" <td>20</td>\n",
" <td>1328</td>\n",
" <td>64</td>\n",
" <td>1228</td>\n",
" <td>81</td>\n",
" <td>N</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1985</td>\n",
" <td>25</td>\n",
" <td>1354</td>\n",
" <td>70</td>\n",
" <td>1106</td>\n",
" <td>77</td>\n",
" <td>A</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1985</td>\n",
" <td>25</td>\n",
" <td>1106</td>\n",
" <td>77</td>\n",
" <td>1354</td>\n",
" <td>70</td>\n",
" <td>H</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1985</td>\n",
" <td>25</td>\n",
" <td>1112</td>\n",
" <td>63</td>\n",
" <td>1223</td>\n",
" <td>56</td>\n",
" <td>H</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>1985</td>\n",
" <td>25</td>\n",
" <td>1223</td>\n",
" <td>56</td>\n",
" <td>1112</td>\n",
" <td>63</td>\n",
" <td>A</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>1985</td>\n",
" <td>25</td>\n",
" <td>1165</td>\n",
" <td>70</td>\n",
" <td>1432</td>\n",
" <td>54</td>\n",
" <td>H</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>1985</td>\n",
" <td>25</td>\n",
" <td>1432</td>\n",
" <td>54</td>\n",
" <td>1165</td>\n",
" <td>70</td>\n",
" <td>A</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>1985</td>\n",
" <td>25</td>\n",
" <td>1192</td>\n",
" <td>86</td>\n",
" <td>1447</td>\n",
" <td>74</td>\n",
" <td>H</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>1985</td>\n",
" <td>25</td>\n",
" <td>1447</td>\n",
" <td>74</td>\n",
" <td>1192</td>\n",
" <td>86</td>\n",
" <td>A</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Season DayNum T1_TeamID ... T2_Score location NumOT\n",
"0 1985 20 1228 ... 64 N 0\n",
"1 1985 20 1328 ... 81 N 0\n",
"2 1985 25 1354 ... 77 A 0\n",
"3 1985 25 1106 ... 70 H 0\n",
"4 1985 25 1112 ... 56 H 0\n",
"5 1985 25 1223 ... 63 A 0\n",
"6 1985 25 1165 ... 54 H 0\n",
"7 1985 25 1432 ... 70 A 0\n",
"8 1985 25 1192 ... 74 H 0\n",
"9 1985 25 1447 ... 86 A 0\n",
"\n",
"[10 rows x 8 columns]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regular_results.head(10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "00008b58fe6066335c46afcbcd3b3e708a656cb9"
},
"source": [
"As you can see, at this point we have clean tables with duplicated rows, but only team positions and their scores swapped.\n",
"\n",
"What we are going to do next is pretty simple - we are going to make dummy features based on `T1_TeamID` and `T2_TeamID` and feed them as factors the `glm` model."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"_uuid": "7e40d0a8fc5294d49e78664a53c35ac127e4fe18"
},
"outputs": [],
"source": [
"# convert to str, so the model would treat TeamID them as factors\n",
"regular_results['T1_TeamID'] = regular_results['T1_TeamID'].astype(str)\n",
"regular_results['T2_TeamID'] = regular_results['T2_TeamID'].astype(str)\n",
"\n",
"# make it a binary task\n",
"regular_results['win'] = np.where(regular_results['T1_Score']>regular_results['T2_Score'], 1, 0)\n",
"\n",
"def team_quality(season):\n",
" \"\"\"\n",
" Calculate team quality for each season seperately. \n",
" Team strength changes from season to season (students playing change!)\n",
" So pooling everything would be bad approach!\n",
" \"\"\"\n",
" formula = 'win~-1+T1_TeamID+T2_TeamID'\n",
" glm = sm.GLM.from_formula(formula=formula, \n",
" data=regular_results.loc[regular_results.Season==season,:], \n",
" family=sm.families.Binomial()).fit()\n",
" \n",
" # extracting parameters from glm\n",
" quality = pd.DataFrame(glm.params).reset_index()\n",
" quality.columns = ['TeamID','beta']\n",
" quality['Season'] = season\n",
" # taking exp due to binomial model being used\n",
" quality['quality'] = np.exp(quality['beta'])\n",
" # only interested in glm parameters with T1_, as T2_ should be mirroring T1_ ones\n",
" quality = quality.loc[quality.TeamID.str.contains('T1_')].reset_index(drop=True)\n",
" quality['TeamID'] = quality['TeamID'].apply(lambda x: x[10:14]).astype(int)\n",
" return quality"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "5a02966d55465cf57de13ab1fae3d58938f28761"
},
"source": [
"So what is done here - we are running a logistic regression with *N* teams in T1_x position and *N* teams in T2_x position as binary features (each feature being either \"0\" or \"1\").\n",
"\n",
"\"0\" is representing that the team did not play in that match at the position T1 (or T2 respectively), and \"1\" would represent that this team actually played.\n",
"\n",
"So among *N* \\* 2 features, only 2 of them are \"1\" (as only 2 teams play during basketball matches). In this setup `glm` model is trying to fit parameters on each of *N* \\* 2 features to predict the `win` as accurately as possible.\n",
"\n",
"As you can see, we did not use any of the information from the matches themselves (not even a score difference). The idea is, that using this approach we want to extract the natural (or mean) strength of a team, not depending on the opponent the team is going to play. And all the actual results of the basketball matches is just a deviation from the expected mean performance of the team.\n",
"\n",
"Let's try calculate the results for each season:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"_uuid": "837a53b417200ee689f5a6ddf6dfd0102f85b1cf"
},
"outputs": [],
"source": [
"team_quality = pd.concat([team_quality(2010),\n",
" team_quality(2011),\n",
" team_quality(2012),\n",
" team_quality(2013),\n",
" team_quality(2014),\n",
" team_quality(2015),\n",
" team_quality(2016),\n",
" team_quality(2017),\n",
" team_quality(2018)]).reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "5454621531cb5eef1d28edc7511df47cdfdd0479"
},
"source": [
"Let's take a look at the distribution of the team strength as in fitted `glm` parameters:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"_uuid": "0ba04dbcc4288e098fba98fc0344516325b885e3"
},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7f55c6746048>"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAD8CAYAAAB3u9PLAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xt003We+P/nJ0mTXtOWS9MCpYDUodgi33FGxVt/21qqVixIcfacHc/qDuvuCgPoikfdnzhWZ1znyx4uDuPKYdSZ3+z39/0Nq8hIRhlEnOKoX3FGrGBRbpUibWqvadrm/vn9kSZQeknaJk3Svh7neGw+l3feb3J55X1XVFVVEUIIMelpop0BIYQQsUECghBCCEACghBCiD4SEIQQQgASEIQQQvSRgCCEEAKQgCCEEKKPBAQhhBCABAQhhBB9dNHOwEgcPXoUg8Ew5HmHwzHs+Xgj5YltE6k8E6ksIOUZ7P7FixcHvS6uAoLBYKCgoGDI83V1dcOejzdSntg2kcozkcoCUp7B7g+FNBkJIYQAJCAIIYToE1JAqKmpoby8nLKyMnbu3Dng/JEjR1ixYgULFy7k7bffDhz/6KOPqKysDPxXVFTEO++8A8Bjjz1GSUlJ4FyoVRohhBCREbQPwePxUF1dzSuvvILJZKKqqoqSkhLmz58fuCYnJ4fnnnuOl19+ud+9119/PXv37gWgo6ODpUuXcuONNwbOP/roo9x2223hKosQQogxCBoQamtrycvLIzc3F4CKigoOHjzYLyDMmjULAI1m6ArH/v37ufnmm0lKShprnoUQQkRA0IBgsVjIzs4OPDaZTNTW1o74icxmM/fff3+/Y1u2bGHHjh0sWbKERx55BL1eP2waDodj2KYlu90+oZqepDyxbSKVZyKVBaQ8ozUuw06bm5v56quvuOmmmwLHHn74YaZPn47L5eLJJ59k586drF27dth0ZNhpfJPyxK6JVBaQ8gx2fyiCdiqbTCaampoCjy0WCyaTaUSZeeuttygrKyMhISFwLCsrC0VR0Ov13H333Xz++ecjSlMIIUR4BQ0IRUVF1NfX09DQgNPpxGw2U1JSMqInMZvNVFRU9DvW3NwMgKqqvPPOO+Tn548oTSFiyefnO/nes+9gsdqjnRUhRi1ok5FOp2PTpk2sXr0aj8fDypUryc/PZ9u2bRQWFlJaWkptbS1r167FarVy6NAhXnjhBcxmMwDnz5+nsbGRa6+9tl+6jzzyCO3t7aiqyoIFC3j66acjU0IhxsGXli5abA4+PdfBbYXZwW8QIgaF1IdQXFxMcXFxv2Pr168P/L1o0SJqamoGvXfWrFkcPnx4wPHf/OY3I8mnEDHNZncB8JWlSwKCiFsyU1mIMLA53IAvIAgRryQgCBEGXRIQxAQgAUGIMLDZfQHhzLfdON3eKOdGiNGRgCBEGPibjNxelfrW7ijnRojRkYAgRBjY7G70Ot/HSZqNRLySgCBEGHQ53CzMMaJR4KsmCQgiPklAECIMbHY301L1zJmWwlcWW7SzI8SoSEAQIgy6nW5SDTquzEqTJiMRtyQgCBEGNrub1EQdV2anUd/ajd3liXaWhBgxCQhChEGXw02qIYHvmNLwqnD6W2k2EvFHAoIQY+Rwe3C6vaQl6rjSlArISCMRnyQgCDFG3Q5f81CKXsucaSkkaBWe2nuciu2HqX7ziyjnTojQSUAQYoz8s5RTExNI0Gp47u5F3F6Yg93l4ZUPzuL1qlHOoRChkYAgxBh1OXwrnaYafIsHV10zi+erFvGD7+eiqr4RSELEAwkIQoyRv4aQlth/Nfm0RN8OgV12CQgiPkhAEGKM/OsY+WsIfv4AIQFBxAsJCEKMUSAgDFlDcI17noQYDQkIQoyRvwaQJjUEEeckIAgxRkPVEIx9j61SQxBxQgKCEGNks7vRKJCUoO133N9k5A8YQsQ6CQhCjJHN4VvYTlGUfsf9nczSZCTiRUgBoaamhvLycsrKyti5c+eA80eOHGHFihUsXLiQt99+u9+5goICKisrqays5J//+Z8DxxsaGli1ahVlZWVs2LABp9M5xqIIER1ddveAEUYAyXotWo0incoibgQNCB6Ph+rqanbt2oXZbGbfvn2cOnWq3zU5OTk899xz3HnnnQPuT0xMZO/evezdu5f//M//DBzfvHkz9913HwcOHMBoNPLf//3fYSiOEJHn9apsfecr2np9v/y7He4B/QcAiqKQatBJDUHEjaABoba2lry8PHJzc9Hr9VRUVHDw4MF+18yaNYsFCxag0YTWAqWqKh999BHl5eUArFixYkCaQsSqc209bH3nJH8669s72d9kNJi0RAkIIn4M/i6+hMViITs7O/DYZDJRW1sb8hM4HA7uvvtudDodDzzwALfeeivt7e0YjUZ0Ot/TZ2dnY7FYQkqrrq5uyPN2u33Y8/FGyhObTrU6ALjQ4SvPtx1dpOo1g5ZNj4cL37bHfLknymvjJ+UZnaABYawOHTqEyWSioaGBv//7v+fKK68kNTV1VGkZDAYKCgqGPF9XVzfs+Xgj5YlNXWfbgG9osasUFBTg/oOF7KnGQcs2raYDFWK+3BPltfGT8gy8PxRB23hMJhNNTU2BxxaLBZPJFHJG/Nfm5uZy7bXX8sUXX5CZmYnVasXt9lWlm5qaRpSmENHU3TeMtMnm+/9wTUZGaTIScSRoQCgqKqK+vp6GhgacTidms5mSkpKQEu/s7AyMHmpra+Ovf/0r8+fPR1EUrrvuOvbv3w/Anj17Qk5TiGjzzyuwdLlQVTWwfeZg0hITZJSRiBtBm4x0Oh2bNm1i9erVeDweVq5cSX5+Ptu2baOwsJDS0lJqa2tZu3YtVquVQ4cO8cILL2A2mzl9+jRPPfUUiqKgqir/+I//yPz58wHYuHEjDz30EFu3bqWgoIBVq1ZFvLBChENP33LWvW6V1m4n3U6PdCqLCSGkPoTi4mKKi4v7HVu/fn3g70WLFlFTUzPgvu9+97u8+eabg6aZm5srQ01FXLL17ZAG8GWTb6vMy5e+9ktL1GFzuFFVdcDENSFijcxUFmKEei5ZiqKu0QoMXPraLy0xAY9XpcfpGfS8ELFEAoIQI2Rz+tYuAqhr9NUQUoZpMgJZvkLEBwkIQoxQj8NDelICaQbNxRrCMJ3KIHsiiPgQ8XkIQkw03Q43KQYdSRqFU802YOBeCH5pgSWwpYYgYp/UEIQYIf+8A1NqAk6PFximhhBY8VRqCCL2SQ1BiBHqcXpI1msxpV78PTVcpzJIH4KID1JDEGKEbH1NRtmpF4NAmiFh0Gv9TUaySY6IBxIQhBihHqebFL2vycgvxaAd9NqLo4ykyUjEPmkyEmKEuh0eUgw6TH1rNCYlaNFpB/9tlaLXoSjSZCTigwQEIUbI12SkxZTqm4wwVIcygEYjm+SI+CEBQYgR6nH6+hAMOpXpaYYhh5z6GRMTsEqTkYgDEhCEGAGH24PLo/aNKnKRm5mE26sOe48scCfihQQEIUagp29hu2S9FnDx2O0FuPrmIgzFFxCkhiBinwQEIUbAP3zUv3bRtXOnBL0nLTEBi9Ue0XwJEQ4y7FSIEeju2wshRR/6bylpMhLxQgKCECPQ3ddkNNS8g8FIk5GIFxIQhBgB/37KQy1VMRjfNpq+TXKEiGXShyDECPi3z0zW66A3tHvSEnW4vSp2l5czLTZUFQpnpkcwl0KMjtQQhBgB//aZI6oh9F3bYnNw/ytHqN73RUTyJsRYSUAQYgQCNYQR9SH41jzaWXOG5i6HdDCLmCUBQYgRsI2qD8F37f/6+BwAvU4JCCI2hRQQampqKC8vp6ysjJ07dw44f+TIEVasWMHChQt5++23A8fr6ur4wQ9+QEVFBcuWLeMPf/hD4Nxjjz1GSUkJlZWVVFZWUldXF4biCBFZ3Q7ffsoGXei/pfw1BI9XZd70FHqcnkhlT4gxCfozx+PxUF1dzSuvvILJZKKqqoqSkhLmz58fuCYnJ4fnnnuOl19+ud+9iYmJPP/888yZMweLxcLKlSu56aabMBqNADz66KPcdtttYS6SEJHjX+lUUZSQ7/HXEK6dM4WrZhr570/ORyp7QoxJ0J85tbW15OXlkZubi16vp6KigoMHD/a7ZtasWSxYsACNpn9yc+fOZc6cOQCYTCamTJlCW1tb+HIvxDjr7ts+cyTypiazZN5UnqgoIFmvpdspQ1BFbAr6zrZYLGRnZwcem0wmamtrR/xEtbW1uFwuZs+eHTi2ZcsWduzYwZIlS3jkkUfQ6/XDpuFwOIZtWrLb7ROq6UnKE3uaWtrR4aGurm5E5dl0czrYGunubMerQu3xL9APsYdCNEyE1+ZSUp7RGZd5CM3NzWzcuJHnn38+UIt4+OGHmT59Oi6XiyeffJKdO3eydu3aYdMxGAwUFBQMeb6urm7Y8/FGyhN7NB92MSVNR0FBwajKk9d2Fv7azuy5+WSmDP8DaDxNhNfmUlKegfeHIuhPFJPJRFNTU+CxxWLBZDKFnBGbzcY//dM/8dBDD7F48eLA8aysLBRFQa/Xc/fdd/P555+HnKYQ0dLtcPsmpY2Sb5VU6HFJx7KIPUEDQlFREfX19TQ0NOB0OjGbzZSUlISUuNPpZM2aNVRWVg7oPG5ubgZAVVXeeecd8vPzR5F9IcZXt9MTWOl0NJL6gokMPRWxKOg7W6fTsWnTJlavXo3H42HlypXk5+ezbds2CgsLKS0tpba2lrVr12K1Wjl06BAvvPACZrOZt956i08++YSOjg727NkDwL//+79TUFDAI488Qnt7O6qqsmDBAp5++umIF1aIseru2z5ztJITtH3pSA1BxJ6QfuoUFxdTXFzc79j69esDfy9atIiampoB9/nnGAzmN7/5zUjyKURM8G+fOVr+Gc4yF0HEotgZ5iBEHLCNYtjppfz9D70uaTISsUcCghAh8vStWOrvGB6NQKey1BBEDJKAIESI/LuljaWGkJQgAUHELgkIQoTIvzlOWIadOqTJSMQeCQhChGg022dezt8hLfMQRCySgCBEiEazfeblDDoNigK90mQkYpAEBCFC1O0ce5ORoigkJ2ilD0HEJNlTWYgg/nbnh1w7dypFffsgj6WGAL7ZyhIQRCySGoIQw3B5vPyfs2388tApPmvoAEa2feZgkvVaWbpCxCSpIQgxjKZOO6oKblVlZ80ZYOw1BN+eCFJDELFHaghCDONCRy8ApQuycHq8AGOamOa/XzqVRSySgCDEML7pCwiP3b6A+VmpwNg6lf3390iTkYhB0mQkxDD8NYTcKcls/9v/wYdnWtFqQt9PeTBJei0tNkc4sidEWElAEGIY33TYmZaqJzFBy8IZRhbOMI45zWS9ll6ZmCZikDQZCTGMCx29zMhICmuayXqd7IcgYpIEBCGG8U1HLzPSwx0QZNipiE0SEIQYgqqqXOjoZWZm+ANCj8uDqqphTVeIsZKAIMQQOntd9Dg9YW8yStJrUVVwuH3DWLsdbtx9Q1qFiCYJCEIMwT/kdGZGYljTTb5sT4TyrTXsPHwmrM8hxGjIKCMhhnChww4QkU5l8NUMDDoN59t7aWjrCetzCDEaUkMQYgjftPu+pMMeEPrWQup1eWju8s1HsMmoIxEDQgoINTU1lJeXU1ZWxs6dOwecP3LkCCtWrGDhwoW8/fbb/c7t2bOHpUuXsnTpUvbs2RM4fuzYMZYtW0ZZWRnPPvusdLCJmHOh045Bp2Fqij6s6V66r3Kz1VcLsdldYX0OIUYjaEDweDxUV1eza9cuzGYz+/bt49SpU/2uycnJ4bnnnuPOO+/sd7yjo4Nf/OIX/O53v2P37t384he/oLOzE4Cf/OQnPPPMM/zxj3+kvr6empqaMBZLiLH7pqOXmRlJKMrYZiZfLimhb9c0pztQQ5B5CSIWBA0ItbW15OXlkZubi16vp6KigoMHD/a7ZtasWSxYsACNpn9y77//PjfeeCMZGRmkp6dz4403cvjwYZqbm7HZbCxevBhFUVi+fPmANIWItkhMSoOLNYRe56VNRjIvQURf0IBgsVjIzs4OPDaZTFgslpASH+rey49nZ2eHnKYQ4+Wb9l5mhHmEEVwMCN1OD81dfU1GEhBEDIirUUYOh4O6urohz9vt9mHPxxspT/Q4PSrNXQ70ru4h8zza8jTbfF/+Z75u4FSTLyB09kT33yaeXptQSHlGJ2hAMJlMNDU1BR5bLBZMJlNIiZtMJj7++ON+91577bUD0mxqagopTYPBQEFBwZDn6+rqhj0fb6Q80XOutQc4y9X5uRQU5A56zWjLk93thNfOkT41C0dzM2Cj101U/23i6bUJhZRn4P2hCNpkVFRURH19PQ0NDTidTsxmMyUlJSElftNNN/H+++/T2dlJZ2cn77//PjfddBNZWVmkpqZy9OhRVFXljTfeoLS0NKQ0hRgPFyelhb8PIenSUUZ9TUZOtxenW2Yri+gKWkPQ6XRs2rSJ1atX4/F4WLlyJfn5+Wzbto3CwkJKS0upra1l7dq1WK1WDh06xAsvvIDZbCYjI4MHH3yQqqoqANasWUNGRgYATz31FI8//jh2u51bbrmFW265JbIlFWIEvmyyAjBnWkrY0zboNGiUi53KGgW8qm+iml4X3iGuQoxESH0IxcXFFBcX9zu2fv36wN+LFi0acthoVVVVICBcqqioiH379o0kr0KMm6MNHZiMBnLSw9+prCgKyXod7T1OOnpczJ6SzLm2HmwON5lhnvMgxEjITGUhBnG0oYPFuRlhn4Pgl6zXcq5vuYq5fbUQGWkkok0CghCXae92Ut/aw+LczIg9R7Jey9etvoAwb7ovIHRLQBBRJgFBiMscPd8BwOLcjIg9R5JeF+i4ntdXQ+iSgCCiTAKCEJc5eq4DjQKLZqVH7DmS9Vo8Xt/6XfOmpwJSQxDRJwFBiMscbejgSlMaKYbIzdv0z1bWKDB7SjIgAUFEnwQEIS6hqiqfne+IaHMRXAwIU1MNGJMSAOiyS0AQ0SUBQYhL1Lf20NHjGoeA4Kt9ZKUZSPGvbSQrnoook4AgxCWONrQDsHh2ZAOCf7ZyVpoBnVZDUoIWm0P2RBDRJQFBiEt8eq6DFL2W/Ky0iD6Pf1/lrDTfxLcUg052TRNRJwFBiEuc/tZGvikNrSYyE9L8/H0IWUYDAGmJOpmYJqJOAoIQl2i1OZmeZoj48yRd0ocAkGLQyigjEXUSEIS4RIvNGfY9lAeTYvDVEKb7m4z0OmwyykhEmQQEIfp4vSrtPU6mpkY+ICQlSJORiD0SEIToY7W78HhVpqREvsnomrxMbs6fxndMvs5rX6eyBAQRXXG1haYQkdRicwKMS5PRvOmp/D8/ui7wONWgkz4EEXVSQxCiT1t3X0AYhyajy6UadLK4nYg6CQhC9Gm1OQCYEoVNalINOpxuLy6PbKMpokcCghB9WvtqCNNSI9+HcDn/QnrSbCSiSQKCEH38TUaZydGpIYAscCeiSwKCEH1abQ6MiTr0uvH/WKQm9tUQnBIQRPRIQBCiT2u3k6lRaC6Ci01GMjlNRFNIw05ramr46U9/itfrZdWqVTzwwAP9zjudTh599FGOHz9ORkYGW7ZsYdasWfz+97/nV7/6VeC6L7/8kj179lBQUMC9995Lc3MziYm+mZovv/wyU6dODWPRhBiZtm5nVDqU4WKTkcxFENEUNCB4PB6qq6t55ZVXMJlMVFVVUVJSwvz58wPX7N69G6PRyIEDBzCbzWzevJmtW7dy1113cddddwG+YLBmzRoKCgoC923evJmioqIIFEuIkWu1OcmbmhyV55aAIGJB0Caj2tpa8vLyyM3NRa/XU1FRwcGDB/td8+6777JixQoAysvL+fDDD1FVtd81ZrOZioqKMGZdiPDyNRlFqYaQKKOMRPQFDQgWi4Xs7OzAY5PJhMViGXBNTk4OADqdjrS0NNrb2/td84c//GFAQHjiiSeorKxkx44dAwKIEOMpsI7ROCxbMZhUvYwyEtE3LktXfPbZZyQlJXHllVcGjm3evBmTyYTNZmPdunXs3buX5cuXD5uOw+Ggrq5uyPN2u33Y8/FGyjN+rHYPHq+Ky9Yech7DWR6P1/eDqP6bJo5+3suvPmnjnqIMpqWMz+oysfzajIaUZ3SCvttMJhNNTU2BxxaLBZPJNOCaxsZGsrOzcbvddHV1kZmZGTg/WHORP43U1FTuvPNOamtrgwYEg8HQrw/icnV1dcOejzdSnvFzqtkGfM3CK3IpKJgZ0j3hLk9iwtckpWXQSCZvflnPzUVzuLkgN2zpDyeWX5vRkPIMvD8UQZuMioqKqK+vp6GhAafTidlspqSkpN81JSUl7NmzB4D9+/dz/fXXoyi+Hae8Xi9vvfVWv4Dgdrtpa2sDwOVy8d5775Gfnx9ayYSIAP+yFdFqMgJfx7LN4eHdE80AWHtlj2UxvoLWEHQ6HZs2bWL16tV4PB5WrlxJfn4+27Zto7CwkNLSUqqqqti4cSNlZWWkp6ezZcuWwP1HjhwhJyeH3NyLv3ScTierV6/G5XLh9XpZsmQJ99xzT2RKKEQI/LOUozXsFPoWuLO7+D9nfT+WOiUgiHEWUgNlcXExxcXF/Y6tX78+8LfBYGD79u2D3nvdddfxu9/9rt+x5ORkXn/99ZHmVYiIaQmsYxS9gJBi0HGkvo1vu3y1lY4eCQhifMlMZSGAtr69EDKjXEOwWB0oCmQmJ0gNQYw72SBHCKCt20F6UgIJ2uj9RvJPTrt6VgaqqkpAEONOaghC4GsyGo+d0objn5xWuiALY1ICHRIQxDiTgCAEviajaHYow8UF7v5mQRYZyXoZZSTGnTQZCQG0djuYOy0lqnlYnJvBKYuNq2YYSU/SSZORGHcSEITAN+z0mrwpUc3DPd/L5Z7v+YZnpyf5OpVVVQ3M6REi0qTJSEx6Xq9KW7czqkNOL5eelIDHq8rqp2JcSUAQk15bjxOvGt1JaZfLSPLlRZqNxHiSgCAmvU/PdQDwney0KOfkImNSAiCT08T4koAgJr33T35LUoKWa/Iyg188TtL7AoKMNBLjSQKCmPQOn2zhunlTMOi00c5KQEayLyBIk5EYTxIQxKR2vr2HMy3d3Jw/PdpZ6cdfQ5DJaWI8SUAQk9r7J1sAuDl/WpRz0p8/IEgNQYwnCQhiUjt8qgWT0UB+Vmq0s9JPsl5LglaRgCDGlQQEMWl5vCp/PtXCTfOnx9zkL0VRSE9KkFFGYlxJQBCT1vELnXT0uGKuucjPmJQgo4zEuJKAICatD063AnDj/NgMCP7lK4QYLxIQxKR1utlGVpqB6WnR20d5OBlJCXT0OqOdDTGJSEAQk9bXbT3MnpIc7WwMSWoIYrxJQBCTVkNbD7OnxnhAkE5lMY4kIIhJye7y0GS1kzclunsgDCc9WY/V7sbjVaOdFTFJhBQQampqKC8vp6ysjJ07dw4473Q62bBhA2VlZaxatYrz588DcP78eRYtWkRlZSWVlZVs2rQpcM+xY8dYtmwZZWVlPPvss6iqvOnF+Dnf3oOqwuypSdHOypD8k9O67FJLEOMjaEDweDxUV1eza9cuzGYz+/bt49SpU/2u2b17N0ajkQMHDnDfffexefPmwLnZs2ezd+9e9u7dS3V1deD4T37yE5555hn++Mc/Ul9fT01NTRiLJcTwzrX1ADA7lmsIMltZjLOgAaG2tpa8vDxyc3PR6/VUVFRw8ODBfte8++67rFixAoDy8nI+/PDDYX/xNzc3Y7PZWLx4MYqisHz58gFpChFJX7f6AkJeDPchZMgS2GKcBQ0IFouF7OzswGOTyYTFYhlwTU5ODgA6nY60tDTa29sBX7PR8uXL+eEPf8gnn3wyaJrZ2dkD0hQikr5u7SFZr2VqDG2Kc7l0WfFUjLOI7qmclZXFoUOHyMzM5NixY6xZswaz2Tzq9BwOB3V1dUOet9vtw56PN1KeyPniXDOmFC0nTpwYdRqRLk9ru28Owhen6pnuaYnY80BsvTbhIOUZnaABwWQy0dTUFHhssVgwmUwDrmlsbCQ7Oxu3201XVxeZmZkoioJe7/sFVlhYyOzZszl79uyANJuamgakORiDwUBBQcGQ5+vq6oY9H2+kPJHT9lYzV87IHFN+Il2eKVY7/P48qVOyKCjIi9jzQGy9NuEg5Rl4fyiCNhkVFRVRX19PQ0MDTqcTs9lMSUlJv2tKSkrYs2cPAPv37+f6669HURTa2trweDwANDQ0UF9fT25uLllZWaSmpnL06FFUVeWNN96gtLR0pGUUYlS8XtU3ByGGJ6WBdCqL8Re0hqDT6di0aROrV6/G4/GwcuVK8vPz2bZtG4WFhZSWllJVVcXGjRspKysjPT2dLVu2AHDkyBG2b9+OTqdDo9Hw9NNPk5GRAcBTTz3F448/jt1u55ZbbuGWW26JbEmF6NPc5cDh9jJ7auyOMAJITNBi0GkkIIhxE1IfQnFxMcXFxf2OrV+/PvC3wWBg+/btA+4rLy+nvLx80DSLiorYt2/fSPIqRFh83doNQF6M1xBAZiuL8SUzlcWkc3EOQuwHhIxkWeBOjB8JCGLSOdfWg1ajMDMzdmcp+01LNQTmTAgRaRIQxKTzdWsPMzISSdDG/tu/ZEEWJ5q6OP2tLdpZEZNA7H8ihAizc3Ewwshv2dUzUBT4/dEL0c6KmAQkIIhJRVVVzrZ0kxfjI4z8TMZErps7hTc/uyALQIqIk4AgJpXGTjudvS4KcozRzkrIKhfP5ExLN8cvWKOdFTHBSUAQk8oXfV+qC+MoINxemE2CVmHv0W+inRUxwUlAEJPKF41WFAUWZKdFOyshy0jWc0v+dPbVNuKVzXJEBElAEJPKFxeszJ2aQoohous6hl15YTaNnXbOtHRHOytiApOAICaVLxqtFMyIn+YiP/+s6qZOe5RzIiYyCQhi0rDaXZxr64mr/gO/nHTfJLrGzt4o50RMZBIQxKRxorELgIVxWEPIMhoAqSGIyJKAICaNLy50AnBVHNYQEhN8u7s1WiUgiMiRgCAmjS8arUxN0TM9zRDtrIxKdnqi1BBERElAEJPGF41WFs4woihKtLMyKjnpiTRKQBARJAFBTAouj5evmmxx2aHsZzIm0iTeavibAAAS/klEQVSdyiKCJCCISeH0tzacHm9cdij75aQn0t7jwu7yRDsrYoKSgCAmhdPNvgld+VnxM0P5ctl9Q0+lH0FEigQEMSk09Y3OmZGRGOWcjF5Oui/v0o8gIkUCgpgULFY7Bp2G9KSEaGdl1LL7AoJFhp6KCJGAICaFxk472emJcTvCCCDbKDUEEVkhrfBVU1PDT3/6U7xeL6tWreKBBx7od97pdPLoo49y/PhxMjIy2LJlC7NmzeLPf/4z//Ef/4HL5SIhIYGNGzeyZMkSAO69916am5tJTPS9yV9++WWmTp0a5uIJ4WPptAe+UONVikGHMVEnI41ExAQNCB6Ph+rqal555RVMJhNVVVWUlJQwf/78wDW7d+/GaDRy4MABzGYzmzdvZuvWrWRmZvLiiy9iMpn46quv+NGPfsThw4cD923evJmioqLIlEyISzRae/nu7MxoZ2PMsmUugoigoE1GtbW15OXlkZubi16vp6KigoMHD/a75t1332XFihUAlJeX8+GHH6KqKgsXLsRkMgGQn5+Pw+HA6XRGoBhCDE1VVSxWR9zXEMA30qhJ+hBEhAQNCBaLhezs7MBjk8mExWIZcE1OTg4AOp2OtLQ02tvb+12zf/9+Fi5ciF6vDxx74oknqKysZMeOHbJfrIiY9h4XTrc30Ckbz3KMUkMQkTMuu4ScPHmSzZs38/LLLweObd68GZPJhM1mY926dezdu5fly5cPm47D4aCurm7I83a7fdjz8UbKEx6n2xwAuLtaqKsL35dpNMqjc3XR0uWg9tgXJGjD10Eu77XYNl7lCRoQTCYTTU1NgccWiyXQDHTpNY2NjWRnZ+N2u+nq6iIz09de29TUxNq1a3n++eeZPXt2v3sAUlNTufPOO6mtrQ0aEAwGAwUFBUOer6urG/Z8vJHyhEfjCQvwDdcsnE9BGPsRolGeoq5zqJ91MGXmHGZlJuP1qmg0Yw8M8l6LbWMtT6jBJGiTUVFREfX19TQ0NOB0OjGbzZSUlPS7pqSkhD179gC+pqHrr78eRVGwWq088MAD/Ou//ivXXHNN4Hq3201bWxsALpeL9957j/z8/JALJ8RI+JtYciZAk1H2JZPTHtn9GXdsPxzkDiFCF7SGoNPp2LRpE6tXr8bj8bBy5Ury8/PZtm0bhYWFlJaWUlVVxcaNGykrKyM9PZ0tW7YA8Nvf/pZz586xY8cOduzYAfiGlyYlJbF69WpcLhder5clS5Zwzz33RLakYtKydNrRKDA9NT6Xvb6Uf+e0TXuPU9doBXw7wRkT43fCnYgdIfUhFBcXU1xc3O/Y+vXrA38bDAa2b98+4L4HH3yQBx98cNA0X3/99ZHkU4hRa7LamZZqQKeN/3mY/pFSdY1Wrpph5PgFK/Ut3SyalRHlnImJIP4/IUIE0dhpnxDNRQDGJB0zM5KoXDyD/1l1NQBnW7qjnCsxUYzLKCMhxtsbn37D//Wd6WQk67FY7cyZmhLtLIWFoii8+0gxeq0Gu8sLQH1LT5RzJSYKqSGICaehrYcN/99RXvzTaeDiOkYThUGnRVEUkvRactITqW+VGoIIDwkIYsI5/a0NgLePNdHjdNNld0+ogHCpOVNTpMlIhI0EBDHh+L8gv27t4dCJbwEmxLIVg5kzLUVqCCJsJCCICedsSzeJCRo0Crz6wVlg4gaEudOS6ehx0dEja4SJsZOAICacsy3dfMeUxrVzp3Ck3rem1kRuMgIZaSTCQwKCmHDOfNvN3Gkp3F6YEzg2UQPC3Gm+gCDNRiIcJCCICcXu8nChs5c501K4rdC3Sm9aoo5k/cQcYZ07JRlFgbMy9FSEwcT8lIhJ6+vWHlTV98vZZEzk+3My6XZ4op2tiElM0DIjPYl6aTISYSABQUwoZ1t8Q07nTUsFYNvf/g/srokbEMAX/KTJSISDNBmJCeVM3y/lOdOSAZiRkcS86anRzFLEzZmWzNmWblRVxeNVcXm80c6SiFMSEMSEcvbbbqanGUibRKt/zpmaQpfdzZ5Pv2HJcwf51999Fu0siTglTUZiQjnb0h0YeTNZ+Mv78O8+Q1Hgz6daUFUVRQnfjmpicpAagphQzrZ0M2+SBYSiWenMSE/kH2+eyxO3F9Da7aTJKvsui5GTGoKYMDp7XbR2OyddDSErLZEPHi8F4C9f+ybiHfvGGthMR4hQSQ1BTBj+oZeTLSBcqiAnDY0Cn3/TGe2siDgkAUFMGP5VTudNn7wBIVmv44rpqRyXgCBGQZqMRNzr6HGys+YMr35QT0ZyArlTkqOdpagqmpnOn0+3RDsbIg5JDUHENZfHy92//IAX/3Sa0gITr/3LDRh02mhnK6qumpmOxeqguUs6lsXISA1BxDVzbSNnWrr55d99lzuKcoLfMAkUzUwH4Pg3VrIWTMxF/URkhFRDqKmpoby8nLKyMnbu3DngvNPpZMOGDZSVlbFq1SrOnz8fOPfSSy9RVlZGeXk5hw8fDjlNIYJRVZX//NNp8rNSue2q7GhnJ2YsnGEEfB3LdpeHx16r5Y/Hm6KcKxEPggYEj8dDdXU1u3btwmw2s2/fPk6dOtXvmt27d2M0Gjlw4AD33XcfmzdvBuDUqVOYzWbMZjO7du3i6aefxuPxhJSmEMG89+W3nGjq4p+Kr0CjkUlYfqkGHfOmpVB7vpNHdn/G/z7SwNr/91P+8nUb4AukDW09eLxqlHMqYk3QJqPa2lry8vLIzc0FoKKigoMHDzJ//vzANe+++y5r164FoLy8nOrqalRV5eDBg1RUVKDX68nNzSUvL4/a2lqAoGmGk6qqWO1uUg06tBqFXqeH9h4nCVoNmckJaBSFth4nrTYnxiQd01INfGXpYvcn5/nL1+38zXemU3VNLjkZibR3O7G7vExJ1ZOi19Lr8tBsdQAwPc2AChw60cyhE82kJydw/bypFM5MR6soKIrvw5qs16Kq0GV30+10Y0xKIEWvxeNV6eh14XR7yUzWA77lnNt7nCgoZKYkoNdq6HZ6aO92YkjQMCVZj0ZR6Ox10dHrIi1RR2ayHlVVae120mV3MzVFT0ZyAg63l2arA6fHQ5YxkTSDji6Hm6ZOOxpFISc9kaQELS02B42ddlITdcxIT0KrUbjQ0YvFamd6moGZmUm4PCr1Ld20dTuZPSWZWZlJtHU7+cpiw+H2kJ+VxowM3wbwxy9YMei0JPa6mOP0cLShg8+/6SAnPYnvzclEqyj86atvOX7BSuHMdG7Jn8Y3Hb3sPXqBMy3dlBVkUVpg4tCXzew6fBabw819N8zh0IlmctITuevqGRF538SzwpnpvFl7AVWFNX9zBebaRh74zV+orizk1Q/OcqS+nQXZaTx623cwJibwy8PN1P/Bwu2F2fzge7Np7rLz9rEm2ntc3FqQxQ3zp3HS0sWHp1vR6zTccMU05mel8kVjJ8e+sTI9zcB3Z2eSlqjjRFMXX7d2M3tKMgtyjHhVlVPNNlptTuZOSyZvago2u5uzrd04XF7mTEvGlJZIa7eT8+09JGg15GYmk5aoo8XmoMlqJy0xgZz0RHQahRabk9ZuB1NTDExPM+Dxqnxrc9DjcDMt1RB4rze09eBVVaalGkgx6Ohxumm1OdFpFaak6NFrNXQ53HT2uEjSa8lM1qPg+1xa7S6MSQkYE3V4Vd/ABYfbS0ZyAkkJWlwelY5eJ6iQnpyAQafF7vLQ2esiQavBmOj7rulxerDaXSQn6EhL9H3ddtnd2JxujIk6Ug06vu1ycOxCJ42ddhK0GnQahcZOOw1tPSQmaLl27hQy3eOzQGPQgGCxWMjOvlgdN5lMgS/1S6/JyfG13+p0OtLS0mhvb8disXD11Vf3u9disQAETTOc/v2tE7xUcwYAvVaD87LFv7QaZdBfS3qthoIZRl44dIrt7w6swSRoFVye/vcpCqgqZCYn0OP08Mqf6wfcp9UoqKrKpU+p0yi4L8uDTgNu75mgz3l5/jUKqPjyMXz6A48N9m/hL9NQj0dyjNcaGMrlr41ep2FmRhJP7j3Ok3uPA3D1rHRmZSbxP/d/CcD/XVGAXidjIy5XONPI7z+7wD3fm8UjS7/Dyu/OYsUvP2DN//or09MM/LhkPm9+doF/ePUTAJITFBblZvLie6fZceg04Hs9kvRaXvvr+eGeasQGe19oFLj8IxjKscGuufi+rg8cG+xzc/n7X1FAo/R//+s0Ch5V7ZffwdIK5ZhGAeWy9If67gGYmqKnx+nh1Q/qSUpQ+MtVCyK+r0dcdSo7HA7q6uqGvWaw88vnwvK588bwzFPGcK+IiBuMfX8Ef09EUjSfezg3T4e3/t73nj9x4gQA//ue3Euu8HJH7iD9LjdnjEPuxGh8ffrkqO91OBwhXRc0IJhMJpqaLnZIWSwWTCbTgGsaGxvJzs7G7XbT1dVFZmbmsPcGS3MwixcvDl4iIYQQoxK0rl1UVER9fT0NDQ04nU7MZjMlJSX9rikpKWHPnj0A7N+/n+uvvx5FUSgpKcFsNuN0OmloaKC+vp5FixaFlKYQQojxFbSGoNPp2LRpE6tXr8bj8bBy5Ury8/PZtm0bhYWFlJaWUlVVxcaNGykrKyM9PZ0tW7YAkJ+fz+23384dd9yBVqtl06ZNaLW+SUODpSmEECJ6FFUd0O0nhBBiEpLhGUIIIQAJCEIIIfpMiICwdetWli1bRmVlJf/wD/8QmOugqirPPvssZWVlLFu2jOPHj0c5p6F5/vnnue2221i2bBlr1qzBarUGzg21FEiseuutt6ioqGDBggV8/vnn/c7FW1n84n3Zlccff5wlS5Zw5513Bo51dHRw//33s3TpUu6//346O+Nn+ezGxkbuvfde7rjjDioqKvj1r38NxGeZHA4HVVVV3HXXXVRUVLB9+3YAGhoaWLVqFWVlZWzYsAGn0xmZDKgTQFdXV+DvX//61+qTTz6pqqqqvvfee+qPfvQj1ev1qp9++qlaVVUVrSyOyOHDh1WXy6Wqqqr+/Oc/V3/+85+rqqqqJ0+eVJctW6Y6HA713Llzamlpqep2u6OZ1aBOnTqlnj59Wv3hD3+o1tbWBo7HY1lUVVXdbrdaWlqqnjt3TnU4HOqyZcvUkydPRjtbI/Lxxx+rx44dUysqKgLHnn/+efWll15SVVVVX3rppcB7Lh5YLBb12LFjqqr6vguWLl2qnjx5Mi7L5PV6VZvNpqqqqjqdTrWqqkr99NNP1XXr1qn79u1TVVVVn3zySfW//uu/IvL8E6KGkJqaGvi7t7c3sLn4wYMHWb58OYqisHjxYqxWK83NzdHKZshuuukmdDrfALDFixcH5mwMtxRIrLriiiuYN2/gpMB4LAv0X8pFr9cHll2JJ9///vdJT0/vd8z/WQFYvnw577zzTjSyNipZWVlcddVVgO+7YN68eVgslrgsk6IopKT4Nnhyu9243W4UReGjjz6ivLwcgBUrVkTsPTchAgLAli1bKC4u5s0332T9+vXAwGU3srOzA81J8eK1117jlltuAQZfRiTeyuMXr2WJ13wH09raSlZWFgDTp0+ntbU1yjkanfPnz1NXV8fVV18dt2XyeDxUVlZyww03cMMNN5Cbm4vRaAz8SIzk91jcLF1x33330dIycBeoDRs2cOutt/LQQw/x0EMP8dJLL/Hb3/6WdevWRSGXoQtWHoAXX3wRrVbLXXfdNd7ZG5FQyiLih6IogVp2POnu7mbdunU88cQT/VoNIL7KpNVq2bt3L1arlTVr1nDmzJngN4VJ3ASEV199NaTrli1bxgMPPMC6desGLJ3R1NQU0hIZ4yFYeV5//XXee+89Xn311cAbOZRlRKIh1NfmUrFalmDiNd/BTJ06lebmZrKysmhubmbKlPhav8vlcrFu3TqWLVvG0qVLgfgvk9Fo5LrrruPo0aNYrVbcbjc6nS6i32MTosmovr4+8PfBgwcDbdYlJSW88cYbqKrK0aNHSUtLC1QhY1lNTQ27du3ixRdfJCkpKXB8qKVA4lG8lmWiLrvi/6wAvPHGG5SWlkY5R6FTVZV/+7d/Y968edx///2B4/FYpra2tsCoQrvdzgcffMAVV1zBddddx/79+wHYs2dPxN5zE2Km8o9//GPOnj2LoijMnDmTp59+GpPJhKqqVFdXc/jwYZKSkvjZz35GUVFRtLMbVFlZGU6nk4wM38qTV199NdXV1YCvGem1115Dq9XyxBNPUFxcHM2sBnXgwAGeeeYZ2traMBqNFBQU8Ktf/QqIv7L4/elPf+JnP/tZYNmVf/mXf4l2lkbk4Ycf5uOPP6a9vZ2pU6fy4x//mFtvvZUNGzbQ2NjIjBkz2Lp1a+D9F+s++eQT/u7v/o4rr7wSjcb3G/fhhx9m0aJFcVemEydO8Nhjj+HxeFBVldtuu421a9fS0NDAQw89RGdnJwUFBWzevBm9Xh/2558QAUEIIcTYTYgmIyGEEGMnAUEIIQQgAUEIIUQfCQhCCCEACQhCCCH6SEAQQggBSEAQQgjRRwKCEEIIAP5/u5aXnIlHw20AAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.set_style('whitegrid')\n",
"sns.kdeplot(np.array(team_quality['beta']), bw=0.1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "48d1827761ee782f4a317e15208f4f0d6d3cfa05"
},
"source": [
"The distribution seems to be symetric, with a little shift towards positive estimates. In general, very nice looking distribution.\n",
"\n",
"If you have noticed, during quality estimation I used the `exp(beta)` to determine team's quality. The idea is, that the difference of i.e. `beta=1` and `beta=2` is not equivalent of the difference between `beta=2` and `beta=3`, as beta parameters are following not linear, but logarithmic interpretation (all this is the result of us using binomial `glm`). However, it is possible to get a linearly comparable team strengths - it is sufficient to apply `exp(beta)` transformation!\n",
"\n",
"Let's take at the `exp(beta)` distribution:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"_uuid": "dc5e5f39425cb1e4e443749557ca47cf847e66ae"
},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7f55c44b9470>"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.set_style('whitegrid')\n",
"sns.kdeplot(np.array(np.clip(team_quality['quality'],0,1000)), bw=0.1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "e7109b11ba480f259616c64dcf89aa67f69435f8"
},
"source": [
"First thing we can notice, that there is a collection of teams with very high team strengths. This is where you would expect the tournament leaders to be!"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "2a4333883da3abe8b37fdc9cdbf771ba74495333"
},
"source": [
"# March Madness teams - team strength overview"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "7c7b92cc11e9a1d26f939d13cb6e57f0ae4aa983"
},
"source": [
"Up until this point we considered all the teams which played in the regular season. However, we are only interested in March Madness selected teams. I am going to analyze how the team strength we calculated translates into predicting the winners of NCAA March Madness tournament matches."
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "44537ca601a8c86a38745effb9022f50bffd8b00"
},
"source": [
"Let's merge the team quality data to our tournament results table:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"_uuid": "978666a8d4831c6d7b14790681eae5b58c7b1062"
},
"outputs": [],
"source": [
"team_quality_T1 = team_quality[['TeamID','Season','quality']]\n",
"team_quality_T1.columns = ['T1_TeamID','Season','T1_quality']\n",
"team_quality_T2 = team_quality[['TeamID','Season','quality']]\n",
"team_quality_T2.columns = ['T2_TeamID','Season','T2_quality']\n",
"\n",
"tourney_results['T1_TeamID'] = tourney_results['T1_TeamID'].astype(int)\n",
"tourney_results['T2_TeamID'] = tourney_results['T2_TeamID'].astype(int)\n",
"tourney_results = tourney_results.merge(team_quality_T1, on = ['T1_TeamID','Season'], how = 'left')\n",
"tourney_results = tourney_results.merge(team_quality_T2, on = ['T2_TeamID','Season'], how = 'left')"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"_uuid": "45a0e7f596ed0cd8f52f23344e4bfa741e9d08b2"
},
"outputs": [],
"source": [
"# we only have tourney results since year 2010\n",
"tourney_results = tourney_results.loc[tourney_results['Season'] >= 2010].reset_index(drop=True)\n",
"\n",
"# not interested in pre-selection matches\n",
"tourney_results = tourney_results.loc[tourney_results['DayNum'] >= 136].reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "48dc666321fa630a92f210c2880a3ff3ea20ba94"
},
"source": [
"We are also going to be interested in comparing how the team strength we calculated correlates with seeds assigned to each team:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"_uuid": "5ca8d86ad11d25d138e0bcee87e45e18dea6b770"
},
"outputs": [],
"source": [
"seeds['seed'] = seeds['Seed'].apply(lambda x: int(x[1:3]))\n",
"seeds['division'] = seeds['Seed'].apply(lambda x: x[0])\n",
"\n",
"seeds_T1 = seeds[['Season','TeamID','seed','division']].copy()\n",
"seeds_T2 = seeds[['Season','TeamID','seed','division']].copy()\n",
"seeds_T1.columns = ['Season','T1_TeamID','T1_seed','T1_division']\n",
"seeds_T2.columns = ['Season','T2_TeamID','T2_seed','T2_division']\n",
"\n",
"tourney_results = tourney_results.merge(seeds_T1, on = ['Season', 'T1_TeamID'], how = 'left')\n",
"tourney_results = tourney_results.merge(seeds_T2, on = ['Season', 'T2_TeamID'], how = 'left')"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "a0766205a2677b48e868eced93acd4ab2e517e96"
},
"source": [
"Let's try and convert team quality to the rank of team quality:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"_uuid": "cdf6ff0388e664fdcdd9cc31a784e940703841c5"
},
"outputs": [],
"source": [
"tourney_results['T1_powerrank'] = tourney_results.groupby(['Season','T1_division'])['T1_quality'].rank(method='dense', ascending=False).astype(int)\n",
"tourney_results['T2_powerrank'] = tourney_results.groupby(['Season','T2_division'])['T2_quality'].rank(method='dense', ascending=False).astype(int)"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "bc7ecf48994aa30811fdb688e4e621fb758859c1"
},
"source": [
"At this point we are interested to see how the power rank we derived from regular season correlats with seed data:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"_uuid": "c119e3557b5cee848459023684fd81f7adea838f"
},
"outputs": [],
"source": [
"piv = pd.pivot_table(tourney_results, index = ['T1_seed'], columns=['T1_powerrank'], values = ['T1_TeamID'], aggfunc=len)\n",
"piv = piv.xs('T1_TeamID', axis=1, drop_level=True)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"_uuid": "a677ed46f21d3e8608ab57fddf70341977703e8e"
},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7f55c449eb70>"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x576 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"fig, ax = plt.subplots(figsize=(12,8))\n",
"sns.heatmap(piv, annot=True,cmap='Blues', fmt='g')"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "aa1205f338c53d2243adeba51f22a04c566c0371"
},
"source": [
"There is a very strong correlation, but there are some interesting misalignments. For example, it is interesting how #1 seed is assigned - it seems our powerranking agrees only in `108/(108+18+15) = 76.5%` of the times with the number 1 seed selection! It can be easily explained - we did not capture any time component, like increasing/decreasing quality trend in our `glm` model. And the `seed` is capturing the end result of how the team is ready for the March Madness (like is all star players of the team are healthy?)\n",
"\n",
"And there is a quite strong disagreement in the mid-level seeded teams - these are the hardest to rank even for experts!\n",
"\n",
"So now we can finally raise a question - what is a better predictor - seed information or our power rankings?\n",
"Let's try and do that without building any models - let's simply calculate how many games a team won with a given seed/powerrank:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"_uuid": "3e906b31f7e40784e0343acaae7b37bd8a5bce63"
},
"outputs": [],
"source": [
"tourney_results['win'] = np.where(tourney_results['T1_Score'] > tourney_results['T2_Score'], 1, 0)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"_uuid": "b42aa3ed367ac96f9448b6fcfc8be11bf377eb62"
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>seed_win_ratio</th>\n",
" <th>powerrank_win_ratio</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.787234</td>\n",
" <td>0.763359</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0.690265</td>\n",
" <td>0.698276</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>0.642857</td>\n",
" <td>0.676190</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0.632653</td>\n",
" <td>0.608696</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>0.520000</td>\n",
" <td>0.492958</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>0.409836</td>\n",
" <td>0.527027</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>0.539474</td>\n",
" <td>0.560976</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>0.470588</td>\n",
" <td>0.280000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>0.379310</td>\n",
" <td>0.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>0.357143</td>\n",
" <td>0.419355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>0.506849</td>\n",
" <td>0.526316</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>0.307692</td>\n",
" <td>0.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>0.200000</td>\n",
" <td>0.234043</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>0.142857</td>\n",
" <td>0.100000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>0.121951</td>\n",
" <td>0.142857</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>0.027027</td>\n",
" <td>0.052632</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" seed_win_ratio powerrank_win_ratio\n",
"1 0.787234 0.763359\n",
"2 0.690265 0.698276\n",
"3 0.642857 0.676190\n",
"4 0.632653 0.608696\n",
"5 0.520000 0.492958\n",
"6 0.409836 0.527027\n",
"7 0.539474 0.560976\n",
"8 0.470588 0.280000\n",
"9 0.379310 0.333333\n",
"10 0.357143 0.419355\n",
"11 0.506849 0.526316\n",
"12 0.307692 0.333333\n",
"13 0.200000 0.234043\n",
"14 0.142857 0.100000\n",
"15 0.121951 0.142857\n",
"16 0.027027 0.052632"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mean_win_ratio = pd.DataFrame({'seed_win_ratio': tourney_results.groupby('T1_seed')['win'].mean(),\n",
" 'powerrank_win_ratio': tourney_results.groupby('T1_powerrank')['win'].mean()})\n",
"mean_win_ratio"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"_uuid": "37020e398d5d1dd6f41731e66a9f3c8dd51fb5dc"
},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7f55c15efc50>"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.set_style('whitegrid')\n",
"sns.lineplot(mean_win_ratio.index, mean_win_ratio['seed_win_ratio']) # Blue\n",
"sns.lineplot(mean_win_ratio.index, mean_win_ratio['powerrank_win_ratio']) # Orange"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "434f340f40e92e359c812de42dc7340c9eb747fa"
},
"source": [
"It seems they are pretty much aligned on average, and there are huge fluctuations in mid-range seeds as to be expected!\n",
"\n",
"Let's take look at AUC of the seed/rank assignment:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"_uuid": "769cd0502095fb91a8b5d73e07948f54546381ad"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"seed AUC: 0.721579898534631\n",
"powerrank AUC: 0.7145998774452625\n",
"team quality AUC: 0.6279483901470968\n"
]
}
],
"source": [
"from sklearn.metrics import roc_auc_score\n",
"\n",
"print(f\"seed AUC: {roc_auc_score(tourney_results['win'],-tourney_results['T1_seed'])}\")\n",
"print(f\"powerrank AUC: {roc_auc_score(tourney_results['win'],-tourney_results['T1_powerrank'])}\")\n",
"print(f\"team quality AUC: {roc_auc_score(tourney_results['win'],tourney_results['T1_quality'])}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "5983ebbf6eea3cfe1142e3148f12960209c58779"
},
"source": [
"It seems that our powerrank is at very similar level expert based seed assignment! which is great news by itself - we have just proven that there is a way to build an unbiased seed ranking for the March Madness!\n",
"\n",
"However, *raw* team quality estimate is a not doing as good as powerranks. The reason for that maybe is that we built quality models for each season separately. This is a food for thought both for you and for me!\n",
"\n",
"So to summarize - estimating team quality with a very simple `glm` model should be helpful in determining the winner of a basketball match! And here you have it - a very strong features for your models!"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "30dee9cb23776171537fd428ea2be7eee309f170"
},
"source": [
"## Hope you enjoyed this analysis!\n",
"\n",
"### Liked the content? don't forget to upvote:)"
]
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "30e61a2ec829abc4ebc57e384f36ed9a4365388a"
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "11a87c44e70c872781f4038b96b7918c3e92c295"
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {
"_uuid": "18100bb051ad29ce690edadab80c9acaea9bc4ac"
},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment