Skip to content

Instantly share code, notes, and snippets.

@fhiyo
Last active February 16, 2018 09:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fhiyo/618c58615e664442075d396e32dcc39b to your computer and use it in GitHub Desktop.
Save fhiyo/618c58615e664442075d396e32dcc39b to your computer and use it in GitHub Desktop.
KaggleのTitanicチュートリアルを試してみた結果のjupyter notebook (feature engineering行った)
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# titanicのチュートリアル問題\n",
"\n",
"18/01/28\n",
"feature engineeringをやっていく."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- [Titanic Top 4% with ensemble modeling | Kaggle](https://www.kaggle.com/yassineghouzam/titanic-top-4-with-ensemble-modeling)\n",
"\n",
"を参考にしながら書いている."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import math\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"import numpy as np\n",
"# from sklearn.model_selection import KFold\n",
"from sklearn.metrics import accuracy_score\n",
"\n",
"from collections import Counter\n",
"\n",
"sns.set(style='white', context='notebook', palette='deep')\n",
"\n",
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.6/site-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.\n",
" \"This module will be removed in 0.20.\", DeprecationWarning)\n"
]
}
],
"source": [
"# modeling用の関数をimport\n",
"\n",
"import xgboost as xgb\n",
"\n",
"from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier, ExtraTreesClassifier, VotingClassifier\n",
"from sklearn.discriminant_analysis import LinearDiscriminantAnalysis\n",
"from sklearn.linear_model import LogisticRegression\n",
"from sklearn.neighbors import KNeighborsClassifier\n",
"from sklearn.tree import DecisionTreeClassifier\n",
"from sklearn.neural_network import MLPClassifier\n",
"from sklearn.svm import SVC\n",
"from sklearn.model_selection import GridSearchCV, cross_val_score, StratifiedKFold, learning_curve"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Load training and test data\n",
"train_df = pd.read_csv('../input/train.csv', header=0)\n",
"test_df = pd.read_csv('../input/test.csv', header=0)\n",
"test_df_ids = test_df.PassengerId\n",
"\n",
"# 前処理を一気に行えるようにデータセットを結合しておき,訓練・テスト前に分ける\n",
"train_count = len(train_df)\n",
"dataset_df = pd.concat(objs=[train_df, test_df], axis=0).reset_index(drop=True)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Age</th>\n",
" <th>Cabin</th>\n",
" <th>Embarked</th>\n",
" <th>Fare</th>\n",
" <th>Name</th>\n",
" <th>Parch</th>\n",
" <th>PassengerId</th>\n",
" <th>Pclass</th>\n",
" <th>Sex</th>\n",
" <th>SibSp</th>\n",
" <th>Survived</th>\n",
" <th>Ticket</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>22.0</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>7.2500</td>\n",
" <td>Braund, Mr. Owen Harris</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>male</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>A/5 21171</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>38.0</td>\n",
" <td>C85</td>\n",
" <td>C</td>\n",
" <td>71.2833</td>\n",
" <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>female</td>\n",
" <td>1</td>\n",
" <td>1.0</td>\n",
" <td>PC 17599</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>26.0</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>7.9250</td>\n",
" <td>Heikkinen, Miss. Laina</td>\n",
" <td>0</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>female</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>STON/O2. 3101282</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>35.0</td>\n",
" <td>C123</td>\n",
" <td>S</td>\n",
" <td>53.1000</td>\n",
" <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
" <td>0</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>female</td>\n",
" <td>1</td>\n",
" <td>1.0</td>\n",
" <td>113803</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>35.0</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>8.0500</td>\n",
" <td>Allen, Mr. William Henry</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>male</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>373450</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Age Cabin Embarked Fare \\\n",
"0 22.0 NaN S 7.2500 \n",
"1 38.0 C85 C 71.2833 \n",
"2 26.0 NaN S 7.9250 \n",
"3 35.0 C123 S 53.1000 \n",
"4 35.0 NaN S 8.0500 \n",
"\n",
" Name Parch PassengerId \\\n",
"0 Braund, Mr. Owen Harris 0 1 \n",
"1 Cumings, Mrs. John Bradley (Florence Briggs Th... 0 2 \n",
"2 Heikkinen, Miss. Laina 0 3 \n",
"3 Futrelle, Mrs. Jacques Heath (Lily May Peel) 0 4 \n",
"4 Allen, Mr. William Henry 0 5 \n",
"\n",
" Pclass Sex SibSp Survived Ticket \n",
"0 3 male 1 0.0 A/5 21171 \n",
"1 1 female 1 1.0 PC 17599 \n",
"2 3 female 0 1.0 STON/O2. 3101282 \n",
"3 1 female 1 1.0 113803 \n",
"4 3 male 0 0.0 373450 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_df.head()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# outlier detection\n",
"# 欠損値が入っていなくても,何らかの理由で妙なデータが含まれていることがあるので,\n",
"# 実際に確認して何かあれば取り除くなり欠損値として扱うなりをする\n",
"\n",
"def detect_outliers(df, n, features):\n",
" \"\"\"\n",
" Takes a dataframe df of features and returns a list of the indices\n",
" corresponding to the observations containing more than n outliers according\n",
" to the Tukey method.\n",
" \"\"\"\n",
" outlier_indices = []\n",
" \n",
" # iterate over features(columns)\n",
" for col in features:\n",
" # 1st quartile (25%)\n",
" Q1 = np.nanpercentile(df[col], 25)\n",
" # 3rd quartile (75%)\n",
" Q3 = np.nanpercentile(df[col], 75)\n",
" # Interquartile range (IQR)\n",
" IQR = Q3 - Q1\n",
" # outlier step\n",
" outlier_step = 1.5 * IQR\n",
" \n",
" # Determine a list of indices of outliers for feature col\n",
" outlier_list_col = df[(df[col] < Q1 - outlier_step)|(df[col] > Q3 + outlier_step)].index\n",
" \n",
" # append the found outlier indices for col to the list of outlier indices \n",
" outlier_indices.extend(outlier_list_col)\n",
" \n",
" # select observations containing more than 2 outliers\n",
" counts_outlier_indices = Counter(outlier_indices) \n",
" multiple_outliers = [k for k, v in counts_outlier_indices.items() if v > n]\n",
" \n",
" return multiple_outliers "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# detect outliers from Age, SibSp, Parch and Fare\n",
"Outliers_to_drop = detect_outliers(train_df, 2, ['Age', 'SibSp', 'Parch', 'Fare'])"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"percentile: nan\n",
"nanpercentile: 20.125\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.6/site-packages/numpy/lib/function_base.py:4274: RuntimeWarning: Invalid value encountered in percentile\n",
" interpolation=interpolation)\n"
]
}
],
"source": [
"# tmp\n",
"print('percentile: {}'.format(np.percentile(train_df['Age'], 25)))\n",
"print('nanpercentile: {}'.format(np.nanpercentile(train_df['Age'], 25)))\n",
"# np.percentile()はnanを無限大の値として扱う?nanを無視してpercentileを表示するにはnp.nanpercentile()を使うようだ."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"outliers count: 11\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Age</th>\n",
" <th>Cabin</th>\n",
" <th>Embarked</th>\n",
" <th>Fare</th>\n",
" <th>Name</th>\n",
" <th>Parch</th>\n",
" <th>PassengerId</th>\n",
" <th>Pclass</th>\n",
" <th>Sex</th>\n",
" <th>SibSp</th>\n",
" <th>Survived</th>\n",
" <th>Ticket</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>745</th>\n",
" <td>70.0</td>\n",
" <td>B22</td>\n",
" <td>S</td>\n",
" <td>71.00</td>\n",
" <td>Crosby, Capt. Edward Gifford</td>\n",
" <td>1</td>\n",
" <td>746</td>\n",
" <td>1</td>\n",
" <td>male</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>WE/P 5735</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>19.0</td>\n",
" <td>C23 C25 C27</td>\n",
" <td>S</td>\n",
" <td>263.00</td>\n",
" <td>Fortune, Mr. Charles Alexander</td>\n",
" <td>2</td>\n",
" <td>28</td>\n",
" <td>1</td>\n",
" <td>male</td>\n",
" <td>3</td>\n",
" <td>0.0</td>\n",
" <td>19950</td>\n",
" </tr>\n",
" <tr>\n",
" <th>88</th>\n",
" <td>23.0</td>\n",
" <td>C23 C25 C27</td>\n",
" <td>S</td>\n",
" <td>263.00</td>\n",
" <td>Fortune, Miss. Mabel Helen</td>\n",
" <td>2</td>\n",
" <td>89</td>\n",
" <td>1</td>\n",
" <td>female</td>\n",
" <td>3</td>\n",
" <td>1.0</td>\n",
" <td>19950</td>\n",
" </tr>\n",
" <tr>\n",
" <th>159</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>69.55</td>\n",
" <td>Sage, Master. Thomas Henry</td>\n",
" <td>2</td>\n",
" <td>160</td>\n",
" <td>3</td>\n",
" <td>male</td>\n",
" <td>8</td>\n",
" <td>0.0</td>\n",
" <td>CA. 2343</td>\n",
" </tr>\n",
" <tr>\n",
" <th>180</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>69.55</td>\n",
" <td>Sage, Miss. Constance Gladys</td>\n",
" <td>2</td>\n",
" <td>181</td>\n",
" <td>3</td>\n",
" <td>female</td>\n",
" <td>8</td>\n",
" <td>0.0</td>\n",
" <td>CA. 2343</td>\n",
" </tr>\n",
" <tr>\n",
" <th>201</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>69.55</td>\n",
" <td>Sage, Mr. Frederick</td>\n",
" <td>2</td>\n",
" <td>202</td>\n",
" <td>3</td>\n",
" <td>male</td>\n",
" <td>8</td>\n",
" <td>0.0</td>\n",
" <td>CA. 2343</td>\n",
" </tr>\n",
" <tr>\n",
" <th>324</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>69.55</td>\n",
" <td>Sage, Mr. George John Jr</td>\n",
" <td>2</td>\n",
" <td>325</td>\n",
" <td>3</td>\n",
" <td>male</td>\n",
" <td>8</td>\n",
" <td>0.0</td>\n",
" <td>CA. 2343</td>\n",
" </tr>\n",
" <tr>\n",
" <th>341</th>\n",
" <td>24.0</td>\n",
" <td>C23 C25 C27</td>\n",
" <td>S</td>\n",
" <td>263.00</td>\n",
" <td>Fortune, Miss. Alice Elizabeth</td>\n",
" <td>2</td>\n",
" <td>342</td>\n",
" <td>1</td>\n",
" <td>female</td>\n",
" <td>3</td>\n",
" <td>1.0</td>\n",
" <td>19950</td>\n",
" </tr>\n",
" <tr>\n",
" <th>792</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>69.55</td>\n",
" <td>Sage, Miss. Stella Anna</td>\n",
" <td>2</td>\n",
" <td>793</td>\n",
" <td>3</td>\n",
" <td>female</td>\n",
" <td>8</td>\n",
" <td>0.0</td>\n",
" <td>CA. 2343</td>\n",
" </tr>\n",
" <tr>\n",
" <th>846</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>69.55</td>\n",
" <td>Sage, Mr. Douglas Bullen</td>\n",
" <td>2</td>\n",
" <td>847</td>\n",
" <td>3</td>\n",
" <td>male</td>\n",
" <td>8</td>\n",
" <td>0.0</td>\n",
" <td>CA. 2343</td>\n",
" </tr>\n",
" <tr>\n",
" <th>863</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" <td>69.55</td>\n",
" <td>Sage, Miss. Dorothy Edith \"Dolly\"</td>\n",
" <td>2</td>\n",
" <td>864</td>\n",
" <td>3</td>\n",
" <td>female</td>\n",
" <td>8</td>\n",
" <td>0.0</td>\n",
" <td>CA. 2343</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Age Cabin Embarked Fare Name \\\n",
"745 70.0 B22 S 71.00 Crosby, Capt. Edward Gifford \n",
"27 19.0 C23 C25 C27 S 263.00 Fortune, Mr. Charles Alexander \n",
"88 23.0 C23 C25 C27 S 263.00 Fortune, Miss. Mabel Helen \n",
"159 NaN NaN S 69.55 Sage, Master. Thomas Henry \n",
"180 NaN NaN S 69.55 Sage, Miss. Constance Gladys \n",
"201 NaN NaN S 69.55 Sage, Mr. Frederick \n",
"324 NaN NaN S 69.55 Sage, Mr. George John Jr \n",
"341 24.0 C23 C25 C27 S 263.00 Fortune, Miss. Alice Elizabeth \n",
"792 NaN NaN S 69.55 Sage, Miss. Stella Anna \n",
"846 NaN NaN S 69.55 Sage, Mr. Douglas Bullen \n",
"863 NaN NaN S 69.55 Sage, Miss. Dorothy Edith \"Dolly\" \n",
"\n",
" Parch PassengerId Pclass Sex SibSp Survived Ticket \n",
"745 1 746 1 male 1 0.0 WE/P 5735 \n",
"27 2 28 1 male 3 0.0 19950 \n",
"88 2 89 1 female 3 1.0 19950 \n",
"159 2 160 3 male 8 0.0 CA. 2343 \n",
"180 2 181 3 female 8 0.0 CA. 2343 \n",
"201 2 202 3 male 8 0.0 CA. 2343 \n",
"324 2 325 3 male 8 0.0 CA. 2343 \n",
"341 2 342 1 female 3 1.0 19950 \n",
"792 2 793 3 female 8 0.0 CA. 2343 \n",
"846 2 847 3 male 8 0.0 CA. 2343 \n",
"863 2 864 3 female 8 0.0 CA. 2343 "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print('outliers count: %d' % len(Outliers_to_drop))\n",
"dataset_df.loc[Outliers_to_drop] # Show the outliers rows"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"上の結果を見ると,Ageが欠損していないやつはFareの値が大きいのが問題になっていそう. \n",
"でも見た感じ,明らかに変な値は入っていなさそうだから,今回は対応しなくてもよいかな? (dataset_df.describe()の情報も参考にしている)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PassengerId 0\n",
"Survived 0\n",
"Pclass 0\n",
"Name 0\n",
"Sex 0\n",
"Age 177\n",
"SibSp 0\n",
"Parch 0\n",
"Ticket 0\n",
"Fare 0\n",
"Cabin 687\n",
"Embarked 2\n",
"dtype: int64"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 欠損値が各属性にどれだけあるかを見る.\n",
"# Survivedが欠損しているのはtest_df由来のデータ.\n",
"train_df.isnull().sum()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>PassengerId</th>\n",
" <th>Survived</th>\n",
" <th>Pclass</th>\n",
" <th>Age</th>\n",
" <th>SibSp</th>\n",
" <th>Parch</th>\n",
" <th>Fare</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>891.000000</td>\n",
" <td>891.000000</td>\n",
" <td>891.000000</td>\n",
" <td>714.000000</td>\n",
" <td>891.000000</td>\n",
" <td>891.000000</td>\n",
" <td>891.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>446.000000</td>\n",
" <td>0.383838</td>\n",
" <td>2.308642</td>\n",
" <td>29.699118</td>\n",
" <td>0.523008</td>\n",
" <td>0.381594</td>\n",
" <td>32.204208</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>257.353842</td>\n",
" <td>0.486592</td>\n",
" <td>0.836071</td>\n",
" <td>14.526497</td>\n",
" <td>1.102743</td>\n",
" <td>0.806057</td>\n",
" <td>49.693429</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.420000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>223.500000</td>\n",
" <td>0.000000</td>\n",
" <td>2.000000</td>\n",
" <td>20.125000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>7.910400</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>446.000000</td>\n",
" <td>0.000000</td>\n",
" <td>3.000000</td>\n",
" <td>28.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>14.454200</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>668.500000</td>\n",
" <td>1.000000</td>\n",
" <td>3.000000</td>\n",
" <td>38.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>31.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>891.000000</td>\n",
" <td>1.000000</td>\n",
" <td>3.000000</td>\n",
" <td>80.000000</td>\n",
" <td>8.000000</td>\n",
" <td>6.000000</td>\n",
" <td>512.329200</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" PassengerId Survived Pclass Age SibSp \\\n",
"count 891.000000 891.000000 891.000000 714.000000 891.000000 \n",
"mean 446.000000 0.383838 2.308642 29.699118 0.523008 \n",
"std 257.353842 0.486592 0.836071 14.526497 1.102743 \n",
"min 1.000000 0.000000 1.000000 0.420000 0.000000 \n",
"25% 223.500000 0.000000 2.000000 20.125000 0.000000 \n",
"50% 446.000000 0.000000 3.000000 28.000000 0.000000 \n",
"75% 668.500000 1.000000 3.000000 38.000000 1.000000 \n",
"max 891.000000 1.000000 3.000000 80.000000 8.000000 \n",
"\n",
" Parch Fare \n",
"count 891.000000 891.000000 \n",
"mean 0.381594 32.204208 \n",
"std 0.806057 49.693429 \n",
"min 0.000000 0.000000 \n",
"25% 0.000000 7.910400 \n",
"50% 0.000000 14.454200 \n",
"75% 0.000000 31.000000 \n",
"max 6.000000 512.329200 "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"### Summarize data\n",
"# Summary and statistics\n",
"train_df.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### feature analysis"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x105696cc0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Correlation matrix between numerical values (SibSp Parch Age and Fare values) and Survived \n",
"g = sns.heatmap(dataset_df[['Survived', 'SibSp', 'Parch', 'Age', 'Fare']].corr(), annot=True, fmt='.2f', cmap='coolwarm')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> Only Fare feature seems to have a significative correlation with the survival probability.\n",
"> \n",
"It doesn't mean that the other features are not usefull. Subpopulations in these features can be correlated with the survival. To determine this, we need to explore in detail these features\n",
"\n",
"参考にしたカーネルには上のように書いてあるが,なぜ細かく見ると参考になる特徴も相関係数を計算すると無相関に見えるのだろう?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### SibSP"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x10bcb4e48>"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10bd4c630>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Explore SibSp feature vs Survived\n",
"g = sns.factorplot(x='SibSp', y='Survived', data=train_df, kind='bar', size=6, palette = 'muted')\n",
"g.despine(left=True)\n",
"g.set_ylabels('survival probability')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> It seems that passengers having a lot of siblings/spouses have less chance to survive\n",
"> \n",
"> Single passengers (0 SibSP) or with two other persons (SibSP 1 or 2) have more chance to survive\n",
"> \n",
"> This observation is quite interesting, we can consider a new feature describing these categories (See feature engineering)\n",
"\n",
"兄弟・姉妹,配偶者の数は1, 2人はいたほうが生存率が高かったようだ."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Parch"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x10bd94b70>"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10bd94da0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Explore Parch feature vs Survived\n",
"g = sns.factorplot(x='Parch', y='Survived', data=train_df, kind='bar', size=6, palette='muted')\n",
"g.despine(left=True)\n",
"g.set_ylabels('survival probability')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> Small families have more chance to survive, more than single (Parch 0), medium (Parch 3,4) and large families (Parch 5,6 ).\n",
"> \n",
"> Be carefull there is an important standard deviation in the survival of passengers with 3 parents/children\n",
"\n",
"家族は少しだけいたほうが生存率が高かった."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Age"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x10be22668>"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10be0fd30>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Explore Age vs Survived\n",
"g = sns.FacetGrid(train_df, col='Survived')\n",
"g.map(sns.distplot, 'Age')"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.legend.Legend at 0x10bf73550>"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10bf73a20>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Explore Age distibution \n",
"g = sns.kdeplot(train_df.Age[(train_df.Survived == 0) & (train_df.Age.notnull())], color='Red', shade=True)\n",
"g = sns.kdeplot(train_df.Age[(train_df.Survived == 1) & (train_df.Age.notnull())], color='Blue', shade=True, ax=g)\n",
"g.set_xlabel('Age')\n",
"g.set_ylabel('Frequency')\n",
"g.legend(['Not Survived','Survived'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> When we superimpose the two densities , we cleary see a peak correponsing (between 0 and 5) to babies and very young childrens."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.legend.Legend at 0x10bf66208>"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10bef2208>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Explore Fare distribution\n",
"g = sns.distplot(dataset_df.Fare.dropna(), color='m', label='Skewness: %.2f' % (dataset_df.Fare.skew()))\n",
"g.legend(loc='best')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> As we can see, Fare distribution is very skewed. This can lead to overweigth very high values in the model, even if it is scaled.\n",
"> \n",
"> In this case, it is better to transform it with the log function to reduce this skew.\n",
"\n",
"データの偏りが大きい分布だと,値が突出しているときに他と属性と比べて効きすぎてしまうことがある? \n",
"そのため,logにするなどして値の偏りを無くす方向に持っていく."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.legend.Legend at 0x10bcb4908>"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10bcb4940>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"g = sns.distplot(dataset_df.Fare.map(lambda i: np.log(i) if i > 0 else 0), color='b', label='Skewness: %.2f' % (dataset_df.Fare.skew()))\n",
"g.legend(loc='best')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Categorical values"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Sex"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0,0.5,'Survived Probability')"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEFCAYAAADqujDUAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAFlxJREFUeJzt3X20XVV57/FvTggkQiIgtkEoiOPCo5USWgIkGisYEYkiuS1WL1ExEhBQacVq4Q4r0Kr0SlMk9QbbS6W+UL1Ym2HSS2ORWsDECFWh0ZjHnuutLy1RUUh4SXLIOef+sfaBzSHZWXmZe+ec9f2MkXH22nPvtZ8Mwv6dOdeac04YHh5GktQ8fb0uQJLUGwaAJDWUASBJDWUASFJD7dfrAuqIiAOAk4EHgMEelyNJY8VE4HDg3szcOrpxTAQA1Zf/3b0uQpLGqJcBXx395FgJgAcAbrnlFqZPn97rWiRpTNiwYQMLFiyA1nfoaGMlAAYBpk+fzpFHHtnrWiRprNnu0LkXgSWpoQwASWqoYkNAEdEHLAVmAFuBRZnZ39b+HuA8YAj4cGYuK1WLJOmZSvYA5gOTM3M2cAWweKQhIg4GfheYDbwK+GjBOiRJ21EyAOYAKwEycw0ws63tMeAHwIGtP0MF65AkbUfJAJgGbGw7HoyI9iGnHwHrgG8CSwrWIUnajpIBsAmY2v5Zmbmt9fgsqtlpxwBHAfMj4pSCtUiSRikZAKuAeQARMQtY29b2ELAZ2JqZW4CHgYML1iJpDFiyZAlz585lyRIHBbqhZAAsA7ZExGrgeuDdEXF5RLwuM+8G7gXWRMTXgO8BtxesRdI+bvPmzSxfvhyAFStWsHnz5h5XNP4Vuw00M4eAi0c9vb6t/SrgqlKfL2lsGRgYYGSHwqGhIQYGBpgyZUqPqxrfnAgmSQ1lAEhSQxkAktRQBoAkNZQBIEkNZQBIUkMZAJLUUAaAJDWUASBJDTVW9gSWxrW33vy7vS6h5wa3bnva8Ts/+9+ZeECzv6L+euENRc9vD0CSGsoAkKSGMgAkqaEMAElqKANAkhrKAJCkhjIAJKmhDABJaqhisywiog9YCswAtgKLMrO/1XYi8NG2l88C5mfmylL1SJKeruQ0u/nA5MycHRGzgMXAOQCZeR9wGkBEvB74D7/8Jam7Sg4BzQFWAmTmGmDm6BdExIHANYDz4CWpy0oGwDRgY9vxYESM7nFcAHw+Mx8sWIckaTtKDgFtAqa2Hfdl5rZRr1kAnFuwBknSDpTsAawC5gG0rgGsbW+MiGcDB2TmjwrWIEnagZI9gGXAGRGxGpgALIyIy4H+zFwOHAf8e8HPlzSGTOib0HYw6lhFFAuAzBwCLh719Pq29nup7hSSJPomTeSg4w7l0e/9goOOPZS+SRN7XdK41+zdFiTtUw455Xkccsrzel1GYzgTWJIaygCQpIYyACSpoQwASWooA0CSGsoAkKSGMgAkqaEMAElqKANAkhrKAJCkhjIAJKmhDABJaigDQJIaygCQpIYyACSpoQwASWooA0CSGqrYjmAR0QcsBWYAW4FFmdnf1n4WcBXVfsHfAN6RmcOl6pEkPV3JHsB8YHJmzgauABaPNETEVOA64LWZeSrV5vCHFaxFkjRKyQCYA6wEyMw1wMy2tpcAa4HFEXE38JPM/FnBWiRJo5QMgGnAxrbjwYgYGXI6DDgd+APgLOD3IuK4grVIkkYpGQCbgKntn5WZ21qPfw7cm5kbMvNR4C7gxIK1SJJG2WkARMRtEfH6iJi0i+deBcxrnWMW1ZDPiG8Cx0fEYa1ewSxg3S6eX5K0B+r0AP4EeDXwbxHxPyPi5JrnXgZsiYjVwPXAuyPi8oh4XWb+FLgS+BLwdeDvMvPbu1G/JGk37fQ20My8C7grIqYA5wJfiIhNwE3AjZm5dQfvGwIuHvX0+rb2zwGf293CJUl7ptY1gIg4DfgY8GGqO3suA34ZWF6sMklSUTvtAUTED4DvAzcD78zMza3n/xn4l6LVSZKKqTMT+DWjx+cjYlbr3v7fKFOWJKm0HQZARLwUmAjcFBEXUC3ZADAJuBHwvn1JGsM69QDOAF4OHA78Udvz24C/KFmUJKm8HQZAZl4NEBFvzsxPd60iSVJXdBoCuroVAq+IiNNHt2fm20oWJkkqq9MQ0DdaP/+5C3VIkrqsUwDcHxFHAV/pVjGSpO7pFAB3AsM8dfdPu2HgBUUqkiR1RaeLwMd0sxBJUnft9CJwRHxie+1eBJaksa3OReA7u1GIJKm7Og0BrWj9/GRE/BJwKvAEcE9m/qJL9UmSCqmzIczrgfuA84GLgPsi4tWlC5MklVVnMbj3Aydl5gMAEXE01TLQK0sWJkkqq85+AE8AG0YOMvMHVOsBSZLGsE53Ab2l9fD/ASsi4pNUX/z/Dbi/C7VJkgrqNAQ0sv7Po60/81rHj7H9yWGSpDGk011AC3fU1tofuKOI6AOWAjOArcCizOxva78BmAM80nrqnMzcWLNuSdIeqrMl5G8DHwAOovrNfyIwBfilnbx1PjA5M2dHxCxgMXBOW/tJwJmZ+eDuFC5J2jN1LgJ/BPg94LvAAqq9gW+t8b45tO4Uam0fOXOkodU7OBb4y4hYFRHOKpakLqsTAA9l5leANcCzW3sEzK7xvmlA+5DOYESM9DgOBP4ceBPwauDSiDihdtWSpD1WJwA2R8RxVD2A0yJif+DZNd63CZja/lmZOXL76OPADZn5eGY+AvwT1bUCSVKX1AmA9wMfBP4emAv8BFhW432raN051LoGsLat7ThgVURMjIhJVMNF39yFuiVJe2inF4Ez806eWhDu5Ig4JDMfqnHuZcAZEbGa6uLxwoi4HOjPzOUR8WmqYaUngE9l5nd2768gSdodde4COhJYApwGDABfjoh3Z+bPOr0vM4eAi0c9vb6t/Trgul0tWJK0d9QZAvoEcDtwNNXQzTeo7gSSJI1hdRaDe25m3th2fH1EnF+qIElSd9TpAdwTEW8cOYiI1wL/Uq4kSVI3dFoMboinNoW/MCL+ChikmhH8ELCoKxVKkorotBZQnd6BJGmMqnMX0LOAq6jmAOxHNWnrDzPzscK1SZIKqvNb/seolm54G9W2kPsDHy9ZlCSpvDp3AZ2Ume3LNLwzItaVKkiS1B11egB9EXHwyEHrsVtCStIYV6cH8GdUt4KuaB2/Dri2XEmSpG6oEwArgHuBl1P1GH4rM9d2foskaV9XJwDuzswXAd8uXYwkqXvqBMD9EfFm4B5g88iTmfnDYlVJkoqrEwCntv60GwZesPfLkSR1S539AI7pRiGSpO7qtBbQ86gmgR0LfBW4MjMf7lZhkqSyOs0DuJlqA5f3ApOB67tSkSSpKzoNAR2RmWcCRMQdwH3dKUmS1A2dAmBg5EFmPhERAx1e+wwR0QcsBWYAW4FFmdm/ndf8H+CLmen6QpLURbuy5PPwLp57PjA5M2cDVwCLt/OaDwKH7OJ5JUl7QacewIsj4vttx0e0jicAw5m5s9tA5wArATJzTUTMbG+MiHOBoZHXSJK6q1MAHLeH554GbGw7HoyI/TJzW0QcD5wHnAt8YA8/R5K0GzrtCPaDPTz3JmBq23FfZo6sIvoW4AiqzWWeDwxExL9npr0BSeqSOjOBd9cq4Gzg1oiYBTy5gFxmvm/kcURcDWzwy1+SuqtkACwDzoiI1VTXDRZGxOVAf2YuL/i5kqQaOs0E/s1Ob8zMu3bSPgRcPOrp9dt53dWdziNJKqNTD+Ca1s/nAP+FakhnEHgJ1XDOS8uWJkkqqdNF4NMBIuI2qk1g+lvHRwN/0Z3yJEml1JkIdvSoGbw/BI4uVI8kqUvqXAT+RkR8EriVKjDOA+4uWpUkqbg6AbAIeBfVBd1h4MtUa/xIksawOhvCDETEF6ju4PkS8CttE7okSWPUTq8BRMQbgBXADcChwNci4k2lC5MklVXnIvAfUN36+Uhm/hT4deDKolVJkoqrEwCDmfnIyEFmPkC1iqckaQyrcxH4OxHxTmBSRJwIXIq7g0nSmFenB/AOqpU7NwOfoFrl89KSRUmSyqvTA7gQ+GhmOu4vSeNInQA4AlgTEQl8Bvi7zHy8bFmSpNJ2OgSUme/NzGOADwGzgPsi4tPFK5MkFVVrU/iImABMAvanugNoa8miJEnl7XQIKCL+HJgPfAu4BbgsM7eULkySVFadawDfA34jM39WuhhJUvd02hHsosz8S6rlHy6JiKe1Z+YfFa5NklRQpx7AhB08riUi+qhWDZ1Bdc1gUfu+AhHxDuCtVCuM/mlm3rqrnyFJ2n2ddgQb2fVrI/DZzPzJLp57PjA5M2dHxCxgMXAOQEQcBlxCta7QZGBdRHw+M4d39S8gSdo9de4CGpkHsDIi3hQRz6p57jnASoDMXAPMHGnIzAeBEzPzCWA6sMUvf0nqrpLzAKZR9R5GDEbEkz2OzNzWWmNoDdUEM0lSF5WcB7AJmNr+WaM3ksnMjwGHA78ZEafXqliStFfUnQdwDtUKoJ+h/jyAVcDZwK2tawBr284ZwLXAbwNPUAWKS0xLUhfVmQfwE+Ck3ZgHsAw4IyJWU91FtDAiLgf6M3N5RNwPfI3qLqB/yMw7d/H8kqQ9UCcAFmTmB3f1xJk5RLWRfLv1be3XANfs6nklSXtHnQBYFxEfAL5OtScAAJl5V7GqJEnF1QmAQ4HTW39GDAOvKFKRJKkrdhoAmendOZI0DtW5C+grVL/xP01m2gOQpDGszhDQ1W2PJ1HdEvpQkWokSV1TZwho9O2ZX46IrwMfKFOSJKkb6gwBHdV2OAF4MfCcYhVJkrqizhDQnVTXACa0fv4MeFfJoiRJ5dUZAjqmG4VIkrqrYwBExGuBdZn5/YiYD1wAfBP449ELu0mSxpYdrgYaEb8PXAVMjogTqDaE/yLVCp9/2p3yJEmldFoO+s3AyzNzHXAesDwzbwLeA5zZjeIkSeV0CoDhzHy89fh0ntrdy527JGkc6HQNYFtEHAwcRLV37z8CRMTRgOP/kjTGdeoB/AnVJjBrgJsy84GI+B3gDuAj3ShOklTODgMgM/8WeAkwLzMvbT39KLAoM+vsCax91JIlS5g7dy5LlizpdSmSeqjjbaCZ+Z/Af7Yd31a8IhW1efNmli9fDsCKFSu48MILmTJlSo+rktQLtTaF1/gxMDDA8HB1HX9oaIiBgYEeVySpV+osBbFbIqIPWArMoNr0fVFm9re1vxt4Y+vwttYWkZKkLinZA5gPTM7M2cAVwOKRhoh4AbCA6hrDLOBVrclmkqQuKRkAc3hq7sAaYGZb24+AV2fmYGtewSRgS8FaJEmjFBsCAqYBG9uOByNiv8zclplPAA9GxATgOuBbmfm9grVIkkYp2QPYRLVu0JOf1b6AXERMplpfaCpwKZKkrioZAKuAeQARMQtYO9LQ+s3/i8D9mfn2zBwsWIckaTtKDgEtA86IiNVUm8ksjIjLgX5gIvBy4ICIOKv1+isz82sF6+G8991S8vRjwtC2p19qefs1f0vffpN7VM2+4W8+sqDXJUg9USwAMnMIuHjU0+vbHjf7W0eSesyJYJLUUAaAJDWUASBJDWUASFJDGQCS1FAGgCQ1lAHQNBMmth+MOpbUJAZAw/RNnMSU574IgCnPfSF9Eyf1uCJJvVJyJrD2UdOOms20o2b3ugxJPWYPQJIaygCQpIYyACSpoQwASWooA0CSGsoAkKSGMgAkqaEMAElqKANAkhrKAJCkhiq2FERE9AFLgRnAVmBRZvaPes1zgVXACZm55ZlnkSSVUrIHMB+YnJmzgSuAxe2NEXEm8I/A9II1SJJ2oGQAzAFWAmTmGmDmqPYh4JXALwrWIEnagZIBMA3Y2HY8GBFPDjll5u2Z+fOCny9J6qBkAGwCprZ/VmZuK/h5kqRdUDIAVgHzACJiFrC24GdJknZRyQ1hlgFnRMRqYAKwMCIuB/ozc3nBz5Uk1VAsADJzCLh41NPrt/O655eqQZK0Y04Ek6SGMgAkqaEMAElqKANAkhrKAJCkhjIAJKmhDABJaigDQJIaygCQpIYyACSpoQwASWooA0CSGsoAkKSGMgAkqaEMAElqKANAkhrKAJCkhjIAJKmhim0JGRF9wFJgBrAVWJSZ/W3tFwJvB7YBH8zMvy9ViyTpmUr2AOYDkzNzNnAFsHikISKmA5cBLwXOBK6NiAMK1iJJGqVYDwCYA6wEyMw1ETGzre0UYFVmbgW2RkQ/cAJw7w7ONRFgw4YNe1TQ1scf3qP3a3z68Y9/3OsS2PLw470uQfugPf232fadOXF77SUDYBqwse14MCL2y8xt22l7BHh2h3MdDrBgwYK9XqQ09/YlvS5B2q65H5+7t051OPB/Rz9ZMgA2AVPbjvtaX/7ba5sKdPr1/F7gZcADwODeLFKSxrGJVF/+2x1dKRkAq4CzgVsjYhawtq3tHuBDETEZOAB4EfDtHZ2oNVT01YK1StJ49Yzf/EdMGB4eLvKJbXcBnQBMABYC84D+zFzeugvoIqoL0R/OzC8UKUSStF3FAkCStG9zIpgkNZQBIEkNZQBIUkOVvAtIY0REvBV4YWZe0etaND5ExH7A7VR3+b0mMx/aS+fdkJnT98a5ZABIKuN5wLTMPKnXhWjHDIBxpvXb/NnAFKoJIDcA5wDHA78P/ArwW8CBwIPAfx31/ncB5wHDwOcy02my2h0fB46NiJupJno+p/X8ZZm5trX8y2rgOOAOqpUATgEyM98cEccDf0Y1kekw4JLMXD1y8oj4NWAJ1S3mPwfelpntqwuoBq8BjE9TM3Me8D+AS6i+8C8CLqD6H/GVmXkq1S8AJ4+8KSJ+FXgD1TpOLwPmR0R0uXaND5cC64CfAndk5ulU/wZvbLU/H3g/1b+zy6jmDJ0KzImIg4EXA+/JzLlU/44Xjjr//wLekZmnAbcB7yv5lxmv7AGMT99q/XwY+G5mDkfEQ8D+wADw2Yh4FDgSmNT2vuOBo6l+IwM4BDgWyK5UrfHo14BXRMQbWseHtn7+PDN/CBARj2XmutbjjcBk4D+AP4yIzVQ9iE2jzvsiYGnr95NJwL8V/VuMU/YAxqcdze7bH5ifmW8A3kX1339CW3sC3wFOb/1m9dfAv5YrUw2wHri+9e/pd4DPtJ7f2QzUJcBVmXk+1TIyE0a1J/CW1nnfB7ifyG6wB9As24DHImJV6/gBqot1AGTm/RFxB/DV1v4M91D9Jibtrg8BfxURF1GtAnx1zfd9Bvh8q+f6Y6rrAO0uAT7VuttomGp4U7vIpSAkqaEcApKkhjIAJKmhDABJaigDQJIaygCQpIbyNlBpJyLiXOBKqv9f+oBPZeZ1va1K2nP2AKQOIuIIYDHwqsycAcwG3hgRr+ttZdKeswcgdXYY1VIDz6JavuDRiDgf2BIRJwPXt9oeBN7e+rkWuCAz74iILwFfzMylvSlf2jEngkk7ERE3Aouo1lj6CvA3wHeBe4GzM/OHEXEm8N7MfGVEvIJq0bMlwGsz86welS51ZABINbSGgl4FnEm1vPa1VGvQ9Le9bFpmvqD1+hupltV+YWY+0OVypVocApI6iIjXAAdl5v8GbgZujogLqb7cv5+ZJ7ZeNxH45dbjCUAAj1Otd28AaJ/kRWCps8eBayPi+fDkl/uvAmuAQyPiZa3XvY1qaAiqtfAfpeop3BQRB3a1Yqkmh4CknWhd9H0vT+2d8CWq3dVOotpxbTLVevXnA0NUO12dkpk/ioiPAX2ZeWnXC5d2wgCQpIZyCEiSGsoAkKSGMgAkqaEMAElqKANAkhrKAJCkhjIAJKmh/j/qjfLOWYGkJAAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10bf03550>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Sex vs Survived\n",
"g = sns.barplot(x='Sex', y='Survived', data=train_df)\n",
"g.set_ylabel('Survived Probability')"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Survived</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Sex</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>female</th>\n",
" <td>0.742038</td>\n",
" </tr>\n",
" <tr>\n",
" <th>male</th>\n",
" <td>0.188908</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Survived\n",
"Sex \n",
"female 0.742038\n",
"male 0.188908"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"train_df[['Sex', 'Survived']].groupby('Sex').mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> It is clearly obvious that Male have less chance to survive than Female.\n",
"> \n",
"> So Sex, might play an important role in the prediction of the survival.\n",
"> \n",
"> For those who have seen the Titanic movie (1997), I am sure, we all remember this sentence during the evacuation : \"Women and children first\"."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### PClass"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x10c0f7240>"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10c09a710>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Explore Pclass vs Survived\n",
"g = sns.factorplot(x='Pclass', y='Survived', data=train_df, kind='bar', size=6, palette='muted')\n",
"g.despine(left=True)\n",
"g.set_ylabels('Survived Probability')"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x10bfec278>"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10c120898>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Explore Pclass vs Survived vs Sex\n",
"g = sns.factorplot(x='Pclass', y='Survived', hue='Sex', data=train_df, kind='bar', size=6, palette='muted')\n",
"g.despine(left=True)\n",
"g.set_ylabels('Survived Probability')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> The passenger survival is not the same in the 3 classes. First class passengers have more chance to survive than second class and third class passengers.\n",
"> \n",
"> This trend is conserved when we look at both male and female passengers."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Embarked"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x10c120198>"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10be09978>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"g = sns.factorplot(x='Embarked', y='Survived', data=train_df, size=6, kind='bar', palette='muted')\n",
"g.despine(left=True).set_ylabels('Survived Probability')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> It seems that passenger coming from Cherbourg (C) have more chance to survive.\n",
"> \n",
"> My hypothesis is that the proportion of first class passengers is higher for those who came from Cherbourg than Queenstown (Q), Southampton (S).\n",
"> \n",
"> Let's see the Pclass distribution vs Embarked"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x10c0f3860>"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10c3a1e80>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Explore Embarked vs Pclass\n",
"g = sns.factorplot('Pclass', col='Embarked', data=train_df, size=6, kind='count', palette='muted')\n",
"g.despine(left=True).set_ylabels('Count')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> Indeed, the third class is the most frequent for passenger coming from Southampton (S) and Queenstown (Q), whereas Cherbourg passengers are mostly in first class which have the highest survival rate.\n",
"> \n",
"> At this point, i can't explain why first class has an higher survival rate. My hypothesis is that first class passengers were prioritised during the evacuation due to their influence."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Filling missing values"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Fare"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_df.Fare.isnull().sum()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"# Fill Fare missing values with the median value\n",
"# テストデータを使って中央値を計算しないように注意.\n",
"dataset_df.Fare = dataset_df.Fare.fillna(train_df.Fare.median())"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"# Apply log to Fare to reduce skewness distribution\n",
"# 場所がわかりにくいので位置を変えたほうがいいか?でも欠損値を先に埋めないとこの処理によって欠損値の部分に0が入ってしまう.\n",
"dataset_df.Fare = dataset_df.Fare.map(lambda i: np.log(i) if i > 0 else 0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Embarked"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"# help(train_df.groupby('Embarked').size())"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Embarked\n",
"C 168\n",
"Q 77\n",
"S 644\n",
"dtype: int64"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# train_df.groupby('Embarked').size().idxmax() でデータ数が一番多いindexを取得できる.今回の場合は'S'\n",
"train_df.groupby('Embarked').size()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_df.Embarked.isnull().sum()"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"# 欠損値の数が少ないので,一番カテゴリ数が多い値で欠損値を埋めておく (今回のケースでは'S')\n",
"dataset_df.Embarked.fillna(train_df.groupby('Embarked').size().idxmax(), inplace=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Age "
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"# 前回同様,敬称からAgeの欠損値を予測してみる\n",
"# Nameから敬称を抽出する.データを見る限り,'名字 敬称 名前'の順になっていたのでそのルールに沿って敬称を取得する\n",
"dataset_df_title = [i.split(',')[1].split('.')[0].strip() for i in dataset_df.Name]\n",
"dataset_df['Title'] = pd.Series(dataset_df_title)\n",
"# dataset_df.columns"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10bca3470>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# 敬称の頻度分布を可視化\n",
"g = sns.countplot(x='Title', data=dataset_df)\n",
"g = plt.setp(g.get_xticklabels(), rotation=45) \n",
"# dataset_df.Title"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"# 敬称をカテゴリデータに変換する.本当はonehot-vectorの形にしたほうがよさそう\n",
"dataset_df.Title = dataset_df.Title.replace(['Lady', 'the Countess', 'Countess', 'Capt', 'Col', 'Don', 'Dr', 'Major', 'Rev', 'Sir', 'Jonkheer', 'Dona'], 'Rare')\n",
"dataset_df.Title = dataset_df.Title.map({'Master': 0, 'Miss': 1, 'Mme': 1, 'Mlle': 1, 'Ms' : 2, 'Mrs': 2, 'Mr': 3, 'Rare': 4})\n",
"dataset_df.Title = dataset_df.Title.astype(int)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEFCAYAAADuT+DpAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAFRdJREFUeJzt3XuQXGWZx/HvTCCgkOCiaHABQdx9LF2Lu4kIZMBguKhRLFcEV4mFLGtQULfwFougsrUKYgHCwnKLl4ByUVQEjCUkhnBV2RUKeJC7KJQEDImwEJPM/nHOmM4wGXpCTncm7/dTleL0e053P30407/znsvbPf39/UiSytXb7QIkSd1lEEhS4QwCSSqcQSBJhduo2wWMVERsAuwBPAqs6HI5kjRajAG2Bm7NzOdaZ4y6IKAKgQXdLkKSRqm9getbG0ZjEDwKMGfOHCZMmNDtWiRpVHjsscc4/PDDof4ObTUag2AFwIQJE9hmm226XYskjTbPO6TuyWJJKpxBIEmFMwgkqXAGgSQVziCQpMIZBJJUOINAkgpnEEhS4UbjDWWS1tJ/fOHSbpfQiM+f9L5ulzCq2SOQpMIZBJJUOINAkgpnEEhS4QwCSSqcQSBJhTMIJKlwBoEkFc4gkKTCGQSSVDiDQJIKZxBIUuEaG3QuIo4AjqgfbgrsDPQBpwHLgbmZeWJE9AJnATsBzwFHZua9TdUlSVpdY0GQmbOB2QARcSZwAXA28F7gfuCnEbELsAOwaWa+JSImAV8HpjVVlyRpdY0fGoqI3YE3At8DNsnM+zKzH/gZMAXYC7gGIDNvAnZvuiZJ0iqdOEfweeBEYDywpKV9KbBF3f5US/uKiPB3EiSpQxoNgoh4GRCZeR1VCIxrmT0OWDxEe29mLm+yLknSKk33CPYBfgGQmUuAZRGxY0T0AFOBBcBC4CCA+hzB7Q3XJElq0fQhmKA6MTzgaGAOMIbqqqGbI+JWYP+IuAHoAaY3XJMkqUWjQZCZJw96fBMwaVDbSqqAkCR1gTeUSVLhDAJJKpxBIEmFMwgkqXAGgSQVziCQpMIZBJJUOINAkgpnEEhS4QwCSSqcQSBJhTMIJKlwBoEkFc4gkKTCGQSSVDiDQJIKZxBIUuEMAkkqnEEgSYVr9DeLI+JzwLuAscBZwHxgNtAP3AHMyMyVEXECcDCwHDguM29psi5J0iqN9Qgiog/YE3grMBnYFjgVmJmZewM9wLSI2LWePxE4FDizqZokSc/X5KGhqcDtwA+BnwBXArtR9QoArgamAHsBczOzPzMfBjaKiK0arEuS1KLJQ0OvAF4DvAPYAfgx0JuZ/fX8pcAWwHjgiZbnDbQ/3mBtkqRak0HwBHB3Zi4DMiKepTo8NGAcsBhYUk8PbpckdUCTh4auBw6IiJ6IeDWwGfCL+twBwIHAAmAhMDUieiNiO6pew6IG65IktWisR5CZV0bEPsAtVIEzA3gAODcixgJ3AZdl5oqIWADc2LKcJKlDGr18NDOPH6J58hDLzQJmNVmLJGlo3lAmSYUzCCSpcAaBJBXOIJCkwhkEklQ4g0CSCmcQSFLhDAJJKpxBIEmFMwgkqXAGgSQVziCQpMIZBJJUOINAkgpnEEhS4QwCSSqcQSBJhTMIJKlwBoEkFa7R3yyOiN8AS+qHDwDnAKcBy4G5mXliRPQCZwE7Ac8BR2bmvU3WJUlapbEgiIhNgZ7M7Gtp+x/gvcD9wE8jYhdgB2DTzHxLREwCvg5Ma6ouSdLqmuwR7AS8NCLm1u8zC9gkM+8DiIifAVOArYFrADLzpojYvcGaJEmDNHmO4BngFGAqcDRwYd02YCmwBTAeeKqlfUVENHrISpK0SpNfuPcA92ZmP3BPRDwFbNkyfxywGHhpPT2gNzOXN1iXJKlFkz2Cj1Ad7yciXk31hf90ROwYET1UPYUFwELgoHq5ScDtDdYkSRqkyR7B+cDsiLge6KcKhpXAHGAM1VVDN0fErcD+EXED0ANMb7AmSdIgjQVBZi4DDhti1qRBy62kOocgSeoCbyiTpMIZBJJUOINAkgpnEEhS4QwCSSqcQSBJhTMIJKlwBoEkFc4gkKTCGQSSVDiDQJIKZxBIUuHaCoKIOGOItm+t+3IkSZ027OijEXEe8Fpg94h4Y8usjal+XUySNMq90DDUXwG2B04DTmxpXw7c1VBNkqQOGjYIMvNB4EFgp4gYT9UL6Klnbw482WRxkqTmtfXDNBHxOeBzwBMtzf1Uh40kSaNYu79QdiSwY2Y+3mQxkqTOa/fy0YfxMJAkbZDa7RH8Drg+Iq4Dnh1ozMwvNVKVJKlj2g2CP9T/YNXJ4hcUEa8Efg3sT3Wl0Wyqcwt3ADMyc2VEnAAcXM8/LjNvaff1JUkvXltBkJknvvBSq4uIjYFzgP+rm04FZmbmvIg4G5gWEQ8Bk4GJwLbA5cAeI30vSdLaa/eqoZVUe/Kt/piZ2w7ztFOAs6muNgLYDZhfT18NvB1IYG5m9gMPR8RGEbGVJ6UlqXPaOlmcmb2ZOSYzxwCbAocCl65p+Yg4Ang8M3/W0txTf+EDLKW6J2E88FTLMgPtkqQOafccwd9k5l+BSyPiC8Ms9hGgPyKmADsD3wZe2TJ/HLAYWFJPD26XJHVIu4eGPtTysAd4I7BsTctn5j4tz50HHA2cHBF9mTkPOBC4DrgX+FpEnAJsA/Rm5qIRfgZJ0ovQbo9g35bpfmAR8P4RvtengXMjYizVOEWXZeaKiFgA3Eh1mGrGCF9TkvQitXvV0PT6KqCon3NHZi5v87l9LQ8nDzF/FjCrndeSJK177f4ewW5UN5V9C7iQ6gqfiU0WJknqjHYPDZ0OvD8zbwaIiEnAGcCbmypMktQZ7Y41tPlACABk5k1Ul5FKkka5doPgyYiYNvAgIt7N6kNSS5JGqXYPDR0FXBkR51NdPtoP7NlYVZKkjmm3R3Ag8AzwGqpLSR8H+hqqSZLUQe0GwVHAWzPz6cz8LdW4QR9vrixJUqe0GwQbs/qdxMt4/iB0kqRRqN1zBFcA10bEJfXjQ4AfNVOSJKmT2h199DNU9xIE1Q/Wn56ZX2yyMElSZ7Q9+mhmXgZc1mAtkqQuaPccgSRpA2UQSFLhDAJJKpxBIEmFMwgkqXAGgSQVziCQpMIZBJJUuLZvKBupiBgDnEt1N3I/cDTwLDC7fnwHMCMzV0bECcDBwHLguMy8pam6JEmra7JH8E6AzHwrMBM4CTgVmJmZe1P9rsG0iNiV6kftJwKHAmc2WJMkaZDGgiAzr6Aavhqq3zFYTDV89fy67WpgCrAXMDcz+zPzYWCjiNiqqbokSatr9BxBZi6PiG9R/dD9HKAnMweGr14KbAGMB55qedpAuySpAxo/WZyZHwb+kep8wUtaZo2j6iUsqacHt0uSOqCxIIiIf4mIz9UPnwFWAr+KiL667UBgAbAQmBoRvRGxHdCbmYuaqkuStLrGrhoCfgBcGBG/pPqFs+OAu4BzI2JsPX1ZZq6IiAXAjVTBNKPBmiRJgzQWBJn5NPDPQ8yaPMSys4BZTdUiSVozbyiTpMIZBJJUOINAkgpnEEhS4QwCSSpck5ePqouOuPDYbpfQiNnTT+t2CdIGxx6BJBXOIJCkwhkEklQ4g0CSCmcQSFLhDAJJKpxBIEmFMwgkqXAGgSQVziCQpMIZBJJUOINAkgpnEEhS4RoZfTQiNgYuALYHNgG+AtwJzAb6gTuAGZm5MiJOAA4GlgPHZeYtTdQkSRpaUz2CDwJPZObewAHAN4FTgZl1Ww8wLSJ2pfox+4nAocCZDdUjSVqDpoLgUuCL9XQP1d7+bsD8uu1qYAqwFzA3M/sz82Fgo4jYqqGaJElDaOTQUGb+BSAixgGXATOBUzKzv15kKbAFMB54ouWpA+2PN1GXynTVh6Z3u4RGHPTtC7tdgjYQjZ0sjohtgeuA72TmRcDKltnjgMXAknp6cLskqUMaCYKIeBUwF/hMZl5QN98WEX319IHAAmAhMDUieiNiO6A3Mxc1UZMkaWhN/Wbx54G/A74YEQPnCo4FTo+IscBdwGWZuSIiFgA3UoXSjIbqkSStQVPnCI6l+uIfbPIQy84CZjVRhyTphXlDmSQVziCQpMIZBJJUOINAkgpnEEhS4QwCSSqcQSBJhTMIJKlwBoEkFc4gkKTCGQSSVDiDQJIKZxBIUuEMAkkqnEEgSYUzCCSpcAaBJBXOIJCkwhkEklQ4g0CSCtfIj9cPiIiJwFczsy8iXgfMBvqBO4AZmbkyIk4ADgaWA8dl5i1N1iRJWl1jPYKIOB44D9i0bjoVmJmZewM9wLSI2BWYDEwEDgXObKoeSdLQmjw0dB9wSMvj3YD59fTVwBRgL2BuZvZn5sPARhGxVYM1SZIGaSwIMvNy4K8tTT2Z2V9PLwW2AMYDT7UsM9AuSeqQTp4sXtkyPQ5YDCyppwe3S5I6pJNBcFtE9NXTBwILgIXA1IjojYjtgN7MXNTBmiSpeI1eNTTIp4FzI2IscBdwWWauiIgFwI1UoTSjg/VIkmg4CDLzQWBSPX0P1RVCg5eZBcxqsg5J0pp5Q5kkFc4gkKTCGQSSVDiDQJIKZxBIUuE6efloRxx2/Jxul7DOXfS1w7tdgqQNmD0CSSqcQSBJhTMIJKlwBoEkFc4gkKTCGQSSVDiDQJIKZxBIUuE2uBvKJKkdv7xyVrdLaMQ+75g14ufYI5CkwhkEklQ4g0CSCmcQSFLh1ouTxRHRC5wF7AQ8BxyZmfd2typJKsP60iN4N7BpZr4F+Czw9S7XI0nFWC96BMBewDUAmXlTROw+zLJjAB577LEhZz73zOJ1Xly3PfLIIyN+zrOLn2mgku5bm3Xx5HPPNlBJ963NuvjL039uoJLuW5t1sejJvzRQSfetaV20fGeOGTyvp7+/v8GS2hMR5wGXZ+bV9eOHgddm5vIhlt0LWNDhEiVpQ7F3Zl7f2rC+9AiWAONaHvcOFQK1W4G9gUeBFU0XJkkbiDHA1lTfoatZX4JgIfBO4JKImATcvqYFM/M54Po1zZckrdF9QzWuL0HwQ2D/iLgB6AGmd7keSSrGenGOQJLUPevL5aOSpC4xCCSpcAaBJBVufTlZ3DUR0QdcB3wgM7/X0v5b4DeZecQIXuuYzPzmOi/yhd+3j2E+AzA+Mw8Z4WvuSnXH93b1f19VX7E1MO/XwL6ZOe9F1N3WewDbA68Hzga+l5mT1vY926ipj7XYHiLiNOA0qqsyzsnMo1vmnQ68KzO3b6rubluXf0ejSf25LwHuBPqB8cD9wOGZuayLpY2IPYLK3cChAw8i4k3AZmvxOjPXWUUjt8bPMNIQqL0DuLKefhQ4sGXe4VQb+4vVifdYG2uzPbw2M+8HngD2iYiN6ueOAfZoqtD1zLr6Oxptrs3MvszcNzN3A/4KvKvbRY1E8T2C2v8CERFbZOZTwAeBOcB2EXEMcAjVBr0IeA/VHuqFwHKqMD0M+BCwZUScBRxLtff6D/X8mZk5LyLuAO4BlmXmoaxbw32GxzJzQkR8DPgwsBK4NTM/ERGHAJ+h2nj/CByamSuB3YEv1699MfAB4Ip6gMBdqW9KiYgjqO4BeQnVzSqnAdOAfwL+PTN/FBHvAz5FdQPg9Zn52fp123qPoUTEZOCk+jXvA/41M/+6dqvueYZblxcCr6s/72mZ+Z2IeANwV/3c5cA8YH/gauDtwM+ptg8iYh7wJ2BLYAZwAS3bUWb+fh19hm4Ybr09RBUUd2bmJ7tZZJMiYizV38Gf6xETtq0f/zgzZ0bEbODl9b+DgeOpbpAdA5yamZd2o257BKtcDhwSET3Am4EbqNbPy4EpmTmRKjj3oPojvwWYApwAbJGZJwFPZubHgCOBRZm5D9WX4pn1e2wOfLmBEBjuM7SaDhxTD+53V73X+gHg5Mzci2rvfHxEvAr4U2YOXFt8C/D6iNgM2I/qEECrcZl5EPBV4N+ogvMoYHpEbAmcCLytfo+/j4j91+I9/qb+fOcCh2TmZOAPwBHtr6a2DLUuxwH71J/vAFbd2d7aswG4iFV7xodRfRm2ujgzp1BtP6ttR+v4M3TDmrbBbamCbkMMgf0iYl5E3El1KPaHVDsnN2XmVKr1cHTL8tdm5p7AJGCH+u9iX+ALEfGyDtcOGAStBv5492HVWEYrgWXAxRFxPrANsDFwPrCYaqC8Y6j26Fq9CTio3vu7HNgoIl5Rz8sOf4ZW04EZETEfeA3VzXufotqQ5wN7Un3mg4GrBj33R1Shdhjw3UHzbqv/uxi4q/5y/zOwKdXe81bAVfX6eAOw41q8R6utqPayLqlf8+3151mXhlqXS4HjgP8Gvg9sUrfvSXV3/ICFwC4RMbDn99Cg1x7YBl5oOxqN1rQNLsrMJ7pTUuOuzcw+qj37ZcADwJPAHhExB/gGq7YVWPX//03AbvU2fA3Vd8v2nSl5dQZBrT6+uxnwCVZ9CY0H3p2Z7wc+TrW+eqi+rBZk5tuAS6kOrVDPg6oLfHG9cRxYL/NkPW9lhz9Dq48CR9d70btQfYEdBcyq23qoDn3tD8wd9NyLqA5vbF2/T6vh7kp8APg9sH+9Ps4AblqL92i1CHgEmFa/5knAtcMsP2JrWJdbA7tl5nuoguxrEbEVsCQzV7Q8t58q5P4LuGKIlx/YBta0HY1aw2yDjW3364s66D4InAd8ElicmYdTDav/0rqXBKvWxd3AdfU2vB/VSechh4BomkGwuu8D22bmPfXj5cDTEbGQ6jjvo8CrgV8BX4qIa6m6fGfUy98ZEd8FzqE6zDGfqmv8UH3cvRufodXtwIK67j8BN1MdmrgyIn4BTKDaMxmbmauN0ZuZd1Ptif9kJMVk5uPAqcD8iLiZKhjvfzHvUa/LY4Gf1sOSfAy4YyR1tWnwunwMmFC/58+BU6gC7ZohnjuH6oThcMd817QdjXbDbYMbtMy8Ezid6hzZARHxS6odgt9RfXe0+gnwl4hYQHWFXH9mLu1kvQMcYkKSCmePQJIKZxBIUuEMAkkqnEEgSYUzCCSpcA4xIbUhIs4E3gqMpbpJ7s561jlUl/2dXQ8/MSszH4qIB4G+zHywC+VKI2IQSG3IzBkAEbE9MC8zdx5isX2phtOQRhWDQHoRImJWPfks1Q1DV0XE3i3zxwAnA31UA4vNzsxvdLhMaVieI5DWgcz8T6rRWw8aNKbOR+v5u1INPjatNSik9YE9AqlZU4CdI2K/+vHmVIONDTUooNQVBoHUrDHA8Zn5A4B6FNqnu1uStDoPDUnrznKev3N1LfDRiNg4IjYHrgcmdrwyaRj2CKR150qqk8VTW9oGfqnuNqq/twtfzO88S01w9FFJKpyHhiSpcAaBJBXOIJCkwhkEklQ4g0CSCmcQSFLhDAJJKtz/AwaQAyuo0lj3AAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10c1ba2b0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"g = sns.countplot(dataset_df.Title)\n",
"g = g.set_xticklabels(['Master', 'Miss/Mme/Mlle', 'Ms/Mrs', 'Mr', 'Rare'])"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAARgAAAEYCAYAAACHjumMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAFMpJREFUeJzt3X+UHWV9x/H37iYxKSEiYsVfKD3qtz8UVFCCBQUWi6RqqKetCmpNC4iKWLFVazkSVKhHS7VrpdhiwVarRZGSUItVfhsaSsUf0MAXo0JFwIoICCxZkt3+MbN6s+zenezus3fv5f06Jyd3Zu6P7yR3P/vMzDPP0zc2NoYkldDf6QIk9S4DRlIxBoykYgwYScUs6nQBTUTEIuDJwK2ZubXT9UhqpisChipcvn/xxRd3ug5Jk+ubbKWHSJKKMWAkFWPASCrGgJFUjAEjqRgDRlIxBoykYgwYScUYMJKKMWAkFWPAaN4NDQ0xODjI0NBQp0tRYQaM5tXw8DDr1q0DYP369QwPD3e4IpVkwGhejYyMMD5M6+joKCMjIx2uSCUZMJKKMWAkFWPASCrGgJFUjAEjqRgDRlIxBoykYgwYScUYMJKKMWAkFWPASCrGgJFUTLGZHSOiHzgD2BvYAhydmZtbtr8DOBIYBU7LzPNL1SKpM0q2YI4Almbm/sC7gdPHN0TELsDbgP2B3wI+WrAOSR1SMmAOAC4CyMyNwL4t2+4HbgF2qv+MFqxDUoeUDJgVwD0ty9siovWQ7AfAJuBawKHNpB5U7BwMcC+wc8tyf2ZurR8fDjwB2LNe/nJEbMjM/ypYj2bpS69fM+v3GN62bbvlr775rSwbGJjx+636x7NnW5IKKtmC2QCsAoiIlcB1Ldt+CgwDWzLzQeBuYJeCtUjqgJItmPOBl0TEVUAfsCYiTgQ2Z+a6iDgU2BgRo8DXgK8UrEVSBxQLmMwcBY6bsPrGlu0nAyeX+nxJnWdHO0nFGDCSijFgupgTmGmhM2C6lBOYqRsYMF3KCczUDQwYScUYMJKKMWAkFWPASCrGgJFUjAEjqRgDRlIxBoykYh7xAWN3e6mcR3TA2N1eKusRHTB2t59/A319P3/cN2FZvecRHTCaf0v6+3nOTssB2Hun5Szp9yvYy0oOmSlNanCXXRncZddOl6F54K8PScUYMJKK8RCpA95w9ttm/R7btmzdbvn4z76HgUfN/L/znDV/PduSpIexBSOpGANGUjEGjKRiDBhJxRgwkooxYCQVY8BIKsaAkVSMASOpGANGUjEGjKRiDBhJxRgwkooxYCQVY8BIKsaAkVSMAdOl+vpbRuPvm7AsLRAGTJfqXzzA8mdWA2cvf8au9C8e6HBF0sNNO8ZiROyemXfMRzHaMY95wRN5zAue2OkypCk1GcT1ioj4DnAO8K+Z+VDZkiT1imkPkTLzmcAHgcOAjIi/iYh9i1cmqes1OgeTmVcCxwNrgdXAFyPi6xGxsmBtkrrctAETEYdGxKeA7wIHAq/KzD2ANwBfKFuepG7W5BzMe4F/AN6UmQ+Mr8zM6yLiL6d6UUT0A2cAewNbgKMzc3PL9sOBk6nmQP868JbMHJvRXkhakJocIv17Zp7TGi4RcRpAZn60zeuOAJZm5v7Au4HTW16/M/Bh4GWZuR9wM7DbjpcvaSGbsgUTER8Efhl4RUQ8o2XTYmA/4D3TvPcBwEUAmblxwonhFwLXAadHxK8AZ2Xmj2dQv6QFrF0L5jzgcuD++u/xPxcBv93gvVcA97Qsb4uI8UDbDTgYeBdwOPDHEfHMHStd0kI3ZQsmM68BromI8zPz3hm8973Azi3L/Zk5PqHyT4BrxjvwRcQVwHOAm2bwOZIWqHaHSNdm5vOAuyOi9eRrHzCWmdP1Td8AvBw4t76cfV3LtmuBZ0XEbsDdwErg72eyA5IWrnYtmOfVf8/0fqXzgZdExFVUobQmIk4ENmfmuoj4M+DL9XPPzczrZ/g5khaodi2Y97Z7YWa+b5rto8BxE1bf2LL9c8DnGtQoqUu16wfj/f+SZqXdIdIp81nITBz5zs/M6vWjWx/cbvmNp3yB/kVLZ/We//yho2b1eqmXTHuSNyJGgZmc5JX0CFfyJK+kBWBoaIgLLriA1atXc8IJJ8zrZzcZcGox1cnag4CHgK8A/+B9Q9LCNzw8zLp16wBYv349xxxzDMuWLZu3z29ys+PHqXrlnkN1ePQHwF7A28qVJWkujIyMMDZWtQVGR0cZGRlZcAGzMjP3Gl+IiAuBb5UrSVKvaHJ+5Yf1DYnjngjcXqgeST2k3VWkS6muHj0O+FZ9v9A2qruk7XUraVrtDpHWTrH+9CnWS9J22l2mvnz8cUQ8F1hOdZJ3ANiTaugGSZpSk8vUn6IaIGpX4AaqYRU2UA2jKUlTanKS90XArwOfB46lGs1uScmiJPWGJgFzWz3Z2g3AXpn5P2w/kJQkTapJP5gf1mO3fBX4UERAdT5Gktpq0oL5I+D79RCa5wGvAd5UtCpJPaHJ1LE/Ay6JiJcDm4E1mXlp8cokdb0mMzv+HvBNqnuQjgW+GREvLV2YpO7X5BzMScA+mXk7QEQ8FVhHPeeRJE2lyTmYh4A7xhcy8xZg69RPl6RKu3uRXl8//D6wvu5wt5XqJK93U0uaVrtDpIPrv++r/6yql+8vWpGkntHuXqQ144/rUe2ifv71LTM0StKUmlxF2gf4DvAp4GzgfyNiv9KFSep+Ta4iDQGvysyrAeppYD8GvKBkYZK6X5OrSMvHwwUgMzcCs5s8SNIjQpOAuSsiVo8vRMQRwE/KlSSpVzQ5RDoW+HREfJJqwKnvAq8tWpWkntAkYAYzc7+I2Anor+9NkqRpNQmY44EzM9P+L5J2SJOA+UFEXAJcDQyPr8zM9xWrSlJPaBIwG1se95UqRFLvaRswEbEbcCFwQ2Y+MD8lSeoVU16mrseBuRn4N+DmiHjxfBUlqTe06wdzEvD8zNwdeB1wyvyUJKlXtAuYscy8ASAzvww8dn5KktQr2gXM6ITlh0oWIqn3tDvJu3NEHMgvrhwtb13OzCtKFyepu7ULmFuB1r4uP2xZHgMOKVWUpN7QbsCpg6faJklNNLmbunf1DbQuTFiWNFuP6IDpH1jMssf9GgDLHver9A8s7nBFUm9pcqtAT1uxx/6s2GP/Tpch9aR205acTXUyd1KZ+YdFKpLUM9q1YC6bzRtHRD9wBrA3sAU4OjM3T/KcfwMuyMwzZ/N5khaedleRPjX+OCJ2BXai6gMzAOzZ4L2PAJZm5v71QOGnA6snPOcDwGN2tGhJ3aHJtCWnUc3umMDXgM3AXzR47wOo56+uBwrfd8L7/i5Vb2HnuJZ6VJOrSK8BngL8C9Vsj4cCP27wuhXAPS3L2yJiEUBEPAs4EnjvDlUrqas0CZjbM/Ne4Hpg78y8FHh8g9fdC+zc+lktM0K+HngScAnwBuDEiHhp46oldYUml6nviYjXAV8H3hoRt9HsvMkG4OXAufU5mOvGN2TmO8cfR8Ra4I7M9FBJ6jFNWjB/BPxyZl5GNQDVJ6jGipnO+cCDEXEV8BHg7RFxYkS8Yoa1SuoyTVowvw98GiAz39H0jTNzFDhuwuobJ3ne2qbvKam7NAmYJwEbIyKpguaLjs8rqYlpD5Ey808zc0/gVGAl8M2I+KfilUnqeo1udoyIPmAxsISq78qWkkVJ6g3THiJFxMeoeuV+A/gMcEJmPli6MEndr8k5mJuA52Vmk851kvRz7e6mPjYz/w7YFXhTRGy33aljJU2nXQumb4rHktRIu7upP1E/vAf4bGb+aH5KktQr7AcjLWBXXLh2Vq9/YHj76cz+8z8+xC8tm93QsC962drGz7UfjKRi7AcjqZim/WBWA9+kOkSyH4ykRpqcg/kRsI/9YCTtqCaHSEcZLpJmokkLZlNEvBe4GhgeX5mZVxSrSlJPaBIwu1KNxds6V/UYcEiRiiT1jGkDJjMPnu45kjSZJleRLmWSGR4z0xaMpLaaHCKtbXm8mOqS9U+LVCOppzQ5RLp8wqqvRsTVOKeRpGk0OUTao2WxD/gN4LHFKpLUM5ocIl1OdQ6mr/77x8BbSxYlqTc0OURqMtG9JD1M24CJiJcBmzLzexFxBNUkbNcC72+ZBlaSJjXlrQIR8SfAycDSiNiLasDvC6jmm/7L+SlPUjdrdy/S64AXZ+Ym4EhgXWaeBbwDOGw+ipPU3doFzFjLyHUHAxcBZObDOt1J0mTanYPZGhG7AMuB5wL/ARARTwU8/yJpWu1aMB+kGmRqI3BWZt4eEb8PXAx8aD6Kk9Td2s0q8IWIuArYLTO/Xa++Dzg6My+bj+Ikdbe2l6kz8zbgtpblLxWvSFLPaDTotyTNhAEjqRgDRlIxBoykYgwYScUYMJKKMWAkFWPASCrGgJFUjAEjqRgDRlIxBoykYgwYScU0mbZkRiKiHzgD2BvYQjXMw+aW7W8HXl0vfikzTylVi6TOKNmCOQJYmpn7A+8GTh/fEBG/AhwFvBBYCfxWPbC4pB5SMmAO4Bfj+G4E9m3Z9gPgpZm5rR7jdzHwYMFaJHVAsUMkYAVwT8vytohYlJlbM/Mh4M6I6AM+DHwjM28qWIukDijZgrmXag6ln39W62RtEbGUaq6lnYE3F6xDmhNDQ0MMDg4yNDTU6VK6RsmA2QCsAoiIlcB14xvqlssFwLcy842Zua1gHdKsDQ8Ps27dOgDWr1/P8PBwhyvqDiUPkc4HXlIPHN4HrImIE4HNwADwYuBREXF4/fw/y8z/LFiPNGMjIyOMjVVTgo2OjjIyMsKyZcs6XNXCVyxgMnMUOG7C6htbHi8t9dmSFgY72kkqxoCRVIwBI/WwRQO/+BHv69t+eT4YMFIPW7JkgOfv/QQA9t3rCSxZMjCvn1/yKpKkBWDVIU9n1SFP78hn24KRVIwtGD0inPbnn5/V6x96aPtb5T566joWL55dT4v3nPp7s3p9N7AFI6kYA0ZSMQaMpGIMGEnFGDCSijFgJBVjwEgqxoCRGujrb+1i3zdhWVMxYKQGFg0s5smP/w0Anvz4X2fRwOIOV9Qd7MkrNRR7HkjseWCny+gqtmAkFWPASCrGgJFUjAEjqRgDRlIxBoykYgwYScUYMJKKMWAkFWPASCrGgJFUjAEjqRgDRlIxBoykYgwYScUYMJKKMWAkFWPASCrGgJFUjAEjqRgDRlIxBoykYgwYScUYMJKKMWAkFWPASCrGgJFUTLG5qSOiHzgD2BvYAhydmZtbth8DvBHYCnwgMy8sVYukzijZgjkCWJqZ+wPvBk4f3xARuwMnAL8JHAb8RUQ8qmAtkjqgWAsGOAC4CCAzN0bEvi3bXgBsyMwtwJaI2AzsBVwzxXsNANxxxx3brdzywN1zXfOs3XrrrdM+58G7H5iHSnZMk7rv2vLgPFSyY5rUDXDf/T8tXMmOa1L7nXfdNw+V7JjJ6h4cHHwacGtmbm1dXzJgVgD3tCxvi4hFdQETt/0MeHSb93oCwFFHHTXnRc61wa8MdbqEGRk8c7DTJczI+we7s26A8/799OmftBCdun6ytd8H9gRubl1ZMmDuBXZuWe5vSbeJ23YG2jVHrgEOBG4Hts1lkZLmzMOaNiUDZgPwcuDciFgJXNey7b+AUyNiKfAo4NeA66d6o/pQ6msFa5VUQN/Y2FiRN265irQX0AesAVYBmzNzXX0V6ViqE82nZeZ5RQqR1DHFAkaS7GgnqRgDRlIxBoykYkpeRZp3EXEQcCnwmsz8XMv6bwPXZuYbduC9js/Mv5nzIpt//kG02RdgRWa+skPlTWsu/y86pRv2oa7xXGATMEbVx+x7wFGZOdLB0oDebMHcCLx6fCEing3sNIP3OWnOKpq5KfdlIYdLi7n6v+ikbtiHSzLzoMw8ODP3AR4CXtHpoqDHWjC1bwEREY/OzHuA1wKfAfaIiOOBV1J9Qe4Efgd4GnA21U2X/cCRwOuBXSPiDOBtwJnAM+rtJ2XmZRFxPXATMJKZr6aMdvtyR2buHhFvBv4AGAWuycwTIuKVwLuovmi3Aa/OzNFCNc60/luofng3AVcukHon02gfMvPtnSxyXEQsoer5/tOIOAt4Sr28LjNPiohzgMfWf34beCdVJ9YB4K8y8/NzWU8vtmAAzgNeGRF9VPc9XUW1r48FDs3M/ajC9fnAS6g6/h0KnAw8OjNPBe7KzDcDRwN3ZuaLgNXAx+vPWA68v2C4tNuXVmuA4+ubSm+IiEXAa4APZ+YBwIVUzeZOmar+pwBH1j+YC6neyTTZh046JCIui4hNVIfP5wPfBTZm5mFUNR/X8vxLMvOFwEpgz/rf/WDgzyNil7ksrFcD5p+pmrUvovrtCNVv+BHgsxHxSeDJwGLgk1S3KVwEHE/Vkmn1bGBVRFxG9UVbFBG71duy4D6Mm2xfWq0B3hIRlwNPperUeCLVl+5y4IVU+94pU9V/Z2b+pH68kOqdTJN96KRLMvMgqpbICNV9QXcBz4+IzwAfoeoxP278e/tsYJ/6u30R1c/D0+aysJ4MmMz8HtVh0AnAp+vVK4AjMvNVwFup9r2PqlVyZWYOAp+naqpTb4OqCfzZ+j/w8Po5d9Xbiv8gTLEvrY4BjsvMFwPPpfoBPRZYW6/rozoU7Ig29bf+2y2YeifTcB86rg671wJnAW8H7s7Mo6iGSvmlugUGv6j7RuDS+rt9CNXJ4u/OZU09GTC1fwGekpk31ctbgfsjYgPwFaobJ58I/Dfwvoi4hKoZ+bH6+Zsi4tPAJ4BfrX+7XgXc0oHzAxP3pdV1wJV1/f8HXE11yHdhRFwM7E512NFJ7eqHhVfvZKbbhwUhMzcBQ8CzgJdGxBXA3wLfofq+t1oP3BcRVwJfB8Yy82dzWY+3CkgqppdbMJI6zICRVIwBI6kYA0ZSMQaMpGJ68VYBdVBEfJxqOpolwNOpbgWA6nL/WGaeGRFnU/V7uSUibgYOysybO1CuCjNgNKcy8y0AEfE04LLMfM4kTzsYOGU+61JnGDCaFxGxtn74IFWHry9FxIEt2weADwMHUd14d05mfmSey9Qc8xyM5lVmfpDqjulVE+7jOabe/jyqm/NWtwaQupMtGC0UhwLPiYhD6uXlVDfjTXaDp7qEAaOFYgB4Z2Z+EaC+Y/3+zpak2fIQSZ2wlYf/crsEOCYiFkfEcqqJ9vab98o0p2zBqBMupDrJe1jLuvFRA79B9b08OzMv60BtmkPeTS2pGA+RJBVjwEgqxoCRVIwBI6kYA0ZSMQaMpGIMGEnF/D/NLSBfTzGtnQAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10cbfa630>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"g = sns.factorplot(x='Title', y='Survived', data=dataset_df, kind='bar')\n",
"g = g.set_xticklabels(['Master', 'Miss', 'Mrs', 'Mr', 'Rare'])\n",
"g = g.set_ylabels('Survival Probability')"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"# Nameの属性を削除\n",
"dataset_df.drop(labels=['Name'], axis=1, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"# fillnaで各行のTitleなどの属性によって補完値を変えるやり方がわからなかったので,forで回す\n",
"# 注意: trainingの欠損値を埋めるのに,testデータの値を使ってはいけない!\n",
"# オンラインの想定でないことを考えると,testデータの穴を埋めるときはtrainingとtestの両方を見てよいはず.\n",
"# FIXME: trainingの欠損値を埋めるときにtestデータを使わないように改良する\n",
"for i, dataset in dataset_df.iterrows():\n",
" if math.isnan(dataset.Age):\n",
" if math.isnan(dataset.Title):\n",
" dataset_df.at[i, 'Age'] = dataset_df[dataset_df.Title==dataset.Title].Age.dropna().mean()\n",
" else:\n",
" dataset_df.at[i, 'Age'] = dataset_df.Age.dropna().mean()"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Age 0\n",
"Cabin 1014\n",
"Embarked 0\n",
"Fare 0\n",
"Parch 0\n",
"PassengerId 0\n",
"Pclass 0\n",
"Sex 0\n",
"SibSp 0\n",
"Survived 418\n",
"Ticket 0\n",
"Title 0\n",
"dtype: int64"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_df.isnull().sum()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feature engineering"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"# Convert 'Sex' to be a dummy variable (female = 0, Male = 1)\n",
"dataset_df['Sex'] = dataset_df['Sex'].map({'female': 0, 'male': 1}).astype(int)"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10cbe2e48>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"g = sns.heatmap(dataset_df[['Age', 'Sex', 'SibSp', 'Parch', 'Pclass']].corr(), cmap='BrBG', annot=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Family size"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x10c97d438>"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10cd8e390>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# SibSpとParchのSurvivedとの関係から,乗船している家族の人数が多いと生存確率が下がるのでは,という当たりをつけてみる.\n",
"train_df['FamilySize'] = train_df['SibSp'] + train_df['Parch'] + 1\n",
"g = sns.factorplot(x='FamilySize', y='Survived', data=train_df, kind='bar', size=6, palette='muted')\n",
"g.set_ylabels('Survived Probability')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"リンク元では家族の人数によって乗客を4つのカテゴリに分類していたが,自分はとりあえず分類はせずこのまま行こう.\n",
"余裕があれば分類したバージョンでも精度を計算してどちらのほうが良さそうか比べてみる."
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"dataset_df['FamilySize'] = dataset_df['SibSp'] + dataset_df['Parch'] + 1\n",
"dataset_df.drop(['SibSp', 'Parch'], axis=1, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"# convert to indicator values Title and Embarked \n",
"dataset_df = pd.get_dummies(dataset_df, columns=[\"Title\"])\n",
"dataset_df = pd.get_dummies(dataset_df, columns=[\"Embarked\"], prefix=\"Em\")"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Age</th>\n",
" <th>Cabin</th>\n",
" <th>Fare</th>\n",
" <th>PassengerId</th>\n",
" <th>Pclass</th>\n",
" <th>Sex</th>\n",
" <th>Survived</th>\n",
" <th>Ticket</th>\n",
" <th>FamilySize</th>\n",
" <th>Title_0</th>\n",
" <th>Title_1</th>\n",
" <th>Title_2</th>\n",
" <th>Title_3</th>\n",
" <th>Title_4</th>\n",
" <th>Em_C</th>\n",
" <th>Em_Q</th>\n",
" <th>Em_S</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>22.0</td>\n",
" <td>NaN</td>\n",
" <td>1.981001</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>A/5 21171</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>38.0</td>\n",
" <td>C85</td>\n",
" <td>4.266662</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>PC 17599</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>26.0</td>\n",
" <td>NaN</td>\n",
" <td>2.070022</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>STON/O2. 3101282</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>35.0</td>\n",
" <td>C123</td>\n",
" <td>3.972177</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>113803</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>35.0</td>\n",
" <td>NaN</td>\n",
" <td>2.085672</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>373450</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Age Cabin Fare PassengerId Pclass Sex Survived Ticket \\\n",
"0 22.0 NaN 1.981001 1 3 1 0.0 A/5 21171 \n",
"1 38.0 C85 4.266662 2 1 0 1.0 PC 17599 \n",
"2 26.0 NaN 2.070022 3 3 0 1.0 STON/O2. 3101282 \n",
"3 35.0 C123 3.972177 4 1 0 1.0 113803 \n",
"4 35.0 NaN 2.085672 5 3 1 0.0 373450 \n",
"\n",
" FamilySize Title_0 Title_1 Title_2 Title_3 Title_4 Em_C Em_Q Em_S \n",
"0 2 0 0 0 1 0 0 0 1 \n",
"1 2 0 0 1 0 0 1 0 0 \n",
"2 1 0 1 0 0 0 0 0 1 \n",
"3 2 0 0 1 0 0 0 0 1 \n",
"4 1 0 0 0 1 0 0 0 1 "
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Cabin"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"891"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"train_df['Cabin'].isnull().count()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cabinは欠損値がやたら多い.客室がない人たち?長旅でそんなのあるのだろうか? \n",
"Cabinの値を見ると,アルファベット一文字に番号が続く形式の文字列になっていることが分かる. \n",
"おそらくアルファベットが船の区画を表していると思われるので,アルファベットだけは情報として持っておき,数字は捨てることにする. \n",
"欠損値には'X'の文字を入れておく."
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [],
"source": [
"# XXX: 本当はdataset_dfじゃなくてtrain_dfで見ないとダメでは?\n",
"dataset_df['Cabin'] = [cabin[0] if not pd.isnull(cabin) else 'X' for cabin in dataset_df['Cabin']]"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Cabin\n",
"A 22\n",
"B 65\n",
"C 94\n",
"D 46\n",
"E 41\n",
"F 21\n",
"G 5\n",
"T 1\n",
"X 1014\n",
"Name: Cabin, dtype: int64\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEFCAYAAAD5bXAgAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAEWNJREFUeJzt3XuUXWV5x/HvDEm0loTaGgwKCEj72NqKJtYoEhI1mgSEWKo1a4FWLNqlqRVkFS9FE6naq9Bi8VIQQhVXi7FWRCLRqhgwGC+4MBUfb8VUJUoIIaEK5DL9Y++Bk8k7yRnk7H3I+X7Wylr7vPs9s5/MTM4v7/vuy9DIyAiSJI013HYBkqT+ZEBIkooMCElSkQEhSSqa1HYBD4WIeATw+8BtwM6Wy5Gkh4sDgEOAr2TmvWN37hcBQRUOa9ouQpIepuYA149t3F8C4jaAK664ghkzZrRdiyQ9LGzcuJFTTz0V6s/QsfaXgNgJMGPGDA499NC2a5Gkh5vi1LyL1JKkIgNCklRkQEiSigwISVKRASFJKjIgJElFBoQkqciAkCQV9fRCuYiYDfxtZs6LiKOBFcAIsB5Ympm7ImIZcCKwAzgzM9eN17eXtUpS0356wc2NH/OxZz2l6749G0FExDnAJcAj66bzgXMzcw4wBCyOiJnAXGA2sAS4aLy+vapTklTWyymm7wOndLyeBVxXb68C5gPHAaszcyQzNwCTImL6OH0lSQ3qWUBk5seA7R1NQ5k5+gDsbcBBwDTgro4+o+2lvpKkBjW5SN25hjAV2AJsrbfHtpf6SpIa1GRA3BQR8+rtRVTPb7gBWBARwxFxODCcmZvG6StJalCTt/s+G7g4IqYAtwArM3NnRKwB1lKF1dLx+jZYpySJHgdEZt4KPLPe/g7VGUtj+ywHlo9pK/aVJDXHC+UkSUUGhCSpyICQJBUZEJKkIgNCklRkQEiSigwISVKRASFJKjIgJElFBoQkqciAkCQVGRCSpCIDQpJUZEBIkooMCElSkQEhSSoyICRJRQaEJKnIgJAkFRkQkqQiA0KSVGRASJKKDAhJUpEBIUkqMiAkSUUGhCSpyICQJBUZEJKkIgNCklRkQEiSigwISVKRASFJKjIgJElFk5o8WERMBi4HjgB2Aq8CdgArgBFgPbA0M3dFxDLgxHr/mZm5rslaJWnQNT2COAGYlJnHAucB7wTOB87NzDnAELA4ImYCc4HZwBLgoobrlKSB13RAfAeYFBHDwDRgOzALuK7evwqYDxwHrM7MkczcUL9nesO1StJAa3SKCbibanrp28BjgBcCx2fmSL1/G3AQVXjc0fG+0fbbG6tUkgZc0yOIs4BrM/O3gGOo1iOmdOyfCmwBttbbY9slSQ1pOiDuBO6qtzcDk4GbImJe3bYIWAPcACyIiOGIOBwYzsxNDdcqSQOt6SmmC4BLI2IN1cjhLcBXgYsjYgpwC7AyM3fWfdZShdjShuuUpIHXaEBk5t3AHxV2zS30XQ4s73FJkqRxeKGcJKnIgJAkFRkQkqQiA0KSVGRASJKKDAhJUpEBIUkqMiAkSUUGhCSpyICQJBUZEJKkIgNCklRkQEiSigwISVKRASFJKjIgJElFBoQkqciAkCQVGRCSpCIDQpJUZEBIkooMCElSkQEhSSoyICRJRQaEJKnIgJAkFRkQkqQiA0KSVGRASJKKDAhJUpEBIUkqMiAkSUUGhCSpaFLTB4yINwMnA1OA9wLXASuAEWA9sDQzd0XEMuBEYAdwZmaua7pWSRpkjY4gImIecCzwbGAucBhwPnBuZs4BhoDFETGz3j8bWAJc1GSdkqTmp5gWAN8EPg58ErgamEU1igBYBcwHjgNWZ+ZIZm4AJkXE9IZrlaSB1vQU02OAJwAvBI4ErgKGM3Ok3r8NOAiYBtzR8b7R9tubK1WSBlvTAXEH8O3MvA/IiLiHappp1FRgC7C13h7bLklqSFdTTBHxnkLb5Q/ieNcDCyNiKCIeB/wq8F/12gTAImANcAOwICKGI+JwqlHGpgdxPEnSg7TXEUREXAIcBTw9Ip7csWsy1ZTPhGTm1RFxPLCOKpyWAv8DXBwRU4BbgJWZuTMi1gBrO/pJkhq0rymmdwBHAP8EvL2jfQfVh/mEZeY5hea5hX7LgeUP5hiSpF/eXgMiM28FbgWOiYhpVKOGoXr3gcDmXhYnSWpPV4vU9cVtb2b3M4tGqKafJEn7oW7PYjoDeGJmepqpJA2Ibi+U24DTSZI0ULodQXwXuD4iPg/cM9qYmef1pCpJUuu6DYgf13/ggUVqSdJ+rKuAyMy377uXJGl/0u1ZTLuozlrq9JPMPKzUX5L08NftCOL+xeyImAy8CHhWr4qSJLVvwrf7zsztmflR4Lk9qEeS1Ce6nWJ6ecfLIeDJwH09qUiS1Be6PYvpOR3bI8Am4KUPfTmSpH7R7RrE6fXaQ9TvWZ+ZO3pamSSpVd0+D2IW1cVylwOXARsiYnYvC5MktavbKaYLgZdm5pcBIuKZwHuAZ/SqMElSu7o9i+nA0XAAyMwbgUf2piRJUj/oNiA2R8Ti0RcR8SJ2v/W3JGk/0+0U06uBqyPig1SnuY4Ax/asKklS67odQSwCfg48geqU19uBeT2qSZLUB7oNiFcDz87M/8vMm4FZwOt6V5YkqW3dBsRkdr9y+j72vHmfJGk/0u0axH8Cn4uIK+vXpwCf6E1JkqR+0NUIIjPfSHUtRABHARdm5lt7WZgkqV3djiDIzJXAyh7WIknqIxO+3bckaTAYEJKkIgNCklRkQEiSigwISVKRASFJKjIgJElFBoQkqajrC+UeShFxMPA14PnADmAF1b2d1gNLM3NXRCwDTqz3n5mZ69qoVZIGVeMjiIiYDHwA+EXddD5wbmbOoXrWxOKImAnMBWYDS4CLmq5TkgZdG1NM/wC8H/hJ/XoWcF29vQqYDxwHrM7MkczcAEyKiOmNVypJA6zRgIiIVwC3Z+a1Hc1DmTl66/BtwEHANOCujj6j7ZKkhjS9BvFKYCQi5gNPBf4VOLhj/1RgC7C13h7bLklqSKMjiMw8PjPnZuY84BvAy4FVETGv7rIIWAPcACyIiOGIOBwYzsxNTdYqSYOulbOYxjgbuDgipgC3ACszc2dErAHWUoXY0jYLlKRB1FpA1KOIUXML+5cDyxsqR5I0hhfKSZKKDAhJUpEBIUkqMiAkSUUGhCSpyICQJBUZEJKkIgNCklRkQEiSigwISVKRASFJKjIgJElFBoQkqciAkCQVGRCSpCIDQpJUZEBIkooMCElSkQEhSSoyICRJRQaEJKnIgJAkFRkQkqQiA0KSVGRASJKKDAhJUpEBIUkqMiAkSUUGhCSpyICQJBUZEJKkIgNCklRkQEiSiiY1ebCImAxcChwBPAJ4B/AtYAUwAqwHlmbmrohYBpwI7ADOzMx1TdYqSYOu6RHEacAdmTkHWAj8M3A+cG7dNgQsjoiZwFxgNrAEuKjhOiVp4DUdEB8F3lpvD1GNDmYB19Vtq4D5wHHA6swcycwNwKSImN5wrZI00BoNiMy8OzO3RcRUYCVwLjCUmSN1l23AQcA04K6Ot462S5Ia0vgidUQcBnwe+FBmfgTY1bF7KrAF2Fpvj22XJDWk0YCIiMcCq4E3ZualdfNNETGv3l4ErAFuABZExHBEHA4MZ+amJmuVpEHX6FlMwFuARwNvjYjRtYjXAxdGxBTgFmBlZu6MiDXAWqoQW9pwnZI08BoNiMx8PVUgjDW30Hc5sLzHJUmSxuGFcpKkIgNCklRkQEiSigwISVKRASFJKjIgJElFBoQkqciAkCQVGRCSpCIDQpJUZEBIkooMCElSkQEhSSoyICRJRQaEJKnIgJAkFRkQkqSiph85qj7zoRULWjnuy15xbSvHldQ9RxCSpCIDQpJUZEBIkooMCElSkQEhSSryLKYG3fy+k1s57lNec1Urx5X08OYIQpJUZEBIkoqcYlJfOv3jCxs/5mV/8OnGjyn1MwNC6tKJ//HeVo77qVNe28pxJaeYJElFBoQkqciAkCQV7bdrELe/78OtHHf6a05r5biS9FDbbwNCGgQnr7y6leNe9eIXtnJcNatvAyIihoH3AscA9wJnZOb32q1KkgZHP69BvAh4ZGY+C3gT8O6W65GkgdK3IwjgOODTAJl5Y0Q8fS99DwDYuHHj/Q2b79rS0+LGc++PfjTuvp/dtb3BSh7wo73UtGVL/9UEcM/m5uvaV03b79zaUCW721td2+/c3GAlD9hbTW+/9rYGK3nAsgWHtHLcX8amrT9r/JjbO352HZ+ZB5T6Do2MjDRQ0sRFxCXAxzJzVf16A3BUZu4o9D0OWNNwiZK0v5iTmdePbeznEcRWYGrH6+FSONS+AswBbgN29rowSdpPHAAcQvUZuod+DogbgJOAKyPimcA3x+uYmfcCe6SfJGmfvj/ejn4OiI8Dz4+ILwFDwOkt1yNJA6Vv1yAkSe3q59NcJUktMiAkSUUGhCSpqJ8XqVsTEecAZwFHZuY9LdcyD7gS+BbVYv0jgNdk5k0t1vRk4O+ARwEHAtcAyzOztQWtwvdpMvCPmXlln9Q06vbMfEk7FVUi4gjgZuDrHc2fy8zz2qkIIuIoqt+pQ4GfA78AzsnM/26xpncDs4AZVL/rP6DFn19EPJ/qjhLPyMx7IuLxVBcTL8zMH/fimAZE2WnAvwFLgBXtlgJU/3iXAETEC4C/Alq5W1pE/BrV9+aUzPxuRBwAfBT4U+D9bdTUofP7dCBwXUR8JzO/0Q819ZlvZea8tosAiIhHAVcBr8rMtXXbM4CLgHlt1ZWZZ9e1vAJ4Uma+qa1a6no+ExGfBi6IiD+n+nf4hl6FAzjFtIf6f33fp/qwW9puNUWPBpq/Pv8Bi6k+9L4LkJk7gZcDl7ZY0x4y827gA8CL265F+3QS1e/U2tGGzFwHPKe9kvrWX1KNaq4CPpuZn+nlwRxB7OkM4JLMzIi4NyJmZ+aXW67puRHxBarppWOobmTYlsdRDbXvV38Y96OfAjNbrmH0ZzfqU5n5920V0+F3xtR1ai//J7oPRwL336k5Ij4BHAQcEhHPy8y93yRrgGTm9oj4F+B9VKP2njIgOkTEo4ETgIMj4nVUv6R/BrQdEJ1TJwGsjYjHZ+YvWqjlh4z50I2II4HDMvOLLdSzN08A2v5wcYpp3/4XuP9mnJm5GCAibsTPqN3U60d/AZwDfDginlOP4nvCKabdnQZ8MDNfkJkLgdnACyJiest1dfppy8e/GlgYEU8EiIjJwPnA77Za1RgRMQ14FdX6iPrbJ4D59S11AIiIo6kWrL2StxYRU4B/B87KzAuADcCyXh7TdN7dGcDLRl9k5s8j4mNUHzTvaq2qB6YpdlLdwPANLY0eyMytEfHHwMX1Q52mAp+kGvK2rfP7NAlYlpnZbkl7TDEBLGrr59ePMvPuiDgJ+JuIOITqZ7eT6oPwh+1W11feDVyfmdfUr18LfC0iPpeZX+jFAb3VhiSpyCkmSVKRASFJKjIgJElFBoQkqciAkCQVeZqrNEH1NRZ/DcwFdgB3Amdn5tfH6X8E8IXMPKKw7xrgjMz8Sc8Klh4kRxDSBNTXflwDbAaemplPBc4DVkXEb0z062XmCYaD+pXXQUgTEBHPAy4Gjs7MXR3tJwBfBd5JdVX5Y4EETqm3bwS+CATVzSD/JDPvjIhbqe5YOg9YCPw6cBSwOjNf28TfSRqPIwhpYp4GfKUzHADqq1ufBNyXmc8CjgZ+hereXgAHAxdm5jFUN6Z7W+FrHwv8IfAU4KSI+L3e/BWk7rgGIU3MLqoHEu0hM78YEXdExFKqsPhNqgcq1bvz+nr7w8DlhS/xpczcBhARP6AaTUitcQQhTcxXgZkRsVtIRMS7ImIxcAXVE9Euo5pSGu23o6P7ELC98LU7n144wjhBJDXFgJAmZg3VA5uW1U/TIyIWAKdTrSFcmZmXARuB44ED6vf9dkQ8rd5+JfDZRquWHgSnmKQJyMyRiDgZuABYHxHbgU1Uaw07gI9ExEuAe6kWpo+s3/o94G31bay/SfVkMKmveRaTJKnIKSZJUpEBIUkqMiAkSUUGhCSpyICQJBUZEJKkIgNCklT0/4fMghcpTSB0AAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10cf485f8>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"print(dataset_df.groupby('Cabin').Cabin.count())\n",
"g = sns.countplot(dataset_df.Cabin, order=['A', 'B', 'C', 'D', 'E', 'F', 'G', 'T', 'X'])"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x10d12ce80>"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAAGoCAYAAAATsnHAAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAGsZJREFUeJzt3XmYZXV95/F3dVU3DdKILS4sEcHlG6JhDzRLBKwADSPIQ3TCgGgwjRI3BhgIkgwgUTQxBKhJcCMwPlHHYQYJXYZBQTZZmk0xGMJXOypKFATZ6aKrq6vmj3MLbhfVt24Xfer+qu/79Tz99D3n1D33013Lp36/e5aesbExJEkqzZxOB5AkaTIWlCSpSBaUJKlIFpQkqUh9nQ7QjojoA7YBHszMkU7nkSTVb1YUFFU5/fQ73/lOp3NIkta/nslWOsUnSSqSBSVJKpIFJUkqkgUlSSqSBSVJKpIFJUkqkgUlSSqSBSVJKpIFJUkqkgUlSSqSBSVJKpIFJUkqkgUlSSqSBSVJKpIFJUkqUq0FFRF7RsQNk6w/LCLujIjbIuL4OjNIkman2goqIk4DLgbmT1g/FzgfOAjYD/hARLymrhySpNmpzhHUvwNHTrJ+B2B5Zj6emcPAzcDbaswhqUsNDAzQ39/PwMBAp6NoGmorqMy8HFg1yabNgCeblp8GXl5XDkndaWhoiKVLlwIwODjI0NBQhxNpXXXiIImngAVNywuAJzqQQ9IGbHh4mLGxMQBGR0cZHh7ucCKtq74OvOa/AW+KiIXAM1TTe3/TgRySpILNWEFFxNHAppn5xYg4GfgW1Qjuksz8j5nKIUmaHWotqMz8GbCo8fhrTesHgcE6X1uSNLt5oq4kqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlLrGwMAA/f39DAwMdDqKpDZYUOoKQ0NDLF26FIDBwUGGhoY6nEjSVCwodYXh4WHGxsYAGB0dZXh4uMOJ1s6RnlSxoKSCONKTXmBBSQWZTSM9qW4WlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFlThPCdGUreyoArmOTGSupkFVTDPiVHJHN2rbhaUpHXm6F4zwYKStM4c3WsmWFCSpCJZUJKkIllQkqQiWVCSpCJZUJKkIllQkqQiWVCSpCJZUJKkIllQkqQiWVCSpCJZUJKkIllQkqQi9dW144iYA1wE7ASsBJZk5vKm7acARwOjwLmZeUVdWSRJs0+dI6gjgPmZuRdwOnDe+IaI2Bw4EdgLOAi4oMYckqRZqM6C2he4GiAzlwG7N217FngAeFnjz2iNOSRJs1CdBbUZ8GTT8uqIaJ5S/AVwH/A9wFtySpLWUGdBPQUsaH6tzBxpPD4E2BLYDngdcERE7FFjFknSLFNnQd0CHAoQEYuAe5u2PQ4MASsz8zngCWDzGrNIkmaZ2o7iA64ADoyIW4Ee4LiIOBlYnplLI+IPgGURMQrcDFxTYxZJ0ixTW0Fl5ihwwoTV9zdtPws4q67XlyTNbp6oK0kqkgUlSSqSBSVJKpIFJUkqkgUlSSqSBSVJKpIFJUkqkgUlSSqSBSVJKpIFJUkqUp3X4pPWq3869/ppP3do1Yo1lq+64GY2nrvJtPZ1xBkHTDuHpPY5gpIkFcmCkiQVySk+aT371tdPmfZzh54bWWP5+ivOZOP50/s2Pfio86adQyqBIyhJUpEcQdXsz26Y/m/TIyvW/G36nFvOpG+Tdf+U/dX+/iYtafZxBCVJKpIFJUkqUtcW1MDAAP39/QwMDHQ6iiRpEl1ZUENDQyxduhSAwcFBhoaGOpxIkjRRVxbU8PAwY2NjAIyOjjI8PNzhRJKkibqyoCRJ5bOgJElFsqAkSUWyoCRJRbKgJElFsqD0knlOmaQ6WFB6STynTFJdLCi9JJ5TJqkuFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIfVN9QERcBVwK/FNmrqo/kiRJ7Y2gPgMsBn4cEX8fEb9XcyZJkqYeQWXmTcBNEbEx8C7g8oh4CrgY+Fxmrqw5oySpC7X1HlRE7A/8HXAucDXwMeA1wNLakkmSulo770E9APyE6n2oj2TmUGP9DcBdtaaTJHWtKQsK+E+Z+cPmFRGxKDOXAbvWE0uS1O3WWlARsQ/QC1wcEX8C9DQ2zQU+B7y5/niSpG7VagR1ILAfsCVwTtP6EeALdYaSJGmtBZWZZwNExLGZ+Y8zlkiSJFpP8Z3dKKm3R8QBE7dn5vvrDCZJ6m6tpvjubvx9wwzkkCRpDa0K6gcR8Trg+pkKI0nSuFYFdSMwxgtH7zUbA7avJZEkSbQ+SGK7mQwiSVKzKQ+SiIhLJtvuQRL16+lrGrz2TFiWpA1cOwdJ3DgTQfRivfN6edXuW/DIXY/yqt22oHdeb6cjaQPyqW/fM+3nrhp6do3l82/4IXM3ftk67+fPD9p52hm04Ws1xTfY+PvLEfFqYE9gFXBHZj42Q/m63raHbMO2h2zT6RiSNOOmvJp5RLwbuAd4H/AB4J6IWFx3MElSd2vnYrF/AeyWmb8CiIhtqW6zcXWrJ0XEHOAiYCdgJbAkM5c3bT8EOIvqKMG7gQ9n5th0/hHSVHp7XvhS76FnjWVJZWrnflCrgIfGFzLzAarr8U3lCGB+Zu4FnA6cN74hIhYAnwXekZl7Aj8Dtmg/trRu5vXN43e32R2At26zG/P65nU4kaSptDqK772Nhz8FBiPiy1TF9F+AH7Sx731pjLIyc1lE7N60bW/gXuC8iNgeuDgzH5lGfqlt+8di9nd2Wpo1Ws1zjF9/75nGn0Mby88y+cm7E20GPNm0vDoi+jJzhGq0dACwc2Pf342I2zLzR+sSXpK04Wp1FN9xa9sWERu3se+ngAVNy3Ma5QTwG+DOzHyosb+bqMrKgpIkAe3d8v0PgTOBTalGTr3AxsCrp3jqLcBhwGURsYhqSm/c94C3RsQWwBPAIuBL65xe682yE0+c1vNWjKz5duTdZ5zBJn3TOwBh0YUXTut5G5Le3hcmJ3p61lyWuk07B0n8NfBfgX8DjgEuBS5r43lXAM9FxK3A+cBJEXFyRByemb8GPg58C7gd+MbE28pL3Wje3F52+Z3qeKGdd9iCeXM9OVvdq51fdR/PzOsbt4B/eePyR3dP9aTMHAVOmLD6/qbtXwe+vk5ppS5w4D7bcOA+npwttTOCGoqIN1ONoPaPiHnAy+uNJUnqdu0U1F8AnwS+CfQDD1NN30mSVJspp/gy80ZeuGDs70XEKzLz8XpjSZK6XTtH8W0DDAD7A8PAtRFxkifWSpLq1M4U3yXANcC2wJuprpt3aZ2hJElq5yi+V2Xm55qWz4+I99UVSJIkaG8EdUdEHDW+EBHvAO6qL5IkSa0vFjsKjFFdPeL4iPgHYDXVFSUeB5bMSEJJUldqdS2+dkZXkiTVop2j+DahurFgf+PjrwP+e2Y+W3M2SVIXa2eU9HfAy4D3U932fR7w+TpDSZLUzlF8u2XmTk3LH4mI++oKJEkStDeCmhMRm48vNB63c8t3SZKmrZ0R1N9SHWo+2Fg+HPh0fZEkSWqvoAaBO4H9qEZcR2bmva2fIknSS9NOQX03M3cAvKGgJGnGtFNQP4iIY4E7gKHxlZn589pSSZK6XjsFtWfjT7MxYPv1H0eSpEo794PabiaCSJLUrNW1+LaiOkn3TcDNwMcz84mZCiZJ6m6tzoO6FLgfOBWYD5w/I4kkSaL1FN/WmXkwQER8B7hnZiJJktR6BDU8/iAzVzUvS5JUt3W5pcZYbSkkSZqg1RTfWyLiJ03LWzeWe4CxzPQwc0lSbVoV1JtnLIUkSRO0uqPuAzMZRJKkZt7WXZJUJAtKklSkVleSeFurJ2bmTes/jiRJlVYHSXyi8fcrgTcCtwCrgb2Be4F96o0mSepmrQ6SOAAgIq6iuknh8sbytsAXZiaepG539JnXT+t5o6tWrLH8wc/czJy5m0xrX18754BpPU8vTTvvQW07Xk4NPwe2rSmPJElAe/eDujsivgxcRlVoRwPfrTWVJKnrtVNQS4CPAidQXe7oWuCiOkNJktTODQuHI+JyqltvfAv4rcwcqT2ZJKmrTfkeVET8ETAIXAgsBG6LiPfUHUyS1N3aOUjiz6gOLX86M38N7AJ8vNZUkqSu105Brc7Mp8cXMvNXwGh9kSRJau8giX+NiI8AcyNiZ+BDeHddSVLN2hlBfRjYGhgCLgGeoiopid6eHnoaj3say5K0PrQzgjoeuCAzfd9JL7JRby97LFzI7Y89xh4LF7JRb2+nI0naQLRTUFsDyyIiga8A38jMFVM8R13k8K224vCttup0DEkbmHbOgzoVODUifh/4I+DMiLg9M4+tPV0L070+F3iNLkmaDdq6H1RE9ABzgXlUR/CtrDOUJElTjqAi4n8ARwDfB74KfCwzn6s7mCSpu7XzHtSPgF0z85G6w0iSNK7VHXU/kJlfpLq80Z9GxBrbM/OcmrNJkrpYqxFUz1oeS5JUu1Z31B2/a+6TwP/KzIdnJpIkSZ4HJUkq1JSHmWfmqZm5HfApYBFwT0T8Y+3JJEldzfOgJElFavc8qHdSXcH8K3gelCRpBrTzHtTDwG6eByVJmkntTPEdYzlJkmZaOyOo+yLiTOB2qntCAZCZN9WWSpLU9dopqIXAAY0/48aAt9eSSJIk2rvdhveTkCTNuHaO4rueasS0hsx0BCVJqk07U3xnNz2eS3XI+eO1pJEkqaGdKb4bJ6y6NiJuB86sJ5IkSe1N8b2uabEHeAvwytoSSZJEe1N8N1K9B9XT+PsR4KN1hpIkqZ0pvu2ms+OImANcBOxEde2+JZm5fJKP+Wfgysz8/HReR5K0YWp5JYmIeEdEbN94fEREDEbEJyKinZHXEcD8zNwLOB04b5KP+STwinUNLUna8K21oCLivwFnAfMjYkfgq8CVwALgb9rY977A1QCZuQzYfcL+30V1ZfSrp5VckrRBazWCOhbYLzPvA44GlmbmxcApwMFt7Hszqrvxjls9PvKKiLc29umRgJKkSbUqqLGmO+cewAujoRedtLsWT1GNtp5/rcwcaTx+L9Wdeq8D/hg4OSIWtxtakrTha/Ve0khEbA5sCuwCfBsgIrYFRlo8b9wtwGHAZRGxCLh3fENmnjb+OCLOBh7KTKf6JEnPa1VQn6G6SWEfcHFm/ioi/jNwLvCJNvZ9BXBgRNxKdYj6cRFxMrA8M5e+xNySpA3cWgsqM/9vo1y2yMx/aax+hupw8Rum2nFmjgInTFh9/yQfd3bbaSVJXaPl4eKZ+Uvgl03LV9WeSFLxenr7eP7c/Z6exrK0frVzR11JWkPfvI3Yete9Adh6l73pm7dRhxNpQ+SvPZKmJQ48kjjwyE7H0AbMEZQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqS0DAwP09/czMDAwI69nQUmSpjQ0NMTSpUsBGBwcZGhoqPbXtKAkSVMaHh5mbGwMgNHRUYaHh2t/TQtKklQkC0qSVCQLSpJUJAtKklQkC0qSVCQLSpJUJAtKklQkC0qSVCQLSpJUJAtKklSkvk4HkCTNjGUnnjjt564YGVlj+e4zzmCTvnWvkEUXXtj2xzqCkiQVyYKSJBXJgpIkFcmCkiQVyYKSJBXJgpIkFcmCkiQVyYKSJBXJgpIkFcmCkiQVyYKSJE2pt6eHnsbjnsZy3SwoSdKUNurtZY+FCwHYY+FCNurtrf01vVisJKkth2+1FYdvtdWMvZ4jKElSkSwoSVKRLChJUpEsKElSkSwoSVKRLChJUpEsKElSkSwoSVKRLChJUpEsKElSkSwoSVKRLChJUpEsKElSkSwoSVKRLChJUpEsKElSkSwoSVKRLChJUpEsKElSkSwoSVKR+uracUTMAS4CdgJWAksyc3nT9pOAoxqLV2XmJ+rKIkmafeocQR0BzM/MvYDTgfPGN0TE9sAxwN7AIuCgiNixxiySpFmmzoLaF7gaIDOXAbs3bfsFsDgzV2fmGDAXeK7GLGua0zxw7JmwLEkqQZ0/mTcDnmxaXh0RfZk5kpmrgEcjogf4LPD9zPxRjVnWMKd3Hhu/dneGHrqLjV+7G3N6583US0uS2lRnQT0FLGhanpOZI+MLETEfuAR4GvhQjTkmtdkbFrPZGxbP9MtKktpU5xTfLcChABGxCLh3fENj5HQl8IPM/GBmrq4xhyRpFqpzBHUFcGBE3Ar0AMdFxMnAcqAX2A/YKCIOaXz8xzPzthrzSJJmkdoKKjNHgRMmrL6/6fH8ul5bkjT7eaKuJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIFpQkqUgWlCSpSBaUJKlIfXXtOCLmABcBOwErgSWZubxp+/HAB4ER4JOZ+c26skiSZp86R1BHAPMzcy/gdOC88Q0R8VrgY8A+wMHApyNioxqzSJJmmdpGUMC+wNUAmbksInZv2rYHcEtmrgRWRsRyYEfgzrXsqxfgoYceen7FymcfrSPzOnvwwQdbbn/20RUzlGTtpsoI8MiK8nM+9szs+Jz/5vHO/1/C1Dmf+c2vZyjJ2rXztTnd7/XRVc8xMjLStJ/HmDN3ep+bdnLOBqV+n/f3978eeDAzR5rX11lQmwFPNi2vjoi+RoCJ254GXt5iX1sCHHPMMes95EvVPwsmJq/huk5HaE9/f6cTtOWvLu90gjZdUP7n/Rsz+FoPPfTJaT93NnyfzxqTf5//FNgO+FnzyjoL6ilgQdPynKZ2nLhtAfBEi33dCfw+8Ctg9foMKUkqwouGVnUW1C3AYcBlEbEIuLdp2x3ApyJiPrARsAPww7XtqDEVeHONWSVJhekZGxurZcdNR/HtCPQAxwGHAsszc2njKL4PUB2ocW5mzpaJE0nSDKitoCRJeik8UVeSVCQLSpJUJAtKklSkOo/iK15EnAacBGyXmc91Ok+ziNgfuAy4j+ogk42AP83M73cy10QR8Rbgr4FNgE2Bq4CzM7OYNzcn+b+cC1yQmZd1MtdEE3KOeyQz392ZRJOLiNcD/wJ8r2n1dZl5TmcSTS4itqf62twGWAEMAadl5r92NFiTiDgP2A14LdX30E8o83N+INXVgPbIzOciYmuqCzEszsz/qOt1u7qggPcAXweOAv5nZ6NM6rrMPAogIg4C/hJ4R2cjvSAiNqf6/zsyM38cEb3A/6G6xuLnOxruxZr/LzcFboyIH2XmPR3ONdHzOQt3X2bu3+kQaxMRmwBLgeMz87bGuj2Avwf272C0NWTmKQAR8cfAb2fm6Z1NNLnMvCYirgbOj4iPUX3fn1xnOUEXT/E1flv9d6ofpB/ubJq2vALo/LVp1vROqh+oPwbIzNXAe4FLOppqCpn5DPAF4F2dzqLaHEb1tXnb+IrMvAM4oHORZr0/pxrtLQWuzcxr6n7Bbh5BLQEuzsyMiJURsWdm3t7pUBO8PSJuoJre24nqArwl2YpqSuJ5jR/+s8HDwK6dDjGJ8c/5uH/OzM92KkwLvzMh5zF1/za9jrYDmu+ecCXV5dS2jIj+zNwwLq43gzJzVUR8Efgc1SxJ7bqyoCLiFVQnDb86Ij5K9YX7EaC0gmqelgrgtojYOjOHOpxr3ANM+CEfEdsBv5WZN3UmUtu2ZZJLqxTAKb714xfA8xeozsx3AkTEMrr0595L1Xjv8VTgNOArEXFAY9akNt06xfce4B8y86DMXAzsCRwUEa/qcK5WHu50gEl8E1gcEW8AiIi5wN8Cb+1oqilExGbA8VTvl2nDdCXwB43LrAEQEW+kOmCimAN4ZouImAf8b+CkzDwf+DlwVt2v262/SSwBjh1fyMwVEXE51Q+tczuW6sXGp3tWU11Q9+SCRk9k5lMR8T7gS41LWy0ABqmmAErT/H/ZB5yVmdnZSJOaOMUHcEhJn/fZIDOfiYjDgM9ExJZUn/PVVD9gH+hsulnpPODmzLyqsfwh4O6IuC4zb6jrRb3UkSSpSN06xSdJKpwFJUkqkgUlSSqSBSVJKpIFJUkqUrceZi7NmMZ5V58G9gNGgMeBUzLze2v5+NcDN2Tm6yfZdhWwJDN/WVtgqRCOoKQaNc4Puwp4DNg5M3cGzgH+X0S8cl33l5mHWk7qFp4HJdUoIvqBLwFvzMzRpvWHAncBn6K68sZrgASObDxeBtwEBNVFjf8kMx+PiJ9RXY17f2AxsBDYHvh2Zn5oJv5N0kxxBCXVaxfgzuZyAmickf/bwHBm7gW8EdiY6hqRAK8GBjJzJ6qLnp45yb73Bv4Q2BE4LCJ+t55/gtQZvgcl1WuU6iaJL5KZN0XEbyLiw1Rl9Saqmz42NufNjcdfAb48yS5uzcynASLiJ1SjKWmD4QhKqtddwK4RsUZJRcS5EfFO4KtUd3u9lGpKb/zjRpo+vAdYNcm+m+8CPcZailCarSwoqV7fpbrR5FmNOw4TEQcDx1G9h3RZZl4KPAS8DehtPG+HiNil8fj9wLUzmloqgFN8Uo0ycywiDgfOB34YEauAR6neaxoBvhYR7wZWUh0YsV3jqcuBMxu3iLiX6m6mUlfxKD5JUpGc4pMkFcmCkiQVyYKSJBXJgpIkFcmCkiQVyYKSJBXJgpIkFen/AxWftiytIqf4AAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10d15ea58>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"g = sns.factorplot(x='Cabin', y='Survived', data=dataset_df, kind='bar', size=6, palette='muted', order=['A', 'B', 'C', 'D', 'E', 'F', 'G', 'T', 'X'])\n",
"g.set_ylabels('Survived Probability')"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [],
"source": [
"# 単純にget_dummies()すると属性の数が大分増えてしまう..生存率が高そうなやつ,低そうなやつ,欠損値の3種類くらいに分けたほうがよいのかも.\n",
"# もしくは欠損とそれ以外の2種類.←こっちにしてみよう.欠損値があるデータは明らか生存率低いし.\n",
"# dataset_df = pd.get_dummies(dataset_df, columns=[\"Cabin\"], prefix=\"Cabin\")\n",
"dataset_df['Cabin'] = [0 if cabin == 'X' else 1 for cabin in dataset_df['Cabin']]"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Age</th>\n",
" <th>Cabin</th>\n",
" <th>Fare</th>\n",
" <th>PassengerId</th>\n",
" <th>Pclass</th>\n",
" <th>Sex</th>\n",
" <th>Survived</th>\n",
" <th>Ticket</th>\n",
" <th>FamilySize</th>\n",
" <th>Title_0</th>\n",
" <th>Title_1</th>\n",
" <th>Title_2</th>\n",
" <th>Title_3</th>\n",
" <th>Title_4</th>\n",
" <th>Em_C</th>\n",
" <th>Em_Q</th>\n",
" <th>Em_S</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>22.0</td>\n",
" <td>0</td>\n",
" <td>1.981001</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>A/5 21171</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>38.0</td>\n",
" <td>1</td>\n",
" <td>4.266662</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>PC 17599</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>26.0</td>\n",
" <td>0</td>\n",
" <td>2.070022</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>STON/O2. 3101282</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>35.0</td>\n",
" <td>1</td>\n",
" <td>3.972177</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>113803</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>35.0</td>\n",
" <td>0</td>\n",
" <td>2.085672</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>373450</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Age Cabin Fare PassengerId Pclass Sex Survived \\\n",
"0 22.0 0 1.981001 1 3 1 0.0 \n",
"1 38.0 1 4.266662 2 1 0 1.0 \n",
"2 26.0 0 2.070022 3 3 0 1.0 \n",
"3 35.0 1 3.972177 4 1 0 1.0 \n",
"4 35.0 0 2.085672 5 3 1 0.0 \n",
"\n",
" Ticket FamilySize Title_0 Title_1 Title_2 Title_3 Title_4 \\\n",
"0 A/5 21171 2 0 0 0 1 0 \n",
"1 PC 17599 2 0 0 1 0 0 \n",
"2 STON/O2. 3101282 1 0 1 0 0 0 \n",
"3 113803 2 0 0 1 0 0 \n",
"4 373450 1 0 0 0 1 0 \n",
"\n",
" Em_C Em_Q Em_S \n",
"0 0 0 1 \n",
"1 1 0 0 \n",
"2 0 0 1 \n",
"3 0 0 1 \n",
"4 0 0 1 "
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Ticket"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 A/5 21171\n",
"1 PC 17599\n",
"2 STON/O2. 3101282\n",
"3 113803\n",
"4 373450\n",
"5 330877\n",
"6 17463\n",
"7 349909\n",
"8 347742\n",
"9 237736\n",
"10 PP 9549\n",
"11 113783\n",
"12 A/5. 2151\n",
"13 347082\n",
"14 350406\n",
"15 248706\n",
"16 382652\n",
"17 244373\n",
"18 345763\n",
"19 2649\n",
"20 239865\n",
"21 248698\n",
"22 330923\n",
"23 113788\n",
"24 349909\n",
"25 347077\n",
"26 2631\n",
"27 19950\n",
"28 330959\n",
"29 349216\n",
" ... \n",
"1279 364858\n",
"1280 349909\n",
"1281 12749\n",
"1282 PC 17592\n",
"1283 C.A. 2673\n",
"1284 C.A. 30769\n",
"1285 315153\n",
"1286 13695\n",
"1287 371109\n",
"1288 13567\n",
"1289 347065\n",
"1290 21332\n",
"1291 36928\n",
"1292 28664\n",
"1293 112378\n",
"1294 113059\n",
"1295 17765\n",
"1296 SC/PARIS 2166\n",
"1297 28666\n",
"1298 113503\n",
"1299 334915\n",
"1300 SOTON/O.Q. 3101315\n",
"1301 365237\n",
"1302 19928\n",
"1303 347086\n",
"1304 A.5. 3236\n",
"1305 PC 17758\n",
"1306 SOTON/O.Q. 3101262\n",
"1307 359309\n",
"1308 2668\n",
"Name: Ticket, Length: 1309, dtype: object"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_df.Ticket"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"数字の前にSOTONとかC.A.とかついてるが,これらは何を指しているのだろう?\n",
"SC/PARISの文字から,購入先の都市?\n",
"Categorical Dataにしてわざわざ属性を増やすほどでもない気がするので,単純に削除することにする."
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [],
"source": [
"dataset_df.drop(['Ticket'], axis=1, inplace=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Pclass"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"# Create categorical data for Pclass\n",
"dataset_df.Pclass = dataset_df.Pclass.astype('category')\n",
"dataset_df = pd.get_dummies(dataset_df, columns=['Pclass'], prefix='pc')"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Age</th>\n",
" <th>Cabin</th>\n",
" <th>Fare</th>\n",
" <th>PassengerId</th>\n",
" <th>Sex</th>\n",
" <th>Survived</th>\n",
" <th>FamilySize</th>\n",
" <th>Title_0</th>\n",
" <th>Title_1</th>\n",
" <th>Title_2</th>\n",
" <th>Title_3</th>\n",
" <th>Title_4</th>\n",
" <th>Em_C</th>\n",
" <th>Em_Q</th>\n",
" <th>Em_S</th>\n",
" <th>pc_1</th>\n",
" <th>pc_2</th>\n",
" <th>pc_3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>22.0</td>\n",
" <td>0</td>\n",
" <td>1.981001</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>38.0</td>\n",
" <td>1</td>\n",
" <td>4.266662</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>26.0</td>\n",
" <td>0</td>\n",
" <td>2.070022</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>35.0</td>\n",
" <td>1</td>\n",
" <td>3.972177</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>1.0</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>35.0</td>\n",
" <td>0</td>\n",
" <td>2.085672</td>\n",
" <td>5</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Age Cabin Fare PassengerId Sex Survived FamilySize Title_0 \\\n",
"0 22.0 0 1.981001 1 1 0.0 2 0 \n",
"1 38.0 1 4.266662 2 0 1.0 2 0 \n",
"2 26.0 0 2.070022 3 0 1.0 1 0 \n",
"3 35.0 1 3.972177 4 0 1.0 2 0 \n",
"4 35.0 0 2.085672 5 1 0.0 1 0 \n",
"\n",
" Title_1 Title_2 Title_3 Title_4 Em_C Em_Q Em_S pc_1 pc_2 pc_3 \n",
"0 0 0 1 0 0 0 1 0 0 1 \n",
"1 0 1 0 0 1 0 0 1 0 0 \n",
"2 1 0 0 0 0 0 1 0 0 1 \n",
"3 0 1 0 0 0 0 1 1 0 0 \n",
"4 0 0 1 0 0 0 1 0 0 1 "
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_df.head()"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [],
"source": [
"# Prepare data\n",
"X_train = dataset_df[:train_count].drop(['PassengerId', 'Survived'], axis=1)\n",
"y_train = dataset_df[:train_count].Survived\n",
"X_test = dataset_df[train_count:].drop(['PassengerId', 'Survived'], axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"様々なアルゴリズムで学習をさせてみる.\n",
"\n",
"[Titanic Top 4% with ensemble modeling | Kaggle](https://www.kaggle.com/yassineghouzam/titanic-top-4-with-ensemble-modeling)を参考に,以下のアルゴリズムで学習を行う.\n",
"\n",
"- SVC\n",
"- Decision Tree\n",
"- AdaBoost\n",
"- Random Forest\n",
"- Extra Trees\n",
"- Gradient Boosting\n",
"- Multiple layer perceprton (neural network)\n",
"- KNN\n",
"- Logistic regression\n",
"- Linear Discriminant Analysis\n",
"- xgboost"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.6/site-packages/sklearn/discriminant_analysis.py:388: UserWarning: Variables are collinear.\n",
" warnings.warn(\"Variables are collinear.\")\n",
"/usr/local/lib/python3.6/site-packages/sklearn/discriminant_analysis.py:388: UserWarning: Variables are collinear.\n",
" warnings.warn(\"Variables are collinear.\")\n",
"/usr/local/lib/python3.6/site-packages/sklearn/discriminant_analysis.py:388: UserWarning: Variables are collinear.\n",
" warnings.warn(\"Variables are collinear.\")\n",
"/usr/local/lib/python3.6/site-packages/sklearn/discriminant_analysis.py:388: UserWarning: Variables are collinear.\n",
" warnings.warn(\"Variables are collinear.\")\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10d188240>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Modeling step Test differents algorithms\n",
"N_SPLITS = 4\n",
"kfold = StratifiedKFold(n_splits=N_SPLITS)\n",
"\n",
"random_state = 42 # 乱数のseed\n",
"\n",
"classifiers = []\n",
"classifiers.append(SVC(random_state=random_state))\n",
"classifiers.append(DecisionTreeClassifier(random_state=random_state))\n",
"classifiers.append(AdaBoostClassifier(DecisionTreeClassifier(\n",
" random_state=random_state), random_state=random_state, learning_rate=0.1))\n",
"classifiers.append(RandomForestClassifier(random_state=random_state))\n",
"classifiers.append(ExtraTreesClassifier(random_state=random_state))\n",
"classifiers.append(GradientBoostingClassifier(random_state=random_state))\n",
"classifiers.append(MLPClassifier(random_state=random_state))\n",
"classifiers.append(KNeighborsClassifier())\n",
"classifiers.append(LogisticRegression(random_state=random_state))\n",
"classifiers.append(LinearDiscriminantAnalysis())\n",
"classifiers.append(xgb.XGBClassifier(seed=random_state))\n",
"\n",
"cv_results = []\n",
"for classifier in classifiers:\n",
" cv_results.append(cross_val_score(classifier, X_train,\n",
" y=y_train, scoring='accuracy', cv=kfold, n_jobs=4))\n",
"\n",
"cv_means = []\n",
"cv_std = []\n",
"for cv_result in cv_results:\n",
" cv_means.append(cv_result.mean())\n",
" cv_std.append(cv_result.std())\n",
"\n",
"cv_res = pd.DataFrame({'CrossValMeans': cv_means, 'CrossValerrors': cv_std, 'Algorithm': ['SVC', 'DecisionTree', 'AdaBoost',\n",
" 'RandomForest', 'ExtraTrees', 'GradientBoosting', 'MultipleLayerPerceptron', 'KNeighboors', 'LogisticRegression', 'LinearDiscriminantAnalysis', 'XGBClassifier']})\n",
"\n",
"g = sns.barplot('CrossValMeans', 'Algorithm', data=cv_res,\n",
" palette='Set3', orient='h', **{'xerr': cv_std})\n",
"g.set_xlabel('Mean Accuracy')\n",
"g = g.set_title('Cross validation scores')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"hyper parameter tuningを行う.この辺りは中のアルゴリズムを知ってないと少し難しそう \n",
"> grid search optimization for AdaBoost, ExtraTrees , RandomForest, GradientBoosting and SVC classifiers "
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fitting 4 folds for each of 112 candidates, totalling 448 fits\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[Parallel(n_jobs=4)]: Done 348 tasks | elapsed: 1.8s\n",
"[Parallel(n_jobs=4)]: Done 448 out of 448 | elapsed: 2.5s finished\n"
]
},
{
"data": {
"text/plain": [
"0.80471380471380471"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Adaboost\n",
"DTC = DecisionTreeClassifier()\n",
"\n",
"adaDTC = AdaBoostClassifier(DTC, random_state=7)\n",
"\n",
"ada_param_grid = {'base_estimator__criterion' : ['gini', 'entropy'],\n",
" 'base_estimator__splitter' : ['best', 'random'],\n",
" 'algorithm' : ['SAMME','SAMME.R'],\n",
" 'n_estimators' :[1, 2],\n",
" 'learning_rate': [0.0001, 0.001, 0.01, 0.1, 0.2, 0.3, 1.5]}\n",
"\n",
"gsadaDTC = GridSearchCV(adaDTC, param_grid=ada_param_grid, cv=kfold, scoring='accuracy', n_jobs=4, verbose=1)\n",
"\n",
"gsadaDTC.fit(X_train, y_train)\n",
"\n",
"ada_best = gsadaDTC.best_estimator_\n",
"gsadaDTC.best_score_"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fitting 4 folds for each of 54 candidates, totalling 216 fits\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[Parallel(n_jobs=4)]: Done 42 tasks | elapsed: 11.7s\n",
"[Parallel(n_jobs=4)]: Done 192 tasks | elapsed: 52.3s\n",
"[Parallel(n_jobs=4)]: Done 216 out of 216 | elapsed: 58.0s finished\n"
]
},
{
"data": {
"text/plain": [
"0.83052749719416386"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# ExtraTrees \n",
"ExtC = ExtraTreesClassifier()\n",
"\n",
"# Search grid for optimal parameters\n",
"ex_param_grid = {'max_depth': [None],\n",
" 'max_features': [1, 3, 10],\n",
" 'min_samples_split': [2, 3, 10],\n",
" 'min_samples_leaf': [1, 3, 10],\n",
" 'bootstrap': [False],\n",
" 'n_estimators' :[100, 300],\n",
" 'criterion': ['gini']}\n",
"\n",
"\n",
"gsExtC = GridSearchCV(ExtC, param_grid=ex_param_grid, cv=kfold, scoring='accuracy', n_jobs=4, verbose=1)\n",
"\n",
"gsExtC.fit(X_train,y_train)\n",
"\n",
"ExtC_best = gsExtC.best_estimator_\n",
"\n",
"# Best score\n",
"gsExtC.best_score_"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fitting 4 folds for each of 54 candidates, totalling 216 fits\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[Parallel(n_jobs=4)]: Done 42 tasks | elapsed: 8.7s\n",
"[Parallel(n_jobs=4)]: Done 192 tasks | elapsed: 34.2s\n",
"[Parallel(n_jobs=4)]: Done 216 out of 216 | elapsed: 40.5s finished\n"
]
},
{
"data": {
"text/plain": [
"0.83726150392817056"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# RFC Parameters tunning \n",
"RFC = RandomForestClassifier()\n",
"\n",
"# Search grid for optimal parameters\n",
"rf_param_grid = {'max_depth': [None],\n",
" 'max_features': [1, 3, 10],\n",
" 'min_samples_split': [2, 3, 10],\n",
" 'min_samples_leaf': [1, 3, 10],\n",
" 'bootstrap': [False],\n",
" 'n_estimators' :[100, 300],\n",
" 'criterion': ['gini']}\n",
"\n",
"gsRFC = GridSearchCV(RFC, param_grid=rf_param_grid, cv=kfold, scoring='accuracy', n_jobs=4, verbose=1)\n",
"\n",
"gsRFC.fit(X_train,y_train)\n",
"\n",
"RFC_best = gsRFC.best_estimator_\n",
"\n",
"# Best score\n",
"gsRFC.best_score_"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fitting 4 folds for each of 72 candidates, totalling 288 fits\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[Parallel(n_jobs=4)]: Done 76 tasks | elapsed: 3.3s\n",
"[Parallel(n_jobs=4)]: Done 288 out of 288 | elapsed: 10.2s finished\n"
]
},
{
"data": {
"text/plain": [
"0.82603815937149272"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Gradient boosting tunning\n",
"\n",
"GBC = GradientBoostingClassifier()\n",
"gb_param_grid = {'loss' : ['deviance'],\n",
" 'n_estimators' : [100, 200, 300],\n",
" 'learning_rate': [0.1, 0.05, 0.01],\n",
" 'max_depth': [4, 8],\n",
" 'min_samples_leaf': [100, 150],\n",
" 'max_features': [0.3, 0.1]}\n",
"\n",
"gsGBC = GridSearchCV(GBC, param_grid=gb_param_grid, cv=kfold, scoring='accuracy', n_jobs=4, verbose=1)\n",
"\n",
"gsGBC.fit(X_train, y_train)\n",
"\n",
"GBC_best = gsGBC.best_estimator_\n",
"\n",
"# Best score\n",
"gsGBC.best_score_"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fitting 4 folds for each of 28 candidates, totalling 112 fits\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[Parallel(n_jobs=4)]: Done 76 tasks | elapsed: 3.7s\n",
"[Parallel(n_jobs=4)]: Done 112 out of 112 | elapsed: 7.7s finished\n"
]
},
{
"data": {
"text/plain": [
"0.8271604938271605"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# SVC classifier\n",
"SVMC = SVC(probability=True)\n",
"svc_param_grid = {'kernel': ['rbf'], \n",
" 'gamma': [0.001, 0.01, 0.1, 1],\n",
" 'C': [1, 10, 50, 100,200, 300, 1000]}\n",
"\n",
"gsSVMC = GridSearchCV(SVMC, param_grid=svc_param_grid, cv=kfold, scoring='accuracy', n_jobs=4, verbose=1)\n",
"\n",
"gsSVMC.fit(X_train, y_train)\n",
"\n",
"SVMC_best = gsSVMC.best_estimator_\n",
"\n",
"# Best score\n",
"gsSVMC.best_score_"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fitting 4 folds for each of 6 candidates, totalling 24 fits\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[Parallel(n_jobs=4)]: Done 24 out of 24 | elapsed: 1.4s finished\n"
]
},
{
"data": {
"text/plain": [
"0.87314094273545273"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# xgboost ref: https://www.kaggle.com/phunter/xgboost-with-gridsearchcv\n",
"# xgboostは欠損値があっても上手く動作するので,前処理の方法を他の手法と変えてみてもいいのかもしれない\n",
"XgbC = xgb.XGBClassifier()\n",
"\n",
"xgb_param_grid = {'learning_rate': [0.1, 0.05, 0.01], #so called `eta` value\n",
" 'max_depth': [4, 8],\n",
" 'min_child_weight': [11],\n",
" 'silent': [1],\n",
" 'subsample': [0.8],\n",
" 'colsample_bytree': [0.7],\n",
" 'n_estimators': [100], #number of trees, change it to 1000 for better results\n",
" 'seed': [random_state]}\n",
"\n",
"gsXgbC = GridSearchCV(XgbC, xgb_param_grid, n_jobs=4, cv=kfold, scoring='roc_auc', verbose=1, refit=True)\n",
"gsXgbC.fit(X_train, y_train)\n",
"\n",
"Xgb_best = gsXgbC.best_estimator_\n",
"\n",
"# Best score\n",
"gsXgbC.best_score_"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10cc15eb8>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10cc15be0>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10cc47a90>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10c96c668>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAETCAYAAADZHBoWAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzsnXd4VFX6xz/T0gsQEoqIgMIRRECKAlYU3UWx7K6uYltZQQUFlWJBXbGCdRVQQBHL6oq4wg/7orA2RJEIAgJHqnQICenJtHt/f5yZyWQykwKZ1PN5njyZe889954zk5zvnPd9z3sspmmi0Wg0Gg2Atb4boNFoNJqGgxYFjUaj0QTQoqDRaDSaAFoUNBqNRhNAi4JGo9FoAmhR0Gg0Gk0Ae303QFO/CCFuBm4BUoAYYDvwoJTyx2O875XAHVLK84QQjwJbpZRvHeW9BgA3SylvE0J0ArYB633FNqAYmCClXHEsbY7w7H8Av0gplxxrP0LuexNwpZRy+LHeqxrPmgcskFJ+Ge1naRo/WhSaMUKIJ4FzgL9KKX/3nTsf+FgI0U9Kuas2niOl/Mcx3uIUoEPQcYmUso//QAjxV+ANoOsxPicc5wMboVb6US9IKUfVdxs0jQctCs0UIUQb4C7gRCnlfv95KeVyIcQEINF33U7gR6AXMAVw+37HABnAm1LKh3zXPgpcB2QDW4Ke9QawQUr5rBCiO/AikIb6lj9DSjlfCHEe8ARqptITiAVuB7YCjwKpQojXgUfCdCcNCPRBCHELMB7wAgdRM5bfhBCpwEtAH8AEPgOmSCk9QohHgD8BLl/7bwL+DPQHnhFCeIHLg/pRCkwHLgTaAy9KKV8QQtiAZ4DLgDzfe9dDSnleJZ9Fqu89ORVwAMuAyb52/R241fd+twKmSyln+2YaN/s+pzzgTV/7DZQ4uoAbpZQbhBBfAbOA1b57fwqc4bvfA1LK94QQCcAcYCCQS5kQ3hTSVjvwNDAc8ADfA2NRfxOtpZR3+K6b6j/2PT8HOBmYCzwEtJdSunzv1+/ARcDeSt6HCp9P8N+tpvbQPoXmyyBgU7h/LCnlv6SUm4JObZBSdgf+D5gI/E1K2R81gNwvhGgthLgc+AtqwB0MpIbe1zeg/Ae4T0rZDzgXmCSEGOi75AzgOSnlacBrwFQp5W7gH8C3UsqRvuvihRBrfT+/owaSab5nnA/cAwyRUvYG/g38nxDCAsxADSinogb73r7nH48SyAG+fi0FzpBSvoQaSCdLKReHdCcWOCylPBO4EpguhIgDRgH9UMI2CDgxwvsfzD+BTN97chrQGpgghEgCRgMX+96Tq1EDsp9TgPOklEN8x+cC46SUPYEVwOQwz+oC/FdKeTpwb9D9HkJ9STwZGOprRzjG+vrX29fHZF+7quKIlLKHlPJF4FeUaIISg51Syo2VvA9hP59qPFNzFOiZQvPFgvq2DIAQIhn41neYBCyUUk7xHX8LIKU0hRCXAsOFENcC3X33SUQNJIuklAW++81HfVsPphtqkJwvhPCfi0cNAJuA36WUa33nf0Z9Ww9HqPloMPCZEKIP8EfgPSlllq/NbwghXgQ6AcOAM6WUJuAUQsxBDTZPA78APwshPgM+k1Iuq+S987MkqK2xvvfhYuAtKWWpr21zw7wPoQwHTvf5d0C9J0gpC4UQw4FLhBBdUYKbFFRvnZQyP+g4U0q5J6hNfw7zLDdqpuC/ppXv9cUov4wB5Ash3kTNDkMZCvxLSlniO77a18+pVfTx26DXr6I+2/8AI4F5vvNh3wfgWY7u89EcBXqm0Hz5EThZCJEGIKUskFL28Q22b6Mcz34KAYQQicAaoC9qQJmMGmT8AmMJquMJ80wbkOt/ju9ZA4HXfeUlQdeG3i8iUsrvAQmcTvi/aQvKHBFaZgUcvoHwXNRAlQ380yckVVHie75fXC2ofge321uN+9iAq4LekzOAO4QQHYC1wAnAd8CDIfUKw7XHR6T3z+Xrb+g11W23h/JfJtoIIdqFeV5MJW39D3CGz5R4LrDQdz7s+3AMn4/mKNCi0EyRUu5DmV3eF0J09J/3vT6T8INCV5RYPCil/Aj1jxqL+mf+HLhKCNFCCGEFbgj3WKBUCHG971nHAxtQ5ojK8KAG9bAIIbqhZiFrgP8CVwsh0n1lI1EDyVZf2e1CCIsQIhYVdfWFEKK3rx2bpJTTUGaM3tV5dhg+Aa4XQsT6zGU3ETSIRuC/wN1B7foQuANl4soCHpdS/hf1TRqfHb62+QQYKYSw+vwL10Zo95fAtb7+WYHZwAhfO/v5+pCIMguFxTeLWoAKDvhASlnsKwr7PlTx+WhqGW0+asZIKR8QQlwHvOOzXzuAUuA9lEM2lHXAx8BmIUQuaqDdCJwkpfxUCHEqygZ/BDXdTw95nsvne3hRCHGP73kPSSlX+BzNkVgJPCGEWAzcjc+nEFRuBW6RUv4G/CaE+Cew3DdoZQHDpZSGEGI8MBMVzhqDErInfO1aCKwWQhSivnH7TT4fAc8KIUK/+UbiDUCgBKoQ2IEKma2M8SiBXo96T75EmbQcwN8BKYQoAlb5+nNSNdtSE6ahnNHrUY7rQxHaPRdlistEzQy+QvlqElHmuS0oh/FKKp/pvYoSvjFB58K+D1JKdyWfj6aWsejU2RpN7SGEuAjIkFK+7Tt+ESiVUt5bvy2rHCHENUC+T9ytwAfAUinl7HpumqaO0eYjjaZ2+RX4mxDiFyHEr6jZ0pP13KbqsAF4wDcD2wDso8wBrGlG6JmCRqPRaALomYJGo9FoAjRqR7MvQmEAajVrdUL/NBqNprljA9oBP0kpnaGFjVoUUILwbZVXaTQajSaUs1HrX8rR2EVhP8A777xD27Zt66UBGzZsoGfPnvXy7NqmqfSlqfQDdF8aIo29HwcOHOC6666DoHxhwTR2UfACtG3blg4dOlR1bVQ4ePBgvT27tmkqfWkq/QDdl4ZIU+kHEUzu2tGs0Wg0mgBaFDQajUYTQIuCRqPRaAJoUdBoNBpNAC0KGo1GowmgRUGj0Wg0AbQoaDQajSZAY1+nUDssWABPPgkbN0KPHjBlClxzzVHfbvr06fz6669kZWVRWlrK8ccfT8uWLZkxY0aVdTdt2sSyZcu44447wpZ/88037N+/n6uvrs62uBqNpl7xJxwNTjwaes4wyo5Dz/l/h6tns0FiYq03WYvCggUwYkTZ8fr1ZcdHKQz33XcfAIsWLWL79u1MmjSp2nW7d+9O9+7dI5afc845R9UmjaZBY5rg9a2lCjcghp6rzmAbpTJ7Vhbs2xe5Xk0zT1ss5X9HOheunhaFo2DyZHj//cjl+/aFP3/jjeAb3Ctw1VXwzDM1bsqPP/7Is88+i8Ph4K9//StxcXG88847eDweLBYLs2bNYsuWLSxYsIB//vOfXHTRRfTt25cdO3aQlpbGzJkzWbJkCdu3b+eaa65h4sSJtG3blt27d3PqqafyyCOPkJOTw6RJk3C5XHTu3JkffviBL774ItAGp9PJnXfeSWFhISUlJdx9992cddZZvP/++7z22mvExcVx/vnnM378eD788EPefPNNYmJi6NSpE48++igfffQRH3zwAYZhMH78eHJzc3njjTewWq3069evRgKoacaYJjidUFqqfrtcZYNp6IAYbrCsRwItqM7AHQ2WLIGZM+G332rFshFK0xeFqnC7a3b+GHE6nbzvE6k5c+bwyiuvEB8fzz/+8Q++++472rRpE7h29+7dvPnmm7Rr145rrrmG9evXl7vXzp07ee2114iPj2fo0KFkZWXx6quvcsEFF3DdddexYsUKVqxYUa7Orl27yM3NZd68eWRnZ7Nz506ys7N59dVXeeSRRxg4cCDPPfcce/fuZebMmSxevJikpCSefPJJ3nvvPRISEkhJSWH27Nnk5uZy7bXX8sEHHxAfH8/kyZNZsWIFZ555ZlTeO00jxjSxOJ1w5IgSAJdLDaZWn1vTFo1tp5sgS5bA2LFlx7Vg2Qil6YvCM89U/q2+Vy/1xoY7/8svtd6czp07B16npaVx7733kpiYyPbt2+nTp0+5a1u2bEm7du0AaNeuHU5n+Sy3HTt2JCkpCYD09HScTifbtm3jT3/6EwD9+/ev8PyuXbty9dVXM2HCBDweDzfccAO7d++ma9euxMTEYLFYmDRpEuvWreOkk04K3H/AgAF899139O7dO9CHXbt2kZOTwy233AJAUVERu3bt0qKgUd/6S0rULMDpBLcbe26uOgdaBIJxuyEvr+wnPz/y8fLl4e8xbZoWhVpjypTyPgU/998flcdZfd+MCgoKmDFjBl999RUAI0eOJHQXPEsV09Jw5d26dWPNmjV0796dtWvXViiXUlJUVMQrr7zCoUOHuOaaa/jPf/7D9u3bcftmR+PHj+fee+9l27ZtFBcXk5CQwKpVqwJi4O9Dhw4daNeuHfPnz8fhcLBo0aJK/SGaJkwYEagwE7A20WBH04SioooDeW5uxQE+3KDvF8pjYePGY7+HDy0KfnWdNq0s+uj++2vVRheOpKQk+vbty9VXX43dbiclJYVDhw4dc/bF0aNHc8899/DZZ5+RkZGB3V7+I+7UqRMvvfQSn332WcAv0KpVK0aPHs1jjz1GUlISQ4YM4bjjjmPcuHHceOONWK1WOnbsyKRJk/jkk08C92rVqhU33XQTN9xwA16vl+OOO45hw4YdU/s1jQS/CJSWlpmDbLYy+3pjmwnU4Nt65717lcM7uMxbgz2+LBZITVU/J52kfqekQIsW6ne4Y//P1VeDlBXv2aNHrb0VUdujWQhhBV4GegNOYJSUcmtQ+UTgWsAAnpRSLhZCWIA9wBbfZSullBG/sgshOgE7li1bVm+pbDMzM+nXr1+9PDscX3/9NS1btqRXr158//33zJkzh7feeqtadRtaX46WptIPaEB9MYyKM4FgEagG69ato1evXhULgh2n3brBuHFw+eU1a19df1uPi6s4iAcP3pUN8klJRz9rCvUp+Hn33Wp/kd2zZw8XXHABQGcp5c7Q8mjOFK4A4qSUg4QQA4HngMsBhBAtgDuBk4BEYC2wGDgR+FlKeWkU29Wk6dChA1OmTMFms2EYBg888EB9N0nTGKlKBOy1NHSEDnKbNqnjHTugT5/q29pr+m3dai0bsLt2DT+oRzhe//vvnBrGX1cn+MVy5kzYsiUqlo1ozhSeB1ZJKRf4jvdKKY/zvXYAy4HLUKLwrZSysxDiauBeIA8oAe6WMtxcKfCMTsCOF198kfT09Kj0Q6NpFhgGltJSrC4XFo8Hy1HMBGqE10v8tm10mjoVx5EjR3ULIzYWb1KS+klMLHvtP05ODn8+KQkjPr7R+zhMiwVP69Y1rpeVlcWdd94J9TBTSEEN7n68Qgi7lNLjO94NbERtIj3Nd24/ME1K+b4Q4izgbdQ+zJXSs2dPbT6qBZpKX5pKPyCKfTEMZW7x+wOiLQIeD1sWLaLr4cOwciWsWgWFhZGvt1rVGqPU1PBmmORkrLGxWAFHdFockYhmsNrGNNXnZJrqc7HZyv/ExBzV4rU9e/ZUWh5NUcgHkoOOrUGCMAxoB/jjM/8rhFgBrAY8AFLK74QQ7YUQFilldKYzGk1zweuF4mIlAE4neDzRMQf5cbtVSPcPPygR+OknuhYVlZV36aJMIf/7X/gFpELA+PG126aGRFUDvs2mPhOHo84jt6IpCiuAS4GFPp9C8GKAIyjzkFNKaQohcoEWwMNANvC0EKI3sFsLgkZzFPhFwL9aONoi4HIpEfj+eyUEP/1U3nnbtSvZXbuSNnw4DBwI/kWakRyn48bVbvvqGq838oBvtZYN+HZ7gzNjRVMUFgMXCiG+R60MHymEmABslVJ+KIQYCvwghDCA74AvgJ+At4UQl6BmDDdFsX0aTdPB6y1vDvJ4yg/8tS0CTiesWaNmAStXQmamCk/1IwQMGqQEYOBASE9n77p1pIWaXUIdp127Hl30UV3i9ar3F8J/w/ebdhrggF8doiYKUkoDuC3k9Oag8odRM4NgjgCXRKtNkViwYQFPfvskG7M20iO9B1POnsI1PY/Nm79lyxaeeeYZSkpKKC4u5txzz2XcuHFVLkirK84880xWrFjBE088wciRI2nfvn2gbNu2bUydOpV//etfEeu//fbbXH/99Tpra33h8ZQ3B3m90RWB0lL4+ecyEfj5Z/VcP927lxeBtLTq3/vyyxuOCBhGWdK9CAO+Oy0NOnRolAN+dWj2i9cWbFjAiA/KVjSvP7Q+cHy0wpCfn8+ECROYOXMmnTp1wuv1cuedd7JgwQJGhFs9XY8cbcjq7Nmzuf7663XW1rrC7S4LEXW5youAxVL7IlBSAqtXK1PQDz8oEXC5yp7Xo4ca/AcPhtNPh1atavf50SDSgG+1lr12OMrs+JFwOJqsIEAzEIXJSyfz/sbIWVL3FYTPknrj4hu578vwWVKv6nEVz1wUOZ/SsmXLOOOMM+jUqRMANpuNp556CofDUSFTanp6Oi+88AKxsbG0aNGCJ598Eo/Hw1133YVpmjidTh555BG6dOkSNrupH7fbzcUXX8ySJUtISEjgtddew2azMXjwYKZPn47X6+XIkSNMnTqVvn37BurdcMMNTJ06leTkZB5//HGSkpLKhfd+/vnnFTK5vvfee+Tl5TF16lR69eoVSA8+f/58PvnkE+x2O/3792fy5MnMnDmTPXv2kJ2dzb59+7j//vs5++yzA/evLGvru+++i2EYNc7aOmvWLFJSUhp31lafCNhycsC/gtY/UEVDBIqLlQj4fQJr15YlhbRa4ZRT1Exg0CAlAi1a1O7zj5XgAT94kA/+8Q/4VmuDyLbaUGnyolAVbiN8NtRI56vDoUOHOP7448udSwwKHfNnSjVNkwsuuIB3332XNm3a8OabbzJ79mzOOOMMWrRowdNPP83WrVspLi4Om900GIfDwUUXXcTSpUu54oor+Pjjj5k/fz4rV67k3nvvRQjBRx99xKJFi8qJgp85c+YwePBgJk+ezKeffsq7774LqEysoZlcx4wZw9tvv83UqVNZtGgRoHIqffbZZyxYsAC73c64ceP43//+B0BMTAzz5s1jxYoVzJ8/v5woVJa19cMPPyQ2NrbGWVsffPDBQF8aTdZWt7u8Ocg0wWbD6vWWOStrk6Ii5Qz2m4N++aXMTm61qoSQflPQ6aercND6IjhSp4kO+KZpYmKWe234RM7AwDAMDIxy5TaLjcQYvZ9CjXnmomcq/Vbfa3Yv1h+qmCW1V5te/HLb0WVJbd++PRtDElTt3r2bAwcOAGWZUo8cOUJSUlIgXfaAAQN4/vnnmTx5Mjt37mTs2LHY7XbGjBkTNrvp6tWrefHFFwG4+eabueqqq5g6dSpdunShc+fOtGzZkoyMDF5++WXi4uIoKioKZD0NZefOnQwfPhyAvn37BkShqkyufrZv307v3r1xOFTUeP/+/dmyRWUr8SfJa9u2LS6/CcJHZVlb4+LiAGqctfXpp58mOTm5YWdtdbnKm4N8IgBExzRRUKDWBvhDRNetK1sBbLMpEfDPBAYMgOTkyu9X23g84Qd8f6ROTEytr6PwL9w1McsNyv4BOHhQDi4vcBWQW5obsb7/dXXK/ecwKduowffagiXggwx+7ceCRYtCNJhy9pRyPgU/95919FlShwwZwty5cxkxYgQdO3bE7XYzffp0Bg8ezEknnRTIMtqyZUsKCws5dOgQGRkZrFq1ik6dOvHjjz+SkZHB/PnzWbNmDc8//zwPPvhgheymy5cvr+AMNk2TefPmBXwXTzzxBM8++ywnnngiM2bMYO/evWHbfOKJJwYGcf++DZVlcg1dCd+lSxdef/11PB4PNpuNn376iSuuuILNmzdX6lyvLGury+UiJiamxllb77vvPs4444yGlbU1VASgbPCPhgjk5SkRWLlSCcH69WXmFbtdpZDwi0D//iofT13iD5GNjYWYGIy4WFxWM8JA7MH0uDE91Rtoq1seaSAGlYE43EAMUGqUUuwuPuqu++9NA53QNHtR8DuTp303LRB9dP9Z9x9T9FFSUhLTp0/nwQcfxDRNioqKGDJkCNdeey2rVq0KXGexWHj88ccDUUmpqalMmzYNi8XChAkTePfdd/F4PNx+++1hs5uG48orr2TGjBkMHDgQgMsuu4w777yTlJQU2rZty5EIKQXGjBnDLbfcwg033BBYHR4pkysoEZk0aRKDBw8GQAjBsGHDGDFiBIZh0K9fP4YOHcrmzZvDPs9PZVlbr7/+eiwWS42ztj722GPEx8fXb9ZW01T5ePy5gyC6M4EjR8pEYOVK+PXXsp3MHA418A8cWCYCCQm134bK8HjUgBwTg+lw4Iy1UYoHt9eN2yjEU5KL1WItNyjXJg19IG5IRC33UV2gs6TWLk2lLw2iH4cPKzE4xsEtYkqFnBz48ccyEdi0qUwEYmKgb98yEejXD+Ljj6kdNcY3EzBjYnDbLJTG2Vi1NpNTTj0Ft9eN1WrFammcETx1luaiCixYaJfcrsb16jNLqkbTPMnLU3H9tTkjyM4uMwX98IMSAT+xsWXhoQMHwmmn1Y8I2O1KAGwmruQ43BYDl7cECxZsXhse04OJid2mh51jYcnmJcxcNZPfsn+rtXVVwehPR6OpTYqKlFP3WAUhKwtWrqT9J5+ofQZ++62sLC4OzjqrTAj69FHCUJd4PHitFkptBi6bBVeiHbdF+QFsVhvgARPsVj3E1BYew8PCXxcy+YvJgXO1sa4qFP2JaTS1hdOpzDpHEz568KCaAfjXCWxV+1G1BvWt/5xzyhzDvXsrE1FdYZoYHjeleHDZwGW34E524LWa2Cz2gP3filXb7ENwepwUuYsodBVS4CqgyFXxdZGriAJXQbnXRa4iCt2FFDoL1W9XIaWe0ojPmfbdNC0KGk2DwuNRfoTqCsK+fWWmoO+/V5vK+ElMhPPOg0GD2Jqezkl//rNyFtcRpmHgdBXjtBi47RZcNvAkxWB1xAX8ABaa5uBhmialnlIKXWogLnIXUeAsoNBdNmBv+X0LSwuXlhvYw/0UuYtweV1VPzQCSTFJJDmSSI1NpUNyBxJjEvl+9/dlYaxBbMzSezRrNA0H04RDh8qcyuG2l+zfv8wp/MMPELz4MCkJzj+/bCZw6qmBFcvF69ZFXRDcHhclriLcFjUL8Dgs0DIBm73sudEcKIJt5N3SujHu9HFcfnL1cyGZpkmxu7jct+1yA3TQN+6I38qDrvea1djBbUvFU1aLVQ3kMUmkJ6bTydGJ5NhkkhxJJMYkkhyTTGJMYuCayn4SHAlhHfFD3xrKpsObKpzvkV57ezRrUdBojpWDB8teR9peMpiUFBg6tEwETjml9tNWRMBreCl2F+P2OHFbTVw2Czgc2NJTA36QWl47XSlLNi9h7Kdl78+mw5sY++lYvtr5FZ1bdq5obgkxqfgH93DfnquDw+oIDNjtk9uXH7yDBvOkmLLXWfuy6NmtZ7lBPjkmmTh7XNQTXo47fVy598vPsayrCkWLgkZzLGRnq5XB/sFg5szw1yUlwcSJSgR69Kj9tBVhMEyDUk8pTo8Tt8eJy2Zi2G3Y4hOxxLcCq7VOBcBPTkkOmfszWb1vNfPXzA97zcKNCyPWj7PFBQbjtNS0cgN28Gv/wJ4Uq8wwSTEhr2OSiLXX3EG/zr2OXsfXT0iqfwY1c9VMtuRsqZV1VaFoUdBojhZ/6Gnwt8PgKKFgSkvhllui1hTTNHF6nZR6nLgNNy53KV6bFVtMLJakeIhLVWsDotaC8Bimwdacrazet5qlvy5le+Z2th3ZVmU9q8XKv/70r8DAnhybTKJDCYHDVtcbcDYsLj/5ci4/+fKjXqdQFVoUNJqjobi4YuhpYaEyA3nD2KS7dq21R5umidtwU+px4jLcuA0XbrcTqyMGa0wsJCZgiWuFvR7SOxe5ilhzYA2r960mc18mmfszyXOWbdWeHJPMuSecS//2/enfvj8Pf/Uwv2VXFFKRJjiv03l12HKNHy0KGk1NcTqV2SjYBGQYcOed5TeeCeYYtpd0e92UektxeT24TTcurwuL18AWo/IGEZuCPS6uznP8m6bJnvw9rN63Wv3sX83GrI0YphG4plOLTlx44oX0b9+flMIUhg8c7lvHoLjrjLvC2sjHnd7It+NsxGhR0GhqgsejFpaF+gSefx4+/1wtJrvmGpg9+6i2l/QaXkq9pTi9btymmwPOLFoWt8JmWgLZQu2xSWoBWx2LgNPjZMOhDazevzowEzhYVOZkj7XFqhlAOzUL6Ne+H60TWgfK161bV04QoKKNvGurrjWOPmrqGKZRTmitFpUiJFoLA7UoaDTVxR96GjoYf/QR/POf0LEjzJ2rdiH7y1+qvJ3fEezyunGZbtyGG8M0sJoWLKapQlFjYrG1TFMrlutYBLKKsgIO4dX7VrPu4Dqc3rKZUJvENlzS9ZKAKahnRk9ibDVfVOe3kTd1TNPEMI2yDMMWsFlsWCyWwEAf7sdutWO32gPH0UaLgkZTXXwZYsuxYQPcfbdacPb66xG3pfQ7gp1el88P4MZjeLFZrGoRsGFgcTiwOeLULCAuDiwWjAMH6iSPkdfwIrNlQAAy92WyM29noNxmsdEjvUdAAPq3789xycc1mD3H65rQAb6qgd1qsWKz2LBb7disNpURtoG+d1oUNJrqkJOjTEfB/8iHD8Pf/672SZg/H04+uVyVAlcBLq8Hl+nCY3ixoKb+/p3E7P7dwoJEoK7Id+bz8/6flQDsz+Tn/T9T6CoMlLeIbcH5nc8PmIP6tO0TlQ1dGgKRzDP+HwvlB/wEWwIt41o2igH+aIiaKAghrMDLQG/ACYySUm4NKp8IXAsYwJNSysVCiHjgbSADKAD+JqXMilYbNZpqkZ+voo2CzTcuF4werfZPnjwZ/vCH8lWcBeS7C5QIGAY2TLA7fI7h2DoVAdM02ZG7o9wsQGbLcgu+urbqSr92/QKzgBNbndjoUlsHtrGsZICvDfNMoiORhJg63o+iDonmTOEKIE5KOUgIMRB4DrgcQAjRArgTOAlIBNYCi4ExwHop5VQhxDXAg77rNJr6obhYiUKwIJgmPPig2tRm+HAVdRT8hpQKAAAgAElEQVSE2+smz5mPzW5XAlDHIlDiLmHdwXWBiKDV+1aTU5ITKI+3xzPo+EGBWUDfdn1pGd+yTtpWXWpqnrFYLNgsNmxWW53a35si0RSFs4DPAaSUPwgh+geVFQG/owQhETCC6jzte/0Z8FAU26fRVI7LpcxGoQ7eN9+Ed96Bnj2VgzlosDdNk8OlOdgcDsjIqJNm7i/YXy4iaMOhDbgNd6C8Q0oHzul4TmAW0D29e4NJaW0YBlggxhaDw+podPb3pkg0/zJSgLygY68Qwi6l9PiOdwMbUalWpoWpUwCkVudBGzZs4GBw/pk6JjMzs96eXds0lb4ccz88Huw5ORUGo8S1a+ny0EN4WrRg6+TJuLduLVee5ymgxFOK0aoVHDhwbG3wsW7durJmGR52FO5gY+5GNuVtYlPeJg6VljnA7RY7JyafSI8WPeie2p3uqd1pHVcWFmoeNNl4sPYyataUtWvXYrFYsFvtOCwO4mxxjXKFcmP+P8nKqtwiH01RyAeSg46tQYIwDGgHdPYd/1cIsSKkTjKQW50H9ezZU2/HWQs0lb4ccz9ME/bvh+OPL39+50546imw2XC88QbdBwwoV+z0OjlUdAhb64xa2/RmReYKSlqWBPwBaw+spcRTEihPi0/jDyf+ITALODXjVOIddbzrWiV4vB5sVhsxthg2rt/I4IGDG6UIBNPY/0/27NlTaXk0RWEFcCmw0OdTWB9UdgQoAZxSSlMIkQu08NW5GFiFEo5vo9g+jSY84UJPCwtVpFFuLjz7LIQIgmmaHC7OxpbS4qgFwTANtuVsK7dCeGtO2UzEgoWTW59Mv/b9Av6ATi06NSjTisfwYLfaibHFEGOLIdGRGFiwluTQeYsaA9EUhcXAhUKI71F7cowUQkwAtkopPxRCDAV+EEIYwHfAF77fbwohvgNcqOgkjabuCBd6ahhqVbKUcPPNMGJExWrOXCxxcZCcXKEsEsXuYtbsXxPwB/y872dynWWT46SYJE5rdRpDug2hf/v+nNbuNFJiU46pe7WJaZp4TS92q51YWywxthgSHAkVVi1rGhdREwUppQHcFnJ6c1D5w8DDIeXFwFXRapNGUynhQk8BnnkGli5V+yL/4x8VqpW4Syj2FGNr3R4Iv2nMZeIy9hbsLZsF7FN5goI3dOmU2okLulwQMAWJNMGvG36lV6/6SdMcimmaeA0vDpuDGFsMcfY44h3xOsqnidEwQhA0mvqmpESlwg7NabRkCcyYAZ06qXxGIZvhGKZBdmkOtow2YLFE3DTm/mX3l8sWGmuLpW+7vgEB6NeuH+mJ6dHsYY3xi0CMXZmCYm2xWgSaAVoUNBqXq2LWU4D162HCBLVBzvz5YVNYZBdnY23ZKiAWM1eF32SnyF3ExV0vDvgCemb0PKoNXqKJf12Aw+Yg1hYbEIGG5LPQRB8tCprmjdersp6GmoyysmDkSJUKe/ZsEKJC1UJnPs54B9ag3ETh9gbw8+qlr9Zas2sDvwj4ncLxjnhibbFaBJo5WhQ0zRd/1tPQQdDphFGjVFjqfffBRRdVqOrxesg1SrCmtil3PikmqZyZyE/XVrW3yc7R4jWU/yLWHhvwCWgR0ISiRUHTfMnKUsIQjGnClCmwerXaA+GOO8JWPVyajbVN+RXLC39dGFYQoH42jfEaXiwWS5k/wB5PjL3mqa01zQstCprmSU4OuN0VZwnz58OCBdCrFzz3XNh8RfkleXhapmINKtt8eDP3L7uflNgUJg6ayIINC+p80xiv4cVqsQbMQQmOBL0uQFNjtChomh8FBeFDT7/5BqZOhfR0eO21sPsYuN1O8uIt2GLjAueKXEXc+vGtlHpKmTVsFsO6DmNU31FR7kTZamG/OSjeHq9FQHPMaFHQNC9KStSq5NBIox07YMwYFUU0bx60b1+xrmly2FqKLTk16JTJfcvuY2vOVkb1HcWwrsOi0uzghWJ+c5BeKKaJBloUNM0Hl0ttjBMqCAUFKtIoN1fttdy/f9jqR1z5GGkpBBuU3t3wLos2LeK0tqfxwNkP1FpT/SKAiV4trKlTtChomgeGoRzLoYLg9cLtt8OWLSri6Oqrw1Z3uksobJGALciPsDFrIw8tf4gWsS2YM3zOUe1P7CfSauGs+CzSEtKO+r4aTU3RoqBp+pimSmMdLvTy6adh2TI45xx4KPz2HabXy+FECzZ7mb2+wFnALR/dQqm3lDmXzqFDSs2y9BqmgWEYgdXCcbY44hxxerWwpt7RoqBp+vhDT0NF4f/+D2bNipjCAgDTJCfGAzGxQadM7vnyHnbk7mBM/zFc2OXCKpugVwtrGgtaFDRNmyNHlC8hNNLol19g4kSV1fSNN6BFi4p1TZMSq0FxnL2c2eitdW/xofyQ/u37c++Z90Z8tNfw6tXCmkaHFgVN06WwEIqKKgrCwYNqbwSnE155BbqGX21sWCA7gXLO3fUH1zP1q6m0jGvJ7EtmRwwBNU2T41KO0+YgTaNDi4KmaVJaqqKJQgWhtFQ5lA8cgAcegAsuCF/fMMhOtmG1ln2zz3fmc+vHt+Lyuph/2XzaJ4cJW0XNENomtdWCoGmU6L9aTdPD7Vahp6GCYJpw//3w88/w5z+rdQnhMAyKkuNw4g2qajLxvxP5Pe93xp0+jiGdh4St6jW8tE5orReRaRotWhQ0TQvDwH7kSEVBAHj1VVi4EHr3VlFH4ez7hoEnIZ4jlGINusf8NfP5dOunDOowiEmDJ0V4tEFqXGqD2iNZo6kp2nykaTqYJhw8GN6Z+9VX8NhjkJERMYUFAA4Hh2PcWM0yQVizfw2PffMYafFpzLp4FnZrxX8b0zSJd8Q3qO0yNZqjQc8UNE2Hw4fVIrVQtm2DsWPB4VCC0K5d+PqmSX5qHB7DEziVW5rLbZ/chsfwMOviWbRNahu2qs1qo1V8xU14NJrGhp4paJoGR46oaKJQs1F+vkphkZcHL7wAffuGr28YuFu3It+VE3AQm6bJ3f+9mz35e5gwcALnnHBO2KqmaZKRlKHDTTVNAj1T0DR+IoWe+lNYbNsGt94KV10Vvr5hQMuWHHbnlYsYmps5l6XblnJWx7O4a+BdYat6DS8ZiRk60kjTZNB/yZrGTaTQU4Dp02H5chgyRIWfhsM0ISGBI1YXhllmelq9bzXTvptGRmIGs4bNCpuIzjAN0hLSdKSRpkkRNfOREMIKvAz0BpzAKCnlVl9ZH+CFoMsHAlcAq4DfgA2+84ullC9Gq42aRo7HEz70FOCDD+Dll6FLF3jppYqJ8PzYbDhTEiksOhQY+HNKchjzyRgM0+Cli18iPTG9QjXTNEmOSSbBkVCbPdJo6p1o+hSuAOKklIOEEAOB54DLAaSUa4HzAIQQVwF7pZSfCyGGAu9KKet+70JN48Iw1MrkMIIQLyXcey+kpMDrr0NqapgbAKaJmZ7O4aIDAUEwTIM7P7+TfQX7uOfMexh8/OAw1Uzi7HGkxkW4r0bTiImmKJwFfA4gpfxBCFEhSb0QIhF4BPB78PoB/YQQXwOHgPFSyv1VPWjDhg0cPHiw1hpeUzIzM+vt2bVNo+iLaWLLzsYaur8yYM/Opuujj2K63ex84AEKioth3bqKtzAMPC1bkrdL4vQ6A07ihTsXsnzHcvql9ePcuHNZF6auBQtpsWnstOys9a6Fo1F8JtWkqfSlMfcjKyur0vJoikIKELyLuVcIYZdSeoLO3Qy8L6U87DveDGRKKb8UQlwHzASurOpBPXv2pEOHmqUuri0yMzPp169fvTy7tmk0fcnKUmGlodE+paVw5ZVq/+WHHqLzzTeHr28YkJpKSZyd7JLsgJP4xz0/8tayt2ib1JbXr3o97D4GpmnSLrldnTmWG81nUg2aSl8aez/27NlTaXk0/7LzgeTgZ4UIAsB1wLyg4+XA/3yvFwOnRa95mkZJXp4KPQ0VBNOEe+6BNWs4csEFKtooHKYJ8fEYSYnklJSFnx4uPszYT8YCMPuS2WEFwTAN0hPTdaSRpkkTzb/uFcDFAD6fwvrgQiFEKhArpdwddHoe8Bff6wuAxjtH09Q+hYVq3UE4x/Lcucq5fNpp7Bk/PnwKC1B1W7UipzgnYDLyGl7GfzaeA0UHuO+s+zj9uNMrVDNMg1bxrY5pdzWNpjEQTfPRYuBCIcT3gAUYKYSYAGyVUn4IdAN2htS5D5gvhBgLFAGjotg+TWPC6VShp+GiiJYvhyeegLZtYd48zEOHwt/DMKBtW4rcxZR6ynIbzVg1g69//5oLOl/Abf1vq1jNNHSkkabZEDVRkFIaQOh/2Oag8p9QEUrBdXYA4dNPapovHo/yI4SbIWzdWj6FRdu2EE4UDAPS0vBaLRwpOhIQhBW7VvD8yudpn9yeF/74QgXTkGmaxNpidaSRptmg01xoGjaVhJ6Sl6dSWBQUwIwZ0KdP5HskJ0N8PIcLDwYE4VDRIW7/9HasFitzLpkTNneR1WKldULr2uyRRtOg0R4zTcPFNNW3/nD+Aa9XzRC2b1e///KXitf4iYmB1FTynfm4Dbeqbni5/dPbySrO4oGzH6Bf+4rRJKZp0iapjc5ppGlWaFHQNFxyctTgH44nnlDpsM8/H+67L/I9TBPS0/F4PeQ78wPmoedXPs/3u7/njyf+kdF9R1eopiONNM0V/RevaZjk5UFJSfhZwsKFKtropJMqT2FhGGr/BIuFw8WHAwP81zu/5sUfX+T4lON57g/PVZgJ6EgjTXNGi4Km4VFUpPwE4fwImZkqhUVqKsyfr1JZhMPrhVatwOEgrzQPr6lmHAcKDzDus3HYrXbmDJ9Di7gW5arpSCNNc0eLgqZh4XQqs1E4Qdi/H0aNUtFIs2fDiSeGv4dpQlISJCTg8roocBZgsVjwGB7GfjKW7JJsHj73Yfq07RNSTUcaaTRaFDQNB3/W03DmoJISuPlm5Xh+6CE499yItzGsVmjZEtM0OVx0OBBt9MyKZ/hx748M7zacm/rcVKGejjTSaLQoaBoKlUUamSZMngy//AJ//SuMrugYDr7W20qFlh4pOYKJSpq3bPsyZv00i06pnXjmwmcq+BF0pJFGo9CioGkYVJbldvZsWLwY+vVTG+dEGri9XmjdGqxWStwlFHuKsVgs7C3Yy/jPxxNri2XupXNJiS3vh9CRRhpNGXrxmqb+yc5WA3q4wf7LL+HJJ1VW1HnzIDY2/D18W2oSG4tpmoFkd26vmzEfjyG3NJdpF0yjZ0bP8tV0pJFGUw791UhTv+TlqZTX4QRhyxa1x3JsrIo0ysgIfw9f5lOSktQtXXkBM9D076aTuT+TK8QV3NDrhnLVdKSRRlMRPVPQ1B/FxZFDT3Nz4aabVGbUl16CXr0i38eX+RSgyFWE03ACsHTbUuZkzqFLyy48deFT5fwFOtJIowmPnilo6geXS5mNwgmCxwNjxsDOnXDHHXDFFRWv8eNbsYzFgmEaHCk5gtViZXfebu76/C7ibHHMHT6XpJikctV0pJFGE55qzxSEEJ2AU1BbbHb0ZTTVaGqOx6MijSKtRH7sMfjmGxg6VC1Ui4TXqwTBrv6Ms4qysFqtuA03Yz4ZQ54zj2cvfJYe6T3KVTNMg3ZJ7XSkkUYThmrNFIQQVwMfATOANGClEOL6aDZM00Txh56GmyEAvPeecih37QqzZkW+zjDUaua4OAAKnAWBZHevbXmNNQfWcGWPK7mm5zXlqnkNLxmJGdisEQRJo2nmVNd8dC8wGMiXUh5CbZN5f9RapWm6RNoAB+Cnn1RyuxYt4PXXVbrrSPgynwJ4vB7ynHlYLVY++e0TluxeQre0bky7YFq52YDX8OpII42mCqorCl4pZYH/QEq5HzCi0yRNkyUnR5mOwrF3r1qU5vWqdQmdO1d+r/T0wEt/sruduTuZuHQisdZY5g6fWy6qyDANkmKSSIxJrI2eaDRNlur6FH4VQtwBOIQQfYCxwNroNUvT5MjPV9FG4cxB/hQWWVnw6KNwzjmR7+P1qt3VfDMAf7I7p9fJbR/fRoGrgIk9JtItrVugimmaxNhiaBnfsrZ7pdE0Oao7U7gdOA4oAeYD+Shh0GiqprhYiUI4QTBNmDgR1q+HESPg73+PfB+vF9LS1NabUC7Z3aNfP8r6Q+sZ0XMEQ9sPLVfNarGSnpAe7o4ajSaE6s4UZkkpR6L9CJqa4nJFznoKypm8ZAkMGKA2zokUERSU+VQdliW7WyKX8OYvb9K9dXceG/IYWzZtCVQzTIO2iW11pJFGU02qO1PoKYRIqvoyjSYIf+hpJEFYuhSeegrat4dXX42cwgJU+GrLMvOPP9ndtiPbmLx0MomOROYMn0O8Iz5wjdfwkp6Qjt2m12hqNNWluv8tBrBLCCFRJiQApJTnR6oghLACLwO9AScwSkq51VfWB3gh6PKBwBXAauDfQDywDxgppSyudm80DYeqQk+lVAvT/Cks0isx75hmuRQXpe5Sij3FOD1Obv3oVorcRbx08Uuc1OqkwDVew0vLuJbE2isRGo1GU4HqisI9R3HvK4A4KeUgIcRA4DngcgAp5VrgPAAhxFXAXinl50KIGcC/pZRvCCHuA24F/nkUz9bUN5WFnubkwMiRaoe12bPh1FMjX+v1Qps2AXExTZPskmysFisPf/Uwmw5v4vpe13PFyWWrnv2RRkmxenKr0dSUapmPpJRfAwnApcCfgBa+c5VxFmr1M1LKH4D+oRcIIRKBR4A7Q+sAnwFDQ+toGgGVhZ663XDbbfD77zB+PFx2WeT7+DOfxpStK8guzsZisfDBxg94Z/07nJJ+Co+c90i5ag6rQ0caaTRHSbVmCkKIe4C/AO8AFuABIcQpUsonK6mWAuQFHXuFEHYpZfBocTPwvpTycJg6BUC1spVt2LCBg5Xl448ymZmZ9fbs2uZY+2ItLMRaVIQlgtmo/ezZtF6xgrxBg/j9D3+AdevC38g08cbGYqSW/QmUeErId+ezp3gPk3+cTLwtngldJ/Dbxt8C11iwkBabpj+TBkpT6Utj7kdWVlal5dU1H10PnCGlLAEQQrwKZAKViUI+ELwk1RoiCADXAVeGqVPi+51bncb17NmTDh06VOfSWiczM5N+/frVy7Nrm2PuS0lJ5CR3AP/+N3z4IQhB6htv0CupEvOO1arMRr6oIcM02Fewj1JPKRP+PQGn4WTu8Ln8sdsfA1X8kUa/rP1FfyYNkKbSl8bejz179lRaXt3oI6tfEHyUAhHsAwFWABcD+HwK64MLhRCpQKyUcne4OsAw4Ntqtk9T31SW9RRg1SqYMqUshUVlgmAYage1oDDSrKIsrBYrDyx/AJktGdlnJMO7DQ+U60gjjaZ2qO5/0DIhxAfAG77jm4DlVdRZDFwohPgeZXIaKYSYAGyVUn4IdAN2htR5HHhTCDEaOAxcW832aeoTr1etRo4kCHv3wqhRarCfOxdOOKHyewVlPgUodBbiNty8/+v7LPx1Ib3b9Oahcx4KlBuGoSONNJpaorqicBdwG3AjanaxDHilsgpSSsNXJ5jNQeU/oSKUguscBP6IpvHgDz2NtDisuFhFGmVnq8VpZ50V+V6GoZLc+TKfgkp2d6T0CFtytjBl+RRSY1OZM3xOQABM0yQxJlFHGmk0tUR1zUeJKBPSVcB4oC2gU01q1AzBNMOXmSbcfTf8+itcdx387W+V3ysmRqXDDuJw8WFKPaXc+vGtlHpKef4Pz9MxtWOg3G6160gjjaYWqa4o/Bto53td4Kv3r6i0SNN4yMlRIaaRePFF+PhjOOMMePzxyLMJPyEL2PJK8/AYHu778j625mxldN/R/PGkoImkCRmJEfZt1mg0R0V1zUcnSCkvA5BS5gMPCiF0ltTmTEFB5KynAJ9/Ds88A8cdB6+8Um6tQQW8XmjXrpxouLwu8p35LNiwgEWbF9G3XV+mnD0lUG6aJm0S2+icRhpNLVPdmYIphAgsOxVCnAxU8hVR06QpKYHc3MiCsGmTWpgWH69SWLSuZC9kw1CZT4Mcy/5kd5sOb+Kh/z1Ei7gWzLlkTmBzHMMwaJ3QWkcaaTRRoLr/VZOAL4QQ/gDXdNTaBU1zw+WCw4cj76+ck6PSXxcVqUijnj0j38s0ITExkPnUz5GSI+Q787n141txep28cukrHJdyHKAEoUVcCx1ppNFEiSpnCkKI4cB2oCPwHmqB2XvAyug2TdPgMAzlWI4kCG433HIL7NqlHMzDh4e/zo/dXi7zKYDT46TQVcg9X97DztydjO0/lqFdVLYT0zRJcCToSCONJopUKgpCiEnAw0AccDIwFeV0tgPPRrtxmjrEMJRt3+NRs4HiYigsVL6D/HxlLjpwoHJn8dSpsHIlDBsGEyZU/jzTrOBYNk2Tw8WHeXv923z020cMaD+Ae84sy8Vot9ppldDqGDqp0Wiqoirz0Q3AICllsRBiOvChlHKeEMICbIx+8zQVMM3yP16v+jEMdez/7f8JPY50DrBnZcG+fcpXYLFUHS0UzL/+BW+8Ad27q6ijSP4GqJD51E92cTbrD63nka8foVV8K16+5GUcNoev35CRpCONNJpoU5UomEH7GQxB7Y+AlNIUQkS1YU2C0IHX41EDcnUH8HDHoRzNAA5h61hstnIO32rzww/w4IPKFDR/vvITRCJM5lOAYncxBwsPcuvHt+L2upk5bCbtk9v7qhi0TdK7p2k0dUFVI4BHCNECSAJOA5YCCCFOoOrcR42LoxzAbdnZcPBg3QzglX37ri/27IHRo9XrV16Bjh0jX2uayqkckvfIMA2yi7OZ/OVkduXtYvwZ4zmv03mqTEcaaTR1SlX/adOBtb7r5kkp9wsh/orKjvpIpTUbA/v3l5legvEPwtUYwK1+W3xo3YY4gNc2RUVw000q4mjaNBg8uPLrrdYKjmVQq5ZfX/s6n239jEEdBjFx0ESgLNIozhFXoY5Go4kOlYqClPI/voR2raWU/sT3haitNb+KduOijmmqgao5DOC1jWHAXXepNQk33AA33lj19SEL1EAlu/txz488/s3jtE5ozUsXv4TdateRRhpNPVHlnFxKuQ+1X7L/+NOotkjTOHjhBfj0Uxg0CB59tPJr/ZlPQ0JZPV4PO3J3MOaTMXgMD7MunkWbpDaAjjTSaOoL/RVZU3M+/RSeew6OP14tUKsshYVpVsh86udw8WEmLp3I3oK9TBg0gbM7nu2ro3MaaTT1hRYFTc349VeVwiIhQUUapaVVfn2YzKegkt3N+mkWX2z/grM7ns2dZ6htug3DICMxQ0caaTT1hA7p0FTNkiUwcyb89pvyv7jd8Oqr0KNH1XXD5D1ye90s27GM6d9Np01iG2YOm4nNatO7p2k0DQD936epnCVLYOzYsmN/pFVlKbNBOZbbtq3gWDZNE5ktuf3T2zExeenil0hPTNeRRhpNA0GbjzSVM3Nmzc6DEoRWrcIuhMspyWHcZ+M4UHiAyYMnM+j4QYFIo+TY5FpqtEajOVq0KGgik50NmzeHL9uyJfx5w1CL00Iyn4JKdvf0iqf5audXDOk0hDtOvwMAm9WmI400mgaCFgVNRbZsgXvugQEDIm+12bVr+PMOB7RoUeG0aZp89NtHPLvyWdomtWXGsBlYLdbAZjkajaZhoH0KGoVpkrRmjdotbflyde6EE+D00+H99yteP25c2HuEZj71s/nwZu749A4sWJhzyRxaxbfSOY00mgaIFoXmjtOpnMmvvEKXTZvUudNPV/siXHSRWnA2ZIjyIWzZomYI48bB5ZeXv49hQEZG2NXhhc5Cbvn4Fg4WHeTBsx9kwHEDdKSRRtNAidp/pBDCisqq2htwolJjbA0qH4baq8ECZAK3+4r2AH6D9Uop5f3RamOzJienLN31oUNgs5F7zjm0uOceOO208tdefnlFEQjGMJTJKMwiNsM0mPr1VL7b9R1Duwzl1v636kgjjaYBE82vaVcAcVLKQUKIgcBzwOUAQohk4BngPCnlYSHEPUBrIBX4WUp5aRTb1bzZtg3mzYOFC6G0FJKT4dZb4e9/Z1dODi169arZ/SJkPvWzaNMinl/5PMclH8cLf3gBCxbiHfE60kijaaBEUxTOAj4HkFL+IIToH1Q2GFgPPCeE6ILKwJolhDgfOE4I8T+gBLhbSimretCGDRs4ePBgjRtoz8qiNqzZ69atq/qi+sQ0SVy/nvRFi0j58UcAXBkZHL7xRnIuuggjMVHNHKh5XwyLBW/r1rBjR4WyXYW7GL1iNDaLjUknT2L3lt3sYQ+t41qzg4rX1yaZmZlRvX9dovvS8GjM/cjKyqq0PJqikALkBR17hRB2KaUHNSsYAvRBZV39VgixEtgPTJNSvi+EOAt4GxhQ1YN69uxJhw4dat7CffuqvqYK1q1bR6+afruuK9xu+Ogjtc/B+vXq3Gmnwa23EjNsGO3tdtoHXV7jvhgGtG8f1o9Q6i5l3FvjyHXnMvW8qfy1718xTZN2ye2wWqIb9JaZmUm/fv2i+oy6Qvel4dHY+7Fnz55Ky6MpCvlAsI3A6hMEgGzgJynlAQAhxDcogfgY3+Y9UsrvhBDthRAWKWWEuEhNWHJz4Z13VG6iAwfUoH3xxcp5PKBKja0ehqEijSKkHZ+yfAor96xk2EnDGHXaKAzDoE1Sm6gLgkajOTaiKQorgEuBhT6fwvqgsp+BnkKI1kAuMBB4FeV4zgaeFkL0BnZrQagBO3fCa6/BggVQXKy2xbz5ZvVzwgm19xzTVEnuYmPDFi/evJgXfniBjqkdee6i5zBMtXtaYL9ljUbTYImmKCwGLvRt0mMBRgohJgBbpZQfCiHuB/7ru3ahlHKDEGI68LYQ4hLUjOGmKLavaWCa8NNPykT0+efquF07mDgRRoxQaatr+3kRMp8C7Dyyk1EfjsJhczDnkjkkxySTEpdCvCO+dtuh0WiiQtREQUppALeFnN4cVL4AWBBS5whwSbTa1KTweOCTT5QYrOJwWokAABO0SURBVF2rzvXqpSKJLrlErSyOBlZr2MynAB7Dw9UfXE1OSQ5PnP8Evdr0Is4eR0pseAHRaDQND71yqLGRnw///rfyF+zdq7KQ/vGPyl9w+ulV7il9TETIfOpn0tJJrNq7iuHdhvO33n/DZrWRllDFfgsajaZBoUWhsbBrl/IXvPsuFBVBfDzcdBOMGgWdO0f/+f4tNcNkPgX4v03/x4s/vkinFp149sJnAb17mkbTGNGi0NDJzFRbXn72Wdk39fHj4brroGXLummDaarnRjBJ/Z77OyM/HEmsLZa5w+eS4EggIzFDRxppNI0QLQoNEa9XicArryhRADjlFOUvuPTSyvdErk1MU/kQ2raNGHrq8rq4cuGV5Jbm8tTQp+iR3oO0+DQdaaTRNFK0KDQkCgtVOOlrrylzEcDQocpfMHhwdP0FoRiGCjlt3brS505eOpnV+1fzp5P/xLU9ryUlVkcaaTSNGS0KDYG9e5Xj+J13oKAA4uLg+uth9Gg46aS6b4/Xq0JOqwhnXbx5MTNWzeDElicy/YLpxDvidaSRRtPI0aJQn/zyizIRffRRmSP3ttvgxhvVdpb1gdernp2YWOllO47sYOT/jSTOHsfc4XNJjk2mVbzePU2jaexoUahrvF744gslBr7kdHTvrmYFV1wRcZVwXWCaptoToYo2OD1Orlx4JXnOPJ6/6HlObn0ybZLa6M1yNJomgBaFuqK4GN57T6Wt3rlTnRsyRPkLzj67bv0FEfC0alUtUZq4dCI/H/iZq3pcxVWnXEV6YrqONNJomghaFKLN/v1qI5u331aJ6mJj4dpr1foCIeq7dSrCyG5XM4T9+6u8/D8b/8NLP71Et7RuPH7+47SKb0WMrY6ioTQaTdTRohAtNmxQ6ws+/FClpEhLgwkT4G9/i5gmos4xDLVBThX+iwUbFvDkt0+yMWsjJiYOq8pr1CaxDQmOhDpqrEajqQuapSgED3I9WnZjSp9xXHNiJdtNVhfDgGXLlBisXKnOde2qTER/+pNahdxQ8HrVFprJle+AtmDDAkZ8MKLcOcM02JqzlSGdh0SzhRqNph5odqIQOsitz9nEiOVjAY5eGEpK4P334dVXYft2de7ss5UYnHde2IVfpmliYmKYBoZp4DW9GKbvGP85I1Bedp0R9hqv4VW/UffyGkF18GIYBh7Td43XjZGSjLvQgul7psf0sHXPVjbaNwbaY5omD3/1cNguz/ppFqP7jT6690uj0TRYmp0oPPntk2HP3/zNRGZseK3iIB1pgMbE63HhzcvD8mWpuvbP4I2LwRsXi2H9BWPXWIy3DAyjbBA3TFMNuDTQbSLWVu+yjVkbo9sOjUZTLzQ7UYg0mBV7SliVtRarxYIVKxaLBZvFFnRsxeo/5zWwFZVgKynFYaAib5KSsaQk/3979x4kVXnmcfzbPVyG+4ACuoIhJvhY7CiXwQUNRoyyqDHqutGaBE3WSFmshDWVbCVgiJqNkUpcNCQbKy6KoHGXNRGMWqtgIipg8DKShRF9KFxjiiAw4kYxRJTp3j/eM00zNMy15/Tp+X2qpqrPrft5p2fOc855z3leUj165rZLEbZJp9JUpNLR9MH3SUXv3bROOpWOpgvNC9Mp0lSkD85LRe+dP50usE6aFBXpHqT79SeVTufeOxdHKs3Ot3Yy8oSRuel0Ks0dG+5g5/s7D/t9jRk6pthflYjEoNslhTFDx7B59+bD5v/1YGPdRSuPvGE2S49n19H7nqX0XB/6Cxo/Poo3LpjGsdfNKq3+guaymVAvqYWSFRs3bmT8+PGHzDumzzHMfHTmYevOmzKv08MUkfh1u6Rww1k3HNZxCjB//PVUVRYo6/DBB7BiRegv2Lo1zDvzTLj2WirOPZd99fVUDT6uyFF3QCYTOpNbMQLbwF4DqaqsOmTeNROuoV+vfixYtyB0zA8dw7wp86itri1WxCISo26XFJp2ZgvWLWDL7i2MGTyaeYXuPnr7bVi2LPzs2RPu5b/sstB5fOqpMUTeDplMuN20b8duG62trlUSEOkmul1SgLyd3I4dhy/cujWcFTz0EOzfH46wv/rVMKDN8cd3eaztlsmEB9K6qsy2iJSFbpkUWL4cbr0VtmyBk08OO/0hQ0I9ojVrwjqjRoWnjq+4osXicCUnlQpjIBxhlDQRkSPpfnuN5cvhC3l9Cq++CrNnH5yeNClcIpo2DSoquj6+jshEHcpDh5ZELSURSZ6iJQUzSwN3AmOB/cBMd9+Wt/wC4CYgBdQBs4FK4OfAMGAv8GV3b+jUwG4t/JwCAweG8Y/HjevUj+symUw4o+mqITpFpCwVs7TlpUClu58BzAUWNi0wswHAbcBF7j4J+D1wLPCPwGZ3Pwu4D5jf6VFtOcJDV/v2JTshVFUpIYhIhxUzKUwBngBw9w3AxLxlZwKbgYVmthbYFZ0R5LYBHgfO6/SoxhzhoavRozv9o7pEJhOeP+jfP+5IRKQMFLNPYSDwbt50o5n1cPcDhLOCc4BxwPvAWjP7bbNt9gIt31wP1NfXs2vXrlYFNbi2lpM2H/7w2psXX8y7mza16j2a29TO7Toqm82GMRBa2fbWqKur67T3ilO5tAPUllKU5HY0NBz9inwxk8J7QH4JznSUEAD2AC+6+04AM3uWkCDytxkA/Kk1H1RdXc2IESNaF1VNDZx0EixYEC4ljR4Nc+bwsUvaVwxv06ZNnHbaae3att2y2dAJPmxYwWJ77VVXV0dNTU2nvV9cyqUdoLaUoqS3Y/v27UddXsyksB74HPCgmU0mXC5q8jJQbWbHEnb8k4HF0TYXAi8AFwBrixJZbW34KfScQqnLZqGyMozPICLSyYqZFFYC08zsOcIdRleb2deBbe7+iJnNA1ZF6z7o7vVm9r/AMjNbB3wIfLGI8SVPG0pWiIi0R9GSgrtngFnNZr+Wt3w5sLzZNvuAy4sVU6J1UskKEZGj6X4PryVRNquSFSLSJZQUSlk2e7BkRdKerhaRRFJSKFUqWSEiMVBSKEUqWSEiMVFSKDVNJSv0hLKIxEBJoZQ0layorIw7EhHpppQUSonGQBCRmGkPFLemkhXDh6tDWURip6QQp0wmPIw2ZEjckYiIAEoK8clkwsA+AwfGHYmISI6SQhwymVDQrk+fuCMRETmEkkJXU8kKESlhSgpdJZsNYx8cd1ynjoEgItKZlBS6QiYDvXuHZxB0h5GIlDAlhWJrbAxjIFRVxR2JiEiLlBSKqbEx1C9SyQoRSQglhWLJZEKHcu/ecUciItJqSgrFopIVIpJA2mt1pmw2JIJhw9ShLCKJpKTQWVSyQkTKgJJCZ2hshEGDVLJCRBJPSaGjGhvD8wcqWSEiZaBoScHM0sCdwFhgPzDT3bflLV8ETAH2RrMuASqArUB9NG+luy8qVowdls3ykWoYiUgZKeaZwqVApbufYWaTgYWEHX+TGmC6u7/dNMPMzgP+093nFDGujssvWbFzZ9zRiIh0mmImhSnAEwDuvsHMJjYtiM4iRgP/bmbDgXvcfQkhUdSY2TPAbuCf3P2tlj6ovr6eXbt2tTnAHg0NtPkeoUyGTK9eNFZVwY4dANTV1bX5s0tVubSlXNoBakspSnI7Ghoajrq8mElhIPBu3nSjmfVw9wNAP+AnwO2ES0ZrzOwl4DWgzt1/bWYzonU+39IHVVdXM2LEiLZHGO3UWy2TCSUrBg3Kzaqrq6Ompqbtn12CyqUt5dIOUFtKUdLbsX379qMuL2a5zveAAfmfFSUEgH3AInff5+57gacIfQ9PAWuidVYC44sYX9s0NobbTfMSgohIuSlmUlgPXAgQ9Slszlt2MrDezCrMrCfhUtPLwN3A30frnAuUxjlaU8mKvn3jjkREpKiKefloJTDNzJ4DUsDVZvZ1YJu7P2Jm9wMbgI+A+9z9FTObCywxs+uAPwMzixhf66lkhYh0E0Xb07l7BpjVbPZrectvA25rts0bwDnFiqlNVLJCRLohHf4WopIVItJNKSk0pzEQRKQbU1LIl8nA0KFQWRl3JCIisVBSaJLNwvDh0LNn3JGIiMRGSSGbhYqK0KGcLuYduiIipa97J4VMJlwqOuYY3WEkIkJ3TwqDB0O/fnFHISJSMrr39RIlBBGRQ3TvpCAiIodQUhARkRwlBRERyVFSEBGRHCUFERHJUVIQEZEcJQUREclJ+sNrFQA7d+6MLYCGhoYWxzxNinJpS7m0A9SWUpT0duTtLysKLU96UjgeYMaMGXHHISKSNMcDrzefmfSk8CJwFvAW0BhzLCIiSVBBSAgvFlqYymazXRuOiIiULHU0i4hIjpKCiIjkKCmIiEiOkoKIiOQoKYiISE7Sb0ntMmY2CfiBu081s08CS4EsUA/MdveMmd0EfBY4AHzN3V+ILeACzKwnsAQYBfQGbgG2kLC2mFkFsBgwQtyzgA9IWDvymdkwoA6YRoh1KQlsi5m9DLwXTb4B3AUsIsS82t2/a2Zp4E5gLLAfmOnu2+KI90jMbB5wMdCLEOszJPQ7aSudKbSCmX0TuBuojGbdDsx397OAFHCJmU0AzgYmAbXAT+OItQVXAnuiuM8H/o1ktuVzAO7+KWA+8H2S2Q4gl6zvAv4SzUpkW8ysEki5+9To52rgZ8AXgSnAJDMbD1wKVLr7GcBcYGFsQRdgZlOBM4FPEX7nI0nod9IeSgqt8zpwWd50DeHIAeBx4DzCH/1qd8+6+x+AHmY2tGvDbNEvgO9Er1OEo5vEtcXdHwaujSY/BvyJBLYjz78Sdp47oumktmUs0NfMVpvZU2b2aaC3u7/u7llgFQfb8gSAu28AJsYWcWHTgc3ASuBR4DGS+520mZJCK7j7Q8BHebNS0R85wF5gEDAQeDdvnab5JcPd33f3vWY2APgl4Sg7qW05YGbLgJ8AD5DQdpjZPwAN7r4qb3Yi2wLsIyS46YRLevdG85ocqS2NZlZKl7KPJSSqywnteABIJ/Q7aTMlhfbJ5L0eQDhSfS963Xx+STGzkcAa4H53/w8S3BZ3/zJwMqF/oU/eoiS14yvANDN7GhgH3AcMy1uepLZsBX4eHTlvJewwh+QtP1Jb0u5+oOvCbNEeYJW7f+juTuivyt/ZJ+k7aTMlhfbZGF13BLgAWAusB6abWdrMTiT8ob8dV4CFmNlwYDXwLXdfEs1OXFvM7KqoIxDCkWgGeClp7QBw90+7+9nuPhX4HfAl4PEktoWQ4BYCmNlfAX2BP5vZJ8wsRTiDaGrLhdF6kwmXakrJOuB8M0tF7egH/Cah30mbldIpW5J8A1hsZr2AV4Ffunujma0FfktItrPjDPAIbgAGA98xs6a+heuBHyesLSuAe83sWaAn8DVC7En8TgpJ6t/XPcBSM1tHuEvnK4SE/QChCNtqd3/ezF4knB09R+jbujqugAtx98ei/pAXOPi7foNkfidtpoJ4IiKSo8tHIiKSo6QgIiI5SgoiIpKjpCAiIjlKCiIikqNbUiURzOynhFo0vYBPEgr5ASxy93tb+R7/Arzk7o8cZZ3fufu4jsYbNzMbBTzt7qNiDkUSRrekSqJoZ9c6+j1Je+lMQRLPzG4GJgMnEiq/vkKonNqX8LDeN939F2a2FHg6+llJKIE8HtgFXO7u75hZ1t1T0XueAIwmFN27292/H1U0/RmhGNofCQ9pfc/dn24W01zgCsJDW6uAbxGquy4ETgVGRHFMBqoINZz6E0pcLHT3H0cxnEgoNDeMUKvqM4SqnP9DqMx5NvBdQm2ukYQHrmY2i2U4oQrrSMLDZPPc/ddmdi7ww6gN/wd8oRyeyJWOUZ+ClItKdx/j7ncCcwg1+icA1wA3Flh/LHC7u1cT6tXMKLDOacDfEnbCc82silAgrR9wCuFJ3NObb2Rm5xOqap5OSDonADOiy1bPAd8mFIv7Z3ffTtiJ3+LupwPnEBJak1Ojz7+SMBbGD4BqYEIUH8DfEJ6mPYVQ3r35k7WLgCXuXkMYI+CuqCjifGCWu08kVAOdUOB3IN2MzhSkXDyf9/pK4CIzu5xwJN6/wPq73X1j9LqeQwu3NVnj7h8Cu83sHUJRtGnA4qhi5ptm9psC251H2JHXRdN9gD9Er68n9Iesd/fl0bxvEGrtzCPs6PPjfTKqCPsm8Ja7bwEwsz8SzoIAno0Kt2Fm9xPKiq9oFs8pUZ8KhNIgnwAeAVaa2cPAr9z9yQJtkW5GZwpSLv6S93ot4ei5jnDUnSqw/gd5r7NtWKeRlv9vKoAfufu4qNN6EgeP/odH73GKmfWO5j0I/B0hWdzQ7L0+zHt9pEqi+fPTBdarAD6TF89kYLO73wFMBbYBPzSzb7fQLukGlBSkrJjZEEI57Rvd/b8Jl38qOvEjngRq8ypoTiUkjHxPAVeZWf9onICHgc9Hw4guJZwtPAN8L1p/WhTvrwh9BE1DjrbWFDM7IRrm8kuEQWCax3Nd9L5jgE2EwXCeBwa4+4+AO9DlI0FJQcqMu79DGDr1FTPbSOig7Wtm/TrpIxYTBlPZDCwD3uTQsxTc/VHgIcIlrXpCSexlhMtEu9x9BeGMoDYqHX0zsC4a33g68Hvg422IaQdhHIYthM7vu5stnwNMNrNNwH8BV7n73iiGpWZWR7jkdFMbPlPKlG5JFWkDM/ssYWS0x8xsELARmBglozjimQrcHI3HINJh6mgWaZstwP1mdks0fWNcCUGkGHSmICIiOepTEBGRHCUFERHJUVIQEZEcJQUREclRUhARkZz/B21MOmXXnVPkAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10cd18898>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Plot learning curve\n",
"def plot_learning_curve(estimator, title, X, y, ylim=None, cv=None,\n",
" n_jobs=-1, train_sizes=np.linspace(.1, 1.0, 5)):\n",
" \"\"\"Generate a simple plot of the test and training learning curve.\"\"\"\n",
" plt.figure()\n",
" plt.title(title)\n",
" if ylim is not None:\n",
" plt.ylim(*ylim)\n",
" plt.xlabel('Training examples')\n",
" plt.ylabel('Score')\n",
" train_sizes, train_scores, test_scores = learning_curve(\n",
" estimator, X, y, cv=cv, n_jobs=n_jobs, train_sizes=train_sizes)\n",
" train_scores_mean = np.mean(train_scores, axis=1)\n",
" train_scores_std = np.std(train_scores, axis=1)\n",
" test_scores_mean = np.mean(test_scores, axis=1)\n",
" test_scores_std = np.std(test_scores, axis=1)\n",
" plt.grid()\n",
"\n",
" plt.fill_between(train_sizes, train_scores_mean - train_scores_std,\n",
" train_scores_mean + train_scores_std, alpha=0.1,\n",
" color='r')\n",
" plt.fill_between(train_sizes, test_scores_mean - test_scores_std,\n",
" test_scores_mean + test_scores_std, alpha=0.1, color='g')\n",
" plt.plot(train_sizes, train_scores_mean, 'o-', color='r',\n",
" label='Training score')\n",
" plt.plot(train_sizes, test_scores_mean, 'o-', color='g',\n",
" label='Cross-validation score')\n",
"\n",
" plt.legend(loc='best')\n",
" return plt\n",
"\n",
"g = plot_learning_curve(gsRFC.best_estimator_, 'RF mearning curves', X_train, y_train, cv=kfold)\n",
"g = plot_learning_curve(gsExtC.best_estimator_, 'ExtraTrees learning curves', X_train, y_train, cv=kfold)\n",
"g = plot_learning_curve(gsSVMC.best_estimator_, 'SVC learning curves', X_train, y_train, cv=kfold)\n",
"g = plot_learning_curve(gsadaDTC.best_estimator_, 'AdaBoost learning curves', X_train, y_train, cv=kfold)\n",
"g = plot_learning_curve(gsGBC.best_estimator_, 'GradientBoosting learning curves', X_train, y_train, cv=kfold)"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA5QAAANpCAYAAAB0OnaPAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzs3Xm4XFWVsPE3hEEhShBQ1KvSSLsiDokMAo3KRZnCIKAJtIIQBjvKhxKwFb5GIUpHQdSOMgkiokyJhEkIYEBIQFubMQqIiyYtfkYEHBgakEiS+/1xzoXiUneq3Kq6VfX+nqeeqnPOPrtWJVAr6+x9do3p6elBkiRJkqThWq3ZAUiSJEmSWpMFpSRJkiSpJhaUkiRJkqSaWFBKkiRJkmpiQSlJkiRJqokFpSRJkiSpJqs3OwBpJEXEGsDvgF9l5q79tJkCHJGZ3YP09SCwDPgbxcWXscA3M/PsEQy59712B7bOzOMj4oPAjpn56RHq+zhgOvCTzDy4hvPXBS7PzPePRDxV+n8dMC8z/6ke/Q/y3guAj2bmnxv93pLULBHRA9wDrOhzaO/MfHCA84b9nRkR3wLeV25uBvyWIq8CbJuZf6t6Yh2YDwd8b/OhamZBqXazD/ArYIuIeGtm3reK/e2fmbcDRMQbgPsj4trM/P2qBtrHVsCrADLzR8CPRrDvQymSxE9rPH894N0jGM+LZOZDQMOTZ2mnJr2vJDXbDjUUD8P+zqy8OFpeqH0+rzaB+bB/5kPVzIJS7eZwYA7wADCD4kokEfElYH/gL8B/9zaOiLcApwPjgNcBi4H9MvPZKn2vBzwNPFWe+17gFGBt4O/A5zPzuvLYF4CPAMuB+ylGRB+OiA8BnwdWUlwZ/izFKOgngLER8UQZ35TM3CMiFgI/B7YD3gjcAhyUmSsjYhpwLMWV3huBIzPzRf9PR8RcoAv4bkQcD1wHfBN4B7AG8BPgs5m5PCIOKf+81qQobk/KzDOB7wEvj4jFwBblZ9qw9x8i5ZXuDYG3l30/DaxDkXR3Lj/vmsAzwL9m5s/7xLgxcE9mjouImcCby8frgP8CFgAHAf8AfC4zLy7bvQ3YCHhN+fd2WGY+GRFvA04D1gd6gK9n5g8iortPfHeWIdwUEbsBE4F/K2N9NfD9zPxCed4s4H/Kz7gW8H8y86aIGAecWv79LAeuAI4r/2xPBranGNm+C/h0Zj6JJI1iEXEQcALwTorv0NuBrwA7lE16vzNvofiOfifFd+dzVPkOHcL7LQOupPgO3p/iO/qbFN/hY4FvZea5Zds9qZJTImIC8F3gZcAY4JzMPKPP+5gPzYeqE++hVNuIiM2AbYAfAt8HPhYR60fEXsCHgUkUV/7WrTjt4xRflNsCm1J8Se9ecfzCiFgcEb+h+BL8dmY+FhHrA/Moirh3UnzBXxAR/xARBwOTga3KY/cA55X9nQIcnplbAl8AujPzv4BvA3Mz87gqH+3NQDdF0ns/sH35WU+mmBr7LuBJii/qF8nM/YCHKK4IzwX+A7gjM7cA3gVsABxdJoKPA7uV/e0HfLXs5mDgb5k5KTP7To/q6+3ARzJzIkUB/OWKPv8FuCwi1hmkj/dQ/Pm9leKK6WaZ+T7gCOCLFe22AaYAEyiS1/ERsTrF6O6p5Z/9ZODLEbFt3/gqpjvtACwFPkNRrG9Z9v1/I2KDss3WFIn4XRT/aJlZ7v8SxT9g3krx39d2FEnz2DKmLco/i4eAkwb53JLUSDeV+a33cTlAZn6f4kLmV4FvAbdk5g8qvzMrZunck5lvpSgeBvoOHciawFWZGRTF0Dzg2DJPbQ/8a0RsExH/SP855bNlH1sAuwHvi4gX/RvXfGg+VP04Qql28klgfmb+FfhrRPyW4grja4HLMvN/ASLiXKB3Cs4xwE4R8TngLRRXAcdV9Fk55fV1wI0RcS/wBPBAWQySmfdGxM8oCr/JwPcy8+myj28Cx0XEmhSjp5dHxHzgel5IUgO5KjNXAv8bEQ9QXC2dBCzIzKVlm1N54Ut9IHsA746IQ8vtl5fxPxURewC7l0l7Up8/h6H6fWb+rny9E8Wf/U8iovf4SorC/ZcD9HFDZj4BEBEPUVxFBlhCOS24dElmPlK2+y4wGzgXeFlmXlZ+roci4lJgV+CmPvE9LzN7yivfe0TERykS4hiKK7cAv8vMxeXrO4Fp5esdgaPLf1isoEieRMRXgfEU/21B8Q+mRwf4zJLUaANNef0Exff03yhG4vpzCwz6HTqUabW3lM9vobiIem5F3ng5RcE3hv5zyuXADyLi3cANFCNgKwd5T/Oh+VAjxIJSbaG8yncg8Gx5jwbAK4H/Q3GFbkxF8+UVry+m+P/gh8B8iquIlW2fV34Z/4hicYH5VZqsRjG1o+/I/2rle4zJzOPKL/udKb6Ej42IgZI1vLB4ARRTVsaUn6EyzsGulPYaC0ztvbc0IsYDPRHRRXFF+mzgpxRXiPcYoJ8x5flr9tn/VJ/3+kl5VZiy/Rsork4OZFmf7ef6aVf597gaxZ9BtVkXvX8vfeN7Xvnfz10U/yi5hSIR780Lf8bV/g56Y+ip6OcNFFOZxlKMXl9b7h9HceVWklrBayi+s9aiuND6P/20670FZLDv0MH0fjePBR7PzEm9ByLiNRQXcT9OPzklM39ZFn87AR8AToiIf8rMJQO8p/mwCvOhauGUV7WL/Smugr4uMzfOzI2BTSiuKv4UmBoR48spMB+rOG8X4Evl9JceiqkcL5k6Cs9/ye4E3Ar8otgV7y6PvY2i0FwI/Bg4uGIqy6eBm4EVZbG7TmZ+m+J+z7dSfLkv54Uv+aH4MbBjRLy+3D5sGOcdFRFjImItimL7CGBL4E/Av2fmjymTZ0SMLWMbGxG9SeNPZXuADw3wXjcCO5f3tlDel/ErRi6R7BUR65Z/px8HrgIS+HsU96r2jip/mGI0uJoVFH/u/0hxAeLzmXkVxZXVtejnv4UKNwAHRcRq5Z/nvPLcHwNHRMSaZXzfobgHSZJGtShWS78YOJ5iWuXF5T544Tuzr1q/Q/tKigvDB5SxvIHitpEtGCCnRMRFFOsfzKHIrU8CbxjkvcyHL2Y+VM0sKNUuPgl8o/Kehsx8nOL+jxkUV9hup7ip/YmK8/6NYgrq7RT3MS6imILSq/ceyrsorthdnZnfK6cJTQVOjYi7gYuAgzPzfop7Cm4Abo2I+4DNKabOLi9juSgi7gQuAQ7JzGUUiwF8MCJOHcqHLd/nKODHZexvpbgSOJhPU0xbuZsimd1NMe12AcV9E1l+1jdSJMpNgT9STGu5L4p7Rz8NnF5+hneVx6vFeC/FfSJzIuKXwInAByumAq+qR4BrgPso/k6/nJnPUVxJPTIifkXx9/ClzLypnz4uo7jgsBK4GvhN+bk+CPyaF/+3UM0XKRZk+iXFfx/XlNOLTgQeLPf9muIK7mdq+5iSVBd976FcXBY6XwYezsxzsviZrL9QLMQC5XdmRLy9T1+/orbv0BfJzL8DewGHld/hC4AvZObPBskpJwL7l/v/i2J0bdEgb2c+fDHzoWo2pqenZ/BWkkaViPgHiim+J2ax4uuHgGMyc+smh9YQUaxqt0FmHtHsWCRJahbzoUYD76GUWtNSivta7o6I5RRXJA9pbkiSJEnqNI5QSpIkSZJq4j2UkiRJkqSaOOV1EOVKVVtR3Gg91J9mkCS1nrEUvxV3W7lYlgZgfpSkjtJvjrSgHNxWvPCDu5Kk9vdeitUONTDzoyR1npfkSAvKwf0R4MILL2SjjTZqdiySpDp5+OGH2X///aGfpf/1EuZHSeoQA+VIC8rBrQDYaKON6OrqanYskqT6c/rm0JgfJanzvCRHWlAO0V8vuIK11h3f7DAkjbANP3lAs0OQWpr5UZJWTav/W8RVXiVJkiRJNbGglCRJkiTVxIJSkiRJklQTC0pJkiRJUk0sKCVJkiRJNWn5VV4johu4AHigYvcemflUcyKSJEmSpM7Q8gVlaV5mzmh2EJIkNUJEzAYmAROApcBTQE9m7hAR+wLzganA+MycPYx+ZwK7AI8BH83Mx0c6dklSe2mXgvJFImIacCAwDrgmM2dGxE+AvwK/BeYBX6eY8ntpZn6jWbFKkjRcvRdRI+I8YHZmLq44fDiwYLh9RsSbgG0yc9uI+BhwGPC1EQhXktTG2qWgnBIRk8rXP6IoFHcExgJ3AzPL11/PzF9ExC3Ah4E/AVdHxLzM/H+ND1uSpJEREYuBIylGLs8Cri33jwHOADYDlgGHZObSvudn5u8iYvdy8/XAXxoRtySptbXLojzzMrO7fHwD+DtwMXAasFZFu/vL57cCPwRuArqAjRsYqyRJdZGZi4DFwPSK3XsCT2fm9sAJ5aO/81dExFnADOCWesYqSWoP7VJQPi8ixgOfyMz9gC8B61QcXlk+/xr4YGZ2A9/hhUJTkqR2MwGYHBELgZOA9QdqnJnTge0p8qMkSQNqu4ISeBJYEhG3AT8AHo2IcX3aHEcx1fVWiilAjzQ4RkmS6qUHGFOxvQSYU15EPQS4qtpJEfGuiPhWufkMsKKeQUqS2kPL30OZmQuBhRXbKymm9/TVXdHmFuB9dQ5NkqRmuBW4CJhbbl8O7BYRi4C1KRbtqWYxsHq5zsBKivsxJUkaUMsXlJIkdarMnFbxelL5fEyVpocOoa8e+i82JUmqyoJSkqQOExGXABv22T0rM69vRjySpNZlQSlJUofJzKnNjkGS1B7acVEeSZIkSVIDOEI5RK86YG827OpqdhiSJI0q5kdJ6myOUEqSJEmSamJBKUmSJEmqiQWlJEmSJKkmFpSSJEmSpJq4KM8QPXrBSYx95TrNDkNSH689/ORmhyB1NPOjRoLf5VLrcoRSkiRJklQTC0pJkiRJUk0sKCVJkiRJNbGglCRJkiTVxIJSkiRJklSTtlvlNSKuAB7MzBnNjkWSpNEgIsYCpwBvB14BXJeZX2xuVJKkdtBWI5QRsSGwEtg2ItZodjySJI0SuwIrM3PnzNwW2DIiNm92UJKk1tduI5T7AtcBbwb2iIjbgDnAc8CzwNzy8T1gI+Bx4MDMfLI54UqS1BAPATtExC7AIuBDwOoRMYeKfAh8EHg/MB24CTgoM5c0J2RJUitoqxFKYD/gMooichpwFPDlzNwBeKZs83Hgp5nZTVFcfqrxYUqS1DiZeRfwb8CRwB+AbwP/Qp98mJkXABtS5NELLSYlSYNpmxHKiNgEmABcUO7aGlgT+Gq5fWv5PAHYOiKmAGsAdzQyTkmSGi0i3gHckZm7RcTLgLOA/wDuqpIPTwMuBQ5qSrCSpJbSTiOUHwWOzcxdM3NX4GvAdkDvPSK9z0uA2eUV2aOAGxodqCRJDbYL8DmAzHwWeAD4LH3yYUSsDnwBOBH4SnNClSS1knYqKPcFrqjYvgh4BDgmIn4CvB5YTnFVdq+IuBn4BnBPowOVJKnBTgXWj4i7IuJnwJuAs3lpPjwWuCozTwbeHBHvbVrEkqSW0DZTXjPznX22fxsRRwO/zswlEXEJsDQznwKmNCVISZKaIDOXAYdWOdQ3H/57xTm71TUoSVJbaJuCsh9/AC4uf3/rV8DNTY5HkiRJktpGWxeUmXkn8O5mxyFJkiRJ7aid7qGUJEmSJDWQBaUkSZIkqSZtPeV1JL36gGN5bVdXs8OQJGlUMT9KUmdzhFKSJEmSVBMLSkmSJElSTSwoJUmSJEk1saCUJEmSJNXERXmG6FcXHcYf112j2WFI6mOr6Vc1OwSpo5kfm8/vQUnN5AilJEmSJKkmFpSSJEmSpJpYUEqSJEmSamJBKUmSJEmqiQWlJEmSJKkmFpSSJEmSpJr4syGSJAmAiNgDOB5YAXwyMxc3OSRJ0ig3qgvKiJgNTAImAEuBp4CezNwhIvYF5gNTgfGZOXsY/c4EdgEeAz6amY+PdOySJLWgE4Bu4FXAmcAeTY1GkjTqjeqCMjNnAETEecDsPldKDwcWDLfPiHgTsE1mbhsRHwMOA742AuFKkjQqRMQ0YB9gXYrbWw4ATgc2BB4G9svMZVVO3SYzV0TEO4AnGhSuJKmFtdw9lBGxOCK2pxi5PKti/5iIODMiFkXEgojoqnZ+Zv4O2L3cfD0mTElSe3osM7uB7wAHA9dl5jbApcAm1U4oi8nDgGuBKxsVqCSpdbVcQQmQmYuAxcD0it17Ak9n5vYUU3ZOGOD8FRFxFjADuKWesUqS1CS9+e02immstwNk5vmZeV9/J2XmOUAX8PmIeEW9g5QktbaWLCj7MQGYHBELgZOA9QdqnJnTge0prtxKktRuJpbPW5aPSQARcXREfKBv44hYvZzhswawDFhOsTiPJEn9auWCsgcYU7G9BJhTTu85BLiq2kkR8a6I+Fa5+QwmS0lSe9o8Im6kuH9yM1646Lo1cHPfxpm5HJgD/LQ8/rXMfKZx4UqSWtGoXpRnELcCFwFzy+3Lgd0iYhGwNsWiPdUsBlaPiFuAlcCR9Q5UkqQmmNdnBfS9BzshM88Fzq1fSJKkdtMSBWVmTqt4Pal8PqZK00OH0FcP/RebkiS1vYg4Dtipz+75mXlKM+KRJLWuligoaxURl1AskV5pVmZe34x4JElqhMw8b5Djs4BZjYlGktTO2rqgzMypzY5BkiRJktpVKy/KI0mSJElqorYeoRxJ7/zoOXR1dTU7DEmSRhXzoyR1NkcoJUmSJEk1saCUJEmSJNXEglKSJEmSVBPvoRyi6y+ZxqvWXaPZYXSMvQ65ttkhSJKGwPzYP3OZpE7gCKUkSZIkqSYWlJIkSZKkmlhQSpIkSZJqYkEpSZIkSaqJBaUkSZIkqSYWlJIkSZKkmtTtZ0Miohu4AHigYvcemfnUMPqYBLwHuAfYOzNnVGmzAfBtYDywDnBSZl4ZEccC8zLzgb7nSJLU7kYiD0uSNJh6/w7lvGpF4FBl5mJgcZkU+3MMcF5mXh0R44DbIuKqzDyp1veVJKlNrFIeliRpMPUuKF8kIqYBBwLjgGsyc2ZE3ADcB2wFXAZsAUwEpgEvA/YGrijP/zSwPDPPiIjdy7ZLgQMi4mHgDmBiZq6MiPOA2cBXgJcD6wKrZebEiDge2BlYARyRmXc34ONLktRU/eThnwB/BX4LzAO+TnFLzKWZ+Y1mxSpJag31LiinlNNWAX5EkaB2BMYCdwMzyxjmAscB/wN0AbtQFJLX9elvLnAhcAawH/AlYAmwEjgdeCNwCvB8AszMyRGxBnAl8PmIeCfwjsx8T0S8ATgbmDyin1qSpNFhKHl4LPD1zPxFRNwCfBj4E3B1RMzLzP/X+LAlSa2i3ovyzMvM7vLxDeDvwMXAacBaFe3uzcwngaWZ+SzwBMXo5Itk5iPAsojYGHhteX9kd2aemplbA5sDB0XEhD6nngmcn5l3AhOAzSNiIXA+sN4Ifl5JkkaToebh+8vntwI/BG6iuMC7cQNjlSS1oIat8hoR44FPZGbvyOI6FYd7htHVHIoRyPnl9mfK6a8Aj5SP5yre97PAnzPz4nLXEmBhZnYD+wCXDPOjSJLUcgbJwyvL518DHyxz5Hd4odCUJKmqRt5D+SSwJCJuK18/Wi6iM1yXU0xvPbzcPgI4KyJOoEiIF2XmkoiAYpRzFnBLOSIJ8AHg4Yi4meIeki/U+HkkSWolQ8nDx1FMdX0ZcDvFRVpJkvpVt4IyMxcCCyu2VwJ7VmnaXdFmUpVz+z6PBW7MzIfLtg9S3HPZ9/2nlS/XrPKexw0WvyRJrazGPHwL8L46hyZJaiMNm/I6EiLiHcDNVCy6I0mSJElqjob+bMiqKn/eY2Kz45AkSZIktdgIpSRJkiRp9LCglCRJkiTVpKWmvDbTTlPPo6urq9lhSJI0qpgfJamzOUIpSZIkSaqJBaUkSZIkqSYWlJIkSZKkmngP5RBddPmBrDt+jWaHMSzTP/bjZocgSWpzrZgfa2FOlaTqHKGUJEmSJNXEglKSJEmSVBMLSkmSJElSTSwoJUmSJEk1saCUJEmSJNXEglKSJEmSVJOO/NmQiNgDOB5YAXwyMxc3OSRJUoeIiG7gAuCBit17ZOZTw+hjEvAe4B5g78ycUaXNBsC3gfHAOsBJmXllRBwLzMvMB/qeI0nScHVkQQmcAHQDrwLOBPZoajSSpE4zr1oROFTlhdDFZXHan2OA8zLz6ogYB9wWEVdl5km1vq8kSX21dEEZEdOAfYB1KabvHgCcDmwIPAzsl5nLqpy6TWauiIh3AE80KFxJkqoq89mBwDjgmsycGRE3APcBWwGXAVsAE4FpwMuAvYEryvM/DSzPzDMiYvey7VLggIh4GLgDmJiZKyPiPGA28BXg5ZQ5NDMnRsTxwM4UM3iOyMy7G/DxJUktrB3uoXwsM7uB7wAHA9dl5jbApcAm1U4oi8nDgGuBKxsVqCRJpSkRsbB8HA1sAOwIbAf8c9lmdWAuRYH3OeAgilHHvav0Nxf4UPl6P+Ai4FvAzygutD4EHFF5QmZOBnYC/ggcHBHvBN6Rme+huED71ZH5qJKkdtYOBeUt5fNtFNNYbwfIzPMz877+TsrMc4Au4PMR8Yp6BylJUoV5mdldPr4B/B24GDgNWKui3b2Z+SSwNDOfpZhV87K+nWXmI8CyiNgYeG15f2R3Zp6amVsDmwMHRcSEPqeeCZyfmXcCE4DNI2IhcD6w3gh+XklSm2qHgnJi+bxl+ZgEEBFHR8QH+jaOiNUjYkFErAEsA5ZTTO2RJKnhImI88InM3A/4EsUCOr16htHVHOAbwPxy+zPl9FeAR8rHcxXv+1ngz5l5cblrCbCwnPWzD3DJMD+KJKkDtUNBuXlE3EgxPWczYHJ5dXVr4Oa+jTNzOUXS/Wl5/GuZ+UzjwpUk6UWeBJZExG3AD4BHy0V0hutyimmzc8rtI4BPR8StwH8CV2fmkvLYy4BZwFa9U2+BxcDDEXEz8BPgN7V+IElS52jpRXlK8zJzdsV2tXtLXiQzzwXOrV9IkiRVl5kLgYUV2yuBPas07a5oM6nKuX2fxwI3ZubDZdsHgV2qvP+08uWaVd7zuMHilySpUjsUlP2KiOMoFhyoND8zT2lGPJIk1UO5avkFwKeaHYskqbO0dEGZmecNcnwWxZQeSZLaVvnzHhMHbShJ0ghrh3soJUmSJElNYEEpSZIkSapJS095baSP7vMDurq6mh2GJEmjivlRkjqbI5SSJEmSpJpYUEqSJEmSamJBKUmSJEmqifdQDtFJ1x7EOuut0bT3P3nKdU17b0mS+tPs/DhSzLOSVBtHKCVJkiRJNbGglCRJkiTVxIJSkiRJklQTC0pJkiRJUk0sKCVJkiRJNbGglCRJkiTVxJ8NkSSpDUTEe4GZwFigB/hkZv6mSruFwN6Z+XjFvtnA5zPzqcZEK0lqF6O6oCwT3CRgArAUeAroycwdImJfYD4wFRifmbOH2fcmwE2Z+aYRDluSpIaKiFcDpwCTM/OxiJgInA9sNZTzM3NGPeOTJLWvUV1Q9ia4iDgPmJ2ZiysOHw4sqKXfiFgdOAlYtqoxSpI0CuwJXJaZjwFk5i8jYueImAYcCIwDrsnMmWX70yNiY+D6zJzZO2oJzAPuAbYBbsvMTzX0U0iSWk7L3UMZEYsjYnuKkcuzKvaPiYgzI2JRRCyIiK4BujkeOB14ps7hSpLUCK8Bfl+5oywuNwB2BLYD/rni8NmZuR2wbURsWrF/dYqicltgx4hYu65RS5JaXssVlACZuQhYDEyv2L0n8HRmbg+cUD5eoixGx5Z9SJLUDv4AbFy5IyL2AZ4DLgZOA9aqOPzz8vlOYJM+fd2bmT3AI8Ca9QhWktQ+WrKg7McEYHI5beckYP1+2n0E2L5st2lEnNOY8CRJqptrgL0iYjxARLyb4p7K6Zm5H/AlYJ2K9hMjYgywBXB/n756GhCvJKlNjOp7KAfRA4yp2F4CzMnMEyPizcD7qp2UmZ/ofR0RizPzsPqGKUlSfWXmnyLieODqiFgBrAR2B74WEbcBTwKPRsS48pTpwDeBKzLzwYhoStySpNbXygXlrcBFwNxy+3Jgt4hYBKxNsWiPJEkdITMX8NLF6vas0rS7yrndfY9V7JMkqV8tUVBm5rSK15PK52OqND10mP1OWrXIJEmSJKlztURBWauIuATYsM/uWZl5fTPikSRJkqR20tYFZWZObXYMkiRJktSu2mmVV0mSJElSA7X1COVIOnby9+nq6mp2GJIkjSrmR0nqbI5QSpIkSZJqYkEpSZIkSaqJBaUkSZIkqSYWlJIkSZKkmrgozxAdvGAWa7xq7RHp65q9vz4i/UiS1GwjmR8bxTwsSSPHEUpJkiRJUk0sKCVJkiRJNbGglCRJkiTVxIJSkiRJklQTC0pJkiRJUk06cpXXiNgbOJaioD4jM89rbkSSJNVfRFwBPJiZM5odiySpPXTqCOVM4P3AdsC/RsTY5oYjSVJ9RcSGwEpg24hYo9nxSJLaQ0uPUEbENGAfYF2K4vgA4HRgQ+BhYL/MXFbl1J0z85mIWB0YQ5FgJUlqZ/sC1wFvBvaIiNuAOcBzwLPA3PLxPWAj4HHgwMx8sjnhSpJaQTuMUD6Wmd3Ad4CDgesycxvgUmCTaidk5qPly28C52ZmTyMClSSpifYDLqMoIqcBRwFfzswdgGfKNh8Hflrm1bnApxofpiSplbT0CGXplvL5NuAQinsjyczz+zshIlajGMn8W2Z+ve4RSpLURBGxCTABuKDctTWwJvDVcvvW8nkCsHVETAHWAO5oZJySpNbTDiOUE8vnLcvHJICIODoiPtDPOScDT2Tm0Q2IT5KkZvsocGxm7pqZuwJfo1hHYPPyeO/zEmB2OUJ5FHBDowOVJLWWdhih3DwibgT+DmwGnBoRHwEeAU7t2zgiXgMcCfw8IhaWu/fKzCcaFK8kSY22L9BdsX0RxbTXYyLiX4G1gOXAWcB5EfFxiovOBzY2TElSq2mHgnJeZs6u2N57oMaZ+QjFNB9JkjpCZr6zz/ZvI+Jo4NeZuSQiLgGWZuZTwJSmBClJakntUFD2KyKOA3bqs3t+Zp7SjHgkSRpF/gBcXP501q+Am5scjySpBbV0QZmZ5w1yfBYwqzHRSJLUOjLzTuDdzY5DktTa2mG1p5AVAAAgAElEQVRRHkmSJElSE1hQSpIkSZJq0tJTXhvpezsfR1dXV7PDkCRpVDE/SlJnc4RSkiRJklQTC0pJkiRJUk0sKCVJkiRJNampoIyINUY6EEmS2oE5UpLUSYa0KE9EvAfoBr4K/AKYEBEHZ+bcOsY2qhzy4++zxnqvXKU+5n/oUyMUjSRptOj0HDkS+XFVmV8lqXmGOkJ5CkWS3Bt4GNgM+Ey9gpIkqYWYIyVJHWuoBeXYzLwB2Am4IjMfBMbWLSpJklqHOVKS1LGGXFBGxLuB3YEFEfF2wHtEJEkyR0qSOthQC8pZwEXAd8srr1cBn69XUJIktRBzpCSpYw1pUZ7MvAy4rGLXppm5oj4hSZLUOsyRkqRONtRVXjcCvgv8I/Be4AcRMS0z/1jP4KrE8V5gJsW9KT3AJzPzN1XaLQT2zszHK/bNBj6fmU81JlpJUicYLTlSkqRmGFJBCZwBXAEcAfwVWAycQ3G/SENExKspVtKbnJmPRcRE4Hxgq6Gcn5kz6hmfJKljNTxHlhdJJwETgKXAU0BPZu4QEfsC84GpwPjMnD3MvjcBbsrMN41w2JKkNjTUgnLjzPxORByemc8Bx0TE3fUMrIo9gcsy8zGAzPxlROwcEdOAA4FxwDWZObNsf3pEbAxcn5kze0ctgXnAPcA2wG2Z6Y9XSZJWRcNzZO9F0og4D5idmYsrDh8OLKil34hYHTgJWLaqMUqSOsNQF+VZGRHPt42IVwzj3JHyGuD3lTvK4nIDYEdgO+CfKw6fnZnbAdtGxKYV+1enKCq3BXaMiLXrGrUkqd2NhhxJRCyOiO0pRi7Pqtg/JiLOjIhFEbEgIroG6OZ44HTgmTqHK0lqE0NNeJcBFwLrRsR04Ebgh3WLqro/ABtX7oiIfYDngIuB04C1Kg7/vHy+E9ikT1/3ZmYP8AiwZj2ClSR1jNGQIwHIzEUUU26nV+zeE3g6M7cHTigfL1EWo2PLPiRJGpIhFZSZ+WXgGuA2ih9uPhv4Uh3jquYaYK+IGA9Q/ubXKcD0zNyvjGedivYTI2IMsAVwf5++ehoQrySpA4ySHDmQCcDk8taPk4D1+2n3EWD7st2mEXFOY8KTJLWyoa7y+oPMPJBiEZymyMw/RcTxwNURsQJYSbHgwdci4jbgSeDRiBhXnjId+CZwRWY+GBFNiVuS1N5GQ47sowcYU7G9BJiTmSdGxJuB91U7KTM/0fs6IhZn5mH1DVOS1A6GuijPxIgYU04TbZrMXMBLFxrYs0rT7irndvc9VrFPkqRajYocWeFW4CJgbrl9ObBbRCwC1qZYtEeSpBEx1ILyj8C9EfELiqXJAcjMT9clKkmSWkfTcmRmTqt4Pal8PqZK00OH2e+kVYtMktQphlpQ/pwXFrmRJEkvaLkcGRGXABv22T0rM69vRjySpNY1pIIyM79Y70AkSWpFrZgjM3Nqs2OQJLWHoS7KczdVVkbNzHeOeESSJLUQc6QkqZMNdcrrERWv1wT2Bh4a+XBGr3N3OYiuroF+C1qS1KE6OkeaHyWpsw11yuuLfuQ4Im4A/hOYVY+gJElqFeZISVInW63G89YHXjeSgUiS1CbMkZKkjlHLPZRjgDcCZ9crKEmSWoU5UpLUyWq5h7IH+FNm3leHeCRJajXmSElSxxpqQXlgZr7oR5Ej4tLM/HAdYhqVDr32StZYb3xN5149Zf8RjkaSNIp0dI5clfzYH/OmJLWOAQvKiDgTeD3w3oio/AHkNYAJ9QxMkqTRzBwpSdLgI5TfBd4OTAQurdi/HPh5vYKSJKkFmCMlSR1vwIIyM28Hbo+IGzJzaYNikiRp1DNHSpI09Hso3xARpwPjKFawGwv8Q2a+sW6RSZLUGsyRkqSONdTfoTyH4keaXwlcCDzJi6f3SJLUqcyRkqSONdSCsiczTwYWAr8BpgLvq1dQkiS1EHOkJKljDbWg/N/yeQnw9sx8lmJKjyRJnc4cKUnqWEO9h/K/ImIu8AVgfkS8BVhRv7BeKiK6gQuAByp2n5SZ1w2zny2Ar1As674SODwzc6TilCR1nKbmSPOjJKmZhlpQHgVsnZn3R8QMYEfgo/ULq1/zMnPGKvZxKrB7Zj4WEVsCpwAfXPXQJEkdajTkSPOjJKkphlRQZmZPRKyMiOnA94C/joarlhExDdiDYiGEZ4C7gN2AmzLz2H5OewQ4IiIuzMzbI2JKQ4KVJLWl0ZgjzY+SpEYZ0j2UEXEwRZL8HDAeuDIiPl7PwPoxJSIW9j6AjYDHM3Nniuk59wL/RJFE+3MQRYL9cUTchwsnSJJWwSjJkeZHSVJTDHVRnk8B2wJPZuajwBbAqk6tqcW8zOzufQAPUyRJKJZpfyAzVwDLq50cEWtRLJjw2cz8R+BA4OwGxC1Jal+jIUeaHyVJTTHUgnJFZj7Zu5GZv6efpNQEPcNouxL4fkS8odz+b+CJkQ9JktRBRmuOND9KkupuqIvy/DUiJlEmp4jYH/hr3aLq35Qyjl7nD+fkzHwuIj4JXBoRyyk+z1EjGaAkqeOMhhxpfpQkNcVQC8ojgXnAmyPiIeBZYK+6RVVFZi4EugY4Pq3i9aQB2t0A3DCSsUmSOlpTc6T5UZLUTENd5fU3ETEReAvFjzVnZj5X18hWUUQcB+zUZ/f8zDylGfFIktpTq+VI86MkaSQNWFBGxNmZ+S/l5nqZeV8DYhoRmTkLmNXsOCRJ7alVc6T5UZI0kgZblGfLitcL6hmIJEktxhwpSep4g015HdPP647z3cl70dXV7y0qkqTOY47E/ChJnW6oPxsCw1t+XJKkTmKOlCR1pMFGKFeLiPUorryOrXgNQGY246dDJEkaDcyRkqSON1hB+Q7gz7yQIP9ScayHYjU7SZI6kTlSktTxBiwoM3M4U2Lb2sevXcia663f7/Erp0xuYDSSpGYzRxYGy4/DZT6VpNZiMpQkSZIk1cSCUpIkSZJUEwtKSZIkSVJNLCglSZIkSTWxoJQkSZIk1cSCUpIkSZJUEwtKSZL0vIjYOCJubHYckqTWMODvUI4WEdENXAA8ULH7pMy8bpj9rAV8FXgXsBZwTWZ+caTilCSplUXEB4AvU+RISZIG1RIFZWleZs5YxT5mAvdl5pERMQaYGxF7ZeaVqx6eJEmjQ0RMA/YB1qWYjXQAcDqwIfAwsF9mLqty6nJgF2BhQwKVJLW8ViooX6RMlnsArwSeAe4CdgNuysxj+zltb2AzgMzsiYhDgGfrH60kSQ33WGbuFREfAw4GrsvM08vtTYD7+p6QmYsAIqKxkUqSWlYrFZRTImJSxfZ1wOOZOSUiLgPuBf4d+CXQX0G5IjN7ejcy86m6RStJUnPdUj7fBhxCmRsz8/ymRSRJajuttCjPvMzs7n1QTNm5tzz2JPBAZq6gmK7Tn9Ui4vnPHBGbRsTmdYtYkqTmmVg+b1k+JgFExNHlvZKSJK2yViooq+kZvMmL/Ag4FCAixlIs0POmkQ5KkqRRYPNytdYDKG73mBwRC4GtgZubGZgkqX208pTXWqbszATOiIgDgZcDV2Tm5SMRnCRJo8y8zJxdsb33UE/MzEmDt5IkqUUKysxcCHQNcHxaxet+k2BmPktxH4kkSR0rIo4Dduqze35mntKMeCRJraslCsrhMlFKkjpZZp43yPFZwKzGRCNJamdtWVCaKCVJkiSp/lp9UR5JkiRJUpO05QhlPXxncjddXf3exilJUkcyP0pSZ3OEUpIkSZJUEwtKSZIkSVJNLCglSZIkSTXxHsohOvy6e1lzvUef35734c2bGI0kSaND3/y4qsyvktRaHKGUJEmSJNXEglKSJEmSVBMLSkmSJElSTSwoJUmSJEk1saCUJEmSJNXEglKSJEmSVJO2+dmQiBgLnAK8HXgFcF1mfrG5UUmSJElS+2qbghLYFViZmTsDRMRVEbF5Zt7Z5LgkSRpRETEbmARMAJYCTwE9mblDROwLzAemAuMzc/Yw+j0ReD8wBvhUZt4x4sFLktpKOxWUDwE7RMQuwCLgQ8DqETEH2Ah4HDgQ+CBFspwO3AQclJlLmhOyJEnDl5kzACLiPGB2Zi6uOHw4sGC4fUbEW4BJmbldRGwKnEZxsVaSpH61zT2UmXkX8G/AkcAfgG8D/wL8NDO7gbkUV1svADYE5gAXWkxKktpBRCyOiO0pRi7Pqtg/JiLOjIhFEbEgIrr66eK3FBdeobjg/Fx9I5YktYO2KSgj4h3AHZm5G/B6imT4H8DBEbEQOAJ4Tdn8NGAX4PwmhCpJUl1k5iJgMcUsnF57Ak9n5vbACeWj2rnPZeZjETEOOAeYVe94JUmtr20KSooC8XMAmfks8ADwWYqpQN3AUcANEbE68AXgROArzQlVkqSGmQBMLi+ungSs31/DiFgPuIYid/6iMeFJklpZOxWUpwLrR8RdEfEz4E3A2cBeEXEz8A3gHuBY4KrMPBl4c0S8t2kRS5I08nooFtXptQSYU15cPQS4qtpJ5WrpVwGnZOa8egcpSWoPbbMoT2YuAw6tcmhKn+1/rzhnt7oGJUlS490KXESxdgDA5cBuEbEIWJti0Z5q9gbeBnwmIj4D/DYzD653sJKk1tY2BaUkSZ0mM6dVvJ5UPh9TpWm1C659+7oUuHTEgpMkdQQLSkmSOkxEXEKx4nmlWZl5fTPikSS1LgtKSZI6TGZObXYMkqT20E6L8kiSJEmSGsgRyiE6Y9e30dXV329BS5LUmcyPktTZHKGUJEmSJNXEglKSJEmSVBMLSkmSJElSTbyHcoi+/eNHecWrxj6/fcw+r21iNJIkjQ598+NwmU8lqbU5QilJkiRJqokFpSRJkiSpJhaUkiRJkqSaWFBKkiRJkmpiQSlJkiRJqokFpSRJkiSpJhaUkiRJkqSajOrfoYyI2cAkYAKwFHgK6MnMHSJiX2A+MBUYn5mzh9HvscA+wGPA/pn5lxEPXpKkOjE/SpJGi1E9QpmZMzKzG7gOOCwzuzNzh/Lw4cAaw+0zIl4PfCAztwbOAo4aqXglSWoE86MkabQY1SOU1UTEYuBIiiuzZwHXlvvHAGcAmwHLgEMyc2mVLrYEbilf3wDMqHfMkiTVm/lRktQMo3qEsj+ZuQhYDEyv2L0n8HRmbg+cUD6qeSXwv+Xrp4Fx9YpTkqRGMj9Kkhqt5UYoBzABmBwRWwJjgP7u+3gS2Lh8Pa7cliSpXZkfJUl105IjlKUeisTYawkwp7yn5BDgqn7OuwN4bzkF6APAL+oZpCRJDWZ+lCQ1TCsXlLcCF1VsXw5sHBGLgDnAPdVOKu8b+Qnwn8CngK/XOU5JkhrJ/ChJapiWmPKamdMqXk8qn4+p0vTQIfZ3MnDyiAQnSVKTmB8lSc3WEgVlrSLiEmDDPrtnZeb1zYhHkqTRwPwoSRopbV1QZubUZscgSdJoY36UJI2UVr6HUpIkSZLURG09QjmSPrHLq+nqem2zw5AkaVQxP0pSZ3OEUpIkSZJUEwtKSZIkSVJNLCglSZIkSTWxoJQkSZIk1cRFeYZo4dWPsf56L2Pyfhs0OxRJkkaN3vw4HOZSSWofjlBKkiRJkmpiQSlJkiRJqokFpSRJkiSpJhaUkiRJkqSaWFBKkiRJkmpiQSlJkiRJqsmo/tmQiJgNTAImAEuBp4CezNwhIvYF5gNTgfGZOXuYfe8FdGfmUSMctiRJDRMR3cAFwAMVu0/KzOuG2c9awFeBdwFrAddk5hdHKk5JUnsa1QVlZs4AiIjzgNmZubji8OHAglr6jYijgenAtasaoyRJo8C83py5CmYC92XmkRExBpgbEXtl5pWrHp4kqV2N6oKymohYDBxJMXJ5FmVRWCa/M4DNgGXAIZm5tJ9u7qcoSPese8CSJDVYREwD9gBeCTwD3AXsBtyUmcf2c9reFDmUzOyJiEOAZ+sfrSSplbXkPZSZuQhYTDHK2GtP4OnM3B44oXz0d/7VwIq6BilJUuNMiYiFvQ9gI+DxzNwZWAncC/wTRZHZnxWZ2dO7kZlPZebyegYtSWp9LTdCOYAJwOSI2BIYA/ylyfFIktQoL5ryWo5QLis3nwQeyMwVETFQgbhaRKyWmSvLPjYFXpmZd9YraElS62vJEcpSD0Xh2GsJMCczu4FDgKuaEZQkSaNEz+BNXuRHwKEAETGWYoGeN410UJKk9tLKI5S3AhcBc8vty4HdImIRsDbFPZKSJHWCKRExqWL7/Br6mAmcEREHAi8HrsjMy0ciOElS+2qJgjIzp1W8nlQ+H1Ol6aHD6HMhsHAVQ5MkqanKfNY1wPFpFa8nDdDuWYoZPpIkDVlLFJS1iohLgA377J6Vmdc3Ix5JkpotIo4Dduqze35mntKMeCRJra2tC8rMnNrsGCRJGk0ycxYwq9lxSJLaQysvyiNJkiRJaqK2HqEcSd17rEdX1wbNDkOSpFHF/ChJnc0RSkmSJElSTSwoJUmSJEk1saCUJEmSJNXEglKSJEmSVBMX5Rmi/77wzzyx7hq87ROvaXYokiSNGr35cSjMoZLUfhyhlCRJkiTVxIJSkiRJklQTC0pJkiRJUk0sKCVJkiRJNbGglCRJkiTVxIJSkiRJklSTUf2zIRExG5gETACWAk8BPZm5Q0TsC8wHpgLjM3P2MPo9EXg/MAb4VGbeMeLBS5JUJ/XKj2XfewHdmXnUCIctSWpDo7qgzMwZABFxHjA7MxdXHD4cWDDcPiPiLcCkzNwuIjYFTgN2HYFwJUlqiHrkx7K/o4HpwLWrGqMkqTOM6oKymohYDBxJcWX2LMqkFxFjgDOAzYBlwCGZubRKF78FDixfrw48V++YJUmqtxHIjwD3UxSke9Y9YElSW2jJeygzcxGwmOIqaq89gaczc3vghPJR7dznMvOxiBgHnAPMqne8kiQ1wqrkx/L8q4EVdQ1SktRWWm6EcgATgMkRsSXFvZF/6a9hRKwHXEkxTegXDYpPkqRmGHJ+lKT/z96dh8lRVosf/8awKSjIJmoUXE9EgcgiICIBWQTDorIoIITlinIBARe4ooAoCoIYBEEUEdmRsMkqsgS5CLJG2Tz8yFWuiAG8sogIEpLfH28NNsNMplOZ7p7u+X6ep5/uqq7lVM08ffrU+77V0rzqyhbKyhxKYuwzAzgnMycCuwKXDLRSRIyt3jsqM6e2OkhJktqsVn6UJKmObi4obwHOapi+EFghIq4HzgHuHmS9rYB3A5+PiGkR8ZPWhilJUlvVzY+SJM2zrujympmTG15PqJ4PGGDR3ZrY1vnA+cMWnCRJHTKc+bFhO9OAafMZmiRplOiKgrKuiDgPWKbf7MMz85ediEeSpJHA/ChJGi49XVBm5jadjkGSpJHG/ChJGi7dPIZSkiRJktRBPd1COZzescPSjBv3uk6HIUnSiGJ+lKTRzRZKSZIkSVItFpSSJEmSpFosKCVJkiRJtVhQSpIkSZJq8aY8kiSptsdOnsECr356yOWW+/z4NkQjSWo3WyglSZIkSbVYUEqSJEmSarGglCRJkiTVYkEpSZIkSarFglKSJEmSVIsFpSRJkiSpllH7syERsQJwSmZu0OlYJEkaCSJiK+BAygXnEzLz1M5GJEka6UZlC2VEfAg4F1iy07FIkjSCHApsAKwDfCEixnY2HEnSSNfVLZQRMRn4KLA4pTjeEfg+sAwwE9guM58bYNVZwCbAtLYEKklSG81Hftw4M5+JiAWAMcDs9kQsSepWvdBC+XhmTgR+BOwCXJmZawHnA28daIXMvD4zn2hfiJIktV2d/Pho9fJYyrCQOe0IVJLUvbq6hbJyQ/V8K7ArZewHmXl6xyKSJKnz5jk/RsQrKC2Z/8zM77Q8QklS1+uFFspVqufVq8cEgIjYvxorKUnSaFQnPx4JPJmZ+7chPklSD+iFFspVI+Ja4F/AisBxEfFJ4BHguI5GJklS58xTfoyI1wGfA26KiGnV7C0z88k2xStJ6kK9UFBOzcwpDdNbNbtiZk5oQTySJI0E85QfM/MRYKHWhiRJ6jW9UFAOKiIOAjbqN/uyzDyqE/FIkjQSmB8lScOlqwvKoX5wOTMPBw5vTzSSJI0M5kdJUrv0wk15JEmSJEkdYEEpSZIkSaqlq7u8SpKkzlpm97ex3LhxnQ5DktQhtlBKkiRJkmqxoJQkSZIk1WJBKUmSJEmqxTGUkiSptv/76U0s9JqlXzJv2b3X71A0kqR2s4VSkiRJklSLBaUkSZIkqRYLSkmSJElSLRaUkiRJkqRaLCglSZIkSbVYUEqSJEmSarGglCRJkiTV0hW/QxkRE4EzgAcaZh+RmVfO43YWBb4LrATMBq4DDsnMF4YpVEmS2sb8KEnqtK4oKCtTM3Pf+dzG94AbM/PTABFxMPBfwDfmNzhJkjrE/ChJ6phuKihfIiImA5OA1wDPAHcCmwHXZeaBAyy/ELBqZu7WMPubwK2YMCVJPcL8KElqp24aQ7l1REzrewDLAU9k5saU7jn3AO+nJNGBLA080jgjM2cBY1sXsiRJLWd+lCR1TDe1UL6kS091Bfa5avIp4IHMfCEiZg2y/l+BZRtnRMQCdNc5kCSpP/OjJKljuqmFciBzml0wM/8FTI+InQEi4iLgFODcFsUmSVKnmB8lSW3RTVcft46ICQ3Tp9fYxl7AdyNiD2ARylXZJyJiwcx8fjiClCSpzcyPkqSO6YqCMjOnAePm8v7khtcT5rLcM8AejfMiYnWTpSSpG5kfJUmd1hUF5byKiIOAjfrNviwzj+q/bGbe1p6oJEnqLPOjJGm49WRBmZmHA4d3Og5JkkYS86Mkabh1+015JEmSJEkd0pMtlJIkqT2W2nltlh036DBOSVKPs6Ac2liAmTNndjoOSVILNXzOj+1kHF3E/ChJo8TccqQF5dBeD7DDDjt0Og5JUnu8HpjR6SC6gPlRkkafl+VIC8qh3QqsC/wFeKHDsUiSWmcsJVHe2ulAuoT5UZJGj0Fz5Jg5c+a0PxxJkiRJUtfzLq+SJEmSpFosKCVJkiRJtTiGUhomETEHuJsylmgO8CrgKeCzmXnbEOtOA47PzKlzWeYtwNGZ+fGIeAMwNTPfPwxxbwFsmJn7zO+25nG/Lx5PO/crSWov8+M879f8qK5iQSkNr/Uz8699ExHxBeA4YO1h2PbyQABk5sPAfCfLals/B34+HNuaRy8ejySp55kfm2d+VFexoJRaJCIWAN4M/K1h3kHAxyndzf8I7Fklv8b1vgxsBSwCLAp8gZLQTgbeGBG/APagXO19DfAg8NG+q7wRcQ5wfWae2OT+JgNbZ+ak6krw7cAGwLLAscDrgPWqWLbNzLuq5e4FVgeWBk7PzEOq7W0FHEK5G9hTwP6ZeUtEHEr54vB64B5gjb7jycxNBjruzLywWm+Far3lgceA7TLz4Yh4J3BSFets4BuZeW5EvBE4vjr/CwLnZOY35/4XkyS1g/nR/Kje4hhKaXhdFxG/jYiHgfurebsARMROwErA+zJzAnA5JQm+KCKWBzYE1svMlYGDgMMy8wVgd2BGZm7St3xmzgZOASZX678W2Ag4q5n9DWKFzHwv8DHgSGBaZq4OXAns3bDc8sA6wKrAdhExKSLGAz8APl7FfzBwcUS8pmGdVTPzk43HM9hxN+xrXWCbzBwPPE75wgBwDnBeZr4b2Az4ZrWv04FTMnM14H3AhhGxbRPHLklqDfOj+VE9yoJSGl7rZ+YqwEcoY0R+nZmPVu9NAtYCbouI6ZTk85IuLZn5ILAzsENEHAF8BlhsiH2eAmwbEQsBnwQuycwnm9nfIC6onvt+tPbKhuklG5Y7KTOfz8wngPOATShXbq/JzP+pjuda4FFgtWqdmzNzVv8dNnHc0zLzqer1ncCSEbEksArVl4DM/FNmvo0yRmc94OvVcd9MuRI7oYljlyS1hvnR/KgeZUEptUBm3gnsB5wcEStUs8cCR2bmhOqK6OqUK5gviohVgV9TuupcRbkCOmaIfT0I3EFJkLsAP2p2f4N4rt/2nx9kucbE9wpKohroM+UVlG41AE8PtKEmjvufDa/nVO/Napju205QuvKPAd7fcOxrAXbpkaQOMz++hPlRPcGCUmqRzDwbuAmYUs36BbB7Q/eWwyhdTxp9ELgtM48BrqeMmRhbvTeLfyee/n4EHAC8KjNvnIf9zY8dI+IVVTeibYFLgGuBjSPirQARsQHwJuA3A6zfeDxzO+4BVVdkb6dcuSUi3gTcCLySctV1/2r+EtX8LWsfqSRp2JgfzY/qLRaUUmvtBWwaEZtQup5cCtwcEfcAK1ON7WhwNrB0RNxLSQZPU7qvvJoyUP+FiLiFl1+V/TllYP6PG+Y1s7/58UrgFkpyOiEzr8nMe4E9gQsi4m7gCGDzqotRf43HM7fjnpvtKd2ZfktJ2Ltn5sxq/loRcRclWZ+dmWfO7wFLkoaN+dH8qB4xZs6cOUMvJUkNoonfBZMkabQxP2o0soVSkiRJklSLLZSSJEmSpFpsoZQkSZIk1WJBKUmSJEmqxYJSkiRJklSLBaUkSZIkqRYLSkmSJElSLRaUkiRJkqRaLCglSZIkSbVYUEqSJEmSarGglCRJkiTVYkEpSZIkSarFglKSJEmSVIsFpSRJkiSpFgtKSZIkSVItFpSSJEmSpFosKCVJkiRJtVhQSpIkSZJqsaCUJEmSJNViQSlJkiRJqsWCUpIkSZJUiwWlJEmSJKkWC0pJkiRJUi0WlJIkSZKkWiwoJUmSJEm1WFBKkiRJkmqxoJQkSZIk1WJBKUmSJEmqxYJSkiRJklSLBaUkSZIkqRYLSkmSJElSLRaUkiRJkqRaLCglSZIkSbVYUEqSJEmSarGglCRJkiTVYkEpSZIkSarFglKSJEmSVIsFpSRJkiSpFgtKSZIkSVItFpSSJEmSpFosKCVJkiRJtVhQSpIkSZJqsaCUJEmSJNViQSlJkiRJqsWCUpIkSZJUiwWlJEmSJKkWC0pJkiRJUi0LdDoAaR8wBNEAACAASURBVF5ExBzgbuAFYA7wKuAp4LOZedsw7WNrYK/MnDhM25sGLA882Tg/MycMx/bnst/FgQszc4MB3nsTcAXlPH4mM2+qsf3dgYUy84T5Dnbg7R8GPJCZp7Vi+3PZ70eANTPz4HbuV5KaFRG7AZ8GXgMsBPwP8JXM/M18bvfF/De/n8ERsQawW2Z+JiJWAGYAd1VvjwWeAfbPzBvnJ+ZB9n0w8NvMvHg4c0lEjAUuAN4FfC8zj6+xjRfPy/zGM8j2twA2zMx9WrH9uez3LcDRmfnxdu5XI4MFpbrR+pn5176JiPgCcBywdudCGtIXM3Nqm/f5WuB9g7y3PjAzMzecj+1/gFLct0QHC7o1gCU7tG9JmquI+CbwQWDbzHywmrcBcGlErJaZ/zsc+xmGz+B3A+Mapv/ZeCE1IrYFTgXeMZ/7GcgGwL0w7LnkjcAmwKKZ+ULNbfQ/L8MqM38O/LxV25+L5YHowH41AlhQqqtFxALAm4G/VdOvA04CXgcsBzxISbqPRsQfKcnrQ9U652bml6r1DgN2AP4P+H8N218c+D4wgdIiegXw5cycFRHPAt8FJlGuEn8R2AZYCXgY2Dwz/zFE/OOAE4EVgDHATzPzqOpq7g3AfdV76wFvAY4EFgVmA4dm5qURsRxwGrB0tdnLMvOrwE+AV0bEdGC1vuQXEesD3wAWj4jrMnP9iNgc+ArlSvczwBcy86bBziewDrAFsFFE/BNYBlg6M/eq9nFo33TVQvs3YHx1rKcBx1bnaUHgGkrBPavfuTkVuDszj272XEfELGAKpWBetPpbXVBt76vAJ4FZwP2Uq/Az+8V3LvAZYGxEPAl8s4r5nZQi8+/A9pmZ1Xo3VefizdXfa+fMnB0Rk6pz/ArgH5RW4N9GxPsH+hv2/7+QpIFUn8n7Am/LzL/0zc/MayNif8pnC1W++w2wMvBl4PnqeSFgWUqu+Wq17GD571T+/Rn8Lsrn9lKU1sXvZeYpETEROJzSQvoeYGHgP4EHgMMoeeYnwNcGOJylgBePISI+DexD6TnzCOUz+v4h8vDXgI8C/6rinwx8DFgdOCoiXgC25KW55AhgI+ANwLGZOaVqeTyKkteerM7dio09lSLi1cCVlLx1e0R8vDqfA52XV1By1lrAqyn5fXfgf/udl58Cx2fme6p9TOybrvLo2sDrgd9l5o4RcRDwcUpu+SOwZ2Y+3HhSI2IysHVmTqry1O2UAnvZKtbXUb5TLEr5fnRXtdy91XlbGjg9Mw+ptrcVcEh1fE9RWpVv6RffPZSLsW+MiF9k5iYR8WVgK2CRal9fyMwLq/VWqNZbHngM2C4zH46Id1K+cyxLyZHfyMxzI+KNwPGUXLsgcE5mfhONGI6hVDe6LiJ+GxEPUwoDgF2q508AN2Xm2sBbKcXRpxrWXSwz1wXeD+wdEW+JiC0pH9ATqvmLNyz/PUqSWonyQbsK8IXqvYWBv2TmSsAJwMmURL9itY0tG7ZzVERMb3hsVs0/E7iu2sY6wI4R8YnqvXHA1zPzncCzlALxU5m5KiXpnRgRbwb+A/ifav66wDuqBLwL1RXhxiupmXkdcDBwQ1VMvoNSOG2Wme+ldKO6ICIWHex8ZuaFlCug383M7w/4V3qpxzNzxcw8jpJkb8/M1YD3UpLX/kOs3+y5Hgv8rdr2tsApEbFMROwCbAqskZkrU1pWTx0gvq8BP6BcbDioWueJzFyr+jvcCuzVsN7bgImU/48NgPWqL3xnAJOrfR0FHBERr2Xwv6EkNWNt4L7GYrJPZp6emfc1zLo7M98FXAR8nnLBa3VKkfNfEbH0EPkPePHC7VTgwOqzdT3gCxGxVrXImsB3qvzxY8qFsj/x7zzTl59f2ZADH6QUN9+q9rEB8CVKD6RVgLOAiyJiDIPk4Wroxr6Uz/XVgasowxW+D9xGuVB5Yb/DWRj4a2auA2xN+WxehFLsrUYpitemfLb3P79/Bzbj3y2tD87lvKxJKVjXzswVKYXjgYOcl7lZHli1KiZ3qs7B+6r9X07JhUNZofrbfIxyQXNadb6uBPbut691gFWB7SJiUkSMp+TEj1f57GDg4oh4Tb/4Pkk5hzOqYnJ5YENgvWq9gyiFdJ91gW0yczzwOLBHNf8c4LzMfDflXH+z2tfpwCnVeX4fsGHVwq0RwhZKdaP1M/OvEfFeypXKX2fmowCZeWxErFtdqX0HJTk0jim5uFruzxHxKKXVaUPggipZEBGnUK6SQiko1snMOcBzEfEDSgI7onr//Op5BnBXZv652sYfeGm3yZd1ea0KtnWAjauYnqyuCG8K3ExpSesb29h3FfCiiBd7lMyhXH2+Eri8KkyupiStJ6sCphkbVdu+pmHbs4G3N3E+m3VDw+tJwPuqMUAAr2xyG82e6+MBMvN3EXEXpWvYpsBPGlqMjwUOioiFBojvRZk5NSL+JyL2Bt5OKR4bx5tekpmzgb9HxANVHOtQvshNr7ZxAaVA34zB/4bD0kVNUs8bQ/ncAF5sNev7/FoM+FlmfrmavgEgM+dUvVAmRcT2lPF/YyitRnPLf33eSSmwTmn47Hol5YLgfcCDfZ93wB2UVsKB9O/y+n7gioiYAHyYciHvsSrmUyPiWEpL1mB5+NvAb4E7IuIK4IrMvGYu567PxQ2xLlydh82A0zLz2Sq2kwY4D/0Nel4y88SI+AqwR0T0XXj8exOx9XdzQ++dSZRi6rZqf2Mp95EYygXV84zq+cqG6YkNy52Umc8DT0TEeZSuvW8GrsnM/4EXW8IfpRTf/eN7UWY+GBE7AztExNspFzEWa1hkWmY+Vb2+E1gyIpakXCw4udrGn4C3Vd+V1quW+Xq1zmKUiyA/a+L41QYWlOpamXlnROwHnBwRN2fmHyPiSMoH7inAdZSuEWMaVvtnw+s5/Ds5Ny7T+OHYvxX/FdU2+zzX8Pr5eTyEV/Tbb//tP9fwQT2WclV6zb4FI+INwGOZ+XyUwfAbUlrJbqm6qLykG8xcjKUkjO0atv0m4OEmzmef/udwoX7vP91vf9v0XUmPiCVo+II0F82e6/5/vxcY+O+4QEPMTzOAiPgspcX2eMoV879Ruh73Gej/6Xle+oVvDOWq8qB/w7kciyQ1+g0wPiKWysz/qwrBCfDvoQYNyz5dzV+U8qX9QkqReQqlK+JQ+a/PWEpPjcZi8HWUrqFrMfDn4JAy89cRkZQcM1CPuTGUnDNgHs4yvGA9SqvlhsB3owzj+NwQu/5ntf85VWE2hnLcjXE3Mz5y0PMS5eZuxwLfoRSwvwd2HGAb85o7j8zME6t9LUy5V8JQGnMnVdE4kGZyZ997fd9TBsudq1KO+7uUluPrKcNH+gz0PzOrYbpvOwHMrN5/f2Y+U81fmtJzSyOEXV7V1TLzbEqL0ZRq1ibAlMw8HXiU0vo2dojNXAlsExFLRBn30NhF9hfAf0bEmOrD+9PAL4cp9r9TWiL/E14cr7nTINu/mdKV9YPVshMoY13eEBFHAF/NzIuAz1HGMryT8uE8tipo5uZaYOOqawtVS9rvKOMe5nY+Z/HvpPIYsFp1nhalanUdxC+A/RrO6c95aTfS+bVTdRyrUsZFXl/tc5cqNihXnn+Vmc8NsH7jcW0CnJqZPwYS2Jyh/59+A7wrIt5dTW9J6QI76N9wno9Q0qiUZbzcscB5jd3lq9frMHAh9A7K2POvZOYllNaehSmfZXPLfy/uFng2Inas9vUmyrCB1QZYtlHjZ+nLRBkv905KsfsLSjfLZar3dqF0c32AQfJwRKxSxXFfZn6LUrys0sy+B3AZZcjJwlUX38kMfaFzbudlI0oPlhMpQyW2YvDc+eaIWLbK1VvNZX+/AHZv6G56GKUr6HDZMSJeUfVu2ha4hH9/P3grvNg1+U0M3FOp8bg+CNyWmcdQcnDj8Q+oarG8Hdi52tebgBsprb43Uw2NqS5C38hLhxWpwywo1Qv2AjaNiE0oH7BHR8TtlG4e/03pqjiozLyccsX2NsqHZOPPe+xDGRx+V/VIyg0IhssOwIeqrpm3ULp1njpAjI9RxrkcFRG/pSSRT2W5w98UYEJE3F0dwx+Asyk3O7gDuC8ilhosgMy8h5Kgz6m2/XVgi6p76NzO5xXAPhHxX5SxoI9RCqTLeWm30P72oXQxuotSuN5F6bo0XNaJiDsof9PtMvNxyrieqymtt/dRxojsMMj61wBbRMRxwNGULkvTq/l3MPT/0yPVtn9arbc/8Ikh/oaS1JRqfPePgTMj4s7qs/8CSkvQfw2wyu+AS4HfV5+NW1BuwPL2IfJf3/7+RfnyvntE/K7az1dz6J/7uInSmto3jrFxDOV0yvjDT2fm/Zn5S0pBeG1E3EMpKiZVQwoGzMOZ+VtKl8fbIuI2YFdgv2pfl1By185DxNjn1Or47wR+TbnJzzNzW2GI8/IDypj631XnYQbwlqpof/G8ZOa9lJvQ3EYpml42NrbByZS/483VOVqZwbsX1/FKyveQm4ETMvOaKr49KcM27qYM99k8M1/2f0K5mP1CRNxC+Q6ydETcSykSn6Z0WX31EDFsD2xb5chLgN0zc2Y1f63qu9JvgLMz88z5PWANnzFz5jTT00ySRr4ov1O6TDb8rIwkSXMTERsDy2bmGdX0scCzmXlAZyNrjyh3eT2+/70epGY5hlKSJEmj2T3AFyPii5Tvxr8FPtvZkKTuYQulJEmSJKkWx1BKkiRJkmqxy+sQqjuKrUEZKN3MbaQlSd1pLOW3Qm8d5A7AamB+lKRRZdAcaUE5tDUY5EfPJUk9aV3KHY01d+ZHSRp9XpYjLSiH9heAM888k+WWW67TsUiSWmTmzJnssMMOMPdb9+vfzI+SNErMLUdaUA7tBYCFrr6ZhRdfotOxSJIGsMxndxzOzdl9sznmR0lq0jDnqU56WY70pjySJEmSpFosKCVJkiRJtVhQSpIkSZJqsaCUJEmSJNViQSlJkiRJqsWCUpIkSZJUiz8bIklSD4qIicAZwAMNsydl5tOdiUiS1ItGdEEZEVOACcB44CHgaWBOZq4fEdsClwHbAEtk5pR52O6hwCbA48D2mfnEcMcuSdIIMDUz9+10EJKk3jWiC8q+JBgRpwJTMnN6w9t7AlfN6zYjYnlgrcxcOyI+BewOHD0M4UqSNKJFxGRgJ2Ax4PLMPDQirgH+BvwBmAp8hzIk5vzMPKZTsUqSusOILigHEhHTgc9RWi5PAq6o5o8BTgBWBJ4Dds3Mh/qvn5kPRsRHqsk3Av/XjrglSeqArSNiQvX655RCcUNgLHAXcGj1+juZeXNE3AB8HHgMuDQipmbm/7Y/bElSt+jKm/Jk5vXAdGCPhtmbA//IzPWAQ6rHYOu/EBEnAfsCN7QyVkmSOmhqZk6sHscA/wLOBo4HFm5Y7v7q+V3Az4DrgHHACm2MVZLUhbqyoBzEeGDTiJgGHAEsNbeFM3MPYD3gR60PTZKkzoqIJYDPZOZ2wGHAog1vz66e7wW2yMyJlPx4P5IkzUU3F5RzgDEN0zOAc6okuCtwyUArRcR7I+J71eQzwAutDFKSpBHiKWBGRNwKnAY8GhGL9VvmIEpX11soQ0geaXOMkqQu03VjKBvcApwFnFtNXwhsFhHXA6+i3LRnINOBBapxIrMp4zElSeopmTkNmNYwPZsyPKS/iQ3L3AB8sMWhSZJ6SFcUlJk5ueH1hOr5gAEW3a2Jbc1h8GJTkiRJktSkrigo64qI84Bl+s0+PDN/2Yl4JEmSJKmX9HRBmZnbdDoGSZIkSepV3XxTHkmSJElSB/V0C+VwWnLHrVhm3LhOhyFJ0ohifpSk0c0WSkmSJElSLRaUkiRJkqRaLCglSZIkSbVYUEqSJEmSavGmPE169IwjGPuaRTsdhjrk9Xse2ekQJGlEMj9K3c/vOZoftlBKkiRJkmqxoJQkSZIk1WJBKUmSJEmqxYJSkiRJklSLBaUkSZIkqZaeuctrRIwFjgLeA7wauDIzv9bZqCRJGjki4iLgj5m5b6djkST1hl5qofwwMDszN87MtYHVI2LVTgclSdJIEBHLALOBtSNiwU7HI0nqDT3TQgk8DKwfEZsA1wMfAxaIiHOA5YAngJ2ALYANgD2A64CdM3NGZ0KWJKlttgWuBN4GTIqIW4FzgOeBZ4Fzq8dPaMibmflUZ8KVJHWDnmmhzMw7gS8DnwP+DPwA+DTw35k5kZIk987MM4BlKEn0TItJSdIosR1wASX/TQb2A76ZmesDz1TL/Af98mb7w5QkdZOeaaGMiJWA2zNzs4hYBDgJ+C5wZ0RsDSwI3F4tfjxwPrBzR4KVJKmNIuKtwHjgjGrWmsBCwLer6Vuq5/HAmgPkTUmSBtQzLZTAJsCXADLzWeAB4IvAlOpK637A1RGxAPBV4OvAtzoTqiRJbbU9cGBmfjgzPwwcDawD9N1roO95Bv3yZrsDlSR1l14qKI8DloqIOyPiRmB54IfAlhHxK+AY4G7gQOCSzDwSeFtErNuxiCVJao9tgYsaps8CHgEOiIhrgDcCsyi9e/rnTUmSBtUzXV4z8zlgtwHe2rrf9Dca1tmspUFJkjQCZObK/ab/EBH7A/dm5oyIOA94KDOf5uV5U5KkQfVMQSlJkubJn4Gzq99x/h3wqw7HI0nqQhaUkiSNQpl5B/C+TschSepuvTSGUpIkSZLURhaUkiRJkqRa7PLapGV3PJDXjxvX6TAkSRpRzI+SNLrZQilJkiRJqsWCUpIkSZJUiwWlJEmSJKkWC0pJkiRJUi3elKdJvztrd/6y+IKdDkPzYI09Lul0CJLU88yP/2bekTQa2UIpSZIkSarFglKSJEmSVIsFpSRJkiSpFgtKSZIkSVItFpSSJEmSpFq6/i6vETEROAN4oGH2pMx8ujMRSZLUnSJiEnAw8ALw2cyc3uGQJEkjXNcXlJWpmblvp4OQJKnLHQJMBJYETgQmdTQaSdKI1ysF5UtExGRgJ2Ax4PLMPDQirgH+BvwBmAp8h9Ll9/zMPKZTsUqSNNyqPPhRYHFKrtsR+D6wDDAT2C4znxtg1bUy84WIWAl4sk3hSpK6WK+Modw6IqZVj/2BpYENgXWAT1TLjAW+k5lfohSTHwc+AHwoIt7ciaAlSWqhxzNzIvAjYBfgysxcCzgfeOtAK1TF5O7AFcDF7QpUktS9eqWF8iVdXiNiH+Bs4Alg4Ybl7q+e3wX8rHr9WmAF4H9bH6YkSW1zQ/V8K7ArcCBAZp4+t5Uy8+SIOBu4KSKuyMy/tzZMSVI365UWyhdFxBLAZzJzO+AwYNGGt2dXz/cCWzRcub0fSZJ6yyrV8+rVYwJAROwfER/qv3BELBARV0XEgsBzwCzKzXkkSRpUr7RQNnoKmBERt1avH42IxfotcxBwaUQsAtwGPNLmGCVJarVVI+Ja4F/AisBxEfFJSs47rv/CmTkrIs4B/ptSSB6dmc+0M2BJUvfp+oIyM6cB0xqmZwObD7DoxIZlbgA+2OLQJEnqpKmZOaVhequhVsjMU4BTWheSJKnXdH1BKUmS5k1EHARs1G/2ZZl5VCfikSR1LwtKSZJ6TGaeOsT7hwOHtycaSVIv67mb8kiSJEmS2sOCUpIkSZJUi11em7Ty9iczbty4TochSdKIYn6UpNHNFkpJkiRJUi0WlJIkSZKkWiwoJUmSJEm1WFBKkiRJkmrxpjxN+uV5k1ly8QU7Hcaos+WuV3Q6BEnSXJgfzVWSRjdbKCVJkiRJtVhQSpIkSZJqsaCUJEmSJNViQSlJkiRJqsWCUpIkSZJUiwWlJEmSJKmWlv1sSERMBM4AHmiYPSkzn56HbUwAPgDcDWyVmfsOsMzSwA+AJYBFgSMy8+KIOBCYmpkP9F9HkqRuFhFTgAnAeOAh4GlgTmauHxHbApcB2wBLZOaUedjuocAmwOPA9pn5xHDHLknqLa3+HcqpAxWBzcrM6cD0qjgdzAHAqZl5aUQsBtwaEZdk5hF19ytJ0kjWl1sj4lRgSpUv++wJXDWv24yI5YG1MnPtiPgUsDtw9DCEK0nqYa0uKF8iIiYDOwGLAZdn5qERcTVwH7AGcAGwGrAKMBlYBNgKuKhafx9gVmaeEBEfqZZ9CNgxImYCtwOrZObsviQLfAt4JbA48IrMXCUiDgY2Bl4A9srMu9pw+JIktUxETAc+R2m5PAm4opo/BjgBWBF4Dtg1Mx/qv35mPljlVoA3Av/XjrglSd2t1WMot46IadVjf2BpYENgHeAT1TILAOdSCrwvATtTWh23GmB75wIfq15vB5wFfA+4Efg+8DCwV+MKmbkpsBHwF2CXiFgZWCkzPwDsCHx7eA5VkqTOyszrgenAHg2zNwf+kZnrAYdUj8HWfyEiTgL2BW5oZaySpN7Q6oJyamZOrB7HAP8CzgaOBxZuWO6ezHwKeCgznwWepLROvkRmPgI8FxErAK+vxkdOzMzjMnNNYFVg54gY32/VE4HTM/MOyniTVSNiGnA68NphPF5Jkkaa8cCmVd47Alhqbgtn5h7AesCPWh+aJKnbte0urxGxBPCZzNwOOIxyA50+c+ZhU+cAx1BuOADw+YYuOo9Uj+cb9vtF4K+ZeXY1awYwLTMnAh8FzpvHQ5EkaSSbA4xpmJ4BnFPlvV2BSwZaKSLeGxHfqyafoQwLkSRprtr5syFPATMi4lbgNODR6iY68+pCSrfZc6rpvYB9IuIW4NfApZk5o3pvEeBwYI2+rreUrkAzI+JXwDXA7+sekCRJI9AtlCEhfS4EVoiI6ym58+5B1psOLBARN1Du0l77pnqSpNGjZTflycxpwLSG6dmUcRz9TWxYZsIA6/Z/Hgtcm5kzq2X/SLnFef/9T65eLjTAPg8aKn5Jkka6hlzXmEMPGGDR3ZrY1hzKHWIlSWpaW+/yOr8iYiXKVdO9Ox2LJEndKiLOA5bpN/vwzPxlJ+KRJHWvriooq5/3WKXTcUiS1M0yc5tOxyBJ6g3tHEMpSZIkSeohXdVC2UkbbXMq48aN63QYkiSNKOZHSRrdbKGUJEmSJNViQSlJkiRJqsWCUpIkSZJUiwWlJEmSJKkWb8rTpLMu3InFl1iw02F01B6f+kWnQ5AkjTCjPT+aGyWNdrZQSpIkSZJqsaCUJEmSJNViQSlJkiRJqsWCUpIkSZJUiwWlJEmSJKkWC0pJkiRJUi2j8mdDImIScDDwAvDZzJze4ZAkSaNEREwEzgAeaJg9KTOfnodtTAA+ANwNbJWZ+w6wzNLAD4AlgEWBIzLz4og4EJiamQ/0X0eSpHk1KgtK4BBgIrAkcCIwqaPRSJJGm6kDFYHNqi6ETq+K08EcAJyamZdGxGLArRFxSWYeUXe/kiT119UFZURMBj4KLE7pvrsj8H1gGWAmsF1mPjfAqmtl5gsRsRLwZJvClSRpQFU+2wlYDLg8Mw+NiKuB+4A1gAuA1YBVgMnAIsBWwEXV+vsAszLzhIj4SLXsQ8COETETuB1YJTNnR8SpwBTgW8ArqXJoZq4SEQcDG1N68OyVmXe14fAlSV2sF8ZQPp6ZE4EfAbsAV2bmWsD5wFsHWqEqJncHrgAublegkiRVto6IadVjf2BpYENgHeAT1TILAOdSCrwvATtTWh23GmB75wIfq15vB5wFfA+4kXKh9WFgr8YVMnNTYCPgL8AuEbEysFJmfoBygfbbw3OokqRe1gsF5Q3V862Ubqy3AWTm6Zl532ArZebJwDjgKxHx6lYHKUlSg6mZObF6HAP8CzgbOB5YuGG5ezLzKeChzHyW0qtmkf4by8xHgOciYgXg9dX4yImZeVxmrgmsCuwcEeP7rXoicHpm3gGMB1aNiGnA6cBrh/F4JUk9qhcKylWq59WrxwSAiNg/Ij7Uf+GIWCAiroqIBYHngFmUrj2SJLVdRCwBfCYztwMOo9xAp8+cedjUOcAxwGXV9Oer7q8Aj1SP5xv2+0Xgr5l5djVrBjCt6vXzUeC8eTwUSdIo1AsF5aoRcS2le86KwKbV1dU1gV/1XzgzZ1GS7n9X7x+dmc+0L1xJkl7iKWBGRNwKnAY8Wt1EZ15dSOk2e041vRewT0TcAvwauDQzZ1TvLQIcDqzR1/UWmA7MjIhfAdcAv697QJKk0aOrb8pTmZqZUxqmBxpb8hKZeQpwSutCkiRpYJk5DZjWMD0b2HyARSc2LDNhgHX7P48Frs3MmdWyfwQ2GWD/k6uXCw2wz4OGil+SpEa9UFAOKiIOotxwoNFlmXlUJ+KRJKkVqruWnwHs3elYJEmjS1cXlJl56hDvH07p0iNJUs+qft5jlSEXlCRpmPXCGEpJkiRJUgdYUEqSJEmSaunqLq/ttP1HT2PcuHGdDkOSpBHF/ChJo5stlJIkSZKkWiwoJUmSJEm1WFBKkiRJkmpxDGWTjrhiZxZ97YKdDqMjjtz6yk6HIEkaobo1P5rbJGl42EIpSZIkSarFglKSJEmSVIsFpSRJkiSpFgtKSZIkSVItFpSSJEmSpFosKCVJkiRJtfTcz4ZExEXAHzNz307HIknSSBARY4GjgPcArwauzMyvdTYqSVIv6KkWyohYBpgNrB0R3fejWJIktcaHgdmZuXFmrg2sHhGrdjooSVL367UWym2BK4G3AZMi4lbgHOB54Fng3OrxE2A54Algp8x8qjPhSpLUFg8D60fEJsD1wMeABSLiHBryIbAFsAGwB3AdsHNmzuhMyJKkbtBTLZTAdsAFlCJyMrAf8M3MXB94plrmP4D/zsyJlOJy7/aHKUlS+2TmncCXgc8BfwZ+AHyafvkwM88AlqHk0TMtJiVJQ+mZFsqIeCswHjijmrUmsBDw7Wr6lup5PLBmRGwNLAjc3s44JUlqt4hYCbg9MzeLiEWAk4DvAncOkA+PB84Hdu5IsJKkrtJLLZTbAwdm5ocz88PA0cA6QN8Ykb7nGcCU6orsfsDV7Q5UkqQ22wT4EkBmPgs8AHyRfvkwIhYAvgp8HfhWspyCkAAAIABJREFUZ0KVJHWTXiootwUuapg+C3gEOCAirgHeCMyiXJXdMiJ+BRwD3N3uQCVJarPjgKUi4s6IuBFYHvghL8+HBwKXZOaRwNsiYt2ORSxJ6go90+U1M1fuN/2HiNgfuDczZ0TEecBDmfk0sHVHgpQkqQMy8zlgtwHe6p8Pv9GwzmYtDUqS1BN6pqAcxJ+Bs6vf3/od8KsOxyNJkiRJPaOnC8rMvAN4X6fjkCRJkqRe1EtjKCVJkiRJbWRBKUmSJEmqpae7vA6nAzf9KePGjet0GJIkjSjmR0ka3WyhlCRJkiTVYkEpSZIkSarFglKSJEmSVItjKJu0y1WHs+CSr+rY/i/f6jsd27ckSYPpdH4ciDlTktrHFkpJkiRJUi0WlJIkSZKkWiwoJUmSJEm1WFBKkiRJkmqxoJQkSZIk1WJBKUmSJEmqpat+NiQi1gUOBcYCc4DPZubvB1huGrBVZj7RMG8K8JXMfLo90UqS1D7mSElSJ3RNQRkRywJHAZtm5uMRsQpwOrBGM+tn5r6tjE+SpE4xR0qSOqVrCkpgc+CCzHwcIDN/GxEbR8RkYCdgMeDyzDy0Wv77EbEC8MvMPLTviiwwFbgbWAu4NTP3butRSJI0/MyRkqSO6KYxlK8D/tQ4o0qcSwMbAusAn2h4+4eZuQ6wdkS8vWH+ApSEuTawYUS8qqVRS5LUeuZISVJHdFNB+WdghcYZEfFR4HngbOB4YOGGt2+qnu8A3tpvW/dk5hzgEWChVgQrSVIbmSMlSR3RTQXl5cCWEbEEQES8jzJeZI/M3A44DFi0YflVImIMsBpwf79tzWlDvJIktYs5UpLUEV1TUGbmY8DBwKURcT1wJPARYEZE3AqcBjwaEYtVq+wB3AhcnZl/7EDIkiS1hTlSktQp3XRTHjLzKuCqfrM3H2DRiQOsO7H/ew3zJEnqauZISVIndE0LpSRJkiRpZLGglCRJkiTVYkEpSZIkSarFglKSJEmSVEtX3ZSnk36y8UGMGzeu02FIkjSimB8laXSzhVKSJEmSVIsFpSRJkiSpFgtKSZIkSVIttQrKiFhwuAORJKkXmCMlSaNJUzfliYgPABOBbwM3A+MjYpfMPLeFsY0ou/7ipyz42te0dB+XfWzvlm5fkjT8RnuObEd+7M98KUkjR7MtlEdRkuRWwExgReDzrQpKkqQuYo6UJI1azRaUYzPzamAj4KLM/CMwtmVRSZLUPcyRkqRRq+mCMiLeB3wEuCoi3gM4RkSSJHOkJGkUa7agPBw4C/hxdeX1EuArrQpKkqQuYo6UJI1aTd2UJzMvAC5omPX2zHyhNSFJktQ9zJGSpNGs2bu8Lgf8GHgHsC5wWkRMzsy/tDI4SZJGOnOkJGk0a6qgBE4ALgL2Av4GTAdOpowXaZmImAJMAMYDDwFPA3Myc/2I2Ba4DNgGWCIzp8zDdr8ObACMAfbOzNuHPXhJ0mjRkRzZJyImAmcADzTMPiIzr5zH7awGfIsy/nM2sGdm5nDFKUnqTc0WlCtk5o8iYs/MfB44ICLuamVgAJm5L0BEnApMyczpDW/vCVw1r9uMiHcCEzJznYh4O3A88OFhCFeSNDp1JEf2M7UvZ86H44CPZObjEbE65edQtpj/0CRJvazZgnJ2RLx4A5+IeDXN39BnWEXEdOBzlJbLk4ArqvljKFeJVwSeA3bNzIcG2MQfgJ2q1wsAz7c6ZklSTxsxObIhhsnAJOA1wDPAncBmwHWZeeAgqz0C7BURZ2bmbRGxdVuClSR1tWYT3gXAmcDiEbEHcC3ws5ZFNYTMvJ7SpWiPhtmbA//IzPWAQ6rHQOs+X119XYzSJenwVscrSeppIyFHbh0R0/oewHLAE5m5MaX76j3A+ylF5mB2phSgv4iI+4APtjhmSVIPaKqgzMxvApcDt1J+uPmHwGEtjKuO8cCmVSI9AlhqsAUj4rWU45mSmTe3JzxJUi8aITlyamZO7HsAMylFJMBTwAPVnWdnDbRyRCwMvCczv5iZ76D05PlhG+KWJHW5Zu/yelpm7gSc3uJ45sUcyk11+swAzsnMr0fE2xjkympEjKX8RtiRmXlJ68OUJPWyEZojoeTJZs0GfhoRG2Tmn4D/BzzZmrAkSb2k2S6vq1RjFEeSWyg/JN3nQmCFiLgeOAe4e5D1tgLeDXy+6hr0k9aGKUnqcSMhR/bv8jp2Xlaubib0WeD8iPg15S7q+w1/mJKkXtPsTXn+AtwTETdTfroDgMzcpyVR9ZOZkxteT6ieDxhg0d2a2Nb5wPnDFpwkabTrdI6cBoyby/uTG15PmMtyVwNXD2dskqTe12xBeVP16CoRcR6wTL/Zh2fmLzsRjySpJ3VVjoyIgyhjPRtdlplHdSIeSVJ3a6qgzMyvtTqQVsjMbTodgySpt3VbjszMw/EO55KkYdLsTXnuYoDB/Zm58rBHJElSFzFHSpJGs2a7vO7V8Hohyo1tHh7+cEauUzbZmXHjBh2iIkkavUZ1jjQ/StLo1myX1+sbpyPiauDX2GVGkjTKmSMlSaNZsz8b0t9SwBuGMxBJknqEOVKSNGrUGUM5Bngz8MNWBSVJUrcwR0qSRrM6YyjnAI9l5n0tiEeSpG5jjpQkjVrNFpQ7ZeZujTMi4vzM/HgLYhqRdrviYhZ87RLDvt1Lt95h2LcpSWqrUZ0jzY+SNLrNtaCMiBOBNwLrRsQyDW8tCIxvZWCSJI1k5khJkoZuofwx8B5gFeD8hvmzgJtaFZQkSV3AHClJGvXmWlBm5m3AbRFxdWY+1KaYJEka8cyRkiQ1P4byTRHxfWAxyh3sxgJvycw3tywySZK6gzlSkjRqNfs7lCdTfqT5NcCZwFO8tHuPJEmjlTlSkjRqNVtQzsnMI4FpwO+BbYAPtiooSZK6iDlSkjRqNdvl9e/V8wzgPZl5Y0SMbVFML4qIKcAEyt3yHgKepiTu9SNiW+AySuJeIjOnzOO23wpcl5nLD3PYkqTRpe05ssX5cUtgYmbuN8xhS5J6ULMF5W8i4lzgq8BlEfFO4IXWhVVk5r4AEXEqMCUzpze8vSdwVZ3tRsQCwBHAc/MboyRp1Gt7jmxhftwf2AO4Yn5jlCSNDs0WlPsBa2bm/RGxL7AhsH3rwhpcREwHPke5MnsSVdKLiDHACcCKlEJx17ncde9g4PvAsS0PWJLU60ZEjhym/Hg/pSDdvOUBS5J6QlNjKDNzDjA7IvYAfgn8LDOzpZHNPZ7rgemUq6h9Ngf+kZnrAYdUj5eJiPWAsdU2JEmaLyMpR85PfqzWv5Q29ECSJPWOpgrKiNgF+AnwJWAJ4OKI+I9WBlbDeGDTiJhG6c661CDLfRJYr1ru7RFxcnvCkyT1oi7Ikc3mR0mS5lmzd3ndG1gbeCozH+X/t3f/wXbV5b3H3yGIYimIgj+uaeVa24fBaz1iWpXWEqRAoVCgAo4Xr8bALZbRgrQd6KUjtBoLpd7GWmmpluEOSsMQRaVIxB8k9Wq9oCWdivbTQn9MGYtU5EcBjSSc+8da0Z2Tc07O2Tn77L32eb9m9uy1vmut737WXkmePOu71trwcuD8gUU1N5M0v/e1wz3A+iSrgDXATdNtlOQtSX62Xe/uJGcPOlBJ0lgbtRzZV36UJKkfcy0otyd5ZMdMkn8Dtg0mpDm7HbiuZ/5G4JCq2gysB746lKgkSUvNqOVI86MkadHM9aE8366qCZqznlTVmcC3BxbVFElW90xPtO8XTrPqWfPsd2LPIpMkaXg5chD5Mckmmt/UlCRpt+ZaUJ4HbAB+rKq+AXwXOHlgUS2QqroBOHhK89oknx5GPJKksdS5HGl+lCQtlDkVlEn+vqpeCvwEsLxpyhMDjWwBJDl92DFIksZbF3Ok+VGStFBmLSir6s+S/Eo7e2CSry9CTJIkjTxzpCRJux+hXNkzfStw+ABjGWl/fvzJrFixYthhSJJGhzkS86MkLXW7e8rrshmmJUla6syRkqQlb64/GwLt0+skSdIuzJGSpCVpd5e87lVVB9KceV3eMw1AkkX76RBJkkaMOVKStOTtrqB8CfAtfpAgH+hZNknzNDtJkpYic6QkacmbtaBMMp9LYsfa/7xlE/sc+Kw97ufjpx2/ANFIkobNHNlYqPwI5khJ6iKToSRJkiSpLxaUkiRJkqS+WFBKkiRJkvpiQSlJkiRJ6osFpSRJkiSpLxaUkiRJkqS+7O53KIeqqtYBE8ChwL3Ao8BkkqOq6gzgZuB04BlJ1s2z75OBVUnevsBhS5LUSVV1CnARzQnnK5NcM9yIJEmjbqQLyiTnA1TVNcC6JFt6Fp8L3NpPv1V1AXAOcMuexihJ0hi5FDgCeAK4s6quTbJ9uCFJkkbZSBeU06mqLcB5NCOXV9EWhVW1DLgSOAzYCqxJcu8M3fwDTUF60sADliRpkVXVauBU4ACa0cY3AO8HDgbuA16XZOs0mx6b5PGq2htYBjy5OBFLkrqqk/dQJtkMbKEZZdzhJOCxJEcCl7Svmbb/S8AzrpKkcfZgklXAB4A3AxuTvBL4CPDC6TZIcn87+V7g6iSTixGoJKm7OjdCOYtDgeOraiXNWdUHhhyPJEnD9Pn2/Q5gDc29kSS5dqYNqmovmpHM7yR5z8AjlCR1XidHKFuTNIXjDvcA69uzsWuAm4YRlCRJI+Kl7fvK9jUBzXMEquroGba5HHg4yQWLEJ8kaQx0uaC8HbiuZ/5G4JCq2gysB746lKgkSRoNh1fV52junzyM5iqeTcArgL+aunJVPYfmGQWvqqpN7euAxQxYktQ9nbjkNcnqnumJ9v3CaVY9ax59bgI27WFokiSNqg1TflLrlNlWTvJNYJ/BhiRJGjedKCj7VVU30DzRrtfaJJ8eRjySJI2CqroYOGZK881JrhhGPJKk7hrrgjLJ6cOOQZKkxZbkmt0sXwusXZxoJEnjrMv3UEqSJEmShmisRygX0geOX8WKFSuGHYYkSSPF/ChJS5sjlJIkSZKkvlhQSpIkSZL6YkEpSZIkSeqLBaUkSZIkqS8+lGeOzt14F/sceP+c1t3w2sMHHI0kSaNhPvlxJuZNSeouRyglSZIkSX2xoJQkSZIk9cWCUpIkSZLUFwtKSZIkSVJfLCglSZIkSX2xoJQkSZIk9cWCUpIkSZLUl878DmVVrQI+BNzd03xZko3z7OflwO8BTwGeBM5NkoWKU5KkQauqdcAEcChwL/AoMJnkqKo6A7gZOB14RpJ18+j3ncBrgGXA25J8ZcGDlySNlc4UlK0NSc7fwz7eB/xikgeraiVwBfBLex6aJEmLY0curKprgHVJtvQsPhe4db59VtVPABNJfqaqXgT8MfALCxCuJGmMda2g3ElVrQZOBPYHHgfuBE4Abkty0QybfRN4a1V9OMmXq+q0RQlWkqQBqqotwHk0I5dXAbe07cuAK4HDgK3AmiT3TtPFPwNvbKf3Bp4YdMySpO7r2j2Up1XVph0v4LnAQ0mOpbl89S7gCJoicyZvoilAP1VVXwd+bsAxS5K0KJJsBrYA5/Q0nwQ8luRI4JL2Nd22T7RX7+wHfBBYO+h4JUnd17URyp0ueW1HKLe2s48AdyfZXlXbptu4qp4K/Lckvwn8ZlX9FHA98MLBhi1J0tAcChzf3uaxDHhgphWr6kDg4zSX0X5pkeKTJHVY10YopzM5j3WfBP5PVf1IO/+PwMMLH5IkSUMzSVM47nAPsD7JKmANcNN0G1XV8nbZFUk2DDpISdJ46NoI5WlVNdEzf+18Nk7yRFX9KvCRdhRzEnj7QgYoSdKQ3Q5cR3MFDsCNwAlVtRl4Os1De6ZzCvBi4Ner6teBf07y5kEHK0nqts4UlEk2AStmWb66Z3pilvU+A3xmIWOTJGkYpst9SS6cZtWz5tDXR4CPLFhwkqQloTMF5XxV1cXAMVOab05yxTDikSRpVFTVDcDBU5rXJvn0MOKRJHXX2BaUSdbiE+okSdpFktOHHYMkaTyMw0N5JEmSJElDMLYjlAvtyl94MStWzHgLpyRJS5L5UZKWNkcoJUmSJEl9saCUJEmSJPXFglKSJEmS1BfvoZyjP/3U/fzwM5fPuPzCU5+3iNFIkjQadpcfe5krJWn8OEIpSZIkSeqLBaUkSZIkqS8WlJIkSZKkvlhQSpIkSZL6YkEpSZIkSeqLBaUkSZIkqS8WlJIkSZKkvoz071BW1TpgAjgUuBd4FJhMclRVnQHcDJwOPCPJunn0exFwKvAgcGaSBxY8eEmSFkFVrQI+BNzd03xZko3z7OepwO8DLwOeCnwyye8sVJySpPE00gVlkvMBquoaYF2SLT2LzwVunW+fVfV84Ogkr6iqU4G3A7+9AOFKkjQsG3bkzD1wKfD1JOdV1TLg+qo6OcnH9zw8SdK4GumCcjpVtQU4j2bk8irglrZ9GXAlcBiwFViT5N5pulgJfL6d/gywpwlYkqSRUlWrgROB/YHHgTuBE4Dbklw0w2an0ORQkkxW1Rrgu4OPVpLUZZ28hzLJZmALcE5P80nAY0mOBC5pX9PZH/jPdvoxYL9BxSlJ0iI5rao27XgBzwUeSnIs8CRwF3AETZE5k+1JJnfMJHk0ybZBBi1J6r7OjVDO4lDg+KpaCSwDZrov8hHgkHZ6v3ZekqQu2+mS13aEcms7+whwd5LtVTVbgbhXVe2V5Mm2jxcB+yf5m0EFLUnqvk6OULYmaQrHHe4B1idZBawBbpphu68Ar24vkT0a+NIgg5QkaUgmd7/KTj4BnAVQVctpHtDzgoUOSpI0Xro8Qnk7cB1wfTt/I3BCVW0Gnk7z0J5dJLm3qj4LfBH4DnDGIsQqSdIgnVZVEz3z1/bRx6XAlVX1RmBf4GNJblyI4CRJ46sTBWWS1T3TE+37hdOsetYc+7scuHxBgpMkaYiSbAJWzLJ8dc/0xCzrfZfmCh9JkuasEwVlv6rqBuDgKc1rk3x6GPFIkjRsVXUxcMyU5puTXDGMeCRJ3TbWBWWS04cdgyRJoyTJWmDtsOOQJI2HLj+UR5IkSZI0RGM9QrmQ3nLcs1mx4nnDDkOSpJFifpSkpc0RSkmSJElSXywoJUmSJEl9saCUJEmSJPXFeyjnaNNfPsizDnzaTm3Hv+6gIUUjSdJomC4/9jJXStJ4c4RSkiRJktQXC0pJkiRJUl8sKCVJkiRJfbGglCRJkiT1xYJSkiRJktQXC0pJkiRJUl+WbEFZVYdU1eeGHYckSaPE/ChJmo8lWVBW1dHA9cAzhx2LJEmjwvwoSZqvvYcdwJ6oqtXAqcABNMXxG4D3AwcD9wGvS7J1mk23AccBmxYlUEmSFpH5UZK0WMZhhPLBJKuADwBvBjYmeSXwEeCF022QZHOShxYvREmSFp35UZI0cJ0eoWx9vn2/A1gDXASQ5NqhRSRJ0vCZHyVJAzcOI5Qvbd9Xtq8JgKq6oL0XRJKkpcj8KEkauHEYoTy8fRrd94DDgPdV1euBbwLvG2pkkiQNj/lRkjRw41BQbkiyrmf+lLlumGRiAPFIkjQKzI+SpIEbh4JyRlV1MXDMlOabk1wxjHgkSRoF5kdJ0kLpdEGZ5JrdLF8LrF2caCRJGg3mR0nSYhmHh/JIkiRJkobAglKSJEmS1JdOX/K6mFadeCArVhw07DAkSRop5kdJWtocoZQkSZIk9cWCUpIkSZLUFwtKSZIkSVJfLCglSZIkSX3xoTxz9I8f/hYPH/CU78+/+C3PGWI0kiRJkjR8jlBKkiRJkvpiQSlJkiRJ6osFpSRJkiSpLxaUkiRJkqS+WFBKkiRJkvqyJJ/yWlWnABfRFNRXJrlmuBFJkiRJUvcsyYISuBQ4AngCuLOqrk2yfbghSZI0P1W1CvgQcHdP82VJNs6znx8C/hB4CfAkcBtwiblRkrQ7nS4oq2o1cCpwAM1o4xuA9wMHA/cBr0uydZpNj03yeFXtDSyjSZ6SJHXRhiTn72EffwR8IcmvAFTVO4DfAt61p8FJksZbpwvK1oNJTq6q/wG8GdiY5P3t/AuBr0/dIMn97eR7gauTTC5euJIkDU57svVEYH/gceBO4ATgtiQXTbP+PsDhSc7qaX43cAcWlJKk3RiHgvLz7fsdwBqaeyNJcu1MG1TVXjQjmd9J8p6BRyhJ0uCcVlUTPfMbgYeSnFZVHwXuoikM/5Y2R05xEPDN3oYk26pq+aACliSNj3F4yutL2/eV7WsCoKouqKqjZ9jmcuDhJBcsQnySJA3ShiSrdrxobvm4q132CHB3ey/kthm2/xbw7N6G9paQcTjpLEkasHEoKA+vqs/R3D95GHB8VW0CXgH81dSVq+o5wHnAq6pqU/s6YDEDliRpwOZ8K0eS7wFbqupNAFX1MeBq4PoBxSZJGiPjcPZxQ5J1PfOnzLZykm8C+ww2JEmSFs3US15nvOVjFm8F/rCqzgGeRjNq+VBVPSXJEwsRpCRpPI1DQTmjqroYOGZK881JrhhGPJIkLaQkm4AVsyxf3TM9Mct6jwPn9LZV1UqLSUnS7nS6oExyzW6WrwXWLk40kiSNtvmcaE3y5cWJSpLUZZ0uKCVJ0tx5olWStNDG4aE8kiRJkqQhcIRyjn78zINYseI5ww5DkiRJkkaGI5SSJEmSpL5YUEqSJEmS+mJBKUmSJEnqiwWlJEmSJKkvFpRz9B8fvIf73vP3ww5DkiRJkkaGBaUkSZIkqS8WlJIkSZKkvlhQSpIkSZL6YkEpSZIkSeqLBaUkSZIkqS8WlJIkSZKkvuw97ADmoqpWAR8C7u5pvizJxnn281Tg94GXAU8FPpnkdxYqTkmSFkNVrQMmgEOBe4FHgckkR1XVGcDNwOnAM5Ksm2ffLwRuS/KCBQ5bkjSGOlFQtjYkOX8P+7gU+HqS86pqGXB9VZ2c5ON7Hp4kSYtjRz6sqmuAdUm29Cw+F7i1n36ram/gMmDrnsYoSVoaulRQ7qSqVgMnAvsDjwN3AifQnFW9aIbNTgEOA0gyWVVrgO8OPlpJkgarqrYA59GMXF4F3NK2LwOupMl/W4E1Se6doZt3AO8H3jvwgCVJY6FL91CeVlWbdryA5wIPJTkWeBK4CziCpsicyfYkkztmkjyaZNsgg5YkabEk2QxsAc7paT4JeCzJkcAl7WsXVXUksLztQ5KkOelSQbkhyaodL+A+miIS4BHg7iTbgdkKxL2q6vv7XFUvqqrDBxaxJEnDdyhwfHsy9jLgWTOs93rgyHa9F1XVBxcnPElSl3WpoJzO5O5X2ckngLMAqmo5zQN6fOiAJGmcTALLeubvAda3J2PXADdNt1GStyT52Xa9u5OcPehAJUnd16V7KE+rqome+Wv76ONS4MqqeiOwL/CxJDcuRHCSJI2I24HrgOvb+RuBE6pqM/B0mof2SJK0IDpRUCbZBKyYZfnqnumJWdb7Ls3ZWUmSOm+6/JfkwmlWPWue/c6YSyVJ6tWJgnK+qupi4JgpzTcnuWIY8UiSNEqq6gbg4CnNa5N8ehjxSJK6aywLyiRrgbXDjkOSpFGU5PRhxyBJGg9dfyiPJEmSJGlIxnKEchAOPvvHeO6KGW/jlCRJkqQlxxFKSZIkSVJfLCglSZIkSX2xoJQkSZIk9cWCUpIkSZLUFwtKSZIkSVJfLCglSZIkSX2xoJQkSZIk9cWCUpIkSZLUFwtKSZIkSVJfLCglSZIkSX2xoJQkSZIk9WXvYQcwF1W1CvgQcHdP82VJNs6znx8C/hB4CfAkcBtwSZLtCxSqJEkDV1XrgAngUOBe4FFgMslRVXUGcDNwOvCMJOvm0e9FwKnAg8CZSR5Y8OAlSWOlEwVla0OS8/ewjz8CvpDkVwCq6h3AbwHv2tPgJElaLDvyYVVdA6xLsqVn8bnArfPts6qeDxyd5BVVdSrwduC3FyBcSdIY61JBuZOqWg2cCOwPPA7cCZwA3JbkomnW3wc4PMlZPc3vBu7AglKS1HFVtQU4j2bk8irglrZ9GXAlcBiwFViT5N5pulgJfL6d/gywpydxJUlLQJfuoTytqjbteAHPBR5KcizN5at3AUfQFJnTOQj4Zm9Dkm3A8sGFLEnS4kmyGdgCnNPTfBLwWJIjgUva13T2B/6znX4M2G9QcUqSxkeXRih3uuS1HaHc2s4+AtydZHtVbZth+28Bz+5tqKq96dZ3IEnSfB0KHF9VK4FlwEz3RT4CHNJO79fOS5I0qy6NUE5ncq4rJvkesKWq3gRQVR8DrgauH1BskiQNwyRN4bjDPcD6JKuANcBNM2z3FeDV7SWyRwNfGmSQkqTx0KWCcuolr/1cqvpW4Iiq+iLwozSXzT6rqp6ygHFKkjRMtwPX9czfCBxSVZuB9cBXp9uova/ys8AXgbcB7xlwnJKkMdCJyz2TbAJWzLJ8dc/0xCzrPc7O95VQVSuTPLHnUUqStLimy39JLpxm1bOmaZuuv8uByxckOEnSktCJgnK+qupi4JgpzTcnuWLqukm+vDhRSZI0GqrqBuDgKc1rk3x6GPFIkrprLAvKJGuBtcOOQ5KkUZTk9GHHIEkaD126h1KSJEmSNEIsKCVJkiRJfbGglCRJkiT1ZSzvoVxgywHuu+++YcchSRqgnn/n+/lZqqXI/ChJS8RsOdKCcveeB3DmmWcOOw5J0uJ4HnDPsIPoAPOjJC09u+RIC8rduwN4NfDvwPYhxyJJGpzlNInyjmEH0hHmR0laOmbMkcsmJycXPxxJkiRJUuf5UB5JkiRJUl8sKCVJkiRJfbGglCRJkiT1xYJSkiRJktQXC0pJkiRJUl+W/M+GVNVTgA/TPAb3a8Bbkky2y44F3kXzOPRfS3LHdG3DifwH+tiHdwLHA48Cf5fkbcOJvDFb/O3yQ4Crk7ymne/UMWiXH8LO+9CZY1BVpwAX0ZyAujLJNaN2DPqIf6S+f9jtPpwIvIPm+/7VJFs6dgymi3/kjoF21fUc2fX8CN3PkV3Pj2COHPYx6Hp+hPHPkY5QwmuBu5ILvyctAAAHzUlEQVS8GvgucHTPsncCxwC/DPzeLG3DNt99+EnguCSrRuQP6IzxV9XRwPXAM3vW79QxmGEfOnMMgEuB1wA/A/xGVS1n9I7BfOMfte8fZt+HS4CjgDNokiR06xhMF/8oHgPtqus5suv5EbqfI7ueH8EcOWxdz48w5jnSghJeAdzWTn+G5keaqaoDgMeSPJzk34H9Z2gbhVHe+ezD3sCPA39eVZuq6qeGEvHOpo2/tQ04bsdM145Ba6d9aHXpGByb5HFgElgG7MfoHYP5xP8ko/f9w+z78MokjwHPBx4e0b8Hc46/bRvFY6BddT1Hdj0/QvdzZNfzI5gjh63r+RHGPEdaUML+wH+204/R/EMwtR2aswnTte076ADnYD77sC/NkPvrgdXAVYsT4qxmip8km5M8NMO6MPrHYJd9qKpldOsY3N9Ovhe4mtE8BvOJH0bv+4fZ92F7VZ0N3AJ8nO4dg53iH9G/A5pe13Nk1/MjdD9Hdj0/gjly2LqeH2HMc6QFJTzCDw7qfu08NAd9v571ngY8PkPbsM13H/4oyXeS/AvwRFXts1iBzmCm+Kcz0z4N23z2ATp0DKpqr6r6E2BrkvcwmsdgPvHD6H3/sJs/Q0k+CKwAfrtnnR1G+hjALvHvx2geA+2q6zmy6/kRup8ju54fwRw57GPQ9fwIY54jR2EIeNjuAFYBn6e5nvmTAEkeqqr92qHzfYHHkzwwTdv2IcXda877ABwAbKqqlwHPApYl+d5Qov6BaeOfzgzHZWSPwQwOpFvH4HLg4SQXwcgegznHz2h+/zDDPrSX6nwS+EVgK80lYg8AnTgGM8T/w8DGETwG2lXXc2TX8yN0P0d2PT+COXLYx6Dr+RHGPEc6Qgk3AIdV1RdpDuA/VdXvt8suBm4FbgJ+a5a2YZvzPiT5NvBnwF8DHwXOG0K8U80W/3S6dgx20qVjUFXPoYnvVe11/Jvaf6RH7RjMOX6ap6iN2vcPM+xDkm3AeuD/An8F/EF7v0snjsEM8X+D0TwG2lXXc2TX8yN0P0d2PT+COXLYup4fYcxz5LLJycndryVJkiRJ0hSOUEqSJEmS+mJBKUmSJEnqiwWlJEmSJKkvFpSSJEmSpL5YUEqSJEmS+uLvUEoLpKomga/SPHJ7Eng6zQ/X/mqSL+9m203AHyfZMMs6/5XmcdKvrar/AmxIcsQCxP1LwM8n+bU97Wuen/v9/VnMz5UkLS7z47w/1/yoTrGglBbWUUm+tWOmqn4DeB/wqgXo+wVAAbS/UbTHybLt6xPAJxair3n6/v5Iksae+XHuzI/qFAtKaUCqam/gR4Fv97RdDLyW5nLzfwHObZNf73b/CzgFeBrwQ8Bv0CS0DwLPr6pPAefQnO3dH/hX4NQdZ3mraj2wOcmfzPHzVgOnJTmxPRP8FeA1wLOB9wLPAY5sYzkjyd+1630NWAkcBFyb5JK2v1OAS4DlNGegL0hye1VdSvMfh+cBdwE/tWN/khw33X4nubHd7pB2uxcA/wG8Lsk3quongKvaWJ8E3pXk+qp6PvDH7ff/FGB9knfPfsQkSYvB/Gh+1HjxHkppYd1WVX9bVd8A/qFtezNAVb0ReAnw00kmgE/SJMHvq6oXAD8PHJnkJ4GLgd9Nsh04G7gnyXE71k/yJHA1sLrd/kDgGOC6uXzeDA5J8jLgl4HLgU1JVgIbgbf1rPcC4GeAw4HXVdWJVXUo8KfAa9v43wF8vKr279nm8CSv792fmfa757NeDZye5FDgQZr/MACsB25I8mLgBODd7WddC1yd5OXATwM/X1VnzGHfJUmDYX40P2pMWVBKC+uoJC8FfpHmHpEvJrm/XXYi8Ergy1W1hSb57HRJS5J/Bd4EnFlVlwFvAfbbzWdeDZxRVfsArwduSvLwXD5vBh9t3+9p3zf2zD+zZ72rkjyR5CHgBuA4mjO3n03yT+3+fA64H3h5u82Xkmyb+oFz2O9NSR5pp+8EnllVzwReSvufgCT/luTHaO7RORJ4Z7vfX6I5Ezsxh32XJA2G+dH8qDFlQSkNQJI7gbcDH6yqQ9rm5cDlSSbaM6Irac5gfl9VHQ58keZSnVtpzoAu281n/SvwNzQJ8s3AB+b6eTPYOqX/J2ZYrzfx7UWTqKb7N2UvmstqAB6drqM57Pd3eqYn22XbeuZ39FM0l/IvA47o2fdXAl7SI0lDZn7ciflRY8GCUhqQJH8B/DWwrm36FHB2z+Utv0tz6UmvnwO+nOR/A5tp7plY3i7bxg8Sz1QfAC4Enp7kC/P4vD3xhqraq72M6AzgJuBzwLFV9UKAqnoN8CPA/5tm+979mW2/p9Wekf0KzZlbqupHgC8A+9Kcdb2gbX9G235y33sqSVow5kfzo8aLBaU0WG8Fjq+q42guPflL4EtVdRfwk7T3dvT4C+CgqvoaTTJ4lObylR+muVF/e1Xdzq5nZT9Bc2P+n/e0zeXz9sS+wO00yenKJJ9N8jXgXOCjVfVV4DLgpPYSo6l692e2/Z7Nf6e5nOlvaRL22Unua9tfWVV/R5Os/yLJh/d0hyVJC8b8aH7UmFg2OTm5+7UkqUfN4XfBJElaasyPWoocoZQkSZIk9cURSkmSJElSXxyhlCRJkiT1xYJSkiRJktQXC0pJkiRJUl8sKCVJkiRJfbGglCRJkiT15f8DzDJwbE0iVy4AAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10d345da0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"nrows = ncols = 2\n",
"fig, axes = plt.subplots(nrows = nrows, ncols = ncols, sharex='all', figsize=(15, 15))\n",
"\n",
"names_classifiers = [('AdaBoosting', ada_best), ('ExtraTrees', ExtC_best), ('RandomForest', RFC_best), ('GradientBoosting', GBC_best)]\n",
"\n",
"nclassifier = 0\n",
"for row in range(nrows):\n",
" for col in range(ncols):\n",
" name = names_classifiers[nclassifier][0]\n",
" classifier = names_classifiers[nclassifier][1]\n",
" indices = np.argsort(classifier.feature_importances_)[::-1][:40]\n",
" g = sns.barplot(y=X_train.columns[indices][:40], x = classifier.feature_importances_[indices][:40] , orient='h',ax=axes[row][col])\n",
" g.set_xlabel('Relative importance', fontsize=12)\n",
" g.set_ylabel('Features', fontsize=12)\n",
" g.tick_params(labelsize=9)\n",
" g.set_title(name + ' feature importance')\n",
" nclassifier += 1"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10db282e8>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"test_Survived_RFC = pd.Series(RFC_best.predict(X_test), name='RFC')\n",
"test_Survived_ExtC = pd.Series(ExtC_best.predict(X_test), name='ExtC')\n",
"test_Survived_SVMC = pd.Series(SVMC_best.predict(X_test), name='SVC')\n",
"test_Survived_AdaC = pd.Series(ada_best.predict(X_test), name='Ada')\n",
"test_Survived_GBC = pd.Series(GBC_best.predict(X_test), name='GBC')\n",
"test_Survived_XgbC = pd.Series(Xgb_best.predict(X_test), name='Xgb')\n",
"\n",
"# Concatenate all classifier results\n",
"ensemble_results = pd.concat([test_Survived_RFC, test_Survived_ExtC, test_Survived_AdaC, test_Survived_GBC, test_Survived_SVMC, test_Survived_XgbC], axis=1)\n",
"\n",
"g = sns.heatmap(ensemble_results.corr(),annot=True)"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [],
"source": [
"# ensamble modeling\n",
"# votingC = VotingClassifier(estimators=[('rfc', RFC_best), ('extc', ExtC_best),\n",
"# ('svc', SVMC_best), ('adac',ada_best),('gbc',GBC_best), ('xgb', Xgb_best)], voting='soft', n_jobs=4)\n",
"votingC = VotingClassifier(estimators=[('rfc', RFC_best), ('extc', ExtC_best),\n",
" ('svc', SVMC_best), ('adac',ada_best),('gbc',GBC_best)], voting='soft', n_jobs=4)\n",
"# votingC = VotingClassifier(estimators=[('gbc',GBC_best), ('xgb', Xgb_best)], weights=[1, 1], voting='soft', n_jobs=4)\n",
"votingC = votingC.fit(X_train, y_train)"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {},
"outputs": [],
"source": [
"# ref: https://www.kaggle.com/valerioorfano/voting-classifier/code#L64\n",
"## # Train Model\n",
"# classifier from xgboost\n",
"clf1 = AdaBoostClassifier(n_estimators=500)\n",
"clf2 = ExtraTreesClassifier(n_estimators=500, n_jobs=-1, criterion='gini',max_depth=5)\n",
"clf3 = xgb.XGBClassifier(n_estimators=500, nthread=-1, max_depth = 5, seed=1729)\n",
"clf4 = GradientBoostingClassifier(n_estimators=500)\n",
"eclf = VotingClassifier(estimators=[('ab', clf1), ('etc', clf2), ('xgb', clf3),('gbc', clf4)], weights=[1,1,1,1], voting='soft')\n",
"eclf = eclf.fit(X_train, y_train)"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [],
"source": [
"# Prediction\n",
"y_pred = pd.Series(votingC.predict(X_test), name='Survived').astype(int)\n",
"\n",
"submission = pd.DataFrame({\n",
" 'PassengerId': test_df['PassengerId'],\n",
" 'Survived': y_pred\n",
" })\n",
"submission.to_csv('../output/submission.csv', index=False)"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {},
"outputs": [],
"source": [
"# 結果 score: 0.78 (xgboostをvotingClassifierで未使用の場合)"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [],
"source": [
"xgb = Xgb_best.fit(X_train, y_train)\n",
"# Prediction\n",
"y_pred = pd.Series(xgb.predict(X_test), name='Survived').astype(int)\n",
"\n",
"submission = pd.DataFrame({\n",
" 'PassengerId': test_df['PassengerId'],\n",
" 'Survived': y_pred\n",
" })\n",
"submission.to_csv('../output/xgboost_result.csv', index=False)"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [],
"source": [
"# 結果 score: 0.77 (xgboostのみ)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"メモ: たとえ相関がある/ないの状態でも,因果関係がある/ないとは限らないので細かくデータを見ていかなくてはならないようだ.\n",
"(年齢と生存率は単純な相関を見ると無相関だが,実は関係が深いっぽい?もっとデータを見ていかないと分からないが.) \n",
"期待したよりも結果が出ない.なぜだろう? \n",
"次はdnnClassifierとかを試す?ensembleの組み合わせの方法をもう少し考える?"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment