Skip to content

Instantly share code, notes, and snippets.

@ankurdhuriya
Created November 25, 2019 07:38
Show Gist options
  • Save ankurdhuriya/e17c6e225a91b301f2991e44935040c1 to your computer and use it in GitHub Desktop.
Save ankurdhuriya/e17c6e225a91b301f2991e44935040c1 to your computer and use it in GitHub Desktop.
Practising Classification Algorithm while predicting whether loan will be paid off or not.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "<a href=\"https://www.bigdatauniversity.com\"><img src=\"https://ibm.box.com/shared/static/cw2c7r3o20w9zn8gkecaeyjhgw3xdgbj.png\" width=\"400\" align=\"center\"></a>\n\n<h1 align=\"center\"><font size=\"5\">Classification with Python</font></h1>"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "In this notebook we try to practice all the classification algorithms that we learned in this course.\n\nWe load a dataset using Pandas library, and apply the following algorithms, and find the best one for this specific dataset by accuracy evaluation methods.\n\nLets first load required libraries:"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "import itertools\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom matplotlib.ticker import NullFormatter\nimport pandas as pd\nimport numpy as np\nimport matplotlib.ticker as ticker\nfrom sklearn import preprocessing\n%matplotlib inline",
"execution_count": 32,
"outputs": []
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### About dataset"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "This dataset is about past loans. The __Loan_train.csv__ data set includes details of 346 customers whose loan are already paid off or defaulted. It includes following fields:\n\n| Field | Description |\n|----------------|---------------------------------------------------------------------------------------|\n| Loan_status | Whether a loan is paid off on in collection |\n| Principal | Basic principal loan amount at the |\n| Terms | Origination terms which can be weekly (7 days), biweekly, and monthly payoff schedule |\n| Effective_date | When the loan got originated and took effects |\n| Due_date | Since it\u2019s one-time payoff schedule, each loan has one single due date |\n| Age | Age of applicant |\n| Education | Education of applicant |\n| Gender | The gender of applicant |"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Lets download the dataset"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "!wget -O loan_train.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_train.csv",
"execution_count": 33,
"outputs": [
{
"output_type": "stream",
"text": "--2019-11-25 07:23:12-- https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_train.csv\nResolving s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)... 67.228.254.196\nConnecting to s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)|67.228.254.196|:443... connected.\nHTTP request sent, awaiting response... 200 OK\nLength: 23101 (23K) [text/csv]\nSaving to: \u2018loan_train.csv\u2019\n\n100%[======================================>] 23,101 --.-K/s in 0.07s \n\n2019-11-25 07:23:13 (303 KB/s) - \u2018loan_train.csv\u2019 saved [23101/23101]\n\n",
"name": "stdout"
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Load Data From CSV File "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df = pd.read_csv('loan_train.csv')\ndf.head()",
"execution_count": 34,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 34,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 0 0 PAIDOFF 1000 30 9/8/2016 \n1 2 2 PAIDOFF 1000 30 9/8/2016 \n2 3 3 PAIDOFF 1000 15 9/8/2016 \n3 4 4 PAIDOFF 1000 30 9/9/2016 \n4 6 6 PAIDOFF 1000 30 9/9/2016 \n\n due_date age education Gender \n0 10/7/2016 45 High School or Below male \n1 10/7/2016 33 Bechalor female \n2 9/22/2016 27 college male \n3 10/8/2016 28 college female \n4 10/8/2016 29 college male ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/8/2016</td>\n <td>10/7/2016</td>\n <td>45</td>\n <td>High School or Below</td>\n <td>male</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n <td>2</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/8/2016</td>\n <td>10/7/2016</td>\n <td>33</td>\n <td>Bechalor</td>\n <td>female</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n <td>3</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>15</td>\n <td>9/8/2016</td>\n <td>9/22/2016</td>\n <td>27</td>\n <td>college</td>\n <td>male</td>\n </tr>\n <tr>\n <th>3</th>\n <td>4</td>\n <td>4</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/9/2016</td>\n <td>10/8/2016</td>\n <td>28</td>\n <td>college</td>\n <td>female</td>\n </tr>\n <tr>\n <th>4</th>\n <td>6</td>\n <td>6</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/9/2016</td>\n <td>10/8/2016</td>\n <td>29</td>\n <td>college</td>\n <td>male</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "df.shape",
"execution_count": 35,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 35,
"data": {
"text/plain": "(346, 10)"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Convert to date time object "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['due_date'] = pd.to_datetime(df['due_date'])\ndf['effective_date'] = pd.to_datetime(df['effective_date'])\ndf.head()",
"execution_count": 36,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 36,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 0 0 PAIDOFF 1000 30 2016-09-08 \n1 2 2 PAIDOFF 1000 30 2016-09-08 \n2 3 3 PAIDOFF 1000 15 2016-09-08 \n3 4 4 PAIDOFF 1000 30 2016-09-09 \n4 6 6 PAIDOFF 1000 30 2016-09-09 \n\n due_date age education Gender \n0 2016-10-07 45 High School or Below male \n1 2016-10-07 33 Bechalor female \n2 2016-09-22 27 college male \n3 2016-10-08 28 college female \n4 2016-10-08 29 college male ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>45</td>\n <td>High School or Below</td>\n <td>male</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n <td>2</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>33</td>\n <td>Bechalor</td>\n <td>female</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n <td>3</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>15</td>\n <td>2016-09-08</td>\n <td>2016-09-22</td>\n <td>27</td>\n <td>college</td>\n <td>male</td>\n </tr>\n <tr>\n <th>3</th>\n <td>4</td>\n <td>4</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>28</td>\n <td>college</td>\n <td>female</td>\n </tr>\n <tr>\n <th>4</th>\n <td>6</td>\n <td>6</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>29</td>\n <td>college</td>\n <td>male</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "# Data visualization and pre-processing\n\n"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Let\u2019s see how many of each class is in our data set "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['loan_status'].value_counts()",
"execution_count": 38,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 38,
"data": {
"text/plain": "PAIDOFF 260\nCOLLECTION 86\nName: loan_status, dtype: int64"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "260 people have paid off the loan on time while 86 have gone into collection \n"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Lets plot some columns to underestand data better:"
},
{
"metadata": {},
"cell_type": "code",
"source": "# notice: installing seaborn might takes a few minutes\n!conda install -c anaconda seaborn -y",
"execution_count": 39,
"outputs": [
{
"output_type": "stream",
"text": "Solving environment: done\n\n# All requested packages already installed.\n\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "import seaborn as sns\n\nbins = np.linspace(df.Principal.min(), df.Principal.max(), 10)\ng = sns.FacetGrid(df, col=\"Gender\", hue=\"loan_status\", palette=\"Set1\", col_wrap=2)\ng.map(plt.hist, 'Principal', bins=bins, ec=\"k\")\n\ng.axes[-1].legend()\nplt.show()",
"execution_count": 40,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x216 with 2 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAADQCAYAAABStPXYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAG4xJREFUeJzt3XucFOWd7/HPV5wVFaIioyKIMyKKqGTAWY3XJbCyqPF2jAbjUdx4DtFoXDbxeMt5aTa+1nghMclRibhyyCaKGrKgSxINUTmKiRfAEcELITrqKCAQN8YgBPB3/qiaSYM9zKV7pmu6v+/Xq15T9VTVU7+umWd+XU9XP6WIwMzMLGt2KHUAZmZm+ThBmZlZJjlBmZlZJjlBmZlZJjlBmZlZJjlBmZlZJjlBdRFJe0u6T9LrkhZJ+q2kM4tU92hJc4tRV3eQNF9SfanjsNIop7YgqVrSs5JekHR8Fx7nw66quydxguoCkgTMAZ6MiAMi4ghgAjCoRPHsWIrjmpVhWxgLvBoRIyPiqWLEZK1zguoaY4C/RMQPmwsi4s2I+D8AknpJulXS85KWSPpyWj46vdqYJelVSfemDRxJ49OyBcB/a65X0q6Spqd1vSDp9LT8Qkk/lfSfwK8KeTGSZkiaKumJ9F3w36XHfEXSjJztpkpaKGmZpH9ppa5x6TvoxWl8fQqJzTKvbNqCpDrgFuBkSQ2Sdm7t71lSo6Qb03ULJY2S9Kik30u6ON2mj6TH0n1fao43z3H/V875yduuylZEeCryBFwO3Lad9ZOA/53O7wQsBGqB0cAfSd5d7gD8FjgO6A28DQwFBDwIzE33vxH47+n87sByYFfgQqAJ6NdKDE8BDXmmv8+z7Qzg/vTYpwMfAIenMS4C6tLt+qU/ewHzgRHp8nygHugPPAnsmpZfBVxX6t+Xp66byrAtXAjcns63+vcMNAKXpPO3AUuAvkA18F5aviPwqZy6VgBKlz9Mf44DpqWvdQdgLnBCqX+v3TW566cbSLqDpHH9JSL+luSPboSkz6eb7EbS4P4CPBcRTel+DUAN8CHwRkT8Li3/CUnDJq3rNElXpMu9gcHp/LyI+EO+mCKio/3n/xkRIeklYHVEvJTGsiyNsQE4R9IkkoY3ABhO0jCbfSYtezp9M/w3JP94rEKUSVto1tbf88Ppz5eAPhHxJ+BPkjZI2h34M3CjpBOAj4GBwN7Aqpw6xqXTC+lyH5Lz82QnY+5RnKC6xjLgrOaFiLhUUn+Sd4eQvBv6akQ8mruTpNHAxpyiLfz1d9TaoIkCzoqI17ap6yiSBpB/J+kpknd027oiIn6dp7w5ro+3ifFjYEdJtcAVwN9GxPtp11/vPLHOi4hzW4vLyk45toXc423v73m7bQY4j+SK6oiI2CSpkfxt5tsRcdd24ihb/gyqazwO9JZ0SU7ZLjnzjwKXSKoCkHSQpF23U9+rQK2kIelyboN4FPhqTv/8yPYEGBHHR0Rdnml7DXJ7PkXyT+CPkvYGTsqzzTPAsZIOTGPdRdJBnTye9Qzl3BYK/XvejaS7b5OkzwL759nmUeBLOZ9tDZS0VweO0aM5QXWBSDqPzwD+TtIbkp4DfkTSRw3wb8DLwGJJS4G72M7VbERsIOnG+Hn6wfCbOatvAKqAJWldNxT79bRHRLxI0g2xDJgOPJ1nmzUkffgzJS0haeDDujFM62bl3BaK8Pd8L1AvaSHJ1dSreY7xK+A+4Ldp9/os8l/tlaXmD+TMzMwyxVdQZmaWSU5QZmaWSU5QZmaWSU5QZmaWSZlIUOPHjw+S7zZ48lQuU9G4fXgqs6ndMpGg1q5dW+oQzDLL7cMqVSYSlJmZ2bacoMzMLJOcoMzMLJM8WKyZlZVNmzbR1NTEhg0bSh1KRevduzeDBg2iqqqq03U4QZlZWWlqaqJv377U1NSQjhtr3SwiWLduHU1NTdTW1na6HnfxmVlZ2bBhA3vuuaeTUwlJYs899yz4KtYJyirG/gMGIKko0/4DBpT65dh2ODmVXjF+B+7is4rx1qpVNO07qCh1DXq3qSj1mFnrfAVlZmWtmFfO7b167tWrF3V1dRx22GGcffbZrF+/vmXd7NmzkcSrr/718U+NjY0cdthhAMyfP5/ddtuNkSNHcvDBB3PCCScwd+7creqfNm0aw4YNY9iwYRx55JEsWLCgZd3o0aM5+OCDqauro66ujlmzZm0VU/PU2NhYyGntFr6CMrOyVswrZ2jf1fPOO+9MQ0MDAOeddx4//OEP+drXvgbAzJkzOe6447j//vv55je/mXf/448/viUpNTQ0cMYZZ7DzzjszduxY5s6dy1133cWCBQvo378/ixcv5owzzuC5555jn332AeDee++lvr6+1Zh6ijavoCRNl/Re+oTK5rJvSnpHUkM6nZyz7hpJKyS9JukfuipwM7Oe4Pjjj2fFihUAfPjhhzz99NPcc8893H///e3av66ujuuuu47bb78dgJtvvplbb72V/v37AzBq1CgmTpzIHXfc0TUvoITa08U3Axifp/y2iKhLp18ASBoOTAAOTfe5U1KvYgVrZtaTbN68mV/+8pccfvjhAMyZM4fx48dz0EEH0a9fPxYvXtyuekaNGtXSJbhs2TKOOOKIrdbX19ezbNmyluXzzjuvpStv3bp1AHz00UctZWeeeWYxXl6Xa7OLLyKelFTTzvpOB+6PiI3AG5JWAEcCv+10hGZmPUxzMoDkCuqiiy4Cku69yZMnAzBhwgRmzpzJqFGj2qwvYvuDgEfEVnfNlUsXXyGfQV0m6QJgIfD1iHgfGAg8k7NNU1r2CZImAZMABg8eXEAYZuXH7aNny5cM1q1bx+OPP87SpUuRxJYtW5DELbfc0mZ9L7zwAocccggAw4cPZ9GiRYwZM6Zl/eLFixk+fHhxX0QGdPYuvqnAEKAOWAl8Jy3Pd+N73tQfEdMioj4i6qurqzsZhll5cvsoP7NmzeKCCy7gzTffpLGxkbfffpva2tqt7sDLZ8mSJdxwww1ceumlAFx55ZVcddVVLV13DQ0NzJgxg6985Std/hq6W6euoCJidfO8pLuB5nsgm4D9cjYdBLzb6ejMzAo0eJ99ivq9tcHpnXIdNXPmTK6++uqtys466yzuu+8+rrrqqq3Kn3rqKUaOHMn69evZa6+9+MEPfsDYsWMBOO2003jnnXc45phjkETfvn35yU9+woAy/PK42urbBEg/g5obEYelywMiYmU6/8/AURExQdKhwH0knzvtCzwGDI2ILdurv76+PhYuXFjI6zBrk6SiflG3jbZTtKEM3D465pVXXmnpDrPSauV30e620eYVlKSZwGigv6Qm4HpgtKQ6ku67RuDLABGxTNKDwMvAZuDStpKTmZlZPu25i+/cPMX3bGf7fwX+tZCgzMzMPNSRmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZW1fQcNLurjNvYd1L6RPVatWsWECRMYMmQIw4cP5+STT2b58uUsW7aMMWPGcNBBBzF06FBuuOGGlq8szJgxg8suu+wTddXU1LB27dqtymbMmEF1dfVWj9B4+eWXAVi+fDknn3wyBx54IIcccgjnnHMODzzwQMt2ffr0aXkkxwUXXMD8+fP53Oc+11L3nDlzGDFiBMOGDePwww9nzpw5LesuvPBCBg4cyMaNGwFYu3YtNTU1HfqdtJcft2FmZW3lO29z1HWPFK2+Z7+Vb+zsrUUEZ555JhMnTmwZtbyhoYHVq1dz4YUXMnXqVMaNG8f69es566yzuPPOO1tGiuiIL3zhCy2jnDfbsGEDp5xyCt/97nc59dRTAXjiiSeorq5uGX5p9OjRTJkypWW8vvnz57fs/+KLL3LFFVcwb948amtreeONNzjxxBM54IADGDFiBJA8W2r69OlccsklHY65I3wFZWZWZE888QRVVVVcfPHFLWV1dXUsX76cY489lnHjxgGwyy67cPvtt3PTTTcV7dj33XcfRx99dEtyAvjsZz/b8kDEtkyZMoVrr72W2tpaAGpra7nmmmu49dZbW7aZPHkyt912G5s3by5a3Pk4QZmZFdnSpUs/8UgMyP+ojCFDhvDhhx/ywQcfdPg4ud12dXV1fPTRR60eu73a8ziPwYMHc9xxx/HjH/+408dpD3fxmZl1k20fi5GrtfLtydfFV6h8MeYru/baaznttNM45ZRTinr8XL6CMjMrskMPPZRFixblLd92XMXXX3+dPn360Ldv3y49dkf23zbGfI/zOPDAA6mrq+PBBx/s9LHa4gRlZlZkY8aMYePGjdx9990tZc8//zxDhw5lwYIF/PrXvwaSBxtefvnlXHnllUU79he/+EV+85vf8POf/7yl7JFHHuGll15q1/5XXHEF3/72t2lsbASgsbGRG2+8ka9//euf2PYb3/gGU6ZMKUrc+biLz8zK2oCB+7XrzruO1NcWScyePZvJkydz00030bt3b2pqavje977HQw89xFe/+lUuvfRStmzZwvnnn7/VreUzZszY6rbuZ55JngE7YsQIdtghuaY455xzGDFiBA888MBWz5O68847OeaYY5g7dy6TJ09m8uTJVFVVMWLECL7//e+36/XV1dVx8803c+qpp7Jp0yaqqqq45ZZbWp4QnOvQQw9l1KhR7X50fUe163EbXc2PE7Du4MdtVAY/biM7Cn3cRptdfJKmS3pP0tKcslslvSppiaTZknZPy2skfSSpIZ1+2N5AzMzMcrXnM6gZwLbXx/OAwyJiBLAcuCZn3e8joi6dLsbMzKwT2kxQEfEk8Idtyn4VEc3f0HqG5NHuZmaZkIWPLipdMX4HxbiL70vAL3OWayW9IOn/STq+tZ0kTZK0UNLCNWvWFCEMs/Lh9tF5vXv3Zt26dU5SJRQRrFu3jt69exdUT0F38Un6Bsmj3e9Ni1YCgyNinaQjgDmSDo2IT3xFOiKmAdMg+RC4kDjMyo3bR+cNGjSIpqYmnNhLq3fv3gwaVFjnWqcTlKSJwOeAsZG+VYmIjcDGdH6RpN8DBwG+BcnMukVVVVXLOHLWs3Wqi0/SeOAq4LSIWJ9TXi2pVzp/ADAUeL0YgZqZWWVp8wpK0kxgNNBfUhNwPcldezsB89LxmZ5J79g7AfiWpM3AFuDiiPhD3orNzMy2o80EFRHn5im+p5Vtfwb8rNCgzMzMPBafmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllUrsSlKTpkt6TtDSnrJ+keZJ+l/7cIy2XpB9IWiFpiaRRXRW8mZmVr/ZeQc0Axm9TdjXwWEQMBR5LlwFOInmS7lBgEjC18DDNzKzStCtBRcSTwLZPxj0d+FE6/yPgjJzyf4/EM8DukgYUI1gzM6schXwGtXdErARIf+6Vlg8E3s7Zrikt24qkSZIWSlq4Zs2aAsIwKz9uH2Zdc5OE8pTFJwoipkVEfUTUV1dXd0EYZj2X24dZYQlqdXPXXfrzvbS8CdgvZ7tBwLsFHMfMzCpQIQnqYWBiOj8ReCin/IL0br7PAH9s7go0MzNrrx3bs5GkmcBooL+kJuB64CbgQUkXAW8BZ6eb/wI4GVgBrAf+scgxm5lZBWhXgoqIc1tZNTbPtgFcWkhQZmZmHknCzMwyyQnKzMwyyQnKzMwyyQnKzMwyyQnKzMwyyQnKzMwyyQnKzMwyyQnKzMwyyQnKzMwyyQnKzMwyyQnKzMwyyQnKzMwyyQnKzMwyqV2jmecj6WDggZyiA4DrgN2B/wk0P6f62oj4RacjNDOzitTpBBURrwF1AJJ6Ae8As0me/3RbREwpSoRmZlaRitXFNxb4fUS8WaT6zMyswhUrQU0AZuYsXyZpiaTpkvbIt4OkSZIWSlq4Zs2afJuYVSy3D7MiJChJfwOcBvw0LZoKDCHp/lsJfCfffhExLSLqI6K+urq60DDMyorbh1lxrqBOAhZHxGqAiFgdEVsi4mPgbuDIIhzDzMwqTDES1LnkdO9JGpCz7kxgaRGOYWZmFabTd/EBSNoFOBH4ck7xLZLqgAAat1lnZmbWLgUlqIhYD+y5Tdn5BUVkZmaGR5IwM7OMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMKug2c7OeRL2qGPRuU9HqMrOu5QRlFSO2bOKo6x4pSl3Pfmt8Ueoxs9a5i8/MzDLJCcrMzDLJCcrMzDLJCcrMzDLJCcrMzDLJCcrMzDKp4NvMJTUCfwK2AJsjol5SP+ABoIbkmVDnRMT7hR7LzMwqR7GuoD4bEXURUZ8uXw08FhFDgcfSZasw+w8YgKSCp/0HDGj7YGZWdrrqi7qnA6PT+R8B84GruuhYllFvrVpF076DCq6nWKM/mFnPUowrqAB+JWmRpElp2d4RsRIg/bnXtjtJmiRpoaSFa9asKUIYZuXD7cOsOAnq2IgYBZwEXCrphPbsFBHTIqI+Iuqrq6uLEIZZ+XD7MCtCgoqId9Of7wGzgSOB1ZIGAKQ/3yv0OGZmVlkKSlCSdpXUt3keGAcsBR4GJqabTQQeKuQ4ZmZWeQq9SWJvYLak5rrui4hHJD0PPCjpIuAt4OwCj2NmZhWmoAQVEa8Dn85Tvg4YW0jdZmZW2TyShJmZZZITlJmZZZITlJmZZZITlJmZZZITlJmZZZITlJmZZZITlJmZZZITlJmZZZITlJmZZZITlJmZZZITlJmZZfIJ2F31RF0zM+tBsvgEbF9BmZlZJnU6QUnaT9ITkl6RtEzSP6Xl35T0jqSGdDq5eOGamVmlKKSLbzPw9YhYnD60cJGkeem62yJiSuHhmZlZpep0goqIlcDKdP5Pkl4BBhYrMDMzq2xF+QxKUg0wEng2LbpM0hJJ0yXt0co+kyQtlLRwzZo1xQjDrGy4fZgVIUFJ6gP8DJgcER8AU4EhQB3JFdZ38u0XEdMioj4i6qurqwsNw6ysuH2YFZigJFWRJKd7I+I/ACJidURsiYiPgbuBIwsP08zMKk0hd/EJuAd4JSK+m1Oe+y2tM4GlnQ/PzMwqVSF38R0LnA+8JKkhLbsWOFdSHRBAI/DlgiI0M7OKVMhdfAsA5Vn1i86HY2ZmlvBIEmZmlkkei8+6jHpVFWVcLvWqKkI0ZtbTOEFZl4ktmzjqukcKrufZb40vQjRm1tO4i8/MzDLJCcrMzDLJCcrMzDLJCcrMzDLJCcrMrJtl8fHqWeS7+MzMulkWH6+eRb6CMjOzTHKCMjOzTHIXn5mZZXLkFycoMzPL5Mgv7uIzM7NM6rIEJWm8pNckrZB0daH1+bZMM7PK0iVdfJJ6AXcAJwJNwPOSHo6Ilztbp2/LNDOrLF31GdSRwIqIeB1A0v3A6UCnE1TW7D9gAG+tWlVwPYP32Yc3V64sQkTlTcr3bEzLIreNthXrhoQdelWVddtQRBS/UunzwPiI+B/p8vnAURFxWc42k4BJ6eLBwGtFD6T9+gNrS3j8Qjj20mgr9rUR0elPizPUPsr5d5Rl5Rx7u9tGV11B5UvpW2XCiJgGTOui43eIpIURUV/qODrDsZdGV8eelfbh31FpOPZEV90k0QTsl7M8CHi3i45lZmZlqKsS1PPAUEm1kv4GmAA83EXHMjOzMtQlXXwRsVnSZcCjQC9gekQs64pjFUnJu1IK4NhLoyfH3hE9+XU69tIoWuxdcpOEmZlZoTyShJmZZZITlJmZZVLFJChJvSS9IGluulwr6VlJv5P0QHozB5J2SpdXpOtrShz37pJmSXpV0iuSjpbUT9K8NPZ5kvZIt5WkH6SxL5E0qsSx/7OkZZKWSpopqXdWz7uk6ZLek7Q0p6zD51nSxHT730ma2J2vobPcNkoSu9tGO1RMggL+CXglZ/lm4LaIGAq8D1yUll8EvB8RBwK3pduV0veBRyJiGPBpktdwNfBYGvtj6TLAScDQdJoETO3+cBOSBgKXA/URcRjJzTITyO55nwFs++XBDp1nSf2A64GjSEZTub654Wac20Y3ctvoQNuIiLKfSL6H9RgwBphL8kXitcCO6fqjgUfT+UeBo9P5HdPtVKK4PwW8se3xSUYVGJDODwBeS+fvAs7Nt10JYh8IvA30S8/jXOAfsnzegRpgaWfPM3AucFdO+VbbZXFy23DbaGfMJWkblXIF9T3gSuDjdHlP4L8iYnO63ETyRwN//eMhXf/HdPtSOABYA/zftAvm3yTtCuwdESvTGFcCe6Xbt8Seyn1d3Soi3gGmAG8BK0nO4yJ6xnlv1tHznJnz3wFuG93MbWOr8u0q+wQl6XPAexGxKLc4z6bRjnXdbUdgFDA1IkYCf+avl9L5ZCb29PL9dKAW2BfYleTyf1tZPO9taS3WnvQa3DbcNrpCUdtG2Sco4FjgNEmNwP0kXRnfA3aX1PxF5dyhmFqGaUrX7wb8oTsDztEENEXEs+nyLJJGuVrSAID053s522dliKm/B96IiDURsQn4D+AYesZ5b9bR85yl898ebhul4bbRzvNf9gkqIq6JiEERUUPyQeTjEXEe8ATw+XSzicBD6fzD6TLp+scj7TTtbhGxCnhb0sFp0ViSR5bkxrht7Bekd9J8Bvhj82V4CbwFfEbSLpLEX2PP/HnP0dHz/CgwTtIe6bvkcWlZJrltuG0UoHvaRik+JCzVBIwG5qbzBwDPASuAnwI7peW90+UV6foDShxzHbAQWALMAfYg6X9+DPhd+rNfuq1IHhT5e+AlkruEShn7vwCvAkuBHwM7ZfW8AzNJPg/YRPJu76LOnGfgS+lrWAH8Y6n/5jvw+t02ujd2t412HNtDHZmZWSaVfRefmZn1TE5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QGSZpi6SGdMTjn0rapZXtfiFp907Uv6+kWQXE1yipf2f3N+sst43K4NvMM0zShxHRJ52/F1gUEd/NWS+S3+HHrdXRxfE1knzPYW0pjm+Vy22jMvgKqud4CjhQUo2SZ9/cCSwG9mt+t5az7m4lz5r5laSdASQdKOnXkl6UtFjSkHT7pen6CyU9JOkRSa9Jur75wJLmSFqU1jmpJK/erHVuG2XKCaoHSMffOonkm9kABwP/HhEjI+LNbTYfCtwREYcC/wWclZbfm5Z/mmTcr3zDvBwJnEfyDf2zJdWn5V+KiCOAeuBySaUeSdkMcNsod05Q2bazpAaS4VzeAu5Jy9+MiGda2eeNiGhI5xcBNZL6AgMjYjZARGyIiPV59p0XEesi4iOSASyPS8svl/Qi8AzJgI9DC35lZoVx26gAO7a9iZXQRxFRl1uQdK3z5+3sszFnfguwM/mHus9n2w8kQ9JoktGXj46I9ZLmk4wNZlZKbhsVwFdQFSAiPgCaJJ0BIGmnVu56OlFSv7Rv/gzgaZKh/d9PG+Aw4DPdFrhZF3PbyDYnqMpxPkl3xBLgN8A+ebZZQDKycgPws4hYCDwC7JjudwNJV4ZZOXHbyCjfZm5AcqcSyW2xl5U6FrMscdsoHV9BmZlZJvkKyszMMslXUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlkn/H+LDZoiBEQ8dAAAAAElFTkSuQmCC\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "bins = np.linspace(df.age.min(), df.age.max(), 10)\ng = sns.FacetGrid(df, col=\"Gender\", hue=\"loan_status\", palette=\"Set1\", col_wrap=2)\ng.map(plt.hist, 'age', bins=bins, ec=\"k\")\n\ng.axes[-1].legend()\nplt.show()",
"execution_count": 41,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x216 with 2 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAADQCAYAAABStPXYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAGfZJREFUeJzt3XuQVOW57/HvTxgdFbygo4yMwKgoopIBZ3tDDYJy2N49XuKOR7GOJx4Naqjo8ZZTVrLdZbyVmhwvkUQLK1HUmA26SUWDCidi4gVwRBBv0UFHQS7RKAchgs/5o9fMHqBhembWTK/u+X2qVnWvt1e/61lMvzy93vX2uxQRmJmZZc02xQ7AzMwsHycoMzPLJCcoMzPLJCcoMzPLJCcoMzPLJCcoMzPLJCeolEjaU9Ijkt6XNE/SXySdkVLdoyXNSKOu7iBptqT6YsdhxVdO7UJSlaSXJb0m6Zgu3M/qrqq71DhBpUCSgOnAnyJin4g4FDgXqClSPL2LsV+z1sqwXYwF3oqIERHxQhox2dY5QaVjDPCPiPhFc0FELImI/wMgqZek2yS9KmmBpP+ZlI9OzjaekPSWpIeTRo2k8UnZHOC/NtcraUdJDyZ1vSbptKT8Qkm/lfQfwB87czCSpki6T9Ks5Jvvt5N9LpY0pdV290maK2mRpJ9soa5xybfm+Ul8fToTm5WUsmkXkuqAW4ETJTVI2n5Ln21JjZJuSl6bK2mkpGck/VXSJck2fSQ9l7z3jeZ48+z3f7X698nbxspaRHjp5AJcAdy5ldcvBv538nw7YC5QC4wG/k7uG+U2wF+Ao4FK4CNgCCDgcWBG8v6bgP+WPN8FeAfYEbgQaAL6bSGGF4CGPMvxebadAjya7Ps04AvgkCTGeUBdsl2/5LEXMBsYnqzPBuqB3YE/ATsm5dcANxT77+Wle5YybBcXAncnz7f42QYagUuT53cCC4C+QBWwPCnvDezUqq73ACXrq5PHccDk5Fi3AWYAxxb779qdi7uCuoCke8g1qH9ExD+R+6ANl3RWssnO5BrZP4BXIqIpeV8DMBhYDXwQEe8m5b8h15hJ6jpV0lXJeiUwMHk+MyL+li+miGhvn/l/RERIegP4NCLeSGJZlMTYAJwj6WJyja0aGEauMTY7Iil7MfkCvC25/2ysByqTdtGsrc/2U8njG0CfiPgS+FLSWkm7AP8PuEnSscA3wABgT2BZqzrGJctryXofcv8+f+pgzCXHCSodi4Azm1ciYqKk3cl9I4TcN6DLI+KZ1m+SNBpY16poA//5N9nSJIkCzoyItzep63ByH/r8b5JeIPctblNXRcSzecqb4/pmkxi/AXpLqgWuAv4pIj5Luv4q88Q6MyL+ZUtxWVkrx3bRen9b+2xvtf0A55E7ozo0Ir6W1Ej+9vPTiLh/K3GUNV+DSsfzQKWkS1uV7dDq+TPApZIqACTtL2nHrdT3FlArad9kvXUjeAa4vFWf/IhCAoyIYyKiLs+ytUa4NTuRa/h/l7Qn8M95tnkJGCVpvyTWHSTt38H9Wekp53bR2c/2zuS6+76WdBwwKM82zwD/vdW1rQGS9mjHPkqeE1QKItdhfDrwbUkfSHoFeIhcvzTAr4A3gfmSFgL3s5Wz14hYS67r4vfJxeAlrV6+EagAFiR13Zj28RQiIl4n1/WwCHgQeDHPNivI9dtPlbSAXKMe2o1hWhGVc7tI4bP9MFAvaS65s6m38uzjj8AjwF+SrvYnyH+2V7aaL8qZmZllis+gzMwsk5ygzMwsk5ygzMwsk5ygzMwsk7o1QY0fPz7I/Y7Bi5dyXTrN7cRLD1gK0q0JauXKld25O7OS5HZiluMuPjMzyyQnKDMzyyQnKDMzyyRPFmtmZefrr7+mqamJtWvXFjuUHq2yspKamhoqKio69H4nKDMrO01NTfTt25fBgweTzB9r3SwiWLVqFU1NTdTW1naoDnfxmVnZWbt2LbvttpuTUxFJYrfdduvUWawTVDcaVF2NpFSWQdXVxT4cs0xzciq+zv4N3MXXjT5ctoymvWpSqavmk6ZU6jEzyyqfQZlZ2Uuz96LQHoxevXpRV1fHwQcfzNlnn82aNWtaXps2bRqSeOut/7wNVGNjIwcffDAAs2fPZuedd2bEiBEccMABHHvsscyYMWOj+idPnszQoUMZOnQohx12GHPmzGl5bfTo0RxwwAHU1dVRV1fHE088sVFMzUtjY2Nn/lm7nM+gzKzspdl7AYX1YGy//fY0NDQAcN555/GLX/yCH/7whwBMnTqVo48+mkcffZQf//jHed9/zDHHtCSlhoYGTj/9dLbffnvGjh3LjBkzuP/++5kzZw6777478+fP5/TTT+eVV16hf//+ADz88MPU19dvMaZS4DMoM7Mudswxx/Dee+8BsHr1al588UUeeOABHn300YLeX1dXxw033MDdd98NwC233MJtt93G7rvvDsDIkSOZMGEC99xzT9ccQJE4QZmZdaH169fzhz/8gUMOOQSA6dOnM378ePbff3/69evH/PnzC6pn5MiRLV2CixYt4tBDD93o9fr6ehYtWtSyft5557V05a1atQqAr776qqXsjDPOSOPwupS7+MzMukBzMoDcGdRFF10E5Lr3Jk2aBMC5557L1KlTGTlyZJv1RWx9EvCI2GjUXDl08RWUoCQ1Al8CG4D1EVEvqR/wGDAYaATOiYjPuiZMM7PSki8ZrFq1iueff56FCxciiQ0bNiCJW2+9tc36XnvtNQ488EAAhg0bxrx58xgzZkzL6/Pnz2fYsGHpHkSRtaeL77iIqIuI5pR8LfBcRAwBnkvWzcxsC5544gkuuOAClixZQmNjIx999BG1tbUbjcDLZ8GCBdx4441MnDgRgKuvvpprrrmmpeuuoaGBKVOm8P3vf7/Lj6E7daaL7zRgdPL8IWA2cE0n4zEzS93A/v1T/e3gwGSkXHtNnTqVa6/d+Lv8mWeeySOPPMI112z83+cLL7zAiBEjWLNmDXvssQc///nPGTt2LACnnnoqH3/8MUcddRSS6Nu3L7/5zW+oLrMf8Kutfk0ASR8An5G7E+L9ETFZ0ucRsUurbT6LiF3zvPdi4GKAgQMHHrpkyZLUgi81klL9oW4hfzvrdh366bzbSboWL17c0h1mxbWFv0VB7aTQLr5RETES+GdgoqRjCw0uIiZHRH1E1FdVVRX6NrMexe3EbHMFJaiI+CR5XA5MAw4DPpVUDZA8Lu+qIM3MrOdpM0FJ2lFS3+bnwDhgIfAUMCHZbALwZFcFaWZmPU8hgyT2BKYl4+t7A49ExNOSXgUel3QR8CFwdteFaWZmPU2bCSoi3ge+lad8FTC2K4IyMzPzVEdmZpZJTlBmVvb2qhmY6u029qoZWNB+ly1bxrnnnsu+++7LsGHDOPHEE3nnnXdYtGgRY8aMYf/992fIkCHceOONLT8bmTJlCpdddtlmdQ0ePJiVK1duVDZlyhSqqqo2uoXGm2++CcA777zDiSeeyH777ceBBx7IOeecw2OPPdayXZ8+fVpuyXHBBRcwe/ZsTj755Ja6p0+fzvDhwxk6dCiHHHII06dPb3ntwgsvZMCAAaxbtw6AlStXMnjw4Hb9TQrhufgKMKi6mg+XLSt2GGbWQUs//ojDb3g6tfpe/tfxbW4TEZxxxhlMmDChZdbyhoYGPv30Uy688ELuu+8+xo0bx5o1azjzzDO59957W2aKaI/vfOc7LbOcN1u7di0nnXQSd9xxB6eccgoAs2bNoqqqqmX6pdGjR3P77be3zNc3e/bslve//vrrXHXVVcycOZPa2lo++OADTjjhBPbZZx+GDx8O5O4t9eCDD3LppZe2O+ZCOUEVIK17yfguuGY9x6xZs6ioqOCSSy5pKaurq+OBBx5g1KhRjBs3DoAddtiBu+++m9GjR3coQeXzyCOPcOSRR7YkJ4Djjjuu4PfffvvtXH/99dTW1gJQW1vLddddx2233cavf/1rACZNmsSdd97J9773vVRizsddfGZmXWDhwoWb3RID8t8qY99992X16tV88cUX7d5P6267uro6vvrqqy3uu1CF3M5j4MCBHH300S0Jqyv4DMrMrBtteluM1rZUvjX5uvg6K1+M+cquv/56Tj31VE466aRU99/MZ1BmZl3goIMOYt68eXnL586du1HZ+++/T58+fejbt2+X7rs97980xny389hvv/2oq6vj8ccf7/C+tsYJysysC4wZM4Z169bxy1/+sqXs1VdfZciQIcyZM4dnn30WyN3Y8IorruDqq69Obd/f/e53+fOf/8zvf//7lrKnn36aN954o6D3X3XVVfz0pz+lsbERgMbGRm666SauvPLKzbb90Y9+xO23355K3JtyF5+Zlb3qAXsXNPKuPfW1RRLTpk1j0qRJ3HzzzVRWVjJ48GDuuusunnzySS6//HImTpzIhg0bOP/88zcaWj5lypSNhnW/9NJLAAwfPpxttsmdV5xzzjkMHz6cxx57bKP7Sd17770cddRRzJgxg0mTJjFp0iQqKioYPnw4P/vZzwo6vrq6Om655RZOOeUUvv76ayoqKrj11ltb7hDc2kEHHcTIkSMLvnV9exR0u4201NfXx6anjaUgrdtk1HzS5NttlL8O3W6jtVJtJ1ni221kR3fcbsPMzKxbOUGZmVkmOUGZWVlyF3jxdfZv4ARlZmWnsrKSVatWOUkVUUSwatUqKisrO1yHR/GZWdmpqamhqamJFStWFDuUHq2yspKamo4PDHOCKlHb0bFfneczsH9/lixdmkpdZllQUVHRMo+clS4nqBK1DlIdsm5mljUFX4OS1EvSa5JmJOu1kl6W9K6kxyRt23VhmplZT9OeQRI/ABa3Wr8FuDMihgCfARelGZiZmfVsBSUoSTXAScCvknUBY4Ankk0eAk7vigDNzKxnKvQM6i7gauCbZH034POIWJ+sNwED8r1R0sWS5kqa6xE1Zvm5nZhtrs0EJelkYHlEtJ67Pd/wsbw/OIiIyRFRHxH1VVVVHQzTrLy5nZhtrpBRfKOAUyWdCFQCO5E7o9pFUu/kLKoG+KTrwjQzs56mzTOoiLguImoiYjBwLvB8RJwHzALOSjabADzZZVGamVmP05mpjq4BfijpPXLXpB5IJyQzM7N2/lA3ImYDs5Pn7wOHpR+SmZmZJ4s1M7OMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMcoIyM7NMajNBSaqU9Iqk1yUtkvSTpLxW0suS3pX0mKRtuz5cMzPrKQo5g1oHjImIbwF1wHhJRwC3AHdGxBDgM+CirgvTzMx6mjYTVOSsTlYrkiWAMcATSflDwOldEqGZmfVIBV2DktRLUgOwHJgJ/BX4PCLWJ5s0AQO28N6LJc2VNHfFihVpxGxWdtxOzDZXUIKKiA0RUQfUAIcBB+bbbAvvnRwR9RFRX1VV1fFIzcqY24nZ5to1ii8iPgdmA0cAu0jqnbxUA3ySbmhmZtaTFTKKr0rSLsnz7YHjgcXALOCsZLMJwJNdFaSZmfU8vdvehGrgIUm9yCW0xyNihqQ3gUcl/RvwGvBAF8ZpZmY9TJsJKiIWACPylL9P7nqUmZlZ6jyThJmZZZITlJmZZZITlJmZZZITlJmZZVLZJqhB1dVISmUxM7PuV8gw85L04bJlNO1Vk0pdNZ80pVKPmZkVrmzPoMzMrLQ5QZmZWSY5QZmZWSY5QZmZWSY5QZmZWSY5QZmZWSY5QZmZWSY5QZmZWSY5QZmZWSY5QZmZWSY5QZmZWSa1maAk7S1plqTFkhZJ+kFS3k/STEnvJo+7dn24ZmbWUxRyBrUeuDIiDgSOACZKGgZcCzwXEUOA55J1MzOzVLSZoCJiaUTMT55/CSwGBgCnAQ8lmz0EnN5VQZqZWc/TrmtQkgYDI4CXgT0jYinkkhiwxxbec7GkuZLmrlixonPRmpUptxOzzRWcoCT1AX4HTIqILwp9X0RMjoj6iKivqqrqSIxmZc/txGxzBSUoSRXkktPDEfHvSfGnkqqT16uB5V0TopmZ9USFjOIT8ACwOCLuaPXSU8CE5PkE4Mn0w7PusB20edv7QpZB1dXFPhQzKyOF3PJ9FHA+8IakhqTseuBm4HFJFwEfAmd3TYjW1dYBTXvVdLqemk+aOh+MmVmizQQVEXMAbeHlsemGk03qVZHKf77qvW1q/4mrV0Uq9ZiZZVUhZ1A9Xmz4msNveLrT9bz8r+NTqae5LjOzcuapjszMLJOcoMzMLJOcoMzMLJOcoMzMLJOcoMzMLJOcoMzMLJOcoMzMLJOcoMzMLJOcoMzMLJPKdiaJtKYnMjOz4ijbBJXW9ETgaYXMzIrBXXxmZpZJTlBmZpZJTlBmZpZJZXsNqtylOQjE95ayrBlUXc2Hy5Z1up7tt+nFV99sSCEiGNi/P0uWLk2lLiuME1SJ8iAQK2cfLluW2l2e06inuS7rXm128Ul6UNJySQtblfWTNFPSu8njrl0bppmZ9TSFXIOaAmz6Ffta4LmIGAI8l6xbD7cdICmVZVB1dbEPx8yKrM0uvoj4k6TBmxSfBoxOnj8EzAauSTEuK0HrwN0pZpaajo7i2zMilgIkj3tsaUNJF0uaK2nuihUrOrg7s/JWDu1kUHV1amfQZtANgyQiYjIwGaC+vj66en9mpagc2klaAxvAZ9CW09EzqE8lVQMkj8vTC8nMzKzjCeopYELyfALwZDrhmJmZ5RQyzHwq8BfgAElNki4CbgZOkPQucEKybmZmlppCRvH9yxZeGptyLGZmZi0yNRefRwGZmVmzTE115FFAZmbWLFMJyoojrYlnPemsmaXJCcpSm3jWk86aWZoydQ3KzMysmROUmZllkhOUmZllkhOUmZllkhOUZZLvLdU9/NtDyzKP4rNM8r2luod/e2hZ5gRlqUnr91TNdZlZz+YEZalJ6/dU4N9UmZmvQZmZWUb5DMoyKc3uwm16VaRyEX9g//4sWbo0hYjKU6pdvL239fRbBRhUXc2Hy5alUlcWP99OUJZJaXcXpjEQwIMAti7tv5mn32pbuQ9ycRefmZllUqbOoNLsIjAzs9KWqQTlUWBmZtasUwlK0njgZ0Av4FcRcXMqUZmlqBzvd5XmxXErTFqDbQC26V3BN+u/TqWuctbhBCWpF3APcALQBLwq6amIeDOt4MzSUI73u0rr4ri71Av3jQfudLvODJI4DHgvIt6PiH8AjwKnpROWmZn1dIqIjr1ROgsYHxH/I1k/Hzg8Ii7bZLuLgYuT1QOAtzsebovdgZUp1JMFPpZs6uixrIyIdp9qdVE7Af9NsqqnH0tB7aQz16DydcZulu0iYjIwuRP72XzH0tyIqE+zzmLxsWRTdx9LV7QT8N8kq3wshelMF18TsHer9Rrgk86FY2ZmltOZBPUqMERSraRtgXOBp9IJy8zMeroOd/FFxHpJlwHPkBtm/mBELEotsq1LvSukiHws2VQux1IuxwE+lqzqsmPp8CAJMzOzruS5+MzMLJOcoMzMLJMyn6Ak7S1plqTFkhZJ+kFS3k/STEnvJo+7FjvWtkiqlPSKpNeTY/lJUl4r6eXkWB5LBp1knqRekl6TNCNZL8njAJDUKOkNSQ2S5iZlJfMZczvJtnJpK93dTjKfoID1wJURcSBwBDBR0jDgWuC5iBgCPJesZ906YExEfAuoA8ZLOgK4BbgzOZbPgIuKGGN7/ABY3Gq9VI+j2XERUdfqNx2l9BlzO8m2cmor3ddOIqKkFuBJcvP/vQ1UJ2XVwNvFjq2dx7EDMB84nNyvsHsn5UcCzxQ7vgLir0k+jGOAGeR+uF1yx9HqeBqB3TcpK9nPmNtJdpZyaivd3U5K4QyqhaTBwAjgZWDPiFgKkDzuUbzICpec6jcAy4GZwF+BzyNifbJJEzCgWPG1w13A1cA3yfpulOZxNAvgj5LmJdMOQel+xgbjdpIl5dRWurWdZOp+UFsjqQ/wO2BSRHyR1rT33S0iNgB1knYBpgEH5tuse6NqH0knA8sjYp6k0c3FeTbN9HFsYlREfCJpD2CmpLeKHVBHuJ1kSxm2lW5tJyWRoCRVkGt0D0fEvyfFn0qqjoilkqrJfdMqGRHxuaTZ5K4X7CKpd/KNqhSmjBoFnCrpRKAS2Inct8RSO44WEfFJ8rhc0jRys/WX1GfM7SSTyqqtdHc7yXwXn3JfAR8AFkfEHa1eegqYkDyfQK7PPdMkVSXfCJG0PXA8uQuns4Czks0yfywRcV1E1ETEYHJTXD0fEedRYsfRTNKOkvo2PwfGAQspoc+Y20k2lVNbKUo7KfZFtwIuyh1N7vR3AdCQLCeS68d9Dng3eexX7FgLOJbhwGvJsSwEbkjK9wFeAd4DfgtsV+xY23FMo4EZpXwcSdyvJ8si4EdJecl8xtxOsr+UelspRjvxVEdmZpZJme/iMzOznskJyszMMskJyszMMskJyszMMskJyszMMskJyszMMskJyszMMskJqsRJmp5M3LioefJGSRdJekfSbEm/lHR3Ul4l6XeSXk2WUcWN3qx7uJ2UJv9Qt8RJ6hcRf0umhHkV+C/Ai8BI4EvgeeD1iLhM0iPAvRExR9JAclP855uE06ysuJ2UppKYLNa26gpJZyTP9wbOB/5vRPwNQNJvgf2T148HhrWa4XonSX0j4svuDNisCNxOSpATVAlLpu8/HjgyItYksz6/Tf5bE0CuS/fIiPiqeyI0Kz63k9Lla1ClbWfgs6TRDSV3S4IdgG9L2lVSb+DMVtv/EbiseUVSXbdGa1YcbiclygmqtD0N9Ja0ALgReAn4GLiJ3N1UnwXeBP6ebH8FUC9pgaQ3gUu6P2Szbud2UqI8SKIMSeoTEauTb4bTgAcjYlqx4zLLEreT7PMZVHn6saQGcvfS+QCYXuR4zLLI7STjfAZlZmaZ5DMoMzPLJCcoMzPLJCcoMzPLJCcoMzPLJCcoMzPLpP8PlTlGZbaTvVAAAAAASUVORK5CYII=\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "# Pre-processing: Feature selection/extraction"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Lets look at the day of the week people get the loan "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['dayofweek'] = df['effective_date'].dt.dayofweek\nbins = np.linspace(df.dayofweek.min(), df.dayofweek.max(), 10)\ng = sns.FacetGrid(df, col=\"Gender\", hue=\"loan_status\", palette=\"Set1\", col_wrap=2)\ng.map(plt.hist, 'dayofweek', bins=bins, ec=\"k\")\ng.axes[-1].legend()\nplt.show()\n",
"execution_count": 42,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x216 with 2 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAADQCAYAAABStPXYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAGepJREFUeJzt3XmcVPW55/HPV2gvIriC2tIBWkQQldtgR+OCQUh4EdzwuoTEKGTMdTQuYQyDSzImN84YF8YlcSVq8EbEhUTMJTcaVIjgztKCiCFebbEVFJgYYxQFfeaPOt1poKGr6VPU6erv+/WqV1edOud3ntNdTz91fnXq91NEYGZmljU7FDsAMzOzprhAmZlZJrlAmZlZJrlAmZlZJrlAmZlZJrlAmZlZJrlApUTS3pLuk/S6pAWSnpV0ckptD5U0M422tgdJcyRVFzsOK75SygtJ3SU9L2mRpCEF3M+HhWq7rXGBSoEkATOApyJiv4g4FBgDVBQpno7F2K9ZYyWYF8OBVyNiUETMTSMm2zoXqHQMAz6NiNvrF0TEmxHxcwBJHSRdJ+lFSYsl/fdk+dDkbGO6pFclTU2SGkkjk2XzgH+pb1fSzpLuTtpaJOmkZPk4SQ9J+g/gD605GElTJN0maXbyzvfLyT6XSZrSaL3bJM2XtFTSv22hrRHJu+aFSXxdWhObtSklkxeSqoBrgVGSaiTttKXXtqRaSVclz82XNFjSY5L+S9K5yTpdJD2RbLukPt4m9vs/G/1+msyxkhYRvrXyBlwE3LCV588Bfpjc/ydgPlAJDAX+Su4d5Q7As8DRQCfgLaAvIOBBYGay/VXAt5L7uwHLgZ2BcUAdsMcWYpgL1DRx+0oT604B7k/2fRLwAXBIEuMCoCpZb4/kZwdgDjAweTwHqAa6AU8BOyfLLwGuKPbfy7ftcyvBvBgH3Jzc3+JrG6gFzkvu3wAsBroC3YH3kuUdgV0atfUaoOTxh8nPEcDk5Fh3AGYCxxT777o9b+4KKgBJt5BLqE8j4ovkXmgDJZ2arLIruST7FHghIuqS7WqA3sCHwBsR8edk+b3kkpmkrRMlTUgedwJ6JvdnRcT/ayqmiGhpn/l/RERIWgK8GxFLkliWJjHWAKdLOodcspUDA8glY70vJcueTt4A70jun421QyWSF/Wae23/Nvm5BOgSEX8D/iZpnaTdgL8DV0k6Bvgc6AHsDaxq1MaI5LYoedyF3O/nqW2Muc1xgUrHUuCU+gcRcb6kbuTeEULuHdCFEfFY440kDQU+abToM/7xN9nSIIkCTomIP23S1uHkXvRNbyTNJfcublMTIuLxJpbXx/X5JjF+DnSUVAlMAL4YEX9Juv46NRHrrIj4xpbispJWinnReH9be21vNX+AM8idUR0aEesl1dJ0/vw0Iu7YShwlzZ9BpeNJoJOk8xot69zo/mPAeZLKACQdIGnnrbT3KlApqU/yuHESPAZc2KhPflA+AUbEkIioauK2tSTcml3IJf5fJe0NfK2JdZ4DjpK0fxJrZ0kHbOP+rO0p5bxo7Wt7V3LdfeslHQv0amKdx4D/1uizrR6S9mrBPto8F6gURK7DeDTwZUlvSHoBuIdcvzTAncArwEJJLwN3sJWz14hYR67r4nfJh8FvNnr6SqAMWJy0dWXax5OPiHiJXNfDUuBu4Okm1llNrt9+mqTF5JK6/3YM04qolPMihdf2VKBa0nxyZ1OvNrGPPwD3Ac8mXe3Tafpsr2TVfyhnZmaWKT6DMjOzTHKBMjOzTHKBMjOzTHKBMjOzTNquBWrkyJFB7nsMvvlWqrdWc5741g5uedmuBWrNmjXbc3dmbZLzxCzHXXxmZpZJLlBmZpZJLlBmZpZJHizWzErO+vXrqaurY926dcUOpV3r1KkTFRUVlJWVbdP2LlBmVnLq6uro2rUrvXv3Jhk/1raziGDt2rXU1dVRWVm5TW24i8/MSs66devYc889XZyKSBJ77rlnq85iXaCs5PUqL0dSq2+9ysuLfSjWAi5Oxdfav4G7+KzkrVi1irp9K1rdTsU7dSlEY2b58hmUmZW8tM6iW3I23aFDB6qqqjj44IM57bTT+Oijjxqee/jhh5HEq6/+Yxqo2tpaDj74YADmzJnDrrvuyqBBg+jXrx/HHHMMM2fO3Kj9yZMn079/f/r3789hhx3GvHnzGp4bOnQo/fr1o6qqiqqqKqZPn75RTPW32tra1vxaCy6vMyhJ/wP4DrkhKpYA3wbKgfuBPYCFwJkR8WmB4jQz22ZpnUXXy+dseqeddqKmpgaAM844g9tvv52LL74YgGnTpnH00Udz//338+Mf/7jJ7YcMGdJQlGpqahg9ejQ77bQTw4cPZ+bMmdxxxx3MmzePbt26sXDhQkaPHs0LL7zAPvvsA8DUqVOprq7eYkxtQbNnUJJ6ABcB1RFxMNABGANcA9wQEX2BvwBnFzJQM7O2asiQIbz22msAfPjhhzz99NPcdddd3H///XltX1VVxRVXXMHNN98MwDXXXMN1111Ht27dABg8eDBjx47llltuKcwBFEm+XXwdgZ0kdQQ6AyuBYeSmIIbcNM6j0w/PzKxt27BhA7///e855JBDAJgxYwYjR47kgAMOYI899mDhwoV5tTN48OCGLsGlS5dy6KGHbvR8dXU1S5cubXh8xhlnNHTlrV27FoCPP/64YdnJJ5+cxuEVVLNdfBHxtqRJwArgY+APwALg/YjYkKxWB/RoantJ5wDnAPTs2TONmM1KjvOk9NQXA8idQZ19dq6Tadq0aYwfPx6AMWPGMG3aNAYPHtxsexFbHwQ8Ija6aq4UuviaLVCSdgdOAiqB94GHgK81sWqTv72ImAxMBqiurs57mHWz9sR5UnqaKgZr167lySef5OWXX0YSn332GZK49tprm21v0aJFHHjggQAMGDCABQsWMGzYsIbnFy5cyIABA9I9iCLLp4vvK8AbEbE6ItYDvwGOBHZLuvwAKoB3ChSjmVlJmD59OmeddRZvvvkmtbW1vPXWW1RWVm50BV5TFi9ezJVXXsn5558PwMSJE7nkkksauu5qamqYMmUK3/3udwt+DNtTPlfxrQC+JKkzuS6+4cB8YDZwKrkr+cYCjxQqSDOz1ui5zz6pfo+tZ3KlXEtNmzaNSy+9dKNlp5xyCvfddx+XXHLJRsvnzp3LoEGD+Oijj9hrr7342c9+xvDhwwE48cQTefvttznyyCORRNeuXbn33nspL7Evk6u5fk0ASf8GfB3YACwid8l5D/5xmfki4FsR8cnW2qmuro758+e3NmazFpGU2hd188iXVg9f4DxpvWXLljV0h1lxbeFvkVee5PU9qIj4EfCjTRa/DhyWz/ZmZmYt5ZEkzMwsk1ygzMwsk1ygzMwsk1ygzMwsk1ygzMwsk1ygzKzk7VvRM9XpNvatyG84qlWrVjFmzBj69OnDgAEDGDVqFMuXL2fp0qUMGzaMAw44gL59+3LllVc2fIVhypQpXHDBBZu11bt3b9asWbPRsilTptC9e/eNptB45ZVXAFi+fDmjRo1i//3358ADD+T000/ngQceaFivS5cuDVNynHXWWcyZM4fjjz++oe0ZM2YwcOBA+vfvzyGHHMKMGTManhs3bhw9evTgk09y3yxas2YNvXv3btHfJB+esNDMSt7Kt9/i8CseTa29538ystl1IoKTTz6ZsWPHNoxaXlNTw7vvvsu4ceO47bbbGDFiBB999BGnnHIKt956a8NIES3x9a9/vWGU83rr1q3juOOO4/rrr+eEE04AYPbs2XTv3r1h+KWhQ4cyadKkhvH65syZ07D9Sy+9xIQJE5g1axaVlZW88cYbfPWrX2W//fZj4MCBQG5uqbvvvpvzzjuvxTHny2dQZmYFMHv2bMrKyjj33HMbllVVVbF8+XKOOuooRowYAUDnzp25+eabufrqq1Pb93333ccRRxzRUJwAjj322IYJEZszadIkLr/8ciorKwGorKzksssu47rrrmtYZ/z48dxwww1s2LBhS820mguUmVkBvPzyy5tNiQFNT5XRp08fPvzwQz744IMW76dxt11VVRUff/zxFvedr3ym8+jZsydHH300v/rVr7Z5P81xF5+Z2Xa06bQYjW1p+dY01cXXWk3F2NSyyy+/nBNPPJHjjjsu1f3X8xmUmVkBHHTQQSxYsKDJ5ZuOtfj666/TpUsXunbtWtB9t2T7TWNsajqP/fffn6qqKh588MFt3tfWuECZmRXAsGHD+OSTT/jFL37RsOzFF1+kb9++zJs3j8cffxzITWx40UUXMXHixNT2/c1vfpNnnnmG3/3udw3LHn30UZYsWZLX9hMmTOCnP/0ptbW1ANTW1nLVVVfx/e9/f7N1f/CDHzBp0qRU4t6Uu/jMrOSV9/hCXlfetaS95kji4YcfZvz48Vx99dV06tSJ3r17c+ONN/LII49w4YUXcv755/PZZ59x5plnbnRp+ZQpUza6rPu5554DYODAgeywQ+684vTTT2fgwIE88MADG80ndeutt3LkkUcyc+ZMxo8fz/jx4ykrK2PgwIHcdNNNeR1fVVUV11xzDSeccALr16+nrKyMa6+9tmGG4MYOOuggBg8enPfU9S2R13QbafE0AlYMnm6j/fF0G9nRmuk23MVnZmaZlKkC1au8PLVvevcqsZklzczam0x9BrVi1apUumKAVKd3NrO2Z2uXc9v20dqPkDJ1BmVmloZOnTqxdu3aVv+DtG0XEaxdu5ZOnTptcxuZOoMyM0tDRUUFdXV1rF69utihtGudOnWiomLbe8VcoMys5JSVlTWMI2dtl7v4zMwsk1ygzMwsk1ygzMwsk1ygzMwsk1ygzMwsk/IqUJJ2kzRd0quSlkk6QtIekmZJ+nPyc/dCB2tmZu1HvmdQNwGPRkR/4J+BZcClwBMR0Rd4InlsZmaWimYLlKRdgGOAuwAi4tOIeB84CbgnWe0eYHShgjQzs/YnnzOo/YDVwC8lLZJ0p6Sdgb0jYiVA8nOvpjaWdI6k+ZLm+1vdZk1znphtLp8C1REYDNwWEYOAv9OC7ryImBwR1RFR3b17920M06y0OU/MNpdPgaoD6iLi+eTxdHIF611J5QDJz/cKE6KZmbVHzRaoiFgFvCWpX7JoOPAK8FtgbLJsLPBIQSI0M7N2Kd/BYi8EpkraEXgd+Da54vagpLOBFcBphQnRrHXUoSyV+cHUoSyFaMwsX3kVqIioAaqbeGp4uuGYpS8+W8/hVzza6nae/8nIFKIxs3x5JAkzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8skFygzM8ukvAuUpA6SFkmamTyulPS8pD9LekDSjoUL08zM2puWnEF9D1jW6PE1wA0R0Rf4C3B2moGZmVn7lleBklQBHAfcmTwWMAyYnqxyDzC6EAGamVn7lO8Z1I3ARODz5PGewPsRsSF5XAf0aGpDSedImi9p/urVq1sVrFmpcp6Yba7ZAiXpeOC9iFjQeHETq0ZT20fE5Iiojojq7t27b2OYZqXNeWK2uY55rHMUcKKkUUAnYBdyZ1S7SeqYnEVVAO8ULkwzM2tvmj2DiojLIqIiInoDY4AnI+IMYDZwarLaWOCRgkVpZmbtTmu+B3UJcLGk18h9JnVXOiGZmZnl18XXICLmAHOS+68Dh6UfkpmZmUeSMDOzjHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKB2o56lZcjKZVbr/LyYh+OmVlBtWg+KGudFatWUbdvRSptVbxTl0o7ZmZZ5TMoMzPLJBcoMzPLJBcoMzPLJBcoMzPLJBcoMzPLJBcoMzPLJBcoMzPLJBcoMzPLJBcoMzPLpGYLlKQvSJotaZmkpZK+lyzfQ9IsSX9Ofu5e+HDNzKy9yOcMagPw/Yg4EPgScL6kAcClwBMR0Rd4InlsZmaWimYLVESsjIiFyf2/AcuAHsBJwD3JavcAowsVpJmZtT8t+gxKUm9gEPA8sHdErIRcEQP22sI250iaL2n+6tWrWxetWYlynphtLu8CJakL8GtgfER8kO92ETE5Iqojorp79+7bEqNZyXOemG0urwIlqYxccZoaEb9JFr8rqTx5vhx4rzAhmplZe5TPVXwC7gKWRcT1jZ76LTA2uT8WeCT98MzMrL3KZ8LCo4AzgSWSapJllwNXAw9KOhtYAZxWmBDNzKw9arZARcQ8QFt4eni64ZiZWTH0Ki9nxapVqbTVc599eHPlyla34ynfzcyMFatWUbdvRSptVbxTl0o7HurIMqlXeTmSUrmVorR+P73Ky4t9KGZb5DMoy6QsvpvLkrR+P6X4u7HS4TMoMzPLpJI9g/onSK17J60P/Cx/6lDmd/dm7VzJFqhPwF1EbVh8tp7Dr3g0lbae/8nIVNoxs+3LXXxmZpZJLlBmZpZJLlBmZpZJLlBmZpZJLlBmZpZJLlBmZpZJLlBmZpZJLlBmZpZJLlBmZpZJLlBmZpZJJTvUkZmZ5S/N8S/VoSyVdlygzMwsk+NfuovPrB2rH/Xfkx9aFvkMyqwd86j/lmU+gzIzs0xygbLU7FvRM7XuIjMzd/FZala+/VbmPmQ1s7YrUwUqi5c5mtn216u8nBWrVrW6nZ777MObK1emEJEVQ6YKVBYvc8yq+quv0uAktqxZsWpVKhdv+MKNtq1VBUrSSOAmoANwZ0RcnUpU1ixffWVmpW6bL5KQ1AG4BfgaMAD4hqQBaQVmZtZaWf2eV6/y8lRi6tyhY0lfmNSaM6jDgNci4nUASfcDJwGvpBGYmVlrZbWnIc0uzCweX1oUEdu2oXQqMDIivpM8PhM4PCIu2GS9c4Bzkof9gD9tpdluwJptCqht8PG1bfkc35qIaPEHoC3Mk3xjact8fG1bc8eXV5605gyqqXPCzapdREwGJufVoDQ/IqpbEVOm+fjatkIeX0vypNCxZIGPr21L6/ha80XdOuALjR5XAO+0LhwzM7Oc1hSoF4G+kiol7QiMAX6bTlhmZtbebXMXX0RskHQB8Bi5y8zvjoilrYwn7y6ONsrH17Zl6fiyFEsh+PjatlSOb5svkjAzMyskDxZrZmaZ5AJlZmaZlJkCJWmkpD9Jek3SpcWOJ02SviBptqRlkpZK+l6xY0qbpA6SFkmaWexYCkHSbpKmS3o1+TseUaQ4nCdtXCnnStp5konPoJJhk5YDXyV3+fqLwDcioiRGpZBUDpRHxEJJXYEFwOhSOT4ASRcD1cAuEXF8seNJm6R7gLkRcWdy1WrniHh/O8fgPCkBpZwraedJVs6gGoZNiohPgfphk0pCRKyMiIXJ/b8By4AexY0qPZIqgOOAO4sdSyFI2gU4BrgLICI+3d7FKeE8aeNKOVcKkSdZKVA9gLcaPa6jxF6Y9ST1BgYBzxc3klTdCEwEPi92IAWyH7Aa+GXSNXOnpJ2LEIfzpO0r5VxJPU+yUqDyGjaprZPUBfg1MD4iPih2PGmQdDzwXkQsKHYsBdQRGAzcFhGDgL8Dxfj8x3nShrWDXEk9T7JSoEp+2CRJZeSSbmpE/KbY8aToKOBESbXkupyGSbq3uCGlrg6oi4j6d/PTySViMeJwnrRdpZ4rqedJVgpUSQ+bpNxkK3cByyLi+mLHk6aIuCwiKiKiN7m/25MR8a0ih5WqiFgFvCWpX7JoOMWZVsZ50oaVeq4UIk8yMeV7gYZNypKjgDOBJZJqkmWXR8R/FjEma5kLgalJYXgd+Pb2DsB5Ym1AqnmSicvMzczMNpWVLj4zM7ONuECZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUBlgKQfS5qQYnv9JdUkw430SavdRu3PkVSddrtmW+M8aX9coErTaOCRiBgUEf9V7GDMMsp5knEuUEUi6QfJvD6PA/2SZf8q6UVJL0n6taTOkrpKeiMZAgZJu0iqlVQmqUrSc5IWS3pY0u6SRgHjge8kc+tMlHRRsu0Nkp5M7g+vH2ZF0ghJz0paKOmhZCw0JB0q6Y+SFkh6LJkOofEx7CDpHkn/e7v94qxdcZ60by5QRSDpUHJDnQwC/gX4YvLUbyLiixHxz+SmGjg7mXZgDrkh+km2+3VErAf+HbgkIgYCS4AfJd+6vx24ISKOBZ4ChiTbVgNdkiQ+GpgrqRvwQ+ArETEYmA9cnKzzc+DUiDgUuBv4P40OoyMwFVgeET9M8ddjBjhPLCNDHbVDQ4CHI+IjAEn146kdnLzL2g3oQm5IG8jNHTMRmEFu6JB/lbQrsFtE/DFZ5x7goSb2tQA4VLkJ4D4BFpJLwCHARcCXgAHA07mh0NgReJbcu9WDgVnJ8g7Aykbt3gE8GBGNk9EsTc6Tds4FqniaGmNqCrkZRF+SNA4YChART0vqLenLQIeIeDlJvOZ3ErFeudGTvw08AywGjgX6kHv32QeYFRHfaLydpEOApRGxpSmbnwGOlfR/I2JdPrGYbQPnSTvmLr7ieAo4WdJOyTu2E5LlXYGVSbfBGZts8+/ANOCXABHxV+Avkuq7Jc4E/kjTngImJD/nAucCNZEbiPE54ChJ+wMk/fkHAH8Cuks6IlleJumgRm3eBfwn8JAkv9GxQnCetHMuUEWQTGv9AFBDbu6buclT/4vcDKKzgFc32WwqsDu55Ks3FrhO0mKgCvjJFnY5FygHno2Id4F19fuMiNXAOGBa0s5zQP9kSvFTgWskvZTEeuQmx3E9ua6QX0nya8lS5Twxj2beRkg6FTgpIs4sdixmWeU8KS0+5WwDJP0c+BowqtixmGWV86T0+AzKzMwyyf2hZmaWSS5QZmaWSS5QZmaWSS5QZmaWSS5QZmaWSf8feZ3K8s9z83MAAAAASUVORK5CYII=\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "We see that people who get the loan at the end of the week dont pay it off, so lets use Feature binarization to set a threshold values less then day 4 "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['weekend'] = df['dayofweek'].apply(lambda x: 1 if (x>3) else 0)\ndf.head()",
"execution_count": 43,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 43,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 0 0 PAIDOFF 1000 30 2016-09-08 \n1 2 2 PAIDOFF 1000 30 2016-09-08 \n2 3 3 PAIDOFF 1000 15 2016-09-08 \n3 4 4 PAIDOFF 1000 30 2016-09-09 \n4 6 6 PAIDOFF 1000 30 2016-09-09 \n\n due_date age education Gender dayofweek weekend \n0 2016-10-07 45 High School or Below male 3 0 \n1 2016-10-07 33 Bechalor female 3 0 \n2 2016-09-22 27 college male 3 0 \n3 2016-10-08 28 college female 4 1 \n4 2016-10-08 29 college male 4 1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n <th>dayofweek</th>\n <th>weekend</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>45</td>\n <td>High School or Below</td>\n <td>male</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n <td>2</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>33</td>\n <td>Bechalor</td>\n <td>female</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n <td>3</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>15</td>\n <td>2016-09-08</td>\n <td>2016-09-22</td>\n <td>27</td>\n <td>college</td>\n <td>male</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>4</td>\n <td>4</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>28</td>\n <td>college</td>\n <td>female</td>\n <td>4</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>6</td>\n <td>6</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>29</td>\n <td>college</td>\n <td>male</td>\n <td>4</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "## Convert Categorical features to numerical values"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Lets look at gender:"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df.groupby(['Gender'])['loan_status'].value_counts(normalize=True)",
"execution_count": 44,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 44,
"data": {
"text/plain": "Gender loan_status\nfemale PAIDOFF 0.865385\n COLLECTION 0.134615\nmale PAIDOFF 0.731293\n COLLECTION 0.268707\nName: loan_status, dtype: float64"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "86 % of female pay there loans while only 73 % of males pay there loan\n"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Lets convert male to 0 and female to 1:\n"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['Gender'].replace(to_replace=['male','female'], value=[0,1],inplace=True)\ndf.head()",
"execution_count": 45,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 45,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 0 0 PAIDOFF 1000 30 2016-09-08 \n1 2 2 PAIDOFF 1000 30 2016-09-08 \n2 3 3 PAIDOFF 1000 15 2016-09-08 \n3 4 4 PAIDOFF 1000 30 2016-09-09 \n4 6 6 PAIDOFF 1000 30 2016-09-09 \n\n due_date age education Gender dayofweek weekend \n0 2016-10-07 45 High School or Below 0 3 0 \n1 2016-10-07 33 Bechalor 1 3 0 \n2 2016-09-22 27 college 0 3 0 \n3 2016-10-08 28 college 1 4 1 \n4 2016-10-08 29 college 0 4 1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n <th>dayofweek</th>\n <th>weekend</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>45</td>\n <td>High School or Below</td>\n <td>0</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n <td>2</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>33</td>\n <td>Bechalor</td>\n <td>1</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n <td>3</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>15</td>\n <td>2016-09-08</td>\n <td>2016-09-22</td>\n <td>27</td>\n <td>college</td>\n <td>0</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>4</td>\n <td>4</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>28</td>\n <td>college</td>\n <td>1</td>\n <td>4</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>6</td>\n <td>6</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>29</td>\n <td>college</td>\n <td>0</td>\n <td>4</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "## One Hot Encoding \n#### How about education?"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df.groupby(['education'])['loan_status'].value_counts(normalize=True)",
"execution_count": 46,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 46,
"data": {
"text/plain": "education loan_status\nBechalor PAIDOFF 0.750000\n COLLECTION 0.250000\nHigh School or Below PAIDOFF 0.741722\n COLLECTION 0.258278\nMaster or Above COLLECTION 0.500000\n PAIDOFF 0.500000\ncollege PAIDOFF 0.765101\n COLLECTION 0.234899\nName: loan_status, dtype: float64"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "#### Feature befor One Hot Encoding"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df[['Principal','terms','age','Gender','education']].head()",
"execution_count": 47,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 47,
"data": {
"text/plain": " Principal terms age Gender education\n0 1000 30 45 0 High School or Below\n1 1000 30 33 1 Bechalor\n2 1000 15 27 0 college\n3 1000 30 28 1 college\n4 1000 30 29 0 college",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Principal</th>\n <th>terms</th>\n <th>age</th>\n <th>Gender</th>\n <th>education</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1000</td>\n <td>30</td>\n <td>45</td>\n <td>0</td>\n <td>High School or Below</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1000</td>\n <td>30</td>\n <td>33</td>\n <td>1</td>\n <td>Bechalor</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1000</td>\n <td>15</td>\n <td>27</td>\n <td>0</td>\n <td>college</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1000</td>\n <td>30</td>\n <td>28</td>\n <td>1</td>\n <td>college</td>\n </tr>\n <tr>\n <th>4</th>\n <td>1000</td>\n <td>30</td>\n <td>29</td>\n <td>0</td>\n <td>college</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "#### Use one hot encoding technique to conver categorical varables to binary variables and append them to the feature Data Frame "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "Feature = df[['Principal','terms','age','Gender','weekend']]\nFeature = pd.concat([Feature,pd.get_dummies(df['education'])], axis=1)\nFeature.drop(['Master or Above'], axis = 1,inplace=True)\nFeature.head()\n",
"execution_count": 48,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 48,
"data": {
"text/plain": " Principal terms age Gender weekend Bechalor High School or Below \\\n0 1000 30 45 0 0 0 1 \n1 1000 30 33 1 0 1 0 \n2 1000 15 27 0 0 0 0 \n3 1000 30 28 1 1 0 0 \n4 1000 30 29 0 1 0 0 \n\n college \n0 0 \n1 0 \n2 1 \n3 1 \n4 1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Principal</th>\n <th>terms</th>\n <th>age</th>\n <th>Gender</th>\n <th>weekend</th>\n <th>Bechalor</th>\n <th>High School or Below</th>\n <th>college</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1000</td>\n <td>30</td>\n <td>45</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1000</td>\n <td>30</td>\n <td>33</td>\n <td>1</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1000</td>\n <td>15</td>\n <td>27</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1000</td>\n <td>30</td>\n <td>28</td>\n <td>1</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>1000</td>\n <td>30</td>\n <td>29</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Feature selection"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Lets defind feature sets, X:"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "X = Feature\nX[0:5]",
"execution_count": 49,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 49,
"data": {
"text/plain": " Principal terms age Gender weekend Bechalor High School or Below \\\n0 1000 30 45 0 0 0 1 \n1 1000 30 33 1 0 1 0 \n2 1000 15 27 0 0 0 0 \n3 1000 30 28 1 1 0 0 \n4 1000 30 29 0 1 0 0 \n\n college \n0 0 \n1 0 \n2 1 \n3 1 \n4 1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Principal</th>\n <th>terms</th>\n <th>age</th>\n <th>Gender</th>\n <th>weekend</th>\n <th>Bechalor</th>\n <th>High School or Below</th>\n <th>college</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1000</td>\n <td>30</td>\n <td>45</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1000</td>\n <td>30</td>\n <td>33</td>\n <td>1</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1000</td>\n <td>15</td>\n <td>27</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1000</td>\n <td>30</td>\n <td>28</td>\n <td>1</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>1000</td>\n <td>30</td>\n <td>29</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "What are our lables?"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "y = df['loan_status'].values\ny[0:5]",
"execution_count": 19,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 19,
"data": {
"text/plain": "array(['PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF'],\n dtype=object)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "merge = pd.concat([X, df['loan_status']], axis=1, sort=False)\nmerge.head()\nmerge.corr(method='pearson')",
"execution_count": 50,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 50,
"data": {
"text/plain": " Principal terms age Gender weekend \\\nPrincipal 1.000000 0.521876 -0.060893 -0.005134 0.089006 \nterms 0.521876 1.000000 -0.064762 -0.032399 0.084842 \nage -0.060893 -0.064762 1.000000 -0.010519 0.000431 \nGender -0.005134 -0.032399 -0.010519 1.000000 -0.079157 \nweekend 0.089006 0.084842 0.000431 -0.079157 1.000000 \nBechalor 0.022212 -0.057337 0.057065 0.082229 0.016430 \nHigh School or Below 0.011206 0.101787 0.066836 -0.043927 -0.064819 \ncollege -0.021506 -0.052172 -0.131585 -0.006420 0.044184 \n\n Bechalor High School or Below college \nPrincipal 0.022212 0.011206 -0.021506 \nterms -0.057337 0.101787 -0.052172 \nage 0.057065 0.066836 -0.131585 \nGender 0.082229 -0.043927 -0.006420 \nweekend 0.016430 -0.064819 0.044184 \nBechalor 1.000000 -0.335888 -0.331958 \nHigh School or Below -0.335888 1.000000 -0.765299 \ncollege -0.331958 -0.765299 1.000000 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Principal</th>\n <th>terms</th>\n <th>age</th>\n <th>Gender</th>\n <th>weekend</th>\n <th>Bechalor</th>\n <th>High School or Below</th>\n <th>college</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>Principal</th>\n <td>1.000000</td>\n <td>0.521876</td>\n <td>-0.060893</td>\n <td>-0.005134</td>\n <td>0.089006</td>\n <td>0.022212</td>\n <td>0.011206</td>\n <td>-0.021506</td>\n </tr>\n <tr>\n <th>terms</th>\n <td>0.521876</td>\n <td>1.000000</td>\n <td>-0.064762</td>\n <td>-0.032399</td>\n <td>0.084842</td>\n <td>-0.057337</td>\n <td>0.101787</td>\n <td>-0.052172</td>\n </tr>\n <tr>\n <th>age</th>\n <td>-0.060893</td>\n <td>-0.064762</td>\n <td>1.000000</td>\n <td>-0.010519</td>\n <td>0.000431</td>\n <td>0.057065</td>\n <td>0.066836</td>\n <td>-0.131585</td>\n </tr>\n <tr>\n <th>Gender</th>\n <td>-0.005134</td>\n <td>-0.032399</td>\n <td>-0.010519</td>\n <td>1.000000</td>\n <td>-0.079157</td>\n <td>0.082229</td>\n <td>-0.043927</td>\n <td>-0.006420</td>\n </tr>\n <tr>\n <th>weekend</th>\n <td>0.089006</td>\n <td>0.084842</td>\n <td>0.000431</td>\n <td>-0.079157</td>\n <td>1.000000</td>\n <td>0.016430</td>\n <td>-0.064819</td>\n <td>0.044184</td>\n </tr>\n <tr>\n <th>Bechalor</th>\n <td>0.022212</td>\n <td>-0.057337</td>\n <td>0.057065</td>\n <td>0.082229</td>\n <td>0.016430</td>\n <td>1.000000</td>\n <td>-0.335888</td>\n <td>-0.331958</td>\n </tr>\n <tr>\n <th>High School or Below</th>\n <td>0.011206</td>\n <td>0.101787</td>\n <td>0.066836</td>\n <td>-0.043927</td>\n <td>-0.064819</td>\n <td>-0.335888</td>\n <td>1.000000</td>\n <td>-0.765299</td>\n </tr>\n <tr>\n <th>college</th>\n <td>-0.021506</td>\n <td>-0.052172</td>\n <td>-0.131585</td>\n <td>-0.006420</td>\n <td>0.044184</td>\n <td>-0.331958</td>\n <td>-0.765299</td>\n <td>1.000000</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "from sklearn.model_selection import train_test_split\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=4)\nprint(\"X_train size is \", X_train.shape, \"\\n\", \"X_test size is \", X_test.shape, \"\\n\",\n \"y_train size is \", y_train.shape, \"\\n\", \"y_test size is \", y_test.shape)\nprint(X_train[0:5])\ny_train[0:5]",
"execution_count": 51,
"outputs": [
{
"output_type": "stream",
"text": "X_train size is (276, 8) \n X_test size is (70, 8) \n y_train size is (276,) \n y_test size is (70,)\n Principal terms age Gender weekend Bechalor High School or Below \\\n188 1000 15 35 0 0 0 0 \n299 1000 30 26 0 1 0 1 \n239 1000 30 31 0 0 0 0 \n46 1000 15 25 0 1 0 0 \n259 1000 30 28 0 0 0 0 \n\n college \n188 1 \n299 0 \n239 1 \n46 1 \n259 1 \n",
"name": "stdout"
},
{
"output_type": "execute_result",
"execution_count": 51,
"data": {
"text/plain": "array(['PAIDOFF', 'COLLECTION', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF'],\n dtype=object)"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "## Normalize Data "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Data Standardization give data zero mean and unit variance (technically should be done after train test split )"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "X= preprocessing.StandardScaler().fit(X).transform(X)\nX[0:5]",
"execution_count": 52,
"outputs": [
{
"output_type": "stream",
"text": "/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/preprocessing/data.py:645: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n return self.partial_fit(X, y)\n/opt/conda/envs/Python36/lib/python3.6/site-packages/ipykernel/__main__.py:1: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n if __name__ == '__main__':\n",
"name": "stderr"
},
{
"output_type": "execute_result",
"execution_count": 52,
"data": {
"text/plain": "array([[ 0.51578458, 0.92071769, 2.33152555, -0.42056004, -1.20577805,\n -0.38170062, 1.13639374, -0.86968108],\n [ 0.51578458, 0.92071769, 0.34170148, 2.37778177, -1.20577805,\n 2.61985426, -0.87997669, -0.86968108],\n [ 0.51578458, -0.95911111, -0.65321055, -0.42056004, -1.20577805,\n -0.38170062, -0.87997669, 1.14984679],\n [ 0.51578458, 0.92071769, -0.48739188, 2.37778177, 0.82934003,\n -0.38170062, -0.87997669, 1.14984679],\n [ 0.51578458, 0.92071769, -0.3215732 , -0.42056004, 0.82934003,\n -0.38170062, -0.87997669, 1.14984679]])"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "X_train = preprocessing.StandardScaler().fit(X_train).transform(X_train.astype(float))\nX_train[0:5]\nX_test = preprocessing.StandardScaler().fit(X_test).transform(X_test.astype(float))\nX_test[0:5]",
"execution_count": 53,
"outputs": [
{
"output_type": "stream",
"text": "/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/preprocessing/data.py:645: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n return self.partial_fit(X, y)\n/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/preprocessing/data.py:645: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n return self.partial_fit(X, y)\n",
"name": "stderr"
},
{
"output_type": "execute_result",
"execution_count": 53,
"data": {
"text/plain": "array([[ 0.33474248, 0.83916906, -0.19614926, -0.47756693, 0.74535599,\n -0.2773501 , 1.26197963, -1.05887304],\n [-1.70282047, -0.9301633 , -0.19614926, -0.47756693, 0.74535599,\n -0.2773501 , -0.79240582, 0.94440028],\n [ 0.33474248, -0.9301633 , -0.04012144, -0.47756693, -1.34164079,\n -0.2773501 , 1.26197963, -1.05887304],\n [ 0.33474248, 0.83916906, -1.13231619, -0.47756693, -1.34164079,\n -0.2773501 , -0.79240582, 0.94440028],\n [ 0.33474248, 0.83916906, 0.42796202, -0.47756693, -1.34164079,\n -0.2773501 , -0.79240582, 0.94440028]])"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "# Classification "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Now, it is your turn, use the training set to build an accurate model. Then use the test set to report the accuracy of the model\nYou should use the following algorithm:\n- K Nearest Neighbor(KNN)\n- Decision Tree\n- Support Vector Machine\n- Logistic Regression\n\n\n\n__ Notice:__ \n- You can go above and change the pre-processing, feature selection, feature-extraction, and so on, to make a better model.\n- You should use either scikit-learn, Scipy or Numpy libraries for developing the classification algorithms.\n- You should include the code of the algorithm in the following cells."
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# K Nearest Neighbor(KNN)\nNotice: You should find the best k to build the model with the best accuracy. \n**warning:** You should not use the __loan_test.csv__ for finding the best k, however, you can split your train_loan.csv into train and test to find the best __k__."
},
{
"metadata": {},
"cell_type": "code",
"source": "# finding a suitable k value\nfrom sklearn.neighbors import KNeighborsClassifier\nfrom sklearn.metrics import jaccard_similarity_score\nimport matplotlib.pyplot as plt\n%matplotlib inline\n\nk_range = range(1, 10)\naccuracy_score = []\nfor k in k_range:\n KNN = KNeighborsClassifier(n_neighbors = k).fit(X_train, y_train)\n # perform the test\n knn_yhat = KNN.predict(X_test)\n print(\"Test set Accuracy at k=\", k, \": \", jaccard_similarity_score(y_test, knn_yhat))\n accuracy_score.append(jaccard_similarity_score(y_test, knn_yhat))\n\n# plot the relationship between K and testing accuracy\nplt.plot(k_range, accuracy_score)\nplt.xlabel('Value of K for KNN')\nplt.ylabel('Testing Accuracy')",
"execution_count": 54,
"outputs": [
{
"output_type": "stream",
"text": "Test set Accuracy at k= 1 : 0.6714285714285714\nTest set Accuracy at k= 2 : 0.6428571428571429\nTest set Accuracy at k= 3 : 0.7285714285714285\nTest set Accuracy at k= 4 : 0.6571428571428571\nTest set Accuracy at k= 5 : 0.7142857142857143\nTest set Accuracy at k= 6 : 0.6571428571428571\nTest set Accuracy at k= 7 : 0.7428571428571429\nTest set Accuracy at k= 8 : 0.7428571428571429\nTest set Accuracy at k= 9 : 0.7142857142857143\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"execution_count": 54,
"data": {
"text/plain": "Text(0, 0.5, 'Testing Accuracy')"
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEKCAYAAADjDHn2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3Xl4m+WV8P/vkbzvTmQnzr7Y2eOsDRQcCgRKICl0gQ687Sz9za/btDNtB2jpvk2nG512pmXmnU477XSmhYGUtjRJCRQoJUBZrBBnT5zFsmNnsSNv8W6f9w9JwThe5ETSI8nnc126sKVHek6M5aPn3Pd9blFVjDHGmNG4nA7AGGNM/LNkYYwxZkyWLIwxxozJkoUxxpgxWbIwxhgzJksWxhhjxmTJwhhjzJgsWRhjjBmTJQtjjDFjSnE6gEjxeDw6Z84cp8MwxpiEUllZ2aiqRWMdlzTJYs6cObz66qtOh2GMMQlFRGrCOc7KUMYYY8ZkycIYY8yYLFkYY4wZkyULY4wxY7JkYYwxZkyWLIwxxozJkoUxxpgxJc06C2NMcnjuyFleOX7O6TCGde2iYlbPKnQ6DEdYsjDGxJVPbamivqULEacjeSNV+M/nT/D4x9czozDL6XBizpKFMSZu1Dd3Ut/SxRfftoT3XT3X6XDeoPZcBxu/90c+9csq/vv/uwKXK86yWZRFdcxCRDaKyCERqRaR+4Z5/Lsi8lrwdlhEmoc8niciJ0XkB9GM0xgTH7w+PwBrZsdfqWfmpCw+s2kxz1c38fOXfU6HE3NRSxYi4gYeAG4GlgB3iciSwceo6idUdaWqrgS+Dzw65GW+CjwbrRiNMfHFW9NMRqqLxSV5TocyrP+zbhbryzx8ffsBfE0dTocTU9G8slgHVKvqMVXtAR4Cbhvl+LuAB0PfiMgaYArwRBRjNMbEkUqfn/IZBaS643OipojwzXeV4xbhni27GRhQp0OKmWj+H5kO1A76vi5430VEZDYwF3g6+L0L+A5wbxTjM8bEka7efvbXt8T9bKNpBZl8/m1LePn4Of7rxRNOhxMz0UwWw43+jJSG7wS2qGp/8Pu/Abarau0IxwdOIPIBEXlVRF49e/bsZYRqjHHanpMt9PZrXI5XDHXHmhlct7CIbz5+kGNn250OJyaimSzqgJmDvp8B1I9w7J0MKkEBbwY+KiIngPuBvxCRbwx9kqr+UFXXquraoqIx9+4wxsQxb01gcHvVrAKHIxmbiPCNd5WT5nZx75Yq+idAOSqayeIVoExE5opIGoGE8NjQg0RkIVAIvBi6T1Xfo6qzVHUOcA/wM1W9aDaVMSZ5VNb4mTM5C09OutOhhGVKXgZfvm0plTV+frzzmNPhRF3UkoWq9gEfBXYAB4CHVXWfiHxFRG4ddOhdwEOqmvyp2RgzLFXF62uO+/GKod6+cjo3LpnC/U8cpvpMm9PhRFVUpxyo6nZVXaCq81X1a8H7vqCqjw065kujXTWo6k9V9aPRjNMY46zac500tnezOgHGKwYTEf7xHcvJTnNz98O76esfcDqkqInP+WnGmAkltBgv0a4sAIpy0/nKbcvYXdfCv/8xectRliyMMY6rrPGTneZm4dRcp0O5JG9bMY1Ny0v43u8Pc/BUq9PhRIUlC2OM47w+PytnFeBO4H5LX7ltKXkZqdz98G56k7AcZcnCGOOo8919HGhoZU0ClqAGm5yTztfesZx99a088Ey10+FEnCULY4yjdtc1M6CwKsEGt4ezcdlUbls5jR88Xc3eky1OhxNRliyMMY4KLcZbPTPxkwXAl29dSmF2Gvc8spuevuQpR1myMMY4yutrprQ4h/ysVKdDiYiCrDS+8c7lHDzVxr88dcTpcCLGkoUxxjGBxXj+hB+vGGrD4incvmYG//bsUXbXNo/9hARgycIY45hjjedp7uhl9ez47wc1Xp/fvISinHTufmQ3Xb39Yz8hzlmyMMY4prImfnfGu1z5mal8413LqT7Tznd/f9jpcC6bJQtjjGN2+fzkZaQwz5PjdChRce3CYu5aN5P/+OOxC4kxUVmyMMY4prLGz+rZhbgSeDHeWD5zy2JK8jO555HddPYkbjnKkoUxxhEtnb0cOdOekP2gxiM3I5Vv3V7O8cbzfHvHIafDuWSWLIwxjnitthnV5ByvGOrqUg9/fuVsfvLCcV461uR0OJfEkoUxxhHeGj8ugRUzk28m1HDuu3kRMwuzuHdLFR09fU6HM26WLIwxjvD6/CycmkdOeorTocREdnoK3769nFp/B9/43UGnwxk3SxZm3GrPdSTFvHHjnP4B5TVfM6sTYL/tSLpi3mTed9VcfvZiDS9UNzodzrhYsjDj0tnTz03f+yPffzp52hiY2Dtypo227r6kH9wezr03LWSuJ5t7t1TR1tXrdDhhs2RhxqWqrpmOnn6ePnjW6VBMAvPWBFpgTITB7aEy09zcf0c5DS2d/OP2xClHWbIw4+L1Bd7kBxpaaWzvdjgak6gqa/xMyk5j9uQsp0NxxJrZk3j/+nk8+LKPZw8nxgcvSxZmXCpr/KSnBH5tnk+wmquJH7t8flbPKkQkeRfjjeUTNy6gtDiHT22poqUz/stRlixM2FSVXT4/tywvIT8zlZ1HLFmY8Tt3vodjjeeTsnngeGSkuvnOHSs4297NV7fudzqcMVmyMGGraeqg6XwPb5oziavmT2ZndSOq6nRYJsHs8gWbB07Awe2hVsws4ENvmceWyjqeOnDa6XBGZcnChM0bfJOvnl1ARZmHhpYujp4973BUJtFU1vhJcQnlMyb2lUXI320oY9HUXD796B6aO3qcDmdElixM2Cpr/OSmp1BWnMv60iIAdh5JjME5Ez+8Pj9LpuWRmeZ2OpS4kJ7i5v47VnDufA9femyf0+GMyJKFCZvX18zKWQW4XcKsyVnMmpTFThvkNuPQ1z/A7tqWCbm+YjTLpufz0etL+fVr9Ty+95TT4QzLkoUJS3t3H4dOtb7hTV5R5uFPx87R2588m9Kb6Dp4qo3O3n5WT8D1FWP5yHWlLJ2Wx+d+vYdz5+OvHGXJwoRld20zA8ob3uTrSz20d/fxWpLsMWyiL7QB0ERr8xGOVLeL77x7BS2dvXz+N3udDucilixMWCpr/IjAykEdQq+a78El8JxNoTVh8vr8TMlLZ3pBptOhxKVFU/P4+A0L2FbVwNaqeqfDeQNLFiYsXp+fsuIc8jNTL9yXn5XK8hkFNshtwua1xXhj+uA181gxI5/P/3ovZ9vip0uCJQszpoEBxVvjH7aPz/pSD7vrWmhNoIZoxhln2rqoPdc5IftBjUeK28X9d6zgfE8/n/3VnrhZyxTVZCEiG0XkkIhUi8h9wzz+XRF5LXg7LCLNwftXisiLIrJPRKpE5M+iGacZ3bHGdlq7+lg1zAyWijIP/QPKi0cTc/cvEzuh5oHD/R6ZNyqbksvdNy7gif2n+c1r8VGOilqyEBE38ABwM7AEuEtElgw+RlU/oaorVXUl8H3g0eBDHcBfqOpSYCPwPRGxETGHhAYlh/tEuHpWIVlpbmv9Ycbk9flJc7tYNj3P6VASwv+/fh6rZxXwxcf2cbq1y+lwonplsQ6oVtVjqtoDPATcNsrxdwEPAqjqYVU9Evy6HjgDFEUxVjMKb00zBVmpzPNkX/RYWoqLK+ZOsvUWZkzeGj/LpueRnmKL8cLhdgn337GC7r5+Pv2o8+WoaCaL6UDtoO/rgvddRERmA3OBp4d5bB2QBhwd5rEPiMirIvLq2bM2yBotlWMMSlaUFXG88Tx1/o4YR2YSRU/fAFUnW2y8YpzmFeXwyZsW8fTBM2yprHM0lmgmi+H+soyUGu8EtqjqG/bqFJES4L+B96nqRSu/VPWHqrpWVdcWFdmFRzS0dPRSfaZ91Hnx68s8AFaKMiPaV99CT9+Ardy+BH911RzWzZ3EV367n/rmTsfiiGayqANmDvp+BjDSSM2dBEtQISKSB2wDPqeqf4pKhGZMu2pDzQNHfpOXFecwJS+d56wUZUZwYTGeXVmMm8sl3H/7CvpV+dQvqxwrR0UzWbwClInIXBFJI5AQHht6kIgsBAqBFwfdlwb8CviZqj4SxRjNGLw1flwCK0bpECoiXF3q4YXqRgYG4mOan4kvu3zNTC/IZEpehtOhJKRZk7P49M2LeO5IIw++XDv2E6IgaslCVfuAjwI7gAPAw6q6T0S+IiK3Djr0LuAhfWO6fDdwDfBXg6bWroxWrGZkXl8zi0vyyE5PGfW49WUe/B297KtvjVFkJpFUjrBOx4TvPVfM5urSyXxt235qz8V+fDCq6yxUdbuqLlDV+ar6teB9X1DVxwYd8yVVvW/I8/5HVVND02qDt9eiGau5WP+AXtj+cixXlwbGLZ6rtokG5o3qmzs51dpl/aAuk8slfPNd5QB8cktVzK/ibQW3GdHh022c7+kP6xNhcW4Gi6bm2iC3ucjr63QmORxJ4ptRmMXnNi/hxWNN/M9LNTE9tyULM6LXO4SGVz6oKPXw6gk/nT39Yx9sJgyvz09GqotFJblOh5IU7nzTTK5ZUMTXtx/kRGPsdqq0ZGFG5PX58eSkM3NSeB1CK8o89PQP8PKJc1GOzCQSb42fFTMKSHXbn5tIEBG++a7lpLiFe7fsjlk5yv7vmRF5a/ysnlUQdofQK+ZOJs3tsi605oKu3n721bfalNkIK8nP5ItvW8orJ/z85IUTMTmnJQszrKb2bk40dYzrTZ6Z5mbN7ELb38JcUFXXQt+AssYW40Xcu1ZPZ8OiYr71+EGOnm2P+vksWZhheX2BDqHjne5YUebh4Km2uOrD7yRV5bYHnud7vz/sdCiO8PoC416rbCZUxIkIX3/ncjJS3dzzSPTLUWMmCxH5kIjkRzUKE3e8Pj8pLmH59PH9rw+1/njeVnMDgU/Wu2ub+cVLPvon4ILFyho/cz3ZTM5JdzqUpFScl8E337WcD71lPi5XdDeUCufKYg7gFZFfiMgNUY3GxI3KGj9Lp+eTkTq+DqFLp+VTkJVqpaigbXsaADjT1s2rE2zgXzWwTseuKqJr47ISblo6NernGTNZBBfMlQE/Bz4kIkeCq7DnRDk245De/gGq6povaRGV2yVcPd/DzuqzjrdUdpqqsq2qgSvmTiIj1XUhcUwUvnMdNLb3WPPAJBHWmEWw4+uJ4G0AKAF+IyJfj1pkxjEHGlrp6h245PYMFWUeTrd2U30m+oNu8ey12mZONnfy7rUzuX5RMdv3nJpQpajQeIW1+UgO4YxZ/I2IvAz8M1AJlKvq+4FVgG13moS841yMN1RFqPXHBC9FbatqIM3t4oYlU9i0fBqN7d28fHzilKIqa/zkpKewYIotxksG4VxZzADuVNUbVPVBVe2GC1cbt47+VJOIKn3NlORnMK0gvMV4Q82clMWcyVkTeve8gQFl254GrlngIT8zlesWFZGZ6mZrVXzspxwL3ppmVs4swB3lgVcTG+Eki18R2NYUABHJFZG1AKq6N1qBGecEFuNdXumgoszDn4410dN30Z5VE8KuWj8NLV1sKi8BICsthesXF/P43lP09Sf/z6S9u4+Dp1qteWASCSdZ/BAY3A/3PPDv0QnHOO10axcnmzsve8VtRWkRHT397ArWrSearVUNpKW4uGHxlAv3va28hKbzPbw0AUpRVbXNDKhtdpRMwkkWrsFbmga/To1eSMZJr49XXN4nwjfPn4xLmJClqIEBZfueBq5dUERuxutvlWsXFpOV5mZrVfLPigo1oVw105JFsggnWRwXkQ+LiFtEXCLyEQKzokwSqqzxk5biYum0y1uHmZ+ZyoqZBRNykPvVGj+nW7svlKBCMlLd3LB4Co/vbUj6UpTX56esOIf8LPtcmSzCSRYfBDYAp4O3twDvj2ZQxjlen5/y6fmkpVx+J5j1pR6q6ppp6eiNQGSJY1tVPekpLjYMKkGFbCovwd/RywtHmxyILDYGBhSvr9nWVySZcBblnVbV21XVo6pFqvpuVT0di+BMbHX39bP3ZGvE5sVXlBUxoPDisYlzddE/oGzfe4rrFhaTM8xWtG9ZUEROegrbkrgUdazxPC2dvba+IsmEs84iXUQ+KCL/IiI/DN1iEZyJrb0nW+npH2BVhD4RrppVQHaae0KVol45cY6zbd1sXlEy7OMZqW5uXDKFx/edojdJS1EXxr1m20yoZBJOreFnBPpDbQZeAuYDXVGMyTgk0m/yVLeLK+dNnlCD3Fur6slIdXH9ouIRj9m0vISWzt6kbbbo9fnJz0xlnifH6VBMBIWTLBao6qeBdlX9MbARWBbdsIwTvD4/MydlUpybEbHXrCjzUNPUQe25jrEPTnB9/QM8vvcUGxZNISvt4hJUyPoFHnLTU5J2VlRlTaB5YLS7oJrYCidZhEYnm0VkMZALzI5eSMYJqkpljT/im9SEWpZPhFLUy8fP0djec9EsqKHSU9zcuHQKO/adSrpFiy2dvRw5026bHSWhcJLFj0WkEPgisAM4DHwnqlGZmDvZ3MmZtu6IL6KaX5TD1LyMpC25DLZ1TwNZaW6uWzhyCSpkc3kJbV197KxOri1oQ4swbTFe8hk1WYiIG2hUVb+qPqOqs4Kzov41RvGZGKm8zOaBIxERKso8PH+0Mak7rl4oQS2eQmba2HuAVJQWkZeRfKUor68Zl8CKmTa4nWxGTRaq2g98PEaxGAft8jWTleZm0dTIdwhdX+ahuaOXffUtEX/tePHisSbOne9h0/LRS1AhaSkublo6lSf3naa7rz/K0cWOt8bPwql5w04bNoktnDLUDhH5uIiUiEhe6Bb1yExMeX1+VswoIMUd+W3Zr54ALcu3VTWQnebm2oVFYT9nU3kJbd19/PFwcvxc+geU12qbWWNTZpNSuCu47wZeBvYFb9ZtNol09vSzv741avPiPTnpLC7JY2eSJove/gEe33eKG5dMGdc2tFeXeijISmVbkrQtP3y6jfbuPlu5naTGvFZU1ZmxCMQ4p6qumb4BjeqK2/VlHn76/Ak6e/rDquknkheONtHc0cum8mnjel6q28XGpVP57e56unr7x73febyxnfGSWzgruP/PcLdYBGdio9IX/Q6hFaUeevoHeOl48vVE2rq7ntz0lAvThMdjU3kJ53v6efZw4s+K8tY0Mzk7jVmTspwOxURBOGWo9YNuNwJfB26PZlAmtrw1zcwryqYwOy1q51g3dxJpKa6kK0X19A2w4xJKUCFvnjeZwqzUpJgV5fX5WT27EBFbjJeMwmkk+OFBt/cBK4Gw3hUislFEDolItYjcN8zj3xWR14K3wyLSPOixvxSRI8HbX47nH2XCp6qBN3mU68wZqW7eNKcw6Vp/PF/dSGtX35gL8UaS4naxcVkJTx04TWdP4s6KOne+h+ON5228IoldytSXNmDBWAcF12g8ANwMLAHuEpElg49R1U+o6kpVXQl8H3g0+NxJBBYBXgGsA74YXBhoIqymqYNz53ti8iavKC3i4Kk2zrQlT2uxrVUN5GaksL4s/FlQQ20uL6Gjp58/HDoz9sFxKtRXzMYrklc4Yxa/EpFHg7dfAweAbWG89jqgWlWPqWoP8BBw2yjH3wU8GPz6JuBJVT2nqn7gSQI9qUyEVcbwTR6q6SfLau7uvn6e2H+Km5ZOvaz9P66YOwlPThpb9yRuKcrr85PiEspnXN6mWSZ+hbNy5geDvu4DalT1RBjPmw7UDvq+jsCVwkVEZDYwF3h6lOdOD+OcZpy8Pj+56SmUFUe/Q+iSkjwmZafx3JFG3rFqRtTPF23PHW6k7TJKUCGBUtRUfll5ko6evlGbEMaryho/S6flJfyMLjOycD4OHQGeV9WnVPVZ4LSIhDOddrhRrpH6PdwJbAmuGA/7uSLyARF5VURePXs28WeTOKGyxs/KGHUIdbmEq+ZPZueRRlQTv/XHtj0N5GemcvX88c+CGmrT8ml09vbz9MHEK0X19g9QVdcSsX1QTHwKJ1k8CgxujTkA/DKM59UBg5PKDGCk1Ud38noJKuznquoPVXWtqq4tKrr0mvFE1dbVy+HTbTEdlFxf5uFMWzdHzrTH7JzR0NXbz5P7T7PxMktQIevmTqIoNz0hd9A72NBGZ2+/jVckuXB+y1OCYw4AqGo3kB7G814BykRkroikEUgIjw09SEQWAoXAi4Pu3gG8VUQKgwPbbw3eZyJod20LAxrbQcmK4EBworf++OPhs7R3X34JKsTtEm5ZNpWnD57hfHdfRF4zVrzWaXZCCCdZNInILaFvRGQzcG6sJ6lqH/BRAn/kDwAPq+o+EfmKiNw66NC7gId0UF1CVc8BXyWQcF4BvhK8z0SQ1+dHBFbOil0vn+kFmczzZLPzSGKXDbdWNVCYlcqb50+O2GtuKp9Gd98ATyVYKaqyxs/UvAym5Udu0ywTf8IZSfsw8AsReYDAuEEj8N5wXlxVtwPbh9z3hSHff2mE5/4n8J/hnMdcmsoaPwuKc8nLSI3peSvKPGyprKOnbyAiJZxY6+rt5/cHTnPbymmkRrDx4trZhRTnprOtqp5bV4yvdYiTAovxCmwxXpILZ1HeYVVdC6wCVqvqOlU9HP3QTDQNDCi7gm/yWKso9dDR03+hfJFo/nDoDB09/WxaHtk/6C6XcMvyEp45dJa2rt6xnxAHzrR2UefvtMV4E0A46yy+KiIFqtqsqs3BcYQvxyI4Ez1Hz7bT2uVMh9Ar50/G7ZKEbf2xtaqBydlpXDlvUsRfe3N5CT19Azx1IDFKUTZeMXGEcw29WVUvtOEILpJ7W/RCMrHg5Js8LyOVlTMLeC4BF+d19PTx1IEzbFw2NSp7f6yeVUhJfkbC9IqqrPGT5naxdJptcZPswvltdwdnMwEgIhlA9DrOmZiorPFTkJXKPE+2I+evKPWwp66Zlo7EKLeEPHPwLJ29/RGbBTVUqBT1x8NnaU2AUpTX18zyGfmkp9hivGQXTrJ4CHgy2NjvLwjMbvpFdMMy0eb1NbN6lnMdQteXeRhQeOFoYl1dbNtTjycnnSvmRm4W1FCbykvo6R/gyX2no3aOSOju62dPXQurYzibzjgnnAHufwS+TWCAew3wreB9JkE1d/RQfabd0UVUK2YWkJOeklClqPPdfTx98Ay3LJ+KO4or3lfNLGB6QSbb4rxX1L76Vnr6B2wx3gQRVtFVVbeq6sdV9WNAo4j8c5TjMlG0qzYwBLXKwU+EqW4XV86blFCD3E8dPENX7wCblkenBBUiItyyfCrPHTkb12W6UKdZmwk1MYSVLERkmYh8TUSOAvcDx6Mblokmb40ft0tYMcPZ8kFFqQffuQ58TR2OxhGubVX1FOems3ZO5GdBDbW5fBq9/coT+09F/VyXyuvzM6Mwk+I8W4w3EYyYLERknoh8RkT2Aj8isBgvVVXXq+r3YhahiTivz8+iqblkpzvb3fRC64/q+F/N3d7dxzOHznLL8pKolqBCymfkM6MwfktRqkplTfQ3zTLxY7Qri2oC+0q8U1WvVNXvEmhRbhJY/4Dymq85LurM84uyKcnPSIhS1FMHTtPTNxC1WVBDiQibykvYeaQR//mesZ8QY/UtXZxu7Y6L3yMTG6Mliz8jcDXxlIj8q4i8heFbh5sEcuhUG+d7+uPiE6GIUFHq4YWjTfQPxHfL8q1VDUzNy2BNDH9um5dPo28gPktRlTZeMeGMmCxU9RFVfReBLVFfAj4NTBWR74vI9bEK0ERWpS++tr+sKPPQ0tnLnpMtTocyotauXp4NlqBise9HyLLpecyenBWXC/S8NX4yU90sKsl1OhQTI+FMnW1T1f9S1Y0E9pg4CHwp2oGZ6NhV48eTk86MwkynQwHg6tLAxkHx3IX29/tP09MfuxJUiIiwaXkJLxxt4lyclaK8Pj/lM/Ij2kjRxLdx/Z9W1UZVfUBVr4lWQCa6vD4/a+KoQ6gnJ50lJXlxvb/FtqoGpuVnsGpm7GePbSovoX9AeXxv/JSiOnv62V/fGjdXpyY27GPBBNLY3s2Jpo64qzOvL/Pg9fnjctOfls5e/njkLJvKY1uCCllSksdcTzbb9oy0yWTsVdU10zegcfd7ZKLLksUEsssXWIwXb58IK8o89PYrLx+Pv/2tnth3it5+ZVO5M/tLhEpRLx5torG925EYhvIGf4+s0+zEYsliAqms8ZPqFpZNz3c6lDd405xJpKW44rIUtW1PA9MLMlkxw7mf2eYVJQwocVOKqqzxM9eTzaRs6yc6kYSzn4VfRM4NuR0XkUdEZE70QzSR4vX5WTotn4zU+OoQmpHqZt2cSeyMs8V5zR097DzSyObyEkfHeBZOyWV+UTbb4mBWlGpw0ywrQU044VxZfB/4PDAfKAU+B/wU+DXwk6hFZiKqt3+AqrrmuH2TV5R5OHy6ndOtXU6HcsET+07TN6BsdqgEFRJYoDeNl443cabN2Z9PTVMHTed7HNlh0TgrnGTx1uAMKL+qnlPVfwVuVtWfA9FvkmMi4kBDK129A3H7Jq+4MIU2fkpRv62qZ9akLJZNd35jn83l8VGK8sbZOh0TO+E2EnznkK9D1+QD0QjKRF5oxW28vsmXlOQxOTuNnXHSsvzc+R5eONrEJodLUCELpuSyYEqO4wv0Kmv85KSnUFZsi/EmmnCSxXuB9wfHKpqA9wN/LiJZwMejGp2JGK+vmZL8DEry42Mx3lAul3BVqYed1Y2oOt/6Y8e+U/QPaNTbkY/HpuXTeOXEOUdLdV5fM6tmFcSkmaKJL+Gs4K5W1ZtVdZKqTg5+fVhVO1T12VgEaS6ft8Yf91Md15d6ONvWzaHTbU6HwraqBuZMzoqrvaU3lU9FFX7nUCfa9u4+Dp1qZVWcjnuZ6ApnNpRHRD4ZbCb4w9AtFsGZyDjV0sXJ5s64HdwOqSiLj3GLpvZuXjjayObyaXFRggopLc5l0dRcx0pRu2ubGdD4LWWa6AqnDPUbYAqwE3hq0M0kiEQZlJxWkMm8omzH11v8bu8pBpSY94IKx6blJbxa46ehpTPm5w6Ne610oO2JcV44ySJbVe9W1V+o6v+GblGPzESMt8ZPeoqLJSXxU1IZyfpSDy8fP0d3X79jMWyramBeUTaLpsbfIG4ogW3fE/tZUV6fnwUDBu34AAAfkklEQVRTcsjPTI35uY3zwkkWvxORt0Y9EhM1lcEOoWkp8b9gv6KsiM7efrw1zY6c/0xbFy8db2Lz8viYBTXUvKIclpTksa0qtr2iBgY0MO4V56VMEz3h/PX4EPC4iLQHZ0T5RST+mviYYXX19rPvZGvCvMmvnDcJt0scW829I1iC2rzC2YV4o9lUXoLX18zJ5tiVoo41ttPa1Rf3kyRM9ISTLDxAKpAPFAW/L4pmUCZy9tW30NM/kDBv8tyMVFbNLHBskPu3VQ2UFeewYEr8laBCQtN5t8dwoDt0pZcoHzpM5I2YLESkLPjl0hFuJgEk4pu8osxD1ckWmjtiu+HP6dYuXjlxLi4Htgeb48lm2fQ8tsZwCm1ljZ+CrFTmebJjdk4TX0a7srgv+N8Hhrn9IMpxmQiprPEza1IWRbnpTocStvVlHlThhaNNMT3v7/Y0oEpcLcQbyebyaeyubab2XEdMzuf1+Vk1s8CRPT1MfBhtD+6/Dn55vaquH3wDNoTz4iKyUUQOiUi1iNw3wjHvFpH9IrJPRH4x6P5vBe87ICL/IvE42hjnVJVKn5/VsxJrquOKGQXkpqfEfArttj0NLJySS1kcl6BCLpSiYnB10dLRy5Ez7Ql1dWoiL5wxi5fCvO8NRMRN4CrkZmAJcJeILBlyTBnwaeBqVV1KsH2IiFwFXA2UA8uANwFvCSNWM0idv5Ozbd1xv75iqBS3iyvnT47pIHdDSyevnPCzOc5LUCEzJ2WxYkZ+TBbo7apNjHU6JrpGG7MoFpEVQKaILBeR8uCtAsgK47XXAdWqekxVe4CHgNuGHPN+4AFV9QOo6png/QpkAGlAOoEB9tPj+YeZ1xfjJWJ7hvVlHmrPdVLTdD4m5wutW7glQZIFBGZF7TnZEvWfkbfGj0tghS3Gm9BGu7LYRGBsYgZvHK/4DIH9LcYyHagd9H1d8L7BFgALROR5EfmTiGwEUNUXgWeAhuBth6oeGHoCEfmAiLwqIq+ePRtfG+fEA2+Nn6w0d1wuLhtLqGV5rEpR26rqWVySx/yinJicLxJuCZaitkW5FOX1NbNoah7Z6SlRPY+Jb6ONWfwkOD7x16p6zaAxi1tU9ZEwXnu4MYah7URTgDLgWuAu4EciUiAipcBiAolqOnC9iFwzTIw/VNW1qrq2qMhm8w7l9TWzYkYBKe74X4w31FxPNtMLMmMyhfZkcydeX3PClKBCZhRmsWpWQVR30OsfCO6MF6f7oJjYCeevSLGI5AGIyP8VkZdFJJwB7jpg5qDvZwBDl53WAb9R1V5VPQ4cIpA83gH8SVXbVbUd+B1wZRjnNEEdPX3sb2hN2DqziFBR6uGFo430D0S3ZXmoi2sizIIaatPyEvbVt3K8MTqlqMOn2zjf05+wv0cmcsJJFh9Q1dZgy48ZwIeBb4XxvFeAMhGZKyJpwJ3AY0OO+TVwHQS62xIoSx0DfMBbRCRFRFIJDG5fVIYyI6uqa6F/QBP6E2FFmYfWrj6q6qLb+uO3VQ0sm57HnARcQ3ChFBWl9h+h5oE2E8qEkyxCH+tuBn6iqpXhPE9V+4CPAjsI/KF/WFX3ichXROTW4GE7gCYR2U9gjOJeVW0CtgBHgT3AbmC3qv52HP+uCS/0Jl81M3Hf5FeXehCJbsvy2nMd7K5tZtPy+G3vMZppBZmsmV0YtVlRXp8fT04asyaFM6fFJLNwRqx2i8h2Ap/6PysiOVw89jAsVd0ObB9y3xcGfa3A3wdvg4/pBz4YzjnM8Hb5/MwryqYwO83pUC7ZpOw0lk7L47nqRv52Q9nYT7gE2xO4BBWyubyEL/92P9Vn2iktjuwAvbfGz6pZhXHZVNHEVjhXFu8DvgSsU9UOAlNa/3rUZxhHqSpeXzNrkqB0UFFaxC6fn/PdfVF5/W17GlgxI59ZkxP3k/PNy0oQifwCvab2bk40ddh4hQHCKyf1A/MIjFUAZIbzPOOcE00dnDvfkzDNA0ezvsxDb7/y0vHIt/6oaTpPVV1L3PeCGsvU/AzeNHsSWyM8buH1JV5fMRM94Wyr+gMCg9DvDd51Hvi/0QzKXB5vTfKsuF0zu5D0FFdU1luE1ifcksAlqJBN5SUcPt3O4QjuX+71+UlxCeUz8iP2miZxhXOFcJWqfhDoAlDVcwRWVps4Venzk5uRQmkCLTAbSUaqm3VzJ0VlkHtbVQMrZxYwozBxS1AhNy+figgRXXNRWeNn6bQ8MlLdEXtNk7jCSRa9IuIiOKgtIpOBgahGZS6Lt8bPyiTqELq+zMORM+2caumK2GsebzzPvvrWhFuIN5Li3AyumDuJbXsaCMwbuTy9/QNU1TUnRSnTRMZovaFCM6UeAH4JFInIl4GdwDdjEJu5BG1dvRw63ZYUJaiQitLA6vyd1ZG7ugitS0iGElTIpvJpVJ9p51AESlEHGlrp6h2w8QpzwWhXFi8DqOrPgM8B9wN+4A5VfSgGsZlLsLu2BdXkGpRcNDUXT04aO49Erv/X1qoG1swuZFpBZsRe02kbl07FFaFSVDKNe5nIGC1ZXKhhqOo+Vf1nVf2equ6NQVzmElXW+BGBlQm2h8VoXC7h6lIPO6ubIlJiqT7TzsFTbQm9tmI4RbnpXDlvMtuqLr8UVelrZmpeRlIlU3N5RluUVyQifz/Sg6r6T1GIx1wmr8/PguJc8jJSnQ4loipKPfzmtXoOnmpjcUneZb3W9j0NiCRXCSpkc/k0PvOrPRxoaGPJtEv/OXlr/HZVYd5gtCsLN5AD5I5wM3FmYEDx+vxJOSi5viw4bhGBWVFbq+p50+xJTM3PuOzXijc3LZ2C2yVs23Ppay5Ot3ZxsrmTVUl0dWou32hXFg2q+pWYReIQVWXbngauWVCU8J/Gj55tp62rL+G2UQ3H1PwMSotzeK66kfdfM++SX+fw6TYOn27ny7cujWB08WNyTjpXzZ/M1qoG7nnrwktq02HjFWY4YY1ZJLPjjef5uwd38Q9b9zsdymWrTPI3eUWph5ePN9HV23/Jr7GtKlCCunnZ1AhGFl82LS+hpqmDffWtl/T8yho/aSkulk6zxXjmdaMli3D2rEh484py+PC183n41TqeOXhm7CfEMa/PT2FWKnMTsNV2ONaXeejqHbjwyXe8QleR6+ZMojgv+UpQITctnUqKSy65E63X56d8ej5pKdbVx7xutJ3yzsUyECf93YYyFk7J5b5Hq2jp6HU6nEtWWeNndRJ3CL1i3mRSXMJzl7je4vDpdqrPtCfNQryRFGancXWph2176sc9K6q7r5+9J1uTctzLXB776ACkp7j5zrtX0NTew5d/u8/pcC5Jc0cPR8+eT+o3eU56CqtnFV7yIPfWqnpcAhuXJXeygECvqNpznew52TKu5+092UpPvy3GMxezZBG0bHo+H7mulEd3neSJfaecDmfcdk2QDqEVZR721rfgP98zruepKtuqGrhy3mSKctOjFF38uGnJVFLd4y9FhUp8ibzDookOSxaDfOS6UpaU5PGZX+0d9x8jp3l9ftwuYcXM5B6UrCjzoArPHx3f1cWBhjaONZ5P+Hbk4crPSqWi1DPuBXpen5+ZkzIpzk3eMR1zaSxZDJKW4uI7715BS2cPX3gsscpRlTV+FpfkkpUWzuaHiat8ej65GSnjLkVt21OP2yVsXJq8s6CG2lw+jZPNnbxWG94e5qp6YdzLmKEsWQyxuCSPj20o47e76yO+81i09PUPsLu2eUK8yVPcLq6aP5nnjjSG/YlZVdla1cBV8yczOSf5S1AhNyyZQprbFXavqJPNnZxp607aqdfm8liyGMaH3jKf8hn5fO7Xe2ls73Y6nDEdOt3G+Z7+CfMmrygr4mRzJyeaOsI6fl99KzVNHUnXC2os+ZmpXLPAw7Y9DQwMjJ1YQ+t0JsKHDjN+liyGkeJ28Z07VtDe1cfnfrU3Is3rommibX+5vtQDEHYX2q1VDbhdwk0TqAQVsqm8hIaWLnbVjr02ZZevmcxUN4umWjcfczFLFiMom5LL3791AY/vO8VjuyO7t3GkeWv8FOWmM6NwYnQInT05ixmFmWFttRpYiFfP1aUeCrMn3gaPNyyeQlqKK6xZUZU1flbMzCfFbX8WzMXst2IU718/j1WzCvjCb/ZxpjVyu7RFmtfnZ/WsgqRdjDeUiLC+zMOLR5vo6x9908aquhZqz3WyeYKVoEJyM1K5dkER28coRXX09LG/oXXCXJ2a8bNkMQq3S7j/jhV09fbzmV/tictyVGN7NzVNHRNmvCKkorSItu4+dteNvuhs254GUt0TswQVsqm8hNOt3VT6Ri5FVdW10D+gE+73yITPksUY5hfl8MmNi/j9gTP80nvS6XAu4p2gg5JXzZ+MyOgty0ML8SpKPeRnJXZH4cuxYfEU0lNcbB2lnOoNJpJVE+z3yITPkkUY3nfVHNbNmcSXf7uPhpZOp8N5g0qfn1S3sGx6ci/GG6owO43l0/PZWT3yIPdrtc2cbO5kU/m0GEYWf3LSU7huYTHb956if4RSlLemmXmebCZNwHEdEx5LFmFwuYRv31FOX79y3y/jqxy1q6aZpdPyyUh1Ox1KzFWUetjla6a9u2/Yx7dVNZDmdnHjkikxjiz+bF5Rwtm2bl45cXF/UNXApll2VWFGY8kiTLMnZ/PpWxbx7OGz/O8rtU6HA0BP3wC765onbJ25osxD34Dyp6NNFz02MBDa1MpDfubELUGFXL+omIzU4Rfo1TR1cO58z4T9PTLhsWQxDu+9YjZvnjeZf9h2gDp/eAvCoulAQyvdfRO3Q+ia2YVkprrZOUzL8l21fhpauiZML6ixZKWlsGHRFH63t+GiGWSV1jzQhMGSxTi4XMK3bi9HVfnUL6vCWhUbTRP9TZ6e4mbd3EnDJoutVQ2kpbi4YbGVoEI2lZfQ2N7Dy8ffWIry+vzkpqdQVmyL8czIoposRGSjiBwSkWoRuW+EY94tIvtFZJ+I/GLQ/bNE5AkRORB8fE40Yw3XzElZfHbTEp6vbuLnL/scjcXr8zMtP4OS/ImxGG8468s8VJ9pf8PEg4EBZfueBt6yoIjcBN9XPZKuW1hMVpqbrUN6nlXW+Fk5qwC3a2Ks0zGXJmrJQkTcwAPAzcAS4C4RWTLkmDLg08DVqroU+Pigh38GfFtVFwPrgLjZ8/SudTNZX+bh69sP4AuzP1E07PI1s2qC15krykKtP16/uni1xs/p1u6k3xFvvDLT3GxYPIXH9566UIpq6+rl8Om2CVvKNOGL5pXFOqBaVY+pag/wEHDbkGPeDzygqn4AVT0DEEwqKar6ZPD+dlV1fpAgSET45rvKcYtwz5bdjpSjTrV0cbK5kzUT/E2+cEounpz0N5SitlXVk57iYoOVoC6yaXkJ58738KdjgVLU7toWBpSk3mHRREY0k8V0YPC0obrgfYMtABaIyPMi8icR2Tjo/mYReVREdonIt4NXKnFjWkEmX3jbEl4+fo6fvnAi5ucPLaKa6G9yEaGidDLPVzcyMKD0Dyjb957iuoXF5KQn994el+LahUVkp7nZWhVYoOf1+RGBlTMn5riXCV80k8VwBdChH8FTgDLgWuAu4EciUhC8fz1wD/AmYB7wVxedQOQDIvKqiLx69mx4HUgj6fY1M7h+UTHf2nGQY2fbY3ruyho/6SkulpTkxfS88aiirIjG9h4OnmrjlRPnONvWbbOgRpCR6uaGJVN4fN8pevsHqKzxU1acY9OLzZiimSzqgJmDvp8BDO03UAf8RlV7VfU4cIhA8qgDdgVLWH3Ar4HVQ0+gqj9U1bWquraoqCgq/4jRiAhff+dy0lPc3PPI7hFXx0aD1+enfEY+aSk2oa0i1LK8+ixbq+rJSHVx/aJih6OKX5uWl9Dc0cvO6kZ2+fy2vsKEJZp/aV4BykRkroikAXcCjw055tfAdQAi4iFQfjoWfG6hiIQywPXA/ijGesmm5GXw5VuX4vU18+Odx2Jyzq7efvaebJnwJaiQqfkZlBXn8Ozhszy+9xQbFk0h20pQI7pmQRG56Sn8y1NHaO3qs5XbJixRSxbBK4KPAjuAA8DDqrpPRL4iIrcGD9sBNInIfuAZ4F5VbVLVfgIlqKdEZA+BktZ/RCvWy3Xbymm8dckU7n/iMNVn2qJ+vn31LfT2q81gGaSizMPz1U00tvdYCWoMGalublwyhV3BTbPsysKEI6o1DFXdrqoLVHW+qn4teN8XVPWx4Neqqn+vqktUdbmqPjTouU+qannw/r8KzqiKSyLC196xnOw0N3c/vHvMPRYul21/ebH1wSm0malurltoJaixhBJqQVYq8zzZDkdjEoEVvCOkKDedr759GbvrWvj3P0a3HOWtaWbWpCyKctOjep5EcsXcyaS5XWxYXExmWlxNnItLFWUe8jJSWDOrcMJsmmUujxV2I2hz+TR+t/cU3/v9YTYsLmbR1MjPVFJVKn3+C4O6JiA7PYX//ut1zLFPyWFJT3Hz3399hbUkN2GzK4sI++pty8jPTOXuh3fTG4VyVJ2/k7Nt3ayeZfPih7pi3mSm5GU4HUbCWDGzgJmTspwOwyQISxYRNik7jX94+3L21bfywDPVEX99W4xnjHGCJYso2LhsKm9fOY0fPF3N3pOj7xE9Xt4aP1lpbhZOsQ6hxpjYsWQRJV+6dSmTstO455HddPf1R+x1K31+Vs4sIMVt/+uMMbFjf3GipCArja+/czkHT7Xx/aciU47q6OnjQIN1CDXGxJ4liyjasHgKt6+Zwb89e5Tdtc2X/Xq7a1voH1BbRGWMiTlLFlH2+c1LKM5N5+5HdtPVe3nlqNDg9iqbCWWMiTFLFlGWn5nKN95VTvWZdr77+8OX9VreGj/zi7IpyLK58caY2LJkEQNvWVDEXetm8h9/PHahVcd4qSpen9/GK4wxjrBkESOf3bSEkvxM7nlkN5094y9HHW88j7+j18YrjDGOsGQRIznpKXz79nKON57n2zsOjfv53mCHUFuMZ4xxgiWLGLqq1MNfvHk2P3nhOC8daxrXcytr/ORmpFBalBOl6IwxZmSWLGLsUxsXMbMwi3u3VHG+uy/s5+3y+Vk1qxCXyzqEGmNiz5JFjGWnp3D/HSuo9XfwzccPhvWc1q5eDp1uY40NbhtjHGLJwgHr5k7ifVfN5Wcv1vB8deOYx++ubUYVVs+29RXGGGdYsnDIvTctZK4nm09uqaKtq3fUYytr/IjAypmWLIwxzrBk4ZDMNDf337GChpZO/nH76OUor6+ZhVNyyc1IjVF0xhjzRpYsHLRmdiHvXz+PB1/28ezhs8MeMzCgFwa3jTHGKZYsHPaJGxdQWpzDp7ZU0dJ5cTmq+mw7bV19thjPGOMoSxYOy0h18507VnC2vZuvbt1/0ePeYHsQ20bVGOMkSxZxYMXMAj78lvlsqazjqQOn3/BYZY2fwqxU5nqyHYrOGGMsWcSNv91QyqKpudz36B6aO3ou3B9qHihii/GMMc6xZBEn0lMCs6P853v40mP7AGju6OHo2fPWD8oY47gUpwMwr1s2PZ+PXl/K935/hI3LSkhPCeRya0tujHGaXVnEmY9cV8rSaXl89ld7ePLAadwuYcXMfKfDMsZMcJYs4kyq28V33r2C1q5efvGSj8UluWSl2QWgMcZZlizi0KKpeXz8hgUA1jzQGBMX7CNrnPrgNfNo7+7j7SunOx2KMcZYsohXKW4Xn9q4yOkwjDEGiHIZSkQ2isghEakWkftGOObdIrJfRPaJyC+GPJYnIidF5AfRjNMYY8zoonZlISJu4AHgRqAOeEVEHlPV/YOOKQM+DVytqn4RKR7yMl8Fno1WjMYYY8ITzSuLdUC1qh5T1R7gIeC2Ice8H3hAVf0Aqnom9ICIrAGmAE9EMUZjjDFhiGaymA7UDvq+LnjfYAuABSLyvIj8SUQ2AoiIC/gOcG8U4zPGGBOmaA5wD9fMSIc5fxlwLTADeE5ElgHvBbarau1oPZFE5APABwBmzZoVgZCNMcYMJ5rJog6YOej7GUD9MMf8SVV7geMicohA8ngzsF5E/gbIAdJEpF1V3zBIrqo/BH4IsHbt2qGJyBhjTIREswz1ClAmInNFJA24E3hsyDG/Bq4DEBEPgbLUMVV9j6rOUtU5wD3Az4YmCmOMMbETtWShqn3AR4EdwAHgYVXdJyJfEZFbg4ftAJpEZD/wDHCvqjZFKyZjjDGXRlSTo3ojImeBmst4CQ/QGKFwIsniGh+La3wsrvFJxrhmq2rRWAclTbK4XCLyqqqudTqOoSyu8bG4xsfiGp+JHJc1EjTGGDMmSxbGGGPGZMnidT90OoARWFzjY3GNj8U1PhM2LhuzMMYYMya7sjDGGDOmCZ8sROQ/ReSMiOx1OpYQEZkpIs+IyIFg6/aPOR0TgIhkiMjLIrI7GNeXnY5pMBFxi8guEdnqdCwhInJCRPaIyGsi8qrT8YSISIGIbBGRg8Hfszc7HROAiCwM/qxCt1YR+XgcxPWJ4O/8XhF5UEQynI4JQEQ+FoxpX7R/ThO+DCUi1wDtBFaJL3M6HgARKQFKVNUrIrlAJfD2we3dHYpLgGxVbReRVGAn8DFV/ZOTcYWIyN8Da4E8Vd3sdDwQSBbAWlWNq7n5IvJfwHOq+qNgh4UsVW12Oq7BgtscnASuUNXLWUN1uXFMJ/C7vkRVO0XkYQK9637qVEzBuJYR6Oa9DugBHgc+rKpHonG+CX9loap/BM45Hcdgqtqgqt7g120EVsA7vr+qBrQHv00N3uLi04aIzAA2AT9yOpZ4JyJ5wDXAjwFUtSfeEkXQBuCok4likBQgU0RSgCwu7nPnhMUEeut1BDtmPAu8I1onm/DJIt6JyBxgFfCSs5EEBEs9rwFngCdVNS7iAr4HfBIYcDqQIRR4QkQqg12S48E84Czwk2DZ7kciku10UMO4E3jQ6SBU9SRwP+ADGoAWVY2HfXb2AteIyGQRyQJu4Y3NWyPKkkUcE5Ec4JfAx1W11el4AFS1X1VXEugivC54KewoEdkMnFHVSqdjGcbVqroauBn4SLDs6bQUYDXwb6q6CjgPxFWjzmBp7FbgkTiIpZDAxm1zgWlAtoi819moQFUPAN8EniRQgtoN9EXrfJYs4lRwTOCXwM9V9VGn4xkqWLb4A7DR4VAArgZuDY4PPARcLyL/42xIAapaH/zvGeBXBOrLTqsD6gZdFW4hkDziyc2AV1VPOx0IcANwXFXPBrdTeBS4yuGYAFDVH6vqalW9hkA5PSrjFWDJIi4FB5J/DBxQ1X9yOp4QESkSkYLg15kE3kQHnY0KVPXTqjoj2NL+TuBpVXX8k5+IZAcnKBAs87yVQOnAUap6CqgVkYXBuzYAjk6eGMZdxEEJKsgHXCkiWcH35gYC44iOE5Hi4H9nAe8kij+zaG5+lBBE5EECO/V5RKQO+KKq/tjZqLga+HNgT3B8AOAzqrrdwZgASoD/Cs5ScRFoOx8301Tj0BTgV8HdHlOAX6jq486GdMHfAj8PlnuOAe9zOJ4LgvX3G4EPOh0LgKq+JCJbAC+BMs8u4mcl9y9FZDLQC3xEVf3ROtGEnzprjDFmbFaGMsYYMyZLFsYYY8ZkycIYY8yYLFkYY4wZkyULY4wxY7JkYRKKiPxBRG4act/HReRfx3he+2iPRyCuIhF5Kdg+Y/2Qx/4gImuDX88RkSND/w3Bx74d7B767UuM4drBHXdF5B9EZIeIpAdjeHXQY2tF5A+Dnqci8rZBj28VkWsvJQ6TnCxZmETzIIGFd4PFQw+hDcBBVV2lqs8Nd0Cw2eEO4G5V3THMIR8EVqvqveGcMNjUbqTHPktgvc7bVbU7eHexiNw8wlPqgM+Gc14zMVmyMIlmC7BZRNLhQqPFacBOEckRkadExBvcQ+K2oU8e5tP3D0Tkr4JfrxGRZ4NN/3YEW8UPff7s4Dmqgv+dJSIrgW8Bt0hgD4bMYeKeCjwBfE5VHxvmdR8DsoGXROTPhjtP8Lifisg/icgzBPoCXURE7ibQVO5tqto56KFvA58b7jkE+gq1iMiNIzxuJjhLFiahqGoT8DKv96S6E/hfDawu7QLeEWzcdx3wnWB7hjEFe3F9H7hdVdcA/wl8bZhDf0Bg75Ny4OfAv6jqa8AXgnGsHPIHOuRnwA9UddjGeKp6K9AZfP7/DneeQYcvAG5Q1buHeamrgQ8BNw9qJx/yItAtItcNFwPwD4ycTMwEZ8nCJKLBpajBJSgB/lFEqoDfE9gDZEqYr7kQWAY8GWyx8jkCnXWHejPwi+DX/w1UhPn6vwf+PNjKIhyjnecRVe0f4XnVBH4Obx3h8RETQqh8NnTMxRiwZGES06+BDSKyGsgMbRQFvAcoAtYE26ifBoZuf9nHG3/vQ48LsC/4yX6lqi5X1ZH+4A4Wbr+cbxHYk+SR0cYawjzP+VGOO02gBPXd4a4gVPVpAv/mK0d4/tewsQszDEsWJuEEyyt/IFAqGjywnU9gX4ve4B/K2cM8vQZYEpwhlE9gYBrgEFAkwb2oRSRVRJYO8/wXeP2q5j0EttsM1yeAVuDHYZTHLvk8qnqYQAfS/wmOpwz1NQIbRQ333CeAQmBFuOczE4MlC5OoHiTwB+2hQff9HFgbnCL6HoZpn66qtcDDQFXw+F3B+3uA24Fvishu4DWG37Pg74D3BUtdfw58LNyAg+Mqf0mge++3xjj8ks8TPNcrBDrJPiYi84c8tp3ATnkj+RrDl+DMBGZdZ40xxozJriyMMcaMyZKFMcaYMVmyMMYYMyZLFsYYY8ZkycIYY8yYLFkYY4wZkyULY4wxY7JkYYwxZkz/D0/v+mWohuXrAAAAAElFTkSuQmCC\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# for KNN\nfrom sklearn.neighbors import KNeighborsClassifier\n# perform the test\nKNN = KNeighborsClassifier(n_neighbors = 7).fit(X_train, y_train)\nKNN",
"execution_count": 55,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 55,
"data": {
"text/plain": "KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',\n metric_params=None, n_jobs=None, n_neighbors=7, p=2,\n weights='uniform')"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Decision Tree"
},
{
"metadata": {},
"cell_type": "code",
"source": "# findinng the best depth level\nfrom sklearn.tree import DecisionTreeClassifier\nfrom sklearn.metrics import f1_score\nfrom sklearn.metrics import jaccard_similarity_score\n\n# Compare accuracy result for depth = 3, 4 and 5\nd_range = range(3, 6)\nf1 = []\nja = []\nfor d in d_range:\n DT = DecisionTreeClassifier(criterion=\"entropy\", max_depth=d)\n DT.fit(X_train, y_train)\n dt_yhat = DT.predict(X_test)\n f1.append(f1_score(y_test, dt_yhat, average='weighted'))\n ja.append(jaccard_similarity_score(y_test, dt_yhat))\n\nresult = pd.DataFrame(f1, index=['d=3','d=4', 'd=5'])\nresult.columns = ['F1-score']\nresult.insert(loc=1, column='Jacard', value=ja)\nresult.columns.name = \"Depth\"\nresult",
"execution_count": 56,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 56,
"data": {
"text/plain": "Depth F1-score Jacard\nd=3 0.620577 0.585714\nd=4 0.620577 0.585714\nd=5 0.648789 0.614286",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th>Depth</th>\n <th>F1-score</th>\n <th>Jacard</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>d=3</th>\n <td>0.620577</td>\n <td>0.585714</td>\n </tr>\n <tr>\n <th>d=4</th>\n <td>0.620577</td>\n <td>0.585714</td>\n </tr>\n <tr>\n <th>d=5</th>\n <td>0.648789</td>\n <td>0.614286</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# for Decision Trees\nfrom sklearn.tree import DecisionTreeClassifier\n# prepare DT setting\nDT = DecisionTreeClassifier(criterion=\"entropy\", max_depth=5)\n# perform the test\nDT.fit(X_train, y_train)\nDT",
"execution_count": 57,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 57,
"data": {
"text/plain": "DecisionTreeClassifier(class_weight=None, criterion='entropy', max_depth=5,\n max_features=None, max_leaf_nodes=None,\n min_impurity_decrease=0.0, min_impurity_split=None,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n splitter='best')"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Support Vector Machine"
},
{
"metadata": {},
"cell_type": "code",
"source": "# for SVM\nfrom sklearn import svm\nfrom sklearn.metrics import jaccard_similarity_score\nfrom sklearn.metrics import f1_score\n\n# import Matplotlib (scientific plotting library)\nimport matplotlib.pyplot as plt\n%matplotlib inline\n\nfunc_list = ['linear', 'poly', 'rbf', 'sigmoid']\naccuracy_score = []\n\nfor func in func_list:\n SVM = svm.SVC(kernel=func)\n SVM.fit(X_train, y_train)\n svm_yhat = SVM.predict(X_test)\n accuracy_score.append(f1_score(y_test, svm_yhat, average='weighted'))\n \n# plot the comparison among 4 kernel functions\nimport numpy as np\nimport matplotlib.pyplot as plt\ny_pos = np.arange(len(func_list))\nplt.bar(y_pos, accuracy_score, align='center', alpha=0.5)\nplt.xticks(y_pos, func_list)\nplt.ylabel('Accuracy')\nplt.xlabel('Kernel Functions')\nplt.title('Accuracy Comparison for 4 Kernal Functions')\nplt.show()",
"execution_count": 58,
"outputs": [
{
"output_type": "stream",
"text": "/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1143: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.\n 'precision', 'predicted', average, warn_for)\n",
"name": "stderr"
},
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEWCAYAAACJ0YulAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3XmYHVW57/Hvj4RAEGRKjgqJJEpQg0CQECcQEPAwmeBwNVGPoAyHc42RSQ8qIjeeixccuD4Sh4AIyBCBwxAwGgYZBAHTYC4QQiCEIW1Em3mGhLz3j7W6qGx2997d6eqdJr/P8+ynq1atqnqruna9u1btWlsRgZmZGcA6rQ7AzMzWHE4KZmZWcFIwM7OCk4KZmRWcFMzMrOCkYGZmBScFe8OT9C1JZ/TzOj8s6X5Jz0k6sD/XvaaRFJK2bnUcXWnF8bEmc1LoA5Kul/SkpPVaHUtVlEyTdLek5yW1S7pI0natjq2RiDgpIg7t59VOB06LiA0j4rK+WqikIZLuldTeTZ3dy9PzPJdIulnSm/sqlr6Q3zsv5eTZ+fpghevbvXbftej4WGM5KawmSaOAXYEAJvbzugf34+p+AnwNmAZsBmwDXAbs348x9Fg/76OyrYAFvZmxQcxfB/7Zg2WtB1wCbAJ8LCKe6cNY+srUnDw7X7f0wzqtKxHh12q8gBOAm4EfA1fWTBsK/Ah4GHgauAkYmqftAvwZeApYChycy68HDi0t42DgptJ4AF8B7gcezGU/yct4Brgd2LVUfxDwLeAB4Nk8fSQwA/hRTbxXAEfW2cYxwKvAhG72w8bAOUBH3t7jgXVK23AzcGre3iXAh3L5UtJJ7qDSss4CfgFcnWO+AdiqNL277T0RuBg4N08/NJedm6evn6c9nmOZB7wlT9sCmA08ASwGDqtZ7oV5G58lnfDHd7EvHgBWAi8CzwHrNbHsVWLuYrmjgYXAvkB7N/+L3YF2YAPgKmAu+bjL09cBjstxPp63a7M8bRTpGDsEeAS4sVR2UC57DPh2aXkTgFvy/vw7cBowpOaY3bqLWK+vt72ldQ6uVzcfOzcBPwSeBB4E9i3V3Qz4NbAsT78MeFP+n6zM/5fn8v+lOD7yvBPz//epvM73lKY9BBwL3El6T/8WWD9PGwZcmed7AvgT+T0wkF4tD2Cgv/Ib/H8COwHLySeYPG1GPqi2JJ2cP5RPEG8nnVimAOsCmwPj8jyrvEmonxSuzgd9Z4L5Ql7GYOAY4NHSgfp14C7gXYCAHXLdCfkN03niHga8UI6/tM4jgIcb7IdzgMuBjfIb+j7gkNI2rAC+lPfDf5FOLjPy/vhY3h8b5vpn5fGP5Ok/qdkH3W3vifn/cCDp5DeUVZPCv5OS3wY5lp2AN+dpNwA/IyWOcaQEt2dpuS8B++X5vg/c2s3+eAjYqzTeaNmrxNzFMq8EPkE+6Xez7t3z8m8gJaL1aqYfCdwKjMj795fABXnaKNIxdg7pJDq0VHZ6Ht8BeJl8ssz78AP5/zGKlLiOLK2vqqSwHDgs/z/+g3Q8K0//HemEvSnpPbZbad+016yrfHxsAzwP7J3n+wbpPT6k9H/9CymZbJa39Yg87fukDzPr5teunfEMpFfLAxjIL9Kn/eXAsDx+L3BUHl6H9KlkhzrzfRO4tItlrvImoX5S+GiDuJ7sXC+wCJjURb2FwN55eCowp4t636b7E+CgfJIYWyr7d+D60jbcX5q2Xd6OcgJ9nNcS41nArNK0DUlXKiOb2N4TgRtrppff9F8mXaFtX1NnZF7HRqWy7wNnlZZxTWnaWODFbvbJQ+Sk0OSyb+xqWbnOJ4A/5OHdaZwUXgJeAT7Vxf99z9L42/Jx3HlSD+AdpemdZSNKZX8BJnex/iMpHd80TgovkD5dPwXcUbPO7pLC4tK0DXL9t+btWQls2sW+6S4pfAe4sDRtHeBvwO6l/+sXStNPAX6Rh6eTPhjV3daB8vI9hdVzEHBVRDyWx8/PZZA+ea9PukSvNbKL8mYtLY9IOkbSQklPS3qK1JQzrIl1nU361E3++5su6j1OeqN1ZRgwhNRs1Olh0hVSp3+Uhl8EiIjasg1L48U2RsRzpMvxLaDh9q4ybx2/ITWnzJK0TNIpktbNy34iIp7tZhseLQ2/AKzfZJt7M8vuMmZJbyKdfL7axLo6PQZMBs6W9K8107YCLpX0VN5/C0lJ6y0N4qnd/g1zfNtIulLSo5KeAU5i1f9HI9MiYpP8el8P5iviiYgX8uCGpGP+iYh4sgfL6rQFpeM4IlaS9kV3x0HncfsD0lXFVZKWSDquF+tvOSeFXpI0FPgMsFt+MzwKHAXsIGkH0pvyJeCddWZf2kU5pEvXDUrjb61TJ0px7Ar8Z45l04jYhNTWqSbWdS4wKcf7HlK7az3XAiMkje9i+mOkT5pblcreTvqE1VsjOwckbUi6VF/WxPZCaf/UiojlEfG/ImIsqTnvAOCLpKaHzSRt1Ifb0KmZZXcZM+mezijgT/k4uwR4Wz7uRnU1U0RcQmpeuVjSHqVJS0nt75uUXutHRLPx1Po56Sp5TES8mXQPS93P0tDz+W+j90I9S0n7e5M60xpt1zJKx7EkkY7FhsdBRDwbEcdExDuAjwNHS9qzyZjXGE4KvXcg6dPVWFIb8TjSifVPwBfzJ4wzgR9L2kLSIEkfzN8GOQ/YS9JnJA2WtLmkcXm584FPStogf7f7kAZxbERqr+8ABks6ASh/7fAM4HuSxuSvlW4vaXOAiGgn3Wj9DfDfEfFivRVExP2k9vAL8lf6hkhaX9JkScdFxKukm5X/W9JGkrYCjiYlnd7aT9IukoYA3wNui4ilTWxvtyTtIWk7SYNIN3WXA6/mZf8Z+H7etu1J+/681dgGAPpg2XeTTkydx9mhpCuvcXR/VUREXEBqGrxc0odz8S9I/6utACQNlzSpZ1u1io1I+/I5Se8mte+vlojoIJ2Iv5DfO1+m6w83tfP+Hfg98DNJm0paV9JH8uR/AJtL2riL2S8E9pe0Z76CPIbUNPrnRuuVdICkrXMieYZ0fni1mZjXJE4KvXcQ8OuIeCQiHu18kb558fncrHAs6SbvPFLzx8mkG7uPkG5YHpPL55Nu3kH6hs4rpIP3bBqfOOaS3gD3kS57X2LVE8WPSQf6VaQD9Vekm4Wdzia18XfVdNRpWt62GaS23wdI7dxX5OlfJX26W0L6Vsj5pKTYW+cD3yXtn52Az+fyRtvbyFtJ3/R5htRscgOvJa8ppE/ky4BLge9GxNWrsQ1lvV52RKyoOcaeAFbm8YYnnYg4m3Ss/U7SBNKN+9mkZo5nSTed39+bjcqOBT5H+nLA6aQbvH3hMNIXJR4HtqWJE3PJv5ES/r2kb7cdCRAR9wIXAEty89kW5ZkiYhGpKfWnpCvgjwMfj4hXmljnGOAa0reabgF+FhHX9yDmNULnnXpbS+VPUOcCo/LVTctJOot0M/D4VsditrbxlcJaLF8efw04Y01JCGbWWk4KaylJ7yE1A70N+L8tDsfM1hBuPjIzs4KvFMzMrNCqzsJ6bdiwYTFq1KhWh2FmNqDcfvvtj0XE8Eb1BlxSGDVqFG1tba0Ow8xsQJH0cONabj4yM7MSJwUzMys4KZiZWcFJwczMCk4KZmZWcFIwM7OCk4KZmRWcFMzMrOCkYGZmhQH3RLPZQHXq1fe1OoSWO2rvbVodgjXgKwUzMys4KZiZWcFJwczMCpUmBUn7SFokabGk4+pMP1XS/Py6T9JTVcZjZmbdq+xGs6RBwAxgb6AdmCdpdkTc01knIo4q1f8qsGNV8ZiZWWNVXilMABZHxJKIeAWYBUzqpv4U4IIK4zEzswaqTApbAktL4+257HUkbQWMBv5YYTxmZtZAlUlBdcqii7qTgYsj4tW6C5IOl9Qmqa2jo6PPAjQzs1VVmRTagZGl8RHAsi7qTqabpqOImBkR4yNi/PDhDX9i1MzMeqnKJ5rnAWMkjQb+Rjrxf662kqR3AZsCt1QYC+AnSv00qZk1UllSiIgVkqYCc4FBwJkRsUDSdKAtImbnqlOAWRHRVdOSrSGcVJ1U7Y2v0r6PImIOMKem7ISa8ROrjMHMzJrnDvHMbMDw1Wr1V6vu5sLMzApOCmZmVnBSMDOzgpOCmZkVnBTMzKzgpGBmZgUnBTMzKzgpmJlZwUnBzMwKTgpmZlZwUjAzs4KTgpmZFZwUzMys4KRgZmYFJwUzMys4KZiZWcFJwczMCk4KZmZWcFIwM7NCpUlB0j6SFklaLOm4Lup8RtI9khZIOr/KeMzMrHuDq1qwpEHADGBvoB2YJ2l2RNxTqjMG+Cbw4Yh4UtK/VBWPmZk1VuWVwgRgcUQsiYhXgFnApJo6hwEzIuJJgIj4Z4XxmJlZA1UmhS2BpaXx9lxWtg2wjaSbJd0qaZ96C5J0uKQ2SW0dHR0VhWtmZlUmBdUpi5rxwcAYYHdgCnCGpE1eN1PEzIgYHxHjhw8f3ueBmplZUmVSaAdGlsZHAMvq1Lk8IpZHxIPAIlKSMDOzFqgyKcwDxkgaLWkIMBmYXVPnMmAPAEnDSM1JSyqMyczMulFZUoiIFcBUYC6wELgwIhZImi5pYq42F3hc0j3AdcDXI+LxqmIyM7PuVfaVVICImAPMqSk7oTQcwNH5ZWZmLeYnms3MrOCkYGZmBScFMzMrOCmYmVnBScHMzApOCmZmVnBSMDOzgpOCmZkVnBTMzKzgpGBmZgUnBTMzKzgpmJlZwUnBzMwKTgpmZlZwUjAzs4KTgpmZFZwUzMys4KRgZmYFJwUzMytUmhQk7SNpkaTFko6rM/1gSR2S5ufXoVXGY2Zm3Rtc1YIlDQJmAHsD7cA8SbMj4p6aqr+NiKlVxWFmZs2r8kphArA4IpZExCvALGBSheszM7PVVGVS2BJYWhpvz2W1PiXpTkkXSxpZb0GSDpfUJqmto6OjiljNzIxqk4LqlEXN+BXAqIjYHrgGOLvegiJiZkSMj4jxw4cP7+MwzcysU5VJoR0of/IfASwrV4iIxyPi5Tx6OrBThfGYmVkDVSaFecAYSaMlDQEmA7PLFSS9rTQ6EVhYYTxmZtZAZd8+iogVkqYCc4FBwJkRsUDSdKAtImYD0yRNBFYATwAHVxWPmZk1VllSAIiIOcCcmrITSsPfBL5ZZQxmZtY8P9FsZmYFJwUzMys4KZiZWcFJwczMCk4KZmZWcFIwM7OCk4KZmRWcFMzMrNAwKUiaKmnT/gjGzMxaq5krhbeSfiDnwvxLavV6PzUzszeAhkkhIo4HxgC/IvVNdL+kkyS9s+LYzMysnzV1TyEiAng0v1YAmwIXSzqlwtjMzKyfNewQT9I04CDgMeAM4OsRsVzSOsD9wDeqDdHMzPpLM72kDgM+GREPlwsjYqWkA6oJy8zMWqGZ5qM5pN86AEDSRpLeDxAR/lEcM7M3kGaSws+B50rjz+cyMzN7g2kmKSjfaAZSsxEV/ziPmZm1RjNJYYmkaZLWza+vAUuqDszMzPpfM0nhCOBDwN+AduD9wOFVBmVmZq3RsBkoIv4JTO6HWMzMrMWaeU5hfeAQYFtg/c7yiPhyE/PuA/wEGAScERH/p4t6nwYuAnaOiLbmQjczs77WTPPRb0j9H/0rcAMwAni20UySBgEzgH2BscAUSWPr1NsImAbc1nzYZmZWhWaSwtYR8R3g+Yg4G9gf2K6J+SYAiyNiSUS8AswCJtWp9z3gFOClJmM2M7OKNJMUlue/T0l6L7AxMKqJ+bYElpbG23NZQdKOwMiIuLK7BUk6XFKbpLaOjo4mVm1mZr3RTFKYmX9P4XhgNnAPcHIT89XrYrt43iH3nXQqcEyjBUXEzIgYHxHjhw8f3sSqzcysN7q90ZxP3M9ExJPAjcA7erDsdmBkaXwEsKw0vhHwXuD6/BMNbwVmS5rom81mZq3R7ZVCfnp5ai+XPQ8YI2m0pCGkr7XOLi376YgYFhGjImIUcCvghGBm1kLNNB9dLelYSSMlbdb5ajRTRKwgJZS5wELgwohYIGm6pImrGbeZmVWgmT6MOp9H+EqpLGiiKSki5pB6WS2XndBF3d2biMXMzCrUzBPNo/sjEDMza71mnmj+Yr3yiDin78MxM7NWaqb5aOfS8PrAnsAdgJOCmdkbTDPNR18tj0vamNT1hZmZvcE08+2jWi8AY/o6EDMza71m7ilcwWtPIq9D6tzuwiqDMjOz1mjmnsIPS8MrgIcjor2ieMzMrIWaSQqPAH+PiJcAJA2VNCoiHqo0MjMz63fN3FO4CFhZGn81l5mZ2RtMM0lhcP49BADy8JDqQjIzs1ZpJil0lPsqkjQJeKy6kMzMrFWauadwBHCepNPyeDtQ9ylnMzMb2Jp5eO0B4AOSNgQUEQ1/n9nMzAamhs1Hkk6StElEPBcRz0raVNJ/9UdwZmbWv5q5p7BvRDzVOZJ/hW2/6kIyM7NWaSYpDJK0XueIpKHAet3UNzOzAaqZG83nAtdK+nUe/xJwdnUhmZlZqzRzo/kUSXcCewEC/gBsVXVgZmbW/5rtJfVR0lPNnyL9nsLCyiIyM7OW6TIpSNpG0gmSFgKnAUtJX0ndIyJO62q+mmXsI2mRpMWSjqsz/QhJd0maL+kmSWN7vSVmZrbaurtSuJd0VfDxiNglIn5K6veoKZIGATOAfUndbU+pc9I/PyK2i4hxwCnAj3sUvZmZ9anuksKnSM1G10k6XdKepHsKzZoALI6IJbm/pFnApHKFiHimNPomXvvdBjMza4Euk0JEXBoRnwXeDVwPHAW8RdLPJX2siWVvSWpy6tSey1Yh6SuSHiBdKUyrtyBJh0tqk9TW0dHRxKrNzKw3Gt5ojojnI+K8iDgAGAHMB153f6COelcVr7sSiIgZEfFO4D+B47uIYWZEjI+I8cOHD29i1WZm1hs9+o3miHgiIn4ZER9tono7MLI0PgJY1k39WcCBPYnHzMz6Vo+SQg/NA8ZIGi1pCDAZmF2uIGlMaXR/4P4K4zEzswaaeaK5VyJihaSpwFxgEHBmRCyQNB1oi4jZwFRJewHLgSeBg6qKx8zMGqssKQBExBxgTk3ZCaXhr1W5fjMz65kqm4/MzGyAcVIwM7OCk4KZmRWcFMzMrOCkYGZmBScFMzMrOCmYmVnBScHMzApOCmZmVnBSMDOzgpOCmZkVnBTMzKzgpGBmZgUnBTMzKzgpmJlZwUnBzMwKTgpmZlZwUjAzs4KTgpmZFZwUzMysUGlSkLSPpEWSFks6rs70oyXdI+lOSddK2qrKeMzMrHuVJQVJg4AZwL7AWGCKpLE11f4KjI+I7YGLgVOqisfMzBqr8kphArA4IpZExCvALGBSuUJEXBcRL+TRW4ERFcZjZmYNVJkUtgSWlsbbc1lXDgF+X2+CpMMltUlq6+jo6MMQzcysrMqkoDplUbei9AVgPPCDetMjYmZEjI+I8cOHD+/DEM3MrGxwhctuB0aWxkcAy2orSdoL+DawW0S8XGE8ZmbWQJVXCvOAMZJGSxoCTAZmlytI2hH4JTAxIv5ZYSxmZtaEypJCRKwApgJzgYXAhRGxQNJ0SRNztR8AGwIXSZovaXYXizMzs35QZfMRETEHmFNTdkJpeK8q129mZj3jJ5rNzKzgpGBmZgUnBTMzKzgpmJlZwUnBzMwKTgpmZlZwUjAzs4KTgpmZFZwUzMys4KRgZmYFJwUzMys4KZiZWcFJwczMCk4KZmZWcFIwM7OCk4KZmRWcFMzMrOCkYGZmBScFMzMrVJoUJO0jaZGkxZKOqzP9I5LukLRC0qerjMXMzBqrLClIGgTMAPYFxgJTJI2tqfYIcDBwflVxmJlZ8wZXuOwJwOKIWAIgaRYwCbins0JEPJSnrawwDjMza1KVzUdbAktL4+25rMckHS6pTVJbR0dHnwRnZmavV2VSUJ2y6M2CImJmRIyPiPHDhw9fzbDMzKwrVSaFdmBkaXwEsKzC9ZmZ2WqqMinMA8ZIGi1pCDAZmF3h+szMbDVVlhQiYgUwFZgLLAQujIgFkqZLmgggaWdJ7cD/AH4paUFV8ZiZWWNVfvuIiJgDzKkpO6E0PI/UrGRmZmsAP9FsZmYFJwUzMys4KZiZWcFJwczMCk4KZmZWcFIwM7OCk4KZmRWcFMzMrOCkYGZmBScFMzMrOCmYmVnBScHMzApOCmZmVnBSMDOzgpOCmZkVnBTMzKzgpGBmZgUnBTMzKzgpmJlZwUnBzMwKlSYFSftIWiRpsaTj6kxfT9Jv8/TbJI2qMh4zM+teZUlB0iBgBrAvMBaYImlsTbVDgCcjYmvgVODkquIxM7PGqrxSmAAsjoglEfEKMAuYVFNnEnB2Hr4Y2FOSKozJzMy6MbjCZW8JLC2NtwPv76pORKyQ9DSwOfBYuZKkw4HD8+hzkhZVEnH1hlGzbf3p6FatuO94/60+78PVM5D331bNVKoyKdT7xB+9qENEzARm9kVQrSSpLSLGtzqOgcr7b/V5H66etWH/Vdl81A6MLI2PAJZ1VUfSYGBj4IkKYzIzs25UmRTmAWMkjZY0BJgMzK6pMxs4KA9/GvhjRLzuSsHMzPpHZc1H+R7BVGAuMAg4MyIWSJoOtEXEbOBXwG8kLSZdIUyuKp41xIBvAmsx77/V5324et7w+0/+YG5mZp38RLOZmRWcFMzMrOCk0EuSnst/t5B0cavjWdtIul7SG/qrgX2l81itU/5uSfMl/VXSO/s7rlaSdEadHhb6eh1zJG1Sp/xEScdWue7VUeVzCmuFiFhG+uZUZSQNjogVVa7D3phyDwFdffg7ELg8Ir7bjyGtESLi0H5Yx35Vr6MKvlJYTZJGSbo7Dx8s6RJJf5B0v6RTSvU+JukWSXdIukjShrn8BEnzJN0taWZnNx/5k/BJkm4AvtaSjetHeT/eK+lsSXdKuljSBpL2zJ9k75J0pqT1auY7RNKppfHDJP24/7dgzZH35UJJPwPuAIZK+lE+9q6VNFzSfsCRwKGSrmttxNWS9CZJv5P0//L77LPlK818DN2Xy06XdFouP0vSzyVdJ2mJpN3yMbhQ0lml5U/Jx+fdkk4ulT8kaVge/nbuHPQa4F39uwd6xkmh740DPgtsB3xW0sh8YBwP7BUR7wPaeO2J9dMiYueIeC8wFDigtKxNImK3iPhRP8bfSu8CZkbE9sAzpH10FvDZiNiOdGX7HzXzzAImSlo3j38J+HX/hLtGexdwTkTsmMfvyMfeDcB3I2IO8Avg1IjYo1VB9pN9gGURsUN+n/2hc4KkLYDvAB8A9gbeXTPvpsBHgaOAK0gdd24LbCdpXJ7/5FxnHLCzpAPLC5C0E+nr9jsCnwR27vMt7ENOCn3v2oh4OiJeAu4h9TfyAVJPsTdLmk96YK+zH5I9crfhd5EOrG1Ly/ptP8a9JlgaETfn4XOBPYEHI+K+XHY28JHyDBHxPPBH4ABJ7wbWjYi7+ivgNdjDEXFrHl7Ja8fSucAurQmpZe4C9pJ0sqRdI+Lp0rQJwA0R8URELAcuqpn3ivxA7V3APyLirohYCSwARpFO8NdHREdu4j2PmmMU2BW4NCJeiIhneP1DvGsU31Poey+Xhl8l7WMBV0fElHJFSesDPwPGR8RSSScC65eqPF9xrGua3j40cwbwLeBefJXQqbtjZ616OCki7suf1vcDvi/pqtLkRr0yd76fV7Lqe3sl6b3d7L2+AbPPfaXQP24FPixpa4DcVr4NryWAx/I9hkpvWA8Ab5f0wTw8BbgGGNW534B/IzV/rCIibiP1ofU54IL+CHSAWYfXjq3PATe1MJZ+l5t4XoiIc4EfAu8rTf4LsJukTXP/a5/q4eJvy/MPU/oNmSm8/hi9EfiEpKGSNgI+3qsN6Se+UugHEdEh6WDggtKN0uPzJ5jTSZemD5H6i1qbLQQOkvRL4H7SDfZbgYvyG3YeqR28nguBcRHxZL9EOrA8D2wr6XbgadI9r7XJdsAPJK0ElpPuS/0QICL+Jukk0sl9GanJ9+muFlQrIv4u6ZvAdaSrjjkRcXlNnTsk/RaYDzwM/Gn1N6k67ubC1ghKP8V6Zb4R2Jv5ryTdNL22L+OyNz5JG0bEc/mDx6WkftoubXVcreLmIxvQJG0i6T7gRScE66UT8xdA7gYeBC5rcTwt5SsFMzMr+ErBzMwKTgpmZlZwUjAzs4KTgg0YKvX2KWm/3L/U2yteZ93eWHP5IqVeRudL6tNnTCQdKWmD0njdHjfN+pqfU7ABR9KewE+Bj0XEI03OU0VPs5+PiLY+XmanI0ldUrwAA7fHTRt4fKVgA4qkXYHTgf0j4oFcNlzSfyv1NjtP0odz+YlKPc9eBZyjXvRi28PYih5z8/ixueuSziuLkyX9JffIuWsuHyTph7mXzTslfVXSNGAL4DrlHkxretw8OvfIebekI0vrXph7+Vwg6SpJQ/O0aZLuycuf1fO9bmsTXynYQLIecDmwe0TcWyr/CenBtZtyc9Jc4D152k7ALhHxYn6qfBypt8qXgUWSfgq8yGu92D4v6T9JPbRObxDPeZJezMN7NhH/4IiYoNRt9XeBvYDDgdHAjhGxQtJmEfGEpKOBPSLisfICch8+XwLeT3qC9jal7tWfBMYAUyLiMEkXkrpsOBc4DhgdES+7CcoacVKwgWQ58GfgEFb9jYm9gLFS0bfZm3MfMwCzI+LFUt1rO3vJlNTZi+0mvNaLLcAQ4JYm4lml+ai0zq5ckv/eTuphszP2X3Q2bUXEEw2WsQupx83n8zovIfXCOZvUo+z8Ouu4k5TALmMtfzDLGnNSsIFkJfAZ4BpJ34qIk3L5OsAHa07+5BN8bW+hTfdi2wsrWLVJdv2a6Z3r7lwved09eYK0u149a7dtaB7en9Sd80TgO5K29S/5WVd8T8EGlIh4gfRDRJ+XdEguvgqY2llH0rgeLrarXmx76h/Av0jaPHd8eECjGUixH5H73UHSZrn8WaDelceNwIE5xjcBn6CbDtYkrQOMjIjrgG+Qrop6fL/E1h6+UrABJ7e57wPcKOkxYBowQ9KdpGP6RuCIHiyvbi+2wH1dz1V3OcslTSf1uPkg6fcdGjkD2Aa4U9Jy0k3004CZwO8l/b38y2i5x82zSF0+A5wREX/NHQrWMwg4V9IbQOCYAAAAP0lEQVTGpKuMUyPiqZ5sl61d3PeRmZkV3HxkZmYFJwUzMys4KZiZWcFJwczMCk4KZmZWcFIwM7OCk4KZmRX+P3yjLdhS0AywAAAAAElFTkSuQmCC\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# for SVM\nfrom sklearn import svm\n# prepare SVM setting\nSVM = svm.SVC(kernel='rbf')\n# perform the test\nSVM.fit(X_train, y_train)\nSVM",
"execution_count": 59,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 59,
"data": {
"text/plain": "SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,\n decision_function_shape='ovr', degree=3, gamma='auto_deprecated',\n kernel='rbf', max_iter=-1, probability=False, random_state=None,\n shrinking=True, tol=0.001, verbose=False)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Logistic Regression"
},
{
"metadata": {},
"cell_type": "code",
"source": "# for Logistic Regression\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.metrics import log_loss\n\n# import Matplotlib (scientific plotting library)\nimport matplotlib.pyplot as plt\n%matplotlib inline\n\nc_list = [0.1, 0.01, 0.001]\nsolver_list = ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga']\nidx = []\n\naccuracy_score = []\nfor idx1, c in enumerate(c_list):\n for idx2, sol in enumerate(solver_list):\n idx.append(idx2 + idx1 * 5)\n # perform the test\n LR = LogisticRegression(C=c, solver=sol).fit(X_train, y_train)\n # it can predict the outcome\n lr_yhat = LR.predict(X_test)\n lr_prob = LR.predict_proba(X_test)\n print(\"Test \", (idx2 + idx1 * 5), \": Accuracy at c =\", c,\"solver=\", sol,\n \"is : \", log_loss(y_test, lr_prob))\n accuracy_score.append(log_loss(y_test, lr_prob))\nlr_prob = LR.predict_proba(X_test)\nlog_loss(y_test, lr_prob)\n# plot the relationship between K and testing accuracy\nplt.plot(idx, accuracy_score)\nplt.xlabel('Parameter value')\nplt.ylabel('Testing Accuracy')",
"execution_count": 60,
"outputs": [
{
"output_type": "stream",
"text": "Test 0 : Accuracy at c = 0.1 solver= newton-cg is : 0.477460130698766\nTest 1 : Accuracy at c = 0.1 solver= lbfgs is : 0.47746026240380063\nTest 2 : Accuracy at c = 0.1 solver= liblinear is : 0.49096560818457907\nTest 3 : Accuracy at c = 0.1 solver= sag is : 0.47745981577453\nTest 4 : Accuracy at c = 0.1 solver= saga is : 0.4774593676938099\nTest 5 : Accuracy at c = 0.01 solver= newton-cg is : 0.4893356417828644\nTest 6 : Accuracy at c = 0.01 solver= lbfgs is : 0.48933560490693945\nTest 7 : Accuracy at c = 0.01 solver= liblinear is : 0.5699980927778155\nTest 8 : Accuracy at c = 0.01 solver= sag is : 0.48933620735787814\nTest 9 : Accuracy at c = 0.01 solver= saga is : 0.489334898515333\nTest 10 : Accuracy at c = 0.001 solver= newton-cg is : 0.5177257828275373\nTest 11 : Accuracy at c = 0.001 solver= lbfgs is : 0.5177257382214536\nTest 12 : Accuracy at c = 0.001 solver= liblinear is : 0.6691108543335518\nTest 13 : Accuracy at c = 0.001 solver= sag is : 0.5177411220522365\nTest 14 : Accuracy at c = 0.001 solver= saga is : 0.517724638981599\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"execution_count": 60,
"data": {
"text/plain": "Text(0, 0.5, 'Testing Accuracy')"
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZIAAAEKCAYAAAA4t9PUAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3Xl83HWd+PHXO1dztE2bSQqlbZI2LfdRIFytFyBYlAUUl8MLXIXVlUV/uixUERXExXV30f2B6yIi+BPlkqNqsYCCR0uhAUpPSpM0aUNLm0ySHpnc8/798f1OGKaTzDSZ75zv5+Mxj8585/v9zntKyTuf6/0RVcUYY4wZr7xUB2CMMSazWSIxxhgzIZZIjDHGTIglEmOMMRNiicQYY8yEWCIxxhgzIZZIjDHGTIglEmOMMRNiicQYY8yEFKQ6gGSorKzU2traVIdhjDEZ5ZVXXulQ1apY53maSERkCfAjIB+4V1XviHLOZcC3AQVeV9VPiMjZwJ1hpx0NXKGqT4rI/cD7gb3ue1er6tqx4qitraWhoWGiX8cYY3KKiLTGc55niURE8oG7gfOANmCNiCxT1U1h5ywAlgKLVbVLRGYAqOrzwEL3nAqgEXgm7PY3qOpjXsVujDEmfl6OkZwONKpqs6oOAA8BF0eccw1wt6p2Aajqnij3+TjwtKoGPIzVGGPMOHmZSGYBO8Jet7nHwh0JHCkiK0VktdsVFukK4NcRx24XkXUicqeITIr24SJyrYg0iEhDe3v7eL+DMcaYGLxMJBLlWGTN+gJgAfAB4ErgXhGZNnIDkZnACcCKsGuW4oyZnAZUADdG+3BVvUdV61W1vqoq5liRMcaYcfIykbQBc8JezwZ2RjnnKVUdVNVtwBacxBJyGfCEqg6GDqjqLnX0Az/H6UIzxhiTIl4mkjXAAhGZKyJFOF1UyyLOeRI4G0BEKnG6uprD3r+SiG4tt5WCiAhwCbDBk+iNMcbExbNZW6o6JCLX4XRL5QP3qepGEbkVaFDVZe5754vIJmAYZzaWH0BEanFaNH+OuPWDIlKF03W2FviCV9/BGGNMbJILW+3W19errSMxxkzUS81+ppYUcszMqakOJSlE5BVVrY91npVIMcaYON3w2DruePqNVIeRdnKiRIoxxkzUwFCQtq4AedHmo+Y4a5EYY0wc2roCBBXaunoZGg6mOpy0YonEGGPi0NrpFNcYCio7u/tSHE16sURijDFxaO3oGXne4u8Z48zcY4nEGGPi0OJ/Z3yk1RLJu1giMcaYOGzvDHDkYVMoLsyj1W81ZMPZrC1jjIlDi7+How6bgqrTOjHvsBaJMcbEMBxUdnQGqPGVUeMrta6tCJZIjDEmhl17exkcVmp8pdT4StneGSAYzP6qIPGyri1jjIkhNCZS4ytlOKj0DwXZvb+PmeUlKY4sPVgiMcaYGELTfWt9ZQTdtYgtHQFLJC7r2jLGmBi2+wMUFeRx+NRianylzrFOGycJsURijDExtPh7qK4oJS9PmFleTGG+2MytMJZIjDEmhlZ/gFq3JVKQn8ec6TZzK5wlEmOMGYOq0uoPUF1RNnKs2ldqixLDeJpIRGSJiGwRkUYRuWmUcy4TkU0islFEfhV2fFhE1rqPZWHH54rISyKyVUQedrfxNcYYT7Tv76d3cJjaytKRY7W+Mlr9AXJhY8B4eJZIRCQfuBu4ADgWuFJEjo04ZwGwFFisqscBXwl7u1dVF7qPi8KOfx+4U1UXAF3A57z6DsYYE6r6W13xTiKp8ZVyoH8If89AqsJKK162SE4HGlW1WVUHgIeAiyPOuQa4W1W7AFR1z1g3FBEBzgEecw89AFyS0KiNMSZMS8c7U39DQjO3rHvL4WUimQXsCHvd5h4LdyRwpIisFJHVIrIk7L1iEWlwj4eShQ/oVtWhMe5pjDEJ0+oPkJ8nzJr+zpqRGjep2IC7w8sFidE2pIzsUCwAFgAfAGYDfxWR41W1G6hW1Z0iMg/4k4isB/bFcU/nw0WuBa4FqK6uHt83MMbkvNbOALOmlVCY/87v3bOnlyBixRtDvGyRtAFzwl7PBnZGOecpVR1U1W3AFpzEgqrudP9sBl4ATgY6gGkiUjDGPXGvu0dV61W1vqqqKjHfyBiTc1r9PSNdWSGTCvI5oryE7dYiAbxNJGuABe4sqyLgCmBZxDlPAmcDiEglTldXs4hMF5FJYccXA5vUmSLxPPBx9/qrgKc8/A7GmBzX0tHzrvGRkNrKUmuRuDxLJO44xnXACmAz8IiqbhSRW0UkNAtrBeAXkU04CeIGVfUDxwANIvK6e/wOVd3kXnMj8FURacQZM/mZV9/BGJPbugMD7OsbOqhFAlBdUcb2Tksk4HHRRlVdDiyPOHZL2HMFvuo+ws9ZBZwwyj2bcWaEGWOMp1pGqv5GaZH4SunsGWBv7yDlJYXJDi2t2Mp2Y4wZRetI1d+DWySh5LLdurcskRhjzGhC60TmVERLJO5aEqsCbInEGGNG0+LvYWZ5McWF+Qe9Z4sS32GJxBhjRrHdH4g60A5QWlTAjCmTRla+5zJLJMYYM4oWf4CaioMH2kNqfKUjtbhymSUSY4yJ4kD/EB0H+qmpjN4iAWfA3cqkWCIxxpioQrOxxmqR1PpK2b2vn96B4WSFlZYskRhjTBShlsZoYyQA1aEpwDnevWWJxBhjonhnMeLoiSS0vqQlx7u3LJEYY0wU2zt78JUVMaV49FXroW6vXB8nsURijDFRtHSMPvU3pLy0kGmlhTm/lsQSiTHGRNHqj171N1KNu397LrNEYowxEfoGh9m1r4/qGC0SgJqKUhsjSXUAxhiTbtq6AqgSV4uk1lfKzu5eBoaCSYgsPVkiMcaYCK1xzNgKqfGVEVQn+eQqSyTGGBNhrH1IIlnxRkskxhhzkFZ/D1OKC5heGnvDqlCyyeUpwJ4mEhFZIiJbRKRRRG4a5ZzLRGSTiGwUkV+5xxaKyIvusXUicnnY+feLyDYRWes+Fnr5HYwxuafVH6DWV4aIxDy3cnIRZUX5Ob1/u2db7YpIPnA3cB7QBqwRkWVhe68jIguApcBiVe0SkRnuWwHgM6q6VUSOAF4RkRWq2u2+f4OqPuZV7MaY3Nbq7+G4WeVxnSsiVPtye/92L1skpwONqtqsqgPAQ8DFEedcA9ytql0AqrrH/fNNVd3qPt8J7AGqPIzVGGMAGBoO0tbVG3V73dHU+nJ7CrCXiWQWsCPsdZt7LNyRwJEislJEVovIksibiMjpQBHQFHb4drfL604RmZTowI0xuWtndx9DQR2z6m+kGl8ZOzoDDAfVw8jSl5eJJFrnYuTfcgGwAPgAcCVwr4hMG7mByEzg/wGfVdXQJO2lwNHAaUAFcGPUDxe5VkQaRKShvb19It/DGJNDWuKo+hupxlfK4LCya2+vV2GlNS8TSRswJ+z1bGBnlHOeUtVBVd0GbMFJLIjIVOD3wM2qujp0garuUkc/8HOcLrSDqOo9qlqvqvVVVdYrZoyJT2j2VW3lobRIcnsKsJeJZA2wQETmikgRcAWwLOKcJ4GzAUSkEqerq9k9/wngF6r6aPgFbisFcaZTXAJs8PA7GGNyTKs/QHFhHjOmxN9rHloBn6vjJJ7N2lLVIRG5DlgB5AP3qepGEbkVaFDVZe5754vIJmAYZzaWX0Q+BbwP8InI1e4tr1bVtcCDIlKF03W2FviCV9/BGJN7Qvu0xzP1N+TwqcUUFeSN7KqYazxLJACquhxYHnHslrDnCnzVfYSf80vgl6Pc85zER2qMMY5Wfw9zD6FbCyAvT6jO4eKNtrLdGGNcwaCyvTP2PiTR1FSU2hiJMcbkut37++gfCsZVYytSaF8Sp6Mlt1giMcYYV6hFEU/5+Ei1laX0Dg7Tvr8/0WGlPUskxhjjah3HGpKQ6grnmlysuWWJxBhjXC3+AIX5wszy4kO+tjaHqwBbIjHGGNd2f4A500spyD/0H42zppeQnyc5OeAe829LRL4gIvGVwTTGmAzW4u+Ja5/2aArz85g1rSQnpwDHk3ZrgVdF5Fci8kGP4zHGmJRQ1ZF9SMarxleak+XkYyYSVb0Jp/7Vg8AXRGSriNwqIrUex2aMMUnT2TPAgf6hkUHz8aj1lbGtoyfnpgDH1RHoVt5tcR9BYCbwlIj8m2eRGWNMEoVmW9VWjj+R1PhK2d83RHdgMFFhZYR4xkj+SUReBn4EvAKcqKrXACcDl495sTHGZIh3pv5OpGvLnbmVY91b8dTamg1coarN4QdVNSgiF3kTljHGJFerP4AIzJ5eMu571I6Uk+9h4ZxpMc7OHvF0bT2Bs9UtACIyRUTqAVTVSrgbY7JCq7+HI8pLmFSQP+57zKnIzX1J4kkk9wDhfys9wP96E44xxqRGa2dgQuMjAMWF+cwsL865KcDxJJK8sG1uQwPvhd6FZIwxydfqD1B9CPu0j6bGl3tVgONJJNtE5Isiki8ieSLyJZzZW8YYkxX29Q3S2TMwMsYxETUVZZZIovhH4Fxgt/t4P3CNl0EZY0wyhXY2nMiMrZCaylI6DvRzoH9owvfKFPEsSNytqh9X1UpVrVLVy1R1dzw3F5ElIrJFRBpF5KZRzrlMRDaJyEYR+VXY8avcxY9bReSqsOOnish6957/LYeyH6YxxkTRMoGqv5FqKnKveGPM6b8iMgm4GjgOGCmJqarXxrguH7gbOA9oA9aIyDJV3RR2zgJgKbBYVbtEZIZ7vAL4FlAPKPCKe20X8D/AtcBqnG18lwBPx/uFjTEmUutIiyQBicS9x3Z/gOOOyI0yhfF0bf0Cp97WhcBLQB3QF8d1pwONqtqsqgPAQ8DFEedcA9ztJghUNTTN+EPAs6ra6b73LLBERGYCU1X1RXe/918Al8QRizHGjKrV38OMKZMoLYpnad3YQokkl/YliSeRHKmqS4EDqvoznBbA8XFcNwvYEfa6zT32rnsDR4rIShFZLSJLYlw7y30+1j2NMeaQtPjHt097NFOKC/GVFeVU11Y8iSRUNKZbRI4BpgA1cVwXbewispJZAU5ByA8AVwL3isi0Ma6N557Oh4tcKyINItLQ3t4eR7jGmFzV6u9JyEB7SK5NAY4nkfxMRKbjjFmsAN4E/jOO69qAOWGvZwM7o5zzlKoOquo2YAtOYhnt2jb3+Vj3BEBV71HVelWtr6qqiiNcY0wu6h0YZve+fmomUPU3Uq2vzFokIe6AeYeqdqnq86pa7c7e+nEc914DLBCRuSJSBFwBLIs450ngbPezKnG6uppxEtb5IjLdTWLnAytUdRewX0TOdGdrfQZ4Kv6va4wx7xbaP6SmMnEtkmpfKbv29dE3OJywe6azMROJqg4DXxnPjVV1CLgOJylsBh5R1Y3uXiahYo8rAL+IbAKeB25QVb+qdgK34SSjNcCt7jGALwL3Ao1AEzZjyxgzAaGpv4lYjBhS6ytDFdq6cqN7K54pCitE5CvAwzh1tgBQ1X2xLlTV5ThTdMOP3RL2XIGvuo/Ia+8D7otyvIH4BvuNMSamkcWICSiPEjIyc6sjwPwZUxJ233QVTyL5R/fPr4UdU6A68eEYY0xytfh7mFZaSHlp4koIhgbuc6V4Y8xEoqpzYp1jjDGZantnIKEztgCmlxYypbggZ/Zvj2dl+yeiHVfVX0U7bowxmaTF38PJc6Yn9J4iQq2vLGcWJcbTtfXesOfFwDk4W+5aIjHGZLSBoSBvdfXy0YWJX9dc7Stlw1t7E37fdBRP19YXw1+703Hv9yogY4xJlre6ewlqYqr+Rqr1lbJiw9sMDgcpzI9nyV7mGs+324+z3sMYYzJaIqv+RqrxlTEUVHZ29yb83ukmnjGSJ3inDEkeThVgWwRojMl4rR2hRJL4FklopbxTxyvx908n8YyR3BX2fAhoVdUWb8Ixxpjkae0MUFaUT+XkooTfu9ZdKb/d3wNkd5mmeBLJVmCPqvYBiEiJiMxR1R0xrjPGmLTW6g9Q7SvDi/3xZkyZRHFhXk7M3IpnjORxIBj2Ogj8xptwjDEmeVr8PQktjRJORHJm//Z4EkmBuzEVAKraD0zyLiRjjPHecFBp6+yl2qNEAqFy8tm/uj2eROIXkQ+HXojIhUDnGOcbY0za27W3l4HhILUeDoTX+Epp7QwQDEbdNilrxDNG8kXgVyJyN87srQ7gU55GZYwxHkvkPu2jqfGVMTAUZPf+PmaWl3j2OakWz4LEN4F6d+dCVLXb86iMMcZj7yQS71okodZOS0cgqxNJzK4tEblNRKapareqdrubTX0nGcEZY4xXWv09FBXkMXNqsWefEWrtZPs4STxjJBeGt0JUtQv4O+9CMsYY77X6A1RXlJKXl/ipvyEzy4spzBdas7wKcDyJJN/dKhcAESkGEr96xxhjkqjF35PQfdqjKcjPY8707J+5FU8ieQh4VkSuEpHP4GyPG1flXxFZIiJbRKRRRG6K8v7VItIuImvdx+fd42eHHVsrIn0icon73v0isi3svYXxf11jjAFV9WQfkmiqfaW0dGR3iySewfbvicg64IOAAP+uqr+PdZ2I5AN3A+cBbcAaEVmmqpsiTn1YVa+L+MzngYXufSpw9md/JuyUG1T1sVgxGGNMNO0H+gkMDFNb6W2LBJwB94aWLlTVkxX06SCu6r+q+jtV/YqqfhnoEJEfxXHZ6UCjqja7CxofAi4eR4wfB55W1exO6caYpAnN2Kr2uGsLnAH3A/1D+HsGYp+coeJKJCJyvIjcLiJNwH8A2+K4bBYQXo+rzT0W6VIRWScij4lItG19rwB+HXHsdveaO0Uk6ip7EblWRBpEpKG9vT2OcI0xuaLFrfrr5WLEkFyYuTVqIhGReSLydRHZANyLsxCxUFXfq6o/jOPe0dpwkcs7fwvUquqJwHPAAxExzAROwBmXCVkKHA2cBlQAN0b7cFW9R1XrVbW+qiq7K28aYw7N9s4A+XnCrOner+0IjcNkc82tsVokjcCHgI+p6pmqeidOGfl4tQHhLYzZwM7wE1TV79buAvgpcGrEPS4DnlDVwbBrdqmjH/g5TheaMcbErcUfYNa0kqTsXDh7egkiZHUV4LH+Fi/HaYX8UUR+LCLvJ3orYzRrgAUiMtedPnwFsCz8BLfFEXIRsDniHlcS0a0VukacUatLgA2HEJMxxtDq7/G0NEq4SQX5HFFekptdW6r6qKpeChwLvITTpXS4iPxfETkn1o1VdQi4DqdbajPwiKpuFJFbReQi97TrRWSjiLwOXA9cHbpeRGpxWjR/jrj1gyKyHlgPVALfjeeLGpNuegeG2Z7Fv6Wms1Z/IGmJBKC2sjSru7bimf67H2fs4gERqcRpqXwb+FMc1y4HlkccuyXs+VKcBBXt2haiDM6raswkZkwm+OFzb/LL1a003HweJUX5qQ4nZ3QHBtjbO5iUgfaQ6ooy/rBhV9I+L9kOqYNQVTtU9W5VfZ9XARmTK/78Zjs9A8O80tqV6lBySjKKNUaq9ZXSFRhkb+9g7JMzkPcjTcaYg3Qc6OeNt/cDsLKpI8XR5JYWd6wimV1boaSVrV2ZlkiMSYFVTX4AKsqKWNVoiSSZkrkYMSSUtFqydMDdEokxKbCqsYMpxQV88oxq1r+1N2u7PNJRqz/AzPJiiguTNy4VSiTbs7QKcDz7kXSJSGfEY5uIPOrOrDLGHKKVTR2cOc/HexdUEVRY3exPdUg5o9Xfk9TWCEBpUQEzpkwaWVGfbeJpkfxf4JtAHTAfuBm4H3gSZ0GgMeYQ7OgMsKOzl8V1PhbOmUZJYb51byVRiz+Q1BlbITW+7J0CHE8iOd+dqdWlqp2q+mPgAlV9EKdEiTHmEKx0k8bi+ZUUFeRx+twKVjZZiyQZevqH6DjQT00Sqv5GqvGV0dqZuy0SRORjEc9DK9yDXgRlTDZb2eRnxpRJzJ8xGYDF83007jnA7n19KY4s+41M/a1Ifouk1lfK7n39BAYOpdJUZognkXwKuMYdG/ED1wCfFpFS4CueRmdMllFVXmzqYFGdb2RvikV1lQCssmnAnmtNwdTfkOrQFOAsHHCPmUhUtVFVL1DVClX1uc/fVNWAqkaWLzHGjGHL7v10HBhg0fzKkWPHzpzKtNJCVjZa95bXQnunpyKR1I6Uk8++RBKzRIpbFuUfgNrw81X1Wu/CMiY7rXKTxaI638ixvDzhrHk+XmzyZ/Uueumg1d+Dr6yIKcWFSf/sUHdaNhZvjJlIgKeA1cDfgGFvwzEmu61q6qDGV8rs6e/+jXjR/Eqe3vA2rf4AtZXJ77/PFS0dyS3WGK68tJBppYVZWU4+nkRSpqpf8zwSY7Lc0HCQl5o7ufCkIw56b7HbQlnZ1GGJxEPbOwOcPjd1k01rfGVZWSYlnsH2p0XkfM8jMSbLrXtrL/v7h1g833fQe3Mry5hZXjzS9WUSr39omJ17e1PWIgGoqSjNyjIp8SSSLwB/EJED7sytLhHp9DowY7JNaNHhWfMOTiQiwqK6SlY1dRAMRu5IbRJhR2cvqsnZp300tb5Sdnb3MjCUXSsn4kkklUAhUA5Uua9tE3RjDtHKRj/HzJyKb/KkqO8vnu+jKzDI5rf3JTmy3BAa5K5OZYvEV0ZQoa0ru7q3Rk0kIrLAfXrcKI+YRGSJiGwRkUYRuSnK+1eLSLuIrHUfnw97bzjs+LKw43NF5CUR2SoiD7vb+BqT1voGh3lle9fIWEg0i90pwda95Y3QIHcqWyQ1WToFeKzB9puAzwF3R3lPgTE3txKRfPfa84A2YI2ILFPVTRGnPqyq10W5Ra+qLoxy/PvAnar6kIj8xI3xf8aKxZhUa2jpYmAoOJIsojlsajF1VWWsbOrgmvfNS2J0uWG7v4cpxQVML03+1N+Q0L4k2TZOMmoiUdXPuU/PUdV31bgWkXj+S5wONKpqs3vNQ8DFQGQiiZs4E+zPAT7hHnoAZ9tfSyQmra1s6qAgT2LOGFpUV8lvXm1jYChIUYHt8pBILe4+7alcp1M5uYiyovysa5HE8y/1pTiPRZoF7Ah73UaUPdiBS0VknYg8JiJzwo4Xi0iDiKwWkUvcYz6gW1VDxWpGu6cxaWVVk5+Fc6ZRNmnsGfeL5/sIDAyzrq07SZHljlZ/T1K3141GRKj2lWXdosSxxkhmiMhJQImInCAiJ7qP9wDxjFZFS/uR01F+C9Sq6onAczgtjJBqVa3HaX38UETq4rxnKP5r3UTU0N7eHke4xnhjb+8g69u637WafTRnzvMhgpVLSbCh4SBtXb0jZUpSqTYLy8mP1SL5CHAXMBtnrCP0+DrO/iSxtAHhLYzZwM7wE1TVr6r97sufAqeGvbfT/bMZeAE4GegApolI6Ne6g+4Zdv09qlqvqvVVVTbJzKTOS81+gsq76muNZlppEccfUW77uCfYzu4+hoKakqq/kWp8ZezoCjCcRdO8R00kqvpzVX0v8DlVfZ+qvtd9fFhVH43j3muABe4sqyLgCmBZ+AkiMjPs5UXAZvf4dBGZ5D6vBBYDm1RVgeeBj7vXXIVTwsWYtLWqyU9xYR4nV0+L6/xF8328tr0rK8uNp0pLCqv+RqrxlTI4rOzs7k11KAkTzxjJDBGZCiAiPxGRl0Xk3FgXueMY1wErcBLEI6q6UURuFZGL3NOuF5GNIvI6cD1wtXv8GKDBPf48cEfYbK8bga+KSCPOmMnP4vqmxqTIysYOTqutYFJBfHuEL66rZHBYWdPS5XFkueOdqr/p0CLJvv3b46m1da2q3uWWSZkNfBG4h7BuqNGo6nJgecSxW8KeLwWWRrluFXDCKPdsxpkRZkza27Ovj617DnDpqbPjvua02gqK8vNY1djB+4+0btlEaO3oobgwjxlToi8GTabasCnAY00HzyTxtEhCHXkXAD9X1VfivM6YnLfK3UJ3cV38PzBKivI5uXqajZMkUIs/QE1FGXl5qS/Rf/jUYooK8rJqwD2ehPC6iCwH/g6ngONkRpkpZYx5t5WNHZSXFHLsEVMP6brF8yvZuHMf3YEBjyLLLds7e1JaGiVcXp5QXVGaVVOA40kkn8VZ9He6qgaAYpzV5MaYMagqq5r8nDXPR/4h/ia8qM6HKrzYZNOAJyoYVGeflzRJJJB9U4Dj2Wp3GJiHMzYCUBLPdcbkulZ/gLe6e6OWjY/lpDnTKCvKt+6tBNizv5/+oWBaDLSHVFeU0eoP4ExEzXwxE4KI3AWcDXzKPdQD/MTLoIzJBqHxkXjWj0QqzM/j9LkVI/cw45dOU39DaitL6R0cpn1/f+yTM0A8LYtFqvqPQB+AqnYCVnHXmBhWNnVw2NRJzBvnjoeL51fS3N7D23v7EhxZbgmNRaSy6m+k6gonqWXLtrvxJJJBEcnDHWAXER+QXbuyGJNgwaDyYpOfxXWV4y4SuMid6bWy0bq3JqLVH6AwX5hZXpzqUEbUZlkV4LFqbYXWmNwN/AaoEpHvAH/DKeVujBnFG2/vp7NnYFzdWiFHHz6FirIiGyeZoFZ/gNnTSynIT5+h3VnTS8jPk6zZv32sBYkvA6eo6i9E5BXggzhFE/9eVTckJTpjMtQq94f/eAbaQ/LyhLPqfKxq9KOqKS1/nsla/D1pNT4CzhjYrGklWdMiGSuRjPyrVdWNwEbvwzEmO6xs7GBeZRkzy0smdJ/FdZX8ft0umjt6qKuanKDocoeqst0f4LTasfeBSYWaLJoCPFYiqRKRr472pqr+lwfxGJPxBoeDvLytk4+eMvGtckItmlWNHZZIxqGzZ4D9/UMjg9vppNZXxpM73sqK1uZYnYb5wGRgyigPY0wUr+/opmdg+JDKooymuqKUWdNKbH+ScRrZp70y/RJJja+U/X1DdAcGY5+c5sZqkexS1VuTFokxWWJlox8ROCuOjaxiEREW1fl4ZtNuhoN6yCvkc932TmcMojoN9iGJFL5/+/SyzF5RMVaLxP7FGjMOq5o6OO6IqUwrTcwPh8XzK9nbO8jmXfsScr9c0tIRQATmVExsrMoLtVlUTn6sRBJzzxFjzLv1Dgzz2vbuhHRrhYS26LX1JIdue2eAI8pL4t4LJpnmhBYldmRxInFXsBtjDsGalk4GhoMJ6dYKmTG1mAUzJrPSyqUcsnSc+htSXJjPzPLirKgCnD4rdIzJAitnoEGcAAAYg0lEQVSbOijMF06fm9jppovnV7JmWycDQ1ZU4lC0+gNpVawxUo2vdGT3xkzmaSIRkSUiskVEGkXkpijvXy0i7SKy1n183j2+UERedLfhXScil4ddc7+IbAu7ZqGX38GYQ7Gq0c/Jc6ZTWhTP5qPxW1Tno3dwmNe22/a78drXN0hnz0BalY+PVFNRZi2SsYhIPk55lQuAY4ErReTYKKc+rKoL3ce97rEA8BlVPQ5YAvxQRKaFXXND2DVrvfoOxhyK7sAAG3buZdEEVrOP5ox5PvIE6946BKHyI+natQVQU1lKx4EBDvQPpTqUCfGyRXI60Kiqzao6ADwEXBzPhar6pqpudZ/vBPYAtnm1SWurm/2o4sk+3OUlhZwwexqrbMA9bu+Uj0/jri13WnKmt0q8TCSzgB1hr9vcY5EudbuvHhOROZFvisjpOGXrm8IO3+5ec6eITEpo1MaM08pGP6VF+Zw0e1rsk8dhUZ2PtTu66cnw316TpTUTWiRubJleKsXLRBJtHUrkdmC/BWpV9UTgOeCBd91AZCbw/4DPqmpolHEpcDRwGlAB3Bj1w0WuFZEGEWlob28f/7cwJk4rmzo4fW4FRQXe/G+1uK6SoaDy8jabUBmPVn8PVVMmJXy8KpEskcTWBoS3MGYDO8NPUFW/qoa2CPspcGroPRGZCvweuFlVV4dds0sd/cDPcbrQDqKq96hqvarWV1VZr5jx1tt7+2hu70no+pFI9bXTKSrIs/UkcWpJs33ao5lSXIivrMi6tsawBlggInNFpAi4AlgWfoLb4gi5CNjsHi8CngB+oaqPRrtGnCpnlwBW0t6kXKhsvBcD7SHFhfmcWj3dtt+N0/Y0n/obUuMrzfhy8p4lElUdAq4DVuAkiEdUdaOI3CoiF7mnXe9O8X0duB642j1+GfA+4Ooo03wfFJH1wHqgEviuV9/BmHitbPQzvbSQYw6f6unnLJ7vY9OufXT2DHj6OZmud2CYt/f1UZOGVX8j1frKMn6DK087D1V1ObA84tgtYc+X4ox5RF73S+CXo9zznASHacyEqCqrmjo4q85HnsdFFRfNr4Rn3uTFJj8fOXFm7AtyVKh+VU1l+rdIqn2lPP7aW/QNDlNcmH6lXOJhK9uNmaBtHT3s2ts3sse6l06cVc6USQW2/W4MoTGHTGmRAOzI4BXu6TudwZgMEVok6MX6kUgF+XmcMa8ia9aTDA4H+ePmPQQGEjul+W9bnb+f2gwZIwF4eM0Ojjz84K2eRmvjRtsMK9q55x13GFOLCycQYWyWSIyZoFWNHRxRXpy0GUKL6ip5bvMe3uruZda09CuPfii++7tNPPBiqyf3njWthPJSb3+AJkLdjMmUFOZz79+2eXL/5+a83xKJMeksGFRebPbzwWMOS9p2qaGZYSsbO7is/qA1vBnjDxt28cCLrVy9qJbPLq5N+P19kzNjrfLU4kJe+sa57O97d6tMVSNex75XtHMOLy+eSHhxsURizARs2rWP7sDgyN7qyXDUYVOonFzEqgxOJDs6A9zw2DpOml3O1z98jGeLODPF1OJCz1sNXsrt/3rGTFBocWAyBtpDRISz6ipZ2eQ/6LfWTDAwFOS6X78GwF2fOCXnk0g2sP+CxkzAyiY/82dM5rCp3ncfhFtc56N9fz+New4k9XMT4Qcr3uD1Hd18/9ITR3YJNJnNEokx4zQwFGTNtk4WJ3A3xHiFZohl2ir3P27ezU//uo1Pn1nDh0+wdTDZwhKJMeO0dkc3vYPDziLBJJtTUcqcipKMqru1a28vX3v0dY6ZOZVvfOSYVIdjEsgSiTHjtLKxgzyBM+cmv0UCTjXg1c1+hoPpP04yNBzk+l+/xsBQkLs/cXLGruA20VkiMWacVjV1cPys8pStVVg0v5J9fUNseGtvSj7/UPzwua2saeniex89gXlVk1MdjkkwSyTGjENP/xCvbe9O6mytSGfNc9eTpHm5lL9ubefuFxq5rH42l5wcbW87k+kskRgzDi+3dDIU1KSuH4lUNWUSRx02hVWN6Tvgvmd/H//n4bXMr5rMty86LtXhGI9YIjFmHFY1dlCUn0d9TUVK41g038ealk76BodTGkc0w0HlKw+t5UD/EHd/8pS03qnQTIwlEmPGYWWjn1NqplFSlNpB48V1lfQPBXl1e1dK44jmx883sqrJz3cuOo4jDzu4GKHJHpZIjDlEnT0DbNq1z9NtdeN1xrwK8vMk7bq3Xmr2c+dzb3LJwiMytoyLiZ8lEmMO0epm54d2KtaPRJpSXMiJs8tHtvpNB/4D/Vz/0GvU+Mr47kdPSFoxS5M6niYSEVkiIltEpFFEbory/tUi0h62ne7nw967SkS2uo+rwo6fKiLr3Xv+t9i/UpNkKxs7mDypgJNml6c6FMDp3nq9bS/7+wZTHQrBoPK1R1+nKzDIXZ84mcmTbFwkF3iWSEQkH7gbuAA4FrhSRI6NcurDqrrQfdzrXlsBfAs4Azgd+JaITHfP/x/gWmCB+1ji1XcwJppVTX7OmFtBQX56NOgXzfcxHFRe3taZ6lC492/NvLClnW9+5BiOOyI9Eq3xnpf/J5wONKpqs6oOAA8BF8d57YeAZ1W1U1W7gGeBJSIyE5iqqi+qU/b0F8AlXgRvTDQ7u3vZ1tHDWSmorzWaU6qnM6kgj5UpHid5dXsX//6HLVxw/OF86syalMZiksvLRDIL2BH2us09FulSEVknIo+JSGhUbrRrZ7nPY93TGE+EalslY1vdeBUX5nNabUVKx0n2Bgb551+9xuHlxdxx6Yk2LpJjvEwk0f4lRRYF+i1Qq6onAs8BD8S4Np57OjcQuVZEGkSkob29Pc6QjRnbqiY/vrIijkqz6axn1fl44+39dBzoT/pnqyo3PPY6u/f1cdcnTqG8JHM3aDLj42UiaQPC5/3NBnaGn6CqflUN/cv/KXBqjGvb3Oej3jPs3veoar2q1ldVVY37SxgToqqsbOzgrDofeXnp9Rt3KsvKP7CqhWc27eamC45m4ZxpSf98k3peJpI1wAIRmSsiRcAVwLLwE9wxj5CLgM3u8xXA+SIy3R1kPx9Yoaq7gP0icqY7W+szwFMefgdjRjS1H2DP/v606tYKOWFWOVOKC1iV5LLy69v28r3lb3Du0TP43HvmJvWzTfrwbG6eqg6JyHU4SSEfuE9VN4rIrUCDqi4DrheRi4AhoBO42r22U0Ruw0lGALeqamhKyheB+4ES4Gn3YYznQoPZ6bAQMVJ+nnDmPF9SCzju7xvkul+/im9yEf/x9yfZuEgO83SSt6ouB5ZHHLsl7PlSYOko194H3BfleANwfGIjNSa2lY0dzJ5eQrUvPbeHXVzn49lNu9nRGfB8C1tVZenj62nr6uWha89kelmRp59n0lt6TIQ3Js0NB5XVzf60bI2EvDNO4n2r5KE1O/jdul189bwjOa02tYUrTepZIjEmDht37mVf3xCLUlg2Ppb5MyYzY8okz9eTvPH2Pr69bCPvXVDJF99f5+lnmcxgicSYOIR+OKdyI6tYRIRFdT5WNflx1usmXmBgiC89+CpTSwr5r8sWpt3sNZMalkiyxKrGDj7/QAMvbNmT6lCy0qqmDo48bDJVUyalOpQxLZpfSceBft7cfcCT+9/y1EaaO3r40eUL0/7vwiSPVVTLcHsDg9y+fBOPNLRRmC88t3k3lyw8gm9eeCy+ybn3P3rf4DA9/UMJvedQUFnT0skVp1Un9L5eWOSWbnlu824qJyd2APzZTbt57JU2rj93QVpUPjbpwxJJhlJVlq9/m28t20hXYIAvfqCOL7y/jvv+to0fv9DIn99s55a/O5ZLFs7KiWmZgYEh7vlLM//752Z6Pdot8D0Z8MNz9vRSan2l/GDFFn6wYkvC73/G3Aq+fO6ChN/XZDbxqi81ndTX12tDQ0Oqw0iYXXt7+eaTG3lu826OnzWV71964rsqrb65ez83/mYdr23v5n1HVnH7Jcd7Ph00VYJB5fHX3uIHK95g975+PnzC4Zw5L/ED4qVFBXz05FnkZ8CYwIa39nqyY2JBXh4fOWEm5aVWAiVXiMgrqlof8zxLJJkjGFQefHk733/6DYaCQb523lF8dnFt1HLmw0Hll6tb+fc/vEFQ4V8+dBRXL6rNiB+E8Vrd7Oe7v9/Ehrf2cdLscr554bHU21RUYxLGEkmYbEgkjXsOsPTxdaxp6eI98yv53kdPiGth3Fvdvdz8xHqe39LOSXOm8f1LT+Dow6cmIWLvtHT08G9Pb2bFxt3MLC/mxiVHc9FJR9gMImMSzBJJmExOJANDQX7y5ybu+lMjJUX53PyRY/j4qbMPadxDVVn2+k5u/e0m9vYO8oX313HdOfMpLsz3MPLE2xsY5L//tJVfvNhCYX4e//SBOj73nnmUFGXW9zAmU8SbSGywPY29ur2Lm36zjjd3H+DCE2fyrb87blxTLkWEixfO4n0Lqrjt95u46/lGlq/fxb997ATO8GA8IdEGh4M8uLqVH/5xK3t7B7m8fg5fPe9IZkwtTnVoxhisRZKWevqH+MGKLTzwYguHTy3mtouP54PHHpaw+//lzXa+/oRTJ+kTZ1Rz0wVHM7U4/QZQVZU/vbGH25dvprm9h0V1Pm7+yLEce0Rmd80ZkymsaytMJiWS57fs4eYnNrBzby+fPrOGGz50FFM8+CEfGBjiv555k/tWbqNqyiRuvfh4PnTc4Qn/nPHatHMfty/fxMpGP/OqyvjGh4/hnKNn5MRUZmPShSWSMJmQSPwH+rn1d5t4au1O5s+YzB0fOyEpM5Be39HNjb9Zxxtv7+eC4w/nOxcdl9Iuoz37+/jPFW/yyCs7KC8p5CvnLuCTZ9ZQGGVmmjHGW5ZIwqRzIlFVnnjtLW773SYO9A/xTx+Yzz+dXcekguQNIA8OB7nnL8386I9bmVSQxzc+fAyXnzYnqb/99w0Oc+9fm/nxC00MDge56qxa/vmcBbZmwZgUskQSJl0TyY7OAF9/Yj1/3drBydXT+P6lJ3JkCvcCb24/wNLH1/PStk7OnFfBv33sROZWlnn6mcGg8tt1O/n+02+wc28fS447nJsuOJpajz/XGBObJZIw6ZZIhoPKz1du4z+feZM8gX9dcjSfOrMmLRYLBoPKww07+N7yzQwMBfnyBxdwzXvnedK11NDSyW2/38zrO7o5ftZUbv7IsZ6sSjfGjE9aJBIRWQL8CGer3XtV9Y5Rzvs48Chwmqo2iMgngRvCTjkROEVV14rIC8BMoNd973xVHbPk7XgTydefWM/L2zpjn3iIDvQN8fa+Ps4+qorvfvQEZk0rSfhnTNTufX1866mN/GHj2xw2dVLCB/yDQaW5o4fDpk7iXz90NB89eZYtKDQmzaR8HYmI5AN3A+cBbcAaEVmmqpsizpsCXA+8FDqmqg8CD7rvnwA8paprwy77pLvlrqdmTSvhKC+6mgSWHHc4F544M21nIR02tZiffPpU/rDhbX63bide/L7xsVNm8Q/vmUtpkS1nMiaTefl/8OlAo6o2A4jIQ8DFwKaI824D/h34l1HucyXwa6+CHMuXzp6fio9NK0uOP5wlx6fPtGBjTPrxck7lLGBH2Os299gIETkZmKOqvxvjPpdzcCL5uYisFZFvyii/0ovItSLSICIN7e3t4wjfGGNMPLxMJNF+wI90kIhIHnAn8LVRbyByBhBQ1Q1hhz+pqicA73Ufn452rareo6r1qlpfVVU1nviNMcbEwctE0gbMCXs9G9gZ9noKcDzwgoi0AGcCy0QkfGDnCiJaI6r6lvvnfuBXOF1oxhhjUsTLRLIGWCAic0WkCCcpLAu9qap7VbVSVWtVtRZYDVwUGkR3Wyx/DzwUukZECkSk0n1eCFwIhLdWjDHGJJlng+2qOiQi1wErcKb/3qeqG0XkVqBBVZeNfQfeB7SFButdk4AVbhLJB54DfupB+MYYY+JkCxKNMcZEFe86EquEZ4wxZkIskRhjjJmQnOjaEpF2oHWcl1cCHQkMx2uZFK/F6p1MijeTYoXMineisdaoasz1EzmRSCZCRBri6SNMF5kUr8XqnUyKN5NihcyKN1mxWteWMcaYCbFEYowxZkIskcR2T6oDOESZFK/F6p1MijeTYoXMijcpsdoYiTHGmAmxFokxxpgJsUQyBhFZIiJbRKRRRG5KdTyjEZE5IvK8iGwWkY0i8uVUxxSLiOSLyGsiMtYWAmlBRKaJyGMi8ob7d3xWqmMajYj8H/ffwAYR+bWIFKc6pnAicp+I7BGRDWHHKkTkWRHZ6v45PZUxhhsl3h+4/xbWicgTIjItlTGGRIs17L1/EREN1SpMNEskowjb4fEC4FjgShE5NrVRjWoI+JqqHoNTRflLaRxryJeBzakOIk4/Av6gqkcDJ5GmcYvILJzdRutV9XicenRXpDaqg9wPLIk4dhPwR1VdAPzRfZ0u7ufgeJ8FjlfVE4E3gaXJDmoU93NwrIjIHJydard79cGWSEY3ssOjqg7gVCG+OMUxRaWqu1T1Vff5fpwfdLPGvip1RGQ28BHg3lTHEouITMUpIPozAFUdUNXu1EY1pgKgREQKgFLevXVDyqnqX4DOiMMXAw+4zx8ALklqUGOIFq+qPqOqQ+7L1ThbZKTcKH+34Oz79K+E7QeVaJZIRhdzh8d0JCK1wMnAS6mNZEw/xPmHHUx1IHGYB7Tj7Mr5mojcKyJlqQ4qGnevnv/A+c1zF7BXVZ9JbVRxOUxVd4HzSxEwI8XxHIp/AJ5OdRCjEZGLgLdU9XUvP8cSyejG3OExHYnIZOA3wFdUdV+q44lGRC4E9qjqK6mOJU4FwCnA/6jqyUAP6dX1MsIdW7gYmAscAZSJyKdSG1X2EpFv4HQrP5jqWKIRkVLgG8AtXn+WJZLRxdrhMa24e7T8BnhQVR9PdTxjWAxc5O6K+RBwjoj8MrUhjakNZ1+cUAvvMZzEko4+CGxT1XZVHQQeBxalOKZ47BaRmQDun3tSHE9MInIVzsZ6n9T0XUNRh/NLxevu/2+zgVdF5PBEf5AlktGNucNjOhERwenD36yq/5XqeMaiqktVdba7K+YVwJ9UNW1/a1bVt4EdInKUe+hcYFMKQxrLduBMESl1/02cS5pODIiwDLjKfX4V8FQKY4lJRJYAN+Ls6BpIdTyjUdX1qjojbBfaNuAU9990QlkiGYU7mBba4XEz8IiqbkxtVKNaDHwa57f7te7jw6kOKov8M/CgiKwDFgLfS3E8UbmtpseAV4H1OP9/p9UqbBH5NfAicJSItInI54A7gPNEZCvO7KI7UhljuFHivQuYAjzr/r/2k5QG6Rol1uR8dvq2yowxxmQCa5EYY4yZEEskxhhjJsQSiTHGmAmxRGKMMWZCLJEYY4yZEEskJquJyLA7RXODiDzqrvZNORH5ego+834R+XiyP9dkP0skJtv1qupCtxruAPCFeC90K0B75ZATicfxGDNulkhMLvkrMB9ARJ4UkVfcvTuuDZ0gIgdE5FYReQk4S0RuEZE1bovmHnfFOCLygojcKSJ/cfcoOU1EHnf31Phu2P0+JSIvu62i/3X3YbkDp0LvWhF5cLTzosUTdt9jROTlsNe17oJJRos5nIi0hPamEJF6EXnBfV7m7muxxi1SmZYVr016sURicoJbVv0CnBXfAP+gqqcC9cD1IuJzj5cBG1T1DFX9G3CXqp7mtmhKcOorhQyo6vuAn+CU9fgScDxwtYj4ROQY4HJgsaouBIZxajPdxDstpU+Odt4o8QCgqpuBIhGZ5x66HHjEfT5WzLF8A6dszWnA2cAP0rXasUkfBakOwBiPlYjIWvf5X3H3FcFJHh91n88BFgB+nB/ivwm7/mwR+VecvT0qgI3Ab933QrXX1gMbQ6XQRaTZved7gFOBNW6joIToBQnPHeO8yHjCPQJchlNS5HL3ESvmWM7HKar5L+7rYqCazKjZZVLEEonJdr3ub/kjROQDOJVyz1LVgNutE9qStk9Vh93zioEf4+w4uENEvh12HkC/+2cw7HnodQHOVgQPqGqsHfTGOm8knigeBh4VkccBVdWtccQcMsQ7PRLh7wtwqapuiRGzMSOsa8vkonKgy00iR+NsTxxN6Adshzh7vRzqjKc/Ah8XkRkwsjd5jfveoDil/2OdNypVbcJpsXwTJ6kcSswtOK0ggEvDjq8A/jlsLOjkWHEYY4nE5KI/AAXu4PRtONulHsTdUvenOF1XT+JsLRA3Vd0E3Aw8437Ws8BM9+17gHUi8mCM82J5GPgU7vjIIcT8HeBHIvJXnGQUchtQ6Ma2wX1tzJis+q8xxpgJsRaJMcaYCbFEYowxZkIskRhjjJkQSyTGGGMmxBKJMcaYCbFEYowxZkIskRhjjJkQSyTGGGMm5P8DwE+fuV0P0lMAAAAASUVORK5CYII=\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# for Logistic Regression\nfrom sklearn.linear_model import LogisticRegression\n# prepare LR setting\nLR = LogisticRegression(C=0.001, solver='liblinear').fit(X_train, y_train)\nLR",
"execution_count": 61,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 61,
"data": {
"text/plain": "LogisticRegression(C=0.001, class_weight=None, dual=False, fit_intercept=True,\n intercept_scaling=1, max_iter=100, multi_class='warn',\n n_jobs=None, penalty='l2', random_state=None, solver='liblinear',\n tol=0.0001, verbose=0, warm_start=False)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Model Evaluation using Test set"
},
{
"metadata": {},
"cell_type": "code",
"source": "from sklearn.metrics import jaccard_similarity_score\nfrom sklearn.metrics import f1_score\nfrom sklearn.metrics import log_loss",
"execution_count": 62,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "First, download and load the test set:"
},
{
"metadata": {},
"cell_type": "code",
"source": "!wget -O loan_test.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_test.csv",
"execution_count": 63,
"outputs": [
{
"output_type": "stream",
"text": "--2019-11-25 07:31:06-- https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_test.csv\nResolving s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)... 67.228.254.196\nConnecting to s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)|67.228.254.196|:443... connected.\nHTTP request sent, awaiting response... 200 OK\nLength: 3642 (3.6K) [text/csv]\nSaving to: \u2018loan_test.csv\u2019\n\n100%[======================================>] 3,642 --.-K/s in 0s \n\n2019-11-25 07:31:06 (507 MB/s) - \u2018loan_test.csv\u2019 saved [3642/3642]\n\n",
"name": "stdout"
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Load Test set for evaluation "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "test_df = pd.read_csv('loan_test.csv')\ntest_df.head()",
"execution_count": 64,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 64,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 1 1 PAIDOFF 1000 30 9/8/2016 \n1 5 5 PAIDOFF 300 7 9/9/2016 \n2 21 21 PAIDOFF 1000 30 9/10/2016 \n3 24 24 PAIDOFF 1000 30 9/10/2016 \n4 35 35 PAIDOFF 800 15 9/11/2016 \n\n due_date age education Gender \n0 10/7/2016 50 Bechalor female \n1 9/15/2016 35 Master or Above male \n2 10/9/2016 43 High School or Below female \n3 10/9/2016 26 college male \n4 9/25/2016 29 Bechalor male ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1</td>\n <td>1</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/8/2016</td>\n <td>10/7/2016</td>\n <td>50</td>\n <td>Bechalor</td>\n <td>female</td>\n </tr>\n <tr>\n <th>1</th>\n <td>5</td>\n <td>5</td>\n <td>PAIDOFF</td>\n <td>300</td>\n <td>7</td>\n <td>9/9/2016</td>\n <td>9/15/2016</td>\n <td>35</td>\n <td>Master or Above</td>\n <td>male</td>\n </tr>\n <tr>\n <th>2</th>\n <td>21</td>\n <td>21</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/10/2016</td>\n <td>10/9/2016</td>\n <td>43</td>\n <td>High School or Below</td>\n <td>female</td>\n </tr>\n <tr>\n <th>3</th>\n <td>24</td>\n <td>24</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/10/2016</td>\n <td>10/9/2016</td>\n <td>26</td>\n <td>college</td>\n <td>male</td>\n </tr>\n <tr>\n <th>4</th>\n <td>35</td>\n <td>35</td>\n <td>PAIDOFF</td>\n <td>800</td>\n <td>15</td>\n <td>9/11/2016</td>\n <td>9/25/2016</td>\n <td>29</td>\n <td>Bechalor</td>\n <td>male</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# convert date time\ntest_df['due_date'] = pd.to_datetime(test_df['due_date'])\ntest_df['effective_date'] = pd.to_datetime(test_df['effective_date'])\ntest_df['dayofweek'] = test_df['effective_date'].dt.dayofweek\n# evaulate weekend field\ntest_df['weekend'] = test_df['dayofweek'].apply(lambda x: 1 if (x>3) else 0)\n# convert male to 0 and female to 1\ntest_df['Gender'].replace(to_replace=['male','female'], value=[0,1],inplace=True)\n# work out education level\ntest_feature = test_df[['Principal','terms','age','Gender','weekend']]\ntest_feature = pd.concat([test_feature,pd.get_dummies(test_df['education'])], axis=1)\ntest_feature.drop(['Master or Above'], axis = 1,inplace=True)\n# Testing feature\nX_loan_test = test_feature\n# normalize the test data\nX_loan_test = preprocessing.StandardScaler().fit(X_loan_test).transform(X_loan_test)\n# and target result\ny_loan_test = test_df['loan_status'].values\ny_loan_test[0:5]\nprint (X_loan_test[0:5])\nprint (X_loan_test.shape)\nprint (y_loan_test[0:5])\nprint (y_loan_test.shape)",
"execution_count": 65,
"outputs": [
{
"output_type": "stream",
"text": "[[ 0.49362588 0.92844966 3.05981865 1.97714211 -1.30384048 2.39791576\n -0.79772404 -0.86135677]\n [-3.56269116 -1.70427745 0.53336288 -0.50578054 0.76696499 -0.41702883\n -0.79772404 -0.86135677]\n [ 0.49362588 0.92844966 1.88080596 1.97714211 0.76696499 -0.41702883\n 1.25356634 -0.86135677]\n [ 0.49362588 0.92844966 -0.98251057 -0.50578054 0.76696499 -0.41702883\n -0.79772404 1.16095912]\n [-0.66532184 -0.78854628 -0.47721942 -0.50578054 0.76696499 2.39791576\n -0.79772404 -0.86135677]]\n(54, 8)\n['PAIDOFF' 'PAIDOFF' 'PAIDOFF' 'PAIDOFF' 'PAIDOFF']\n(54,)\n",
"name": "stdout"
},
{
"output_type": "stream",
"text": "/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/preprocessing/data.py:645: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n return self.partial_fit(X, y)\n/opt/conda/envs/Python36/lib/python3.6/site-packages/ipykernel/__main__.py:15: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n",
"name": "stderr"
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# Jaccard setup\nfrom sklearn.metrics import jaccard_similarity_score\n\n# evaluate KNN\nknn_yhat = KNN.predict(X_loan_test)\njc1 = round(jaccard_similarity_score(y_loan_test, knn_yhat), 2)\n# evaluate Decision Trees\ndt_yhat = DT.predict(X_loan_test)\njc2 = round(jaccard_similarity_score(y_loan_test, dt_yhat), 2)\n#evaluate SVM\nsvm_yhat = SVM.predict(X_loan_test)\njc3 = round(jaccard_similarity_score(y_loan_test, svm_yhat), 2)\n# evaluate Logistic Regression\nlr_yhat = LR.predict(X_loan_test)\njc4 = round(jaccard_similarity_score(y_loan_test, lr_yhat), 2)\n\nlist_jc = [jc1, jc2, jc3, jc4]\nlist_jc",
"execution_count": 66,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 66,
"data": {
"text/plain": "[0.67, 0.74, 0.8, 0.78]"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# F1-score setup\nfrom sklearn.metrics import f1_score\n\n# evaluate KNN\nfs1 = round(f1_score(y_loan_test, knn_yhat, average='weighted'), 2)\n# evaluate Desision Trees \nfs2 = round(f1_score(y_loan_test, dt_yhat, average='weighted'), 2)\n# evaluate SVM\nfs3 = round(f1_score(y_loan_test, svm_yhat, average='weighted'), 2)\n# evaluate Logistic Regression\nfs4 = round(f1_score(y_loan_test, lr_yhat, average='weighted'),2 )\n\nlist_fs = [fs1, fs2, fs3, fs4]\nlist_fs",
"execution_count": 67,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 67,
"data": {
"text/plain": "[0.63, 0.76, 0.76, 0.73]"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# LogLoss\nfrom sklearn.metrics import log_loss\nlr_prob = LR.predict_proba(X_loan_test)\nlist_ll = ['NA', 'NA', 'NA', round(log_loss(y_loan_test, lr_prob), 2)]\nlist_ll",
"execution_count": 68,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 68,
"data": {
"text/plain": "['NA', 'NA', 'NA', 0.67]"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Report\nYou should be able to report the accuracy of the built model using different evaluation metrics:"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "| Algorithm | Jaccard | F1-score | LogLoss |\n|--------------------|---------|----------|---------|\n| KNN | ? | ? | NA |\n| Decision Tree | ? | ? | NA |\n| SVM | ? | ? | NA |\n| LogisticRegression | ? | ? | ? |"
},
{
"metadata": {},
"cell_type": "code",
"source": "import pandas as pd\n\n# fomulate the report format\ndf = pd.DataFrame(list_jc, index=['KNN','Decision Tree','SVM','Logistic Regression'])\ndf.columns = ['Jaccard']\ndf.insert(loc=1, column='F1-score', value=list_fs)\ndf.insert(loc=2, column='LogLoss', value=list_ll)\ndf.columns.name = 'Algorithm'\ndf",
"execution_count": 69,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 69,
"data": {
"text/plain": "Algorithm Jaccard F1-score LogLoss\nKNN 0.67 0.63 NA\nDecision Tree 0.74 0.76 NA\nSVM 0.80 0.76 NA\nLogistic Regression 0.78 0.73 0.67",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th>Algorithm</th>\n <th>Jaccard</th>\n <th>F1-score</th>\n <th>LogLoss</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>KNN</th>\n <td>0.67</td>\n <td>0.63</td>\n <td>NA</td>\n </tr>\n <tr>\n <th>Decision Tree</th>\n <td>0.74</td>\n <td>0.76</td>\n <td>NA</td>\n </tr>\n <tr>\n <th>SVM</th>\n <td>0.80</td>\n <td>0.76</td>\n <td>NA</td>\n </tr>\n <tr>\n <th>Logistic Regression</th>\n <td>0.78</td>\n <td>0.73</td>\n <td>0.67</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "<h2>Want to learn more?</h2>\n\nIBM SPSS Modeler is a comprehensive analytics platform that has many machine learning algorithms. It has been designed to bring predictive intelligence to decisions made by individuals, by groups, by systems \u2013 by your enterprise as a whole. A free trial is available through this course, available here: <a href=\"http://cocl.us/ML0101EN-SPSSModeler\">SPSS Modeler</a>\n\nAlso, you can use Watson Studio to run these notebooks faster with bigger datasets. Watson Studio is IBM's leading cloud solution for data scientists, built by data scientists. With Jupyter notebooks, RStudio, Apache Spark and popular libraries pre-packaged in the cloud, Watson Studio enables data scientists to collaborate on their projects without having to install anything. Join the fast-growing community of Watson Studio users today with a free account at <a href=\"https://cocl.us/ML0101EN_DSX\">Watson Studio</a>\n\n<h3>Thanks for completing this lesson!</h3>\n\n<h4>Author: <a href=\"https://ca.linkedin.com/in/saeedaghabozorgi\">Saeed Aghabozorgi</a></h4>\n<p><a href=\"https://ca.linkedin.com/in/saeedaghabozorgi\">Saeed Aghabozorgi</a>, PhD is a Data Scientist in IBM with a track record of developing enterprise level applications that substantially increases clients\u2019 ability to turn data into actionable knowledge. He is a researcher in data mining field and expert in developing advanced analytic methods like machine learning and statistical modelling on large datasets.</p>\n\n<hr>\n\n<p>Copyright &copy; 2018 <a href=\"https://cocl.us/DX0108EN_CC\">Cognitive Class</a>. This notebook and its source code are released under the terms of the <a href=\"https://bigdatauniversity.com/mit-license/\">MIT License</a>.</p>"
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3.6",
"language": "python"
},
"language_info": {
"name": "python",
"version": "3.6.8",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment