Created
August 18, 2019 20:55
-
-
Save rezapci/2745762c0daa5e0f74b877bcf8a87e31 to your computer and use it in GitHub Desktop.
Created on Cognitive Class Labs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"| Name | Description | Date\n", | |
"| :- |-------------: | :-:\n", | |
"|Reza Hashemi| IBM Machine Learning with Python. | On 8th of August 2019\n", | |
"\n", | |
"\n", | |
"<h1 align=\"center\"><font size=\"5\">Classification with Python</font></h1>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"In this notebook we try to practice all the classification algorithms that we learned in this course.\n", | |
"\n", | |
"I will load a dataset using Pandas library, and apply the following algorithms, and find the best one for this specific dataset by accuracy evaluation methods.\n", | |
"\n", | |
"Lets first load required libraries:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"done\n" | |
] | |
} | |
], | |
"source": [ | |
"import itertools\n", | |
"import numpy as np\n", | |
"import matplotlib.pyplot as plt\n", | |
"from matplotlib.ticker import NullFormatter\n", | |
"import pandas as pd\n", | |
"import numpy as np\n", | |
"import matplotlib.ticker as ticker\n", | |
"from sklearn import preprocessing\n", | |
"%matplotlib inline\n", | |
"print('done')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"### About dataset" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"This dataset is about past loans. The __Loan_train.csv__ data set includes details of 346 customers whose loan are already paid off or defaulted. It includes following fields:\n", | |
"\n", | |
"| Field | Description |\n", | |
"|----------------|---------------------------------------------------------------------------------------|\n", | |
"| Loan_status | Whether a loan is paid off on in collection |\n", | |
"| Principal | Basic principal loan amount at the |\n", | |
"| Terms | Origination terms which can be weekly (7 days), biweekly, and monthly payoff schedule |\n", | |
"| Effective_date | When the loan got originated and took effects |\n", | |
"| Due_date | Since it’s one-time payoff schedule, each loan has one single due date |\n", | |
"| Age | Age of applicant |\n", | |
"| Education | Education of applicant |\n", | |
"| Gender | The gender of applicant |" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"Lets download the dataset" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"--2019-08-08 18:42:50-- https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_train.csv\n", | |
"Resolving s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)... 67.228.254.193\n", | |
"Connecting to s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)|67.228.254.193|:443... connected.\n", | |
"HTTP request sent, awaiting response... 200 OK\n", | |
"Length: 23101 (23K) [text/csv]\n", | |
"Saving to: ‘loan_train.csv’\n", | |
"\n", | |
"loan_train.csv 100%[===================>] 22.56K --.-KB/s in 0.03s \n", | |
"\n", | |
"2019-08-08 18:42:50 (771 KB/s) - ‘loan_train.csv’ saved [23101/23101]\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"!wget -O loan_train.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_train.csv" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"### Load Data From CSV File " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 50, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Unnamed: 0</th>\n", | |
" <th>Unnamed: 0.1</th>\n", | |
" <th>loan_status</th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>effective_date</th>\n", | |
" <th>due_date</th>\n", | |
" <th>age</th>\n", | |
" <th>education</th>\n", | |
" <th>Gender</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>9/8/2016</td>\n", | |
" <td>10/7/2016</td>\n", | |
" <td>45</td>\n", | |
" <td>High School or Below</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>2</td>\n", | |
" <td>2</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>9/8/2016</td>\n", | |
" <td>10/7/2016</td>\n", | |
" <td>33</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>female</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>3</td>\n", | |
" <td>3</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>15</td>\n", | |
" <td>9/8/2016</td>\n", | |
" <td>9/22/2016</td>\n", | |
" <td>27</td>\n", | |
" <td>college</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>4</td>\n", | |
" <td>4</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>9/9/2016</td>\n", | |
" <td>10/8/2016</td>\n", | |
" <td>28</td>\n", | |
" <td>college</td>\n", | |
" <td>female</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>6</td>\n", | |
" <td>6</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>9/9/2016</td>\n", | |
" <td>10/8/2016</td>\n", | |
" <td>29</td>\n", | |
" <td>college</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n", | |
"0 0 0 PAIDOFF 1000 30 9/8/2016 \n", | |
"1 2 2 PAIDOFF 1000 30 9/8/2016 \n", | |
"2 3 3 PAIDOFF 1000 15 9/8/2016 \n", | |
"3 4 4 PAIDOFF 1000 30 9/9/2016 \n", | |
"4 6 6 PAIDOFF 1000 30 9/9/2016 \n", | |
"\n", | |
" due_date age education Gender \n", | |
"0 10/7/2016 45 High School or Below male \n", | |
"1 10/7/2016 33 Bechalor female \n", | |
"2 9/22/2016 27 college male \n", | |
"3 10/8/2016 28 college female \n", | |
"4 10/8/2016 29 college male " | |
] | |
}, | |
"execution_count": 50, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df = pd.read_csv('loan_train.csv')\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 51, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"(346, 10)" | |
] | |
}, | |
"execution_count": 51, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df.shape" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"### Convert to date time object " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 52, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Unnamed: 0</th>\n", | |
" <th>Unnamed: 0.1</th>\n", | |
" <th>loan_status</th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>effective_date</th>\n", | |
" <th>due_date</th>\n", | |
" <th>age</th>\n", | |
" <th>education</th>\n", | |
" <th>Gender</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-10-07</td>\n", | |
" <td>45</td>\n", | |
" <td>High School or Below</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>2</td>\n", | |
" <td>2</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-10-07</td>\n", | |
" <td>33</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>female</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>3</td>\n", | |
" <td>3</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>15</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-09-22</td>\n", | |
" <td>27</td>\n", | |
" <td>college</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>4</td>\n", | |
" <td>4</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-09</td>\n", | |
" <td>2016-10-08</td>\n", | |
" <td>28</td>\n", | |
" <td>college</td>\n", | |
" <td>female</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>6</td>\n", | |
" <td>6</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-09</td>\n", | |
" <td>2016-10-08</td>\n", | |
" <td>29</td>\n", | |
" <td>college</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n", | |
"0 0 0 PAIDOFF 1000 30 2016-09-08 \n", | |
"1 2 2 PAIDOFF 1000 30 2016-09-08 \n", | |
"2 3 3 PAIDOFF 1000 15 2016-09-08 \n", | |
"3 4 4 PAIDOFF 1000 30 2016-09-09 \n", | |
"4 6 6 PAIDOFF 1000 30 2016-09-09 \n", | |
"\n", | |
" due_date age education Gender \n", | |
"0 2016-10-07 45 High School or Below male \n", | |
"1 2016-10-07 33 Bechalor female \n", | |
"2 2016-09-22 27 college male \n", | |
"3 2016-10-08 28 college female \n", | |
"4 2016-10-08 29 college male " | |
] | |
}, | |
"execution_count": 52, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df['due_date'] = pd.to_datetime(df['due_date'])\n", | |
"df['effective_date'] = pd.to_datetime(df['effective_date'])\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"# Data visualization and pre-processing\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"Let’s see how many of each class is in our data set " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"PAIDOFF 260\n", | |
"COLLECTION 86\n", | |
"Name: loan_status, dtype: int64" | |
] | |
}, | |
"execution_count": 6, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df['loan_status'].value_counts()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"260 people have paid off the loan on time while 86 have gone into collection \n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Lets plot some columns to underestand data better:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Solving environment: done\n", | |
"\n", | |
"\n", | |
"==> WARNING: A newer version of conda exists. <==\n", | |
" current version: 4.5.11\n", | |
" latest version: 4.7.11\n", | |
"\n", | |
"Please update conda by running\n", | |
"\n", | |
" $ conda update -n base -c defaults conda\n", | |
"\n", | |
"\n", | |
"\n", | |
"## Package Plan ##\n", | |
"\n", | |
" environment location: /home/jupyterlab/conda/envs/python\n", | |
"\n", | |
" added / updated specs: \n", | |
" - seaborn\n", | |
"\n", | |
"\n", | |
"The following packages will be downloaded:\n", | |
"\n", | |
" package | build\n", | |
" ---------------------------|-----------------\n", | |
" certifi-2019.6.16 | py36_1 156 KB anaconda\n", | |
" matplotlib-3.1.0 | py36h5429711_0 6.7 MB anaconda\n", | |
" seaborn-0.9.0 | py36_0 379 KB anaconda\n", | |
" sip-4.19.13 | py36he6710b0_0 293 KB anaconda\n", | |
" openssl-1.0.2s | h7b6447c_0 3.1 MB anaconda\n", | |
" pyqt-5.9.2 | py36h22d08a2_1 5.6 MB anaconda\n", | |
" ------------------------------------------------------------\n", | |
" Total: 16.2 MB\n", | |
"\n", | |
"The following packages will be UPDATED:\n", | |
"\n", | |
" certifi: 2019.6.16-py36_1 conda-forge --> 2019.6.16-py36_1 anaconda\n", | |
" matplotlib: 2.2.3-py37hb69df0a_0 --> 3.1.0-py36h5429711_0 anaconda\n", | |
" openssl: 1.0.2r-h14c3975_0 conda-forge --> 1.0.2s-h7b6447c_0 anaconda\n", | |
" sip: 4.19.8-py37hf484d3e_0 --> 4.19.13-py36he6710b0_0 anaconda\n", | |
"\n", | |
"The following packages will be DOWNGRADED:\n", | |
"\n", | |
" pyqt: 5.9.2-py37h05f1152_2 --> 5.9.2-py36h22d08a2_1 anaconda\n", | |
" seaborn: 0.9.0-py_1 conda-forge --> 0.9.0-py36_0 anaconda\n", | |
"\n", | |
"\n", | |
"Downloading and Extracting Packages\n", | |
"certifi-2019.6.16 | 156 KB | ##################################### | 100% \n", | |
"matplotlib-3.1.0 | 6.7 MB | ##################################### | 100% \n", | |
"seaborn-0.9.0 | 379 KB | ##################################### | 100% \n", | |
"sip-4.19.13 | 293 KB | ##################################### | 100% \n", | |
"openssl-1.0.2s | 3.1 MB | ##################################### | 100% \n", | |
"pyqt-5.9.2 | 5.6 MB | ##################################### | 100% \n", | |
"Preparing transaction: done\n", | |
"Verifying transaction: done\n", | |
"Executing transaction: done\n" | |
] | |
} | |
], | |
"source": [ | |
"# notice: installing seaborn might takes a few minutes\n", | |
"!conda install -c anaconda seaborn -y" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAADQCAYAAABStPXYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAbQUlEQVR4nO3de5hU1Znv8e9P7NgoREVaRRC7RRRBmQZ7NF6HwEhQ420cDcYonDiHaLwcJnq8nsc48YnxQmKS4xVHDplEQEMGdEiiEiNHMfHSYIvghXhptRUQSE7UIATwPX/U7k7RVtOXqu7eXfX7PM9+ateqvdd+i67FW3vtXWspIjAzM0ubHbo7ADMzs1ycoMzMLJWcoMzMLJWcoMzMLJWcoMzMLJWcoMzMLJWcoDqJpL0kzZL0pqQlkn4v6fQC1T1G0oJC1NUVJC2SVNPdcVj3KKa2IKlC0rOSXpB0bCce5+POqrsncYLqBJIEzAeejIj9I+IwYCIwqJvi2bE7jmtWhG1hHPBqRIyKiKcKEZO1zAmqc4wF/hoRdzcWRMTbEfG/AST1knSrpOclLZP0jaR8THK2MVfSq5LuTxo4kiYkZYuBf2qsV9IukmYkdb0g6dSkfLKkn0v6L+CxfN6MpJmS7pL0RPIt+B+SY74iaWbWdndJqpW0QtK/tVDX+OQb9NIkvj75xGapVzRtQVI1cAtwoqQ6Sb1b+jxLqpd0Y/JaraTRkh6V9IakC5Jt+kh6PNn3pcZ4cxz3f2b9++RsV0UrIrwUeAEuBW7bzutTgP+VrO8E1AJVwBjgz2S+Xe4A/B44BigH3gWGAgIeBBYk+98IfC1Z3w1YCewCTAYagH4txPAUUJdj+ccc284E5iTHPhX4EDg0iXEJUJ1s1y957AUsAkYmzxcBNUB/4Elgl6T8SuC67v57eem8pQjbwmTg9mS9xc8zUA9cmKzfBiwD+gIVwAdJ+Y7A57Pqeh1Q8vzj5HE8MD15rzsAC4Djuvvv2lWLu366gKQ7yDSuv0bE35P50I2U9M/JJruSaXB/BZ6LiIZkvzqgEvgYeCsi/pCU/4xMwyap6xRJlyfPy4HByfrCiPhjrpgior395/8VESHpJWBNRLyUxLIiibEOOEvSFDINbwAwnEzDbPSFpOzp5Mvw58j8x2MlokjaQqPWPs8PJ48vAX0i4iPgI0kbJe0G/AW4UdJxwKfAQGAvYHVWHeOT5YXkeR8y/z5PdjDmHsUJqnOsAM5ofBIRF0nqT+bbIWS+DV0SEY9m7yRpDLApq2grf/sbtTRoooAzIuK1ZnUdQaYB5N5JeorMN7rmLo+I3+Qob4zr02YxfgrsKKkKuBz4+4j4U9L1V54j1oURcXZLcVnRKca2kH287X2et9tmgHPInFEdFhGbJdWTu818LyLu2U4cRcvXoDrHb4FySRdmle2ctf4ocKGkMgBJB0raZTv1vQpUSRqSPM9uEI8Cl2T1z49qS4ARcWxEVOdYttcgt+fzZP4T+LOkvYATcmzzDHC0pAOSWHeWdGAHj2c9QzG3hXw/z7uS6e7bLOmLwH45tnkU+HrWta2BkvZsxzF6NCeoThCZzuPTgH+Q9Jak54CfkOmjBvh34GVgqaTlwD1s52w2IjaS6cb4ZXJh+O2sl28AyoBlSV03FPr9tEVEvEimG2IFMAN4Osc2a8n04c+WtIxMAx/WhWFaFyvmtlCAz/P9QI2kWjJnU6/mOMZjwCzg90n3+lxyn+0VpcYLcmZmZqniMygzM0slJygzM0slJygzM0slJygzM0ulVCSoCRMmBJnfNnjxUixLwbh9eCmypc1SkaDWrVvX3SGYpZbbh5WqVCQoMzOz5pygzMwslZygzMwslTxYrJkVlc2bN9PQ0MDGjRu7O5SSVl5ezqBBgygrK+twHU5QZlZUGhoa6Nu3L5WVlSTjxloXiwjWr19PQ0MDVVVVHa7HXXxmVlQ2btzIHnvs4eTUjSSxxx575H0W6wRlJWO/AQOQlPey34AB3f1WrBVOTt2vEH8Dd/FZyXhn9Woa9hmUdz2D3m8oQDRm1hqfQZlZUSvUmXN7zqB79epFdXU1hxxyCGeeeSYbNmxoem3evHlI4tVX/zb9U319PYcccggAixYtYtddd2XUqFEcdNBBHHfccSxYsGCb+qdPn86wYcMYNmwYhx9+OIsXL256bcyYMRx00EFUV1dTXV3N3Llzt4mpcamvr8/nn7VL+AzKzIpaoc6cG7XlDLp3797U1dUBcM4553D33XfzrW99C4DZs2dzzDHHMGfOHK6//vqc+x977LFNSamuro7TTjuN3r17M27cOBYsWMA999zD4sWL6d+/P0uXLuW0007jueeeY++99wbg/vvvp6ampsWYeopWz6AkzZD0QTJDZWPZ9ZLek1SXLCdmvXa1pNclvSbpS50VuJlZT3Dsscfy+uuvA/Dxxx/z9NNPc9999zFnzpw27V9dXc11113H7bffDsDNN9/MrbfeSv/+/QEYPXo0kyZN4o477uicN9CN2tLFNxOYkKP8toioTpZfAUgaDkwERiT73CmpV6GCNTPrSbZs2cKvf/1rDj30UADmz5/PhAkTOPDAA+nXrx9Lly5tUz2jR49u6hJcsWIFhx122Dav19TUsGLFiqbn55xzTlNX3vr16wH45JNPmspOP/30Qry9TtdqF19EPCmpso31nQrMiYhNwFuSXgcOB37f4QjNzHqYxmQAmTOo888/H8h0702dOhWAiRMnMnv2bEaPHt1qfRHbHwQ8Ira5a65YuvjyuQZ1saTzgFrgsoj4EzAQeCZrm4ak7DMkTQGmAAwePDiPMMyKj9tHz5YrGaxfv57f/va3LF++HEls3boVSdxyyy2t1vfCCy9w8MEHAzB8+HCWLFnC2LFjm15funQpw4cPL+ybSIGO3sV3FzAEqAZWAd9PynPd+J4z9UfE9IioiYiaioqKDoZhVpzcPorP3LlzOe+883j77bepr6/n3Xffpaqqaps78HJZtmwZN9xwAxdddBEAV1xxBVdeeWVT111dXR0zZ87km9/8Zqe/h67WoTOoiFjTuC7pXqDxHsgGYN+sTQcB73c4OjOzPA3ee++C/nZtcHKnXHvNnj2bq666apuyM844g1mzZnHllVduU/7UU08xatQoNmzYwJ577smPf/xjxo0bB8App5zCe++9x1FHHYUk+vbty89+9jMGFOEPyNVa3yZAcg1qQUQckjwfEBGrkvV/BY6IiImSRgCzyFx32gd4HBgaEVu3V39NTU3U1tbm8z7MWiWpYD/UbUO7KdhQBm4f7fPKK680dYdZ92rhb9HmttHqGZSk2cAYoL+kBuDbwBhJ1WS67+qBbwBExApJDwIvA1uAi1pLTmZmZrm05S6+s3MU37ed7b8LfDefoMzMzDzUkZmZpZITlJmZpZITlJmZpZITlJmZpZITlJkVtX0GDS7odBv7DGp9ZI/Vq1czceJEhgwZwvDhwznxxBNZuXIlK1asYOzYsRx44IEMHTqUG264oeknCzNnzuTiiy/+TF2VlZWsW7dum7KZM2dSUVGxzfQZL7/8MgArV67kxBNP5IADDuDggw/mrLPO4oEHHmjark+fPk3TcZx33nksWrSIL3/5y011z58/n5EjRzJs2DAOPfRQ5s+f3/Ta5MmTGThwIJs2bQJg3bp1VFZWtvtv0laebsPMitqq997liOseKVh9z34n19jZfxMRnH766UyaNKlpxPK6ujrWrFnD5MmTueuuuxg/fjwbNmzgjDPO4M4772waJaI9vvKVrzSNcN5o48aNnHTSSfzgBz/g5JNPBuCJJ56goqKiaeilMWPGMG3atKax+hYtWtS0/4svvsjll1/OwoULqaqq4q233uL4449n//33Z+TIkUBmXqkZM2Zw4YUXtjvm9vIZlJlZAT3xxBOUlZVxwQUXNJVVV1ezcuVKjj76aMaPHw/AzjvvzO23385NN91UsGPPmjWLI488sik5AXzxi19smgyxNdOmTeOaa66hqqoKgKqqKq6++mpuvfXWpm2mTp3KbbfdxpYtWwoWd0ucoMzMCmj58uWfmQ4Dck+TMWTIED7++GM+/PDDdh8nu9uuurqaTz75pMVjt1VbpvIYPHgwxxxzDD/96U87fJy2chefmVkXaD4lRraWyrcnVxdfvnLFmKvsmmuu4ZRTTuGkk04q6PGb8xmUmVkBjRgxgiVLluQsbz6m4ptvvkmfPn3o27dvpx67Pfs3jzHXVB4HHHAA1dXVPPjggx0+Vls4QZmZFdDYsWPZtGkT9957b1PZ888/z9ChQ1m8eDG/+c1vgMykhpdeeilXXHFFwY791a9+ld/97nf88pe/bCp75JFHeOmll9q0/+WXX873vvc96uvrAaivr+fGG2/ksssu+8y21157LdOmTStI3C1xF5+ZFbUBA/dt9c679ta3PZKYN28eU6dO5aabbqK8vJzKykp++MMf8tBDD3HJJZdw0UUXsXXrVs4999xtbi2fOXPmNrd1P/NMZv7XkSNHssMOmfOJs846i5EjR/LAAw9sM5fUnXfeyVFHHcWCBQuYOnUqU6dOpaysjJEjR/KjH/2oTe+turqam2++mZNPPpnNmzdTVlbGLbfc0jQ7cLYRI0YwevToNk9b3xFtmm6js3k6AesKnm6jNHi6jfTId7qNVrv4JM2Q9IGk5Vllt0p6VdIySfMk7ZaUV0r6RFJdstzd1kDMzMyyteUa1Eyg+fnxQuCQiBgJrASuznrtjYioTpYLMDMz64BWE1REPAn8sVnZYxHR+CutZ8hM7W5mlgppuHRR6grxNyjEXXxfB36d9bxK0guS/q+kY1vaSdIUSbWSateuXVuAMMyKh9tHx5WXl7N+/XonqW4UEaxfv57y8vK86snrLj5J15KZ2v3+pGgVMDgi1ks6DJgvaUREfOZn0hExHZgOmYvA+cRhVmzcPjpu0KBBNDQ04MTevcrLyxk0KL/OtQ4nKEmTgC8D4yL5qhIRm4BNyfoSSW8ABwK+BcnMukRZWVnTWHLWs3Woi0/SBOBK4JSI2JBVXiGpV7K+PzAUeLMQgZqZWWlp9QxK0mxgDNBfUgPwbTJ37e0ELEzGaHomuWPvOOA7krYAW4ELIuKPOSs2MzPbjlYTVEScnaP4vha2/QXwi3yDMjMz81h8ZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSk5QZmaWSq0mKEkzJH0gaXlWWT9JCyX9IXncPeu1qyW9Luk1SV/qrMDNzKy4teUMaiYwoVnZVcDjETEUeDx5jqThwERgRLLPnY0z7JqZmbVHqwkqIp4Ems+Keyrwk2T9J8BpWeVzImJTRLwFvA4cXqBYzcyshHT0GtReEbEKIHncMykfCLybtV1DUvYZkqZIqpVUu3bt2g6GYVac3D7MCn+ThHKURa4NI2J6RNRERE1FRUWBwzDr2dw+zDqeoNZIGgCQPH6QlDcA+2ZtNwh4v+PhmZlZqepognoYmJSsTwIeyiqfKGknSVXAUOC5/EI0M7NStGNrG0iaDYwB+ktqAL4N3AQ8KOl84B3gTICIWCHpQeBlYAtwUURs7aTYzcysiLWaoCLi7BZeGtfC9t8FvptPUGZmZh5JwszMUskJyszMUskJyszMUskJyszMUskJyszMUskJyszMUskJyszMUskJyszMUskJyszMUskJyszMUskJyszMUskJyszMUskJyszMUqnV0cxbIukg4IGsov2B64DdgP8ONM5TfU1E/KrDEZqZWUnqcIKKiNeAagBJvYD3gHnAfwNui4hpBYnQzMxKUqG6+MYBb0TE2wWqz8zMSlyhEtREYHbW84slLZM0Q9LuuXaQNEVSraTatWvX5trErGS5fZgVIEFJ+hxwCvDzpOguYAiZ7r9VwPdz7RcR0yOiJiJqKioq8g3DrKi4fZgV5gzqBGBpRKwBiIg1EbE1Ij4F7gUOL8AxzMysxBQiQZ1NVveepAFZr50OLC/AMczMrMR0+C4+AEk7A8cD38gqvkVSNRBAfbPXzMzM2iSvBBURG4A9mpWdm1dEZmZmeCQJMzNLKScoMzNLJScoMzNLJScoMzNLJScoMzNLJScoMzNLpbxuMzfrSdSrjEHvNxSkHjPrfE5QVjJi62aOuO6RvOt59jsTChCNmbXGXXxmZpZKTlBmZpZKTlBmZpZKTlBmZpZKTlBmZpZKTlBmZpZK+c4HVQ98BGwFtkREjaR+wANAJZn5oM6KiD/lF6aZmZWaQpxBfTEiqiOiJnl+FfB4RAwFHk+eWwnab8AAJOW97DdgQOsHM7Oi0xk/1D0VGJOs/wRYBFzZCcexlHtn9Woa9hmUdz2FGP3BzHqefM+gAnhM0hJJU5KyvSJiFUDyuGeuHSVNkVQrqXbt2rV5hmFWXNw+zPJPUEdHxGjgBOAiSce1dceImB4RNRFRU1FRkWcYZsXF7cMszwQVEe8njx8A84DDgTWSBgAkjx/kG6SZmZWeDicoSbtI6tu4DowHlgMPA5OSzSYBD+UbpJmZlZ58bpLYC5gnqbGeWRHxiKTngQclnQ+8A5yZf5hmZlZqOpygIuJN4O9ylK8HxuUTlJmZmUeSMDOzVHKCMjOzVHKCMjOzVHKCMjOzVHKCMjOzVHKCMjOzVHKCMjOzVHKCMjOzVHKCMjOzVHKCMjOzVHKCMjMrcYWa/brQM2B3xoy6ZmbWgxRq9mso7AzYPoMyM7NUymc+qH0lPSHpFUkrJP2PpPx6Se9JqkuWEwsXrpmZlYp8uvi2AJdFxNJk4sIlkhYmr90WEdPyD8/MzEpVPvNBrQJWJesfSXoFGFiowMzMrLQV5BqUpEpgFPBsUnSxpGWSZkjavYV9pkiqlVS7du3aQoRhVjTcPswKkKAk9QF+AUyNiA+Bu4AhQDWZM6zv59ovIqZHRE1E1FRUVOQbhllRcfswyzNBSSojk5zuj4j/BIiINRGxNSI+Be4FDs8/TDMzKzX53MUn4D7glYj4QVZ59q+0TgeWdzw8MzMrVfncxXc0cC7wkqS6pOwa4GxJ1UAA9cA38orQzMxKUj538S0GlOOlX3U8HDMzswyPJGFmZqnksfis06hXWUHG5VKvsgJEY2Y9jROUdZrYupkjrnsk73qe/c6EAkRjZj2Nu/jMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzLpYoaZYL+T06mnku/jMzLpYoaZYL+T06mnkMygzM0slJygzM0sld/GZmZW4Qo360lhXoThBmZmVuEKN+gKFHfnFXXxmZpZKnZagJE2Q9Jqk1yVdlW99vi3TzKy0dEoXn6RewB3A8UAD8LykhyPi5Y7W6dsyzcxKS2ddgzoceD0i3gSQNAc4Fehwgkqb/QYM4J3Vq/OuZ/Dee/P2qlUFiKi4SbnmxrQ0cttoXaFuStihV1lRtw1FROErlf4ZmBAR/5I8Pxc4IiIuztpmCjAleXoQ8FrBA2m7/sC6bjx+Phx712tL3OsiosNXi1PUPnrq3wgce3dpLfY2t43OOoPKldK3yYQRMR2Y3knHbxdJtRFR091xdIRj73pdEXda2kdP/RuBY+8uhYy9s26SaAD2zXo+CHi/k45lZmZFqLMS1PPAUElVkj4HTAQe7qRjmZlZEeqULr6I2CLpYuBRoBcwIyJWdMaxCqTbu1Ly4Ni7Xk+NuyN68nt17N2jYLF3yk0SZmZm+fJIEmZmlkpOUGZmlkolk6Ak9ZL0gqQFyfN+khZK+kPyuHvWtlcnQzS9JulL3Rc1SNpN0lxJr0p6RdKRPSj2f5W0QtJySbMllac1dkkzJH0gaXlWWbtjlXSYpJeS136sHvArSreNbondbaMtbSMiSmIBvgXMAhYkz28BrkrWrwJuTtaHAy8COwFVwBtAr26M+yfAvyTrnwN26wmxAwOBt4DeyfMHgclpjR04DhgNLM8qa3eswHPAkWR+C/hr4ITu+uy04727bXRt3G4bbWwb3d44uugfeBDwODA2qxG+BgxI1gcAryXrVwNXZ+37KHBkN8X9+eSDrGblPSH2gcC7QD8yd4suAManOXagslkjbFesyTavZpWfDdzTHf/+7XjPbhtdH7vbRhvbRql08f0QuAL4NKtsr4hYBZA87pmUN354GjUkZd1hf2At8H+SLph/l7QLPSD2iHgPmAa8A6wC/hwRj9EDYs/S3lgHJuvNy9PMbaOLuW1sU75dRZ+gJH0Z+CAilrR1lxxl3XUv/o5kTq3viohRwF/InE63JDWxJ33Sp5I5zd8H2EXS17a3S46ytP4GoqVYe9J7cNtw2+gMBW0bRZ+ggKOBUyTVA3OAsZJ+BqyRNAAgefwg2T5NwzQ1AA0R8WzyfC6ZRtkTYv9H4K2IWBsRm4H/BI6iZ8TeqL2xNiTrzcvTym2je7httPE9FH2CioirI2JQRFSSGXLptxHxNTJDL01KNpsEPJSsPwxMlLSTpCpgKJmLe10uIlYD70o6KCkaR2bKktTHTqb74guSdk7u1hkHvELPiL1Ru2JNujo+kvSF5D2fl7VP6rhtuG3koWvaRndcJOyuBRjD3y4E70Hm4vAfksd+WdtdS+buk9fo5ruwgGqgFlgGzAd270Gx/xvwKrAc+CmZO3tSGTswm8z1gM1kvu2d35FYgZrk/b4B3E6zi/hpXdw2ujx2t402tA0PdWRmZqlU9F18ZmbWMzlBmZlZKjlBmZlZKjlBmZlZKjlBmZlZKjlBpZikrZLqkhGPfy5p5xa2+5Wk3TpQ/z6S5uYRX72k/h3d36yj3DZKg28zTzFJH0dEn2T9fmBJRPwg63WR+Rt+2lIdnRxfPVATEeu64/hWutw2SoPPoHqOp4ADJFUqM/fNncBSYN/Gb2tZr92rzFwzj0nqDSDpAEm/kfSipKWShiTbL09enyzpIUmPJPO4fLvxwJLmS1qS1DmlW969WcvcNoqUE1QPIGlH4ATgpaToIOA/ImJURLzdbPOhwB0RMQL4f8AZSfn9SfnfkRn3a1WOQx0OnEPmF/pnSqpJyr8eEYeR+SX4pZL2KNBbM8uL20Zxc4JKt96S6sgM5/IOcF9S/nZEPNPCPm9FRF2yvgSolNQXGBgR8wAiYmNEbMix78KIWB8Rn5AZwPKYpPxSSS8Cz5AZCHJo3u/MLD9uGyVgx+4OwLbrk4iozi7IdK3zl+3ssylrfSvQm9xD3efS/IJkSBpDZvTlIyNig6RFQHkb6zPrLG4bJcBnUCUgIj4EGiSdBpCMNJzrrqfjJfVL+uZPA54GdgX+lDTAYcAXuixws07mtpFuTlCl41wy3RHLgN8Be+fYZjGZkZXrgF9ERC3wCLBjst8NZLoyzIqJ20ZK+TZzAzJ3KpG5Lfbi7o7FLE3cNrqPz6DMzCyVfAZlZmap5DMoMzNLJScoMzNLJScoMzNLJScoMzNLJScoMzNLpf8PgN7OMX/lNXoAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<Figure size 432x216 with 2 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"import seaborn as sns\n", | |
"\n", | |
"bins = np.linspace(df.Principal.min(), df.Principal.max(), 10)\n", | |
"g = sns.FacetGrid(df, col=\"Gender\", hue=\"loan_status\", palette=\"Set1\", col_wrap=2)\n", | |
"g.map(plt.hist, 'Principal', bins=bins, ec=\"k\")\n", | |
"\n", | |
"g.axes[-1].legend()\n", | |
"plt.show()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAADQCAYAAABStPXYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAZQElEQVR4nO3de5BU5bnv8e9PmGRQiIoMMjpyUVFEJQPOViNqEJTD9orHrXHHKNbxhKNBDRU93nLKSrZVxlupyfESSbSgoqBus0E3qWgQnRMx3gBHBDHq0UFHucdEOQhBfM4fvZgMMDA9w+rp1T2/T9Wq7vX26nc9L9MvT693rX6XIgIzM7Os2a3YAZiZmbXGCcrMzDLJCcrMzDLJCcrMzDLJCcrMzDLJCcrMzDLJCSplkvaVNF3S+5IWSHpJ0tkp1T1K0uw06uoMkuol1RU7Diu+cuoXkqokvSLpdUknFHA/6wpVd6lwgkqRJAGzgD9GxIERcRRwPlBTpHi6F2O/Zi2VYb8YA7wdEcMj4oU0YrLWOUGlazTw94j45ZaCiFgWEf8bQFI3SbdLek3SIkn/IykflRxtPCHpbUmPJJ0aSeOSsnnAf91Sr6Q9JD2U1PW6pLOS8osl/buk/wT+sCuNkTRV0v2Snk+++X472edSSVNbbHe/pPmSlkj66Q7qGpt8a16YxNdzV2KzklI2/UJSLXAbcKqkBkk9dvTZltQo6ebktfmSRkh6RtL/lXRpsk1PSXOT9765Jd5W9vs/W/z7tNrHylJEeElpAa4E7trJ6xOB/5U8/zowHxgEjAL+Ru4b5W7AS8DxQCXwETAYEPA4MDt5/83A95LnewHvAHsAFwNNQO8dxPAC0NDKcnIr204FHk32fRbwGXBkEuMCoDbZrnfy2A2oB4Yl6/VAHdAH+COwR1J+LXBjsf9eXjpnKcN+cTFwT/J8h59toBG4LHl+F7AI6AVUAauS8u7AN1rU9R6gZH1d8jgWmJK0dTdgNnBisf+unbF4CKiAJN1LrkP9PSL+idwHbZikf0k22ZNcJ/s78GpENCXvawAGAuuADyLi3aT8YXKdmaSuMyVdnaxXAv2T53Mi4i+txRQR7R0z/8+ICElvAisj4s0kliVJjA3AeZImkuts1cBQcp1xi2OTsheTL8BfI/efjXVBZdIvtmjrs/1U8vgm0DMiPgc+l7RB0l7A/wNulnQi8BWwP7AvsKJFHWOT5fVkvSe5f58/djDmkuEEla4lwDlbViJikqQ+5L4RQu4b0BUR8UzLN0kaBWxsUbSZf/xtdjRZooBzIuLP29R1DLkPfetvkl4g9y1uW1dHxLOtlG+J66ttYvwK6C5pEHA18E8R8Wky9FfZSqxzIuJfdxSXlbVy7Bct97ezz/ZO+w9wAbkjqqMiYpOkRlrvPz+LiAd2EkdZ8jmodD0HVEq6rEXZ7i2ePwNcJqkCQNIhkvbYSX1vA4MkHZSst+wEzwBXtBiTH55PgBFxQkTUtrLsrBPuzDfIdfy/SdoX+OdWtnkZGCnp4CTW3SUd0sH9Wekp536xq5/tPckN922SdBIwoJVtngH+W4tzW/tL6tuOfZQsJ6gURW7AeDzwbUkfSHoVmEZuXBrg18BbwEJJi4EH2MlRbERsIDd08bvkZPCyFi/fBFQAi5K6bkq7PfmIiDfIDT0sAR4CXmxlm9Xkxu1nSFpErlMP6cQwrYjKuV+k8Nl+BKiTNJ/c0dTbrezjD8B04KVkqP0JWj/aKztbTsaZmZllio+gzMwsk5ygzMwsk5ygzMwsk5ygzMwskzo1QY0bNy7I/X7Bi5eusHSI+4mXLri0qlMT1Jo1azpzd2Ylyf3ELMdDfGZmlklOUGZmlklOUGZmlkmeLNbMyt6mTZtoampiw4YNxQ6lS6usrKSmpoaKioq8tneCMrOy19TURK9evRg4cCDJPLLWySKCtWvX0tTUxKBBg/J6j4f4zKzsbdiwgX322cfJqYgksc8++7TrKNYJqggGVFcjKZVlQHV1sZtjVhKcnIqvvX8DD/EVwYcrVtC0X00qddV80pRKPWZmWeMjKDPrctIcxch3JKNbt27U1tZyxBFHcO6557J+/frm12bOnIkk3n77H7eDamxs5IgjjgCgvr6ePffck+HDh3PooYdy4oknMnv27K3qnzJlCkOGDGHIkCEcffTRzJs3r/m1UaNGceihh1JbW0ttbS1PPPHEVjFtWRobG3flnzV1PoIysy4nzVEMyG8ko0ePHjQ0NABwwQUX8Mtf/pIf/ehHAMyYMYPjjz+eRx99lJ/85Cetvv+EE05oTkoNDQ2MHz+eHj16MGbMGGbPns0DDzzAvHnz6NOnDwsXLmT8+PG8+uqr9OvXD4BHHnmEurq6HcaURT6CMjPrZCeccALvvfceAOvWrePFF1/kwQcf5NFHH83r/bW1tdx4443cc889ANx6663cfvvt9OnTB4ARI0YwYcIE7r333sI0oJM4QZmZdaIvv/yS3//+9xx55JEAzJo1i3HjxnHIIYfQu3dvFi5cmFc9I0aMaB4SXLJkCUcdddRWr9fV1bFkyZLm9QsuuKB5KG/t2rUAfPHFF81lZ599dhrNS5WH+MzMOsGWZAC5I6hLLrkEyA3vTZ48GYDzzz+fGTNmMGLEiDbri9jhJODNr7e8aq4Uh/jySlCSGoHPgc3AlxFRJ6k38BgwEGgEzouITwsTpplZaWstGaxdu5bnnnuOxYsXI4nNmzcjidtuu63N+l5//XUOO+wwAIYOHcqCBQsYPXp08+sLFy5k6NCh6Taik7VniO+kiKiNiC0p+DpgbkQMBuYm62ZmlqcnnniCiy66iGXLltHY2MhHH33EoEGDtroCrzWLFi3ipptuYtKkSQBcc801XHvttc1Ddw0NDUydOpUf/OAHBW9DIe3KEN9ZwKjk+TSgHrh2F+MxMyu4/v36pfobwv7JlXLtNWPGDK67buvv9ueccw7Tp0/n2mu3/u/0hRdeYPjw4axfv56+ffvyi1/8gjFjxgBw5pln8vHHH3PcccchiV69evHwww9TXeI/5Fdb45gAkj4APiV358MHImKKpL9GxF4ttvk0IvZu5b0TgYkA/fv3P2rZsmWpBV+qJKX6Q918/oZWFHn/bN79pLCWLl3aPBxmxbWDv0WrfSXfIb6RETEC+GdgkqQT8w0mIqZERF1E1FVVVeX7NrMuxf3EbHt5JaiI+CR5XAXMBI4GVkqqBkgeVxUqSDMz63raTFCS9pDUa8tzYCywGHgKmJBsNgF4slBBmplZ15PPRRL7AjOT6+m7A9Mj4mlJrwGPS7oE+BA4t3BhmplZV9NmgoqI94FvtlK+FhhTiKDMzMw81ZGZmWWSE5SZdTn71fRP9XYb+9X0b3OfK1as4Pzzz+eggw5i6NChnHrqqbzzzjssWbKE0aNHc8ghhzB48GBuuumm5p+OTJ06lcsvv3y7ugYOHMiaNWu2Kps6dSpVVVVb3T7jrbfeAuCdd97h1FNP5eCDD+awww7jvPPO47HHHmvermfPns2347jooouor6/n9NNPb6571qxZDBs2jCFDhnDkkUcya9as5tcuvvhi9t9/fzZu3AjAmjVrGDhwYLv/Jq3xXHx5GlBdzYcrVhQ7DDNLwfKPP+KYG59Orb5X/m3cTl+PCM4++2wmTJjQPGN5Q0MDK1eu5OKLL+b+++9n7NixrF+/nnPOOYf77ruveZaI9vjOd77TPMP5Fhs2bOC0007jzjvv5IwzzgDg+eefp6qqqnnqpVGjRnHHHXc0z9VXX1/f/P433niDq6++mjlz5jBo0CA++OADTjnlFA488ECGDRsG5O4r9dBDD3HZZZe1O+adcYLKk++Ca2Yd9fzzz1NRUcGll17aXFZbW8uDDz7IyJEjGTt2LAC7774799xzD6NGjepQgmrN9OnT+da3vtWcnABOOumkvN9/xx13cMMNNzBo0CAABg0axPXXX8/tt9/Ob37zGwAmT57MXXfdxfe///1UYt7CQ3xmZgW2ePHi7W6HAa3fJuOggw5i3bp1fPbZZ+3eT8thu9raWr744osd7jtf+dzKo3///hx//PHNCSstPoIyMyuSbW+J0dKOynemtSG+XdVajK2V3XDDDZx55pmcdtppqe3bR1BmZgV2+OGHs2DBglbL58+fv1XZ+++/T8+ePenVq1dB992e928bY2u38jj44IOpra3l8ccf7/C+tuUEZWZWYKNHj2bjxo386le/ai577bXXGDx4MPPmzePZZ58Fcjc1vPLKK7nmmmtS2/d3v/td/vSnP/G73/2uuezpp5/mzTffzOv9V199NT/72c9obGwEoLGxkZtvvpmrrrpqu21//OMfc8cdd6QSN3iIz8y6oOr9D2jzyrv21rczkpg5cyaTJ0/mlltuobKykoEDB3L33Xfz5JNPcsUVVzBp0iQ2b97MhRdeuNWl5VOnTt3qsu6XX34ZgGHDhrHbbrljjPPOO49hw4bx2GOPbXUvqfvuu4/jjjuO2bNnM3nyZCZPnkxFRQXDhg3j5z//eV5tq62t5dZbb+WMM85g06ZNVFRUcNtttzXfHbilww8/nBEjRuR92/q25HW7jbTU1dXFtoeKpSLtW2T4dhtdQvtPIlDa/SSrfLuN7CjE7TbMzMw6lROUmZllkhOUmXUJHgovvvb+DZygzKzsVVZWsnbtWiepIooI1q5dS2VlZd7v8VV8Zlb2ampqaGpqYvXq1cUOpUurrKykpib/C8ScoErc1+nYL85b079fP5YtX55KXWZZUlFR0TyXnJUOJ6gStxE8ia2ZlaW8z0FJ6ibpdUmzk/XekuZIejd53LtwYZqZWVfTnoskfggsbbF+HTA3IgYDc5N1MzOzVOSVoCTVAKcBv25RfBYwLXk+DRifbmhmZtaV5XsEdTdwDfBVi7J9I2I5QPLYt7U3Spooab6k+b6Cxqx17idm22szQUk6HVgVER2arz0ipkREXUTUVVVVdaQKs7LnfmK2vXyu4hsJnCnpVKAS+Iakh4GVkqojYrmkamBVIQM1M7Oupc0jqIi4PiJqImIgcD7wXER8D3gKmJBsNgF4smBRmplZl7MrUx3dApwi6V3glGTdzMwsFe36oW5E1AP1yfO1wJj0QzIzM/NksWZmllFOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlkltJihJlZJelfSGpCWSfpqU95Y0R9K7yePehQ/XzMy6inyOoDYCoyPim0AtME7SscB1wNyIGAzMTdbNzMxS0WaCipx1yWpFsgRwFjAtKZ8GjC9IhGZm1iXldQ5KUjdJDcAqYE5EvALsGxHLAZLHvjt470RJ8yXNX716dVpxm5UV9xOz7eWVoCJic0TUAjXA0ZKOyHcHETElIuoioq6qqqqjcZqVNfcTs+216yq+iPgrUA+MA1ZKqgZIHlelHp2ZmXVZ+VzFVyVpr+R5D+Bk4G3gKWBCstkE4MlCBWlmZl1P9zy2qQamSepGLqE9HhGzJb0EPC7pEuBD4NwCxmlmZl1MmwkqIhYBw1spXwuMKURQZmZmnknCzMwyyQnKzMwyyQnKzMwyyQnKzMwyqawT1IDqaiSlspiZWefK5zLzkvXhihU07VeTSl01nzSlUo+ZmeWnrI+gzMysdDlBmZlZJjlBmZlZJjlBmZlZJjlBmZlZJjlBmZlZJjlBmZlZJjlBmZlZJjlBmZlZJjlBmZlZJjlBmZlZJrWZoCQdIOl5SUslLZH0w6S8t6Q5kt5NHvcufLhmZtZV5HME9SVwVUQcBhwLTJI0FLgOmBsRg4G5ybqZmVkq2kxQEbE8IhYmzz8HlgL7A2cB05LNpgHjCxWkmZl1Pe06ByVpIDAceAXYNyKWQy6JAX138J6JkuZLmr969epdi9asTLmfmG0v7wQlqSfwW2ByRHyW7/siYkpE1EVEXVVVVUdiNCt77idm28srQUmqIJecHomI/0iKV0qqTl6vBlYVJkQzM+uK8rmKT8CDwNKIuLPFS08BE5LnE4An0w/POtPXYae3vW/PMqC6utjNMbMSl88t30cCFwJvSmpIym4AbgEel3QJ8CFwbmFCtM6yEWjaryaVumo+aUqlHjPrutpMUBExD9AOXh6TbjjZpW4Vqf2nq+5fS6+ubhWp1GNmljX5HEEZEJs3ccyNT6dS1yv/Ni7VuszMypGnOjIzs0xygjIzs0xygjIzs0xygjIzs0xygjIzs0xygjIzs0xygjIzs0xygjIzs0xygjIzs0wq65kk0pyeyMzMOldZJ6i0pycyM7PO4yE+MzPLJCcoMzPLJCcoMzPLpLI+B9UVpHqfKt9byjJkQHU1H65YkUpdPXbrxhdfbU6lrv79+rFs+fJU6rKdc4Iqcb4QxMrVhytWpHqHZ98tuvS0OcQn6SFJqyQtblHWW9IcSe8mj3sXNkwzM+tq8jkHNRXY9qv1dcDciBgMzE3WzZp9HZCUyjKgurrYzTGzImhziC8i/ihp4DbFZwGjkufTgHrg2hTjshK3ETykYma7pKNX8e0bEcsBkse+O9pQ0kRJ8yXNX716dQd3Z1beyqWfDKiuTu3I2azgF0lExBRgCkBdXV0Uen9mpahc+knaFzZY19bRI6iVkqoBksdV6YVkZmbW8QT1FDAheT4BeDKdcMzMzHLyucx8BvAScKikJkmXALcAp0h6FzglWTczM0tNPlfx/esOXhqTcixmZmbNMjcXn68CMjMzyOBUR74KyMzMIIMJyorHE8+aWZY4QVkzTzxrZlmSuXNQZmZm4ARlZmYZ5QRlZmaZ5ARlZmaZ5ARlmed7SxWWf3toWeWr+CzzfG+pwvJvDy2rnKCsIPybKjPbVU5QVhD+TZWZ7SqfgzIzs0zyEZRlXprDhbt1q0jtZH7/fv1Ytnx5KnWVi1SHdrt/zcPE7TCgupoPV6xIpa6sfLadoCzz0h4u9AUBhZP238rDxPkrx4tdPMRnZmaZlLkjqDSHCMzMrHRlLkH56i8zM4NdTFCSxgE/B7oBv46IW1KJyqxAyuX3WWmeELf2SfNCm926V/DVl5tSqascdThBSeoG3AucAjQBr0l6KiLeSis4s7SVyxF6OZ4QLxVf+aKdTrMrF0kcDbwXEe9HxN+BR4Gz0gnLzMy6OkVEx94o/QswLiL+e7J+IXBMRFy+zXYTgYnJ6qHAnzse7lb6AGtSqisL3J7s6mhb1kREXodZ7id5c3uyLdW+sivnoFobhN0u20XEFGDKLuyn9Z1L8yOiLu16i8Xtya7OaIv7SX7cnmxLuz27MsTXBBzQYr0G+GTXwjEzM8vZlQT1GjBY0iBJXwPOB55KJywzM+vqOjzEFxFfSroceIbcZeYPRcSS1CJrW+rDIUXm9mRXKbellGNvjduTbam2p8MXSZiZmRWS5+IzM7NMcoIyM7NMynyCknSApOclLZW0RNIPk/LekuZIejd53LvYseZDUqWkVyW9kbTnp0l5SbZnC0ndJL0uaXayXrLtkdQo6U1JDZLmJ2WZb4/7Sva5n7RP5hMU8CVwVUQcBhwLTJI0FLgOmBsRg4G5yXop2AiMjohvArXAOEnHUrrt2eKHwNIW66XenpMiorbFbzpKoT3uK9nnftIeEVFSC/Akufn//gxUJ2XVwJ+LHVsH2rI7sBA4ppTbQ+43cHOB0cDspKyU29MI9NmmrOTa476SrcX9pP1LKRxBNZM0EBgOvALsGxHLAZLHvsWLrH2Sw/wGYBUwJyJKuj3A3cA1wFctykq5PQH8QdKCZAoiKLH2uK9kkvtJO2XuflA7Iqkn8FtgckR8ltZ098UQEZuBWkl7ATMlHVHsmDpK0unAqohYIGlUseNJyciI+ERSX2COpLeLHVB7uK9kj/tJx5TEEZSkCnId7pGI+I+keKWk6uT1anLfsEpKRPwVqAfGUbrtGQmcKamR3Iz2oyU9TOm2h4j4JHlcBcwkN3N/SbTHfSWz3E86IPMJSrmvfw8CSyPizhYvPQVMSJ5PIDfennmSqpJvg0jqAZwMvE2Jticiro+ImogYSG66q+ci4nuUaHsk7SGp15bnwFhgMSXQHveV7HI/6aBin2jL40Tc8eTGOhcBDclyKrAPuROO7yaPvYsda57tGQa8nrRnMXBjUl6S7dmmbaP4x8nfkmwPcCDwRrIsAX5cKu1xXymNxf0k/8VTHZmZWSZlfojPzMy6JicoMzPLJCcoMzPLJCcoMzPLJCcoMzPLJCcoMzPLJCcoMzPLJCeoMiBpVjJh45ItkzZKukTSO5LqJf1K0j1JeZWk30p6LVlGFjd6s87jvlJa/EPdMiCpd0T8JZkO5jXgvwAvAiOAz4HngDci4nJJ04H7ImKepP7AM5G7f5BZ2XNfKS0lM5u57dSVks5Onh8AXAj8n4j4C4CkfwcOSV4/GRjaYobrb0jqFRGfd2bAZkXivlJCnKBKXDJ1/8nAtyJivaR6cjcN29E3vd2Sbb/onAjNssF9pfT4HFTp2xP4NOlwQ8jd6nt34NuS9pbUHTinxfZ/AC7fsiKptlOjNSse95US4wRV+p4GuktaBNwEvAx8DNxM7m6qzwJvAX9Ltr8SqJO0SNJbwKWdH7JZUbivlBhfJFGmJPWMiHXJt8KZwEMRMbPYcZlljftKdvkIqnz9RFIDufvofADMKnI8ZlnlvpJRPoIyM7NM8hGUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZll0v8HzHNLcwZAb84AAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<Figure size 432x216 with 2 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"bins = np.linspace(df.age.min(), df.age.max(), 10)\n", | |
"g = sns.FacetGrid(df, col=\"Gender\", hue=\"loan_status\", palette=\"Set1\", col_wrap=2)\n", | |
"g.map(plt.hist, 'age', bins=bins, ec=\"k\")\n", | |
"\n", | |
"g.axes[-1].legend()\n", | |
"plt.show()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"# Pre-processing: Feature selection/extraction" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"### Lets look at the day of the week people get the loan " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAADQCAYAAABStPXYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAZ2UlEQVR4nO3deZxU9Znv8c9X6Eyr4Apqa4dFRRCVabAT44JBiLyIK45LSIxCrhmvxiVcw3VLrsnEew0q1yVxJWpwIuJCIjpkosGFCO6ArYgYdLTFVlBgkhijKOozf9TpngYKuug+1XWo+r5fr3pV1alzfuc53f30U+dXp34/RQRmZmZZs0WpAzAzM8vHBcrMzDLJBcrMzDLJBcrMzDLJBcrMzDLJBcrMzDLJBSplknaWdKek1yXNl/SUpONSanuYpJlptNUZJM2WVF/qOKz0yikvJPWU9Iyk5yUNLeJ+PihW25sLF6gUSRIwA3g8InaPiP2BMUBtieLpWor9mrVWhnkxAnglIgZHxJw0YrL8XKDSNRz4JCJual4QEW9GxC8AJHWRdKWk5yS9KOl/JsuHJWcb0yW9ImlqktRIGpUsmwv8U3O7kraWdFvS1vOSjk2Wj5N0r6R/A/7QkYORNEXSjZIeS975fjXZ52JJU1qtd6OkeZIWSfqXDbQ1MnnXvCCJr1tHYrPNStnkhaQ64ArgCEkNkrbc0N+2pEZJlyWvzZM0RNJDkv5D0hnJOt0kPZJsu7A53jz7/d+tfj55c6wsRYRvKd2Ac4GrN/L66cCPksf/AMwD+gLDgL+Se0e5BfAUcAhQDbwF9AME3APMTLa/DPh28ng7YAmwNTAOaAJ22EAMc4CGPLev5Vl3CnBXsu9jgfeB/ZIY5wN1yXo7JPddgNnAoOT5bKAe6AE8DmydLL8AuKTUvy/fOudWhnkxDrguebzBv22gETgzeXw18CLQHegJvJcs7wps06qt1wAlzz9I7kcCk5Nj3QKYCRxa6t9rZ9zcBVREkq4nl1CfRMSXyP2hDZJ0QrLKtuSS7BPg2YhoSrZrAPoAHwBvRMSryfI7yCUzSVvHSJqQPK8GeiWPZ0XEf+aLKSI2tc/83yIiJC0E3o2IhUksi5IYG4CTJJ1OLtlqgIHkkrHZV5JlTyRvgL9A7p+NVaAyyYtmbf1tP5DcLwS6RcTfgL9JWi1pO+DvwGWSDgU+B3YDdgaWt2pjZHJ7PnnejdzP5/F2xrzZcIFK1yLg+OYnEXGWpB7k3hFC7h3QORHxUOuNJA0DPm616DP++3ezocESBRwfEX9ap60DyP3R599ImkPuXdy6JkTEw3mWN8f1+Toxfg50ldQXmAB8KSL+nHT9VeeJdVZEfHNDcVlZK8e8aL2/jf1tbzR/gJPJnVHtHxFrJDWSP39+FhE3bySOsuTPoNL1KFAt6cxWy7Zq9fgh4ExJVQCS9pK09UbaewXoK2mP5HnrJHgIOKdVn/zgQgKMiKERUZfntrEk3JhtyCX+XyXtDHw9zzpPAwdL2jOJdStJe7Vzf7b5Kee86Ojf9rbkuvvWSDoM6J1nnYeA/9Hqs63dJO20CfvYbLlApShyHcajga9KekPSs8Dt5PqlAW4BXgYWSHoJuJmNnMVGxGpyXRe/Sz4MfrPVy5cCVcCLSVuXpn08hYiIF8h1PSwCbgOeyLPOCnL99tMkvUguqQd0YphWQuWcFyn8bU8F6iXNI3c29UqeffwBuBN4Kulqn07+s72y0/xhnJmZWab4DMrMzDLJBcrMzDLJBcrMzDLJBcrMzDKpUwvUqFGjgtz3F3zzrRJu7eI88a0Cb3l1aoFauXJlZ+7ObLPkPDHLcRefmZllkguUmZllkguUmZllkgeLNbOyt2bNGpqamli9enWpQ6lo1dXV1NbWUlVVVdD6LlBmVvaampro3r07ffr0IRlH1jpZRLBq1Sqampro27dvQdu4i8/Myt7q1avZcccdXZxKSBI77rjjJp3FukBZReldU4OkVG69a2pKfTi2CVycSm9Tfwfu4rOKsnT5cpp2rU2lrdp3mlJpx8zy8xmUmVWcNM+kCz2b7tKlC3V1dey7776ceOKJfPjhhy2v3XfffUjilVf+ezqoxsZG9t13XwBmz57Ntttuy+DBg+nfvz+HHnooM2fOXKv9yZMnM2DAAAYMGMCXv/xl5s6d2/LasGHD6N+/P3V1ddTV1TF9+vS1Ymq+NTY2duTHmrqCzqAk/S/gu+SGpFgIfIfcjJh3A32ARuCkiPhzUaI0M0tRmmfSUNjZ9JZbbklDQwMAJ598MjfddBPnnXceANOmTeOQQw7hrrvu4ic/+Une7YcOHdpSlBoaGhg9ejRbbrklI0aMYObMmdx8883MnTuXHj16sGDBAkaPHs2zzz7LLrvsAsDUqVOpr6/fYExZ1OYZlKTdgHOB+ojYF+gCjAEuBB6JiH7AI8lzMzNrw9ChQ3nttdcA+OCDD3jiiSe49dZbueuuuwravq6ujksuuYTrrrsOgMsvv5wrr7ySHj16ADBkyBDGjh3L9ddfX5wD6CSFdvF1BbaU1JXcmdM7wLHkpm0muR+dfnhmZuXl008/5fe//z377bcfADNmzGDUqFHstdde7LDDDixYsKCgdoYMGdLSJbho0SL233//tV6vr69n0aJFLc9PPvnklq68VatWAfDRRx+1LDvuuOPSOLxUtdnFFxFvS5oELAU+Av4QEX+QtHNELEvWWSZpp3zbSzodOB2gV69e6UVuVkacJ+WvuRhA7gzqtNNOA3Lde+PHjwdgzJgxTJs2jSFDhrTZXsQGBwFveb31VXObYxdfmwVK0vbkzpb6An8B7pX07UJ3EBGTgckA9fX1G/+JmlUo50n5y1cMVq1axaOPPspLL72EJD777DMkccUVV7TZ3vPPP8/ee+8NwMCBA5k/fz7Dhw9veX3BggUMHDgw3YPoZIV08X0NeCMiVkTEGuC3wEHAu5JqAJL794oXpplZ+Zk+fTqnnnoqb775Jo2Njbz11lv07dt3rSvw8nnxxRe59NJLOeusswA4//zzueCCC1q67hoaGpgyZQrf+973in4MxVTIVXxLga9I2opcF98IYB7wd2AsMDG5v79YQZqZpanXLruk+j22XsmVcptq2rRpXHjh2teXHX/88dx5551ccMEFay2fM2cOgwcP5sMPP2SnnXbi5z//OSNGjADgmGOO4e233+aggw5CEt27d+eOO+6gZjP/Mrna6scEkPQvwDeAT4HnyV1y3g24B+hFroidGBH/ubF26uvrY968eR2N2azdJKX6Rd028qddQxc4T9K3ePHilu4wK60N/C7y5kpB34OKiB8DP15n8cfkzqbMzMxS55EkzMwsk1ygzMwsk1ygzMwsk1ygzMwsk1ygzMwsk1ygzKzi7FrbK9XpNnatbXt4quXLlzNmzBj22GMPBg4cyBFHHMGSJUtYtGgRw4cPZ6+99qJfv35ceumlLV9fmDJlCmefffZ6bfXp04eVK1eutWzKlCn07NlzrekzXn75ZQCWLFnCEUccwZ577snee+/NSSedxN13392yXrdu3Vqm4zj11FOZPXs2Rx11VEvbM2bMYNCgQQwYMID99tuPGTNmtLw2btw4dtttNz7++GMAVq5cSZ8+fTb5d5KPJyw0s4qz7O23OOCSB1Nr75mfjtro6xHBcccdx9ixY1tGLG9oaODdd99l3Lhx3HjjjYwcOZIPP/yQ448/nhtuuKFllIhN8Y1vfKNlhPNmq1ev5sgjj+Sqq67i6KOPBuCxxx6jZ8+eLUMvDRs2jEmTJrWM1Td79uyW7V944QUmTJjArFmz6Nu3L2+88QaHH344u+++O4MGDQJy80rddtttnHnmmZsc88b4DMrMrMgee+wxqqqqOOOMM1qW1dXVsWTJEg4++GBGjhwJwFZbbcV1113HxIkTU9v3nXfeyYEHHthSnAAOO+ywlskQ2zJp0iQuvvhi+vbtC0Dfvn256KKLuPLKK1vWGT9+PFdffTWffvppanGDC5SZWdG99NJL602HAfmnydhjjz344IMPeP/99zd5P6277erq6vjoo482uO9CFTKVR69evTjkkEP49a9/3e795OMuPjOzEll3SozWNrR8Y/J18XVUvhjzLbv44os55phjOPLII1Pbt8+gzMyKbJ999mH+/Pl5l6877uLrr79Ot27d6N69e1H3vSnbrxtjvqk89txzT+rq6rjnnnvava91uUCZmRXZ8OHD+fjjj/nlL3/Zsuy5556jX79+zJ07l4cffhjITWp47rnncv7556e2729961s8+eST/O53v2tZ9uCDD7Jw4cKCtp8wYQI/+9nPaGxsBKCxsZHLLruMH/zgB+ut+8Mf/pBJkyalEje4i8/MKlDNbl9s88q7TW1vYyRx3333MX78eCZOnEh1dTV9+vThmmuu4f777+ecc87hrLPO4rPPPuOUU05Z69LyKVOmrHVZ99NPPw3AoEGD2GKL3DnGSSedxKBBg7j77rvXmkvqhhtu4KCDDmLmzJmMHz+e8ePHU1VVxaBBg7j22msLOra6ujouv/xyjj76aNasWUNVVRVXXHFFy+zAre2zzz4MGTKk4Gnr21LQdBtp8TQCVmqebqMyebqN7NiU6TbcxWdmZpmUuQLVu6YmtW93997MZ5M0M6tkmfsMauny5al2wZiZwcYv6bbOsakfKWXuDMrMLG3V1dWsWrVqk/9BWnoiglWrVlFdXV3wNpk7gzIzS1ttbS1NTU2sWLGi1KFUtOrqamprC+8hc4Eys7JXVVXVMpacbT7cxWdmZpnkAmVmZpnkAmVmZpnkAmVmZpnkAmVmZplUUIGStJ2k6ZJekbRY0oGSdpA0S9Kryf32xQ7WzMwqR6FnUNcCD0bEAOAfgcXAhcAjEdEPeCR5bmZmloo2C5SkbYBDgVsBIuKTiPgLcCxwe7La7cDoYgVpZmaVp5AzqN2BFcCvJD0v6RZJWwM7R8QygOR+p3wbSzpd0jxJ8/wtbrP8nCdm6yukQHUFhgA3RsRg4O9sQndeREyOiPqIqO/Zs2c7wzQrb84Ts/UVUqCagKaIeCZ5Pp1cwXpXUg1Acv9ecUI0M7NK1GaBiojlwFuS+ieLRgAvAw8AY5NlY4H7ixKhmZlVpEIHiz0HmCrpC8DrwHfIFbd7JJ0GLAVOLE6IZulRl6rU5glTl6pU2jGz/AoqUBHRANTneWlEuuGYFVd8toYDLnkwlbae+emoVNoxs/w8koSZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWWSC5SZmWVSwQVKUhdJz0uamTzfQdIsSa8m99sXL0wzM6s0m3IG9X1gcavnFwKPREQ/4JHkuZmZWSoKKlCSaoEjgVtaLT4WuD15fDswOt3QzMyskhV6BnUNcD7weatlO0fEMoDkfqd8G0o6XdI8SfNWrFjRoWDNypXzxGx9bRYoSUcB70XE/PbsICImR0R9RNT37NmzPU2YlT3nidn6uhawzsHAMZKOAKqBbSTdAbwrqSYilkmqAd4rZqBmZlZZ2jyDioiLIqI2IvoAY4BHI+LbwAPA2GS1scD9RYvSzMwqTke+BzUROFzSq8DhyXMzM7NUFNLF1yIiZgOzk8ergBHph2RmZuaRJMzMLKNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoMzMLJNcoEqgd00NklK59a6pKfXhmJkVxSbNB2XpWLp8OU271qbSVu07Tam0Y2aWNT6DMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTHKBMjOzTGqzQEn6oqTHJC2WtEjS95PlO0iaJenV5H774odrZmaVopAzqE+BH0TE3sBXgLMkDQQuBB6JiH7AI8lzMzOzVLRZoCJiWUQsSB7/DVgM7AYcC9yerHY7MLpYQZqZWeXZpM+gJPUBBgPPADtHxDLIFTFgpw1sc7qkeZLmrVixomPRmpUp54nZ+gouUJK6Ab8BxkfE+4VuFxGTI6I+Iup79uzZnhjNyp7zxGx9BRUoSVXkitPUiPhtsvhdSTXJ6zXAe8UJ0czMKlEhV/EJuBVYHBFXtXrpAWBs8ngscH/64ZmZWaUqZMLCg4FTgIWSGpJlFwMTgXsknQYsBU4sTohmZlaJ2ixQETEX0AZeHpFuOGZmVmq9a2pYunx5Km312mUX3ly2rF3besp3MzNby9Lly2natTaVtmrfaWr3th7qyDKvd00NklK5lYs0fya9a2pKfThmefkMyjIvK+/mssQ/E6sEPoMyM7NMKuszqH+A1Lp1OvJBn3WMulT5Xb5ZBSrrAvUxuBukDMRnazjgkgdTaeuZn45KpR0zKz538ZmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSaV9VBHZma26dIc/1Jdqtq9rQuUmZmtJSvjX7qLz6zCNY/678kPLWt8BmVW4Tzqv2WVz6DMzCyTXKCsKHat7ZVat5GZVSZ38VlRLHv7rUx8yGpmm6/MFaisXN5oZqXVu6aGpcuXp9JWr1124c1ly1JpyzpP5gpUVi5v3Fw0X4GVBiexZcnS5ct98UaF61CBkjQKuBboAtwSERNTicoK5iuwzKxctfsiCUldgOuBrwMDgW9KGphWYGZmacnqd71619SkFtdWXbqW3YVJHTmD+jLwWkS8DiDpLuBY4OU0AjMzS0tWexrS7sbM4jF2hCKifRtKJwCjIuK7yfNTgAMi4ux11jsdOD152h/4UxtN9wBWtiuozYePsTy0dYwrI6KgD0KdJ3n5GMtDIceYN1c6cgaV7zxwvWoXEZOByQU3Ks2LiPoOxJV5PsbykOYxOk/W52MsDx05xo58UbcJ+GKr57XAOx1oz8zMrEVHCtRzQD9JfSV9ARgDPJBOWGZmVuna3cUXEZ9KOht4iNxl5rdFxKIUYiq4m2Mz5mMsD6U8Rv98y4OPcSPafZGEmZlZMXmwWDMzyyQXKDMzy6TMFChJoyT9SdJrki4sdTxpk/RFSY9JWixpkaTvlzqmYpHURdLzkmaWOpZikLSdpOmSXkl+nwd24r7LOk+gcnKl3PMEOp4rmfgMKhk2aQlwOLnL158DvhkRZTMqhaQaoCYiFkjqDswHRpfTMTaTdB5QD2wTEUeVOp60SbodmBMRtyRXsG4VEX/phP2WfZ5A5eRKuecJdDxXsnIG1TJsUkR8AjQPm1Q2ImJZRCxIHv8NWAzsVtqo0iepFjgSuKXUsRSDpG2AQ4FbASLik84oTomyzxOojFwp9zyBdHIlKwVqN+CtVs+bKLM/yNYk9QEGA8+UNpKiuAY4H/i81IEUye7ACuBXSffMLZK27qR9V1SeQFnnSrnnCaSQK1kpUAUNm1QOJHUDfgOMj4j3Sx1PmiQdBbwXEfNLHUsRdQWGADdGxGDg70BnfRZUMXkC5ZsrFZInkEKuZKVAVcSwSZKqyCXc1Ij4banjKYKDgWMkNZLrfhou6Y7ShpS6JqApIprf0U8nl4Sdte+yzxMo+1yphDyBFHIlKwWq7IdNUm6SlVuBxRFxVanjKYaIuCgiaiOiD7nf4aMR8e0Sh5WqiFgOvCWpf7JoBJ03xUzZ5wmUf65UQp5AOrmSiSnfizhsUpYcDJwCLJTUkCy7OCL+vYQxWfucA0xNisTrwHc6Y6cVkifgXCknHcqVTFxmbmZmtq6sdPGZmZmtxQXKzMwyyQXKzMwyyQXKzMwyyQXKzMwyyQUqIyT9RNKEFNsbIKkhGWJkj7TabdX+bEn1abdr1hbnSuVwgSpfo4H7I2JwRPxHqYMxyzDnSka5QJWQpB8mc/s8DPRPlv2zpOckvSDpN5K2ktRd0hvJ8C9I2kZSo6QqSXWSnpb0oqT7JG0v6QhgPPDdZF6d8yWdm2x7taRHk8cjmodYkTRS0lOSFki6NxkHDUn7S/qjpPmSHkqmQmh9DFtIul3S/+20H5xVHOdKZXKBKhFJ+5Mb5mQw8E/Al5KXfhsRX4qIfyQ3zcBpyZQDs8kNz0+y3W8iYg3wr8AFETEIWAj8OPnG/U3A1RFxGPA4MDTZth7oliTwIcAcST2AHwFfi4ghwDzgvGSdXwAnRMT+wG3A/2t1GF2BqcCSiPhRij8esxbOlcqViaGOKtRQ4L6I+BBAUvOYavsm77C2A7qRG9YGcvPGnA/MIDdcyD9L2hbYLiL+mKxzO3Bvnn3NB/ZXbvK3j4EF5JJvKHAu8BVgIPBEbhg0vgA8Re6d6r7ArGR5F2BZq3ZvBu6JiNaJaJY250qFcoEqrXzjTE0hN3voC5LGAcMAIuIJSX0kfRXoEhEvJUnX9k4i1ig3cvJ3gCeBF4HDgD3IvfPcA5gVEd9svZ2k/YBFEbGhaZqfBA6T9P8jYnUhsZi1k3OlArmLr3QeB46TtGXybu3oZHl3YFnSZXDyOtv8KzAN+BVARPwV+LOk5i6JU4A/kt/jwITkfg5wBtAQucEYnwYOlrQnQNKXvxfwJ6CnpAOT5VWS9mnV5q3AvwP3SvKbHSsW50qFcoEqkWRK67uBBnLz3sxJXvo/5GYPnQW8ss5mU4HtySVes7HAlZJeBOqAn25gl3OAGuCpiHgXWN28z4hYAYwDpiXtPA0MSKYVPwG4XNILSawHrXMcV5HrBvm1JP89WeqcK5XLo5lvRiSdABwbEaeUOhazLHOulAefam4mJP0C+DpwRKljMcsy50r58BmUmZllkvtBzcwsk1ygzMwsk1ygzMwsk1ygzMwsk1ygzMwsk/4LuTrZSwyZOFsAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<Figure size 432x216 with 2 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"df['dayofweek'] = df['effective_date'].dt.dayofweek\n", | |
"bins = np.linspace(df.dayofweek.min(), df.dayofweek.max(), 10)\n", | |
"g = sns.FacetGrid(df, col=\"Gender\", hue=\"loan_status\", palette=\"Set1\", col_wrap=2)\n", | |
"g.map(plt.hist, 'dayofweek', bins=bins, ec=\"k\")\n", | |
"g.axes[-1].legend()\n", | |
"plt.show()\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"We see that people who get the loan at the end of the week dont pay it off, so lets use Feature binarization to set a threshold values less then day 4 " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Unnamed: 0</th>\n", | |
" <th>Unnamed: 0.1</th>\n", | |
" <th>loan_status</th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>effective_date</th>\n", | |
" <th>due_date</th>\n", | |
" <th>age</th>\n", | |
" <th>education</th>\n", | |
" <th>Gender</th>\n", | |
" <th>dayofweek</th>\n", | |
" <th>weekend</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-10-07</td>\n", | |
" <td>45</td>\n", | |
" <td>High School or Below</td>\n", | |
" <td>male</td>\n", | |
" <td>3</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>2</td>\n", | |
" <td>2</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-10-07</td>\n", | |
" <td>33</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>female</td>\n", | |
" <td>3</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>3</td>\n", | |
" <td>3</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>15</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-09-22</td>\n", | |
" <td>27</td>\n", | |
" <td>college</td>\n", | |
" <td>male</td>\n", | |
" <td>3</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>4</td>\n", | |
" <td>4</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-09</td>\n", | |
" <td>2016-10-08</td>\n", | |
" <td>28</td>\n", | |
" <td>college</td>\n", | |
" <td>female</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>6</td>\n", | |
" <td>6</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-09</td>\n", | |
" <td>2016-10-08</td>\n", | |
" <td>29</td>\n", | |
" <td>college</td>\n", | |
" <td>male</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n", | |
"0 0 0 PAIDOFF 1000 30 2016-09-08 \n", | |
"1 2 2 PAIDOFF 1000 30 2016-09-08 \n", | |
"2 3 3 PAIDOFF 1000 15 2016-09-08 \n", | |
"3 4 4 PAIDOFF 1000 30 2016-09-09 \n", | |
"4 6 6 PAIDOFF 1000 30 2016-09-09 \n", | |
"\n", | |
" due_date age education Gender dayofweek weekend \n", | |
"0 2016-10-07 45 High School or Below male 3 0 \n", | |
"1 2016-10-07 33 Bechalor female 3 0 \n", | |
"2 2016-09-22 27 college male 3 0 \n", | |
"3 2016-10-08 28 college female 4 1 \n", | |
"4 2016-10-08 29 college male 4 1 " | |
] | |
}, | |
"execution_count": 11, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df['weekend'] = df['dayofweek'].apply(lambda x: 1 if (x>3) else 0)\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"## Convert Categorical features to numerical values" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"Lets look at gender:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"Gender loan_status\n", | |
"female PAIDOFF 0.865385\n", | |
" COLLECTION 0.134615\n", | |
"male PAIDOFF 0.731293\n", | |
" COLLECTION 0.268707\n", | |
"Name: loan_status, dtype: float64" | |
] | |
}, | |
"execution_count": 12, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df.groupby(['Gender'])['loan_status'].value_counts(normalize=True)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"86 % of female pay there loans while only 73 % of males pay there loan\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"Lets convert male to 0 and female to 1:\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Unnamed: 0</th>\n", | |
" <th>Unnamed: 0.1</th>\n", | |
" <th>loan_status</th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>effective_date</th>\n", | |
" <th>due_date</th>\n", | |
" <th>age</th>\n", | |
" <th>education</th>\n", | |
" <th>Gender</th>\n", | |
" <th>dayofweek</th>\n", | |
" <th>weekend</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-10-07</td>\n", | |
" <td>45</td>\n", | |
" <td>High School or Below</td>\n", | |
" <td>0</td>\n", | |
" <td>3</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>2</td>\n", | |
" <td>2</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-10-07</td>\n", | |
" <td>33</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>1</td>\n", | |
" <td>3</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>3</td>\n", | |
" <td>3</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>15</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-09-22</td>\n", | |
" <td>27</td>\n", | |
" <td>college</td>\n", | |
" <td>0</td>\n", | |
" <td>3</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>4</td>\n", | |
" <td>4</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-09</td>\n", | |
" <td>2016-10-08</td>\n", | |
" <td>28</td>\n", | |
" <td>college</td>\n", | |
" <td>1</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>6</td>\n", | |
" <td>6</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-09</td>\n", | |
" <td>2016-10-08</td>\n", | |
" <td>29</td>\n", | |
" <td>college</td>\n", | |
" <td>0</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n", | |
"0 0 0 PAIDOFF 1000 30 2016-09-08 \n", | |
"1 2 2 PAIDOFF 1000 30 2016-09-08 \n", | |
"2 3 3 PAIDOFF 1000 15 2016-09-08 \n", | |
"3 4 4 PAIDOFF 1000 30 2016-09-09 \n", | |
"4 6 6 PAIDOFF 1000 30 2016-09-09 \n", | |
"\n", | |
" due_date age education Gender dayofweek weekend \n", | |
"0 2016-10-07 45 High School or Below 0 3 0 \n", | |
"1 2016-10-07 33 Bechalor 1 3 0 \n", | |
"2 2016-09-22 27 college 0 3 0 \n", | |
"3 2016-10-08 28 college 1 4 1 \n", | |
"4 2016-10-08 29 college 0 4 1 " | |
] | |
}, | |
"execution_count": 13, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df['Gender'].replace(to_replace=['male','female'], value=[0,1],inplace=True)\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"## One Hot Encoding \n", | |
"#### How about education?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"education loan_status\n", | |
"Bechalor PAIDOFF 0.750000\n", | |
" COLLECTION 0.250000\n", | |
"High School or Below PAIDOFF 0.741722\n", | |
" COLLECTION 0.258278\n", | |
"Master or Above COLLECTION 0.500000\n", | |
" PAIDOFF 0.500000\n", | |
"college PAIDOFF 0.765101\n", | |
" COLLECTION 0.234899\n", | |
"Name: loan_status, dtype: float64" | |
] | |
}, | |
"execution_count": 14, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df.groupby(['education'])['loan_status'].value_counts(normalize=True)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"#### Feature befor One Hot Encoding" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 15, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>age</th>\n", | |
" <th>Gender</th>\n", | |
" <th>education</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>45</td>\n", | |
" <td>0</td>\n", | |
" <td>High School or Below</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>33</td>\n", | |
" <td>1</td>\n", | |
" <td>Bechalor</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>1000</td>\n", | |
" <td>15</td>\n", | |
" <td>27</td>\n", | |
" <td>0</td>\n", | |
" <td>college</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>28</td>\n", | |
" <td>1</td>\n", | |
" <td>college</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>29</td>\n", | |
" <td>0</td>\n", | |
" <td>college</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Principal terms age Gender education\n", | |
"0 1000 30 45 0 High School or Below\n", | |
"1 1000 30 33 1 Bechalor\n", | |
"2 1000 15 27 0 college\n", | |
"3 1000 30 28 1 college\n", | |
"4 1000 30 29 0 college" | |
] | |
}, | |
"execution_count": 15, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df[['Principal','terms','age','Gender','education']].head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"#### Use one hot encoding technique to conver categorical varables to binary variables and append them to the feature Data Frame " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 16, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>age</th>\n", | |
" <th>Gender</th>\n", | |
" <th>weekend</th>\n", | |
" <th>Bechalor</th>\n", | |
" <th>High School or Below</th>\n", | |
" <th>college</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>45</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>33</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>1000</td>\n", | |
" <td>15</td>\n", | |
" <td>27</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>28</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>29</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Principal terms age Gender weekend Bechalor High School or Below \\\n", | |
"0 1000 30 45 0 0 0 1 \n", | |
"1 1000 30 33 1 0 1 0 \n", | |
"2 1000 15 27 0 0 0 0 \n", | |
"3 1000 30 28 1 1 0 0 \n", | |
"4 1000 30 29 0 1 0 0 \n", | |
"\n", | |
" college \n", | |
"0 0 \n", | |
"1 0 \n", | |
"2 1 \n", | |
"3 1 \n", | |
"4 1 " | |
] | |
}, | |
"execution_count": 16, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"Feature = df[['Principal','terms','age','Gender','weekend']]\n", | |
"Feature = pd.concat([Feature,pd.get_dummies(df['education'])], axis=1)\n", | |
"Feature.drop(['Master or Above'], axis = 1,inplace=True)\n", | |
"Feature.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"### Feature selection" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"Lets defind feature sets, X:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 17, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>age</th>\n", | |
" <th>Gender</th>\n", | |
" <th>weekend</th>\n", | |
" <th>Bechalor</th>\n", | |
" <th>High School or Below</th>\n", | |
" <th>college</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>45</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>33</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>1000</td>\n", | |
" <td>15</td>\n", | |
" <td>27</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>28</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>29</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Principal terms age Gender weekend Bechalor High School or Below \\\n", | |
"0 1000 30 45 0 0 0 1 \n", | |
"1 1000 30 33 1 0 1 0 \n", | |
"2 1000 15 27 0 0 0 0 \n", | |
"3 1000 30 28 1 1 0 0 \n", | |
"4 1000 30 29 0 1 0 0 \n", | |
"\n", | |
" college \n", | |
"0 0 \n", | |
"1 0 \n", | |
"2 1 \n", | |
"3 1 \n", | |
"4 1 " | |
] | |
}, | |
"execution_count": 17, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"X = Feature\n", | |
"X[0:5]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"What are our lables?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 18, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"array(['PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF'],\n", | |
" dtype=object)" | |
] | |
}, | |
"execution_count": 18, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"y = df['loan_status'].values\n", | |
"y[0:5]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"## Normalize Data " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"Data Standardization give data zero mean and unit variance (technically should be done after train test split )" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 19, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"/home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/sklearn/preprocessing/data.py:625: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n", | |
" return self.partial_fit(X, y)\n", | |
"/home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/ipykernel_launcher.py:1: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n", | |
" \"\"\"Entry point for launching an IPython kernel.\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"array([[ 0.51578458, 0.92071769, 2.33152555, -0.42056004, -1.20577805,\n", | |
" -0.38170062, 1.13639374, -0.86968108],\n", | |
" [ 0.51578458, 0.92071769, 0.34170148, 2.37778177, -1.20577805,\n", | |
" 2.61985426, -0.87997669, -0.86968108],\n", | |
" [ 0.51578458, -0.95911111, -0.65321055, -0.42056004, -1.20577805,\n", | |
" -0.38170062, -0.87997669, 1.14984679],\n", | |
" [ 0.51578458, 0.92071769, -0.48739188, 2.37778177, 0.82934003,\n", | |
" -0.38170062, -0.87997669, 1.14984679],\n", | |
" [ 0.51578458, 0.92071769, -0.3215732 , -0.42056004, 0.82934003,\n", | |
" -0.38170062, -0.87997669, 1.14984679]])" | |
] | |
}, | |
"execution_count": 19, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"X= preprocessing.StandardScaler().fit(X).transform(X)\n", | |
"X[0:5]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"# Classification " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"Now, it is your turn, use the training set to build an accurate model. Then use the test set to report the accuracy of the model\n", | |
"You should use the following algorithm:\n", | |
"- K Nearest Neighbor(KNN)\n", | |
"- Decision Tree\n", | |
"- Support Vector Machine\n", | |
"- Logistic Regression\n", | |
"\n", | |
"\n", | |
"\n", | |
"__ Notice:__ \n", | |
"- You can go above and change the pre-processing, feature selection, feature-extraction, and so on, to make a better model.\n", | |
"- You should use either scikit-learn, Scipy or Numpy libraries for developing the classification algorithms.\n", | |
"- You should include the code of the algorithm in the following cells." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# K Nearest Neighbor(KNN)\n", | |
"Notice: You should find the best k to build the model with the best accuracy. \n", | |
"**warning:** You should not use the __loan_test.csv__ for finding the best k, however, you can split your train_loan.csv into train and test to find the best __k__." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Train Test Split</h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Train set: (276, 8) (276,)\n", | |
"Test set: (70, 8) (70,)\n" | |
] | |
} | |
], | |
"source": [ | |
"#Train-test split\n", | |
"from sklearn.model_selection import train_test_split\n", | |
"X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=4)\n", | |
"print ('Train set:', X_train.shape, y_train.shape)\n", | |
"print ('Test set:', X_test.shape, y_test.shape)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Find best k</h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 21, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[0.67142857 0.65714286 0.71428571 0.68571429 0.75714286 0.71428571\n", | |
" 0.78571429 0.75714286 0.75714286]\n" | |
] | |
} | |
], | |
"source": [ | |
"from sklearn.neighbors import KNeighborsClassifier\n", | |
"from sklearn import metrics\n", | |
"\n", | |
"###Calculate the accuracy of KNN for Ks from 1 to 10:\n", | |
"Ks = 10\n", | |
"mean_acc = np.zeros((Ks-1))\n", | |
"std_acc = np.zeros((Ks-1))\n", | |
"ConfustionMx = [];\n", | |
"for n in range(1,Ks):\n", | |
" \n", | |
" ###Train Model and Predict \n", | |
" neigh = KNeighborsClassifier(n_neighbors = n).fit(X_train,y_train)\n", | |
" yhat1=neigh.predict(X_test)\n", | |
" mean_acc[n-1] = metrics.accuracy_score(y_test, yhat1)\n", | |
"\n", | |
" \n", | |
" std_acc[n-1]=np.std(yhat1==y_test)/np.sqrt(yhat1.shape[0])\n", | |
"\n", | |
"print(mean_acc)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 22, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAAEYCAYAAAAJeGK1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nOzdd3iUVfbA8e+ZmUw6HRWFlWIDBaI0V0GlNwXBhvJb7K4KCBZWXNG14SqWtaGuFaWKqAgWkKqggFSpFsCG9BpSp93fHzNhY0jIJJmZd8r5PE8eksn7znsCZM7ce897jxhjUEoppaKNzeoAlFJKqdJoglJKKRWVNEEppZSKSpqglFJKRSVNUEoppaKSw+oAQqlOnTqmYcOGVoehlFKqAlauXLnXGFO35ONxlaAaNmzIihUrrA5DKaVUBYjIr6U9rlN8SimlopImKKWUUlFJE5RSSqmoFFdrUEopFUput5tt27ZRUFBgdShxISUlhfr165OUlBTU8ZqglFKqDNu2bSMzM5OGDRsiIlaHE9OMMezbt49t27bRqFGjoM7RKT6llCpDQUEBtWvX1uQUAiJC7dq1KzQa1QSllFLHoMkpdCr6d6kJSimlVFTSBKWUUlHuo48+QkT4/vvvrQ4lojRBKaUSyqE8NwVur9VhVMjkyZNp3749U6ZMCet1vN7o+nvRBKWUShi5hR4KPF4O5bs5kOvC7fVZHVK5cnJy+Prrr3nzzTePSlBjxoyhefPmtGzZkpEjRwKwefNmunTpQsuWLTnnnHPYsmULCxcu5OKLLz5y3pAhQxg3bhzg3yLukUceoX379rz//vu8/vrrtGnThpYtW3LZZZeRl5cHwK5du+jXrx8tW7akZcuWfPPNNzzwwAM8//zzR573/vvv54UXXgjZz65l5kqphODx+sgt9Bz52uX1sT/XRarTTobTgc127AX84bOGs2bnmpDGlHVCFs/1eO6Yx0yfPp0ePXpw2mmnUatWLVatWsU555zD559/zvTp01m2bBlpaWns378fgIEDBzJy5Ej69etHQUEBPp+P33///ZjXSElJYfHixQDs27ePm2++GYBRo0bx5ptvMnToUO644w4uvPBCPvroI7xeLzk5OZx44on079+fYcOG4fP5mDJlCt9++20I/mb8NEEppeKeMYZD+W5MKd/Ld3kpcHvJSHaQmmSPuqq9yZMnM3z4cAAGDBjA5MmTOeecc5g7dy7XX389aWlpANSqVYvDhw/zxx9/0K9fP8CfeIJx1VVXHfl8/fr1jBo1ioMHD5KTk0P37t0BmD9/Pu+++y4Adrud6tWrU716dWrXrs3q1avZtWsXZ599NrVr1w7Zz64JSikV93IKPXh8paUnP2PgcIGHPJeXzBQHyQ77UceUN9IJh3379jF//nzWr1+PiOD1ehERxowZgzHmqGRqTOk/o8PhwOf733RmyXuR0tPTj3x+3XXXMX36dFq2bMm4ceNYuHDhMWO86aabGDduHDt37uSGG26o4E94bLoGpZSKay6PjzxXcIv/Xp/hYJ6bg3kuPFGwPjVt2jQGDRrEr7/+yi+//MLvv/9Oo0aNWLx4Md26deOtt946ska0f/9+qlWrRv369Zk+fToAhYWF5OXlcfLJJ7Nx40YKCws5dOgQ8+bNK/Oahw8fpl69erjdbiZOnHjk8c6dO/PKK68A/mKK7OxsAPr168esWbNYvnz5kdFWqGiCUkrFLWMM2QXuCp9X6PGvT/mMKXNUEgmTJ08+Ml1X5LLLLmPSpEn06NGDPn360Lp1a7Kysnj66acBGD9+PC+88AItWrTgvPPOY+fOnTRo0IArr7ySFi1aMHDgQM4+++wyr/noo4/Srl07unbtyhlnnHHk8eeff54FCxbQvHlzWrVqxYYNGwBwOp107NiRK6+8Erv96JFnVYiVf/mh1rp1a6MNC5VSRbIL3OQHOXoqzZ7ft3L6GWdgFym3iCJR+Xw+zjnnHN5//31OPfXUco/ftGkTTZs2/dNjIrLSGNO65LE6glJKxaVCj7dKyak4rzF4vD58cfSGPhQ2btzIKaecQufOnYNKThWlRRJKqbjj8xmy8z3lH1gBBv8alQ+D3SZRV+1nhWbNmrF169awPb+OoJRScedwgSdsox0DeHwGr8/a9alEoAlKKRVXCtxeCjzh37LHZwwen8F3jPJ1VTWaoJRSccPnq1zVXlXo+lT4aIJSSsWN7AI3VuSJovUpj8+n034hpAlKKRUX8l1eCj3hvbl2V3bBMT92Hirgj4P5bD+Yz85D+eUeHypLliw5sn9eWV599VWaN29OVlYW7du3Z+PGjRW+zsGDB3n55ZfL/P51113HtGnTKvy8ZdEEpZSKeV6f4XCEp/aOJRzrUwsXLuS6664r9XuzZs2iR48exzz/mmuuYd26daxZs4Z//OMf3HXXXRWOobwEFWqaoJRSMa+sjWCtFqn1qXnz5tGlS5djHlOtWrUjn+fm5h4pk//oo4/o0qULxhh27NjBaaedxs6dO9mwYQNt27YlKyuLFi1a8NNPPzFy5Ei2bNlCVlYWI0aMwBjDkCFDaNasGb1792b37t0h/bn0PiilVEzLLfREdV+novUpIwabhP7+qb1795KUlET16tXLPXbs2LE8++yzuFwu5s+fD/j30vvggw8YO3Yss2bN4uGHH+aEE05g9OjRDBs2jIEDB+JyufB6vTzxxBOsX7+eNWv8bUc+/PBDfvjhB9atW8euXbto1qxZSDeM1QSllIpZJXs8RTOf8U/92QTsFdg2qV27dhQWFpKTk8P+/fvJysoC4Mknn6R79+588cUXdOvWLajnGjx4MIMHD2bSpEk89thjvPPOOwC8+OKLnHXWWZx77rlcffXVAPz1r39l9OjRbNu2jf79+5e6U8RXX33F1Vdfjd1u58QTT6RTp05B/1zB0Ck+pVRMOlaPp2jmMwZ3Bab9li1bxpo1a3jjjTfo06cPa9asYc2aNUd2Dv/888+PrD9df/31ZGVl0atXr2M+54ABA47seA7wxx9/YLPZ2LVr15G2HNdccw0zZswgNTWV7t27HxlxlRTOHTU0QSmlYlKuy3vMHk/Rzuvzr09VpSzdGMPatWuPjKrefvtt1qxZw2effXbUsT/99NORzz/99NMjIyKPx8P111/PpEmTaNq0Kc8++ywAW7dupXHjxtxxxx306dOHtWvXkpmZyeHDh488zwUXXMCUKVPwer3s2LGDBQsWVPpnKY1O8SmlYo7LY83U3vHVgutQG6yibZNslVyfWrlyJWeffXZQ57300kvMnTuXpKQkataseWR67/HHH6dDhw506NCBrKws2rRpQ+/evZk+fToTJkwgKSmJE044gQcffJBatWpx/vnnc9ZZZ9GzZ0/GjBnD/Pnzad68OaeddhoXXnhhZf4ayqTtNpRSMcUYw75cF94IjJ6K2m1ESkXbejz22GOccsopDBgwIIxRhVZF2m3oCEopFVMOF3oikpys4DUGn9dgswm2IEZFo0aNikBU1tEEpVSU83h9FHp8pCfrr2soezxFq6KydG8MlH8I4LCHr5RB/8crFeUO5bv97R2MoVpKktXhWCYcPZ6CYYzR3k9lqGgKreiSklbxKRXFcgo9RyrV8l1eDua5EnYz0nD2eCqLw5nM/n37E/bvPJSMMezbt4+UlOALTXQEpVSUcpdyE2qhx8eBPDc1UpMqtJge6yLV46mkarWP5+C+Xezduyfi144VFbnpOCUlhfr16wd9vCYopaJQ0U2opXF7fezLdVEzLSms8//RwooeT0XsDgc1jz/JkmvHAhE4LjO0pffFhfV/t4j0EJEfRGSziIws5fvVRWSmiHwnIhtE5Ppgz1UqnuWUU6nmM4b9eS5cYW4vEQ2s6vGkrBe2BCUidmAs0BNoBlwtIs1KHDYY2GiMaQlcBDwjIs4gz1UqLrk8PvKCqFQzBg7muShwx29VWyR6PKnoFc4RVFtgszFmqzHGBUwB+pY4xgCZ4i+RyQD2A54gz1Uq7hhTseksg7/KL88VGxumVkS09XhSkRfOBHUS8Huxr7cFHivuJaApsB1YBwwzxviCPBcAEblFRFaIyIo9e3QhU8W2yt6EerjAY9k6TbjE4kawKrTCmaBKK+0o+f+tO7AGOBHIAl4SkWpBnut/0JjXjDGtjTGt69atW5V4lbJUVW9Cjacy9Gjv8aQiI5wJahvQoNjX9fGPlIq7HvjQ+G0GfgbOCPJcpeKGMaG5CbWoDD2UrcYjLZZ6PKnwCmeCWg6cKiKNRMQJDABmlDjmN6AzgIgcD5wObA3yXKXiRnYIb0J1e33sz4vMZqqhFqs9nlR4hO0+KGOMR0SGALMBO/CWMWaDiNwa+P6rwKPAOBFZh39a715jzF6A0s4NV6xKWanA7Q15JZ7XZ9iXW0jNNCdJMXSvVKz3eFKhpe02lLKQz2fYm1sYtvt8BKiWmkRKkj08Fwghl8fHgTyX1WGoCgjVjbpltduInbdWSsWhwwWesN6EGitl6BUtr1eJQROUUhaJ5P5yhws8UX1PUTz3eFKVpwlKKQtYsb9cnsvLoTx31JWhJ0KPJ1U5mqCUsoBV+8sVeLwcjKIydKt6PKnYoAlKqQizen85VxSVoVvR40nFDk1QSkWQ12c4XGj9WpDXZ9if67J0twarejyp2KEJSqkIys6PntYRPmM4kOui0IIkYVWPpx/2beKSKR2ZsmF81K3FqaNpglIqQvJcHlxRtr+cAQ7muSNepGDFGpzX5+WuObeyYscyhn9xC5dN68Hm/T9GNghVIZqglIoAj9dHTkH0FgNkF7jJidD+d1atwY1b+xord3zLC93f4JkuL7Nhz1o6TWjDU0seo9BTGPF4VPk0QSkVAdkFnqjfXy630OPfBy+MQxurejxty/6Nxxc/SMeTu3J506sZ2Px6Fl+7hotP7cczS0fTaUIbvv79q4jHpY5NE5RSYRZLrSMK3P4y9HAlqWwLNoI1xjBy/nB8xseTnV/E3x8V6qYfz8s9xzGl/0w8Pg+XTevOsNm3sC9/b4QjVGXRBKVUGMVi6wiX18f+3NCXoecWWrMG9/GP05j78+eMPP9f/KX6yUd9/6KTu7Bw0EruaDOCD76fTIdxWby3YYIWUUQBTVBKhVGsto7wBMrQPSFKKFYl6v35+7h/wd1kHd+Km7IGl3lcqiOVf7Z/hDkDl9Kk5qkM++JmLaKIApqglAqTnEJPTLeO8BnD/rzQlKFblagf/uo+DhUe4JmuL2O3lb+je9M6Z/LxVfN4qvNLR4oonln6uBZRWEQTlFJh4Pb6yIuxqb3SGAOH8txV6ldlVaL+6tf5vLdxPLe3upMz67YI+jyb2PhbixtZdO1qep9yKU8teZTOE9ryzbZFYYxWlUYTlFIh5m/fHptTe6UpatlRmSk6t0VTe3nuPEbMG0LjGqdw57n3Veo5jks/gVd6vcPkfjNweV30f78bw7/4O/vz94U4WlUWTVBKhVisT+2VJSdQhh6sovbtVnh6yWP8euhnnu76MqmO1Co9V8eGXVk4aCVD29zDtE2TaP9OFlM3TtQiigjQBKVUCLk8PvLiuHWEvwzdFdSLs1U9ntbuWs2rq57n/866gfPqdwjJc6YlpXF/+0eZM3AJjWs04Y7ZN3HFB73YcuCnkDy/Kp0mKKVCJFG6whZ6/GXox2rZYVWPJ4/Pw91zb6dO6nE80GF0yJ+/aZ2zmHHVfMZ0fpG1u1fTaXwbnl36by2iCBNNUEqFSCJ1hfX4DPvKKEP3r8FZUyDy31UvsG73Gh7v9CzVU2qE5Ro2sTGoxU0svnYNPU/pw5glj9BlYjuWbFscluslMk1QSoVAInaFLSpDd5XYVy/boh5PvxzcylPfPErPJpfQ+5RLw36949JP4NVe7zLx0ukUeAro935X7vziVi2iCCFNUEpVkZUjBqsZAwfzXEfK0Avc3iqVpFc+DsOIuUNIsjt5vNN/jmxnFAmdG3Xny0GrGNL6bqZunED7d7J4f+MkLaIIAU1QKiwS6ZfTqhFDtCgqQ88p9Fi2Bvfexgks+n0Bo9o/Rr2MkyJ+/bSkNEZ1eIw5A5fQqEZjhs6+kSs/6M3WA5sjHks80QSlQs4XWJ84mBe6rXKiVaHHmhFDNMot9FjSjHFP7i4e+vJe2p14Hn9rcWPkAyimWd3mzLxqAU92eoHvdq+i4/jW/GfZE1pEUUmaoFTIHcp34/WZI9VehwvC28LBKj5f4k7tRZMHvhxBnieXp7u+jE2sf0mziY1rW97MokGr6d7kYp785mG6TDyXpVpEUWHW/2uquJJd4P7TjtUGyHN52ZvjiruRxuEEn9qLBl9s/YzpP7zP8LYjObXW6VaH8yfHZ9Tjtd4TmHDpR+R78rj0/a7cNec2DhTstzq0mKEJSoVMnstTZiWbL7CrwIEQ7pBtpQK3l4IQbKKqKi/HdZiR8+7g9NrNGNLmbqvDKVOXRj34ctAqBre+i/c2jKf9uCymbZocl7MKoaYJSoVEocfL4SBamru8PvbF+LSfz5cYN+RGu8e/fpAdOdt5tuvLOO1Oq8M5pvSkdB7oMJovBn7DydUbMmTWDVpEEQRNUKrKPF4fh/Iq9oKd5/KyJ6cwJu8dyi5wW1IMoP5nxfalvL3mv9yQdSut6rWzOpygnVm3BTOvWsATnZ5nza6VdBzfmueWPYnL67I6tKikCUpVic9nOJBXuZ27jfG/2O/PdcVUS/RCT2zEGq9cXhd3zbmdEzNP4r7zH7Y6nAqz2+xc1/IWFl27mm6Ne/PENw/ReUI7lv3xtdWhRR1NUKrSjDEczHdXuVDAHWgxnl3gPub+blbz6tReVHhx+dP8uH8TT3R6gQxnptXhVNoJGSfy+sUTGd/3Q/I9efSd2oW759yuRRTFaIJSlZZd4AnpyCff5WVvbiF5rugs3c7O16k9q/2473ue//ZJLj39Cro27ml1OCHRtXFPvhy0ittaDWfKhndpPy6LD7SIAgCJp7+E1q1bmxUrVlgdRkLILfSQE8ZGdA6bkJmShNMRHe+h8lyeoIpAwmHS+nF8v3cDozqMjvpigHDyGR+XTu3CT/t/4KtrV1M37TirQwq59bu/Y8S8IazeuYLTajWlWnI1q0MqV7cmnRnduWo7x4vISmNM65KPO6r0rCohFbi9YU1O4N8t+0Cei5QkO5nJDmy2yO2tVpLXZ8ixKDl9v3cj9867A7fPzZaDm3m990TSktIsicVq7659g2+3L+H5bq/HZXICOOu4lnxy1ULGr3uTWVtmRv8oSiDFkRK+p4/6v4AK0BFU+Lm9Pg7kuiLazlwEMpIdpDmteT9lVRGHz/jo814nth7YzO2t72T04gdod9L5jO/7AZkx8M46lHbk/EGHd87mnBPa8F7/TyK6Gawqmwgcl1n1BFXWCCo65k9UTPD6DAcrWbFXFcb4d23Ym1NIYYRvjs0tDO06W0WM++41VuxYxsMXjmFIm7t5pdc7rNixlMum9WRf/l5LYrKCMYaR84bj9XkY0/lFTU4JJKwJSkR6iMgPIrJZREaW8v0RIrIm8LFeRLwiUivwvV9EZF3gezosspgxhoN5Lku39ilKkIfy3BFpDOjx+sgN81RmWf44/DujFz/ARSd34fKmVwNw6elXMK7P+/y4byOXTu3Kjpw/LIkt0j756SNmb/2EEX99gIY1GlsdjoqgsCUoEbEDY4GeQDPgahFpVvwYY8xTxpgsY0wWcB/wpTGmeI1lx8D3jxr6qcjKzvfgiZIS8AKPl305hYHds8MXU3aBJ+KjRfjfiMFnfDxZYsTQpVEPJvWfwY6cP+jzXmd+ObjVgggj52DBAf654C5aHHc2t5wz1OpwVISFcwTVFthsjNlqjHEBU4C+xzj+amByGONRlZRT6Im6fecM/rj25brCMu2XY+HU3owfP2DOz59x73kPcnL1hkd9/7z6HZh22efkunLo815nNu3dEPkgI+TRRfezP38vT3cZi8OmNV2JJpwJ6iTg92Jfbws8dhQRSQN6AB8Ue9gAX4jIShG5payLiMgtIrJCRFbs2bMnBGGr4grcXsumuYJRNO13MM8Vsmk/t9dHnkU/84GC/dy/8G5aHn8ON509uMzjsk5oxUdXzsEmQr+pXVm1c3kEo4yMxb9/ycT1b3Nrq2G0OP5sq8NRFghngiptJbOsV5BLgK9LTO+db4w5B/8U4WARuaC0E40xrxljWhtjWtetW7dqEas/cXl8ZOfHxs4JhR4f+3IKyanitJ+/fXvkC0GKPPzVfRzI38ezXV8pd8Rweu2mfHzVfKqn1OCKab1Y/PuXEYoy/PI9+YyYO5iG1Rtz97n3Wx2Oskg4E9Q2oEGxr+sD28s4dgAlpveMMdsDf+4GPsI/ZagixOszHMyPbDl5VRn8VXdV6T2VU2jdWtui3xYwZcO73N76Ts6s2yKoc06u3pAZV86jfrW/MPCjvsze8kmYo4yM/yz9Nz8f3MJTXV5K2Pu+VHgT1HLgVBFpJCJO/EloRsmDRKQ6cCHwcbHH0kUks+hzoBuwPoyxqmKKKvZi9Ra5yvaecnl85Fm0u3qeO48Rc4fQqEYT7jr3nxU69/iMenx0xRc0rXMWN8wcwIffTwlTlJGxYc9axq54lgFnDqLDXzpaHY6yUNgSlDHGAwwBZgObgKnGmA0icquI3Frs0H7AF8aY3GKPHQ8sFpHvgG+BT40xs8IVq/qzQ/nuqKnYqwqXN/iW88ZYuxHsM0tH88uhrTzVZSypjtQKn18rtTbTLv+cdiedz+DPb2Dcd6+FIcrw8/g83DXnNmqm1uZfF/zb6nCUxcJaFmOM+Qz4rMRjr5b4ehwwrsRjW4GW4YxNle5wgTuu2kkUtZwvcPvITHGQkmQv9bjDhZ6I3FtVmnW71/Dqyue55qzraN/gwko/T4Yzk4n9pvP3T//GyPnDOFx4iKFtR4Qw0vB7Y/VYvtu1ild7vUvNlFpWh6MspjtJqCPyXV7LprjCrWjar7Rti1wen2WNEz0+D3fPuZ1aqXV4sMPjVX6+VEcqb148mf5nXMXorx/ksUWjon8/t4BfD/3Ck988QtdGveh72uVWh6OigN5YoAD/i/ThBOh1VNR7KtVpJ8PpQMQ/pWmV11a9yNrdq3m990RqpNQMyXMm2ZN4qcdbZDir8dKKZ8h2ZfNEp+ewSfS+HzXGcO+8odjExhOdn9PtjBSgCUrh39In1ir2qirf5aXA7SXJZrNs+6ZfD/7MU0sepXvji7n41H4hfW6b2Hiy0/NUCySpHFc2z3d7nSR7UkivEyrTNk1m4a9zGd3xWU7KbFD+CSohaIJKcD6fvytujMwChZQx/kIKa65tGDFvCHabg393+k9YRgwiwqgOj1E9uTqjv36QHFcOr/WeENb2CJWxN28P//ryH7Su147rWpR5T75KQOWO+UVkiIiEZu5BRZ1D+ZHZeFX92dSNE/nqt/mMav8YJ2bWD+u1hrYdwROdnmfO1s8YOP1SclyHw3q9ivrXl//gsCubp7u8jN1WehGLSkzBTEqfACwXkamB3cl1cjhOZBe4LRtBJLI9ebt56Kt7aXviXxnU4qaIXPO6lrfwUo83WbptMZdP68n+/H0RuW555v08mw++n8LQNiM4o06z8k9QCaXcBGWMGQWcCrwJXAf8JCKPi0iTMMemwijP5bGsci3RPbhwBLnuHJ7qMjaihQuXNb2aty95j01719Pv/W7szClrY5fIyHXlcO/8Ozi11ukMa/uPiF3Xpu+xY0ZQvx3GX6e6M/DhAWoC00RkTBhjU2FS6PFy2KIW5oluztbP+eiHqQxrey+n124a8et3a9KbiZdO5/fsX+k7tQu/Hvol4jEUefKbh9mW/RvPdHmZZEdyRK4pArXTnVRPTULzVPQLZg3qDhFZCYwBvgaaG2NuA1oBl4U5PhViHq/P0rLqRJbjOszI+Xdweu1mDG1zj2VxtP/LRUy77DMOFRyk73ud+GHfpojHsGrHt7y+eizXtfw7bU86L2LXzUh2YLMJKUl26qQnk+yI3tJ7FdwIqg7Q3xjT3RjzvjHGDWCM8QEXhzU6FVKJXLEXDZ74+iG2H/6DZ7qMxWl3WhrLOfXa8tGVc/AZQ7+pXVmzc2XEru32url77mBOyKjH/ec/ErHrOmxCmvN/hcs2m1AjTUdT0SyYBPUZcKQNhohkikg7AGNM5N96qUoxxp+ctGLPGit3LOPNNa9wfdbfaX3iuVaHA0DTOmcy46p5pDszuPyDnnyzbVFErvvyyv+wae96nuj0PJnJ1SJyTYDMlNLvAUtJslNbR1NRKZh/kVeAnGJf5wYeUzEku8C6DrGJzuV1cdec26mXcSL/jOCIIRgNazRmxlXzqJdxEtd82Ic5Wz8P6/U27/+RZ5c+ziWn9qd7k8hNwKQk2XEeIwHZA6OpailJpTayU9YIJkGJKbaZV2BqT2/wjSG5hZ5K90dSVffS8mf4Yd9Gnuj8AhnOTKvDOUq9jJP46IovOK12U66feSXTf5galuv4jI975g4mxZHK6I7PhOUapRGBzOTgXrJSnXZqZyTjtOtoKhoE86+wNVAokRT4GAZsDXdgKjQK3F5yorhle7z7af8PPPftE/Q97XK6Ne5ldThlqpNWl2mXf07reu247bPrGL/2zZBfY9L6cSz9YzH/uuDfHJd+QsifvyxFhRHBstuEmulOMlMcOpqyWDAJ6lbgPOAP/F1y2wG6H0kMcHtjp2V7PPIZH/fMuZ00RzqPXfS01eGUq1pydSb1m0HHht0YMW8IY1c8G7Ln3pmznUcW/ZPzG1zI1WdeG7LnLU/JwoiKSHM6dDRlsXL/5QIt1wdEIBbLHcrzv5gnJ9lw2m0VetcVbXw+w8E8d0JtABttxq99k2Xbv+E/3f5L3fTjrQ4nKGlJaYzrM5Whs27k0UX3c7gwm3vP+1eV9wq8f8FdFHoKeKrzSxHdqbyswohgFY2m8lwecgo8+vsUYeUmKBFJAW4EzgSO7DJpjLkhjHFZwmcMLq+PAo9/vcZpt+F02Eh22HDE0Luoooo9q3bpVrAj5w8eWzyK9g0uYkCzv1kdToU47U5e7jmODGcmz337JIcKDzK647OV3vXis80f8+nmj7n//EdoXPOUEEdbtpaRteYAACAASURBVPIKIyoizenAabdpsVGEBTP2HQ98D3QHHgEG4m/hHvdcXh8ur4+cQv87qWSHjWRH6P7Th0t2vv4SWckYw33zh+P2uni6y9iY7G1kt9l5ustYMpOr8erK5zlcmM1z3V/DYavYdNmhgoPcN384Z9Ztwa2thocp2qNVpDAiWA67jVrpTnILPeQW6mgqEoL5FzzFGHOFiPQ1xrwjIpOA2eEOLNp4fYa8QMdZEUi226NyKjCn0HNkBKis8enm6cza8gkPdBhNwxqNrQ6n0kSEf3X4NzWSa/LENw+R4z7Mq73GV6hdx+jFD7Anbzfv9JkW0V5UFS2MqIj0ZAfJDh1NRUIwQ4GiVfaDInIWUB1oGLaIYoAxUODxcijfzZ6cQg7kusgt9OCx+D9rgdtLrlbsWepQwUH+Of8umh+Xxd/PucPqcKpMRBje7l5Gd3yWWVs+4W/T+5Pryin/RGDJtsW8u+4Nbj57CFkntApzpP9TlcKIoK8RGE1lJGulXzgFk6BeC/SDGgXMADYCT4Y1qhjjnwb0sC/Xxd6cQg4XuCmM8ChGK/aiw6OL7mdf/h6e6fJyhafDotmNWbfxQvc3+GbbV1z5YW8OFOw/5vEFngJGzB1Mg2on84/zHoxQlH5VLYyoiPRkB7XSnTiiaBYlnhzzN0hEbEC2MeYA8BUQu/MVEfLnqUB3RKYCvT7DgbzEatkejb7ZtogJ69/i9lZ30uL4s60OJ+SubDaQDGcGt342iP7vd+e9/jPLvJ/p+W+fZPOBH5nSfybpSekRizHFgjVih91G7Yxkcgo95OnaVEgd818ysGvEkAjFEndKTgXuD8NUoDGGg3ku3QDWYgWeAu6ZczsnV2/EPX8dZXU4YdPrlL5M6PsRvxzcSt+pXfg9+9ejjtm0dz0vLn+ay5tew0Und4lYbAJkplg3as1IdlBTR1MhFcxbjTkico+INBCRWkUfYY8sDrnDMBV4KN+NRzeAtdx/lv2brQc381Tnl0hLSrM6nLC64OROvH/Zp+zP30ff9zrz0/4fjnzP6/Ny95zbqZZcg4cvjOxKQEZK+AojgpUUWJtKD3EFYaIKJkHdAAzGP8W3MvCxIpxBJYKiqcCDeW52Hy7gUJ6bArcXXwWSjT/BaRWR1TbsWcvYFc9yVbO/ccHJnawOJyJan3guH14xG5fPTd+pXVi7azUAb3/3Kqt2Luexi56idmqdiMUTicKIYIkIGYG1KbuOpqpETBzNDbVu3dqsWFH53Hkg14UrCspGk+y2wD1XZd8gnO/ykl2gRRFW8/q89J5yIb9n/8aia1dTK7W21SFF1NYDm7nig15kFx7iyc7Pc8/cIbQ76XwmXTo9ovd/1UxzRuX9icYY/9qUKz5v/RCB4zKDv+2g7OeRlcaY1iUfD2YniUGlPW6MebfKUalSub2+wHTg/24QdgZuEgZweXwcjuLkVOgp5Nllj9OhQUfa/+Uiq8MJqzfXvMyaXSt5tdc7CZecABrXPIUZV83jyg96c/vn15OWlM6Yzi9ENDlZURgRLBEhMyWJZIed7ALtx1ZRwfyrtin20QF4COgTxphUMaVNBR7Mj+6KvfsX3MXz347h8g96MnTWTezN22N1SGHx26Ff+ffXD9GlUU/6nnaF1eFY5qTMBky/ci5dGvXkyU4v0KDayRG7tuBfe4p2ToeN2ulOUp12q0OJKcFsFju0+NciUh3/9kcqwoqqAqPZ+LVvMmH9W9zWajjJ9mTGrniWuT9/zoMX/JsBzf4Wk9v+lMYYw73zhmITG090ei5ufq7Kqpt2HBMu/TDi101PdsTMOo+IUC0liRQdTQWtMuPiPODUUAeiYt/y7Uv454I76dSwO6PaP8bI8x9izsClnFrrdO784u/0n9b9TxVfsezD76ew4Nc53Hf+w9Sv9herw0lIdpvEZLWcjqaCV26CEpGZIjIj8PEJ8APwcfhDU7FkZ852bpx5DSdlNuDlnm9jt/l/+c6o04zpV87lmS4vs3HPOjpPaMuYbx6lwFNgccSVtzdvDw8sHEGrem25vuXfrQ4nYVl5z1NVFY2maqQlYUvw0fexBPMvXLzTmgf41RizLUzxqBhU6Cnkpk+uIcd9mPcu+4QaKTX/9H2b2BjY/Hq6Ne7FQ1+N5Nllj/Pxj+/zZOcXad/gQouirryHvryXw65snunyypFErCIrxWE/UjQUy5Iddupk+DeeLXBH9/S9FYKZ4vsNWGaM+dIY8zWwT0QahjUqFVNGLbybFTuW8Xy312ha58wyj6ubfjxje77Ne/0/wevzcvm0Htwx+2b25e+NYLRVs+CXOUz7fjJD2tzDGXWaWR1OQoqVwohgiQjVU3U0VZpgEtT7QPGbg7yBx5Ri/No3Gb/uTe5oM4JLTusf1DkXntyZBYNWMKztP/jw+yl0GJfFlA3jifZ78nJdOYyYN4RTa53O8Lb3Wh1OwoqlwoiKSHbYqZ3uJCUp9keGoRJMgnIYY1xFXwQ+d4YvJBUrVmxfyj8X3EnHht2497x/VejcVEcq953/MHMHLqNJzVMZ/sUt9J/Wnc37fwxTtFU3ZskjbMv+jae7jCXZkRzRa8ffy3Hl2G1CWhwXF9hs/tFU9dQkkuy26P+whff+s2DGyXtEpI8xZgaAiPQFYmdORoXFzpzt3PjJ1ZyU2YBXeo6r9FrMGXWa8fFV85i0fhyPLrqfThPaMLTNCO5oMyLiSeBYVu9cweurxzKoxc20O+n8iF47xWEn1Wn3bwoc0StHn8wUR0KU9Kck2XUkRXAjqFuBf4rIbyLyG3AvoKVLCczldXHTJ9dw2HWYt/tMPaoooqJsYuP/mt/A4mvXcPGp/Xhm6Wg6TWjD179/FaKIq8btdXP3nNs5Lu14RrV/NKLXLlpvcTps1EhzJvRIKl4KI1Twyk1QxpgtxphzgWbAmcaY84wxm4N5chHpISI/iMhmERlZyvdHiMiawMd6EfEW7ZRe3rnKOqMW+Isinuv232MWRVRU3fTjebnnOCb3m4Hb6+ayad0ZNvsWy4soXln5HBv3ruOJzs9TLbl6RK9dfL0lkZNUvBVGqOAEcx/U4yJSwxiTY4w5LCI1ReSxIM6zA2OBnviT29Ui8qeyJ2PMU8aYLGNMFnAf8KUxZn8w5yprTFj3Fu+ue4Ohbe6hz2mXheUaHRt2ZeGgldzRZgQffD+ZDuOyeG/DBEuKKLYc+Ilnlo7m4lP70aPJJRG9tqOUG1ETNUnFa2GEOrZgpvh6GmMOFn0R6K7bK4jz2gKbjTFbA4UVU4C+xzj+amByJc9VEbBi+1Lumz+cjid3ZeR5D4X1WmlJafyz/SPMGbiUJjVPZdgXN3PZtB4RLaLwGR/3zB1MsiOF0Rc9E7HrFimrdXmiJal4L4xQZQsmQdlF5MhqtYikAsGsXp8E/F7s622Bx44iImlAD+CDSpx7i4isEJEVe/bE56ak0WBXzg5u/ORqTsysz8u9Kl8UUVFN65zJx1fNY0znF1m/5zs6TWjDM0sfp9BTGPZrT17/Dku2LeJfHf7N8Rn1wn694lKSjr1DdyIlqUQpjFBHCyZBTQDmiciNInIDMAcIptVGaf+jypqjuQT42hizv6LnGmNeM8a0Nsa0rlu3bhBhqYoqKorILsxmXJ+p1EyJbENlm9gY1OImFl+7hl6n9OWpJY/SeUJbvtm2KGzX3JWzg4cX3cd59S/gmrOuC9t1SiMCmUHsMZcISSq5WJsZlXiCKZIYAzwGNAXOBB41xgTTy3kb0KDY1/WB7WUcO4D/Te9V9FwVZqMW3sPyHUsDRRFnWRbHcekn8Gqvd5nU72NcXhf93+/G8C/+zv78fSG/1v0L76bQU8DTXcZG/N17RnLwrcvjOUkJZU9zqsQQ1F1WxphZxph7jDF3AzkiMjaI05YDp4pIIxFx4k9CM0oeFGjfcSF/3oA2qHNV+E1c9zbvrn2dIa3vpu/pl1sdDgCdGnZj4aCVDG1zD9M2TaL9O1lM3TgxZEUUn2+ewSc/fcTd595P45qnhOQ5g1WZ1uXxmqS0MEIFlaBEJEtEnhSRX/CPpr4v7xxjjAcYAswGNgFTjTEbRORWEbm12KH9gC+MMbnlnRvkz6RCZOWOZdy3YDgXndyF+85/2Opw/iQtKY372z/KnIFLaFSjMXfMvokrPujFlgM/Vel5swsPcd/84TSr05zbWg0PUbTBq+yIId6SlBZGKAAp612niJyGf+RyNbAPeA+4xxgTuXaZFdS6dWuzYsWKSp9/INeFy+sr/8AEsDt3J90mnkeyI4VZ1yyO+LpTRfiMj/Fr32T01w9Q6ClgWNt7Gdz6rkrtRHHvvGGMX/cGnw74krNPaB2GaMuWkmSnemrVprRcHl9c7DhRIy1J154SiIisNMYc9Qt3rBHU90Bn4BJjTHtjzIv4N4pVca6oKOJQ4SHevuS9qE5O4C+iuLblzSwatJoeTS5hzJJH6DKxHUu2La7Q8yz742veWfsaN589OOLJKdjCiPLEw0hKCyNUkWMlqMuAncACEXldRDqje1YmhAcWjuDb7Ut4rtt/aVa3udXhBO34jHr8t/d4Jl46nQJPAf3e78qdX9waVBFFgaeAe+YOpkG1k/lHBTe+DYWKFEaUJ5aTlBZGqOLKTFDGmI+MMVcBZwALgTuB40XkFRHpFqH4VIRNWj+Od9a+xuDWd0VNUURFdW7UnS8HrWJw67uYunEC7d/J4v2Nk45ZRPH8t2P4af8PjOn8IulJ6RGMtnKFEeWJ1SSVpoURqphgysxzjTETjTEX4y/3XgPo3nhxaNWObxk5fxgX/qUz/zz/EavDqZK0pDQe6DCaOQOX0LB6I4bOvpErP+jN1gNHbyO5ae8GXlr+NJefcTUdG3aNeKzVqrjuVJZYS1J2m5CuhRGqmAo18zDG7DfG/NcY0ylcASlr7M7dyQ0zB3BC+om82vvduGll3qxuc2ZetYAnOj3Pml0r6Ti+Nc8u/feRnSi8Pi93z7mNzOTqPHzRmIjHl+q0k2QPX0+dWEpSGcm6Y4T6s/B2m1IxoXhRhBU7RYSb3Wbnupa3sOja1XRvcnGgiOJclm5bzNvf/ZdVO5fz6IVjqJ1aJ6JxiUBGiKf2ShMLSSrZYdP+R+ooun+94sEv/8G325fwaq93YqoooqJOyDiR13pP4Mpm/8fI+cO49P2uOO1OOp7clf5nDIh4PJnJSSErjChPUZKKxhJ0LYxQZdERVIKbtP4dxn33X25vdSeXnn6l1eFERJdGPfhy0CpuazWck6s35snOL0Z8ainJbiM1wust0TqS0sIIVZYyb9SNRXqjbsWs2rmcS6d24dyT2jOp38c4bDqgjpRa6c6wrj0dSzTdzGsToU6GU9eeElxlbtRVcWx37k5uLCqK6PWuJqcICndhRHmiaSSlrTTUseirUgJyeV3c/MlADhYc4JMBC6mVWtvqkBJGpAojynNkTSrfhVWTKFoYocqjI6gE9K8v/8Gy7d/wn26vcmbdFlaHk1AiWRhRHqfDRo1UJ1YMYLQwQgVDE1SCmbzhXd7+7r/c1mp4whRFRAsrCiPKY1WS0sIIFQxNUAlk1c7l3DtvKBf8pRP3t3/U6nASTmaK9VN7pYl0krKJ7hihgqMJKkHsyd3FjTMHcHx6PS2KsECaxYUR5YlkktLCCBUsfZVKAC6vi5s/9RdFzLxqgRZFRJhNhIwQtNIIt6IkFc7CCaddCyNU8KL3LZ0KmYe+upelf3zNs11f4azjWlodTsKJpRGD02GjZlp4RlL+wojoT9QqemiCinNTNoznrTWvcmurYfQ74yqrw0k4sThiSLKHJ0mlJTtwRPE0p4o++r8ljq3euYJ75w2lQ4OOjGr/mNXhJJxYHjGEOklpYYSqDE1QAV//9jXPL3uazft/tDqUkCgqiqibfjyv9taiCCukOu0xPWIIZZKKpWlOFT1i97cnxOb9PI+HF91P+3da0uGds3l88YOs2rkcn4m9vfncXjc3f/p/HCjYz7hLpka8jYSKncKI8oQiScXiNKeKDrpZbDHrdmxh5k8zmLV5Jt9s+wqv8XJCej26N7mEnqdcwnn1L8Bpd4Yw4vC4f8FdvLnmFV7u+bYlbSQUVE9NiqsXZbfXx4G8ilf3Cf6NcWN5JKnCr6zNYjVBFVN8N/ODBQeY+/MsPt88gwW/ziHPnUumsxpdGvWgR5NL6NSwG5nJ1UIVeshM2TCe4V/cwt/PuYOHL3zS6nASktNuo2Z69L+RqajKJKk0p123NFLl0gQVhLLabeR78ln02wJmbZnJ7C2fsi9/D067k/YNOtKzySV0b9Kb49JPqEroIbFm50r6Tu1MmxP/ypT+M3XdyQLxPmKoSJLSVhoqWJqgghBMPyivz8uKHUv5fMtMPt88g18P/YwgtK7Xjh6nXELPJn1oXPOUSsdQWXvydtN94nnYbHZmX/O1rjtZJBFGDMEmqXib5lThowkqCBVtWGiM4ft9G5kVSFZrd68G4LRaTel5yiX0aHIJLY8/B5uE99202+vmig96sWbnCmYOWEDz47LCej1VukQaMZSXpOJ1mlOFhyaoIFS1o+627N+YteUTZm2ZyZJti/AaL/UyTvQXWTS5hL/W7xCWIotRC+7mjTUvM7bHW1zW9OqQP78KTqKNGMpKUvE+zalCTxNUEELZ8v1AwX7mbp3FrC0zmP/LHPI9eVRLrk6XRj3o2aQPHRt2JcOZWeXrTN04kTtm38TfzxnKwxeOCUHkqjKSAw0AE01pSSoRpjlVaGmCCkIoE1Rx+Z58Fv06n8+3zGT21k/Zn7+XZHsyHf7SiZ5NLqFb417UTT++ws+rRRHRQYDaGckJ29+oeJJKpGlOFTqaoIIQrgRVnNfnZfn2JUeKLH7L/gVBaHPiufRocgk9T+lDoxpNyn2ePXm76THpfECYfc3X1EmrG9a4VdnSkx1xcVNuVRQlqWopiTXNqUJDE1QQIpGgivMXWWzg880zmbVl5pEii9NrN6Nnk/8VWZR8N+r2urnyw96s3rGcGVfNp8XxZ0cs5mA57TY8PoMvjv5/lUZHDP/j8fp03UlViiaoIEQ6QZX0e/avzN7yCZ9vmcnSbYvxGi8nZpxE9yYX07NJH/5avwNJ9iQeWHgPr68ey0s93uLyKCyKcNiEWulOjIHDhR4K3F6rQwqbRCuMUCocNEEFweoEVdz+/H3M/dlfZLHglznke/KpnlyDVvXaMf+X2dxy9hAeuegpq8M8igjUSvtzBVehx0t2vifuRlOJWhihVKhpggpCNCWo4vLceSz6zV9k8cWWT2l+XBYT+02PyqKIskYUxhgOF3rId8XHaCrRCyOUCqWyElT0vcKpo6QlpdG9ycV0b3IxRW8oonHNI81pL3O6S0SolpJEssMWF6OptGSHJielwkxXNGOMiERlckqy24KqZEt22KmT4SQ1hpvX2W3afE+pSNAEpapMxD+1F2ziLBpN1UxzxuQoRJvvKRUZYU1QItJDRH4Qkc0iMrKMYy4SkTUiskFEviz2+C8isi7wvcovLKmwq56aVKlE43TYqJ3uJC2GRiPJDhvJjtiJV6lYFrY1KBGxA2OBrsA2YLmIzDDGbCx2TA3gZaCHMeY3ETmuxNN0NMbsDVeMqurSkx1VesEWETJTkkh22MkucOP1Re/alIBu4aNUBIVzBNUW2GyM2WqMcQFTgL4ljrkG+NAY8xuAMWZ3GONRIeYMct0pqOeKgdFUuhZGKBVR4UxQJwG/F/t6W+Cx4k4DaorIQhFZKSKDin3PAF8EHr+lrIuIyC0iskJEVuzZsydkwatjs4lQPTW0o4mi0VStdCeOKEsEdptEdfJUKh6Fs8y8tFeYkvM3DqAV0BlIBZaIyFJjzI/A+caY7YFpvzki8r0x5qujntCY14DXwH8fVEh/AlWm6qlJ2MKURJLsNmqlO8l1eckr9Bz1n8YKWhihVOSFcwS1DWhQ7Ov6wPZSjplljMkNrDV9BbQEMMZsD/y5G/gI/5ShigIZyQ6cjvAWgIoIGckOakbBaCrFYdfCCKUsEM5XmeXAqSLSSEScwABgRoljPgY6iIhDRNKAdsAmEUkXkUwAEUkHugHrwxirClKyw0Z6BHfuTrLbqJ2RTEayo9QhebgJkJGi97MrZYWw/eYZYzwiMgSYDdiBt4wxG0Tk1sD3XzXGbBKRWcBawAe8YYxZLyKNgY8CUyoOYJIxZla4YlXBsQXuX7KCv1rQRnaBB3cEt6PSwgilrKN78RUTrXvxRQMBaqY7SYqCdgq5hR5yI7A2ZbcJdTKSw3wVpVRZe/FZ/2qjYkJGiiMqkhP4RzW1IpAsrRotKqX8ouMVR0W1FIedNGd0rcM4ApV+mSnhWZtKcdjDXgiilDo2/Q1Ux2S3CdVSoys5FZfmdFA7IxlnCEdTIv6ycqWUtTRBqTIJUKMCm8BaxW4TaqY7qZaSFJLRVEayI2z3eCmlgqcJSpWpWmrSnzrjRrtUp53aGckkV2FqzmGTqJvOVCpRxc6rj4qo1GM0H4xmdptQIy0wmqrEIEg3g1UqemiCUkdx2ITMCN6MGw6pTjt10is2mkpJ0sIIpaKJ/jaqPxGBGmnOqF93CoYtMJryN1M89rEixHxSVire6G+k+pNqKZVrPhjNUpLsOO02Dhd4KPB4Sz1GCyOUij46glJHpMXoulMwbDahelpSqaMpLYxQKjrpb6UC/JuyJkKBQGmjqUT4uZWKRZqgFCKEvPlgNCsaTSW7bXh8RgsjlIpSmqAU1VPjb90pGPE6nalUvNC3jgnO38ZCX6iVUtFHE1QCc9ptZGhptVIqSmmCSlA2kYRad1JKxR5NUAlI8K876X0/SqlopgkqAWWkOLRyTSkV9fRVKsEkO2x6U6pSKiZogkogdpuuOymlYocmqARRtO4UD5vAKqUSgyaoBJGZkkRSDDUfVEopfcVKAClJdlKdejOuUiq2aIKKcw6bUC1FiyKUUrFHE1Qc03UnpVQs0wRVjNNhK7fzaiyplpqEQ9edlFIxSud+iklPdpCe7MDl8VHg8VLo9uEzxuqwKiU1jpsPKqUSgyaoUjgdNv9OCyng8vgo9HgpiKFklWS3kambwCqlYpy+ipWjKFllpoDb66PA7aXQ48Pri85kVdR8UNedlFKxThNUBSTZbf7RCeDx+ijw+BNWNCWraimJ2XxQKRV/NEFVksNuIyPQT8nj9VEYSFYeC5NVmq47KaXiiCaoEHDYbTjsNtKTHXh95sg0oNvri1gMTruNzBTdZ08pFT80QYWY3SaBakDw+syRAotwJquidSellIonmqDCyG4T0pwO0pzg85kjpeuuECerGqlObT6olIo7mqAixFYiWRWtWbm9PqqyapWRrM0HlVLxSROUBWw2IdXp38DV5zO4AuXrLk/FklWyw7/upZRS8Sisb71FpIeI/CAim0VkZBnHXCQia0Rkg4h8WZFz44HNJqQk2amR5qRuZjLVU5NIcdgpb8LOJkI1LYpQSsWxsL39FhE7MBboCmwDlovIDGPMxmLH1ABeBnoYY34TkeOCPTceifiTVUqSHWMcFHp8FLr9O1kUH1kJUCMtSdedlFJxLZwjqLbAZmPMVmOMC5gC9C1xzDXAh8aY3wCMMbsrcG5cK0pW1dOS/jeySrIjAhkpDm0+qJSKe+F8lTsJ+L3Y19sCjxV3GlBTRBaKyEoRGVSBcwEQkVtEZIWIrNizZ0+IQo8uR5JVahLHZaaQ5tR1J6VU/AvnK11p808lawAcQCugM5AKLBGRpUGe63/QmNeA1wBat24dPXsOKaWUqpJwJqhtQINiX9cHtpdyzF5jTC6QKyJfAS2DPFcppVQcC+cU33LgVBFpJCJOYAAwo8QxHwMdRMQhImlAO2BTkOcqpZSKY2EbQRljPCIyBJgN2IG3jDEbROTWwPdfNcZsEpFZwFrAB7xhjFkPUNq54YpVKaVU9BETI034gtG6dWuzYsUKq8NQSilVASKy0hjTuuTjWquslFIqKmmCUkopFZU0QSmllIpKmqCUUkpFJU1QSimlolJcVfGJyB7g1yo8RR1gb4jCCSeNM3RiIUbQOEMtFuKMhRghNHGebIypW/LBuEpQVSUiK0ordYw2GmfoxEKMoHGGWizEGQsxQnjj1Ck+pZRSUUkTlFJKqaikCerPXrM6gCBpnKETCzGCxhlqsRBnLMQIYYxT16CUUkpFJR1BKaWUikqaoJRSSkUlTVCAiLwlIrtFZL3VsRyLiDQQkQUisklENojIMKtjKklEUkTkWxH5LhDjw1bHdCwiYheR1SLyidWxlEVEfhGRdSKyRkSicrt+EakhItNE5PvA/8+/Wh1TSSJyeuDvsOgjW0SGWx1XaUTkzsDvz3oRmSwiKVbHVJKIDAvEtyFcf4+6BgWIyAVADvCuMeYsq+Mpi4jUA+oZY1aJSCawErjUGLPR4tCOEBEB0o0xOSKSBCwGhhljllocWqlE5C6gNVDNGHOx1fGURkR+AVobY6L2pk0ReQdYZIx5I9BkNM0Yc9DquMoiInbgD6CdMaYqN/eHnIichP/3ppkxJl9EpgKfGWPGWRvZ/4jIWcAUoC3gAmYBtxljfgrldXQEBRhjvgL2Wx1HeYwxO4wxqwKfH8bfffgka6P6M+OXE/gyKfARle+CRKQ+0Bt4w+pYYpmIVAMuAN4EMMa4ojk5BXQGtkRbcirGAaSKiANIA7ZbHE9JTYGlxpg8Y4wH+BLoF+qLaIKKUSLSEDgbWGZtJEcLTJutAXYDc4wxURdjwHPAP/B3c45mBvhCRFaKyC1WB1OKxsAe4O3AdOkbIpJudVDlGABMtjqI0hhj/gCeBn4DdgCHjDFfWBvVUdYDF4hIbRFJA3oBDUJ9EU1QMUhEMoAPgOHGmGyr4ynJGOM1xmQB9YG2gemAqCIiFwO7jTErrY4lCOcbY84BegKDA1PS0cQBnAO8Yow5G8gFRlobUtkCU5B9gPetjqU0IlIT6As0d/jhdAAABgJJREFUAk4E0kXk/6yN6s+MMZuAJ4E5+Kf3vgM8ob6OJqgYE1jX+QCYaIz50Op4jiUwzbMQ6GFxKKU5H+gTWN+ZAnQSkQnWhlQ6Y8z2wJ+7gY/wz/tHk23AtmIj5Wn4E1a06gmsMsbssjqQMnQBfjbG7DHGuIEPgfMsjukoxpg3jTHnGGMuwL9EEtL1J9AEFVMCBQhvApuMMc9aHU9pRKSuiNQIfJ6K/5fte2ujOpox5j5jTH1jTEP80z3zjTFR9S4VQETSAwUxBKbNuuGfXokaxpidwO8icnrgoc5A1BTulOJqonR6L+A34FwRSQv8znfGv94cVUTkuMCffwH6E4a/U0eonzAWichk4CKgjohsA/5ljHnT2qhKdT7wN2BdYI0H4J/GmM8sjKmkesA7gSopGzDVGBO1Jdwx4HjgI//rFA5gkjFmlrUhlWooMDEwfbYVuN7ieEoVWC/pCvzd6ljKYoxZJiLTgFX4p81WE53bHn0gIrUBNzDYGHMg1BfQMnOllFJRSaf4lFJKRSVNUEoppaKSJiillFJRSROUUkqpqKQJSimlVFTSBKXijogYEXmm2Nf3iMhDIXrucSJyeSieq5zrXBHYFXxBiccbBn6+ocUee0lErivn+SIV9zQRaRz4/BcRqRP4vJWI/CwiZ4vIxdG+y72KDpqgVDwqBPoXvThGi8C9YcG6EbjdGNOxlO/tBoYF7jkKu8CGpcEcdyZgN8ZsLfF4C/y7S1xljFkNfIp/F4+0kAer4oomKBWPPPhvbLyz5DdKjiREJCfw50Ui8qWITBWRH0XkCREZKP7eVutEpEmxp+kiIosCx10cON8uIk+JyHIRWSsify/2vAtEZBKwrpR4rg48/3oReTLw2INAe+BVEXmqlJ9vDzAPuLaU57s5EMN3IvJBiSRQWtwpIvJ2IIbVItIx8Ph1IvK+iMzEv1FtPRH5Svx9lNaLSIdS4hoIfFzisabAdOBvxphvwb/jPf4tsKKyvYmKHpqgVLwaCwwUkeoVOKclMAxojn/HjtOMMW3xt+MYWuy4hsCF+Ft1vCr+ZnI34t91ug3QBrhZRBoFjm8L3G+MaVb8YiJyIv4NNzsBWUAbEbnUGPMIsAIYaIwZUUasTwB3lzIq+9AY08YY0xL/9jg3lhP3YABjTHP8WwC9I/9rjvdX4FpjTCfgGmB2YBPglsAajnY+/h5lxX0MDDHGLC7x+AqgtCSn1BGaoFRcCuzy/i5wRwVOWx7ouVUIbAGKWhysw//iXmSqMcYXaM62FTgD/x55gwJbUC0DagOnBo7/1hjzcynXawMsDGwK6gEm4u+rFMzP9zPwLf7EUdxZgVHSOvwjmjPLibs9MD7wnN8DvwKnBY6fY4wp6pO2HLg+sJbXPNCPrKR6+Ed3xc0Fbiolke7Gv1O3UmXSBKXi2XP4RxDFexN5CPy/D2zEWXwdp7DY575iX/v4876VJfcHM4AAQ40xWYGPRsV6+OSWEZ8E+4OU4XHgXv78ezwO/4ilOfAwULxVeFlxl+VI3IGmnhfg70I7XkQGlXJ8fonrAQwJ/PlyicdTAscrVSZNUCpuBd79T+XP01y/AK0Cn/fF3/G3oq4QEVtgXaox8AMwG7hN/O1QEJHTpPymfcuAC0WkTmCEcTX+zqRBCYx4NvLntZxMYEcgjoFBxP1V0XEichrwl8DjfyIiJ+Pvn/U6/h31S2unsQk4pcRjvsDPdbqIPFLs8dOIsl3ZVfTRBKXi3TNA8Wq+1/EnhW+BdpQ9ujmWH/Anks+BW40xBfjXqTYCq0RkPfBfyukWYIzZAdwHLMDf8G2VMaZkkUF5RuNvDFnkAfyJbw5HtzkpLe6XAXtgSvA94LrAFGdJFwFrRGQ1cBnwfCnH/H97d2zCMAxEAfQ0RSbJACFFSg+S+Vxnl6wRglycMARiVyZc4L1SHELdLyT05zH3Yew3Rb7cu4/ly5iHTX4zBw7Rsv/rEdkA/N6ZO0XWhlx/djj+koACDtNau0UWaj53Zs4R8eq9f3sJCCsBBUBJ7qAAKElAAVCSgAKgJAEFQEkCCoCSFsNKOT0n+2iKAAAAAElFTkSuQmCC\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 1 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"The best accuracy was with 0.7857142857142857 with k= 7\n" | |
] | |
} | |
], | |
"source": [ | |
"### Plot to find K with the highest accuracy\n", | |
"plt.plot(range(1,Ks),mean_acc,'g')\n", | |
"plt.fill_between(range(1,Ks),mean_acc - 1 * std_acc,mean_acc + 1 * std_acc, alpha=0.10)\n", | |
"plt.legend(('Accuracy ', '+/- 3xstd'))\n", | |
"plt.ylabel('Accuracy ')\n", | |
"plt.xlabel('Number of Nabors (K)')\n", | |
"plt.tight_layout()\n", | |
"plt.show()\n", | |
"### Decision : which k should be used\n", | |
"print( \"The best accuracy was with\", mean_acc.max(), \"with k=\", mean_acc.argmax()+1) " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Train Model and Predict for k = 7 </h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 23, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',\n", | |
" metric_params=None, n_jobs=None, n_neighbors=7, p=2,\n", | |
" weights='uniform')" | |
] | |
}, | |
"execution_count": 23, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"#Train Model and Predict for k=7\n", | |
"k=7\n", | |
"KNN = KNeighborsClassifier(n_neighbors = k).fit(X_train,y_train)\n", | |
"KNN" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Predicting</h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 24, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"array(['PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF'],\n", | |
" dtype=object)" | |
] | |
}, | |
"execution_count": 24, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"#Predicting\n", | |
"yhat1 = KNN.predict(X_test)\n", | |
"yhat1[0:5]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Accuracy evaluation</h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Model 1: KNN Train set Accuracy: 0.8079710144927537\n", | |
"Model 1: KNN Validation set (Test set) Accuracy: 0.7857142857142857\n", | |
"Model 1: KNN Jaccard Score: 0.7857142857142857\n", | |
"Model 1: KNN F1 Score 0.7766540244416351\n" | |
] | |
} | |
], | |
"source": [ | |
"from sklearn.metrics import f1_score\n", | |
"from sklearn.metrics import jaccard_similarity_score\n", | |
"print(\"Model 1: KNN Train set Accuracy: \", metrics.accuracy_score(y_train, KNN.predict(X_train)))\n", | |
"print(\"Model 1: KNN Validation set (Test set) Accuracy: \", metrics.accuracy_score(y_test, yhat1))\n", | |
"print(\"Model 1: KNN Jaccard Score:\",jaccard_similarity_score(y_test, yhat1))\n", | |
"print(\"Model 1: KNN F1 Score\", f1_score(y_test, yhat1, average=\"weighted\"))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 26, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"0.8079710144927537\n", | |
"0.7857142857142857\n", | |
"0.7857142857142857\n", | |
"0.7766540244416351\n" | |
] | |
} | |
], | |
"source": [ | |
"KNN1=metrics.accuracy_score(y_train, KNN.predict(X_train))\n", | |
"KNN2=metrics.accuracy_score(y_test, yhat1)\n", | |
"KNN3=jaccard_similarity_score(y_test, yhat1)\n", | |
"KNN4=f1_score(y_test, yhat1, average=\"weighted\")\n", | |
"print(KNN1)\n", | |
"print(KNN2)\n", | |
"print(KNN3)\n", | |
"print(KNN4)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Decision Tree" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 27, | |
"metadata": { | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Done\n" | |
] | |
} | |
], | |
"source": [ | |
"from sklearn.tree import DecisionTreeClassifier\n", | |
"from sklearn.model_selection import train_test_split\n", | |
"print('Done')" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 28, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Train set: (276, 8) (276,)\n", | |
"Test set: (70, 8) (70,)\n" | |
] | |
} | |
], | |
"source": [ | |
"#Train-test split\n", | |
"from sklearn.model_selection import train_test_split\n", | |
"X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=4)\n", | |
"print ('Train set:', X_train.shape, y_train.shape)\n", | |
"print ('Test set:', X_test.shape, y_test.shape)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Modeling and predicting</h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 29, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"DecisionTreeClassifier(class_weight=None, criterion='entropy', max_depth=4,\n", | |
" max_features=None, max_leaf_nodes=None,\n", | |
" min_impurity_decrease=0.0, min_impurity_split=None,\n", | |
" min_samples_leaf=1, min_samples_split=2,\n", | |
" min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n", | |
" splitter='best')" | |
] | |
}, | |
"execution_count": 29, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"DT = DecisionTreeClassifier(criterion=\"entropy\", max_depth = 4)\n", | |
"DT # DT as \"drugtree\" it shows the default parameters" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 30, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"DecisionTreeClassifier(class_weight=None, criterion='entropy', max_depth=4,\n", | |
" max_features=None, max_leaf_nodes=None,\n", | |
" min_impurity_decrease=0.0, min_impurity_split=None,\n", | |
" min_samples_leaf=1, min_samples_split=2,\n", | |
" min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n", | |
" splitter='best')" | |
] | |
}, | |
"execution_count": 30, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"DT.fit(X_train,y_train)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 31, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"array(['COLLECTION', 'COLLECTION', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'COLLECTION',\n", | |
" 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'COLLECTION', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'COLLECTION', 'COLLECTION', 'COLLECTION', 'PAIDOFF',\n", | |
" 'COLLECTION', 'COLLECTION', 'PAIDOFF', 'COLLECTION', 'PAIDOFF',\n", | |
" 'COLLECTION', 'COLLECTION', 'COLLECTION', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'COLLECTION', 'PAIDOFF',\n", | |
" 'COLLECTION', 'PAIDOFF', 'PAIDOFF', 'COLLECTION', 'PAIDOFF',\n", | |
" 'COLLECTION', 'COLLECTION', 'COLLECTION', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'COLLECTION',\n", | |
" 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'PAIDOFF'], dtype=object)" | |
] | |
}, | |
"execution_count": 31, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"#Predicting\n", | |
"from sklearn import metrics\n", | |
"import matplotlib.pyplot as plt\n", | |
"Y_test_predict_DT = DT.predict(X_test)\n", | |
"Y_test_predict_DT" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Evaluation</h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 32, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Model 2: DT Train set Accuracy: 0.7463768115942029\n", | |
"Model 2: DT Validation set Accuracy: 0.6142857142857143\n", | |
"Model 2: DT Jaccard Score: 0.6142857142857143\n", | |
"Model 2: DT F1 Score 0.6445993031358885\n" | |
] | |
} | |
], | |
"source": [ | |
"from sklearn import metrics\n", | |
"import matplotlib.pyplot as plt\n", | |
"print(\"Model 2: DT Train set Accuracy: \", metrics.accuracy_score(y_train, DT.predict(X_train)))\n", | |
"print(\"Model 2: DT Validation set Accuracy: \", metrics.accuracy_score(y_test, Y_test_predict_DT))\n", | |
"print(\"Model 2: DT Jaccard Score:\",jaccard_similarity_score(y_test, Y_test_predict_DT))\n", | |
"print(\"Model 2: DT F1 Score\", f1_score(y_test, Y_test_predict_DT, average=\"weighted\"))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 33, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"0.7463768115942029\n", | |
"0.6142857142857143\n", | |
"0.6142857142857143\n", | |
"0.6445993031358885\n" | |
] | |
} | |
], | |
"source": [ | |
"\n", | |
"DT1=metrics.accuracy_score(y_train, DT.predict(X_train))\n", | |
"DT2= metrics.accuracy_score(y_test, Y_test_predict_DT)\n", | |
"DT3=jaccard_similarity_score(y_test, Y_test_predict_DT)\n", | |
"DT4=f1_score(y_test, Y_test_predict_DT, average=\"weighted\")\n", | |
"print(DT1)\n", | |
"print(DT2)\n", | |
"print(DT3)\n", | |
"print(DT4)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Support Vector Machine" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
" <h3>Modeling and predicting</h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 34, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,\n", | |
" decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',\n", | |
" max_iter=-1, probability=False, random_state=None, shrinking=True,\n", | |
" tol=0.001, verbose=False)" | |
] | |
}, | |
"execution_count": 34, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"from sklearn import svm\n", | |
"clf = svm.SVC(kernel='rbf', gamma='auto')\n", | |
"clf.fit(X_train, y_train)\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 35, | |
"metadata": { | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"array(['COLLECTION', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF'],\n", | |
" dtype=object)" | |
] | |
}, | |
"execution_count": 35, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"#After being fitted, the model can then be used to predict new values:\n", | |
"Y_test_predict_SVM = clf.predict(X_test)\n", | |
"Y_test_predict_SVM [0:5]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Evaluation</h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 36, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Model 3: SVM Train set Accuracy: 0.782608695652174\n", | |
"Model 3: SVM Validation set Accuracy: 0.7428571428571429\n", | |
"Model 3: SVM Jaccard Score: 0.7428571428571429\n", | |
"Model 3: SVM F1 Score 0.7275882012724117\n" | |
] | |
} | |
], | |
"source": [ | |
"from sklearn.metrics import f1_score\n", | |
"print(\"Model 3: SVM Train set Accuracy: \", metrics.accuracy_score(y_train, clf.predict(X_train)))\n", | |
"print(\"Model 3: SVM Validation set Accuracy: \", metrics.accuracy_score(y_test, Y_test_predict_SVM))\n", | |
"print(\"Model 3: SVM Jaccard Score:\",jaccard_similarity_score(y_test,Y_test_predict_SVM))\n", | |
"print(\"Model 3: SVM F1 Score\", f1_score(y_test, Y_test_predict_SVM, average=\"weighted\"))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 37, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"0.782608695652174\n", | |
"0.7428571428571429\n", | |
"0.7428571428571429\n", | |
"0.7275882012724117\n" | |
] | |
} | |
], | |
"source": [ | |
"from sklearn.metrics import f1_score\n", | |
"SVM1=metrics.accuracy_score(y_train, clf.predict(X_train))\n", | |
"SVM2= metrics.accuracy_score(y_test, Y_test_predict_SVM)\n", | |
"SVM3=jaccard_similarity_score(y_test,Y_test_predict_SVM)\n", | |
"SVM4= f1_score(y_test, Y_test_predict_SVM, average=\"weighted\")\n", | |
"print(SVM1)\n", | |
"print(SVM2)\n", | |
"print(SVM3)\n", | |
"print(SVM4)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Logistic Regression" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 38, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"LogisticRegression(C=0.01, class_weight=None, dual=False, fit_intercept=True,\n", | |
" intercept_scaling=1, max_iter=100, multi_class='warn',\n", | |
" n_jobs=None, penalty='l2', random_state=None, solver='liblinear',\n", | |
" tol=0.0001, verbose=0, warm_start=False)" | |
] | |
}, | |
"execution_count": 38, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"from sklearn.linear_model import LogisticRegression\n", | |
"from sklearn.metrics import confusion_matrix\n", | |
"LR = LogisticRegression(C=0.01, solver='liblinear').fit(X_train,y_train)\n", | |
"LR" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Predicting</h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 39, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"array(['COLLECTION', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'COLLECTION', 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'PAIDOFF', 'COLLECTION',\n", | |
" 'COLLECTION', 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'COLLECTION', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'COLLECTION',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'COLLECTION', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'COLLECTION', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF',\n", | |
" 'PAIDOFF', 'PAIDOFF'], dtype=object)" | |
] | |
}, | |
"execution_count": 39, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"from sklearn.metrics import log_loss\n", | |
"\n", | |
"Y_test_predict_LR = LR.predict(X_test)\n", | |
"Y_test_predict_LR" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<h3>Evaluation </h3>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 40, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Model 5: Log_reg Train set Accuracy: 0.7572463768115942\n", | |
"Model 5: Log_reg Validation set Accuracy: 0.6857142857142857\n", | |
"Model 5: Log_reg Jaccard Score: 0.6857142857142857\n", | |
"Model 5: Log_reg F1 Score 0.6670522459996144\n", | |
"Model 5: Log_reg log_loss Score 0.5772287609479654\n" | |
] | |
} | |
], | |
"source": [ | |
"\n", | |
"lg_loan_status_probas = LR.predict_proba(X_test)\n", | |
"lg_log_loss = log_loss(y_test, lg_loan_status_probas)\n", | |
"print(\"Model 5: Log_reg Train set Accuracy: \", metrics.accuracy_score(y_train, LR.predict(X_train)))\n", | |
"print(\"Model 5: Log_reg Validation set Accuracy: \", metrics.accuracy_score(y_test, Y_test_predict_LR))\n", | |
"print(\"Model 5: Log_reg Jaccard Score:\",jaccard_similarity_score(y_test, Y_test_predict_LR))\n", | |
"print(\"Model 5: Log_reg F1 Score\", f1_score(y_test,Y_test_predict_LR, average=\"weighted\"))\n", | |
"print(\"Model 5: Log_reg log_loss Score\", lg_log_loss)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 41, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"0.7572463768115942\n", | |
"0.6857142857142857\n", | |
"0.6857142857142857\n", | |
"0.6670522459996144\n", | |
"0.5772287609479654\n" | |
] | |
} | |
], | |
"source": [ | |
"LR1=metrics.accuracy_score(y_train, LR.predict(X_train))\n", | |
"LR2=metrics.accuracy_score(y_test, Y_test_predict_LR)\n", | |
"LR3=jaccard_similarity_score(y_test, Y_test_predict_LR)\n", | |
"LR4=f1_score(y_test,Y_test_predict_LR, average=\"weighted\")\n", | |
"LR5=lg_log_loss\n", | |
"print(LR1)\n", | |
"print(LR2)\n", | |
"print(LR3)\n", | |
"print(LR4)\n", | |
"print(LR5)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Model Evaluation using Test set" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 42, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from sklearn.metrics import jaccard_similarity_score\n", | |
"from sklearn.metrics import f1_score\n", | |
"from sklearn.metrics import log_loss" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"First, download and load the test set:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 43, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"--2019-08-08 18:44:30-- https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_test.csv\n", | |
"Resolving s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)... 67.228.254.193\n", | |
"Connecting to s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)|67.228.254.193|:443... connected.\n", | |
"HTTP request sent, awaiting response... 200 OK\n", | |
"Length: 3642 (3.6K) [text/csv]\n", | |
"Saving to: ‘loan_test.csv’\n", | |
"\n", | |
"loan_test.csv 100%[===================>] 3.56K --.-KB/s in 0s \n", | |
"\n", | |
"2019-08-08 18:44:30 (90.8 MB/s) - ‘loan_test.csv’ saved [3642/3642]\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"!wget -O loan_test.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_test.csv" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"### Load Test set for evaluation " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 44, | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Unnamed: 0</th>\n", | |
" <th>Unnamed: 0.1</th>\n", | |
" <th>loan_status</th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>effective_date</th>\n", | |
" <th>due_date</th>\n", | |
" <th>age</th>\n", | |
" <th>education</th>\n", | |
" <th>Gender</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>9/8/2016</td>\n", | |
" <td>10/7/2016</td>\n", | |
" <td>50</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>female</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>5</td>\n", | |
" <td>5</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>300</td>\n", | |
" <td>7</td>\n", | |
" <td>9/9/2016</td>\n", | |
" <td>9/15/2016</td>\n", | |
" <td>35</td>\n", | |
" <td>Master or Above</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>21</td>\n", | |
" <td>21</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>9/10/2016</td>\n", | |
" <td>10/9/2016</td>\n", | |
" <td>43</td>\n", | |
" <td>High School or Below</td>\n", | |
" <td>female</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>24</td>\n", | |
" <td>24</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>9/10/2016</td>\n", | |
" <td>10/9/2016</td>\n", | |
" <td>26</td>\n", | |
" <td>college</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>35</td>\n", | |
" <td>35</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>800</td>\n", | |
" <td>15</td>\n", | |
" <td>9/11/2016</td>\n", | |
" <td>9/25/2016</td>\n", | |
" <td>29</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n", | |
"0 1 1 PAIDOFF 1000 30 9/8/2016 \n", | |
"1 5 5 PAIDOFF 300 7 9/9/2016 \n", | |
"2 21 21 PAIDOFF 1000 30 9/10/2016 \n", | |
"3 24 24 PAIDOFF 1000 30 9/10/2016 \n", | |
"4 35 35 PAIDOFF 800 15 9/11/2016 \n", | |
"\n", | |
" due_date age education Gender \n", | |
"0 10/7/2016 50 Bechalor female \n", | |
"1 9/15/2016 35 Master or Above male \n", | |
"2 10/9/2016 43 High School or Below female \n", | |
"3 10/9/2016 26 college male \n", | |
"4 9/25/2016 29 Bechalor male " | |
] | |
}, | |
"execution_count": 44, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"test_df = pd.read_csv('loan_test.csv')\n", | |
"test_df.head()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 45, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Unnamed: 0</th>\n", | |
" <th>Unnamed: 0.1</th>\n", | |
" <th>loan_status</th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>effective_date</th>\n", | |
" <th>due_date</th>\n", | |
" <th>age</th>\n", | |
" <th>education</th>\n", | |
" <th>Gender</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-10-07</td>\n", | |
" <td>50</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>female</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>5</td>\n", | |
" <td>5</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>300</td>\n", | |
" <td>7</td>\n", | |
" <td>2016-09-09</td>\n", | |
" <td>2016-09-15</td>\n", | |
" <td>35</td>\n", | |
" <td>Master or Above</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>21</td>\n", | |
" <td>21</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-10</td>\n", | |
" <td>2016-10-09</td>\n", | |
" <td>43</td>\n", | |
" <td>High School or Below</td>\n", | |
" <td>female</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>24</td>\n", | |
" <td>24</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-10</td>\n", | |
" <td>2016-10-09</td>\n", | |
" <td>26</td>\n", | |
" <td>college</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>35</td>\n", | |
" <td>35</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>800</td>\n", | |
" <td>15</td>\n", | |
" <td>2016-09-11</td>\n", | |
" <td>2016-09-25</td>\n", | |
" <td>29</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>male</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n", | |
"0 1 1 PAIDOFF 1000 30 2016-09-08 \n", | |
"1 5 5 PAIDOFF 300 7 2016-09-09 \n", | |
"2 21 21 PAIDOFF 1000 30 2016-09-10 \n", | |
"3 24 24 PAIDOFF 1000 30 2016-09-10 \n", | |
"4 35 35 PAIDOFF 800 15 2016-09-11 \n", | |
"\n", | |
" due_date age education Gender \n", | |
"0 2016-10-07 50 Bechalor female \n", | |
"1 2016-09-15 35 Master or Above male \n", | |
"2 2016-10-09 43 High School or Below female \n", | |
"3 2016-10-09 26 college male \n", | |
"4 2016-09-25 29 Bechalor male " | |
] | |
}, | |
"execution_count": 45, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"dp = pd.read_csv('loan_test.csv')\n", | |
"dp['due_date'] = pd.to_datetime(dp['due_date'])\n", | |
"dp['effective_date'] = pd.to_datetime(dp['effective_date'])\n", | |
"dp.head()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 46, | |
"metadata": { | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Unnamed: 0</th>\n", | |
" <th>Unnamed: 0.1</th>\n", | |
" <th>loan_status</th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>effective_date</th>\n", | |
" <th>due_date</th>\n", | |
" <th>age</th>\n", | |
" <th>education</th>\n", | |
" <th>Gender</th>\n", | |
" <th>dayofweek</th>\n", | |
" <th>weekend</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-08</td>\n", | |
" <td>2016-10-07</td>\n", | |
" <td>50</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>1</td>\n", | |
" <td>3</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>5</td>\n", | |
" <td>5</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>300</td>\n", | |
" <td>7</td>\n", | |
" <td>2016-09-09</td>\n", | |
" <td>2016-09-15</td>\n", | |
" <td>35</td>\n", | |
" <td>Master or Above</td>\n", | |
" <td>0</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>21</td>\n", | |
" <td>21</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-10</td>\n", | |
" <td>2016-10-09</td>\n", | |
" <td>43</td>\n", | |
" <td>High School or Below</td>\n", | |
" <td>1</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>24</td>\n", | |
" <td>24</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>2016-09-10</td>\n", | |
" <td>2016-10-09</td>\n", | |
" <td>26</td>\n", | |
" <td>college</td>\n", | |
" <td>0</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>35</td>\n", | |
" <td>35</td>\n", | |
" <td>PAIDOFF</td>\n", | |
" <td>800</td>\n", | |
" <td>15</td>\n", | |
" <td>2016-09-11</td>\n", | |
" <td>2016-09-25</td>\n", | |
" <td>29</td>\n", | |
" <td>Bechalor</td>\n", | |
" <td>0</td>\n", | |
" <td>6</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n", | |
"0 1 1 PAIDOFF 1000 30 2016-09-08 \n", | |
"1 5 5 PAIDOFF 300 7 2016-09-09 \n", | |
"2 21 21 PAIDOFF 1000 30 2016-09-10 \n", | |
"3 24 24 PAIDOFF 1000 30 2016-09-10 \n", | |
"4 35 35 PAIDOFF 800 15 2016-09-11 \n", | |
"\n", | |
" due_date age education Gender dayofweek weekend \n", | |
"0 2016-10-07 50 Bechalor 1 3 0 \n", | |
"1 2016-09-15 35 Master or Above 0 4 1 \n", | |
"2 2016-10-09 43 High School or Below 1 5 1 \n", | |
"3 2016-10-09 26 college 0 5 1 \n", | |
"4 2016-09-25 29 Bechalor 0 6 1 " | |
] | |
}, | |
"execution_count": 46, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"#Preprosessing test set:\n", | |
"test_df['due_date'] = pd.to_datetime(test_df['due_date'])\n", | |
"test_df['effective_date'] = pd.to_datetime(test_df['effective_date'])\n", | |
"test_df['dayofweek'] = test_df['effective_date'].dt.dayofweek\n", | |
"test_df['weekend'] = test_df['dayofweek'].apply(lambda x: 1 if (x>3) else 0)\n", | |
"test_df['Gender'].replace(to_replace=['male','female'], value=[0,1],inplace=True)\n", | |
"test_df.head(5)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 47, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Principal</th>\n", | |
" <th>terms</th>\n", | |
" <th>age</th>\n", | |
" <th>Gender</th>\n", | |
" <th>weekend</th>\n", | |
" <th>Bechalor</th>\n", | |
" <th>High School or Below</th>\n", | |
" <th>college</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>50</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>300</td>\n", | |
" <td>7</td>\n", | |
" <td>35</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>43</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>1000</td>\n", | |
" <td>30</td>\n", | |
" <td>26</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>800</td>\n", | |
" <td>15</td>\n", | |
" <td>29</td>\n", | |
" <td>0</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>0</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Principal terms age Gender weekend Bechalor High School or Below \\\n", | |
"0 1000 30 50 1 0 1 0 \n", | |
"1 300 7 35 0 1 0 0 \n", | |
"2 1000 30 43 1 1 0 1 \n", | |
"3 1000 30 26 0 1 0 0 \n", | |
"4 800 15 29 0 1 1 0 \n", | |
"\n", | |
" college \n", | |
"0 0 \n", | |
"1 0 \n", | |
"2 0 \n", | |
"3 1 \n", | |
"4 0 " | |
] | |
}, | |
"execution_count": 47, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"Feature_test = test_df[['Principal','terms','age','Gender','weekend']]\n", | |
"Feature_test = pd.concat([Feature_test,pd.get_dummies(test_df['education'])], axis=1)\n", | |
"Feature_test.drop(['Master or Above'], axis = 1,inplace=True)\n", | |
"Feature_test.head()\n", | |
"x_datatest = Feature_test\n", | |
"x_datatest[0:5]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 48, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[1 1 1 1 1]\n" | |
] | |
}, | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"/home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/sklearn/preprocessing/data.py:625: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n", | |
" return self.partial_fit(X, y)\n", | |
"/home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/ipykernel_launcher.py:3: DataConversionWarning: Data with input dtype uint8, int64 were all converted to float64 by StandardScaler.\n", | |
" This is separate from the ipykernel package so we can avoid doing imports until\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"array([[ 0.51578458, 0.92071769, 2.33152555, -0.42056004, -1.20577805,\n", | |
" -0.38170062, 1.13639374, -0.86968108],\n", | |
" [ 0.51578458, 0.92071769, 0.34170148, 2.37778177, -1.20577805,\n", | |
" 2.61985426, -0.87997669, -0.86968108],\n", | |
" [ 0.51578458, -0.95911111, -0.65321055, -0.42056004, -1.20577805,\n", | |
" -0.38170062, -0.87997669, 1.14984679],\n", | |
" [ 0.51578458, 0.92071769, -0.48739188, 2.37778177, 0.82934003,\n", | |
" -0.38170062, -0.87997669, 1.14984679],\n", | |
" [ 0.51578458, 0.92071769, -0.3215732 , -0.42056004, 0.82934003,\n", | |
" -0.38170062, -0.87997669, 1.14984679]])" | |
] | |
}, | |
"execution_count": 48, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"y_datatest = test_df['loan_status'].replace(to_replace=['PAIDOFF','COLLECTION'], value=[1,0]).values\n", | |
"print(y_datatest[0:5])\n", | |
"x_datatest= preprocessing.StandardScaler().fit(x_datatest).transform(x_datatest)\n", | |
"X[0:5]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 49, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Algorithm</th>\n", | |
" <th>Jaccard</th>\n", | |
" <th>F1-score</th>\n", | |
" <th>LogLoss</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>KNN</td>\n", | |
" <td>0.785714</td>\n", | |
" <td>0.776654</td>\n", | |
" <td>NA</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>Decision Tree</td>\n", | |
" <td>0.614286</td>\n", | |
" <td>0.644599</td>\n", | |
" <td>NA</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>SVM</td>\n", | |
" <td>0.742857</td>\n", | |
" <td>0.727588</td>\n", | |
" <td>NA</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>Log_Reg</td>\n", | |
" <td>0.685714</td>\n", | |
" <td>0.667052</td>\n", | |
" <td>0.577229</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Algorithm Jaccard F1-score LogLoss\n", | |
"0 KNN 0.785714 0.776654 NA\n", | |
"1 Decision Tree 0.614286 0.644599 NA\n", | |
"2 SVM 0.742857 0.727588 NA\n", | |
"3 Log_Reg 0.685714 0.667052 0.577229" | |
] | |
}, | |
"execution_count": 49, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"col_names = ['Algorithm','Jaccard', 'F1-score','LogLoss']\n", | |
"Report_df = pd.DataFrame(columns = col_names)\n", | |
"Report_df.loc[len(Report_df)] = ['KNN', KNN3,KNN4,'NA']\n", | |
"Report_df.loc[len(Report_df)] = ['Decision Tree', DT3, DT4,'NA']\n", | |
"Report_df.loc[len(Report_df)] = ['SVM', SVM3, SVM4,'NA']\n", | |
"Report_df.loc[len(Report_df)] = ['Log_Reg', LR3,LR4,LR5]\n", | |
"Report_df\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Report\n", | |
"You should be able to report the accuracy of the built model using different evaluation metrics:" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"| Algorithm | Jaccard | F1-score | LogLoss |\n", | |
"|--------------------|---------|----------|---------|\n", | |
"| KNN | ? | ? | NA |\n", | |
"| Decision Tree | ? | ? | NA |\n", | |
"| SVM | ? | ? | NA |\n", | |
"| LogisticRegression | ? | ? | ? |" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"button": false, | |
"new_sheet": false, | |
"run_control": { | |
"read_only": false | |
} | |
}, | |
"source": [ | |
"<h2>Want to learn more?</h2>\n", | |
"\n", | |
"IBM SPSS Modeler is a comprehensive analytics platform that has many machine learning algorithms. It has been designed to bring predictive intelligence to decisions made by individuals, by groups, by systems – by your enterprise as a whole. A free trial is available through this course, available here: <a href=\"http://cocl.us/ML0101EN-SPSSModeler\">SPSS Modeler</a>\n", | |
"\n", | |
"Also, you can use Watson Studio to run these notebooks faster with bigger datasets. Watson Studio is IBM's leading cloud solution for data scientists, built by data scientists. With Jupyter notebooks, RStudio, Apache Spark and popular libraries pre-packaged in the cloud, Watson Studio enables data scientists to collaborate on their projects without having to install anything. Join the fast-growing community of Watson Studio users today with a free account at <a href=\"https://cocl.us/ML0101EN_DSX\">Watson Studio</a>\n", | |
"\n", | |
"<h3>Thanks for completing this lesson!</h3>\n", | |
"\n", | |
"<h4>Author: <a href=\"https://ca.linkedin.com/in/saeedaghabozorgi\">Saeed Aghabozorgi</a></h4>\n", | |
"<p><a href=\"https://ca.linkedin.com/in/saeedaghabozorgi\">Saeed Aghabozorgi</a>, PhD is a Data Scientist in IBM with a track record of developing enterprise level applications that substantially increases clients’ ability to turn data into actionable knowledge. He is a researcher in data mining field and expert in developing advanced analytic methods like machine learning and statistical modelling on large datasets.</p>\n", | |
"\n", | |
"<hr>\n", | |
"\n", | |
"<p>Copyright © 2018 <a href=\"https://cocl.us/DX0108EN_CC\">Cognitive Class</a>. This notebook and its source code are released under the terms of the <a href=\"https://bigdatauniversity.com/mit-license/\">MIT License</a>.</p>" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python", | |
"language": "python", | |
"name": "conda-env-python-py" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.7" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment