Skip to content

Instantly share code, notes, and snippets.

@lionello
Last active February 20, 2021 17:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lionello/ccd39ce52b08488beb8f741b180c32e7 to your computer and use it in GitHub Desktop.
Save lionello/ccd39ce52b08488beb8f741b180c32e7 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "<a href=\"https://www.bigdatauniversity.com\"><img src=\"https://ibm.box.com/shared/static/cw2c7r3o20w9zn8gkecaeyjhgw3xdgbj.png\" width=\"400\" align=\"center\"></a>\n\n<h1 align=\"center\"><font size=\"5\">Classification with Python</font></h1>"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "In this notebook we try to practice all the classification algorithms that we learned in this course.\n\nWe load a dataset using Pandas library, and apply the following algorithms, and find the best one for this specific dataset by accuracy evaluation methods.\n\nLets first load required libraries:"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "import itertools\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom matplotlib.ticker import NullFormatter\nimport pandas as pd\nimport numpy as np\nimport matplotlib.ticker as ticker\nfrom sklearn import preprocessing\n%matplotlib inline",
"execution_count": 77,
"outputs": []
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### About dataset"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "This dataset is about past loans. The __Loan_train.csv__ data set includes details of 346 customers whose loan are already paid off or defaulted. It includes following fields:\n\n| Field | Description |\n|----------------|---------------------------------------------------------------------------------------|\n| Loan_status | Whether a loan is paid off on in collection |\n| Principal | Basic principal loan amount at the |\n| Terms | Origination terms which can be weekly (7 days), biweekly, and monthly payoff schedule |\n| Effective_date | When the loan got originated and took effects |\n| Due_date | Since it\u2019s one-time payoff schedule, each loan has one single due date |\n| Age | Age of applicant |\n| Education | Education of applicant |\n| Gender | The gender of applicant |"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Lets download the dataset"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "!wget -O loan_train.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_train.csv",
"execution_count": 78,
"outputs": [
{
"output_type": "stream",
"text": "--2021-02-20 17:16:25-- https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_train.csv\nResolving s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)... 67.228.254.196\nConnecting to s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)|67.228.254.196|:443... connected.\nHTTP request sent, awaiting response... 200 OK\nLength: 23101 (23K) [text/csv]\nSaving to: \u2018loan_train.csv\u2019\n\nloan_train.csv 100%[===================>] 22.56K --.-KB/s in 0.001s \n\n2021-02-20 17:16:25 (15.2 MB/s) - \u2018loan_train.csv\u2019 saved [23101/23101]\n\n",
"name": "stdout"
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Load Data From CSV File "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df = pd.read_csv('loan_train.csv')\ndf.head()",
"execution_count": 79,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 79,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 0 0 PAIDOFF 1000 30 9/8/2016 \n1 2 2 PAIDOFF 1000 30 9/8/2016 \n2 3 3 PAIDOFF 1000 15 9/8/2016 \n3 4 4 PAIDOFF 1000 30 9/9/2016 \n4 6 6 PAIDOFF 1000 30 9/9/2016 \n\n due_date age education Gender \n0 10/7/2016 45 High School or Below male \n1 10/7/2016 33 Bechalor female \n2 9/22/2016 27 college male \n3 10/8/2016 28 college female \n4 10/8/2016 29 college male ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/8/2016</td>\n <td>10/7/2016</td>\n <td>45</td>\n <td>High School or Below</td>\n <td>male</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n <td>2</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/8/2016</td>\n <td>10/7/2016</td>\n <td>33</td>\n <td>Bechalor</td>\n <td>female</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n <td>3</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>15</td>\n <td>9/8/2016</td>\n <td>9/22/2016</td>\n <td>27</td>\n <td>college</td>\n <td>male</td>\n </tr>\n <tr>\n <th>3</th>\n <td>4</td>\n <td>4</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/9/2016</td>\n <td>10/8/2016</td>\n <td>28</td>\n <td>college</td>\n <td>female</td>\n </tr>\n <tr>\n <th>4</th>\n <td>6</td>\n <td>6</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/9/2016</td>\n <td>10/8/2016</td>\n <td>29</td>\n <td>college</td>\n <td>male</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "df.shape",
"execution_count": 80,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 80,
"data": {
"text/plain": "(346, 10)"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Convert to date time object "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['due_date'] = pd.to_datetime(df['due_date'])\ndf['effective_date'] = pd.to_datetime(df['effective_date'])\ndf.head()",
"execution_count": 81,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 81,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 0 0 PAIDOFF 1000 30 2016-09-08 \n1 2 2 PAIDOFF 1000 30 2016-09-08 \n2 3 3 PAIDOFF 1000 15 2016-09-08 \n3 4 4 PAIDOFF 1000 30 2016-09-09 \n4 6 6 PAIDOFF 1000 30 2016-09-09 \n\n due_date age education Gender \n0 2016-10-07 45 High School or Below male \n1 2016-10-07 33 Bechalor female \n2 2016-09-22 27 college male \n3 2016-10-08 28 college female \n4 2016-10-08 29 college male ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>45</td>\n <td>High School or Below</td>\n <td>male</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n <td>2</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>33</td>\n <td>Bechalor</td>\n <td>female</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n <td>3</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>15</td>\n <td>2016-09-08</td>\n <td>2016-09-22</td>\n <td>27</td>\n <td>college</td>\n <td>male</td>\n </tr>\n <tr>\n <th>3</th>\n <td>4</td>\n <td>4</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>28</td>\n <td>college</td>\n <td>female</td>\n </tr>\n <tr>\n <th>4</th>\n <td>6</td>\n <td>6</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>29</td>\n <td>college</td>\n <td>male</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "# Data visualization and pre-processing\n\n"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Let\u2019s see how many of each class is in our data set "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['loan_status'].value_counts()",
"execution_count": 82,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 82,
"data": {
"text/plain": "PAIDOFF 260\nCOLLECTION 86\nName: loan_status, dtype: int64"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "260 people have paid off the loan on time while 86 have gone into collection \n"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Lets plot some columns to underestand data better:"
},
{
"metadata": {},
"cell_type": "code",
"source": "# notice: installing seaborn might takes a few minutes\n!conda install -c anaconda seaborn -y",
"execution_count": 83,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "import seaborn as sns\n\nbins = np.linspace(df.Principal.min(), df.Principal.max(), 10)\ng = sns.FacetGrid(df, col=\"Gender\", hue=\"loan_status\", palette=\"Set1\", col_wrap=2)\ng.map(plt.hist, 'Principal', bins=bins, ec=\"k\")\n\ng.axes[-1].legend()\nplt.show()",
"execution_count": 84,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x216 with 2 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAADQCAYAAABStPXYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAbDUlEQVR4nO3de5xVdb3/8ddbnBwRzQuTIogzKpog/HY6aWZ6EJXwih5vmKl0PIc0rThpJlrWyUdmQloeb+mJ8OERlErRMK8Ex9C8II4KXvA26SggYI+UBAL8/P7Ya8Yt7mEue++ZNXu/n4/Heuy1vntdPovZXz77+11rr68iAjMzs7TZpLsDMDMzy8cJyszMUskJyszMUskJyszMUskJyszMUskJyszMUskJqkQkbS9pqqTXJD0l6S+SjivSvodLmlmMfXUFSXMk1Xd3HNY9yqkuSKqR9LikpyUdWMLjrCzVvnsSJ6gSkCRgBvBwROwSEfsAY4AB3RTPpt1xXLMyrAuHAC9GxOci4s/FiMla5wRVGiOAf0bEDc0FEfHXiPhvAEm9JE2U9KSkZyV9PSkfnrQ2fifpRUm3JhUcSaOSsrnAvzbvV9IWkiYn+3pa0uikfKyk30r6A/BAIScjaYqk6yXNTr4F/0tyzBckTclZ73pJ8yQtlPRfrexrZPINen4SX59CYrPUK5u6ICkDXAEcIalB0uatfZ4lNUq6LHlvnqS9Jd0v6VVJZyXr9JE0K9n2ueZ48xz3uzn/PnnrVdmKCE9FnoBvAVdt5P1xwPeT+c2AeUAdMBz4O9lvl5sAfwG+BFQDbwKDAAHTgZnJ9pcBX03mtwYWAVsAY4EmYNtWYvgz0JBnOjTPulOA25JjjwbeA4YmMT4FZJL1tk1eewFzgGHJ8hygHugLPAxskZR/D7iku/9enko3lWFdGAtck8y3+nkGGoGzk/mrgGeBLYEa4J2kfFNgq5x9vQIoWV6ZvI4EbkzOdRNgJnBQd/9du2py108XkHQt2cr1z4j4PNkP3TBJJySrfJpshfsn8ERENCXbNQC1wErg9Yh4OSn/X7IVm2Rfx0g6P1muBgYm8w9GxLv5YoqIjvaf/yEiQtJzwNKIeC6JZWESYwNwkqRxZCteP2Aw2YrZ7AtJ2SPJl+FPkf2PxypEmdSFZm19nu9OXp8D+kTE+8D7klZL2hr4B3CZpIOAD4H+wPbAkpx9jEymp5PlPmT/fR7uZMw9ihNUaSwEjm9eiIhzJPUl++0Qst+GvhkR9+duJGk4sCanaD0f/Y1ae2iigOMj4qUN9rUf2QqQfyPpz2S/0W3o/Ih4KE95c1wfbhDjh8CmkuqA84HPR8Tfkq6/6jyxPhgRp7QWl5WdcqwLucfb2Od5o3UGOJVsi2qfiFgrqZH8deanEfGrjcRRtnwNqjT+BFRLOjunrHfO/P3A2ZKqACTtLmmLjezvRaBO0q7Jcm6FuB/4Zk7//OfaE2BEHBgRmTzTxirkxmxF9j+Bv0vaHjg8zzqPAQdI2i2Jtbek3Tt5POsZyrkuFPp5/jTZ7r61kg4Gds6zzv3Av+Vc2+ov6TMdOEaP5gRVApHtPD4W+BdJr0t6AriZbB81wP8AzwPzJS0AfsVGWrMRsZpsN8Y9yYXhv+a8fSlQBTyb7OvSYp9Pe0TEM2S7IRYCk4FH8qyzjGwf/jRJz5Kt4J/twjCti5VzXSjC5/lWoF7SPLKtqRfzHOMBYCrwl6R7/Xfkb+2VpeYLcmZmZqniFpSZmaWSE5SZmaWSE5SZmaWSE5SZmaVSKhLUqFGjguxvGzx5KpepaFw/PJXZ1G6pSFDLly/v7hDMUsv1wypVKhKUmZnZhpygzMwslZygzMwslfywWDMrK2vXrqWpqYnVq1d3dygVrbq6mgEDBlBVVdXpfThBmVlZaWpqYsstt6S2tpbkubHWxSKCFStW0NTURF1dXaf34y4+Mysrq1evZrvttnNy6kaS2G677QpuxTpBWcXYuV8/JBU87dyvX3efirXByan7FeNv4C4+qxhvLFlC044DCt7PgLebihCNmbXFLSgzK2vFajl3pAXdq1cvMpkMe+21FyeeeCIffPABAOvWraNv375MmDDhY+sPHz6cefOygwzX1tYydOhQhg4dyuDBg/n+97/PmjUfDci7cOFCRowYwe67786gQYO49NJLaR42acqUKdTU1JDJZMhkMpx++ukAjB07lrq6upbyq6++uij/tqXmFpSZlbVitZybtacFvfnmm9PQ0ADAqaeeyg033MB3vvMdHnjgAfbYYw+mT5/OZZdd1mo32OzZs+nbty8rV65k3LhxjBs3jptvvplVq1ZxzDHHcP311zNy5Eg++OADjj/+eK677jrOOeccAE4++WSuueaaT+xz4sSJnHDCCQWceddrswUlabKkd5IRKpvLfiTpLUkNyXREznsTJL0i6SVJXy5V4GZmPcGBBx7IK6+8AsC0adP49re/zcCBA3nsscfa3LZPnz7ccMMNzJgxg3fffZepU6dywAEHMHLkSAB69+7NNddcw+WXX17Sc+gu7enimwKMylN+VURkkumPAJIGA2OAIck210nqVaxgzcx6knXr1nHvvfcydOhQVq1axaxZszjqqKM45ZRTmDZtWrv2sdVWW1FXV8fLL7/MwoUL2WeffT72/q677srKlSt57733ALj99ttbuvJ+85vftKz33e9+t6X8ueeeK95JllCbCSoiHgbebef+RgO3RcSaiHgdeAXYt4D4zMx6nFWrVpHJZKivr2fgwIGceeaZzJw5k4MPPpjevXtz/PHHc+edd7J+/fp27a/5GlNEtNot2Fx+8skn09DQQENDA1/72tda3p84cWJL+dChQws8w65RyDWocyWdDswDzouIvwH9gdx2a1NS9gmSxgHjAAYOHFhAGGblx/WjZ8u9BtVs2rRpPPLII9TW1gKwYsUKZs+ezaGHHrrRfb3//vs0Njay++67M2TIEB5++OGPvf/aa6/Rp08fttxyy6KeQxp09i6+64FdgQywGPh5Up4vtecd/yMiboyI+oior6mp6WQYZuXJ9aO8vPfee8ydO5c33niDxsZGGhsbufbaa9vs5lu5ciXf+MY3OPbYY9lmm2049dRTmTt3Lg899BCQbal961vf4oILLuiK0+hynWpBRcTS5nlJNwEzk8UmYKecVQcAb3c6OjOzAg3cYYei/nZt4A47dHibO+64gxEjRrDZZpu1lI0ePZoLLrjgY7eQNzv44IOJCD788EOOO+44fvCDHwDZltldd93FN7/5Tc455xzWr1/Paaedxrnnntv5E0oxNfdtbnQlqRaYGRF7Jcv9ImJxMv+fwH4RMUbSEGAq2etOOwKzgEERsdGO1vr6+mj+DYBZqUgq2g9121FvivYoA9ePjnnhhRfYc889uzsMo9W/RbvrRpstKEnTgOFAX0lNwA+B4ZIyZLvvGoGvA0TEQknTgeeBdcA5bSUnMzOzfNpMUBFxSp7iX29k/Z8APykkKDMzMz/qyMzMUskJyszMUskJyszMUskJyszMUskJyszK2o4DBhZ1uI0dB7T9ZI8lS5YwZswYdt11VwYPHswRRxzBokWL2hwqI9/vmWpra1m+fPnHyjYcViOTyfD8888DsGjRIo444gh222039txzT0466aSPPZ+vT58+7LHHHi3DccyZM4ejjjqqZd8zZsxg2LBhfPazn2Xo0KHMmDGj5b2xY8fSv3//lt9uLV++vOXJGKXg4TbMrKwtfutN9rvkvqLt7/Ef53t29kciguOOO44zzjiD2267DYCGhgaWLl3K2LFjNzpURkfkG1Zj9erVHHnkkVx55ZUcffTRQHbojpqampZHLw0fPpxJkyZRX18PwJw5c1q2f+aZZzj//PN58MEHqaur4/XXX+ewww5jl112YdiwYUB2rKvJkydz9tlndzjmjnILysysiGbPnk1VVRVnnXVWS1kmk2HRokUlHypj6tSp7L///i3JCbJPpdhrr73atf2kSZO46KKLqKurA6Curo4JEyYwceLElnXGjx/PVVddxbp164oWd2ucoMzMimjBggWfGBIDaNdQGR2R222XyWRYtWpVq8dur3wx1tfXs3DhwpblgQMH8qUvfYlbbrml08dpL3fxmZl1gfYMldERrY2cW4h8MeYru+iiizjmmGM48sgji3r8DbkFZWZWREOGDOGpp57KW77hMxWLPVRGa8fuyPYbxjh//nwGDx78sbLddtuNTCbD9OnTO32s9nCCMjMrohEjRrBmzRpuuummlrInn3ySQYMGlXyojK985Ss8+uij3HPPPS1l9913X7tH0D3//PP56U9/SmNjIwCNjY1cdtllnHfeeZ9Y9+KLL2bSpElFibs17uIzs7LWr/9Obd5519H9bYwk7rzzTsaPH8/ll19OdXU1tbW1/OIXv2hzqIwpU6Z87Lbuxx7Ljv86bNgwNtkk25446aSTGDZsGLfffjtz585tWfe6667ji1/8IjNnzmT8+PGMHz+eqqoqhg0bxi9/+ct2nVsmk+FnP/sZRx99NGvXrqWqqoorrriCTCbziXWHDBnC3nvvzfz589u1785o13AbpebhBKwreLiNyuDhNtKj0OE22uzikzRZ0juSFuSUTZT0oqRnJd0paeukvFbSKkkNyXRDewMxMzPL1Z5rUFOADdvHDwJ7RcQwYBEwIee9VyMik0xnYWZm1gltJqiIeBh4d4OyByKi+Vdaj5Ed2t3MLBXScOmi0hXjb1CMu/j+Dbg3Z7lO0tOS/k/Sga1tJGmcpHmS5i1btqwIYZiVD9ePzquurmbFihVOUt0oIlixYgXV1dUF7aegu/gkXUx2aPdbk6LFwMCIWCFpH2CGpCER8YmfSUfEjcCNkL0IXEgcZuXG9aPzBgwYQFNTE07s3au6upoBAwrrXOt0gpJ0BnAUcEgkX1UiYg2wJpl/StKrwO6Ab0Eysy5RVVXV8iw569k61cUnaRTwPeCYiPggp7xGUq9kfhdgEPBaMQI1M7PK0mYLStI0YDjQV1IT8EOyd+1tBjyYPKPpseSOvYOAH0taB6wHzoqId/Pu2MzMbCPaTFARcUqe4l+3su7vgd8XGpSZmZmfxWdmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnkBGVmZqnUZoKSNFnSO5IW5JRtK+lBSS8nr9vkvDdB0iuSXpL05VIFbmZm5a09LagpwKgNyi4EZkXEIGBWsoykwcAYYEiyzXXNI+yamZl1RJsJKiIeBjYcFXc0cHMyfzNwbE75bRGxJiJeB14B9i1SrGZmVkE6ew1q+4hYDJC8fiYp7w+8mbNeU1L2CZLGSZonad6yZcs6GYZZeXL9MCv+TRLKUxb5VoyIGyOiPiLqa2pqihyGWc/m+mHW+QS1VFI/gOT1naS8CdgpZ70BwNudD8/MzCpVZxPU3cAZyfwZwF055WMkbSapDhgEPFFYiGZmVok2bWsFSdOA4UBfSU3AD4HLgemSzgTeAE4EiIiFkqYDzwPrgHMiYn2JYjczszLWZoKKiFNaeeuQVtb/CfCTQoIyMzPzkyTMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyV2nyaeWsk7QHcnlO0C3AJsDXwH0DzONUXRcQfOx2hmZlVpE4nqIh4CcgASOoFvAXcCXwNuCoiJhUlQjMzq0jF6uI7BHg1Iv5apP2ZmVmFK1aCGgNMy1k+V9KzkiZL2ibfBpLGSZonad6yZcvyrWJWsVw/zIqQoCR9CjgG+G1SdD2wK9nuv8XAz/NtFxE3RkR9RNTX1NQUGoZZWXH9MCtOC+pwYH5ELAWIiKURsT4iPgRuAvYtwjHMzKzCFCNBnUJO956kfjnvHQcsKMIxzMyswnT6Lj4ASb2Bw4Cv5xRfISkDBNC4wXtmZmbtUlCCiogPgO02KDutoIjMzMzwkyTMzCylnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVnKDMzCyVCrrN3KwnUa8qBrzdVJT9mFnpOUFZxYj1a9nvkvsK3s/jPx5VhGjMrC3u4jMzs1RygjIzs1RygjIzs1RygjIzs1RygjIzs1RygjIzs1QqdDyoRuB9YD2wLiLqJW0L3A7Ukh0P6qSI+FthYZqZWaUpRgvq4IjIRER9snwhMCsiBgGzkmWrQDv364ekgqed+/Vr+2BmVnZK8UPd0cDwZP5mYA7wvRIcx1LujSVLaNpxQMH7KcbTH8ys5ym0BRXAA5KekjQuKds+IhYDJK+fybehpHGS5kmat2zZsgLDMCsvrh9mhSeoAyJib+Bw4BxJB7V3w4i4MSLqI6K+pqamwDDMyovrh1mBCSoi3k5e3wHuBPYFlkrqB5C8vlNokGZmVnk6naAkbSFpy+Z5YCSwALgbOCNZ7QzgrkKDNDOzylPITRLbA3dKat7P1Ii4T9KTwHRJZwJvACcWHqaZmVWaTieoiHgN+H95ylcAhxQSlJmZmZ8kYWZmqeQEZWZmqeQEZWZmqeQEZWZmqeQEZWZmqeQEZWZmqeQEZWZmqeQEZWZmqeQEZWZmqeQEZWZmqeQEZWZW4Yo1+nWxR8AuxYi6ZmbWgxRr9Gso7gjYbkGZmVkqFTIe1E6SZkt6QdJCSd9Oyn8k6S1JDcl0RPHCNTOzSlFIF9864LyImJ8MXPiUpAeT966KiEmFh2dmZpWqkPGgFgOLk/n3Jb0A9C9WYGZmVtmKcg1KUi3wOeDxpOhcSc9Kmixpm1a2GSdpnqR5y5YtK0YYZmXD9cOsCAlKUh/g98D4iHgPuB7YFciQbWH9PN92EXFjRNRHRH1NTU2hYZiVFdcPswITlKQqssnp1oi4AyAilkbE+oj4ELgJ2LfwMM3MrNIUchefgF8DL0TElTnlub/SOg5Y0PnwzMysUhVyF98BwGnAc5IakrKLgFMkZYAAGoGvFxShmZlVpELu4psLKM9bf+x8OGZmZll+koSZmaWSn8VnJaNeVUV5Lpd6VRUhGjPraZygrGRi/Vr2u+S+gvfz+I9HFSEaM+tp3MVnZmap5ARlZmap5ARlZmap5ARlZmap5ARlZtbFijXEejGHV08j38VnZtbFijXEejGHV08jt6DMzCyVnKDMzCyV3MVnZlbhivXUl+Z9FYsTlJlZhSvWU1+guE9+cRefmZmlUskSlKRRkl6S9IqkCwvdn2/LNDOrLCXp4pPUC7gWOAxoAp6UdHdEPN/Zffq2TDOzylKqa1D7Aq9ExGsAkm4DRgOdTlBps3O/fryxZEnB+xm4ww78dfHiIkRU3qR8Y2NaGrlutK1YNyVs0quqrOuGIqL4O5VOAEZFxL8ny6cB+0XEuTnrjAPGJYt7AC8VPZD26wss78bjF8Kxd732xL08Ijp9tThF9aOn/o3AsXeXtmJvd90oVQsqX0r/WCaMiBuBG0t0/A6RNC8i6rs7js5w7F2vK+JOS/3oqX8jcOzdpZixl+omiSZgp5zlAcDbJTqWmZmVoVIlqCeBQZLqJH0KGAPcXaJjmZlZGSpJF19ErJN0LnA/0AuYHBELS3GsIun2rpQCOPau11Pj7oyefK6OvXsULfaS3CRhZmZWKD9JwszMUskJyszMUqliEpSkXpKeljQzWd5W0oOSXk5et8lZd0LyiKaXJH25+6IGSVtL+p2kFyW9IGn/HhT7f0paKGmBpGmSqtMau6TJkt6RtCCnrMOxStpH0nPJe1erB/yK0nWjW2J33WhP3YiIipiA7wBTgZnJ8hXAhcn8hcDPkvnBwDPAZkAd8CrQqxvjvhn492T+U8DWPSF2oD/wOrB5sjwdGJvW2IGDgL2BBTllHY4VeALYn+xvAe8FDu+uz04Hzt11o2vjdt1oZ93o9srRRf/AA4BZwIicSvgS0C+Z7we8lMxPACbkbHs/sH83xb1V8kHWBuU9Ifb+wJvAtmTvFp0JjExz7EDtBpWwQ7Em67yYU34K8Kvu+PfvwDm7bnR97K4b7awbldLF9wvgAuDDnLLtI2IxQPL6maS8+cPTrCkp6w67AMuA3yRdMP8jaQt6QOwR8RYwCXgDWAz8PSIeoAfEnqOjsfZP5jcsTzPXjS7muvGx8o0q+wQl6SjgnYh4qr2b5CnrrnvxNyXbtL4+Ij4H/INsc7o1qYk96ZMeTbaZvyOwhaSvbmyTPGVp/Q1Ea7H2pHNw3XDdKIWi1o2yT1DAAcAxkhqB24ARkv4XWCqpH0Dy+k6yfpoe09QENEXE48ny78hWyp4Q+6HA6xGxLCLWAncAX6RnxN6so7E2JfMblqeV60b3cN1o5zmUfYKKiAkRMSAiask+culPEfFVso9eOiNZ7QzgrmT+bmCMpM0k1QGDyF7c63IRsQR4U9IeSdEhZIcsSX3sZLsvviCpd3K3ziHAC/SM2Jt1KNakq+N9SV9Izvn0nG1Sx3XDdaMAXVM3uuMiYXdNwHA+uhC8HdmLwy8nr9vmrHcx2btPXqKb78ICMsA84FlgBrBND4r9v4AXgQXALWTv7Ell7MA0stcD1pL9tndmZ2IF6pPzfRW4hg0u4qd1ct3o8thdN9pRN/yoIzMzS6Wy7+IzM7OeyQnKzMxSyQnKzMxSyQnKzMxSyQnKzMxSyQkqxSStl9SQPPH4t5J6t7Leo53cf72kqwuIb2VntzUrhOtGZfBt5ikmaWVE9EnmbwWeiogrc97vFRHr0xCfWVdy3agMbkH1HH8GdpM0XNJsSVOB5+Cjb2vJe3P00Rg5tzaPuSLp85IelfSMpCckbZms3zwG0I8k3SLpT8kYL/+RlPeRNEvS/GQsl9Hdc/pmrXLdKFObdncA1jZJmwKHA/clRfsCe0XE63lW/xwwhOxzrh4BDpD0BHA7cHJEPClpK2BVnm2HAV8AtgCelnQP2WdsHRcR70nqCzwm6e5w09tSwHWjvLkFlW6bS2og+ziXN4BfJ+VPtFIBm99riogPgQay47jsASyOiCcBIuK9iFiXZ9u7ImJVRCwHZpOt7AIuk/Qs8BDZR+RvX5zTM+s0140K4BZUuq2KiExuQdIr8Y+NbLMmZ3492b+xaN/j+TdcJ4BTgRpgn4hYq+yTr6vbsS+zUnLdqABuQVWGF4EdJX0eIOljz/flZLSkaknbkX146JPAp8mOGbRW0sHAzl0VtFkXcN1IMbegKkBE/FPSycB/S9qcbB/7oXlWfQK4BxgIXBoRbyd3SP1B0jyy3SIvdlXcZqXmupFuvs3cgOydSsDKiJjU3bGYpYnrRvdxF5+ZmaWSW1BmZpZKbkGZmVkqOUGZmVkqOUGZmVkqOUGZmVkqOUGZmVkq/X8YrwBsFUyyKAAAAABJRU5ErkJggg==\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "bins = np.linspace(df.age.min(), df.age.max(), 10)\ng = sns.FacetGrid(df, col=\"Gender\", hue=\"loan_status\", palette=\"Set1\", col_wrap=2)\ng.map(plt.hist, 'age', bins=bins, ec=\"k\")\n\ng.axes[-1].legend()\nplt.show()",
"execution_count": 85,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x216 with 2 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAADQCAYAAABStPXYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAZBklEQVR4nO3de5QU5bnv8e9PmDgiGEFGGR1hRsULChl1djTBJIjKYXtDj5dojIF1POFo8MKKxqi5rJPtWoREl5psbyHRwEoCyt5RcJMVFQkcg1EjIl4QIx4d2bPlrolyBALynD+6ZjLAwPQM1dPVPb/PWrW66+3qt56X6Zen663qehURmJmZZc1exQ7AzMysLU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QKZN0kKTpkt6W9KKkZyWdn1LdIyTNSaOuriBpgaSGYsdhxVdO/UJSlaTnJb0k6QsF3M+GQtVdKpygUiRJwCzg6Yg4LCJOBC4BaooUT89i7NestTLsF6cBb0TE8RHxxzRisrY5QaVrJPD3iLi/uSAi3o2IfwWQ1EPSbZJekPSKpP+VlI9Ijjb+XdIbkn6TdGokjU7KFgL/vbleSftKejCp6yVJY5LycZL+TdJ/AE/uSWMkTZV0n6T5yTffLyX7XCZpaqvt7pO0SNJSST/YRV2jkm/Ni5P4eu9JbFZSyqZfSKoHfgycKWmJpH129dmW1ChpUvLaIkknSHpC0v+VdGWyTW9J85L3vtocbxv7/Varf582+1hZiggvKS3AtcCdu3l9PPDd5PnewCKgDhgB/I3cN8q9gGeBU4BK4D+BwYCAmcCc5P2TgK8mz/cH3gT2BcYBTUC/XcTwR2BJG8vpbWw7FXgo2fcY4ENgaBLji0B9sl2/5LEHsAAYlqwvABqA/sDTwL5J+beB7xf77+Wla5Yy7BfjgLuT57v8bAONwFXJ8zuBV4A+QBWwJinvCezXqq63ACXrG5LHUcCUpK17AXOALxb779oVi4eACkjSPeQ61N8j4p/IfdCGSbow2eTT5DrZ34E/R0RT8r4lQC2wAXgnIpYn5b8m15lJ6jpX0g3JeiUwMHk+NyLebyumiOjomPl/RERIehVYHRGvJrEsTWJcAlwsaTy5zlYNDCHXGZudnJQ9k3wB/hS5/2ysGyqTftGsvc/2Y8njq0DviPgI+EjSJkn7A/8PmCTpi8A24BDgIGBVqzpGJctLyXpvcv8+T3cy5pLhBJWupcAFzSsRMUFSf3LfCCH3DeiaiHii9ZskjQA2tyr6hH/8bXZ1s0QBF0TEX3ao6yRyH/q23yT9kdy3uB3dEBFPtVHeHNe2HWLcBvSUVAfcAPxTRHyQDP1VthHr3Ii4dFdxWVkrx37Ren+7+2zvtv8Al5E7ojoxIrZIaqTt/vPDiPjZbuIoSz4Hla4/AJWSrmpV1qvV8yeAqyRVAEg6UtK+u6nvDaBO0uHJeutO8ARwTasx+ePzCTAivhAR9W0su+uEu7MfuY7/N0kHAf/cxjbPAcMlHZHE2kvSkZ3cn5Wecu4Xe/rZ/jS54b4tkk4FBrWxzRPA/2h1busQSQd2YB8lywkqRZEbMD4P+JKkdyT9GZhGblwa4BfA68BiSa8BP2M3R7ERsYnc0MXvkpPB77Z6+VagAnglqevWtNuTj4h4mdzQw1LgQeCZNrZZS27cfoakV8h16qO7MEwronLuFyl8tn8DNEhaRO5o6o029vEkMB14Nhlq/3faPtorO80n48zMzDLFR1BmZpZJTlBmZpZJTlBmZpZJTlBmZpZJXZqgRo8eHeR+v+DFS3dYOsX9xEs3XNrUpQlq3bp1Xbk7s5LkfmKW4yE+MzPLJCcoMzPLJCcoMzPLJN8s1szK3pYtW2hqamLTpk3FDqVbq6yspKamhoqKiry2d4Iys7LX1NREnz59qK2tJbmPrHWxiGD9+vU0NTVRV1eX13s8xGdmZW/Tpk0ccMABTk5FJIkDDjigQ0exTlBFMKi6GkmpLIOqq4vdHLOS4ORUfB39G3iIrwhWrFpF08E1qdRV815TKvWYmWWNj6DMrNtJcxQj35GMHj16UF9fz3HHHcdFF13Exx9/DMDWrVvp378/N99883bbjxgxgkWLcpMO19bWMnToUIYOHcqQIUP47ne/y+bN/5igd+nSpYwcOZIjjzySwYMHc+utt9I8ldLUqVOpqqqivr6e+vp6vva1rwEwbtw46urqWsp/+tOfpvJvmyYfQZlZt5PmKAbkN5Kxzz77sGTJEgAuu+wy7r//fr75zW/y5JNPctRRRzFz5kwmTZq0y2Gw+fPn079/fzZs2MD48eMZP34806ZNY+PGjZx77rncd999jBo1io8//pgLLriAe++9lwkTJgDw5S9/mbvvvnunOm+77TYuvPDCPWh5YfkIysysi33hC1/grbfeAmDGjBlcd911DBw4kOeee67d9/bu3Zv777+fWbNm8f777zN9+nSGDx/OqFGjAOjVqxd33303kydPLmgbuoITlJlZF9q6dSu///3vGTp0KBs3bmTevHmcffbZXHrppcyYMSOvOvbbbz/q6upYvnw5S5cu5cQTT9zu9cMPP5wNGzbw4YcfAvDwww+3DOX98pe/bNnuW9/6Vkv5q6++ml4jU+IhPjOzLrBx40bq6+uB3BHUFVdcwezZszn11FPp1asXF1xwAbfeeit33nknPXr0aLe+5nNMEbHLYcHm8lId4ssrQUlqBD4CPgG2RkSDpH7Aw0At0AhcHBEfFCZMM7PS1vocVLMZM2bwzDPPUFtbC8D69euZP38+p59++m7r+uijj2hsbOTII4/k2GOP5emnn97u9bfffpvevXvTp0+fVNvQ1ToyxHdqRNRHREOyfhMwLyIGA/OSdTMzy8OHH37IwoULWbFiBY2NjTQ2NnLPPfe0O8y3YcMGvvGNb3DeeefRt29fLrvsMhYuXMhTTz0F5I7Urr32Wm688cauaEZB7ckQ3xhgRPJ8GrAA+PYexmNmVnADBwxI9TeEAwcM6PB7HnnkEUaOHMnee+/dUjZmzBhuvPHG7S4hb3bqqacSEWzbto3zzz+f733ve0DuyGz27Nlcc801TJgwgU8++YTLL7+cq6++uvMNygg1j2PudiPpHeADcjMf/iwipkj6a0Ts32qbDyKibxvvHQ+MBxg4cOCJ7777bmrBlypJqf5QN5+/oRVF3j+bdz8prGXLlnHMMccUOwxjl3+LNvtKvkN8wyPiBOCfgQmSvphvMBExJSIaIqKhqqoq37eZdSvuJ2Y7yytBRcR7yeMa4FHgs8BqSdUAyeOaQgVpZmbdT7sJStK+kvo0PwdGAa8BjwFjk83GArMLFaSZmXU/+VwkcRDwaHI9fU9gekQ8LukFYKakK4AVwEWFC9PMzLqbdhNURLwNfKaN8vXAaYUIyszMzLc6MjOzTHKCMrNu5+CagalOt3FwzcB297lq1SouueQSDj/8cIYMGcKZZ57Jm2++2e5UGW39nqm2tpZ169ZtV7bjtBr19fW8/vrrALz55puceeaZHHHEERxzzDFcfPHF292fr3fv3hx11FEt03EsWLCAs88+u6XuWbNmMWzYMI4++miGDh3KrFmzWl4bN24chxxySMtvt9atW9dyZ4w95Xvx5WlQdTUrVq0qdhhmloKV//WfnPT9x1Or7/l/Gb3b1yOC888/n7Fjx/LQQw8BsGTJElavXs24ceN2O1VGR7R1z71NmzZx1llncccdd3DOOecAuak7qqqqWm69NGLECG6//XYaGnI3ClqwYEHL+19++WVuuOEG5s6dS11dHe+88w5nnHEGhx12GMOGDQNyc109+OCDXHXVVR2OeXecoPLkWXDNrLPmz59PRUUFV155ZUtZfX09DzzwQJtTZYwYMaJTCaot06dP53Of+1xLcoLcXSnydfvtt3PLLbdQV1cHQF1dHTfffDO33XYbv/rVrwCYOHEid955J1//+tdTibmZh/jMzArstdde22lKDCCvqTI6ovWwXX19PRs3btzlvvPVVowNDQ0sXbq0ZX3gwIGccsopLQkrLT6CMjMrknymyuiIXU2rsSfairGtsltuuYVzzz2Xs846K7V9+wjKzKzAjj32WF588cU2yxctWrRdWdpTZexq3x15/44xLl68mCFDhmxXdsQRR1BfX8/MmTM7va8dOUGZmRXYyJEj2bx5Mz//+c9byl544QUGDx5c8KkyvvKVr/CnP/2J3/3udy1ljz/+eN4z6N5www388Ic/pLGxEYDGxkYmTZrE9ddfv9O23/nOd7j99ttTiRs8xGdm3VD1IYe2e+VdR+vbHUk8+uijTJw4kcmTJ1NZWUltbS133XVXu1NlTJ06dbvLup977jkAhg0bxl575Y4xLr74YoYNG8bDDz/MwoULW7a99957+fznP8+cOXOYOHEiEydOpKKigmHDhvGTn/wkr7bV19fzox/9iHPOOYctW7ZQUVHBj3/845bZgVs79thjOeGEE1i8eHFedbcnr+k20tLQ0BA7HiqWirSnyPB0G91Cx08iUNr9JKs83UZ2FGK6DTMzsy7lBGVmZpnkBGVm3YKHwouvo38DJygzK3uVlZWsX7/eSaqIIoL169dTWVmZ93t8FZ+Zlb2amhqamppYu3ZtsUPp1iorK6mpyf8CMSeoErc3nfvFeVsGDhjAuytXplKXWZZUVFS03EvOSocTVInbDL6JrZmVpbzPQUnqIeklSXOS9X6S5kpanjz2LVyYZmbW3XTkIonrgGWt1m8C5kXEYGBesm5mZpaKvBKUpBrgLOAXrYrHANOS59OA89INzczMurN8j6DuAm4EtrUqOygiVgIkjwe29UZJ4yUtkrTIV9CYtc39xGxn7SYoSWcDayKiU/drj4gpEdEQEQ1VVVWdqcKs7LmfmO0sn6v4hgPnSjoTqAT2k/RrYLWk6ohYKakaWFPIQM3MrHtp9wgqIm6OiJqIqAUuAf4QEV8FHgPGJpuNBWYXLEozM+t29uRWR5OBMyQtB85I1s3MzFLRoR/qRsQCYEHyfD1wWvohmZmZ+WaxZmaWUU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSe0mKEmVkv4s6WVJSyX9ICnvJ2mupOXJY9/Ch2tmZt1FPkdQm4GREfEZoB4YLelk4CZgXkQMBuYl62ZmZqloN0FFzoZktSJZAhgDTEvKpwHnFSRCMzPrlvI6ByWph6QlwBpgbkQ8DxwUESsBkscDd/He8ZIWSVq0du3atOI2KyvuJ2Y7yytBRcQnEVEP1ACflXRcvjuIiCkR0RARDVVVVZ2N06ysuZ+Y7axDV/FFxF+BBcBoYLWkaoDkcU3q0ZmZWbeVz1V8VZL2T57vA5wOvAE8BoxNNhsLzC5UkGZm1v30zGObamCapB7kEtrMiJgj6VlgpqQrgBXARQWM08zMupl2E1REvAIc30b5euC0QgRlZmbmO0mYmVkmOUGZmVkmOUGZmVkmOUGZmVkmlXWCGlRdjaRUFjMz61r5XGZeslasWkXTwTWp1FXzXlMq9ZiZWX7K+gjKzMxKlxOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllkhOUmZllUrsJStKhkuZLWiZpqaTrkvJ+kuZKWp489i18uGZm1l3kcwS1Fbg+Io4BTgYmSBoC3ATMi4jBwLxk3czMLBXtJqiIWBkRi5PnHwHLgEOAMcC0ZLNpwHmFCtLMzLqfDp2DklQLHA88DxwUESshl8SAA3fxnvGSFklatHbt2j2L1qxMuZ+Y7SzvBCWpN/BbYGJEfJjv+yJiSkQ0RERDVVVVZ2I0K3vuJ2Y7yytBSaogl5x+ExGPJMWrJVUnr1cDawoTopmZdUf5XMUn4AFgWUTc0eqlx4CxyfOxwOz0w7OutDfsdtr7jiyDqquL3RwzK3H5TPk+HLgceFXSkqTsFmAyMFPSFcAK4KLChGhdZTPQdHBNKnXVvNeUSj1m1n21m6AiYiGgXbx8WrrhZJd6VKT2n656fiq9unpUpFKPmVnW5HMEZUB8soWTvv94KnU9/y+jU63LzKwc+VZHZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSU5QZmaWSWV9J4k0b09kZmZdq6wTVNq3JzIzs67jIT4zM8skJygzM8skJygzM8uksj4H1R2kOk+V55ayDBlUXc2KVatSqWufvXqwcdsnqdQ1cMAA3l25MpW6bPecoEqcLwSxcrVi1apUZ3j2bNGlp90hPkkPSloj6bVWZf0kzZW0PHnsW9gwzcysu8nnHNRUYMev1jcB8yJiMDAvWTdrsTcgKZVlUHV1sZtjZkXQ7hBfRDwtqXaH4jHAiOT5NGAB8O0U47IStxk8pGJme6SzV/EdFBErAZLHA3e1oaTxkhZJWrR27dpO7s6svJVLPxlUXZ3akbNZwS+SiIgpwBSAhoaGKPT+zEpRufSTtC9ssO6ts0dQqyVVAySPa9ILyczMrPMJ6jFgbPJ8LDA7nXDMzMxy8rnMfAbwLHCUpCZJVwCTgTMkLQfOSNbNzMxSk89VfJfu4qXTUo7FzMysRebuxeergMzMDDJ4qyNfBWRmZpDBBGXF4xvPmlmWOEFZC9941syyJHPnoMzMzMAJyszMMsoJyszMMskJyszMMskJyjLPc0sVln97aFnlq/gs8zy3VGH5t4eWVU5QVhD+TZWZ7SknKCsI/6bKzPaUz0GZmVkm+QjKMi/N4cK9elSkdjJ/4IABvLtyZSp1lYtUh3Z7fsrDxB0wqLqaFatWpVJXVj7bTlCWeWkPF/qCgMJJ+2/lYeL8lePFLh7iMzOzTMrcEVSaQwRmZla6MpegfPWXmZnBHiYoSaOBnwA9gF9ExORUojIrkHL5fVaaJ8StY9K80GavnhVs27ollbrKUacTlKQewD3AGUAT8IKkxyLi9bSCM0tbuRyhl+MJ8VKxzRftdJk9uUjis8BbEfF2RPwdeAgYk05YZmbW3SkiOvdG6UJgdET8z2T9cuCkiLh6h+3GA+OT1aOAv3Q+3O30B9alVFcWuD3Z1dm2rIuIvA6z3E/y5vZkW6p9ZU/OQbU1CLtTtouIKcCUPdhP2zuXFkVEQ9r1Fovbk11d0Rb3k/y4PdmWdnv2ZIivCTi01XoN8N6ehWNmZpazJwnqBWCwpDpJnwIuAR5LJywzM+vuOj3EFxFbJV0NPEHuMvMHI2JpapG1L/XhkCJze7KrlNtSyrG3xe3JtlTb0+mLJMzMzArJ9+IzM7NMcoIyM7NMynyCknSopPmSlklaKum6pLyfpLmSliePfYsdaz4kVUr6s6SXk/b8ICkvyfY0k9RD0kuS5iTrJdseSY2SXpW0RNKipCzz7XFfyT73k47JfIICtgLXR8QxwMnABElDgJuAeRExGJiXrJeCzcDIiPgMUA+MlnQypdueZtcBy1qtl3p7To2I+la/6SiF9rivZJ/7SUdEREktwGxy9//7C1CdlFUDfyl2bJ1oSy9gMXBSKbeH3G/g5gEjgTlJWSm3pxHov0NZybXHfSVbi/tJx5dSOIJqIakWOB54HjgoIlYCJI8HFi+yjkkO85cAa4C5EVHS7QHuAm4EtrUqK+X2BPCkpBeTWxBBibXHfSWT3E86KHPzQe2KpN7Ab4GJEfFhWre7L4aI+ASol7Q/8Kik44odU2dJOhtYExEvShpR7HhSMjwi3pN0IDBX0hvFDqgj3Feyx/2kc0riCEpSBbkO95uIeCQpXi2pOnm9mtw3rJISEX8FFgCjKd32DAfOldRI7o72IyX9mtJtDxHxXvK4BniU3J37S6I97iuZ5X7SCZlPUMp9/XsAWBYRd7R66TFgbPJ8LLnx9syTVJV8G0TSPsDpwBuUaHsi4uaIqImIWnK3u/pDRHyVEm2PpH0l9Wl+DowCXqME2uO+kl3uJ51U7BNteZyIO4XcWOcrwJJkORM4gNwJx+XJY79ix5pne4YBLyXteQ34flJeku3ZoW0j+MfJ35JsD3AY8HKyLAW+UyrtcV8pjcX9JP/FtzoyM7NMyvwQn5mZdU9OUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGZmlklOUGVA0qzkho1Lm2/aKOkKSW9KWiDp55LuTsqrJP1W0gvJMry40Zt1HfeV0uIf6pYBSf0i4v3kdjAvAP8NeAY4AfgI+APwckRcLWk6cG9ELJQ0EHgicvMHmZU995XSUjJ3M7fdulbS+cnzQ4HLgf8TEe8DSPo34Mjk9dOBIa3ucL2fpD4R8VFXBmxWJO4rJcQJqsQlt+4/HfhcRHwsaQG5ScN29U1vr2TbjV0ToVk2uK+UHp+DKn2fBj5IOtzR5Kb67gV8SVJfST2BC1pt/yRwdfOKpPoujdaseNxXSowTVOl7HOgp6RXgVuA54L+ASeRmU30KeB34W7L9tUCDpFckvQ5c2fUhmxWF+0qJ8UUSZUpS74jYkHwrfBR4MCIeLXZcZlnjvpJdPoIqX/9b0hJy8+i8A8wqcjxmWeW+klE+gjIzs0zyEZSZmWWSE5SZmWWSE5SZmWWSE5SZmWWSE5SZmWXS/wfDQWC+dUWUHQAAAABJRU5ErkJggg==\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "# Pre-processing: Feature selection/extraction"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Lets look at the day of the week people get the loan "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['dayofweek'] = df['effective_date'].dt.dayofweek\nbins = np.linspace(df.dayofweek.min(), df.dayofweek.max(), 10)\ng = sns.FacetGrid(df, col=\"Gender\", hue=\"loan_status\", palette=\"Set1\", col_wrap=2)\ng.map(plt.hist, 'dayofweek', bins=bins, ec=\"k\")\ng.axes[-1].legend()\nplt.show()\n",
"execution_count": 86,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x216 with 2 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAADQCAYAAABStPXYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAZsklEQVR4nO3de3hU9b3v8fdHSI0I1BtqJIVExQsIO2p6rEVbxMtDvaHbe9GCx25OrTeOpW61tj27nsdS4fHS7a3WqrRV1FpvpacqUtiKFStiFBGLbk0xFRSwrVJBQb/nj1lJAwQySdZkFjOf1/PMMzNr1vqt7wr58p3fbya/nyICMzOzrNmq2AGYmZm1xQXKzMwyyQXKzMwyyQXKzMwyyQXKzMwyyQXKzMwyyQUqZZJ2kXS3pDckPS/pGUknptT2CEnT02irO0iaLam+2HFY8ZVSXkjqJ+lZSS9IOrSA51lVqLa3FC5QKZIk4CHgyYjYPSIOBE4HqosUT89inNestRLMi8OBVyNi/4h4Ko2YrG0uUOkaCXwcEbc0b4iIP0fEfwJI6iFpsqTnJL0k6X8l20ckvY37Jb0q6a4kqZE0Ktk2B/jX5nYlbSvp9qStFySNTraPk/QrSb8BHu/KxUi6U9LNkmYl73y/nJxzkaQ7W+13s6R5khZK+o9NtHVU8q55fhJf767EZluUkskLSXXA1cDRkhokbbOp321JjZKuSl6bJ+kASY9J+m9J30j26S1pZnLsguZ42zjvt1v9fNrMsZIUEb6ldAMuBK7dzOvjgSuSx1sD84BaYATwd3LvKLcCngEOASqBt4BBgID7gOnJ8VcBZyaPtwMWA9sC44AmYIdNxPAU0NDG7Yg29r0TuCc592jgfWBoEuPzQF2y3w7JfQ9gNjAseT4bqAd2Ap4Etk22/zvwvWL/e/nWPbcSzItxwA3J403+bgONwLnJ42uBl4A+QD/g3WR7T6Bvq7ZeB5Q8X5XcHwXcmlzrVsB04EvF/nftjpuHgApI0o3kEurjiPg8uV+0YZJOTnb5LLkk+xj4Y0Q0Jcc1ADXAKuDNiHgt2f5LcslM0tbxkiYmzyuBAcnjGRHxXlsxRURHx8x/ExEhaQHwTkQsSGJZmMTYAJwqaTy5ZKsCBpNLxmZfSLY9nbwB/gy5/2ysDJVIXjRr73f7keR+AdA7Ij4APpC0RtJ2wD+AqyR9CfgU6A/sAixr1cZRye2F5Hlvcj+fJzsZ8xbDBSpdC4GTmp9ExHmSdiL3jhBy74AuiIjHWh8kaQTwUatNn/DPf5tNTZYo4KSI+NMGbR1E7pe+7YOkp8i9i9vQxIh4oo3tzXF9ukGMnwI9JdUCE4HPR8Rfk6G/yjZinRERZ2wqLitppZgXrc+3ud/tzeYPMIZcj+rAiFgrqZG28+eHEfGTzcRRkvwZVLp+D1RKOrfVtl6tHj8GnCupAkDSXpK23Ux7rwK1kvZInrdOgseAC1qNye+fT4ARcWhE1LVx21wSbk5fcon/d0m7AF9pY5+5wHBJeyax9pK0VyfPZ1ueUs6Lrv5uf5bccN9aSYcBA9vY5zHgf7b6bKu/pJ07cI4tlgtUiiI3YHwC8GVJb0r6IzCV3Lg0wG3AK8B8SS8DP2EzvdiIWENu6OK3yYfBf2718pVABfBS0taVaV9PPiLiRXJDDwuB24Gn29hnOblx+2mSXiKX1Pt0Y5hWRKWcFyn8bt8F1EuaR6439Wob53gcuBt4Jhlqv5+2e3slp/nDODMzs0xxD8rMzDLJBcrMzDLJBcrMzDLJBcrMzDKpWwvUqFGjgtzfL/jmWzncOsV54lsZ3trUrQVqxYoV3Xk6sy2S88Qsx0N8ZmaWSS5QZmaWSS5QZmaWSZ4s1sxK3tq1a2lqamLNmjXFDqWsVVZWUl1dTUVFRV77u0CZWclramqiT58+1NTUkMwja90sIli5ciVNTU3U1tbmdYyH+Mys5K1Zs4Ydd9zRxamIJLHjjjt2qBfrAmVlZWBVFZJSuQ2sqir25VgHuDgVX0f/DTzEZ2VlybJlNO1WnUpb1W83pdKOmbXNPSgzKztp9qTz7U336NGDuro69ttvP0455RQ+/PBDANatW8dOO+3EZZddtt7+I0aMYN683KLDNTU1DB06lKFDhzJ48GCuuOIKPvronwv0Lly4kJEjR7LXXnsxaNAgrrzySpqXUrrzzjvp168fdXV11NXV8bWvfQ2AcePGUVtb27L9xz/+cSo/2zTl1YOS9L+Br5ObkmIBcDa5FTHvBWqARuDUiPhrQaI0M0tRmj1pyK83vc0229DQ0ADAmDFjuOWWW7j44ot5/PHH2Xvvvbnvvvu46qqrNjkMNmvWLHbaaSdWrVrF+PHjGT9+PFOnTmX16tUcf/zx3HzzzRx11FF8+OGHnHTSSdx0002cd955AJx22mnccMMNG7U5efJkTj755C5ceWG124OS1B+4EKiPiP2AHsDpwKXAzIgYBMxMnpuZWTsOPfRQXn/9dQCmTZvGRRddxIABA5g7d267x/bu3ZtbbrmFhx56iPfee4+7776b4cOHc9RRRwHQq1cvbrjhBiZNmlTQa+gO+Q7x9QS2kdSTXM/pbWA0uWWbSe5PSD88M7PSsm7dOn73u98xdOhQVq9ezcyZMzn22GM544wzmDZtWl5t9O3bl9raWl577TUWLlzIgQceuN7re+yxB6tWreL9998H4N57720Zyrvjjjta9vv2t7/dsn3BggXpXWRK2i1QEfEXYAqwBFgK/D0iHgd2iYilyT5LgZ3bOl7SeEnzJM1bvnx5epGblRDnSelbvXo1dXV11NfXM2DAAM455xymT5/OYYcdRq9evTjppJN48MEH+eSTT/Jqr/kzpojY5LBg8/bTTjuNhoYGGhoaOPvss1tenzx5csv2oUOHdvEK09fuZ1CStifXW6oF/gb8StKZ+Z4gIm4FbgWor6/f5LTqZuXMeVL6Wn8G1WzatGk8/fTT1NTUALBy5UpmzZrFEUccsdm2PvjgAxobG9lrr70YMmQITz755Hqvv/HGG/Tu3Zs+ffqkeg3dLZ8hviOANyNieUSsBR4Avgi8I6kKILl/t3BhmpmVlvfff585c+awZMkSGhsbaWxs5MYbb2x3mG/VqlV885vf5IQTTmD77bdnzJgxzJkzhyeeeALI9dQuvPBCLrnkku64jILK51t8S4AvSOoFrAYOB+YB/wDGApOS+4cLFaSZWZoG7Lprqn/HNmDXXTt8zAMPPMDIkSPZeuutW7aNHj2aSy65ZL2vkDc77LDDiAg+/fRTTjzxRL773e8CuZ7Zww8/zAUXXMB5553HJ598wllnncX555/f+QvKCDWPY252J+k/gNOAdcAL5L5y3hu4DxhAroidEhHvba6d+vr6aP5ev1kxSEr1D3XbyZ9OTV3gPEnfokWL2HfffYsdhrHJf4s2cyWvv4OKiO8D399g80fkelNmZmap80wSZmaWSS5QZmaWSS5QZmaWSS5QZmaWSS5QZmaWSS5QZlZ2dqsekOpyG7tVD2j3nMuWLeP0009njz32YPDgwRx99NEsXry43aUy2vp7ppqaGlasWLHetg2X1airq+OVV14BYPHixRx99NHsueee7Lvvvpx66qnrzc/Xu3dv9t5775blOGbPns2xxx7b0vZDDz3EsGHD2GeffRg6dCgPPfRQy2vjxo2jf//+LX+7tWLFipaZMbrKCxaaWdlZ+pe3OOh7j6bW3rM/GLXZ1yOCE088kbFjx3LPPfcA0NDQwDvvvMO4ceM2u1RGR7S1rMaaNWs45phjuOaaazjuuOOA3NId/fr1a5l6acSIEUyZMoX6+noAZs+e3XL8iy++yMSJE5kxYwa1tbW8+eabHHnkkey+++4MGzYMyK11dfvtt3Puued2OObNcQ/KzKzAZs2aRUVFBd/4xjdattXV1bF48eKCL5Vx9913c/DBB7cUJ8jNSrHffvvldfyUKVO4/PLLqa2tBaC2tpbLLruMyZMnt+wzYcIErr32WtatW5da3OACZWZWcC+//PJGS2IAeS2V0RGth+3q6upYvXr1Js+dr7ZirK+vZ+HChS3PBwwYwCGHHMIvfvGLTp+nLR7iMzMrknyWyuiITa2c2xVtxdjWtssvv5zjjz+eY445JrVzuwdlZlZgQ4YM4fnnn29z+4bzLqa9VMamzt2R4zeMcf78+QwePHi9bXvuuSd1dXXcd999nT7XhlygzMwKbOTIkXz00Uf89Kc/bdn23HPPMWjQoIIvlfHVr36VP/zhD/z2t79t2fboo4/mvYLuxIkT+eEPf0hjYyMAjY2NXHXVVXzrW9/aaN/vfOc7TJkyJZW4wUN8ZlaGqvp/rt1v3nW0vc2RxIMPPsiECROYNGkSlZWV1NTUcN1117W7VMadd9653te6586dC8CwYcPYaqtcH+PUU09l2LBh3HvvvcyZM6dl35tuuokvfvGLTJ8+nQkTJjBhwgQqKioYNmwY119/fV7XVldXx49+9COOO+441q5dS0VFBVdffTV1dXUb7TtkyBAOOOAA5s+fn1fb7clruY20eBkBKzYvt1GevNxGdnRkuQ0P8ZmZWSZlrkANrKpK7a+7B1ZVFftyzMyskzL3GdSSZctSHYIxM4PNf6XbukdHP1LKXA/KzCxtlZWVrFy5ssP/QVp6IoKVK1dSWVmZ9zGZ60GZmaWturqapqYmli9fXuxQylplZSXV1fmPkLlAmVnJq6ioaJlLzrYcHuIzM7NMcoEyM7NMcoEyM7NMcoEyM7NMcoEyM7NMyqtASdpO0v2SXpW0SNLBknaQNEPSa8n99oUO1szMyke+PajrgUcjYh/gX4BFwKXAzIgYBMxMnpuZmaWi3QIlqS/wJeBnABHxcUT8DRgNTE12mwqcUKggzcys/OTTg9odWA7cIekFSbdJ2hbYJSKWAiT3O7d1sKTxkuZJmue/4jZrm/PEbGP5FKiewAHAzRGxP/APOjCcFxG3RkR9RNT369evk2GalTbnidnG8ilQTUBTRDybPL+fXMF6R1IVQHL/bmFCNDOzctRugYqIZcBbkvZONh0OvAI8AoxNto0FHi5IhGZmVpbynSz2AuAuSZ8B3gDOJlfc7pN0DrAEOKUwIZqlRz0qUlsnTD0qUmnHzNqWV4GKiAagvo2XDk83HLPCik/WctD3Hk2lrWd/MCqVdsysbZ5JwszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMskFyszMMinvAiWph6QXJE1Pnu8gaYak15L77QsXppmZlZuO9KAuAha1en4pMDMiBgEzk+dmZmapyKtASaoGjgFua7V5NDA1eTwVOCHd0MzMrJzl24O6DrgE+LTVtl0iYilAcr9zWwdKGi9pnqR5y5cv71KwZqXKeWK2sXYLlKRjgXcj4vnOnCAibo2I+oio79evX2eaMCt5zhOzjfXMY5/hwPGSjgYqgb6Sfgm8I6kqIpZKqgLeLWSgZmZWXtrtQUXEZRFRHRE1wOnA7yPiTOARYGyy21jg4YJFaWZmZacrfwc1CThS0mvAkclzMzOzVOQzxNciImYDs5PHK4HD0w/JzMzMM0mYmVlGuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUCZmVkmuUAVwcCqKiSlchtYVVXsyzEzK4gOrQdl6ViybBlNu1Wn0lb1202ptGNmljXuQZmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa5QJmZWSa1W6AkfU7SLEmLJC2UdFGyfQdJMyS9ltxvX/hwzcysXOTTg1oHfCsi9gW+AJwnaTBwKTAzIgYBM5PnZmZmqWi3QEXE0oiYnzz+AFgE9AdGA1OT3aYCJxQqSDMzKz8d+gxKUg2wP/AssEtELIVcEQN23sQx4yXNkzRv+fLlXYvWrEQ5T8w2lneBktQb+DUwISLez/e4iLg1Iuojor5fv36didGs5DlPzDaWV4GSVEGuON0VEQ8km9+RVJW8XgW8W5gQzcysHOXzLT4BPwMWRcQ1rV56BBibPB4LPJx+eGZmVq7yWbBwOHAWsEBSQ7LtcmAScJ+kc4AlwCmFCdHMzMpRuwUqIuYA2sTLh6cbjpmZFdvAqiqWLFuWSlsDdt2VPy9d2qljveS7mZmtZ8myZTTtVp1KW9VvN3X6WE91ZJk3sKoKSancSkWaP5OBVVXFvhyzNrkHZZmXlXdzWeKfiZUD96DMzCyTSroHtTWkNqzTlQ/6rGvUo8Lv8s3KUEkXqI/AwyAlID5Zy0HfezSVtp79wahU2jGzwvMQn5mZZZILlJmZZZILlJmZZZILlJmZZZILlJmZZZILlJmZZZILlJmZZZILlJmZZZILlJmZZZILlJmZZVJJT3VkZmYdl+b8l+pR0eljXaDMzGw9WZn/0kN8ZmWuedZ/L35oWeMelFmZ86z/llXuQZmZWSa5QFlB7FY9ILVhIzMrTx7is4JY+pe3MvEhq5ltuTJXoLLy9UYzK66BVVUsWbYslbYG7Lorf166NJW2rPtkrkBl5euNW4rmb2ClwUlsWbJk2TJ/eaPMdalASRoFXA/0AG6LiEmpRGV58zewzKxUdfpLEpJ6ADcCXwEGA2dIGpxWYGZmacnq33oNrKpKLa5ePXqW3BeTutKD+h/A6xHxBoCke4DRwCtpBGZmlpasjjSkPYyZxWvsCkVE5w6UTgZGRcTXk+dnAQdFxPkb7DceGJ883Rv4UztN7wSs6FRQWw5fY2lo7xpXREReH4Q6T9rkaywN+Vxjm7nSlR5UW/3AjapdRNwK3Jp3o9K8iKjvQlyZ52ssDWleo/NkY77G0tCVa+zKH+o2AZ9r9bwaeLsL7ZmZmbXoSoF6DhgkqVbSZ4DTgUfSCcvMzMpdp4f4ImKdpPOBx8h9zfz2iFiYQkx5D3NswXyNpaGY1+ifb2nwNW5Gp78kYWZmVkieLNbMzDLJBcrMzDIpMwVK0ihJf5L0uqRLix1P2iR9TtIsSYskLZR0UbFjKhRJPSS9IGl6sWMpBEnbSbpf0qvJv+fB3Xjuks4TKJ9cKfU8ga7nSiY+g0qmTVoMHEnu6+vPAWdERMnMSiGpCqiKiPmS+gDPAyeU0jU2k3QxUA/0jYhjix1P2iRNBZ6KiNuSb7D2ioi/dcN5Sz5PoHxypdTzBLqeK1npQbVMmxQRHwPN0yaVjIhYGhHzk8cfAIuA/sWNKn2SqoFjgNuKHUshSOoLfAn4GUBEfNwdxSlR8nkC5ZErpZ4nkE6uZKVA9QfeavW8iRL7hWxNUg2wP/BscSMpiOuAS4BPix1IgewOLAfuSIZnbpO0bTedu6zyBEo6V0o9TyCFXMlKgcpr2qRSIKk38GtgQkS8X+x40iTpWODdiHi+2LEUUE/gAODmiNgf+AfQXZ8FlU2eQOnmSpnkCaSQK1kpUGUxbZKkCnIJd1dEPFDseApgOHC8pEZyw08jJf2yuCGlrgloiojmd/T3k0vC7jp3yecJlHyulEOeQAq5kpUCVfLTJim3yMrPgEURcU2x4ymEiLgsIqojoobcv+HvI+LMIoeVqohYBrwlae9k0+F03xIzJZ8nUPq5Ug55AunkSiaWfC/gtElZMhw4C1ggqSHZdnlE/L8ixmSdcwFwV1Ik3gDO7o6TlkmegHOllHQpVzLxNXMzM7MNZWWIz8zMbD0uUGZmlkkuUGZmlkkuUGZmlkkuUGZmlkkuUBki6f9Imphie/tIakimGdkjrXZbtd8oaae02zXbHOdJ+XCBKm0nAA9HxP4R8d/FDsYso5wnGeUCVWSSvpOs7/MEsHey7d8kPSfpRUm/ltRLUh9JbyZTwCCpb/LOrEJSnaS5kl6S9KCk7SUdDUwAvp6srXOTpOOTYx+UdHvy+BxJ/zd5fKakPybvJn+SLO+ApKMkPSNpvqRfJXOktb6GbSQ9Kunfuu0HZ2XFeVKeXKCKSNKB5KY62R/4V+DzyUsPRMTnI+JfyC01cE6y7MBsclP0kxz364hYC/wc+PeIGAYsAL6f/NX9LcC1EXEY8CRwaHJsf2Bw8vgQ4ClJ+wKnAcMjog74BBiTDE1cARwREQcA84CLW11Gb+A3wN0R8dOUfjRmLZwn5csFqrgOBR6MiA+T2Zqb51XbT9JTkhYAY4Ahyfbb+OdUIWeTm8b+s8B2EfFfyfap5NZg2dBTwKGSBpObD+sd5RaGOxj4A7l5sg4Enkumlzmc3HT5XyCXpE8n28cCA1u1+zBwR0T8vCs/CLPNcJ6UqUzMxVfm2ppr6k5yK4i+KGkcMAIgIp6WVCPpy0CPiHg5Sbz2TxLxF0nbA6PIvUvcATgVWBURH0gSMDUiLmt9nKTjgBkRccYmmn4a+Iqku8PzZlnhOE/KkHtQxfUkcGIyNt0HOC7Z3gdYmoyjj9ngmJ8D04A7ACLi78BfJTUPS5wF/Bdte4bcePuT5N4pTkzuAWYCJ0vaGUDSDpIGAnOB4ZL2TLb3krRXqza/B6wEburoxZvlyXlSplygiihZ1vpeoIHc2jfNSfBdciuIzgBe3eCwu4DtySVfs7HAZEkvAXXADzZxyqeAnhHxOjCf3LvDp5JYXiE3hv540s4MoCoilgPjgGnJ9rnAPhu0OwGolHR13hdvlifnSfnybOZbGEknA6Mj4qxix2KWVc6T0uDPoLYgkv4T+ApwdLFjMcsq50npcA/KzMwyyZ9BmZlZJrlAmZlZJrlAmZlZJrlAmZlZJrlAmZlZJv1/wof9+c/Zg1kAAAAASUVORK5CYII=\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "We see that people who get the loan at the end of the week dont pay it off, so lets use Feature binarization to set a threshold values less then day 4 "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['weekend'] = df['dayofweek'].apply(lambda x: 1 if (x>3) else 0)\ndf.head()",
"execution_count": 87,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 87,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 0 0 PAIDOFF 1000 30 2016-09-08 \n1 2 2 PAIDOFF 1000 30 2016-09-08 \n2 3 3 PAIDOFF 1000 15 2016-09-08 \n3 4 4 PAIDOFF 1000 30 2016-09-09 \n4 6 6 PAIDOFF 1000 30 2016-09-09 \n\n due_date age education Gender dayofweek weekend \n0 2016-10-07 45 High School or Below male 3 0 \n1 2016-10-07 33 Bechalor female 3 0 \n2 2016-09-22 27 college male 3 0 \n3 2016-10-08 28 college female 4 1 \n4 2016-10-08 29 college male 4 1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n <th>dayofweek</th>\n <th>weekend</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>45</td>\n <td>High School or Below</td>\n <td>male</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n <td>2</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>33</td>\n <td>Bechalor</td>\n <td>female</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n <td>3</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>15</td>\n <td>2016-09-08</td>\n <td>2016-09-22</td>\n <td>27</td>\n <td>college</td>\n <td>male</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>4</td>\n <td>4</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>28</td>\n <td>college</td>\n <td>female</td>\n <td>4</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>6</td>\n <td>6</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>29</td>\n <td>college</td>\n <td>male</td>\n <td>4</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "## Convert Categorical features to numerical values"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Lets look at gender:"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df.groupby(['Gender'])['loan_status'].value_counts(normalize=True)",
"execution_count": 88,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 88,
"data": {
"text/plain": "Gender loan_status\nfemale PAIDOFF 0.865385\n COLLECTION 0.134615\nmale PAIDOFF 0.731293\n COLLECTION 0.268707\nName: loan_status, dtype: float64"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "86 % of female pay there loans while only 73 % of males pay there loan\n"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Lets convert male to 0 and female to 1:\n"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df['Gender'].replace(to_replace=['male','female'], value=[0,1],inplace=True)\ndf.head()",
"execution_count": 89,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 89,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 0 0 PAIDOFF 1000 30 2016-09-08 \n1 2 2 PAIDOFF 1000 30 2016-09-08 \n2 3 3 PAIDOFF 1000 15 2016-09-08 \n3 4 4 PAIDOFF 1000 30 2016-09-09 \n4 6 6 PAIDOFF 1000 30 2016-09-09 \n\n due_date age education Gender dayofweek weekend \n0 2016-10-07 45 High School or Below 0 3 0 \n1 2016-10-07 33 Bechalor 1 3 0 \n2 2016-09-22 27 college 0 3 0 \n3 2016-10-08 28 college 1 4 1 \n4 2016-10-08 29 college 0 4 1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n <th>dayofweek</th>\n <th>weekend</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>45</td>\n <td>High School or Below</td>\n <td>0</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n <td>2</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-08</td>\n <td>2016-10-07</td>\n <td>33</td>\n <td>Bechalor</td>\n <td>1</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n <td>3</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>15</td>\n <td>2016-09-08</td>\n <td>2016-09-22</td>\n <td>27</td>\n <td>college</td>\n <td>0</td>\n <td>3</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>4</td>\n <td>4</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>28</td>\n <td>college</td>\n <td>1</td>\n <td>4</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>6</td>\n <td>6</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>2016-09-09</td>\n <td>2016-10-08</td>\n <td>29</td>\n <td>college</td>\n <td>0</td>\n <td>4</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "## One Hot Encoding \n#### How about education?"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df.groupby(['education'])['loan_status'].value_counts(normalize=True)",
"execution_count": 90,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 90,
"data": {
"text/plain": "education loan_status\nBechalor PAIDOFF 0.750000\n COLLECTION 0.250000\nHigh School or Below PAIDOFF 0.741722\n COLLECTION 0.258278\nMaster or Above COLLECTION 0.500000\n PAIDOFF 0.500000\ncollege PAIDOFF 0.765101\n COLLECTION 0.234899\nName: loan_status, dtype: float64"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "#### Feature befor One Hot Encoding"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "df[['Principal','terms','age','Gender','education']].head()",
"execution_count": 91,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 91,
"data": {
"text/plain": " Principal terms age Gender education\n0 1000 30 45 0 High School or Below\n1 1000 30 33 1 Bechalor\n2 1000 15 27 0 college\n3 1000 30 28 1 college\n4 1000 30 29 0 college",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Principal</th>\n <th>terms</th>\n <th>age</th>\n <th>Gender</th>\n <th>education</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1000</td>\n <td>30</td>\n <td>45</td>\n <td>0</td>\n <td>High School or Below</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1000</td>\n <td>30</td>\n <td>33</td>\n <td>1</td>\n <td>Bechalor</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1000</td>\n <td>15</td>\n <td>27</td>\n <td>0</td>\n <td>college</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1000</td>\n <td>30</td>\n <td>28</td>\n <td>1</td>\n <td>college</td>\n </tr>\n <tr>\n <th>4</th>\n <td>1000</td>\n <td>30</td>\n <td>29</td>\n <td>0</td>\n <td>college</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "#### Use one hot encoding technique to conver categorical varables to binary variables and append them to the feature Data Frame "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "Feature = df[['Principal','terms','age','Gender','weekend']]\nFeature = pd.concat([Feature,pd.get_dummies(df['education'])], axis=1)\nFeature.drop(['Master or Above'], axis = 1,inplace=True)\nFeature.head()\n",
"execution_count": 92,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 92,
"data": {
"text/plain": " Principal terms age Gender weekend Bechalor High School or Below \\\n0 1000 30 45 0 0 0 1 \n1 1000 30 33 1 0 1 0 \n2 1000 15 27 0 0 0 0 \n3 1000 30 28 1 1 0 0 \n4 1000 30 29 0 1 0 0 \n\n college \n0 0 \n1 0 \n2 1 \n3 1 \n4 1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Principal</th>\n <th>terms</th>\n <th>age</th>\n <th>Gender</th>\n <th>weekend</th>\n <th>Bechalor</th>\n <th>High School or Below</th>\n <th>college</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1000</td>\n <td>30</td>\n <td>45</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1000</td>\n <td>30</td>\n <td>33</td>\n <td>1</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1000</td>\n <td>15</td>\n <td>27</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1000</td>\n <td>30</td>\n <td>28</td>\n <td>1</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>1000</td>\n <td>30</td>\n <td>29</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Feature selection"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Lets defind feature sets, X:"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "X = Feature\nX[0:5]",
"execution_count": 93,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 93,
"data": {
"text/plain": " Principal terms age Gender weekend Bechalor High School or Below \\\n0 1000 30 45 0 0 0 1 \n1 1000 30 33 1 0 1 0 \n2 1000 15 27 0 0 0 0 \n3 1000 30 28 1 1 0 0 \n4 1000 30 29 0 1 0 0 \n\n college \n0 0 \n1 0 \n2 1 \n3 1 \n4 1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Principal</th>\n <th>terms</th>\n <th>age</th>\n <th>Gender</th>\n <th>weekend</th>\n <th>Bechalor</th>\n <th>High School or Below</th>\n <th>college</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1000</td>\n <td>30</td>\n <td>45</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1000</td>\n <td>30</td>\n <td>33</td>\n <td>1</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1000</td>\n <td>15</td>\n <td>27</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1000</td>\n <td>30</td>\n <td>28</td>\n <td>1</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>1000</td>\n <td>30</td>\n <td>29</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "What are our lables?"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "y = df['loan_status'].values\ny[0:5]",
"execution_count": 94,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 94,
"data": {
"text/plain": "array(['PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF'],\n dtype=object)"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "## Normalize Data "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Data Standardization give data zero mean and unit variance (technically should be done after train test split )"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "# X= preprocessing.StandardScaler().fit(X).transform(X) NO! We scale after we split\nX[0:5]",
"execution_count": 97,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 97,
"data": {
"text/plain": " Principal terms age Gender weekend Bechalor High School or Below \\\n0 1000 30 45 0 0 0 1 \n1 1000 30 33 1 0 1 0 \n2 1000 15 27 0 0 0 0 \n3 1000 30 28 1 1 0 0 \n4 1000 30 29 0 1 0 0 \n\n college \n0 0 \n1 0 \n2 1 \n3 1 \n4 1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Principal</th>\n <th>terms</th>\n <th>age</th>\n <th>Gender</th>\n <th>weekend</th>\n <th>Bechalor</th>\n <th>High School or Below</th>\n <th>college</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1000</td>\n <td>30</td>\n <td>45</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1000</td>\n <td>30</td>\n <td>33</td>\n <td>1</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1000</td>\n <td>15</td>\n <td>27</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1000</td>\n <td>30</td>\n <td>28</td>\n <td>1</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>1000</td>\n <td>30</td>\n <td>29</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "# Classification "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "Now, it is your turn, use the training set to build an accurate model. Then use the test set to report the accuracy of the model\nYou should use the following algorithm:\n- K Nearest Neighbor(KNN)\n- Decision Tree\n- Support Vector Machine\n- Logistic Regression\n\n\n\n__ Notice:__ \n- You can go above and change the pre-processing, feature selection, feature-extraction, and so on, to make a better model.\n- You should use either scikit-learn, Scipy or Numpy libraries for developing the classification algorithms.\n- You should include the code of the algorithm in the following cells."
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# K Nearest Neighbor(KNN)\nNotice: You should find the best k to build the model with the best accuracy. \n**warning:** You should not use the __loan_test.csv__ for finding the best k, however, you can split your train_loan.csv into train and test to find the best __k__."
},
{
"metadata": {},
"cell_type": "code",
"source": "from sklearn.model_selection import train_test_split\n\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=3)\nprint ('Train set:', X_train.shape, y_train.shape)\nprint ('Test set:', X_test.shape, y_test.shape)\n# The test data should not influence our training, so not used for creating the scaler\nscaler = preprocessing.StandardScaler().fit(X_train)\nX_train = scaler.transform(X_train)\nX_test = scaler.transform(X_test)",
"execution_count": 137,
"outputs": [
{
"output_type": "stream",
"text": "Train set: (276, 8) (276,)\nTest set: (70, 8) (70,)\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "from sklearn.neighbors import KNeighborsClassifier\nfrom sklearn import metrics\n\nKs = 19\nmean_acc = np.zeros((Ks-1))\nstd_acc = np.zeros((Ks-1))\n\nfor n in range(1,Ks):\n neigh = KNeighborsClassifier(n_neighbors = n).fit(X_train,y_train)\n yhat = neigh.predict(X_test)\n mean_acc[n-1] = metrics.accuracy_score(y_test, yhat)\n std_acc[n-1] = np.std(yhat==y_test)/np.sqrt(yhat.shape[0])\n\nk = mean_acc.argmax()+1\nKNN = KNeighborsClassifier(n_neighbors = k).fit(X_train,y_train)\nprint(\"KNN's best accuracy was\", mean_acc.max(), \"with k =\", k) \n",
"execution_count": 138,
"outputs": [
{
"output_type": "stream",
"text": "KNN's best accuracy was 0.7 with k = 13\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "plt.plot(range(1,Ks),mean_acc,'g')\nplt.fill_between(range(1,Ks),mean_acc - 1 * std_acc,mean_acc + 1 * std_acc, alpha=0.10)\nplt.fill_between(range(1,Ks),mean_acc - 3 * std_acc,mean_acc + 3 * std_acc, alpha=0.10,color=\"green\")\nplt.legend(('Accuracy ', '+/- 1xstd','+/- 3xstd'))\nplt.ylabel('Accuracy ')\nplt.xlabel('Number of Neighbors (K)')\nplt.tight_layout()\nplt.show()",
"execution_count": 139,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAAEYCAYAAAAJeGK1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzdd5iU1dn48e89bWcbS+8gGMWGQgxg19ixotFENDFRX0NMxJJf9I0prxrfV2NJsYtGjcYESSwUuzEaTaJGwGBBAyJSlqUKW2Z3+ty/P2aGrOsuzMI8M8/M3p/r4mJn5inn2dmZ+znn3OccUVWMMcYYt/EUuwDGGGNMZyxAGWOMcSULUMYYY1zJApQxxhhXsgBljDHGlXzFLkA+9e/fX0eNGlXsYhhjjOmGhQsXblLVAR2fL6sANWrUKBYsWFDsYhhjjOkGEVnZ2fPWxGeMMcaVLEAZY4xxJQtQxhhjXKms+qCMMSaf4vE49fX1RCKRYhelLASDQYYPH47f789pewtQxhjThfr6empraxk1ahQiUuzilDRV5dNPP6W+vp7Ro0fntI818RljTBcikQj9+vWz4JQHIkK/fv26VRu1AGWMMdtgwSl/uvu7tABljDHGlSxAGWOMy82ePRsR4d///nexi1JQFqCMMSVFVWmONLM+tJ6UpopdnIJ49NFHOfTQQ5k1a5aj50kmk44ev7ssQBljSkY8GaehpYG1obU0RhppaGko+yAVCoX4xz/+wQMPPPCZAJVMJrniiivYd9992W+//bjjjjsAmD9/PgcffDDjxo1j0qRJtLS08NBDDzF9+vSt+5588sn89a9/BaCmpoarr76aAw44gDfeeIPrrruOiRMnMnbsWKZNm0Z21fVly5ZxzDHHMG7cOPbff38+/vhjzj33XObOnbv1uF//+teZN29e3q7d0syNMa6nqjRH07Umr8dLbUUtAK2xVtY0r2FYr2F4xNn77cufv5xF6xbl9ZjjB4/n1sm3bnObOXPmMHnyZMaMGUPfvn15++232X///bnvvvv45JNP+Ne//oXP52Pz5s3EYjHOOuss/vjHPzJx4kSam5uprKzc5vFbW1sZO3Ys1113HQB77703V199NQDnnnsuTz/9NKeccgpf//rXueqqqzj99NOJRCKkUikuvPBCfv3rXzNlyhSampp4/fXXefjhh/Pzy8FqUMYYl4slY9Q317MutI6qQBWV/v984VYHqokkIqxpXkMy5a7mqXx59NFHmTp1KgBTp07l0UcfBeCll17ioosuwudL1zP69u3LkiVLGDJkCBMnTgSgV69eW1/vitfr5Ywzztj6+JVXXuGAAw5g33335eWXX2bx4sW0tLSwZs0aTj/9dCA94LaqqoojjjiCZcuWsWHDBh599FHOOOOM7Z6vOxytQYnIZOA2wAvcr6o3dni9Dvg9MDJTll+o6m8zr60AWoAkkFDVCU6W1RjjLqpKY6SRjW0b8Xl8W2tNHVUHqmmLtdHQ0sDQ2qF4PV5HyrO9mo4TPv30U15++WXef/99RIRkMomIcPPNN6Oqn0vb7uw5AJ/PRyr1n6bQ9mORgsEgXq936/Pf+973WLBgASNGjODaa68lEolsbebrzLnnnssf/vAHZs2axYMPPrizl/wZjtWgRMQL3AWcAOwNnC0ie3fY7GLgA1UdB3wZ+KWIBNq9fqSqjrfgZEzPEk1EWdW0ig2tG6jyVxH0Bbe5fVWgimgiSkNLQ1nVpB5//HG++c1vsnLlSlasWMHq1asZPXo0f//73znuuOOYMWMGiUQCgM2bN7PnnnvS0NDA/PnzAWhpaSGRSDBq1CgWLVpEKpVi9erVvPXWW52eLxu4+vfvTygU4vHHHwfSNbHhw4czZ84cAKLRKG1tbQCcd9553HprOnjvs88+eb1+J5v4JgHLVHW5qsaAWcCUDtsoUCvpkF8DbAYSDpbJGONiKU2xObyZTxo/IaUpaitqc+5bygap+ub6ogYpVSWZSpJIJXb638xHZ3LqlFM/U/s544wzmDlzJhdeeCEjR45kv/32Y9y4ccycOZNAIMAf//hHLrnkEsaNG8exxx5LJBLhkEMOYfTo0ey7775cccUV7L///p2WvXfv3nz7299m33335bTTTtvaVAjwyCOPcPvtt7Pffvtx8MEHs27dOgAGDRrEXnvtxfnnn5/336Vsq+q2UwcWOROYrKoXZh6fCxygqtPbbVMLzAP2BGqBs1T1mcxrnwBbSAexe1X1vu2dc8KECWoLFhpTmiKJCGtb1hJPxan2V+/wDA7heBifx8fwXsN3urnvww8/ZK+99sp5+5SmiCfj6SaxPE5AoaoEvAHHmi93RltbG/vuuy9vv/02dXV1292+s9+piCzsrKXMyRpUZ29Px2h4PLAIGAqMB+4UkV6Z1w5R1f1JNxFeLCKHd3oSkWkiskBEFmzcuDFPRTfGFEpKU2xq3cSKLSsQEWoCNTs1vVClv5JEKlHQmpSqEk/GiSViAHg8HjySv39e8RJLxlzXfPnSSy+x5557cskll+QUnLrLySSJemBEu8fDgYYO25wP3KjpatyyTK1pT+AtVW0AUNUNIjKbdJPhax1PkqlZ3QfpGlTer8IY45hwPMzalrUkUglqK2rzNu9dpb+Stngbq5tWM7xuOD6Pc191W2tNKB6PQ/f8Al7SQSqAe2pSxxxzDKtWrXLs+E7WoOYDu4vI6Eziw1TSzXntrQKOBhCRQcAewHIRqc40/yEi1cBxwPsOltUYU0DJVJINrRtY2bQSr8dLTcXO1Zo6U+WvIqlJ6pvqSaTy37WdrTVFE1EAx8dhIbi2JuUUx24rVDUhItOBF0inmT+oqotF5KLM6zOA/wUeEpH3SDcJ/lBVN4nIrsDszB+sD5ipqs87VVZjTOG0xlpZF1qXToII5K/W1JlKfyXheJj6pvq81qRSqRTxVLqvqaC1GZfWpJzi6DgoVX0WeLbDczPa/dxAunbUcb/lwDgny1aK8nnXJCLO3/GZkqaqeZ1GKJuhtzm8mepAtaPNbu3lM0ip6tYMO494nGvS25YeFKRsqqMSEE/GWRdaRzgeRj+XZ7JjRIQhNUO6HPxoeiZVJZaMEYqFaIo0kUgl8vY3B+kmqrpg/jvTtycbpFY3rWZE3YgdClJFqzV1pocEKQtQLqaqtERbWBdat7WdPl+SqSRrWtYwnOF5Pa4pPR2DUjwVx+vxUuGtIOjf9gDZUpINUquaVjGi1wj8Xn9O+7WvNW1siZPPFslBvXbi99suSC18YyEPP/Qwv/nNb7rc/LXXXuPyyy/n3XffZdasWZx55pndPmVjYyMzZ87ke9/7Xqevn3feeZx88sk7dOzOWBuPS2VnbW5oafjc/GP54PV4qfZXU99STygayuuxjfupKtFElE/bPuWTLZ+wonEFWyJb8Hv91FbUUuWvKsu78kp/JarK6qbVxJPx7W6fDd7JVBKvx5vX4NRdr/71VS684MLPPplJnHj2uWc57rjP9ZZ8xsiRI3nooYc455xzdrgMjY2N3H333Tu8f3dZgHKZ7Fo3n2z5hEgiQq9gL8f6itoHqdZYqyPnMO7RU4NSR5X+ShC2GaSyWYbZ7L+i9DXlSuDVV17lsCMP22Y/9ahRo9hvv/0+dy2zZ8/mmGOOQVVZu3YtY8aMYd26dSxevJhJkyYxfvx49ttvPz766COuuuoqPv74Y8aPH8+VV16JqjJ9+nT23ntvTjrpJDZs2JDXS7MmPheJJ+NsaN1AKBYq2JeF1+Olyl9FfXM9I+pGUOWvcvycpnB6SvNddwV9QSKJyNZxUgHvf6YAbYu3sa5lHUlNT8zqZJZhPmzatAm/30/f3n13qE/q9NNP54knnuCuu+7i+eef52c/+xmDBw/m+uuv57LLLuPrX/86sViMZDLJjTfeyPvvv8+iRellR5588kmWLFnCe++9x/r169l777254IIL8nZtFqBcoGNfU6ETF3weH5X+yq0dyBakSltnQckjHoK+YI8OSh21D1Ij6kbgFS+b2jaxJbKFSl8lQW+QKNFiF5NDDzqUaCxKKBRiy+YtTPxSen6862+4nuOOP46X/vwSxxx7zE4lTtxxxx2MHTuWAw88kLPPPhuAgw46iOuvv576+nq+8pWvsPvuu39uv9dee42zzz4br9fL0KFDOeqoo/Jz0RkWoIoskUqwPrSellgL1f7qojWxWJDavpSmiCQiOfVdFEs8Gac52mxBKUftgxRQkLFZ3fX3N/4OpPugHvndI9z/4P2fef2F51/gsssvA+DbF36bRf9axOChg3n2mWdz/j5Zs2YNHo+H9evXk0ql8Hg8nHPOORxwwAE888wzHH/88dx///3suuuun9vXyd+VBagiytaaPOKhV0Wv7e/gsGyQqm+uZ0SvEXlPzChFqkokEaEl1kJTpImUplw9fkxEenzzXXcFfUFiyRge8RRsbFa+qCrvvfce48anh43+5oFMFp+Sc00qkUhw/vnnM3PmTH73u9/xq1/9iiuuuILly5ez6667cumll7J8+XLeffddxo0bR0tLy9Z9Dz/8cO69916++c1vsmHDBl555ZWdSsLoqLTejTLhllpTZ3weH0FfkNXNq3tskOosKGWDt5uDk9lx7fugtmWn0sId8PbCtxk/fvznazGdNPfNnz+f008/nS1btvDUU09xzTXXsHjxYm644QYOO+wwDjvsMMaPH8/EiRM56aSTmDNnDr///e/x+/0MHjyYq6++mr59+3LIIYcwduxYTjjhBG6++WZefvll9t13X8aMGcMRRxyR1+tzbLmNYiiF5Tba15rc/OWfSCUIx8OMrBvp6nLmi6oSTUYJRUM0RhtJppL4PD4qfBUWlHqwzas2M2bPMcUuRpd+fv3P+cJuX+BrZ32t8w0Ukpp01VId3Vluw2pQBZJIJdjYupGmaJPrak2d2VqTalrNyN4jt7uiaSlqH5SaoulZE7LXbUHJlIIf/eRH296gxGecsABVAC3RFtaH1oPgir6mXGVH2q9qXFU2QapjUEpqcmsyQaWUf03R9EAlHKQsQDloa60p0kR1wP21ps6UQ5DqaiyQ1ZRMj1GiQcoCVJ61v0PfEtmSztALlk6tqTOlGqSiiSit8VYaw402QNWYdkHKk8rfjZmI5Jxk0l0WoPKgqzv0csr68nv9KOk5zEbWjaTCV1HsInUqlozRGmtlS2QL8aSNBTLmMzJz9+WLquJkop0FqJ3Q0+7Qs3dJq5pWuSpIxZNx2uJtbAlvIZqM4hEPFb6KkqnpGVOqRMQClJv09Dv0bJDKzjhRrCDVVVCq9dn6VsY5q5tWE06E83a8Sl8lI+pG7PRx3nzjTR5+6GHuufeeLre57977mHHPDLxeLzXVNdw942722nuvLrfvTGNjI7MencVF372o09fzvdyGBagc2B36ZxUrSCVSCdpibTRGGwnHwwiSDkq26KIpkHAiTE0gf+unhWK5L3XT1VRHAC++8CLHHb/t5Tamnj2Vad+ZBsBTTz3FlVdcydPPPt2t8jY2NnLvjHu7DFD55mgHiYhMFpElIrJMRK7q5PU6EXlKRN4RkcUicn6u+zotkUrQHGlmZeNKlm9ZzobWDXg8Hmoragu6XLVbZQf+rW5aTSwZc+w8yVSSUCxEfVM9y7csZ13ruvR8aRW11FTU5LzonDHl7JWXX+Goo7c9UWuvXv9J1mprbds6+8TcOXOZfNzkrctt7LPXPqxbt44PFn/AIQcewsQvTeRLX/wSH330ET/98U9Z/vFyJn5pIlf991WoKpdfennpLbchIl7gLuBYoB6YLyLzVPWDdptdDHygqqeIyABgiYj8AUjmsG/eJVNJWmOtdoeeowpfBdFElNVNqxnWa1jeOl+VdNJJU6SJUCy0NUson3euxpSL7HIbdXV12932nrvv4bZbbyMei/P8n58HYMppU5j95GzuufseXnzhRa6+5moGDx7MjT+/kemXTufsc87eutzG/93wfyxevJj5C+cDMGf2HJYuXVqSy21MApap6nIAEZkFTAHaBxkFaiUdymuAzUACOCCHffNuY+tGGqONBH1BC0o5ygapVY2rIJ+TGitbF9IzpifLebmNHHz3e9/lu9/7LrMencWNN9zIA799AIBf3/Zr9h+3P5MOmMRZU88C4MADD+TGn9/Imvo1TDl9SqfLbfztb3/ja2d9rSSX2xgGrG73uJ504GnvTmAe0ADUAmepakpEctkXABGZBkyD9JLGO0NRKrwVjuX0l6sKX4VrMvqMKTfdWm7jv77NokWLGDJkCPOentflMb921te45OJLtj5uWNOAx+Nhw/oNW5fbmHr2VCZOmshzzz7HySeezIx7ZzB619GfO5aTy2042QfVWak75iMeDywChgLjgTtFpFeO+6afVL1PVSeo6oQBAwbsTHmNMaakdLbcxvyF8zsNTh999NHWn5995ll22303IL3cxrcv/DYPP/Iwe+61J7f++laArcttTL9kOiefcjLvvfcetbW1hFr+k9hx2GGH8difHiOZTLJ27VpeeeWVvF6fkzWoeqB9/uRw0jWl9s4HbtR0Iv0yEfkE2DPHfY0xpqAqfZXdyrzL5Xg7o8vlNjpxz9338PJfXsbv99Ondx8eeDDdvHfTz2/ikEMO4dDDDmXc+HEcfODBnHDiCTw19ylmzpyJ3+9n0KBB/OSnP6Fv374cdPBBfHHcFzn++OP5+U0/55WXXym95TZExAcsBY4G1gDzgXNUdXG7be4B1qvqtSIyCHgbGAc0bm/fzuzschtrW9YSSUSsucoYA5TBchsFoKrd+s50xXIbqpoQkenAC4AXeFBVF4vIRZnXZwD/CzwkIu+Rbtb7oapuyhT4c/s6VVZjjClF211uo8Q5OphHVZ8Fnu3w3Ix2PzcAnY4u62xfY4wxPUd5zGRqjDEOKadVx4utu79LC1DGGNMFb8DLls1bLEjlgary6aefEgzmPkVcz56vxxhjtqGmfw1bNm1h08ZNxS6KK2lm9E+uU78Fg0GGDx+e8/EtQBljTBe8Pi91g7c/hVBPldIUsUSMXfvu6sjxrYnPGGOMK1mAMsYY40oWoIwxxriSBShjjDGuZAHKGGOMK1mAMsYY40oWoIwxxriSBShjjDGuZAHKGGOMK1mAMsYY40oWoIwxxriSBShjjDGuZAHKGGOMKzkaoERksogsEZFlInJVJ69fKSKLMv/eF5GkiPTNvLZCRN7LvLbAyXIaY0pLJJ60NZp6AMeW2xARL3AXcCxQD8wXkXmq+kF2G1W9Bbgls/0pwPdVdXO7wxypqrYQizEGgFRKCUUTRBNJIl4PdZV+RKTYxTIOcbIGNQlYpqrLVTUGzAKmbGP7s4FHHSyPMaaExRIptrTFiCaSAMSTKZoj8SKXyjjJyQA1DFjd7nF95rnPEZEqYDLwRLunFXhRRBaKyLSuTiIi00RkgYgs2LhxYx6KbYxxm9ZogqZwjFSHZr1YIkVz2IJUuXIyQHVW7+6q0fgU4B8dmvcOUdX9gROAi0Xk8M52VNX7VHWCqk4YMGDAzpXYGOMqyZTS2BajLZbocptoIknIalJlyckAVQ+MaPd4ONDQxbZT6dC8p6oNmf83ALNJNxkaY3qISDzJlrYo8WRqu9uG40nC2whipjQ5GaDmA7uLyGgRCZAOQvM6biQidcARwNx2z1WLSG32Z+A44H0Hy2qMcQlVpSUSpyUSpzuJeqFogkg86VzBTME5lsWnqgkRmQ68AHiBB1V1sYhclHl9RmbT04EXVbW13e6DgNmZ7BwfMFNVn3eqrMYYd0hkEh+SqR1LIW/JNPUF/d58FqtbEskUsUSqy/6M7vIIVAYc+6p2NUevWlWfBZ7t8NyMDo8fAh7q8NxyYJyTZTPGuEs4lqA1mtjpL/ZQJI5HhICv8PMQROLp/rB8j9BKpJTaoD/PR3U/m0nCGFNUqZTSFI4RykNwgnQmVnMkllPfVb6oKs3hTLOkA8ePxJO0RXteH5sFKGNM0WTHNsUS+Q0mqtAc3vGmwu6IJz87PssprbEE4VjP6mOzAGWMKYquxjblS0rTNbOUg0GqLZqgqS1WkEAIEIrGe1QiiAUoY0xBJVPKlu2MbcrnuRrDsbzP25fKjM9qjeWnWbI7QpF43mucbmUByhhTMNmxTYkC9g8lU0pTOJ63IBVNJNmc4/gsJxSjj61YembuYgmKJpLdGhOyPX6vB6/HJtk0nVNVonm+S48lUo7303QlO29fXWVgh4+hqrRGE4Rd0MSmCk3hGH2qKsr6c2wByuXSgxYTef9gez1C78oAnjL+4zbdF0ukiMSTxJL5vSFyg+y8fb0qu5+unUwpzZF4QWt+25MNUuX8ObYA5WLxZIqWnRi0uC3JlNIUidPblivo8RLJFNFMYHIqYcEt0vP2QU03xhQ5NbYpH7J9bH2qAmX5ObYA5VL5GrS4LYk8NHuY0pRKaXpNpUTKVbWCQgjHk3hEqKrY9tefU60X+ZbtYyvHtbEsQLlMKqW0RAuXpbMzzR6mtGT7laKJJPE8TsVTilpjCUSEykDnUyI52XrhhHz0sbmRBSgXiSXSH4pCN7PsSLOHKR3Z5IR8J9qUulA0jsjn5+0rROuFE8rxZtMClEu0RhMFGRfSlVybPUxpSKaUSDwdlEqlFlAM7eftK3TrhROiiSStUaG6TD7H5XEVJSyZSi8t4IYxDdtr9jDp98vNiQSJZKpH9ivtqOyYouqAj7ZYeSSJtMUSZTMDeulfQQmLxJOEot1b88ZpXTV79GTphIIUkUTSvvjLkGp6LalyEoqmbzZL/XNsAaoIVNXVi6sVc7kCN4nEk8QSKWKJZMn1RxgTiqRvNit8pRukLEAV2M4uyFYI2WaP3pUBfN6eFaTiydTWvhs31WyN6S4FWsJxpLJ0bzYtQBVQOJakNerOAX8dpUepx+ldFSjrqVQg3a8UjSeJWEKBKTOlfrPpaIlFZLKILBGRZSJyVSevXykiizL/3heRpIj0zWXfUpJdzCxUIsEpqxDLFRSLqhKOJWlsi7G5NUprLGHByZSl7M1mKf59OxagRMQL3AWcAOwNnC0ie7ffRlVvUdXxqjoe+BHwqqpuzmXfUlGoxcyckp0SKd/LFRRLNJGkORzn01CUUNQd2ZPGOK1UbzadrEFNApap6nJVjQGzgCnb2P5s4NEd3NeVCr2YmVMSyRRN4Xixi7HDEskUoUg6KDWH4+n+pWIXyuyw1liIX//zRuY3vFHsopSUUrzZdLIPahiwut3jeuCAzjYUkSpgMjB9B/adBkwDGDly5M6VOA/KdYBkPFlao9Szc82F4+X1PvR0H21ewhUvXczKpk/40wd/4IGTZ7L3gH2LXaySkUim2BSK5u14SopeQefqOds9soicLCI7UoLOeta7+qY4BfiHqm7u7r6qep+qTlDVCQMGDNiBYu68bH/Glkx/RluZ9mekp0Ryb01KNX1z0BSO8WlrlFC0PN+Hnmru0sc5d85XaI2FuPnoO+gd7MOlL3ybNS31xS6acUgugWcq8JGI3Cwie3Xj2PXAiHaPhwMN2zjHo+0ed2ffounYn9ETBnGG40naXDaoMTuH4aetUVp60HLYPUUkEeHaV6/imld/yNiB45j1lac4btcTuXPyA0STUS55/r9ojjYVu5jGAdsNUKr6DeCLwMfAb0XkDRGZJiK129l1PrC7iIwWkQDpIDSv40YiUgccAczt7r7FYP0Z6SmRwrHiJn0kU+nVTTe3RmkKx4jEbdxSOVrZ9AnfnHsGc5Y+xn+N/y4zTvwd/avSLSVf6LM7vz5uBqubV/H9P3+XWDJ/TVfGHXJqulPVZuAJ0skKQ4DTgbdF5JJt7JMg3af0AvAh8CdVXSwiF4nIRe02PR14UVVbt7dvt64sj1IpJRxLfxluaYsR7gELu21PKBov+EwYPaUp1aT9eflznDP7NNa3ruOO4+/nkolX4PN8ttt8wpAD+NkRN7Fw7T+5+tUfklKrPZeT7SZJiMgpwAXAF4BHgEmquiGT2PAhcEdX+6rqs8CzHZ6b0eHxQ8BDuexbSO3XzrEmo84VakqkaCJJNG5TDvUU8WSMX/3zRh5d/DD7DhzPzUffwZCaoV1uf+Jup7Iu1MDt829haM0wLp10ZQFLa5yUSxbfV4Ffq+pr7Z9U1TYRucCZYhVPLJEiGo8TS1qT0fY4OUo9nkwRjSeJJlI9vrbak6wNNfDff7mE9zYs4pyx5/H9ST/E793+Inznj/sODS31PPjODIbUDOOre59TgNJ2rjUWYmXTJ3k7nt8bYLc+Y8putdxc5BKgrgHWZh+ISCUwSFVXqOpfHCtZETSH4zRHYwS8Pe8PYUepQmM4hteTvwClqtZ01wP9ffVf+ckrPyCRSnDL0Xdy7K4n5LyviHDVIdeyrnUtP3/9GgZVD+bwXY5ysLSdW7j2La56+TI2tm3I63FPG/NVrjn85z0uSOUSoB4DDm73OJl5bqIjJSoiu1PfMar0iAxG44xEKsE9C2/lgUX3MKbvntxyzJ3sUje628fxeXzcfPTt/NfTZ/PfL1/KAyfPZJ8B+zlQ4s9LaYqH3/kNdy74JcNqR3DTUbcT9AXzcux/NrzOzPcfYmjtMKbt32W3f1nKJUD5MrM5AKCqsUxmnTHG7JRNbRu56uXLWLD2n5y+x9f44cHX7NQXe5W/mjuOf4Bz557BpS98m99NeYJhtcPzWOLPa4o08j+vXslrq17muF1P5OrDbqAmsL0k59wdPvIomqNN3L3wVgbXDOXUMWfk7dhul0u7zEYROTX7QESmAJucK5IxpidYsPafTH3yFN7f8A7XHXEz1xz+87zUOvpXDeDOyQ8QS8aY/vwFjo6Ren/ju0ydfSqv1/+Nqw6+hpuOuj2vwQnSzZfXHHYDBww9mOte+zH/XPOPvB7fzXIJUBcBPxaRVSKyGvgh8B1ni2WMKVcpTfHAonuY9sw3qA7U8MhpT+a9VpAdI1XfvJrvv3hR3sdIqSqzFv+O8+Z9DVB+e8ofmbrPNx3rI/J7A/zi2LsZ1XtXfvDn7/HR5iWOnMdtchmo+7GqHkh6VvG9VfVgVV3mfNGMMeWmMbKFy16Yxh3zf8Gxo09g5mlz2L3vHo6ca8KQA7juiJtYuK+mjrkAACAASURBVO4trn71v/M2Rqo1FuKqly/jxtd/xkHDDmXW6U+x78BxeTn2ttQGarlz8gNU+qu5+PkLWB9au/2dSlxOk8WKyEnAPkAwe4egqtc5WC5jTJl5b8M7/PdfprOxbSNXHXwtZ+39Dcez0k7Y7VTWhdZy2/ybGVozfKfHSGUnq13dvJJLJ17JeeOm4dmhqUp3zOCaodw5+QEueGoql7xwIQ+eMivvTYpukstA3RlAFXAkcD9wJvCWw+Uy7fz70w+4d+HthBPhvB1zr/77cMnEKwr64TKlIRRr4cXlz/Lqyr8QzVPTmKIsXPsWA6oG8tCpf2JsgbLrAM4bN42G0M6PkZq39Alu+PvV1ARque+k3zNhSKcLLDhuj357ccsxd3Lp8xdyxUvTuWPy/fg9pbHKQHflUoM6WFX3E5F3VfVnIvJL4EmnC2bS7dyzl/yJG1+/lmp/DSN67ZKX48ZTMX77zr0kUgl+cOCP83JMU9pUlYXr3mLOksd46ZPniSTCjOg1kj7Bfnk7x4m7ncoPDvgxdcHeeTtmLkSEHx58DetCOzZGKpKIcNPrP2P2kj8xcciB/PyoW7fOB1gsBw8/jJ8edj3XvvZD/u9vP+Xaw28syzFSuQSoSOb/NhEZCnwKdH+QgumWcLyN6//+Pzy9bA4HDjuUG478FX0r8/Nloarc/MZ1PPLeAwypGco5Y8/Ly3FN6VkXauCppU8yd+kT1LesosZfw0m7TeG0Pb7K2AHjyuZLz+fxcdPRt3Hh0+d0a4zUyqZPuPKl6Szd/G8uHP89vvuly/F6vAUo8fadtseZrA2t4d63b2do7XC+U4ZjpHIJUE+JSG/gFuBt0jPc/MbRUvVwnzR+zBUvXczyLcu4aP/L+PYXL87rh0JEuOLAn7I21MAtb/wfg2uGctSo4/J2fONu0USUV1a+yNwlT/Dmmr+jKJOGHsR3v3QZR40+nkpfZbGL6IgqfzW3H39/zmOk/rz8Oa597Sp8Hh93Tn6AQ0d8uXCFzdFF+19KQ0s99yy8lSFlOEZKtrX8b2ahwgNV9fXM4wogqKquXHxlwoQJumDBgh3e/8P1KwnF2gh4K/JYqu55btk8rvvbTwj6gtxw5K85aPihjp0rnAgz7ZlvsPTTD/nNSX9gv0FfdOxcprhUlQ83vc+cpY/z3LJ5tMSaM19oZ3Lq7l9hWK8R2z9ImVi+ZRnfmvdV+lcN4KFT/vS5JsfuTlZbbPFkjOkvXMjChn9y5wkPcuCwQwp27uyKurv23XWnjiMiC1V1wuee39769CLyhqoetFNnL5BSDlCxZJRb3riexz78A+MHfYmbjr6dQdWDHT/v5vCnfGvemYRiIX435fG89XMZd9gc/pRnl81j7tLH+GjzEiq8FRw16nhO2+NMJg49qMcmySxc+xYXPfst9hs4nntOfGjrZ35HJ6sttpZYCxc8dRYNLWv47Sl/ZEy/PQtyXjcEqJ8B7wJP6vY2LrJSDVBrmldz5V8u4YNN7/HN/S7kkolXFDQrZ2XjJ3xr3lfpVVHHw1Meo0+wb8HObfIvkUrw+urXmLv0cV5d9TKJVJyxA8YxZcyZHP+Fk+lV0avYRXSF55bN40evfJ/JXziZG478Na/Xv7Z1stprD7+xW5PVusH60FrOnXcmAI+c+jiDaoY4fk43BKgWoBpIkE6YEEBV1XV/5aUYoF5d+Rd++tcrUJTrjri5aH1Bi9YvZNoz32Cv/mO598RH8jbRZTlJaYoPN71PW7yt2EXpVEqTvFH/d57+aDabwhvpE+zLKbufzqljzmS3vmOKXTxX+u2ie7lt/s18cdAE/rV+AWP67sUvjrmTkXWjil20HbLk0w+54KmpDKsdXpAxUkUPUKWklAJUIpXgzvm/5KF372PPfvvwi2PuZHivkY6fd1te+uR5rnxpOkePPp6bj76jxzb/dNTQsoanPnqSeUufYE3L6mIXZ5u84uWwkUcyZcyZHDryy2U7PiZfVJUb/nE1j304My+T1brB6/V/49LnL2TC0AMdHyNV9AAlIod3WrAOCxh2se9k4DbAC9yvqjd2ss2XgVsBP7BJVY/IPL8CaCG9vEeis8J3VCoBakPreq56+TLeXjefM/c8mysP+h8qfMVLzGjvkfce5JdvXs+5+/5Xjx4jFUlEeGXFi8xZ+jhvrXl9a6bbKbt/hcEu7jDftfdu9KvqX+xilJSUpljVtIJRvXfuS9ZN5ix5nGtf+yFTxpzp6BgppwNULmnm7ecGCQKTgIXANke6iYgXuAs4FqgH5ovIPFX9oN02vYG7gcmqukpEBnY4zJGqWlYzp/9zzT/40Svfpy3exvVH/oqTdptS7CJ9xjfGnk9DS32PHCOlqize9B5zlzzOcx/PIxRrYUhNeg2enpbp1pN4xFNWwQnKZ4zUdgOUqp7S/rGIjABuzuHYk4Blqro8s98sYArwQbttziGdfLEqc678LkPpIilNcf+/7uKehbcxuvcX+M1Jf+ALfXYvdrE+Jz1G6ies60FjpDaHP+WZZXOZu+Qxlm1ZSoW3gqNHT2bKmDN6dKabKW3lMEYqp8liO6gHxuaw3TCgfYN9PdBx8qoxgF9E/grUArep6u8yrynwoogocK+q3tfZSURkGjANYOTI4vbhdGVLZDM/eeX/8Xr93zhxtyn89ND/pcpfXexidcnr8XLDUb9m2jPf4EcvX16WY6QSqQT/WP0qc5c+zmsrXyahCcYOGMdPD/1fjtvVMt1M6RMRrj7seja0ree6137MwOrBBR0jlQ+5TBZ7B+lgAenlOcYD7+Rw7M4aPTt2ePmALwFHA5XAGyLypqouBQ5R1YZMs9+fReTfnfV7ZQLXfZDug8qhXAX1zvq3+e+/XMKWyGZ+euj/ccaeU0ti+phKXyW3HXcf35p3Jpe9OK1sxkgt37KMuUsf55mP5rApvJG+lf04Z+y3LNPNlCW/N8AvjrmLC546ix/8+XsFHSOVD7nUoNpnHSSAR1U1lyUd64H2jfbDgYZOttmkqq1Aq4i8BowDlqpqA6Sb/URkNukmw+0mZriFqvL793/Lbf+8icE1Q3n41MfYq38uFU/36FvZjzsnP8i35n6Vi5+7oGTHSIViLbyw/BnmLnmcdzf8C5/4OHTkly3TrYSJQBklIDuqNlDLncc/wLnzzmT6C/9VsDFS+ZBLFl81EFHVZOaxF6hQ1W0OBhERH7CUdO1oDTAfOEdVF7fbZi/gTuB4IEB6GY+pwCeAR1VbMuf/M3Cdqj6/rXO6IYsvnorzj9WvMmvxI7y55u8cucux/OyIm0u6yeid9W8z7ZlvsGf/fUpmjFRKUyxc+xZzs7NzJyPs2nt3TtvjTE7a7TTLdCthfq+HXkE/rbEEkXiy2MUpGdkxUlX+KobUDMvLMRXl0JEHcfdJd+/UcXYmi+8vwDFAKPO4EngROHhbO6lqQkSmAy+QTjN/UFUXi8hFmddnqOqHIvI86ZkqUqRT0d8XkV2B2ZmmMB8wc3vBqdg+3vLR1qajT8Ob6FfZnysP+inn7HNeSTTpbcu4Qftz/ZG/4sqXpvPTv/6Am4663TUzOnfUccxSjb+Gk3c/nSl7fJWxA/Yr+feip6v0e6kJpmu8tUE/yZQST+Znpdxyt0e/vbj9+N/w23fuJZnKV2BXR29Yc6lBLVLV8dt7zg0KXYNqibXwwsdPM3fp47y3YRE+8XHYyCM5bY+vcvCIw8uu6ej37/2WX7z5f3xj7AVccdBPil2crTobs3TA0IOZssdXOWrUcSVR4zPbJkBN0E/Q/9kbo1RKaQzHSKasva8Y3DAOqlVE9lfVtzMH+hKQv6VdS0xKUyxoeJO5S5/gL5mmoy/02Z0fHPBjTtp9Cn0ry7fp6Bv7nk9DqJ7fv/8gQ2qH8vWx5xetLJ2NWRpaM5zv7H8pp4z5yjaXUTClxSNCr0o/fu/n0/09HqFX0E9jOGZ9UmUolwB1OfCYiGQTHIYAZzlXJHdqaFnDvKVPMG/pEzSE6qkJ1HLKmDOYsseZ7NN/3x7TdPSDA37MulADv3jjeoZUD+Wo0ccX9Pybw5t45qO5zF36+NYxS8eMPoEpY85gwtADbcxSmcn2N3k8XX++fF4PtUE/LeH459KETWnLaS4+EfEDe5Cuaf9bVeNOF2xH5LuJL5KI8JcVLzBvyeO81fAGAJOGHcxpY87kyB7cdBROhPnOM+ey5NMPCjJGKjtmac6Sx/jbqldIaIJ9B47fOjt3rcMTYpriaN/flItwLEko6sqvprLlhrn4Lgb+oKqNmcd9gLNVdefSNhyQjwDVEm1l6eYlzF3yGM8vf5pQrIVhtSOYMuYMThlzhqsXLiuk9utIPXzqY47M/pwds/T0R7O3Jp6ctPtpTBlzpitn4TD50VV/Uy5aInHL7CsgNwSozpIk/qWqrptaYGcC1PrQen75+h08/uEsPmn8mKA3yDGjJzNlj6/ypSGTrOmoEyubPuG8eV+jJlDL/zvgR3n7Ha1vXcfTH83eOmbpsJFHMmWPMzlkxBFll3hiPmtb/U25agrHiCWKl9knAhW+/GW5qkI04c6g64YkCY+ISHaxwsw4KPcvMdlNt755K7e8cSP7DhzP/xx2PcfvepLja6mUul3qRnPrcffynWfO5ft/viivx+4piSfmP3Lpb8pFr6CfLW3FyezzZa7Bu5PX0FFLhB5ZM8wlQL0A/ElEZpCequgiwNVjknbEJQdcwiHDj2FQ9dCiLPleqsYN2p+np/6VDa3r8nbMSn8Vo+p27TGJJ6b7/U3bIiLUVQbY0hYtaGZf0O+lpsLnyN9tTx3zlUuA+iHpyVi/S7p5+EXgN04WqhiG1g6lqU+cUMydq6W6Wf+qAfSvGlDsYpgStDP9Tdvi9Qi9ggGawzHHM/sEqK7wUxlwdvB6XWXxaobFst2GXlVNZWZ9OFNVzwAWA3c4XzRjTDnzeoTeVYG8B6esgM9DdYWzfZbpPrOA48EJ/lMz7EkNCzkttyEi44GzSY9/+gR40slCGWPKW776m7anMuAlmUoRdqD/xuf1UFeAa2ivkDVDN+gyQInIGNITt54NfAr8kXTW35EFKpsxpgzls78pFzVBP0nVvGb2Bf1eagt4De0FfB5qgn5aIuU/5mtbNah/A38DTlHVZQAi8v2ClMoYU3ac6m/KRb4y+4p5De0F/V4SSWdqhm6yrQB1Buka1CuZGcdn0fkihMZhAlQGfHjy2PgcSSRJ9LCMIJM7r0cI+rx5zUjzewXfToxv2hn5yOzLxxitfHKiZug2XQYoVZ1NesmLauA04PvAIBG5B5itqi8WqIw9mkeE2qCfgC+/H4oKn8dmgTafIUDA56Uy4HXNl3A+7Uz/TaH6zLqrmGO+CiGXLL5WVf2Dqp5MelXcRcBVjpfM4PN66FMVyHtwgv/MAt2TMoJM53xeDzUVfvrVVLiqhuCEbP9Nd1T6vfSuCrguOMF/aob5bF1xk279JarqZlW9V1WPcqpAJi3o99K70tk7tuws0KbnEcn8jVUF6FOVTpPuKQOjg34vVYHtJzAL6QGyhUzo2BFeT7rpsRzfvZzSzE3hFLoTtsLnpaZCCUUTBTmfKS6f10PQ5yXo9/SYgNSZ6gofyZR2Oced15NuWi+V2qTfW56ZfY7+9kVksogsEZFlItJps6CIfFlEFonIYhF5tTv7lhuPCHUODlzsSmXAV/SsJOOcbG2pTw+sLW1LbdDX6Zx5fq+H3pWBkglOWbnWDEuJY1eTmVT2LuBYoB6YLyLzVPWDdtv0Bu4GJqvqKhEZmOu+5abYnbC1QT+pMs8I6ml8Xg+Vfi8Vvp5dW+pKtv+msS1GKpPaV+gxWvm2vZphqXHyFmESsExVl6tqjHSa+pQO25wDPKmqqwBUdUM39i0bbumEdWIWZlNYIum/p2xtKei32tK2bO2/kdLob8pFbdBXtHT+fHOyPjgMWN3ucT1wQIdtxgB+EfkrUAvcpqq/y3FfAERkGunJbBk5cmReCl4obhn0l1WsWaBLQTYFu8LvcXXGlM8jFpC6ye/10K+6omx+byJCXSb9PFXiH2QnA1Rn73bH35YP+BJwNFAJvCEib+a4b/pJ1fuA+yC9YOEOl7bAvJk0b7fd6fS0ub62x+uRTDOZt+g1XOOccglOWR6PUFfpp7GttD/HTgaoemBEu8fDgYZOttmkqq1Aq4i8BozLcd+SFfB5MmOQ3Pmh6ElzfXUmW1sK+r2OjEEzphB8ZZDZ5+Snbz6wu4iMFpEA6WmT5nXYZi5wmIj4RKSKdDPehznuW5KqAr7MlPnuDE5Z5ZgRtD1ej3xmwKoFJ1Pqgn4v1SX8OXas5KqaEJHppFfk9QIPqupiEbko8/oMVf0wM8/fu0AKuF9V3wfobF+nyloIIlBT4Z7+plyUW0ZQZ0TSY8GC/vKc3seYqgofiRL9HDsaWlX1WeDZDs/N6PD4FuCWXPYtVW7tb8pFbdBHMqxlN7GsDVg1PUmvSj+NbaW3ZHzp1v1KRIXPS23QV7JfguWUEZStLVX6vSV5s2DMzugV9JfcBNEWoBxUHfBRVVH6v+JSzwjyez0EbcCq6eGyE0Q3heNonj7J4vAMgKX/7ekyHhGC/nSfRjkNei21jCARMk14VlsyJsvn9dCvpiJvx0tpilgilrfjdWQBKg/aD+Ks8JVOEkR3Bf1eVN09sazVlowpHxagdkJ21dGgv+cM4qwMpDOCIi5aarpca63G9HQWoLrJBnGm5yxLpoqfERTwebamiBtjyo8FqBzZlDefVayMoJ5YazWmp7IAtQ3Z2lJlwAZxdpTNCGoMxxyfWLan9PEZYz7LAlQnbBBnbnxej6MTy1qt1ZiezQJUO+mU5ADVFYFiF6VkZCeWzedCh9kBtT21j88Yk2YBqp2g30skYV+K3ZXNoDPGmHyyb2NjjDGuZAHKGGOMK1mAMsYY40oWoIwxxriSBShjjDGuZAHKGGOMKzkaoERksogsEZFlInJVJ69/WUSaRGRR5t/V7V5bISLvZZ5f4GQ5jTHGuI9j46BExAvcBRwL1APzRWSeqn7QYdO/qerJXRzmSFXd5FQZjTHGuJeTNahJwDJVXa6qMWAWMMXB8xljjCkjTgaoYcDqdo/rM891dJCIvCMiz4nIPu2eV+BFEVkoItMcLKcxxhgXcnKqo85m9+w4p+jbwC6qGhKRE4E5wO6Z1w5R1QYRGQj8WUT+raqvfe4k6eA1DWDkyJH5K70xxpiicrIGVQ+MaPd4ONDQfgNVbVbVUObnZwG/iPTPPG7I/L8BmE26yfBzVPU+VZ2gqhMGDBiQ/6swxhhTFE4GqPnA7iIyWkQCwFRgXvsNRGSwZNazEJFJmfJ8KiLVIlKbeb4aOA5438GyGmOMcRnHmvhUNSEi04EXAC/woKouFpGLMq/PAM4EvisiCSAMTFVVFZFBwOxM7PIBM1X1eafKaowxxn0cXW4j02z3bIfnZrT7+U7gzk72Ww6Mc7Jsxhhj3M1mkjDGGONKFqCMMca4kgUoY0zJCcfDNEWaUO04csWUEwtQxpiSEoqFCHgDDKweSCgWKnZxjIMcTZIwxph8UVVaoi30qezDgOoBCEIkEaEt3kaVv6rYxTMOsBqUMcb1UpqiOdrMgOoBDKweiEc8iAiDagYhCLFkrNhFNA6wAGWMcbVEKkEoFmJY7TD6VfUjMz4SAJ/Hx7Bew4gkIqQ0VcRSGidYgDLGuFYsGSMSjzCybiS9gr063SboCzKkZoj1R5UhC1DGGFcKx8MkU0l26b3LdvuY6oJ11FXU0RprLVDpTCFYgDLGuE5brA2veBlZN5IKX0VO+wysHojP4yOaiDpcOlMolsVnjHGVlmgLNYEaBtcMxuvx5ryf1+NlaO1QVjSuwOfxdWvffIskInkdo+X3+vF5et7Xdc+7YmOMK6kqLbEW+gT7MLB64GeSIXJV4atgaO1Q1jSvobaidoeOsTNUlVAsRJW/iqAvmLfjbglvIegP9rgg1bOu1hjjSslUktZ4KwOrB9In2GenAkttRS39qvqxJbKFmkBNHku5bSlNEYqF6FfZj/5V/fMaHKv8VaxuXk1toPBBt5isD8oYU1TxZJy2eBvDaofRt7JvXr6A+1X1o8JbQTgezkMJty+ZShKKhhhcMzg9iDjPQaQ6UE3/yv49LlPRApQxpmiiiSixZIyRdSOprajN23E94mFo7VBSmiKRSuTtuJ2JJWO0xdsY3ms4vYO9HTtPv6p+VAeqaYu3OXYOt7EAZYwpinA8jKqyS+9dqPRX5v34fq+fYb2G0RZrc2xS2fap8DUVzjYnigiDawb3qJkzLEAZYwquNdaKz+NjZO+RBLwBx85T5a9iYI0zk8q2T4XPZ0LEtmRnzogmoj1i5gwLUMaYgslO+Frtr2ZE3YiCZKX1CfahJlCT16axUCxEpb+SEXUj8Hv9eTtuLoK+IINrBveI/ihHA5SITBaRJSKyTESu6uT1L4tIk4gsyvy7Otd9jTGlJaUpWmIt9K3sy5DaIXikMPfH+ZxUVlVpjjZTV1HH0NqhRRtrVReso0+wD6FoeQcpx25fRMQL3AUcC9QD80Vknqp+0GHTv6nqyTu4b4/RGmvNazt6SlNU+isLfvdnSkM2eUHIbzbaoOpB9Knsk9dj5iLbNLaycSU+j2+HgmM2Fb5Y19DRgOoBhONhIolIwZoYC83J+vUkYJmqLgcQkVnAFCCXILMz+5aV9oMX+1X1y9txY8kYq5pW4fV4C3Yna9xNVYkkIiRSCSr9lQyvHp7zNEO5EKSosztkm8bWhtbSq6LziWe7Ek/GiSQiDKsdltdsw53hEQ9Dew1lxZYVJDyJshzE6+QVDQNWt3tcDxzQyXYHicg7QANwhaou7sa+iMg0YBrAyJEj81Bs90imkoRi6bEV+b5j83l8DKkZskMfVlNekqnk1vFCvSt7U1dRl9fA5CZ1wTrCiXC6HyxQndM+kUSEVCrFLr13cV1NJeANMKzXsLIdxOtkgOrsN9WxjeptYBdVDYnIicAcYPcc900/qXofcB/AhAkTnMklLYLsHduIXiMcS1+tC9YRSURoijQ5niJr3CfbjOf3+BlYM5CaQE1Z3oV3NKBqAJF4hGgiut1AHI6H8YjH8WzDnZEdxLspvKnsbjadbNupB0a0ezycdC1pK1VtVtVQ5udnAb+I9M9l33IWSUSIJ+MFGVsxoHoAAW+ASCLi6HmMO6gq4XiY5kgzXo+X4b2GM7rPaHoHe/eI4ASZSWV7DSWejJNMJbvcLhQLEfAGGFnn3uCU1a+qH7WB2rIbxOtkgJoP7C4io0UkAEwF5rXfQEQGS6ZOKiKTMuX5NJd9y1VbrA0PnoI1J2TbsZOppOMj7k3xZKfiaY21UltRy+g+oxlZN5LqQHXZNQvlIuANMKR2SKfJR6pKcySdqTes17Ci9pvlKp+Zim7i2C2TqiZEZDrwAuAFHlTVxSJyUeb1GcCZwHdFJAGEgama/mvpdF+nyuoG2VmQd2SZgZ2Vbcde1biqKDNAG+e0b8YbVDOI6kB1j6kpbU9nk8punbS2ZucnrS20fGQquo04NQVIMUyYMEEXLFiww/uvbVlLJBEpeAexk7Mgd8fm8GY2tm50TZaS2THZbLx4Mk5VoIp+lf2o8leV1JdtoaQ0RX1zPfFkHJ/HRyQRYWjt0JL+DDRFmgqW/JTSFLFEjF377rpTxxGRhao6oePzditVZIlUgrZYG4NrBzs60WQu+gT7EI6HaYu1URXY9hLbPVUsGSOejBe7GF1KaQpByj4bL1884mFIzRBWNq7cOmmtE/MCFlI5JT9ZgCqiWDJGNBFlRN2InFNenZSdjHJl08qcMpx6ivbjg4K+IAOq8r+cQr4IQnWguiT6TdzC7/Uzom4EIuL6ZIhclcsgXgtQRZIddzKq9yhXBQKvx8uw2mGuWDa72NqPD6oL1lEXrCvpD7vpmps+g/mQTX5a2biSRKp0B/GWZqlLXGusdWtighv/cIq9bHaxZRMLfB4fA6oHUFtR68r3yZhtCXgDDK0dWtKDeO1TV0DZTL1eFb0YVDPI1Vk22QynzeHNJd1hnCtVJZwIk0gmqApUMbx6uCUWmJJX6oN4LUAVSEpThKIh+lf1p19Vv5L44utf1Z9IIkI4Hi75juOuJFKJ9CBlLf9pfkzP1K+qH9FklLZ4G1X+0kp+cu8tfBlJpBKEYiGG1A6hf3Xx0si7S0QYUjsEwNWZazsikojQHG0mkUwwqHoQX+j7BQZWD7TgZMpOKQ/itRqUw6KJKPFknJF1I0vu7gX+M/hvReOKkp/5PKUpwvEwKU1R7a9mcM1gKn2VJXPDYMyOKtVBvBagHBSOhxGEXXrvUtJ35kFfsKRnPs824wlCn2AfegV7lU06sTG5ar/cSLU/P8NanF523gJUnrUfM1Plr2JI7ZCyyAArtcF/7WdTCPgCDK4ebOODTI9XF6wjnozTEmvJ2zGdHMNZ+t+cLpFIJYjE0zOCl2tn+4DqAUQSEVcP/mvfjFdbUcvQ2qEEfUFrxjMmo391f/pX9y92MXJiAWonRRIRYskYAU+AQTWDqAnUlO1dukc8DK0dyorGFa4b/JedlcPn8W1desCWszemtLnnG6aEZO/Sk5qkxl/Tozrb/V6/a2Y+/9wS5b2GU+mvLJkOYGPMtlmA6obsKrde8fbozvYqfxUDawayIbSBXsHCJ030hOZUY4wFqO1qf5de4U1PAVQdqO7xd+l9gn2IJqK0xloLNvN5T2pONcZYgOpSMpUkkoiQ0hR1FXX0ruzt2sSAYhARBlYPZGXC2ZnP2yc9ZBdz7CnNqcb0dBagOoin4sSi6YlC+1f1t4lCt8HJmc/bN6f2rexLbUVtj2xONaYnc/SbV0QmA7eRXrb9flW9sYvtJgJvAmep6uOZ51YALUASSHS22mK++Tw+Kn2V9K3saxOF5ig78/na0Nq8HVNVrTnVGONcgBIRL3AXcCxQD8wXkXmqOJuBEQAACzBJREFU+kEn290EvNDJYY5U1U1OlbGjYi63XspqK2qpDlSjqnk7pvUtGWOcrEFNApap6nIAEZkFTAE+6LDdJcATwEQHy5ITC047ziMesF+fMSaPnGw7GQasbve4PvPcViIyDDgdmNHJ/gq8KCILRWRaVycRkWkiskBEFmzcuDEPxTbGGOMGTgaozu6nO7YB3Qr8UFWTnWx7iKruD5wAXCwih3d2ElW9T1UnqOqEAQMG7FyJjTHGuIaTTXz1wIh2j4cDDR22mQDMyjSt9QdOFJGEqs5R1QYAVd0gIrNJNxm+5mB5jTHGuIiTNaj5wO4iMlpEAsBUYF77DVR1tKqOUtVRwOPA91R1johUi0gtgIhUA8cB7ztYVmOMMS7jWA1KVRMiMp10dp4XeFBVF4vIRZnXO+t3yhoEzM7UrHzATFV93qmyGmOMcR/JZ2pwsU2YMEEXLFhQ7GIYY4zpBhFZ2NlYVxsBaYwxxpUsQBljjHGlsmriE5GNwMpil2MH9AcKNmOGQ+wa3MGuwR3sGrpnF1X93DihsgpQpUpEFhRirkEn2TW4g12DO9g15Ic18RljjHElC1DGGGNcyQKUO9xX7ALkgV2DO9g1uINdQx5YH5QxxhhXshqUMcYYV7IAZYwxxpUsQBWAiIwQkVdE5EMRWSwil3WyzZdFpElEFmX+XV2Msm6PiKwQkfcyZfzcvFKSdruILBORd0Vk/2KUsysiske73/EiEWkWkcs7bOO690JEHhSRDSLyfrvn+orIn0Xko8z/fbrYd7KILMm8J1cVrtSfK0dn13CLiPw787cyW0R6d7HvNv/uCqWLa7hWRNa0+3s5sYt93fw+/LFd+VeIyKIu9i3s+6Cq9s/hf8AQYP/Mz7XAUmDvDtt8GXi62GXN4VpWAP238fqJwHOk1wM7EPhnscu8jbJ6gXWkBwm6+r0ADgf2B95v99zNwFWZn68CburiGj8GdgUCwDsd//aKfA3HAb7Mzzd1dg25/N0V+RquBa7I4W/Nte9Dh9d/CVzthvfBalAFoKprVfXtzM8twId0WF24jEwBfqdpbwK9RWRIsQvVhaOBj1XV9bOPqOprwOYOT08BHs78/DBwWie7TgKWqepyVY0BszL7FVxn16CqL6pqIvPwTdLrxrlWF+9DLlz9PmRJegmJrwGPFrRQXbAAVWAiMgr4IvDPTl4+SETeEZHnRGSfghYsdwq8KCILRWRaJ68PA1a3e1yPe4PxVLr+IJbCezFIVddC+iYIGNjJNqX0flxAuvbdme393RXb9Ewz5YNdNLWWyvtwGLBeVT/q4vWCvg8WoApIRGqAJ4DLVbW5w8tvk25qGgfcAcwpdPlydIiq7g+cAFwsIod3eF062cd1Yxkyi2ieCjzWycul8l7kolTej58ACeAPXWyyvb+7YroH+AIwHlhLuomso5J4H4Cz2XbtqaDvgwWoAhERP+ng9AdVfbLj66rarKqhzM/PAn4R6V/gYm6XqjZk/t8AzCbddNFePTCi3ePhQENhStctJwBvq+r6ji+UynsBrM82n2b+39DJNq5/P0TkW8DJwNc109HRUQ5/d0WjqutVNamqKeA3dF62UngffMBXgD92tU2h3wcLUAWQadd9APhQVX/VxTaDM9shIpNIvzf/v70zDdWijOL4729FYkW2Efoly5IoKslue1gR7XuJlbghkUUmlW0UFlHRgtCHFtvIzCIsiCQiA1GzxaV9o8WyIIiysluZSeXpwzlT4/C+3veGeef1nh9c7syZ53nmPO+875x5Zp75nx82nZddI2kbSdsVy/gD7g8qxeYAY2I236FAZ3EbqmY0vVJsh2MRzAHGxvJY4LkGZZYBe0naPUaN50W9WiDpROAa4HQz+61JmVa+dz1G5RnrWTT2rdbHITgO+NjMvm60sUeOQ0/MIultf8CR+HD+PeCd+DsZmAhMjDKXAh/is3sWA4f3tN8N+rFH+Pdu+Hp92Mv9EHAvPmPpfeCgnva7QT/64QFn+5Kt1scCD6bfAH/gV+MTgJ2AecBn8X/HKDsQeKFU92R85ujnxTGrUR+W489mit/F9Gofmn3vatSHx+O7/h4edAa023EI+4ziN1Aq26PHIaWOkiRJklqSt/iSJEmSWpIBKkmSJKklGaCSJEmSWpIBKkmSJKklGaCSJEmSWpIBKmk7JJmkaaX1KZJu2khtz5B07sZoq4v9jJCr28+v2AdF/yaVbPdIGtdFexMljemizDhJ9zTZ9ms33P9PSBog6flYPrpYjvVbJM2VtLWkpyTt9X/7k9SfDFBJO7IWOLtu6g6StuhG8QnAJWZ2TINt3wGT44XOljCz6WY2sxv732iEAkErXIErLVTrXw8cAZxpZmtx6aCrN56HSbuSASppR/4EHgQur26ojoCKkUFcsS+UNFvSp5JulzRK0tLIbzO41MxxkhZFuVOj/hby3EXLQhT0olK78yU9ib+sWfXn/Gj/A0l3hG0q/vL2dEl3NejfSvzF27HVDZIGS3oxxDoXSdo77DdJmhLLHeHj6+Fz+W3/gVH/M0l3VtqeJuktSfMk7RK2oZIW6998TTuEfYGk2yQtxIPpiOjju5JebtAngHOAFyv7vBJ/gfU0M1sT5kVxDFoNfMlmSgaopF25Fxglaftu1DkAmAzsB4wGhpjZwcDDwKRSuUHAcOAUPIj0xUc8nWbWAXQAF0raPcofjL9Vv095Z5IG4jmOjsWFRDsknWlmNwNv4NpzVzXx9XbgygajsgeBSWY2DJgC3Neg7qO4IsBhwF+VbUOBkfEZjJRU6MNtg2sTHggsBG4M+0zgGjPbHw/AN5ba6m9mw81sGjAVOMFcYPf0qkPxWa2KEVLBEbiCx0kW2ocA5pp2y/HjlfRiMkAlbYm5GvxM4LJuVFtmnptrLS4381LY38eDUsFsM1tnnnLgC2BvXHdsjDzT6BJcZqh4TrLUzFY02F8HsMDMVprnPHoCTxbXSv9WAEuBCwqbXA3/cODp8OMBPBkmpTL9ge3M7LUwPVlpep6ZdZrZ78BHwG5hX8e/IqGzgCMj+Pc3s4Vhf6zif1lU9FVghqQL8eR8VQbgI8Myy3FprOMblP8Ol9lJejE5hE7ambvx1BiPlmx/EhdeIfhafo5TvnpfV1pfx/q/har+l+En0klmNre8QdLRwOom/jVKsdAdbgOeAYpbZn2An8xs6AbqdLXP8mfwF83PAa1ooP3TbzObKOkQfNT5jqShZlYW2F0D9K3U/xYYBcyT9IOZlSeM9I06SS8mR1BJ22JmPwKz8dtvBV8Cw2L5DGCr/9D0CEl94rnUHsAnwFzgYnnaFCQNCUXnDbEEGC5p57hVdz5++6wlzOxjfJRzaqz/DKyQNCJ8kKQDKnVWAb/IleTBVbNboQ9QPLu7AHjFzDqBVZKOCvvoZv5LGmxmS8xsKvA966eWABdJHdSgj5/iKR5mSSoH3iG4IGnSi8kRVNLuTMPVxwseAp6TtBSfaNBsdLMhPsFPxLviz3J+l/QwfoJ9K0ZmK2mcYv0fzOwbSdcB8/GRzQtm1iglxoa4FXi7tD4KuF/SDXjwfQpXly4zAXhI0mpgAdDZwn5WA/tKejPKjwz7WPw5XD/8duf4JvXviqnhwj/39Xwys9WSPpe0p5ktr2xbJmk8MEfSMcCvwBqrZ5qWZBOSauZJspkhadti0oGka/H0D5N72C0knQUMM7Mbuih3OfCzmT2yaTxL6kqOoJJk8+OUGLltCXwFjOtZdxwze1bSTi0U/QnPsZT0cnIElSRJktSSnCSRJEmS1JIMUEmSJEktyQCVJEmS1JIMUEmSJEktyQCVJEmS1JK/ATzTqqNhw6VvAAAAAElFTkSuQmCC\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Decision Tree"
},
{
"metadata": {},
"cell_type": "code",
"source": "from sklearn.tree import DecisionTreeClassifier\n\nloanTree = DecisionTreeClassifier(criterion=\"entropy\", max_depth = 4)\nloanTree.fit(X_train,y_train)\npredTree = loanTree.predict(X_test)\nprint(\"DecisionTrees's Accuracy: \", metrics.accuracy_score(y_test, predTree))",
"execution_count": 140,
"outputs": [
{
"output_type": "stream",
"text": "DecisionTrees's Accuracy: 0.6\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Support Vector Machine"
},
{
"metadata": {},
"cell_type": "code",
"source": "from sklearn import svm\n\nSVM = svm.SVC(kernel='rbf')\nSVM.fit(X_train, y_train) \nyhat = SVM.predict(X_test)\nprint(\"SVM's Accuracy: \", metrics.accuracy_score(y_test, yhat))",
"execution_count": 141,
"outputs": [
{
"output_type": "stream",
"text": "SVM's Accuracy: 0.6714285714285714\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Logistic Regression"
},
{
"metadata": {},
"cell_type": "code",
"source": "from sklearn.linear_model import LogisticRegression\nfrom sklearn.metrics import confusion_matrix\n\nLR = LogisticRegression(C=0.01, solver='liblinear')\nLR.fit(X_train,y_train)\nyhat = LR.predict(X_test)\nprint(\"LR's Accuracy: \", metrics.accuracy_score(y_test, yhat))",
"execution_count": 142,
"outputs": [
{
"output_type": "stream",
"text": "LR's Accuracy: 0.6714285714285714\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Model Evaluation using Test set"
},
{
"metadata": {},
"cell_type": "code",
"source": "from sklearn.metrics import jaccard_score\nfrom sklearn.metrics import f1_score\nfrom sklearn.metrics import log_loss",
"execution_count": 143,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "First, download and load the test set:"
},
{
"metadata": {},
"cell_type": "code",
"source": "!wget -O loan_test.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_test.csv",
"execution_count": 144,
"outputs": [
{
"output_type": "stream",
"text": "--2021-02-20 17:33:57-- https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/loan_test.csv\nResolving s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)... 67.228.254.196\nConnecting to s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)|67.228.254.196|:443... connected.\nHTTP request sent, awaiting response... 200 OK\nLength: 3642 (3.6K) [text/csv]\nSaving to: \u2018loan_test.csv\u2019\n\nloan_test.csv 100%[===================>] 3.56K --.-KB/s in 0s \n\n2021-02-20 17:33:57 (43.9 MB/s) - \u2018loan_test.csv\u2019 saved [3642/3642]\n\n",
"name": "stdout"
}
]
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "### Load Test set for evaluation "
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "code",
"source": "test_df = pd.read_csv('loan_test.csv')\ntest_df.head()",
"execution_count": 151,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 151,
"data": {
"text/plain": " Unnamed: 0 Unnamed: 0.1 loan_status Principal terms effective_date \\\n0 1 1 PAIDOFF 1000 30 9/8/2016 \n1 5 5 PAIDOFF 300 7 9/9/2016 \n2 21 21 PAIDOFF 1000 30 9/10/2016 \n3 24 24 PAIDOFF 1000 30 9/10/2016 \n4 35 35 PAIDOFF 800 15 9/11/2016 \n\n due_date age education Gender \n0 10/7/2016 50 Bechalor female \n1 9/15/2016 35 Master or Above male \n2 10/9/2016 43 High School or Below female \n3 10/9/2016 26 college male \n4 9/25/2016 29 Bechalor male ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>Unnamed: 0.1</th>\n <th>loan_status</th>\n <th>Principal</th>\n <th>terms</th>\n <th>effective_date</th>\n <th>due_date</th>\n <th>age</th>\n <th>education</th>\n <th>Gender</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1</td>\n <td>1</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/8/2016</td>\n <td>10/7/2016</td>\n <td>50</td>\n <td>Bechalor</td>\n <td>female</td>\n </tr>\n <tr>\n <th>1</th>\n <td>5</td>\n <td>5</td>\n <td>PAIDOFF</td>\n <td>300</td>\n <td>7</td>\n <td>9/9/2016</td>\n <td>9/15/2016</td>\n <td>35</td>\n <td>Master or Above</td>\n <td>male</td>\n </tr>\n <tr>\n <th>2</th>\n <td>21</td>\n <td>21</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/10/2016</td>\n <td>10/9/2016</td>\n <td>43</td>\n <td>High School or Below</td>\n <td>female</td>\n </tr>\n <tr>\n <th>3</th>\n <td>24</td>\n <td>24</td>\n <td>PAIDOFF</td>\n <td>1000</td>\n <td>30</td>\n <td>9/10/2016</td>\n <td>10/9/2016</td>\n <td>26</td>\n <td>college</td>\n <td>male</td>\n </tr>\n <tr>\n <th>4</th>\n <td>35</td>\n <td>35</td>\n <td>PAIDOFF</td>\n <td>800</td>\n <td>15</td>\n <td>9/11/2016</td>\n <td>9/25/2016</td>\n <td>29</td>\n <td>Bechalor</td>\n <td>male</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# Transform the test data based (Note that Gender is transformed in-place)\ntest_df['due_date'] = pd.to_datetime(test_df['due_date'])\ntest_df['effective_date'] = pd.to_datetime(test_df['effective_date'])\ntest_df['dayofweek'] = test_df['effective_date'].dt.dayofweek\ntest_df['weekend'] = test_df['dayofweek'].apply(lambda x: 1 if (x>3) else 0)\ntest_df['Gender'].replace(to_replace=['male','female'], value=[0,1], inplace=True)\n\nX2 = test_df[['Principal','terms','age','Gender','weekend']]\n# One Hot Encoding of education\nX2 = pd.concat([X2, pd.get_dummies(test_df['education'])], axis=1)\nX2.drop(['Master or Above'], axis = 1, inplace=True)\n\n# Scaling the features using the training scaler\nX2 = scaler.transform(X2)\n\ny_test2 = test_df['loan_status'].values",
"execution_count": 152,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "# Create a new DataFrame using the algorithm name as the index\ndf = pd.DataFrame(index=[\"KNN\", \"Decision Tree\", \"SVM\", \"LogisticRegression\"])\n\n# Iterate through the name-algorithm pairs\nfor name, alg in {\"KNN\":KNN, \"Decision Tree\":loanTree, \"SVM\":SVM, \"LogisticRegression\":LR}.items():\n yhat = alg.predict(X2)\n df.at[name,\"Jaccard\"] = jaccard_score(y_test2, yhat, pos_label=\"PAIDOFF\")\n df.at[name,\"F1-score\"] = f1_score(y_test2, yhat, pos_label=\"PAIDOFF\")\n # log_loss only has sensible result for LR (unset cells will show up as \"NaN\")\n if name == \"LogisticRegression\":\n yhat_prob = alg.predict_proba(X2)\n df.at[name,\"LogLoss\"] = log_loss(y_test2, yhat_prob)\ndf",
"execution_count": 153,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 153,
"data": {
"text/plain": " Jaccard F1-score LogLoss\nKNN 0.745098 0.853933 NaN\nDecision Tree 0.729167 0.843373 NaN\nSVM 0.740741 0.851064 NaN\nLogisticRegression 0.740741 0.851064 0.562413",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Jaccard</th>\n <th>F1-score</th>\n <th>LogLoss</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>KNN</th>\n <td>0.745098</td>\n <td>0.853933</td>\n <td>NaN</td>\n </tr>\n <tr>\n <th>Decision Tree</th>\n <td>0.729167</td>\n <td>0.843373</td>\n <td>NaN</td>\n </tr>\n <tr>\n <th>SVM</th>\n <td>0.740741</td>\n <td>0.851064</td>\n <td>NaN</td>\n </tr>\n <tr>\n <th>LogisticRegression</th>\n <td>0.740741</td>\n <td>0.851064</td>\n <td>0.562413</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Report\nYou should be able to report the accuracy of the built model using different evaluation metrics:"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "| Algorithm | Jaccard | F1-score | LogLoss |\n|--------------------|---------|----------|---------|\n| KNN | ? | ? | NA |\n| Decision Tree | ? | ? | NA |\n| SVM | ? | ? | NA |\n| LogisticRegression | ? | ? | ? |"
},
{
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
}
},
"cell_type": "markdown",
"source": "<h2>Want to learn more?</h2>\n\nIBM SPSS Modeler is a comprehensive analytics platform that has many machine learning algorithms. It has been designed to bring predictive intelligence to decisions made by individuals, by groups, by systems \u2013 by your enterprise as a whole. A free trial is available through this course, available here: <a href=\"http://cocl.us/ML0101EN-SPSSModeler\">SPSS Modeler</a>\n\nAlso, you can use Watson Studio to run these notebooks faster with bigger datasets. Watson Studio is IBM's leading cloud solution for data scientists, built by data scientists. With Jupyter notebooks, RStudio, Apache Spark and popular libraries pre-packaged in the cloud, Watson Studio enables data scientists to collaborate on their projects without having to install anything. Join the fast-growing community of Watson Studio users today with a free account at <a href=\"https://cocl.us/ML0101EN_DSX\">Watson Studio</a>\n\n<h3>Thanks for completing this lesson!</h3>\n\n<h4>Author: <a href=\"https://ca.linkedin.com/in/saeedaghabozorgi\">Saeed Aghabozorgi</a></h4>\n<p><a href=\"https://ca.linkedin.com/in/saeedaghabozorgi\">Saeed Aghabozorgi</a>, PhD is a Data Scientist in IBM with a track record of developing enterprise level applications that substantially increases clients\u2019 ability to turn data into actionable knowledge. He is a researcher in data mining field and expert in developing advanced analytic methods like machine learning and statistical modelling on large datasets.</p>\n\n<hr>\n\n<p>Copyright &copy; 2018 <a href=\"https://cocl.us/DX0108EN_CC\">Cognitive Class</a>. This notebook and its source code are released under the terms of the <a href=\"https://bigdatauniversity.com/mit-license/\">MIT License</a>.</p>"
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3.7",
"language": "python"
},
"language_info": {
"name": "python",
"version": "3.7.9",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment