Created
October 13, 2021 21:08
-
-
Save brt-h/7a011db2f2b269ec098536e5d1159b39 to your computer and use it in GitHub Desktop.
Applied Data Science Capstone
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<center>\n", | |
" <img src=\"https://gitlab.com/ibm/skills-network/courses/placeholder101/-/raw/master/labs/module%201/images/IDSNlogo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\" />\n", | |
"</center>\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# **Space X Falcon 9 First Stage Landing Prediction**\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Assignment: Machine Learning Prediction\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Estimated time needed: **60** minutes\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Space X advertises Falcon 9 rocket launches on its website with a cost of 62 million dollars; other providers cost upward of 165 million dollars each, much of the savings is because Space X can reuse the first stage. Therefore if we can determine if the first stage will land, we can determine the cost of a launch. This information can be used if an alternate company wants to bid against space X for a rocket launch. In this lab, you will create a machine learning pipeline to predict if the first stage will land given the data from the preceding labs.\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"![](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/api/Images/landing\\_1.gif)\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Several examples of an unsuccessful landing are shown here:\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"![](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/api/Images/crash.gif)\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Most unsuccessful landings are planed. Space X; performs a controlled landing in the oceans.\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Objectives\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Perform exploratory Data Analysis and determine Training Labels\n", | |
"\n", | |
"* create a column for the class\n", | |
"* Standardize the data\n", | |
"* Split into training data and test data\n", | |
"\n", | |
"\\-Find best Hyperparameter for SVM, Classification Trees and Logistic Regression\n", | |
"\n", | |
"* Find the method performs best using test data\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"***\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Import Libraries and Define Auxiliary Functions\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We will import the following libraries for the lab\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 34, | |
"metadata": { | |
"tags": [] | |
}, | |
"outputs": [], | |
"source": [ | |
"# Pandas is a software library written for the Python programming language for data manipulation and analysis.\n", | |
"import pandas as pd\n", | |
"# NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays\n", | |
"import numpy as np\n", | |
"# Matplotlib is a plotting library for python and pyplot gives us a MatLab like plotting framework. We will use this in our plotter function to plot data.\n", | |
"import matplotlib.pyplot as plt\n", | |
"#Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics\n", | |
"import seaborn as sns\n", | |
"# Preprocessing allows us to standarsize our data\n", | |
"from sklearn import preprocessing\n", | |
"# Allows us to split our data into training and testing data\n", | |
"from sklearn.model_selection import train_test_split\n", | |
"# Allows us to test parameters of classification algorithms and find the best one\n", | |
"from sklearn.model_selection import GridSearchCV\n", | |
"# Logistic Regression classification algorithm\n", | |
"from sklearn.linear_model import LogisticRegression\n", | |
"# Support Vector Machine classification algorithm\n", | |
"from sklearn.svm import SVC\n", | |
"# Decision Tree classification algorithm\n", | |
"from sklearn.tree import DecisionTreeClassifier\n", | |
"# K Nearest Neighbors classification algorithm\n", | |
"from sklearn.neighbors import KNeighborsClassifier" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import warnings\n", | |
"warnings.filterwarnings('ignore')\n", | |
"warnings.simplefilter('ignore')" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"np.random.seed(0)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"This function is to plot the confusion matrix.\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def plot_confusion_matrix(y,y_predict):\n", | |
" \"this function plots the confusion matrix\"\n", | |
" from sklearn.metrics import confusion_matrix\n", | |
"\n", | |
" cm = confusion_matrix(y, y_predict)\n", | |
" ax= plt.subplot()\n", | |
" sns.heatmap(cm, annot=True, ax = ax); #annot=True to annotate cells\n", | |
" ax.set_xlabel('Predicted labels')\n", | |
" ax.set_ylabel('True labels')\n", | |
" ax.set_title('Confusion Matrix'); \n", | |
" ax.xaxis.set_ticklabels(['did not land', 'land']); ax.yaxis.set_ticklabels(['did not land', 'landed'])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Load the dataframe\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Load the data\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>FlightNumber</th>\n", | |
" <th>Date</th>\n", | |
" <th>BoosterVersion</th>\n", | |
" <th>PayloadMass</th>\n", | |
" <th>Orbit</th>\n", | |
" <th>LaunchSite</th>\n", | |
" <th>Outcome</th>\n", | |
" <th>Flights</th>\n", | |
" <th>GridFins</th>\n", | |
" <th>Reused</th>\n", | |
" <th>Legs</th>\n", | |
" <th>LandingPad</th>\n", | |
" <th>Block</th>\n", | |
" <th>ReusedCount</th>\n", | |
" <th>Serial</th>\n", | |
" <th>Longitude</th>\n", | |
" <th>Latitude</th>\n", | |
" <th>Class</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1</td>\n", | |
" <td>2010-06-04</td>\n", | |
" <td>Falcon 9</td>\n", | |
" <td>6104.959412</td>\n", | |
" <td>LEO</td>\n", | |
" <td>CCAFS SLC 40</td>\n", | |
" <td>None None</td>\n", | |
" <td>1</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0</td>\n", | |
" <td>B0003</td>\n", | |
" <td>-80.577366</td>\n", | |
" <td>28.561857</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>2</td>\n", | |
" <td>2012-05-22</td>\n", | |
" <td>Falcon 9</td>\n", | |
" <td>525.000000</td>\n", | |
" <td>LEO</td>\n", | |
" <td>CCAFS SLC 40</td>\n", | |
" <td>None None</td>\n", | |
" <td>1</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0</td>\n", | |
" <td>B0005</td>\n", | |
" <td>-80.577366</td>\n", | |
" <td>28.561857</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>3</td>\n", | |
" <td>2013-03-01</td>\n", | |
" <td>Falcon 9</td>\n", | |
" <td>677.000000</td>\n", | |
" <td>ISS</td>\n", | |
" <td>CCAFS SLC 40</td>\n", | |
" <td>None None</td>\n", | |
" <td>1</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0</td>\n", | |
" <td>B0007</td>\n", | |
" <td>-80.577366</td>\n", | |
" <td>28.561857</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>4</td>\n", | |
" <td>2013-09-29</td>\n", | |
" <td>Falcon 9</td>\n", | |
" <td>500.000000</td>\n", | |
" <td>PO</td>\n", | |
" <td>VAFB SLC 4E</td>\n", | |
" <td>False Ocean</td>\n", | |
" <td>1</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0</td>\n", | |
" <td>B1003</td>\n", | |
" <td>-120.610829</td>\n", | |
" <td>34.632093</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>5</td>\n", | |
" <td>2013-12-03</td>\n", | |
" <td>Falcon 9</td>\n", | |
" <td>3170.000000</td>\n", | |
" <td>GTO</td>\n", | |
" <td>CCAFS SLC 40</td>\n", | |
" <td>None None</td>\n", | |
" <td>1</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>False</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0</td>\n", | |
" <td>B1004</td>\n", | |
" <td>-80.577366</td>\n", | |
" <td>28.561857</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" FlightNumber Date BoosterVersion PayloadMass Orbit LaunchSite \\\n", | |
"0 1 2010-06-04 Falcon 9 6104.959412 LEO CCAFS SLC 40 \n", | |
"1 2 2012-05-22 Falcon 9 525.000000 LEO CCAFS SLC 40 \n", | |
"2 3 2013-03-01 Falcon 9 677.000000 ISS CCAFS SLC 40 \n", | |
"3 4 2013-09-29 Falcon 9 500.000000 PO VAFB SLC 4E \n", | |
"4 5 2013-12-03 Falcon 9 3170.000000 GTO CCAFS SLC 40 \n", | |
"\n", | |
" Outcome Flights GridFins Reused Legs LandingPad Block \\\n", | |
"0 None None 1 False False False NaN 1.0 \n", | |
"1 None None 1 False False False NaN 1.0 \n", | |
"2 None None 1 False False False NaN 1.0 \n", | |
"3 False Ocean 1 False False False NaN 1.0 \n", | |
"4 None None 1 False False False NaN 1.0 \n", | |
"\n", | |
" ReusedCount Serial Longitude Latitude Class \n", | |
"0 0 B0003 -80.577366 28.561857 0 \n", | |
"1 0 B0005 -80.577366 28.561857 0 \n", | |
"2 0 B0007 -80.577366 28.561857 0 \n", | |
"3 0 B1003 -120.610829 34.632093 0 \n", | |
"4 0 B1004 -80.577366 28.561857 0 " | |
] | |
}, | |
"execution_count": 5, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"data = pd.read_csv(\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DS0321EN-SkillsNetwork/datasets/dataset_part_2.csv\")\n", | |
"\n", | |
"# If you were unable to complete the previous lab correctly you can uncomment and load this csv\n", | |
"\n", | |
"# data = pd.read_csv('https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/api/dataset_part_2.csv')\n", | |
"\n", | |
"data.head()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>FlightNumber</th>\n", | |
" <th>PayloadMass</th>\n", | |
" <th>Flights</th>\n", | |
" <th>Block</th>\n", | |
" <th>ReusedCount</th>\n", | |
" <th>Orbit_ES-L1</th>\n", | |
" <th>Orbit_GEO</th>\n", | |
" <th>Orbit_GTO</th>\n", | |
" <th>Orbit_HEO</th>\n", | |
" <th>Orbit_ISS</th>\n", | |
" <th>...</th>\n", | |
" <th>Serial_B1058</th>\n", | |
" <th>Serial_B1059</th>\n", | |
" <th>Serial_B1060</th>\n", | |
" <th>Serial_B1062</th>\n", | |
" <th>GridFins_False</th>\n", | |
" <th>GridFins_True</th>\n", | |
" <th>Reused_False</th>\n", | |
" <th>Reused_True</th>\n", | |
" <th>Legs_False</th>\n", | |
" <th>Legs_True</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1.0</td>\n", | |
" <td>6104.959412</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>...</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>2.0</td>\n", | |
" <td>525.000000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>...</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>3.0</td>\n", | |
" <td>677.000000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>...</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>4.0</td>\n", | |
" <td>500.000000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>...</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>5.0</td>\n", | |
" <td>3170.000000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>...</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>...</th>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>85</th>\n", | |
" <td>86.0</td>\n", | |
" <td>15400.000000</td>\n", | |
" <td>2.0</td>\n", | |
" <td>5.0</td>\n", | |
" <td>2.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>...</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>86</th>\n", | |
" <td>87.0</td>\n", | |
" <td>15400.000000</td>\n", | |
" <td>3.0</td>\n", | |
" <td>5.0</td>\n", | |
" <td>2.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>...</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>87</th>\n", | |
" <td>88.0</td>\n", | |
" <td>15400.000000</td>\n", | |
" <td>6.0</td>\n", | |
" <td>5.0</td>\n", | |
" <td>5.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>...</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>88</th>\n", | |
" <td>89.0</td>\n", | |
" <td>15400.000000</td>\n", | |
" <td>3.0</td>\n", | |
" <td>5.0</td>\n", | |
" <td>2.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>...</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>89</th>\n", | |
" <td>90.0</td>\n", | |
" <td>3681.000000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>5.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>...</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>90 rows × 83 columns</p>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" FlightNumber PayloadMass Flights Block ReusedCount Orbit_ES-L1 \\\n", | |
"0 1.0 6104.959412 1.0 1.0 0.0 0.0 \n", | |
"1 2.0 525.000000 1.0 1.0 0.0 0.0 \n", | |
"2 3.0 677.000000 1.0 1.0 0.0 0.0 \n", | |
"3 4.0 500.000000 1.0 1.0 0.0 0.0 \n", | |
"4 5.0 3170.000000 1.0 1.0 0.0 0.0 \n", | |
".. ... ... ... ... ... ... \n", | |
"85 86.0 15400.000000 2.0 5.0 2.0 0.0 \n", | |
"86 87.0 15400.000000 3.0 5.0 2.0 0.0 \n", | |
"87 88.0 15400.000000 6.0 5.0 5.0 0.0 \n", | |
"88 89.0 15400.000000 3.0 5.0 2.0 0.0 \n", | |
"89 90.0 3681.000000 1.0 5.0 0.0 0.0 \n", | |
"\n", | |
" Orbit_GEO Orbit_GTO Orbit_HEO Orbit_ISS ... Serial_B1058 \\\n", | |
"0 0.0 0.0 0.0 0.0 ... 0.0 \n", | |
"1 0.0 0.0 0.0 0.0 ... 0.0 \n", | |
"2 0.0 0.0 0.0 1.0 ... 0.0 \n", | |
"3 0.0 0.0 0.0 0.0 ... 0.0 \n", | |
"4 0.0 1.0 0.0 0.0 ... 0.0 \n", | |
".. ... ... ... ... ... ... \n", | |
"85 0.0 0.0 0.0 0.0 ... 0.0 \n", | |
"86 0.0 0.0 0.0 0.0 ... 1.0 \n", | |
"87 0.0 0.0 0.0 0.0 ... 0.0 \n", | |
"88 0.0 0.0 0.0 0.0 ... 0.0 \n", | |
"89 0.0 0.0 0.0 0.0 ... 0.0 \n", | |
"\n", | |
" Serial_B1059 Serial_B1060 Serial_B1062 GridFins_False GridFins_True \\\n", | |
"0 0.0 0.0 0.0 1.0 0.0 \n", | |
"1 0.0 0.0 0.0 1.0 0.0 \n", | |
"2 0.0 0.0 0.0 1.0 0.0 \n", | |
"3 0.0 0.0 0.0 1.0 0.0 \n", | |
"4 0.0 0.0 0.0 1.0 0.0 \n", | |
".. ... ... ... ... ... \n", | |
"85 0.0 1.0 0.0 0.0 1.0 \n", | |
"86 0.0 0.0 0.0 0.0 1.0 \n", | |
"87 0.0 0.0 0.0 0.0 1.0 \n", | |
"88 0.0 1.0 0.0 0.0 1.0 \n", | |
"89 0.0 0.0 1.0 0.0 1.0 \n", | |
"\n", | |
" Reused_False Reused_True Legs_False Legs_True \n", | |
"0 1.0 0.0 1.0 0.0 \n", | |
"1 1.0 0.0 1.0 0.0 \n", | |
"2 1.0 0.0 1.0 0.0 \n", | |
"3 1.0 0.0 1.0 0.0 \n", | |
"4 1.0 0.0 1.0 0.0 \n", | |
".. ... ... ... ... \n", | |
"85 0.0 1.0 0.0 1.0 \n", | |
"86 0.0 1.0 0.0 1.0 \n", | |
"87 0.0 1.0 0.0 1.0 \n", | |
"88 0.0 1.0 0.0 1.0 \n", | |
"89 1.0 0.0 0.0 1.0 \n", | |
"\n", | |
"[90 rows x 83 columns]" | |
] | |
}, | |
"execution_count": 6, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"X = pd.read_csv('https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DS0321EN-SkillsNetwork/datasets/dataset_part_3.csv')\n", | |
"\n", | |
"# If you were unable to complete the previous lab correctly you can uncomment and load this csv\n", | |
"\n", | |
"# X = pd.read_csv('https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/api/dataset_part_3.csv')\n", | |
"\n", | |
"X.head(100)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 1\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a NumPy array from the column <code>Class</code> in <code>data</code>, by applying the method <code>to_numpy()</code> then\n", | |
"assign it to the variable <code>Y</code>,make sure the output is a Pandas series (only one bracket df\\['name of column']).\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"Y = data['Class'].to_numpy()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"array([0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1,\n", | |
" 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1,\n", | |
" 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1,\n", | |
" 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,\n", | |
" 1, 1])" | |
] | |
}, | |
"execution_count": 8, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"Y" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 2\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Standardize the data in <code>X</code> then reassign it to the variable <code>X</code> using the transform provided below.\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# students get this\n", | |
"transform = preprocessing.StandardScaler()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"X = transform.fit_transform(X)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We split the data into training and testing data using the function <code>train_test_split</code>. The training data is divided into validation data, a second set used for training data; then the models are trained and hyperparameters are selected using the function <code>GridSearchCV</code>.\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 3\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Use the function train_test_split to split the data X and Y into training and test data. Set the parameter test_size to 0.2 and random_state to 2. The training data and test data should be assigned to the following labels.\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<code>X_train, X_test, Y_train, Y_test</code>\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Train set: (72, 83) (72,)\n", | |
"Test set: (18, 83) (18,)\n" | |
] | |
} | |
], | |
"source": [ | |
"X_train, X_test, Y_train, Y_test = train_test_split( X, Y, test_size=0.2, random_state=2)\n", | |
"print ('Train set:', X_train.shape, Y_train.shape)\n", | |
"print ('Test set:', X_test.shape, Y_test.shape)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"we can see we only have 18 test samples.\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"(18,)" | |
] | |
}, | |
"execution_count": 12, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"Y_test.shape" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 4\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a logistic regression object then create a GridSearchCV object <code>logreg_cv</code> with cv = 10. Fit the object to find the best parameters from the dictionary <code>parameters</code>.\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"parameters ={'C':[0.01,0.1,1],\n", | |
" 'penalty':['l2'],\n", | |
" 'solver':['lbfgs']}\n", | |
"lr=LogisticRegression()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"metadata": { | |
"tags": [] | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"GridSearchCV(cv=10, error_score='raise-deprecating',\n", | |
" estimator=LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", | |
" intercept_scaling=1, max_iter=100, multi_class='warn',\n", | |
" n_jobs=None, penalty='l2', random_state=None, solver='warn',\n", | |
" tol=0.0001, verbose=0, warm_start=False),\n", | |
" fit_params=None, iid='warn', n_jobs=None,\n", | |
" param_grid={'C': [0.01, 0.1, 1], 'penalty': ['l2'], 'solver': ['lbfgs']},\n", | |
" pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',\n", | |
" scoring=None, verbose=0)" | |
] | |
}, | |
"execution_count": 14, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"logreg_cv = GridSearchCV(lr,parameters,cv=10)\n", | |
"logreg_cv.fit(X_train, Y_train)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We output the <code>GridSearchCV</code> object for logistic regression. We display the best parameters using the data attribute <code>best_params\\_</code> and the accuracy on the validation data using the data attribute <code>best_score\\_</code>.\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 15, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"tuned hpyerparameters :(best parameters) {'C': 0.01, 'penalty': 'l2', 'solver': 'lbfgs'}\n", | |
"accuracy : 0.8472222222222222\n" | |
] | |
} | |
], | |
"source": [ | |
"print(\"tuned hpyerparameters :(best parameters) \",logreg_cv.best_params_)\n", | |
"print(\"accuracy :\",logreg_cv.best_score_)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 5\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Calculate the accuracy on the test data using the method <code>score</code>:\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 16, | |
"metadata": { | |
"tags": [] | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"test set accuracy : 0.8333333333333334\n" | |
] | |
} | |
], | |
"source": [ | |
"print(\"test set accuracy :\",logreg_cv.score(X_test, Y_test))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Lets look at the confusion matrix:\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 17, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAWgAAAEWCAYAAABLzQ1kAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAfwUlEQVR4nO3dd5xdVbn/8c93JhACJKG3ACYooIAUKVIEg1joBBvVAmjgStPrBUG5RMCGhSv+xBIRgRAiRYoUKVIMIEgKoTcllJAAAYQEEiCZeX5/7DVwGGfmlDn7nH0y3zev/co5u6y1ZubwzJq113q2IgIzMyuetmY3wMzMeuYAbWZWUA7QZmYF5QBtZlZQDtBmZgXlAG1mVlAO0NZvkoZIukrSq5Iu6Uc5B0m6oZ5tawZJf5H0pWa3w1qfA/QAIulASVMlvSZpTgokH6lD0Z8FVgdWjojP1VpIREyMiE/WoT3vImm0pJB0Wbf9m6X9t1ZYznclXVDuvIjYLSLOq7G5Zm9zgB4gJP038HPgB2TBdF3gV8A+dSj+PcBjEbG4DmXlZS6wvaSVS/Z9CXisXhUo4/+nrG78YRoAJA0HTgWOjIjLIuL1iFgUEVdFxHHpnMGSfi5pdtp+LmlwOjZa0ixJ35T0Qup9H5KOnQKcDOyXeuaHde9pShqZeqqD0vsvS3pC0nxJMyUdVLL/9pLrtpc0JQ2dTJG0fcmxWyWdJumOVM4Nklbp49vwFnAFsH+6vh34PDCx2/fqTEnPSJonaZqkHdP+XYFvl3yd95a04/uS7gAWAOulfV9Jx38t6dKS8k+XdJMkVfrzs4HLAXpg2A5YBri8j3O+A2wLbA5sBmwDnFRyfA1gODACOAw4S9KKETGOrFd+UUQsHxG/76shkpYDfgHsFhFDge2BGT2ctxJwTTp3ZeAM4JpuPeADgUOA1YClgf/pq27gfOCL6fWngAeB2d3OmUL2PVgJuBC4RNIyEXFdt69zs5JrvgCMBYYCT3Ur75vApumXz45k37svhXMsWAUcoAeGlYEXywxBHAScGhEvRMRc4BSywNNlUTq+KCKuBV4DNqyxPZ3AJpKGRMSciHiwh3P2AB6PiAkRsTgiJgGPAHuVnPOHiHgsIhYCF5MF1l5FxN+BlSRtSBaoz+/hnAsi4qVU58+AwZT/Os+NiAfTNYu6lbcAOJjsF8wFwNERMatMeWaAA/RA8RKwStcQQy/W4t29v6fSvrfL6BbgFwDLV9uQiHgd2A84Apgj6RpJ76+gPV1tGlHy/rka2jMBOArYmR7+okjDOA+nYZVXyP5q6GvoBOCZvg5GxN3AE4DIfpGYVcQBemC4E3gDGNPHObPJbvZ1WZf//PO/Uq8Dy5a8X6P0YERcHxGfANYk6xX/roL2dLXp2Rrb1GUC8DXg2tS7fVsagvgW2dj0ihGxAvAqWWAF6G1Yos/hCklHkvXEZwPH19xyG3AcoAeAiHiV7EbeWZLGSFpW0lKSdpP043TaJOAkSaumm20nk/1JXosZwE6S1k03KE/sOiBpdUl7p7HoN8mGSjp6KONaYIM0NXCQpP2AjYCra2wTABExE/go2Zh7d0OBxWQzPgZJOhkYVnL8eWBkNTM1JG0AfI9smOMLwPGSNq+t9TbQOEAPEBFxBvDfZDf+5pL9WX4U2cwGyILIVOA+4H5getpXS103Ahelsqbx7qDaRnbjbDbwMlmw/FoPZbwE7JnOfYms57lnRLxYS5u6lX17RPT018H1wF/Ipt49RfZXR+nwRdcinJckTS9XTxpSugA4PSLujYjHyWaCTOiaIWPWF/lmsplZMbkHbWZWUA7QZmZ1JumctKjrgZJ9P5H0iKT7JF0uaYVy5ThAm5nV37nArt323QhsEhGbkt3nOLH7Rd05QJuZ1VlETCa7CV6674aStQR3AWuXK6evhQtNdejIz/rupZlV5JwnL+13bpNFLz5RccxZetX3Hk62vL/L+IgYX0V1h5LNdOpTYQO0mVlDdfY0Hb9nKRhXE5DfJuk7ZPPtJ5Y71wHazAwgOnOvQtmDHPYEdqkkYZYDtJkZQGe+ATqlrP0W8NHuaQZ64wBtZgZEHXvQkiYBo8mSlM0CxpHN2hgM3JjSgd8VEUf0VY4DtJkZQEf9HggUEQf0sLvPXOk9cYA2M4OqbhI2igO0mRk05CZhtRygzcwg95uEtXCANjOjvjcJ68UB2swM3IM2MyusjkXlz2kwB2gzM/BNQjOzwvIQh5lZQbkHbWZWUO5Bm5kVU3T6JqGZWTG5B21mVlAegzYzKygnSzIzKyj3oM3MCspj0GZmBVXHhP314gBtZgbuQZuZFVWEbxKamRWTe9BmZgXlWRxmZgXlHrSZWUF5FoeZWUF5iMPMrKA8xGFmVlAO0GZmBVXAIY62ZjfAzKwQOhZXvpUh6RxJL0h6oGTfSpJulPR4+nfFcuU4QJuZQTbEUelW3rnArt32nQDcFBHrAzel931ygDYzg2yIo9KtXFERk4GXu+3eBzgvvT4PGFOuHI9Bm5lBI24Srh4RcwAiYo6k1cpd4ABtZgZVBWhJY4GxJbvGR8T4ejfJAdrMDCCiilNjPFBtQH5e0pqp97wm8EK5CzwGbWYGsHhx5Vtt/gx8Kb3+EnBluQvcgzYzg7rOg5Y0CRgNrCJpFjAO+BFwsaTDgKeBz5UrxwHazAzqepMwIg7o5dAu1ZTjAG1mBlWNQTeKA7SZGTgXh5lZYTlAm5kVU3T4obFmZsXkHrSZWUEVMN2oA7SZGUCnZ3GYmRWThzjMzArKNwmtGoMGL8UJF53KUoOXoq29nal/uZMr/+/iZjfLmsyfi5y4B23VWPzmIn5y4Cm8ueAN2ge1c+Kl3+P+W+/hiXseb3bTrIn8ucjJQBiDljQf6PUrjYhh9a5zSfbmgjcAaB/UTvug9j6+szaQ+HORg4EwiyMihgJIOhV4DpgACDgIGFrv+pZ0amtj3NWns9p71uDmCdfzxAz3ksyfi1wUsAedZz7oT0XEryJifkTMi4hfA5/p6wJJYyVNlTT10flP5Ni01hGdnXx39+P45naHM2qz9zFig3Wa3SQrAH8u6i86OyveGiXPAN0h6SBJ7ZLaJB0E9HmbNCLGR8RWEbHVhkPXy7FprWfhvAU8eteDbPLRLZrdFCsQfy7qqKOj8q1B8gzQBwKfB55P2+fSPqvQ0JWGMWTYsgAsNXhpNtphU57717NNbpU1mz8XOemMyrcGyW0WR0Q8SfaYcavR8NVW5LCfHUVbWxtqE1Ou+Tv33jyt2c2yJvPnIicDaZqdpFWBrwIjS+uJiEPzqnNJM+uRpzhlj+Oa3QwrGH8uclLAm4R5zoO+ErgN+Ctlxp7NzJpuIEyzK7FsRHwrx/LNzOpngPWgr5a0e0Rcm2MdZmZ1EYuL94d+ngH6WODbkt4EFpEtVgmvJDSzQhpIPeiuFYVmZi1hgI1BI2lFYH1gma59ETE5zzrNzGoykHrQkr5CNsyxNjAD2Ba4E/hYXnWamdUqChig81xJeCywNfBUROwMbAHMzbE+M7PaLe6ofGuQPIc43oiINyQhaXBEPCJpwxzrMzOrXQF70HkG6FmSVgCuAG6U9G9gdo71mZnVbiAF6IjYN738rqRbgOHAdXnVZ2bWHxH1C9CSvgF8hexRCvcDh0TEG9WWk8cTVVbqYff96d/lgZfrXaeZWb/VqQctaQRwDLBRRCyUdDGwP3ButWXl0YOeRvZbQyX7ut4H4ETPZlY89R3iGAQMkbQIWJYah3fzeOTVqHqXaWaWt1hc+UIVSWOBsSW7xkfEeICIeFbST4GngYXADRFxQy1t8lO9zcwAqlhImILx+J6OpQV6+wCjgFeASyQdHBEXVNukPOdBm5m1jOiMircyPg7MjIi5EbEIuAzYvpY2uQdtZgb1HIN+GthW0rJkQxy7AFNrKSi3HrSkCZXsMzMrhM4qtj5ExD+AS4HpZDPY2uhlOKScPHvQG5e+kdQObJljfWZmNatnLo6IGAeM6285ecyDPhH4NtkUk3m8M93uLWr8LWJmlrdYXLyVhHUf4oiIH6Zc0D+JiGERMTRtK0fEifWuz8ysLuo0xFFPeS71PlHS3sBOadetEXF1XvWZmfVHAfP155oP+ofANsDEtOtYSTu4F21mhTSQAjSwB7B5RPZ7SdJ5wD2AA7SZFU7L96DTCpl1IuK+Ci9ZgXeSIw2vpi4zs0aKxc1uwX8qG6Al3Qrsnc6dAcyV9LeI+O8yl/4QuCelGhXZWLR7z2ZWSK3agx4eEfPSMwb/EBHjJJXtQUfEpBTctyYL0N+KiOf611wzs3wUMUBXMs1ukKQ1gc8D1c7CaANeBP4NbCBppzLnm5k1R6jyrUEq6UGfClwP3B4RUyStBzxe7iJJpwP7AQ/yzv3RACbX2FYzs9wUsQddNkBHxCXAJSXvnwA+U0HZY4ANI+LNmltnZtYg0dm4nnGleg3Qkv4fWY+3RxFxTJmynwCWAhygzazwOjtaKEBTY3q8EguAGZJuoiRIVxDYzcwarqWGOCLivNL3kpaLiNerKPvPaTMzK7yWGuLoImk74PdkT+ReV9JmwOER8bW+ruse4M3MiiyKl8yuoml2Pwc+BbwEEBH38k4CJDOzJUJ0quKtUSpa6h0Rz0jvalRHPs0xM2uOVrtJ2OUZSdsDIWlp4Bjg4XybZWbWWC05Bg0cAZwJjACeJVu0cmRvJ0u6ir6n5+1dZRvNzHIXDVwhWKlKFqq8CBxURZk/Tf9+GlgDuCC9PwB4sprGmZk1SktNs+uSlnafCWxL1jO+E/hGWlH4HyLib+m60yKi9GbiVZK8zNvMCqmzgD3oSmZxXAhcDKwJrEW27HtSBdetmoI7AJJGAavW0kgzs7xFqOKtUSoZg1ZETCh5f4Gkoyq47hvArZK6etojgcOrbJ+ZWUO01CwOSSull7dIOgH4I9kQx37ANeUKjojrJK0PvD/tesSJk8ysqFptFsc0soDc1erS3m8Ap/V0kaSPRcTNkj7d7dB7JRERl9XcWjOznBRxDLqvXByjaizzo8DNwF49FQs4QJtZ4bTkNDsASZsAGwHLdO2LiPN7OjcixqV/D6lHA83MGqGIuTgqmWY3DhhNFqCvBXYDbgd6DNCS+nyYbEScUXUrzcxyVs8hDkkrAGcDm5CNHBwaEXdWW04lPejPApsB90TEIZJWTxX3Zmj6d0OyB8Z2pRzdCz/uyswKqrO+NwnPBK6LiM+mFBnL1lJIJQF6YUR0SlosaRjwArBebydHxCkAkm4APhQR89P771Ly6CwzsyKpVw86xcmdgC8DRMRbwFu1lFVJgJ6auuu/I5vZ8RpwdwXXrdutUW+RzYWuyPmzq/5rwAaAhbNva3YTbAlVzU1CSWOBsSW7xkfE+PR6PWAu8IeUP38acGyVDzwBKsvF0ZWY/zeSrgOGRcR9FZQ9Abhb0uVkYzD7Ak7ib2aFVE0POgXj8b0cHgR8CDg6Iv4h6UzgBOB/q21TXwtVPtTXsYiY3lfBEfF9SX8Bdky7DomIe6ptoJlZI9RxEscsYFZE/CO9v5QsQFetrx70z/o4FsDHyhWegnifgdzMrAg6OitJTVReRDwn6RlJG0bEo8AuwEO1lNXXQpWda22gmVmrqXO20aOBiWkGxxNATetCKlqoYma2pAvqN80uImYAW/W3HAdoMzOgsxVXEpqZDQSddexB10vZUXFlDpZ0cnq/rqRt8m+amVnjBKp4a5RKblv+CtiO7JmCAPOBs3JrkZlZE3SgirdGqWSI48MR8SFJ9wBExL/TnUkzsyVGAZ8ZW1GAXiSpnTSPW9KqFPNrMTOrWRGDWiVDHL8ALgdWk/R9slSjP8i1VWZmDVbEMehKcnFMlDSNbDWMgDER8XDuLTMza6ACPpKwooT96wILgKtK90XE03k2zMyskYo4za6SMehreOfhscsAo4BHgY1zbJeZWUN1NLsBPahkiOODpe9TlrvDezndzKwldao1e9DvEhHTJW2dR2PMzJqlgCu9KxqDLn0IbBtZIuq5ubXIzKwJijjNrpIe9NCS14vJxqT/lE9zzMyao+VmcaQFKstHxHENao+ZWVM0cgl3pfp65NWgiFjc16OvzMyWFK3Wg76bbLx5hqQ/A5cAbz+VNiIuy7ltZmYN06pj0CsBL5E9g7BrPnQADtBmtsRotVkcq6UZHA/wTmDuUsSvxcysZq02xNEOLA89jpw7QJvZEqXVhjjmRMSpDWuJmVkTdbRYD7qAzTUzy0er9aB3aVgrzMyarKUCdES83MiGmJk1UxFvrFWdLMnMbEnUarM4zMwGjJYa4jAzG0haMmG/mdlAUO8hjpRsbirwbETsWUsZDtBmZuQyxHEs8DAwrNYC2urXFjOz1hVVbOVIWhvYAzi7P21ygDYzAzqJijdJYyVNLdnGdivu58Dx9LNj7iEOMzOqu0kYEeOB8T0dk7Qn8EJETJM0uj9tcoA2M6OuY9A7AHtL2h1YBhgm6YKIOLjagjzEYWZGNouj0q0vEXFiRKwdESOB/YGbawnO4B60mRmQjUEXjQO0mRn55OKIiFuBW2u93gHazAwv9TYzK6wOD3GYmRWTe9BmZgXlm4RmZgVVvPDsAG1mBniIw8yssHyT0MysoIo4Bu2l3gX3qU+O5sEHJvPIQ7dz/HFHNrs51iQn/eAMdtpjf8YcfMTb+376y7PZ64Cvsu8X/4tjTjyVefNfa2ILW189043WiwN0gbW1tfGLM7/PnnsdzAc325n99hvDBz6wfrObZU0wZvdP8JszvveufdttvQWXT/gNl5//a0auM4KzJ1zUpNYtGapJN9ooDtAFts3WW/Cvfz3JzJlPs2jRIi6++Er23utTzW6WNcFWm3+Q4cOGvmvfDh/ekkGD2gHYdOP38/wLLzajaUuMziq2RnGALrC1RqzBM7Nmv/1+1rNzWGutNZrYIiuqy6+5gY9st3Wzm9HSoor/GiWXm4SSPt3X8Yi4rJfrxgJjAdQ+nLa25XJoXeuQ/jOvYUTxbmRYc/32vEm0t7ez5yd3bnZTWtpAmsWxV/p3NWB74Ob0fmeyzE49BujSpxQMWnpE8b5bDfbsrDmss/Zab79fe8SazJnzfBNbZEVz5bU3MvmOuzn7Fz/s8Re6VW7AzIOOiEMAJF0NbBQRc9L7NYGz8qhzSTRl6gze975RjBy5Ds8++xyf//w+fOGLnslhmdvvmsrvJ17Cub/8MUOWWabZzWl5nQX86zTvedAju4Jz8jywQc51LjE6Ojo49usnce01F9Le1sa5513EQw891uxmWRMcN+5HTLnnPl55ZR67jDmYrx32Bc6ecBFvLVrEV7/+HSC7UTju+KOb3NLWVbzwDMpzTFPSL4H1gUlkX//+wD8jouynyEMc1pOFs29rdhOsgJZaZb1+j+8c+J59K445Fz51eUPGk3LtQUfEUZL2BXZKu8ZHxOV51mlmVotGzs6oVCOWek8H5kfEXyUtK2loRMxvQL1mZhVbXMAAnes8aElfBS4Ffpt2jQCuyLNOM7NaFHEedN4LVY4EdgDmAUTE42RT78zMCqWIKwnzHuJ4MyLe6pqfKWkQxbxZamYDXBEXgeUdoP8m6dvAEEmfAL4GXJVznWZmVRuI6UZPAOYC9wOHA9cCJ+Vcp5lZ1TqIirdGyXuaXSfwu7SZmRVWEXvQeSVLup8+xpojYtM86jUzq9VAGoPeM/3blThiQvr3IGBBTnWamdVsICVLegpA0g4RsUPJoRMk3QGcmke9Zma1qtf8ZknrAOcDa5DF/fERcWYtZeV9k3A5SR/peiNpe2BgJ3k2s0Kq4yOvFgPfjIgPANsCR0raqJY25T3N7jDgHEnD0/tXgENzrtPMrGodUZ9BjpTBc056PV/Sw2SrqB+qtqy8Z3FMAzaTNIwsc96redZnZlarPJZwSxoJbAH8o5brcw3QkgYDnwFGAoO6VhRGhMegzaxQqknYX/p4vmR8eiJU6TnLA38Cvh4R82ppU95DHFcCrwLTgDdzrsvMrGbV9J9LH8/XE0lLkQXnib09g7USeQfotSNi15zrMDPrt3otVFE2VPB74OGIOKM/ZeU9i+Pvkj6Ycx1mZv1Wx1kcOwBfAD4maUbadq+lTXn3oD8CfFnSTLIhDgHhlYRmVjR1nMVxO1ms67e8A/RuOZdvZlYXA+6RVyUrClcD/Fx4MyusIubiyPuRV3tLehyYCfwNeBL4S551mpnVoo5j0HWT903C08iWOj4WEaOAXYA7cq7TzKxqEVHx1ih5B+hFEfES0CapLSJuATbPuU4zs6p10Fnx1ih53yR8Ja2mmQxMlPQCWSIRM7NCqWYlYaPk3YPeB1gIfAO4DvgXsFfOdZqZVS2q+K9R8p7F8XrJ2/PyrMvMrD+K2IPO65FX8+l5aXvXQpVhedRrZlarATMPOiKG5lGumVleBkwP2sys1dRrqXc9OUCbmTGAhjjMzFpNuAdtZlZMjVzCXSkHaDMzipksyQHazAz3oM3MCquj02PQZmaF5FkcZmYF5TFoM7OC8hi0mVlBuQdtZlZQvkloZlZQHuIwMysoD3GYmRWU042amRWU50GbmRWUe9BmZgXVWcB0o3k/1dvMrCVERMVbOZJ2lfSopH9KOqHWNrkHbWZG/WZxSGoHzgI+AcwCpkj6c0Q8VG1Z7kGbmQFRxVbGNsA/I+KJiHgL+COwTy1tKmwPevFbz6rZbSgKSWMjYnyz22HF4s9FfVUTcySNBcaW7Bpf8rMYATxTcmwW8OFa2uQedGsYW/4UG4D8uWiSiBgfEVuVbKW/KHsK9DWNnzhAm5nV1yxgnZL3awOzaynIAdrMrL6mAOtLGiVpaWB/4M+1FFTYMWh7F48zWk/8uSigiFgs6SjgeqAdOCciHqylLBUxQYiZmXmIw8yssBygzcwKygG6HyR9V9L/pNenSvp4D+eMlnR1ner7dh/HnpS0Sp3qea0e5Vht6vX9lzRS0gP1KMuawwG6TiLi5Ij4a87V9BqgzWzJ4wBdJUnfSUlQ/gpsWLL/XEmfTa93lfSIpNuBT/dSzpclXSbpOkmPS/pxybEDJN0v6QFJp6d9PwKGSJohaWKZNl4haZqkB9OKp679r0n6vqR7Jd0lafW0f5SkOyVNkXRaP749VkeSlpd0k6Tp6fOwT9o/UtLDkn6XfsY3SBqSjm2Zfr53Akc29QuwfnOAroKkLcnmNG5BFni37uGcZYDfAXsBOwJr9FHk5sB+wAeB/SStI2kt4HTgY+n41pLGRMQJwMKI2DwiDirT1EMjYktgK+AYSSun/csBd0XEZsBk4Ktp/5nAryNia+C5MmVb47wB7BsRHwJ2Bn4mqWuV2vrAWRGxMfAK8Jm0/w/AMRGxXaMba/XnAF2dHYHLI2JBRMyj58nn7wdmRsTjkc1hvKCP8m6KiFcj4g3gIeA9ZEH/1oiYGxGLgYnATlW28xhJ9wJ3ka1oWj/tfwvoGg+fBoxMr3cAJqXXE6qsy/Ij4AeS7gP+SpbjYfV0bGZEzEivpwEjJQ0HVoiIv6X9/lm2OC9UqV4lE8crnVz+ZsnrDrKfR7+SREkaDXwc2C4iFki6FVgmHV4U70x876qviyfEF89BwKrAlhGxSNKTvPOz7P7ZGUL22fHPcQniHnR1JgP7ShoiaSjZMEZ3jwCjJL03vT+gyjr+AXxU0iopr+wBQFePaJGkpcpcPxz4dwrO7we2raDOO8iGbiALClYMw4EXUnDemewvrF5FxCvAq5I+knb5Z9niHKCrEBHTgYuAGcCfgNt6OOcNsixj16SbhE9VWccc4ETgFuBeYHpEXJkOjwfuK3OT8DpgUPqz+DSyYY5yjgWOlDSFLChYMUwEtpI0lSzYPlLBNYcAZ6WbhAvzbJzlz0u9zcwKyj1oM7OCcoA2MysoB2gzs4JygDYzKygHaDOzgnKAtv8gqSPl/HhA0iWSlu1HWaU5Ss6WtFEf546WtH0NdfSYya+SDH/VZo4rzWBoljcHaOtJV86PTciWhx9RejAtoKlaRHwlIh7q45TRQNUB2mxJ5QBt5dwGvC/1bm+RdCFwv6R2ST9JGfDuk3Q4gDK/lPSQpGuA1boKknSrpK3S611TlrZ7U8a2kWS/CL6Reu87SlpV0p9SHVMk7ZCuXTllcLtH0m+pYHl8bxn+0rGfpbbcJGnVtO+9yjINTpN0W1qV2b3MY9LXeZ+kP9b4/TXrlXNxWK8kDQJ2I1udCLANsElEzExB7tWI2FrSYOAOSTeQZfrbkCxD3+pkSaDO6VbuqmQZ/3ZKZa0UES9L+g3wWkT8NJ13IfB/EXG7pHXJHsL5AWAccHtEnCppD7KVm+UcmuoYAkyR9KeIeIksw9/0iPimpJNT2UeRrdo8IiIel/Rh4FdkGQZLnQCMiog3Ja1QyffUrBoO0NaTIZJmpNe3Ab8nG3q4OyJmpv2fBDbtGl8mWyK+PlnmvUkR0QHMlnRzD+VvC0zuKisiXu6lHR8HNnonwybDUg6UnUh5tiPiGkn/ruBrOkbSvul1V4a/l4BOsuX7kGUevEzS8unrvaSk7sE9lHkfMFHSFcAVFbTBrCoO0NaThRGxeemOFKheL90FHB0R13c7b3fKZ1SrNOtaG1lWvnfllEhtqThHQZkMf91FqveV7t+DHuxB9stib+B/JW2cUsSa1YXHoK1W1wP/1ZVdT9IGkpYjy/i3fxqjXpMs0Xx3d5Jl7BuVrl0p7Z8PDC057way4QbSeZunl5NJmdok7QasWKatfWX4awO6/go4kGzoZB4wU9LnUh2StFlpgZLagHUi4hbgeGAFYPky7TCrinvQVquzyRL+T1fWpZ0LjAEuJxurvR94jHdSpb4tIuamMezLUqB7AfgEcBVwqbJHOx0NHEOWme0+ss/qZLIbiacAkyRNT+U/Xaat1wFHpHIe5d0Z/l4HNpY0DXiV7Ak3kP0C+LWkk4ClgD+SZRfs0g5coCxJvsjGyl8p0w6zqjibnZlZQXmIw8ysoBygzcwKygHazKygHKDNzArKAdrMrKAcoM3MCsoB2sysoP4/GZXC9mFMgIMAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 2 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"yhat=logreg_cv.predict(X_test)\n", | |
"plot_confusion_matrix(Y_test,yhat)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Examining the confusion matrix, we see that logistic regression can distinguish between the different classes. We see that the major problem is false positives.\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 6\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a support vector machine object then create a <code>GridSearchCV</code> object <code>svm_cv</code> with cv - 10. Fit the object to find the best parameters from the dictionary <code>parameters</code>.\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 18, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"parameters = {'kernel':('linear', 'rbf','poly','rbf', 'sigmoid'),\n", | |
" 'C': np.logspace(-3, 3, 5),\n", | |
" 'gamma':np.logspace(-3, 3, 5)}\n", | |
"svm = SVC()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 19, | |
"metadata": { | |
"tags": [] | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"GridSearchCV(cv=10, error_score='raise-deprecating',\n", | |
" estimator=SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,\n", | |
" decision_function_shape='ovr', degree=3, gamma='auto_deprecated',\n", | |
" kernel='rbf', max_iter=-1, probability=False, random_state=None,\n", | |
" shrinking=True, tol=0.001, verbose=False),\n", | |
" fit_params=None, iid='warn', n_jobs=None,\n", | |
" param_grid={'kernel': ('linear', 'rbf', 'poly', 'rbf', 'sigmoid'), 'C': array([1.00000e-03, 3.16228e-02, 1.00000e+00, 3.16228e+01, 1.00000e+03]), 'gamma': array([1.00000e-03, 3.16228e-02, 1.00000e+00, 3.16228e+01, 1.00000e+03])},\n", | |
" pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',\n", | |
" scoring=None, verbose=0)" | |
] | |
}, | |
"execution_count": 19, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"svm_cv = GridSearchCV(svm,parameters,cv=10)\n", | |
"svm_cv.fit(X_train, Y_train)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"tuned hyperparameters :(best parameters) {'C': 1.0, 'gamma': 0.03162277660168379, 'kernel': 'sigmoid'}\n", | |
"accuracy : 0.8472222222222222\n" | |
] | |
} | |
], | |
"source": [ | |
"print(\"tuned hyperparameters :(best parameters) \",svm_cv.best_params_)\n", | |
"print(\"accuracy :\",svm_cv.best_score_)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 7\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Calculate the accuracy on the test data using the method <code>score</code>:\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 21, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"test set accuracy : 0.8333333333333334\n" | |
] | |
} | |
], | |
"source": [ | |
"print(\"test set accuracy :\",svm_cv.score(X_test, Y_test))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We can plot the confusion matrix\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 22, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAWgAAAEWCAYAAABLzQ1kAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAfwUlEQVR4nO3dd5xdVbn/8c93JhACJKG3ACYooIAUKVIEg1joBBvVAmjgStPrBUG5RMCGhSv+xBIRgRAiRYoUKVIMIEgKoTcllJAAAYQEEiCZeX5/7DVwGGfmlDn7nH0y3zev/co5u6y1ZubwzJq113q2IgIzMyuetmY3wMzMeuYAbWZWUA7QZmYF5QBtZlZQDtBmZgXlAG1mVlAO0NZvkoZIukrSq5Iu6Uc5B0m6oZ5tawZJf5H0pWa3w1qfA/QAIulASVMlvSZpTgokH6lD0Z8FVgdWjojP1VpIREyMiE/WoT3vImm0pJB0Wbf9m6X9t1ZYznclXVDuvIjYLSLOq7G5Zm9zgB4gJP038HPgB2TBdF3gV8A+dSj+PcBjEbG4DmXlZS6wvaSVS/Z9CXisXhUo4/+nrG78YRoAJA0HTgWOjIjLIuL1iFgUEVdFxHHpnMGSfi5pdtp+LmlwOjZa0ixJ35T0Qup9H5KOnQKcDOyXeuaHde9pShqZeqqD0vsvS3pC0nxJMyUdVLL/9pLrtpc0JQ2dTJG0fcmxWyWdJumOVM4Nklbp49vwFnAFsH+6vh34PDCx2/fqTEnPSJonaZqkHdP+XYFvl3yd95a04/uS7gAWAOulfV9Jx38t6dKS8k+XdJMkVfrzs4HLAXpg2A5YBri8j3O+A2wLbA5sBmwDnFRyfA1gODACOAw4S9KKETGOrFd+UUQsHxG/76shkpYDfgHsFhFDge2BGT2ctxJwTTp3ZeAM4JpuPeADgUOA1YClgf/pq27gfOCL6fWngAeB2d3OmUL2PVgJuBC4RNIyEXFdt69zs5JrvgCMBYYCT3Ur75vApumXz45k37svhXMsWAUcoAeGlYEXywxBHAScGhEvRMRc4BSywNNlUTq+KCKuBV4DNqyxPZ3AJpKGRMSciHiwh3P2AB6PiAkRsTgiJgGPAHuVnPOHiHgsIhYCF5MF1l5FxN+BlSRtSBaoz+/hnAsi4qVU58+AwZT/Os+NiAfTNYu6lbcAOJjsF8wFwNERMatMeWaAA/RA8RKwStcQQy/W4t29v6fSvrfL6BbgFwDLV9uQiHgd2A84Apgj6RpJ76+gPV1tGlHy/rka2jMBOArYmR7+okjDOA+nYZVXyP5q6GvoBOCZvg5GxN3AE4DIfpGYVcQBemC4E3gDGNPHObPJbvZ1WZf//PO/Uq8Dy5a8X6P0YERcHxGfANYk6xX/roL2dLXp2Rrb1GUC8DXg2tS7fVsagvgW2dj0ihGxAvAqWWAF6G1Yos/hCklHkvXEZwPH19xyG3AcoAeAiHiV7EbeWZLGSFpW0lKSdpP043TaJOAkSaumm20nk/1JXosZwE6S1k03KE/sOiBpdUl7p7HoN8mGSjp6KONaYIM0NXCQpP2AjYCra2wTABExE/go2Zh7d0OBxWQzPgZJOhkYVnL8eWBkNTM1JG0AfI9smOMLwPGSNq+t9TbQOEAPEBFxBvDfZDf+5pL9WX4U2cwGyILIVOA+4H5getpXS103Ahelsqbx7qDaRnbjbDbwMlmw/FoPZbwE7JnOfYms57lnRLxYS5u6lX17RPT018H1wF/Ipt49RfZXR+nwRdcinJckTS9XTxpSugA4PSLujYjHyWaCTOiaIWPWF/lmsplZMbkHbWZWUA7QZmZ1JumctKjrgZJ9P5H0iKT7JF0uaYVy5ThAm5nV37nArt323QhsEhGbkt3nOLH7Rd05QJuZ1VlETCa7CV6674aStQR3AWuXK6evhQtNdejIz/rupZlV5JwnL+13bpNFLz5RccxZetX3Hk62vL/L+IgYX0V1h5LNdOpTYQO0mVlDdfY0Hb9nKRhXE5DfJuk7ZPPtJ5Y71wHazAwgOnOvQtmDHPYEdqkkYZYDtJkZQGe+ATqlrP0W8NHuaQZ64wBtZgZEHXvQkiYBo8mSlM0CxpHN2hgM3JjSgd8VEUf0VY4DtJkZQEf9HggUEQf0sLvPXOk9cYA2M4OqbhI2igO0mRk05CZhtRygzcwg95uEtXCANjOjvjcJ68UB2swM3IM2MyusjkXlz2kwB2gzM/BNQjOzwvIQh5lZQbkHbWZWUO5Bm5kVU3T6JqGZWTG5B21mVlAegzYzKygnSzIzKyj3oM3MCspj0GZmBVXHhP314gBtZgbuQZuZFVWEbxKamRWTe9BmZgXlWRxmZgXlHrSZWUF5FoeZWUF5iMPMrKA8xGFmVlAO0GZmBVXAIY62ZjfAzKwQOhZXvpUh6RxJL0h6oGTfSpJulPR4+nfFcuU4QJuZQTbEUelW3rnArt32nQDcFBHrAzel931ygDYzg2yIo9KtXFERk4GXu+3eBzgvvT4PGFOuHI9Bm5lBI24Srh4RcwAiYo6k1cpd4ABtZgZVBWhJY4GxJbvGR8T4ejfJAdrMDCCiilNjPFBtQH5e0pqp97wm8EK5CzwGbWYGsHhx5Vtt/gx8Kb3+EnBluQvcgzYzg7rOg5Y0CRgNrCJpFjAO+BFwsaTDgKeBz5UrxwHazAzqepMwIg7o5dAu1ZTjAG1mBlWNQTeKA7SZGTgXh5lZYTlAm5kVU3T4obFmZsXkHrSZWUEVMN2oA7SZGUCnZ3GYmRWThzjMzArKNwmtGoMGL8UJF53KUoOXoq29nal/uZMr/+/iZjfLmsyfi5y4B23VWPzmIn5y4Cm8ueAN2ge1c+Kl3+P+W+/hiXseb3bTrIn8ucjJQBiDljQf6PUrjYhh9a5zSfbmgjcAaB/UTvug9j6+szaQ+HORg4EwiyMihgJIOhV4DpgACDgIGFrv+pZ0amtj3NWns9p71uDmCdfzxAz3ksyfi1wUsAedZz7oT0XEryJifkTMi4hfA5/p6wJJYyVNlTT10flP5Ni01hGdnXx39+P45naHM2qz9zFig3Wa3SQrAH8u6i86OyveGiXPAN0h6SBJ7ZLaJB0E9HmbNCLGR8RWEbHVhkPXy7FprWfhvAU8eteDbPLRLZrdFCsQfy7qqKOj8q1B8gzQBwKfB55P2+fSPqvQ0JWGMWTYsgAsNXhpNtphU57717NNbpU1mz8XOemMyrcGyW0WR0Q8SfaYcavR8NVW5LCfHUVbWxtqE1Ou+Tv33jyt2c2yJvPnIicDaZqdpFWBrwIjS+uJiEPzqnNJM+uRpzhlj+Oa3QwrGH8uclLAm4R5zoO+ErgN+Ctlxp7NzJpuIEyzK7FsRHwrx/LNzOpngPWgr5a0e0Rcm2MdZmZ1EYuL94d+ngH6WODbkt4EFpEtVgmvJDSzQhpIPeiuFYVmZi1hgI1BI2lFYH1gma59ETE5zzrNzGoykHrQkr5CNsyxNjAD2Ba4E/hYXnWamdUqChig81xJeCywNfBUROwMbAHMzbE+M7PaLe6ofGuQPIc43oiINyQhaXBEPCJpwxzrMzOrXQF70HkG6FmSVgCuAG6U9G9gdo71mZnVbiAF6IjYN738rqRbgOHAdXnVZ2bWHxH1C9CSvgF8hexRCvcDh0TEG9WWk8cTVVbqYff96d/lgZfrXaeZWb/VqQctaQRwDLBRRCyUdDGwP3ButWXl0YOeRvZbQyX7ut4H4ETPZlY89R3iGAQMkbQIWJYah3fzeOTVqHqXaWaWt1hc+UIVSWOBsSW7xkfEeICIeFbST4GngYXADRFxQy1t8lO9zcwAqlhImILx+J6OpQV6+wCjgFeASyQdHBEXVNukPOdBm5m1jOiMircyPg7MjIi5EbEIuAzYvpY2uQdtZgb1HIN+GthW0rJkQxy7AFNrKSi3HrSkCZXsMzMrhM4qtj5ExD+AS4HpZDPY2uhlOKScPHvQG5e+kdQObJljfWZmNatnLo6IGAeM6285ecyDPhH4NtkUk3m8M93uLWr8LWJmlrdYXLyVhHUf4oiIH6Zc0D+JiGERMTRtK0fEifWuz8ysLuo0xFFPeS71PlHS3sBOadetEXF1XvWZmfVHAfP155oP+ofANsDEtOtYSTu4F21mhTSQAjSwB7B5RPZ7SdJ5wD2AA7SZFU7L96DTCpl1IuK+Ci9ZgXeSIw2vpi4zs0aKxc1uwX8qG6Al3Qrsnc6dAcyV9LeI+O8yl/4QuCelGhXZWLR7z2ZWSK3agx4eEfPSMwb/EBHjJJXtQUfEpBTctyYL0N+KiOf611wzs3wUMUBXMs1ukKQ1gc8D1c7CaANeBP4NbCBppzLnm5k1R6jyrUEq6UGfClwP3B4RUyStBzxe7iJJpwP7AQ/yzv3RACbX2FYzs9wUsQddNkBHxCXAJSXvnwA+U0HZY4ANI+LNmltnZtYg0dm4nnGleg3Qkv4fWY+3RxFxTJmynwCWAhygzazwOjtaKEBTY3q8EguAGZJuoiRIVxDYzcwarqWGOCLivNL3kpaLiNerKPvPaTMzK7yWGuLoImk74PdkT+ReV9JmwOER8bW+ruse4M3MiiyKl8yuoml2Pwc+BbwEEBH38k4CJDOzJUJ0quKtUSpa6h0Rz0jvalRHPs0xM2uOVrtJ2OUZSdsDIWlp4Bjg4XybZWbWWC05Bg0cAZwJjACeJVu0cmRvJ0u6ir6n5+1dZRvNzHIXDVwhWKlKFqq8CBxURZk/Tf9+GlgDuCC9PwB4sprGmZk1SktNs+uSlnafCWxL1jO+E/hGWlH4HyLib+m60yKi9GbiVZK8zNvMCqmzgD3oSmZxXAhcDKwJrEW27HtSBdetmoI7AJJGAavW0kgzs7xFqOKtUSoZg1ZETCh5f4Gkoyq47hvArZK6etojgcOrbJ+ZWUO01CwOSSull7dIOgH4I9kQx37ANeUKjojrJK0PvD/tesSJk8ysqFptFsc0soDc1erS3m8Ap/V0kaSPRcTNkj7d7dB7JRERl9XcWjOznBRxDLqvXByjaizzo8DNwF49FQs4QJtZ4bTkNDsASZsAGwHLdO2LiPN7OjcixqV/D6lHA83MGqGIuTgqmWY3DhhNFqCvBXYDbgd6DNCS+nyYbEScUXUrzcxyVs8hDkkrAGcDm5CNHBwaEXdWW04lPejPApsB90TEIZJWTxX3Zmj6d0OyB8Z2pRzdCz/uyswKqrO+NwnPBK6LiM+mFBnL1lJIJQF6YUR0SlosaRjwArBebydHxCkAkm4APhQR89P771Ly6CwzsyKpVw86xcmdgC8DRMRbwFu1lFVJgJ6auuu/I5vZ8RpwdwXXrdutUW+RzYWuyPmzq/5rwAaAhbNva3YTbAlVzU1CSWOBsSW7xkfE+PR6PWAu8IeUP38acGyVDzwBKsvF0ZWY/zeSrgOGRcR9FZQ9Abhb0uVkYzD7Ak7ib2aFVE0POgXj8b0cHgR8CDg6Iv4h6UzgBOB/q21TXwtVPtTXsYiY3lfBEfF9SX8Bdky7DomIe6ptoJlZI9RxEscsYFZE/CO9v5QsQFetrx70z/o4FsDHyhWegnifgdzMrAg6OitJTVReRDwn6RlJG0bEo8AuwEO1lNXXQpWda22gmVmrqXO20aOBiWkGxxNATetCKlqoYma2pAvqN80uImYAW/W3HAdoMzOgsxVXEpqZDQSddexB10vZUXFlDpZ0cnq/rqRt8m+amVnjBKp4a5RKblv+CtiO7JmCAPOBs3JrkZlZE3SgirdGqWSI48MR8SFJ9wBExL/TnUkzsyVGAZ8ZW1GAXiSpnTSPW9KqFPNrMTOrWRGDWiVDHL8ALgdWk/R9slSjP8i1VWZmDVbEMehKcnFMlDSNbDWMgDER8XDuLTMza6ACPpKwooT96wILgKtK90XE03k2zMyskYo4za6SMehreOfhscsAo4BHgY1zbJeZWUN1NLsBPahkiOODpe9TlrvDezndzKwldao1e9DvEhHTJW2dR2PMzJqlgCu9KxqDLn0IbBtZIuq5ubXIzKwJijjNrpIe9NCS14vJxqT/lE9zzMyao+VmcaQFKstHxHENao+ZWVM0cgl3pfp65NWgiFjc16OvzMyWFK3Wg76bbLx5hqQ/A5cAbz+VNiIuy7ltZmYN06pj0CsBL5E9g7BrPnQADtBmtsRotVkcq6UZHA/wTmDuUsSvxcysZq02xNEOLA89jpw7QJvZEqXVhjjmRMSpDWuJmVkTdbRYD7qAzTUzy0er9aB3aVgrzMyarKUCdES83MiGmJk1UxFvrFWdLMnMbEnUarM4zMwGjJYa4jAzG0haMmG/mdlAUO8hjpRsbirwbETsWUsZDtBmZuQyxHEs8DAwrNYC2urXFjOz1hVVbOVIWhvYAzi7P21ygDYzAzqJijdJYyVNLdnGdivu58Dx9LNj7iEOMzOqu0kYEeOB8T0dk7Qn8EJETJM0uj9tcoA2M6OuY9A7AHtL2h1YBhgm6YKIOLjagjzEYWZGNouj0q0vEXFiRKwdESOB/YGbawnO4B60mRmQjUEXjQO0mRn55OKIiFuBW2u93gHazAwv9TYzK6wOD3GYmRWTe9BmZgXlm4RmZgVVvPDsAG1mBniIw8yssHyT0MysoIo4Bu2l3gX3qU+O5sEHJvPIQ7dz/HFHNrs51iQn/eAMdtpjf8YcfMTb+376y7PZ64Cvsu8X/4tjTjyVefNfa2ILW189043WiwN0gbW1tfGLM7/PnnsdzAc325n99hvDBz6wfrObZU0wZvdP8JszvveufdttvQWXT/gNl5//a0auM4KzJ1zUpNYtGapJN9ooDtAFts3WW/Cvfz3JzJlPs2jRIi6++Er23utTzW6WNcFWm3+Q4cOGvmvfDh/ekkGD2gHYdOP38/wLLzajaUuMziq2RnGALrC1RqzBM7Nmv/1+1rNzWGutNZrYIiuqy6+5gY9st3Wzm9HSoor/GiWXm4SSPt3X8Yi4rJfrxgJjAdQ+nLa25XJoXeuQ/jOvYUTxbmRYc/32vEm0t7ez5yd3bnZTWtpAmsWxV/p3NWB74Ob0fmeyzE49BujSpxQMWnpE8b5bDfbsrDmss/Zab79fe8SazJnzfBNbZEVz5bU3MvmOuzn7Fz/s8Re6VW7AzIOOiEMAJF0NbBQRc9L7NYGz8qhzSTRl6gze975RjBy5Ds8++xyf//w+fOGLnslhmdvvmsrvJ17Cub/8MUOWWabZzWl5nQX86zTvedAju4Jz8jywQc51LjE6Ojo49usnce01F9Le1sa5513EQw891uxmWRMcN+5HTLnnPl55ZR67jDmYrx32Bc6ecBFvLVrEV7/+HSC7UTju+KOb3NLWVbzwDMpzTFPSL4H1gUlkX//+wD8jouynyEMc1pOFs29rdhOsgJZaZb1+j+8c+J59K445Fz51eUPGk3LtQUfEUZL2BXZKu8ZHxOV51mlmVotGzs6oVCOWek8H5kfEXyUtK2loRMxvQL1mZhVbXMAAnes8aElfBS4Ffpt2jQCuyLNOM7NaFHEedN4LVY4EdgDmAUTE42RT78zMCqWIKwnzHuJ4MyLe6pqfKWkQxbxZamYDXBEXgeUdoP8m6dvAEEmfAL4GXJVznWZmVRuI6UZPAOYC9wOHA9cCJ+Vcp5lZ1TqIirdGyXuaXSfwu7SZmRVWEXvQeSVLup8+xpojYtM86jUzq9VAGoPeM/3blThiQvr3IGBBTnWamdVsICVLegpA0g4RsUPJoRMk3QGcmke9Zma1qtf8ZknrAOcDa5DF/fERcWYtZeV9k3A5SR/peiNpe2BgJ3k2s0Kq4yOvFgPfjIgPANsCR0raqJY25T3N7jDgHEnD0/tXgENzrtPMrGodUZ9BjpTBc056PV/Sw2SrqB+qtqy8Z3FMAzaTNIwsc96redZnZlarPJZwSxoJbAH8o5brcw3QkgYDnwFGAoO6VhRGhMegzaxQqknYX/p4vmR8eiJU6TnLA38Cvh4R82ppU95DHFcCrwLTgDdzrsvMrGbV9J9LH8/XE0lLkQXnib09g7USeQfotSNi15zrMDPrt3otVFE2VPB74OGIOKM/ZeU9i+Pvkj6Ycx1mZv1Wx1kcOwBfAD4maUbadq+lTXn3oD8CfFnSTLIhDgHhlYRmVjR1nMVxO1ms67e8A/RuOZdvZlYXA+6RVyUrClcD/Fx4MyusIubiyPuRV3tLehyYCfwNeBL4S551mpnVoo5j0HWT903C08iWOj4WEaOAXYA7cq7TzKxqEVHx1ih5B+hFEfES0CapLSJuATbPuU4zs6p10Fnx1ih53yR8Ja2mmQxMlPQCWSIRM7NCqWYlYaPk3YPeB1gIfAO4DvgXsFfOdZqZVS2q+K9R8p7F8XrJ2/PyrMvMrD+K2IPO65FX8+l5aXvXQpVhedRrZlarATMPOiKG5lGumVleBkwP2sys1dRrqXc9OUCbmTGAhjjMzFpNuAdtZlZMjVzCXSkHaDMzipksyQHazAz3oM3MCquj02PQZmaF5FkcZmYF5TFoM7OC8hi0mVlBuQdtZlZQvkloZlZQHuIwMysoD3GYmRWU042amRWU50GbmRWUe9BmZgXVWcB0o3k/1dvMrCVERMVbOZJ2lfSopH9KOqHWNrkHbWZG/WZxSGoHzgI+AcwCpkj6c0Q8VG1Z7kGbmQFRxVbGNsA/I+KJiHgL+COwTy1tKmwPevFbz6rZbSgKSWMjYnyz22HF4s9FfVUTcySNBcaW7Bpf8rMYATxTcmwW8OFa2uQedGsYW/4UG4D8uWiSiBgfEVuVbKW/KHsK9DWNnzhAm5nV1yxgnZL3awOzaynIAdrMrL6mAOtLGiVpaWB/4M+1FFTYMWh7F48zWk/8uSigiFgs6SjgeqAdOCciHqylLBUxQYiZmXmIw8yssBygzcwKygG6HyR9V9L/pNenSvp4D+eMlnR1ner7dh/HnpS0Sp3qea0e5Vht6vX9lzRS0gP1KMuawwG6TiLi5Ij4a87V9BqgzWzJ4wBdJUnfSUlQ/gpsWLL/XEmfTa93lfSIpNuBT/dSzpclXSbpOkmPS/pxybEDJN0v6QFJp6d9PwKGSJohaWKZNl4haZqkB9OKp679r0n6vqR7Jd0lafW0f5SkOyVNkXRaP749VkeSlpd0k6Tp6fOwT9o/UtLDkn6XfsY3SBqSjm2Zfr53Akc29QuwfnOAroKkLcnmNG5BFni37uGcZYDfAXsBOwJr9FHk5sB+wAeB/SStI2kt4HTgY+n41pLGRMQJwMKI2DwiDirT1EMjYktgK+AYSSun/csBd0XEZsBk4Ktp/5nAryNia+C5MmVb47wB7BsRHwJ2Bn4mqWuV2vrAWRGxMfAK8Jm0/w/AMRGxXaMba/XnAF2dHYHLI2JBRMyj58nn7wdmRsTjkc1hvKCP8m6KiFcj4g3gIeA9ZEH/1oiYGxGLgYnATlW28xhJ9wJ3ka1oWj/tfwvoGg+fBoxMr3cAJqXXE6qsy/Ij4AeS7gP+SpbjYfV0bGZEzEivpwEjJQ0HVoiIv6X9/lm2OC9UqV4lE8crnVz+ZsnrDrKfR7+SREkaDXwc2C4iFki6FVgmHV4U70x876qviyfEF89BwKrAlhGxSNKTvPOz7P7ZGUL22fHPcQniHnR1JgP7ShoiaSjZMEZ3jwCjJL03vT+gyjr+AXxU0iopr+wBQFePaJGkpcpcPxz4dwrO7we2raDOO8iGbiALClYMw4EXUnDemewvrF5FxCvAq5I+knb5Z9niHKCrEBHTgYuAGcCfgNt6OOcNsixj16SbhE9VWccc4ETgFuBeYHpEXJkOjwfuK3OT8DpgUPqz+DSyYY5yjgWOlDSFLChYMUwEtpI0lSzYPlLBNYcAZ6WbhAvzbJzlz0u9zcwKyj1oM7OCcoA2MysoB2gzs4JygDYzKygHaDOzgnKAtv8gqSPl/HhA0iWSlu1HWaU5Ss6WtFEf546WtH0NdfSYya+SDH/VZo4rzWBoljcHaOtJV86PTciWhx9RejAtoKlaRHwlIh7q45TRQNUB2mxJ5QBt5dwGvC/1bm+RdCFwv6R2ST9JGfDuk3Q4gDK/lPSQpGuA1boKknSrpK3S611TlrZ7U8a2kWS/CL6Reu87SlpV0p9SHVMk7ZCuXTllcLtH0m+pYHl8bxn+0rGfpbbcJGnVtO+9yjINTpN0W1qV2b3MY9LXeZ+kP9b4/TXrlXNxWK8kDQJ2I1udCLANsElEzExB7tWI2FrSYOAOSTeQZfrbkCxD3+pkSaDO6VbuqmQZ/3ZKZa0UES9L+g3wWkT8NJ13IfB/EXG7pHXJHsL5AWAccHtEnCppD7KVm+UcmuoYAkyR9KeIeIksw9/0iPimpJNT2UeRrdo8IiIel/Rh4FdkGQZLnQCMiog3Ja1QyffUrBoO0NaTIZJmpNe3Ab8nG3q4OyJmpv2fBDbtGl8mWyK+PlnmvUkR0QHMlnRzD+VvC0zuKisiXu6lHR8HNnonwybDUg6UnUh5tiPiGkn/ruBrOkbSvul1V4a/l4BOsuX7kGUevEzS8unrvaSk7sE9lHkfMFHSFcAVFbTBrCoO0NaThRGxeemOFKheL90FHB0R13c7b3fKZ1SrNOtaG1lWvnfllEhtqThHQZkMf91FqveV7t+DHuxB9stib+B/JW2cUsSa1YXHoK1W1wP/1ZVdT9IGkpYjy/i3fxqjXpMs0Xx3d5Jl7BuVrl0p7Z8PDC057way4QbSeZunl5NJmdok7QasWKatfWX4awO6/go4kGzoZB4wU9LnUh2StFlpgZLagHUi4hbgeGAFYPky7TCrinvQVquzyRL+T1fWpZ0LjAEuJxurvR94jHdSpb4tIuamMezLUqB7AfgEcBVwqbJHOx0NHEOWme0+ss/qZLIbiacAkyRNT+U/Xaat1wFHpHIe5d0Z/l4HNpY0DXiV7Ak3kP0C+LWkk4ClgD+SZRfs0g5coCxJvsjGyl8p0w6zqjibnZlZQXmIw8ysoBygzcwKygHazKygHKDNzArKAdrMrKAcoM3MCsoB2sysoP4/GZXC9mFMgIMAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 2 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"yhat=svm_cv.predict(X_test)\n", | |
"plot_confusion_matrix(Y_test,yhat)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 8\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a decision tree classifier object then create a <code>GridSearchCV</code> object <code>tree_cv</code> with cv = 10. Fit the object to find the best parameters from the dictionary <code>parameters</code>.\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 23, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"parameters = {'criterion': ['gini', 'entropy'],\n", | |
" 'splitter': ['best', 'random'],\n", | |
" 'max_depth': [2*n for n in range(1,10)],\n", | |
" 'max_features': ['auto', 'sqrt'],\n", | |
" 'min_samples_leaf': [1, 2, 4],\n", | |
" 'min_samples_split': [2, 5, 10]}\n", | |
"\n", | |
"tree = DecisionTreeClassifier()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 24, | |
"metadata": { | |
"tags": [] | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"GridSearchCV(cv=10, error_score='raise-deprecating',\n", | |
" estimator=DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,\n", | |
" max_features=None, max_leaf_nodes=None,\n", | |
" min_impurity_decrease=0.0, min_impurity_split=None,\n", | |
" min_samples_leaf=1, min_samples_split=2,\n", | |
" min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n", | |
" splitter='best'),\n", | |
" fit_params=None, iid='warn', n_jobs=None,\n", | |
" param_grid={'criterion': ['gini', 'entropy'], 'splitter': ['best', 'random'], 'max_depth': [2, 4, 6, 8, 10, 12, 14, 16, 18], 'max_features': ['auto', 'sqrt'], 'min_samples_leaf': [1, 2, 4], 'min_samples_split': [2, 5, 10]},\n", | |
" pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',\n", | |
" scoring=None, verbose=0)" | |
] | |
}, | |
"execution_count": 24, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"tree_cv = GridSearchCV(tree,parameters,cv=10)\n", | |
"tree_cv.fit(X_train, Y_train)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"tuned hpyerparameters :(best parameters) {'criterion': 'entropy', 'max_depth': 12, 'max_features': 'sqrt', 'min_samples_leaf': 4, 'min_samples_split': 2, 'splitter': 'random'}\n", | |
"accuracy : 0.8888888888888888\n" | |
] | |
} | |
], | |
"source": [ | |
"print(\"tuned hpyerparameters :(best parameters) \",tree_cv.best_params_)\n", | |
"print(\"accuracy :\",tree_cv.best_score_)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 9\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Calculate the accuracy of tree_cv on the test data using the method <code>score</code>:\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 26, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"test set accuracy : 0.8333333333333334\n" | |
] | |
} | |
], | |
"source": [ | |
"print(\"test set accuracy :\",tree_cv.score(X_test, Y_test))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We can plot the confusion matrix\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 27, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAWgAAAEWCAYAAABLzQ1kAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAfwUlEQVR4nO3dd5xdVbn/8c93JhACJKG3ACYooIAUKVIEg1joBBvVAmjgStPrBUG5RMCGhSv+xBIRgRAiRYoUKVIMIEgKoTcllJAAAYQEEiCZeX5/7DVwGGfmlDn7nH0y3zev/co5u6y1ZubwzJq113q2IgIzMyuetmY3wMzMeuYAbWZWUA7QZmYF5QBtZlZQDtBmZgXlAG1mVlAO0NZvkoZIukrSq5Iu6Uc5B0m6oZ5tawZJf5H0pWa3w1qfA/QAIulASVMlvSZpTgokH6lD0Z8FVgdWjojP1VpIREyMiE/WoT3vImm0pJB0Wbf9m6X9t1ZYznclXVDuvIjYLSLOq7G5Zm9zgB4gJP038HPgB2TBdF3gV8A+dSj+PcBjEbG4DmXlZS6wvaSVS/Z9CXisXhUo4/+nrG78YRoAJA0HTgWOjIjLIuL1iFgUEVdFxHHpnMGSfi5pdtp+LmlwOjZa0ixJ35T0Qup9H5KOnQKcDOyXeuaHde9pShqZeqqD0vsvS3pC0nxJMyUdVLL/9pLrtpc0JQ2dTJG0fcmxWyWdJumOVM4Nklbp49vwFnAFsH+6vh34PDCx2/fqTEnPSJonaZqkHdP+XYFvl3yd95a04/uS7gAWAOulfV9Jx38t6dKS8k+XdJMkVfrzs4HLAXpg2A5YBri8j3O+A2wLbA5sBmwDnFRyfA1gODACOAw4S9KKETGOrFd+UUQsHxG/76shkpYDfgHsFhFDge2BGT2ctxJwTTp3ZeAM4JpuPeADgUOA1YClgf/pq27gfOCL6fWngAeB2d3OmUL2PVgJuBC4RNIyEXFdt69zs5JrvgCMBYYCT3Ur75vApumXz45k37svhXMsWAUcoAeGlYEXywxBHAScGhEvRMRc4BSywNNlUTq+KCKuBV4DNqyxPZ3AJpKGRMSciHiwh3P2AB6PiAkRsTgiJgGPAHuVnPOHiHgsIhYCF5MF1l5FxN+BlSRtSBaoz+/hnAsi4qVU58+AwZT/Os+NiAfTNYu6lbcAOJjsF8wFwNERMatMeWaAA/RA8RKwStcQQy/W4t29v6fSvrfL6BbgFwDLV9uQiHgd2A84Apgj6RpJ76+gPV1tGlHy/rka2jMBOArYmR7+okjDOA+nYZVXyP5q6GvoBOCZvg5GxN3AE4DIfpGYVcQBemC4E3gDGNPHObPJbvZ1WZf//PO/Uq8Dy5a8X6P0YERcHxGfANYk6xX/roL2dLXp2Rrb1GUC8DXg2tS7fVsagvgW2dj0ihGxAvAqWWAF6G1Yos/hCklHkvXEZwPH19xyG3AcoAeAiHiV7EbeWZLGSFpW0lKSdpP043TaJOAkSaumm20nk/1JXosZwE6S1k03KE/sOiBpdUl7p7HoN8mGSjp6KONaYIM0NXCQpP2AjYCra2wTABExE/go2Zh7d0OBxWQzPgZJOhkYVnL8eWBkNTM1JG0AfI9smOMLwPGSNq+t9TbQOEAPEBFxBvDfZDf+5pL9WX4U2cwGyILIVOA+4H5getpXS103Ahelsqbx7qDaRnbjbDbwMlmw/FoPZbwE7JnOfYms57lnRLxYS5u6lX17RPT018H1wF/Ipt49RfZXR+nwRdcinJckTS9XTxpSugA4PSLujYjHyWaCTOiaIWPWF/lmsplZMbkHbWZWUA7QZmZ1JumctKjrgZJ9P5H0iKT7JF0uaYVy5ThAm5nV37nArt323QhsEhGbkt3nOLH7Rd05QJuZ1VlETCa7CV6674aStQR3AWuXK6evhQtNdejIz/rupZlV5JwnL+13bpNFLz5RccxZetX3Hk62vL/L+IgYX0V1h5LNdOpTYQO0mVlDdfY0Hb9nKRhXE5DfJuk7ZPPtJ5Y71wHazAwgOnOvQtmDHPYEdqkkYZYDtJkZQGe+ATqlrP0W8NHuaQZ64wBtZgZEHXvQkiYBo8mSlM0CxpHN2hgM3JjSgd8VEUf0VY4DtJkZQEf9HggUEQf0sLvPXOk9cYA2M4OqbhI2igO0mRk05CZhtRygzcwg95uEtXCANjOjvjcJ68UB2swM3IM2MyusjkXlz2kwB2gzM/BNQjOzwvIQh5lZQbkHbWZWUO5Bm5kVU3T6JqGZWTG5B21mVlAegzYzKygnSzIzKyj3oM3MCspj0GZmBVXHhP314gBtZgbuQZuZFVWEbxKamRWTe9BmZgXlWRxmZgXlHrSZWUF5FoeZWUF5iMPMrKA8xGFmVlAO0GZmBVXAIY62ZjfAzKwQOhZXvpUh6RxJL0h6oGTfSpJulPR4+nfFcuU4QJuZQTbEUelW3rnArt32nQDcFBHrAzel931ygDYzg2yIo9KtXFERk4GXu+3eBzgvvT4PGFOuHI9Bm5lBI24Srh4RcwAiYo6k1cpd4ABtZgZVBWhJY4GxJbvGR8T4ejfJAdrMDCCiilNjPFBtQH5e0pqp97wm8EK5CzwGbWYGsHhx5Vtt/gx8Kb3+EnBluQvcgzYzg7rOg5Y0CRgNrCJpFjAO+BFwsaTDgKeBz5UrxwHazAzqepMwIg7o5dAu1ZTjAG1mBlWNQTeKA7SZGTgXh5lZYTlAm5kVU3T4obFmZsXkHrSZWUEVMN2oA7SZGUCnZ3GYmRWThzjMzArKNwmtGoMGL8UJF53KUoOXoq29nal/uZMr/+/iZjfLmsyfi5y4B23VWPzmIn5y4Cm8ueAN2ge1c+Kl3+P+W+/hiXseb3bTrIn8ucjJQBiDljQf6PUrjYhh9a5zSfbmgjcAaB/UTvug9j6+szaQ+HORg4EwiyMihgJIOhV4DpgACDgIGFrv+pZ0amtj3NWns9p71uDmCdfzxAz3ksyfi1wUsAedZz7oT0XEryJifkTMi4hfA5/p6wJJYyVNlTT10flP5Ni01hGdnXx39+P45naHM2qz9zFig3Wa3SQrAH8u6i86OyveGiXPAN0h6SBJ7ZLaJB0E9HmbNCLGR8RWEbHVhkPXy7FprWfhvAU8eteDbPLRLZrdFCsQfy7qqKOj8q1B8gzQBwKfB55P2+fSPqvQ0JWGMWTYsgAsNXhpNtphU57717NNbpU1mz8XOemMyrcGyW0WR0Q8SfaYcavR8NVW5LCfHUVbWxtqE1Ou+Tv33jyt2c2yJvPnIicDaZqdpFWBrwIjS+uJiEPzqnNJM+uRpzhlj+Oa3QwrGH8uclLAm4R5zoO+ErgN+Ctlxp7NzJpuIEyzK7FsRHwrx/LNzOpngPWgr5a0e0Rcm2MdZmZ1EYuL94d+ngH6WODbkt4EFpEtVgmvJDSzQhpIPeiuFYVmZi1hgI1BI2lFYH1gma59ETE5zzrNzGoykHrQkr5CNsyxNjAD2Ba4E/hYXnWamdUqChig81xJeCywNfBUROwMbAHMzbE+M7PaLe6ofGuQPIc43oiINyQhaXBEPCJpwxzrMzOrXQF70HkG6FmSVgCuAG6U9G9gdo71mZnVbiAF6IjYN738rqRbgOHAdXnVZ2bWHxH1C9CSvgF8hexRCvcDh0TEG9WWk8cTVVbqYff96d/lgZfrXaeZWb/VqQctaQRwDLBRRCyUdDGwP3ButWXl0YOeRvZbQyX7ut4H4ETPZlY89R3iGAQMkbQIWJYah3fzeOTVqHqXaWaWt1hc+UIVSWOBsSW7xkfEeICIeFbST4GngYXADRFxQy1t8lO9zcwAqlhImILx+J6OpQV6+wCjgFeASyQdHBEXVNukPOdBm5m1jOiMircyPg7MjIi5EbEIuAzYvpY2uQdtZgb1HIN+GthW0rJkQxy7AFNrKSi3HrSkCZXsMzMrhM4qtj5ExD+AS4HpZDPY2uhlOKScPHvQG5e+kdQObJljfWZmNatnLo6IGAeM6285ecyDPhH4NtkUk3m8M93uLWr8LWJmlrdYXLyVhHUf4oiIH6Zc0D+JiGERMTRtK0fEifWuz8ysLuo0xFFPeS71PlHS3sBOadetEXF1XvWZmfVHAfP155oP+ofANsDEtOtYSTu4F21mhTSQAjSwB7B5RPZ7SdJ5wD2AA7SZFU7L96DTCpl1IuK+Ci9ZgXeSIw2vpi4zs0aKxc1uwX8qG6Al3Qrsnc6dAcyV9LeI+O8yl/4QuCelGhXZWLR7z2ZWSK3agx4eEfPSMwb/EBHjJJXtQUfEpBTctyYL0N+KiOf611wzs3wUMUBXMs1ukKQ1gc8D1c7CaANeBP4NbCBppzLnm5k1R6jyrUEq6UGfClwP3B4RUyStBzxe7iJJpwP7AQ/yzv3RACbX2FYzs9wUsQddNkBHxCXAJSXvnwA+U0HZY4ANI+LNmltnZtYg0dm4nnGleg3Qkv4fWY+3RxFxTJmynwCWAhygzazwOjtaKEBTY3q8EguAGZJuoiRIVxDYzcwarqWGOCLivNL3kpaLiNerKPvPaTMzK7yWGuLoImk74PdkT+ReV9JmwOER8bW+ruse4M3MiiyKl8yuoml2Pwc+BbwEEBH38k4CJDOzJUJ0quKtUSpa6h0Rz0jvalRHPs0xM2uOVrtJ2OUZSdsDIWlp4Bjg4XybZWbWWC05Bg0cAZwJjACeJVu0cmRvJ0u6ir6n5+1dZRvNzHIXDVwhWKlKFqq8CBxURZk/Tf9+GlgDuCC9PwB4sprGmZk1SktNs+uSlnafCWxL1jO+E/hGWlH4HyLib+m60yKi9GbiVZK8zNvMCqmzgD3oSmZxXAhcDKwJrEW27HtSBdetmoI7AJJGAavW0kgzs7xFqOKtUSoZg1ZETCh5f4Gkoyq47hvArZK6etojgcOrbJ+ZWUO01CwOSSull7dIOgH4I9kQx37ANeUKjojrJK0PvD/tesSJk8ysqFptFsc0soDc1erS3m8Ap/V0kaSPRcTNkj7d7dB7JRERl9XcWjOznBRxDLqvXByjaizzo8DNwF49FQs4QJtZ4bTkNDsASZsAGwHLdO2LiPN7OjcixqV/D6lHA83MGqGIuTgqmWY3DhhNFqCvBXYDbgd6DNCS+nyYbEScUXUrzcxyVs8hDkkrAGcDm5CNHBwaEXdWW04lPejPApsB90TEIZJWTxX3Zmj6d0OyB8Z2pRzdCz/uyswKqrO+NwnPBK6LiM+mFBnL1lJIJQF6YUR0SlosaRjwArBebydHxCkAkm4APhQR89P771Ly6CwzsyKpVw86xcmdgC8DRMRbwFu1lFVJgJ6auuu/I5vZ8RpwdwXXrdutUW+RzYWuyPmzq/5rwAaAhbNva3YTbAlVzU1CSWOBsSW7xkfE+PR6PWAu8IeUP38acGyVDzwBKsvF0ZWY/zeSrgOGRcR9FZQ9Abhb0uVkYzD7Ak7ib2aFVE0POgXj8b0cHgR8CDg6Iv4h6UzgBOB/q21TXwtVPtTXsYiY3lfBEfF9SX8Bdky7DomIe6ptoJlZI9RxEscsYFZE/CO9v5QsQFetrx70z/o4FsDHyhWegnifgdzMrAg6OitJTVReRDwn6RlJG0bEo8AuwEO1lNXXQpWda22gmVmrqXO20aOBiWkGxxNATetCKlqoYma2pAvqN80uImYAW/W3HAdoMzOgsxVXEpqZDQSddexB10vZUXFlDpZ0cnq/rqRt8m+amVnjBKp4a5RKblv+CtiO7JmCAPOBs3JrkZlZE3SgirdGqWSI48MR8SFJ9wBExL/TnUkzsyVGAZ8ZW1GAXiSpnTSPW9KqFPNrMTOrWRGDWiVDHL8ALgdWk/R9slSjP8i1VWZmDVbEMehKcnFMlDSNbDWMgDER8XDuLTMza6ACPpKwooT96wILgKtK90XE03k2zMyskYo4za6SMehreOfhscsAo4BHgY1zbJeZWUN1NLsBPahkiOODpe9TlrvDezndzKwldao1e9DvEhHTJW2dR2PMzJqlgCu9KxqDLn0IbBtZIuq5ubXIzKwJijjNrpIe9NCS14vJxqT/lE9zzMyao+VmcaQFKstHxHENao+ZWVM0cgl3pfp65NWgiFjc16OvzMyWFK3Wg76bbLx5hqQ/A5cAbz+VNiIuy7ltZmYN06pj0CsBL5E9g7BrPnQADtBmtsRotVkcq6UZHA/wTmDuUsSvxcysZq02xNEOLA89jpw7QJvZEqXVhjjmRMSpDWuJmVkTdbRYD7qAzTUzy0er9aB3aVgrzMyarKUCdES83MiGmJk1UxFvrFWdLMnMbEnUarM4zMwGjJYa4jAzG0haMmG/mdlAUO8hjpRsbirwbETsWUsZDtBmZuQyxHEs8DAwrNYC2urXFjOz1hVVbOVIWhvYAzi7P21ygDYzAzqJijdJYyVNLdnGdivu58Dx9LNj7iEOMzOqu0kYEeOB8T0dk7Qn8EJETJM0uj9tcoA2M6OuY9A7AHtL2h1YBhgm6YKIOLjagjzEYWZGNouj0q0vEXFiRKwdESOB/YGbawnO4B60mRmQjUEXjQO0mRn55OKIiFuBW2u93gHazAwv9TYzK6wOD3GYmRWTe9BmZgXlm4RmZgVVvPDsAG1mBniIw8yssHyT0MysoIo4Bu2l3gX3qU+O5sEHJvPIQ7dz/HFHNrs51iQn/eAMdtpjf8YcfMTb+376y7PZ64Cvsu8X/4tjTjyVefNfa2ILW189043WiwN0gbW1tfGLM7/PnnsdzAc325n99hvDBz6wfrObZU0wZvdP8JszvveufdttvQWXT/gNl5//a0auM4KzJ1zUpNYtGapJN9ooDtAFts3WW/Cvfz3JzJlPs2jRIi6++Er23utTzW6WNcFWm3+Q4cOGvmvfDh/ekkGD2gHYdOP38/wLLzajaUuMziq2RnGALrC1RqzBM7Nmv/1+1rNzWGutNZrYIiuqy6+5gY9st3Wzm9HSoor/GiWXm4SSPt3X8Yi4rJfrxgJjAdQ+nLa25XJoXeuQ/jOvYUTxbmRYc/32vEm0t7ez5yd3bnZTWtpAmsWxV/p3NWB74Ob0fmeyzE49BujSpxQMWnpE8b5bDfbsrDmss/Zab79fe8SazJnzfBNbZEVz5bU3MvmOuzn7Fz/s8Re6VW7AzIOOiEMAJF0NbBQRc9L7NYGz8qhzSTRl6gze975RjBy5Ds8++xyf//w+fOGLnslhmdvvmsrvJ17Cub/8MUOWWabZzWl5nQX86zTvedAju4Jz8jywQc51LjE6Ojo49usnce01F9Le1sa5513EQw891uxmWRMcN+5HTLnnPl55ZR67jDmYrx32Bc6ecBFvLVrEV7/+HSC7UTju+KOb3NLWVbzwDMpzTFPSL4H1gUlkX//+wD8jouynyEMc1pOFs29rdhOsgJZaZb1+j+8c+J59K445Fz51eUPGk3LtQUfEUZL2BXZKu8ZHxOV51mlmVotGzs6oVCOWek8H5kfEXyUtK2loRMxvQL1mZhVbXMAAnes8aElfBS4Ffpt2jQCuyLNOM7NaFHEedN4LVY4EdgDmAUTE42RT78zMCqWIKwnzHuJ4MyLe6pqfKWkQxbxZamYDXBEXgeUdoP8m6dvAEEmfAL4GXJVznWZmVRuI6UZPAOYC9wOHA9cCJ+Vcp5lZ1TqIirdGyXuaXSfwu7SZmRVWEXvQeSVLup8+xpojYtM86jUzq9VAGoPeM/3blThiQvr3IGBBTnWamdVsICVLegpA0g4RsUPJoRMk3QGcmke9Zma1qtf8ZknrAOcDa5DF/fERcWYtZeV9k3A5SR/peiNpe2BgJ3k2s0Kq4yOvFgPfjIgPANsCR0raqJY25T3N7jDgHEnD0/tXgENzrtPMrGodUZ9BjpTBc056PV/Sw2SrqB+qtqy8Z3FMAzaTNIwsc96redZnZlarPJZwSxoJbAH8o5brcw3QkgYDnwFGAoO6VhRGhMegzaxQqknYX/p4vmR8eiJU6TnLA38Cvh4R82ppU95DHFcCrwLTgDdzrsvMrGbV9J9LH8/XE0lLkQXnib09g7USeQfotSNi15zrMDPrt3otVFE2VPB74OGIOKM/ZeU9i+Pvkj6Ycx1mZv1Wx1kcOwBfAD4maUbadq+lTXn3oD8CfFnSTLIhDgHhlYRmVjR1nMVxO1ms67e8A/RuOZdvZlYXA+6RVyUrClcD/Fx4MyusIubiyPuRV3tLehyYCfwNeBL4S551mpnVoo5j0HWT903C08iWOj4WEaOAXYA7cq7TzKxqEVHx1ih5B+hFEfES0CapLSJuATbPuU4zs6p10Fnx1ih53yR8Ja2mmQxMlPQCWSIRM7NCqWYlYaPk3YPeB1gIfAO4DvgXsFfOdZqZVS2q+K9R8p7F8XrJ2/PyrMvMrD+K2IPO65FX8+l5aXvXQpVhedRrZlarATMPOiKG5lGumVleBkwP2sys1dRrqXc9OUCbmTGAhjjMzFpNuAdtZlZMjVzCXSkHaDMzipksyQHazAz3oM3MCquj02PQZmaF5FkcZmYF5TFoM7OC8hi0mVlBuQdtZlZQvkloZlZQHuIwMysoD3GYmRWU042amRWU50GbmRWUe9BmZgXVWcB0o3k/1dvMrCVERMVbOZJ2lfSopH9KOqHWNrkHbWZG/WZxSGoHzgI+AcwCpkj6c0Q8VG1Z7kGbmQFRxVbGNsA/I+KJiHgL+COwTy1tKmwPevFbz6rZbSgKSWMjYnyz22HF4s9FfVUTcySNBcaW7Bpf8rMYATxTcmwW8OFa2uQedGsYW/4UG4D8uWiSiBgfEVuVbKW/KHsK9DWNnzhAm5nV1yxgnZL3awOzaynIAdrMrL6mAOtLGiVpaWB/4M+1FFTYMWh7F48zWk/8uSigiFgs6SjgeqAdOCciHqylLBUxQYiZmXmIw8yssBygzcwKygG6HyR9V9L/pNenSvp4D+eMlnR1ner7dh/HnpS0Sp3qea0e5Vht6vX9lzRS0gP1KMuawwG6TiLi5Ij4a87V9BqgzWzJ4wBdJUnfSUlQ/gpsWLL/XEmfTa93lfSIpNuBT/dSzpclXSbpOkmPS/pxybEDJN0v6QFJp6d9PwKGSJohaWKZNl4haZqkB9OKp679r0n6vqR7Jd0lafW0f5SkOyVNkXRaP749VkeSlpd0k6Tp6fOwT9o/UtLDkn6XfsY3SBqSjm2Zfr53Akc29QuwfnOAroKkLcnmNG5BFni37uGcZYDfAXsBOwJr9FHk5sB+wAeB/SStI2kt4HTgY+n41pLGRMQJwMKI2DwiDirT1EMjYktgK+AYSSun/csBd0XEZsBk4Ktp/5nAryNia+C5MmVb47wB7BsRHwJ2Bn4mqWuV2vrAWRGxMfAK8Jm0/w/AMRGxXaMba/XnAF2dHYHLI2JBRMyj58nn7wdmRsTjkc1hvKCP8m6KiFcj4g3gIeA9ZEH/1oiYGxGLgYnATlW28xhJ9wJ3ka1oWj/tfwvoGg+fBoxMr3cAJqXXE6qsy/Ij4AeS7gP+SpbjYfV0bGZEzEivpwEjJQ0HVoiIv6X9/lm2OC9UqV4lE8crnVz+ZsnrDrKfR7+SREkaDXwc2C4iFki6FVgmHV4U70x876qviyfEF89BwKrAlhGxSNKTvPOz7P7ZGUL22fHPcQniHnR1JgP7ShoiaSjZMEZ3jwCjJL03vT+gyjr+AXxU0iopr+wBQFePaJGkpcpcPxz4dwrO7we2raDOO8iGbiALClYMw4EXUnDemewvrF5FxCvAq5I+knb5Z9niHKCrEBHTgYuAGcCfgNt6OOcNsixj16SbhE9VWccc4ETgFuBeYHpEXJkOjwfuK3OT8DpgUPqz+DSyYY5yjgWOlDSFLChYMUwEtpI0lSzYPlLBNYcAZ6WbhAvzbJzlz0u9zcwKyj1oM7OCcoA2MysoB2gzs4JygDYzKygHaDOzgnKAtv8gqSPl/HhA0iWSlu1HWaU5Ss6WtFEf546WtH0NdfSYya+SDH/VZo4rzWBoljcHaOtJV86PTciWhx9RejAtoKlaRHwlIh7q45TRQNUB2mxJ5QBt5dwGvC/1bm+RdCFwv6R2ST9JGfDuk3Q4gDK/lPSQpGuA1boKknSrpK3S611TlrZ7U8a2kWS/CL6Reu87SlpV0p9SHVMk7ZCuXTllcLtH0m+pYHl8bxn+0rGfpbbcJGnVtO+9yjINTpN0W1qV2b3MY9LXeZ+kP9b4/TXrlXNxWK8kDQJ2I1udCLANsElEzExB7tWI2FrSYOAOSTeQZfrbkCxD3+pkSaDO6VbuqmQZ/3ZKZa0UES9L+g3wWkT8NJ13IfB/EXG7pHXJHsL5AWAccHtEnCppD7KVm+UcmuoYAkyR9KeIeIksw9/0iPimpJNT2UeRrdo8IiIel/Rh4FdkGQZLnQCMiog3Ja1QyffUrBoO0NaTIZJmpNe3Ab8nG3q4OyJmpv2fBDbtGl8mWyK+PlnmvUkR0QHMlnRzD+VvC0zuKisiXu6lHR8HNnonwybDUg6UnUh5tiPiGkn/ruBrOkbSvul1V4a/l4BOsuX7kGUevEzS8unrvaSk7sE9lHkfMFHSFcAVFbTBrCoO0NaThRGxeemOFKheL90FHB0R13c7b3fKZ1SrNOtaG1lWvnfllEhtqThHQZkMf91FqveV7t+DHuxB9stib+B/JW2cUsSa1YXHoK1W1wP/1ZVdT9IGkpYjy/i3fxqjXpMs0Xx3d5Jl7BuVrl0p7Z8PDC057way4QbSeZunl5NJmdok7QasWKatfWX4awO6/go4kGzoZB4wU9LnUh2StFlpgZLagHUi4hbgeGAFYPky7TCrinvQVquzyRL+T1fWpZ0LjAEuJxurvR94jHdSpb4tIuamMezLUqB7AfgEcBVwqbJHOx0NHEOWme0+ss/qZLIbiacAkyRNT+U/Xaat1wFHpHIe5d0Z/l4HNpY0DXiV7Ak3kP0C+LWkk4ClgD+SZRfs0g5coCxJvsjGyl8p0w6zqjibnZlZQXmIw8ysoBygzcwKygHazKygHKDNzArKAdrMrKAcoM3MCsoB2sysoP4/GZXC9mFMgIMAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 2 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"yhat = svm_cv.predict(X_test)\n", | |
"plot_confusion_matrix(Y_test,yhat)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 10\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a k nearest neighbors object then create a <code>GridSearchCV</code> object <code>knn_cv</code> with cv = 10. Fit the object to find the best parameters from the dictionary <code>parameters</code>.\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 28, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"parameters = {'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n", | |
" 'algorithm': ['auto', 'ball_tree', 'kd_tree', 'brute'],\n", | |
" 'p': [1,2]}\n", | |
"\n", | |
"KNN = KNeighborsClassifier()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 29, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"GridSearchCV(cv=10, error_score='raise-deprecating',\n", | |
" estimator=KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',\n", | |
" metric_params=None, n_jobs=None, n_neighbors=5, p=2,\n", | |
" weights='uniform'),\n", | |
" fit_params=None, iid='warn', n_jobs=None,\n", | |
" param_grid={'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'algorithm': ['auto', 'ball_tree', 'kd_tree', 'brute'], 'p': [1, 2]},\n", | |
" pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',\n", | |
" scoring=None, verbose=0)" | |
] | |
}, | |
"execution_count": 29, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"knn_cv = GridSearchCV(KNN,parameters,cv=10)\n", | |
"knn_cv.fit(X_train, Y_train)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 30, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"tuned hpyerparameters :(best parameters) {'algorithm': 'auto', 'n_neighbors': 9, 'p': 1}\n", | |
"accuracy : 0.8472222222222222\n" | |
] | |
} | |
], | |
"source": [ | |
"print(\"tuned hpyerparameters :(best parameters) \",knn_cv.best_params_)\n", | |
"print(\"accuracy :\",knn_cv.best_score_)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 11\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Calculate the accuracy of tree_cv on the test data using the method <code>score</code>:\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 31, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"test set accuracy : 0.8333333333333334\n" | |
] | |
} | |
], | |
"source": [ | |
"print(\"test set accuracy :\",knn_cv.score(X_test, Y_test))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We can plot the confusion matrix\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 32, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAWgAAAEWCAYAAABLzQ1kAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAfwUlEQVR4nO3dd5xdVbn/8c93JhACJKG3ACYooIAUKVIEg1joBBvVAmjgStPrBUG5RMCGhSv+xBIRgRAiRYoUKVIMIEgKoTcllJAAAYQEEiCZeX5/7DVwGGfmlDn7nH0y3zev/co5u6y1ZubwzJq113q2IgIzMyuetmY3wMzMeuYAbWZWUA7QZmYF5QBtZlZQDtBmZgXlAG1mVlAO0NZvkoZIukrSq5Iu6Uc5B0m6oZ5tawZJf5H0pWa3w1qfA/QAIulASVMlvSZpTgokH6lD0Z8FVgdWjojP1VpIREyMiE/WoT3vImm0pJB0Wbf9m6X9t1ZYznclXVDuvIjYLSLOq7G5Zm9zgB4gJP038HPgB2TBdF3gV8A+dSj+PcBjEbG4DmXlZS6wvaSVS/Z9CXisXhUo4/+nrG78YRoAJA0HTgWOjIjLIuL1iFgUEVdFxHHpnMGSfi5pdtp+LmlwOjZa0ixJ35T0Qup9H5KOnQKcDOyXeuaHde9pShqZeqqD0vsvS3pC0nxJMyUdVLL/9pLrtpc0JQ2dTJG0fcmxWyWdJumOVM4Nklbp49vwFnAFsH+6vh34PDCx2/fqTEnPSJonaZqkHdP+XYFvl3yd95a04/uS7gAWAOulfV9Jx38t6dKS8k+XdJMkVfrzs4HLAXpg2A5YBri8j3O+A2wLbA5sBmwDnFRyfA1gODACOAw4S9KKETGOrFd+UUQsHxG/76shkpYDfgHsFhFDge2BGT2ctxJwTTp3ZeAM4JpuPeADgUOA1YClgf/pq27gfOCL6fWngAeB2d3OmUL2PVgJuBC4RNIyEXFdt69zs5JrvgCMBYYCT3Ur75vApumXz45k37svhXMsWAUcoAeGlYEXywxBHAScGhEvRMRc4BSywNNlUTq+KCKuBV4DNqyxPZ3AJpKGRMSciHiwh3P2AB6PiAkRsTgiJgGPAHuVnPOHiHgsIhYCF5MF1l5FxN+BlSRtSBaoz+/hnAsi4qVU58+AwZT/Os+NiAfTNYu6lbcAOJjsF8wFwNERMatMeWaAA/RA8RKwStcQQy/W4t29v6fSvrfL6BbgFwDLV9uQiHgd2A84Apgj6RpJ76+gPV1tGlHy/rka2jMBOArYmR7+okjDOA+nYZVXyP5q6GvoBOCZvg5GxN3AE4DIfpGYVcQBemC4E3gDGNPHObPJbvZ1WZf//PO/Uq8Dy5a8X6P0YERcHxGfANYk6xX/roL2dLXp2Rrb1GUC8DXg2tS7fVsagvgW2dj0ihGxAvAqWWAF6G1Yos/hCklHkvXEZwPH19xyG3AcoAeAiHiV7EbeWZLGSFpW0lKSdpP043TaJOAkSaumm20nk/1JXosZwE6S1k03KE/sOiBpdUl7p7HoN8mGSjp6KONaYIM0NXCQpP2AjYCra2wTABExE/go2Zh7d0OBxWQzPgZJOhkYVnL8eWBkNTM1JG0AfI9smOMLwPGSNq+t9TbQOEAPEBFxBvDfZDf+5pL9WX4U2cwGyILIVOA+4H5getpXS103Ahelsqbx7qDaRnbjbDbwMlmw/FoPZbwE7JnOfYms57lnRLxYS5u6lX17RPT018H1wF/Ipt49RfZXR+nwRdcinJckTS9XTxpSugA4PSLujYjHyWaCTOiaIWPWF/lmsplZMbkHbWZWUA7QZmZ1JumctKjrgZJ9P5H0iKT7JF0uaYVy5ThAm5nV37nArt323QhsEhGbkt3nOLH7Rd05QJuZ1VlETCa7CV6674aStQR3AWuXK6evhQtNdejIz/rupZlV5JwnL+13bpNFLz5RccxZetX3Hk62vL/L+IgYX0V1h5LNdOpTYQO0mVlDdfY0Hb9nKRhXE5DfJuk7ZPPtJ5Y71wHazAwgOnOvQtmDHPYEdqkkYZYDtJkZQGe+ATqlrP0W8NHuaQZ64wBtZgZEHXvQkiYBo8mSlM0CxpHN2hgM3JjSgd8VEUf0VY4DtJkZQEf9HggUEQf0sLvPXOk9cYA2M4OqbhI2igO0mRk05CZhtRygzcwg95uEtXCANjOjvjcJ68UB2swM3IM2MyusjkXlz2kwB2gzM/BNQjOzwvIQh5lZQbkHbWZWUO5Bm5kVU3T6JqGZWTG5B21mVlAegzYzKygnSzIzKyj3oM3MCspj0GZmBVXHhP314gBtZgbuQZuZFVWEbxKamRWTe9BmZgXlWRxmZgXlHrSZWUF5FoeZWUF5iMPMrKA8xGFmVlAO0GZmBVXAIY62ZjfAzKwQOhZXvpUh6RxJL0h6oGTfSpJulPR4+nfFcuU4QJuZQTbEUelW3rnArt32nQDcFBHrAzel931ygDYzg2yIo9KtXFERk4GXu+3eBzgvvT4PGFOuHI9Bm5lBI24Srh4RcwAiYo6k1cpd4ABtZgZVBWhJY4GxJbvGR8T4ejfJAdrMDCCiilNjPFBtQH5e0pqp97wm8EK5CzwGbWYGsHhx5Vtt/gx8Kb3+EnBluQvcgzYzg7rOg5Y0CRgNrCJpFjAO+BFwsaTDgKeBz5UrxwHazAzqepMwIg7o5dAu1ZTjAG1mBlWNQTeKA7SZGTgXh5lZYTlAm5kVU3T4obFmZsXkHrSZWUEVMN2oA7SZGUCnZ3GYmRWThzjMzArKNwmtGoMGL8UJF53KUoOXoq29nal/uZMr/+/iZjfLmsyfi5y4B23VWPzmIn5y4Cm8ueAN2ge1c+Kl3+P+W+/hiXseb3bTrIn8ucjJQBiDljQf6PUrjYhh9a5zSfbmgjcAaB/UTvug9j6+szaQ+HORg4EwiyMihgJIOhV4DpgACDgIGFrv+pZ0amtj3NWns9p71uDmCdfzxAz3ksyfi1wUsAedZz7oT0XEryJifkTMi4hfA5/p6wJJYyVNlTT10flP5Ni01hGdnXx39+P45naHM2qz9zFig3Wa3SQrAH8u6i86OyveGiXPAN0h6SBJ7ZLaJB0E9HmbNCLGR8RWEbHVhkPXy7FprWfhvAU8eteDbPLRLZrdFCsQfy7qqKOj8q1B8gzQBwKfB55P2+fSPqvQ0JWGMWTYsgAsNXhpNtphU57717NNbpU1mz8XOemMyrcGyW0WR0Q8SfaYcavR8NVW5LCfHUVbWxtqE1Ou+Tv33jyt2c2yJvPnIicDaZqdpFWBrwIjS+uJiEPzqnNJM+uRpzhlj+Oa3QwrGH8uclLAm4R5zoO+ErgN+Ctlxp7NzJpuIEyzK7FsRHwrx/LNzOpngPWgr5a0e0Rcm2MdZmZ1EYuL94d+ngH6WODbkt4EFpEtVgmvJDSzQhpIPeiuFYVmZi1hgI1BI2lFYH1gma59ETE5zzrNzGoykHrQkr5CNsyxNjAD2Ba4E/hYXnWamdUqChig81xJeCywNfBUROwMbAHMzbE+M7PaLe6ofGuQPIc43oiINyQhaXBEPCJpwxzrMzOrXQF70HkG6FmSVgCuAG6U9G9gdo71mZnVbiAF6IjYN738rqRbgOHAdXnVZ2bWHxH1C9CSvgF8hexRCvcDh0TEG9WWk8cTVVbqYff96d/lgZfrXaeZWb/VqQctaQRwDLBRRCyUdDGwP3ButWXl0YOeRvZbQyX7ut4H4ETPZlY89R3iGAQMkbQIWJYah3fzeOTVqHqXaWaWt1hc+UIVSWOBsSW7xkfEeICIeFbST4GngYXADRFxQy1t8lO9zcwAqlhImILx+J6OpQV6+wCjgFeASyQdHBEXVNukPOdBm5m1jOiMircyPg7MjIi5EbEIuAzYvpY2uQdtZgb1HIN+GthW0rJkQxy7AFNrKSi3HrSkCZXsMzMrhM4qtj5ExD+AS4HpZDPY2uhlOKScPHvQG5e+kdQObJljfWZmNatnLo6IGAeM6285ecyDPhH4NtkUk3m8M93uLWr8LWJmlrdYXLyVhHUf4oiIH6Zc0D+JiGERMTRtK0fEifWuz8ysLuo0xFFPeS71PlHS3sBOadetEXF1XvWZmfVHAfP155oP+ofANsDEtOtYSTu4F21mhTSQAjSwB7B5RPZ7SdJ5wD2AA7SZFU7L96DTCpl1IuK+Ci9ZgXeSIw2vpi4zs0aKxc1uwX8qG6Al3Qrsnc6dAcyV9LeI+O8yl/4QuCelGhXZWLR7z2ZWSK3agx4eEfPSMwb/EBHjJJXtQUfEpBTctyYL0N+KiOf611wzs3wUMUBXMs1ukKQ1gc8D1c7CaANeBP4NbCBppzLnm5k1R6jyrUEq6UGfClwP3B4RUyStBzxe7iJJpwP7AQ/yzv3RACbX2FYzs9wUsQddNkBHxCXAJSXvnwA+U0HZY4ANI+LNmltnZtYg0dm4nnGleg3Qkv4fWY+3RxFxTJmynwCWAhygzazwOjtaKEBTY3q8EguAGZJuoiRIVxDYzcwarqWGOCLivNL3kpaLiNerKPvPaTMzK7yWGuLoImk74PdkT+ReV9JmwOER8bW+ruse4M3MiiyKl8yuoml2Pwc+BbwEEBH38k4CJDOzJUJ0quKtUSpa6h0Rz0jvalRHPs0xM2uOVrtJ2OUZSdsDIWlp4Bjg4XybZWbWWC05Bg0cAZwJjACeJVu0cmRvJ0u6ir6n5+1dZRvNzHIXDVwhWKlKFqq8CBxURZk/Tf9+GlgDuCC9PwB4sprGmZk1SktNs+uSlnafCWxL1jO+E/hGWlH4HyLib+m60yKi9GbiVZK8zNvMCqmzgD3oSmZxXAhcDKwJrEW27HtSBdetmoI7AJJGAavW0kgzs7xFqOKtUSoZg1ZETCh5f4Gkoyq47hvArZK6etojgcOrbJ+ZWUO01CwOSSull7dIOgH4I9kQx37ANeUKjojrJK0PvD/tesSJk8ysqFptFsc0soDc1erS3m8Ap/V0kaSPRcTNkj7d7dB7JRERl9XcWjOznBRxDLqvXByjaizzo8DNwF49FQs4QJtZ4bTkNDsASZsAGwHLdO2LiPN7OjcixqV/D6lHA83MGqGIuTgqmWY3DhhNFqCvBXYDbgd6DNCS+nyYbEScUXUrzcxyVs8hDkkrAGcDm5CNHBwaEXdWW04lPejPApsB90TEIZJWTxX3Zmj6d0OyB8Z2pRzdCz/uyswKqrO+NwnPBK6LiM+mFBnL1lJIJQF6YUR0SlosaRjwArBebydHxCkAkm4APhQR89P771Ly6CwzsyKpVw86xcmdgC8DRMRbwFu1lFVJgJ6auuu/I5vZ8RpwdwXXrdutUW+RzYWuyPmzq/5rwAaAhbNva3YTbAlVzU1CSWOBsSW7xkfE+PR6PWAu8IeUP38acGyVDzwBKsvF0ZWY/zeSrgOGRcR9FZQ9Abhb0uVkYzD7Ak7ib2aFVE0POgXj8b0cHgR8CDg6Iv4h6UzgBOB/q21TXwtVPtTXsYiY3lfBEfF9SX8Bdky7DomIe6ptoJlZI9RxEscsYFZE/CO9v5QsQFetrx70z/o4FsDHyhWegnifgdzMrAg6OitJTVReRDwn6RlJG0bEo8AuwEO1lNXXQpWda22gmVmrqXO20aOBiWkGxxNATetCKlqoYma2pAvqN80uImYAW/W3HAdoMzOgsxVXEpqZDQSddexB10vZUXFlDpZ0cnq/rqRt8m+amVnjBKp4a5RKblv+CtiO7JmCAPOBs3JrkZlZE3SgirdGqWSI48MR8SFJ9wBExL/TnUkzsyVGAZ8ZW1GAXiSpnTSPW9KqFPNrMTOrWRGDWiVDHL8ALgdWk/R9slSjP8i1VWZmDVbEMehKcnFMlDSNbDWMgDER8XDuLTMza6ACPpKwooT96wILgKtK90XE03k2zMyskYo4za6SMehreOfhscsAo4BHgY1zbJeZWUN1NLsBPahkiOODpe9TlrvDezndzKwldao1e9DvEhHTJW2dR2PMzJqlgCu9KxqDLn0IbBtZIuq5ubXIzKwJijjNrpIe9NCS14vJxqT/lE9zzMyao+VmcaQFKstHxHENao+ZWVM0cgl3pfp65NWgiFjc16OvzMyWFK3Wg76bbLx5hqQ/A5cAbz+VNiIuy7ltZmYN06pj0CsBL5E9g7BrPnQADtBmtsRotVkcq6UZHA/wTmDuUsSvxcysZq02xNEOLA89jpw7QJvZEqXVhjjmRMSpDWuJmVkTdbRYD7qAzTUzy0er9aB3aVgrzMyarKUCdES83MiGmJk1UxFvrFWdLMnMbEnUarM4zMwGjJYa4jAzG0haMmG/mdlAUO8hjpRsbirwbETsWUsZDtBmZuQyxHEs8DAwrNYC2urXFjOz1hVVbOVIWhvYAzi7P21ygDYzAzqJijdJYyVNLdnGdivu58Dx9LNj7iEOMzOqu0kYEeOB8T0dk7Qn8EJETJM0uj9tcoA2M6OuY9A7AHtL2h1YBhgm6YKIOLjagjzEYWZGNouj0q0vEXFiRKwdESOB/YGbawnO4B60mRmQjUEXjQO0mRn55OKIiFuBW2u93gHazAwv9TYzK6wOD3GYmRWTe9BmZgXlm4RmZgVVvPDsAG1mBniIw8yssHyT0MysoIo4Bu2l3gX3qU+O5sEHJvPIQ7dz/HFHNrs51iQn/eAMdtpjf8YcfMTb+376y7PZ64Cvsu8X/4tjTjyVefNfa2ILW189043WiwN0gbW1tfGLM7/PnnsdzAc325n99hvDBz6wfrObZU0wZvdP8JszvveufdttvQWXT/gNl5//a0auM4KzJ1zUpNYtGapJN9ooDtAFts3WW/Cvfz3JzJlPs2jRIi6++Er23utTzW6WNcFWm3+Q4cOGvmvfDh/ekkGD2gHYdOP38/wLLzajaUuMziq2RnGALrC1RqzBM7Nmv/1+1rNzWGutNZrYIiuqy6+5gY9st3Wzm9HSoor/GiWXm4SSPt3X8Yi4rJfrxgJjAdQ+nLa25XJoXeuQ/jOvYUTxbmRYc/32vEm0t7ez5yd3bnZTWtpAmsWxV/p3NWB74Ob0fmeyzE49BujSpxQMWnpE8b5bDfbsrDmss/Zab79fe8SazJnzfBNbZEVz5bU3MvmOuzn7Fz/s8Re6VW7AzIOOiEMAJF0NbBQRc9L7NYGz8qhzSTRl6gze975RjBy5Ds8++xyf//w+fOGLnslhmdvvmsrvJ17Cub/8MUOWWabZzWl5nQX86zTvedAju4Jz8jywQc51LjE6Ojo49usnce01F9Le1sa5513EQw891uxmWRMcN+5HTLnnPl55ZR67jDmYrx32Bc6ecBFvLVrEV7/+HSC7UTju+KOb3NLWVbzwDMpzTFPSL4H1gUlkX//+wD8jouynyEMc1pOFs29rdhOsgJZaZb1+j+8c+J59K445Fz51eUPGk3LtQUfEUZL2BXZKu8ZHxOV51mlmVotGzs6oVCOWek8H5kfEXyUtK2loRMxvQL1mZhVbXMAAnes8aElfBS4Ffpt2jQCuyLNOM7NaFHEedN4LVY4EdgDmAUTE42RT78zMCqWIKwnzHuJ4MyLe6pqfKWkQxbxZamYDXBEXgeUdoP8m6dvAEEmfAL4GXJVznWZmVRuI6UZPAOYC9wOHA9cCJ+Vcp5lZ1TqIirdGyXuaXSfwu7SZmRVWEXvQeSVLup8+xpojYtM86jUzq9VAGoPeM/3blThiQvr3IGBBTnWamdVsICVLegpA0g4RsUPJoRMk3QGcmke9Zma1qtf8ZknrAOcDa5DF/fERcWYtZeV9k3A5SR/peiNpe2BgJ3k2s0Kq4yOvFgPfjIgPANsCR0raqJY25T3N7jDgHEnD0/tXgENzrtPMrGodUZ9BjpTBc056PV/Sw2SrqB+qtqy8Z3FMAzaTNIwsc96redZnZlarPJZwSxoJbAH8o5brcw3QkgYDnwFGAoO6VhRGhMegzaxQqknYX/p4vmR8eiJU6TnLA38Cvh4R82ppU95DHFcCrwLTgDdzrsvMrGbV9J9LH8/XE0lLkQXnib09g7USeQfotSNi15zrMDPrt3otVFE2VPB74OGIOKM/ZeU9i+Pvkj6Ycx1mZv1Wx1kcOwBfAD4maUbadq+lTXn3oD8CfFnSTLIhDgHhlYRmVjR1nMVxO1ms67e8A/RuOZdvZlYXA+6RVyUrClcD/Fx4MyusIubiyPuRV3tLehyYCfwNeBL4S551mpnVoo5j0HWT903C08iWOj4WEaOAXYA7cq7TzKxqEVHx1ih5B+hFEfES0CapLSJuATbPuU4zs6p10Fnx1ih53yR8Ja2mmQxMlPQCWSIRM7NCqWYlYaPk3YPeB1gIfAO4DvgXsFfOdZqZVS2q+K9R8p7F8XrJ2/PyrMvMrD+K2IPO65FX8+l5aXvXQpVhedRrZlarATMPOiKG5lGumVleBkwP2sys1dRrqXc9OUCbmTGAhjjMzFpNuAdtZlZMjVzCXSkHaDMzipksyQHazAz3oM3MCquj02PQZmaF5FkcZmYF5TFoM7OC8hi0mVlBuQdtZlZQvkloZlZQHuIwMysoD3GYmRWU042amRWU50GbmRWUe9BmZgXVWcB0o3k/1dvMrCVERMVbOZJ2lfSopH9KOqHWNrkHbWZG/WZxSGoHzgI+AcwCpkj6c0Q8VG1Z7kGbmQFRxVbGNsA/I+KJiHgL+COwTy1tKmwPevFbz6rZbSgKSWMjYnyz22HF4s9FfVUTcySNBcaW7Bpf8rMYATxTcmwW8OFa2uQedGsYW/4UG4D8uWiSiBgfEVuVbKW/KHsK9DWNnzhAm5nV1yxgnZL3awOzaynIAdrMrL6mAOtLGiVpaWB/4M+1FFTYMWh7F48zWk/8uSigiFgs6SjgeqAdOCciHqylLBUxQYiZmXmIw8yssBygzcwKygG6HyR9V9L/pNenSvp4D+eMlnR1ner7dh/HnpS0Sp3qea0e5Vht6vX9lzRS0gP1KMuawwG6TiLi5Ij4a87V9BqgzWzJ4wBdJUnfSUlQ/gpsWLL/XEmfTa93lfSIpNuBT/dSzpclXSbpOkmPS/pxybEDJN0v6QFJp6d9PwKGSJohaWKZNl4haZqkB9OKp679r0n6vqR7Jd0lafW0f5SkOyVNkXRaP749VkeSlpd0k6Tp6fOwT9o/UtLDkn6XfsY3SBqSjm2Zfr53Akc29QuwfnOAroKkLcnmNG5BFni37uGcZYDfAXsBOwJr9FHk5sB+wAeB/SStI2kt4HTgY+n41pLGRMQJwMKI2DwiDirT1EMjYktgK+AYSSun/csBd0XEZsBk4Ktp/5nAryNia+C5MmVb47wB7BsRHwJ2Bn4mqWuV2vrAWRGxMfAK8Jm0/w/AMRGxXaMba/XnAF2dHYHLI2JBRMyj58nn7wdmRsTjkc1hvKCP8m6KiFcj4g3gIeA9ZEH/1oiYGxGLgYnATlW28xhJ9wJ3ka1oWj/tfwvoGg+fBoxMr3cAJqXXE6qsy/Ij4AeS7gP+SpbjYfV0bGZEzEivpwEjJQ0HVoiIv6X9/lm2OC9UqV4lE8crnVz+ZsnrDrKfR7+SREkaDXwc2C4iFki6FVgmHV4U70x876qviyfEF89BwKrAlhGxSNKTvPOz7P7ZGUL22fHPcQniHnR1JgP7ShoiaSjZMEZ3jwCjJL03vT+gyjr+AXxU0iopr+wBQFePaJGkpcpcPxz4dwrO7we2raDOO8iGbiALClYMw4EXUnDemewvrF5FxCvAq5I+knb5Z9niHKCrEBHTgYuAGcCfgNt6OOcNsixj16SbhE9VWccc4ETgFuBeYHpEXJkOjwfuK3OT8DpgUPqz+DSyYY5yjgWOlDSFLChYMUwEtpI0lSzYPlLBNYcAZ6WbhAvzbJzlz0u9zcwKyj1oM7OCcoA2MysoB2gzs4JygDYzKygHaDOzgnKAtv8gqSPl/HhA0iWSlu1HWaU5Ss6WtFEf546WtH0NdfSYya+SDH/VZo4rzWBoljcHaOtJV86PTciWhx9RejAtoKlaRHwlIh7q45TRQNUB2mxJ5QBt5dwGvC/1bm+RdCFwv6R2ST9JGfDuk3Q4gDK/lPSQpGuA1boKknSrpK3S611TlrZ7U8a2kWS/CL6Reu87SlpV0p9SHVMk7ZCuXTllcLtH0m+pYHl8bxn+0rGfpbbcJGnVtO+9yjINTpN0W1qV2b3MY9LXeZ+kP9b4/TXrlXNxWK8kDQJ2I1udCLANsElEzExB7tWI2FrSYOAOSTeQZfrbkCxD3+pkSaDO6VbuqmQZ/3ZKZa0UES9L+g3wWkT8NJ13IfB/EXG7pHXJHsL5AWAccHtEnCppD7KVm+UcmuoYAkyR9KeIeIksw9/0iPimpJNT2UeRrdo8IiIel/Rh4FdkGQZLnQCMiog3Ja1QyffUrBoO0NaTIZJmpNe3Ab8nG3q4OyJmpv2fBDbtGl8mWyK+PlnmvUkR0QHMlnRzD+VvC0zuKisiXu6lHR8HNnonwybDUg6UnUh5tiPiGkn/ruBrOkbSvul1V4a/l4BOsuX7kGUevEzS8unrvaSk7sE9lHkfMFHSFcAVFbTBrCoO0NaThRGxeemOFKheL90FHB0R13c7b3fKZ1SrNOtaG1lWvnfllEhtqThHQZkMf91FqveV7t+DHuxB9stib+B/JW2cUsSa1YXHoK1W1wP/1ZVdT9IGkpYjy/i3fxqjXpMs0Xx3d5Jl7BuVrl0p7Z8PDC057way4QbSeZunl5NJmdok7QasWKatfWX4awO6/go4kGzoZB4wU9LnUh2StFlpgZLagHUi4hbgeGAFYPky7TCrinvQVquzyRL+T1fWpZ0LjAEuJxurvR94jHdSpb4tIuamMezLUqB7AfgEcBVwqbJHOx0NHEOWme0+ss/qZLIbiacAkyRNT+U/Xaat1wFHpHIe5d0Z/l4HNpY0DXiV7Ak3kP0C+LWkk4ClgD+SZRfs0g5coCxJvsjGyl8p0w6zqjibnZlZQXmIw8ysoBygzcwKygHazKygHKDNzArKAdrMrKAcoM3MCsoB2sysoP4/GZXC9mFMgIMAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 2 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"yhat = knn_cv.predict(X_test)\n", | |
"plot_confusion_matrix(Y_test,yhat)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## TASK 12\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Find the method performs best:\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 35, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# After comparing accuracy of above methods, they all preformed practically\n", | |
"# the same, except for tree which fit train data slightly better but test data worse." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Authors\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"<a href=\"https://www.linkedin.com/in/joseph-s-50398b136/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDS0321ENSkillsNetwork26802033-2021-01-01\">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Change Log\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n", | |
"| ----------------- | ------- | ------------- | ----------------------- |\n", | |
"| 2021-08-31 | 1.1 | Lakshmi Holla | Modified markdown |\n", | |
"| 2020-09-20 | 1.0 | Joseph | Modified Multiple Areas |\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Copyright © 2020 IBM Corporation. All rights reserved.\n" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python", | |
"language": "python", | |
"name": "conda-env-python-py" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.7.10" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment