Skip to content

Instantly share code, notes, and snippets.

@tkhan0
Last active October 21, 2019 23:58
Show Gist options
  • Save tkhan0/8e4ee93feed1228d0101b23a67e8ee1c to your computer and use it in GitHub Desktop.
Save tkhan0/8e4ee93feed1228d0101b23a67e8ee1c to your computer and use it in GitHub Desktop.
Multiple Linear Regression
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1. Import the data to a DataFrame using Pandas"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv('/Admission_Predict_Ver1.2.csv',encoding = 'utf-8')"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Serial No.</th>\n",
" <th>GRE Score</th>\n",
" <th>TOEFL Score</th>\n",
" <th>University Rating</th>\n",
" <th>SOP</th>\n",
" <th>LOR</th>\n",
" <th>CGPA</th>\n",
" <th>Research</th>\n",
" <th>Admit</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>337</td>\n",
" <td>118</td>\n",
" <td>4</td>\n",
" <td>4.5</td>\n",
" <td>4.5</td>\n",
" <td>9.65</td>\n",
" <td>1</td>\n",
" <td>0.92</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>324</td>\n",
" <td>107</td>\n",
" <td>4</td>\n",
" <td>4.0</td>\n",
" <td>4.5</td>\n",
" <td>8.87</td>\n",
" <td>1</td>\n",
" <td>0.76</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>316</td>\n",
" <td>104</td>\n",
" <td>3</td>\n",
" <td>3.0</td>\n",
" <td>3.5</td>\n",
" <td>8.00</td>\n",
" <td>1</td>\n",
" <td>0.72</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>322</td>\n",
" <td>110</td>\n",
" <td>3</td>\n",
" <td>3.5</td>\n",
" <td>2.5</td>\n",
" <td>8.67</td>\n",
" <td>1</td>\n",
" <td>0.80</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>314</td>\n",
" <td>103</td>\n",
" <td>2</td>\n",
" <td>2.0</td>\n",
" <td>3.0</td>\n",
" <td>8.21</td>\n",
" <td>0</td>\n",
" <td>0.65</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Serial No. GRE Score TOEFL Score University Rating SOP LOR CGPA \\\n",
"0 1 337 118 4 4.5 4.5 9.65 \n",
"1 2 324 107 4 4.0 4.5 8.87 \n",
"2 3 316 104 3 3.0 3.5 8.00 \n",
"3 4 322 110 3 3.5 2.5 8.67 \n",
"4 5 314 103 2 2.0 3.0 8.21 \n",
"\n",
" Research Admit \n",
"0 1 0.92 \n",
"1 1 0.76 \n",
"2 1 0.72 \n",
"3 1 0.80 \n",
"4 0 0.65 "
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 500 entries, 0 to 499\n",
"Data columns (total 9 columns):\n",
"Serial No. 500 non-null int64\n",
"GRE Score 500 non-null int64\n",
"TOEFL Score 500 non-null int64\n",
"University Rating 500 non-null int64\n",
"SOP 500 non-null float64\n",
"LOR 500 non-null float64\n",
"CGPA 500 non-null float64\n",
"Research 500 non-null int64\n",
"Admit 500 non-null float64\n",
"dtypes: float64(4), int64(5)\n",
"memory usage: 35.2 KB\n"
]
}
],
"source": [
"df.info()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. It is a good practice to shuffle the data to remove any kind of order effects in data."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.utils import shuffle"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"df_shuffled = shuffle(df,random_state = 42)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Serial No.</th>\n",
" <th>GRE Score</th>\n",
" <th>TOEFL Score</th>\n",
" <th>University Rating</th>\n",
" <th>SOP</th>\n",
" <th>LOR</th>\n",
" <th>CGPA</th>\n",
" <th>Research</th>\n",
" <th>Admit</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>361</th>\n",
" <td>362</td>\n",
" <td>334</td>\n",
" <td>116</td>\n",
" <td>4</td>\n",
" <td>4.0</td>\n",
" <td>3.5</td>\n",
" <td>9.54</td>\n",
" <td>1</td>\n",
" <td>0.93</td>\n",
" </tr>\n",
" <tr>\n",
" <th>73</th>\n",
" <td>74</td>\n",
" <td>314</td>\n",
" <td>108</td>\n",
" <td>4</td>\n",
" <td>4.5</td>\n",
" <td>4.0</td>\n",
" <td>9.04</td>\n",
" <td>1</td>\n",
" <td>0.84</td>\n",
" </tr>\n",
" <tr>\n",
" <th>374</th>\n",
" <td>375</td>\n",
" <td>315</td>\n",
" <td>105</td>\n",
" <td>2</td>\n",
" <td>2.0</td>\n",
" <td>2.5</td>\n",
" <td>7.65</td>\n",
" <td>0</td>\n",
" <td>0.39</td>\n",
" </tr>\n",
" <tr>\n",
" <th>155</th>\n",
" <td>156</td>\n",
" <td>312</td>\n",
" <td>109</td>\n",
" <td>3</td>\n",
" <td>3.0</td>\n",
" <td>3.0</td>\n",
" <td>8.69</td>\n",
" <td>0</td>\n",
" <td>0.77</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104</th>\n",
" <td>105</td>\n",
" <td>326</td>\n",
" <td>112</td>\n",
" <td>3</td>\n",
" <td>3.5</td>\n",
" <td>3.0</td>\n",
" <td>9.05</td>\n",
" <td>1</td>\n",
" <td>0.74</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Serial No. GRE Score TOEFL Score University Rating SOP LOR CGPA \\\n",
"361 362 334 116 4 4.0 3.5 9.54 \n",
"73 74 314 108 4 4.5 4.0 9.04 \n",
"374 375 315 105 2 2.0 2.5 7.65 \n",
"155 156 312 109 3 3.0 3.0 8.69 \n",
"104 105 326 112 3 3.5 3.0 9.05 \n",
"\n",
" Research Admit \n",
"361 1 0.93 \n",
"73 1 0.84 \n",
"374 0 0.39 \n",
"155 0 0.77 \n",
"104 1 0.74 "
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_shuffled.head()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"DV = 'Admit '"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3. Splitting the DataFrame df_shuffled into feature variable(X) and dependent variable(y)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"X = df_shuffled.drop(['Admit ','Serial No.'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GRE Score</th>\n",
" <th>TOEFL Score</th>\n",
" <th>University Rating</th>\n",
" <th>SOP</th>\n",
" <th>LOR</th>\n",
" <th>CGPA</th>\n",
" <th>Research</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>361</th>\n",
" <td>334</td>\n",
" <td>116</td>\n",
" <td>4</td>\n",
" <td>4.0</td>\n",
" <td>3.5</td>\n",
" <td>9.54</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>73</th>\n",
" <td>314</td>\n",
" <td>108</td>\n",
" <td>4</td>\n",
" <td>4.5</td>\n",
" <td>4.0</td>\n",
" <td>9.04</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>374</th>\n",
" <td>315</td>\n",
" <td>105</td>\n",
" <td>2</td>\n",
" <td>2.0</td>\n",
" <td>2.5</td>\n",
" <td>7.65</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>155</th>\n",
" <td>312</td>\n",
" <td>109</td>\n",
" <td>3</td>\n",
" <td>3.0</td>\n",
" <td>3.0</td>\n",
" <td>8.69</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104</th>\n",
" <td>326</td>\n",
" <td>112</td>\n",
" <td>3</td>\n",
" <td>3.5</td>\n",
" <td>3.0</td>\n",
" <td>9.05</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" GRE Score TOEFL Score University Rating SOP LOR CGPA Research\n",
"361 334 116 4 4.0 3.5 9.54 1\n",
"73 314 108 4 4.5 4.0 9.04 1\n",
"374 315 105 2 2.0 2.5 7.65 0\n",
"155 312 109 3 3.0 3.0 8.69 0\n",
"104 326 112 3 3.5 3.0 9.05 1"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X.head()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"y = df_shuffled[DV]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"361 0.93\n",
"73 0.84\n",
"374 0.39\n",
"155 0.77\n",
"104 0.74\n",
"Name: Admit , dtype: float64"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4. Split X and y into training and testing sets"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 0.33, random_state = 42)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GRE Score</th>\n",
" <th>TOEFL Score</th>\n",
" <th>University Rating</th>\n",
" <th>SOP</th>\n",
" <th>LOR</th>\n",
" <th>CGPA</th>\n",
" <th>Research</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>443</th>\n",
" <td>321</td>\n",
" <td>114</td>\n",
" <td>5</td>\n",
" <td>4.5</td>\n",
" <td>4.5</td>\n",
" <td>9.16</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>497</th>\n",
" <td>330</td>\n",
" <td>120</td>\n",
" <td>5</td>\n",
" <td>4.5</td>\n",
" <td>5.0</td>\n",
" <td>9.56</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>124</th>\n",
" <td>301</td>\n",
" <td>106</td>\n",
" <td>4</td>\n",
" <td>2.5</td>\n",
" <td>3.0</td>\n",
" <td>8.47</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50</th>\n",
" <td>313</td>\n",
" <td>98</td>\n",
" <td>3</td>\n",
" <td>2.5</td>\n",
" <td>4.5</td>\n",
" <td>8.30</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>331</th>\n",
" <td>311</td>\n",
" <td>105</td>\n",
" <td>2</td>\n",
" <td>3.0</td>\n",
" <td>2.0</td>\n",
" <td>8.12</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" GRE Score TOEFL Score University Rating SOP LOR CGPA Research\n",
"443 321 114 5 4.5 4.5 9.16 1\n",
"497 330 120 5 4.5 5.0 9.56 1\n",
"124 301 106 4 2.5 3.0 8.47 0\n",
"50 313 98 3 2.5 4.5 8.30 1\n",
"331 311 105 2 3.0 2.0 8.12 1"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X_train.head()"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(335, 7)"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X_train.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5. Instantiating the Linear Regression model and fitting the model"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.linear_model import LinearRegression"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"model = LinearRegression()"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.fit(X_train[['GRE Score']],y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 6. Extracting the intercept(c) and the coefficient(m)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"intercept = model.intercept_"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-2.6151678753807004"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"intercept"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"coefficient = model.coef_"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.01054397])"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"coefficient"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 7. Printing out the Equation (y = c + mx) in terms of Admit(y), intercept(c), coefficient(m) and x(GRE Score) "
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Admit = -2.62 + (0.01 x GRE Score)\n"
]
}
],
"source": [
"print('Admit = {0:0.2f} + ({1:0.2f} x GRE Score)'.format(intercept,coefficient[0]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Predicted Value"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"Admit = -2.62 + (0.01 * X_train['GRE Score'])"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"443 0.59\n",
"497 0.68\n",
"124 0.39\n",
"50 0.51\n",
"331 0.49\n",
"Name: GRE Score, dtype: float64"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Admit.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 8. Generate predictions of the test data using the following"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"predictions = model.predict(X_test[['GRE Score']])"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(165,)"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predictions.shape"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.56911022, 0.62183005, 0.69563782, 0.55856625, 0.63237402,\n",
" 0.59019815, 0.82216543, 0.74835766, 0.65346195, 0.81162146,\n",
" 0.75890163, 0.59019815, 0.79053353, 0.79053353, 0.81162146,\n",
" 0.70618179, 0.85379733, 0.54802228, 0.68509386, 0.63237402,\n",
" 0.90651717, 0.8327094 , 0.77998956, 0.61128609, 0.8327094 ,\n",
" 0.54802228, 0.56911022, 0.8327094 , 0.71672576, 0.67454989,\n",
" 0.72726972, 0.67454989, 0.69563782, 0.50584641, 0.70618179,\n",
" 0.49530245, 0.70618179, 0.57965418, 0.69563782, 0.59019815,\n",
" 0.59019815, 0.70618179, 0.72726972, 0.88542923, 0.79053353,\n",
" 0.84325336, 0.90651717, 0.71672576, 0.63237402, 0.66400592,\n",
" 0.75890163, 0.66400592, 0.75890163, 0.52693435, 0.48475848,\n",
" 0.75890163, 0.67454989, 0.52693435, 0.72726972, 0.60074212,\n",
" 0.9276051 , 0.52693435, 0.74835766, 0.79053353, 0.60074212,\n",
" 0.70618179, 0.77998956, 0.93814907, 0.90651717, 0.79053353,\n",
" 0.69563782, 0.50584641, 0.8010775 , 0.72726972, 0.94869304,\n",
" 0.67454989, 0.52693435, 0.73781369, 0.76944559, 0.73781369,\n",
" 0.62183005, 0.64291799, 0.8010775 , 0.56911022, 0.84325336,\n",
" 0.74835766, 0.69563782, 0.8010775 , 0.8010775 , 0.96978097,\n",
" 0.59019815, 0.79053353, 0.8643413 , 0.81162146, 0.53747832,\n",
" 0.76944559, 0.54802228, 0.68509386, 0.62183005, 0.50584641,\n",
" 0.68509386, 0.71672576, 0.51639038, 0.82216543, 0.75890163,\n",
" 0.959237 , 0.59019815, 0.66400592, 0.84325336, 0.75890163,\n",
" 0.9276051 , 0.8327094 , 0.87488527, 0.55856625, 0.82216543,\n",
" 0.63237402, 0.59019815, 0.61128609, 0.85379733, 0.66400592,\n",
" 0.8010775 , 0.76944559, 0.79053353, 0.72726972, 0.71672576,\n",
" 0.67454989, 0.71672576, 0.77998956, 0.71672576, 0.54802228,\n",
" 0.74835766, 0.96978097, 0.85379733, 0.96978097, 0.8327094 ,\n",
" 0.73781369, 0.59019815, 0.8327094 , 0.54802228, 0.53747832,\n",
" 0.8010775 , 0.73781369, 0.66400592, 0.75890163, 0.66400592,\n",
" 0.87488527, 0.8327094 , 0.53747832, 0.63237402, 0.49530245,\n",
" 0.85379733, 0.66400592, 0.55856625, 0.88542923, 0.76944559,\n",
" 0.96978097, 0.67454989, 0.54802228, 0.74835766, 0.55856625,\n",
" 0.71672576, 0.65346195, 0.8010775 , 0.67454989, 0.72726972])"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predictions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 9. Using the pearson correlation coefficient to determine direction of the linear relationship between the dependent variable y and the predicted value of y which is the variable predictions in our case\n",
"\n",
"ex: correlation_coeff, p_value = pearsonr(x,y). In the below code we access the 0th index as the correlation_coeff is the 1st element output for pearsonr(x,y)."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0.5,1,'Predicted vs Actual Values (r =0.80)')"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"from scipy.stats import pearsonr\n",
"plt.scatter(y_test,predictions)\n",
"plt.xlabel('Y test (True values)')\n",
"plt.ylabel('Predicted Values')\n",
"plt.title('Predicted vs Actual Values (r ={0:0.2f})'.format(pearsonr(y_test,predictions)[0]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From the above plot we can see that the Predicted vs the Actual value(pearson r value = 0.80). \n",
"\n",
"This shows that there is a moderate, positive linear correlation between the predicted and the actual value. For a perfect model all the points would align in a straight line with the pearson r value being 1.0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 10. Now we will find out the residuals (difference between the true and predicted value). Generally a model fits the data very well if it will have normally distributed residuals. We can test this using the Shapiro - Wilk test.\n",
"\n",
"Shapiro- Wilk test is generally used for testing the normality. It tests the null hypothesis that the data was drawn from a normal distribution.\n",
"\n",
"A p-value(probability value) < 0.05 indicates a non-normal distribution while a p-value > 0.05 indicates a normal distribution. \n",
"\n",
"In the below code we are indexing the [1] element of (shapiro(y_test - predictions)[1] as the p-value is at the [1] index of the shapiro.\n",
"\n",
"ex: shap_w, Shap_p = shapiro(y_test - predictions)\n",
"\n",
"#### Let's take an example to better understand this:\n",
"\n",
"Let's consider a food delivery app claims that their delivery times are 30 minutes or less on average. However according to you it’s more than that. To verify this you carry out a hypothesis test because you believe the null hypothesis, that the mean delivery time is 30 minutes max, is incorrect. \n",
"\n",
"The alternative hypothesis is that the mean time is greater than 30 minutes. To carry out we randomly sample some delivery times and run the data through the hypothesis test, and the p-value(probability) turns out to be 0.001, which is much less than 0.05. \n",
"\n",
"In real terms, there is a probability of 0.05 that you will mistakenly reject the pizza place’s claim that their delivery time is less than or equal to 30 minutes. Since typically we are willing to reject the null hypothesis when this probability is less than 0.05, you conclude that the pizza place is wrong."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\tkhan050\\AppData\\Local\\Continuum\\anaconda3\\lib\\site-packages\\scipy\\stats\\stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.\n",
" return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xd8HPWZx/HPoy6rWbLlJssdbGxjG1vGGLAhtAABTAhHNcSEhJYc6blwl1xIO0I6KXCQI6EkphkwYHoxJWDcC7g32SqWJVtWl9X2uT9mRNZCZS3v7kia5/167Uva3dmZr3ZH+8zvNzO/EVXFGGOMf8V4HcAYY4y3rBAYY4zPWSEwxhifs0JgjDE+Z4XAGGN8zgqBMcb4nG8LgYhsFJEzvc7hJRH5vIgUiEiNiJwUweXMEZGtnTz/kIj8LAzLGSUiKiJx3XjtRBFZFeK0C0Tkn0efMKR5/6+I/DAS8+7JRORtEfmy1zl6GhFZISKTIr2cPlkIRCRfRM5p89gR/7yqOklV3+5iPt3+Yuklfg18TVVTVXVtpBaiqu+p6vhIzT9MforzfgAgIqeLyAciUiki5SLyvojMjHQIVb1FVX8ajnmJyKsi8r2g+znu+tzeY0PCscy+Qhx3i8hB9/ZLEZFOpr9GRPaISK2ILBaRrKDnskTkWfe5PSJyTaivxVknfxL+v/BIfbIQ9BY9oMCMBDaGMmEPyBoxIjIU+Ayw2L2fDiwB/ghkATnAj4EGrzK6uY72M3gXOCPo/lxgSzuPbVfVkmOM19fcBFwKTAWmABcBN7c3obvFfj9wHTAYqAPuDZrkz0Cj+9y1wH2tW/khvPZ54DPuOhoxvi0Ewa0GETlZRFaJSJWI7BeR37qTvev+rHC7T2aLSIyI/MCt4KUi8oiIZATN93r3uYMi8sM2y7lTRBaJyN9FpApY4C57mYhUiMg+EfmTiCQEzU9F5DYR2S4i1SLyUxEZ676mSkSeDJ6+zd/YblYRSRSRGiAWWC8iOzt4vYrIV0VkO7DdfWyCiLzubiVvFZErgqa/UEQ2uTmLROQ77uNnikhh0HQnicgad7ongKSg5z7V7eLmGOf+/jkRWev+7QUicmcnn/ECEdnlLme3iFzbwaTnAmtU9bB7/3gAVX1MVVtUtV5VX1PVDW3m/2sROeTO+4Kgx28Qkc3ucneJyM1Bz50pIoUi8p8icsBdP64Nev6TbrKgaf9DREqAv7mPf0VEdrifwfMiMqyDv+td4DQRaf0/nwP8Hshr89i77b3Yff/eF5E/itMy2iIiZ3cwbaK7Dk8OeixbROpFZJCIZIrIEhEpc9+zJSIyvIN53Skifw+6f0TL3F2HH3T/X4pE5GciEtvBe9BdXwR+o6qFqloE/AZY0MG01wIvqOq7qloD/BC4TETSRCQF+ALwQ1WtUdV/4ny5X9fVawHcdXI1cF6Y/74j+LYQtHEPcI+qpgNjgSfdx+e6P/u73SfLcFaGBThbkGOAVOBP4PQz41Tza4GhQAbO1mSwecAioD/wD6AF+CYwEJgNnA3c1uY15wMzgFOA7wEPuMvIBSYDV3fwd7WbVVUbVDXVnWaqqo7t+K3hUmAWMNFdqV8HFgKD3OXeK//qw3wQuFlV09xcb7WdmVu0FgOP4mxtP4XzjxKqWuB6nPfvc8CtInJpO8tJAf4AXODmORVY18E8TwSC92FsA1pE5GERuUBEMtt5zSz3NQOBXwIPinzSdVCKswWZDtwA/E5Epge9doj7uhycL5wHRKSjrrMhOO/TSOAmETkLuAu4Amcd2wM83sFrVwCJOFu14KzPrwM72jzWbiEI+jt3uXl/BDwjR3ZdAKCqDcAzHLkuXgG8o6qlON81f3P/jhFAPe7/TTc8DDQD44CTcL4k292/IE63S0UntxEdLGMSsD7o/nr3sS6nVdWdOC2A491bi6pu62Benb221Wb+9XlFRF8uBIuDP3CObG611QSME5GBbtX+sJNprwV+q6q73Ap+B3CVu7VyOU51/6eqNgL/DbQdzGmZqi5W1YC7pblaVT9U1WZVzcdpJp7R5jV3q2qVqm4EPgZec5dfCbyM889wtFlDdZeqlqtqPc6XW76q/s3NuwZ42v27wXkfJ4pIuqoecp9v6xQgHvi9qjap6iJgZahhVPVtVf3Iff82AI/x6ferVQCYLCLJqrrPff/a0x+oDlpGFXA6zmf3F6DM3fIeHPSaPar6F1VtwfliGorTtEdVX1TVnep4B3gNZ8s72A/dgvwO8CLOl2ZHf8OP3GnrcT7Tv6rqGvfL9w5gtoiMavtC9/nlwFz3y7u/qu4C3gt6bCLwTgfLBqeotX5WT+AUv891MO1CjiwE17iPoaoHVfVpVa1T1Wrg53T8uXXI/QwuAL6hqrVukfkdcFV706vqQlXt38ltbweLSgUqg+5XAqlBxb6zaVunT+viua5e26oaZx2NmL5cCC4N/sD59FZ2sBtxKvAWEVkpIhd1Mu0wnK2wVnuAOJwvgWFAQesTqloHHGzz+oLgOyJyvNtMLhGnu+h/cLa+gu0P+r2+nfuptK+zrKEKzjsSmNWmwF6Ls9UKzpb9hcAeEXlHRGZ3kKlIjxztcE8707VLRGaJyFK3i6ESuIVPv1+oai1wpfv8PhF5UUQmdDDbQxz5j4eqblbVBao6HKd1MwynW6VVSdC0de6vqW7GC0TkQ7frpgLnPQnOeMjN12qPO//2lAV1WUGbz9Qt8Af5dMuz1bs4W/1zgNYut38GPVagqp29/+19VsPEORKsxr21Fti3gGT3MxoJTAOeBRCRfiJyvzjdlFVurv7d6NIZibMhsS9oHbwfp4UaTjU4LbpW6UBNm/eio2lbp6/u4rmuXtsqDagIOXk39OVCEDJV3a6qV+OsTHcDi9yuhfY+9GKclbHVCJxm6n5gH/BJv6eIJAMD2i6uzf37cHbgHed2Tf0n0OHRCUeps6yhCs5bgNPUD96iSlXVWwFUdaWqzsN5Hxfzry62YPuAnDZbVsHN81qgX+sd+fTRLAtx+lhzVTUD+F86eL9U9VVVPRdna30LztZ9ezZwZFO87Xy2AA/hFIROiUgiTivp18BgdyPkpTYZM931q9UInM+q3cW3uX/EZ+rOZwBQ1MHr38X5wp+L0xIAeB84ja67haD9z6pYnSPBUt3bJABVDeB85lfjtAaWuFv/AN8GxgOz3PW8tdu1vc/uiHWAf21ogLMONgADg9bB9NYMbYnItUEFq71bR11DGzmyO2YqHR9YccS0IjIGp0tum3uLE5HjOphXZ69tdQJHdlOFnRUCQETmi0i2uyK3Vt4WoAynaT4maPLHgG+KyGgRScXZgn9CVZtx+v4vFpFT3b7wH9P1l3oaUAXUuFust4btD+s8a3csAY4XketEJN69zRSRE0Qkwf2ny1DVJpy/qaWdeSzDKUa3i0iciFwGnBz0/HpgkohME5Ek4M42r08DylX1sIicjPOF8ykiMlhELnG/KBtwtrzaywNOv/l0d3mtO8S/Le7OTBHJxfly66zLsFUCzj9yGdAszk7k9nb0/dh9z+bgdLk9FcK8wSmEN7jvTyLOZ7rc7VZszwc43QrzcQuBqh5y882n60IwCOeziheRf8P5Unqpi3xX4rQUFwY9nobTeq1wu6R+1Mk81uF0XY0Q50CMO1qfUNV9OF1tvxGRdHEOiBgrIu12M6nqP4IKVnu3jrqGHgG+Jc7htcNwCtlDHUz7D5z/+znu+vYT4BlVrXZbfs8APxGRFBE5DWc/4aNdvRY+2bCYgbOORowVAsf5wEZxjqS5B7hKVQ+7Tf6fA++7zdBTgL/ifIjvAruBw8C/A7h90P+Os/NuH07zrpTODzv8Ds6XWTXOFusTYfy7OszaHe7KeR5Of2wxTvfI3ThffOAcCZHvNv1vwfmiaTuPRuAynJ3Yh3C+NJ4Jen4bzj/DGzhHKrU9ces2nH+qapx9MO21OsBZt7/t5izH6Y9ut3tQVffjdGvMcx+qxtlJulxEanEKwMfu/Drlvke3u7kO4Xy2z7eZrMR9rhjni+AWt9XRJVV9E+fIkqdx1rGxdNA/7k5fh3PUSaL7N7R6D+dLvqtCsBw4DjiA879wuaq27e4MXt5ynC36YTj7r1r9Hkh25/Mh8Eon83gd5/9gg5t9SZtJrscpuJtw3sdFOK2+cLofeAH4COd9e9F9DAC3NTHHzbsRZ33/B87/expHrmu34fztpTgbZ7e27q8K4bWXAG+rakctxrCQ9ru8TDi4W+EVON0+u73OYzomzhFfDwMnd9APHK7lnAn83d330KOJyALgy6p6utdZ/EpElgM3qurHXU58DPrsSUJeEZGLgTdxuoR+jbNFke9lJtM1Vd0ERPzMYWOOhqrOisZyItY1JCJ/Feckpo+DHssS52Sk7e7P9o7P7u3m4TT5i3Ga1FdFcgvTGGOOVcS6hkRkLs4OukdUdbL72C9xdvT9QkS+D2Sq6n9EJIAxxpiQRHQfgTgnuSwJKgRbgTNVdZ84Y2e8rT1/MDJjjOnTor2PYLB7+BduMejwJBARuQln4CdSUlJmTJjQ0blAxhhj2rN69eoDqprd1XQ9dmexqj6AM6YOeXl5umpVSEPFG2OMcYlISGftR/s8gv1ul1Dr0L+lUV6+McaYNqJdCJ7HGW0R9+dzUV6+McaYNiJ5+OhjOMMJjBdnTPUbgV8A54ozvv257n1jjDEeitg+AncQt/a0e2ELY4wx3rCxhowxxuesEBhjjM9ZITDGGJ+zQmCMMT5nhcAYY3yux55ZbIzpnoXLP33RrWtmdXRFRmOsRWCMMb5nhcAYY3zOCoExxvicFQJjjPE5KwTGGONzVgiMMcbnrBAYY4zPWSEwxhifs0JgjDE+Z4XAGGN8zgqBMcb4nBUCY4zxOSsExhjjc1YIjDHG56wQGGOMz1khMMYYn7NCYIwxPmeFwBhjfM4KgTHG+JwVAmOM8TkrBMYY43NWCIwxxuesEBhjjM9ZITDGGJ+zQmCMMT5nhcAYY3zOCoExxvicFQJjjPE5TwqBiHxTRDaKyMci8piIJHmRwxhjjAeFQERygNuBPFWdDMQCV0U7hzHGGIdXXUNxQLKIxAH9gGKPchhjjO9FvRCoahHwa2AvsA+oVNXXop3DGGOMw4uuoUxgHjAaGAakiMj8dqa7SURWiciqsrKyaMc0xhjf8KJr6Bxgt6qWqWoT8AxwatuJVPUBVc1T1bzs7OyohzTGGL/wohDsBU4RkX4iIsDZwGYPchhjjMGbfQTLgUXAGuAjN8MD0c5hjDHGEefFQlX1R8CPvFi2McaYI9mZxcYY43NWCIwxxuesEBhjjM95so/AGNO1hcv3fuqxa2aN8CCJ6eusRWCMMT5nhcAYY3zOCoExxvicFQJjjPE5KwTGGONzVgiMMcbnrBAYY4zPWSEwxhifs0JgjDE+Z4XAGGN8zgqBMcb4nBUCY4zxOSsExhjjc1YIjDHG56wQGGOMz1khMMYYn7NCYIwxPmeFwBhjfM4KgTHG+JwVAmOM8TkrBMYY43NWCIwxxuesEBhjjM9ZITDGGJ+zQmCMMT5nhcAYY3zOCoExxvhcnNcBjDHdt3D5Xq8jmD7AWgTGGONzVgiMMcbnrBAYY4zPeVIIRKS/iCwSkS0isllEZnuRwxhjjHc7i+8BXlHVy0UkAejnUQ5jjPG9qBcCEUkH5gILAFS1EWiMdg5jjDEOL7qGxgBlwN9EZK2I/J+IpLSdSERuEpFVIrKqrKws+imNMcYnvCgEccB04D5VPQmoBb7fdiJVfUBV81Q1Lzs7O9oZjTHGN7woBIVAoaoud+8vwikMxhhjPBD1QqCqJUCBiIx3Hzob2BTtHMYYYxxeHTX078A/3COGdgE3eJTDGGN8z5NCoKrrgDwvlm2MMeZIIXUNichFImJnIRvTQzW1BDjc1OJ1DNNLhdoiuAq4R0SeBv6mqpsjmMmYXq3tiKDXzBoRkeU0twRYvecQG4sr2VFaQ3NASYqPYWBqIqePG8jknAxiRELKGMmcpucLqRCo6nz3RLCrcY7/V+BvwGOqWh3JgMaYTyuqqOcbj69lZf4hMvvFM3NUFhnJ8VTUN7GrrIbHVxYwdFsZl07LITfLTtw3nQt5H4GqVrktgmTgG8Dnge+KyB9U9Y+RCmiMOdK728r42sI1BBSuyBvO1OH9kaAt/4AqGworeG3Tfv7y3i6uyMv1MK3pDULdR3CJiDwLvAXEAyer6gXAVOA7EcxnjAmyrqCCmx9dzbD+ybx4++lMy808oggAxIgwLTeT284cx9CMJB5bsZeH3t/tUWLTG4S6A/hy4HeqOkVVf6WqpQCqWgd8KWLpjDGfOFjTwJceWkl2WiKP3jiLkQM+NTLLEVIT4/jynDGcMDSdO1/YxHPriqKU1PQ2oRaCfar6bvADInI3gKq+GfZUxpgjNDYHeOiDfFSVh26YSXZaYkivi4+N4aqZuZw8OovvPrWBVfnlEU5qeqNQC8G57Tx2QTiDGGM69srGEsprG7lv/gzGZKce1WvjYmO4f/4McjKTuenR1RSU10UopemtOi0EInKriHwETBCRDUG33cCG6EQ0xt92ltXw4a6DnDp2AKeMGdCteWSmJPDXBTNpag7w9cfX0hLQMKc0vVlXLYKFwMXAc+7P1tsMVZ0f4WzG+F5DUwvPrClkQEoC504cckzzGj0whZ99fjJr9lawdGtpmBKavqCrQqCqmg98FagOuiEiWZGNZoxZurWMQ3VNfGH6cBLijv3k/nnTcrhseg5Lt5SSf6A2DAlNXxBKiwBgNbDK/bk66L4xJkIq6hr5YOcBpuX2Z9TAzo8QOho/mTeZzJQEFq0ppKklELb5mt6r00Kgqhe5P0er6hj3Z+ttTHQiGuNPr2/aD8C5EweHdb6piXF8/qQcymsbeWuLdRGZ0E8oO631cpIiMl9EfisiNjCJMRHycVElawsqOHXsQDL7JYR9/mOzU5k+IpP3tpdRUnk47PM3vUuoQ0zcB0wVkanA94AHgUeBMyIVzJi+rKtB3+5+ZQv9EmI5c3zkLtN64eQhbCmp4tm1hdx8xtioDZZnep5Q9z41q6oC84B7VPUeIC1ysYzxr3UFFby3/QBzj8smKT42YsvplxjH504cSsGhepbvthPN/CzUQlAtIncA84EXRSQWZ8whY0yY/XnpDjKS45k1OvIH5k3L7c+4Qam8trGEyvqmiC/P9EyhFoIrgQbgRveawznAryKWyhif2lpSzeub9rPg1FEkRrA10EpEuHRaDgFVXlhfHPHlmZ4ppEKgqiWq+ltVfc+9v1dVH4lsNGP85963d9AvIZYbThsVtWVmpSRw1oTBbNpXxabiyqgt1/QcoR41dJmIbBeRShGpEpFqEamKdDhj/KSgvI4X1hcz/5SR9I/AkUKdOX3cQIakJ/HChn00NNslL/0m1K6hXwKXqGqGqqarapqqpkcymDF+8/AH+cSI8KXTRkd92bExwrxpw6isb+KtzXZugd+EWgj223WKjYmchuYWnlhVwAUnDmVIRpInGUYOSCFvZCbv7zxg5xb4TKiFYJWIPCEiV7vdRJeJyGURTWaMj6zdW0H14eao7htoz/mThpAUH8tz64oI2AilvhFqIUgH6oDz+NcIpBdFKpQxfhJQ5YOdB5ma25/pIzI9zdIvMY4LJg9hT3kdT60u8DSLiZ6QzixW1RsiHcQYv9pRWsOBmgZ+8LkTvI4CwEkjMlm95xB3vbyFcycOISslujuuTfSFetTQ8SLypoh87N6fIiI/iGw0Y/xh+a6DpCbGceGJQ72OAkCMCPOm5VBzuJlfvGy7Bv0g1K6hvwB3AE0AqroBuCpSoYzxi8r6JraUVDNjZGZYrjcQLoPTk7hxzmieXFXI8l0HvY5jIizUNa+fqq5o81hzuMMY4zer9pSjwMxRPe86T18/+zhys5L5j6c3UN9o5xb0ZaGOPnpARMYCCiAilwP7IpbKGB8IqLIq/xDjBqWG3A/f3qilkbJ4bTHnTRzCg//czVceWdVh11Uoo5SGMrKpjX7qnVBbBF8F7se5iH0R8A3gloilMsYHtu+vprK+qUe2BlqNzU7l5NFZvL/jAHsP2qUt+6pOWwQi8q2guy8BS3GKRy3wBeC3kYtmTN+2Iv8QKYlxnDC0Z4/ofsGkIWwrqebpNUV87axxxMf2nH0ZJjy6+kTT3FsecCuQCfTHaQ1MjGw0Y/qu6sNNbC2pYsaI/sTF9Owv1sT4WD5/Ug5lNQ12acs+qtMWgar+GEBEXgOmq2q1e/9O4KmIpzOmj1pXUEFAYfpIb08gC9Vxg9PIG+lc2nLSsHSGZ/bzOpIJo1A3RUYAjUH3G4FRYU9jjA+oKqv3HCI3M5lBad6MK9QdF0weSmpiHE+vKaS5JeB1HBNGoRaCR4EVInKniPwIWA48HLlYxvRdHxVVUlrd0GtaA62SE5wuov1VDby6scTrOCaMQr0wzc+BG4BDQAVwg6redSwLFpFYEVkrIkuOZT7G9DaLVhcSFyNMyenvdZSjNn5IOrPHDOD9nQfZtr/a6zgmTEI9jwBVXQOsCeOyvw5sxhnQzhhfaGhu4bl1xUwclk5yQuQvRRkJ508ewu4DtTy1upDbzxrndRwTBp4criAiw4HPAf/nxfKN8cpbm0uprG9ihsejjB6L+NgYrpyZS0NTC0+sLLD9BX2AV8et/R74HtDhGiQiN4nIKhFZVVZWFr1kxkTQ4nVFZKclMnZQqtdRjsng9CQunZbDrgO1/PLVrV7HMcco6oVARC4CSlV1dWfTqeoDqpqnqnnZ2dlRSmdM5FTWN7F0SxkXTxlGjIjXcY7Z9JGZnDImiwfe3cWSDcVexzHHwIsWwWnAJSKSDzwOnCUif/cghzFR9erHJTS2BLj0pGFeRwmbC08cyoyRmXxv0Qa2ltjO494q6oVAVe9Q1eGqOgpnKOu3VHV+tHMYE22L1xUxemAKJ+ZkeB0lbOJiYrj32umkJMZxy99XU1nf5HUk0w0hHzVkjOm+/VWHWbbrILefdRzSQbdQNEcWDafB6Unce+10rn7gQ7795DoeuC6PmJjIdH219x7ZKKXHztNBTlT1bVW1ax+bPu+F9cWowiXT+k63ULCZo7L44UUTeWNzKb953XYe9zbWIjAmCp5fX8yJORmMze7dRwt15vrZI9lSUsWfl+5kbHYql00f7nUkE6KePeyhMX3ArrIaNhRWMq+PtgZaiQg/mTeZ2WMG8P2nP2JlfrnXkUyIrBAYE2HPrStGBC6e2rcLATgnm903fzo5mcnc/Ohq9h6s8zqSCYEVAmMiSFV5fn0xs8cMYHB67xlp9Fj075fAg1/MoyWg3PjwSqoO25FEPZ0VAmMiqKiint0Havt8t1BbY7JTuW/+dHYfqOVrC9fSElCvI5lOWCEwJoLWF1SQEBvD+ZPav/B7X3bq2IH87NLJvLutjBc/2ud1HNMJKwTGREhAlQ1FlZw5PpuMfvFex/HEVSeP4CtzRvPhroMs23XQ6zimA1YIjImQXWW1VB9u5tKTcryO4qnvX3ACE4ak8eKGYruGQQ9lhcCYCFlfWEFiXAxnTRjkdRRPxcYIV+blMigticdW7GV/1WGvI5k2rBAYEwFNLQE2FlcyaVg6SfG98wI04ZQYH8v1s0cSHxvDI8vyOVjT4HUkE8QKgTERsLWkmsNNAaYO732Xo4yU/v0SuO6UkVQfbuaWv6+mobnF60jGZYXAmAhYX1hBSmIcY/rwkBLdkZvVj8tnDGdl/iHueOYjVO2w0p7AxhoyJswON7WwtaSamaOyiI3QKJxHqyeNbDpleH/Kahp4Zk0RNYebOXO8v/eh9ATWIjAmzDYWV9EcUKblWrdQR84aP4gpwzN4bdN+tpZUeR3H96wQGBNm6wsqyEpJYHhmstdReiwR4QvThzMkPYmnVhdSZRe08ZQVAmPCaH/VYXaW1TB1eP8OL0BjHPGxMVw1M5emlgBPri4gYPsLPGOFwJgwemF9MQrWLRSiQelJXDxlGLvKanlvW5nXcXzLCoExYbR4XRE5/ZPJTkv0OkqvMWNkJpNzMnhjcykllXaymResEBgTJjtKq/m4qMpaA0dJRJg3dRhJCbEsWlNgI5V6wAqBMWGyeG0xMQJThmd4HaXXSUmMY97UYRRXHOYd6yKKOisExoSBqvLc+iJOGzeQtCR/jjR6rCbnZDBleAZLt5TaeERRZoXAmDBYs/cQBeX1XDrN3yONHquLpgwjIS6G59cX21nHUWSFwJgwWLy2mKT4GD47eYjXUXq11MQ4zp80hN0HanlmTZHXcXzDCoExx6ipJcCSDcWcc8JgUhNt1JZjNWNUJiOy+vE/L22moq7R6zi+YIXAmGP07rYyDtU1WbdQmMSIMG/aMCrqm/jNa9u8juMLVgiMOUaL1xWT2S+eucdnex2lzxiakcy1s0awcMVeu6pZFFg71phjUNPQzOubSrh8xnAS4trfrupJI39GQqT+vm+cczyL1xbx0yWbeORLJ9uQHRFkLQJjjsErH5dwuClg3UIRkJWSwNfPOZ73th/g7a12bkEkWSEw5hg8uaqA0QNTmDEy0+sofdJ1p4xkzMAUfvbiJppbAl7H6bOsEBjTTbsP1LJidzn/ljfcui0iJCEuhu+dP56dZbU8s9YOJ40UKwTGdNOi1QXECHxh+nCvo/Rpn500hCnDM7jnje3WKogQKwTGdENzS4BFqws5c/wgBqcneR2nTxMRvvvZ8RRV1LMiv9zrOH2SFQJjuuG97QfYX9XAFXm5XkfxhdPHDWT2mAEs3VJKQ3OL13H6HCsExnTDEysLGJCSwFkT7MLr0SAifPf88dQ2tvDBzoNex+lzol4IRCRXRJaKyGYR2SgiX492BmOOxb7Kel7fvJ/L8zo+d8CE3/QRmZwwJI33tpdR19jsdZw+xYu1uBn4tqqeAJwCfFVEJnqQw5huWbh8LwFV5s8a6XUU3zl34hAamgK8u+2A11H6lKgXAlXdp6pr3N+rgc2AnY1jeoXG5gCPrSjgrPGDyM3q53Uc3xmSkcTU3P4s23WAqsNNXsfpMzxt14rIKOAkYHk7z90kIqtEZFVZmZ1VaHqGlz/ex4GaBq6bba0Br5w9YRAtAWX5BZ7PAAASGklEQVTpllKvo/QZnhUCEUkFnga+oapVbZ9X1QdUNU9V87KzbTAv0zM8smwPowb0Y+5xtk56ZUBqInkjs1iVf8iGqQ4TTwqBiMTjFIF/qOozXmQw5mhtKKxg9Z5DzD9lJDExdiaxl84cnw0CS20MorDw4qghAR4ENqvqb6O9fGO66763d5KeFMeVM+3cAa/175fAzFGZrN5Tzt6DdV7H6fW8aBGcBlwHnCUi69zbhR7kMCZkO0preGVjCdfPHmUXp+8hzjh+EDEi/PGt7V5H6fWifj0CVf0nYO1q06s88O5OEmJjWHDaKK+jGFdGcjyzRmfxzNoibvvMOEYPTPE6Uq9lZ8MY04V9lfU8u7aIK2fmMjA10es4Jsjc47OJjxX++Ka1Co6FFQJjuvC/b+8koPCVOWO8jmLaSEuK5/rZo1i8rogdpTVex+m1rBAY04k9B2tZuGIvV+Tl2glkPdTNc8eQFB/LPdYq6DYrBMZ04levbiUuJoZvnnOc11FMBwakJrLg1FEs2VDM1hK70H13WCEwpgPrCypYsmEfX54zmkF2zYEe7StzxpCSEMc9b27zOkqvZIXAmHaoKne9vJmslARummv7Bnq6zJQEvnTaKF76qISNxZVex+l1rBAY046n1xTx4a5yvnnu8XbeQC9x45wxpCXF8fs3bF/B0bJCYEwbZdUN/HTJJvJGZnLtySO8jmNClJEcz1fmjOH1Tfv5qNBaBUfDCoExbfz4hY3UN7bwiy9MsTGFepkbThtFRnI8v3l9q9dRehUrBMYEeXHDPpZs2Me/nzWOcYNSvY5jjlJaUjy3njmWt7eWscwuaRmyqA8xYUykLVy+t8tprpn16S6fHaXVfG/Rek4a0Z+bzxjb7Xmb6Gr7mSw4dRQPf5DPXS9vZvFtp1mrLgTWIjAGqGlo5uZHV5OcEMu91063axH3YknxsXz7vPFsKKzkxY/2eR2nV7C13fheU0uAbzy+jt0HavnD1ScxNCPZ60jmGH3+pBwmDEnjV69upbE54HWcHs8KgfG1loDy7SfX88bm/dx5ySROHTvQ60gmDGJjhDsuPIG95XX87f3dXsfp8awQGN9qCSh3PLOB59cX8x/nT+D62aO8jmTC6Izjszl7wiD+8OZ2SqsOex2nR7NCYHyp1t0n8OSqQm4/+zhuPbP9ncOmd/vhRRNpalF+8fIWr6P0aFYIjO9U1DVyxf3LeGvLfn58ySS+de7xXkcyETJqYApfmTuaZ9YWsXpPuddxeiwrBMZXNhRW8Ie3tpN/oJYHvziTL546yutIJsK++plxDM1I4r+e/dh2HHfACoHxhZqGZp5cVcDjKwsYmJrIktvn8JkJg7yOZaKgX0IcP7t0MltKqrn37R1ex+mR7IQy06cFVFmZX86rG0tobA5w1oRBfGb8ILu+rc+cfcJg5k0bxp+X7uD8yUOYMCTd60g9irUITJ+kqmwtqeKPb23nuXXFDMtI5vazjuOcEwYTa2ea+tKPLp5EelI831u0gaYW6yIKZoXA9Cmqys6yGv7y3m4eXraHphbl6pNHcOPpdnEZv8tKSeCnl05mQ2Elv37NBqULZl1Dpk9QVT7YeZB73tjOivxy0pLiuHjKUGaOziIuxrZ3jOPCE4dyzawR3P/OLk4ZM4DPjLf9RGCFwPRyqsrb28q4d+kOVuYfYkh6EhdPGUreqCziY60AmE/774smsmbPIb795HpevP10G1IEEFX1OkOX8vLydNWqVV7HCJv2RrBsbzTMSM67O6Nohpqxu6N/Hs18mlsCrCuo4KOiSraX1jA0I4lbzxzLFXm5PLOmKKScxj/arm+/f2Mb9769k4EpCXxl7hgS42LD9j/Yk4jIalXN62o6axGYXqXqcBOr9xxi2c6D1DQ0c8LQdH535VQ+d+IwGzHUhGxQWhJXz8zlkWV7eHxFAfNPGel1JE9ZITA9XnNLgM0l1azZc4jtpdUEFI4fnMrp47L54UUnIGJHAZmjN35IOhdPHcbz64t5YX0x808Z4dt1yQqB6ZEamwNsL61mU3EVGworqW9qIT0pjjnHZTN9RCbZaYkAvv3HNeFxypgBVNY38c62Mv7z2Y/5+aWTfXkhGysE5lNUleaAEggosbFCrEjEv3Ar65vYsq+KNXsrWJlfzord5dQ0NBMXI5wwNJ0ZIzMZNyiVGPviN2F23sTBADy2Yi9NLQF+cdmJxPnsQAMrBD7T0NzCxuIqVuWXU1bdwKG6Rirrm6hpaKaxOUBjS4DmFiX4EALBGd/9rpc3kxwfS2piHCmJcaQkOr/3S3DupybGkpIYx7aSahLjYkmIjyFGhEBACagSUAgElMPNLRQcqqO0qoHCQ3UUlNdRXPmvYYLHZKdw8dRhxMUIY7NTre/fRJSIcN7EwUwfkcnv3thGcUU9f7pmOlkpCV5HixorBH1cfWMLS7eUsiK/nFX55awvrPxk4K24GKF/vwT6J8czIDWRxLgY4mOdW0JcDDECzQGluUVpDgQYm51KfWMLNY3N1DU0U9vQQnHFYWobm6ltaKamoZnDTaGdsRkXIwxMTWR4ZjInj85i/JB0JgxN48ScDAamOt0+dn1gEy0iwtfPOY6czGT+89mPuPiP/+TP105nWm5/r6NFhRWCPqairpE9B+vIP1jLnoN17K86jOJ88U7OyWDBqaOYPiKT7furyUxJOKqullAOr2sJKA9/kE9Dc4CGphYUEIFYEWJEiIkRkuJiWHDaKOvfNz3O5TOGc/zgVG55dDWX3fs+N54+mm+dO57khFivo0WUFYJerKG5ha0l1awvrGR1fjkr8w9RVFEPQEJcDCOz+jE5ZzA3nj6aabn9j1iZy2sbI5IpNkZIio8lKT4WkuM7nM6KgOmppgzvzyvfnMtdL23hL+/t5uWPS/jGOcdz6bRhfXbfgRWCXqCmoZmiQ/Wf9Kdv3lfNR0WVbNtfTXPA6c3PTkvk5FFZnDSiP6MGpDA4PemTwdVmjx3gZXxjep30pHjuuuxELpk6jJ+/tInvPLWee5fu4Eunj2betGGkJXW8kdMbWSGIguaWADUNzVQfdm67D9TS0NTC4eYWDjc5XSh7y+uoPtzkTuP8rDrc5O7QbTpifpn94pmck8FN48cwOSeDE3MyGJ6ZjIhYv7oxYTR77ABe+NrpvLpxP398azs/WPwx//PSZi6YPJTPThrMnOOy+0S3kSeFQETOB+4BYoH/U9VfeJHjWKgqVfXNlNU0UFbdQGn1YcqqG5z7VQ2fPF5W3UB5XSNdjeQRH1tKWlI8aUlxpCXFkZ4Uz+iBKeSNyiI3sx/DM5MZnplMTmYy2amJ1rViTJSICOdPHsJnJw1mfWElC5fv4ZWPS3h6TSGJcTFMze3PyaOymJyTztjsVEYOSOl1R7pFvRCISCzwZ+BcoBBYKSLPq+qmSCxP3cMWmwMBAgHnZ0tAP7k1uz/rGluoaWimzj0CprahhdpG50iYiromDtQ0cKCmkYM1DRysaeRgbQNNLZ/+dk+IjSE7LZHstERys/oxfWQm2amJZCTHk5oUR3pSHCt2HyIpPobEuFgS42NIiovli6eOtC93Y3owEWFabn+m5fbn558/kRW7y3lrSymr8su5752dtLjdtLExwoisfowZmMKg9EQGpCSSlZLAgNQE+vdLIDk+9pP//7Y/Y2IgRlrP3YnevjQvWgQnAztUdReAiDwOzAPCXghufGglb24pPeb5JMbFMDA1kYGpCQxOT2LSsHQGpCYyICXhky/9QWmJZKcmkZ4c1+WHV17b9KnHrAgY03vEx8Zw2riBnDZuIAB1jc3sLK1lZ1nNJ7ddZbWsL6ygvLaRQDfH9hSB1785l3GD0sKYvp3lRHv0URG5HDhfVb/s3r8OmKWqX2sz3U3ATe7d8cDRXEliIHAgDHG9Yvm9Zfm9ZfnDZ6SqZnc1kRctgvY2fT9VjVT1AeCBbi1AZFUoQ6/2VJbfW5bfW5Y/+rzYo1EI5AbdHw4Ue5DDGGMM3hSClcBxIjJaRBKAq4DnPchhjDEGD7qGVLVZRL4GvIpz+OhfVXVjmBfTrS6lHsTye8vye8vyR1mvuFSlMcaYyOldZz0YY4wJOysExhjjc32iEIhIloi8LiLb3Z+ZnUybLiJFIvKnaGbsTCj5RWSkiKwWkXUislFEbvEia3tCzD9NRJa52TeIyJVeZG1PqOuPiLwiIhUisiTaGdsjIueLyFYR2SEi32/n+UQRecJ9frmIjIp+yo6FkH+uiKwRkWb3/KMeJYT83xKRTe76/qaIjPQiZyj6RCEAvg+8qarHAW+69zvyU+CdqKQKXSj59wGnquo0YBbwfREZFsWMnQklfx1wvapOAs4Hfi8iPeWqH6GuP78Crotaqk4EDdVyATARuFpEJraZ7EbgkKqOA34H3B3dlB0LMf9eYAGwMLrpuhZi/rVAnqpOARYBv4xuytD1lUIwD3jY/f1h4NL2JhKRGcBg4LUo5QpVl/lVtVFVG9y7ifSszy6U/NtUdbv7ezFQCnR5xmOUhLT+qOqbQHW0QnXhk6FaVLURaB2qJVjw37UIOFt6zlgmXeZX1XxV3QCEdtm76Aol/1JVrXPvfohzzlSP1JO+TI7FYFXdB+D+HNR2AhGJAX4DfDfK2ULRZX4AEckVkQ1AAXC3+4XaE4SUv5WInAwkADujkC0UR5W/h8jBWQ9aFbqPtTuNqjYDlUBPuThFKPl7sqPNfyPwckQTHYNecz0CEXkDGNLOU/8V4ixuA15S1QIvNorCkB9VLQCmuF1Ci0VkkaruD1fGzoQjvzufocCjwBdVNWpbeuHK34OEMlRLSMO5eKQnZwtFyPlFZD6QB5wR0UTHoNcUAlU9p6PnRGS/iAxV1X3uF017Q47OBuaIyG1AKpAgIjWq2tn+hLAJQ/7geRWLyEZgDk6TP+LCkV9E0oEXgR+o6ocRitqucL7/PUQoQ7W0TlMoInFABlAenXhd6u1DzYSUX0TOwdnYOCOoa7fH6StdQ88DX3R//yLwXNsJVPVaVR2hqqOA7wCPRKsIhKDL/CIyXESS3d8zgdM4uhFZIymU/AnAszjv+1NRzBaKLvP3QKEM1RL8d10OvKU95wzS3j7UTJf5ReQk4H7gElXt2RsXqtrrbzj9nm8C292fWe7jeThXQGs7/QLgT17nPpr8OBfy2QCsd3/e5HXuo8w/H2gC1gXdpnmd/WjWH+A9oAyox9ki/KzHuS8EtuHsa/kv97Gf4HzxACQBTwE7gBXAGK/f66PMP9N9n2uBg8BGrzMfZf43gP1B6/vzXmfu6GZDTBhjjM/1la4hY4wx3WSFwBhjfM4KgTHG+JwVAmOM8TkrBMYY43NWCEyfJyIt7qitH4vIC90d7E5E/q+dgcUQkQXHMpqtiNR097XGhIMVAuMH9ao6TVUn45xZ+9XuzERVv6yqm8IbzRjvWSEwfrOMoMHBROS7IrLSHTP+x+5jKSLyooisd1sRV7qPvy0iee7vN4jINhF5B+cs79b5PRQ8dn7r1r6IpLpj0q8RkY9EpO1IoYjIUBF5N6j1MidSb4IxwXrNWEPGHCt3DPmzgQfd++cBx+EMKSzA8yIyF2d47GJV/Zw7XUab+QwFfgzMwBnRcynO2POdOQx8XlWrRGQg8KGIPK9HntF5DfCqqv7czdrvmP5gY0JkLQLjB8kisg5nmIIs4HX38fPc21pgDTABpzB8BJwjIneLyBxVrWwzv1nA26paps5Y9E+EkEGA/3GHEX8Dp1UyuM00K4EbRORO4ERV7SnXPjB9nBUC4wf16lzZbSTOdRBa9xEIcJe7/2Caqo5T1QdVdRvO1v5HwF0i8t/tzLOjsVmacf+v3IvAJLiPX4vT0pjhZtmPMxbQv2ao+i4wFygCHhWR67v35xpzdKwQGN9wt+xvB74jIvHAq8CXRCQVQERyRGSQe72HOlX9O/BrYHqbWS0HzhSRAe58/i3ouXycIgLOFavi3d8zgFJVbRKRz+AUpSO417QtVdW/4HRftV2uMRFh+wiMr6jqWhFZD1ylqo+KyAnAMvdiRTU4o6SOA34lIgGcEVNvbTOPfW73zTKca0mvAWLdp/8CPCciK3BGMq11H/8H8IKIrMIZiXJLO/HOBL4rIk1uFmsRmKiw0UeNMcbnrGvIGGN8zgqBMcb4nBUCY4zxOSsExhjjc1YIjDHG56wQGGOMz1khMMYYn/t/yTgknjEjs/0AAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"from scipy.stats import shapiro\n",
"sns.distplot((y_test - predictions),bins = 50)\n",
"plt.xlabel('Residuals')\n",
"plt.ylabel('density')\n",
"plt.title('Histogram of residuals (Shapiro W p-value = {0:0.3f})'.format(shapiro(y_test-predictions)[1]))\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The histogram shows us that the residuals are negatively skewed and the value of the Shapiro W p-value in the title tells us that the distribution is not normal. This gives us further evidence that our model has room for improvement."
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"shap_w, shap_p = shapiro(y_test - predictions)"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2.519790541555267e-06"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"shap_p"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 11. Computing the metrics for mean absolute error, mean squared error, root mean squared error, and R-squared, and put them into a DataFrame"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"from sklearn import metrics"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"metrics_df = pd.DataFrame({'Metric':['MAE','MSE','RMSE','R-Squared'],\n",
" 'Value':[metrics.mean_absolute_error(y_test,predictions),\n",
" metrics.mean_squared_error(y_test,predictions),\n",
" np.sqrt(metrics.mean_squared_error(y_test,predictions)),\n",
" metrics.explained_variance_score(y_test,predictions)]}).round(3)"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Metric</th>\n",
" <th>Value</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>MAE</td>\n",
" <td>0.059</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>MSE</td>\n",
" <td>0.006</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>RMSE</td>\n",
" <td>0.080</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>R-Squared</td>\n",
" <td>0.629</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Metric Value\n",
"0 MAE 0.059\n",
"1 MSE 0.006\n",
"2 RMSE 0.080\n",
"3 R-Squared 0.629"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics_df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"***Mean absolute error (MAE)*** is the average absolute difference between the predicted values and the actual values. \n",
"\n",
"***Mean squared error (MSE)*** is the average of the squared differences between the predicted and actual values. \n",
"\n",
"***Root mean squared error (RMSE)*** is the square root of the MSE. \n",
"\n",
"***R-squared*** tells us the proportion of variance in the dependent variable that can be explained by the model. Thus, in this simple linear regression model, **GRE Score** explained 62.9% of the variance in Admit (The meaning of that is 62.9% of the times the **Admit** will change with the change of **GRE Score**). Additionally, our predictions were within ± 0.059 **Admit score**."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise: Fitting Multiple Linear Regression Model and determining the Intercept and Coefficient"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5. Instantiating the Multiple Linear Regression model and fitting the model"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.linear_model import LinearRegression"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"model = LinearRegression()"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.fit(X_train,y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 6. Calculate the Model Intercept and Coefficient --Regression Coefficient"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"intercept = model.intercept_"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-1.4242541443027852"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"intercept"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [],
"source": [
"coefficients = model.coef_"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.00217167, 0.00294465, 0.00431416, 0.00161238, 0.01659515,\n",
" 0.12281766, 0.02050198])"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"coefficients"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 7. Printing the equation using the coefficients we got above"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Admit_Predict = -1.4243 + (0.0022 x GRE Score) + (0.0029 x TOEFL Score) + (0.0043 x University Rating) + (0.0016 x SOP) +(0.0166 x LOR) + (0.1228 x CGPA) +(0.0205 x Research)\n"
]
}
],
"source": [
"print('Admit_Predict = {0:0.4f} + ({1:0.4f} x GRE Score) + ({2:0.4f} x TOEFL Score) + ({3:0.4f} x University Rating) + ({4:0.4f} x SOP) +({5:0.4f} x LOR) + ({6:0.4f} x CGPA) +({7:0.4f} x Research)'.format(intercept, \n",
" coefficients[0], \n",
" coefficients[1], \n",
" coefficients[2], \n",
" coefficients[3], \n",
" coefficients[4], \n",
" coefficients[5], \n",
" coefficients[6]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 8. Implementing the above equation and predicting the Admit Scores"
]
},
{
"cell_type": "code",
"execution_count": 117,
"metadata": {},
"outputs": [],
"source": [
"Admit_Predict = -1.4243 + (0.0022 * X_train['GRE Score']) + (0.0029 * X_train['TOEFL Score']) + (0.0043 * X_train['University Rating']) + (0.0016 * X_train['SOP']) +(0.0166 * X_train['LOR ']) + (0.1228 * X_train['CGPA']) +(0.0205 * X_train['Research'])"
]
},
{
"cell_type": "code",
"execution_count": 118,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"443 0.861248\n",
"497 0.955868\n",
"124 0.656416\n",
"50 0.679840\n",
"331 0.628636\n",
"dtype: float64"
]
},
"execution_count": 118,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Admit_Predict.head()"
]
},
{
"cell_type": "code",
"execution_count": 119,
"metadata": {},
"outputs": [],
"source": [
"predictions = model.predict(X_test)"
]
},
{
"cell_type": "code",
"execution_count": 120,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(165,)"
]
},
"execution_count": 120,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predictions.shape"
]
},
{
"cell_type": "code",
"execution_count": 121,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.54539918, 0.55363752, 0.78000432, 0.597551 , 0.64566267,\n",
" 0.68833295, 0.82664645, 0.68428919, 0.65744214, 0.81637454,\n",
" 0.80592038, 0.69639124, 0.73953556, 0.72629543, 0.91058275,\n",
" 0.65096547, 0.86424101, 0.55253423, 0.56833614, 0.69443069,\n",
" 0.93981603, 0.82809052, 0.7057749 , 0.68265075, 0.84866986,\n",
" 0.41677175, 0.45426869, 0.78282754, 0.74439537, 0.62941571,\n",
" 0.60821993, 0.71991221, 0.65369032, 0.47574244, 0.56190713,\n",
" 0.41525819, 0.70733081, 0.61018855, 0.5510159 , 0.59136738,\n",
" 0.58473005, 0.64840503, 0.66023105, 0.8882523 , 0.86243497,\n",
" 0.86309401, 0.91663565, 0.72721531, 0.69744991, 0.71971577,\n",
" 0.68922378, 0.56502979, 0.71907273, 0.48716266, 0.46391183,\n",
" 0.58724119, 0.63811268, 0.52989804, 0.69304485, 0.52932209,\n",
" 0.96703953, 0.50234456, 0.760185 , 0.80683706, 0.6320383 ,\n",
" 0.70421576, 0.65733996, 0.95995125, 0.90553398, 0.81891241,\n",
" 0.63092162, 0.51734089, 0.81072184, 0.64351458, 0.93423053,\n",
" 0.68180096, 0.59762578, 0.70679619, 0.81197215, 0.70090992,\n",
" 0.59868914, 0.61671976, 0.75784717, 0.69352287, 0.91258705,\n",
" 0.73889149, 0.62336392, 0.84694287, 0.78719785, 0.95083272,\n",
" 0.58962115, 0.83550833, 0.90836346, 0.78785687, 0.53057778,\n",
" 0.81483149, 0.6040547 , 0.64881483, 0.635637 , 0.50640703,\n",
" 0.63748851, 0.63647359, 0.56320363, 0.88665176, 0.85762558,\n",
" 0.9594266 , 0.59228023, 0.6218097 , 0.85438238, 0.84801528,\n",
" 0.96605969, 0.75338374, 0.89920837, 0.59894699, 0.85988345,\n",
" 0.68324326, 0.50029916, 0.60548738, 0.88691487, 0.66408934,\n",
" 0.83845589, 0.73074958, 0.78803455, 0.71693481, 0.7619414 ,\n",
" 0.6623658 , 0.7788172 , 0.79581755, 0.56030919, 0.61204924,\n",
" 0.73125575, 0.97052884, 0.90841142, 1.00860231, 0.79695337,\n",
" 0.7402222 , 0.60401769, 0.77472074, 0.59145471, 0.52849543,\n",
" 0.86786918, 0.72693087, 0.66230199, 0.7716777 , 0.68760897,\n",
" 0.9152402 , 0.78942901, 0.65033575, 0.64887332, 0.50990954,\n",
" 0.86208432, 0.66961578, 0.57715668, 0.83828783, 0.80576419,\n",
" 0.89029024, 0.70912519, 0.591096 , 0.65530893, 0.62941468,\n",
" 0.63998591, 0.61842808, 0.7481594 , 0.74215304, 0.61427862])"
]
},
"execution_count": 121,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predictions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 9. Plotting the predicted versus actual values on a scatterplot using the following code:"
]
},
{
"cell_type": "code",
"execution_count": 122,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": 123,
"metadata": {},
"outputs": [],
"source": [
"from scipy.stats import pearsonr"
]
},
{
"cell_type": "code",
"execution_count": 124,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEWCAYAAACJ0YulAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XucXHV9//HXO8sCCwKBJraw5AIaoVwbWLn80ipouZcQxQvxgvFGbaX2hxobLIWA+CAVqdVKq4jUC0gUYmMQNLYGpKIgG0PAcOkvcksWrMGwKBDJZvP5/XHOTM7OnjPzndk5O2d2Ps/HYx+ZOXPmzPfMbs73fD/f7/fzlZnhnHPOAUxqdQGcc84Vh1cKzjnnyrxScM45V+aVgnPOuTKvFJxzzpV5peCcc67MKwUHgKSZkkzSTvHz70l61zh87mJJ1+f9OUUg6QRJG3M47gJJP27i8W6UNK9Zx2s1SUdI+kmry9EuvFJoI5Iel7RF0vOS/lfSv0t6WR6fZWanmdlXA8v053mUodkk3SHpWUm7BO4/oqLsBJKOAI4EvjMOnyVJ/yjpN/HPpySpyr5/L+lJSb+VtFTSnonXd5F0XfzaryR9uPSamd0PDEo6M+9zmgi8Umg/Z5rZy4CjgFcDF1XuEP8H8t9tgqSZwJ8BBsxtaWGK7S+BGyxjVmuTK8jzgHlEldARwF/En5/mXOCdwBxgP6AH+JfE64uBWcAM4ETgY5JOTbx+Q5VjuwS/cLQpMxsAvgccBuW74E9Kugt4EThQ0l6SvizpaUkDki6X1BXv3yXp05KekfQocEby+PHx3pd4/n5JD0n6naQHJR0l6evAdOCWuPXysXjf4yT9RNKgpLWSTkgc5wBJP4qP85/AlKxzjD/vLxLPd4rLe5SkXSVdH99hDkq6V9IfVvnKzgXuBr4CjAiLSeqRdJWkJyQ9J+nHknqAO+NdBuPzO74y3JUSdnt34nt6VFLQhUjSFyR9umLbd0p3vJIWSfpl4vt/Q8ZxRrVuUn6X74nL+KyklZJmJA5xGvCjxL4LJN0l6TOSNhNdfJvlXcBVZrYx/nu+CliQse+ZwJfNbIOZPQ/8I/BWSbvFr58LfMLMnjWzh4AvVRzrDuD1oa3ETuaVQpuSNA04HViT2PxOoruvPYAngK8C24BXArOBk4HSxeH9RHdms4E+4E1VPuvNRBeDc4E9ie60f2Nm7wSeJG69mNmnJPUCtwKXA/sAHwWWSZoaH+4bwGqiyuATVFygK9wIzE88PwV4xsx+Hr9vL2Aa8AfAB4AtVY51LtHd4g3AKRUVyKeBo4H/E5f5Y8B24DXx65Pj8/tpleOX/Jroe90TeDfwGUlHBbzvG0QXOQFI2pvo97U0fv2XRC2dvYBLgesl7Rtw3BEU9RV8HHgjMBX4b6LvGUm7AwcAj1S87VjgUeDlwCdTjvm2uGLO+pmeUZxDgbWJ52vjbalFj3+Sz3cBZsXf1X7VjhVXOkPAQRnHdyVm5j9t8gM8DjwPDBJd9P8V6IlfuwO4LLHvHwIvlV6Pt80Hbo8frwI+kHjtZKLQyk6J470vfrwS+NsqZfrzxPO/A75esc9Koov4dKJKavfEa98Ars849iuB3wG7xc9vAC6OH78H+AlwRMD39qdEF4Qp8fOHgQvix5OIKpMjU943M/mdxNsWJ8ubtk/FMZaXvjvgBGBjxn4iqmBfEz9/P7CqyjndB5wVP14A/LhKmZO/y+8B7028NomoZTkD6I3fu2vi9QXAkzn9PQ8DByeez4o/Xyn7vg/4n/j89gJWxPseT3RjUFnuk4DHK44xUPp+/Sf7x1sK7WeemU02sxlm9tdmlrw73pB4PAPoBp4u3bEBXyS624Poziq5/xNVPnMa0Z1qiBnAm5N3ikQX5X3jz3zWzF4I+VwzWw88BJwZhwnmElUiAF8nqmyWSnpKUSdld8ah3gX8wMyeiZ9/gx0tlCnArnWcX1WSTpN0t6TN8bmfTpUQWYlFV62l7GgZvY2oEiwd91xJ9yW+08NCjptiBvDZxHE2E1VIvUQ3GxC1NJM2kI/niVpUJXsCz8ffRaXriFo0dwDrgNvj7Rvj45TenzzW7yqOsQc7ztFl8EphYkn+Z9pA1FKYElcik81sTzMrNamfJrrYl2Q18UvHekXAZ5b2/XriMyeb2e5mtiT+zL3jMEXI58KOENJZwINxRYGZDZnZpWZ2CFHY5y+IQkQjxH0DbwFeq2hUyq+AC4AjJR0JPAP8PuP80i5OLwC7JZ7/UeKzdgGWEYWj/tDMJgO3MTLsUetc3xTH+I+Nj0X8/EvA+cAfxMf9RcZxSxVuahmJfj9/WfH76TGzn8SV9S+BV1Ucs2oqZUlvj/tcsn6yfsfriDqZS46Mt41iZtvN7BIzm2lm+8f7DQADZvYs0d9W5rEk7QfszOjQmKvglcIEZWZPAz8ArpK0p6RJkl4h6bXxLt8CPiRp/zgmu6jK4a4FPirpaEVemeic/F/gwMS+1xPd2Z+iqDN7V0Xj8/c3syeAfuBSSTtL+lOiDsRqlhKFtv6KHa0EJJ0o6XBFHee/JQoPDae8f168/RDgT+KfPyaKpZ9rZtuJ7kL/SdJ+cZmPjy/wm4j6FpLndx/wGknTJe0FXJh4bWeiOPcmYJuk0+KyBzGzNfF7rwVWmlnprnZ3ogvzpvjc3008wCDlGJuILpbviM/lPYys8L4AXCjp0PhYe8V9RiW3Aa+lDmZ2g0V9Llk/T2a89WvAhyX1xhftjxANBBhF0j7x368kHQL8E1G4dHviWBdJ2lvSwUTht+SxTiAKx71Uz7l1pFbHr/wn/IeK+H3Fa3cQx40T2/YC/o2oif0cUaf0OfFrOwGfAX4DPAZ8kIw+hfj5B4jusp4nukudHW8/iygWPgh8NN52LNEIls1EF7JbgenxawcSXZCfB/4T+DwZfQqJz/4hUV/EHyW2zY/L8wJRxfQ5UuL6wPeJRrhUbn8L8Kv4e+gB/pnoYvoc0aijUl/NZfE5DALHxduujp+vJ7r4JL+3D8blGSQKcS0FLo9fO4GMPoVEuf4hPt6bK7Z/Mv4+nyG6IP6IHf0EC4j7FOLnp8W/00GiET3lfePX3wk8QFSZbgCuS7x2GNEdttKO3eS/ZwGfis9rc/xYidefB/4sfvyq+Pf9IlHI8cMVx9qFqHL/bfz9V75+KzC31f+H2+Gn9It3zjkAJH0D+JaZLW91WZpB0uHANWZ2fKvL0g68UnDOOVfmfQrOOefKvFJwzjlX5pWCc865srbL/jhlyhSbOXNmq4vhnHNtZfXq1c+Y2dRa+7VdpTBz5kz6+/tbXQznnGsrkqplLSjz8JFzzrkyrxScc86VeaXgnHOuzCsF55xzZV4pOOecK/NKwTnnXJlXCs4558q8UnDOOVfmlYJzzrkyrxScc86VtV2aC+ecm8iWrxngypWP8NTgFvab3MPCUw5i3uzecft8rxScc64glq8Z4MJvP8CWoWi58YHBLVz47QcAxq1iyC18JOk6Sb+W9IuM1yXpc5LWS7pf0lF5lcU559rBlSsfKVcIJVuGhrly5SPjVoY8+xS+Apxa5fXTgFnxz3lEC8w751zHempwS13b85BbpWBmdwKbq+xyFvA1i9wNTJa0b17lcc65ottvck9d2/PQytFHvcCGxPON8bZRJJ0nqV9S/6ZNm8alcM45N94WnnIQPd1dI7b1dHex8JSDxq0MrawUlLLN0nY0s2vMrM/M+qZOrblwkHPOtaV5s3u54o2H0zu5BwG9k3u44o2Hd8zoo43AtMTz/YGnWlQW55wrhHmze8e1EqjUypbCCuDceBTSccBzZvZ0C8vjnHMdL7eWgqQbgROAKZI2ApcA3QBm9gXgNuB0YD3wIvDuvMrinHMuTG6VgpnNr/G6AR/M6/Odc87Vz3MfOeecK/M0F845l4NW5zBqlFcKzjnXZKE5jNIqDsAT4jnn3ERSLYdR6QKfVnEsvGktCIaGrbxtvBPieaXgnHN1CAkLheQwSqs4hraPnr9bWZnkzTuanXMuUOnufmBwC8aOO/nlawZG7BeSw2igjiR3EyIhnnPOTTShqa1Dchh1KS3TT7pOSYjnnHNtJTS19bzZvZx9dG/5wt8lcfbRI9NXDFtqqrdROikhnnPOtZXQ1NbL1wywbPVA+cI/bMay1QMjwky9Gcea3NPdsQnxnHOurSw85aARI4Yg/U4+ZPRR1rEWzz20pfMZvFJwzrlApYt1M0YfhR5rvHml4JxzdQhJbb3f5J7U0UWVYaZWp8lO430KzjmXYfmaAeYsWcUBi25lzpJVo4aeZjnx4PTFwLK2F4m3FJxzLkVoqoo0tz+cvmxw1vYi8ZaCc86lCJ2TkCZ06GoReaXgnHMpxnJhDx26WkReKTjnXIqxXNhDZjQXlVcKzjmXYiwX9nmze7nijYe3dBJao7yj2TnnUmTNIwCYs2RV0BoI7VAJVJIF5t8oir6+Puvv7291MZxzHahyRBJAd5cYHja2J/brniSufPORhaoUJK02s75a+3lLwTnnMlSunfDi1m2j10AYHn1jPbTdWLxiXaEqhVBeKTjnOlKtxXLS5inUY3DLUFPLO168UnDOdZyQiWlp8xQ6Qa6jjySdKukRSeslLUp5fYakH0q6X9IdkvbPszzOOQdhE9PaYaJZHnKrFCR1AVcDpwGHAPMlHVKx26eBr5nZEcBlwBV5lcc550pCJqa1w0SzPOTZUjgGWG9mj5rZVmApcFbFPocAP4wf357yunPONV3IxLS0eQrhC2jWt9xmkeRZKfQCGxLPN8bbktYCZ8eP3wDsIekPKg8k6TxJ/ZL6N20qfkIp51yxhUxMS5uAVs8A/vnHThu1rdGsq+Mpz47mtGqy8jv9KPB5SQuAO4EBYNuoN5ldA1wD0TyF5hbTOdeuao0gypI2Me3Eg6dy5cpHuOCb942YlJa0927dPPvi6FFFu+/cxe+HtjNsRpfE/GOncfm8w0eVtdGsq+Mpt8lrko4HFpvZKfHzCwHMLLXfQNLLgIfNrGpns09ec85B+kSynu6uhtJJpE5KmyTQyHkIadtCP3POklWpw1p7J/dw16LX1VXeRoROXsszfHQvMEvSAZJ2Bs4BViR3kDRFUqkMFwLX5Vge59wEMpbU1iHHGtpuoyamDW03dt95p4ZyGrVLOu3cwkdmtk3S+cBKoAu4zszWSboM6DezFcAJwBWSjCh89MG8yuOcm1iaeZGt5z3PbRnivktOrvszQpfobLVc5ymY2W1m9ioze4WZfTLednFcIWBmN5vZrHif95nZS3mWxzk3cTRzzYJ63tPoRbxd0ml76mznXFtq5kU27Vjdk0R318jxMmO5iLdLOm1Pc+Gca0tZqa1DL7KVI5fOPrqX2x/eFJQSeyxlLlolUMlTZzvnOk4zRy61C0+d7ZxrikbnArRKSHmrjVwq8rmNB68UnHOZ2mXCVUlWefuf2DwiNJSVBrve9NgTkXc0O+cyNXMuwHjIKu/1dz/JwOAWDL/w1+KVgnMuU7tMuCoparnaiVcKzrlMe/V017W91cY6EaxdM5s2k1cKzrlMWdfIZl47m5k5dOEpB0X5iRq0y04qdAbT8eAdzc65TIMpGUGrba9XLh3ZgXVC9yTYbmLYDEV57nhxaHvzytGmvKXgnMvUzFQSaZrdkX3lykdGJbFL0z1JXPnmP+GXV5zO40vOYL+9ethe8bYid6jnySsF51ymPPL1JMNFWSOBGu0wrva+ZHqJK9985IgWQLt1qOfJw0fOuUxjTSVRKW0mcZpGWyJ79XQzuGV0aGtyT3fVNQvaJYPpePBKwTlXVTPz9aSFiyqNpSUyNLy9ru0lC085KDXtRdEymI6HmpWCpDnAfWb2gqR3AEcBnzWzJ3IvnXNuQqkWjhGMSEQ3Z8mqulsnL2xNr3Cytpc0u0XUzkJaCv8GHCnpSOBjwJeBrwGvzbNgzrn2EpJzKCtMk1ySslWpNdohg+l4COlo3mZRKtWziFoInwX2yLdYzrl2UrqQJ1NJXPjtB0aN9Q/puB7LiKTJGZPqsra70UIqhd9JuhB4J3CrpC7Av2HnXFnWhXzxinUjJqYBNReaGctIoMVzDx01ea17klg899A6z6hzhYSP3gq8DXiPmf1K0nTgynyL5ZwbD5UhnxMPnjpqoZmxLEo/uGWoPBqo1Hq44o2H5zYSyPsGxq5mS8HMfgUsA3aJNz0D/EeehXLO5S8t5FOZTTQtBJQmdOhmSBioXdYynqhqVgqS3g/cDHwx3tQLLM+zUM65/IUMDw2N5addyLMMDG6pmutoLGsZh/ZtuGwh4aMPAscA9wCY2f+T9PJcS+Wcy13obN20/dJGGl3xxsNHbHtx6zaezciRVAoPDQxu4cPfug8YObKo0ZFAvqLa2IV0NL9kZltLTyTtBLTXws7OuVFCQz6V+2XdjQPcteh1PLbkDO5a9DouOfPQoNbDdoOPf/v+usufxtNVjF1IpfAjSR8HeiSdBNwE3BJycEmnSnpE0npJi1Jeny7pdklrJN0v6fT6iu+ca1RIyCctlh86ZDQtDJSllJ10rPJO4NcJQsJHi4D3Ag8AfwncBlxb603x0NWrgZOAjcC9klaY2YOJ3S4CvmVm/ybpkPjYM+s6A+cytNuC8+MtbaRO2ugjGDm7uJ4kdpVhoJmLbs3hTHbwdBVjV7NSMLPtwJfin3ocA6w3s0cBJC0lmgCXrBQM2DN+vBfwVJ2f4VyqdltwvlVqxe7TvkeRHj8OuRuXwFLe3KxFe3xI6tiF5D56jJS/ATM7sMZbe4ENiecbgWMr9lkM/EDS3wC7A3+eUYbzgPMApk+fXqvIznmHY5OkfY9pFUL3JAXdjb/92Olcf/eTqdubxdNVjE1I+Kgv8XhX4M3APgHvS6v7K/+e5gNfMbOrJB0PfF3SYXHrZMebzK4BrgHo6+vzTu42NN6hHO9wbI7Q7yu0R+DyeYcDcOM9Gxg2o0ti/rHTyttd64WEj35TsemfJf0YuLjGWzcC0xLP92d0eOi9wKnx5/xU0q7AFODXtcrl2kcrQjmeH785Ju/WnTmsNGl4u3HpLeuCfp+XzzvcK4ECCwkfHZV4Oomo5RCSEO9eYJakA4AB4ByidBlJTwKvB74i6Y+JWiKbAo7t2kgrQjkTocOxCB3lafH/LCGVBxTjvFy2kPDRVYnH24DHgbfUepOZbZN0PrAS6AKuM7N1ki4D+s1sBfAR4EuSLiAKLS2IM7K6CaQVoZyidzjWujAWpaP8uZRVzMaiKOflsoWEj05s9OBmdhvRMNPktosTjx8E5jR6fNceWhXKKWqHY8iFcaytq2bdjVcbglopJD21DwAovsxKQdKHq73RzP6p+cVxE9FECOU0U8iFsZ7WVVqm02WrB5pyN572u+ueJIbN2J5o04emp/YBAMVXraXgC+m4pih6KGe8hVwYQ1tXaa2OG+5+ctQwv0bvxrN+d2nbQo7tAwCKL7NSMLNLx7MgbmIraiinFUIujKGtq9B5BND43XjW766R36e3GosvZPTRrkRDRw8lGh0EgJm9J8dyOTdhhVwYQ1tX9Vzoi3A37q3G4gsZffR14GHgFOAy4O3AQ3kWyrmJLPTCGNK6qqcj+MSDp9bcJ62DOqSs9fBWY7Gp1ghQSWvMbLak+83sCEndwEozy15PL0d9fX3W39/fio92rnAq+xSAzNxEvZN7qi6DmXas7kkCwdDwjiP2dHcFL3rjikPSajPrq7VfSOrs0kDlQUmHESWumzmGsjnnmiQtPXWjfQpp/RND221EhQDhq7G59hQSPrpG0t5Eaa5XAC8D/iHXUjnnglWGY+YsWdXQCJ96+idCQ1au/WS2FCT9IYCZXWtmz5rZnWZ2oJm93My+mPU+51z9lq8ZqLpucT0aXfi+no7ormblunaFU62lsFbSA8CNwDIze26cyuRcR8ma4dz/xOZRC96kxfFD1ksO6RxOGxWVZdiz0UxY1SqFXqL1Dc4BrpD0U6IKYoWZedvRTRitTtCWNcM5OQkta1by8jUDfOSmtQzH04sHBrfwkZvWctWbj6zaqZwmbVTUi1u3pSa6q7a0pmtvNUcfAUjaGTiNqII4Efihmb0957Kl8tFHrpnSRtyM9+iaAxbdmtk5XKlyBNGhF3+fF7aOvrPffecu1l126pjLVoTvxzVHM0cfYWZbiZbRfAj4LXDI2IrnXDGELkKfp3pi+ZWdwWkVQrXt9Uob3eQVwsRWdfSRpOnAW4lWSNsdWAqcZWY+ec1NCEVI0JYWy8+aa7BXTzdzlqwqh3fGQ9pks1aH3Fx+qmVJ/QlRv8JNwHlm5jEbN+EUIUFbWiz/xIOn8s2fbWAokYp0EvDC1m0MxmscVBsW2ujYoJCLva+JMLFVaylcCNzpi964iawoCdoq78aXrxngm/duGLHPdmD7cNh/x7cfN73uMoRe7H1NhIkts0/BzH7kFYKb6IoaM79y5SOjZhKHmiTom7FPQ58Z0r9ShJCby0/IjGbnJrRGE7SFJo9L21br88Zygd1uNHTXHnqxL0LIzeXHKwXnGpAWall481owyv0AA4NbWHjT2hEJ5ULj7/VkP03TSKUSerEvSsjN5cOX43SuAanJ41LCPcmO4pKQ+Hs9I5LSNHLXHnqx9zURJraQ5TgPAl5NlAwP4EzgzjwL5VzRjTV+Xuv9aRfe0JZDo3ft82b30v/EZm68ZwPDZnRJnH109qprXglMTDWX45T0A+AoM/td/Hwx0TBV5zpGZf/B5N26U9M/hGrkTn5yT3d5OGrS3rt1s9vOO435rn35mgGWrR4o5zUaNmPZ6gH6ZuzjFUAHCelTmA5sTTzfiq+n4JqgXSZApfUfdE8S3V0aETLq7tKIPgXIXqSm1p182md2TUqffXDGEfty+bzDGz6/Eh9q6iB8Oc6fSfoPopDmG4CvhRxc0qnAZ4Eu4FozW1Lx+meIcikB7Aa83MwmB5bdtbF2mgCVtfjM5J5udt9lp1xGH6V95nBK/wTA7Q9vavTURvChpg4CKgUz+6Sk7wF/Fm96t5mtqfU+SV3A1cBJwEbgXkkrzOzBxLEvSOz/N8DsOsvv2lQ73ZVmXRSf2zLEfZecPGp72gzgWipbTfWMPGrWRduHmjoITIhHdBf/WzP7LLBR0gEB7zkGWG9mj8YJ9ZYCZ1XZfz5Ram7XAdrprjTrohhysVy+ZoCFN61lYHALxo5hqsmKotRqSu7TjPLVa+EpB0UhsITuLvlQ0w5Ts1KQdAnwd0RpLwC6gesDjt0LJOfpb4y3pX3GDOAAYFXG6+dJ6pfUv2lTc5rKrrXGcqEdbycePLWu7UmLV6wbNSx1aLuxeMW68vO0VlOops8PqIxQeU6DjhPSUngDMBd4AcDMnmLHcNVq0nrFsv7EzgFuNrPU/xlmdo2Z9ZlZ39Sptf8juuJrdMnIVsiK2adtr1xWM220EDBiez0tg713684tJceVKx9JrcDGM424a72QjuatZmaSDEDS7oHH3ghMSzzfH3gqY99zgA8GHtdNAEWaAFVrFFRoqCut8zxE6KS07knikjMPDfqOGhnZ1U4hPZefkErhW5K+CEyW9H7gPcC1Ae+7F5gV9z8MEF3431a5k6SDgL2BnwaX2k0IRZgAFTIKKrQDtp4w0N67dZcfB0doAvNh1zOyK1l5TJJS114uYkjP5adm+MjMPg3cDCwjmt18sZl9LuB924DzgZVEK7Z9y8zWSbpM0tzErvOBpZ6R1bVCSGbQ0FBX6B11d1d0x1+voeGwUE5ottPKDu60CqGoIT2Xn5otBUn/aGZ/B/xnyraqzOw24LaKbRdXPF8cXFrnmiwkZBIa6spqUaTNZ0i+d+86ZkeHVDyhYaCslk2XxHazQk8odPkJCR+dRDT6KOm0lG3OtZ3Q0FBIqCsrodziudX7AS4581AW3rw2aP2EkFBO6DllVR7bzXhsyRk1P8dNTJnhI0l/JekB4GBJ9yd+HgMeGL8iOpefZo6CanTBnnmze7nyTUeOeN87jpsepchI6J4UNmcg9JzaaViwGz/VWgrfAL4HXAEsSmz/nZltzrVUzo2TZo+CamrneWXHcmBHc+g5+boILo1q9e9KOg5Yl8iSugdwiJndMw7lG6Wvr8/6+/tb8dHOjdCshH6Vo4Uge5hq7+Qe7lr0usYLnfLZjZxDuyQzdDtIWm1mfbX2C+lT+DfgqMTzF1K2OddRlq8ZGNEPUF55jfoT+qV1+GbdqjV7zkAjLZt2Smbo6hcyo1nJ4aJmth1fxtN1uEtvWTeqY3ho2Lj0lnWj9q2c5VyZIK+eGc1FiPeHDnl17Snk4v6opA8RtQ4A/hp4NL8iOddaIaGRrCGkldubeVddlHi/z3ye2EIqhQ8AnwMuImrV/hA4L89COdcMaRd3qN4BW8pqWsoBVMpqCuEX8TlLVpWP/+LWbWNKEd47uadwcXtPsT2xhayn8GuiFBXOtY20O/SFN68dsTJa2l17taymyQty1tKYpeMm/00TelddhE7lSj5qaWLLrBQkfczMPiXpX0jp9zKzD+VaMufGIHW1tJTJYZV37SFZTQEWzz10RIuiXs28qw652DczjFWkZIau+aq1FB6K//Xxn67t1BPfbiQWnnZhDO0wrryr7q2SHiMZihrLxb7ZK90VIZmhy0dmpWBmt8T/fnX8iuNcc9RzkU7etWflIUpmNS2pvDDOWbKqodxHaeGY7knidy9tK7dQsoa8hl7svXPYhaoWPrqFKll9zWxu1mvOtVrahbZrkhhOCfckV1A744h9uf7uJ0ftc8YR+9b8zBMPnpr63r84cl8un3d45vvSWh2DL27lha2jw1+X3rKuoYu9dw67UNXCR5+O/30j8EfsWIJzPvB4jmVyba4Is13TLrQvJO68k5IrqGWtsnbr/U9z+8Obqp5T1nu/vXojN96zgWEzuiTmHzttVCVR2eqYuejW1GNVtmJCL/beOexCVQsf/QhA0ifM7DWJl26RdGfuJXNtqUizXSsvtAdkXGiTd9VZd97PvjhUviBnnVPWe18c2l5+PGxWbk1Uaz2ECr3Ye+ewCxUyo3mqpANLT+KV1HyhZJeqyLNdQ7KChoZT0s6pnlDMjfdsqPr65J7RfRhp2+vJzDoJaTWTAAAUFklEQVRvdi93LXodjy05g7sWvc4rBJcqZPLaBcAdkkqzmGcCf5lbiSawIoRV8lbkDs2Qu+qsfoE0leeUdvwsaaucJaUNee2eJBbPHb1im48Ecs0UMnnt+5JmAQfHmx42s5fyLdbEU6SwSp6K3KEZEkLJ6hdIk7YQT+Xxn3puC2nX/y5Vz4OdVVag5jBV58YiZDnO3YAPAzPM7P2SZkk6yMy+m3/xJo5mjxMvqqJ3aNa6q66nRZMctZR1/IuWP5Da8ph/7LSax688VqfcWLjWCulT+HdgK3B8/HwjcHluJZqgihxWaaZGVx8bL7UyltbTovnu2qdr7nP5vMN5x3HTyy2DLol3HDe9oU7mIvfXuIkjpE/hFWb2VknzAcxsi1Sj7etGKVJYJe++jaLGuEPutOvpU8hKiVHp8nmHN2WkUafcWLjWCqkUtkrqIZ7IJukVgPcp1KkoYZVOCkFUVn5ZGUsXr1hX3m9Sge53Kss/OWO2dRH6a9zEEVIpXAJ8H5gm6QZgDrAgz0JNREUZJ94pfRtplV+WwS1D5bv+WqOCktJSXzRLWvm7J4nuLo1I7Fek/ho3MVStFOIw0cNEs5qPI1o69m/N7JmQg0s6Ffgs0AVca2ZLUvZ5C7CYqCWy1szeVs8JtJMihFXaLQTRaKgrrfJrpu4uccmZo4eHprlo+QM1ZzRXSs3yut1q5lFybqyqVgpmZpKWm9nRQPp00AySuoCrgZOIOqfvlbTCzB5M7DMLuBCYY2bPSnp53Wfg6lKkvo1axhLqamYl19PdxdlH99ZMc5GmcvRR6IzmrPI/t2WI+y45uc4zcC5cyOijuyW9uoFjHwOsN7NHzWwrsBQ4q2Kf9wNXm9mzUF7Qx+Vo4SkH0dPdNWJbUUMQYxlt06xKrkvi7KN76ZuxT0Pvz5q5XGtGc8jsa+fyEFIpnEhUMfxS0v2SHpB0f8D7eoHkX/7GeFvSq4BXSbpL0t1xuGkUSedJ6pfUv2lT+OQiN1rRh4wmjSXUlVb5NWLYjG/+bAMLb17LwOAWjB0tlsrhrFnvr2d7SdociGrbnWuWkI7m0xo8dtowjsr/CTsBs4ATgP2B/5Z0mJkNjniT2TXANQB9fX2NLXXlyorQtxFiLKGutI79zS+8xJZEcrpQaaurhXbOd0mpFUCtGc1ZM6vrmXHtXCOqraewK/AB4JXAA8CXzWxbHcfeCCSnbe4PPJWyz91mNgQ8JukRokri3jo+x00QlZ3KJx48lWWrBxoexltZ+c2+7AcNVQpZQlos84+d1tCM5nYbEOAmjmrho68CfUQVwmnAVXUe+15glqQDJO0MnAOsqNhnOVF4CklTiMJJj+I6TqlTORmiWbZ6gLOP7m1aqGswZYz/WIS0WBqd0ex9Cq5VqoWPDjGzwwEkfRn4WT0HNrNtks4HVhINSb3OzNZJugzoN7MV8WsnS3oQGAYWmtlvGjkR196yOpVvf3gTdy16XVM+o54lOmupp8XSyIzmokx2dJ2nWqVQvq2KL/B1H9zMbgNuq9h2ceKxESXb+3DdB3cTyniES7LWQkaMmBCWpqd7Evvsvsu4zQ8oymRH13mqVQpHSvpt/FhAT/xcRNfzPXMvnesY4zF/olo66tI2CVL6ldm1u6tpLZZQ7TIgwE0s1ZbjHPt4PueqSHYs79XTPS4pHLIutKVtWUt2Nrs/wrmiChmS6lzTVc5WHtwyxCRgUnynXpo01uw75VppM9ppxrdzefBKoU6dsKRmlmaee1rH8nYoz2QZNmPZ6gH6ZuzTtO83JG2Gd/C6Thcyo9nF0oZNhs5sbXfNPveQDuRmLyATkjajnWZ8O5cHbynUoVPSTqdp9rmHDg8dGNzStDWJQ0c4eQev62ReKdShk2eZNvvc08I0acSOtRBKrZP+JzY3lLG0yP0FnRyWdMXi4aM6dPIs02afe2WYZnI8+ihJjE6WtWVomBvufrKhMFZRM8R2cljSFY9XCnUo6kVlPOR97rvvshNvffW0EbH8rOlkaRVFSN9DUfsLxpIi3Llm8/BRHTp5lmmzzz1tJNCy1QMjLtJzlqwKTksRGsYqYn9BJ4clXfF4pVCnIl5Uxkszzz2k4zqt3yEtpATtHcIrcl+H6zwePnItEXJ3nBbueftx0ydcCK+Tw5KueLyl4Foi9O44rXXSN2OfCRXC6+SwpCserxSq8GGCjQn53sYyc7j/ic386rnfY8Cvnvs9/U9sbvvfSyeHJV2xeKWQISQlghst9Htr9O74ouUPjFjJbNis/LzeNQucc6PJaiwgXjR9fX3W39+f++dkjXzpndwz7imU20ne39srLrwtc83jX15x+piP79xEJWm1mfXV2s87mjP4MMHG5P29pVUI1bY75+rj4aMMPkywMXl/b11SagUg0bQcSc51Mm8pZPBhgo3J+3ubf+y01O2lHEmeJsK5sfGWQgYfJtiYsXxvIaOWSp3JN96zgWEzuiR22Um8OLR9xH6dkr3WuWbzjmZXCJWjliBqYYTkJjpg0a2ps5wFPLbkjOYW1Lk25R3NLtPyNQPMWbKKAxbdypwlqwoRZhlLUrhOzl7rXLPlGj6SdCrwWaALuNbMllS8vgC4EihdlT5vZtfmWaZOFzqPIC2UA/mF08YyasmX0HSueXKrFCR1AVcDJwEbgXslrTCzByt2/aaZnZ9XOdxIIYno0iqOhTetBcHQsJW3NXMy31hGLXn/j3PNk2dL4RhgvZk9CiBpKXAWUFkpuHEUckeeVnEMbR8dtW9mZ+5Y7/Y9TYRzzZFnn0IvsCHxfGO8rdLZku6XdLOk9PGGrmlC4u/1TDRr1qS0oi6A41ynybOloJRtlbebtwA3mtlLkj4AfBUYlQtB0nnAeQDTp09vdjk7SsgdeVYoJ00zO3P9bt+51suzpbARSN757w88ldzBzH5jZi/FT78EHJ12IDO7xsz6zKxv6tSpuRS2U4TckadNQOuepFFrKPd0d3HiwVMLN5LJOde4PFsK9wKzJB1ANLroHOBtyR0k7WtmT8dP5wIP5VgeF6t1R57VcVu57cSDp7Js9YBnknVuAsmtUjCzbZLOB1YSDUm9zszWSboM6DezFcCHJM0FtgGbgQV5lcfVJ6viSG6bs2RVzZFMzrn2kus8BTO7DbitYtvFiccXAhfmWYZOMt6LAnkmWecmHp/RPEGU5haMZ1I4n0ns3MTjlcIEMZY0EY3yTLLOTTyeJXWCaEUox2cSOzfxeKUwQdSTJqKZfQ8+t8C5icXDRxNEaCinFX0Pzrn24ZXCBBGaJqIVfQ/Oufbh4aMJJCSUU+RhpOM9pNY5N5q3FDpMUYeReljLuWLwSqHDFHUYqYe1nCsGDx91mKIOIy1yWMu5TuKVQgcq4jDSsay85pxrHg8fuUIoaljLuU7jLQVXCEUNaznXabxScIVRxLCWc53Gw0fOOefKvFJwzjlX5pWCc865Mq8UnHPOlXVsR7Pn2XHOudE6slIo5dkppVUo5dkB2qZi8ErNOZeHjgwftXueHU8e55zLS0dWCu2eZ6fdKzXnXHF1ZKVQ1PTRodq9UnPOFVeulYKkUyU9Imm9pEVV9nuTJJPUl2d5Sto9z067V2rOueLKrVKQ1AVcDZwGHALMl3RIyn57AB8C7smrLJVCl64sqnav1JxzxZXn6KNjgPVm9iiApKXAWcCDFft9AvgU8NEcyzJKO+fZ8eRxzrm85Fkp9AIbEs83Ascmd5A0G5hmZt+VlFkpSDoPOA9g+vTpORS1/bRzpeacK648+xSUss3KL0qTgM8AH6l1IDO7xsz6zKxv6tSpTSyic865pDwrhY3AtMTz/YGnEs/3AA4D7pD0OHAcsGK8Opudc86NlmelcC8wS9IBknYGzgFWlF40s+fMbIqZzTSzmcDdwFwz68+xTM4556rIrVIws23A+cBK4CHgW2a2TtJlkubm9bnOOecal2vuIzO7DbitYtvFGfuekGdZnHPO1daRM5qdc86l80rBOedcmVcKzjnnyrxScM45VyYzq71XgUjaBDzR6nJkmAI80+pCjEG7lx/a/xzavfzQ/ufQ7uWH9HOYYWY1Z/+2XaVQZJL6zaxtJ9+1e/mh/c+h3csP7X8O7V5+GNs5ePjIOedcmVcKzjnnyrxSaK5rWl2AMWr38kP7n0O7lx/a/xzavfwwhnPwPgXnnHNl3lJwzjlX5pWCc865Mq8UGiDpVEmPSFovaVGV/d4kyYq2RkSt8ktaIGmTpPvin/e1opzVhPwOJL1F0oOS1kn6xniXsZqA38FnEt///0gabEU5qwk4h+mSbpe0RtL9kk5vRTmzBJR/hqQfxmW/Q9L+rShnFknXSfq1pF9kvC5Jn4vP735JRwUd2Mz8p44foAv4JXAgsDOwFjgkZb89gDuJ1onoa3W56yk/sAD4fKvLOsZzmAWsAfaOn7+81eWu928osf/fANe1utwN/A6uAf4qfnwI8Hiry11n+W8C3hU/fh3w9VaXu6J8rwGOAn6R8frpwPeIVsE8Drgn5LjeUqjfMcB6M3vUzLYCS4GzUvb7BPAp4PfjWbgAoeUvspBzeD9wtZk9C2Bmvx7nMlZT7+9gPnDjuJQsXMg5GLBn/HgvRq682Goh5T8E+GH8+PaU11vKzO4ENlfZ5Szgaxa5G5gsad9ax/VKoX69wIbE843xtjJJs4FpZvbd8SxYoJrlj50dNzlvljQt5fVWCjmHVwGvknSXpLslnTpupast9HeApBnAAcCqcShXPULOYTHwDkkbidZV+ZvxKVqQkPKvBc6OH78B2EPSH4xD2Zol+O8sySuF+illW3lcr6RJwGeAj4xbiepTtfyxW4CZZnYE8F/AV3MvVX1CzmEnohDSCUR32tdKmpxzuUKFlL/kHOBmMxvOsTyNCDmH+cBXzGx/olDG1+P/H0UQUv6PAq+VtAZ4LTAAbMu7YE1Uz99ZWVF+Qe1kI5C8c96fkc3iPYDDgDskPU4Uy1tRoM7mWuXHzH5jZi/FT78EHD1OZQtV8xzifb5jZkNm9hjwCFElUQQh5S85h+KFjiDsHN4LfAvAzH4K7EqUqK0IQv4fPGVmbzSz2cDfx9ueG78ijlk9f2dlXinU715glqQDJO1M9J92RelFM3vOzKaY2Uwzm0nU0TzXzPpbU9xRqpYfoCLuOJdoje0iqXkOwHLgRABJU4jCSY+OaymzhZQfSQcBewM/HefyhQg5hyeB1wNI+mOiSmHTuJYyW8j/gymJls2FwHXjXMaxWgGcG49COg54zsyervWmXNdonojMbJuk84GVRCMYrjOzdZIuA/rNbNR/7iIJLP+HJM0laipvJhqNVBiB57ASOFnSg8AwsNDMftO6Uu9Qx9/QfGCpxUNJiiTwHD4CfEnSBURhiwVFOZfA8p8AXCHJiEYSfrBlBU4h6UaiMk6J+20uAboBzOwLRP04pwPrgReBdwcdtyC/I+eccwXg4SPnnHNlXik455wr80rBOedcmVcKzjnnyrxScM45V+aVgiuUeEz1jyWdltj2Fknfr9jv42P4jAWS9qvYdrOkAyXdE2cmfVIjM8XObPTzmkHSKyXd1+Rj/l9J72zmMV378yGprnAkHUaUoXI20Rjy+4BTzeyXiX2eN7OXNXj8O4CPliYUSjoUuNzM3pDYZwFRdtvzM47RNZ6pJyS9kijdxZ808ZgvA+40s7CUyq4jeEvBFY6Z/YIo/9LfEU3I+VpFhbAE6Inv4G+It71D0s/ibV+U1BX/fEXSLyQ9IOkCSW8C+oAb4n17gLcD36lWJkk7SRqUdLmknwHHSNpYyqck6ThJ/xU/fln8uT9TtJbAmSnHWybp5MTz6yWdJekVkv47ft9qScemvPd9kv458fz7kv40fnyapJ9K+rmkb0raPd5+paK1Je6X9I/x9/w8MKDQPPuuI/iMZldUlwI/B7YSXcTLzGyRpPNLd81xCoW3AnPMbEjSvxJd6NcBvWZ2WLzfZDMbjGeyJlsKcwjLL7QX8HMzuyh+X9Z+FwPfN7MFkvYG7pH0n2aWTKO+NC7zDyTtSpRw7b1ELaOTzOz3kg4mSkY4qmJII+nlwCLg9Wb2oqS/B/5W0peJZrYeamamkYkB+4E/I/qunfNKwRWTmb0g6ZvA84nkfFleT5S07974Qt0D/JqotXGgpH8BbgV+kPH+fQnLybMV+I+A/U4GTtOO1bx2BaYD/5PY51bgKkndwBnAKjN7Ka5EPi/pSKI0I68I+LyS/0O0BsBP4u9hZ+DHRKlKthOlnLgVSKZ0/zUws47PcBOcVwquyLbHP7UI+KqZXTjqhejiegpR3pq3AO9Jef8Wogt3LVsqcvdsY0cINvl+AfOSIa9K8Z38XcBJRC2Gf49f+ghRDvx3EOWxeT7l7cnPTX62iFooozqPFWXpPYko8dtfEVVcpfduySqn6zzep+Da1VB8lw3R6lhvisMnSNpH0fq6U4BJZrYM+AeipQsBfkeU4rzkIeCVDZThcXakFT87sX0l8KHSE0WLLqVZShQyOp5o3QqIQlRPx5XPu0jPif84MDseqTUzUYafEOX/PzD+3N0lzZK0B7BnvOjTBUQd+CWvAlLX+HWdyVsKrl1dA9wv6edm9nZJFxHF5ycBQ0Qtgy3Av2tk+mOArwBfkLSF6IJ8K1G2yf+iPouJQjK/An6W2H4p8M+SHiC68VpP+lKO3yfqM7jJzIbibZ8HbpY0Py5PWujsR0QLvjxAdEG/D8DM/lfSe4FvKkoHDfBxou/h25J2icvz4cSxjo/3cQ7wIanOEY9Aup2oo7poK5zlRtKrgb82s6CUyq4zeKXgHCDpFOAhM3uy1WUZL514zq42rxScc86VeUezc865Mq8UnHPOlXml4JxzrswrBeecc2VeKTjnnCv7/+Quzs2UQND3AAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.scatter(y_test,predictions)\n",
"plt.xlabel('Y test(True values)')\n",
"plt.ylabel('Predicted Values')\n",
"plt.title('Predicted vs Actual value(r = {0:0.2f})'.format(pearsonr(y_test,predictions)[0]))\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"|Strength of Association|\tPositive Coefficient, r\t|Negative Coefficient, r|\n",
"|-------|-----------|------------|\n",
"|Small\t|.1 to .3\t|-0.1 to -0.3|\n",
"|Medium\t|.3 to .5\t|-0.3 to -0.5|\n",
"|Large\t|.5 to 1.0\t|-0.5 to -1.0|\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 10. Plotting the residuals\n",
"\n",
"As discussed in the Introduction section, A **Residual** in simple terms is the difference between the Actual and Predicted value of the dependent variable."
]
},
{
"cell_type": "code",
"execution_count": 125,
"metadata": {},
"outputs": [],
"source": [
"import seaborn as sns\n",
"from scipy.stats import shapiro"
]
},
{
"cell_type": "code",
"execution_count": 126,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\tkhan050\\AppData\\Local\\Continuum\\anaconda3\\lib\\site-packages\\scipy\\stats\\stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.\n",
" return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XmcHGW1+P/PmZ59XzMzmSSTfSGLgQTCGlB2RIEfoIAIioq7XvWrX67X5fr9XnG9XL1frwqiF1Q2BQQELgJhSYCQjQSyrzOZLJPZ932mz++PqoHOZJaeyfRUL+f9Sr/SU11ddZ7u6jr1PE/VU6KqGGOMiV1xXgdgjDHGW5YIjDEmxlkiMMaYGGeJwBhjYpwlAmOMiXGWCIwxJsZFfCIQke0icoHXcYQjESkUkdUi0iIi/x7idQ35PYjIBSJyeJzW84qIfHqM731dRE4Ncl4VkdljWc8Iyz1PRHaP93LDnYh8QkRe8zqOWCEiSSKyS0QmBTN/WCcCESkXkYsGTDtug1LVhar6ygjLme7+sONDFGq4uh2oBTJV9RuhXFEw34OXRORDQIuqbnb/zhaRP4jIMTdR7hGR/x3qOFR1jarOG49liciNIrJjwLQXhph2x3isM5qIyIXuzrJdRF4WkdJh5p3uztPuvmfgfulr7rbU5G5XSV6+V1W7gD8AQW3TYZ0IIkUYJ5hSYIcGedWgiPhCHI+XPgf8KeDv/wDSgQVAFvBhYL8Hcb1rDNvRq8ACESkIeP/7gNQB084CVo9nrJFORPKBx4HvArnARuCRYd7yELAZyAP+BXg04DO+FLgDuBCYDswEfhAG730QuDUwsQxJVcP2AZQDFw2Y9gngtcHmAc7A+UKbgSrgLnd6BaBAq/s4CycJfgc4CFQDfwSyApZ7i/taHc7GEriefwUeBf7sruvT7rrXAo1AJfArIDFgeQp8AdgLtAD/F5jlvqcZ+Ev//EA+8LS7rHpgDRA3xGd0NrABaHL/P9udfh/QA3S7Zb5okPfeB/wGeBZoAy4CkoCfu59ZFfBbIGWkuAZ8PinushuAHcA3gcMDPovZA+L4N/d5jruOGvf9TwNTAuZ9Bfi0+3w2zs6wCafm88gQn1Ei0DFgOduAq4fZ9hQneex14/gvQNzXZgEvudtGLfAAkD1gm/xnt+wNwH8Dye5rFwz4LMpxjtreAbqAeJzk9Ir7OW8HPjxMnPuBawO2/5eB+wdMaydgWxyknF8BDrhl+RlDb2u/BX4+YNqTwNfd53e48bS4Zb9msN8tzk5LgfjBvlf379uAne7n9w+gdJz3LbcDbwT8neZuI/MHmXeu+91kBExbA3zOff4gcGfAaxcCx7x8b8C0vcD5I30e0VYj+CXwS1XNxPmx/sWdvtL9P1tV01V1Lc6G+Qng/TiZNB1n542InAL8GvgYUIxzxFgyYF1X4SSDbJwdQR/wNZyd5Vk4X8oXBrznMmAZcCbwLeAedx1TgUXAje583wAOAwVAIfBtnB/OcUQkF3gG+E+cI4a7gGdEJE9VP+HG9VO3zC8O8ZndBPwQyABeA36CswEuxdnRlgDfG01cwPdxPv9ZwKXArUOsezBxODvOUmAazo/zV0PM+3+B53GSxxTg/w0x3xzAr6qB/RRvAj8UkU+KyJwh3nclcDrOUfZHcMoCIMCPgMk4O+2pOAcHgT7mzj8L5/P8zhDrAOd7/yDOtiTA391yTQK+DDwgIkM1J63mve17Jc6O4rUB095U1e5h1n8NsBw4DWe7vm2I+R4EPioiAiAiOcAlwMPu6/uB83B+Lz8A/iwixcOsd1AicjXOtvX/4Wxra3COjIeav3GYx1BNYguBt/v/UNU2N/6FQ8x7QFVbAqa9HTDvcctynxeKSJ6H7+23E2f7HVYkJIInAr9YnB30UHqA2SKSr6qtqvrmMPN+DKfGcEBVW3GO4G5wq9LXAX9X1dfcH9D3OHGHt1ZVn1BVv6p2qOomVX1TVXtVtRy4Gzh/wHt+oqrNqrod54j0eXf9TcD/AP0dmT04CahUVXvUaVcebIf7QWCvqv7JXe9DwC7gQ8OUe6AnVfV1VfXjHH18Bviaqta7G+CdwA2jjOsjwA/dZRzCSVRBUdU6VX1MVdvd9f+QEz/Hfj04CWOyqnaq6lCdkdk4R6mBvoyTKL8E7BCRfSJy+YB5fqyqjapagXOkvdSNcZ+qvqCqXapag5OAB8b4K1U9pKr1bhluZGj/6c7bgXOQkO6uu1tVX8KpFQ31/ld5b6d/Hs5Oc82Aaa8Os25wtst6t5y/GGZda3B+B+e5f1+H8zs4CqCqf1XVo+5v4hGco9EzRlj3YD4L/EhVd6pqL842uHSoNnxVzR7m8eMh1pGOU5MM1IRzQDTaeQe+3v88w8P39mvB2f6HFQmJ4OrAL5YTj7IDfQrn6GuXiGwQkSuHmXcyTtNPv4M41fJC97VD/S+oajtOM0CgQ4F/iMhcEXna7bhpxtl48we8pyrgeccgf6e7z38G7AOeF5EDwxzVDCxDfzkG1l6GE1iOAiAV2BSQeJ9zp482rsDlDoxxSCKSKiJ3i8hB93NcDWQP0X/xLZwj6PXinLU01JFsAwN+4G7yvlNVl+HUpv4C/NWtZfU7FvC8Hff7EZFJIvKwiBxxY/wzJ37XA8s/eZhiB847GTjkJubA9w/1na4GlrhH52fi7Jh3AcXutHMZuX9g0Fjdz7TVfZznJv2HeS9R3ISTTHHnv0VEtgRsO4s48XMJRinwy4Dl1ON8z6PZrkfSCmQOmJbJiQcMwcw78PX+5y0evrdfBk4T47AiIREETVX3quqNOFXqn+B0rKQxePPFUZwNrt80oBdn51yJ09QAgIik4OwsjlvdgL9/g3M0Psdtmvo2zsY7lnK0qOo3VHUmztH910XkwiDK0F+OI6NZXcDzWpyEtDAg+Wapavoo46rEaS4JjClQO07C6VcU8PwbwDxghfs59h/ZnvBZquoxVf2Mqk7GOYr8tQx+yudeQERk0B2JqvYn7jRgxmDzDPAjnM9tiRvjzYPEN7D8R4dZXuB3cBSYKiKBv80hv1NVPeC+53agwq3dgtP3dDtO8hquZjxkrOqcCZbuPta4rz8EXOcena8AHgNw//4dTg0rzz1o28bgv4E29/+htoFDwGcHHNmnqOobgwUfkKwGe3x7iDJvJ6DJxN1PzHKnDzbvTBEJPJh4X8C8xy3LfV6lqnUevrffAo5vPhpUVCUCEblZRArco6n+LNiH0/Hox+kL6PcQ8DURmSEi6Tg7gkfcquijwIdE5GwRScRp7xxpp56B0+nbKiLzgc+fRDmuFJHZbltss1uGvkFmfRaYKyI3iUi8iHwUOAWnKWHU3M/td8B/iHv+sYiUuGcnjCauvwD/LCI5IjIFpxkm0BbgJhHxichlHN+skoGTjBrdo/PvDxWviFzvLh+co34dLB5V7QFeDFyPiHxXRE4XkUQRSQa+irPNBHOOfwbO0Vijm1y+Ocg8XxSRKW4Zvs3wZ6QEWoezo/yWiCSIc23Gh3ivHX4wa4Cvu//3e82dttFtchrON93vairO5zBkrOqcflsD3Av8Q1X7f2f9B1w1ACLySZwawWDLqMFJbDe728BtODvhfr/F2X4WusvKEpHrh4kpfZjHnUO87W/AIhG51v3+vwe849amBi5/D842+30RSRaRa4AluEkQ50STT4nIKW4t7Ds4J0B49l73cyvBOSNqpAOB6EoEOJ2x20WkFafj+Aa37bgdp532dbe6eSbOObZ/wqk2lwGduDsstw3/yzg/vkqcqlY1Thv6UP4XTlW5BWdnGuwPfzBzcHZcrThHdr/WQc7RdzP/lThH0XU4TSVXqmrtSaz7f+M0/7zpNnu8iHOEHnRcOInzIM7n+jzHn7YJzs7mQzg73o8BTwS89gucs45qcTbg54aJ9XRgnft9PwV8VVXLhpj3buDjAX8rTqd0Lc4R8MXABwOOqIfzA5yO1SaczvrHB5nnQZyyH3Af/xbEclGnT+rDwOVubL8GbhlsBxXgVZxacGAfyRp3WjCnjT4JbMLZ6TwD/H6E+R/COcPswYC4dwD/jrNdVAGLgdeHWcZncBJoHU6n57tH+6r6N5wa/cPuNrgN5/MYN24yuhZnv9CAU7vp7wtDRH4rIr8NeMsNOB3qDcCPgevcZaCqzwE/xelHOug+vh8G770JuF+dawqG1X86nBmGW2NoxGn2GWpHY8KcOBciftk9qg3lespxToUc6kytsCEiirNd7/M6FjN+xLl24G1gpapWjzR/uF4I5TlxrkRdhdMk9HNgK8453yZCqeq5XsdgzERwawHzg50/2pqGxtNVOE0GR3GaRG5Qqz4ZY6KQNQ0ZY0yMsxqBMcbEuIjoI8jPz9fp06d7HYYxxkSUTZs21apqwUjzRUQimD59Ohs3bvQ6DGOMiSgiEtRV/dY0ZIwxMc4SgTHGxDhLBMYYE+MsERhjTIyzRGCMMTHOEoExxsQ4SwTGGBPjLBEYY0yMs0RgjDExLiKuLDbGjM2D6ypOmHbTioF3DjWxzmoExhgT4ywRGGNMjLNEYIwxMc4SgTHGxDhLBMYYE+MsERhjTIyzRGCMMTHOEoExxsQ4SwTGGBPjLBEYY0yMs0RgjDExzhKBMcbEOEsExhgT4ywRGGNMjAvZMNQi8gfgSqBaVRe503KBR4DpQDnwEVVtCFUMxoSzkxkierD3GjNWoawR3AdcNmDaHcAqVZ0DrHL/NsYY46GQJQJVXQ3UD5h8FXC/+/x+4OpQrd8YY0xwJrqPoFBVKwHc/ycNNaOI3C4iG0VkY01NzYQFaIwxsSZsO4tV9R5VXa6qywsKCrwOxxhjotZEJ4IqESkGcP+vnuD1G2OMGWCiE8FTwK3u81uBJyd4/cYYYwYIWSIQkYeAtcA8ETksIp8CfgxcLCJ7gYvdv40xxngoZNcRqOqNQ7x0YajWaYwxZvTCtrPYGGPMxLBEYIwxMc4SgTHGxDhLBMYYE+MsERhjTIyzRGCMMTHOEoExxsS4kF1HYIyZOEcbO6hp6SItKZ68tERy0hK9DslEEEsExkQwvyov76rmpV3VqDtNgKtPLeH06blehmYiiCUCYyJUV08ff3zzIGW1bSydms3KOQW09/Syek8Nf9t8hJ4+P2fPyvc6TBMBLBEYE6Ge3VZJeW0b1542hWWlOe9On5aTysMbDvH0O5UkxfuOe82YwVhnsTERaFdlMxvKGzhvTsEJO/p4Xxw3njGN0rxUntt+jK6ePo+iNJHCEoExEaa+rZvHNx+hKDOZixYMfpM/X5xwxaJi2rp6WbOvdoIjNJHGEoExEeanz+2io7uP65dPId439E94am4qi0qyeG1vLS2dPRMYoYk0lgiMiSBHGzt47K3DnD4jh+KslBHnv+SUQnr9flbtspsBmqFZIjAmgtyz+gCqsHJOcPfxzk9PYnlpLpvKG2jv6g1xdCZSWSIwJkLUtnbx0PoKrjm1hOzU4C8YWzEzlz5VthxuDGF0JpJZIjAmQvz+tTK6+/x87oJZo3pfcVYKk7OSeetgQ4giM5HOEoExEaC9u5c/rz3IFYuKmVWQPur3n1aaw9GmTiqbOkIQnYl0lgiMiQDPbTtGS1cvHz+rdEzvXzolG1+csMlqBWYQdmWxMWHkwXUVJ0y7acU0/rrxMNNyU1kxY2zjB6UmxbOgKIMthxrp7vWTGG/HgOY9tjUYE+YO1bez9kAd1y2bgoiMeTnLSnNo7+7jld12Kqk5niUCY8LcY28dRgSuXTblpJYze1IGyQlxvLizapwiM9HCEoExYcyvyqObDnP2rDxKske+gGw4vjhhbmEGq3ZW0+fXkd9gYoYlAmPCWHltG4cbOrh+2dRxWd6Cokzq2rrZcsiuKTDvsURgTBjbeqSJ5IQ4Ll1YNC7Lm1uYQXycWPOQOY4lAmPClF+VnZXNnD+3gJRE37gsMyXRx4qZuby4wxKBeY8lAmPC1JGGDpo7e8etNtDvogWF7K1upby2bVyXayKXJQJjwtT2o83ECVw4v3Bcl3vRAmd51jxk+nmSCETkayKyXUS2ichDIpLsRRzGhCtVZfvRJmYWpJOVmjCuy56am8o89+whY8CDRCAiJcBXgOWqugjwATdMdBzGhLPqli7q2ro5pTgzJMtfOTefTQcb6Oi221ga75qG4oEUEYkHUoGjHsVhTFjafrQZAU6ZHJpEcN6cArr7/KwrqwvJ8k1kmfBEoKpHgJ8DFUAl0KSqzw+cT0RuF5GNIrKxpqZmosM0xlO7jjUzJSeFzOTxbRbqd8aMXBLj43htr93P2HjTNJQDXAXMACYDaSJy88D5VPUeVV2uqssLCoK7G5Mx0aCtq5cjDR3MK8oI2TqSE3ycPj2H1+zG9gZvmoYuAspUtUZVe4DHgbM9iMOYsLSvuhUF5kwKXSIAOHd2AbuOtVDd0hnS9Zjw50UiqADOFJFUcYZSvBDY6UEcxoSlPVUtpCb6KMk5ubGFhvLgugoeXFdBq3sP4589tzsk6zGRw4s+gnXAo8BbwFY3hnsmOg5jwpGqsq+6ldmT0ok7iSGng1GclUxqoo991a0hXY8Jf57cmEZVvw9834t1GxPOjjV30tLVG/JmIYA4EWYVpLOvphVVPal7HZjIZlcWGxNG9lY5R+dzCkd/X+KxmDMpnZbOXvZarSCmWSIwJozsqW6hKDM5ZKeNDjSrwEk4b9jZQzHNEoExYaKrt4+Dte0TVhsAyElLJDctkdf324VlscwSgTFhoqKunT5VZk+auEQAMDM/jTcP1Nldy2KYJQJjwkRZbRtxAqW5aRO63lluP8G2I00Tul4TPiwRGBMmyuraKMlOITF+Yn+WM/OdxPP6fusniFWWCIwJAz19fg43dDA9f2JrAwAZyQnMK8xgrfUTxCxLBMaEgUMN7fT5lRkeJAKAs2fnsaG8nq5eG5Y6FlkiMCYMlNe2IUx8/0C/s2fl09njZ3NFoyfrN96yRGBMGCirbaMoK3ncblI/Witm5hIndj1BrLJEYIzHev1+KurbPekf6JeZnMDiKdl2PUGMskRgjMeONnbS06fMyPMuEQCcMyuPtw81vjsqqYkdngw6Z4x5T3ltG8CQNYIH11WEPIYH11XQ2eOn16/87LldzCvK5KYV04J630DBvM+EF6sRGOOxsto2CjKSSE/y9risNC8VX5ywv6bN0zjMxLNEYIyH/KqU17V53iwEkOCLY1puKvtrbCTSWGOJwBgPHWvqpKvX72lHcaBZBelUNnXSbv0EMcUSgTEeKuvvH8hL9TgSx+wCJyHtr7XmoVhiicAYD5XXtZGTmkB2aqLXoQBQkpNKYnycNQ/FGEsExnhEVSmrbWNG/sQOOz0cX5wwIy+NA5YIYoolAmM8Ut3SRXt3HzPyw6NZqN+sgjRqW7upbOrwOhQzQYJKBCLymIh8UEQscRgzTsrr+vsHwqOjuN+sSf23r7SrjGNFsDv23wA3AXtF5MciMj+EMRkTE8pq28hMjic3LTz6B/oVZiaTmuiz+xPEkKASgaq+qKofA04DyoEXROQNEfmkiEzMXbaNiSKqSnltG9Pz0xARr8M5TpwIMwvSeWNfHap2+8pYEHRTj4jkAZ8APg1sBn6JkxheCElkxkSxivp2mjt7Pbv/wEhmFaRxrLnz3dNbTXQLto/gcWANkAp8SFU/rKqPqOqXgfA55cGYCLGurB4Iv/6BfrMKnJ+1jUYaG4KtEdyrqqeo6o9UtRJARJIAVHV5yKIzJkqtL6snNdHHpIwkr0MZVF5aIpOzkllr/QQxIdhE8G+DTFs7noEYE0vWl9UzPS/8+gf6iQhnzcpn7f46/H7rJ4h2wyYCESkSkWVAioicKiKnuY8LcJqJjDGjVNnUQUV9e9j2D/Q7Z3YeDe097Khs9joUE2IjjXt7KU4H8RTgroDpLcC3x7pSEckG7gUWAQrcpqpWwzAxYX1//0DYJ4J8AFbvrWFRSZbH0ZhQGjYRqOr9wP0icq2qPjaO6/0l8JyqXiciiVjtwsSQdWX1ZCTFU5yV7HUowyrMTGZBcSav7K7hCxfM9jocE0LDJgIRuVlV/wxMF5GvD3xdVe8a5G3DEpFMYCVOTQNV7Qa6R7scYyLV+rJ6lk/PIS5M+wcCXTCvgHtWH6C5s4fMZLtkKFqN1FncX3dNBzIGeYzFTKAG+G8R2Swi94pIeNeRjRknta1d7Ktu5YwZeV6HEpT3z5tEn195fa+dPRTNRmoautv9/wfjvM7TgC+r6joR+SVwB/DdwJlE5HbgdoBp0+weqCY6bHD7B86YkcvuYy0eRzOy06Zlk5Ecz8u7q7l8cbHX4ZgQCfaCsp+KSKaIJIjIKhGpFZGbx7jOw8BhVV3n/v0oTmI4jqreo6rLVXV5QUHBGFdlTHhZV1ZPckIciyOk8zXeF8d5c/J5dU+NDTcRxYK9juASVW0GrsTZkc8FvjmWFarqMeCQiMxzJ10I7BjLsoyJNOvL6llWmkNifOQM5HvB3ElUNXexszL8azBmbILdGvt7ia4AHlLV+pNc75eBB0TkHWApcOdJLs+YsNfU0cPOY82cMT0y+gf6nT/PqZG/sqfa40hMqASbCP4uIruA5cAqESkAOse6UlXd4jb7LFHVq1W1YazLMiZSbCyvR9XpH4gkgaeRmugU7DDUdwBnActVtQdoA64KZWDGRJv1ZfUk+IRTp2V7HcqovX9eAZsONtDc2eN1KCYERtNQuQD4qIjcAlwHXBKakIyJTuvK6nnflGySE3xehzJqF7inkb5mp5FGpWDPGvoT8HPgXOB092GjjhoTpLauXrYdaWLFzMhqFurXfxrpK7utnyAajTTWUL/lwClq548ZMyabKxrp9WvEXEg20MDTSMN11FQzNsEmgm1AEVAZwliMiTgPrqs4YdpNK068AHJdWR1xAstKcyYirJC4YO4knt16jLte2ENxVorX4ZhxFGwiyAd2iMh6oKt/oqp+OCRRGRNl1h2oZ+HkLNKTgv3JhZ/+00j3HGuxRBBlgt0q/zWUQRgTzTq6+9h8qIHbzpnhdSgnpTAzmeKsZHZXtXL+vEleh2PGUbCnj74KlAMJ7vMNwFshjMuYqLGhvJ6ePuVsd3z/SDa3MIOK+jY6uvu8DsWMo2DPGvoMzphAd7uTSoAnQhWUMdHkjf11xMcJp0+P3P6BfvOLMvAr7Km24SaiSbDXEXwROAdoBlDVvYDVDY0Jwtr9tZw6LZvUxMjtH+g3NTeV1EQfu+z2lVEl2ETQ5d5ABgARice5xaQxZhhNHT1sPdLEWbMiv1kIIE6E+UWZ7K5qoc9uah81gk0Er4rIt3FuYn8x8Ffg76ELy5josL6sHr/CObMi8/qBwSwozqCzx095XZvXoZhxEmwiuAPnrmJbgc8CzwLfCVVQxkSLN/bXkpwQx9IIHF9oKLMnpeOLE2seiiJBNVqqql9EngCeUFUbgtCYIK3dX8fp03NJio+88YWGkhTvY1ZBGjuPtXDFYrvKOBoMWyMQx7+KSC2wC9gtIjUi8r2JCc+YyFXb2sWuYy2cFUXNQv0WFGdS39ZNdUvXyDObsDdS09A/4ZwtdLqq5qlqLrACOEdEvhby6IyJYGv2OpXn82ZH361W5xdlAljzUJQYKRHcAtyoqmX9E1T1AHCz+5oxZgiv7q4hPz2RhZMzvQ5l3GWlJDA5O5mdx+x6gmgwUiJIUNUTBiB3+wkSBpnfGAP4/crqvbWsnFNAXFx0tqEvKMrkUH07rV29XodiTtJIiaB7jK8ZE9O2HW2ivq373YHaotGC4kwU2H3Mmoci3UhnDb1PRAb7lgVIDkE8xkSFV3bXIALnRvD4QoMNsR2oOCuZrJQEdla2sKw0Mm+4YxzDJgJVjZ5z3oyZQK/uqWFxSRZ56UlehxIyIsL8ogzeqmigp89Pgm80d7414cS+OWPGWVN7D5srGjh/bvQ2C/VbUJxJT59yoKbV61DMSbBEYMw4e21fLX4lJhLBzPw0EuPj2FlpZw9FMksExoyzl3ZVk5kcz9Kp0TOsxFDifXHMmZTOrmPN2C3NI5clAmPGUZ9feWlXFR+YP4n4GGkzX1CUSXNnL0cbO70OxYxRbGypxkyQivp2Gtp7uOiUQq9DmTBzizIQYKedRhqxLBEYM452VjaT4JOY6B/ol54Uz7TcVBtuIoJZIjBmnKgqOyqbOWtWPhnJsXXh/YLiTI42ddLYbteZRiJLBMaMk+qWLurburk4hpqF+s0vzgBgl409FJE8SwQi4hORzSLytFcxGDOedrpNIxcviL1EUJCeRF5aIrusnyAieVkj+Cqw08P1GzOudlY2U5KdQlFW7I2+IiIsKM5kf02bDUIXgTxJBCIyBfggcK8X6zdmvDW2d3OooYNTonDI6WDNL86gz6+s2WM3MYw0XtUIfgF8C/APNYOI3C4iG0VkY02NbVgmvG0/6jSJLJ6c5XEk3inNTSMlwccLO6u8DsWM0oQnAhG5EqhW1U3Dzaeq96jqclVdXlAQO6fimci09UgTxVnJ5GdE7yBzI/HFCfOKMnh5VzV9frvKOJIEdfP6cXYO8GERuQJnKOtMEfmzqt7sQSzGDGqwIZhvWjFt0Hkb27upqG8f9myhkYZ0jhbzizLYcqiRtyoaOH26DU0dKSa8RqCq/6yqU1R1OnAD8JIlARPJrFnoPXMLM0jwCS/usOahSGLXERhzkqxZ6D3JCT7OnJnHi9ZPEFE8TQSq+oqqXullDMacjKaOHirq21lUYrWBfhfOn8T+mja7R0EEsRqBMSdh6+FGwJqFAl3oXlC3ame1x5GYYFkiMOYkbDnUyJScFGsWCjA1N5X5RRm8YP0EEcMSgTFjtLeqhaNNnbxvSvTfgGa0LllYxMaD9dS2dnkdigmCJQJjxuiJLUeIE1gyxZqFBrp0YSF+hVXWaRwRLBEYMwZ+v/LklqPMKkiPuSGng3FKcSYl2Sn8Y7slgkhgicCYMdhU0cDhho6YuC/xWIgIly4s4rV9tTYIXQSwRGDMGDyx+QgpCb6YHmRuJJcsLKS718+ru22ssHBnicCYUeru9fPM1kouWVhIUrzP63DC1vLSHHLTEnl+xzGvQzEjsERgzCi9uqeGxvYerl5a4nUoYS3eF8eF8yet+cmzAAAU00lEQVTx0s5qunuHHGjYhAFLBMaM0hObj5CXlsi5c/K9DiXsXbaoiJauXl7fV+t1KGYYlgiMGYXmzh5e3FnFlUuKSfDZz2ck587JJyMpnme3VnodihmGF8NQGxOxntt2jK5eP1efas1CwUiK93HxKYU8v6OKO/v8xyXP0Qz1bULLDmmMGYUntxyhNC/VThsdhcsXF9PU0WPNQ2HMEoExQaps6uCN/XVctbQEEfE6nIhx3px80q15KKxZIjAmSI+/dQRVuPY0axYajeQEHxctmMTzO6ro6bOzh8KRJQJjgqCq/HXjIVbMyKU0L83rcCLOFYuLaWzvYe3+Oq9DMYOwRGBMEMrr2imva+cjy6d6HUpEWjm3gIykeP7+9lGvQzGDsERgTBA2HWwgPSmeyxcXeR1KREpO8HHZoiL+Z9sxOnv6vA7HDGCJwJgRdPX0se1IE1cuKSY10c64HqtrTi2htavX7mcchiwRGDOCrUea6O7zc701C52UFTPzKMpM5onNR7wOxQxgicCYEWw62EB+ehKnTbNrB06GL064aulkXtldQ31bt9fhmACWCIwZRm1LFwfr21lemmPXDoyDq5aW0OtXnnnHOo3DiSUCY4axqaKBOIGlVhsYFwuKM5hXmMFjb1nzUDixRGDMEPr8ylsVDcwtzCDTbkc5LkSE65ZNYcuhRo41d3odjnFZIjBmCPuqW2jp7GVZaY7XoUSVa5dNIdEXx4ayeq9DMS47F86YIWw82EBqoo95RRnA4KNlDibY+WLJwM9kfnEGmw81cOnCIhLj7XjUa/YNGDOI5s4edlY2c+rUbOLj7Gcy3s6Ynktnj59tR5q8DsVgicCYQW0or8evzrnvZvzNyE8jPz2J9eXWPBQOLBEYM0BPn58NZfXMmZROfnqS1+FEJRHhjOk5VNS3U9nU4XU4MW/CE4GITBWRl0Vkp4hsF5GvTnQMxgznhR1VNHf2cqbVBkLqtNIcEn1xvLbXbljjNS9qBL3AN1R1AXAm8EUROcWDOIwZ1B/XlpOTmvBuJ7EJjdTEeJZNz+Htw400ttuVxl6a8ESgqpWq+pb7vAXYCdidPkxY2H2shTcP1LNiRh5xdiVxyJ07Kx+AN+w+BZ7ytI9ARKYDpwLrBnntdhHZKCIba2pqJjo0E6PuWX2AlAQfy+3agQmRk5bI4pIs1pfX09Ftw1N7xbNEICLpwGPAP6lq88DXVfUeVV2uqssLCgomPkATc440dvDkliPccMZUUpPsEpuJct6cArp7/awrs1qBVzxJBCKSgJMEHlDVx72IwZiB7l1zAIBPnzfT40hiy+TsFOYVZrB6b431FXjEi7OGBPg9sFNV75ro9RszmPq2bh5ef4irlpZQkp3idTgx55KFhXT1+Pmvl/d5HUpM8qJGcA7wceADIrLFfVzhQRzGvOu+N8rp6Onjc+dbbcALxVkpnDYth/vfOMih+navw4k5Xpw19JqqiqouUdWl7uPZiY7DmH61rV384bUyLl1YyJxCO2XUKxedUogI/Pvzu70OJebYlcUm5v3nqr109PTxzUvnex1KTMtKSeBT587giS1HeWO/XWQ2kSwRmJh2oKaVB9dVcMPpU5k9Kd3rcGLelz8wh+l5qXzr0Xdo7er1OpyYYefImZgycDjkB9YdJDE+jn+6aK5HEZlAKYk+fn79+7j+7rX86Nmd/PCaxYPON9hQ3zetmBbq8KKW1QhMzNpT1cL2o818duUsCjJscLlwsXx6Lp8+dwYPrKvg5d3VXocTEywRmJjU2dPH3zYfoSAjic/amUJh5xuXzGNBcSZfeXAze6pavA4n6lkiMDHpma2VNHf0cN1pU0hO8HkdjhkgOcHH729dTnKij0/+9wZqWrq8DimqWSIwMWdXZTObDjawcm4BU3NTvQ7HDGFydgq/v3U5dW1dfPK+9dS1WjIIFUsEJqZUNXfyyMZDFGclc+H8SV6HY0awZEo2v/7YaeytauX6367lcINdbBYKlghMzKhr7eKPa8tJ9MXx8TNLiffZ5h8JPjC/kD99agW1rV1c+5s32HTQbm853uyXYGJCS2cPn/njRlo6e7n5zFKyUxO9DsmMwhkzcvnL584iMT6Oj9z9Jqt2VtHnV6/DihqWCEzUq2/r5qbfreOdw018ZPlU6xeIUPOLMnn2K+dx1fsms2pXNb95ZR9ltW1ehxUV7IIyE9UO1bdz230bqKhv53e3LKeyqdPrkMxJyEhO4K6PLiUpwcezWyv53ZoDLCrJ4vKFRV6HFtEsEZio9dKuKr72yNv4/cr9t53BmTPzBr0i1USexSVZzCvMYM2+GlbvqWFXZTMdPX18/oJZpNlNhUbNmoZM1Ons6eNHz+7ktvs2UpKdwtNfOZczZ+Z5HZYZZ4nxcVw4v5CvXTSXhZMz+dXL+7jw31/lyS1HULX+g9GwRGCiyobyeq745RruXn2Am1ZM4/EvnE1pXprXYZkQyk5N5KOnT+Oxz59FfkYiX314Cx+9+012HD3hDrhmCJYITFQ40tjBVx/ezPW/XUt3n58/f2oFd16z2K4ajiHLSnN58ovncuc1i9lb3cKV/28N331iG00dPV6HFvasMc1EtNauXn7zyj7uXVMGwJfeP9vaiWOYL064acU0Pri4mP94cQ9/XFvO8zuO8W9XL+biUwq9Di9s2a8lBMZziNxglzXe8wUby2AmYnn3v1HOpoMNrNpVTVtXL0unZnPJKYVkpyby5JajQS3XhMZ4d8gHu7yB880tzOBz58/i8beO8Jk/bmTJlCyuXDKZ9EEOEsb7txJpLBGYiNLc2cMDb1bwXy/vo7Wrl9K8VG49q5QpOXZtgDnRlJxUvvD+Wby6p4ZXdtWwr7qVDy2ZzJIpWYiI1+GFDUsEJiJUNXfy36+X88CbB2np6mX2pHTOn1vAzPw0+0GbYcXHOWcXLZqcxWNvHeaRjYd4+3AjVy0tISslwevwwoIlAhO2evv87Klq5dP3b+Tl3dWoKpcvLubz58/incNNXodnIkxhZjKfO38Wb+yr5YWdVfzixT1cvqiY06fneB2a5ywRmLDS3et37xzWxK5jLXT1+slPT+L2lTO54fSp754KaonAjEWcCOfOKWBBcSaPbz7CE1uO8M7hRs6dkx/TpxlbIjCe8qtyrKmTsto2ymrb2FvdQk+fkproY3FJFgsnZ/GdKxeQYCOFmnGUl57Ep86dwcbyBv5nWyUX37WaW84q5UsfmB2TAxJaIjDv8vuVzt4+Onv8dPT00dHdR2eP8+jo6WPH0WZ6/H56ev34FZIS4khJ8JGc4CM5Ic75P96Hqp7Qbq+qtHb1UtXcyb7qNvZWtfDM1krK69ro7PEDkJuWyGnTclg4OYsZ+Wn44pxlWBIwoRAnwhkzcplXlMGBmlZ+/3oZf9l4iE+cM4NPnj3d6/AmlCWCCNXQ1s3OY83sr25lf00bG8rrae7ooaPHT2+fn54+Pz98ZgddvX78ATtmVUUQ3H8A/ODv21GcZpnx8H+e3k5aUjwZSfGICF29ftq7e2nv7jtuvry0RBa5O/0Z+WkxeSRmvJeVksDPrn8ft507g7te2MN/rtrLvWsOsLgki+WluUzOTo76ExIsEUQAVaW6pYuy2jbK69r47av7qah/705NaYk+0pLiyUxOIDs1kQSfEO+LY3FJFknxccT1JwGUbUea3WU6UwAWFGeCQHK8j5REHykJzqP/iD8l0Tnqf2V3DQk+IdEX5+zg360t+J2aQ28fXT1+5hSm09LZS2tXL6rOmDCpiT4mZSQxKTOJmfnpzJ6Ubuf7m7CyoDiT392ynD1VLfz21f08teUo68rqKc5KZllpDkunZnsdYshYIghDrV29bD/SxDuHm/jb5iOU17W9ezSdmRzPObPzufGMaSwqyWTOpAwKM5N4aP2hE5Yz3hfJ7K1qPX7CEKfeRetFNyY2zC3M4K6PLGVhcRZvH25k48F6nn6nkue2HePNA/VctqiIixZMiqoarCWCcdLd66exo5um9h6ONnac8PrOymZ8cUKcCG7TN+3dfTR39lDT0sXhhg7217Sy9XAT+2pa6R88MSc1gflFGczIT2N6Xhq5aYl87MzSCSyZMbEpJdHHmTPzOHNmHkcbO9hc0cCOo028uLMKX5xw1sw8Ll1YyAXzJkX8zY4sEYxSV28f+6vb2FPVwu6qFvYca2FPdQuH6k/c+Qf61cv7Rlx2QUYSS0qy+OCSYt43JZtFJVm8sKNqvEI3xozR5OwUJmencOMZU9l6pInnth3juW3H+O6T24HtzCxIY+WcAs6fV8CZM/JISYyswQ49SQQichnwS8AH3KuqP/YijuH09vkpr2t3dvjHWthb7fxfXtf+7r1S4+OEWQXpLJ2aw7WnTSEvPYmslATWH6g7blkKnDs7nz5V/OqcnQOQmugjPSmeSZlJTM5OITXR8rIx4UxEWDIlmyVTsvnmpfPYX9PG6j01vLqnhofWV3DfG+XExwkLijNZOjWbU6dls3RqNjPC/Ar4Cd/ziIgP+C/gYuAwsEFEnlLVHeO9LlWlz6/09j/6/PT0KR1uk0xzZw8tnb00d/RQ2dTJkYYOjjR2cLihnaONnXT3+d2YYXpeGnML07licTFzCzOYV5TB9Lw0EuNPPLWxtbP3hGmXLy4e7+IZYzwkIsye5Jz4cNu5M+js6WN9WT1rD9SxpaKRx986zJ/ePAhASoKPKTkpTMtNZar7yE9PJDM5gcyUeDKSE8hIjifRF0e8L8454SPO+X8iEogXh6BnAPtU9QCAiDwMXAWMeyL45H0beGV3TdDzF2QkUZKdwsKSLC5dVMTcSc4Of1ZBesRV9YwxEys5wcfKuQWsnFsAQJ9f2VvdwpaKRvZWt3Kovp2K+nbePFBH24BTqYfzwtdWMqcwI1RhAyATfUs3EbkOuExVP+3+/XFghap+acB8twO3u3/OA3YHvJwP1E5AuOHKym/lj9Xyx3LZYfTlL1XVgpFm8qJGMFg954RspKr3APcMugCRjaq6fLwDixRWfit/rJY/lssOoSu/F9fuHwamBvw9BbAri4wxxiNeJIINwBwRmSEiicANwFMexGGMMQYPmoZUtVdEvgT8A+f00T+o6vZRLmbQJqMYYuWPbbFc/lguO4So/BPeWWyMMSa82Pi+xhgT4ywRGGNMjIuIRCAiuSLygojsdf8/4SajIlIqIptEZIuIbBeRz3kRaygEWf6lIrLWLfs7IvJRL2INhWDK7873nIg0isjTEx3jeBORy0Rkt4jsE5E7Bnk9SUQecV9fJyLTJz7K0Ami/CtF5C0R6XWvTYoqQZT/6yKyw/2trxKRkxqJMiISAXAHsEpV5wCr3L8HqgTOVtWlwArgDhGZPIExhlIw5W8HblHVhcBlwC9EJFoGUA+m/AA/Az4+YVGFSMAwLJcDpwA3isgpA2b7FNCgqrOB/wB+MrFRhk6Q5a8APgE8OLHRhV6Q5d8MLFfVJcCjwE9PZp2RkgiuAu53n98PXD1wBlXtVtUu988kIqdswQim/HtUda/7/ChQDYx4RWGEGLH8AKq6CmiZqKBC6N1hWFS1G+gfhiVQ4GfyKHChhPOoZqMzYvlVtVxV3wHG57Z64SWY8r+sqv13p3oT53qsMYuUnWWhqlYCuP9PGmwmEZkqIu8Ah4CfuDvEaBBU+fuJyBlAIrB/AmKbCKMqfxQowdmG+x12pw06j6r2Ak1A3oREF3rBlD+ajbb8nwL+52RWGDbjHovIi0DRIC/9S7DLUNVDwBK3SegJEXlUVSNiQP/xKL+7nGLgT8CtqhoxR0vjVf4oEcwwLEEN1RKhorlswQi6/CJyM7AcOP9kVhg2iUBVLxrqNRGpEpFiVa10d3TVIyzrqIhsB87DqTaHvfEov4hkAs8A31HVN0MUakiM5/cfBYIZhqV/nsMiEg9kAfUTE17IxfowNEGVX0QuwjlQOj+gWXxMIqVp6CngVvf5rcCTA2cQkSkikuI+zwHO4fgRSyNZMOVPBP4G/FFV/zqBsU2EEcsfZYIZhiXwM7kOeEmj5+rQWB+GZsTyi8ipwN3Ah1X15A+MVDXsHzhtn6uAve7/ue705Th3OAPnRjfvAG+7/9/uddwTXP6bgR5gS8BjqdexT1T53b/XADVAB85R1aVex34SZb4C2IPTz/Mv7rT/4/7wAZKBvwL7gPXATK9jnuDyn+5+x21AHbDd65gnuPwvAlUBv/WnTmZ9NsSEMcbEuEhpGjLGGBMilgiMMSbGWSIwxpgYZ4nAGGNinCUCY4yJcZYITNQTkT53VNptIvL3sQ7GJyL3DjL4FyLyCRH51UnE1zrW9xozHiwRmFjQoapLVXURztW3XxzLQlT106q6Y3xDM8Z7lghMrFlLwABeIvJNEdngjuv+A3damog8IyJvu7WIj7rTXxGR5e7zT4rIHhF5Fecq9v7l3Rc4Pn7/0b6IpLvjxr8lIltFZOBooohIsYisDqi9nBeqD8GYQGEz1pAxoeaO834h8Hv370uAOTjD/grwlIisxBm++6iqftCdL2vAcoqBHwDLcEb9fBlnfPjhdALXqGqziOQDb4rIU3r8FZ03Af9Q1R+6saaeVIGNCZLVCEwsSBGRLThDEeQCL7jTL3Efm4G3gPk4iWErcJGI/EREzlPVpgHLWwG8oqo16owX/0gQMQhwpztM+os4tZLCAfNsAD4pIv8KLFbVaLi3gokAlghMLOhQ5851pTj3aejvIxDgR27/wVJVna2qv1fVPThH+1uBH4nI9wZZ5lBjs/Ti/q7cG8UkutM/hlPTWObGUoUzXtB7C1RdDawEjgB/EpFbxlZcY0bHEoGJGe6R/VeA/yUiCcA/gNtEJB1AREpEZJJ7P4t2Vf0z8HPgtAGLWgdcICJ57nKuD3itHCeJgHNXqQT3eRZQrao9IvJ+nKR0HPe+s9Wq+juc5quB6zUmJKyPwMQUVd0sIm8DN6jqn0RkAbDWvctjK84orrOBn4mIH2dE188PWEal23yzFude2W8BPvfl3wFPish6nJFS29zpDwB/F5GNOKNF7hokvAuAb4pIjxuL1QjMhLDRR40xJsZZ05AxxsQ4SwTGGBPjLBEYY0yMs0RgjDExzhKBMcbEOEsExhgT4ywRGGNMjPv/AZbpkAF8L0qGAAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.distplot((y_test- predictions),bins=50)\n",
"plt.xlabel('Residuals')\n",
"plt.ylabel('Density')\n",
"plt.title('Histograms of residuals (Shapiro W p-value = {0:03f})'.format(shapiro(y_test-predictions)[1]))\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The histogram shows us that the residuals are negatively skewed and the value of the Shapiro W p-value in the title tells us that the distribution is not normal. This gives us further evidence that our model has room for improvement."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 11. Computing the metrics for mean absolute error, mean squared error, root mean squared error, and R-squared to determine the model performance"
]
},
{
"cell_type": "code",
"execution_count": 127,
"metadata": {},
"outputs": [],
"source": [
"from sklearn import metrics"
]
},
{
"cell_type": "code",
"execution_count": 130,
"metadata": {
"code_folding": []
},
"outputs": [],
"source": [
"metrics_df = pd.DataFrame({'Metric':['MAE','MSE','RMSE','R-Squared'],\n",
" 'Value':[metrics.mean_absolute_error(y_test,predictions),\n",
" metrics.mean_squared_error(y_test,predictions),\n",
" np.sqrt(metrics.mean_squared_error(y_test,predictions)),\n",
" metrics.explained_variance_score(y_test,predictions)]}).round(3)"
]
},
{
"cell_type": "code",
"execution_count": 131,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Metric</th>\n",
" <th>Value</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>MAE</td>\n",
" <td>0.041</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>MSE</td>\n",
" <td>0.003</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>RMSE</td>\n",
" <td>0.058</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>R-Squared</td>\n",
" <td>0.809</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Metric Value\n",
"0 MAE 0.041\n",
"1 MSE 0.003\n",
"2 RMSE 0.058\n",
"3 R-Squared 0.809"
]
},
"execution_count": 131,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics_df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"|Metric|Simple Linear Regression|Mutiple Linear Regression|\n",
"|-------|-----------|------------|\n",
"|MAE\t|0.059\t|0.041|\n",
"|MSE\t|0.006\t|0.003|\n",
"|RMSE\t|0.080|0.058|\n",
"|R-Squared|0.629|0.809|"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"_draft": {
"nbviewer_url": "https://gist.github.com/f5aa427cb52d033cf1204d9687d087bd"
},
"gist": {
"data": {
"description": "GRE School.ipynb",
"public": true
},
"id": "f5aa427cb52d033cf1204d9687d087bd"
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment