Skip to content

Instantly share code, notes, and snippets.

@jamm1985
Created November 28, 2021 12:28
Show Gist options
  • Save jamm1985/5fb71795ac52f2d514844149a1d2c8b8 to your computer and use it in GitHub Desktop.
Save jamm1985/5fb71795ac52f2d514844149a1d2c8b8 to your computer and use it in GitHub Desktop.
Lab_5_linear_regression.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Lab_5_linear_regression.ipynb",
"provenance": [],
"authorship_tag": "ABX9TyPuDhaW6Z0YtnJhZd/efLH5",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/jamm1985/5fb71795ac52f2d514844149a1d2c8b8/lab_5_linear_regression.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uXoHbRTOg1Vf"
},
"source": [
"Видео лабраторной: https://youtu.be/txDLkiesqpY\n",
"\n",
"TG: https://t.me/data_science_news\n",
"\n",
"---\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "np_jZLrjrTZ8",
"outputId": "b8aca14e-9dbc-4f81-d9fe-f4e6cccdd062"
},
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pylab as plt\n",
"plt.rcParams['figure.figsize'] = [12, 12]\n",
"\n",
"import statsmodels.api as sm\n",
"import seaborn as sns"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"/usr/local/lib/python3.7/dist-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.\n",
" import pandas.util.testing as tm\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vjA2EdB9wlEd"
},
"source": [
"Документация statsmodels https://www.statsmodels.org/stable/index.html"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Lu_LS8yBuYxG"
},
"source": [
"## Цены на недвижимость в Бостоне (набор данных)\n",
"\n",
"Набор данных https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html\n",
"\n",
"Плейбук: http://www.science.smith.edu/~jcrouser/SDS293/labs/lab2-py.html "
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "-UZz9rIWtDaU",
"outputId": "6a1cd023-ed28-4a6d-8184-7c5db02597fe"
},
"source": [
"!wget http://www.science.smith.edu/~jcrouser/SDS293/data/Boston.csv"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"--2021-11-10 07:16:17-- http://www.science.smith.edu/~jcrouser/SDS293/data/Boston.csv\n",
"Resolving www.science.smith.edu (www.science.smith.edu)... 131.229.72.9\n",
"Connecting to www.science.smith.edu (www.science.smith.edu)|131.229.72.9|:80... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 37658 (37K) [text/csv]\n",
"Saving to: ‘Boston.csv’\n",
"\n",
"Boston.csv 100%[===================>] 36.78K --.-KB/s in 0.07s \n",
"\n",
"2021-11-10 07:16:17 (518 KB/s) - ‘Boston.csv’ saved [37658/37658]\n",
"\n"
]
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "6jBJP5detISb",
"outputId": "05b0433f-6a38-4edd-d080-58cd62223020"
},
"source": [
"!head Boston.csv"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"\"\",\"crim\",\"zn\",\"indus\",\"chas\",\"nox\",\"rm\",\"age\",\"dis\",\"rad\",\"tax\",\"ptratio\",\"black\",\"lstat\",\"medv\"\n",
"\"1\",0.00632,18,2.31,0,0.538,6.575,65.2,4.09,1,296,15.3,396.9,4.98,24\n",
"\"2\",0.02731,0,7.07,0,0.469,6.421,78.9,4.9671,2,242,17.8,396.9,9.14,21.6\n",
"\"3\",0.02729,0,7.07,0,0.469,7.185,61.1,4.9671,2,242,17.8,392.83,4.03,34.7\n",
"\"4\",0.03237,0,2.18,0,0.458,6.998,45.8,6.0622,3,222,18.7,394.63,2.94,33.4\n",
"\"5\",0.06905,0,2.18,0,0.458,7.147,54.2,6.0622,3,222,18.7,396.9,5.33,36.2\n",
"\"6\",0.02985,0,2.18,0,0.458,6.43,58.7,6.0622,3,222,18.7,394.12,5.21,28.7\n",
"\"7\",0.08829,12.5,7.87,0,0.524,6.012,66.6,5.5605,5,311,15.2,395.6,12.43,22.9\n",
"\"8\",0.14455,12.5,7.87,0,0.524,6.172,96.1,5.9505,5,311,15.2,396.9,19.15,27.1\n",
"\"9\",0.21124,12.5,7.87,0,0.524,5.631,100,6.0821,5,311,15.2,386.63,29.93,16.5\n"
]
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "hi3Y8MgKuYHS",
"outputId": "c5b93d13-b817-4f3c-ab93-6e8126a4e544"
},
"source": [
"df = pd.read_csv('http://www.science.smith.edu/~jcrouser/SDS293/data/Boston.csv', index_col=0)\n",
"df.head()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>crim</th>\n",
" <th>zn</th>\n",
" <th>indus</th>\n",
" <th>chas</th>\n",
" <th>nox</th>\n",
" <th>rm</th>\n",
" <th>age</th>\n",
" <th>dis</th>\n",
" <th>rad</th>\n",
" <th>tax</th>\n",
" <th>ptratio</th>\n",
" <th>black</th>\n",
" <th>lstat</th>\n",
" <th>medv</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.00632</td>\n",
" <td>18.0</td>\n",
" <td>2.31</td>\n",
" <td>0</td>\n",
" <td>0.538</td>\n",
" <td>6.575</td>\n",
" <td>65.2</td>\n",
" <td>4.0900</td>\n",
" <td>1</td>\n",
" <td>296</td>\n",
" <td>15.3</td>\n",
" <td>396.90</td>\n",
" <td>4.98</td>\n",
" <td>24.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0.02731</td>\n",
" <td>0.0</td>\n",
" <td>7.07</td>\n",
" <td>0</td>\n",
" <td>0.469</td>\n",
" <td>6.421</td>\n",
" <td>78.9</td>\n",
" <td>4.9671</td>\n",
" <td>2</td>\n",
" <td>242</td>\n",
" <td>17.8</td>\n",
" <td>396.90</td>\n",
" <td>9.14</td>\n",
" <td>21.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>0.02729</td>\n",
" <td>0.0</td>\n",
" <td>7.07</td>\n",
" <td>0</td>\n",
" <td>0.469</td>\n",
" <td>7.185</td>\n",
" <td>61.1</td>\n",
" <td>4.9671</td>\n",
" <td>2</td>\n",
" <td>242</td>\n",
" <td>17.8</td>\n",
" <td>392.83</td>\n",
" <td>4.03</td>\n",
" <td>34.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0.03237</td>\n",
" <td>0.0</td>\n",
" <td>2.18</td>\n",
" <td>0</td>\n",
" <td>0.458</td>\n",
" <td>6.998</td>\n",
" <td>45.8</td>\n",
" <td>6.0622</td>\n",
" <td>3</td>\n",
" <td>222</td>\n",
" <td>18.7</td>\n",
" <td>394.63</td>\n",
" <td>2.94</td>\n",
" <td>33.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>0.06905</td>\n",
" <td>0.0</td>\n",
" <td>2.18</td>\n",
" <td>0</td>\n",
" <td>0.458</td>\n",
" <td>7.147</td>\n",
" <td>54.2</td>\n",
" <td>6.0622</td>\n",
" <td>3</td>\n",
" <td>222</td>\n",
" <td>18.7</td>\n",
" <td>396.90</td>\n",
" <td>5.33</td>\n",
" <td>36.2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" crim zn indus chas nox ... tax ptratio black lstat medv\n",
"1 0.00632 18.0 2.31 0 0.538 ... 296 15.3 396.90 4.98 24.0\n",
"2 0.02731 0.0 7.07 0 0.469 ... 242 17.8 396.90 9.14 21.6\n",
"3 0.02729 0.0 7.07 0 0.469 ... 242 17.8 392.83 4.03 34.7\n",
"4 0.03237 0.0 2.18 0 0.458 ... 222 18.7 394.63 2.94 33.4\n",
"5 0.06905 0.0 2.18 0 0.458 ... 222 18.7 396.90 5.33 36.2\n",
"\n",
"[5 rows x 14 columns]"
]
},
"metadata": {},
"execution_count": 4
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
},
"id": "v0AQ1BA7rmT7",
"outputId": "31eee5f4-a413-4d3a-8e4a-928f72f12aaf"
},
"source": [
"df.describe()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>crim</th>\n",
" <th>zn</th>\n",
" <th>indus</th>\n",
" <th>chas</th>\n",
" <th>nox</th>\n",
" <th>rm</th>\n",
" <th>age</th>\n",
" <th>dis</th>\n",
" <th>rad</th>\n",
" <th>tax</th>\n",
" <th>ptratio</th>\n",
" <th>black</th>\n",
" <th>lstat</th>\n",
" <th>medv</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" <td>506.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>3.613524</td>\n",
" <td>11.363636</td>\n",
" <td>11.136779</td>\n",
" <td>0.069170</td>\n",
" <td>0.554695</td>\n",
" <td>6.284634</td>\n",
" <td>68.574901</td>\n",
" <td>3.795043</td>\n",
" <td>9.549407</td>\n",
" <td>408.237154</td>\n",
" <td>18.455534</td>\n",
" <td>356.674032</td>\n",
" <td>12.653063</td>\n",
" <td>22.532806</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>8.601545</td>\n",
" <td>23.322453</td>\n",
" <td>6.860353</td>\n",
" <td>0.253994</td>\n",
" <td>0.115878</td>\n",
" <td>0.702617</td>\n",
" <td>28.148861</td>\n",
" <td>2.105710</td>\n",
" <td>8.707259</td>\n",
" <td>168.537116</td>\n",
" <td>2.164946</td>\n",
" <td>91.294864</td>\n",
" <td>7.141062</td>\n",
" <td>9.197104</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>0.006320</td>\n",
" <td>0.000000</td>\n",
" <td>0.460000</td>\n",
" <td>0.000000</td>\n",
" <td>0.385000</td>\n",
" <td>3.561000</td>\n",
" <td>2.900000</td>\n",
" <td>1.129600</td>\n",
" <td>1.000000</td>\n",
" <td>187.000000</td>\n",
" <td>12.600000</td>\n",
" <td>0.320000</td>\n",
" <td>1.730000</td>\n",
" <td>5.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>0.082045</td>\n",
" <td>0.000000</td>\n",
" <td>5.190000</td>\n",
" <td>0.000000</td>\n",
" <td>0.449000</td>\n",
" <td>5.885500</td>\n",
" <td>45.025000</td>\n",
" <td>2.100175</td>\n",
" <td>4.000000</td>\n",
" <td>279.000000</td>\n",
" <td>17.400000</td>\n",
" <td>375.377500</td>\n",
" <td>6.950000</td>\n",
" <td>17.025000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>0.256510</td>\n",
" <td>0.000000</td>\n",
" <td>9.690000</td>\n",
" <td>0.000000</td>\n",
" <td>0.538000</td>\n",
" <td>6.208500</td>\n",
" <td>77.500000</td>\n",
" <td>3.207450</td>\n",
" <td>5.000000</td>\n",
" <td>330.000000</td>\n",
" <td>19.050000</td>\n",
" <td>391.440000</td>\n",
" <td>11.360000</td>\n",
" <td>21.200000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>3.677082</td>\n",
" <td>12.500000</td>\n",
" <td>18.100000</td>\n",
" <td>0.000000</td>\n",
" <td>0.624000</td>\n",
" <td>6.623500</td>\n",
" <td>94.075000</td>\n",
" <td>5.188425</td>\n",
" <td>24.000000</td>\n",
" <td>666.000000</td>\n",
" <td>20.200000</td>\n",
" <td>396.225000</td>\n",
" <td>16.955000</td>\n",
" <td>25.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>88.976200</td>\n",
" <td>100.000000</td>\n",
" <td>27.740000</td>\n",
" <td>1.000000</td>\n",
" <td>0.871000</td>\n",
" <td>8.780000</td>\n",
" <td>100.000000</td>\n",
" <td>12.126500</td>\n",
" <td>24.000000</td>\n",
" <td>711.000000</td>\n",
" <td>22.000000</td>\n",
" <td>396.900000</td>\n",
" <td>37.970000</td>\n",
" <td>50.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" crim zn indus ... black lstat medv\n",
"count 506.000000 506.000000 506.000000 ... 506.000000 506.000000 506.000000\n",
"mean 3.613524 11.363636 11.136779 ... 356.674032 12.653063 22.532806\n",
"std 8.601545 23.322453 6.860353 ... 91.294864 7.141062 9.197104\n",
"min 0.006320 0.000000 0.460000 ... 0.320000 1.730000 5.000000\n",
"25% 0.082045 0.000000 5.190000 ... 375.377500 6.950000 17.025000\n",
"50% 0.256510 0.000000 9.690000 ... 391.440000 11.360000 21.200000\n",
"75% 3.677082 12.500000 18.100000 ... 396.225000 16.955000 25.000000\n",
"max 88.976200 100.000000 27.740000 ... 396.900000 37.970000 50.000000\n",
"\n",
"[8 rows x 14 columns]"
]
},
"metadata": {},
"execution_count": 5
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "dqha-KNXwpja",
"outputId": "17554e8f-c984-49ce-d2f2-af85d0b5f6bb"
},
"source": [
"plt.rcParams['figure.figsize'] = [12, 12]\n",
"df.hist()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7faf1fe18dd0>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1fe04d50>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f93d650>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f8edb50>],\n",
" [<matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f918b90>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f864590>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f899b10>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f84ff90>],\n",
" [<matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f85e050>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f8115d0>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f77ded0>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f740410>],\n",
" [<matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f6f5910>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f6aae10>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f66e350>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7faf1f624850>]],\n",
" dtype=object)"
]
},
"metadata": {},
"execution_count": 7
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x864 with 16 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "uJBgQkTO2sQM",
"outputId": "818a5f16-59fd-4c49-e886-c986aa302944"
},
"source": [
"df[['crim','medv']]"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>crim</th>\n",
" <th>medv</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.00632</td>\n",
" <td>24.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0.02731</td>\n",
" <td>21.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>0.02729</td>\n",
" <td>34.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0.03237</td>\n",
" <td>33.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>0.06905</td>\n",
" <td>36.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>502</th>\n",
" <td>0.06263</td>\n",
" <td>22.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>503</th>\n",
" <td>0.04527</td>\n",
" <td>20.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>504</th>\n",
" <td>0.06076</td>\n",
" <td>23.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>505</th>\n",
" <td>0.10959</td>\n",
" <td>22.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>506</th>\n",
" <td>0.04741</td>\n",
" <td>11.9</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>506 rows × 2 columns</p>\n",
"</div>"
],
"text/plain": [
" crim medv\n",
"1 0.00632 24.0\n",
"2 0.02731 21.6\n",
"3 0.02729 34.7\n",
"4 0.03237 33.4\n",
"5 0.06905 36.2\n",
".. ... ...\n",
"502 0.06263 22.4\n",
"503 0.04527 20.6\n",
"504 0.06076 23.9\n",
"505 0.10959 22.0\n",
"506 0.04741 11.9\n",
"\n",
"[506 rows x 2 columns]"
]
},
"metadata": {},
"execution_count": 8
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "emIJC6FO3AlX"
},
"source": [
"**Sample mean:**\n",
"\n",
"$$E[X] = \\bar{X_n}=\\frac{1}{n}\\Sigma_{i=1}^n X_i$$\n",
"\n",
"**Sample variance:**\n",
"\n",
"$$Var[X] = s^2_{n-1}= E(X-E[X])^2= \\frac{1}{n-1}\\Sigma (X_i-\\bar{X_n})^2$$"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "k0HII_LjLUdZ",
"outputId": "97e6952a-fb3a-49ef-81d2-166854e13a5e"
},
"source": [
"N=df['crim'].size\n",
"mean_crim = df['crim'].sum()/N\n",
"mean_crim"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"3.613523557312254"
]
},
"metadata": {},
"execution_count": 9
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "62HBmzRVLfVL",
"outputId": "b27a52c2-566f-475d-f453-e782f3701eb9"
},
"source": [
"df['crim'].mean()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"3.6135235573122535"
]
},
"metadata": {},
"execution_count": 10
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "HkTP1ZutL6_e",
"outputId": "22a46cdc-471a-4bc0-9f22-272571e7c3db"
},
"source": [
"var_crim = 1/(N-1)*((df['crim'] - mean_crim)**2).sum()\n",
"var_crim"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"73.98657819906931"
]
},
"metadata": {},
"execution_count": 14
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "1vR-jq__MSX-",
"outputId": "50164c41-0d7e-4ac5-e635-9c27178356d8"
},
"source": [
"df['crim'].var()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"73.98657819906929"
]
},
"metadata": {},
"execution_count": 12
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "k4nF94FCMpEy"
},
"source": [
"**Sample correlation:**\n",
"\n",
"$$\\rho = \\frac{Cov(X,Y)}{\\sqrt{Var(X)Var(Y)}} = \\frac{E[(X-E[X])(Y-E[Y])]}{s_X s_Y} = \\frac{\\frac{1}{n-1}\\Sigma_{n=1}^n((X-\\bar{X_n})(Y-\\bar{Y_n}))}{s_X s_Y} $$"
]
},
{
"cell_type": "code",
"metadata": {
"id": "zfZ0N2uvP5dB"
},
"source": [
"mean_medv=df['medv'].mean()\n",
"var_medv=df['medv'].var()"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "4TrHYn9wMopP",
"outputId": "1f3db43d-b4e6-4548-e74b-c312a3e9b7d2"
},
"source": [
"1/(N-1)*((df['crim'] - mean_crim)*(df['medv'] - mean_medv)).sum()*1/(np.sqrt(var_crim*var_medv))"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"-0.3883046085868115"
]
},
"metadata": {},
"execution_count": 16
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 111
},
"id": "5POludaGQxtB",
"outputId": "06bfc463-d785-4482-ab57-4afd29fab450"
},
"source": [
"df[['crim','medv']].corr()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>crim</th>\n",
" <th>medv</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>crim</th>\n",
" <td>1.000000</td>\n",
" <td>-0.388305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>medv</th>\n",
" <td>-0.388305</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" crim medv\n",
"crim 1.000000 -0.388305\n",
"medv -0.388305 1.000000"
]
},
"metadata": {},
"execution_count": 17
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 483
},
"id": "sXKyQuSpw4e9",
"outputId": "27d25c77-c560-4309-d153-a6b063271b13"
},
"source": [
"df.corr()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>crim</th>\n",
" <th>zn</th>\n",
" <th>indus</th>\n",
" <th>chas</th>\n",
" <th>nox</th>\n",
" <th>rm</th>\n",
" <th>age</th>\n",
" <th>dis</th>\n",
" <th>rad</th>\n",
" <th>tax</th>\n",
" <th>ptratio</th>\n",
" <th>black</th>\n",
" <th>lstat</th>\n",
" <th>medv</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>crim</th>\n",
" <td>1.000000</td>\n",
" <td>-0.200469</td>\n",
" <td>0.406583</td>\n",
" <td>-0.055892</td>\n",
" <td>0.420972</td>\n",
" <td>-0.219247</td>\n",
" <td>0.352734</td>\n",
" <td>-0.379670</td>\n",
" <td>0.625505</td>\n",
" <td>0.582764</td>\n",
" <td>0.289946</td>\n",
" <td>-0.385064</td>\n",
" <td>0.455621</td>\n",
" <td>-0.388305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>zn</th>\n",
" <td>-0.200469</td>\n",
" <td>1.000000</td>\n",
" <td>-0.533828</td>\n",
" <td>-0.042697</td>\n",
" <td>-0.516604</td>\n",
" <td>0.311991</td>\n",
" <td>-0.569537</td>\n",
" <td>0.664408</td>\n",
" <td>-0.311948</td>\n",
" <td>-0.314563</td>\n",
" <td>-0.391679</td>\n",
" <td>0.175520</td>\n",
" <td>-0.412995</td>\n",
" <td>0.360445</td>\n",
" </tr>\n",
" <tr>\n",
" <th>indus</th>\n",
" <td>0.406583</td>\n",
" <td>-0.533828</td>\n",
" <td>1.000000</td>\n",
" <td>0.062938</td>\n",
" <td>0.763651</td>\n",
" <td>-0.391676</td>\n",
" <td>0.644779</td>\n",
" <td>-0.708027</td>\n",
" <td>0.595129</td>\n",
" <td>0.720760</td>\n",
" <td>0.383248</td>\n",
" <td>-0.356977</td>\n",
" <td>0.603800</td>\n",
" <td>-0.483725</td>\n",
" </tr>\n",
" <tr>\n",
" <th>chas</th>\n",
" <td>-0.055892</td>\n",
" <td>-0.042697</td>\n",
" <td>0.062938</td>\n",
" <td>1.000000</td>\n",
" <td>0.091203</td>\n",
" <td>0.091251</td>\n",
" <td>0.086518</td>\n",
" <td>-0.099176</td>\n",
" <td>-0.007368</td>\n",
" <td>-0.035587</td>\n",
" <td>-0.121515</td>\n",
" <td>0.048788</td>\n",
" <td>-0.053929</td>\n",
" <td>0.175260</td>\n",
" </tr>\n",
" <tr>\n",
" <th>nox</th>\n",
" <td>0.420972</td>\n",
" <td>-0.516604</td>\n",
" <td>0.763651</td>\n",
" <td>0.091203</td>\n",
" <td>1.000000</td>\n",
" <td>-0.302188</td>\n",
" <td>0.731470</td>\n",
" <td>-0.769230</td>\n",
" <td>0.611441</td>\n",
" <td>0.668023</td>\n",
" <td>0.188933</td>\n",
" <td>-0.380051</td>\n",
" <td>0.590879</td>\n",
" <td>-0.427321</td>\n",
" </tr>\n",
" <tr>\n",
" <th>rm</th>\n",
" <td>-0.219247</td>\n",
" <td>0.311991</td>\n",
" <td>-0.391676</td>\n",
" <td>0.091251</td>\n",
" <td>-0.302188</td>\n",
" <td>1.000000</td>\n",
" <td>-0.240265</td>\n",
" <td>0.205246</td>\n",
" <td>-0.209847</td>\n",
" <td>-0.292048</td>\n",
" <td>-0.355501</td>\n",
" <td>0.128069</td>\n",
" <td>-0.613808</td>\n",
" <td>0.695360</td>\n",
" </tr>\n",
" <tr>\n",
" <th>age</th>\n",
" <td>0.352734</td>\n",
" <td>-0.569537</td>\n",
" <td>0.644779</td>\n",
" <td>0.086518</td>\n",
" <td>0.731470</td>\n",
" <td>-0.240265</td>\n",
" <td>1.000000</td>\n",
" <td>-0.747881</td>\n",
" <td>0.456022</td>\n",
" <td>0.506456</td>\n",
" <td>0.261515</td>\n",
" <td>-0.273534</td>\n",
" <td>0.602339</td>\n",
" <td>-0.376955</td>\n",
" </tr>\n",
" <tr>\n",
" <th>dis</th>\n",
" <td>-0.379670</td>\n",
" <td>0.664408</td>\n",
" <td>-0.708027</td>\n",
" <td>-0.099176</td>\n",
" <td>-0.769230</td>\n",
" <td>0.205246</td>\n",
" <td>-0.747881</td>\n",
" <td>1.000000</td>\n",
" <td>-0.494588</td>\n",
" <td>-0.534432</td>\n",
" <td>-0.232471</td>\n",
" <td>0.291512</td>\n",
" <td>-0.496996</td>\n",
" <td>0.249929</td>\n",
" </tr>\n",
" <tr>\n",
" <th>rad</th>\n",
" <td>0.625505</td>\n",
" <td>-0.311948</td>\n",
" <td>0.595129</td>\n",
" <td>-0.007368</td>\n",
" <td>0.611441</td>\n",
" <td>-0.209847</td>\n",
" <td>0.456022</td>\n",
" <td>-0.494588</td>\n",
" <td>1.000000</td>\n",
" <td>0.910228</td>\n",
" <td>0.464741</td>\n",
" <td>-0.444413</td>\n",
" <td>0.488676</td>\n",
" <td>-0.381626</td>\n",
" </tr>\n",
" <tr>\n",
" <th>tax</th>\n",
" <td>0.582764</td>\n",
" <td>-0.314563</td>\n",
" <td>0.720760</td>\n",
" <td>-0.035587</td>\n",
" <td>0.668023</td>\n",
" <td>-0.292048</td>\n",
" <td>0.506456</td>\n",
" <td>-0.534432</td>\n",
" <td>0.910228</td>\n",
" <td>1.000000</td>\n",
" <td>0.460853</td>\n",
" <td>-0.441808</td>\n",
" <td>0.543993</td>\n",
" <td>-0.468536</td>\n",
" </tr>\n",
" <tr>\n",
" <th>ptratio</th>\n",
" <td>0.289946</td>\n",
" <td>-0.391679</td>\n",
" <td>0.383248</td>\n",
" <td>-0.121515</td>\n",
" <td>0.188933</td>\n",
" <td>-0.355501</td>\n",
" <td>0.261515</td>\n",
" <td>-0.232471</td>\n",
" <td>0.464741</td>\n",
" <td>0.460853</td>\n",
" <td>1.000000</td>\n",
" <td>-0.177383</td>\n",
" <td>0.374044</td>\n",
" <td>-0.507787</td>\n",
" </tr>\n",
" <tr>\n",
" <th>black</th>\n",
" <td>-0.385064</td>\n",
" <td>0.175520</td>\n",
" <td>-0.356977</td>\n",
" <td>0.048788</td>\n",
" <td>-0.380051</td>\n",
" <td>0.128069</td>\n",
" <td>-0.273534</td>\n",
" <td>0.291512</td>\n",
" <td>-0.444413</td>\n",
" <td>-0.441808</td>\n",
" <td>-0.177383</td>\n",
" <td>1.000000</td>\n",
" <td>-0.366087</td>\n",
" <td>0.333461</td>\n",
" </tr>\n",
" <tr>\n",
" <th>lstat</th>\n",
" <td>0.455621</td>\n",
" <td>-0.412995</td>\n",
" <td>0.603800</td>\n",
" <td>-0.053929</td>\n",
" <td>0.590879</td>\n",
" <td>-0.613808</td>\n",
" <td>0.602339</td>\n",
" <td>-0.496996</td>\n",
" <td>0.488676</td>\n",
" <td>0.543993</td>\n",
" <td>0.374044</td>\n",
" <td>-0.366087</td>\n",
" <td>1.000000</td>\n",
" <td>-0.737663</td>\n",
" </tr>\n",
" <tr>\n",
" <th>medv</th>\n",
" <td>-0.388305</td>\n",
" <td>0.360445</td>\n",
" <td>-0.483725</td>\n",
" <td>0.175260</td>\n",
" <td>-0.427321</td>\n",
" <td>0.695360</td>\n",
" <td>-0.376955</td>\n",
" <td>0.249929</td>\n",
" <td>-0.381626</td>\n",
" <td>-0.468536</td>\n",
" <td>-0.507787</td>\n",
" <td>0.333461</td>\n",
" <td>-0.737663</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" crim zn indus ... black lstat medv\n",
"crim 1.000000 -0.200469 0.406583 ... -0.385064 0.455621 -0.388305\n",
"zn -0.200469 1.000000 -0.533828 ... 0.175520 -0.412995 0.360445\n",
"indus 0.406583 -0.533828 1.000000 ... -0.356977 0.603800 -0.483725\n",
"chas -0.055892 -0.042697 0.062938 ... 0.048788 -0.053929 0.175260\n",
"nox 0.420972 -0.516604 0.763651 ... -0.380051 0.590879 -0.427321\n",
"rm -0.219247 0.311991 -0.391676 ... 0.128069 -0.613808 0.695360\n",
"age 0.352734 -0.569537 0.644779 ... -0.273534 0.602339 -0.376955\n",
"dis -0.379670 0.664408 -0.708027 ... 0.291512 -0.496996 0.249929\n",
"rad 0.625505 -0.311948 0.595129 ... -0.444413 0.488676 -0.381626\n",
"tax 0.582764 -0.314563 0.720760 ... -0.441808 0.543993 -0.468536\n",
"ptratio 0.289946 -0.391679 0.383248 ... -0.177383 0.374044 -0.507787\n",
"black -0.385064 0.175520 -0.356977 ... 1.000000 -0.366087 0.333461\n",
"lstat 0.455621 -0.412995 0.603800 ... -0.366087 1.000000 -0.737663\n",
"medv -0.388305 0.360445 -0.483725 ... 0.333461 -0.737663 1.000000\n",
"\n",
"[14 rows x 14 columns]"
]
},
"metadata": {},
"execution_count": 18
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 721
},
"id": "djONUoR01h4i",
"outputId": "be437d99-6978-47fa-baff-fb1f7f08723c"
},
"source": [
"import seaborn as sns\n",
"sns.heatmap(df.corr(), \n",
" xticklabels=df.corr().columns,\n",
" yticklabels=df.corr().columns)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7faf1f29b590>"
]
},
"metadata": {},
"execution_count": 19
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x864 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1pCkcbbdfdVQ"
},
"source": [
"## Простая линейная регрессия (Simple Linear Regression)\n",
"\n",
"Функция регрессии (regression function):\n",
"\n",
"$$r(x)=E[Y|X=x]=\\int y f_{Y|X}(y|x)dx$$\n",
"\n",
"Простая линейная регрессия предполагает, что $X_i$ одномерный, а функция регрессии имеет линейный вид: \n",
"\n",
"$$r(x)=\\beta_0+\\beta_1 x$$\n",
"\n",
"Тогда модель простой линейной регрессии будет иметь вид:\n",
"\n",
"$$Y_i=\\beta_0 + \\beta_1 X_i + \\epsilon_i$$\n",
"\n",
"где $Y_i$ - это \"зависимая переменная\" _(predictor variable, regressor, covariate, manipulated variable, \"explanatory variable\", exposure variable)_, $X_i$ - \"независимая переменная\" _(Explanatory variable, independent variable, exogenous)_, $\\epsilon_i$ - \"переменная ошибки\" _(error term, disturbance term, noise)_ неизуветсная случайная величина с математическим ожиданием $E[\\epsilon_i|X_i] = 0$ и константной дисперсией $Var(\\epsilon_i|X_i)=\\sigma^2$. Кроме этого, все $\\epsilon_i$ независимо распределены (имеют $Cov(\\epsilon_i,\\epsilon_j)=0$ для всех $i$,$j$ таких что $i\\neq j$).\n",
"\n",
"Допустим, что $\\hat{\\beta_0}$ и $\\hat{\\beta_1}$ это оценка неизвестных параметров функции регрессии $\\beta_0$ и $\\beta_1$, тогда подобранная прямая будет иметь вид:\n",
"\n",
"$$\\hat{r}(x)=\\hat{\\beta_0} + \\hat{\\beta_1}x$$ \n",
"\n",
"Подобранные (прогнозные) значения будут выражены $\\hat{Y_i}=\\hat{r}(X_i)$, а остаточная ошибка имеет вид:\n",
"\n",
"$$\\hat{\\epsilon_i}=Y_i-\\hat{Y_i}=Y_i-(\\hat{\\beta_0} + \\hat{\\beta_1}X_i)$$\n",
"\n",
"\n",
"Остаточная сумма квадтратов (Residual sum of squares, **RSS**) определяется как:\n",
"\n",
"$$RSS= \\Sigma_{i=1}^n \\hat{\\epsilon_i}^2$$\n",
"\n",
"Метод наименьших квадратов (least squares estimates) это такие знаяения $\\hat{\\beta_0}$ и $\\hat{\\beta_1}$ который минимизирует $RSS= \\Sigma_{i=1}^n \\hat{\\epsilon_i}$. Оценка методом наменьших квадратов иммет токчную аналитическую форму:\n",
"\n",
"$$\\hat{\\beta_1}=\\frac{\\Sigma_{i=1}^n (X_i-\\bar{X_n})(Y_i-\\bar{Y_i})}{\\Sigma_{i=1}^n(X_i-\\bar{X_n})^2}=\\frac{Cov(X,Y)}{Var(X)}=\\rho_{X,Y} \\frac{\\sigma_Y}{\\sigma_X}$$\n",
"\n",
"$$\\hat{\\beta_0}=\\bar{Y_n}-\\hat{\\beta_1} \\bar{X_n}$$\n",
"\n",
"И несмещённая оценка дисперсии:\n",
"\n",
"$$\\hat{\\sigma}^2=\\frac{1}{n-2} \\Sigma_{i=1}^n \\hat{\\epsilon_i}^2$$\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZBdkz9JJw0Oz"
},
"source": [
"Простая линейная регрессия на примере $$medv_i=\\hat{\\beta_0}+\\hat{\\beta_1}crim_i + \\epsilon_i$$"
]
},
{
"cell_type": "code",
"metadata": {
"id": "qXjhE6zk1Kfn"
},
"source": [
"plt.rcParams['figure.figsize'] = [12, 8]"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 514
},
"id": "B8mQ7JVbxL7J",
"outputId": "a624d24f-1f2e-422f-b3d9-5a32ac7ec0cb"
},
"source": [
"df[['crim','medv']].plot.scatter(x='crim',\n",
"... y='medv',\n",
"... c='DarkBlue')"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7faf1ee69350>"
]
},
"metadata": {},
"execution_count": 27
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x576 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 444
},
"id": "OL7XDbwE18Nt",
"outputId": "19f60228-a116-426c-a60b-e80258f17c7a"
},
"source": [
"lm = sm.OLS.from_formula('medv ~ crim', df)\n",
"result = lm.fit()\n",
"result.summary()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<table class=\"simpletable\">\n",
"<caption>OLS Regression Results</caption>\n",
"<tr>\n",
" <th>Dep. Variable:</th> <td>medv</td> <th> R-squared: </th> <td> 0.151</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Model:</th> <td>OLS</td> <th> Adj. R-squared: </th> <td> 0.149</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Method:</th> <td>Least Squares</td> <th> F-statistic: </th> <td> 89.49</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Date:</th> <td>Wed, 10 Nov 2021</td> <th> Prob (F-statistic):</th> <td>1.17e-19</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Time:</th> <td>07:55:01</td> <th> Log-Likelihood: </th> <td> -1798.9</td>\n",
"</tr>\n",
"<tr>\n",
" <th>No. Observations:</th> <td> 506</td> <th> AIC: </th> <td> 3602.</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Df Residuals:</th> <td> 504</td> <th> BIC: </th> <td> 3610.</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Df Model:</th> <td> 1</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>Covariance Type:</th> <td>nonrobust</td> <th> </th> <td> </td> \n",
"</tr>\n",
"</table>\n",
"<table class=\"simpletable\">\n",
"<tr>\n",
" <td></td> <th>coef</th> <th>std err</th> <th>t</th> <th>P>|t|</th> <th>[0.025</th> <th>0.975]</th> \n",
"</tr>\n",
"<tr>\n",
" <th>Intercept</th> <td> 24.0331</td> <td> 0.409</td> <td> 58.740</td> <td> 0.000</td> <td> 23.229</td> <td> 24.837</td>\n",
"</tr>\n",
"<tr>\n",
" <th>crim</th> <td> -0.4152</td> <td> 0.044</td> <td> -9.460</td> <td> 0.000</td> <td> -0.501</td> <td> -0.329</td>\n",
"</tr>\n",
"</table>\n",
"<table class=\"simpletable\">\n",
"<tr>\n",
" <th>Omnibus:</th> <td>139.832</td> <th> Durbin-Watson: </th> <td> 0.713</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Prob(Omnibus):</th> <td> 0.000</td> <th> Jarque-Bera (JB): </th> <td> 295.404</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Skew:</th> <td> 1.490</td> <th> Prob(JB): </th> <td>7.14e-65</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Kurtosis:</th> <td> 5.264</td> <th> Cond. No. </th> <td> 10.1</td>\n",
"</tr>\n",
"</table><br/><br/>Warnings:<br/>[1] Standard Errors assume that the covariance matrix of the errors is correctly specified."
],
"text/plain": [
"<class 'statsmodels.iolib.summary.Summary'>\n",
"\"\"\"\n",
" OLS Regression Results \n",
"==============================================================================\n",
"Dep. Variable: medv R-squared: 0.151\n",
"Model: OLS Adj. R-squared: 0.149\n",
"Method: Least Squares F-statistic: 89.49\n",
"Date: Wed, 10 Nov 2021 Prob (F-statistic): 1.17e-19\n",
"Time: 07:55:01 Log-Likelihood: -1798.9\n",
"No. Observations: 506 AIC: 3602.\n",
"Df Residuals: 504 BIC: 3610.\n",
"Df Model: 1 \n",
"Covariance Type: nonrobust \n",
"==============================================================================\n",
" coef std err t P>|t| [0.025 0.975]\n",
"------------------------------------------------------------------------------\n",
"Intercept 24.0331 0.409 58.740 0.000 23.229 24.837\n",
"crim -0.4152 0.044 -9.460 0.000 -0.501 -0.329\n",
"==============================================================================\n",
"Omnibus: 139.832 Durbin-Watson: 0.713\n",
"Prob(Omnibus): 0.000 Jarque-Bera (JB): 295.404\n",
"Skew: 1.490 Prob(JB): 7.14e-65\n",
"Kurtosis: 5.264 Cond. No. 10.1\n",
"==============================================================================\n",
"\n",
"Warnings:\n",
"[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n",
"\"\"\""
]
},
"metadata": {},
"execution_count": 28
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OwQQWlh_7iGO"
},
"source": [
"$$R^2 = 1 - \\frac{\\Sigma_{i=1}^n (Y_i - \\hat{Y}_i)^2}{\\Sigma_{i=1}^n (Y_i - \\bar{Y}_i)^2}=1 - \\frac{RSS}{TSS}$$\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6X0L3z4JArbh"
},
"source": [
"$$\\hat{\\beta_1}=\\frac{\\Sigma_{i=1}^n (X_i-\\bar{X_n})(Y_i-\\bar{Y_i})}{\\Sigma_{i=1}^n(X_i-\\bar{X_n})^2}=\\frac{Cov(X,Y)}{Var(X)}=\\rho_{X,Y} \\frac{\\sigma_Y}{\\sigma_X}$$"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "D6eeDYBYyCAK",
"outputId": "7173dfd0-8321-49e3-8b7a-06b1763a2e9b"
},
"source": [
"df[['crim','medv']].corr().iloc[0,1]*(df['medv'].std()/df['crim'].std())"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"-0.4151902779150908"
]
},
"metadata": {},
"execution_count": 29
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 514
},
"id": "2ZpHD3rRAVK3",
"outputId": "9861c129-6244-425e-88d5-3f67e7564e12"
},
"source": [
"sns.regplot(x='crim',y='medv',data=df)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7faf1ec1c2d0>"
]
},
"metadata": {},
"execution_count": 30
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x576 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 514
},
"id": "SjlxFxXQBO3Y",
"outputId": "68b98776-cd31-47e3-99e9-69d0e00ec1f3"
},
"source": [
"df[['lstat','medv']].plot.scatter(x='lstat',\n",
"... y='medv',\n",
"... c='DarkBlue')"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7faf1ec0dd90>"
]
},
"metadata": {},
"execution_count": 31
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x576 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 444
},
"id": "7lVRNzPYBdsl",
"outputId": "9bb2907f-df6d-43ff-de2c-32bbe00252b4"
},
"source": [
"lm = sm.OLS.from_formula('medv ~ lstat', df)\n",
"result = lm.fit()\n",
"result.summary()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<table class=\"simpletable\">\n",
"<caption>OLS Regression Results</caption>\n",
"<tr>\n",
" <th>Dep. Variable:</th> <td>medv</td> <th> R-squared: </th> <td> 0.544</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Model:</th> <td>OLS</td> <th> Adj. R-squared: </th> <td> 0.543</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Method:</th> <td>Least Squares</td> <th> F-statistic: </th> <td> 601.6</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Date:</th> <td>Wed, 10 Nov 2021</td> <th> Prob (F-statistic):</th> <td>5.08e-88</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Time:</th> <td>08:01:42</td> <th> Log-Likelihood: </th> <td> -1641.5</td>\n",
"</tr>\n",
"<tr>\n",
" <th>No. Observations:</th> <td> 506</td> <th> AIC: </th> <td> 3287.</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Df Residuals:</th> <td> 504</td> <th> BIC: </th> <td> 3295.</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Df Model:</th> <td> 1</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>Covariance Type:</th> <td>nonrobust</td> <th> </th> <td> </td> \n",
"</tr>\n",
"</table>\n",
"<table class=\"simpletable\">\n",
"<tr>\n",
" <td></td> <th>coef</th> <th>std err</th> <th>t</th> <th>P>|t|</th> <th>[0.025</th> <th>0.975]</th> \n",
"</tr>\n",
"<tr>\n",
" <th>Intercept</th> <td> 34.5538</td> <td> 0.563</td> <td> 61.415</td> <td> 0.000</td> <td> 33.448</td> <td> 35.659</td>\n",
"</tr>\n",
"<tr>\n",
" <th>lstat</th> <td> -0.9500</td> <td> 0.039</td> <td> -24.528</td> <td> 0.000</td> <td> -1.026</td> <td> -0.874</td>\n",
"</tr>\n",
"</table>\n",
"<table class=\"simpletable\">\n",
"<tr>\n",
" <th>Omnibus:</th> <td>137.043</td> <th> Durbin-Watson: </th> <td> 0.892</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Prob(Omnibus):</th> <td> 0.000</td> <th> Jarque-Bera (JB): </th> <td> 291.373</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Skew:</th> <td> 1.453</td> <th> Prob(JB): </th> <td>5.36e-64</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Kurtosis:</th> <td> 5.319</td> <th> Cond. No. </th> <td> 29.7</td>\n",
"</tr>\n",
"</table><br/><br/>Warnings:<br/>[1] Standard Errors assume that the covariance matrix of the errors is correctly specified."
],
"text/plain": [
"<class 'statsmodels.iolib.summary.Summary'>\n",
"\"\"\"\n",
" OLS Regression Results \n",
"==============================================================================\n",
"Dep. Variable: medv R-squared: 0.544\n",
"Model: OLS Adj. R-squared: 0.543\n",
"Method: Least Squares F-statistic: 601.6\n",
"Date: Wed, 10 Nov 2021 Prob (F-statistic): 5.08e-88\n",
"Time: 08:01:42 Log-Likelihood: -1641.5\n",
"No. Observations: 506 AIC: 3287.\n",
"Df Residuals: 504 BIC: 3295.\n",
"Df Model: 1 \n",
"Covariance Type: nonrobust \n",
"==============================================================================\n",
" coef std err t P>|t| [0.025 0.975]\n",
"------------------------------------------------------------------------------\n",
"Intercept 34.5538 0.563 61.415 0.000 33.448 35.659\n",
"lstat -0.9500 0.039 -24.528 0.000 -1.026 -0.874\n",
"==============================================================================\n",
"Omnibus: 137.043 Durbin-Watson: 0.892\n",
"Prob(Omnibus): 0.000 Jarque-Bera (JB): 291.373\n",
"Skew: 1.453 Prob(JB): 5.36e-64\n",
"Kurtosis: 5.319 Cond. No. 29.7\n",
"==============================================================================\n",
"\n",
"Warnings:\n",
"[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n",
"\"\"\""
]
},
"metadata": {},
"execution_count": 32
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 514
},
"id": "F9LaexLjBpyP",
"outputId": "2f16e276-e5fb-4a73-a4f3-9cddd8c718dc"
},
"source": [
"sns.regplot(x='lstat',y='medv',data=df)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7faf1ed1fb50>"
]
},
"metadata": {},
"execution_count": 33
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x576 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "73v0MnoPCwwo"
},
"source": [
"## Линейная регрессия в общем виде (Multiple linear regression)\n",
"\n",
"Предположем, что одно наблюдение это не одномерное значение, а вектор $X_i \\in R^k$. \n",
"\n",
"Выразим все наши наши данные наблюдений в матричной форме:\n",
"\n",
"$$\\bf{X}=\\left[ \\begin{matrix} 1 & X_{11} & X_{12} & ... & X_{1k}\\\\ 1& X_{21} & X_{22} & ... & X_{2k} \\\\ ... & ... & ... & ... & ... \\\\ 1 & X_{n1} & X_{n2} & ... & X_{nk} \\end{matrix} \\right]$$\n",
"\n",
"Запишем зависимую переменную, коэффициенты регрессии и остаточной ошибки в виде векторов: \n",
"\n",
"$$\\bf{Y}=\\left[ \\begin{matrix} Y_1 \\\\ Y_2 \\\\ ... \\\\ Y_n \\end{matrix} \\right], \\bf{\\beta}=\\left[ \\begin{matrix} \\beta_0 \\\\ \\beta_1 \\\\ ... \\\\ \\beta_k \\end{matrix} \\right], \\bf{\\epsilon}=\\left[ \\begin{matrix} \\epsilon_1 \\\\ \\epsilon_2 \\\\ ... \\\\ \\epsilon_n \\end{matrix} \\right]$$\n",
"\n",
"Тогда модель линейной регрессии для многомерного случая будет иметь вид: \n",
"\n",
"$$Y = \\bf{X \\beta + \\epsilon}$$\n",
"\n",
"где $E[\\bf{\\epsilon}]=\\bf{0}$, $Var(\\epsilon_i|X_i)=\\sigma^2$, $Cov(\\epsilon_i,\\epsilon_j)=0$ для всех $i$,$j$ таких что $i\\neq j$\n",
"\n",
"Если матрица $\\bf{X^TX}$ размерностью $kxk$ имеет обратную матрицу, то оценка коэффициентов регрессии методом наименьших квадратов имеет вид:\n",
"\n",
"$$\\bf{\\hat{\\beta}=(X^TX)^{-1}X^TY}$$\n",
"\n",
"\n",
"$$\\bf{\\hat{\\beta}} \\approx N(\\bf{\\beta,\\sigma^2(X^TX)^{-1}})$$\n",
"\n",
"$$\\hat{\\sigma}^2=\\frac{1}{n-k} \\Sigma_{i=1}^n \\hat{\\epsilon_i}^2$$\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "rD-qDCETKnpv",
"outputId": "19fe9d0d-665d-41e8-b536-fe9a58e865e3"
},
"source": [
"X = df.drop(columns='medv')\n",
"X"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>crim</th>\n",
" <th>zn</th>\n",
" <th>indus</th>\n",
" <th>chas</th>\n",
" <th>nox</th>\n",
" <th>rm</th>\n",
" <th>age</th>\n",
" <th>dis</th>\n",
" <th>rad</th>\n",
" <th>tax</th>\n",
" <th>ptratio</th>\n",
" <th>black</th>\n",
" <th>lstat</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.00632</td>\n",
" <td>18.0</td>\n",
" <td>2.31</td>\n",
" <td>0</td>\n",
" <td>0.538</td>\n",
" <td>6.575</td>\n",
" <td>65.2</td>\n",
" <td>4.0900</td>\n",
" <td>1</td>\n",
" <td>296</td>\n",
" <td>15.3</td>\n",
" <td>396.90</td>\n",
" <td>4.98</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0.02731</td>\n",
" <td>0.0</td>\n",
" <td>7.07</td>\n",
" <td>0</td>\n",
" <td>0.469</td>\n",
" <td>6.421</td>\n",
" <td>78.9</td>\n",
" <td>4.9671</td>\n",
" <td>2</td>\n",
" <td>242</td>\n",
" <td>17.8</td>\n",
" <td>396.90</td>\n",
" <td>9.14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>0.02729</td>\n",
" <td>0.0</td>\n",
" <td>7.07</td>\n",
" <td>0</td>\n",
" <td>0.469</td>\n",
" <td>7.185</td>\n",
" <td>61.1</td>\n",
" <td>4.9671</td>\n",
" <td>2</td>\n",
" <td>242</td>\n",
" <td>17.8</td>\n",
" <td>392.83</td>\n",
" <td>4.03</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0.03237</td>\n",
" <td>0.0</td>\n",
" <td>2.18</td>\n",
" <td>0</td>\n",
" <td>0.458</td>\n",
" <td>6.998</td>\n",
" <td>45.8</td>\n",
" <td>6.0622</td>\n",
" <td>3</td>\n",
" <td>222</td>\n",
" <td>18.7</td>\n",
" <td>394.63</td>\n",
" <td>2.94</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>0.06905</td>\n",
" <td>0.0</td>\n",
" <td>2.18</td>\n",
" <td>0</td>\n",
" <td>0.458</td>\n",
" <td>7.147</td>\n",
" <td>54.2</td>\n",
" <td>6.0622</td>\n",
" <td>3</td>\n",
" <td>222</td>\n",
" <td>18.7</td>\n",
" <td>396.90</td>\n",
" <td>5.33</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>502</th>\n",
" <td>0.06263</td>\n",
" <td>0.0</td>\n",
" <td>11.93</td>\n",
" <td>0</td>\n",
" <td>0.573</td>\n",
" <td>6.593</td>\n",
" <td>69.1</td>\n",
" <td>2.4786</td>\n",
" <td>1</td>\n",
" <td>273</td>\n",
" <td>21.0</td>\n",
" <td>391.99</td>\n",
" <td>9.67</td>\n",
" </tr>\n",
" <tr>\n",
" <th>503</th>\n",
" <td>0.04527</td>\n",
" <td>0.0</td>\n",
" <td>11.93</td>\n",
" <td>0</td>\n",
" <td>0.573</td>\n",
" <td>6.120</td>\n",
" <td>76.7</td>\n",
" <td>2.2875</td>\n",
" <td>1</td>\n",
" <td>273</td>\n",
" <td>21.0</td>\n",
" <td>396.90</td>\n",
" <td>9.08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>504</th>\n",
" <td>0.06076</td>\n",
" <td>0.0</td>\n",
" <td>11.93</td>\n",
" <td>0</td>\n",
" <td>0.573</td>\n",
" <td>6.976</td>\n",
" <td>91.0</td>\n",
" <td>2.1675</td>\n",
" <td>1</td>\n",
" <td>273</td>\n",
" <td>21.0</td>\n",
" <td>396.90</td>\n",
" <td>5.64</td>\n",
" </tr>\n",
" <tr>\n",
" <th>505</th>\n",
" <td>0.10959</td>\n",
" <td>0.0</td>\n",
" <td>11.93</td>\n",
" <td>0</td>\n",
" <td>0.573</td>\n",
" <td>6.794</td>\n",
" <td>89.3</td>\n",
" <td>2.3889</td>\n",
" <td>1</td>\n",
" <td>273</td>\n",
" <td>21.0</td>\n",
" <td>393.45</td>\n",
" <td>6.48</td>\n",
" </tr>\n",
" <tr>\n",
" <th>506</th>\n",
" <td>0.04741</td>\n",
" <td>0.0</td>\n",
" <td>11.93</td>\n",
" <td>0</td>\n",
" <td>0.573</td>\n",
" <td>6.030</td>\n",
" <td>80.8</td>\n",
" <td>2.5050</td>\n",
" <td>1</td>\n",
" <td>273</td>\n",
" <td>21.0</td>\n",
" <td>396.90</td>\n",
" <td>7.88</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>506 rows × 13 columns</p>\n",
"</div>"
],
"text/plain": [
" crim zn indus chas nox ... rad tax ptratio black lstat\n",
"1 0.00632 18.0 2.31 0 0.538 ... 1 296 15.3 396.90 4.98\n",
"2 0.02731 0.0 7.07 0 0.469 ... 2 242 17.8 396.90 9.14\n",
"3 0.02729 0.0 7.07 0 0.469 ... 2 242 17.8 392.83 4.03\n",
"4 0.03237 0.0 2.18 0 0.458 ... 3 222 18.7 394.63 2.94\n",
"5 0.06905 0.0 2.18 0 0.458 ... 3 222 18.7 396.90 5.33\n",
".. ... ... ... ... ... ... ... ... ... ... ...\n",
"502 0.06263 0.0 11.93 0 0.573 ... 1 273 21.0 391.99 9.67\n",
"503 0.04527 0.0 11.93 0 0.573 ... 1 273 21.0 396.90 9.08\n",
"504 0.06076 0.0 11.93 0 0.573 ... 1 273 21.0 396.90 5.64\n",
"505 0.10959 0.0 11.93 0 0.573 ... 1 273 21.0 393.45 6.48\n",
"506 0.04741 0.0 11.93 0 0.573 ... 1 273 21.0 396.90 7.88\n",
"\n",
"[506 rows x 13 columns]"
]
},
"metadata": {},
"execution_count": 34
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "17aMKZNOLCuo",
"outputId": "ab2082db-2358-4977-a632-65b357b6e53e"
},
"source": [
"X = X.to_numpy()\n",
"X = np.c_[np.ones(506), X]\n",
"X"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[1.0000e+00, 6.3200e-03, 1.8000e+01, ..., 1.5300e+01, 3.9690e+02,\n",
" 4.9800e+00],\n",
" [1.0000e+00, 2.7310e-02, 0.0000e+00, ..., 1.7800e+01, 3.9690e+02,\n",
" 9.1400e+00],\n",
" [1.0000e+00, 2.7290e-02, 0.0000e+00, ..., 1.7800e+01, 3.9283e+02,\n",
" 4.0300e+00],\n",
" ...,\n",
" [1.0000e+00, 6.0760e-02, 0.0000e+00, ..., 2.1000e+01, 3.9690e+02,\n",
" 5.6400e+00],\n",
" [1.0000e+00, 1.0959e-01, 0.0000e+00, ..., 2.1000e+01, 3.9345e+02,\n",
" 6.4800e+00],\n",
" [1.0000e+00, 4.7410e-02, 0.0000e+00, ..., 2.1000e+01, 3.9690e+02,\n",
" 7.8800e+00]])"
]
},
"metadata": {},
"execution_count": 35
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "yK7PtvMmLMxs"
},
"source": [
"Y = df['medv'].to_numpy()"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vC_AG2odLS6f",
"outputId": "9db0872f-2049-44fe-d933-d7ad5409eeda"
},
"source": [
"Betta = np.linalg.inv(X.T@X)@X.T@Y\n",
"Betta"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([ 3.64594884e+01, -1.08011358e-01, 4.64204584e-02, 2.05586264e-02,\n",
" 2.68673382e+00, -1.77666112e+01, 3.80986521e+00, 6.92224640e-04,\n",
" -1.47556685e+00, 3.06049479e-01, -1.23345939e-02, -9.52747232e-01,\n",
" 9.31168327e-03, -5.24758378e-01])"
]
},
"metadata": {},
"execution_count": 37
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 730
},
"id": "biPTraBxB0WU",
"outputId": "780b9a3d-07be-46a4-a727-2ef4bb84a760"
},
"source": [
"lm = sm.OLS.from_formula('medv ~ crim + zn + indus + chas + nox + rm + age + dis + rad + tax + ptratio + black + lstat', df)\n",
"result = lm.fit()\n",
"result.summary()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<table class=\"simpletable\">\n",
"<caption>OLS Regression Results</caption>\n",
"<tr>\n",
" <th>Dep. Variable:</th> <td>medv</td> <th> R-squared: </th> <td> 0.741</td> \n",
"</tr>\n",
"<tr>\n",
" <th>Model:</th> <td>OLS</td> <th> Adj. R-squared: </th> <td> 0.734</td> \n",
"</tr>\n",
"<tr>\n",
" <th>Method:</th> <td>Least Squares</td> <th> F-statistic: </th> <td> 108.1</td> \n",
"</tr>\n",
"<tr>\n",
" <th>Date:</th> <td>Wed, 10 Nov 2021</td> <th> Prob (F-statistic):</th> <td>6.72e-135</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Time:</th> <td>08:15:01</td> <th> Log-Likelihood: </th> <td> -1498.8</td> \n",
"</tr>\n",
"<tr>\n",
" <th>No. Observations:</th> <td> 506</td> <th> AIC: </th> <td> 3026.</td> \n",
"</tr>\n",
"<tr>\n",
" <th>Df Residuals:</th> <td> 492</td> <th> BIC: </th> <td> 3085.</td> \n",
"</tr>\n",
"<tr>\n",
" <th>Df Model:</th> <td> 13</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>Covariance Type:</th> <td>nonrobust</td> <th> </th> <td> </td> \n",
"</tr>\n",
"</table>\n",
"<table class=\"simpletable\">\n",
"<tr>\n",
" <td></td> <th>coef</th> <th>std err</th> <th>t</th> <th>P>|t|</th> <th>[0.025</th> <th>0.975]</th> \n",
"</tr>\n",
"<tr>\n",
" <th>Intercept</th> <td> 36.4595</td> <td> 5.103</td> <td> 7.144</td> <td> 0.000</td> <td> 26.432</td> <td> 46.487</td>\n",
"</tr>\n",
"<tr>\n",
" <th>crim</th> <td> -0.1080</td> <td> 0.033</td> <td> -3.287</td> <td> 0.001</td> <td> -0.173</td> <td> -0.043</td>\n",
"</tr>\n",
"<tr>\n",
" <th>zn</th> <td> 0.0464</td> <td> 0.014</td> <td> 3.382</td> <td> 0.001</td> <td> 0.019</td> <td> 0.073</td>\n",
"</tr>\n",
"<tr>\n",
" <th>indus</th> <td> 0.0206</td> <td> 0.061</td> <td> 0.334</td> <td> 0.738</td> <td> -0.100</td> <td> 0.141</td>\n",
"</tr>\n",
"<tr>\n",
" <th>chas</th> <td> 2.6867</td> <td> 0.862</td> <td> 3.118</td> <td> 0.002</td> <td> 0.994</td> <td> 4.380</td>\n",
"</tr>\n",
"<tr>\n",
" <th>nox</th> <td> -17.7666</td> <td> 3.820</td> <td> -4.651</td> <td> 0.000</td> <td> -25.272</td> <td> -10.262</td>\n",
"</tr>\n",
"<tr>\n",
" <th>rm</th> <td> 3.8099</td> <td> 0.418</td> <td> 9.116</td> <td> 0.000</td> <td> 2.989</td> <td> 4.631</td>\n",
"</tr>\n",
"<tr>\n",
" <th>age</th> <td> 0.0007</td> <td> 0.013</td> <td> 0.052</td> <td> 0.958</td> <td> -0.025</td> <td> 0.027</td>\n",
"</tr>\n",
"<tr>\n",
" <th>dis</th> <td> -1.4756</td> <td> 0.199</td> <td> -7.398</td> <td> 0.000</td> <td> -1.867</td> <td> -1.084</td>\n",
"</tr>\n",
"<tr>\n",
" <th>rad</th> <td> 0.3060</td> <td> 0.066</td> <td> 4.613</td> <td> 0.000</td> <td> 0.176</td> <td> 0.436</td>\n",
"</tr>\n",
"<tr>\n",
" <th>tax</th> <td> -0.0123</td> <td> 0.004</td> <td> -3.280</td> <td> 0.001</td> <td> -0.020</td> <td> -0.005</td>\n",
"</tr>\n",
"<tr>\n",
" <th>ptratio</th> <td> -0.9527</td> <td> 0.131</td> <td> -7.283</td> <td> 0.000</td> <td> -1.210</td> <td> -0.696</td>\n",
"</tr>\n",
"<tr>\n",
" <th>black</th> <td> 0.0093</td> <td> 0.003</td> <td> 3.467</td> <td> 0.001</td> <td> 0.004</td> <td> 0.015</td>\n",
"</tr>\n",
"<tr>\n",
" <th>lstat</th> <td> -0.5248</td> <td> 0.051</td> <td> -10.347</td> <td> 0.000</td> <td> -0.624</td> <td> -0.425</td>\n",
"</tr>\n",
"</table>\n",
"<table class=\"simpletable\">\n",
"<tr>\n",
" <th>Omnibus:</th> <td>178.041</td> <th> Durbin-Watson: </th> <td> 1.078</td> \n",
"</tr>\n",
"<tr>\n",
" <th>Prob(Omnibus):</th> <td> 0.000</td> <th> Jarque-Bera (JB): </th> <td> 783.126</td> \n",
"</tr>\n",
"<tr>\n",
" <th>Skew:</th> <td> 1.521</td> <th> Prob(JB): </th> <td>8.84e-171</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Kurtosis:</th> <td> 8.281</td> <th> Cond. No. </th> <td>1.51e+04</td> \n",
"</tr>\n",
"</table><br/><br/>Warnings:<br/>[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.<br/>[2] The condition number is large, 1.51e+04. This might indicate that there are<br/>strong multicollinearity or other numerical problems."
],
"text/plain": [
"<class 'statsmodels.iolib.summary.Summary'>\n",
"\"\"\"\n",
" OLS Regression Results \n",
"==============================================================================\n",
"Dep. Variable: medv R-squared: 0.741\n",
"Model: OLS Adj. R-squared: 0.734\n",
"Method: Least Squares F-statistic: 108.1\n",
"Date: Wed, 10 Nov 2021 Prob (F-statistic): 6.72e-135\n",
"Time: 08:15:01 Log-Likelihood: -1498.8\n",
"No. Observations: 506 AIC: 3026.\n",
"Df Residuals: 492 BIC: 3085.\n",
"Df Model: 13 \n",
"Covariance Type: nonrobust \n",
"==============================================================================\n",
" coef std err t P>|t| [0.025 0.975]\n",
"------------------------------------------------------------------------------\n",
"Intercept 36.4595 5.103 7.144 0.000 26.432 46.487\n",
"crim -0.1080 0.033 -3.287 0.001 -0.173 -0.043\n",
"zn 0.0464 0.014 3.382 0.001 0.019 0.073\n",
"indus 0.0206 0.061 0.334 0.738 -0.100 0.141\n",
"chas 2.6867 0.862 3.118 0.002 0.994 4.380\n",
"nox -17.7666 3.820 -4.651 0.000 -25.272 -10.262\n",
"rm 3.8099 0.418 9.116 0.000 2.989 4.631\n",
"age 0.0007 0.013 0.052 0.958 -0.025 0.027\n",
"dis -1.4756 0.199 -7.398 0.000 -1.867 -1.084\n",
"rad 0.3060 0.066 4.613 0.000 0.176 0.436\n",
"tax -0.0123 0.004 -3.280 0.001 -0.020 -0.005\n",
"ptratio -0.9527 0.131 -7.283 0.000 -1.210 -0.696\n",
"black 0.0093 0.003 3.467 0.001 0.004 0.015\n",
"lstat -0.5248 0.051 -10.347 0.000 -0.624 -0.425\n",
"==============================================================================\n",
"Omnibus: 178.041 Durbin-Watson: 1.078\n",
"Prob(Omnibus): 0.000 Jarque-Bera (JB): 783.126\n",
"Skew: 1.521 Prob(JB): 8.84e-171\n",
"Kurtosis: 8.281 Cond. No. 1.51e+04\n",
"==============================================================================\n",
"\n",
"Warnings:\n",
"[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n",
"[2] The condition number is large, 1.51e+04. This might indicate that there are\n",
"strong multicollinearity or other numerical problems.\n",
"\"\"\""
]
},
"metadata": {},
"execution_count": 38
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 80
},
"id": "oeeP-FvjKJC3",
"outputId": "83bffc56-bdf3-43e0-d54b-381ddba7c6f9"
},
"source": [
"df_new_data = pd.DataFrame(np.array([[0, 0, 0 , 1, 0.4, 10, 0, 2, 20, 400, 15, 0, 0]]), \n",
" columns=['crim', 'zn', 'indus', 'chas', 'nox', 'rm', 'age', 'dis', 'rad', 'tax', 'ptratio', 'black', 'lstat'])\n",
"df_new_data"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>crim</th>\n",
" <th>zn</th>\n",
" <th>indus</th>\n",
" <th>chas</th>\n",
" <th>nox</th>\n",
" <th>rm</th>\n",
" <th>age</th>\n",
" <th>dis</th>\n",
" <th>rad</th>\n",
" <th>tax</th>\n",
" <th>ptratio</th>\n",
" <th>black</th>\n",
" <th>lstat</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>0.4</td>\n",
" <td>10.0</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>20.0</td>\n",
" <td>400.0</td>\n",
" <td>15.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" crim zn indus chas nox rm ... dis rad tax ptratio black lstat\n",
"0 0.0 0.0 0.0 1.0 0.4 10.0 ... 2.0 20.0 400.0 15.0 0.0 0.0\n",
"\n",
"[1 rows x 13 columns]"
]
},
"metadata": {},
"execution_count": 39
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "P8Z98W3UX_Oa",
"outputId": "c13d4795-766c-48e3-8991-d509430caae6"
},
"source": [
"result.predict(df_new_data)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0 54.08304\n",
"dtype: float64"
]
},
"metadata": {},
"execution_count": 40
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 500
},
"id": "E6xriKXXWqJp",
"outputId": "1fe7787e-a78b-42c3-87f5-6feea87b6e2b"
},
"source": [
"df['medv'].hist(bins=40)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7faf116d2510>"
]
},
"metadata": {},
"execution_count": 44
},
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAr8AAAHSCAYAAADlm6P3AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAUs0lEQVR4nO3dX4il933f8c+3GhsLbSvZljuIldpRsXERVmPjwXVwLmaVuqhZE+vCBAc3yKCyF03ApQrpJBBKSgPrC8fJRW9EbLQXbdbGjSujTUiEomlaaJTsxk7XtmqsmDX1okiESE7WGIdNfr3YR85GndmZ3Tmzz9n9vl4g5jznz5zvrH5zzluPnj1PjTECAAAd/J25BwAAgOtF/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbaxczye78847x9ra2vV8Srbxne98J7fddtvcY7CErA12Ym1wJdYHO5lrbZw5c+ZPxxhv2e626xq/a2trOX369PV8SraxtbWVjY2NucdgCVkb7MTa4EqsD3Yy19qoqm/udJvDHgAAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgjZW5BwCu3drmqWt+7LnjRxc4CQDcGOz5BQCgDfELAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhpNcADcUJ/YAYD/s+QUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaGPP8VtVt1TVF6vqyWn73qp6tqqer6rPVNXrD25MAADYv6vZ8/uxJM9dtv3xJJ8cY7w1yctJHlnkYAAAsGh7it+qujvJ0SS/Om1XkgeSfG66y4kkDx3EgAAAsCh73fP7y0l+JslfT9tvTvLKGOPitP2tJIcXPBsAACxUjTGufIeqDyT5kTHGv66qjSQ/neSjSX5vOuQhVXVPkt8cY7xjm8cfS3IsSVZXV9998uTJhf4AXL0LFy7k0KFDc4/BApw9/+2Ffr/VW5MXv7v7/e4/fPtCn/dq7OdnnnPuG53XDa7E+mAnc62NI0eOnBljrG9328oeHv++JD9aVT+S5A1J/l6SX0lyR1WtTHt/705yfrsHjzEeS/JYkqyvr4+NjY2r/wlYqK2trfj3cHP46OaphX6/R++/mE+c3f1l4dxHNhb6vFdjPz/znHPf6LxucCXWBztZxrWx62EPY4yfHWPcPcZYS/LhJL8zxvhIkmeSfGi628NJnjiwKQEAYAH28zm//y7Jv62q53PpGOBPLWYkAAA4GHs57OH7xhhbSbamy99I8p7FjwQAAAfDGd4AAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtrMw9ANDP2uapuUcAoCl7fgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADaWJl7AODGs7Z5au4RAOCa2PMLAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgjV3jt6reUFW/X1V/VFVfqapfmK6/t6qerarnq+ozVfX6gx8XAACu3V72/H4vyQNjjB9I8s4kD1bVe5N8PMknxxhvTfJykkcObkwAANi/XeN3XHJh2nzd9M9I8kCSz03Xn0jy0IFMCAAAC7KnY36r6paq+lKSl5I8leSPk7wyxrg43eVbSQ4fzIgAALAYNcbY+52r7kjy+SQ/n+Tx6ZCHVNU9SX5zjPGObR5zLMmxJFldXX33yZMnFzE3+3DhwoUcOnRo7jFYgLPnv73Q77d6a/Lidxf6LW8a9x++fe4RZuV1gyuxPtjJXGvjyJEjZ8YY69vdtnI132iM8UpVPZPkB5PcUVUr097fu5Oc3+ExjyV5LEnW19fHxsbG1TwlB2Brayv+PdwcPrp5aqHf79H7L+YTZ6/qZaGNcx/ZmHuEWXnd4EqsD3ayjGtjL5/28JZpj2+q6tYk70/yXJJnknxoutvDSZ44qCEBAGAR9rKL564kJ6rqllyK5c+OMZ6sqq8mOVlV/zHJF5N86gDnBACAfds1fscY/zvJu7a5/htJ3nMQQwEAwEFwhjcAANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoI2VuQeAy61tnrrmx547fnSBk1w/+/mZAYCrY88vAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADaWJl7AFiUtc1T1/zYc8ePLnASAGBZ2fMLAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoY9f4rap7quqZqvpqVX2lqj42Xf+mqnqqqr4+fX3jwY8LAADXbi97fi8meXSMcV+S9yb5yaq6L8lmkqfHGG9L8vS0DQAAS2vX+B1jvDDG+MPp8l8keS7J4SQfTHJiutuJJA8d1JAAALAIV3XMb1WtJXlXkmeTrI4xXphu+pMkqwudDAAAFqzGGHu7Y9WhJP89yS+OMX69ql4ZY9xx2e0vjzH+v+N+q+pYkmNJsrq6+u6TJ08uZnKu2YULF3Lo0KG5x9jW2fPfnuV57z98+yzPm8z3M29n9dbkxe/OPcVymnONLINlft1gftYHO5lrbRw5cuTMGGN9u9v2FL9V9bokTyb5rTHGL03XfS3Jxhjjhaq6K8nWGOPtV/o+6+vr4/Tp01f9A7BYW1tb2djYmHuMba1tnprlec8dPzrL8ybz/czbefT+i/nE2ZW5x1hKc66RZbDMrxvMz/pgJ3OtjaraMX738mkPleRTSZ57NXwnX0jy8HT54SRP7HdQAAA4SHvZxfO+JD+R5GxVfWm67ueSHE/y2ap6JMk3k/zYwYwIAACLsWv8jjH+Z5La4eYfXuw4AABwcJzhDQCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0MbK3APAMljbPDX3CADAdWDPLwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0sTL3AAAALK+1zVPX/NjHH7xtgZMshj2/AAC0IX4BAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgjZW5B+Dms7Z5au4R4Kayn9+pc8ePLnASgBufPb8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADa2DV+q+rTVfVSVX35suveVFVPVdXXp69vPNgxAQBg//ay5/fxJA++5rrNJE+PMd6W5OlpGwAAltqu8TvG+N0kf/aaqz+Y5MR0+USShxY8FwAALFyNMXa/U9VakifHGO+Ytl8ZY9wxXa4kL7+6vc1jjyU5liSrq6vvPnny5GIm54rOnv/2jret3pq8+N2dH3v/4dsP7LlZbrutjc7283txM/xOXO3amPPPa7+vYVy9Cxcu5NChQ3OPwQHZz+/kvbffMsvaOHLkyJkxxvp2t+07fqftl8cYux73u76+Pk6fPr3XudmHtc1TO9726P0X84mzKzvefu740QN7bpbbbmujs/38XtwMvxNXuzbm/PPa72sYV29raysbGxtzj8EB2c/v5OMP3jbL2qiqHeP3Wj/t4cWqumv65ncleelahwMAgOvlWuP3C0keni4/nOSJxYwDAAAHZy8fdfZrSf5XkrdX1beq6pEkx5O8v6q+nuSfTdsAALDUdj2Aa4zx4zvc9MMLngUAAA6UM7wBANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgDfELAEAb4hcAgDbELwAAbYhfAADaWJl7AJbP2uapuUcAbnBeR4BlZc8vAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2nORiifmQeIDra67X3XPHj87yvNCRPb8AALQhfgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtOEMb7vY79l+nLUH4PpydkzgSuz5BQCgDfELAEAb4hcAgDbELwAAbYhfAADaEL8AALQhfgEAaEP8AgDQhvgFAKANZ3g7YM40BMBunE0Urh97fgEAaEP8AgDQhvgFAKAN8QsAQBviFwCANsQvAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0MbK3ANcD2ubp+YeAQBYkDnf188dPzrbc++HFvob9vwCANCG+AUAoA3xCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0EaLM7wB7JezI7HM9rM+5zpj2Y36O3Uj/lnzt9nzCwBAG+IXAIA2xC8AAG2IXwAA2hC/AAC0IX4BAGhD/AIA0Ib4BQCgDSe5AIDGnLTh+rlRT+xxs7HnFwCANsQvAABtiF8AANoQvwAAtCF+AQBoY1/xW1UPVtXXqur5qtpc1FAAAHAQrjl+q+qWJP8pyb9Icl+SH6+q+xY1GAAALNp+9vy+J8nzY4xvjDH+MsnJJB9czFgAALB4+4nfw0n+72Xb35quAwCApVRjjGt7YNWHkjw4xvhX0/ZPJPmnY4yfes39jiU5Nm2+PcnXrn1cFuTOJH869xAsJWuDnVgbXIn1wU7mWhv/cIzxlu1u2M/pjc8nueey7bun6/6WMcZjSR7bx/OwYFV1eoyxPvccLB9rg51YG1yJ9cFOlnFt7Oewhz9I8raqureqXp/kw0m+sJixAABg8a55z+8Y42JV/VSS30pyS5JPjzG+srDJAABgwfZz2EPGGL+R5DcWNAvXj8NQ2Im1wU6sDa7E+mAnS7c2rvkvvAEAwI3G6Y0BAGhD/N7kqurTVfVSVX35suveVFVPVdXXp69vnHNG5lFV91TVM1X11ar6SlV9bLre+miuqt5QVb9fVX80rY1fmK6/t6qenU5p/5npLzvTUFXdUlVfrKonp21rg1TVuao6W1VfqqrT03VL954ifm9+jyd58DXXbSZ5eozxtiRPT9v0czHJo2OM+5K8N8lPTqcotz74XpIHxhg/kOSdSR6sqvcm+XiST44x3prk5SSPzDgj8/pYkucu27Y2eNWRMcY7L/t4s6V7TxG/N7kxxu8m+bPXXP3BJCemyyeSPHRdh2IpjDFeGGP84XT5L3LpjexwrI/2xiUXps3XTf+MJA8k+dx0vbXRVFXdneRokl+dtivWBjtbuvcU8dvT6hjjhenynyRZnXMY5ldVa0neleTZWB/k+/9b+0tJXkryVJI/TvLKGOPidBentO/rl5P8TJK/nrbfHGuDS0aS366qM9MZfpMlfE/Z10edceMbY4yq8pEfjVXVoST/Ncm/GWP8+aWdOJdYH32NMf4qyTur6o4kn0/yj2ceiSVQVR9I8tIY40xVbcw9D0vnh8YY56vq7yd5qqr+z+U3Lst7ij2/Pb1YVXclyfT1pZnnYSZV9bpcCt//PMb49elq64PvG2O8kuSZJD+Y5I6qenWnybantOem974kP1pV55KczKXDHX4l1gZJxhjnp68v5dJ/NL8nS/ieIn57+kKSh6fLDyd5YsZZmMl0nN6nkjw3xvily26yPpqrqrdMe3xTVbcmeX8uHRP+TJIPTXezNhoaY/zsGOPuMcZakg8n+Z0xxkdibbRXVbdV1d999XKSf57ky1nC9xQnubjJVdWvJdlIcmeSF5P8+yT/Lclnk/yDJN9M8mNjjNf+pThuclX1Q0n+R5Kz+Ztj934ul477tT4aq6p/kkt/MeWWXNpJ8tkxxn+oqn+US3v73pTki0n+5Rjje/NNypymwx5+eozxAWuDaQ18ftpcSfJfxhi/WFVvzpK9p4hfAADacNgDAABtiF8AANoQvwAAtCF+AQBoQ/wCANCG+AUAoA3xCwBAG+IXAIA2/h+yn/u5KdaecQAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 864x576 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "moUPnO2n70dW"
},
"source": [
""
],
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment