Skip to content

Instantly share code, notes, and snippets.

@sunkay
Created October 15, 2020 16:10
Show Gist options
  • Save sunkay/b43c0014650186d320536bb86bef8951 to your computer and use it in GitHub Desktop.
Save sunkay/b43c0014650186d320536bb86bef8951 to your computer and use it in GitHub Desktop.
Regression-linear-simple.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Regression-linear-simple.ipynb",
"provenance": [],
"authorship_tag": "ABX9TyN/PMu30dV/ku0iAKIjrOzF",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/sunkay/b43c0014650186d320536bb86bef8951/regression-linear-simple.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tmNZ2_k9H0aw"
},
"source": [
"Simple linear regression using salary data"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QyyDcMBLI-Kt"
},
"source": [
"Data PreProcessing\n",
"\n",
"\n",
"1. Import Libraries\n",
"2. Read Data Set\n",
"3. Split Training & Test Sets\n",
"4. (No need to encode since there are no text categoriees/features\n",
"5. No need to address missing values\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "VYFU5v0bH5I7"
},
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd "
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "r5nyZLJXIFJQ"
},
"source": [
"dataset = pd.read_csv('Salary_Data.csv')\n",
"X = dataset.iloc[:, :-1].values \n",
"y = dataset.iloc[:, -1].values"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "RY_S8NGWIkmt"
},
"source": [
"from sklearn.model_selection import train_test_split\n",
"X_train, X_test, y_train, y_test = \\\n",
" train_test_split(X, y, test_size = 0.2, random_state = 1)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "sj5xTkoRInO3"
},
"source": [
"print(X_train)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "qkuk4gAXJLky"
},
"source": [
"print(X_test)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Ix80DyvPJOMn",
"outputId": "7f3d79d1-3368-4700-d73c-abd5dd6478f1",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 54
}
},
"source": [
"from sklearn.linear_model import LinearRegression\n",
"regressor = LinearRegression()\n",
"regressor.fit(X_train, y_train)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ydIrJQmHKQQN"
},
"source": [
"y_pred = regressor.predict(X_test)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "xXsJ0Ps-LqOE",
"outputId": "85fcb8b2-f64f-40ec-a439-9170cc678a1a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
}
},
"source": [
"print(y_pred)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[75074.50510972 91873.8056381 62008.38247653 81607.56642631\n",
" 67608.14931932 89073.92221671]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rpPr4nOGL-OW"
},
"source": [
"Evalute of the predicted results are good enough\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "b_40PR87MDv8",
"outputId": "04cd6e43-7142-4d14-a26e-9dc6a9b99d86",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 295
}
},
"source": [
"plt.scatter(X_train, y_train, color = 'red')\n",
"plt.plot(X_train, regressor.predict(X_train))\n",
"plt.title('Salary vs. Experience (Training)')\n",
"plt.xlabel('Years of Experience')\n",
"plt.ylabel('Salary')\n",
"plt.show()"
],
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "FIS8e-DZMlJ0",
"outputId": "ff65fbb7-ed67-4cc9-a1ab-5daab573b3f9",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 295
}
},
"source": [
"plt.scatter(X_test, y_test, color = 'red')\n",
"plt.plot(X_train, regressor.predict(X_train))\n",
"plt.title('Salary vs. Experience (Test)')\n",
"plt.xlabel('Years of Experience')\n",
"plt.ylabel('Salary')\n",
"plt.show()"
],
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ub2zTGzLO4zO"
},
"source": [
"Making a single prediction (for example the salary of an employee with 12 years of experience)\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "wuPfWj78O5_k",
"outputId": "9d754c15-74f6-47cd-e050-9ca504e6a57b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"print(regressor.predict([[12]]))"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[137605.23485427]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "r2Hm2CIAPUbf"
},
"source": [
"Getting the final linear regression equation with the values of the coefficients\n",
"y = b0 + b1 x"
]
},
{
"cell_type": "code",
"metadata": {
"id": "LsWum86GPioK",
"outputId": "b6467936-7981-45be-dad9-f93adcf95b8f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
}
},
"source": [
"print(regressor.coef_)\n",
"print(regressor.intercept_)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[9332.94473799]\n",
"25609.89799835482\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UT6tW5pvPmFv"
},
"source": [
"Salary=9345.94×YearsExperience+26816.19"
]
},
{
"cell_type": "code",
"metadata": {
"id": "43eM-E7uP6t7"
},
"source": [
""
],
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment