Skip to content

Instantly share code, notes, and snippets.

@tharunpeddisetty
Last active May 26, 2021 07:16
Show Gist options
  • Save tharunpeddisetty/22447d5446928f864b55e096b4aac995 to your computer and use it in GitHub Desktop.
Save tharunpeddisetty/22447d5446928f864b55e096b4aac995 to your computer and use it in GitHub Desktop.
Implementing Polynomial Regression for predicting the salaries of the employees based on their seniority level
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('/Users/tharunpeddisetty/Desktop/Position_Salaries.csv') #Add your file path
X = dataset.iloc[:,1:-1].values
y = dataset.iloc[:, -1].values
#Visualizing the data
plt.scatter(X,y,c='red')
plt.title('Polynomial Regression')
plt.xlabel("Position Level")
plt.ylabel('Salary')
plt.show()
#Since we are trying to predict for level 6.5 we use entire dataset
#First training using Linear Regression
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X,y)
#Training using Polynomial Linear Regression
# we need to create matrix of features along with the sqaured terms
from sklearn.preprocessing import PolynomialFeatures
poly_reg = PolynomialFeatures(degree=4)
X_poly=poly_reg.fit_transform(X)
lin_reg2 = LinearRegression()
lin_reg2.fit(X_poly,y)
#Visualizing the results of Linear Regression
plt.scatter(X,y,c='red')
plt.plot(X,lin_reg.predict(X), c='blue')
plt.title('Linear Regression')
plt.xlabel("Position Level")
plt.ylabel('Salary')
plt.show()
#Visualizing the results of Polynomial Regression
plt.scatter(X,y,c='red')
plt.plot(X,lin_reg2.predict(X_poly), c='blue')
plt.title('Polynomial Linear Regression')
plt.xlabel("Position Level")
plt.ylabel('Salary')
plt.show()
#Predicting 6.5 level result using Linear Regression
print(lin_reg.predict([[6.5]])) # can also predict for ([[6,5],[2,3]]) inner [2,3] - rows; outer [] columns
#Predicting 6.5 level result using Polynomial Linear Regression
print(lin_reg2.predict(poly_reg.fit_transform([[6.5]]))) #we need to enter x1,x2,x3,x4 and this is the efficient model
Position Level Salary
Business Analyst 1 45000
Junior Consultant 2 50000
Senior Consultant 3 60000
Manager 4 80000
Country Manager 5 110000
Region Manager 6 150000
Partner 7 200000
Senior Partner 8 300000
C-level 9 500000
CEO 10 1000000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment