Skip to content

Instantly share code, notes, and snippets.

@arangates
Created June 5, 2017 04:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save arangates/356f571edae030ac96ed88355209bb8a to your computer and use it in GitHub Desktop.
Save arangates/356f571edae030ac96ed88355209bb8a to your computer and use it in GitHub Desktop.
A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0).
from __future__ import division
# Define the data
data = set()
count = int(raw_input("Enter the number of data points: "))
for i in range(count):
x=float(raw_input("X"+str(i+1)+": "))
y=float(raw_input("Y"+str(i+1)+": "))
data.add((x,y))
# Find the average x and y
avgx = 0.0
avgy = 0.0
for i in data:
avgx += i[0]/len(data)
avgy += i[1]/len(data)
# Find the sums
totalxx = 0
totalxy = 0
for i in data:
totalxx += (i[0]-avgx)**2
totalxy += (i[0]-avgx)*(i[1]-avgy)
# Slope/intercept form
m = totalxy/totalxx
b = avgy-m*avgx
print "Best fit line:"
print "y = "+str(m)+"x + "+str(b)
x = float(raw_input("Enter a value to calculate:"))
print "y = "+str(m*x+b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment