Skip to content

Instantly share code, notes, and snippets.

@garymanley
Created December 29, 2017 21:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save garymanley/c6da61b3ebcf73adc8e03c0050829cce to your computer and use it in GitHub Desktop.
Save garymanley/c6da61b3ebcf73adc8e03c0050829cce to your computer and use it in GitHub Desktop.
Linear Regression
# -*- coding: utf-8 -*-
"""
Created on Sun Dec 28 16:19:53 2017
@author: garym
"""
import pandas as pd
import pyodbc as py
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
### set up database connection
conn_str = (
r'Driver={SQL Server};'
r'Server=localhost\SQLEXPRESS;'
r'Database=RUNNING;'
r'Trusted_Connection=yes;'
)
cnxn = py.connect(conn_str)
cursor = cnxn.cursor()
### Extract data from database
dfSummary = pd.read_sql("SELECT [Time] [fiveK], round(([Time]/3.11)*1.5*0.96,2) [Handicap] FROM [RUNNING].[dbo].[PloddersTest12122017]" , cnxn )
### create data to predict (don't need to do this, can just put in nubmers but this is for future use)
dfTest = pd.read_sql("select 20 test" , cnxn )
## prep data
test = dfTest.values.reshape(1, -1)
### Set up data to be use for linear regression (x axis)
X = dfSummary['fiveK'] ### put as many variables here as needed
X = X.values.reshape(-1, 1)
### set up training data to be predicted (y axis)
y = dfSummary['Handicap']
y = y.values.reshape(-1, 1)
## set up train test split (I set test size small becuase small sample dataset)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=101)
## run and fit linear regression model
lm = LinearRegression()
lm.fit(X_train,y_train)
## predict and print prediction on the test data
predictions = lm.predict(X_test)
print(predictions)
## predict and print on my mock data
predictions = lm.predict(test)
print(predictions)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment