Skip to content

Instantly share code, notes, and snippets.

@changkun
Created June 27, 2017 06:11
Show Gist options
  • Save changkun/ca252e5d16283986f2254e56d2a602a5 to your computer and use it in GitHub Desktop.
Save changkun/ca252e5d16283986f2254e56d2a602a5 to your computer and use it in GitHub Desktop.
Analytic solution for Linear Regression, implemented by Python
# 使用 numpy 手动实现实现
import numpy as np
# 生成随机数据集
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
# 函数说明:
# 1. np.c_[np.array([1,2,3]), np.array([4,5,6])] --> array([[1, 4],[2, 5],[3, 6]])
# 2. np.ones((2, 1)) --> array([[ 1.], [ 1.]])
# 3. np.linalg.inv() 求逆矩阵
# 4. X.T 求转置
# 5. X.dot(Y) 矩阵乘法
X_b = np.c_[np.ones((100, 1)), X] # 为每个数据增加 x0 = 1
theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
X_new = np.array([[0], [2]])
X_new_b = np.c_[np.ones((2, 1)), X_new] # 为每个数据增加 x0 = 1
y_predict = X_new_b.dot(theta_best)
print('theata_best via numpy: ', theta_best)
# 使用 Scikit-Lean 中的线性回归模型
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X, y)
y_predict_sk = lin_reg.predict(X_new)
print('theata_best via sklean: ', np.c_[lin_reg.intercept_, lin_reg.coef_].T)
# 使用 matplotlib 绘制结果
import matplotlib.pyplot as plt
plt.plot(X_new, y_predict, 'y-', lw=3)
plt.plot(X_new, y_predict_sk, 'r:', lw=3)
plt.plot(X, y, 'b.')
plt.axis([0, 2, 0, 15])
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment