sklearn笔记28 线性回归原理

编程入门行业动态更新时间:2024-10-09 00:48:55

sklearn笔记28 <a href=https://www.elefans.com/category/jswz/34/1768154.html style= 线性回归原理"/>

完整代码sklearn代码20 1-线性回归boston房价预测

线性回归常用的导数：

线性回归的

基本规律：一般都会有一条线，穿过这些点，即线性回归方程
机器学习的本质：解方程

最小二乘法的相关概念：

最小二乘法的概念：

导入需要的包

import numpy as npfrom sklearn.linear_model import LinearRegressionimport matplotlib.pyplot as plt
%matplotlib inlinefrom sklearn import datasets

导入数据

# 波士顿房价
boston = datasets.load_boston()

对数据进行划分

X = np.linspace(0,10,50).reshape(-1,1)
X

y = np.random.randint(2,8,size = 1)*X
y

求斜率

y/X

lr = LinearRegression()lr.fit(X,y)
# coeficient  效率，斜率
# w ------> weight 权重
lr.coef_

# 线性代数中的矩阵运算
np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)

# 矩阵运算不满足交换率，满足结合律

推导过程

举例说明

a = np.random.randint(0,10,size = (3,2))
a

a.T   #转置矩阵

np.linalg.inv(a)    #非满秩矩阵，没有逆矩阵

b = a.dot(a.T)    #与转置矩阵相乘，变为方阵np.linalg.inv(b)   #求逆矩阵

波士顿的房价数据

boston

线性回归函数的参数

X = boston['data']
y = boston['target']

X.shape

开始对数据进行训练预测

from sklearn.model_selection import train_test_split  # 对数据集进行划分，取出80%的数据进行训练X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 0.2)

无截距的运算

lr = LinearRegression(fit_intercept=False)lr.fit(X_train,y_train)
# 斜率个数，由属性个数决定，由上述可知，有13个属性
display(lr.coef_,lr.intercept_)

# 算法预测的结果
lr.predict(X_test).round(2)[:25]   # 保留两位小数，并且之查看前25个数值

w = lr.coef_
w

# 公式计算结果
X_test.dot(w).round(2)[:25]

# '真实'房价显示  只是对通用数据进行预测
y_test[:25]

有截距的预测

lr = LinearRegression(fit_intercept=True)lr.fit(X_train,y_train)display(lr.coef_,lr.intercept_)

lr.predict(X_test).round(2)[:15]

# 根据斜率和截距构造方程，进行求解的结果
(X_test.dot(lr.coef_) + lr.intercept_).round(2)[:15]

更多推荐

sklearn笔记28 线性回归原理

本文发布于:2024-03-05 21:15:13，感谢您对本站的认可！

线性原理笔记 sklearn

评论列表（有 0 条评论）