线性回归的实现,权重值增加到Inf

编程入门 行业动态 更新时间:2024-10-24 12:21:18
本文介绍了线性回归的实现,权重值增加到Inf的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在实现一个程序,该程序对以下数据集执行线性回归:

I am implementing a program that performs linear regression on the following dataset:

www.rossmanchance/iscam2/data/housing.txt

我的程序如下:

import numpy as np import pandas as pd import matplotlib.pyplot as plt def abline(X,theta,Y): yValues=calcH(X,theta) plt.xlim(0, 5000) plt.ylim(0, 2000000) plt.xlabel("sqft") plt.ylabel("price") plt.gca().set_aspect(0.001, adjustable='box') plt.plot(X,Y,'.',X, yValues, '-') plt.show() def openFile(fileR): f=pd.read_csv(fileR,sep="\t") header=f.columns.values prediction=f["price"] X=f["sqft"] gradientDescent(0.0005,100,prediction,X) def calcH(X,theta): h=np.dot(X,theta) return h def calcC(X,Y,theta): d=((calcH(X,theta)-Y)**2).mean()/2 return d def gradientDescent(learningRate,itera, Y, X): t0=[] t1=[] cost=[] theta=np.zeros(2) X=np.column_stack((np.ones(len(X)),X)) for i in range(itera): h_theta=calcH(X,theta) theta0=theta[0]-learningRate*(Y-h_theta).mean() theta1=theta[1]-learningRate*((Y-h_theta)*X[:,1]).mean() theta=np.array([theta0,theta1]) j=calcC(X,Y,theta) t0.append(theta0) t1.append(theta1) cost.append(j) if (i%10==0): print ("iteration ",i,"cost ",j,"theta ",theta) abline(X,theta,Y)

我遇到的问题是,当我得到结果时,theta的值最终变为Inf.我仅用3次迭代进行了测试,其中一些值如下:

The problem that I have is that when I got my results the values of theta ends up to Inf. I have tested with only 3 iterations and some values are as follows:

iteration 0 cost 9.948977633931098e+21 theta [-2.47365759e+04 -6.10382173e+07] iteration 1 cost 7.094545903263138e+32 theta [-6.46495395e+09 -1.62995849e+13] iteration 2 cost 5.059070733255204e+43 theta [-1.72638812e+15 -4.35260862e+18]

我想根据可变平方英尺来预测价格.我基本上遵循的是Andrew Ng在Coursera ML课程中给出的公式:

I would like to predict the price based on the variable sqft. I am basically following the formulas given by Andrew Ng in its Coursera ML course:

通过推导该术语,我得到了更新规则:

By deriving the term I got the update rule:

更新:我添加了一个绘制数据图的功能,奇怪的是,我得到了以下不正确的图:

Update: I have added a function to plot my data and, strange, I got the following plots which are not correct:

因为我的预测似乎正在上升.

Because it seems that my predictions are going up.

但是当我绘制关系时显然是线性的:

but when I plot the relationship is clearly lineal:

我在做什么错了?

谢谢

推荐答案

我复制了您的结果.除了某些样式问题以及(Y-h_theta)和(h_theta - Y)的反转(如注释之一所指出)外,实际代码是正确的.只是数量巨大,并且每次迭代试图抵消"最后一步时,都容易导致结果在每次迭代中都超出梯度并在极端之间振荡.极低的学习率可能会起作用.在现实世界的应用程序中,您还可以规范化数据以解决其中的一些问题.

I replicated your results. Besides some stylistic issues and the reversing of (Y-h_theta) and (h_theta - Y) (as pointed out in one of the comments), the actual code is correct. It's just that the numbers are massive and it easily causes the results to overdo the gradient every iteration and oscillate between extremes, each time trying to "counteract" the last step with an even bigger step to the other direction. A very low learning rate could work. In real world applications, you could also normalize your data to address some of these issues.

更多推荐

线性回归的实现,权重值增加到Inf

本文发布于:2023-10-28 08:29:16,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1536097.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:权重   线性   增加到   Inf

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!