我最近一直在玩R的回归函数/包。 我想知道,有没有办法可以强制我的回归系数加总到一个特定的值? 我理解强制系数可能会造成不合适,但我仍然在寻找一种方法。
对不起,我没有太多代码,因为我没有成功这样做,但我正在尝试这样的事情:
b [,1] [1,] 2 [2,] 6 [3,] 4 [4,] 7 [5,] 8 A [,1] [,2] [,3] [1,] 2 3 4 [2,] 7 5 5 [3,] 5 5 3 [4,] 7 8 9 [5,] 8 9 9我想要一个函数来使用A为b创建一个模型
constrainedcoefs < - function(A,b,coefsum){
适合< - nnls(A,b)
理想情况下
sum(coef(fit)) = coefsum有没有人知道强制coef(fit)之和为某种用户定义值的方法? 或具有此功能的包。 我只找到了允许我定义上限和下限的包,以及一些关于获得coef(fit)= 1的方法的讨论。
I have recently been playing around with R's regression functions/packages. I'm wondering, is there a way that I could force my regression coefficients to sum to a particular value? I understand that forcing the coefficients might create a poor fit, but nevertheless I'm looking for a way to do it.
Sorry, I don't have much code as I have had no success doing this, but I'm trying for something like this:
b [,1] [1,] 2 [2,] 6 [3,] 4 [4,] 7 [5,] 8 A [,1] [,2] [,3] [1,] 2 3 4 [2,] 7 5 5 [3,] 5 5 3 [4,] 7 8 9 [5,] 8 9 9I want a function to create a model for b using A
constrainedcoefs <- function(A, b, coefsum) {
fit <- nnls(A, b)
Ideally with
sum(coef(fit)) = coefsumDoes anyone know of a way to force the sum of coef(fit) to be some user defined value? Or of a package with this feature. I have only found packages that let me define upper and lower bounds and some discussion on ways that gets coef(fit) = 1.
最满意答案
如果你有像这样的输入数据
dd <- structure(list(b = c(2L, 6L, 4L, 7L, 8L), A1 = c(2L, 7L, 5L, 7L, 8L), A2 = c(3L, 5L, 5L, 8L, 9L), A3 = c(4L, 5L, 3L, 9L, 9L )), .Names = c("b", "A1", "A2", "A3"), class = "data.frame", row.names = c(NA, -5L))您可以尝试使用nls
nls(b~a1*A1+a2*A2+(1-a1-a2)*A3, dd, lower=0, upper=1,algorithm="port", start=c(a1=.3, a2=.3))这里我们要求它们总和为1,所以我们实际上只有两个自由参数。 一旦我们知道a1和a2我们就能算出a3 。 然后我们使用“端口”算法,它允许我们指定上限和下限,以确保值不低于0或高于1.我得到的这些数据
Nonlinear regression model model: b ~ a1 * A1 + a2 * A2 + (1 - a1 - a2) * A3 data: dd a1 a2 0.7647 0.0000 residual sum-of-squares: 1.059所以参数是(0.7647,0.0000,0.2353)
当然,这种类型的回归似乎很不寻常,所以要非常小心你根据模型拟合做出的推论。
If you have input data like
dd <- structure(list(b = c(2L, 6L, 4L, 7L, 8L), A1 = c(2L, 7L, 5L, 7L, 8L), A2 = c(3L, 5L, 5L, 8L, 9L), A3 = c(4L, 5L, 3L, 9L, 9L )), .Names = c("b", "A1", "A2", "A3"), class = "data.frame", row.names = c(NA, -5L))You could try using nls with
nls(b~a1*A1+a2*A2+(1-a1-a2)*A3, dd, lower=0, upper=1,algorithm="port", start=c(a1=.3, a2=.3))Here we require them to sum to 1 so we really only have two free parameters. Once we know a1 and a2 we can figure out a3. Then we use the "port" algorithm which allows us to specify upper and lower bounds to make sure the values don't go below 0 or above 1. WIth this data I got
Nonlinear regression model model: b ~ a1 * A1 + a2 * A2 + (1 - a1 - a2) * A3 data: dd a1 a2 0.7647 0.0000 residual sum-of-squares: 1.059So the parameters are (0.7647, 0.0000, 0.2353)
Of course, this type of regression seems very unusual so be very careful about the inferences you make based on model fit.
更多推荐
发布评论