我应该如何获得套索模型的系数?

编程入门行业动态更新时间:2024-10-17 11:26:54

本文介绍了我应该如何获得套索模型的系数?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

这是我的代码:

library(MASS) library(caret) df <- Boston set.seed(3721) cv.10.folds <- createFolds(df$medv, k = 10) lasso_grid <- expand.grid(fraction=c(1,0.1,0.01,0.001)) lasso <- train(medv ~ ., data = df, preProcess = c("center", "scale"), method ='lasso', tuneGrid = lasso_grid, trControl= trainControl(method = "cv", number = 10, index = cv.10.folds)) lasso

与线性模型不同，我无法从summary(lasso)中找到Lasso回归模型的系数.我该怎么办?或者也许我可以使用glmnet?

Unlike linear model, I cannot find the coefficients of Lasso regression model from summary(lasso). How should I do that? Or maybe I can use glmnet?

推荐答案

使用method="lasso"进行训练时，来自Elasticnet的enet称为:

When you train with method="lasso", enet from elasticnet is called:

lasso$finalModel$call elasticnet::enet(x = as.matrix(x), y = y, lambda = 0)

小插图写道:

LARS-EN算法可计算完整的弹性网解决方案同时在相同的收缩参数的所有值计算成本最小二乘拟合

The LARS-EN algorithm computes the complete elastic net solution simultaneously for ALL values of the shrinkage parameter in the same computational cost as a least squares fit

在lasso$finalModel$beta.pure下，您具有与lasso$finalModel$L1norm下的L1范数的16个值相对应的所有16组系数的系数:

Under lasso$finalModel$beta.pure, you have coefficients for all 16 sets of coefficients corresponding to 16 values of L1 norm under lasso$finalModel$L1norm:

length(lasso$finalModel$L1norm) [1] 16 dim(lasso$finalModel$beta.pure) [1] 16 13

您也可以使用预测来查看它:

You can see it using predict too:

predict(lasso$finalModel,type="coef") $s [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 $fraction [1] 0.00000000 0.06666667 0.13333333 0.20000000 0.26666667 0.33333333 [7] 0.40000000 0.46666667 0.53333333 0.60000000 0.66666667 0.73333333 [13] 0.80000000 0.86666667 0.93333333 1.00000000 $mode [1] "step" $coefficients crim zn indus chas nox rm age 0 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 0.000000 0.00000000 1 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 0.000000 0.00000000 2 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 1.677765 0.00000000 3 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 2.571071 0.00000000 4 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 2.716138 0.00000000 5 0.00000000 0.0000000 0.00000000 0.2586083 0.0000000 2.885615 0.00000000 6 -0.05232643 0.0000000 0.00000000 0.3543411 0.0000000 2.953605 0.00000000 7 -0.13286554 0.0000000 0.00000000 0.4095229 0.0000000 2.984026 0.00000000 8 -0.21665925 0.0000000 0.00000000 0.5196189 -0.5933941 3.003512 0.00000000 9 -0.32168140 0.3326103 0.00000000 0.6044308 -1.0246080 2.973693 0.00000000 10 -0.33568474 0.3771889 -0.02165730 0.6165190 -1.0728128 2.967696 0.00000000 11 -0.42820289 0.4522827 -0.09212253 0.6407298 -1.2474934 2.932427 0.00000000 12 -0.62605363 0.7005114 0.00000000 0.6574277 -1.5655601 2.832726 0.00000000 13 -0.88747102 1.0150162 0.00000000 0.6856705 -1.9476465 2.694820 0.00000000 14 -0.91679342 1.0613165 0.09956489 0.6837833 -2.0217269 2.684401 0.00000000 15 -0.92906457 1.0826390 0.14103943 0.6824144 -2.0587536 2.676877 0.01948534

通过插入符号调整的超参数是最大L1范数的分数，因此在您提供的结果中，它将为1，即最大值:

The hyper-parameter tuned by caret is the fraction of the maximum L1 norm, so in the result you have provided, it will be 1, i.e the max :

lasso The lasso 506 samples 13 predictor Pre-processing: centered (13), scaled (13) Resampling: Cross-Validated (10 fold) Summary of sample sizes: 51, 51, 51, 50, 51, 50, ... Resampling results across tuning parameters: fraction RMSE Rsquared MAE 0.001 9.182599 0.5075081 6.646013 0.010 9.022117 0.5075081 6.520153 0.100 7.597607 0.5572499 5.402851 1.000 6.158513 0.6033310 4.140362 RMSE was used to select the optimal model using the smallest value. The final value used for the model was fraction = 1.

要获取最佳分数的系数，请执行以下操作:

To get the coefficients out for the optimal fraction:

predict(lasso$finalModel,type="coef",s=16) $s [1] 16 $fraction [1] 1 $mode [1] "step" $coefficients crim zn indus chas nox rm -0.92906457 1.08263896 0.14103943 0.68241438 -2.05875361 2.67687661 age dis rad tax ptratio black 0.01948534 -3.10711605 2.66485220 -2.07883689 -2.06264585 0.85010886 lstat -3.74733185