sklearn RidgeCV 与 sample

编程入门行业动态更新时间:2024-10-25 18:22:35

本文介绍了sklearn RidgeCV 与 sample_weight的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我正在尝试使用 sklearn 进行加权岭回归.但是，当我调用 fit 方法时代码会中断.我得到的例外是:

例外:数据必须是一维的

但我确信(通过检查打印语句)我传递的数据具有正确的形状.

print temp1.shape #(781, 21)打印 temp2.shape #(781,)打印 weights.shape #(781,)结果=RidgeCV(normalize=True).fit(temp1,temp2,sample_weight=weights)

可能出什么问题了??

这是整个输出:

---------------------------------------------------------------------------异常回溯(最近一次调用)<ipython-input-65-a5b1eba5d9cf>在 <module>()2223--->24 结果=RidgeCV(normalize=True).fit(temp2,temp1,sample_weight=weights)2526/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in fit(self, X, y, sample_weight)第868话第869话-->第870话第871话第872话/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in fit(self, X, y, sample_weight)第 793 章794 如果错误:-->795 出，c = _errors(weighted_alpha, y, v, Q, QT_y)796 其他:第 797 章，c = _values(weighted_alpha, y, v, Q, QT_y)/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in _errors(self, alpha, y, v, Q, QT_y)685 w = 1.0/(v + alpha)686 c = np.dot(Q, self._diag_dot(w, QT_y))-->687 G_diag = self._decomp_diag(w, Q)688 # 处理 y 是 2-d 的情况689 如果 len(y.shape) != 1:/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in _decomp_diag(self, v_prime, Q)672 def_decomp_diag(自我，v_prime，Q):673 # 计算矩阵的对角线:dot(Q, dot(diag(v_prime), Q^T))-->674 返回 (v_prime * Q ** 2).sum(axis=-1)675676 def_diag_dot(自我，D，B):包装器中的/usr/local/lib/python2.7/dist-packages/pandas/core/ops.pyc(左，右，名称)531 返回 left._constructor(wrap_results(na_op(lvalues, rvalues)),第532话-->第533话534 返回包装器535/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath)209 其他:210 数据 = _sanitize_array(数据，索引，数据类型，复制，-->第211话212213 data = SingleBlockManager(data, index, fastpath=True)/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in _sanitize_array(data, index, dtype, copy, raise_cast_failure)第2683章1:2684 如果是实例(数据，np.ndarray):->2685 引发异常('数据必须是一维的')2686 其他:2687 subarr = _asarray_tuplesafe(数据，dtype=dtype)例外:数据必须是一维的

解决方案

该错误似乎是由于 sample_weights 是 Pandas 系列而不是 numpy 数组:

from sklearn.linear_model import RidgeCVtemp1 = pd.DataFrame(np.random.rand(781, 21))temp2 = pd.Series(temp1.sum(1))权重 = pd.Series(1 + 0.1 * np.random.rand(781))结果 = RidgeCV(normalize=True).fit(temp1, temp2,样本权重=权重)# 例外:数据必须是一维的

如果您改用 numpy 数组，错误就会消失:

result = RidgeCV(normalize=True).fit(temp1, temp2,sample_weight=weights.values)

这似乎是一个错误；我打开了一个 scikit-learn issue 来报告这个问题.>

I'm trying to do a weighted Ridge Regression with sklearn. However, the code breaks when I call the fit method. The exception I get is :

Exception: Data must be 1-dimensional

But I'm sure (by checking through print-statements) that the data I'm passing has the right shapes.

print temp1.shape #(781, 21) print temp2.shape #(781,) print weights.shape #(781,) result=RidgeCV(normalize=True).fit(temp1,temp2,sample_weight=weights)

What could be going wrong ??

Here's the whole output :

--------------------------------------------------------------------------- Exception Traceback (most recent call last) <ipython-input-65-a5b1eba5d9cf> in <module>() 22 23 ---> 24 result=RidgeCV(normalize=True).fit(temp2,temp1, sample_weight=weights) 25 26 /usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in fit(self, X, y, sample_weight) 868 gcv_mode=self.gcv_mode, 869 store_cv_values=self.store_cv_values) --> 870 estimator.fit(X, y, sample_weight=sample_weight) 871 self.alpha_ = estimator.alpha_ 872 if self.store_cv_values: /usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in fit(self, X, y, sample_weight) 793 else alpha) 794 if error: --> 795 out, c = _errors(weighted_alpha, y, v, Q, QT_y) 796 else: 797 out, c = _values(weighted_alpha, y, v, Q, QT_y) /usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in _errors(self, alpha, y, v, Q, QT_y) 685 w = 1.0 / (v + alpha) 686 c = np.dot(Q, self._diag_dot(w, QT_y)) --> 687 G_diag = self._decomp_diag(w, Q) 688 # handle case where y is 2-d 689 if len(y.shape) != 1: /usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in _decomp_diag(self, v_prime, Q) 672 def _decomp_diag(self, v_prime, Q): 673 # compute diagonal of the matrix: dot(Q, dot(diag(v_prime), Q^T)) --> 674 return (v_prime * Q ** 2).sum(axis=-1) 675 676 def _diag_dot(self, D, B): /usr/local/lib/python2.7/dist-packages/pandas/core/ops.pyc in wrapper(left, right, name) 531 return left._constructor(wrap_results(na_op(lvalues, rvalues)), 532 index=left.index, name=left.name, --> 533 dtype=dtype) 534 return wrapper 535 /usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath) 209 else: 210 data = _sanitize_array(data, index, dtype, copy, --> 211 raise_cast_failure=True) 212 213 data = SingleBlockManager(data, index, fastpath=True) /usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in _sanitize_array(data, index, dtype, copy, raise_cast_failure) 2683 elif subarr.ndim > 1: 2684 if isinstance(data, np.ndarray): -> 2685 raise Exception('Data must be 1-dimensional') 2686 else: 2687 subarr = _asarray_tuplesafe(data, dtype=dtype) Exception: Data must be 1-dimensional

解决方案

The error seems to be due to sample_weights being a Pandas series rather than a numpy array:

from sklearn.linear_model import RidgeCV temp1 = pd.DataFrame(np.random.rand(781, 21)) temp2 = pd.Series(temp1.sum(1)) weights = pd.Series(1 + 0.1 * np.random.rand(781)) result = RidgeCV(normalize=True).fit(temp1, temp2, sample_weight=weights) # Exception: Data must be 1-dimensional

If you use a numpy array instead, the error goes away:

result = RidgeCV(normalize=True).fit(temp1, temp2, sample_weight=weights.values)

This seems to be a bug; I've opened a scikit-learn issue to report this.

更多推荐

sklearn RidgeCV 与 sample

本文发布于:2023-11-27 01:48:27，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1636068.html