当通过lambda调用pandas apply()而不是调用函数时,规则/过程是什么?下面的例子.显然没有lambda,整个序列(df [column name])将传递给"test"函数,该函数会在尝试对序列执行布尔运算时引发错误.
What is the rule/process when a function is called with pandas apply() through lambda vs. not? Examples below. Without lambda apparently, the entire series ( df[column name] ) is passed to the "test" function which throws an error trying to do a boolean operation on a series.
如果通过lambda调用了相同的函数,则它将起作用.遍历每行,每行以"x"进行迭代,并且df [列名]返回当前行中该列的单个值.
If the same function is called via lambda it works. Iteration over each row with each passed as "x" and the df[ column name ] returns a single value for that column in the current row.
这就像lambda正在移除尺寸.有人对此有解释或指向特定文档吗?谢谢.
It's like lambda is removing a dimension. Anyone have an explanation or point to the specific doc on this? Thanks.
带有lambda的示例1可以正常运行
print("probPredDF columns:", probPredDF.columns) def test( x, y): if x==y: r = 'equal' else: r = 'not equal' return r probPredDF.apply( lambda x: test( x['yTest'], x[ 'yPred']), axis=1 ).head()示例1输出
probPredDF columns: Index([0, 1, 'yPred', 'yTest'], dtype='object') Out[215]: 0 equal 1 equal 2 equal 3 equal 4 equal dtype: object没有lambda的示例2,在系列错误时引发布尔运算
print("probPredDF columns:", probPredDF.columns) def test( x, y): if x==y: r = 'equal' else: r = 'not equal' return r probPredDF.apply( test( probPredDF['yTest'], probPredDF[ 'yPred']), axis=1 ).head()示例2输出
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().推荐答案
lambda没有什么神奇之处.它们是一个参数中的函数,可以内联定义,并且没有名称.您可以在需要使用lambda的函数中使用该函数,但是该函数还需要采用一个参数.您需要做类似...
There is nothing magic about a lambda. They are functions in one parameter, that can be defined inline, and do not have a name. You can use a function where a lambda is expected, but the function will need to also take one parameter. You need to do something like...
将其定义为:
def wrapper(x): return test(x['yTest'], x['yPred'])用作:
probPredDF.apply(wrapper, axis=1)更多推荐
带有和不带有lambda的pandas apply()
发布评论