我正在尝试pandas中的where()方法。 我在文档页面中运行了一个简单的例子,其他的是pd.Series,我得到了NaN,我无法解释:
示例数据框是:
df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])where()子句是:
m = df % 3 == 0 n = pd.Series([100, 200]) df.where(m, n, axis = 1)该方法返回以下数据帧:
A B 0 0.0 NaN 1 NaN 3.0 2 NaN NaN 3 6.0 NaN 4 NaN 9.0我期待在A中看到100,在B中看到200而不是NaN。
你能解释一下NaN吗? 您的建议将不胜感激。
I am experimenting with the where() method in pandas. I run the simple example in the documentation page with other being a pd.Series and I got NaNs which I can not explain:
The example dataframe is:
df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])The where() clause is:
m = df % 3 == 0 n = pd.Series([100, 200]) df.where(m, n, axis = 1)The method returns the following dataframe:
A B 0 0.0 NaN 1 NaN 3.0 2 NaN NaN 3 6.0 NaN 4 NaN 9.0I was expecting to see 100 in A and 200 in B instead of NaNs.
Could you explain the NaNs? Your advice will be appreciated.
最满意答案
您的系列n没有合适的标签:
n Out: 0 100 1 200 dtype: int64如果您将此作为other参数使用,则仅当索引相同时(在名为0和1的列上),它才会使用此Series。 如果找不到它们,它将显示NaN的。 但是,如果您更改标签:
n.index = ['A', 'B'] n Out: A 100 B 200 dtype: int64现在它将按预期工作:
df.where(m, n, axis = 1) Out: A B 0 0 200 1 100 3 2 100 200 3 6 200 4 100 9Your Series, n, does not have the appropriate labels:
n Out: 0 100 1 200 dtype: int64If you use this as the other parameter, it will use this Series only when the index is the same (on columns named 0 and 1). If it cannot find them, it will display NaN's. However, if you change the labels:
n.index = ['A', 'B'] n Out: A 100 B 200 dtype: int64Now it will work as you expect:
df.where(m, n, axis = 1) Out: A B 0 0 200 1 100 3 2 100 200 3 6 200 4 100 9更多推荐
发布评论