我有df:
df = pd.DataFrame({'a':[7,8,9], 'b':[1,3,5], 'c':[5,3,6]}) print (df) a b c 0 7 1 5 1 8 3 3 2 9 5 6然后通过此重命名第一个值:
Then rename first value by this:
df.columns.values[0] = 'f'一切似乎都很好:
print (df) f b c 0 7 1 5 1 8 3 3 2 9 5 6 print (df.columns) Index(['f', 'b', 'c'], dtype='object') print (df.columns.values) ['f' 'b' 'c']如果选择b,效果很好:
print (df['b']) 0 1 1 3 2 5 Name: b, dtype: int64但是如果选择a,它将返回列f:
But if select a it return column f:
print (df['a']) 0 7 1 8 2 9 Name: f, dtype: int64如果选择f会得到键盘错误.
And if select f get keyerror.
print (df['f']) #KeyError: 'f' print (df.info()) #KeyError: 'f'什么问题?有人可以解释吗?还是虫子?
What is problem? Can somebody explain it? Or bug?
推荐答案不希望您更改values属性.
尝试df.columns.values = ['a', 'b', 'c'],您会得到:
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-61-e7e440adc404> in <module>() ----> 1 df.columns.values = ['a', 'b', 'c'] AttributeError: can't set attribute
这是因为pandas检测到您正在尝试设置属性并停止您.
That's because pandas detects that you are trying to set the attribute and stops you.
但是,它不能阻止您更改基础的values对象本身.
However, it can't stop you from changing the underlying values object itself.
当您使用rename时,pandas将跟进一堆清理工作.我在下面粘贴了源代码.
When you use rename, pandas follows up with a bunch of clean up stuff. I've pasted the source below.
最终,您所做的只是更改了值,而没有启动清理.您可以通过对_data.rename_axis的后续调用来自己启动它(示例可以在下面的源代码中看到).这将强制执行清理,然后您可以访问['f']
Ultimately what you've done is altered the values without initiating the clean up. You can initiate it yourself with a followup call to _data.rename_axis (example can be seen in source below). This will force the clean up to be run and then you can access ['f']
df._data = df._data.rename_axis(lambda x: x, 0, True) df['f'] 0 7 1 8 2 9 Name: f, dtype: int64故事的寓意:用这种方式重命名列可能不是一个好主意.
Moral of the story: probably not a great idea to rename a column this way.
但这个故事很奇怪
but this story gets weirder
这很好
df = pd.DataFrame({'a':[7,8,9], 'b':[1,3,5], 'c':[5,3,6]}) df.columns.values[0] = 'f' df['f'] 0 7 1 8 2 9 Name: f, dtype: int64这不很好
df = pd.DataFrame({'a':[7,8,9], 'b':[1,3,5], 'c':[5,3,6]}) print(df) df.columns.values[0] = 'f' df['f']KeyError:
结果是,我们可以在显示df之前修改values属性,并且显然它将在第一个display上运行所有初始化.如果在更改values属性之前显示它,它将出错.
Turns out, we can modify the values attribute prior to displaying df and it will apparently run all the initialization upon the first display. If you display it prior to changing the values attribute, it will error out.
更寂静
weirder still
df = pd.DataFrame({'a':[7,8,9], 'b':[1,3,5], 'c':[5,3,6]}) print(df) df.columns.values[0] = 'f' df['f'] = 1 df['f'] f f 0 7 1 1 8 1 2 9 1好像我们还不知道这是个坏主意...
As if we didn't already know that this was a bad idea...
rename
def rename(self, *args, **kwargs): axes, kwargs = self._construct_axes_from_arguments(args, kwargs) copy = kwargs.pop('copy', True) inplace = kwargs.pop('inplace', False) if kwargs: raise TypeError('rename() got an unexpected keyword ' 'argument "{0}"'.format(list(kwargs.keys())[0])) if com._count_not_none(*axes.values()) == 0: raise TypeError('must pass an index to rename') # renamer function if passed a dict def _get_rename_function(mapper): if isinstance(mapper, (dict, ABCSeries)): def f(x): if x in mapper: return mapper[x] else: return x else: f = mapper return f self._consolidate_inplace() result = self if inplace else self.copy(deep=copy) # start in the axis order to eliminate too many copies for axis in lrange(self._AXIS_LEN): v = axes.get(self._AXIS_NAMES[axis]) if v is None: continue f = _get_rename_function(v) baxis = self._get_block_manager_axis(axis) result._data = result._data.rename_axis(f, axis=baxis, copy=copy) result._clear_item_cache() if inplace: self._update_inplace(result._data) else: return result.__finalize__(self)更多推荐
重命名列后得到keyerror
发布评论