从Pandas DataFrame中提取多个非连续索引值(Pulling multiple, non

编程入门行业动态更新时间:2024-10-10 15:24:15

从Pandas DataFrame中提取多个非连续索引值(Pulling multiple, non-consecutive index values from a Pandas DataFrame)

我已经创建了一个pandas数据框，它通过以下方式从scipy.io中读取它（file.sav是在不同的机器上创建的IDL结构.scipy.io创建一个标准的python字典）：

from scipy import io import pandas as p import numpy as np tmp=io.readsav('file.sav', python_dict = True) df=pd.DataFrame(tmp,index=tmp['shots'].astype('int32'))

数据帧包含一组值（来自file.sav）和作为索引的一系列整数形式19999,20000,30000等。现在我想采取这些指数的一个子集，说

df.loc[[19999,20000]]

由于某些原因，我得到表格的错误

raise ValueError('Cannot index with multidimensional key')

加上其他和最后

ValueError: Big-endian buffer not supported on little-endian compiler

但是我已经检查过我正在处理的机器和创建了file.sav的机器都是小端。所以我认为这不是问题所在。

I've created a pandas dataframe reading it from a scipy.io in the following way (file.sav is an IDL structure created on a different machine. The scipy.io creates a standard python dictionary):

from scipy import io import pandas as p import numpy as np tmp=io.readsav('file.sav', python_dict = True) df=pd.DataFrame(tmp,index=tmp['shots'].astype('int32'))

the dataframe contains a set of values (from file.sav) and as indices a series of integers of the form 19999,20000,30000 etc. Now I would like to take a subset of these indices, says

df.loc[[19999,20000]]

for some reasons I get errors of the form

raise ValueError('Cannot index with multidimensional key')

plus other and at the end

ValueError: Big-endian buffer not supported on little-endian compiler

But I've checked that both the machine I'm working on and the machine which has created the file.sav are both little endian. So I don't think this is the problem.

最满意答案

您的输入文件是大端。看到这里改变它： http ： //pandas.pydata.org/pandas-docs/dev/gotchas.html#byte-ordering-issues

之前和之后比较

In [7]: df.dtypes Out[7]: a >f4 b >f4 c >f4 shots >f4 dtype: object In [9]: df.apply(lambda x: x.values.byteswap().newbyteorder()) Out[9]: <class 'pandas.core.frame.DataFrame'> Int64Index: 100 entries, 20000 to 20099 Data columns (total 4 columns): a 100 non-null values b 100 non-null values c 100 non-null values shots 100 non-null values dtypes: float32(4) In [10]: df.apply(lambda x: x.values.byteswap().newbyteorder()).dtypes Out[10]: a float32 b float32 c float32 shots float32 dtype: object

在执行此操作后也设置索引（例如，不要在构造函数中执行此操作）

df.set_index('shots',inplace=True)

Your input file is big endian. see here to transform it: http://pandas.pydata.org/pandas-docs/dev/gotchas.html#byte-ordering-issues

Compare before and after

Also set the index AFTER you do this (e.g. don't do it in the constructor)

df.set_index('shots',inplace=True)

更多推荐

本文发布于:2023-08-05 18:59:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1437192.html