我试图根据它们是否为日期类型来过滤熊猫数据框中的列。我可以弄清楚是哪些,但随后必须解析该输出或手动选择列。我想自动选择日期列。到目前为止,这里是一个示例-在这种情况下,我只想选择 date_col列。
I am trying to filter the columns in a pandas dataframe based on whether they are of type date or not. I can figure out which ones are, but then would have to parse that output or manually select columns. I want to select date columns automatically. Here's what I have so far as an example - I'd want to only select the 'date_col' column in this case.
import pandas as pd df = pd.DataFrame([['Feb-2017', 1, 2], ['Mar-2017', 1, 2], ['Apr-2017', 1, 2], ['May-2017', 1, 2]], columns=['date_str', 'col1', 'col2']) df['date_col'] = pd.to_datetime(df['date_str']) df.dtypes输出:
date_str object col1 int64 col2 int64 date_col datetime64[ns] dtype: object推荐答案
熊猫有一个很酷的函数,叫做 select_dtypes ,该函数可以使用排除或包含(或两者)作为参数。它根据dtype过滤数据帧。因此,在这种情况下,您将希望包括dtype np.datetime64 的列。要按整数过滤,可以使用 [np.int64,np.int32,np.int16,np.int] 进行浮点运算: [np .float32,np.float64,np.float16,np.float] ,仅按数字列进行过滤: [np.number] 。
Pandas has a cool function called select_dtypes, which can take either exclude or include (or both) as parameters. It filters the dataframe based on dtypes. So in this case, you would want to include columns of dtype np.datetime64. To filter by integers, you would use [np.int64, np.int32, np.int16, np.int], for float: [np.float32, np.float64, np.float16, np.float], to filter by numerical columns only: [np.number].
df.select_dtypes(include=[np.datetime64])退出:
date_col 0 2017-02-01 1 2017-03-01 2 2017-04-01 3 2017-05-01在:
df.select_dtypes(include=[np.number])出局:
col1 col2 0 1 2 1 1 2 2 1 2 3 1 2更多推荐
如何判断 pandas 数据框中的列是否为日期时间类型?如何判断列是否为数字?
发布评论