我有一个涵盖多天的样本数据集,所有样本都带有时间戳. 我想在特定时间范围内选择行.例如.每天下午1点到3点之间生成的所有行.
I have a dataset of samples covering multiple days, all with a timestamp. I want to select rows within a specific time window. E.g. all rows that were generated between 1pm and 3 pm every day.
这是我在熊猫数据框中的数据示例:
This is a sample of my data in a pandas dataframe:
22 22 2018-04-12T20:14:23Z 2018-04-12T21:14:23Z 0 6370.1 23 23 2018-04-12T21:14:23Z 2018-04-12T21:14:23Z 0 6368.8 24 24 2018-04-12T22:14:22Z 2018-04-13T01:14:23Z 0 6367.4 25 25 2018-04-12T23:14:22Z 2018-04-13T01:14:23Z 0 6365.8 26 26 2018-04-13T00:14:22Z 2018-04-13T01:14:23Z 0 6364.4 27 27 2018-04-13T01:14:22Z 2018-04-13T01:14:23Z 0 6362.7 28 28 2018-04-13T02:14:22Z 2018-04-13T05:14:22Z 0 6361.0 29 29 2018-04-13T03:14:22Z 2018-04-13T05:14:22Z 0 6359.3 .. ... ... ... ... ... 562 562 2018-05-05T08:13:21Z 2018-05-05T09:13:21Z 0 6300.9 563 563 2018-05-05T09:13:21Z 2018-05-05T09:13:21Z 0 6300.7 564 564 2018-05-05T10:13:14Z 2018-05-05T13:13:14Z 0 6300.2 565 565 2018-05-05T11:13:14Z 2018-05-05T13:13:14Z 0 6299.9 566 566 2018-05-05T12:13:14Z 2018-05-05T13:13:14Z 0 6299.6我该如何实现?我需要忽略日期,而只是评估时间部分.我可以循环遍历数据框并以这种方式评估日期时间,但是必须有一种更简单的方法来做到这一点.
How do I achieve that? I need to ignore the date and just evaluate the time component. I could traverse the dataframe in a loop and evaluate the date time in that way, but there must be a more simple way to do that..
我将读取字符串的messageDate转换为dateTime,
I converted the messageDate which was read a a string to a dateTime by
df["messageDate"]=pd.to_datetime(df["messageDate"])但是在那之后,我陷入了如何仅按时进行过滤的问题.
But after that I got stuck on how to filter on time only.
任何输入表示赞赏.
推荐答案datetime列具有DatetimeProperties对象,您可以从中提取datetime.time并对其进行过滤:
datetime columns have DatetimeProperties object, from which you can extract datetime.time and filter on it:
import datetime df = pd.DataFrame( [ '2018-04-12T12:00:00Z', '2018-04-12T14:00:00Z','2018-04-12T20:00:00Z', '2018-04-13T12:00:00Z', '2018-04-13T14:00:00Z', '2018-04-13T20:00:00Z' ], columns=['messageDate'] ) df messageDate # 0 2018-04-12 12:00:00 # 1 2018-04-12 14:00:00 # 2 2018-04-12 20:00:00 # 3 2018-04-13 12:00:00 # 4 2018-04-13 14:00:00 # 5 2018-04-13 20:00:00 df["messageDate"] = pd.to_datetime(df["messageDate"]) time_mask = (df['messageDate'].dt.hour >= 13) & \ (df['messageDate'].dt.hour <= 15) df[time_mask] # messageDate # 1 2018-04-12 14:00:00 # 4 2018-04-13 14:00:00更多推荐
pandas :在特定时间窗口中选择行
发布评论