如何检查病情是否持续超过15分钟?

编程入门行业动态更新时间:2024-10-24 05:23:12

本文介绍了如何检查病情是否持续超过15分钟?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

下面是数据集的示例

日期值

2020-01-01 01:35	50
2020-01-01 01:41	49
2020-01-01 01:46	50

我希望检查连续15分钟的值"是否等于50.如果是，我想提取它发生的日期.让我举一个例子，我说连续15分钟.假设我要在5分钟(而不是15分钟)的连续时间内检查该值是否等于50.满足该条件的数据如下

I wish to check if the 'Value' was equal to 50 for continuous period of 15 mins. If yes, I want to extract the date for which it occurred. Let me give an example what I mean by continuous period of 15 mins. Assume that I want to check if the value is equal to 50 for a continuous period of 5 mins (instead of 15 mins). The data that would satisfy this condition would be as follows

日期值

2020-01-01 01:35	50
2020-01-01 01:36	50
2020-01-01 01:37	50
2020-01-01 01:38	50
2020-01-01 01:39	50

然后我想将日期 2020-01-01 提取到列表中，因为上述数据连续5分钟(或更长)等于50.

Then I want to extract the date2020-01-01 onto a list because the above data was equal to 50 for a continuous period of 5 mins (or more).

推荐答案

我将代码发布5分钟，以便输出与您所需的输出匹配.将 300 更改为 900 15分钟.步骤:

I am posting code for 5 mins so that output matches your desired output. Change 300 to 900 for 15 mins. Steps:

将 df ['Date'] 转换为 datetime ，以便我们可以减去两个日期知道他们之间的时差.

Convert the df['Date'] to datetime so that we can subtract two dates to know the time difference between them.

按日期对 df 进行分组，并为每个分组对象调用 f .

Group the df by date and Call f for each group object.

在 f 中: max-continuous_range 给出了长度为50的最长段的长度.如果长度为5分钟或以上，则 f 返回True.如果 f 返回 True ，则在列表中追加日期.

In f: max-continuous_range gives the length of longest segment where value is 50. f return True if length is 5 mins or more. Append date in list if f returns True.

使用:

def f(g): mask = (g['Value'] == 50) max_continuous_range = (np.max(np.cumsum(g['Date'].where(mask).diff())) + timedelta(minutes = 1)) return max_continuous_range.seconds >= 300 df['Date'] = pd.to_datetime(df['Date']) groups = df.groupby(df['Date'].dt.date, as_index = False) final_list = [str(idx) for idx, g in groups if f(g)]

输入:

Date Value 0 2020-01-01 01:35 40 1 2020-01-01 01:36 50 2 2020-01-01 01:37 50 3 2020-01-01 01:38 50 4 2020-01-01 01:39 50 5 2020-01-01 01:40 50 6 2020-01-01 01:41 40 7 2020-01-01 01:42 40

输出:

>>> final_list ['2020-01-01']

在f(g)内:

掩码:真，值是50.

0 False 1 True 2 True 3 True 4 True 5 True 6 False 7 False

df ['Date'].where(mask)将NaT放在mask不是True的地方.

df['Date'].where(mask) Puts NaT where mask is not True.

0 NaT 1 2020-01-01 01:36:00 2 2020-01-01 01:37:00 3 2020-01-01 01:38:00 4 2020-01-01 01:39:00 5 2020-01-01 01:40:00 6 NaT 7 NaT

.diff 给出两个连续元素之间的区别.如果任何值为NaT，它将给出NaT. df ['Date'].where(mask).diff():

.diff gives difference between two consecuting elements. It will give NaT if any value is NaT. Result after df['Date'].where(mask).diff():

0 NaT 1 NaT 2 0 days 00:01:00 3 0 days 00:01:00 4 0 days 00:01:00 5 0 days 00:01:00 6 NaT 7 NaT

现在，连续时间之间的累计差值总和将为我们提供经过的总时间.在 np.cumsum(...)之后: