如何对顺序事件时间序列(事件之间有间隔)进行分组以查找事件的持续时间

编程入门 行业动态 更新时间:2024-10-25 05:24:49
本文介绍了如何对顺序事件时间序列(事件之间有间隔)进行分组以查找事件的持续时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我在R中有一个数据集,其中包含一系列人员,发生的事件以及发生的指定时间(以秒为单位),从0开始。它看起来像这样:

I have a data set in R with a series of people, events that occur and an assigned time that they occur in seconds, starting from 0. It looks similar to this:

event seconds person 1 0.0 Bob 2 15.0 Bob 3 28.5 Bob 4 32.0 Joe 5 38.0 Joe 6 41.0 Joe 7 42.5 Joe 8 55.0 Anne 9 58.0 Anne

我需要过滤每个名字,这意味着每个人的有序事件不会是连续的。

I need to filter for each name, and that means the ordered events will not be sequential for each person.

此示例(请注意Bob如何不参与事件4-40等):

An example of what this looks like (notice how Bob is not involved in events 4-40, etc.):

event seconds person 1 0.0 Bob 2 15.0 Bob 3 28.5 Bob 41 256.0 Bob 42 261.0 Bob 43 266.0 Bob 44 268.5 Bob 45 272.0 Bob 46 273.0 Bob 49 569.0 Bob 80 570.5 Bob 81 581.0 Bob

顺序的事件和相关的事件以1的增量分开。我想找到相关事件的持续时间,例如事件1-3是一个28.5秒的组。事件41-46是另一个持续17秒的小组。对于人员列中列出的所有名称,这都是必需的。

The events that are sequential and related are separated by an increment of 1. I would like to find the duration of the related events, for example, events 1-3 is a group that would be 28.5 seconds. Events 41-46 is another group that lasts 17 seconds. This would be required for all the names that are listed in the person column.

我尝试使用dplyr过滤名称,然后使用as.matrix查找事件行之间的差异,并确定增量大于1的位置(指示它不是当前事件序列的较长部分)。我还没有找到一种根据此最大值和最小值来确定相关事件持续时间的方法。解决方案虽然不需要涉及此步骤,但它是我能想到的最接近的步骤。

I have tried filtering the names using dplyr and then finding the difference between event rows, using as.matrix, and determining where the increment is greater than 1 (indicating it's no longer part of the current sequence of events). I haven't found a way to assign the max and min based off of this to determine the duration of related events. The solution does not need to involve this step though, but it was the closest I could come.

最终目标是绘制每个人的非连续时间长度,以直观表示每个人在整个数据集中涉及的事件。

The end goal is to plot the non-contiguous time durations for each person to have a visual representation of each person's event involvement for the entire data set.

谢谢。

推荐答案

假设首先我们只有鲍勃的数据框行,称为 bob 。 我们假设 bob 已被 event 排序,并在增加。

Suppose first we have just Bob's rows of the dataframe, called bob. We will assume bob is already ordered by event, increasing.

与您提到的相同(请参见 diff(event)> 1 ),还可以使用 cumsum 将每个事件分组到其所属事件的运行中:

Along the same lines as you mentioned (looking at diff(event) > 1), you can additionally use cumsum to group each event to the 'run' of events it belongs to:

library(plyr) bob2 <- mutate(bob, start = c(1, diff(bob$event) > 1), run=cumsum(start)) event seconds person start run 1 1 0.0 Bob 1 1 2 2 15.0 Bob 0 1 3 3 28.5 Bob 0 1 4 41 256.0 Bob 1 2 5 42 261.0 Bob 0 2 6 43 266.0 Bob 0 2 7 44 268.5 Bob 0 2 8 45 272.0 Bob 0 2 9 46 273.0 Bob 0 2 10 49 569.0 Bob 1 3 11 80 570.5 Bob 1 4 12 81 581.0 Bob 0 4

开始指示是否这将启动一系列顺序事件,而 run 是我们所处的此类事件。

start indicates whether this starts a run of sequential events, and run is which such set of events we are in.

然后您可以找到持续时间:

Then you can just find the duration:

ddply(bob2, .(run), summarize, length=diff(range(seconds))) run length 1 1 28.5 2 2 17.0 3 3 0.0 4 4 10.5

现在假设您将原始数据帧与每个人混合在一起,我们可以再次使用 ddply 进行拆分按人:

Now supposing you have your original dataframe with everyone mixed together in it, we can use ddply again to split it up by person:

tmp <- ddply(df, .(person), transform, run=cumsum(c(1, diff(event) != 1))) ddply(tmp, .(person, run), summarize, length=diff(range(seconds)), start_event=first(event), end_event=last(event)) person run length start_event end_event 1 Anne 1 3.0 8 9 2 Bob 1 28.5 1 3 3 Bob 2 17.0 41 46 4 Bob 3 0.0 49 49 5 Bob 4 10.5 80 81 6 Joe 1 10.5 4 7

注意:我的 df 是您的bob表到另一张表的rbind表, unique() d(只是为了说明当有多个表时它是有效的每人运行一次)。 可能有一个聪明的方法将两个 ddply 调用结合在一起(或使用 dplyr 我不熟悉的pipe-y语法),但我不知道它是什么。

Note: my df is your bob table rbind-ed to your other table, unique()d (just to show it works when there are more than one run per person). There is probably a clever way to do this that combines the two ddply calls (or uses the dplyr pipe-y syntax that I am not familiar with), but I do not know what it is.

更多推荐

如何对顺序事件时间序列(事件之间有间隔)进行分组以查找事件的持续时间

本文发布于:2023-10-13 09:45:44,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1487605.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:事件   间隔   序列   持续时间   顺序

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!