我正在分析动物位置,要求每只动物的位置相距 60 分钟或更长时间.动物之间位置的时间差异并不重要.该数据集有一个动物 ID 列表以及每个位置的日期和时间,示例如下.
I am working on an analysis of animal locations that requires locations for each animal to be 60 minutes or greater apart. Time differences in locations among animals does not matter. The data set has a list of animal IDs and date and time of each location, example below.
例如,对于下面的动物 6,从 16:19 的位置开始,代码将遍历位置,直到找到距离 16:19 超过 60 分钟的位置.在这种情况下,它将是 17:36 的位置.然后,代码将从 17:36 位置开始查找下一个位置 (18:52) 60 分钟以上,依此类推.然后将彼此相距 60 分钟以上的每个位置提取到单独的数据帧中.
For example, for animal 6 below, starting at the 16:19 location, the code would iterate through locations until it finds a location that is 60+ minutes from 16:19. In this case it would be the 17:36 location. Then, the code would start from the 17:36 location to find the next location (18:52) 60+ minutes, and so on. Each of the locations 60+ minutes from each other would then be extracted to a separate dataframe.
我在 R 中编写了一个循环来对数据进行子集化,但是在计算位置是否为 60 分钟或更长时间时,代码没有考虑到日期的变化.
I have wrote a loop in R to subset the data, but having issue with the code not accounting for a change in date when calculating if locations are 60 minutes or greater.
我一直在探索 lubridate 包,它似乎有一种更简单的方法来解决我的数据的子集化问题.但是,我还没有找到使用这个包将数据子集到我的规范的解决方案.任何有关使用 lubridate 或替代方法的建议将不胜感激.
I have been exploring the lubridate package, which seems like it may have an easier way to address subsetting my data. However, I have not yet found a solution to subsetting the data to my specifications using this package. Any suggestions for using lubridate or an alternative method would be greatly appreciated.
提前感谢您的考虑.
>data(locdata); >view(locdata); id date time 6 30-Jun-09 16:19 6 30-Jun-09 16:31 6 30-Jun-09 17:36 6 30-Jun-09 17:45 6 30-Jun-09 18:00 6 30-Jun-09 18:52 6 7-Aug-10 5:30 6 7-Aug-10 5:45 6 7-Aug-10 6:00 6 7-Aug-10 6:45 23 30-Jun-09 17:15 23 30-Jun-09 17:38 23 30-Jun-09 17:56 23 30-Jun-09 20:00 23 30-Jun-09 22:19 23 18-Jul-11 16:22 23 18-Jul-11 17:50 23 18-Jul-11 18:15上面示例数据的输出如下所示:
The output from the example data above would look like this:
id date time 6 30-Jun-09 16:19 6 30-Jun-09 17:36 6 30-Jun-09 18:52 6 7-Aug-10 5:30 6 7-Aug-10 6:45 23 30-Jun-09 17:15 23 30-Jun-09 20:00 23 30-Jun-09 22:19 23 18-Jul-11 16:22 23 18-Jul-11 17:50 推荐答案如果我理解正确,我认为您正在寻找以下方面的东西:
If I understood you correctly, I think you're looking for something along these lines:
library(dplyr) library(lubridate) locdata %>% mutate(timestamp = dmy_hm(paste(date, time))) %>% group_by(id, date) %>% mutate(delta = timestamp - lag(timestamp))如果您之前没有使用过 dplyr 或 magrittr,上面的语法可能不清楚.%>% 操作符将每次计算的结果传递给下一个函数,所以上面的代码做了以下事情:
If you haven't used dplyr or magrittr before, the syntax above may be unclear. The %>% operator passes the results of each computation to the next function, so the above code does the following:
如果要保存输出,请将第一行更改为 results <- locdata %>%.
If you want to save the output, change the first line to something like results <- locdata %>%.
根据您更新的问题和修改后的数据,我相信这是可行的:
locdata %>% mutate(timestamp = dmy_hm(paste(date, time))) %>% group_by(id, date) %>% mutate(delta = timestamp - first(timestamp), steps = as.numeric(floor(delta / 3600)), change = ifelse(is.na(steps - lag(steps)), 1, steps - lag(steps))) %>% filter(change > 0) %>% select(id, date, timestamp)输出:
Source: local data frame [10 x 3] Groups: id, date id date timestamp 1 6 30-Jun-09 2009-06-30 16:19:00 2 6 30-Jun-09 2009-06-30 17:36:00 3 6 30-Jun-09 2009-06-30 18:52:00 4 6 7-Aug-10 2010-08-07 05:30:00 5 6 7-Aug-10 2010-08-07 06:45:00 6 23 30-Jun-09 2009-06-30 17:15:00 7 23 30-Jun-09 2009-06-30 20:00:00 8 23 30-Jun-09 2009-06-30 22:19:00 9 23 18-Jul-11 2011-07-18 16:22:00 10 23 18-Jul-11 2011-07-18 17:50:00它是如何工作的:
要熟悉它的工作原理,请从末尾删除 filter 和 select 并检查输出:
To get comfortable with how it works, drop the filter and select from the end and inspect the output:
Source: local data frame [18 x 7] Groups: id, date id date time timestamp delta steps change 1 6 30-Jun-09 16:19 2009-06-30 16:19:00 0 secs 0 1 2 6 30-Jun-09 16:31 2009-06-30 16:31:00 720 secs 0 0 3 6 30-Jun-09 17:36 2009-06-30 17:36:00 4620 secs 1 1 4 6 30-Jun-09 17:45 2009-06-30 17:45:00 5160 secs 1 0 5 6 30-Jun-09 18:00 2009-06-30 18:00:00 6060 secs 1 0 6 6 30-Jun-09 18:52 2009-06-30 18:52:00 9180 secs 2 1 7 6 7-Aug-10 5:30 2010-08-07 05:30:00 0 secs 0 1 8 6 7-Aug-10 5:45 2010-08-07 05:45:00 900 secs 0 0 9 6 7-Aug-10 6:00 2010-08-07 06:00:00 1800 secs 0 0 10 6 7-Aug-10 6:45 2010-08-07 06:45:00 4500 secs 1 1 11 23 30-Jun-09 17:15 2009-06-30 17:15:00 0 secs 0 1 12 23 30-Jun-09 17:38 2009-06-30 17:38:00 1380 secs 0 0 13 23 30-Jun-09 17:56 2009-06-30 17:56:00 2460 secs 0 0 14 23 30-Jun-09 20:00 2009-06-30 20:00:00 9900 secs 2 2 15 23 30-Jun-09 22:19 2009-06-30 22:19:00 18240 secs 5 3 16 23 18-Jul-11 16:22 2011-07-18 16:22:00 0 secs 0 1 17 23 18-Jul-11 17:50 2011-07-18 17:50:00 5280 secs 1 1 18 23 18-Jul-11 18:15 2011-07-18 18:15:00 6780 secs 1 0更多推荐
如何按行中的时间间隔对时间序列进行子集化和提取
发布评论