获取不同的行以在SQL Server中重叠时间戳

编程入门 行业动态 更新时间:2024-10-20 07:54:09
本文介绍了获取不同的行以在SQL Server中重叠时间戳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有以下从SQL Server获得的结果集:

employeeNumber | start_date | start_time | end_date | end_time ---------------+------------+------------+--------------+---------- 123 | 10-03-2020 | 18:13:55 | 10-03-2020 | 22:59:46 123 | 10-03-2020 | 18:24:22 | 10-03-2020 | 22:59:51 123 | 10-03-2020 | 23:24:22 | 10-03-2020 | 23:59:51 123 | 11-03-2020 | 18:25:25 | 11-03-2020 | 20:59:51 123 | 12-03-2020 | 18:40:22 | 12-03-2020 | 22:59:52

在某些情况下,我有多行具有与上述相同的重叠时间(行1和2),但开始和结束时间不同(以秒或分钟为单位的差异).

虽然我的查询是一个简单的选择查询,可从源表中获取数据,但我可以在where子句中添加什么以获取此类重叠时间戳行的不同行.也就是说,对于上述查询,我​​希望结果集返回以下内容:

employeeNumber | start_date | start_time | end_date | end_time ---------------+------------+------------+--------------+---------- 123 | 10-03-2020 | 18:13:55 | 10-03-2020 | 22:59:46 123 | 10-03-2020 | 23:24:22 | 10-03-2020 | 23:59:51 123 | 11-03-2020 | 18:25:25 | 11-03-2020 | 20:59:51 123 | 12-03-2020 | 18:40:22 | 12-03-2020 | 22:59:52

以下是我的查询:

select employeeNumber, start_date, start_time, end_date, end_time from emp_data where employeeNumber = 123 order by employeeNumber;

我可能只提取第一条记录,但是where子句将是什么.

感谢您的帮助,因为我对SQL Server不太熟悉.

解决方案

这很复杂.您需要跟踪开始"和结束".我将假设您的列是datetime或可以合并到单个列中的类似内容:

with e as ( select e.employeeNumber, v.dt, sum(v.inc) as inc, sum(sum(v.inc)) over (partition by e.employeeNumber order by v.dt) as in_outs from emp_data e cross apply (values (start_date + start_time, 1), (end_date + end_time, -1) ) v(dt, inc) group by e.employeeNumber, v.dt ) select employeeNumber, min(dt) as start_datetime, max(dt) as end_datetime from (select e.*, sum(case when in_outs = 0 then 1 else 0 end) over (partition by employeeNumber order by dt) as grp from e ) e where in_outs <> 0 group by employeeNumber, grp;

此处是db< fiddle.

这是做什么的?

  • 首先将日期/时间转换为日期时间.
  • 然后将这些列取消显示,并分别标识为开始和结束以及+1或-1,以指示员工当时在进入"还是现有".
  • 这些是累积的.
  • 现在您有一个缺口和孤岛的问题,您想在其中查找"in"的连续期间. 岛屿"是使用"ins"的累积总和来标识的.
  • 然后将这些汇总.

您可以将累计金额替换为:

from (select e.*, (select sum(case when e2.in_outs = 0 then 1 else 0 end) from e e2 where e2.employeeNumber = e.employeeNumber e2.dt <= e.dt ) as grp from e ) e

I have the following result set which I get from SQL Server:

employeeNumber | start_date | start_time | end_date | end_time ---------------+------------+------------+--------------+---------- 123 | 10-03-2020 | 18:13:55 | 10-03-2020 | 22:59:46 123 | 10-03-2020 | 18:24:22 | 10-03-2020 | 22:59:51 123 | 10-03-2020 | 23:24:22 | 10-03-2020 | 23:59:51 123 | 11-03-2020 | 18:25:25 | 11-03-2020 | 20:59:51 123 | 12-03-2020 | 18:40:22 | 12-03-2020 | 22:59:52

For some cases I have multiple rows for the same overlapping time (row 1 and 2) as above but with a different start and end time (difference in seconds or minutes).

While my query is a simple select query that fetches the data from the source table, What can i add in the where clause to fetch distinct rows for such overlapping timestamp rows. i.e. for the above query i would want the result set to return the following :

employeeNumber | start_date | start_time | end_date | end_time ---------------+------------+------------+--------------+---------- 123 | 10-03-2020 | 18:13:55 | 10-03-2020 | 22:59:46 123 | 10-03-2020 | 23:24:22 | 10-03-2020 | 23:59:51 123 | 11-03-2020 | 18:25:25 | 11-03-2020 | 20:59:51 123 | 12-03-2020 | 18:40:22 | 12-03-2020 | 22:59:52

Below is my query :

select employeeNumber, start_date, start_time, end_date, end_time from emp_data where employeeNumber = 123 order by employeeNumber;

I can probably do with fetching only the first record but what would the where clause be.

Any help is appreciated as I am not very familiar with SQL Server.

解决方案

This is complicated. You need to keep track of "starts" and "ends". I am going to assume that your columns are datetimes or something similar that can be combined into a single column:

with e as ( select e.employeeNumber, v.dt, sum(v.inc) as inc, sum(sum(v.inc)) over (partition by e.employeeNumber order by v.dt) as in_outs from emp_data e cross apply (values (start_date + start_time, 1), (end_date + end_time, -1) ) v(dt, inc) group by e.employeeNumber, v.dt ) select employeeNumber, min(dt) as start_datetime, max(dt) as end_datetime from (select e.*, sum(case when in_outs = 0 then 1 else 0 end) over (partition by employeeNumber order by dt) as grp from e ) e where in_outs <> 0 group by employeeNumber, grp;

Here is a db<>fiddle.

What is this doing?

  • First the date/times are converted to date times.
  • Then the columns are unpivoted and identified as starts and ends, along with +1 or -1 to indicate whether the employee is "entering" or "existing" at that time.
  • These are accumulated.
  • Now you have a gaps and islands problem, where you want to find continue periods of "in"s. The "islands" are identified using a cumulative sum of "ins".
  • Then these are aggregated.

EDIT:

You can replace the cumulative sum with:

from (select e.*, (select sum(case when e2.in_outs = 0 then 1 else 0 end) from e e2 where e2.employeeNumber = e.employeeNumber e2.dt <= e.dt ) as grp from e ) e

更多推荐

获取不同的行以在SQL Server中重叠时间戳

本文发布于:2023-07-18 04:41:14,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1141114.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:时间   SQL   Server

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!