SQL将日期缩小到“开始

编程入门 行业动态 更新时间:2024-10-27 12:29:14
SQL将日期缩小到“开始 - 结束”范围(SQL reduce dates to range “start - end”)

我在表中有几个日期重复行:

ID STATE DATE ---------------------------- id01 connected 2015-04-04 id01 connected 2015-04-05 id01 connected 2015-04-08 id01 disconect 2015-04-11 id01 disconect 2015-04-12 id01 connected 2015-04-13

我想要一个带有“开始日期”和“结束日期”的查询,结果如下:

ID STATE START DATE END DATE ---------------------------------------- id01 connected 2015-04-04 2015-04-10 id01 disconect 2015-04-11 2015-04-12 id01 connected 2015-04-13 XXXXXXXXXX

最后一个“结束日期”并不重要(最后一个值,null,now()...)

最重要的是检测更改日期(在此示例中,2015-04-10没有行,2015-04-13中的状态相同)。

可行的解决方案? (无效)

SELECT ID, STATE, MIN(date), MAX(date) FROM TABLE GROUP BY ID, STATE;

合并间隔无效:

ID STATE START DATE END DATE ---------------------------------------- id01 connected 2015-04-04 XXXXXXXXXX id01 disconect 2015-04-11 2015-04-12

查询已在Impala中运行(类似SQL92)

I have repeat rows in a table with several dates:

ID STATE DATE ---------------------------- id01 connected 2015-04-04 id01 connected 2015-04-05 id01 connected 2015-04-08 id01 disconect 2015-04-11 id01 disconect 2015-04-12 id01 connected 2015-04-13

I want a query with "start date" and "end date", with this result:

ID STATE START DATE END DATE ---------------------------------------- id01 connected 2015-04-04 2015-04-10 id01 disconect 2015-04-11 2015-04-12 id01 connected 2015-04-13 XXXXXXXXXX

The last "end date" it's not important (last value, null, now()...)

The most important is detect date of change (in this example no row for 2015-04-10, and the same state happends in 2015-04-13).

Posible solution? (not valid)

SELECT ID, STATE, MIN(date), MAX(date) FROM TABLE GROUP BY ID, STATE;

Isn't valid because merge intervals:

ID STATE START DATE END DATE ---------------------------------------- id01 connected 2015-04-04 XXXXXXXXXX id01 disconect 2015-04-11 2015-04-12

Query has run in Impala (similar SQL92)

最满意答案

Impala支持窗口功能。 这个问题是一个“缺口和孤岛”问题,因此可以使用行数的差异来解决:

select id, state, min(date) as start_date, max(date) as end_date from (select t.*, row_number() over (partition by id order by date) as seqnum_id, row_number() over (partition by id, state order by date) as seqnum_isd from table t ) t group by id, state, (seqnum_id - seqnum_isd);

差异的逻辑并不困难,但是当你第一次学习它时却很棘手。 它有助于运行子查询并查看行号值是什么 - 以及差异定义每个组的原因。

Impala supports window functions. This problem is a "gap-and-islands" problem, so it can be solved using a difference of row numbers:

select id, state, min(date) as start_date, max(date) as end_date from (select t.*, row_number() over (partition by id order by date) as seqnum_id, row_number() over (partition by id, state order by date) as seqnum_isd from table t ) t group by id, state, (seqnum_id - seqnum_isd);

The logic for the difference is not difficult, but tricky when you first learn it. It helps to run the subquery and see what the row number values are -- and why the difference defines each group.

更多推荐

本文发布于:2023-08-03 12:28:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1391439.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:日期   SQL

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!