这是一个老问题 - 在Vertica中寻找最佳解决方案。 想象一下带有列的表格: -
A, B, C, D, E列AD是整数或变量,列E是具有缺省值GETUTCDATE()的timestamptz列。
表格的样本内容: -
1, 2, "AAA", 4, 1404305559 1, 2, "BBB", 23, 1404305633 1, 2, "CCC", 62, 1404305705 <-- the max entry for (1,2,"CCC") 1, 2, "AAA", 123, 1404305740 <-- the max entry for (1,2,"AAA") 1, 2, "BBB", 91, 1404305778 <-- the max entry for (1,2,"BBB")因此,可能会出现复合(A,B,C)值的重复行(列D是值,列E是时间戳)。
我想要一个结果集,显示每个唯一(A,B,C)组合的最新行及其值。 因此,上述结果将如下所示: -
1, 2, "CCC", 62, 1404305705 1, 2, "AAA", 123, 1404305740 1, 2, "BBB", 91, 1404305778This is an old problem - looking for the best solution in Vertica. Imagine a table with columns:-
A, B, C, D, EColumns A-D are ints or varchars and column E is a timestamptz column that has a default value of GETUTCDATE().
Sample content of the table:-
1, 2, "AAA", 4, 1404305559 1, 2, "BBB", 23, 1404305633 1, 2, "CCC", 62, 1404305705 <-- the max entry for (1,2,"CCC") 1, 2, "AAA", 123, 1404305740 <-- the max entry for (1,2,"AAA") 1, 2, "BBB", 91, 1404305778 <-- the max entry for (1,2,"BBB")So potentially there are repeating rows for the composite (A,B,C) value (with column D being a value and column E the timestamp).
I'd like a resultset that showed, for each unique (A,B,C) combination, the latest row and its value. Hence the resultset for the above would look like:-
1, 2, "CCC", 62, 1404305705 1, 2, "AAA", 123, 1404305740 1, 2, "BBB", 91, 1404305778最满意答案
我们来设置示例数据:
CREATE TABLE public.test ( A int, B int, C varchar, D int, E int ); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'AAA', 4, 1404305559); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'BBB', 23, 1404305633); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'CCC', 62, 1404305705); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'AAA', 123, 1404305740); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'BBB', 91, 1404305778); COMMIT;我们将使用RANK函数根据A, B, C对每行进行排序A, B, C然后对E排序,并仅返回顶部的行(排名为1)。
SELECT a.a, a.b, a.c, a.d, a.e FROM (SELECT a, b, c, d, e, RANK() OVER ( PARTITION BY a, b, c ORDER BY e DESC) AS rank FROM public.test) a WHERE a.rank = 1;这返回:
A | B | C | D | E ---+---+-----+-----+------------ 1 | 2 | CCC | 62 | 1404305705 1 | 2 | AAA | 123 | 1404305740 1 | 2 | BBB | 91 | 1404305778Let's set up the sample data:
CREATE TABLE public.test ( A int, B int, C varchar, D int, E int ); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'AAA', 4, 1404305559); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'BBB', 23, 1404305633); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'CCC', 62, 1404305705); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'AAA', 123, 1404305740); INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'BBB', 91, 1404305778); COMMIT;We'll use the RANK function to rank each row based on A, B, C and sort on E and return only the rows that are at the top (have a rank of 1).
SELECT a.a, a.b, a.c, a.d, a.e FROM (SELECT a, b, c, d, e, RANK() OVER ( PARTITION BY a, b, c ORDER BY e DESC) AS rank FROM public.test) a WHERE a.rank = 1;This returns:
A | B | C | D | E ---+---+-----+-----+------------ 1 | 2 | CCC | 62 | 1404305705 1 | 2 | AAA | 123 | 1404305740 1 | 2 | BBB | 91 | 1404305778更多推荐
发布评论