将查询中的单个列与多列聚合

编程入门行业动态更新时间:2024-10-26 20:32:04

本文介绍了将查询中的单个列与多列聚合的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

当查询中有许多其他列时，是否有适当的方法来聚合单个列?

Is there a proper way to aggregate a single column when I have many other columns in the query?

我试过这个答案有效，但我的查询变得更加冗长.

I've tried this answer which works, but my query has become a lot more verbose.

我当前的查询如下所示:

My current query looks like this:

SELECT t1.foo1, t1.foo2, t2.foo3, t2.foo4, string_agg(t3.aggregated_field, ', ') FROM tbl1 t1 LEFT JOIN tbl2 t2 ON t1.id = t2.fkeyid LEFT JOIN tbl3 t3 ON t2.id = t3.fkeyid GROUP BY t1.foo1, t1.foo2, t2.foo3, t2.foo4, t2.foo5, t2.foo6 ORDER BY t2.foo5, t2.foo6

查询有更多的字段和LEFT JOIN，重要的部分是所有这些字段都有 1 到 1 或 1 到 0 的关系，除了我想要的 1 到 n 的一个字段聚合，由上面伪查询中的 t3.aggregated_field 表示.

The query has many more fields and LEFT JOINs, the important part is that all these fields have 1 to 1 or 1 to 0 relationship except one field that is 1 to n which I want to aggregate, represented by t3.aggregated_field in the pseudo-query above.

当我使用聚合函数时，SELECT 和 ORDER BY 中列出的所有字段必须是聚合的或 GROUP BY 的一部分代码> 子句.这使我的查询方式比现在更加冗长.

As I'm using an aggregate function, all fields listed in the SELECT and ORDER BY must be either aggregated or part of the GROUP BY clause. This makes my query way more verbose than it already is.

即假设foo1为主键，当该字段重复时，除aggregated_field外的其他字段也相等.我希望这些重复的行作为带有聚合字段值的单行结果.(基本上是带有聚合列的 select distinct)

That is, assuming foo1 is a primary key, when this field is repeated, all others except aggregated_field are also equal. I want these repeated rows as a single row result with the aggregated field value. (basically a select distinct with an aggregated column)

是否有更好的方法来做到这一点(无需将所有其他字段放在 GROUP BY 中)，或者我应该在后端迭代结果集，为每个字段执行查询获取这个 1 到 n 关系的行?

Is there a better way to do this (without having to put all other fields in the GROUP BY) or should I just iterate over the result set in my back-end executing a query for each row fetching this 1 to n relationship?

服务器正在运行 PostgreSQL 9.1.9，更具体地说:

The server is running PostgreSQL 9.1.9, more specifically:

PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu，由 gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54) 编译，64 位

PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54), 64-bit

推荐答案

简单查询

使用 PostgreSQL 9.1 或更高版本，这可以简单得多.正如在这个密切相关的答案中所解释的:

Simple query

This can be much simpler with PostgreSQL 9.1 or later. As explained in this closely related answer:

PGError:错误:在对象及其 has_many 对象的 AR 查询的 WHERE 子句中不允许聚合

GROUP BY一个表的主键就足够了.由于:

It is enough to GROUP BY the primary key of a table. Since:

foo1 是主键

.. 您可以将示例简化为:

.. you can simplify your example to:

SELECT foo1, foo2, foo3, foo4, foo5, foo6, string_agg(aggregated_field, ', ') FROM tbl1 GROUP BY 1 ORDER BY foo7, foo8; -- have to be spelled out, since not in select list!

多表查询

但是，既然您已经:

Query with multiple tables

However, since you have:

更多的字段和 LEFT JOIN，重要的部分是所有这些字段都具有 1 到 1 或 1 到 0 的关系，除了我想要聚合的 1 到 n 的一个字段

many more fields and LEFT JOINs, the important part is that all these fields have 1 to 1 or 1 to 0 relationship except one field that is 1 to n which I want to aggregate

..先聚合，后加入应该更快更简单:

SELECT t1.foo1, t1.foo2, ... , t2.bar1, t2.bar2, ... , a.aggregated_col FROM tbl1 t1 LEFT JOIN tbl2 t2 ON ... ... LEFT JOIN ( SELECT some_id, string_agg(agg_col, ', ') AS aggregated_col FROM agg_tbl a ON ... GROUP BY some_id ) a ON a.some_id = ?.some_id ORDER BY ...

这样，您查询的大部分内容根本不需要聚合.

This way the big portion of your query does not need aggregation at all.

我最近在 SQL Fiddle 中提供了一个测试用例来证明这个相关答案中的观点:

I recently provided a test case in an SQL Fiddle to prove the point in this related answer: