我正在尝试为表中行的子集的字段计算聚合函数.问题是我想一次找到k个行的每个组合的均值-因此,对于所有行,我想找到(说)10行的每个组合的均值.所以:
I am trying to calculate an aggregate function for a field for a subset of rows in a table. The problem is that I'd like to find the mean of every combination of rows taken k at a time --- so for all the rows, I'd like to find (say) the mean of every combination of 10 rows. So:
id | count ----|------ 1 | 5 2 | 3 3 | 6 ... 30 | 16应该给我
ids 1..10的平均值;ID 1,3..11;ID 1、4..12,以此类推.我知道这会产生很多行.
mean of ids 1..10; ids 1, 3..11; ids 1, 4..12, and so so. I know this will yield a lot of rows.
对于从数组中找到组合,有一些答案.我可以通过编程来做到这一点,一次获取30个ID 10,然后 SELECT 对其进行编码.是否可以使用 PARTITION BY , TABLESAMPLE 或其他功能(例如python的 itertoolsbinations())?(据我所知, TABLESAMPLE 本身不能保证我选择的是哪行的子集.)
There are SO answers for finding combinations from arrays. I could do this programmatically by taking 30 ids 10 at a time and then SELECTing them. Is there a way to do this with PARTITION BY, TABLESAMPLE, or another function (something like python's itertoolsbinations())? (TABLESAMPLE by itself won't guarantee which subset of rows I am selecting as far as I can tell.)
推荐答案引用的答案中描述的方法是静态的.一个更方便的解决方案可能是使用递归.
The method described in the cited answer is static. A more convenient solution may be to use recursion.
示例数据:
drop table if exists my_table; create table my_table(id int primary key, number int); insert into my_table values (1, 5), (2, 3), (3, 6), (4, 9), (5, 2);查询可在5个元素集中找到2个元素子集(k组合,k = 2):
Query which finds 2 element subsets in 5 element set (k-combination with k = 2):
with recursive recur as ( select id, array[id] as combination, array[number] as numbers, number as sum from my_table union all select t.id, combination || t.id, numbers || t.number, sum+ number from my_table t join recur r on r.id < t.id and cardinality(combination) < 2 -- param k ) select combination, numbers, sum/2.0 as average -- param k from recur where cardinality(combination) = 2 -- param k combination | numbers | average -------------+---------+-------------------- {1,2} | {5,3} | 4.0000000000000000 {1,3} | {5,6} | 5.5000000000000000 {1,4} | {5,9} | 7.0000000000000000 {1,5} | {5,2} | 3.5000000000000000 {2,3} | {3,6} | 4.5000000000000000 {2,4} | {3,9} | 6.0000000000000000 {2,5} | {3,2} | 2.5000000000000000 {3,4} | {6,9} | 7.5000000000000000 {3,5} | {6,2} | 4.0000000000000000 {4,5} | {9,2} | 5.5000000000000000 (10 rows)对于k = 3的相同查询给出:
The same query for k = 3 gives:
combination | numbers | average -------------+---------+-------------------- {1,2,3} | {5,3,6} | 4.6666666666666667 {1,2,4} | {5,3,9} | 5.6666666666666667 {1,2,5} | {5,3,2} | 3.3333333333333333 {1,3,4} | {5,6,9} | 6.6666666666666667 {1,3,5} | {5,6,2} | 4.3333333333333333 {1,4,5} | {5,9,2} | 5.3333333333333333 {2,3,4} | {3,6,9} | 6.0000000000000000 {2,3,5} | {3,6,2} | 3.6666666666666667 {2,4,5} | {3,9,2} | 4.6666666666666667 {3,4,5} | {6,9,2} | 5.6666666666666667 (10 rows)当然,如果不需要,您可以从查询中删除数字.
Of course, you can remove numbers from the query if you do not need them.
更多推荐
汇总一次取k的行的所有组合
发布评论