计算在Django ORM中按查询分组的带注释字段的总和最大值?

编程入门 行业动态 更新时间:2024-10-28 22:23:29
本文介绍了计算在Django ORM中按查询分组的带注释字段的总和最大值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

为简单起见,我有四个表(A,B,类别和关系),关系表将A的Intensity存储在B中,类别存储B的类型.

A< ---关系---> B --->类别

(因此,当B和Category之间的关系是n到1时,A和B之间的关系是n到n)

我需要一个ORM来按类别和A对关系记录进行分组,然后在每个(类别,A)中计算Intensity的Sum(似乎很简单,直到此处),然后我要对计算出的Sum的最大值进行注释在每个类别中.

我的代码如下:

A.objects.values('B_id').annotate(AcSum=Sum(Intensity)).annotate(Max(AcSum))

哪个会引发错误:

django.core.exceptions.FieldError: Cannot compute Max('AcSum'): 'AcSum' is an aggregate

Django-group-by 程序包,具有相同的错误.

有关更多信息,请参见此stackoverflow问题.

我正在使用Django 2和PostgreSQL.

是否有一种使用ORM来实现此目标的方法,如果没有,使用原始SQL表达式的解决方案是什么?

更新

经过大量的努力,我发现我写的确实是一个聚合,但是我想要的是找出每个类别中每个A的AcSum的最大值.因此,我想我必须在AcSum计算之后再次对结果进行分组.基于这一见解,我发现了一个 stack-overflow问题提出相同的概念(问题是在1年零2个月前提出的,没有任何可接受的答案). 将另一个值('id')链接到集合既不能用作group_by,也不能用作输出属性的过滤器,它会从集合中删除AcSum.由于按结果集分组的更改,因此也无法将AcSum添加到values(). 我认为我想做的是基于列内的字段(即id)对按查询分组的分组进行重新分组. 有什么想法吗?

解决方案

您无法进行汇总Max(Sum())的汇总,无论您是否使用ORM,它在SQL中都是无效的.相反,您必须将表自身联接起来才能找到最大值.您可以使用子查询来执行此操作.下面的代码对我来说似乎很正确,但是请记住,我没有什么可以运行的,所以它可能并不完美.

from django.db.models import Subquery, OuterRef annotation = { 'AcSum': Sum('intensity') } # The basic query is on Relation grouped by A and Category, annotated # with the Sum of intensity query = Relation.objects.values('a', 'b__category').annotate(**annotation) # The subquery is joined to the outerquery on the Category sub_filter = Q(b__category=OuterRef('b__category')) # The subquery is grouped by A and Category and annotated with the Sum # of intensity, which is then ordered descending so that when a LIMIT 1 # is applied, you get the Max. subquery = Relation.objects.filter(sub_filter).values('a', 'b__category').annotate(**annotation).order_by('-AcSum').values('AcSum')[:1] query = query.annotate(max_intensity=Subquery(subquery))

这应该生成类似以下的SQL

SELECT a_id, category_id, (SELECT SUM(U0.intensity) AS AcSum FROM RELATION U0 JOIN B U1 on U0.b_id = U1.id WHERE U1.category_id = B.category_id GROUP BY U0.a_id, U1.category_id ORDER BY SUM(U0.intensity) DESC LIMIT 1 ) AS max_intensity FROM Relation JOIN B on Relation.b_id = B.id GROUP BY Relation.a_id, B.category_id

通过使用后端特定功能(例如array_agg(Postgres)或GroupConcat(MySQL))收集在外部查询中分组在一起的Relation.id,在子查询中消除联接可能会更有效.但是我不知道您正在使用什么后端.

To keep it simple I have four tables(A, B, Category and Relation), Relation table stores the Intensity of A in B and Category stores the type of B.

A <--- Relation ---> B ---> Category

(So the relation between A and B is n to n, when the relation between B and Category is n to 1)

I need an ORM to group Relation records by Category and A, then calculate Sum of Intensity in each (Category, A) (seems simple till here), then I want to annotate Max of calculated Sum in each Category.

My code is something like:

A.objects.values('B_id').annotate(AcSum=Sum(Intensity)).annotate(Max(AcSum))

Which throws the error:

django.core.exceptions.FieldError: Cannot compute Max('AcSum'): 'AcSum' is an aggregate

Django-group-by package with the same error.

For further information please also see this stackoverflow question.

I am using Django 2 and PostgreSQL.

Is there a way to achieve this using ORM, if there is not, what would be the solution using raw SQL expression?

Update

After lots of struggling I found out that what I wrote was indeed an aggregation, however what I want is to find out the maximum of AcSum of each A in each category. So I suppose I have to group-by the result once more after AcSum Calculation. Based on this insight I found a stack-overflow question which asks the same concept(The question was asked 1 year, 2 months ago without any accepted answer). Chaining another values('id') to the set does not function neither as a group_by nor as a filter for output attributes, It removes AcSum from the set. Adding AcSum to values() is also not an option due to changes in the grouped by result set. I think what I am trying to do is re grouping the grouped by query based on the fields inside a column (i.e id). any thoughts?

解决方案

You can't do an aggregate of an aggregate Max(Sum()), it's not valid in SQL, whether you're using the ORM or not. Instead, you have to join the table to itself to find the maximum. You can do this using a subquery. The below code looks right to me, but keep in mind I don't have something to run this on, so it might not be perfect.

from django.db.models import Subquery, OuterRef annotation = { 'AcSum': Sum('intensity') } # The basic query is on Relation grouped by A and Category, annotated # with the Sum of intensity query = Relation.objects.values('a', 'b__category').annotate(**annotation) # The subquery is joined to the outerquery on the Category sub_filter = Q(b__category=OuterRef('b__category')) # The subquery is grouped by A and Category and annotated with the Sum # of intensity, which is then ordered descending so that when a LIMIT 1 # is applied, you get the Max. subquery = Relation.objects.filter(sub_filter).values('a', 'b__category').annotate(**annotation).order_by('-AcSum').values('AcSum')[:1] query = query.annotate(max_intensity=Subquery(subquery))

This should generate SQL like:

SELECT a_id, category_id, (SELECT SUM(U0.intensity) AS AcSum FROM RELATION U0 JOIN B U1 on U0.b_id = U1.id WHERE U1.category_id = B.category_id GROUP BY U0.a_id, U1.category_id ORDER BY SUM(U0.intensity) DESC LIMIT 1 ) AS max_intensity FROM Relation JOIN B on Relation.b_id = B.id GROUP BY Relation.a_id, B.category_id

It may be more performant to eliminate the join in Subquery by using a backend specific feature like array_agg (Postgres) or GroupConcat (MySQL) to collect the Relation.ids that are grouped together in the outer query. But I don't know what backend you're using.

更多推荐

计算在Django ORM中按查询分组的带注释字段的总和最大值?

本文发布于:2023-11-22 17:28:51,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1618359.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:最大值   字段   总和   注释   Django

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!