查找某年获奖次数最多的电影

编程入门行业动态更新时间:2024-10-26 10:37:54

本文介绍了查找某年获奖次数最多的电影-代码重复的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我正尝试编写一个查询（PostgreSQL）以获取 2012年获奖次数最多的电影。

I am trying to write a query (PostgreSQL) to get "Movies with highest number of awards in year 2012."

我有以下表格：

CREATE TABLE Award( ID_AWARD bigserial CONSTRAINT Award_pk PRIMARY KEY, award_name VARCHAR(90), category VARCHAR(90), award_year integer, CONSTRAINT award_unique UNIQUE (award_name, category, award_year)); CREATE TABLE AwardWinner( ID_AWARD integer, ID_ACTOR integer, ID_MOVIE integer, CONSTRAINT AwardWinner_pk PRIMARY KEY (ID_AWARD));

然后我写了以下查询，给出了正确的结果，但是我有很多代码重复

And I written following query, which gives correct results, but there's is quite a lot of code duplication I think.

select * from (select id_movie, count(id_movie) as awards from Award natural join awardwinner where award_year = 2012 group by id_movie) as SUB where awards = (select max(count) from (select id_movie, count(id_movie) from Award natural join awardwinner where award_year = 2012 group by id_movie) as SUB2);

所以 SUB 和 SUB2 是完全相同的子查询。有更好的方法吗？

So SUB and SUB2 are exactly the same subquery. Is there a better way to do this?

推荐答案

获取所有获奖影片

Get all winning movies

SELECT id_movie, awards FROM ( SELECT aw.id_movie, count(*) AS awards ,rank() OVER (ORDER BY count(aw.id_movie) DESC) AS rnk FROM award a JOIN awardwinner aw USING (id_award) WHERE a.award_year = 2012 GROUP BY aw.id_movie ) sub WHERE rnk = 1;

要点

这应该比到目前为止的建议更简单，更快捷。使用 EXPLAIN ANALYZE 进行测试。

在某些情况下，CTE有助于避免代码重复。但是现在还不行：子查询可以很好地完成工作，而且通常更快。

There are cases where CTEs are instrumental to avoid code duplication. But not in this time: a subquery does the job just fine and is usually faster.

您可以在同一查询上的聚合函数上运行窗口函数水平。这就是它起作用的原因：

You can run a window function OVER an aggregate function on the same query level. That's why this works:
rank() OVER (ORDER BY count(aw.id_movie) DESC) AS rnk
我建议在JOIN条件下使用显式列名，而不要使用 NATURAL JOIN ，如果以后更改/向基础表中添加列，则很容易损坏。使用使用
I'd suggest to use explicit column names in the JOIN condition instead of NATURAL JOIN, which is prone to breakage if you later change / add columns to the underlying tables. The JOIN condition with USING is almost as short, but doesn't break as easily.

因为几乎是短的，但是却不那么容易破解。因为 id_movie 不能为NULL （由JOIN条件以及pk的一部分决定），使用 count（*）较短，但速度稍快。结果相同。

Since id_movie cannot be NULL (ruled out by the JOIN condition and also part of the pk) it is shorter ans slightly faster to use count(*) instead. Same result.

更快，但是，如果您只需要一个赢家：

Shorter and faster, yet, if you only need one winner:
SELECT aw.id_movie, count(*) AS awards FROM award a JOIN awardwinner aw USING (id_award) WHERE a.award_year = 2012 GROUP BY 1 ORDER BY 2 DESC, 1 -- as tie breaker LIMIT 1
使用位置引用（ 1 ， 2 ）作为速记。我向 id_movie 添加了 ORDER BY 作为决胜局，以防多部电影获胜。

Using positional references (1, 2) here as shorthand. I added id_movie to ORDER BY as tie breaker in case multiple movies should qualify for the win.