使用GROUP BY对特定行进行分组

编程入门行业动态更新时间:2024-10-26 12:22:29

本文介绍了使用GROUP BY对特定行进行分组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

SCHEMA

我在MySQL数据库中设置了以下设置：

CREATE TABLE项目（序列号，名称VARCHAR（100）， group_id INT，价格DECIMAL（10,2）， KEY items_group_id_idx（group_id）， PRIMARY KEY（id））; INSERT INTO项目VALUES （1，'Item A'，NULL，10），（2，'Item B'，NULL，20），（3，'项目C'，NULL，30），（4，'项目D'，1,40），（5，'项目E'，2,50），（6，'项目F'，2,60），（7，'项目G'，2,70）;

问题 我需要选择：

所有项目 group_id ，其中 NULL 值，和 group_id 具有最低价格价格。

预期成果
+ ---- + -------- + ---------- + ------- + | id |名称| group_id |价格| + ---- + -------- + ---------- + ------- + | 1 |项目A | NULL | 10.00 | | 2 |项目B | NULL | 20.00 | | 3 |项目C | NULL | 30.00 | | 4 |项目D | 1 | 40.00 | | 5 |项目E | 2 | 50.00 | + ---- + -------- + ---------- + ------- +
可能的解决方案1：使用 UNION ALL
SELECT id，name，group_id，price FROM items WHERE group_id IS NULL UNION ALL SELECT id ，name，MIN（price）FROM items WHERE group_id IS NOT NULL GROUP BY group_id; / *解释* / + ---- + -------------- + ------------ + ------ + -------------------- + -------------------- + --------- ------- + ------ + + ------------------------- --------------------- + | id | select_type |表| |键入| possible_keys |键| key_len | ref |行|额外| + ---- + -------------- + ------------ + ------ + ----- --------------- + -------------------- + --------- + --- ---- + ------ + -------------------------------------- -------- + | 1 | PRIMARY |物品| ref | items_group_id_idx | items_group_id_idx | 5 | const | 3 |使用where | | 2 | UNION |物品| ALL | items_group_id_idx | NULL | NULL | NULL | 7 |在哪里使用;使用临时;使用filesort | | NULL |联合结果| < union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | | + ---- + -------------- + ------------ + ------ + ----- --------------- + -------------------- + --------- + --- ---- + ------ + -------------------------------------- -------- +
然而，有两个查询是不可取的，因为会有更复杂的条件在 WHERE 子句中，我需要对最终结果进行排序。可能的解决方案2： GROUP BY 关于表达式（ SELECT id，name，group_id ，MIN（price）FROM items GROUP BY CASE WHEN group_id IS NOT NULL THEN group_id ELSE RAND（）END; / *解释* / + ---- + ------------- + ------- + ----- - + --------------- + ------ + --------- + ------ + ------ + - -------------------------------- + | id | select_type |表| |键入| possible_keys |键| key_len | ref |行|额外| + ---- + ------------- + ------- + ------ + ----------- ---- + ------ + --------- + ------ + ------ + -------------- ------------------- + | 1 | SIMPLE |物品| ALL | NULL | NULL | NULL | NULL | 7 |使用临时;使用filesort | + ---- + ------------- + ------- + ------ + ----------- ---- + ------ + --------- + ------ + ------ + -------------- ------------------- +
解决方案2似乎更快更简单，但我想知道是否有更好的性能方法。
/ p>
根据 解决方案
根据此答案 @ axiac ，在兼容性和性能方面更好的解决方案如下所示。
它也在 SQL反模式手册，第15章：不明确的组合。为了提高性能，组合索引也被添加到（ group_id，price，id）。

解决方案

SELECT a.id，a.name，a.group_id，a.price FROM items a LEFT JOIN项目b ON a.group_id = b.group_id AND（a.price> b.price OR（a.price = b.price和a.id> b.id）） WHERE b.price为NULL;
请参阅
偶然的作为副作用，这个查询在我需要包含 ALL包含 group_id 的记录等于 NULL AND $ b

结果

+ ---- + -------- + ---------- + - ------ + | id |名称| group_id |价格| + ---- + -------- + ---------- + ------- + | 1 |项目A | NULL | 10.00 | | 2 |项目B | NULL | 20.00 | | 3 |项目C | NULL | 30.00 | | 4 |项目D | 1 | 40.00 | | 5 |项目E | 2 | 50.00 | + ---- + -------- + ---------- + ------- +

EXPLAIN

+ ---- + ------------- + ------- + ------ + - ------------------------------ + ------------------- - + --------- + ---------------------------- + ------ + - ------------------------ + | id | select_type |表| |键入| possible_keys |键| key_len | ref |行|额外| + ---- + ------------- + ------- + ------ + ----------- -------------------- + -------------------- + -------- - + ---------------------------- + ------ + ------------ -------------- + | 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 7 | | | 1 | SIMPLE | b | ref | PRIMARY，ID，items_group_id_idx | items_group_id_idx | 5 | agi_development.a.group_id | 1 |在哪里使用;使用index | + ---- + ------------- + ------- + ------ + ----------- -------------------- + -------------------- + -------- - + ---------------------------- + ------ + ------------ -------------- +

SCHEMA

I have the following set-up in MySQL database:
CREATE TABLE items ( id SERIAL, name VARCHAR(100), group_id INT, price DECIMAL(10,2), KEY items_group_id_idx (group_id), PRIMARY KEY (id) ); INSERT INTO items VALUES (1, 'Item A', NULL, 10), (2, 'Item B', NULL, 20), (3, 'Item C', NULL, 30), (4, 'Item D', 1, 40), (5, 'Item E', 2, 50), (6, 'Item F', 2, 60), (7, 'Item G', 2, 70);
PROBLEM

I need to select:

All items with group_id that has NULL value, and

One item from each group identified by group_id having the lowest price.

EXPECTED RESULTS
+----+--------+----------+-------+ | id | name | group_id | price | +----+--------+----------+-------+ | 1 | Item A | NULL | 10.00 | | 2 | Item B | NULL | 20.00 | | 3 | Item C | NULL | 30.00 | | 4 | Item D | 1 | 40.00 | | 5 | Item E | 2 | 50.00 | +----+--------+----------+-------+
POSSIBLE SOLUTION 1: Two queries with UNION ALL
SELECT id, name, group_id, price FROM items WHERE group_id IS NULL UNION ALL SELECT id, name, MIN(price) FROM items WHERE group_id IS NOT NULL GROUP BY group_id; /* EXPLAIN */ +----+--------------+------------+------+--------------------+--------------------+---------+-------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------+------------+------+--------------------+--------------------+---------+-------+------+----------------------------------------------+ | 1 | PRIMARY | items | ref | items_group_id_idx | items_group_id_idx | 5 | const | 3 | Using where | | 2 | UNION | items | ALL | items_group_id_idx | NULL | NULL | NULL | 7 | Using where; Using temporary; Using filesort | | NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | | +----+--------------+------------+------+--------------------+--------------------+---------+-------+------+----------------------------------------------+
However it is undesirable to have two queries since there will be more complex condition in WHERE clause and I would need to sort the final results.

POSSIBLE SOLUTION 2: GROUP BY on expression (reference)
SELECT id, name, group_id, MIN(price) FROM items GROUP BY CASE WHEN group_id IS NOT NULL THEN group_id ELSE RAND() END; /* EXPLAIN */ +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ | 1 | SIMPLE | items | ALL | NULL | NULL | NULL | NULL | 7 | Using temporary; Using filesort | +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
Solution 2 seems to be faster and simple to use but I'm wondering whether there is a better approach in terms of performance.

UPDATE:

According to documentation referenced by @axiac, this query is illegal in SQL92 and earlier and may work in MySQL only.
解决方案
According to this answer by @axiac, better solution in terms of compatibility and performance is shown below.

It is also explained in SQL Antipatterns book, Chapter 15: Ambiguous Groups.

To improve performance, combined index is also added for (group_id, price, id).

SOLUTION

SELECT a.id, a.name, a.group_id, a.price FROM items a LEFT JOIN items b ON a.group_id = b.group_id AND (a.price > b.price OR (a.price = b.price and a.id > b.id)) WHERE b.price is NULL;
See explanation on how it works for more details.

By accident as a side-effect this query works in my case where I needed to include ALL records with group_id equals to NULL AND one item from each group with the lowest price.

RESULT

+----+--------+----------+-------+ | id | name | group_id | price | +----+--------+----------+-------+ | 1 | Item A | NULL | 10.00 | | 2 | Item B | NULL | 20.00 | | 3 | Item C | NULL | 30.00 | | 4 | Item D | 1 | 40.00 | | 5 | Item E | 2 | 50.00 | +----+--------+----------+-------+

EXPLAIN

+----+-------------+-------+------+-------------------------------+--------------------+---------+----------------------------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+-------------------------------+--------------------+---------+----------------------------+------+--------------------------+ | 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 7 | | | 1 | SIMPLE | b | ref | PRIMARY,id,items_group_id_idx | items_group_id_idx | 5 | agi_development.a.group_id | 1 | Using where; Using index | +----+-------------+-------+------+-------------------------------+--------------------+---------+----------------------------+------+--------------------------+

更多推荐

使用GROUP BY对特定行进行分组

本文发布于:2023-10-24 08:27:41，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1523444.html

版权声明:本站内容均来自互联网，仅供演示用，请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系，我们将在24小时内删除。

GROUP

上一篇：对bar进行分组

下一篇：如何使用groovy对列表进行分组

发布评论取消回复

评论列表（有 0 条评论）

最近发表

荆门网站建设的重要性

win10蓝屏终止代码CRITICAL_PROCESS_DIED解决方法

您可以尝试添加 --skip-broken 选项来解决该问题您可以尝试执行：rpm -Va --nofiles --nodigest 解决方案

关于无线网络波动大的解决办法

Windows10 关于系统中断CPU占用过高导致电脑变卡的解决办法

VS 2019 点击页面自动定位到解决方案资源管理器目录位置

（亲测解决）VMware打开需要半天才进入、打开系统很慢、运行很慢解决办法

Typora官网下载的最新版本mac10.13以下版本用不了的解决办法

成功解决ModuleNotFoundError: No module named ‘torch._C‘

MySQL:由于找不到VCRUNTIME140_1.dll，无法继续执行代码。重新安装程序可能会解决此问题

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍！

热门文章

从源“http://localhost:5173”访问“...”处的 XMLHttpRequest 已被 CORS 策略阻止

币安API错误代码1102，未发送强制参数“时间戳”

如果我在bot telegram nodejs中使用editMessageMedia，我如何制作标题

在 Node.js 中从网络流创建 blob

使用 Node.js / ES6 如何设置 dotenv 文件的自定义路径？

使用 NODE.JS 和 html5 实现低延迟（50 毫秒）视频流

如何从nodejs连接laravel>laravel

使用nodejs观看目录

如果文件包含特定字符串，如何跳过 GitHub 工作流程步骤？

FirebaseError：无法从.env加载环境变量

标签列表

文件

如何在

Python

系统

java

方法

数据

错误

windows

函数

android

linux

教程

如何使用

代码

字符串

计算机

电脑

服务器

NET

应用程序

数组

PHP

MySQL

SQL

对象

项目

程序

数据库

word