MySQL:将表连接到自身,消除重复的行(MySQL: Joining a table to itself, eliminating duplicate rows)

编程入门 行业动态 更新时间:2024-10-28 10:33:18
MySQL:将表连接到自身,消除重复的行(MySQL: Joining a table to itself, eliminating duplicate rows)

我有一个将表连接到自身的查询。 结果包含重复的行(排序)。 此查询的目的是生成最常购买的产品列表。 考虑这个查询:

SELECT o1.ITEM ,o2.ITEM as ITEM2 ,o3.ITEM AS ITEM3 ,count(DISTINCT o1.ORDERNUM) as oCount FROM orders o1 INNER JOIN orders o2 ON o2.ORDERNUM = o1.ORDERNUM AND o2.ITEM != o1.ITEM LEFT OUTER JOIN orders o3 ON o3.ORDERNUM = o1.ORDERNUM AND o3.ITEM != o2.ITEM AND o3.ITEM != o1.ITEM GROUP BY o1.ITEM, o2.ITEM, o3.ITEM ORDER BY oCount DESC

前12个结果:

+-------------+-------------+-------------+--------+ | ITEM | ITEM2 | ITEM3 | oCount | +-------------+-------------+-------------+--------+ | 02B13.04.GP | 77A04.10 | 45A04.04.GP | 54 | | 02B13.04.GP | 45A04.04.GP | 77A04.10 | 54 | | 77A04.10 | 45A04.04.GP | 02B13.04.GP | 54 | | 45A04.04.GP | 02B13.04.GP | 77A04.10 | 54 | | 77A04.10 | 02B13.04.GP | 45A04.04.GP | 54 | | 45A04.04.GP | 77A04.10 | 02B13.04.GP | 54 | | 57B01.01.GP | 57B01.11.GP | 57B01.10.GP | 12 | | 57B01.10.GP | 57B01.11.GP | 57B01.01.GP | 12 | | 57B01.01.GP | 57B01.10.GP | 57B01.11.GP | 12 | | 57B01.10.GP | 57B01.01.GP | 57B01.11.GP | 12 | | 57B01.11.GP | 57B01.10.GP | 57B01.01.GP | 12 | | 57B01.11.GP | 57B01.01.GP | 57B01.10.GP | 12 |

请注意,前6个结果是相同的连接,顺序不同。 第二个6个结果具有相同的问题(整个结果都是如此)。 我的目标是为每个项目组创建一条记录,而不是每个项目组的每个组合都有一行。

我怎样才能避免这些重复的结果呢?

此外,对于这种查询的更有效方法的任何建议都是受欢迎的(我想添加一个额外的连接,但是有1,000,000个订单,资源需求已经失控)。

================================================

编辑:回答Darshan的问题

你可以分享表结构:

该表包含所有订单的行。 如果订单包含多个产品,则每个产品将有一行(给定订单的多行)。 此查询中唯一关注的列是:

ORDERNUM CHAR : Order Number ITEM CHAR : SKU for the item QTY INT : Quantity purchased ORDDATE DATETIME : Order Date

结果返回:我需要的就是我在上面的结果示例中列出的内容。 目标是获得最常购买的产品列表。

I have a query that connects a table to itself. The results contain duplicate rows (sort of). The objective of this query is to produce a list of products most frequently purchased together. Consider this query:

SELECT o1.ITEM ,o2.ITEM as ITEM2 ,o3.ITEM AS ITEM3 ,count(DISTINCT o1.ORDERNUM) as oCount FROM orders o1 INNER JOIN orders o2 ON o2.ORDERNUM = o1.ORDERNUM AND o2.ITEM != o1.ITEM LEFT OUTER JOIN orders o3 ON o3.ORDERNUM = o1.ORDERNUM AND o3.ITEM != o2.ITEM AND o3.ITEM != o1.ITEM GROUP BY o1.ITEM, o2.ITEM, o3.ITEM ORDER BY oCount DESC

And the first 12 results:

+-------------+-------------+-------------+--------+ | ITEM | ITEM2 | ITEM3 | oCount | +-------------+-------------+-------------+--------+ | 02B13.04.GP | 77A04.10 | 45A04.04.GP | 54 | | 02B13.04.GP | 45A04.04.GP | 77A04.10 | 54 | | 77A04.10 | 45A04.04.GP | 02B13.04.GP | 54 | | 45A04.04.GP | 02B13.04.GP | 77A04.10 | 54 | | 77A04.10 | 02B13.04.GP | 45A04.04.GP | 54 | | 45A04.04.GP | 77A04.10 | 02B13.04.GP | 54 | | 57B01.01.GP | 57B01.11.GP | 57B01.10.GP | 12 | | 57B01.10.GP | 57B01.11.GP | 57B01.01.GP | 12 | | 57B01.01.GP | 57B01.10.GP | 57B01.11.GP | 12 | | 57B01.10.GP | 57B01.01.GP | 57B01.11.GP | 12 | | 57B01.11.GP | 57B01.10.GP | 57B01.01.GP | 12 | | 57B01.11.GP | 57B01.01.GP | 57B01.10.GP | 12 |

Note that the first 6 results are the same connections, in a different order. The second 6 results have the same issue (and this continues throughout the results). My goal is to have a single record for each item group, not a single row for each combination of each item group.

How can I avoid these repeated results?

Also any advice on a more efficient approach to this query would be welcome (I'd like to add an additional join, but with 1,000,000 orders the resource requirements are getting out of hand).

================================================

EDIT: To answer Darshan's questions

Can you share the table structure:

The table contains the lines for all the orders. If an order contains multiple products, there will be a line for each product (multiple lines for a given order). The only columns of concern in this query are:

ORDERNUM CHAR : Order Number ITEM CHAR : SKU for the item QTY INT : Quantity purchased ORDDATE DATETIME : Order Date

Results returned: All I need is what I listed in the result sample above. The objective is to get a list of the products that are purchased together the most often.

最满意答案

你想要做的是消除重复的行,无论位置如何; 一招,因为你总是拥有所有项目组合,就是根据一个谓词item1 <item2 <item3来过滤结果

这是一个可能的解决方案:

SELECT a.item, b.item, c.item, count(*) from `orders` a left join orders b on a.ordernum = b.ordernum and a.item <> b.item left join orders c on a.ordernum = c.ordernum and a.item <> c.item and b.item <> c.item where a.item < b.item and b.item < c.item group by a.item, b.item, c.item order by count(*) desc

What you want to do is to eliminate duplicated rows regardless of the position; one trick, since you always have all the combinations of items is to filter the results according to a predicate that says item1 < item2 < item3

Here is a possible solution:

SELECT a.item, b.item, c.item, count(*) from `orders` a left join orders b on a.ordernum = b.ordernum and a.item <> b.item left join orders c on a.ordernum = c.ordernum and a.item <> c.item and b.item <> c.item where a.item < b.item and b.item < c.item group by a.item, b.item, c.item order by count(*) desc

更多推荐

本文发布于:2023-07-21 23:20:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1214883.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:连接到   Joining   MySQL   table   rows

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!