从MySQL表中删除重复值的最佳方法是什么?

编程入门 行业动态 更新时间:2024-10-23 09:30:02
本文介绍了从MySQL表中删除重复值的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有以下SQL从表中删除重复值,

I have the following SQL to delete duplicate values form a table,

DELETE p1 FROM `ProgramsList` p1, `ProgramsList` p2 WHERE p1.CustId = p2.CustId AND p1.CustId = 1 AND p1.`Id`>p2.`Id` AND p1.`ProgramName` = p2.`ProgramName`;

ID 是自动增量 给定 CustId ProgramName 必须是唯一的(目前不是) 上述SQL大约需要4到5个小时才能完成约1,000,000条记录

Id is auto incremental for a given CustId ProgramName must be unique (currently it is not) The above SQL takes about 4 to 5 hours to complete with about 1,000,000 records

有人可以建议您从表中删除重复的方式吗?

Could anyone suggest a quicker way of deleting duplicates from a table?

推荐答案

首先,如果还没有添加索引,可以尝试向ProgramName和CustID字段添加索引。

First, You might try adding indexes to ProgramName and CustID fields if you don't already have them.

De-Duping

您可以将记录分组以识别重复,正如你这样做,抓住每个组的最小ID值。然后,只需删除其ID不是MinID的所有记录。

You can group your records to identify dupes, and as you are doing that, grab the min ID value for each group. Then, just delete all records whose ID is not one of the MinID's.

条款方法

delete from ProgramsList where id not in (select min(id) as MinID from ProgramsList group by ProgramName, CustID)

加入方法

如果每个组中有很多成员,您可能需要多次运行。

You may have to run this more than once, if there are many members per group.

DELETE P FROM ProgramsList as P INNER JOIN (select count(*) as Count, max(id) as MaxID from ProgramsList group by ProgramName, CustID) as A on A.MaxID = P.id WHERE A.Count >= 2

有些人在条款中有性能问题,有些则不会。这取决于你的索引等等。如果太慢,请尝试另一个。

Some people have performance issues with the In-Clause, some don't. It depends a lot on your indexes and such. If one is too slow, try the other.

相关: stackoverflow/a/4192849/127880

更多推荐

从MySQL表中删除重复值的最佳方法是什么?

本文发布于:2023-11-22 03:26:44,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1615800.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:方法   MySQL

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!