我有一个在我的MySQL数据库中没有任何索引或主键的表。 我无法更改表格的模式(它不是“我的”表格)。 由于表存储间隔到达的数据,因此可能存在重复数据。
例如:
+--------------+--------------+--------+----------+----------+---------+ | first_seen | last_seen | type | name | hitcnt | data | +--------------+--------------+--------+----------+----------+---------+ | 15:12:02 | 16:02:32 | 5 | foo | 3 | difank | +--------------+--------------+--------+----------+----------+---------+ | 19:52:23 | 22:06:20 | 5 | foo | 4 | difank | +--------------+--------------+--------+----------+----------+---------+现在我想“减少”到:
+--------------+--------------+--------+----------+----------+---------+ | first_seen | last_seen | type | name | hitcnt | data | +--------------+--------------+--------+----------+----------+---------+ | 15:12:02 | 22:06:20 | 5 | foo | 7 | difank | +--------------+--------------+--------+----------+----------+---------+如果可能的话,我想这样做“原位”(即就地)。
使用GROUP BY , MIN() , MAX()等。我可以编写一个查询,返回我想要的结果:
SELECT MIN(first_seen), MAX(last_seen), type, name, SUM(hitcnt) as hit, data FROM <table> GROUP BY type, name, data ORDER BY hit desc, type;问题是:如何用该查询的结果(有效) 替换现有数据?
我是否必须使用临时表(即将数据移动到临时表,截断现有表和从临时表中SELECT INTO )? 我可以在交易中做到这一点(为防止数据丢失,如果出现问题)? 除临时表外还有其他(更好的?)选项吗?I have a table that does not have any index or primary key in my MySQL database. I cannot change the schema of the table (it is not "my" table). As the table stores data that arrives in intervals, there can be (are) duplicates.
For example:
+--------------+--------------+--------+----------+----------+---------+ | first_seen | last_seen | type | name | hitcnt | data | +--------------+--------------+--------+----------+----------+---------+ | 15:12:02 | 16:02:32 | 5 | foo | 3 | difank | +--------------+--------------+--------+----------+----------+---------+ | 19:52:23 | 22:06:20 | 5 | foo | 4 | difank | +--------------+--------------+--------+----------+----------+---------+Now I would like to "reduce" this to:
+--------------+--------------+--------+----------+----------+---------+ | first_seen | last_seen | type | name | hitcnt | data | +--------------+--------------+--------+----------+----------+---------+ | 15:12:02 | 22:06:20 | 5 | foo | 7 | difank | +--------------+--------------+--------+----------+----------+---------+And I would like to do this "in situ" (i.e. in place) if possible.
Using GROUP BY, MIN(), MAX(), etc. I can write a query that returns exactly what I want to end up with:
SELECT MIN(first_seen), MAX(last_seen), type, name, SUM(hitcnt) as hit, data FROM <table> GROUP BY type, name, data ORDER BY hit desc, type;The question is: how can I replace the existing data (efficiently) with the result of that query?
Do I have to use a temporary table (i.e. move the data to a temporary table, truncate the existing table and SELECT INTO from the temporary table)? Can I do this in a transaction (to prevent data loss if something goes wrong)? Are there other (better?) options than a temporary table?最满意答案
TRUNCATE TABLE table_name; INSERT INTO table_name (column1.....) SELECT MIN(first_seen), MAX(last_seen), type, name, SUM(hitcnt) as hit, data FROM <table> GROUP BY type, name, data ORDER BY hit desc, type;确保插入的列数和选择的台阶匹配。
TRUNCATE TABLE table_name; INSERT INTO table_name (column1.....) SELECT MIN(first_seen), MAX(last_seen), type, name, SUM(hitcnt) as hit, data FROM <table> GROUP BY type, name, data ORDER BY hit desc, type;Make sure number of columns of insert and select staement matches.
更多推荐
发布评论