为什么此简单的联接查询与子查询相比显着更快?

编程入门行业动态更新时间:2024-10-26 15:15:31

本文介绍了为什么此简单的联接查询与子查询相比显着更快?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我有两个桌子. order_details是100,000行，而outbound是10,000行.

I have two tables. order_details which is 100,000 rows, and outbound which is 10,000 rows.

我需要将它们加入到名为order_number的列上，这两个列上都是VARCHAR(50).出站表中的order_number不是唯一的.

I need to join them on a column called order_number, which is a VARCHAR(50) on both. order_number is not unique in the outbound table.

CREATE TABLE `outbound` ( `outbound_id` int(12) NOT NULL, `order_number` varchar(50) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE `order_details` ( `order_details_id` int(12) NOT NULL, `order_number` varchar(50) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

这是我最初的查询，运行时间超过60秒:

This is my initial query, and it takes well over 60 seconds to run:

SELECT o.order_number FROM outbound o INNER JOIN order_details od ON o.order_number = od.order_number

此查询获得相同的结果，并且运行时间不到一秒钟:

This query gets the same results and takes less than a second to run:

SELECT o.order_number FROM outbound o INNER JOIN ( SELECT order_number FROM order_details ) od ON (o.order_number = od.order_number)

这使我感到惊讶，因为通常子查询的速度要慢得多.

This is surprising to me because usually sub-queries are significantly slower.

运行EXPLAIN(我仍在学习如何理解)显示，子查询版本使用derived2表，使用索引，并且该索引为auto_key0.我不够聪明，不知道如何解释这一点，以了解为什么这会产生重大影响.

Running EXPLAIN (which I'm still learning how to understand) shows that the sub query version uses a derived2 table, that it is using an index, and that index is auto_key0. I'm not savvy enough to know how to interpret this to understand why this makes a significant difference.

我正在通过命令行运行这些查询.

I am running these queries over command line.

我正在为Linux(x86_64)CentOS运行MySQL Ver 14.14 Distrib 5.6.35.

I am running MySQL Ver 14.14 Distrib 5.6.35, for Linux (x86_64) CentOS.

总结:

为什么这个简单的联接查询与子查询相比显着更快?

推荐答案

我对MySQL的了解非常有限.但这是我的想法:

My knowledge of MySQL is very limited. But these are my thoughts:

您的表没有索引. 然后，该联接必须读取整个第二张表以便对第一张表的每一行进行比较.

Your tables don't have indexes. Then the join has to read the entire second table in order to compare, for each row of the first table.

子查询一次读取第二张表并创建一个索引，然后它不需要为第一张表的每一行读取整个第二张表.它只需要检查索引，这会更快.

The subquery reads the second table once and creates an index, then it doesn't need to read the entire second table for each row of the first table. It only has to check the index, which is much more faster.

要验证我是否正确，请尝试在两个表中为order_number列创建索引(CREATE INDEX ...)，然后再次运行这两个查询.您的第一个查询只需要不到一秒钟的时间，而不是一分钟.

To verify if I'm ritght or not, try creating indexes for the column order_number in your two tables (CREATE INDEX ... ), and run again this two queries. Your first query should only take less than a second instead of a minute.