假设我有一张顾客表和一张购买表.每次购买都属于一个客户.我想在一个 SELECT 语句中获取所有客户的列表以及他们的最后一次购买.最佳做法是什么?关于建立索引有什么建议吗?
Suppose I have a table of customers and a table of purchases. Each purchase belongs to one customer. I want to get a list of all customers along with their last purchase in one SELECT statement. What is the best practice? Any advice on building indexes?
请在您的答案中使用这些表/列名称:
Please use these table/column names in your answer:
- 客户:id, name
- 购买:id、customer_id、item_id、date
在更复杂的情况下,通过将最后一次购买放入客户表来对数据库进行非规范化是否(在性能方面)有益?
And in more complicated situations, would it be (performance-wise) beneficial to denormalize the database by putting the last purchase into the customer table?
如果(购买)id 保证按日期排序,是否可以使用类似LIMIT 1 的东西来简化语句?
If the (purchase) id is guaranteed to be sorted by date, can the statements be simplified by using something like LIMIT 1?
推荐答案这是 StackOverflow 上经常出现的 greatest-n-per-group 问题的一个示例.
This is an example of the greatest-n-per-group problem that has appeared regularly on StackOverflow.
以下是我通常建议的解决方法:
Here's how I usually recommend solving it:
SELECT c.*, p1.* FROM customer c JOIN purchase p1 ON (c.id = p1.customer_id) LEFT OUTER JOIN purchase p2 ON (c.id = p2.customer_id AND (p1.date < p2.date OR (p1.date = p2.date AND p1.id < p2.id))) WHERE p2.id IS NULL;说明:给定一行 p1,应该没有行 p2 具有相同的客户和较晚的日期(或在平局的情况下,较晚的 >id).当我们发现这是真的时,p1 是该客户最近购买的产品.
Explanation: given a row p1, there should be no row p2 with the same customer and a later date (or in the case of ties, a later id). When we find that to be true, then p1 is the most recent purchase for that customer.
关于索引,我会在 purchase 中在列(customer_id、date、id)上创建一个复合索引代码>).这可能允许使用覆盖索引完成外连接.请务必在您的平台上进行测试,因为优化取决于实现.使用 RDBMS 的功能来分析优化计划.例如.EXPLAIN 在 MySQL 上.
Regarding indexes, I'd create a compound index in purchase over the columns (customer_id, date, id). That may allow the outer join to be done using a covering index. Be sure to test on your platform, because optimization is implementation-dependent. Use the features of your RDBMS to analyze the optimization plan. E.g. EXPLAIN on MySQL.
有些人使用子查询而不是我上面展示的解决方案,但我发现我的解决方案可以更轻松地解决关系.
Some people use subqueries instead of the solution I show above, but I find my solution makes it easier to resolve ties.
更多推荐
SQL join:选择一对多关系中的最后一条记录
发布评论