我有两个表- books 和 images . books 具有诸如 id , name , releasedate , purchasecount 之类的列. images 具有 bookid (与书籍中的 id 相同,基本上一本书可以包含多张图像.尽管我没有设置任何其他键约束), bucketid , poster (每个记录都指向某个存储桶中某个 bookid 的图像文件).
表架构:
我的查询被命名,从名称与该名称匹配的books表中找到前十本书(按 purchasecount 的数量排序),并针对该书返回任何一本(最好是首先)从 images 表中记录( 很明显,这可以通过运行两个查询来解决,方法是运行第一个查询,并使用其结果查询图像表,但这会很慢,因此我想使用"join"和子查询一次性完成.但是,我所尝试的并没有给我正确的结果: 有人可以在这里建议一个最佳查询来获取所需的结果集(包括任何建议来更改表架构以缩短搜索时间)吗? 更新的提琴:
结果
+ ---- + --------------- ++ -------------------------------- +---------- +|id |采购数量|海报|桶|+ ---- + --------------- + --------------------- + ---------- +|4 |678 |NULL |NULL ||6 |500 |NULL |NULL ||5 |459 |swt |11 ||1 |456 |blah_blah_tie_break |111 ||3 |77 |qwqqe |14 ||2 |11 |z |81 |+ ---- + --------------- + --------------------- + ---------- +交叉连接的意义仅在于引入和设置2个变量的起始值.就是这样.
结果是按 purchasecount 降序排列的前十本书,其中包含 images 中的信息(如果存在)(否则为 NULL )最高评价的图像.选定的图像遵循平局决胜规则,如上面可视化部分所述,使用 rownum 来选择第一个规则.
最终思想由于给定的示例数据没有可用于搜索的有用的书名,因此我将其留给OP在最后的适当的 where 子句中插入.这部分微不足道.哦,对主键的大宽度做些关于架构的事情.但这目前还没主题.
I have two tables - books and images. books has columns like id, name, releasedate, purchasecount. images has bookid (which is same as the id in books, basically one book can have multiple images. Although I haven't set any foreign key constraint), bucketid, poster (each record points to an image file in a certain bucket, for a certain bookid).
Table schema:
My query is, given a name, find the top ten books (sorted by number of purchasecount) from the books table whose name matches that name, and for that book, return any (preferably the first) record (bucketid and poster) from the images table.
Obviously this can be solved by two queries by running the first, and using its results to query the images table, but that will be slow, so I want to use 'join' and subquery to do it in one go. However, what I am trying is not giving me correct results:
select books.id,books.name,year(releasedate),purchasecount,bucketid,poster from books inner join (select bucketid,bookid, poster from images) t on t.bookid = books.id where name like "%foo%" order by purchasecount desc limit 2;Can anybody suggest an optimal query to fetch the result set as desired here (including any suggestion to change the table schema to improve search time) ?
Updated fiddle: sqlfiddle/#!9/17c5a8/1.
The example query should return two results - fooe and fool, and one (any of the multiple posters corresponding to each book) poster for each result. However I am not getting correct results. Expected:
fooe - 1973 - 459 - 11 - swt (or fooe - 1973 - 459 - 11 - pqr)
fool - 1963 - 456 - 12 - xxx (or fool - 1963 - 456 - 111 - qwe)
解决方案I agree with Strawberry about the schema. We can discuss ideas for better performance and all that. But here is my take on how to solve this after a few chats and changes to the question.
Note below the data changes to deal with various boundary conditions which include books with no images in that table, and tie-breaks. Tie-breaks meaning using the max(upvotes). The OP changed the question a few times and added a new column in the images table.
Modified quetion became return 1 row make per book. Scratch that, always 1 row per book even if there are no images. The image info to return would be the one with max upvotes.
Books table create table books ( id int primary key, name varchar(1000), releasedate date, purchasecount int ) ENGINE=InnoDB; insert into books values(1,"fool","1963-12-18",456); insert into books values(2,"foo","1933-12-18",11); insert into books values(3,"fooherty","1943-12-18",77); insert into books values(4,"eoo","1953-12-18",678); insert into books values(5,"fooe","1973-12-18",459); insert into books values(6,"qoo","1983-12-18",500);Data Changes from original question.
Mainly the new upvotes column.
The below includes a tie-break row added.
create table images ( bookid int, poster varchar(150) primary key, bucketid int, upvotes int -- a new column introduced by OP ) ENGINE=InnoDB; insert into images values (1,"xxx",12,27); insert into images values (5,"pqr",11,0); insert into images values (5,"swt",11,100); insert into images values (2,"yyy",77,65); insert into images values (1,"qwe",111,69); insert into images values (1,"blah_blah_tie_break",111,69); insert into images values (3,"qwqqe",14,81); insert into images values (1,"qqawe",8,45); insert into images values (2,"z",81,79);Visualization of a Derived Table
This is just to assist in visualizing an inner piece of the final query. It demonstrates the gotcha for tie-break situations, thus the rownum variable. That variable is reset to 1 each time the bookid changes otherwise it increments. In the end (our final query) we only want rownum=1 rows so that max 1 row is returned per book (if any).
Final Query select b.id,b.purchasecount,xDerivedImages2.poster,xDerivedImages2.bucketid from books b left join ( select i.bookid,i.poster,i.bucketid,i.upvotes, @rn := if(@lastbookid = i.bookid, @rn + 1, 1) as rownum, @lastbookid := i.bookid as dummy from ( select bookid,max(upvotes) as maxup from images group by bookid ) xDerivedImages join images i on i.bookid=xDerivedImages.bookid and i.upvotes=xDerivedImages.maxup cross join (select @rn:=0,@lastbookid:=-1) params order by i.bookid ) xDerivedImages2 on xDerivedImages2.bookid=b.id and xDerivedImages2.rownum=1 order by b.purchasecount desc limit 10Results
+----+---------------+---------------------+----------+ | id | purchasecount | poster | bucketid | +----+---------------+---------------------+----------+ | 4 | 678 | NULL | NULL | | 6 | 500 | NULL | NULL | | 5 | 459 | swt | 11 | | 1 | 456 | blah_blah_tie_break | 111 | | 3 | 77 | qwqqe | 14 | | 2 | 11 | z | 81 | +----+---------------+---------------------+----------+The significance of the cross join is merely to introduce and set starting values for 2 variables. That is all.
The results are the top ten books in descending order of purchasecount with the info from images if it exists (otherwise NULL) for the most upvoted image. The image selected honors tie-break rules picking the first one as mentioned above in the Visualization section with rownum.
Final ThoughtsI leave it to the OP to wedge in the appropriate where clause at the end as the sample data given had no useful book name to search on. That part is trivial. Oh, and do something about the schema for the large width of your primary keys. But that is off-topic at the moment.
更多推荐
MySQL复杂子查询表述
发布评论