我很难在这里找到正确的查询/查询.当您想使用标签查找相关项时,在 MySQL 中,您可以使用常用标签计数"来查找最相似的项.
I'm having a hard time to get to the right query/queries here. When you want to find related items using tags, in MySQL, you can use a 'common tag count' to find items that are most similar.
假设我的架构如下所示:
Say my schema looks like this:
- 标签(tag_id,标题)
- 文章(article_id, some_text)
- articles_tags(tag_id, article_id)
然后,您可以获取项目并在常见标签上对它们进行排序,例如文章 2",如下所示:
Then you can get items and sort them on common tags with 'article 2' for example, like this:
SELECT at1.article_id, Count(at1.tag_id) AS common_tag_count FROM articles_tags AS at1 INNER JOIN articles_tags AS at2 ON at1.tag_id = at2.tag_id WHERE at2.article_id = 2 GROUP BY at1.article_id HAVING at1.article_id != 2 ORDER BY common_tag_count DESC;但在我的情况下,有一个挑战.我想根据多篇文章而不是一篇文章(类似于阅读历史")找到类似的文章.如果 2 篇文章都有标签 X,我希望标签 X 变得更重要.
But in my situation, there's a challenge. I want to find similar articles based on multiple articles instead of one (something like a "read history"). And if 2 articles both have tag X, I want tag X to become more important.
所以基本上,我正在寻找一种方法来进行 common_tag_count 匹配,但具有标签的权重.任何人都知道如何做到这一点?
So basicly, I'm looking for a way to do a common_tag_count match but with a weight for tags. Anyone has any idea how to accomplish this?
推荐答案要获取多篇文章使用的标签,包括使用频率,可以使用这个查询:
To get the tags used by the multiple articles, including how often they are used, you can use this query:
SELECT tag_id, COUNT(article_id) as tag_weight FROM articles_tags WHERE article_id IN ( /* Read articles */ 1, 2 ) GROUP BY tag_id;要根据该选择获取类似文章,您必须在已有的类似联接中使用上述查询:
To get the similar articles based on that selection you have to use above query in a similar join as you already have:
SELECT articles.article_id, articles.title, SUM(tag_weights.tag_weight) FROM articles JOIN articles_tags ON articles_tags.article_id = articles.article_id JOIN ( SELECT tag_id, COUNT(article_id) as tag_weight FROM articles_tags WHERE article_id IN ( /* Read articles */ 1, 2 ) GROUP BY tag_id ) AS tag_weights ON articles_tags.tag_id = tag_weights.tag_id WHERE articles.article_id NOT IN ( /* Read articles */ 1, 2 ) GROUP BY articles.article_id ORDER BY SUM(tag_weights.tag_weight) DESC;我们在可以访问标签权重的子查询上添加了一个额外的 JOIN.使用 ORDER BY,您将首先获得最佳"结果.
We're adding an extra JOIN here on the subquery which has access to the tag-weights. Using the ORDER BY you get the 'best' results first.
演示:www.sqlfiddle/#!2/b35432/2/1(阅读文章 1 和 2,给标签 1 的权重为 2,给标签 2 的权重为 1).
Demo: www.sqlfiddle/#!2/b35432/2/1 (articles 1 and 2 are read, giving tag 1 a weight of 2, tag 2 a weight of 1).
更多推荐
在 MySQL 中对具有权重的匹配标签上的项目进行排序
发布评论