猫鼬独特的结果(Mongoose unique results)

编程入门行业动态更新时间:2024-10-28 18:34:32

我正在寻找形成一个查询，将从我的mongo数据库中拉出一组结果，但删除/忽略具有重复字段值的结果。

这里是senario，我从spotify api中提取了很多结果并将它们存储在我的数据库中，由于我正在做的事情的性质，我最终得到了许多相同的专辑，这些专辑共享一个id字段。请注意，这不是mongo _id字段。

我想要的是，当用户构建可能包含这些重复项的查询时，消除拉同一专辑的多个部分。

这是我目前的查询，这是我想要的，但不会过滤掉重复项：

Albums.aggregate([ { $match : { source_region : { $in: countries }}}, { $skip : offset }, { $limit : limit } ])

起初，我使用了更典型的Collection.find().sort()等，并且遇到了distinct ，但是您不能使用sort ， limit等等。

我也尝试使用$group但似乎只是返回我指定的字段，所以当我尝试类似于：

{ $group : { _id : null, uniqueValues : { $addToSet : "$id" }}}

唯一返回的字段是id字段，当我需要大约10-20与该专辑相关联时。

如果有人能指出我的方向会很棒！

更新1

以下是集合中一些文档的示例

{ _id : ObjectId("5ad965a8bc349952904f7f31"), id : 0nEsaNZGpk0HIgY3OGCyR6, title : "some album", artist : "some artist }, { _id : ObjectId("665fhFHJFjdjfud7d6f6"), id : 5JUSBHF&55sdfhjkf86sd, title : "another album", artist : "another artist }, { _id : ObjectId("56&DFHJFHJJFJSgh76sdghhsd"), id : 0nEsaNZGpk0HIgY3OGCyR6, title : "some album", artist : "some artist }

因此，如果这是我的数据，我只想返回共享Spotify生成的id字段的文档之一。

I'm looking to form a query that will pull a set of results from my mongo database, but remove/ignore results that have a duplicate field value.

Here is the senario, i'm pulling many results from the spotify api and storing them in my database, and due to the nature of what I am doing, I end up pulling many of the same albums, these albums share an id field. Note this is not the mongo _id field.

What I want, is to eliminate pulling multiple of the same album when the user builds a query that could include these duplicates.

Here is my query currently, which does what i want, but doesn't filter out the duplicates:

Albums.aggregate([ { $match : { source_region : { $in: countries }}}, { $skip : offset }, { $limit : limit } ])

At first i was using the more typical Collection.find().sort() etc and came across distinct, but you can't use sort, limit etc with distinct.

I've also tried using $group but that seems to just return the field i specify, so when i try something like:

{ $group : { _id : null, uniqueValues : { $addToSet : "$id" }}}

the only field that is returned is the id field, when i need about 10-20 associated with that album.

If anybody could point me in the right direction that would be great!

Update 1

Here is an example of some documents in the collection

So if this was my data, I would want to only return one of the documents that share the spotify generated id field.

最满意答案

既然你已经沉默了，那么我们就必须做出一些假设。

除了您希望文档中的“一个”属性定义“唯一”（除_id ，其他数据除外）之外，您将执行的操作如下所示：

Albumns.aggregate([ { "$group": { "_id": "$uniqueProp", "doc": { "$first": "$$ROOT" } }}, { "$replaceRoot": { "newRoot": "$doc" } } { "$skip": offset }, { "$limit": limit } ])

或者其他任何你想做的操作。

通过$group管道阶段， _id属性决定了“分组”的结果的“唯一性”。由该键指定的值不会超过1个。你甚至可以有一个复合值：

{ "$group": { "_id": { "firstField": "$firstField", "secondField": "$secondField" }, "doc": { "$first": "$$ROOT" } }}

因此，无论在哪里，都是独一无二的 。

无论何时“分组”，除了_id键以外，您都需要一个“累加器”。因此，在这里我们使用$first来简单地获取我们指定的任何值的第一个结果，并在整个文档中使用$$ROOT 。

现代版本有$replaceRoot来清理文档。如果你没有那个，那么你可以$project每个领域或者简单地使用"doc"属性下的输出。

Since you've gone pretty silent, we'll just have to make some presumptions then.

With no other data to go on other than you expect "one" property in your documents to define "unique" ( other than _id, which already does ) then what you would do is something like this:

Albumns.aggregate([ { "$group": { "_id": "$uniqueProp", "doc": { "$first": "$$ROOT" } }}, { "$replaceRoot": { "newRoot": "$doc" } } { "$skip": offset }, { "$limit": limit } ])

Or whatever other manipulation you want to do.

With a $group pipeline stage, the _id property is what determines "uniqueness" of results that you "group by". There is never more than 1 of the same value produced by whatever gets specified in this key. You can even have a compound value:

{ "$group": { "_id": { "firstField": "$firstField", "secondField": "$secondField" }, "doc": { "$first": "$$ROOT" } }}

So whatever is in there comes out unique.

Whenever you are "grouping" you need an "accumulator" for anything other than the _id key. So here we use $first to simply take the first result of any value we specify and use $$ROOT here for the whole document.

Modern releases have $replaceRoot to clean up the document. If you don't have that, then you can either $project every field or simply use the output under the "doc" property.

更多推荐

本文发布于:2023-07-31 14:15:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1344946.html