可以避免使用$ elemMatch和常规查询在大型数组上进行$ unwind/聚合吗?

编程入门行业动态更新时间:2024-10-25 14:35:28

本文介绍了可以避免使用$ elemMatch和常规查询在大型数组上进行$ unwind/聚合吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我有一个类似于以下内容的文档集合(称为日志"):

I have a collection of documents (call it 'logs') which looks similar to this:

{ "_id" : ObjectId("52f523892491e4d58e85d70a"), "ds_id" : "534d35d72491de267ca08e96", "eT" : NumberLong(1391784000), "vars" : [{ "n" : "ActPow", "val" : 73.4186401367188, "u" : "kWh", "dt" : "REAL", "cM" : "AVE", "Q" : 99 }, { "n" : "WinSpe", "val" : 3.06327962875366, "u" : "m/s", "dt" : "REAL", "cM" : "AVE", "Q" : 99 }] }

vars数组包含大约150个子文档，而不仅仅是上面显示的两个.我现在想做的是运行一个查询，该查询在上面显示的vars数组中检索两个子文档的val.

The vars array holds about 150 subdocuments, not just the two I have shown above. What I'd like to do now is to run a query which retrieves the val of the two subdocuments in the vars array that I have shown above.

使用聚合框架，我可以提出以下建议:

Using the aggregation framework, I've been able to come up with the following:

db.logs.aggregate( [ { $match : { ds_id: "534d35d72491de267ca08e96", eT: { $lt : 1391784000 }, vars: { $elemMatch: { n: "PowCrvVld", val: 3 }} } }, { $unwind : "$vars" }, { $match : { "vars.n" : { $in : ["WinSpe", "ActPow"] }}, { $project : { "vars.n" : 1, N : 1} } ]);

尽管这可行，但是在运行较大的查询时，我的内存已达到16MB的限制.鉴于我在vars数组中有大约150个子文档，因此我也想避免使用$unwind.

While this works, I run up against the 16MB limit when running larger queries. Seeing as I have about 150 subdocuments in the vars array, I'd also like to avoid $unwind if it's possible.

使用常规查询并使用$elemMatch，我已经能够检索其中一个值:

Using a regular query and using $elemMatch I have been able to retrieve ONE of the values:

db.logs.TenMinLog.find({ ds_id : "534d35d72491de267ca08e96", eT : { $lt : 1391784000 }, vars : { $elemMatch : { n : "PowCrvVld", val : 3 } } }, { ds_id : 1, vars : { $elemMatch : { n : "ActPow", cM : "AVE" } });

我的问题归结为，是否有一种方法可以在find的<projection>部分中对数组多次使用$ elemMatch.如果不是，是否有另一种方法可以轻松地检索这两个子文档而无需使用$ unwind?我也愿意接受其他一些我可能不会意识到的建议.谢谢！

What my question comes down to is if there's a way to use $elemMatch on an array multiple times in the <projection> part of find. If not, is there another way to easily retrieve those two subdocuments without using $unwind? I am also open to other suggestions that would be more performant that I may not be aware of. Thanks!

推荐答案

我强烈考虑迁移到MongoDB版本2.6.聚合已得到增强，可以返回一个游标，从而消除了16MB的文档限制:

I'd strongly consider a move to MongoDB version 2.6. Aggregation has been enhanced to return a cursor which eliminates the 16MB document limit:

在2.6版中进行了更改:

Changed in version 2.6:

db.collection.aggregate()方法返回一个游标，并且可以返回任何大小的结果集.先前的版本会在单个文档，并且结果集的大小限制为16 兆字节.

The db.collection.aggregate() method returns a cursor and can return result sets of any size. Previous versions returned all results in a single document, and the result set was subject to a size limit of 16 megabytes.

docs.mongodb/manual/core/aggregation-pipeline /

对于更复杂的聚合查询，您还可以发现许多增强功能:

Also there are a number of enhancements that you may find useful for more complex aggregation queries:

聚合增强

聚合管道增加了返回任何结果集的能力大小，方法是返回游标或将输出写入收藏.此外，聚合管道支持变量并添加了新的操作来处理集合和修订数据.

The aggregation pipeline adds the ability to return result sets of any size, either by returning a cursor or writing the output to a collection. Additionally, the aggregation pipeline supports variables and adds new operations to handle sets and redact data.

db.collection.aggregate()现在返回一个游标，该游标启用聚合管道返回任意大小的结果集.聚合管道现在支持解释操作，以帮助分析聚合操作.聚合现在可以使用更高效的基于外部磁盘的排序过程.

The db.collection.aggregate() now returns a cursor, which enables the aggregation pipeline to return result sets of any size. Aggregation pipelines now support an explain operation to aid analysis of aggregation operations. Aggregation can now use a more efficient external-disk-based sorting process.

新的管道阶段: