我正在评估涉及某些MongoDB聚合查询的算法的计算成本,因此,我试图找出我使用的各种运算符的成本,那么整个查询的成本将仅为所有这些都是级联应用的.
I'm evaluating the computational cost of my algorithm that involves some MongoDB aggregation queries, so I'm trying to figure out the costs of the various operators I use, then the cost of the whole query will be just the sum of all of them as they're applied in cascade.
我想说$ project,$ match和$ unwind的成本为O(n),n是集合中文档的数量,因为我没有任何索引,所以我需要扫描所有文件.
I came up saying that the cost of $project, $match and $unwind is O(n), with n being the number of documents in the collection, as I don't have any index so I need to scan all the documents.
现在我的问题是:新的$ lookup运算符的成本如何?它对两个集合执行左连接,因此我首先猜测它有点儿计算两个集合的笛卡尔积,因此代价应该是O(n * m),其中m是第二个集合的大小.我对吗? MongoDB会做些更有效的事情吗?您对此主题有参考吗?
Now my question is: what about the cost of the new $lookup operator? It performs a left join over two collections, so my first guess it that it kinda computes the cartesian product of the two collections, hence the cost should be something like O(n * m), where m is the size of the second collection. Am I right? Does MongoDB do something more efficient? Do you have any reference about this topic?
推荐答案$lookup 实际上是针对引用的集合的$in查询,其中$in的值是从管道到查找的localField值的集合.
$lookup is effectively an $in query against the referenced collection, where the value of $in is the set of localField values from the pipeline to lookup.
如果对foreignField进行了索引,则该查询的复杂度为O(log(n)).如果未索引foreignField,则查询的复杂度为O(n).
If the foreignField is indexed, that query's complexity is O(log(n)). If the foreignField isn't indexed, the query's complexity is O(n).
更多推荐
评估MongoDB聚合查询的复杂性:$ lookup的成本
发布评论