我正在尝试使用Mongoose计算我的集合中数组中字符串的出现次数。我的架构如下所示:
I'm trying to count the number of occurrences of a string in an array in my collection using Mongoose. My "schema" looks like this:
var ThingSchema = new Schema({ tokens: [ String ] });我的目标是在Thing系列中获得前十名代币,其中包含每个文档多个值。例如:
My objective is to get the top 10 "tokens" in the "Thing" collection, which can contain multiple values per document. For example:
var documentOne = { _id: ObjectId('50ff1299a6177ef9160007fa') , tokens: [ 'foo' ] } var documentTwo = { _id: ObjectId('50ff1299a6177ef9160007fb') , tokens: [ 'foo', 'bar' ] } var documentThree = { _id: ObjectId('50ff1299a6177ef9160007fc') , tokens: [ 'foo', 'bar', 'baz' ] } var documentFour = { _id: ObjectId('50ff1299a6177ef9160007fd') , tokens: [ 'foo', 'baz' ] }...会给我数据结果:
...would give me data result:
[ foo: 4, bar: 2 baz: 2 ]<我正在考虑将MapReduce和Aggregate用于此工具,但我不确定什么是最佳选择。
I'm considering using MapReduce and Aggregate for this tool, but I'm not certain what is the best option.
推荐答案啊哈,我找到了解决方案。 MongoDB的聚合框架允许我们对集合执行一系列任务。特别值得注意的是 $ unwind ,它将文档中的数组分解为唯一文档,因此可以将它们分组/计算 en masse 。
Aha, I've found the solution. MongoDB's aggregate framework allows us to execute a series of tasks on a collection. Of particular note is $unwind, which breaks an array in a document into unique documents, so they can be groups / counted en masse.
MongooseJS 对此非常容易接触一个模型。使用上面的示例,如下所示:
MongooseJS exposes this very accessibly on a model. Using the example above, this looks as follows:
Thing.aggregate([ { $match: { /* Query can go here, if you want to filter results. */ } } , { $project: { tokens: 1 } } /* select the tokens field as something we want to "send" to the next command in the chain */ , { $unwind: '$tokens' } /* this converts arrays into unique documents for counting */ , { $group: { /* execute 'grouping' */ _id: { token: '$tokens' } /* using the 'token' value as the _id */ , count: { $sum: 1 } /* create a sum value */ } } ], function(err, topTopics) { console.log(topTopics); // [ foo: 4, bar: 2 baz: 2 ] });在大约200,000条记录的初步测试中,它明显快于MapReduce,因此可能会更好地扩展,但是这只是在粗略一瞥之后。 YMMV。
It is noticeably faster than MapReduce in preliminary tests across ~200,000 records, and thus likely scales better, but this is only after a cursory glance. YMMV.
更多推荐
Mongoose / MongoDB:计算数组中的元素
发布评论