时间间隔内的Mongo聚合

编程入门 行业动态 更新时间:2024-10-27 18:26:32
本文介绍了时间间隔内的Mongo聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有一些日志数据存储在mongo集合中,其中包括基本信息(如request_id以及将其添加到集合中的时间),例如:

I have some log data stored in a mongo collection that includes basic information as a request_id and the time it was added to the collection, for example:

{ "_id" : ObjectId("55ae6ea558a5d3fe018b4568"), "request_id" : "030ac9f1-aa13-41d1-9ced-2966b9a6g5c3", "time" : ISODate("2015-07-21T16:00:00.00Z") }

我想知道是否可以使用聚合框架来聚合一些统计数据.我想获取最近X个小时的每N分钟间隔内创建的对象的计数.

I was wondering if I could use the aggregation framework to aggregate some statistical data. I would like to get the counts of the objects created within each interval of N minutes for the last X hours.

所以我在过去1个小时需要间隔10分钟的输出应该类似于以下内容:

So the output which I need for 10 minutes intervals for the last 1 hour should be something like the following:

{ "_id" : 0, "time" : ISODate("2015-07-21T15:00:00.00Z"), "count" : 67 } { "_id" : 0, "time" : ISODate("2015-07-21T15:10:00.00Z"), "count" : 113 } { "_id" : 0, "time" : ISODate("2015-07-21T15:20:00.00Z"), "count" : 40 } { "_id" : 0, "time" : ISODate("2015-07-21T15:30:00.00Z"), "count" : 10 } { "_id" : 0, "time" : ISODate("2015-07-21T15:40:00.00Z"), "count" : 32 } { "_id" : 0, "time" : ISODate("2015-07-21T15:50:00.00Z"), "count" : 34 }

我会用它来获取图形数据.

I would use that to get data for graphs.

任何建议都值得赞赏!

推荐答案

根据最适合您需要的输出格式,有几种解决方法.主要说明是聚合框架" 本身,您实际上无法返回"cast"作为日期,但是当在API中处理结果时,您可以获得易于重构为Date对象的值.

There are a couple of ways of approaching this depending on which output format best suits your needs. The main note is that with the "aggregation framework" itself, you cannot actually return something "cast" as a date, but you can get values that are easily reconstructed into a Date object when processing results in your API.

第一种方法是使用日期聚合运算符" 可用于聚合框架:

The first approach is to use the "Date Aggregation Operators" available to the aggregation framework:

db.collection.aggregate([ { "$match": { "time": { "$gte": startDate, "$lt": endDate } }}, { "$group": { "_id": { "year": { "$year": "$time" }, "dayOfYear": { "$dayOfYear": "$time" }, "hour": { "$hour": "$time" }, "minute": { "$subtract": [ { "$minute": "$time" }, { "$mod": [ { "$minute": "$time" }, 10 ] } ] } }, "count": { "$sum": 1 } }} ])

哪个返回_id的组合键,其中包含您想要的日期"的所有值.或者,如果总是在一个小时"之内,则只需使用分钟"部分,然后根据范围选择的startDate计算出实际日期.

Which returns a composite key for _id containing all the values you want for a "date". Alternately if just within an "hour" always then just use the "minute" part and work out the actual date based on the startDate of your range selection.

或者您也可以只使用简单的日期数学"来获取自纪元"以来的毫秒数,该毫秒数可以再次直接输入到日期构造器中.

Or you can just use plain "Date math" to get the milliseconds since "epoch" which can again be fed to a date contructor directly.

db.collection.aggregate([ { "$match": { "time": { "$gte": startDate, "$lt": endDate } }}, { "$group": { "_id": { "$subtract": [ { "$subtract": [ "$time", new Date(0) ] }, { "$mod": [ { "$subtract": [ "$time", new Date(0) ] }, 1000 * 60 * 10 ]} ] }, "count": { "$sum": 1 } }} ])

在所有情况下,您不想要做的是使用 $project ,然后实际应用 $group . $project作为管道阶段",必须循环"所有选定的文档并转换"内容.

In all cases what you do not want to do is use $project before actually applying $group. As a "pipeline stage", $project must "cycle" though all documents selected and "transform" the content.

这将花费时间,并将其添加到查询的执行总数中.您可以直接将$group应用于已显示的内容.

This takes time, and adds to the execution total of the query. You can simply just apply to the $group directly as has been shown.

或者,如果您对返回的Date对象确实是纯"的,而无需进行后期处理,则可以始终使用" mapReduce",因为JavaScript函数实际上允许将日期重铸为日期,但比聚合框架要慢,当然也没有游标响应:

Or if you are really "pure" about a Date object being returned without post processing, then you can always use "mapReduce", since the JavaScript functions actually allow recasting as a date, but slower than the aggregation framework and of course without a cursor response:

db.collection.mapReduce( function() { var date = new Date( this.time.valueOf() - ( this.time.valueOf() % ( 1000 * 60 * 10 ) ) ); emit(date,1); }, function(key,values) { return Array.sum(values); }, { "out": { "inline": 1 } } )

您最好的选择还是使用聚合,因为转换响应非常容易:

Your best bet is using aggregation though, as transforming the response is quite easy:

db.collection.aggregate([ { "$match": { "time": { "$gte": startDate, "$lt": endDate } }}, { "$group": { "_id": { "year": { "$year": "$time" }, "dayOfYear": { "$dayOfYear": "$time" }, "hour": { "$hour": "$time" }, "minute": { "$subtract": [ { "$minute": "$time" }, { "$mod": [ { "$minute": "$time" }, 10 ] } ] } }, "count": { "$sum": 1 } }} ]).forEach(function(doc) { doc._id = new Date(doc._id); printjson(doc); })

然后将间隔分组输出与实际的Date对象一起

And then you have your interval grouping output with real Date objects.

更多推荐

时间间隔内的Mongo聚合

本文发布于:2023-10-16 11:23:45,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1497417.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:间隔   时间   Mongo

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!