我需要提取与正则表达式匹配的字符串的一部分并返回它.
I need to extract a part of a string that matches a regex and return it.
我有一组文件,例如:
{"_id" :12121, "fileName" : "apple.doc"}, {"_id" :12125, "fileName" : "rap.txt"}, {"_id" :12126, "fileName" : "tap.pdf"}, {"_id" :12126, "fileName" : "cricket.txt"},我需要提取所有文件扩展名并返回{".doc", ".txt", ".pdf"}.
I need to extract all file extensions and return {".doc", ".txt", ".pdf"}.
我正在尝试使用$regex运算符查找子字符串并根据结果进行汇总,但是无法提取所需的部分并将其传递到管道中.
I am trying to use the $regex operator to find the sub strings and aggregate on the results but am unable to extract the required part and pass it down the pipeline.
我尝试了类似的尝试,但没有成功:
I have tried something like this without success:
aggregate([ { $match: { "name": { $regex: '/\.[0-9a-z]+$/i', "$options": "i" } } }, { $group: { _id: null, tot: { $push: "$name" } } } ])推荐答案
在即将发布的MongoDB版本中(撰写本文时),可以使用聚合框架和$indexOfCP运算符执行此操作.在那之前,您最好的选择是MapReduce.
It will be possible to do this in the upcoming version of MongoDB(as the time of this writing) using the aggregation framework and the $indexOfCP operator. Until then, your best bet here is MapReduce.
var mapper = function() { emit(this._id, this.fileName.substring(this.fileName.indexOf("."))) }; db.coll.mapReduce(mapper, function(key, value) {}, { "out": { "inline": 1 }} )["results"]哪种产量:
[ { "_id" : 12121, "value" : ".doc" }, { "_id" : 12125, "value" : ".txt" }, { "_id" : 12126, "value" : ".pdf" }, { "_id" : 12127, "value" : ".txt" } ]
为完整起见,这是使用聚合框架的解决方案 *
db.coll.aggregate( [ { "$match": { "name": /\.[0-9a-z]+$/i } }, { "$group": { "_id": null, "extension": { "$push": { "$substr": [ "$fileName", { "$indexOfCP": [ "$fileName", "." ] }, -1 ] } } }} ])产生:
{ "_id" : null, "extensions" : [ ".doc", ".txt", ".pdf", ".txt" ] }
* MongoDB的当前开发版本(在撰写本文时).
更多推荐
使用正则表达式从MongoDB中提取子字符串列表
发布评论