Mongo将嵌入式文档转换为数组(Mongo convert embedded document to array)

编程入门 行业动态 更新时间:2024-10-28 12:20:09
Mongo将嵌入式文档转换为数组(Mongo convert embedded document to array)

有没有办法将嵌套文档结构转换为数组? 以下是一个例子:

输入

"experience" : { "0" : { "duration" : "3 months", "end" : "August 2012", "organization" : { "0" : { "name" : "Bank of China", "profile_url" : "http://www.linkedin.com/company/13801" } }, "start" : "June 2012", "title" : "Intern Analyst" } },

预期产出:

"experience" : [ { "duration" : "3 months", "end" : "August 2012", "organization" : { "0" : { "name" : "Bank of China", "profile_url" : "http://www.linkedin.com/company/13801" } }, "start" : "June 2012", "title" : "Intern Analyst" } ],

目前我正在使用脚本迭代每个元素,将它们转换为数组并最终更新文档。 但这需要花费很多时间,有没有更好的方法呢?

Is there a way to convert a nested document structure into an array? Below is an example:

Input

"experience" : { "0" : { "duration" : "3 months", "end" : "August 2012", "organization" : { "0" : { "name" : "Bank of China", "profile_url" : "http://www.linkedin.com/company/13801" } }, "start" : "June 2012", "title" : "Intern Analyst" } },

Expected Output:

"experience" : [ { "duration" : "3 months", "end" : "August 2012", "organization" : { "0" : { "name" : "Bank of China", "profile_url" : "http://www.linkedin.com/company/13801" } }, "start" : "June 2012", "title" : "Intern Analyst" } ],

Currently I am using a script to iterate over each element, convert them to an array & finally update the document. But it is taking a lot of time, is there a better way of doing this?

最满意答案

您仍然需要迭代内容,但您应该使用批量操作回写:

对于MongoDB 2.6及更高版本:

var bulk = db.collection.initializeUnorderedBulkOp(),
    count = 0;

db.collection.find({ 
   "$where": "return !Array.isArray(this.experience)"
}).forEach(function(doc) {
    bulk.find({ "_id": doc._id }).updateOne({
        "$set": { "experience": [doc.experience["0"]] }
    });
    count++;

    // Write once in 1000 entries
    if ( count % 1000 == 0 ) {
        bulk.execute();    
        bulk = db.collection.initializeUnorderedBulkOp();
    }
})

// Write the remaining
if ( count % 1000 != 0 )
    bulk.execute();
 

或者在MongoDB 3.2及更高版本的现代版本中, bulkWrite()方法是首选:

var ops = [];

db.collection.find({ 
   "$where": "return !Array.isArray(this.experience)"
}).forEach(function(doc) {
   ops.push({
       "updateOne": {
           "filter": { "_id": doc._id },
           "update": { "$set": { "experience": [doc.experience["0"]] } }
       }
   });

   if ( ops.length == 1000 ) {
       db.collection.bulkWrite(ops,{ "ordered": false })
       ops = [];
   }
})

if ( ops.length > 0 )
    db.collection.bulkWrite(ops,{ "ordered": false });
 

因此,当通过游标写回数据库时,使用“无序”设置的批量写入操作是可行的方法。 每批1000个请求只有一个写入/响应,这减少了大量开销,“无序”意味着写入可以并行发生,而不是按顺序发生。 这一切都使它更快。

You still need to iterate over the content, but instead you should be writing back using bulk operations:

Either for MongoDB 2.6 and greater:

var bulk = db.collection.initializeUnorderedBulkOp(),
    count = 0;

db.collection.find({ 
   "$where": "return !Array.isArray(this.experience)"
}).forEach(function(doc) {
    bulk.find({ "_id": doc._id }).updateOne({
        "$set": { "experience": [doc.experience["0"]] }
    });
    count++;

    // Write once in 1000 entries
    if ( count % 1000 == 0 ) {
        bulk.execute();    
        bulk = db.collection.initializeUnorderedBulkOp();
    }
})

// Write the remaining
if ( count % 1000 != 0 )
    bulk.execute();
 

Or in modern releases of MongoDB 3.2 and greater, the bulkWrite() method is preferred:

var ops = [];

db.collection.find({ 
   "$where": "return !Array.isArray(this.experience)"
}).forEach(function(doc) {
   ops.push({
       "updateOne": {
           "filter": { "_id": doc._id },
           "update": { "$set": { "experience": [doc.experience["0"]] } }
       }
   });

   if ( ops.length == 1000 ) {
       db.collection.bulkWrite(ops,{ "ordered": false })
       ops = [];
   }
})

if ( ops.length > 0 )
    db.collection.bulkWrite(ops,{ "ordered": false });
 

So when writing back to the database over a cursor, then bulk write operations with "unordered" set is the way to go. It's only one write/response per batch of 1000 requests, which reduces a lot of overhead, and "unordered" means that writes can happen in parallel rather than in a serial order. It all makes it faster.

更多推荐

本文发布于:2023-07-31 15:08:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1344967.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:数组   转换为   嵌入式   文档   Mongo

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!