我必须将500万条记录从PostgreSQL迁移到MongoDb.
I have to migrate 5 million records from PostgreSQL to MongoDb.
我尝试使用mongify进行相同操作,但由于它在ruby上运行,所以我对红宝石一点都不了解,我无法解决它带来的错误.
I tried using mongify for the same but as it runs on ruby and I am not at all acquainted with ruby i couldn't solve the errors posed by it.
因此,我尝试自己在node.js中编写代码,该代码首先将PostgreSQL data转换为JSON,然后将该JSON插入mongoDb. 但是,此操作失败了,因为它占用了大量RAM,并且最多可以迁移13000条记录.
So, I tried writing a code myself in node.js that would first convert PostgreSQL data into JSON and then insert that JSON into mongoDb. But, this failed as it ate a lot of RAM and not more than 13000 records could be migrated.
然后我想到了Java的代码,因为它有垃圾回收器.就RAM利用率而言,它工作正常,但速度非常慢(大约10000条记录/小时).以这种速度,我将需要几天的时间来迁移我的数据.
Then I thought of writing code in Java because of its garbage collector. It works fine in terms of RAM utilization but the speed is very slow (around 10000 records/hour). At this rate it would take me days to migrate my data.
那么,有没有更有效,更快捷的方法呢? python程序会比Java程序快吗?还是有其他可用的现成工具可以做到这一点?
So, Is there a more efficient and faster way of doing this? Would a python program be faster than the Java program? Or is there any other ready-made tool available for doing the same?
我的系统配置是: 作业系统-Windows 7(64位元), 内存-4GB i3处理器
My system configuration is : OS - Windows 7 (64 bit), RAM - 4GB, i3 processor
推荐答案好像我参加聚会迟到了.但是,这可能会在某天对某人派上用场!!!!
Seems like I am late to the party. However, this might come in handy to somebody, someday!!!!
以下基于python的迁移框架应该派上用场.
The following python-based migration framework should come in handy.
github/datawrangl3r/pg2mongo
保证性能,每个JSON对象的迁移都是动态的,并且在使用上述框架时不应该存在任何内存锁定问题.
Answering to your performance, the migration of each JSON object will be dynamic and there shouldn't be any memory lock issues when you use the above framework.
希望有帮助!
更多推荐
将数据从PostgreSQL迁移到MongoDB
发布评论