我正在开发一个Web系统,以处理ReiserFS上的大量小型图像,大约1亿个50kb〜200kb的图像.
I am developing a web system to handle a very large set of small images, about 100 millions images of 50kb ~ 200kb, working on ReiserFS.
目前,difficult to backup and sync只是大量的小文件.
For now, it is very difficult to backup and sync those large number of small files.
我的问题是,将这些小图像存储到键/值存储或其他Nosql数据库(例如GridFS (Mongodb),Tokyo Tyrant,Voldemort)以获得更好的性能并带来更好的备份支持是一个好主意吗?
My question is that if it a good idea to store these small images to a key/value store or other nosql database such as GridFS (Mongodb), Tokyo Tyrant, Voldemort to gain more performance and bring better backup support?
推荐答案首先,看看以下内容:在文件系统中存储millon图像.尽管与备份无关,但这是对当前主题的值得讨论的内容.
First off, have a look at this: Storing a millon images in the filesystem. While it isn't about backups, it is a worthwile discussion of the topic at hand.
是的,大量小文件令人讨厌;它们占用inode,需要用于文件名& c的空间. (并且花时间来备份所有这些元数据).基本上,听起来好像您已经弄清楚了文件的提供;如果在nginx上运行它,并且前面带有varnish,则几乎无法使其更快.在该数据库下添加数据库只会使事情变得更加复杂.在备份方面也是如此. las,我建议您更加努力地执行就地FS备份策略.
And yes, large numbers of small files are pesky; They take up inodes, require space for filenames &c. (And it takes time to do backup of all this meta-data). Basically it sounds like you got the serving of the files figured out; if you run it on nginx, with a varnish in front or such, you can hardly make it any faster. Adding a database under that will only make things more complicated; also when it comes to backing up. Alas, I would suggest working harder on a in-place FS backup strategy.
首先,您是否尝试过使用-az开关(分别为存档和压缩)进行rsync?它们往往非常高效,因为它不会一次又一次地传输相同的文件.
First off, have you tried rsync with the -az-switches (archive and compression, respectively)? They tend to be highly effective, as it doesn't transfer the same files again and again.
或者,我的建议是将tar + gz转换为许多文件.用伪代码(并假设您将它们放在不同的子文件夹中):
Alternately, my suggestion would be to tar + gz into a number of files. In pseudo-code (and assuming you got them in different sub-folders):
foreach prefix (`ls -1`): tar -c $prefix | gzip -c -9 | ssh -z destination.example.tld "cat > backup_`date --iso`_$prefix.tar.gz" end这将创建许多.tar.gz文件,这些文件可以轻松传输而不会产生过多开销.
This will create a number of .tar.gz-files that are easily transferred without too much overhead.
更多推荐
将数亿个小图像存储到键/值存储或其他nosql数据库是个好主意吗?
发布评论