基于一条线将一个大文件拆分成较小的文件[关闭](Splitting a big file into smaller ones basing on a line [closed])

编程入门 行业动态 更新时间:2024-10-24 20:19:38
基于一条线将一个大文件拆分成较小的文件[关闭](Splitting a big file into smaller ones basing on a line [closed])

我有一个非常大的文件(超过20GB),我想把它分成更小的文件,比如2GB的多个文件。

有一点是我必须在特定行之前拆分:

我正在使用Python,但如果在shell中有另一个解决方案,我就是为了它。

这就是大文件的样子:

bigfile.txt (20GB)

Recno:: 0 some data... Recno:: 1 some data... Recno:: 2 some data... Recno:: 3 some data... Recno:: 4 some data... Recno:: 5 some data... Recno:: x some more data...

这就是我要的:

file1.txt (2 GB +/-)

Recno::0 some data... Recno:: 1 some data...

file2.txt (2GB +/-)

Recno:: 2 some data... Recno:: 4 some data... Recno:: 5 some data...

等等等等...

谢谢 !

I have a pretty big file (more than 20GB) and I'd like to split it into smaller ones, like multiple files of 2GB.

One thing is I have to split before a specific line:

I'm using Python, but if there another solution in shell for example, I'm up for it.

This is how the big file looks like:

bigfile.txt (20GB)

Recno:: 0 some data... Recno:: 1 some data... Recno:: 2 some data... Recno:: 3 some data... Recno:: 4 some data... Recno:: 5 some data... Recno:: x some more data...

This is what I want:

file1.txt (2 GB +/-)

Recno::0 some data... Recno:: 1 some data...

file2.txt (2GB +/-)

Recno:: 2 some data... Recno:: 4 some data... Recno:: 5 some data...

And so on, and so on...

Thanks !

最满意答案

你可以这样做:

import sys try: _, size, file = sys.argv size = int(size) except ValueError: sys.exit('Usage: splitter.py <size in bytes> <filename to split>') with open(file) as infile: count = 0 current_size = 0 # you could do something more # fancy with the name like use # os.path.splitext outfile = open(file+'_0', 'w+') for line in infile: if current_size > size and line.startswith('Recno'): outfile.close() count += 1 current_size = 0 outfile = open(file+'_{}'.format(count), 'w+') current_size += len(line) outfile.write(line) outfile.close()

You could do something like this:

import sys try: _, size, file = sys.argv size = int(size) except ValueError: sys.exit('Usage: splitter.py <size in bytes> <filename to split>') with open(file) as infile: count = 0 current_size = 0 # you could do something more # fancy with the name like use # os.path.splitext outfile = open(file+'_0', 'w+') for line in infile: if current_size > size and line.startswith('Recno'): outfile.close() count += 1 current_size = 0 outfile = open(file+'_{}'.format(count), 'w+') current_size += len(line) outfile.write(line) outfile.close()

更多推荐

本文发布于:2023-07-30 00:08:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1320842.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:较小   一条线   大文件   文件   Splitting

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!