据的gz的文件大小被保存在一个.gz的文件的最后4字节的specifiction
According to the specifiction of gz the filesize is saved in the last 4bytes of a .gz file.
我已经创建了2个文件
dd if=/dev/urandom of=500M bs=1024 count=500000 dd if=/dev/urandom of=5G bs=1024 count=5000000
我gziped它们
I gziped them
gzip 500M 5G
我检查了最后4个字节做
I checked the last 4 bytes doing
tail -c4 500M|od -I (returns 512000000 as expected) tail -c4 5G|od -I (returns 825032704 as not expected)
似乎击中无形的32位屏障,使写进ISIZE完全胡说八道的价值。这是比较烦人,比他们用了一些错误,而不是位。
It seems that hitting the invisible 32bit barrier, makes the value written into the ISIZE completely nonsense. Which is more annoying, than if they had used some error bit instead.
有谁知道的一种方式来获得pssed。广州从.gz的文件大小的uncom $ P $无解压呢?
Does anyone know of a way to get the uncompressed .gz filesize from the .gz without extracting it?
感谢
说明: www.gzip/zlib/rfc-gzip.html
编辑: 如果任何人都可以尝试一下,你可以使用/ dev / zero的,而不是为/ dev / urandom的
edit: if anyone to try it out, you could use /dev/zero instead of /dev/urandom
推荐答案没有一个。
要得到一个COM pressed流的确切大小是实际去DECOM preSS(即使你写的一切到/ dev / null的,只是算个字节)的唯一方法。
The only way to get the exact size of a compressed stream is to actually go and decompress it (even if you write everything to /dev/null and just count the bytes).
其值得注意的是,ISIZE被定义为
Its worth noting that ISIZE is defined as
ISIZE(输入大小) 这包含原始(uncom pressed)输入的大小 数据模2 ^ 32。
ISIZE (Input SIZE) This contains the size of the original (uncompressed) input data modulo 2^32.
在GZIP RFC 所以它不是真正的破的在32位的障碍,你看到的是预期的行为。
in the gzip RFC so it isn't actually breaking at the 32-bit barrier, what you're seeing is expected behavior.
更多推荐
得到非常大的。广州文件的文件大小在64位平台
发布评论