我正在开发一个个人项目,我想知道某些字符是否在文本文件中占用的数据多于其他字符。 我需要选择一个字符来分隔我的文件中的项目,但是如果一个0使用的字节少于一个! 或者其他什么,最好这样做。 我知道所有字符都有ASCII值,但是较低的ASCII值意味着字符可以存储在更少的字节中吗?
这可能是一个令人难以置信的愚蠢问题,但我在网上看不到关于这个主题的任何信息,所以我来这里查看。
谢谢!
I am working on a personal project and I'm wondering if certain characters take up more data in a text file than others. I need to choose a character to seperate items in my file, but if a 0 uses less bytes than a ! or something, it would be best to do that. I know all characters have an ASCII value, but would a lower ASCII value mean the character can be stored in fewer bytes?
This might be an incredibly stupid question, but I don't see any information on the topic online so I came here to check.
Thanks!
最满意答案
这取决于您使用的字符集是否一个字符占用的空间多于另一个字符。 一些字符集是可变宽度的[1]。 UTF-8就是这样一个字符集。 以UTF-8为例,标准ASCII字符的宽度均为1字节,而扩展的ASCII字符开始占用多个字节(最多6个字节)[2]。
在您的示例中,“0”和“!”:两者都是标准ASCII,因此两者都是UTF-8中的1字节宽度。
参考文献:
可变宽度编码(维基百科) UTF-8描述(维基百科)It depends on which character set you are using as to whether or not one character will take up more space than another. Some character sets are variable-width [1]. UTF-8 is one such character set. Using UTF-8 as an example, the standard ASCII characters are all 1 byte in width, whereas the extended ASCII characters start to take up multiple bytes (up to 6) [2].
In your example, of '0' and '!': both are standard ASCII and therefore both are 1 byte in width in UTF-8.
References:
Variable Width Encoding (Wikipedia) UTF-8 Description (Wikipedia)更多推荐
发布评论