不同的字符需要更多/更少的数据？(Different characters take more/less data?)

系统教程行业动态更新时间:2024-06-14 17:04:03

我正在开发一个个人项目，我想知道某些字符是否在文本文件中占用的数据多于其他字符。我需要选择一个字符来分隔我的文件中的项目，但是如果一个0使用的字节少于一个！或者其他什么，最好这样做。我知道所有字符都有ASCII值，但是较低的ASCII值意味着字符可以存储在更少的字节中吗？

这可能是一个令人难以置信的愚蠢问题，但我在网上看不到关于这个主题的任何信息，所以我来这里查看。

谢谢！

I am working on a personal project and I'm wondering if certain characters take up more data in a text file than others. I need to choose a character to seperate items in my file, but if a 0 uses less bytes than a ! or something, it would be best to do that. I know all characters have an ASCII value, but would a lower ASCII value mean the character can be stored in fewer bytes?

This might be an incredibly stupid question, but I don't see any information on the topic online so I came here to check.

Thanks!

最满意答案

这取决于您使用的字符集是否一个字符占用的空间多于另一个字符。一些字符集是可变宽度的[1]。 UTF-8就是这样一个字符集。以UTF-8为例，标准ASCII字符的宽度均为1字节，而扩展的ASCII字符开始占用多个字节（最多6个字节）[2]。

在您的示例中，“0”和“！”：两者都是标准ASCII，因此两者都是UTF-8中的1字节宽度。

参考文献：

可变宽度编码（维基百科） UTF-8描述（维基百科）

It depends on which character set you are using as to whether or not one character will take up more space than another. Some character sets are variable-width [1]. UTF-8 is one such character set. Using UTF-8 as an example, the standard ASCII characters are all 1 byte in width, whereas the extended ASCII characters start to take up multiple bytes (up to 6) [2].

In your example, of '0' and '!': both are standard ASCII and therefore both are 1 byte in width in UTF-8.

References:

Variable Width Encoding (Wikipedia) UTF-8 Description (Wikipedia)

更多推荐

本文发布于:2023-04-24 21:06:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/dzcp/d214228914c9eaec77a5892ab5f2619e.html