Java InputStream编码/字符集

编程入门行业动态更新时间:2024-10-28 13:21:00

本文介绍了Java InputStream编码/字符集的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

运行以下（示例）代码

import java.io.*; public class test { public static void main(String[] args) throws Exception { byte[] buf = {-27}; InputStream is = new ByteArrayInputStream(buf); BufferedReader r = new BufferedReader( new InputStreamReader(is, "ISO-8859-1")); String s = r.readLine(); System.out.println("test.java:9 [byte] (char)" + (char)s.getBytes()[0] + " (int)" + (int)s.getBytes()[0]); System.out.println("test.java:10 [char] (char)" + (char)s.charAt(0) + " (int)" + (int)s.charAt(0)); System.out.println("test.java:11 string below"); System.out.println(s); System.out.println("test.java:13 string above"); } }

给我这个输出

test.java:9 [byte] (char)? (int)63 test.java:10 [char] (char)? (int)229 test.java:11 string below ? test.java:13 string above

如何在第9行保留正确的字节值（-27）打印？并因此收到 System.out.println（s）命令（å）的预期输出。

How do I retain the correct byte value (-27) in the line-9 printout? And consequently receive the expected output of the System.out.println(s) command (å).

推荐答案

如果要保留字节值，最好不要使用阅读器。为了在文本中表示任意的二进制数据，稍后将其转换回二进制数据，您应该使用base16或base64编码。

If you want to retain byte values, don't use a Reader at all, ideally. To represent arbitrary binary data in text and convert it back to binary data later, you should use base16 or base64 encoding.

然而，为了解释发生了什么，当您调用 s.getBytes（）使用默认字符编码，这显然不包括Unicode字符U + 00E5。

However, to explain what's going on, when you call s.getBytes() that's using the default character encoding, which apparently doesn't include Unicode character U+00E5.

如果您调用 s.getBytes（ISO-8859-1），而不是 s.getBytes（）我怀疑你会得到正确的字节值...但依靠ISO-8859-1这样做是有点脏的IMO。

If you call s.getBytes("ISO-8859-1") everywhere instead of s.getBytes() I suspect you'll get back the right byte value... but relying on ISO-8859-1 for this is kinda dirty IMO.

更多推荐

Java InputStream编码/字符集

本文发布于:2023-11-09 19:30:53，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1573232.html