运行以下(示例)代码
import java.io.*; public class test { public static void main(String[] args) throws Exception { byte[] buf = {-27}; InputStream is = new ByteArrayInputStream(buf); BufferedReader r = new BufferedReader( new InputStreamReader(is, "ISO-8859-1")); String s = r.readLine(); System.out.println("test.java:9 [byte] (char)" + (char)s.getBytes()[0] + " (int)" + (int)s.getBytes()[0]); System.out.println("test.java:10 [char] (char)" + (char)s.charAt(0) + " (int)" + (int)s.charAt(0)); System.out.println("test.java:11 string below"); System.out.println(s); System.out.println("test.java:13 string above"); } }给我这个输出
test.java:9 [byte] (char)? (int)63 test.java:10 [char] (char)? (int)229 test.java:11 string below ? test.java:13 string above如何在第9行保留正确的字节值(-27)打印?并因此收到 System.out.println(s)命令(å)的预期输出。
How do I retain the correct byte value (-27) in the line-9 printout? And consequently receive the expected output of the System.out.println(s) command (å).
推荐答案如果要保留字节值,最好不要使用阅读器。为了在文本中表示任意的二进制数据,稍后将其转换回二进制数据,您应该使用base16或base64编码。
If you want to retain byte values, don't use a Reader at all, ideally. To represent arbitrary binary data in text and convert it back to binary data later, you should use base16 or base64 encoding.
然而,为了解释发生了什么,当您调用 s.getBytes()使用默认字符编码,这显然不包括Unicode字符U + 00E5。
However, to explain what's going on, when you call s.getBytes() that's using the default character encoding, which apparently doesn't include Unicode character U+00E5.
如果您调用 s.getBytes(ISO-8859-1),而不是 s.getBytes( )我怀疑你会得到正确的字节值...但依靠ISO-8859-1这样做是有点脏的IMO。
If you call s.getBytes("ISO-8859-1") everywhere instead of s.getBytes() I suspect you'll get back the right byte value... but relying on ISO-8859-1 for this is kinda dirty IMO.
更多推荐
Java InputStream编码/字符集
发布评论