将unicode符号用作Java标识符是一个好主意吗?

编程入门 行业动态 更新时间:2024-10-25 04:24:33
本文介绍了将unicode符号用作Java标识符是一个好主意吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有一段代码如下:

doubleΔt= lastPollTime - pollTime; doubleα= 1 - Math.exp(-Δt/τ); 平均值+ =α*(x - 平均值);

在Java标识符中使用unicode字符的想法有多糟糕?或者这完全可以接受吗?

解决方案

出于各种原因,这是一个坏主意。

  • 许多人的键盘不支持这些字符。如果我要在qwerty键盘(或任何其他没有希腊字母)上维护该代码,我必须始终复制并粘贴这些字符。

  • <某些人的编辑或终端可能无法正确显示这些字符。例如,一些编辑(不幸的是)仍默认为某些ISO-8859(拉丁文)变体。 ASCII仍然如此普遍的主要原因是它几乎总是有效。

  • 即使字符可以正确呈现,它们也可能引起混淆。 直接来自太阳(强调我的):

    具有相同外观的标识符可能不同。例如,由单个字母组成的标识符拉丁文大写字母A(A,\ 0000),拉丁文小写字母A(a,\ u0061),希腊大写字母ALPHA(A,\ u0391),CYRILLIC小写字母A(a,\ u0430)和数学大胆ITALIC SMALL A(a,\\\�\\\�)都不同。

    ...

    Unicode复合字符与分解的字符不同。例如,拉丁文大写字母A ACUTE(Á,\ u00c1)可被视为与拉丁文大写字母A(A,\\ \\ u0041)在排序时紧接着是非间距ACUTE(',\ u0301),但这些标识符不同。

    这绝不是一个想象中的问题:α(U + 03b1 GREEK SMALL LETTER ALPHA)和⍺(U + 237a APL FUNCTIONAL SYMBOL ALPHA)是不同的字符!

  • 无法确定哪些字符有效。代码中的字符可以正常工作,但是当我使用FUNCTIONAL SYMBOL ALPHA时,我的Java编译器会抱怨非法字符:\ 9082。即使功能符号在此代码中更合适。关于哪些字符可以接受似乎没有可靠的规则,除了询问 Character.isJavaIdentifierPart()。

  • 即使你可以得到它编译,所有Java虚拟机实现都经过严格的Unicode标识符测试似乎令人怀疑。如果这些字符仅用于方法范围中的变量,它们应该被编译掉,但如果它们是类成员,它们最终也会出现在 .class 文件中,可能会在有缺陷的JVM实现上破坏你的程序。

I have a snippet of code that looks like this:

double Δt = lastPollTime - pollTime; double α = 1 - Math.exp(-Δt / τ); average += α * (x - average);

Just how bad an idea is it to use unicode characters in Java identifiers? Or is this perfectly acceptable?

解决方案

It's a bad idea, for various reasons.

  • Many people's keyboards do not support these characters. If I were to maintain that code on a qwerty keyboard (or any other without Greek letters), I'd have to copy and paste those characters all the time.

  • Some people's editors or terminals might not display these characters properly. For example, some editors (unfortunately) still default to some ISO-8859 (Latin) variant. The main reason why ASCII is still so prevalent is that it nearly always works.

  • Even if the characters can be rendered properly, they may cause confusion. Straight from Sun (emphasis mine):

    Identifiers that have the same external appearance may yet be different. For example, the identifiers consisting of the single letters LATIN CAPITAL LETTER A (A, \u0041), LATIN SMALL LETTER A (a, \u0061), GREEK CAPITAL LETTER ALPHA (A, \u0391), CYRILLIC SMALL LETTER A (a, \u0430) and MATHEMATICAL BOLD ITALIC SMALL A (a, \ud835\udc82) are all different.

    ...

    Unicode composite characters are different from the decomposed characters. For example, a LATIN CAPITAL LETTER A ACUTE (Á, \u00c1) could be considered to be the same as a LATIN CAPITAL LETTER A (A, \u0041) immediately followed by a NON-SPACING ACUTE (´, \u0301) when sorting, but these are different in identifiers.

    This is in no way an imaginary problem: α (U+03b1 GREEK SMALL LETTER ALPHA) and ⍺ (U+237a APL FUNCTIONAL SYMBOL ALPHA) are different characters!

  • There is no way to tell which characters are valid. The characters from your code work, but when I use the FUNCTIONAL SYMBOL ALPHA my Java compiler complains about "illegal character: \9082". Even though the functional symbol would be more appropriate in this code. There seems to be no solid rule about which characters are acceptable, except asking Character.isJavaIdentifierPart().

  • Even though you may get it to compile, it seems doubtful that all Java virtual machine implementations have been rigorously tested with Unicode identifiers. If these characters are only used for variables in method scope, they should get compiled away, but if they are class members, they will end up in the .class file as well, possibly breaking your program on buggy JVM implementations.

更多推荐

将unicode符号用作Java标识符是一个好主意吗?

本文发布于:2023-10-21 17:29:31,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1514883.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:是一个   标识符   好主意   符号   unicode

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!