如何在C#中获取Unicode字符的数字值?
How do i get the numeric value of a unicode character in C#?
例如,如果泰米尔字符அ( U + 0B85 ),输出应为2949(即0x0B85)
For example if tamil character அ (U+0B85) given, output should be 2949 (i.e. 0x0B85)
- C ++:如何获取十进制值c ++中的unicode字符的显示
- Java:如何获取Unicode字符的代码?
- C++: How to get decimal value of a unicode character in c++
- Java: How can I get a Unicode character's code?
某些字符需要多个代码点.在此示例UTF-16中,每个代码单元仍位于基本多语言平面中:
Some characters require multiple code points. In this example, UTF-16, each code unit is still in the Basic Multilingual Plane:
- (即U+0072 U+0327 U+030C)
- (即U+0072 U+0338 U+0327 U+0316 U+0317 U+0300 U+0301 U+0302 U+0308 U+0360)
- (i.e. U+0072 U+0327 U+030C)
- (i.e. U+0072 U+0338 U+0327 U+0316 U+0317 U+0300 U+0301 U+0302 U+0308 U+0360)
更大的一点是,一个字符"可能需要1个以上的UTF-16代码单元,它可能需要2个以上的UTF-16代码单元,它可能需要3个以上的UTF-16代码单元.
The larger point being that one "character" can require more than 1 UTF-16 code unit, it can require more than 2 UTF-16 code units, it can require more than 3 UTF-16 code units.
更大的一点是,一个字符"可能需要数十个unicode代码点.在C#中的UTF-16中,意味着大于1 char.一个字符可能需要17个char.
The larger point being that one "character" can require dozens of unicode code points. In UTF-16 in C# that means more than 1 char. One character can require 17 char.
我的问题是关于将char转换为UTF-16编码值.即使整个字符串17 char仅表示一个字符",我仍然想知道如何将每个UTF-16单位转换为数值.
My question was about converting char into a UTF-16 encoding value. Even if an entire string of 17 char only represents one "character", i still want to know how to convert each UTF-16 unit into a numeric value.
例如
String s = "அ"; int i = Unicode(s[0]);其中Unicode返回Unicode标准定义的整数值,输入表达式的第一个字符.
Where Unicode returns the integer value, as defined by the Unicode standard, for the first character of the input expression.
推荐答案它与Java基本相同.如果您将其作为char,则可以隐式转换为int:
It's basically the same as Java. If you've got it as a char, you can just convert to int implicitly:
char c = '\u0b85'; // Implicit conversion: char is basically a 16-bit unsigned integer int x = c; Console.WriteLine(x); // Prints 2949如果您将其作为字符串的一部分,只需先获取单个字符:
If you've got it as part of a string, just get that single character first:
string text = GetText(); int x = text[2]; // Or whatever...请注意,不在基本多语言平面中的字符将表示为两个UTF-16代码单元. .NET中有 支持查找完整的Unicode代码点,但它不是简单.
Note that characters not in the basic multilingual plane will be represented as two UTF-16 code units. There is support in .NET for finding the full Unicode code point, but it's not simple.
更多推荐
如何在C#中获取unicode字符的十进制值?
发布评论