如何在C#中获取unicode字符的十进制值?

编程入门 行业动态 更新时间:2024-10-25 12:23:24
本文介绍了如何在C#中获取unicode字符的十进制值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

如何在C#中获取Unicode字符的数字值?

How do i get the numeric value of a unicode character in C#?

例如,如果泰米尔字符அ( U + 0B85 ),输出应为2949(即0x0B85)

For example if tamil character அ (U+0B85) given, output should be 2949 (i.e. 0x0B85)

  • C ++:如何获取十进制值c ++中的unicode字符的显示
  • Java:如何获取Unicode字符的代码?
  • C++: How to get decimal value of a unicode character in c++
  • Java: How can I get a Unicode character's code?

某些字符需要多个代码点.在此示例UTF-16中,每个代码单元仍位于基本多语言平面中:

Some characters require multiple code points. In this example, UTF-16, each code unit is still in the Basic Multilingual Plane:

  • (即U+0072 U+0327 U+030C)
  • (即U+0072 U+0338 U+0327 U+0316 U+0317 U+0300 U+0301 U+0302 U+0308 U+0360)
  • (i.e. U+0072 U+0327 U+030C)
  • (i.e. U+0072 U+0338 U+0327 U+0316 U+0317 U+0300 U+0301 U+0302 U+0308 U+0360)

更大的一点是,一个字符"可能需要1个以上的UTF-16代码单元,它可能需要2个以上的UTF-16代码单元,它可能需要3个以上的UTF-16代码单元.

The larger point being that one "character" can require more than 1 UTF-16 code unit, it can require more than 2 UTF-16 code units, it can require more than 3 UTF-16 code units.

更大的一点是,一个字符"可能需要数十个unicode代码点.在C#中的UTF-16中,意味着大于1 char.一个字符可能需要17个char.

The larger point being that one "character" can require dozens of unicode code points. In UTF-16 in C# that means more than 1 char. One character can require 17 char.

我的问题是关于将char转换为UTF-16编码值.即使整个字符串17 char仅表示一个字符",我仍然想知道如何将每个UTF-16单位转换为数值.

My question was about converting char into a UTF-16 encoding value. Even if an entire string of 17 char only represents one "character", i still want to know how to convert each UTF-16 unit into a numeric value.

例如

String s = "அ"; int i = Unicode(s[0]);

其中Unicode返回Unicode标准定义的整数值,输入表达式的第一个字符.

Where Unicode returns the integer value, as defined by the Unicode standard, for the first character of the input expression.

推荐答案

它与Java基本相同.如果您将其作为char,则可以隐式转换为int:

It's basically the same as Java. If you've got it as a char, you can just convert to int implicitly:

char c = '\u0b85'; // Implicit conversion: char is basically a 16-bit unsigned integer int x = c; Console.WriteLine(x); // Prints 2949

如果您将其作为字符串的一部分,只需先获取单个字符:

If you've got it as part of a string, just get that single character first:

string text = GetText(); int x = text[2]; // Or whatever...

请注意,不在基本多语言平面中的字符将表示为两个UTF-16代码单元. .NET中有 支持查找完整的Unicode代码点,但它不是简单.

Note that characters not in the basic multilingual plane will be represented as two UTF-16 code units. There is support in .NET for finding the full Unicode code point, but it's not simple.

更多推荐

如何在C#中获取unicode字符的十进制值?

本文发布于:2023-11-16 16:22:04,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1605902.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字符   如何在   unicode   十进制值

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!