为什么我的应用程序无法正确显示unicode字符？(Why My Applicaion cannot display unicode character correctly?)

编程入门行业动态更新时间:2024-10-23 19:21:17

我决定将我的win32 c ++应用程序转换为Unicode版本，但是当我使用它时，我收到了阿拉伯语，中文和日语的不可读信件......

第一：

如果我不使用Unicode，我可以在编辑框中使用阿拉伯语。窗口标题：

HWND hWnd = CreateWindowEx(WS_EX_CLIENTEDGE, "Edit", "ا ب ت ث ج ح خ د ذ", WS_CHILD | WS_VISIBLE | WS_BORDER | ES_MULTILINE, 10, 10, 300, 200, hWnd, (HMENU)100, GetModuleHandle(NULL), NULL); SetWindowText(hWnd, "صباح الخير");

输出看起来不错，工作正常！（没有unicode）。

使用Unicode：

我在包含标题之前添加了：

#define UNICODE #include <windows.h

现在在窗口过程中：

case WM_CREATE:{ HWND hEdit = CreateWindowExW(WS_EX_CLIENTEDGE, L"Edit", L"ا ب ت ث ج ح خ د ذ", WS_CHILD | WS_VISIBLE | WS_BORDER | ES_MULTILINE, 10, 10, 300, 200, hWnd, (HMENU)100, GetModuleHandle(NULL), NULL); // Even I send message to change text but I get unreadable characters! } break; case WM_LBUTTONDBLCLK:{ SendDlgItemMessageW(hWnd, 100, WM_SETTEXT, 0, (LPARAM)L"السلام عليكم"); // Get unreadable characters also } break;

正如你可以看到使用Unicode控件无法正确显示阿拉伯字符。

重要的是：在创建控件后，我用backspace手动删除内容现在如果我手动输入阿拉伯语文本它会成功显示它正确吗？但为什么使用函数？喜欢SetWindowTextW() ??

请帮忙。谢谢。

I decided to turn my win32 c++ application into Unicode version but when I use that i got unreadable letters for Arabic, Chinese and Japanese...

First:

If I don't use Unicode I got Arabic ok in edit boxes Window titles:

The output seems ok and works fine! (without unicode).

With Unicode:

I added before inclusion headers:

#define UNICODE #include <windows.h

Now in Window Procedure:

ِAs you can see with Unicode the controls cannot display Arabic characters correctly.

The thing that matters is: After the control is created I delete the content manually with backspace Now If I enter an Arabic text manually It succeeds to display it correctly?!!! But why Wen using Functions? Like SetWindowTextW()??

Please Help. Thank you.

最满意答案

确保使用BOM将源文件保存为UTF-16或UTF-8。许多Windows应用程序会采用ANSI编码（默认的本地化Windows代码页）。您还可以检查编译器开关以强制使用UTF-8作为源文件。例如，MS Visual Studio 2015的编译器具有/utf-8开关，因此不需要使用BOM进行保存。

这是一个以UTF-8保存的简单示例，然后是UTF-8 w / BOM，并使用Microsoft Visual Studio编译器进行编译。请注意，如果您对API的W版本进行硬编码并对宽字符串使用L“”，则无需定义UNICODE：

#include <windows.h> int main() { MessageBoxW(NULL,L"ا ب ت ث ج ح خ د ذ",L"中文",MB_OK); }

结果（UTF-8）。编译器假定ANSI编码（Windows-1252）并错误地解码了宽字符串。

图像损坏

结果（UTF-8 w / BOM）。编译器检测BOM并使用UTF-8解码源代码，从而为宽字符串生成正确的数据。

正确的形象

一个Python代码演示解码错误：

>>> s='中文,ا ب ت ث ج ح خ د ذ' >>> print(s.encode('utf8').decode('Windows-1252')) ä¸æ–‡,Ø§ Ø¨ Øª Ø« Ø¬ Ø Ø® Ø¯ Ø°

Make sure to save the source file as UTF-16 or UTF-8 with BOM. Many Windows applications assume the ANSI encoding (default localized Windows code page) otherwise. You can also check compiler switches to force using UTF-8 for source files. For example, MS Visual Studio 2015's compiler has a /utf-8 switch so saving with BOM is not required.

Here's a simple example saved in UTF-8, and then UTF-8 w/ BOM and compiled with the Microsoft Visual Studio compiler. Note that there is no need to define UNICODE if you hard-code the W versions of the APIs and use L"" for wide strings:

#include <windows.h> int main() { MessageBoxW(NULL,L"ا ب ت ث ج ح خ د ذ",L"中文",MB_OK); }

Result (UTF-8). The compiler assumed ANSI encoding (Windows-1252) and decoded the wide string incorrectly.

Corrupted image

Result (UTF-8 w/ BOM). The compiler detects the BOM and uses UTF-8 to decode the source code, resulting in the correct data generated for the wide strings.

Correct image

A little Python code demonstrating the decode error:

>>> s='中文,ا ب ت ث ج ح خ د ذ' >>> print(s.encode('utf8').decode('Windows-1252')) ä¸æ–‡,Ø§ Ø¨ Øª Ø« Ø¬ Ø Ø® Ø¯ Ø°

更多推荐

本文发布于:2023-07-31 23:52:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1351411.html