模式对象与不同的语言不匹配(Pattern objects not matching with different languages)

编程入门行业动态更新时间:2024-10-23 06:24:01

当用户输入英文时，我有以下reg表达式可以正常工作。但使用葡萄牙字符时，它总是失败。

Pattern p = Pattern.compile("^[a-zA-Z]*$"); Matcher matcher = p.matcher(fieldName); if (!matcher.matches()) { .... }

有什么方法可以让模式对象识别有效的葡萄牙语字符，例如ÁÂÃÀÇÉÊÍÓÓÕÕç....

谢谢

I have the following reg expression that works fine when the user's inputs English. But it always fails when using Portuguese characters.

Pattern p = Pattern.compile("^[a-zA-Z]*$"); Matcher matcher = p.matcher(fieldName); if (!matcher.matches()) { .... }

Is there any way to get the pattern object to recognise valid Portuguese characters such as ÁÂÃÀÇÉÊÍÓÔÕÚç....?

Thanks

最满意答案

你想要一个正则表达式来匹配所有字母的类。在世界的所有脚本中，都有很多这样的脚本，但幸运的是，我们可以告诉Java 6的RE引擎我们正在写一封信，它将使用Unicode类的魔术来完成剩下的工作。特别是， L类匹配所有类型的字母，上，下和“哦，这个概念不适用于我的语言”：

Pattern p = Pattern.compile("^\\p{L}*$"); // the rest is identical, so won't repeat it...

阅读文档时，请记住，如果将反斜杠放置在Java文字中，则需要将反斜杠加倍，以便阻止Java编译器将它们解释为其他内容。（另外请注意，RE不适用于验证人员姓名等事情，这是一个完全不同的难题。）

You want a regular expression that will match the class of all alphabetic letters. Across all the scripts of the world, there's loads of those, but luckily we can tell Java 6's RE engine that we're after a letter and it will use the magic of Unicode classes to do the rest. In particular, the L class matches all types of letters, upper, lower and “oh, that concept doesn't apply in my language”:

Pattern p = Pattern.compile("^\\p{L}*$"); // the rest is identical, so won't repeat it...

When reading the docs, remember that backslashes will need to be doubled up if placed in a Java literal so as to stop the Java compiler from interpreting them as something else. (Also be aware that that RE is not suitable for things like validating the names of people, which is an entirely different and much more difficult problem.)

更多推荐

本文发布于:2023-08-07 17:16:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1465097.html