ruby 中的/\p{Alpha}/i 和/\p{L}/i 有什么区别?

编程入门 行业动态 更新时间:2024-10-27 20:24:13
本文介绍了ruby 中的/\p{Alpha}/i 和/\p{L}/i 有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在尝试在 ruby​​ 中构建一个正则表达式来匹配 UTF-8 中的字母字符,例如 ñíóúü 等.我知道 /\p{Alpha}/i工作和 /\p{L}/i 也工作,但有什么区别?

I'm trying to build a regexp in ruby to match alpha characters in UTF-8 like ñíóúü, etc. I know /\p{Alpha}/i works and /\p{L}/i works too but what's the difference?

推荐答案

它们似乎是等效的.(有时,请参阅此答案的结尾)

They seem to be equivalent. ( sometimes, see the end of this answer)

似乎 Ruby 从 1.9 版开始就支持 \p{Alpha}.在 POSIX 中 \p{Alpha} 等于 \p{L&}(对于支持 Unicode 的正则表达式;见这里).这匹配所有具有大写和小写变体的字符(参见此处).不匹配大写字母(而它们将通过 \p{L} 匹配.

It seems like Ruby supports \p{Alpha} since version 1.9. In POSIX \p{Alpha} is equal to \p{L&} (for regular expressions with Unicode support; see here). This matches all characters that have an upper and lower case variant (see here). Unicase letters would not be matched (while they would be match by \p{L}.

这似乎不适用于 Ruby(我随机选择了一个阿拉伯语字符,因为阿拉伯语有一个unicase 字母表):

This does not seem to be true for Ruby (I picked a random Arabic character, since Arabic has a unicase alphabet):

  • \p{L}(任何字母)匹配.
  • 区分大小写的类 \p{Lu}, \p{Ll}、\p{Lt} 不匹配.正如预期的那样.
  • p{L&} 不匹配. As
  • \p{Alpha} 匹配.
  • \p{L} (any letter) matches.
  • Case-sensitive classes \p{Lu}, \p{Ll}, \p{Lt} don't match. As expected.
  • p{L&} doesn't match. As expected.
  • \p{Alpha} matches.

这似乎很好地表明 \p{Alpha} 只是 Ruby 中 \p{L} 的别名.在 Rubular 上,您还可以看到 \p{Alpha} 在 Ruby 1.8.7 中不可用.

Which seems to be a very good indication that \p{Alpha} is just an alias for \p{L} in Ruby. On Rubular you can also see that \p{Alpha} was not available in Ruby 1.8.7.

注意 i 修饰符在任何情况下都无关紧要,因为 \p{Alpha} 和 \p{L} 都匹配大写和小写字符.

Note that the i modifier is irrelevant in any case, because both \p{Alpha} and \p{L} match both upper- and lower-case characters anyway.

啊哈,有区别!我刚刚找到了 这个 PDF 关于 Ruby 的新正则表达式引擎(从 Ruby 1.9 开始使用)如上所述).\p{Alpha} 无论编码如何都可用(如果不支持 Unicode,则可能只匹配 [A-Za-z]),而 \p{L} 特别是一个 Unicode 属性.这意味着,\p{Alpha} 的行为与 POSIX 正则表达式完全相同,不同之处在于这里对应于 \p{L},但在 POSIX 中它对应于 \p{L&}.

A ha, there is a difference! I just found this PDF about Ruby's new regex engine (in use as of Ruby 1.9 as stated above). \p{Alpha} is available regardless of encoding (and will probably just match [A-Za-z] if there is no Unicode support), while \p{L} is specifically a Unicode property. That means, \p{Alpha} behaves exactly as in POSIX regexes, with the difference that here is corresponds to \p{L}, but in POSIX it corresponds to \p{L&}.

更多推荐

ruby 中的/\p{Alpha}/i 和/\p{L}/i 有什么区别?

本文发布于:2023-07-29 09:04:12,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1239329.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:有什么区别   ruby   Alpha

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!