Perl 拆分和正则表达式查询

编程入门 行业动态 更新时间:2024-10-24 22:19:08
本文介绍了Perl 拆分和正则表达式查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有一行文本,例如

这是可以"解决的非常有趣"问题的测试

This "is" a test "of very interesting" problems "that can" be solved

而且我正在尝试拆分它,以便我的数组 @goodtext 将包含来自引用部分的许多字符串.所以我的数组将包含以下内容:

And I'm trying to split it so that my array @goodtext would contain however many strings there are from quoted sections. So my array would contain the following:

$goodtext[0] is $goodtext[1] of very interesting $goodtext[2] that can

不幸的是,每行中引用部分的数量各不相同...

The number of quoted sections in each line varies, unfortunately...

推荐答案

假设没有合理的嵌套

my @quoted = $string =~ /"([^"]+)"/g;

或者,如果您需要在收集它们时进行一些处理

or, if you need to be able to do some processing while collecting them

my @quoted; while ($string =~ /"([^"]+)"/g) { #" (stop faulty markup highlight) # ... push @quoted, $1; }

请注意,我们需要结束 ",即使 [^"]+ 无论如何都会匹配它.这是为了让引擎消耗它并通过它,所以 " 的下一个匹配确实是下一个打开的匹配.

Note that we need the closing ", even though [^"]+ will match up to it anyway. This is so that the engine consumes it and gets past it, so the next match of " is indeed the next opening one.

如果引号也可以嵌套"",那么您需要 Text::Balanced

If the quotations "can be "nested" as well" then you'd want Text::Balanced

顺便说一句,请注意列表和标量中 /g 修饰符的行为差异 上下文.

As an aside, note the difference in behavior of the /g modifier in list and scalar contexts.

  • 在列表上下文中,由列表分配强加(到@quoted 在第一个示例中),使用 /g 修饰符,匹配运算符返回所有捕获的列表,或者如果模式中没有捕获(无括号),则返回所有匹配的列表

  • In the list context, imposed by the list assignment (to @quoted in the first example), with the /g modifier the match operator returns a list of all captures, or of all matches if there is no capturing in the pattern (no parens)

在标量上下文中,当作为 while 条件进行评估时(例如),它与 /g 的行为更加复杂.匹配后,下一次正则表达式运行时,它会继续从前一次匹配(之后)的位置开始搜索字符串,从而遍历匹配.

In the scalar context, when evaluated as the while condition (for example), its behavior with /g is more complex. After a match, the next time the regex runs it continues searching the string from the position of (one after) the previous match, thus iterating through matches.

请注意,我们不需要为此循环(什么是细微错误的细微原因)

Note that we don't need a loop for this (what is a subtle cause for subtle bugs)

my $string = q(one simple string); $string =~ /(\w+)/g; say $1; #--> one $string =~ /(\w+)g; say $1; #--> simple

在任何一个正则表达式中都没有 /g 我们不会得到这种行为,而是 one 被打印两次.

Without /g in either regex we don't get this behavior, but rather one is printed both times.

参见全局匹配inperlretut,例如 \G assertion 在 perlop 和 pos

See Global matching in perlretut, and for instance \G assertion in perlop and pos

更多推荐

Perl 拆分和正则表达式查询

本文发布于:2023-10-27 18:59:25,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1534261.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:正则表达式   Perl

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!