使用替换函数时,为什么反向引用在Python的re.sub中不起作用?(Why don't backreferences work in Python's re.sub when

编程入门 行业动态 更新时间:2024-10-27 09:32:46
使用替换函数时,为什么反向引用在Python的re.sub中不起作用?(Why don't backreferences work in Python's re.sub when using a replacement function?)

在Python 2.7中使用re.sub ,以下示例使用简单的反向引用:

re.sub('-{1,2}', r'\g<0> ', 'pro----gram-files')

它按预期输出以下字符串:

'pro-- -- gram- files'

我希望以下示例相同,但它不是:

def dashrepl(matchobj): return r'\g<0> ' re.sub('-{1,2}', dashrepl, 'pro----gram-files')

这会产生以下意外输出:

'pro\\g<0> \\g<0> gram\\g<0> files'

为什么这两个例子给出不同的输出? 我是否遗漏了解释此内容的文档? 这种行为是否比我预期的更好? 有没有办法在替换函数中使用反向引用?

Using re.sub in Python 2.7, the following example uses a simple backreference:

re.sub('-{1,2}', r'\g<0> ', 'pro----gram-files')

It outputs the following string as expected:

'pro-- -- gram- files'

I would expect the following example to be identical, but it is not:

def dashrepl(matchobj): return r'\g<0> ' re.sub('-{1,2}', dashrepl, 'pro----gram-files')

This gives the following unexpected output:

'pro\\g<0> \\g<0> gram\\g<0> files'

Why do the two examples give different output? Did I miss something in the documentation that explains this? Is there any particular reason that this behavior is preferable to what I expected? Is there a way to use backreferences in a replacement function?

最满意答案

由于有更简单的方法来实现您的目标,您可以使用它们。

正如您已经看到的那样,您的替换函数会获取匹配对象作为参数。

除其他外,该对象具有可以替代使用的方法group() :

def dashrepl(matchobj): return matchobj.group(0) + ' '

这将给出你的结果。


但你完全正确 - 文档有点令人困惑:

他们描述了repl参数:

repl可以是字符串或函数; 如果它是一个字符串,则处理其中的任何反斜杠转义。

如果repl是一个函数,则会为每个非重叠的模式调用调用它。 该函数接受单个匹配对象参数,并返回替换字符串。

可以解释这个,好像函数返回的“替换字符串”也适用于反斜杠转义的处理。

但由于此处理仅针对“它是一个字符串”的情况进行描述,因此它变得更清晰,但乍一看并不明显。

As there are simpler ways to achieve your goal, you can use them.

As you already see, your replacement function gets a match object as it argument.

This object has, among others, a method group() which can be used instead:

def dashrepl(matchobj): return matchobj.group(0) + ' '

which will give exactly your result.


But you are completely right - the docs are a bit confusing in that way:

they describe the repl argument:

repl can be a string or a function; if it is a string, any backslash escapes in it are processed.

and

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

You could interpret this as if "the replacement string" returned by the function would also apply to the processment of backslash escapes.

But as this processment is described only for the case that "it is a string", it becomes clearer, but not obvious at the first glance.

更多推荐

本文发布于:2023-07-28 12:10:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1305190.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:中不   函数   Python   backreferences   replacement

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!