正则表达式匹配img标签的url(Regular expression to match img tag's url)

编程入门行业动态更新时间:2024-10-26 17:36:41

这个正则表达式：

<IMG\s([^"'>]+|'[^']*'|"[^"]*")+>

在给出这篇文章时似乎处理不休

我期望它 - 没有找到匹配（很快） - 因为文本中只有一个单引号。我在C＃中发生过这种情况，并且还使用了Expresso正则表达式工具。如果文本短得多，它似乎工作。

This regular expression:

<IMG\s([^"'>]+|'[^']*'|"[^"]*")+>

seems to process endlessly when given this text

I would expect it to - not find a match (quickly) - because there is only one single quote in the text. I have had this happen in C# and also using the Expresso regex tool. If the text is a lot shorter it seems to work.

最满意答案

其他评论者提到复杂性是导致perfo问题的可能原因。我想补充一点，如果你想匹配类似于IMG标签的东西，我想你想要一个正则表达式更像这样：

<IMG(\s+[a-z]+=('[^']*'|"[^"]*"|[^\s'">]+))+>

当然，这个正则表达式不会捕捉到仍然有效的HTML变体。就像关闭/ （在xhtml中要求）或者右括号之前的空格。它会通过一些无效的情况，如不支持的属性名称。

Other commenters have mentioned the complexity being the likely cause for the perfo problem. I'd add that if you're trying to match something resembling an IMG tag, I think you want a regex more like this:

<IMG(\s+[a-z]+=('[^']*'|"[^"]*"|[^\s'">]+))+>

Of course, there are still valid HTML variations that this regex won't catch. Like a closing / (required in xhtml), or whitespace before the closing bracket. And it will pass some invalid cases, like unsupported attribute names.

更多推荐

本文发布于:2023-07-31 20:35:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1347298.html