正则表达式匹配img标签的url(Regular expression to match img tag's url)

编程入门 行业动态 更新时间:2024-10-26 17:36:41
正则表达式匹配img标签的url(Regular expression to match img tag's url)

这个正则表达式:

<IMG\s([^"'>]+|'[^']*'|"[^"]*")+>

在给出这篇文章时似乎处理不休

<img src=http://www.blahblahblah.com/houses/Images/ single_and_multi/roof/feb09/01_img_trrnjks_vol2009.jpg' />

我期望它 - 没有找到匹配(很快) - 因为文本中只有一个单引号。 我在C#中发生过这种情况,并且还使用了Expresso正则表达式工具。 如果文本短得多,它似乎工作。

This regular expression:

<IMG\s([^"'>]+|'[^']*'|"[^"]*")+>

seems to process endlessly when given this text

<img src=http://www.blahblahblah.com/houses/Images/ single_and_multi/roof/feb09/01_img_trrnjks_vol2009.jpg' />

I would expect it to - not find a match (quickly) - because there is only one single quote in the text. I have had this happen in C# and also using the Expresso regex tool. If the text is a lot shorter it seems to work.

最满意答案

其他评论者提到复杂性是导致perfo问题的可能原因。 我想补充一点,如果你想匹配类似于IMG标签的东西,我想你想要一个正则表达式更像这样:

<IMG(\s+[a-z]+=('[^']*'|"[^"]*"|[^\s'">]+))+>

当然,这个正则表达式不会捕捉到仍然有效的HTML变体。 就像关闭/ (在xhtml中要求)或者右括号之前的空格。 它会通过一些无效的情况,如不支持的属性名称。

Other commenters have mentioned the complexity being the likely cause for the perfo problem. I'd add that if you're trying to match something resembling an IMG tag, I think you want a regex more like this:

<IMG(\s+[a-z]+=('[^']*'|"[^"]*"|[^\s'">]+))+>

Of course, there are still valid HTML variations that this regex won't catch. Like a closing / (required in xhtml), or whitespace before the closing bracket. And it will pass some invalid cases, like unsupported attribute names.

更多推荐

本文发布于:2023-07-31 20:35:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1347298.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:标签   正则表达式   img   url   tag

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!