RegEx：如果内部引号不匹配某个字符

编程入门行业动态更新时间:2024-10-23 23:34:43

本文介绍了RegEx：如果内部引号不匹配某个字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

披露：我已阅读此答案很多时候，在这里，我知道比使用正则表达式解析HTML更好。这个问题只是用正则表达式来扩大我的知识。

Disclosure: I have read this answer many times here on SO and I know better than to use regex to parse HTML. This question is just to broaden my knowledge with regex.

说我有这个字符串：

some text <tag link="fo>o"> other text

我想匹配整个标签，但如果我使用 [^]] +> 它只匹配< tag link =fo> 。

I want to match the whole tag but if I use <[^>]+> it only matches <tag link="fo>.

我如何确保引号内的> 可以被忽略。

How can I make sure that > inside of quotes can be ignored.

我可以简单地用while循环写一个解析器来做到这一点，但是我想知道如何用正则表达式来做。

I can trivially write a parser with a while loop to do this, but I want to know how to do it with regex.

推荐答案

表达式：

Regular Expression:

<[^>]*?(?:(?:('|")[^'"]*?\1)[^>]*?)*>

在线演示：

regex101/r/ yX5xS8

我知道这个正则表达式可能是头痛的，所以这是我的解释：

I know this regex might be a headache to look at, so here is my explanation:

< # Open HTML tags [^>]*? # Lazy Negated character class for closing HTML tag (?: # Open Outside Non-Capture group (?: # Open Inside Non-Capture group ('|") # Capture group for quotes, backreference group 1 [^'"]*? # Lazy Negated character class for quotes \1 # Backreference 1 ) # Close Inside Non-Capture group [^>]*? # Lazy Negated character class for closing HTML tag )* # Close Outside Non-Capture group > # Close HTML tags

更多推荐

RegEx：如果内部引号不匹配某个字符

本文发布于:2023-11-10 21:59:21，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1576574.html