问题描述
限时送ChatGPT账号..我想要一个匹配文本中的>"字符的正则表达式,这样它就不应该匹配标签中的 >
I want a regex which matches '>' char in the text such that it should not match >
in the tags
例如 -
"<span>some >text< again some<some tag></some tag>vfs>vf</span>"
应该匹配 - some >text<又是一些<一些标签></一些标签>vfs>vf</span>
...................................................|...…………………………………………………………………………………………………………………………………………………………......|
Should match - <span>some >text< again some<some tag></some tag>vfs>vf</span>
..............................................|..............................................................|
其中|
表示要匹配的>
.
作为参考,我准备了一个正则表达式,它对 <
For reference I have prepared a regex which does the same thing for <
这是我的正则表达式 - "/(?!<[^<>]*>)**<**/"
(这里 '<' 就在此处加粗)
Here is my regex - "/(?!<[^<>]*>)**<**/"
(here '<' is just in bold to show here)
提前致谢!
推荐答案
如果您的要求很简单——不包括带引号或转义的尖括号,也不包括嵌套的尖括号对,那么首先要解决的是找到 STARTING 不匹配的括号的问题以左方括号开头,不包含内部方括号,以另一个方括号或字符串结尾结尾的字符串的位置.
If your requirements are simple - don't include quoted or escaped angle brackets, nor nested angle bracket pairs, the problem of finding a STARTING unmatched bracket is the first position of a string starting with an open bracket, containing no internal brackets, and ending with either another open bracket or the end of the string.
用正则表达式来说,就是:
In regex speak, that would be:
/(<)[^<>]*(?:$|<)/
因为您想捕获所有这些,并且将使用 preg_match_all,所以您需要添加前瞻以捕获重叠的匹配:
Because you want to capture all of them, and will be using preg_match_all, you need to add in look ahead to catch the overlapping matches:
/(?=(<)[^<>]*(?:$|<))/
同样,不匹配的右括号问题简化为字符串的最后一个字符,以字符串的开头或右括号开始,以右括号结束,中间没有括号.添加向前看,您会得到:
Similarly, the unmatched right bracket problem simplifies to the last character of a string starting with either the beginning of a string or a close bracket, and ending with the close bracket, with no bracket in between. Adding in look ahead, you get:
/(?=(?:^|>)[^<>]*(>))/
我在您的测试字符串中添加了几个额外的括号,以确保我们捕捉到结尾和重叠的情况,以及一个替换示例:
I added a couple of extra brackets to your test strings to make sure we catch the end and overlapping cases, and a replacement example:
<?php
// Left angle brackets
$x = "<span>some >text< again<< some<some tag><</some tag>vfs>vf</span><<";
$y = preg_match_all('/(?=(<)[^<>]*(?:$|<))/', $x, $match, PREG_OFFSET_CAPTURE);
echo "Test: '{$x}'\n";
echo "Repl: '" . locate_replace($x, $match[1], '\<') . "'\n";
echo "There are {$y} extra left angle brackets at character positions:\n";
echo " " . implode(", ", array_column($match[1], 1)) . "\n\n";
// Right angle brackets
$x = "abc><span>some >text< again some<some tag></some tag>vfs>>vf</span>";
$y = preg_match_all('/(?=(?:^|>)[^<>]*(>))/', $x, $match, PREG_OFFSET_CAPTURE);
echo "Test: '{$x}'\n";
echo "Repl: '" . locate_replace($x, $match[1], '\>') . "'\n";
echo "There are {$y} extra right angle brackets at character positions:\n";
echo " " . implode(", ", array_column($match[1], 1)) . "\n";
function locate_replace($x, $match_oc, $repl) {
while ($mt = array_pop($match_oc)) {
$sloc = $mt[1];
$eloc = $sloc + strlen($mt[0]);
$x = substr($x, 0, $sloc) . $repl . substr($x, $eloc);
}
return $x;
}
?>
这会产生:
Test: '<span>some >text< again<< some<some tag><</some tag>vfs>vf</span><<'
Repl: '<span>some >text\< again\<\< some<some tag>\<</some tag>vfs>vf</span>\<\<'
There are 6 extra left angle brackets at character positions:
16, 23, 24, 40, 65, 66
Test: 'abc><span>some >text< again some<some tag></some tag>vfs>>vf</span>'
Repl: 'abc\><span>some \>text< again some<some tag></some tag>vfs\>\>vf</span>'
There are 4 extra right angle brackets at character positions:
3, 15, 56, 57
这篇关于想要一个匹配 > 的正则表达式不包含在任何标签中的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论