想要一个匹配 > 的正则表达式不包含在任何标签中的字符

编程入门行业动态更新时间:2024-10-27 22:30:26

本文介绍了想要一个匹配 > 的正则表达式不包含在任何标签中的字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

限时送ChatGPT账号..

我想要一个匹配文本中的>"字符的正则表达式，这样它就不应该匹配标签中的 >

I want a regex which matches '>' char in the text such that it should not match > in the tags

例如 -

"<span>some >text< again some<some tag></some tag>vfs>vf</span>"

应该匹配 - some >text<又是一些<一些标签></一些标签>vfs>vf</span>
...................................................|...…………………………………………………………………………………………………………………………………………………………......|

Should match - <span>some >text< again some<some tag></some tag>vfs>vf</span>
..............................................|..............................................................|

其中|表示要匹配的>.

作为参考，我准备了一个正则表达式，它对 <

For reference I have prepared a regex which does the same thing for <

这是我的正则表达式 - "/(?!<[^<>]*>)**<**/"(这里 '<' 就在此处加粗)

Here is my regex - "/(?!<[^<>]*>)**<**/" (here '<' is just in bold to show here)

提前致谢！

推荐答案

如果您的要求很简单——不包括带引号或转义的尖括号，也不包括嵌套的尖括号对，那么首先要解决的是找到 STARTING 不匹配的括号的问题以左方括号开头，不包含内部方括号，以另一个方括号或字符串结尾结尾的字符串的位置.

If your requirements are simple - don't include quoted or escaped angle brackets, nor nested angle bracket pairs, the problem of finding a STARTING unmatched bracket is the first position of a string starting with an open bracket, containing no internal brackets, and ending with either another open bracket or the end of the string.

用正则表达式来说，就是:

In regex speak, that would be:

/(<)[^<>]*(?:$|<)/

因为您想捕获所有这些，并且将使用 preg_match_all，所以您需要添加前瞻以捕获重叠的匹配:

Because you want to capture all of them, and will be using preg_match_all, you need to add in look ahead to catch the overlapping matches:

/(?=(<)[^<>]*(?:$|<))/

同样，不匹配的右括号问题简化为字符串的最后一个字符，以字符串的开头或右括号开始，以右括号结束，中间没有括号.添加向前看，您会得到:

Similarly, the unmatched right bracket problem simplifies to the last character of a string starting with either the beginning of a string or a close bracket, and ending with the close bracket, with no bracket in between. Adding in look ahead, you get:

/(?=(?:^|>)[^<>]*(>))/

我在您的测试字符串中添加了几个额外的括号，以确保我们捕捉到结尾和重叠的情况，以及一个替换示例:

I added a couple of extra brackets to your test strings to make sure we catch the end and overlapping cases, and a replacement example:

<?php
// Left angle brackets
$x = "<span>some >text< again<< some<some tag><</some tag>vfs>vf</span><<";
$y = preg_match_all('/(?=(<)[^<>]*(?:$|<))/', $x, $match, PREG_OFFSET_CAPTURE);
echo "Test: '{$x}'\n";
echo "Repl: '" . locate_replace($x, $match[1], '\<') . "'\n";
echo "There are {$y} extra left angle brackets at character positions:\n";
echo "  " . implode(", ", array_column($match[1], 1)) . "\n\n";

// Right angle brackets

$x = "abc><span>some >text< again some<some tag></some tag>vfs>>vf</span>";
$y = preg_match_all('/(?=(?:^|>)[^<>]*(>))/', $x, $match, PREG_OFFSET_CAPTURE);
echo "Test: '{$x}'\n";
echo "Repl: '" . locate_replace($x, $match[1], '\>') . "'\n";
echo "There are {$y} extra right angle brackets at character positions:\n";
echo "  " . implode(", ", array_column($match[1], 1)) . "\n";

function locate_replace($x, $match_oc, $repl) {
    while ($mt = array_pop($match_oc)) {
        $sloc = $mt[1];
        $eloc = $sloc + strlen($mt[0]);
        $x = substr($x, 0, $sloc) . $repl . substr($x, $eloc);
    }
    return $x;
}
?>

这会产生:

Test: '<span>some >text< again<< some<some tag><</some tag>vfs>vf</span><<'
Repl: '<span>some >text\< again\<\< some<some tag>\<</some tag>vfs>vf</span>\<\<'
There are 6 extra left angle brackets at character positions:
  16, 23, 24, 40, 65, 66

Test: 'abc><span>some >text< again some<some tag></some tag>vfs>>vf</span>'
Repl: 'abc\><span>some \>text< again some<some tag></some tag>vfs\>\>vf</span>'
There are 4 extra right angle brackets at character positions:
  3, 15, 56, 57

这篇关于想要一个匹配 > 的正则表达式不包含在任何标签中的字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

更多推荐

[db:关键词]

本文发布于:2023-05-01 10:25:25，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1408896.html