写正则表达式没有否定(Write regex without negations)

编程入门 行业动态 更新时间:2024-10-23 02:47:32
正则表达式没有否定(Write regex without negations)

在上一篇文章中,我曾要求在没有否定的情况下重写正则表达式

开始正则表达式:

https?:\/\/(?:.(?!https?:\/\/))+$

结束于:

https?:[^:]*$

这工作正常,但我注意到,如果我将有:在我的URL除了:从http \ s它不会选择。

这是一个不起作用的字符串:

sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2

您可以注意到:query2

如何修改此处列出的第二个正则表达式,以便选择包含: URL :

预期产量:

http://websites.com/path/subpath/cc:query2

此外,我想选择一切,直到第一次出现?=param

输入: sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param

输出:

http://websites.com/path/subpath/cc:query2/text/

In a previous post I've asked for some help on rewriting a regex without negation

Starting regex:

https?:\/\/(?:.(?!https?:\/\/))+$

Ended up with:

https?:[^:]*$

This works fine but i've noticed that in case I will have : in my URL besides the : from http\s it will not select.

Here is a string which is not working:

sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2

You can notice the :query2

How can I modify the second regex listed here so it will select urls which contain :.

Expected output:

http://websites.com/path/subpath/cc:query2

Also I would like to select everything till the first occurance of ?=param

Input: sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param

Output:

http://websites.com/path/subpath/cc:query2/text/

最满意答案

很遗憾,Go正则表达式不支持外观。 但是,您可以通过一种技巧获取最后一个链接:贪婪地匹配所有可能的链接和其他字符,并捕获与捕获组的最后一个链接:

^(?:https?://|.)*(https?://\S+?)(?:\?=|$)

和\S*?一起\S*? 懒惰的空白匹配,这也让我们捕获链接到?= 。

请参阅regex演示和Go演示

var r = regexp.MustCompile(`^(?:https?://|.)*(https?://\S+?)(?:\?=|$)`) fmt.Printf("%q\n", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2", -1)[0][1]) fmt.Printf("%q\n", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param", -1)[0][1])

结果:

"http://websites.com/path/subpath/:query2" "http://websites.com/path/subpath/cc:query2/text/"

如果最后一个链接中可以有空格,请使用.+? :

^(?:https?://|.)*(https?://.+?)(?:\?=|$)

It is a pity that Go regex does not support lookarounds. However, you can obtain the last link with a sort of a trick: match all possible links and other characters greedily and capture the last link with a capturing group:

^(?:https?://|.)*(https?://\S+?)(?:\?=|$)

Together with \S*? lazy whitespace matching, this also lets capture the link up to the ?=.

See regex demo and Go demo

var r = regexp.MustCompile(`^(?:https?://|.)*(https?://\S+?)(?:\?=|$)`) fmt.Printf("%q\n", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2", -1)[0][1]) fmt.Printf("%q\n", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param", -1)[0][1])

Results:

"http://websites.com/path/subpath/:query2" "http://websites.com/path/subpath/cc:query2/text/"

In case there can be spaces in the last link, use just .+?:

^(?:https?://|.)*(https?://.+?)(?:\?=|$)

更多推荐

本文发布于:2023-07-04 11:05:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1023585.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:正则表达式   Write   negations   regex

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!