一个正则表达式,可以拆分具有相同嵌套括号的字符串(A regular expression that can split a string having nested brackets that ar

编程入门 行业动态 更新时间:2024-10-26 08:26:37
一个正则表达式,可以拆分具有相同嵌套括号的字符串(A regular expression that can split a string having nested brackets that are the same)

我知道可以使用正则表达式来编写检查器来检查括号中的开始和结束符号对:

例如。 a.[b.[cd]].e屈服值a , [b.[cd]]和e

我怎样才能写出一个正则表达式,可以找出相同符号的开始和结束括号

例如。 a.|b.|cd||.e会生成值a , |b.|cd|| ,和e

更新

感谢所有的评论。 我必须提出一些问题的背景。 我基本上想模仿JavaScript语法

a.hello is a["hello"] or a.hello a.|hello| is a[hello] a.|b.c.|d.e||.f.|g| is a[b.c[d.e]].f[g]

所以我想要做的是将符号分解为:

[`a`, `|b.c.|d.e||`, `f`, `|g|`]

然后如果他们被管道引用,则通过它们重现

这里有一个没有管道的语法的实现:

https://github.com/zcaudate/purnam

我真的不希望使用解析器,主要是因为我不知道如何,我不认为它证明了必要的复杂性。 但如果正则表达式不能削减它,我可能不得不这样做。

I know that regular expression can be used to write checkers that check for pairs of start and end symbols for brackets:

eg. a.[b.[c.d]].e yield values a, [b.[c.d]], and e

How can I write a regular expression that can figure out start and end brackets that are the same symbol

eg. a.|b.|c.d||.e would yield values a, |b.|c.d||, and e

update

Thanks for all the comments. I have to give some context to the question. I basically want to mimic javascript syntax

a.hello is a["hello"] or a.hello a.|hello| is a[hello] a.|b.c.|d.e||.f.|g| is a[b.c[d.e]].f[g]

So what I'd want to do is to break the symbols into:

[`a`, `|b.c.|d.e||`, `f`, `|g|`]

and then recur through them if they are pipe-quoted

I've got an implementation of the syntax without pipes here:

https://github.com/zcaudate/purnam

I'm really hoping not to use a parser mainly as I don't know how and I don't think it justifies the necessary complexity. But if regex can't cut it, I may have to.

最满意答案

感谢@ m.buettner和@rafal,这是我在clojure中的代码:

有一个normal-mode和pipe-mode 。 遵循m.buettner描述的内容:

助手:

(defn conj-if-str [arr s] (if (empty? s) arr (conj arr s))) (defmacro case-let [[var bound] & body] `(let [~var ~bound] (case ~var ~@body)))

管道模式:

(declare split-dotted) ;; normal mode declaration (defn split-dotted-pipe ;; pipe mode ([output current ss] (split-dotted-pipe output current ss 0)) ([output current ss level] (case-let [ch (first ss)] nil (throw (Exception. "Cannot have an unpaired pipe")) \| (case level 0 (trampoline split-dotted (conj output (str current "|")) "" (next ss)) (recur output (str current "|") (next ss) (dec level))) \. (case-let [nch (second ss)] nil (throw (Exception. "Incomplete dotted symbol")) \| (recur output (str current ".|") (nnext ss) (inc level)) (recur output (str current "." nch) (nnext ss) level)) (recur output (str current ch) (next ss) level))))

正常模式:

(defn split-dotted ([ss] (split-dotted [] "" ss)) ([output current ss] (case-let [ch (first ss)] nil (conj-if-str output current) \. (case-let [nch (second ss)] nil (throw (Exception. "Cannot have . at the end of a dotted symbol")) \| (trampoline split-dotted-pipe (conj-if-str output current) "|" (nnext ss)) (recur (conj-if-str output current) (str nch) (nnext ss))) \| (throw (Exception. "Cannot have | during split mode")) (recur output (str current ch) (next ss)))))

测试:

(fact "split-dotted" (js/split-dotted "a") => ["a"] (js/split-dotted "a.b") => ["a" "b"] (js/split-dotted "a.b.c") => ["a" "b" "c"] (js/split-dotted "a.||") => ["a" "||"] (js/split-dotted "a.|b|.c") => ["a" "|b|" "c"] (js/split-dotted "a.|b|.|c|") => ["a" "|b|" "|c|"] (js/split-dotted "a.|b.c|.|d|") => ["a" "|b.c|" "|d|"] (js/split-dotted "a.|b.|c||.|d|") => ["a" "|b.|c||" "|d|"] (js/split-dotted "a.|b.|c||.|d|") => ["a" "|b.|c||" "|d|"] (js/split-dotted "a.|b.|c.d.|e|||.|d|") => ["a" "|b.|c.d.|e|||" "|d|"]) (fact "split-dotted exceptions" (js/split-dotted "|a|") => (throws Exception) (js/split-dotted "a.") => (throws Exception) (js/split-dotted "a.|||") => (throws Exception) (js/split-dotted "a.|b.||") => (throws Exception))

Thanks to @m.buettner and @rafal, this is my code in clojure:

There is a normal-mode and pipe-mode. Following what m.buettner described:

Helpers:

(defn conj-if-str [arr s] (if (empty? s) arr (conj arr s))) (defmacro case-let [[var bound] & body] `(let [~var ~bound] (case ~var ~@body)))

Pipe Mode:

(declare split-dotted) ;; normal mode declaration (defn split-dotted-pipe ;; pipe mode ([output current ss] (split-dotted-pipe output current ss 0)) ([output current ss level] (case-let [ch (first ss)] nil (throw (Exception. "Cannot have an unpaired pipe")) \| (case level 0 (trampoline split-dotted (conj output (str current "|")) "" (next ss)) (recur output (str current "|") (next ss) (dec level))) \. (case-let [nch (second ss)] nil (throw (Exception. "Incomplete dotted symbol")) \| (recur output (str current ".|") (nnext ss) (inc level)) (recur output (str current "." nch) (nnext ss) level)) (recur output (str current ch) (next ss) level))))

Normal Mode:

(defn split-dotted ([ss] (split-dotted [] "" ss)) ([output current ss] (case-let [ch (first ss)] nil (conj-if-str output current) \. (case-let [nch (second ss)] nil (throw (Exception. "Cannot have . at the end of a dotted symbol")) \| (trampoline split-dotted-pipe (conj-if-str output current) "|" (nnext ss)) (recur (conj-if-str output current) (str nch) (nnext ss))) \| (throw (Exception. "Cannot have | during split mode")) (recur output (str current ch) (next ss)))))

Tests:

(fact "split-dotted" (js/split-dotted "a") => ["a"] (js/split-dotted "a.b") => ["a" "b"] (js/split-dotted "a.b.c") => ["a" "b" "c"] (js/split-dotted "a.||") => ["a" "||"] (js/split-dotted "a.|b|.c") => ["a" "|b|" "c"] (js/split-dotted "a.|b|.|c|") => ["a" "|b|" "|c|"] (js/split-dotted "a.|b.c|.|d|") => ["a" "|b.c|" "|d|"] (js/split-dotted "a.|b.|c||.|d|") => ["a" "|b.|c||" "|d|"] (js/split-dotted "a.|b.|c||.|d|") => ["a" "|b.|c||" "|d|"] (js/split-dotted "a.|b.|c.d.|e|||.|d|") => ["a" "|b.|c.d.|e|||" "|d|"]) (fact "split-dotted exceptions" (js/split-dotted "|a|") => (throws Exception) (js/split-dotted "a.") => (throws Exception) (js/split-dotted "a.|||") => (throws Exception) (js/split-dotted "a.|b.||") => (throws Exception))

更多推荐

本文发布于:2023-08-01 13:59:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1359317.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:嵌套   括号   字符串   正则表达式   regular

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!