如何在antlr3的树语法中捕获令牌列表？(How to catch list of tokens in tree grammar of antlr3?)

我举了一个虚拟语言：它只接受一个或多个'！'。它的词法和规则是：

grammar Ns; options { output=AST; ASTLabelType=CommonTree; } tokens { NOTS; } @header { package test; } @lexer::header { package test; } ns : NOT+ EOF -> ^(NOTS NOT+); NOT : '!';

好吧，正如你所看到的，这代表了一种接受'！'的语言。要么 '！！！' 要么 '！！！！！'...

我定义了一些有意义的类来构建AST：

public class Not { public static final Not SINGLETON = new Not(); private Not() { } } public class Ns { private List<Not> nots; public Ns(String nots) { this.nots = new ArrayList<Not>(); for (int i = 0; i < nots.length(); i++) { this.nots.add(Not.SINGLETON); } } public String toString() { String ret = ""; for (int i = 0; i < this.nots.size(); i++) { ret += "!"; } return ret; } }

这是树语法：

tree grammar NsTreeWalker; options { output = AST; tokenVocab = Ns; ASTLabelType = CommonTree; } @header { package test; } ns returns [Ns ret] : ^(NOTS n=NOT+) {$ret = new Ns($n.text);};

以及包含一些示例数据的主类代码来测试生成的类：

public class Test { public static void main(String[] args) throws Exception { ANTLRInputStream input = new ANTLRInputStream(new ByteArrayInputStream("!!!".getBytes("utf-8"))); NsLexer lexer = new NsLexer(input); CommonTokenStream tokens = new CommonTokenStream(lexer); NsParser parser = new NsParser(tokens); CommonTree root = (CommonTree) parser.ns().getTree(); NsTreeWalker walker = new NsTreeWalker(new CommonTreeNodeStream(root)); try { NsTreeWalker.ns_return r = walker.ns(); System.out.println(r.ret); } catch (RecognitionException e) { e.printStackTrace(); } } }

但打印的最终输出是'！'，而不是期待'!!!'。这主要是因为这行代码：

ns returns [Ns ret] : ^(NOTS n=NOT+) {$ret = new Ns($n.text);};

上面的$ n只捕获了一个'！'，我不知道如何捕获'！'的所有三个标记，换句话说，一个'！'列表 $ n。有人可以帮忙吗？谢谢！

I took a dummy language for example: It simply accepts one or more '!'. its lexer and grammar rules are:

grammar Ns; options { output=AST; ASTLabelType=CommonTree; } tokens { NOTS; } @header { package test; } @lexer::header { package test; } ns : NOT+ EOF -> ^(NOTS NOT+); NOT : '!';

ok, as you can see, this represents a language which accept '!' or '!!!' or '!!!!!'...

and I defined some meaningful classes to build ASTs:

public class Not { public static final Not SINGLETON = new Not(); private Not() { } } public class Ns { private List<Not> nots; public Ns(String nots) { this.nots = new ArrayList<Not>(); for (int i = 0; i < nots.length(); i++) { this.nots.add(Not.SINGLETON); } } public String toString() { String ret = ""; for (int i = 0; i < this.nots.size(); i++) { ret += "!"; } return ret; } }

and here's the tree grammar:

tree grammar NsTreeWalker; options { output = AST; tokenVocab = Ns; ASTLabelType = CommonTree; } @header { package test; } ns returns [Ns ret] : ^(NOTS n=NOT+) {$ret = new Ns($n.text);};

and the main class code with some sample data to test the generated classes:

public class Test { public static void main(String[] args) throws Exception { ANTLRInputStream input = new ANTLRInputStream(new ByteArrayInputStream("!!!".getBytes("utf-8"))); NsLexer lexer = new NsLexer(input); CommonTokenStream tokens = new CommonTokenStream(lexer); NsParser parser = new NsParser(tokens); CommonTree root = (CommonTree) parser.ns().getTree(); NsTreeWalker walker = new NsTreeWalker(new CommonTreeNodeStream(root)); try { NsTreeWalker.ns_return r = walker.ns(); System.out.println(r.ret); } catch (RecognitionException e) { e.printStackTrace(); } } }

but the final output printed is '!', other than the expecting '!!!'. that's mainly because this line of code :

ns returns [Ns ret] : ^(NOTS n=NOT+) {$ret = new Ns($n.text);};

the $n above captured only one '!', I don't know how to capture all three tokens of '!', in other words , a list of '!' with $n. Is there some one could help?thanks!

最满意答案

事实上只有一个! 被打印是因为你的规则：

ns returns [Ns ret] : ^(NOTS n=NOT+) {$ret = new Ns($n.text);} ;

得到或多或少的翻译为：

Token n = null LOOP n = match NOT_token END return new Ns(n.text)

因此， n.text将永远只是一个! 。

您需要做的是在列表中收集这些NOT标记。在ANTLR中，您可以使用+=运算符而不是“单个标记”运算符=创建标记列表。因此，将ns规则更改为：

ns returns [Ns ret] : ^(NOTS n+=NOT+) {$ret = new Ns($n);} ;

被翻译为：

List n = null LOOP n.add(match NOT_token) END return new Ns(n)

请务必更改Ns类的构造函数以取代List ：

public Ns(List nots) { this.nots = new ArrayList<Not>(); for (Object o : nots) { this.nots.add(Not.SINGLETON); } }

之后，您的测试类的输出将是：

!!!

祝你好运！

The fact that only one ! gets printed is because your rule:

ns returns [Ns ret] : ^(NOTS n=NOT+) {$ret = new Ns($n.text);} ;

gets more or less translated as:

Token n = null LOOP n = match NOT_token END return new Ns(n.text)

Therefor, n.text will always be just a single !.

What you need to do is collect these NOT tokens in a list. In ANTLR you can create a list of tokens using the += operator instead of the "single token" operator =. So change your ns rule into:

ns returns [Ns ret] : ^(NOTS n+=NOT+) {$ret = new Ns($n);} ;

which gets translated as:

List n = null LOOP n.add(match NOT_token) END return new Ns(n)

Be sure to change the constructor of your Ns class to take a List instead:

public Ns(List nots) { this.nots = new ArrayList<Not>(); for (Object o : nots) { this.nots.add(Not.SINGLETON); } }

after which the output of your test class would be:

!!!

Good luck!

更多推荐