ANTLR 4 
Cool Features 
by Alexander Vasiltsov
What features? 
● Recursive rules 
● Precedence 
● Associativity 
● Actions 
● Attributes 
● Semantic predicates 
● Commands
Recursive rules 
expr : expr ‘*’ expr 
| expr ‘+’ expr 
| INT 
| ‘(‘ expr ‘)’ 
; 
expr : addExpr; 
addExpr : multExpr (‘+’ multExpr)*; 
multExpr : atom (‘*’ atom)*; 
atom : INT;
Precedence 
Rule or alternative described earlier has higher 
priority than others
Associativity 
By default, ANTLR 
associates operators left to 
right 
It’s possible to change the 
direction of associativity 
expr : <assoc=right> expr ‘^’ expr 
| INT 
; 
expr : expr ‘^’ expr 
| INT 
;
assoc option 
expr : <assoc=right> expr ‘^’ expr 
| expr ‘*’ expr 
| expr ‘+’ expr 
| INT 
| ‘(‘ expr ‘)’ 
;
Actions 
rule : subrule {<some code in target language> } 
; 
rule2 : alternative1 {<some code for alternative1> } 
| alternative2 {<some code for alternative2> } 
; 
rule3 : subrule1 {<some code after subrule1 matched> } 
TOKEN {<some code after TOKEN matched> } 
subrule2 {<some code after whole rule matched> } 
;
Actions and attributes 
<rulename> [<arguments list>] 
returns [<return values>] 
locals [<local members>] 
@init { 
<init code> 
} 
@after { 
<finalization code> 
} 
: <rule description> 
;
Actions and attributes (2) 
rule [int i, string s] 
returns [string[] vals, int max] 
locals [int loc=0] 
@init { 
//some init code here 
} 
@after { 
//some post-processing code 
} 
: 'TOKENS'+ 
; 
public partial class RuleContext : ParserRuleContext { 
public int i; 
public string s; 
public string[] vals; 
public int max; 
public int loc; 
public RuleContext(ParserRuleContext parent, int 
invokingState, int i, string s) : base(parent, 
invokingState) 
{ 
this.i = i; 
this.s = s; 
} 
... 
} 
public RuleContext rule(int i, string s) { 
//some init code here 
... //parsing logic here 
//some post-processing code 
}
Custom attributes 
parent_rule : rule[0, "test"] {Console.WriteLine("max: "+ $rule.max);} 
; 
rule [int i, string s] 
returns [string[] vals, int max] 
locals [int loc=0] 
@init { 
$ctx.vals = new[] {s}; 
$ctx.loc = i + 10; 
} 
@after { 
$ctx.max *= 10; 
} 
: (TOKEN { if ($ctx.loc > $TOKEN.text.Length) $ctx.max++;})+ 
;
Predefined token attributes 
Attribute Typ 
e 
Description 
text string Text matched 
type int Token type 
line int Line number (counting from 1) 
pos int Position in line (counting from 0) 
index int Overall token offset inside input stream 
channel int Channel where token was emited 
int int Token integer value
Predefined rule attributes 
Attribute Type Description 
$ctx ParserRuleContext Context object 
$ctx.GetText() string Text matched 
$ctx.start IToken First token 
$ctx.stop IToken Last token
Where else to place code 
grammar MyGrammar; 
@header { 
using System; 
} 
lexer::header { ... } 
@parser::header { ... } 
@lexer::members members { 
{ ... } 
private int myField; 
@public parser::int members MyProperty { ... {get; } 
set;} 
public void MyMethod() { 
//do something 
} 
} 
rule : subrule TOKEN+ ;
Bonus for C# 
● MyGrammar.g4.lexer.cs 
namespace MyNamespace 
{ 
partial class SimpleLexer 
{ 
} 
} 
● MyGrammar.g4.parser.cs 
namespace MyNamespace 
{ 
partial class SimpleParser 
{ 
} 
}
Semantic predicates
Lexer commands 
TokenName: alternative -> command-name[(parameter)] ; 
skip Skips token, doesn’t send it to parser 
type(T) Set type T to token 
channel(C) Send token in channel C 
mode(M) Switch lexer to mode M 
pushMode(M) Switch lexer to mode M, put current mode to the 
stack 
popMode Switch lexer to mode taken from the stack’s top 
more Match token but continue searching
type(T) 
Sets required type to the token
channel(C) 
Sends token in desired channel 
Predefined channels: 
● Token.DEFAULT_CHANNEL 
● Token.HIDDEN_CHANNEL
mode(M), pushMode(M), popMode 
Switch lexer to defined mode. 
Each mode contains its own set of rules 
Default mode: DEFAULT_MODE
more 
Matches token but continues searching
For those who like hardcore 
It’s possible to override any method in generated lexer and parser classes. Chose one that fits your 
needs and act!

Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)

  • 1.
    ANTLR 4 CoolFeatures by Alexander Vasiltsov
  • 2.
    What features? ●Recursive rules ● Precedence ● Associativity ● Actions ● Attributes ● Semantic predicates ● Commands
  • 3.
    Recursive rules expr: expr ‘*’ expr | expr ‘+’ expr | INT | ‘(‘ expr ‘)’ ; expr : addExpr; addExpr : multExpr (‘+’ multExpr)*; multExpr : atom (‘*’ atom)*; atom : INT;
  • 4.
    Precedence Rule oralternative described earlier has higher priority than others
  • 5.
    Associativity By default,ANTLR associates operators left to right It’s possible to change the direction of associativity expr : <assoc=right> expr ‘^’ expr | INT ; expr : expr ‘^’ expr | INT ;
  • 6.
    assoc option expr: <assoc=right> expr ‘^’ expr | expr ‘*’ expr | expr ‘+’ expr | INT | ‘(‘ expr ‘)’ ;
  • 7.
    Actions rule :subrule {<some code in target language> } ; rule2 : alternative1 {<some code for alternative1> } | alternative2 {<some code for alternative2> } ; rule3 : subrule1 {<some code after subrule1 matched> } TOKEN {<some code after TOKEN matched> } subrule2 {<some code after whole rule matched> } ;
  • 8.
    Actions and attributes <rulename> [<arguments list>] returns [<return values>] locals [<local members>] @init { <init code> } @after { <finalization code> } : <rule description> ;
  • 9.
    Actions and attributes(2) rule [int i, string s] returns [string[] vals, int max] locals [int loc=0] @init { //some init code here } @after { //some post-processing code } : 'TOKENS'+ ; public partial class RuleContext : ParserRuleContext { public int i; public string s; public string[] vals; public int max; public int loc; public RuleContext(ParserRuleContext parent, int invokingState, int i, string s) : base(parent, invokingState) { this.i = i; this.s = s; } ... } public RuleContext rule(int i, string s) { //some init code here ... //parsing logic here //some post-processing code }
  • 10.
    Custom attributes parent_rule: rule[0, "test"] {Console.WriteLine("max: "+ $rule.max);} ; rule [int i, string s] returns [string[] vals, int max] locals [int loc=0] @init { $ctx.vals = new[] {s}; $ctx.loc = i + 10; } @after { $ctx.max *= 10; } : (TOKEN { if ($ctx.loc > $TOKEN.text.Length) $ctx.max++;})+ ;
  • 11.
    Predefined token attributes Attribute Typ e Description text string Text matched type int Token type line int Line number (counting from 1) pos int Position in line (counting from 0) index int Overall token offset inside input stream channel int Channel where token was emited int int Token integer value
  • 12.
    Predefined rule attributes Attribute Type Description $ctx ParserRuleContext Context object $ctx.GetText() string Text matched $ctx.start IToken First token $ctx.stop IToken Last token
  • 13.
    Where else toplace code grammar MyGrammar; @header { using System; } lexer::header { ... } @parser::header { ... } @lexer::members members { { ... } private int myField; @public parser::int members MyProperty { ... {get; } set;} public void MyMethod() { //do something } } rule : subrule TOKEN+ ;
  • 14.
    Bonus for C# ● MyGrammar.g4.lexer.cs namespace MyNamespace { partial class SimpleLexer { } } ● MyGrammar.g4.parser.cs namespace MyNamespace { partial class SimpleParser { } }
  • 15.
  • 16.
    Lexer commands TokenName:alternative -> command-name[(parameter)] ; skip Skips token, doesn’t send it to parser type(T) Set type T to token channel(C) Send token in channel C mode(M) Switch lexer to mode M pushMode(M) Switch lexer to mode M, put current mode to the stack popMode Switch lexer to mode taken from the stack’s top more Match token but continue searching
  • 17.
    type(T) Sets requiredtype to the token
  • 18.
    channel(C) Sends tokenin desired channel Predefined channels: ● Token.DEFAULT_CHANNEL ● Token.HIDDEN_CHANNEL
  • 19.
    mode(M), pushMode(M), popMode Switch lexer to defined mode. Each mode contains its own set of rules Default mode: DEFAULT_MODE
  • 20.
    more Matches tokenbut continues searching
  • 21.
    For those wholike hardcore It’s possible to override any method in generated lexer and parser classes. Chose one that fits your needs and act!

Editor's Notes

  • #2 Semantic predicates Recursive rules Attributes Grammar commands Options