Vladimir Kozhaev
● Freelancer
● Development of programming
languages and development
tools.
● If you need to develop a domain
specific language, IDE based
on Eclipse or Inellij Idea, parser,
compiler do not heistate to ask
me.
● email:vkozhaev@gmaill.com,
skype:vladimir.kozhaev
blog:http://gamesdevandmath.bl
ogspot.com
● ANTLR 4, DSL development in depth
ANTLR overview
● ANTLR (ANother Tool for Language
Recognition) is a powerful parser generator for
reading, processing, executing, or translating
structured text or binary files. It's widely used to
build languages, tools, and frameworks. From a
grammar, ANTLR generates a parser that can
build and walk parse trees.
● Now avaiable version 4.5.3
● Java, C#, JavaScript, Python2, Python3 targets
● Some alternatives:JFlex,JLex,CookCC,AustenX
Why ANTLR?
● Developed with java
● Actively developing project
● Opensource
● Vide range of the tools and target languages
● Existing community
● Good book: The Definitive ANTLR 4 Reference
by Terence Parr
● A lot of the syntax sugar:left recursion, right
associativity etc
What is ANTLR
● Parsers(for example log files)
● Translators(from JSON to XML,
pseudocode to Java)
● Domain Specific Languages(DSL)
What is ANTLR
What is ANTLR
What Is ANTLR
ANTLR patterns
Lexems vs Fragments where is a
difference?
● A fragment will never be counted as a token, it
only serves to simplify a grammar.
● They need to be referenced from a lexer rule
● Matching a NUMBER will always return a
NUMBER to the lexer, regardless of if it
matched "1234", "0xab12", or "0777"
Priority Rules
Several lexer rules can match the same input
text. In that case, the token type will be chosen
as follows:
● First, select the lexer rule which matches the
longest input
● If several lexer rules match the same input
length, choose the first one, based on definition
order
Error recovery strategies
Error recovering
Visitors,Listeners,Actions
● Example 1, with visitor
● Example 2, with listener
● Example 3, with embeded action
● See https://github.com/vladimirkozhaev/https---
github.com-vladimirkozhaev-calc_antlr
Lexer vs parser rules
● Lexer vs parser, question two steps
with the same sence
● Lexer rules are used to split text on
the tokens
● Parser rules - for higher level parsing
Right associativity
expr : expr '^'<assoc=right> expr // ^ operator is right
associative
| INT
;
Eclipse ANTLR IDE
Eclipse ANTLR IDE
Questions
● Ask me please?

ANTLR4 in depth

  • 1.
    Vladimir Kozhaev ● Freelancer ●Development of programming languages and development tools. ● If you need to develop a domain specific language, IDE based on Eclipse or Inellij Idea, parser, compiler do not heistate to ask me. ● email:vkozhaev@gmaill.com, skype:vladimir.kozhaev blog:http://gamesdevandmath.bl ogspot.com
  • 2.
    ● ANTLR 4,DSL development in depth
  • 3.
    ANTLR overview ● ANTLR(ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees. ● Now avaiable version 4.5.3 ● Java, C#, JavaScript, Python2, Python3 targets ● Some alternatives:JFlex,JLex,CookCC,AustenX
  • 4.
    Why ANTLR? ● Developedwith java ● Actively developing project ● Opensource ● Vide range of the tools and target languages ● Existing community ● Good book: The Definitive ANTLR 4 Reference by Terence Parr ● A lot of the syntax sugar:left recursion, right associativity etc
  • 5.
    What is ANTLR ●Parsers(for example log files) ● Translators(from JSON to XML, pseudocode to Java) ● Domain Specific Languages(DSL)
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
    Lexems vs Fragmentswhere is a difference? ● A fragment will never be counted as a token, it only serves to simplify a grammar. ● They need to be referenced from a lexer rule ● Matching a NUMBER will always return a NUMBER to the lexer, regardless of if it matched "1234", "0xab12", or "0777"
  • 11.
    Priority Rules Several lexerrules can match the same input text. In that case, the token type will be chosen as follows: ● First, select the lexer rule which matches the longest input ● If several lexer rules match the same input length, choose the first one, based on definition order
  • 12.
  • 13.
  • 14.
    Visitors,Listeners,Actions ● Example 1,with visitor ● Example 2, with listener ● Example 3, with embeded action ● See https://github.com/vladimirkozhaev/https--- github.com-vladimirkozhaev-calc_antlr
  • 15.
    Lexer vs parserrules ● Lexer vs parser, question two steps with the same sence ● Lexer rules are used to split text on the tokens ● Parser rules - for higher level parsing
  • 16.
    Right associativity expr :expr '^'<assoc=right> expr // ^ operator is right associative | INT ;
  • 17.
  • 18.
  • 19.