1. Hello ANTLR: ANother Tool for Language Recognition
2. Where we can use ANTLR?
3. Why just not use regular expression language?
4. Tools under ANTLR umbrella
5. ANTLR basic syntax
6. ANTLR on real example
Using ANTLR on real example - convert "string combined" queries into parameterized queries
Using ANTLR on real example convert “string combined” queries into parameterized queries
Simon Wiki says: ANTLR (pronounced Antler), or ANother Tool for Language Recognition, is a parser generator that uses LL(*) parsing. ANTLR takes as input a grammar that specifies a language and generates as output source code for a recognizer for that language. A language is specified using a context-free grammar which is expressed using Extended Backus–Naur Form (EBNF). ANTLR allows generating lexers, parsers, tree parsers, and combined lexer-parsers. Parsers can automatically generate abstract syntax trees which can be further processed with tree parsers. ANTLR provides a single consistent notation for specifying lexers, parsers, and tree parsers. This is in contrast with other parser/lexer generators and adds greatly to the tools ease of use.
Used at least in following products: Drools, JBoss rule engine (DRL DSL) Hibernate, Java ORM (HQL DSL) NHibernate, .NET ORM (HQL DSL) Groovy, language for JVM Jython, language for JVM
Where we need ANTLR? Parsing a text stream of formal data Parsing a text stream of incomplete formal data Complex parsing Parsing with good error handling Writing Domain-Specific Language You have enough time and some data to parse...
Why just not use regular expression language? In most cases you should go with RegEx SO: “RegEx is a text search tool. If all you need to do is pull strings out of strings then its often the hammer of choice.” SO: “ANTLR is a parser generator. If you need error messages and parse actions or any of the complicated things that come with a interpreter/compiler then its a good option.” SO: “ANTLR has perfect support for "error-messages": they show line/column numbers and what was wrong. RegEx doesnt have this support.” ANTLR is a something (a-lot-of-things) on top of regular expression language.
Real example. Test cases• Query without any parameters• Query with concat and variable• Query with dotted and escaped table names and single quote in sql• Query with function call and func args concat• Query with function call with several func args• Query with nested function call with several func args• Query with concat and two variables• Insert query with four params• Query with dotted param and function name and funciton arg• Endline symbol will be dropped from query• Single line comment will be dropped from query• Strip single quote only if it next to parameter• Query with like keyword (FAILED)• Refactor multiline query (FAILED)
Real example. Syntax treestrsql = "SELECT * FROM TABLE_NAME WHERE FIRST_FIELD = " & DOTTED.PARAM_VAR & " AND SECOND_FIELD = " & DOTTED.FUNC_CALL(DOTTED.FUNC_ARG)