SlideShare a Scribd company logo
1 of 124
5/11/2021 Saeed Parsa 1
Compiler Design
Top down parsers
Saeed Parsa
Room 332,
School of Computer Engineering,
Iran University of Science & Technology
parsa@iust.ac.ir
Winter 2021
Who, when/where, and what?
• Who are we?
• Lecturer
• Saeed Parsa
• Associate Professor in IUST
• Research Area: Software Engineering, Software Testing,
Software Debugging, Reverse Engineering, etc.
• Email: parsa@iust.ac.ir
• More Information:
• http://parsa.iust.ac.ir/
• Slide Share
• https://www.slideshare.net/SaeedParsa
5/11/2021 Saeed Parsa 2
5/11/2021 Saeed Parsa 3
5/11/2021 Saeed Parsa 4
5/11/2021 Saeed Parsa 5
Top-Down Parsing
 A predictive parser is characterized by its ability to choose the production to
apply solely on the basis of the next input symbol and the current
nonterminal being processed.
 Top down parsing, starts with the start symbol and apply the
productions until arriving at the desired string.
5/11/2021 Saeed Parsa 6
Example
5/11/2021 Saeed Parsa 7
Predictive parsers
A predictive parser uses the next input symbol,
as a look-ahead to determine the production
rule for expanding the current nonterminal.
5/11/2021 Saeed Parsa 8
5/11/2021 Saeed Parsa 9
LL(1) Grammars
 A grammar is LL(1) if it can be parsed by considering only one non-
terminal and the next token, as look ahead, in the input stream.
 Example: The following grammar is LL(1):
S ::= a S B | d B
B ::= b B | a
N.T. a b d
S a S B d B
B a bB
 Parsing table:
5/11/2021 Saeed Parsa 10
 The following grammar is LL(1):
S ::= a S B | d B
B ::= b B | a
N.T. a b d
S a S B d B
B a bB
 Parsing table:
 Input string:
adbaa
 Parsing:
Use parsing table
to select a production
S
Example
5/11/2021 Saeed Parsa 11
LL(1) Grammars defintion
 A grammar, G, is LL(1) if and only if:
 “A  𝛽1 | 𝛽2 | … | 𝛽𝑛" G
 𝐹𝑖𝑟𝑠𝑡(𝛽𝑖), 𝐹𝑖𝑟𝑠𝑡(𝛽𝑗) = 𝜑 ,  𝑖, 𝑗  1. . 𝑛  i  j
where
A  a 𝛽  First(A) = { a } , a  Terminal Symbols & 𝛽 is any string
A  B 𝛽  First(A) ⊇ First(B), B  Nonterminal Symbols & …
A  𝛽1 | 𝛽2 | … | 𝛽𝑛
 First(A) = First(𝛽1)  First(𝛽2)  …  First(𝛽𝑛)
5/11/2021 Saeed Parsa 12
First Set: Definition
• Suppose A is a nonterminals, First(A) consists of the first terminals
that can be derived from A.
if A ⇒* aβ, then a ∈ First(A)
if A ⇒*  (nullable), then  ∈ First(A)
First(A) = First(aB)+ First(CD)
= {a} + First(C) + First(D)
= {a, c, d}
A  aB | CD
B  Bb | 
C  c | 
D  d
Another grammar
E.g.
First(B) = {b, }
5/11/2021 Saeed Parsa 13
 Paziraee ::= PishGhaza Ghaza Deser Kotak
 PishGaza ::= sup β1| Ash β2| panir β3| salad β4| 
 Ghaza ::= Ash β5 | Abgusht | Pitza | Kabab | 
 Deser ::= Bastani | SholehZard | Miveh | 
 Kotak ::= Chomagh | Shamsir
o First(PishGaza) ::={ sup, Ash, panir,  } = { sup, Ash, panir} +
Follow(PishGhaza)
o Ghaza ::= Ash | Abgusht | Pitza | Kabab | 
o Deser ::= Bastani | SholehZard | Miveh | 
o Kotak ::= Chomagh | Shamsir
First Set: Definition
5/11/2021 Saeed Parsa 14
First Set: Properties
1. If X is a terminal or ε, then First(X) = {X}
2. Suppose X is a nonterminal and X  Y1Y2...Yk
- if for some i, Y1...Yi-1 ⇒* ε , then First(X) ⊇ First(Yi) – {ε}
- if Y1...Yk ⇒* ε, then ε ∈ First(X)
First(A) = {a, c, d} A  aB | CD
B  Bb | ε
C  c | ε
D  d
Another grammar
E.g.
Why exclude it ?
5/11/2021 Saeed Parsa 15
First Set: Definition
5/11/2021 Saeed Parsa 16
Example 1
 First(B) = First(ea) = {e}
First(C) = First(bC) + First(d) = {b, d}
First(A) = {a} + First(B) + First(C) = {a,b,d,e}
G is LL(1) because:
A  a A | B b | C d a First(aA)∩First(Bb), First(aA)∩First(Cda), First(Bb)∩First(Cda)
= {a} ∩ {e} = 𝜑 , = {a} ∩ {b,d} = 𝜑, = {e} ∩ {b,d} = 𝜑
C  b C | d  First(bC) ∩ First(d)
={b} ∩ {d} = 𝜑
5/11/2021 Saeed Parsa 17
Example 2
G: A  a A | a B d| C d a
C  d C | a
B  a e
G is not LL(1) because:
A  a A | a B d | C d a 
1. First(aA) ∩ First(a B b) = a  𝜑
2. also First(Cda) = {d, a}  {a}
 if ( look-ahead == {a}  current symbol == A )
then it will not be possible to determine which production to choose
5/11/2021 Saeed Parsa 18
i. Left Factoring
G: A  a eA | a B d| C d a 
C  d C | a
B  a e
A  a e A| a B d| d C d a | a d a
C  d C | a
B  a e
A  a A’ | d C d a
A’  e A |B d | d a
C  d C | a
B  a e
5/11/2021 Saeed Parsa 19
We say that a nonterminal x is nullable if the empty sequence can be
derived from it.
X *  then Nullable(X) = true
Nullable
To be LL(1) for any production:
Y       * 
First()  First() should be empty.
A  B b e d
B  b d | 
First(B)  Follow(B) = {b}
5/11/2021 Saeed Parsa 20
Where
LL(1) grammars
A grammar is LL(1) if and only if:
 No two distinct productions with the same LHS can generate the same
first terminal symbol. (eg. A → a  | a β is not LL{1})
 No nullable symbol ‘A’ has the same terminal symbol ‘a’ in both its first
and follow sets for distinct production rules.
 There is only one way to send a nullable symbol to .
5/11/2021 Saeed Parsa 21
Follow:
 A → αBβ then Follow(B) = First(β)
 A → αBβC and Nullable(β)= true, then Follow(B)= Follow(B)∪First(β)∪First(C)
 A → αBβ and Nullable(β) = true, then Follow(B) = Follow(B)∪First(β)∪ Follow(A)
Predict(A, a) = {A → β : a ∈ First(β)} ∪ {A → β : β is nullable and a ∈ Follow(A)}
LL(1) grammars
5/11/2021 Saeed Parsa 22
• G : S  a b | a d
 Is G LL(1) ? First(ab) ∩ First(ad) = {a}   
 S is not LL(1)  Left refactoring should be applied
 G : S  a A’
A’  b | d
• G: S  a B d
B  d e| 
 Is G LL(1) ? First(B) ∩ Follow(B) = {d}   
S is not LL(1)   productions should be removed
 G : S  a de d | a d
S  a d S’
LL(1)
5/11/2021 Saeed Parsa 23
• Example: determine the follow set where required
S  a B C d | C A
B  b B | 
C  a A e| 
A  e
First(S) = {a} + First(C)
First(B) = {b, } = {b} + Follow(B) = {b} + First(C) = {b, a, d, e}
First(C) = {a, } = {a} + Follow(C) = {a} + {d} + First(A) ={a, d, e}
Example - 1
5/11/2021 Saeed Parsa 24
• Example: determine the follow set where required
S  a B C d | C A
B  b B | 
C  a A e| 
A  e
First(S) = {a} + First(C) = {a, …}
First(B) = {b, } = {b} + Follow(B) = {b} + First(C) = {b, d, e}
First(C) = {a, } = {a} + Follow(C) = {a} + {d} + First(A) ={a, d, e}
Example - 2
5/11/2021 Saeed Parsa 25
Production Rules
S -> aBDh
B -> cC
C -> bC | d
D -> EF
E -> g | λ
F -> f | λ
First sets
First(D) = First(E) = {g, f, h}
First(E) = {g} + Follow(E) = {g, f, h}
Follow(E) = First(F) = {f, h}
First(F) = {f} + Follow(F) = {f, h}
Follow(F) = Follow(D) = {h}
Follow(S) = {$}
Follow(B) = First(D) = {g, f, h}
Follow(C) = Follow(B) = {g, f, h}
Follow(D) = {h}
Example - 3
5/11/2021 Saeed Parsa 26
Production Rules:
S -> aBDh
B -> cC
C -> bC | d
D -> EF | h
E -> g | λ
F -> f | λ
First(D) = First(E) = {g, f, h}
First(E) = {g} + Follow(E) = {g, f, h}
Follow(E) = First(F) = {f, h}
First(F) = {f} + Follow(F) = {f, h}
Follow(F) = Follow(D) = {h}
Follow(S) = {$}
Follow(B) = First(D) = {g, f, h}
Follow(C) = Follow(B) = {g, f, h}
Follow(D) = {h}
Is this grammar LL(1) ?
C -> bC | d First)bC)  First(d) = 
D -> EF | h  First(EF)  First(h) = {h}  
 D -> g F | F | h  D -> g f | g | f | λ | h
First(D)  Follow(D) ={ h }
 D -> g f | g | f | h
 S -> a B D h | a B h
 S -> a B S’
 S’ -> D h | h
 D -> gf | g | f |h
 S’ -> g f h | g h | f h | h h | h
Example - 4
5/11/2021 Saeed Parsa 27
Students’ question and my answers
5/11/2021 Saeed Parsa 28
Follow sets
• What ?
 The Follow set of a non-terminal, A, is the First of symbols that come
after A.
• Why ?
 For a grammar, G, to be LL(1), as described before:
 B  G | Nullable(B)  First(B)  Follow(B) = 
• How ?
A   B   Follow(B) = First()
if nullable ()  Follow(B)= First()+Follow(A)
• Examples ? Let me finish with the description and then …
5/11/2021 Saeed Parsa 29
Follow sets
5/11/2021 Saeed Parsa 30
 Notice:
 Input to a compiler is a source file;
 A source file like any other files ends up with an end of file marker;
 The end of file marker is represented by $;
 Since a source program is supposed to be an instance of the start
symbol, therefore:
 $ is always considered as a member of the follow set of the start
symbol.

Follow Sets
5/11/2021 Saeed Parsa 31
Follow Set Example 1
• Notice:
Always add the end of
File marker, $, to the
Follow(start symbol)
5/11/2021 Saeed Parsa 32
Follow Set Example 1
C → cC => First(C) = {c} + follow(C)
S → C => Follow(C)  Follow(S)
S → ASb => Follow(S)  {b}
S is strting symbol => $  Follow(S)
=> Follow(S) ={b, $}
S → C => Follow(C) = {b,$}
C → cC => First(C) = {c} + follow(C)
= {c, b,$}
S → ASb => Follow(A)  Firs(S)
S →Asb => First(S)  First(A) = {a}
S → C => First(S)  First(C) = {c} + Follow(C)
= {a, c, b, $}
S →ASb => Follow(A)  Firs(S) = {a,c,b,$}
Follow(A) = {a,c,b,$}
Follow(C) = {b,$}
Follow(S) = {b,$}
5/11/2021 Saeed Parsa 33
Follow Set Example 2
G: S  L B
L  id : | 
B  i c t S E
E  e S | 
E  e S |  => First(E) = {e} + Follow(E) = {e. $}
B  i C t S E => Follow(E)  Follow(B)
S  L B => Follow(B)  Follow(S)
B  i C t S E => Follow(S) = First(E) = {$} + First(E)= {$} + {e,$} {e,$}
Follow(B)  Follow(S) = {e,$}
S  L B => Follow(L)  First(B) = {i}
B  i C t S E => First(B) = {i}
5/11/2021 Saeed Parsa 34
Example
Transform the following grammar into LL(1)
G: LabeledSt  Label Statement
Label  id : | 
Statement  AssignmentSt | IfSt | WhileSt | CallSt
AssignmentSt  id := Expression
WhileSt  while Expression do Statement
IfSt  if Expression do Statement
CallSt  id ( Params )
1. The grammar is not LL(1) because:
Nulable(Label)  (First(Label) ∩ Follow Label = id ≠ 𝜑  ¬ 𝐿𝐿 1
First(Label) = {id,  } = {id} + follow(Label) = {id} + First(Staement)
5/11/2021 Saeed Parsa 35
ii. Null Production Removal
LabeledSt  Label Statement
Label  id : | 
1. Replace Label with its expansion
 LabeledSt  id: Statement | Statement
2. First(id : Statement)  First(Statement) = {id} 
 Left factor into LL(1)
5/11/2021 Saeed Parsa 36
ii. Null Production Removal
LabeledSt  Label Statement
Label  id : | 
LabeledSt  id: Statement | Statement
Statement  AssignmentSt | CallSt | IfSt | WhileSt
Statement  id := Expression | id (params) | IfSt | WhileSt
Statement  id CallAssign | IfSt | WhileSt
CallAssign  := Expression | (params)
LabeledSt  id: Statement | id CallAssign | IfSt | WhileSt
LabeledSt  id LaCallAs | IfSt | WhileSt
LaCallAs  : Statement | CallAssign
5/11/2021 Saeed Parsa 37
iii. Left Recursion Elimination
A grammar is not LL(1) if:
 it includes a left recursive production:
X  X 𝛼 | 𝛽
because: First(X 𝛼) = First(X) = First(𝛽)  First(X 𝛼)  First(𝛽)  
 Left recursion is eliminated by converting the grammar into an
equivalent right recursive grammar.
X  X 𝛼 | 𝛽 𝑖𝑠 𝑐𝑜𝑛𝑣𝑒𝑟𝑡𝑒𝑑 𝑡𝑜
1. BNF: X  𝛽 X’
X’  𝛼 X’ | 
2. EBNF: X  𝛽 {𝛼}
5/11/2021 Saeed Parsa 38
 X  X 𝛼 | 𝛽 𝑖𝑠 𝑒𝑞𝑢𝑖𝑣𝑎𝑙𝑒𝑛𝑡 𝑡𝑜
1. BNF: X  𝛽 X’
X’  𝛼 X’ | 
2. EBNF: X  𝛽 {𝛼}
Because:
1. X  X 𝛼 | 𝛽 2. X  𝛽 {𝛼}
5/11/2021 Saeed Parsa 39
Consider the regular expressions grammar
E  E + T | E - T | T
T  T * F | T / F | F
F  Id | No | ( E )
1. Left factoring (BNF)
E  E + T | E - T | T  E  E E” | T
E”  +T | -T
T  T * F | T / F | F  T  T T” | F
T” * F | / F
1. Left factoring (EBNF)
E  E + T | E - T | T
 E  E (+ T | - T) | T
T  T * F | T / F | F
 T  T (* F | / F) | F
5/11/2021 Saeed Parsa 40
X  X𝛼 | 𝛽 𝑖𝑠 𝑐𝑜𝑛𝑣𝑒𝑟𝑡𝑒𝑑 𝑡𝑜
1. BNF: X  𝛽 X’
X’  𝛼 X’ | 
2. EBNF: X  𝛽 {𝛼}
2. Left Recursion Elimination (BNF)
E  E E” | T  E  T E’
E’  E”E’ | 
E’  +T E’ | -T E’ | 
T  T T” | F T  F T’
T’ * F T’| / F T’ | 
2. Left Recursion Elimination (EBNF)
E  E (+ T | - T) | T  E  T {+ T | - T}
T  T (* F | / F ) | F  T  F {* F | / F}
Example
5/11/2021 Saeed Parsa 41
- Equivalent G. (BNF)
E  T E’
E’  +T E’ | -T E’ | 
T  F T’
T’ * F T’| / F T’ | 
F  Id | No | ( E )
- Equivalen G. (EBNF)
E  T {+ T | - T}
T  F {* F | / F}
F  Id | No | ( E )
Example
Consider the regular expressions grammar
E  E + T | E - T | T
T  T * F | T / F | F
F  Id | No | ( E )
5/11/2021 Saeed Parsa 42
To ensure that a grammar is LL(1), we must do the following:
1. Eliminate any common left prefixes,
2. Eliminate any left recursion, as shown below.
3. Eliminate nullable productions, if they cause problem.
1. Left factoring:
A → αβ1|αβ2|𝜹
is replaced with:
A → αA′ | 𝜹
A′ → β1|β2
Or in extended BNF:
A → α (β1|β2)
How to transform to LL(1)
5/11/2021 Saeed Parsa 43
2. Left Recursion Elimination
X  X𝛼 | 𝛽
is converted to
X  𝛽 X’
X’  𝛼 X’ | 
Or in extended BNF:
X  𝛽 {𝛼}
3. No nullable symbol A has the same terminal symbol a in both its first
and follow sets for distinct production rules.
How to transform to LL(1)
5/11/2021 Saeed Parsa 44
 The key problem during predictive parsing is that of determining the
production to be applied for a non-terminal.
 This is done by using a parsing table.
 A parsing table is a two-dimensional array M[A,a] where A is a non-terminal,
and a is a terminal or the symbol $, menaing “end of input string”.
 The other inputs of a predictive parser are:
◦ The input buffer, which contains the string to be parsed followed by $.
◦ The stack which contains a sequence of sentential forms, initially, $S
(end of input string and start symbol) in it.
Parse tables
5/11/2021 Saeed Parsa 45
• The purpose of parsing table is to determine which production rule to use next.
• Consider the following grammar:
G1:
S  d A B | B a B
A  d A | B a
B  b B | 
Example 1
1. Transform the grammar into LL(1) form,
2. Use First and follow sets to construct the parsing table,
3. Use the parsing table to parse given input strings.
5/11/2021 Saeed Parsa 46
1. Convert G1 into the LL(1) form
- B  b B | 
- First(B) = {b, } => First(B)  Follow(B) should be null.
- A  B a => follow(B) = {a}
- S  d A B => follow(B) = {a} + {$} = {a, $}
- It is assumed that always: $  follow(Start symbol)
- => $  follow(S) => $  follow(dAB) => $  follow(B)
- => First(B)  Follow(B) = {b}  {a, $} = 
- First(B) = {b, }, Follow(B) = {a, $}
- First(B) = {b, a, $}
Example 1-Continued
5/11/2021 Saeed Parsa 47
- A  d A | B a
- First(dA)  First(B a) = {d}  {b, a} = 
- First(A) = First(dA)  First(B a) ={d, b, a}
- S  d A B | B a B
- First(dAB)  First(BaB) = {d}  {b , a, $} = 
- First(S) = First(dAB)  First(BaB)  First( a )
- First(S) = {d}  {b}  {a} = {d, b, a}
2. Use First sets to work out the parsing table
Example 1-Continued
5/11/2021 Saeed Parsa 48
- First(S) = {d, b, a}
- First(A) = {d, b, a}
- First(B) = {b, } = {b,a,$}
- Follow(B) = { a, $}
Example 1-Continued
G1:
S  d A B | B a B
A  d A | B a
B  b B | 
d a b $
S dAB BaB BaB BaB
A dA Ba Ba Ba
B  bB 
5/11/2021 Saeed Parsa 49
Build parsing table for this grammar:
G2:
S ( L ) | a
LL S | S
Example 2
1- Eliminate left recursion
G2:
S ( L ) | a
L  S L’
L’  SL’ | λ
5/11/2021 Saeed Parsa 50
Example 2-2
2- Define First set and if required
follow sets for the Non-terminals.
 First(L’) = First(S) +{λ} ={(, a, λ}
 Follow(L’) = Follow(L) = { ) }
 First(L) = First(S) = {(, a }
 Follow(L)={(,a}+{)} = { (, a, ) }
$
)
(
a
-
-
(L)
a
S
SL’
SL’
L
-
λ
SL’
SL’
L’
‫قاعده‬
‫ورودي‬
‫تجزيه‬ ‫پشته‬
S(L)
(a(aa))$
$ S
(a(aa))$
$ )L(
LSL’
a(aa))$
$ )L
Sa
a(aa))$
$ ) L’S
Delete
a(aa))$
$ ) L’a
L’SL’
(aa))$
$ ) L’
S(L)
(aa))$
$ ) L’S
Delete
(aa))$
$ ) L’)L(
LSL’
aa))$
$ ) L’)L
Sa
aa))$
$ ) L’) L’S
Delete
aa))$
$ ) L’) L’a
L’ SL’
a))$
$ ) L’) L’
Sa
a))$
$ ) L’) L’S
Delete
a))$
$ ) L’) L’a
L’λ
))$
$ ) L’) L’
Delete
))$
$ ) L’)
L’λ
)$
$ ) L’
Delete
)$
$ )
$
$
5/11/2021 Saeed Parsa 51
Example 3
5/11/2021 Saeed Parsa 52
5/11/2021 Saeed Parsa 53
• The third homework : Insert your slides from this slide on
5/11/2021 Saeed Parsa 54
1. Convert this grammar to LL(1)
G1:
S::=  |A S
A ::= id := id
A ::= if id then A
A ::= if id then A' else A
A' ::= id := id
A' ::= if id then A' else A‘
Exercise -1
5/11/2021 Saeed Parsa 55
Exercise -2
5/11/2021 Saeed Parsa 56
1. A ::= ABd | Aa | a
i. Left factoring
A ::= A A” | a
A” ::= Bd | a
ii. Eliminate left recursion
A ::= A A” | a => A ::= a A’
A’ ::= A” A’ | 
=> A’ ::= Bd A’ | a A’ | 
B ::= Be | b => B ::= b B’
B’::= e B’ | 
3. A ::= A B |A c| a | aa
i. Left factoring
=> A ::= A A” | a D
A” ::= B | c
D ::= a | 
ii. Eliminate left recursion
=> A ::= a D A’
A’ ::= B A’ | c A’ | 
2. A ::= A b |A c| a | b
i. Left factoring
=> A ::= A A” | a | b
A” ::= b | c
ii. Eliminate left recursion
=> A ::= a A’ | b A’
A’ ::= b A’ | c A’ | 
Solution
5/11/2021 Saeed Parsa 57
Consider the grammar G12
a) Point out all aspects of Grammar G12 which are
not LL(1).
b) Write a new grammar which accepts the same
language, but avoids left recursion and common
left prefixes.
c) Write the FIRST and FOLLOW sets for the new
grammar.
d) Write out the LL(1) parse table for the new
grammar.
e) Is the new grammar an LL(1) grammar? Explain
your answer carefully.
Exercise -3
5/11/2021 Saeed Parsa 58
Exercise -4
Consider the assignment statements grammar
A  id := E
E  E + T | E - T | T
T  T * F | T / F | F
F  Id | No | ( E )
Convert the grammar to LL(1).
Construct the parsing table for the grammar.
Use the table to parse the statement: a := (b/c*3 – e*f)/2
5/11/2021 Saeed Parsa 59
Solution
5/11/2021 Saeed Parsa 60
Solution
5/11/2021 Saeed Parsa 61
5/11/2021 Saeed Parsa 62
 A recursive-descent parser is structured as a set of mutually recursive
procedures, one for each nonterminal in the grammar.
 The procedure corresponding to nonterminal A recognizes an instance of A in
the input stream.
 To recognize a nonterminal B on some right-hand side for A, the parser
invokes the procedure corresponding to B.
 Thus, the grammar itself serves as a guide to the parser's implementation.
Recursive descent parsers
5/11/2021 Saeed Parsa 63
• To test for the presence of a nonterminal, say ’A’, the code invokes a
procedure, named A.
• Suppose: A a B D
Recursive descent parsers
public class Parser
{ private enum symbols currentSymbol;
Parser () { currentSymbol = nextSymbol(); A()}
public void A()
{ /*A  */ Expect(‘a’); B(), D(); }
public void Expect(enum Symbols expectedSymbol)
{ if ( currentSymbol == expectedSymbol) currentSymbol = nextSymbol();
else syntaxError(); }
5/11/2021 Saeed Parsa 64
 For instance:
G: S  if E then S | if E then S else S | begin S L | print E
L  end | ; S L
E  i
 Recursive descent parsers develop,a procedure / method for each non-
terminal A, with the same name as the nonterminal.
 There are three non-terminals S, L, and E, in the grammar.
 Three methods S(), L() and E() should be written.
 A lexical analyzer method nextSymbol() is invoked to get the next lexicon
from the input file.
 nextSymbol() copies the symbol in a global variable called currentSymbol.
 It is assumed that always the next symbol is accessible via currentSymbol,
before the next symbol could be analyzed.
Recursive descent parsers
5/11/2021 Saeed Parsa 65
 There are three non-terminals S, L, and E, in the grammar.
 Three methods S(), L() and E() should be written.
 A lexical analyzer method nextSymbol() is invoked to get the next lexicon
from the input file.
 nextSymbol() copies the symbol in a global variable called currentSymbol.
 It is assumed that always the next symbol is accessible via currentSymbol,
before the next symbol could be analyzed.
Recursive descent parsers
5/11/2021 Saeed Parsa 66
// S  if E then S | if E then S else S | begin S L | print E
public void S()
{ if (currentSymbol == "if")
{ nextSymbol(); E(); expect( "then"); S();
if (currentSymbol == "else") { nextSymbol(); S(); return; }
} else if (currentSymbol == "begin") { nextSymbol(); S(); L(); return; }
else if (currentSymbol == "print")
{ nextSymbol(); E(); return; }
else { throw new IllegalTokenException("Procedure S() expected an 'if’
or 'then' or else or begin or print token " + "but received: "
+ currentSymbol ); } } }
Recursive descent parsers
5/11/2021 Saeed Parsa 67
1. Transform the G into LL(1):
G: E  T E’
E’  +T E’ | -T E’ | 
T  T T’
T’ * F T’| / F T’ | 
F  Id | No | ( E )
- Equivaled G. (EBNF)
G: E  T {+ T | - T}
T  F {* F | / F}
F  Id | No | ( E )
• For instance consider the regular expressions grammar
G: E  E + T | E - T | T
T  T * F | T / F | F
F  Id | No | ( E )
• A recursive-descent parser is structured as a set of mutually recursive
procedures, one for each nonterminal in the grammar.
Recursive descent parsers
5/11/2021 Saeed Parsa 68
• The procedure corresponding to nonterminal A recognizes an instance of A
in the input stream.
// E  T E’
Public void E()
{ /* E  */ T(); E’(); }
• To recognize a nonterminal B on some right-hand side for A, the parser
invokes the procedure corresponding to B.
//E’  +T E’ | -T E’ | 
Public void E()
{if (currentSymbol == s_Add)
{/* E’ */ nextSymbol(); T(); E’();}
else if (currentSymbol == s_Sub)
{/* E’ */ nextSymbol(); T(); E’();}
}
Recursive descent parsers
5/11/2021 Saeed Parsa 69
• For building parsers (especially bottom-up) a BNF grammar is often better,
than EBNF. But it’s easy to convert an EBNF Grammar to BNF:
 Convert every repetition { E } to a fresh non-terminal X and add
 X ::=  | E X.
 Convert every option [ E ] to a fresh non-terminal X and add
 X ::=  | E.
 Convert every group ( E ) to a fresh non-terminal X and add
 X ::= E.
 We can even do away with alternatives by having several productions
with the same non-terminal.
X ::= E | E’. becomes X ::= E. X ::= E’.
From EBNF to BNF
5/11/2021 Saeed Parsa 70
For a recursive descent parser it is easier to use extended BNF.
G: E  T {+ T | - T}
T  F {* F | / F}
F  Id | No | ( E )
public class Parser
{ private enum symbols currentSymbol;
Parser() { // Gets the next symbol, as currentSymbol, before calling E
currentSymbol = nextSymbol(); E();}
// G: E  T {+ T | - T}
public void E( )
{ /* E  */ T();
while ( currentSymbol == S_Add || currentSymbol == S_Sub)
{ nextSymbol(); T(); }
}
From EBNF to BNF
5/11/2021 Saeed Parsa 71
// T  F {* F | / F}
public void T( )
{ /* T */ F();
while ( currentSymbol == S_Mul || currentSymbol == S_Div)
{ nextSymbol(); F(); } }
// F  Id | No | ( E )
public void F( )
{ if (currentSymbol == S_Id || currentSymbol == S_No) nextSymbol();
else { /* F  ( E ) */
Expect(S_openPar); E(); Expect(S_closePar); }
public void Expect(enum symbols expectedSymbol )
{ if currentSymbol == expectedSymbol) nextSymbol(); else syntaxError(); }
public void nextSymbol( File *input-File){ … }
} //Eof Parser Class.
From EBNF to BNF
5/11/2021 Saeed Parsa 72
A Mini Pascal Compiler
ProgramX  Program id ; BlockBody .
Blockbody  [ ConstantDefpart ] [ typeDefPart ] [VarDefPart ]
{FunctionDef | ProcedureDef }CompaundStatement
ConstantDefPart  Const ConstandDef {ConstantDef}
ConstantDef  id = ( No | id ) ;
TypeDefPart  Type TypeDef {TypeDef}
TypeDef  id = (integer | real | character)
VarDefPart  Var VarDef {VarDef}
VarDef  id : (integer | real | character)
• Consider the mini-pascal grammar:
5/11/2021 Saeed Parsa 73
A sample mini-pascal program
5/11/2021 Saeed Parsa 74
Mini Pascal R.D. parser
Begin
init(); // Initializes the Mini-Pascal parser
NextSymbol(); // Get a lookahead
ProgramX(); // Call Starting symbol function
End.
Public class Parser
{ public enum symbols currentSymbol;
Parser(String SourceFile)
{ init(SourceFile); // Open Source and …
NextSymbol(); // currentSymbol = next symbol;
ProgramX(); // Call Start-symbol }
…
}
5/11/2021 Saeed Parsa 75
Recursive descent parsers start by calling the starting symbol of the grammar.
/* ProgramX  Program id ; blockBody . */
public void ProgramX( )
{
Expect( S_Program ); // Expect visiting the “program” keyword
Expect( S_id ); // Expect visiting an identifier
Expect( S_Semi ); // Expect visiting a semicolon
bolckBody( ); // Invoke blockBody()
Expect( S_Dot ); // Expect visiting a dot
}
5/11/2021 Saeed Parsa 76
Mini Pascal R.D. parser - 3
/* blockBody [ constantDefpart ] [ typeDefPart ] [varDefPart ]
{functionDef | procedureDef } compaundStatement */
public void blockBody( )
{ if (currentSymbol == S_Const) constantDefpart();
if (currentSymbol == S_Type) typeDefpart();
if (currentSymbol == S_Var) varDefpart();
while (currentSymbol == S_Procedure || currentSymbol == S_function)
if (currentSymbol == S_Procedure) procedureDef();
else functionDef();
compoundStatement( );
}
5/11/2021 Saeed Parsa 77
Mini Pascal R.D. parser - 4
/* constantDefpart  Const constandDef {constantDef} */
public void constantDefpart( )
{ Expect( S_Const );
constantDef();
// while currenstSymbol in first(constantDef)
while (currentSymbol == S_Id) constantDef();
}
/* constantDef  id = ( No | id ) ; */
public void constantDefpart( )
{ Expect( S_Id); Expect( S_Eaual);
if(currentSymbol == S_No) nextSymbol()
else Expect( S_Id);
expect(S_Semicolon); }
5/11/2021 Saeed Parsa 78
Error Recovery
Error recovery is a process to act against the error in order to reduce the negative
effect of the error.
If the next symbol does not match the expected symbol, then ignore the input
symbols as far as next expected symbol is observed.
5/11/2021 Saeed Parsa 79
Error Recovery: definition
Error recovery is a process to act against the error in order to reduce the negative
effect of the error.
Internally the error recovery works as follows:
‫؞‬ The location of the syntax error is reported.
‫؞‬ If possible, the tokens that would be a legal continuation of the program are
reported.
‫؞‬ The tokens that can serve to continue parsing are computed. A minimal
sequence of tokens is skipped until one of these tokens is found.
5/11/2021 Saeed Parsa 80
Error recovery
• Consider the “Expect” method:
public void Expect( enum Symbols expectedSymbol )
{
if (currentSymbol == expectedSymbol)
nextSymbol( );
else
syntaxError( );
}
• We are going to complete the “syntaxError” method:
public void syntaxError( )
{
Console.writeline( “ Syntax Error “);
nextSymbol(); //Get the next look-ahead symbol
}
5/11/2021 Saeed Parsa 81
Motivating Example .1
• Now, consider this grammar:
• Consider the following code:
5/11/2021 Saeed Parsa 82
Motivating Example .2
/* compoundSt ::= begin Sts end*/
procedure compoundSt( )
begin
Expect(S_Begin);
Sts();
Expect(S_end);
end;
/* Sts ::= St; Sts |  */
procedure Sts( )
begin
St( ); Expect(S_semicolon);
Sts();
end;
Look ahead : begin
5/11/2021 Saeed Parsa 83
Motivating Example .3
/* St ::= ifSt | whileSt | assSt | compounSt */
procedure St( )
begin
if currentSymbol = s_if) then ifSt( )
else if currentSymbol = s_while) then whileSt( )
else if currentSymbol = s_id) then assSt( )
else Expect(s_begin);
end;
Look ahead : begin
Look ahead : jf
5/11/2021 Saeed Parsa 84
Motivating Example .4
/* assSt ::= id := E
procedure assSt( )
begin
nextSymbol( );
Expect(s_assign);
E()
end;
begin
Jf i = 5 then i := i+1;
while j< 5 di i := i*j;
end
public void Expect( enum Symbols expectedSymbol )
{
if (currentSymbol == expectedSymbol)
nextSymbol( );
else
syntaxError( );
}
public void Expect( enum Symbols expectedSymbol )
{
Console.writeline( “ Syntax Error “);
nextSymbol(); //Get the next look-ahead symbol
}
Expected: S_id
Look ahead: i
5/11/2021 Saeed Parsa 85
Error Recovery: Approach
Suppose parser is expecting a non-terminal, Yi, in this production:
X  Y1 Y2 … Yi … Yn
In fact the parser expecting a terminal symbol s  First( Yi ).
The error recovery works as follows:
‫؞‬ Skip next symbols, s, till arriving at a symbol
‫؞‬ s  First( Yi+1)..n).
‫؞‬ Or it proceeds with ignoring the next symols, s, until it arrives at a symbol
S  Stop(Yi)
‫؞‬ where  i  [1..n-1] => Stop(Yi) = 𝑗=𝑖+1
𝑛
𝐹𝑖𝑟𝑠𝑡(𝑌𝑗) + Stop(Y)
‫؞‬ Stop(Yn) = Stop(Y)
‫؞‬ Stop(Start Symbol) always includes the end of file marker, $.
5/11/2021 Saeed Parsa 86
Stop set
G1:
St  ifSt | whileSt | assSt | compoundSt
=> Stop(St) = [s_eof] since St is the start symbol,
=> Stop(ifSt) = Stop(whileSt) = Stop(assSt) = Stop(compoundSt) = Stop(St)
= [s_eof]
compoundSt  begin Sts end
=> Stop(s_begin) = First(Sts) + [s_end] + Stop[compoundSt]= [s_end, s_eof]
=> Stop(Sts) = [s_end] + Stop[compoundSt] = [s_eof, s_end]
=> Stop(s_end) = Stop[compoundSt] = [s_eof]
Sts  St ; Sts | St
=> Stop(St) = [s_semicolon] + first(St) + Stop(Sts)
= [s_semicolon] + First(ifSt) + First(whileSt) + First(assSt) +
First(compoundSt) + [s_eof, s_end];
5/11/2021 Saeed Parsa 87
Error Recovery
• The “Expect” method is modified as follows:
public void Expect( enum Symbols expectedSymbol , HashSet Stop)
{
if (currentSymbol == expectedSymbol)
nextSymbol( );
else syntaxError( Stop );
}
• We are going to complete the “syntaxError” method:
public void syntaxError( HashSet<enum symbols> Stop )
{
Console.writeline( “ Syntax Error “);
while( !Stop.contains( currentSymbol ) )
nextSymbol();
}
5/11/2021 Saeed Parsa 88
Error Recovery
The Expect method is modified as follows:
(* Expect compares expected symbol S with the current symbol *)
Procedure Expect ( ExpectedSymbol : Symbols , Stop : Set of Symbols ) ;
Begin
if Currentsymbol = ExpectedSymbol Then Nextsymbol
Else SyntaxError( Stop ) ;
End {Expect};
Procedure SyntaxError(Stop: Set of Symbols);
Begin
promptMsg(‘ Syntax error at‘, LineNo, ColNo) ;
While not (CurrentSymbol in Stop) DO NextSymbol() ;
End{ SyntaxError };
5/11/2021 Saeed Parsa 89
Error Recovery
• The main body of the Mini-pascal compiler is modified as follows:
Program MiniPascalComplier;
Type
Symbols = ( S_if , S_while, S_repeat , S_for, S_Case, S_then, S_else, S_do,
S_program, S_uses, S_interface, S_unit, S_begin, S_end,
S_label, S_const, S_type, S_var, S_procedure, S_function,
S_integer , S_real , S_char, S_array, S_record, S_pointer,
S_lt , S_gt , S_eq , S_le , S_ge, S_ne, S_add, S_sub, S_or, S_mul,
S_div, S_and, S_id, S_no, S_not, S_comma, S_colon, S_semicolon,
S_dot, S_OpBracket, S_ClBracket, S_OpCurlyB, S_ClCurlyB,
S_OpSquB, S_ClSquB );
Var
CurrentSymbol : Symbols ;
5/11/2021 Saeed Parsa 90
Error Recovery : MiniPascal
Begin
init( ) ; (* Initialize variables, open source and create target files *)
nextSymbol( ); (*Detects and saves the first lexicon in currentSymbol *)
ProgramX ([S_EOF] ); (* End of file marker is expected after the start symbol,
ProgramX *)
End.
5/11/2021 Saeed Parsa 91
Error Recovery : MiniPascal
(* ProgramX ::= Program id ‘;’ BlockBody ‘.’ *)
Procedure ProgramX ( Stop : Set of Symbols ) ;
Begin
Expect(S_program, [S_id , S_Semicolon] + First ( Blockbody ) + [ S_dot ] + Stop ) ;
Expect(S_id , [ S_Semicolon , First ( Blockbody ) + [ S_dot ] + Stop ) ;
Expect ( S_Semicolon , First ( Blockbody ) + [ S_dot ] + Stop ) ;
Blockbody ( [ S_dot ] + Stop ) ;
Expect ( S_dot , Stop ) ;
End;
5/11/2021 Saeed Parsa 92
Error Recovery : MiniPascal
(* Blockbody [ ConstantDef.part ][ typeDef.Part ][VarDefPart ]
{FunctionDef | ProcedureDef} CompaundStatement *)
Procedure BlockBody( Stop : Set of Symbols);
Begin
if CurrentSymbol = S_const then
ConstantDef.Part(Stop + [S_Type, S_Var, S_Procedure, S_Function, S_Begin] );
if CurrentSymbol = S_type then
TypeDefPart( Stop + [ S_Var, S_Procedure, S_Function, S_Begin] );
if CurrentSymbol = S_var then
VarDefPart( Stop + [ S_Procedure, S_Function, S_Begin] );
5/11/2021 Saeed Parsa 93
Example 1
 Convert this grammar into LL(1) form and write a recursive-descent parser
for it:
G1:
S  aB | aC | dD
D  Da | Db | d
B  BC | b
C  Cd | d
G1:
S  a (B | C ) | dD
D  d { a | b }
B  b { C }
C  d {d}
First(S) = {a, d}
First(D) = {d}
First(B) = {b}
First(C) = {d}
5/11/2021 Saeed Parsa 94
Error Recovery : Example – 1.1
Begin
init();
NextSymbol();
S(S_EOF);
End.
(* S  a (B | C ) | dD *)
Procedure S( Stop : Set of Symbols);
Begin
If(CurrentSymbol = S_d) Then Begin NextSymbol; D( Stop );
End
Else Expect(S_a, [S_b, S_d] + Stop);
End;
5/11/2021 Saeed Parsa 95
Error Recovery : Example – 1.2
(* D  d { a | b }
*)
Procedure D( Stop : Set of Symbols);
Begin
Expect(S_d, [S_b, S_d] + Stop);
While(CurrentSymbol = S_a ) Or (CurrentSymbol = S_b) do
If CurrentSymbol = S_a Then NextSymbol
Else Expect(S_a,
[S_b, S_d] + Stop);
End;
(*B  b { d }*)
Procedure B( Stop : Set of Symbols);
Begin
Expect(S_b, [ S_d] + Stop);
While(CurrentSymbol = S_d ) do NextSymbol;
End;
(*C  d { d }*)
Procedure C( Stop : Set of Symbols)
Begin
Expect(S_d, [ S_d] + Stop);
While(CurrentSymbol = S_d ) do NextSymbol;
End;
5/11/2021 Saeed Parsa 96
Example 2
• Convert this grammar into LL(1) and write a recursive-descent parser for it:
G2:
S  Aa | Bd | Sc
A  Aa | 
B  Bb | 
A  {a} A is nullable
B  {b} B is nullable
First(A) = {a, }
Follow(A) = {a}
First(A)  Follow(A) = {a}  
First(B) = {b, }
Follow(B) = {d}
First(B)  Follow(B) = 
S  {a}a | Bd | Sc
S  {a{a}| Bd }c
First(S) = {a, b, c, d}
5/11/2021 Saeed Parsa 97
Example 2.1
G2:
S  Aa | Bd | Sc
A  Aa | 
B  Bb | 
G2:
S  { a{a} | b{b} | d } c
/* S  { a{a} | b{b} | d } c */
public void S(HashSet Stop)
{ HashSet<char> Follow = new HashSet<char>();
if (currentSymbol == ‘a’)
while (currentSymbol == ‘a’) nextSymbol();
else if (currentSymbol == ‘b’)
while (currentSymbol == ‘b’) nextSymbol();
else
{ Follow.Add (‘a’); Follow.sAdd (‘b’);
Follow.Add (‘c’); Follows.Add (‘d’);
Expect(‘d’, Follow+ Stop); }
Expect( d, Stop ),
5/11/2021 Saeed Parsa 98
Example 3
 Convert this grammar into LL(1) and write a recursive-descent parser for it:
G3:
Expression  SimpleExp {RelOp SimpleExp}
RelOp  < | <= | = | <> | >= | > | IN
SimpleExp  Term { ( ‘+’ | ‘-’ | Or ) Term }
Term  Factor { (‘/’ | ‘*’ | DIV | AND) Factor}
Factor  Number | NOT Factor | ‘(‘ Expression ‘)’ | Variable
Variable  Identifier { ‘[‘ Dim ‘]’ }
Dim  Expression { ‘ ,’ Expression }
5/11/2021 Saeed Parsa 99
Example 3.1
Expression  SimpleExp {RelOp SimpleExp} => First(Expression) = First(SimpleExp)
RelOp  < | <= | = | <> | >= | > | IN => First(Relop) = [<,<=.=,<>,>=,>, IN]
SimpleExp  Term { ( ‘+’ | ‘-’ | Or ) Term } => First(SimpleExp) = First(Term)
Term  Factor { (‘/’ | ‘*’ | DIV | AND) Factor} => First(Term) = First(Factor)
Factor  Number | NOT Factor | => First(Factor) = [ Number, Not,
‘(‘ Expression ‘)’ ‘(‘ ]
| Variable + First(Variable)
Variable  Identifier { ‘[‘ Dim ‘]’ } => First(Variable) = [Identifier]
Dim  Expression { ‘ ,’ Expression } => First(Dim) = First(Expression)
First(Expression) =First(SimpleExp}= First(Term)=First(Factor) =
[Number, Not, ‘(‘, Identifier]
5/11/2021 Saeed Parsa 100
Example 3.2
Begin
Init( );
NextSymbol([S_Eof]);
Expression( );
End.
(*Expression SimpleExp { RelOp SimpleExp } *)
procedure Expression( Stop : Set of Symbols);
Begin
FirstSetOfRelop = [ < , <=, = , < > , >= | > | IN ];
FirstSimpleExp = [Number, Not, ‘(‘, Identifier ];
SimpleExp( Stop + FirstSetOfRelop );
while CurrentSymbol in FirstSetOfRelop do
Begin
RelOp(Stop + FirstSimpleExp);
SimpleExp (Stop + FirstSetOfRelop);
End;
End {Expression};
5/11/2021 Saeed Parsa 101
Example 4
• Convert this grammar into LL(1) form and write a recursive-descent parser for
it:
G4:
S  L D
L  id : | 
D  A | C | I | B
A  id := no + id
C  id ( )
B  begin T end
T  T; S | S
I  if (id > no) goto id
5/11/2021 Saeed Parsa 102
Example 4.2
S  L D => First(S) = First(L)
L  id : |  => First(L) = [id, ] nullable
S  L D => Follow(L) = First(D)
D  A | C | I | B => First(D) = [id, if, begin]
=> First(C)  First(D) = [id]   Not LL(1)
=> First(L)  Follow (L) = [id]   Not LL(1)
=> D  id := no + id | id ( ) | I | B
Left refactoring : D  id G | | I | B
G  := no | ( )
Null production Elimination:
S  id: D | id G | I | B
Left Factoring: S  id (: D | G) | I | B
5/11/2021 Saeed Parsa 103
Example 4.3
G4:
S  id (: D |G }| I | B => First(S) = [id, if, begin]
D  id G | I | B => First(D) = [id, if, begin]
G  := no | ( ) => First(G) = [no, ( ]
B  begin T end => First(B) = [Begin]
T  S{; S } => First(G) = [id, if, begin]
I  if (id > no) goto id => First(I) = [if]
5/11/2021 Saeed Parsa 104
Example 4.4
// S  id (: D |G }| I | B => First(S) = [id, if, begin]
public void S( HashSet Stop)
{HashSet<String> First = new HashSet<String>();
D  id G | I | B => First(D) = [id, if, begin]
G  := no | ( ) => First(G) = [no, ( ]
B  begin T end => First(B) = [Begin]
T  S{; S } => First(G) = [id, if, begin]
I  if (id > no) goto id => First(I) = [if]
5/11/2021 Saeed Parsa 105
Example 4.3
G4:
S  id (: D |G }| I | B => First(S) = [id, if, begin]
D  id G | I | B => First(D) = [id, if, begin]
G  := no | ( ) => First(G) = [no, ( ]
B  begin T end => First(B) = [Begin]
T  S{; S } => First(G) = [id, if, begin]
I  if (id > no) goto id => First(I) = [if]
5/11/2021 Saeed Parsa 106
The third homework : Insert your slides from this slide on
5/11/2021 Saeed Parsa 107
5/11/2021 Saeed Parsa 108
1. Convert this grammar to LL(1) in EBNF and write a recursive descent parser
In Python or C#.
G1:
S::=  |A S
A ::= id := id
A ::= if id then A
A ::= if id then A' else A
A' ::= id := id
A' ::= if id then A' else A‘
Exercise -1
1. A’ Could be removed
2. The grammar is not acepable
3. If you substitute  for S in the
profuction S ::= A S you will have
S ::= {A}.
5/11/2021 Saeed Parsa 109
Consider the grammar G12
a) Point out all aspects of Grammar G12 which are
not LL(1).
b) Convert it into LL(1) in EBNF.
c) Write the FIRST and FOLLOW sets for the new
grammar.
d) Write out the LL(1) recursive descent parser.
e) Do not forget error recovery.
Exercise -2
5/11/2021 Saeed Parsa 110
Exercise -3
Consider the assignment statements grammar
A  id := E
E  E + T | E - T | T
T  T * F | T / F | F
F  Id | No | ( E )
 Convert the grammar into LL(1), using EBNF.
 Write out the recursive descent parser in C#, C++ or Python
 Programming languages.
 Do not forget error recovery.
5/11/2021 Saeed Parsa 111
5/11/2021 Saeed Parsa 112
Parse-Tree Listeners & Visitors
 ANTLR provides support for two tree-walking mechanisms in its runtime library.
 By default, ANTLR generates a parse-tree listener interface that responds to
events triggered by the built-in tree walker.
 The listeners receive notification of events like startDocument and
endDocument.
 ANTLR can also generate tree walkers that follow the visitor design pattern
 As the walker encounters the node for rule assign, for example, it triggers enterAssign()
and passes it the AssignContext parse-tree node.
 The beauty of the listener mechanism is that it’s all automatic.
 We don’t have to write a parse-tree walker, and our listener methods don’t have to
explicitly visit their children.
5/11/2021 Saeed Parsa 113
Parse-Tree Listeners
 To walk a tree and trigger calls into a listener, ANTLR’s runtime provides class
ParseTreeWalker.
 ANTLR generates a ParseTreeListener subclass specific to each grammar with enter and
exit methods for each rule.
 As the walker encounters the node for rule assign, for example, it triggers enterAssign()
and passes it the AssignContext parse-tree node.
 The beauty of the listener mechanism is that it’s all automatic. We don’t have to write a
parse-tree walker, and our listener methods don’t have to explicitly visit their children.
 The beauty of the listener mechanism is that it’s all automatic. We don’t have to write a
parse-tree walker, and our listener methods don’t have to explicitly visit their children.
5/11/2021 Saeed Parsa 114
Parse-Tree Listeners
 The thick dashed line shows a depth-first walk of the parse tree.
 The thin dashed lines indicate the method call sequence among the visitor methods.
5/11/2021 Saeed Parsa 115
Build a language application
 The first step to building a language application is to create a grammar that describes a
language’s syntactic rules (the set of valid sentences).
 Run ANTLR (class org.antlr.v4.Tool) on the grammar file.
antlr4 ArrayInit.g4 # Generate parser and lexer using antlr4 alias
• From grammar ArrayInit.g4, ANTLR generates lots of files that we’d normally have to
write by hand.
5/11/2021 Saeed Parsa 116
Write syntactic and lexical rules
starter/ArrayInit.g4
/** Grammars always start with a grammar header. This grammar is called
* ArrayInit and must match the filename: ArrayInit.g4
*/
grammar ArrayInit;
/** A rule called init that matches comma-separated values between {...}. */
init : '{' value (',' value)* '}' ; // must match at least one value
/** A value can be either a nested array/struct or a simple integer (INT) */
value : init
| INT
;
// parser rules start with lowercase letters, lexer rules with uppercase
INT : [0-9]+ ; // Define token INT as one or more digits
WS : [ trn]+ -> skip ; // Define whitespace rule, toss it out
5/11/2021 Saeed Parsa 117
Integrating a Generated Parser into a Java Program
5/11/2021 Saeed Parsa 118
Run the program
 The program generates lisp like parse tress for a given input.
 Here’s how to compile everything and run Test:
javac ArrayInit*.java Test.java
java Test
• Input
➾ {1,{2,3},4}
➾EOF
• output
❮ (init { (value 1) , (value (init { (value 2) , (value 3) })) , (value 4) })
5/11/2021 Saeed Parsa 119
ANTLR 4 with Python3 Detailed Example
 ANTLR4 introduced a handy listener-based API, but sometimes it's
better not to use it.
https://dzone.com/articles/antlr-4-with-python-2-detailed-example
5/11/2021 Saeed Parsa 120
ANTLR 4 with Python3 Detailed Example
 As before, we run ANTLR on the grammar to generate code.
https://dzone.com/articles/antlr-4-with-python-2-detailed-example
antlr4 -Dlanguage=Python3 arithmetic.g4
 This generates a lexer, parser, and a base class for a listener;
 I'll give the main body of the code first:
1 def main():
2 lexer = arithmeticLexer(antlr4.StdinStream())
3 stream = antlr4.CommonTokenStream(lexer)
4 parser = arithmeticParser(stream)
5 tree = parser.expression()
6 handleExpression(tree)
7 if __name__ == '__main__’:
8 main()
5/11/2021 Saeed Parsa 121
Iterate over the children
 The ANTLR API provides us with the means to iterate over the children of a node.
 We can walk through the children in order.
 NTLR API provides us with the means to iterate over the children of a node.
1. def handleExpression(expr):
2 adding = True
3 value = 0
4 for child in expr.getChildren():
5 if isinstance(child, antlr4.tree.Tree.TerminalNode):
6 adding = child.getText() == "+"
7 else:
8 multValue = handleMultiply(child)
9 if adding:
10 value += multValue
11 else:
12 value -= multValue
13 print "Parsed expression %s has value %s" % (expr.getText(), value)
5/11/2021 Saeed Parsa 122
 We iterate over the children; where we find a multiplying expression, we evaluate it.
 Where we find an operator, we use it to set a flag indicating the next operation to
perform.
1. def handleMultiply(expr):
2 multiplying = True
3 value = 1
4 for child in expr.getChildren():
5 if isinstance(child, antlr4.tree.Tree.TerminalNode):
6 multiplying = child.getText() == "*"
7 else:
8 if multiplying:
9 value *= int(child.getText())
10 else:
11 value /= int(child.getText())
12
13 return value
Iterate over the children
The place of IUST in the world
5/11/2021 Saeed Parsa 123
https://www.researchgate.net/publication/328099969_Software_Fault_Localisation_A_Systematic_Mapping_Study
5/11/2021 Saeed Parsa 124

More Related Content

What's hot

Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translationAkshaya Arunan
 
Syntactic analysis in NLP
Syntactic analysis in NLPSyntactic analysis in NLP
Syntactic analysis in NLPkartikaVashisht
 
Introduction - Imperative and Object-Oriented Languages
Introduction - Imperative and Object-Oriented LanguagesIntroduction - Imperative and Object-Oriented Languages
Introduction - Imperative and Object-Oriented LanguagesGuido Wachsmuth
 
Compiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term RewritingCompiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term RewritingEelco Visser
 
Chapter Two(1)
Chapter Two(1)Chapter Two(1)
Chapter Two(1)bolovv
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generationAkshaya Arunan
 
Chapter Three(2)
Chapter Three(2)Chapter Three(2)
Chapter Three(2)bolovv
 
Lecture 13 intermediate code generation 2.pptx
Lecture 13 intermediate code generation 2.pptxLecture 13 intermediate code generation 2.pptx
Lecture 13 intermediate code generation 2.pptxIffat Anjum
 
Chapter Eight(2)
Chapter Eight(2)Chapter Eight(2)
Chapter Eight(2)bolovv
 
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...Saikrishna Tanguturu
 

What's hot (19)

Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translation
 
Chapter 6 Intermediate Code Generation
Chapter 6   Intermediate Code GenerationChapter 6   Intermediate Code Generation
Chapter 6 Intermediate Code Generation
 
Parsing
ParsingParsing
Parsing
 
Syntaxdirected
SyntaxdirectedSyntaxdirected
Syntaxdirected
 
Type analysis
Type analysisType analysis
Type analysis
 
Syntactic analysis in NLP
Syntactic analysis in NLPSyntactic analysis in NLP
Syntactic analysis in NLP
 
COMPILER DESIGN- Syntax Analysis
COMPILER DESIGN- Syntax AnalysisCOMPILER DESIGN- Syntax Analysis
COMPILER DESIGN- Syntax Analysis
 
Dynamic Semantics
Dynamic SemanticsDynamic Semantics
Dynamic Semantics
 
Introduction - Imperative and Object-Oriented Languages
Introduction - Imperative and Object-Oriented LanguagesIntroduction - Imperative and Object-Oriented Languages
Introduction - Imperative and Object-Oriented Languages
 
Compiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term RewritingCompiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term Rewriting
 
Chapter Two(1)
Chapter Two(1)Chapter Two(1)
Chapter Two(1)
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generation
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
Chapter Three(2)
Chapter Three(2)Chapter Three(2)
Chapter Three(2)
 
Static Analysis
Static AnalysisStatic Analysis
Static Analysis
 
Lecture 13 intermediate code generation 2.pptx
Lecture 13 intermediate code generation 2.pptxLecture 13 intermediate code generation 2.pptx
Lecture 13 intermediate code generation 2.pptx
 
Chapter Eight(2)
Chapter Eight(2)Chapter Eight(2)
Chapter Eight(2)
 
Module 11
Module 11Module 11
Module 11
 
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
 

Similar to 5 top-down-parsers

Parsing in Compiler Design
Parsing in Compiler DesignParsing in Compiler Design
Parsing in Compiler DesignAkhil Kaushik
 
context free grammars automata therory and compiler design
context free grammars automata therory and compiler designcontext free grammars automata therory and compiler design
context free grammars automata therory and compiler designsunitachalageri1
 
2.3 context free grammars and languages
2.3 context free grammars and languages2.3 context free grammars and languages
2.3 context free grammars and languagesSampath Kumar S
 
compiler-lecture-6nn-14112022-110738am.ppt
compiler-lecture-6nn-14112022-110738am.pptcompiler-lecture-6nn-14112022-110738am.ppt
compiler-lecture-6nn-14112022-110738am.pptSheikhMuhammadSaad3
 
Typing quantum superpositions and measurement
Typing quantum superpositions and measurementTyping quantum superpositions and measurement
Typing quantum superpositions and measurementAlejandro Díaz-Caro
 
Compiler first set_followset_brief
Compiler first set_followset_briefCompiler first set_followset_brief
Compiler first set_followset_briefshubham509
 
Compiler_FirstSet_FollowSet-examples.pptx
Compiler_FirstSet_FollowSet-examples.pptxCompiler_FirstSet_FollowSet-examples.pptx
Compiler_FirstSet_FollowSet-examples.pptxBilalWarraich15
 
Chpt 2-functions-seqs v.5
Chpt 2-functions-seqs v.5Chpt 2-functions-seqs v.5
Chpt 2-functions-seqs v.5ShahidAkbar22
 
Scope Graphs: A fresh look at name binding in programming languages
Scope Graphs: A fresh look at name binding in programming languagesScope Graphs: A fresh look at name binding in programming languages
Scope Graphs: A fresh look at name binding in programming languagesEelco Visser
 
CS17604_TOP Parser Compiler Design Techniques
CS17604_TOP Parser Compiler Design TechniquesCS17604_TOP Parser Compiler Design Techniques
CS17604_TOP Parser Compiler Design Techniquesd72994185
 
Darmon Points: an overview
Darmon Points: an overviewDarmon Points: an overview
Darmon Points: an overviewmmasdeu
 
Prerequisite for metric space
Prerequisite for metric spacePrerequisite for metric space
Prerequisite for metric spaceROHAN GAIKWAD
 

Similar to 5 top-down-parsers (20)

Parsing in Compiler Design
Parsing in Compiler DesignParsing in Compiler Design
Parsing in Compiler Design
 
context free grammars automata therory and compiler design
context free grammars automata therory and compiler designcontext free grammars automata therory and compiler design
context free grammars automata therory and compiler design
 
PARSING.ppt
PARSING.pptPARSING.ppt
PARSING.ppt
 
2.3 context free grammars and languages
2.3 context free grammars and languages2.3 context free grammars and languages
2.3 context free grammars and languages
 
compiler-lecture-6nn-14112022-110738am.ppt
compiler-lecture-6nn-14112022-110738am.pptcompiler-lecture-6nn-14112022-110738am.ppt
compiler-lecture-6nn-14112022-110738am.ppt
 
Ch03
Ch03Ch03
Ch03
 
Circular queues
Circular queuesCircular queues
Circular queues
 
Compiling fµn language
Compiling fµn languageCompiling fµn language
Compiling fµn language
 
Ch4a
Ch4aCh4a
Ch4a
 
Fields in cryptography
Fields in cryptographyFields in cryptography
Fields in cryptography
 
Cs419 lec9 constructing parsing table ll1
Cs419 lec9   constructing parsing table ll1Cs419 lec9   constructing parsing table ll1
Cs419 lec9 constructing parsing table ll1
 
Typing quantum superpositions and measurement
Typing quantum superpositions and measurementTyping quantum superpositions and measurement
Typing quantum superpositions and measurement
 
Cerutti -- TAFA2013
Cerutti -- TAFA2013Cerutti -- TAFA2013
Cerutti -- TAFA2013
 
Compiler first set_followset_brief
Compiler first set_followset_briefCompiler first set_followset_brief
Compiler first set_followset_brief
 
Compiler_FirstSet_FollowSet-examples.pptx
Compiler_FirstSet_FollowSet-examples.pptxCompiler_FirstSet_FollowSet-examples.pptx
Compiler_FirstSet_FollowSet-examples.pptx
 
Chpt 2-functions-seqs v.5
Chpt 2-functions-seqs v.5Chpt 2-functions-seqs v.5
Chpt 2-functions-seqs v.5
 
Scope Graphs: A fresh look at name binding in programming languages
Scope Graphs: A fresh look at name binding in programming languagesScope Graphs: A fresh look at name binding in programming languages
Scope Graphs: A fresh look at name binding in programming languages
 
CS17604_TOP Parser Compiler Design Techniques
CS17604_TOP Parser Compiler Design TechniquesCS17604_TOP Parser Compiler Design Techniques
CS17604_TOP Parser Compiler Design Techniques
 
Darmon Points: an overview
Darmon Points: an overviewDarmon Points: an overview
Darmon Points: an overview
 
Prerequisite for metric space
Prerequisite for metric spacePrerequisite for metric space
Prerequisite for metric space
 

Recently uploaded

Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersChitralekhaTherkar
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 

Recently uploaded (20)

Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of Powders
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 

5 top-down-parsers

  • 1. 5/11/2021 Saeed Parsa 1 Compiler Design Top down parsers Saeed Parsa Room 332, School of Computer Engineering, Iran University of Science & Technology parsa@iust.ac.ir Winter 2021
  • 2. Who, when/where, and what? • Who are we? • Lecturer • Saeed Parsa • Associate Professor in IUST • Research Area: Software Engineering, Software Testing, Software Debugging, Reverse Engineering, etc. • Email: parsa@iust.ac.ir • More Information: • http://parsa.iust.ac.ir/ • Slide Share • https://www.slideshare.net/SaeedParsa 5/11/2021 Saeed Parsa 2
  • 5. 5/11/2021 Saeed Parsa 5 Top-Down Parsing  A predictive parser is characterized by its ability to choose the production to apply solely on the basis of the next input symbol and the current nonterminal being processed.  Top down parsing, starts with the start symbol and apply the productions until arriving at the desired string.
  • 7. 5/11/2021 Saeed Parsa 7 Predictive parsers A predictive parser uses the next input symbol, as a look-ahead to determine the production rule for expanding the current nonterminal.
  • 9. 5/11/2021 Saeed Parsa 9 LL(1) Grammars  A grammar is LL(1) if it can be parsed by considering only one non- terminal and the next token, as look ahead, in the input stream.  Example: The following grammar is LL(1): S ::= a S B | d B B ::= b B | a N.T. a b d S a S B d B B a bB  Parsing table:
  • 10. 5/11/2021 Saeed Parsa 10  The following grammar is LL(1): S ::= a S B | d B B ::= b B | a N.T. a b d S a S B d B B a bB  Parsing table:  Input string: adbaa  Parsing: Use parsing table to select a production S Example
  • 11. 5/11/2021 Saeed Parsa 11 LL(1) Grammars defintion  A grammar, G, is LL(1) if and only if:  “A  𝛽1 | 𝛽2 | … | 𝛽𝑛" G  𝐹𝑖𝑟𝑠𝑡(𝛽𝑖), 𝐹𝑖𝑟𝑠𝑡(𝛽𝑗) = 𝜑 ,  𝑖, 𝑗  1. . 𝑛  i  j where A  a 𝛽  First(A) = { a } , a  Terminal Symbols & 𝛽 is any string A  B 𝛽  First(A) ⊇ First(B), B  Nonterminal Symbols & … A  𝛽1 | 𝛽2 | … | 𝛽𝑛  First(A) = First(𝛽1)  First(𝛽2)  …  First(𝛽𝑛)
  • 12. 5/11/2021 Saeed Parsa 12 First Set: Definition • Suppose A is a nonterminals, First(A) consists of the first terminals that can be derived from A. if A ⇒* aβ, then a ∈ First(A) if A ⇒*  (nullable), then  ∈ First(A) First(A) = First(aB)+ First(CD) = {a} + First(C) + First(D) = {a, c, d} A  aB | CD B  Bb |  C  c |  D  d Another grammar E.g. First(B) = {b, }
  • 13. 5/11/2021 Saeed Parsa 13  Paziraee ::= PishGhaza Ghaza Deser Kotak  PishGaza ::= sup β1| Ash β2| panir β3| salad β4|   Ghaza ::= Ash β5 | Abgusht | Pitza | Kabab |   Deser ::= Bastani | SholehZard | Miveh |   Kotak ::= Chomagh | Shamsir o First(PishGaza) ::={ sup, Ash, panir,  } = { sup, Ash, panir} + Follow(PishGhaza) o Ghaza ::= Ash | Abgusht | Pitza | Kabab |  o Deser ::= Bastani | SholehZard | Miveh |  o Kotak ::= Chomagh | Shamsir First Set: Definition
  • 14. 5/11/2021 Saeed Parsa 14 First Set: Properties 1. If X is a terminal or ε, then First(X) = {X} 2. Suppose X is a nonterminal and X  Y1Y2...Yk - if for some i, Y1...Yi-1 ⇒* ε , then First(X) ⊇ First(Yi) – {ε} - if Y1...Yk ⇒* ε, then ε ∈ First(X) First(A) = {a, c, d} A  aB | CD B  Bb | ε C  c | ε D  d Another grammar E.g. Why exclude it ?
  • 15. 5/11/2021 Saeed Parsa 15 First Set: Definition
  • 16. 5/11/2021 Saeed Parsa 16 Example 1  First(B) = First(ea) = {e} First(C) = First(bC) + First(d) = {b, d} First(A) = {a} + First(B) + First(C) = {a,b,d,e} G is LL(1) because: A  a A | B b | C d a First(aA)∩First(Bb), First(aA)∩First(Cda), First(Bb)∩First(Cda) = {a} ∩ {e} = 𝜑 , = {a} ∩ {b,d} = 𝜑, = {e} ∩ {b,d} = 𝜑 C  b C | d  First(bC) ∩ First(d) ={b} ∩ {d} = 𝜑
  • 17. 5/11/2021 Saeed Parsa 17 Example 2 G: A  a A | a B d| C d a C  d C | a B  a e G is not LL(1) because: A  a A | a B d | C d a  1. First(aA) ∩ First(a B b) = a  𝜑 2. also First(Cda) = {d, a}  {a}  if ( look-ahead == {a}  current symbol == A ) then it will not be possible to determine which production to choose
  • 18. 5/11/2021 Saeed Parsa 18 i. Left Factoring G: A  a eA | a B d| C d a  C  d C | a B  a e A  a e A| a B d| d C d a | a d a C  d C | a B  a e A  a A’ | d C d a A’  e A |B d | d a C  d C | a B  a e
  • 19. 5/11/2021 Saeed Parsa 19 We say that a nonterminal x is nullable if the empty sequence can be derived from it. X *  then Nullable(X) = true Nullable To be LL(1) for any production: Y       *  First()  First() should be empty. A  B b e d B  b d |  First(B)  Follow(B) = {b}
  • 20. 5/11/2021 Saeed Parsa 20 Where LL(1) grammars A grammar is LL(1) if and only if:  No two distinct productions with the same LHS can generate the same first terminal symbol. (eg. A → a  | a β is not LL{1})  No nullable symbol ‘A’ has the same terminal symbol ‘a’ in both its first and follow sets for distinct production rules.  There is only one way to send a nullable symbol to .
  • 21. 5/11/2021 Saeed Parsa 21 Follow:  A → αBβ then Follow(B) = First(β)  A → αBβC and Nullable(β)= true, then Follow(B)= Follow(B)∪First(β)∪First(C)  A → αBβ and Nullable(β) = true, then Follow(B) = Follow(B)∪First(β)∪ Follow(A) Predict(A, a) = {A → β : a ∈ First(β)} ∪ {A → β : β is nullable and a ∈ Follow(A)} LL(1) grammars
  • 22. 5/11/2021 Saeed Parsa 22 • G : S  a b | a d  Is G LL(1) ? First(ab) ∩ First(ad) = {a}     S is not LL(1)  Left refactoring should be applied  G : S  a A’ A’  b | d • G: S  a B d B  d e|   Is G LL(1) ? First(B) ∩ Follow(B) = {d}    S is not LL(1)   productions should be removed  G : S  a de d | a d S  a d S’ LL(1)
  • 23. 5/11/2021 Saeed Parsa 23 • Example: determine the follow set where required S  a B C d | C A B  b B |  C  a A e|  A  e First(S) = {a} + First(C) First(B) = {b, } = {b} + Follow(B) = {b} + First(C) = {b, a, d, e} First(C) = {a, } = {a} + Follow(C) = {a} + {d} + First(A) ={a, d, e} Example - 1
  • 24. 5/11/2021 Saeed Parsa 24 • Example: determine the follow set where required S  a B C d | C A B  b B |  C  a A e|  A  e First(S) = {a} + First(C) = {a, …} First(B) = {b, } = {b} + Follow(B) = {b} + First(C) = {b, d, e} First(C) = {a, } = {a} + Follow(C) = {a} + {d} + First(A) ={a, d, e} Example - 2
  • 25. 5/11/2021 Saeed Parsa 25 Production Rules S -> aBDh B -> cC C -> bC | d D -> EF E -> g | λ F -> f | λ First sets First(D) = First(E) = {g, f, h} First(E) = {g} + Follow(E) = {g, f, h} Follow(E) = First(F) = {f, h} First(F) = {f} + Follow(F) = {f, h} Follow(F) = Follow(D) = {h} Follow(S) = {$} Follow(B) = First(D) = {g, f, h} Follow(C) = Follow(B) = {g, f, h} Follow(D) = {h} Example - 3
  • 26. 5/11/2021 Saeed Parsa 26 Production Rules: S -> aBDh B -> cC C -> bC | d D -> EF | h E -> g | λ F -> f | λ First(D) = First(E) = {g, f, h} First(E) = {g} + Follow(E) = {g, f, h} Follow(E) = First(F) = {f, h} First(F) = {f} + Follow(F) = {f, h} Follow(F) = Follow(D) = {h} Follow(S) = {$} Follow(B) = First(D) = {g, f, h} Follow(C) = Follow(B) = {g, f, h} Follow(D) = {h} Is this grammar LL(1) ? C -> bC | d First)bC)  First(d) =  D -> EF | h  First(EF)  First(h) = {h}    D -> g F | F | h  D -> g f | g | f | λ | h First(D)  Follow(D) ={ h }  D -> g f | g | f | h  S -> a B D h | a B h  S -> a B S’  S’ -> D h | h  D -> gf | g | f |h  S’ -> g f h | g h | f h | h h | h Example - 4
  • 27. 5/11/2021 Saeed Parsa 27 Students’ question and my answers
  • 28. 5/11/2021 Saeed Parsa 28 Follow sets • What ?  The Follow set of a non-terminal, A, is the First of symbols that come after A. • Why ?  For a grammar, G, to be LL(1), as described before:  B  G | Nullable(B)  First(B)  Follow(B) =  • How ? A   B   Follow(B) = First() if nullable ()  Follow(B)= First()+Follow(A) • Examples ? Let me finish with the description and then …
  • 29. 5/11/2021 Saeed Parsa 29 Follow sets
  • 30. 5/11/2021 Saeed Parsa 30  Notice:  Input to a compiler is a source file;  A source file like any other files ends up with an end of file marker;  The end of file marker is represented by $;  Since a source program is supposed to be an instance of the start symbol, therefore:  $ is always considered as a member of the follow set of the start symbol.  Follow Sets
  • 31. 5/11/2021 Saeed Parsa 31 Follow Set Example 1 • Notice: Always add the end of File marker, $, to the Follow(start symbol)
  • 32. 5/11/2021 Saeed Parsa 32 Follow Set Example 1 C → cC => First(C) = {c} + follow(C) S → C => Follow(C)  Follow(S) S → ASb => Follow(S)  {b} S is strting symbol => $  Follow(S) => Follow(S) ={b, $} S → C => Follow(C) = {b,$} C → cC => First(C) = {c} + follow(C) = {c, b,$} S → ASb => Follow(A)  Firs(S) S →Asb => First(S)  First(A) = {a} S → C => First(S)  First(C) = {c} + Follow(C) = {a, c, b, $} S →ASb => Follow(A)  Firs(S) = {a,c,b,$} Follow(A) = {a,c,b,$} Follow(C) = {b,$} Follow(S) = {b,$}
  • 33. 5/11/2021 Saeed Parsa 33 Follow Set Example 2 G: S  L B L  id : |  B  i c t S E E  e S |  E  e S |  => First(E) = {e} + Follow(E) = {e. $} B  i C t S E => Follow(E)  Follow(B) S  L B => Follow(B)  Follow(S) B  i C t S E => Follow(S) = First(E) = {$} + First(E)= {$} + {e,$} {e,$} Follow(B)  Follow(S) = {e,$} S  L B => Follow(L)  First(B) = {i} B  i C t S E => First(B) = {i}
  • 34. 5/11/2021 Saeed Parsa 34 Example Transform the following grammar into LL(1) G: LabeledSt  Label Statement Label  id : |  Statement  AssignmentSt | IfSt | WhileSt | CallSt AssignmentSt  id := Expression WhileSt  while Expression do Statement IfSt  if Expression do Statement CallSt  id ( Params ) 1. The grammar is not LL(1) because: Nulable(Label)  (First(Label) ∩ Follow Label = id ≠ 𝜑  ¬ 𝐿𝐿 1 First(Label) = {id,  } = {id} + follow(Label) = {id} + First(Staement)
  • 35. 5/11/2021 Saeed Parsa 35 ii. Null Production Removal LabeledSt  Label Statement Label  id : |  1. Replace Label with its expansion  LabeledSt  id: Statement | Statement 2. First(id : Statement)  First(Statement) = {id}   Left factor into LL(1)
  • 36. 5/11/2021 Saeed Parsa 36 ii. Null Production Removal LabeledSt  Label Statement Label  id : |  LabeledSt  id: Statement | Statement Statement  AssignmentSt | CallSt | IfSt | WhileSt Statement  id := Expression | id (params) | IfSt | WhileSt Statement  id CallAssign | IfSt | WhileSt CallAssign  := Expression | (params) LabeledSt  id: Statement | id CallAssign | IfSt | WhileSt LabeledSt  id LaCallAs | IfSt | WhileSt LaCallAs  : Statement | CallAssign
  • 37. 5/11/2021 Saeed Parsa 37 iii. Left Recursion Elimination A grammar is not LL(1) if:  it includes a left recursive production: X  X 𝛼 | 𝛽 because: First(X 𝛼) = First(X) = First(𝛽)  First(X 𝛼)  First(𝛽)    Left recursion is eliminated by converting the grammar into an equivalent right recursive grammar. X  X 𝛼 | 𝛽 𝑖𝑠 𝑐𝑜𝑛𝑣𝑒𝑟𝑡𝑒𝑑 𝑡𝑜 1. BNF: X  𝛽 X’ X’  𝛼 X’ |  2. EBNF: X  𝛽 {𝛼}
  • 38. 5/11/2021 Saeed Parsa 38  X  X 𝛼 | 𝛽 𝑖𝑠 𝑒𝑞𝑢𝑖𝑣𝑎𝑙𝑒𝑛𝑡 𝑡𝑜 1. BNF: X  𝛽 X’ X’  𝛼 X’ |  2. EBNF: X  𝛽 {𝛼} Because: 1. X  X 𝛼 | 𝛽 2. X  𝛽 {𝛼}
  • 39. 5/11/2021 Saeed Parsa 39 Consider the regular expressions grammar E  E + T | E - T | T T  T * F | T / F | F F  Id | No | ( E ) 1. Left factoring (BNF) E  E + T | E - T | T  E  E E” | T E”  +T | -T T  T * F | T / F | F  T  T T” | F T” * F | / F 1. Left factoring (EBNF) E  E + T | E - T | T  E  E (+ T | - T) | T T  T * F | T / F | F  T  T (* F | / F) | F
  • 40. 5/11/2021 Saeed Parsa 40 X  X𝛼 | 𝛽 𝑖𝑠 𝑐𝑜𝑛𝑣𝑒𝑟𝑡𝑒𝑑 𝑡𝑜 1. BNF: X  𝛽 X’ X’  𝛼 X’ |  2. EBNF: X  𝛽 {𝛼} 2. Left Recursion Elimination (BNF) E  E E” | T  E  T E’ E’  E”E’ |  E’  +T E’ | -T E’ |  T  T T” | F T  F T’ T’ * F T’| / F T’ |  2. Left Recursion Elimination (EBNF) E  E (+ T | - T) | T  E  T {+ T | - T} T  T (* F | / F ) | F  T  F {* F | / F} Example
  • 41. 5/11/2021 Saeed Parsa 41 - Equivalent G. (BNF) E  T E’ E’  +T E’ | -T E’ |  T  F T’ T’ * F T’| / F T’ |  F  Id | No | ( E ) - Equivalen G. (EBNF) E  T {+ T | - T} T  F {* F | / F} F  Id | No | ( E ) Example Consider the regular expressions grammar E  E + T | E - T | T T  T * F | T / F | F F  Id | No | ( E )
  • 42. 5/11/2021 Saeed Parsa 42 To ensure that a grammar is LL(1), we must do the following: 1. Eliminate any common left prefixes, 2. Eliminate any left recursion, as shown below. 3. Eliminate nullable productions, if they cause problem. 1. Left factoring: A → αβ1|αβ2|𝜹 is replaced with: A → αA′ | 𝜹 A′ → β1|β2 Or in extended BNF: A → α (β1|β2) How to transform to LL(1)
  • 43. 5/11/2021 Saeed Parsa 43 2. Left Recursion Elimination X  X𝛼 | 𝛽 is converted to X  𝛽 X’ X’  𝛼 X’ |  Or in extended BNF: X  𝛽 {𝛼} 3. No nullable symbol A has the same terminal symbol a in both its first and follow sets for distinct production rules. How to transform to LL(1)
  • 44. 5/11/2021 Saeed Parsa 44  The key problem during predictive parsing is that of determining the production to be applied for a non-terminal.  This is done by using a parsing table.  A parsing table is a two-dimensional array M[A,a] where A is a non-terminal, and a is a terminal or the symbol $, menaing “end of input string”.  The other inputs of a predictive parser are: ◦ The input buffer, which contains the string to be parsed followed by $. ◦ The stack which contains a sequence of sentential forms, initially, $S (end of input string and start symbol) in it. Parse tables
  • 45. 5/11/2021 Saeed Parsa 45 • The purpose of parsing table is to determine which production rule to use next. • Consider the following grammar: G1: S  d A B | B a B A  d A | B a B  b B |  Example 1 1. Transform the grammar into LL(1) form, 2. Use First and follow sets to construct the parsing table, 3. Use the parsing table to parse given input strings.
  • 46. 5/11/2021 Saeed Parsa 46 1. Convert G1 into the LL(1) form - B  b B |  - First(B) = {b, } => First(B)  Follow(B) should be null. - A  B a => follow(B) = {a} - S  d A B => follow(B) = {a} + {$} = {a, $} - It is assumed that always: $  follow(Start symbol) - => $  follow(S) => $  follow(dAB) => $  follow(B) - => First(B)  Follow(B) = {b}  {a, $} =  - First(B) = {b, }, Follow(B) = {a, $} - First(B) = {b, a, $} Example 1-Continued
  • 47. 5/11/2021 Saeed Parsa 47 - A  d A | B a - First(dA)  First(B a) = {d}  {b, a} =  - First(A) = First(dA)  First(B a) ={d, b, a} - S  d A B | B a B - First(dAB)  First(BaB) = {d}  {b , a, $} =  - First(S) = First(dAB)  First(BaB)  First( a ) - First(S) = {d}  {b}  {a} = {d, b, a} 2. Use First sets to work out the parsing table Example 1-Continued
  • 48. 5/11/2021 Saeed Parsa 48 - First(S) = {d, b, a} - First(A) = {d, b, a} - First(B) = {b, } = {b,a,$} - Follow(B) = { a, $} Example 1-Continued G1: S  d A B | B a B A  d A | B a B  b B |  d a b $ S dAB BaB BaB BaB A dA Ba Ba Ba B  bB 
  • 49. 5/11/2021 Saeed Parsa 49 Build parsing table for this grammar: G2: S ( L ) | a LL S | S Example 2 1- Eliminate left recursion G2: S ( L ) | a L  S L’ L’  SL’ | λ
  • 50. 5/11/2021 Saeed Parsa 50 Example 2-2 2- Define First set and if required follow sets for the Non-terminals.  First(L’) = First(S) +{λ} ={(, a, λ}  Follow(L’) = Follow(L) = { ) }  First(L) = First(S) = {(, a }  Follow(L)={(,a}+{)} = { (, a, ) } $ ) ( a - - (L) a S SL’ SL’ L - λ SL’ SL’ L’ ‫قاعده‬ ‫ورودي‬ ‫تجزيه‬ ‫پشته‬ S(L) (a(aa))$ $ S (a(aa))$ $ )L( LSL’ a(aa))$ $ )L Sa a(aa))$ $ ) L’S Delete a(aa))$ $ ) L’a L’SL’ (aa))$ $ ) L’ S(L) (aa))$ $ ) L’S Delete (aa))$ $ ) L’)L( LSL’ aa))$ $ ) L’)L Sa aa))$ $ ) L’) L’S Delete aa))$ $ ) L’) L’a L’ SL’ a))$ $ ) L’) L’ Sa a))$ $ ) L’) L’S Delete a))$ $ ) L’) L’a L’λ ))$ $ ) L’) L’ Delete ))$ $ ) L’) L’λ )$ $ ) L’ Delete )$ $ ) $ $
  • 51. 5/11/2021 Saeed Parsa 51 Example 3
  • 53. 5/11/2021 Saeed Parsa 53 • The third homework : Insert your slides from this slide on
  • 54. 5/11/2021 Saeed Parsa 54 1. Convert this grammar to LL(1) G1: S::=  |A S A ::= id := id A ::= if id then A A ::= if id then A' else A A' ::= id := id A' ::= if id then A' else A‘ Exercise -1
  • 55. 5/11/2021 Saeed Parsa 55 Exercise -2
  • 56. 5/11/2021 Saeed Parsa 56 1. A ::= ABd | Aa | a i. Left factoring A ::= A A” | a A” ::= Bd | a ii. Eliminate left recursion A ::= A A” | a => A ::= a A’ A’ ::= A” A’ |  => A’ ::= Bd A’ | a A’ |  B ::= Be | b => B ::= b B’ B’::= e B’ |  3. A ::= A B |A c| a | aa i. Left factoring => A ::= A A” | a D A” ::= B | c D ::= a |  ii. Eliminate left recursion => A ::= a D A’ A’ ::= B A’ | c A’ |  2. A ::= A b |A c| a | b i. Left factoring => A ::= A A” | a | b A” ::= b | c ii. Eliminate left recursion => A ::= a A’ | b A’ A’ ::= b A’ | c A’ |  Solution
  • 57. 5/11/2021 Saeed Parsa 57 Consider the grammar G12 a) Point out all aspects of Grammar G12 which are not LL(1). b) Write a new grammar which accepts the same language, but avoids left recursion and common left prefixes. c) Write the FIRST and FOLLOW sets for the new grammar. d) Write out the LL(1) parse table for the new grammar. e) Is the new grammar an LL(1) grammar? Explain your answer carefully. Exercise -3
  • 58. 5/11/2021 Saeed Parsa 58 Exercise -4 Consider the assignment statements grammar A  id := E E  E + T | E - T | T T  T * F | T / F | F F  Id | No | ( E ) Convert the grammar to LL(1). Construct the parsing table for the grammar. Use the table to parse the statement: a := (b/c*3 – e*f)/2
  • 59. 5/11/2021 Saeed Parsa 59 Solution
  • 60. 5/11/2021 Saeed Parsa 60 Solution
  • 62. 5/11/2021 Saeed Parsa 62  A recursive-descent parser is structured as a set of mutually recursive procedures, one for each nonterminal in the grammar.  The procedure corresponding to nonterminal A recognizes an instance of A in the input stream.  To recognize a nonterminal B on some right-hand side for A, the parser invokes the procedure corresponding to B.  Thus, the grammar itself serves as a guide to the parser's implementation. Recursive descent parsers
  • 63. 5/11/2021 Saeed Parsa 63 • To test for the presence of a nonterminal, say ’A’, the code invokes a procedure, named A. • Suppose: A a B D Recursive descent parsers public class Parser { private enum symbols currentSymbol; Parser () { currentSymbol = nextSymbol(); A()} public void A() { /*A  */ Expect(‘a’); B(), D(); } public void Expect(enum Symbols expectedSymbol) { if ( currentSymbol == expectedSymbol) currentSymbol = nextSymbol(); else syntaxError(); }
  • 64. 5/11/2021 Saeed Parsa 64  For instance: G: S  if E then S | if E then S else S | begin S L | print E L  end | ; S L E  i  Recursive descent parsers develop,a procedure / method for each non- terminal A, with the same name as the nonterminal.  There are three non-terminals S, L, and E, in the grammar.  Three methods S(), L() and E() should be written.  A lexical analyzer method nextSymbol() is invoked to get the next lexicon from the input file.  nextSymbol() copies the symbol in a global variable called currentSymbol.  It is assumed that always the next symbol is accessible via currentSymbol, before the next symbol could be analyzed. Recursive descent parsers
  • 65. 5/11/2021 Saeed Parsa 65  There are three non-terminals S, L, and E, in the grammar.  Three methods S(), L() and E() should be written.  A lexical analyzer method nextSymbol() is invoked to get the next lexicon from the input file.  nextSymbol() copies the symbol in a global variable called currentSymbol.  It is assumed that always the next symbol is accessible via currentSymbol, before the next symbol could be analyzed. Recursive descent parsers
  • 66. 5/11/2021 Saeed Parsa 66 // S  if E then S | if E then S else S | begin S L | print E public void S() { if (currentSymbol == "if") { nextSymbol(); E(); expect( "then"); S(); if (currentSymbol == "else") { nextSymbol(); S(); return; } } else if (currentSymbol == "begin") { nextSymbol(); S(); L(); return; } else if (currentSymbol == "print") { nextSymbol(); E(); return; } else { throw new IllegalTokenException("Procedure S() expected an 'if’ or 'then' or else or begin or print token " + "but received: " + currentSymbol ); } } } Recursive descent parsers
  • 67. 5/11/2021 Saeed Parsa 67 1. Transform the G into LL(1): G: E  T E’ E’  +T E’ | -T E’ |  T  T T’ T’ * F T’| / F T’ |  F  Id | No | ( E ) - Equivaled G. (EBNF) G: E  T {+ T | - T} T  F {* F | / F} F  Id | No | ( E ) • For instance consider the regular expressions grammar G: E  E + T | E - T | T T  T * F | T / F | F F  Id | No | ( E ) • A recursive-descent parser is structured as a set of mutually recursive procedures, one for each nonterminal in the grammar. Recursive descent parsers
  • 68. 5/11/2021 Saeed Parsa 68 • The procedure corresponding to nonterminal A recognizes an instance of A in the input stream. // E  T E’ Public void E() { /* E  */ T(); E’(); } • To recognize a nonterminal B on some right-hand side for A, the parser invokes the procedure corresponding to B. //E’  +T E’ | -T E’ |  Public void E() {if (currentSymbol == s_Add) {/* E’ */ nextSymbol(); T(); E’();} else if (currentSymbol == s_Sub) {/* E’ */ nextSymbol(); T(); E’();} } Recursive descent parsers
  • 69. 5/11/2021 Saeed Parsa 69 • For building parsers (especially bottom-up) a BNF grammar is often better, than EBNF. But it’s easy to convert an EBNF Grammar to BNF:  Convert every repetition { E } to a fresh non-terminal X and add  X ::=  | E X.  Convert every option [ E ] to a fresh non-terminal X and add  X ::=  | E.  Convert every group ( E ) to a fresh non-terminal X and add  X ::= E.  We can even do away with alternatives by having several productions with the same non-terminal. X ::= E | E’. becomes X ::= E. X ::= E’. From EBNF to BNF
  • 70. 5/11/2021 Saeed Parsa 70 For a recursive descent parser it is easier to use extended BNF. G: E  T {+ T | - T} T  F {* F | / F} F  Id | No | ( E ) public class Parser { private enum symbols currentSymbol; Parser() { // Gets the next symbol, as currentSymbol, before calling E currentSymbol = nextSymbol(); E();} // G: E  T {+ T | - T} public void E( ) { /* E  */ T(); while ( currentSymbol == S_Add || currentSymbol == S_Sub) { nextSymbol(); T(); } } From EBNF to BNF
  • 71. 5/11/2021 Saeed Parsa 71 // T  F {* F | / F} public void T( ) { /* T */ F(); while ( currentSymbol == S_Mul || currentSymbol == S_Div) { nextSymbol(); F(); } } // F  Id | No | ( E ) public void F( ) { if (currentSymbol == S_Id || currentSymbol == S_No) nextSymbol(); else { /* F  ( E ) */ Expect(S_openPar); E(); Expect(S_closePar); } public void Expect(enum symbols expectedSymbol ) { if currentSymbol == expectedSymbol) nextSymbol(); else syntaxError(); } public void nextSymbol( File *input-File){ … } } //Eof Parser Class. From EBNF to BNF
  • 72. 5/11/2021 Saeed Parsa 72 A Mini Pascal Compiler ProgramX  Program id ; BlockBody . Blockbody  [ ConstantDefpart ] [ typeDefPart ] [VarDefPart ] {FunctionDef | ProcedureDef }CompaundStatement ConstantDefPart  Const ConstandDef {ConstantDef} ConstantDef  id = ( No | id ) ; TypeDefPart  Type TypeDef {TypeDef} TypeDef  id = (integer | real | character) VarDefPart  Var VarDef {VarDef} VarDef  id : (integer | real | character) • Consider the mini-pascal grammar:
  • 73. 5/11/2021 Saeed Parsa 73 A sample mini-pascal program
  • 74. 5/11/2021 Saeed Parsa 74 Mini Pascal R.D. parser Begin init(); // Initializes the Mini-Pascal parser NextSymbol(); // Get a lookahead ProgramX(); // Call Starting symbol function End. Public class Parser { public enum symbols currentSymbol; Parser(String SourceFile) { init(SourceFile); // Open Source and … NextSymbol(); // currentSymbol = next symbol; ProgramX(); // Call Start-symbol } … }
  • 75. 5/11/2021 Saeed Parsa 75 Recursive descent parsers start by calling the starting symbol of the grammar. /* ProgramX  Program id ; blockBody . */ public void ProgramX( ) { Expect( S_Program ); // Expect visiting the “program” keyword Expect( S_id ); // Expect visiting an identifier Expect( S_Semi ); // Expect visiting a semicolon bolckBody( ); // Invoke blockBody() Expect( S_Dot ); // Expect visiting a dot }
  • 76. 5/11/2021 Saeed Parsa 76 Mini Pascal R.D. parser - 3 /* blockBody [ constantDefpart ] [ typeDefPart ] [varDefPart ] {functionDef | procedureDef } compaundStatement */ public void blockBody( ) { if (currentSymbol == S_Const) constantDefpart(); if (currentSymbol == S_Type) typeDefpart(); if (currentSymbol == S_Var) varDefpart(); while (currentSymbol == S_Procedure || currentSymbol == S_function) if (currentSymbol == S_Procedure) procedureDef(); else functionDef(); compoundStatement( ); }
  • 77. 5/11/2021 Saeed Parsa 77 Mini Pascal R.D. parser - 4 /* constantDefpart  Const constandDef {constantDef} */ public void constantDefpart( ) { Expect( S_Const ); constantDef(); // while currenstSymbol in first(constantDef) while (currentSymbol == S_Id) constantDef(); } /* constantDef  id = ( No | id ) ; */ public void constantDefpart( ) { Expect( S_Id); Expect( S_Eaual); if(currentSymbol == S_No) nextSymbol() else Expect( S_Id); expect(S_Semicolon); }
  • 78. 5/11/2021 Saeed Parsa 78 Error Recovery Error recovery is a process to act against the error in order to reduce the negative effect of the error. If the next symbol does not match the expected symbol, then ignore the input symbols as far as next expected symbol is observed.
  • 79. 5/11/2021 Saeed Parsa 79 Error Recovery: definition Error recovery is a process to act against the error in order to reduce the negative effect of the error. Internally the error recovery works as follows: ‫؞‬ The location of the syntax error is reported. ‫؞‬ If possible, the tokens that would be a legal continuation of the program are reported. ‫؞‬ The tokens that can serve to continue parsing are computed. A minimal sequence of tokens is skipped until one of these tokens is found.
  • 80. 5/11/2021 Saeed Parsa 80 Error recovery • Consider the “Expect” method: public void Expect( enum Symbols expectedSymbol ) { if (currentSymbol == expectedSymbol) nextSymbol( ); else syntaxError( ); } • We are going to complete the “syntaxError” method: public void syntaxError( ) { Console.writeline( “ Syntax Error “); nextSymbol(); //Get the next look-ahead symbol }
  • 81. 5/11/2021 Saeed Parsa 81 Motivating Example .1 • Now, consider this grammar: • Consider the following code:
  • 82. 5/11/2021 Saeed Parsa 82 Motivating Example .2 /* compoundSt ::= begin Sts end*/ procedure compoundSt( ) begin Expect(S_Begin); Sts(); Expect(S_end); end; /* Sts ::= St; Sts |  */ procedure Sts( ) begin St( ); Expect(S_semicolon); Sts(); end; Look ahead : begin
  • 83. 5/11/2021 Saeed Parsa 83 Motivating Example .3 /* St ::= ifSt | whileSt | assSt | compounSt */ procedure St( ) begin if currentSymbol = s_if) then ifSt( ) else if currentSymbol = s_while) then whileSt( ) else if currentSymbol = s_id) then assSt( ) else Expect(s_begin); end; Look ahead : begin Look ahead : jf
  • 84. 5/11/2021 Saeed Parsa 84 Motivating Example .4 /* assSt ::= id := E procedure assSt( ) begin nextSymbol( ); Expect(s_assign); E() end; begin Jf i = 5 then i := i+1; while j< 5 di i := i*j; end public void Expect( enum Symbols expectedSymbol ) { if (currentSymbol == expectedSymbol) nextSymbol( ); else syntaxError( ); } public void Expect( enum Symbols expectedSymbol ) { Console.writeline( “ Syntax Error “); nextSymbol(); //Get the next look-ahead symbol } Expected: S_id Look ahead: i
  • 85. 5/11/2021 Saeed Parsa 85 Error Recovery: Approach Suppose parser is expecting a non-terminal, Yi, in this production: X  Y1 Y2 … Yi … Yn In fact the parser expecting a terminal symbol s  First( Yi ). The error recovery works as follows: ‫؞‬ Skip next symbols, s, till arriving at a symbol ‫؞‬ s  First( Yi+1)..n). ‫؞‬ Or it proceeds with ignoring the next symols, s, until it arrives at a symbol S  Stop(Yi) ‫؞‬ where  i  [1..n-1] => Stop(Yi) = 𝑗=𝑖+1 𝑛 𝐹𝑖𝑟𝑠𝑡(𝑌𝑗) + Stop(Y) ‫؞‬ Stop(Yn) = Stop(Y) ‫؞‬ Stop(Start Symbol) always includes the end of file marker, $.
  • 86. 5/11/2021 Saeed Parsa 86 Stop set G1: St  ifSt | whileSt | assSt | compoundSt => Stop(St) = [s_eof] since St is the start symbol, => Stop(ifSt) = Stop(whileSt) = Stop(assSt) = Stop(compoundSt) = Stop(St) = [s_eof] compoundSt  begin Sts end => Stop(s_begin) = First(Sts) + [s_end] + Stop[compoundSt]= [s_end, s_eof] => Stop(Sts) = [s_end] + Stop[compoundSt] = [s_eof, s_end] => Stop(s_end) = Stop[compoundSt] = [s_eof] Sts  St ; Sts | St => Stop(St) = [s_semicolon] + first(St) + Stop(Sts) = [s_semicolon] + First(ifSt) + First(whileSt) + First(assSt) + First(compoundSt) + [s_eof, s_end];
  • 87. 5/11/2021 Saeed Parsa 87 Error Recovery • The “Expect” method is modified as follows: public void Expect( enum Symbols expectedSymbol , HashSet Stop) { if (currentSymbol == expectedSymbol) nextSymbol( ); else syntaxError( Stop ); } • We are going to complete the “syntaxError” method: public void syntaxError( HashSet<enum symbols> Stop ) { Console.writeline( “ Syntax Error “); while( !Stop.contains( currentSymbol ) ) nextSymbol(); }
  • 88. 5/11/2021 Saeed Parsa 88 Error Recovery The Expect method is modified as follows: (* Expect compares expected symbol S with the current symbol *) Procedure Expect ( ExpectedSymbol : Symbols , Stop : Set of Symbols ) ; Begin if Currentsymbol = ExpectedSymbol Then Nextsymbol Else SyntaxError( Stop ) ; End {Expect}; Procedure SyntaxError(Stop: Set of Symbols); Begin promptMsg(‘ Syntax error at‘, LineNo, ColNo) ; While not (CurrentSymbol in Stop) DO NextSymbol() ; End{ SyntaxError };
  • 89. 5/11/2021 Saeed Parsa 89 Error Recovery • The main body of the Mini-pascal compiler is modified as follows: Program MiniPascalComplier; Type Symbols = ( S_if , S_while, S_repeat , S_for, S_Case, S_then, S_else, S_do, S_program, S_uses, S_interface, S_unit, S_begin, S_end, S_label, S_const, S_type, S_var, S_procedure, S_function, S_integer , S_real , S_char, S_array, S_record, S_pointer, S_lt , S_gt , S_eq , S_le , S_ge, S_ne, S_add, S_sub, S_or, S_mul, S_div, S_and, S_id, S_no, S_not, S_comma, S_colon, S_semicolon, S_dot, S_OpBracket, S_ClBracket, S_OpCurlyB, S_ClCurlyB, S_OpSquB, S_ClSquB ); Var CurrentSymbol : Symbols ;
  • 90. 5/11/2021 Saeed Parsa 90 Error Recovery : MiniPascal Begin init( ) ; (* Initialize variables, open source and create target files *) nextSymbol( ); (*Detects and saves the first lexicon in currentSymbol *) ProgramX ([S_EOF] ); (* End of file marker is expected after the start symbol, ProgramX *) End.
  • 91. 5/11/2021 Saeed Parsa 91 Error Recovery : MiniPascal (* ProgramX ::= Program id ‘;’ BlockBody ‘.’ *) Procedure ProgramX ( Stop : Set of Symbols ) ; Begin Expect(S_program, [S_id , S_Semicolon] + First ( Blockbody ) + [ S_dot ] + Stop ) ; Expect(S_id , [ S_Semicolon , First ( Blockbody ) + [ S_dot ] + Stop ) ; Expect ( S_Semicolon , First ( Blockbody ) + [ S_dot ] + Stop ) ; Blockbody ( [ S_dot ] + Stop ) ; Expect ( S_dot , Stop ) ; End;
  • 92. 5/11/2021 Saeed Parsa 92 Error Recovery : MiniPascal (* Blockbody [ ConstantDef.part ][ typeDef.Part ][VarDefPart ] {FunctionDef | ProcedureDef} CompaundStatement *) Procedure BlockBody( Stop : Set of Symbols); Begin if CurrentSymbol = S_const then ConstantDef.Part(Stop + [S_Type, S_Var, S_Procedure, S_Function, S_Begin] ); if CurrentSymbol = S_type then TypeDefPart( Stop + [ S_Var, S_Procedure, S_Function, S_Begin] ); if CurrentSymbol = S_var then VarDefPart( Stop + [ S_Procedure, S_Function, S_Begin] );
  • 93. 5/11/2021 Saeed Parsa 93 Example 1  Convert this grammar into LL(1) form and write a recursive-descent parser for it: G1: S  aB | aC | dD D  Da | Db | d B  BC | b C  Cd | d G1: S  a (B | C ) | dD D  d { a | b } B  b { C } C  d {d} First(S) = {a, d} First(D) = {d} First(B) = {b} First(C) = {d}
  • 94. 5/11/2021 Saeed Parsa 94 Error Recovery : Example – 1.1 Begin init(); NextSymbol(); S(S_EOF); End. (* S  a (B | C ) | dD *) Procedure S( Stop : Set of Symbols); Begin If(CurrentSymbol = S_d) Then Begin NextSymbol; D( Stop ); End Else Expect(S_a, [S_b, S_d] + Stop); End;
  • 95. 5/11/2021 Saeed Parsa 95 Error Recovery : Example – 1.2 (* D  d { a | b } *) Procedure D( Stop : Set of Symbols); Begin Expect(S_d, [S_b, S_d] + Stop); While(CurrentSymbol = S_a ) Or (CurrentSymbol = S_b) do If CurrentSymbol = S_a Then NextSymbol Else Expect(S_a, [S_b, S_d] + Stop); End; (*B  b { d }*) Procedure B( Stop : Set of Symbols); Begin Expect(S_b, [ S_d] + Stop); While(CurrentSymbol = S_d ) do NextSymbol; End; (*C  d { d }*) Procedure C( Stop : Set of Symbols) Begin Expect(S_d, [ S_d] + Stop); While(CurrentSymbol = S_d ) do NextSymbol; End;
  • 96. 5/11/2021 Saeed Parsa 96 Example 2 • Convert this grammar into LL(1) and write a recursive-descent parser for it: G2: S  Aa | Bd | Sc A  Aa |  B  Bb |  A  {a} A is nullable B  {b} B is nullable First(A) = {a, } Follow(A) = {a} First(A)  Follow(A) = {a}   First(B) = {b, } Follow(B) = {d} First(B)  Follow(B) =  S  {a}a | Bd | Sc S  {a{a}| Bd }c First(S) = {a, b, c, d}
  • 97. 5/11/2021 Saeed Parsa 97 Example 2.1 G2: S  Aa | Bd | Sc A  Aa |  B  Bb |  G2: S  { a{a} | b{b} | d } c /* S  { a{a} | b{b} | d } c */ public void S(HashSet Stop) { HashSet<char> Follow = new HashSet<char>(); if (currentSymbol == ‘a’) while (currentSymbol == ‘a’) nextSymbol(); else if (currentSymbol == ‘b’) while (currentSymbol == ‘b’) nextSymbol(); else { Follow.Add (‘a’); Follow.sAdd (‘b’); Follow.Add (‘c’); Follows.Add (‘d’); Expect(‘d’, Follow+ Stop); } Expect( d, Stop ),
  • 98. 5/11/2021 Saeed Parsa 98 Example 3  Convert this grammar into LL(1) and write a recursive-descent parser for it: G3: Expression  SimpleExp {RelOp SimpleExp} RelOp  < | <= | = | <> | >= | > | IN SimpleExp  Term { ( ‘+’ | ‘-’ | Or ) Term } Term  Factor { (‘/’ | ‘*’ | DIV | AND) Factor} Factor  Number | NOT Factor | ‘(‘ Expression ‘)’ | Variable Variable  Identifier { ‘[‘ Dim ‘]’ } Dim  Expression { ‘ ,’ Expression }
  • 99. 5/11/2021 Saeed Parsa 99 Example 3.1 Expression  SimpleExp {RelOp SimpleExp} => First(Expression) = First(SimpleExp) RelOp  < | <= | = | <> | >= | > | IN => First(Relop) = [<,<=.=,<>,>=,>, IN] SimpleExp  Term { ( ‘+’ | ‘-’ | Or ) Term } => First(SimpleExp) = First(Term) Term  Factor { (‘/’ | ‘*’ | DIV | AND) Factor} => First(Term) = First(Factor) Factor  Number | NOT Factor | => First(Factor) = [ Number, Not, ‘(‘ Expression ‘)’ ‘(‘ ] | Variable + First(Variable) Variable  Identifier { ‘[‘ Dim ‘]’ } => First(Variable) = [Identifier] Dim  Expression { ‘ ,’ Expression } => First(Dim) = First(Expression) First(Expression) =First(SimpleExp}= First(Term)=First(Factor) = [Number, Not, ‘(‘, Identifier]
  • 100. 5/11/2021 Saeed Parsa 100 Example 3.2 Begin Init( ); NextSymbol([S_Eof]); Expression( ); End. (*Expression SimpleExp { RelOp SimpleExp } *) procedure Expression( Stop : Set of Symbols); Begin FirstSetOfRelop = [ < , <=, = , < > , >= | > | IN ]; FirstSimpleExp = [Number, Not, ‘(‘, Identifier ]; SimpleExp( Stop + FirstSetOfRelop ); while CurrentSymbol in FirstSetOfRelop do Begin RelOp(Stop + FirstSimpleExp); SimpleExp (Stop + FirstSetOfRelop); End; End {Expression};
  • 101. 5/11/2021 Saeed Parsa 101 Example 4 • Convert this grammar into LL(1) form and write a recursive-descent parser for it: G4: S  L D L  id : |  D  A | C | I | B A  id := no + id C  id ( ) B  begin T end T  T; S | S I  if (id > no) goto id
  • 102. 5/11/2021 Saeed Parsa 102 Example 4.2 S  L D => First(S) = First(L) L  id : |  => First(L) = [id, ] nullable S  L D => Follow(L) = First(D) D  A | C | I | B => First(D) = [id, if, begin] => First(C)  First(D) = [id]   Not LL(1) => First(L)  Follow (L) = [id]   Not LL(1) => D  id := no + id | id ( ) | I | B Left refactoring : D  id G | | I | B G  := no | ( ) Null production Elimination: S  id: D | id G | I | B Left Factoring: S  id (: D | G) | I | B
  • 103. 5/11/2021 Saeed Parsa 103 Example 4.3 G4: S  id (: D |G }| I | B => First(S) = [id, if, begin] D  id G | I | B => First(D) = [id, if, begin] G  := no | ( ) => First(G) = [no, ( ] B  begin T end => First(B) = [Begin] T  S{; S } => First(G) = [id, if, begin] I  if (id > no) goto id => First(I) = [if]
  • 104. 5/11/2021 Saeed Parsa 104 Example 4.4 // S  id (: D |G }| I | B => First(S) = [id, if, begin] public void S( HashSet Stop) {HashSet<String> First = new HashSet<String>(); D  id G | I | B => First(D) = [id, if, begin] G  := no | ( ) => First(G) = [no, ( ] B  begin T end => First(B) = [Begin] T  S{; S } => First(G) = [id, if, begin] I  if (id > no) goto id => First(I) = [if]
  • 105. 5/11/2021 Saeed Parsa 105 Example 4.3 G4: S  id (: D |G }| I | B => First(S) = [id, if, begin] D  id G | I | B => First(D) = [id, if, begin] G  := no | ( ) => First(G) = [no, ( ] B  begin T end => First(B) = [Begin] T  S{; S } => First(G) = [id, if, begin] I  if (id > no) goto id => First(I) = [if]
  • 107. The third homework : Insert your slides from this slide on 5/11/2021 Saeed Parsa 107
  • 108. 5/11/2021 Saeed Parsa 108 1. Convert this grammar to LL(1) in EBNF and write a recursive descent parser In Python or C#. G1: S::=  |A S A ::= id := id A ::= if id then A A ::= if id then A' else A A' ::= id := id A' ::= if id then A' else A‘ Exercise -1 1. A’ Could be removed 2. The grammar is not acepable 3. If you substitute  for S in the profuction S ::= A S you will have S ::= {A}.
  • 109. 5/11/2021 Saeed Parsa 109 Consider the grammar G12 a) Point out all aspects of Grammar G12 which are not LL(1). b) Convert it into LL(1) in EBNF. c) Write the FIRST and FOLLOW sets for the new grammar. d) Write out the LL(1) recursive descent parser. e) Do not forget error recovery. Exercise -2
  • 110. 5/11/2021 Saeed Parsa 110 Exercise -3 Consider the assignment statements grammar A  id := E E  E + T | E - T | T T  T * F | T / F | F F  Id | No | ( E )  Convert the grammar into LL(1), using EBNF.  Write out the recursive descent parser in C#, C++ or Python  Programming languages.  Do not forget error recovery.
  • 112. 5/11/2021 Saeed Parsa 112 Parse-Tree Listeners & Visitors  ANTLR provides support for two tree-walking mechanisms in its runtime library.  By default, ANTLR generates a parse-tree listener interface that responds to events triggered by the built-in tree walker.  The listeners receive notification of events like startDocument and endDocument.  ANTLR can also generate tree walkers that follow the visitor design pattern  As the walker encounters the node for rule assign, for example, it triggers enterAssign() and passes it the AssignContext parse-tree node.  The beauty of the listener mechanism is that it’s all automatic.  We don’t have to write a parse-tree walker, and our listener methods don’t have to explicitly visit their children.
  • 113. 5/11/2021 Saeed Parsa 113 Parse-Tree Listeners  To walk a tree and trigger calls into a listener, ANTLR’s runtime provides class ParseTreeWalker.  ANTLR generates a ParseTreeListener subclass specific to each grammar with enter and exit methods for each rule.  As the walker encounters the node for rule assign, for example, it triggers enterAssign() and passes it the AssignContext parse-tree node.  The beauty of the listener mechanism is that it’s all automatic. We don’t have to write a parse-tree walker, and our listener methods don’t have to explicitly visit their children.  The beauty of the listener mechanism is that it’s all automatic. We don’t have to write a parse-tree walker, and our listener methods don’t have to explicitly visit their children.
  • 114. 5/11/2021 Saeed Parsa 114 Parse-Tree Listeners  The thick dashed line shows a depth-first walk of the parse tree.  The thin dashed lines indicate the method call sequence among the visitor methods.
  • 115. 5/11/2021 Saeed Parsa 115 Build a language application  The first step to building a language application is to create a grammar that describes a language’s syntactic rules (the set of valid sentences).  Run ANTLR (class org.antlr.v4.Tool) on the grammar file. antlr4 ArrayInit.g4 # Generate parser and lexer using antlr4 alias • From grammar ArrayInit.g4, ANTLR generates lots of files that we’d normally have to write by hand.
  • 116. 5/11/2021 Saeed Parsa 116 Write syntactic and lexical rules starter/ArrayInit.g4 /** Grammars always start with a grammar header. This grammar is called * ArrayInit and must match the filename: ArrayInit.g4 */ grammar ArrayInit; /** A rule called init that matches comma-separated values between {...}. */ init : '{' value (',' value)* '}' ; // must match at least one value /** A value can be either a nested array/struct or a simple integer (INT) */ value : init | INT ; // parser rules start with lowercase letters, lexer rules with uppercase INT : [0-9]+ ; // Define token INT as one or more digits WS : [ trn]+ -> skip ; // Define whitespace rule, toss it out
  • 117. 5/11/2021 Saeed Parsa 117 Integrating a Generated Parser into a Java Program
  • 118. 5/11/2021 Saeed Parsa 118 Run the program  The program generates lisp like parse tress for a given input.  Here’s how to compile everything and run Test: javac ArrayInit*.java Test.java java Test • Input ➾ {1,{2,3},4} ➾EOF • output ❮ (init { (value 1) , (value (init { (value 2) , (value 3) })) , (value 4) })
  • 119. 5/11/2021 Saeed Parsa 119 ANTLR 4 with Python3 Detailed Example  ANTLR4 introduced a handy listener-based API, but sometimes it's better not to use it. https://dzone.com/articles/antlr-4-with-python-2-detailed-example
  • 120. 5/11/2021 Saeed Parsa 120 ANTLR 4 with Python3 Detailed Example  As before, we run ANTLR on the grammar to generate code. https://dzone.com/articles/antlr-4-with-python-2-detailed-example antlr4 -Dlanguage=Python3 arithmetic.g4  This generates a lexer, parser, and a base class for a listener;  I'll give the main body of the code first: 1 def main(): 2 lexer = arithmeticLexer(antlr4.StdinStream()) 3 stream = antlr4.CommonTokenStream(lexer) 4 parser = arithmeticParser(stream) 5 tree = parser.expression() 6 handleExpression(tree) 7 if __name__ == '__main__’: 8 main()
  • 121. 5/11/2021 Saeed Parsa 121 Iterate over the children  The ANTLR API provides us with the means to iterate over the children of a node.  We can walk through the children in order.  NTLR API provides us with the means to iterate over the children of a node. 1. def handleExpression(expr): 2 adding = True 3 value = 0 4 for child in expr.getChildren(): 5 if isinstance(child, antlr4.tree.Tree.TerminalNode): 6 adding = child.getText() == "+" 7 else: 8 multValue = handleMultiply(child) 9 if adding: 10 value += multValue 11 else: 12 value -= multValue 13 print "Parsed expression %s has value %s" % (expr.getText(), value)
  • 122. 5/11/2021 Saeed Parsa 122  We iterate over the children; where we find a multiplying expression, we evaluate it.  Where we find an operator, we use it to set a flag indicating the next operation to perform. 1. def handleMultiply(expr): 2 multiplying = True 3 value = 1 4 for child in expr.getChildren(): 5 if isinstance(child, antlr4.tree.Tree.TerminalNode): 6 multiplying = child.getText() == "*" 7 else: 8 if multiplying: 9 value *= int(child.getText()) 10 else: 11 value /= int(child.getText()) 12 13 return value Iterate over the children
  • 123. The place of IUST in the world 5/11/2021 Saeed Parsa 123 https://www.researchgate.net/publication/328099969_Software_Fault_Localisation_A_Systematic_Mapping_Study