PARSING.ppt

TOP-DOWN PARSING
 The parse tree is constructed
– From the top
– From left to right
• Terminals are seen in order of
appearance in the token stream:
t2 t5 t6 t8 t9
3

TOP-DOWN PARSING
Top-down parser
 Recursive-Descent Parsing
 Backtracking is needed (If a choice of a production
rule does not work, we backtrack to try other
alternatives.)
 It is a general parsing technique, but not widely used.
 Not efficient
 Predictive Parsing
 no backtracking
 efficient
 needs a special form of grammars (LL(1) grammars).
 Non-Recursive (Table Driven) Predictive Parser is also
known as LL(1) parser.
 Recursive Predictive Parsing is a special form of
Recursive Descent parsing without backtracking.
4

RECURSIVE-DESCENT PARSING (USES
BACKTRACKING)
 Backtracking is needed.
 It tries to find the left-most derivation.
S  aBc
B  bc | b
S S
Input : abc
a B c a B
b c b
5
fails, backtrack
c

RECURSIVE DESCENT PARSING
 Consider the grammar
E → T + E | T
T → ( E ) | int | int * T
Input: int * int
 Start with top-level non-terminal E
 Try the rules for E in order
6

RECURSIVE DESCENT PARSING. EXAMPLE (CONT.)
Try E → T + E
Then try a rule for T → ( E )
But ( does not match input token int.
Try T → int . Token matches.
But + after T does not match input token *
Try T → int * T
This will match but + after T will be unmatched
Has exhausted the choices for T
Backtrack to choose for another derivation of E
7

RECURSIVE DESCENT PARSING. EXAMPLE (CONT.)
Try E → T
Follow same steps as before for T
– And succeed with T → int * T and T →
int
– With the following parse tree
E
T
8
int
*
T
int

RECURSIVE-DESCENT PARSING (BACKTRACKING
PROBLEM)
 Consider the following production
S → aAb
A → c |cd
Let the input string be acdb.
9

EXAMPLE 2
 Consider the following production
SBA| AB
Aa| SA
Bb | SB
w= abab
Parse the above w using recursive decent
parsing and find the problem of recursive
decent parser
10

PREDICTIVE PARSER
 When re-writing a non-terminal in a derivation
step, a predictive parser can uniquely choose a
production rule by just looking the current symbol in
the input string.
A  1 | ... | n input: ... a .......
current token
 Unlike recursive-descent, predictive parser can
“predict” which production to use.
– By looking at the next few tokens.
– No backtracking.
11

PREDICTIVE PARSER (EXAMPLE)
stmt  if ...... |
while ...... |
begin ...... |
for .....
 When we are trying to write the non-terminal stmt, if the
current token is if we have to choose first production rule.
 When we are trying to write the non-terminal stmt, we can
uniquely choose the production rule by just looking the
current token.
12

CONSTRUCTING THE LL(1) PARSING
TABLE

EXAMPLE
A → BC
B → DE
D → FG
F → HI
H → xY
First(A) = {x}

TASK
Write the sets of the following:
S -> Ty
T -> AB
T -> sT
A -> aA
A -> λ
B -> bB
B -> λ

 Example 2.
Calculate the first and follow functions for the
given grammar-
S → aBDh
B → cC
C → bC / ∈
D → EF
E → g / ∈
F → f / ∈

 Solution-
The first and follow functions are as follows-
First Functions-
First(S) = { a }
First(B) = { c }
First(C) = { b , ∈ }
First(D) = { First(E) – ∈ } ∪ First(F) = { g , f , ∈ }
First(E) = { g , ∈ }
First(F) = { f , ∈ }
Follow Functions-
Follow(S) = { $ }
Follow(B) = { First(D) – ∈ } ∪ First(h) = { g , f , h }
Follow(C) = Follow(B) = { g , f , h }
Follow(D) = First(h) = { h }
Follow(E) = { First(F) – ∈ } ∪ Follow(D) = { f , h }
Follow(F) = Follow(D) = { h }

 Calculate the first and follow functions for the
given grammar-
 S → AaAb / BbBa
 A → ∈
 B → ∈

 Example 3.
 E -> TR
 R -> +T R| #
 T -> F Y
 Y -> *F Y | #
 F -> (E) | i
 Output :
 First(E)= { (, i, }
 First(R)= { +, #, }
 First(T)= { (, i, }
 First(Y)= { *, #, }
 First(F)= { (, i, }
 Follow(E) = { $, ), }
 Follow(R) = { $, ), }
 Follow(T) = { +, $, ), }
 Follow(Y) = { +, $, ), }
 Follow(F) = { *, +, $, ), }

 E → T X
 X → + E
 X → ε
 T → int Y
 T → ( E )
 Y → * T
 Y → ε

LL(1) GRAMMAR
Grammer1:
1. Q -> aQbQ
2. Q -> bQaQ
3. Q -> Ɛ
Grammar2:
1. S->ab
2. S->Ɛ
3. B->bC
4. B->Ɛ
5. C->cS
6. C->Ɛ

NON-RECURSIVE PREDICTIVE PARSING -- LL(1)
PARSER

NON-RECURSIVE PREDICTIVE PARSING -- LL(1)
PARSER
 Non-Recursive predictive parsing is a table-driven
parser.
 It is a top-down parser.
 It is also known as LL(1) Parser.
input buffer
stack Non-recursive
output
Predictive Parser
81

EXAMPLE PARSE TABLE CONSTRUCTION
S → B c | D B
B → a b | c S
D → d | ε
For this grammar:
 Construct FIRST and FOLLOW Sets
 Apply algorithm to calculate parse table

EXAMPLE PARSE TABLE CONSTRUCTION
X FIRST(X) FOLLOW(X)
---------------------------------------------------
D { d, ε } { a, c }
B { a, c } { c, $ }
S { a, c, d } { $, c }
Bc { a, c }
DB { d, a, c }
ab { a }
cS { c }
D { d }
Ε {ε }

PARSE TABLE
a b c d $
S Bc
DB
Bc
DB
DB
B
D ε ε
Finish Filling In Table

LL(1) PARSER
input buffer
 our string to be parsed. We will assume that its
end is marked with a special symbol $.
stack
 contains the grammar symbols
 at the bottom of the stack, there is a special end
marker symbol $.
 initially the stack contains only the symbol $ and
the starting symbol S. $S  initial stack
 when the stack is emptied (i.e. only $ left in the
stack), the parsing is completed.
90

LL(1) PARSER
output
a production rule representing a step of the
derivation sequence (left-most derivation) of
the string in the input buffer.
parsing table
 a two-dimensional array M[A,a]
 each row is a non-terminal symbol
 each column is a terminal symbol & the special
symbol $
 each entry holds a production rule.
91

LL(1) PARSER – PARSER ACTIONS
 The symbol at the top of the stack (say X) and the
current symbol in the input string (say a)
determine the parser action.
 There are four possible parser actions.
1. If X and a are $  parser halts (successful completion)
2. If X and a are the same terminal symbol then
 parser pops X from the stack, and moves the next symbol in the
input buffer.
3. If X is a non-terminal
 M [X,a] holds a production rule XY1Y2...Yk, it pushes Yk,Yk-1,...,Y1
into the stack. The parser also outputs the production rule XY1Y2...Yk
to represent a step of the derivation.
4. none of the above  error
 all empty entries in the parsing table are errors.
 If X is a terminal symbol different from a, this is also an error case.
92

LL(1) PARSER
EXAMPLE TO PARSE ID+ID
stack input output
$E id+id$ E  TE’
$E’T id+id$ T  FT’
$E’ T’F id+id$ F  id
$ E’ T’id id+id$
$ E’ T’ +id$ T’  
$ E’ +id$ E’  +TE’
$ E’ T+ +id$
$ E’ T id$ T  FT’
$ E’ T’ F id$ F  id
$ E’ T’id id$
$ E’ T’ $ T’  
$ E’ $ E’  
$ $ accept
150
id + $
E E 
TE’
E
’
E’ 
+TE’
E’  
T T 
FT’
T
’
T’   T’  
F F 
id

LL(1) PARSER – ANOTHER EXAMPLE
S  aBa LL(1) Parsing
B  bB |  Table
w =abba
stack input output
$S abba$ S  aBa
$aBa abba$
$aB bba$ B  bB
$aBb bba$
$aB ba$ B  bB
$aBb ba$
$aB a$ B  
$a a$
$ $ accept, successful completion
151
a b $
S S  aBa
B B   B  bB

LL(1) PARSER – ANOTHER EXAMPLE (CONT.)
152
Outputs: S  aBa B  bB B  bB B  
Derivation(left-most): S  aBa  abBa  abbBa  abba
S
B
a a
B
B
b
b

parse tree

RECURSIVE DESCENT
PREDICTIVE PARSING

RECURSIVE DESCENT PREDICTIVE PARSING
After left factoring, the grammer is changed to
PROGRAM → begin DECLIST comma STATELIST
end
DECLIS → d semi DECLIST
DECLIST → d
STATELIST → s semi STATELIST
STATELIST → s
PROGRAM → begin DECLIST comma STATELIST end
DECLIST → dX
X → semi DECLIST | є
STATELIST → sY
Y → semi STATELIST | є

First(X) = {semi, є} Follow(X) =
{comma}
First(Y) = {semi, є} Follow(Y) = {end}
Write functions for each nonterminal.
PROGRAM → begin DECLIST comma
STATELIST end
DECLIST → dX
STATELIST → sY

main()
{
token = lexical();
PROGRAM();
}

Viod PROGRAM
{
if (token != begin) error();
token = lexical();
DECLIST();
if (token != comma) error();
token = lexical();
STATELIST();
if (token != end) error();
}

void DECLIST()
{
if (token != d) error;
token = lexical();
X();
}

void X()
{
if (token == semi)
{
token = lexical();
DECLIST();
}
else
if (token == comma) ; // do nothing
else error();
}

void STATELIST()
{
if (token != s) error();
token = lexical();
Y();
}
Void Y()
{
if (token == semi)
{
token = lexical();
STATELIST();
}
else
if (token == end) ; // do nothing
else error();
}

CHANGING RECURSION INTO ITERATION
Change productions into an extended notation
that includes the *.
STATELIST end
DECLIST → dX
STATELIST → sY
STATELIST end
DECLIST → d (semi d)*
STATELIST → s (semi s)*

void DECLIST()
{ if (token != d) error();
token = lexical();
while (token == semi)
{
token = lexical();
if (token != d) error();
token = lexical();
}
}

void STATELIST()
{ if (token != s) error();
token = lexical();
while (token == semi)
{
token = lexical();
if (token != s) error();
token = lexical();
}
}

Removal of recursion is not always possible. A
context free grammar might contain middle
recursion and this can not be replaced by
iteration. For example
E → E ‘+’ T
E → T
T → T ‘*’ F
T → F
F → ‘(‘ E ‘)’
F → ‘x’

Transforming the grammar into LL(1)
E → E ‘+’ T
E → T
T → T ‘*’ F
T → F
F → ‘(‘ E ‘)’
F → ‘x’
E → TX
X → ‘ +’ TX | є
T → FY
Y → ‘*’ FY | є
F → ‘(‘ E ‘) | ‘x’
Replacing recursion by iteration, where
possible, we have
E → T( ‘+’ T)*
T → F(‘*’ F)*
F → ‘(‘ E ‘)’ | ‘x’

void E()
{
T();
while (token == plus)
{
token = lexical();
T();
}
}
Void T()
{
F();
while (token == Times)
{
token = lexical();
F();
}
}
E → T( ‘+’ T)*
T → F(‘*’ F)*
F → ‘(‘ E ‘)’ | ‘x’

Void F()
{
if (token == obracket)
{
token = lexical();
E();
if (token == cbracket)
token = lexical();
else
error();
}
else if (token == x)
token = lexical();
else
error();
}
main()
{
token = lexical(;
E();
}
E → T( ‘+’ T)*
T → F(‘*’ F)*
F → ‘(‘ E ‘)’ | ‘x’

PARSING.ppt

More Related Content

What's hot

Similar to PARSING.ppt

Recently uploaded

PARSING.ppt

Editor's Notes