Bottom-Up Parsing
E →T → T * F → T * id → F * id → id * id
rightmost derivation
reduction
E → E + T | T
T → T * F | F
F → ( E ) | id
5.
Handle Pruning
Handle
A handle is a substring of grammar symbols in a
right-sentential form that matches a right-hand
side of a production
Right Sentential Form Handle Reducing Production
id1 * id2 id1 F → id
F * id2 F T → F
T * id2 id2 F → id
T * F T * F E → T * F
E → E + T | T
T → T * F | F
F → ( E ) | id
6.
Handle Pruning
Arightmost derivation in reverse can be obtai
ned by “handle pruning”
S = γ0 →rm γ1 → rm γ2 →rm …. →rm γn-1 →rm γn =ω
Handle definition
S →*rm αAω →*rm αβω
S
A
α ω
β
A handle A →β in the parse tree for αβω
7.
Example: Handle
Handle
Grammar
S a A B e
A A b c | b
B d
NOT a handle, because
further reductions will fail
(result is not a sentential form)
a b b c d e
a A b c d e
a A A e
… ?
a b b c d e
a A b c d e
a A d e
a A B e
S
8.
Shift-Reduce Parsing
Shift-ReduceParsing is a form of bottom-up
parsing
A stack holds grammar symbols
An input buffer holds the rest of the string to
parsed.
Shift-Reduce parser action
shift
reduce
accept
error
9.
Shift-Reduce Parsing
Shift
Shift the next input symbol onto the top of the stack.
Reduce
The right end of the string to be reduced must be at the top
of the stack.
Locate the left end of the string within the stack and decide
with what nonterminal to replace the string.
Accept
Announce successful completion of parsing
Error
Discover a syntax error and call recovery routine
10.
Shift-Reduce Parsing
Stack InputAction
$ id1 * id2 $ shift
$id1 * id2 $ reduce by F → id
$F * id2 $ reduce by T → F
$T * id2 $ shift
$T * id2 $ shift
$T * id2 $ reduce by F → id
$T * F $ reduce by T → T * F
$T $ reduce by E → T
$E $ accept
E → E + T | T
T → T * F | F
F → ( E ) | id
11.
Example: Shift-Reduce Parsing
Grammar:
S a A B e
A A b c | b
B d
Shift-reduce corresponds
to a rightmost derivation:
S rm a A B e
rm a A d e
rm a A b c d e
rm a b b c d e
Reducing a sentence:
a b b c d e
a A b c d e
a A d e
a A B e
S
S
a b b c d e
A
A
B
a b b c d e
A
A
B
a b b c d e
A
A
a b b c d e
A
These match
production’s
right-hand sides
12.
Conflicts During Shift-ReduceParsing
Conflicts Type
shift-reduce
reduce-reduce
Shift-reduce and reduce-reduce conflicts are
caused by
The limitations of the LR parsing method (even
when the grammar is unambiguous)
Ambiguity of the grammar
Shift-Reduce Conflict
Stack
$…
$…if Ethen S
Input
…$
else…$
Action
…
shift or reduce?
Ambiguous grammar:
S if E then S
| if E then S else S
| other
Resolve in favor
of shift, so else
matches closest if
Shift else to if E then S or
Reduce if E then S
LR
LR parserare table-driven
Much like the nonrecursive LL parsers.
The reasons of the using the LR parsing
An LR-parsering method is the most general nonbacktracki
ng shift-reduce parsing method known, yet it can be imple
mented as efficiently as other.
LR parsers can be constructed to recognize virtually all pro
gramming-language constructs for which context-free gram
mars can be written.
An LR parser can detect a syntactic error as soon as it is p
ossible to do so on a left-to-right scan of the input.
The class of grammars that can be parsed using LR metho
ds is a proper superset of the class of grammars that can b
e parsed with predictive or LL methods.
17.
LR(0)
An LRparser makes shift-reduce decisions b
y maintaining states to keep track.
An item of a grammar G is a production of G
with a dot at some position of the body.
Example
A → X Y Z
A → . X Y Z
A → X . Y Z
A → X Y . Z
A → X Y Z .
A → X . Y Z
stack next derivations
with input strings
items
Note that production A has one item [A •]
18.
LR(0)
Canonical LR(0)Collection
One collection of sets of LR(0) items
Provide the basis for constructing a DFA that is us
ed to make parsing decisions.
LR(0) automation
The canonical LR(0) collection for a grammar
Augmented the grammar
If G is a grammar with start symbol S, then G’ is the aug
mented grammar for G with new start symbol S’ and new
production S’ → S
Closure function
Goto function
Function Closure
IfI is a set of items for a grammar G, then closure(I) is th
e set of items constructed from I.
Create closure(I) by the two rules:
add every item in I to closure(I)
If A→α . Bβ is in closure(I) and B →γ is a production, then
add the item B → . γ to closure(I). Apply this rule untill no
more new items can be added to closure(I).
Divide all the sets of items into two classes
Kernel items
initial item S’ → . S, and all items whose dots are not at the left end.
Nonkernel items
All items with their dots at the left end, except for S’ → . S
21.
Example
The grammarG
E’ → E
E → E + T | T
T → T * F | F
F → ( E ) | id
Let I = { E’ → . E } , then
closure(I) = {
E’ → . E
E → . E + T
E → . T
T → . T * F
T → . F
F → . ( E )
F → . id }
22.
Exercise
The grammarG
E’ → E
E → E + T | T
T → T * F | F
F → ( E ) | id
Let I = { E → E + . T }
23.
Function Goto
FunctionGoto(I, X)
I is a set of items
X is a grammar symbol
Goto(I, X) is defined to be the closure of the set of
all items [A α X‧β] such that [A α‧ Xβ] is in I.
Goto function is used to define the transitions in th
e LR(0) automation for a grammar.
24.
Example
I ={
E’ → E .
E → E . + T }
Goto (I, +) = {
E → E + . T
T → . T * F
T → . F
F → . (E)
F → . id
}
The grammar G
E’ → E
E → E + T | T
T → T * F | F
F → ( E ) | id
25.
Constructing the LR(0)Collection
1. The grammar is augmented with a new start
symbol S’ and production S’S
2. Initially, set C = closure({[S’•S]})
(this is the start state of the DFA)
3. For each set of items I C and each gramm
ar symbol X (NT) such that
GOTO(I, X) C and goto(I, X) ,
add the set of items GOTO(I, X) to C
4. Repeat 3 until no more sets can be added t
o C
26.
Example: The Parseof id * id
Line STACK SYMBOLS INPUT ACTION
(1) 0 $ id * id $ shift to 5
(2) 0 5 $ id * id $ reduce by F → id
(3) 0 3 $ F * id $ reduce by T → F
(4) 0 2 $ T * id $ shift to 7
(5) 0 2 7 $ T * id $ shift to 5
(6) 0 2 7 5 $ T * id $ reduce by F → id
(7) 0 2 7 10 $ T * F $ reduce by T → T * F
(8) 0 2 $ T $ reduce by E → T
(9) 0 1 $ E $ accept
Structure of theLR Parsing Table
Parsing Table consists of two parts:
A parsing-action function ACTION
A goto function GOTO
The Action function, Action[i, a], have one of four for
ms:
Shift j, where j is a state.
The action taken by the parser shifts input a to the stack, but u
se state j to represent a.
Reduce A→β.
The action of the parser reduces β on the top of the stack to h
ead A.
Accept
The parser accepts the input and finishes parsing.
Error
29.
Structure of theLR Parsing Table(1)
The GOTO Function, GOTO[Ii, A], defined on
sets of items.
If GOTO[Ii, A] = Ij, then GOTO also maps a state i
and a nonterminal A to state j.
30.
LR-Parser Configurations
Configuration (= LR parser state):
($s0 s1 s2 … sm, ai ai+1 … an $)
stack input
($ X1 X2 … Xm, ai ai+1 … an $)
If action[sm, ai] = shift s then push s (ai), and advance input
(s0 s1 s2 … sm, ai ai+1 … an $) (s0 s1 s2 … sm s, ai+1 … an $)
If action[sm, ai] = reduce A and goto[sm-r, A] = s with r = ||
then pop r symbols, and push s ( push A )
( (s0 s1 s2 … sm, ai ai+1 … an $) (s0 s1 s2 … sm-r s, ai ai+1 … an $)
If action[sm, ai] = accept then stop
If action[sm, ai] = error then attempt recovery
31.
Example LR ParseTable
Grammar:
1. E E + T
2. E T
3. T T * F
4. T F
5. F ( E )
6. F id
s5 s4
s6 acc
r2 s7 r2 r2
r4 r4 r4 r4
s5 s4
r6 r6 r6 r6
s5 s4
s5 s4
s6 s11
r1 s7 r1 r1
r3 r3 r3 r3
r5 r5 r5 r5
id + * ( ) $
0
1
2
3
4
5
6
7
8
9
10
11
E T F
1 2 3
8 2 3
9 3
10
shift & goto 5
reduce by
production #1
action goto
state
32.
Line STACK SYMBOLSINPUT ACTION
(1) 0 id * id + id $ shift 5
(2) 0 5 id * id + id $ reduce 6 goto 3
(3) 0 3 F * id + id $ reduce 4 goto 2
(4) 0 2 T * id + id $ shift 7
(5) 0 2 7 T * id + id $ shift 5
(6) 0 2 7 5 T * id + id $ reduce 6 goto 10
(7) 0 2 7 10 T * F + id $ reduce 3 goto 2
(8) 0 2 T + id $ reduce 2 goto 1
(9) 0 1 E + id $ shift 6
(10
)
0 1 6 E + id $ shift 5
(11
)
0 1 6 5 E + id $ reduce 6 goto 3
(12
)
0 1 6 3 E + F $ reduce 4 goto 9
(13 0 1 6 9 E + T $ reduce 1 goto 1
Grammar
0. S E
1. E E + T
2. E T
3. T T * F
4. T F
5. F ( E )
6. F id
Example
33.
SLR Grammars
SLR(Simple LR): a simple extension of LR(0)
shift-reduce parsing
SLR eliminates some conflicts by populating
the parsing table with reductions A on
symbols in FOLLOW(A)
S E
E id + E
E id
State I0:
S •E
E •id + E
E •id
State I2:
E id•+ E
E id•
goto(I0,id) goto(I3,+)
FOLLOW(E)={$}
thus reduce on $
Shift on +
34.
SLR Parsing Table
Reductions do not fill entire rows
Otherwise the same as LR(0)
s2
acc
s3 r3
s2
r2
id + $
0
1
2
3
4
E
1
4
1. S E
2. E id + E
3. E id
FOLLOW(E)={$}
thus reduce on $
Shift on +
35.
Constructing SLR ParsingTables
Augment the grammar with S’ S
Construct the set C={I0, I1, …, In} of LR(0) items
State i is constructed from Ii
If [A•a] Ii and goto(Ii, a)=Ij then set
action[i, a]=shift j
If [A•] Ii then set action[i,a]=reduce A for all a FOLL
OW(A) (apply only if AS’)
If [S’S•] is in Ii then set action[i,$]=accept
If goto(Ii, A)=Ij then set goto[i, A]=j set goto table
Repeat 3-4 until no more entries added
The initial state i is the Ii holding item [S’•S]
36.
Example SLR Grammarand LR(0) Items
Augmented
grammar:
0. C’ C
1. C A B
2. A a
3. B a
State I0:
C’ •C
C •A B
A •a
State I1:
C’ C•
State I2:
C A•B
B •a
State I3:
A a•
State I4:
C A B•
State I5:
B a•
goto(I0,C)
goto(I0,a)
goto(I0,A)
goto(I2,a)
goto(I2,B)
I0 = closure({[C’ •C]})
I1 = goto(I0,C) = closure({[C’ C•]})
…
start
final
37.
Example SLR ParsingTable
s3
acc
s5
r2
r1
r3
a $
0
1
2
3
4
5
C A B
1 2
4
State I0:
C’ •C
C •A B
A •a
State I1:
C’ C•
State I2:
C A•B
B •a
State I3:
A a•
State I4:
C A B•
State I5:
B a•
1
2
4
5
3
0
start
a
A
C
B
a
Grammar:
0. C’ C
1. C A B
2. A a
3. B a
38.
SLR and Ambiguity
Every SLR grammar is unambiguous, but not every
unambiguous grammar is SLR, maybe LR(1)
Consider for example the unambiguous grammar
S L = R | R
L * R | id
R L FOLLOW(R) = {=, $}
I0:
S’ •S
S •L=R
S •R
L •*R
L •id
R •L
I1:
S’ S•
I2:
S L•=R
R L•
I3:
S R•
I4:
L *•R
R •L
L •*R
L •id
I5:
L id•
I6:
S L=•R
R •L
L •*R
L •id
I7:
L *R•
I8:
R L•
I9:
S L=R•
action[2,=]=s6
action[2,=]=r5
no
Has no SLR
parsing table
SLR Versus LR(1)
Split the SLR states by adding LR(1) lookahead
Unambiguous grammar
S L = R | R
L * R | id
R L
I2:
S L•=R
R L•
action[2,=]=s6
Should not reduce, because no
right-sentential form begins with R=
split
R L•
S L•=R
41.
LR(1) Items
AnLR(1) item
[A•, a]
contains a lookahead terminal a, meaning a
lready on top of the stack, expect to see a
For items of the form
[A•, a]
the lookahead a is used to reduce A only if
the next input is a
For items of the form
[A•, a]
with the lookahead has no effect
42.
The Closure Operationfor LR(1) Items
Start with closure(I) = I
If [A•B, a] closure(I) then
for each production B in the grammar
and each terminal b FIRST(a)
add the item [B•, b] to I
if not already in I
Repeat 2 until no new items can be added
43.
The Goto Operationfor LR(1) Items
For each item [A•X, a] I, add the set o
f items closure({[AX•, a]}) to goto(I,X) if
not already there
Repeat step 1 until no more items can be ad
ded to goto(I,X)
44.
Example
Let I={ (S’ → •S, $) }
I0 = closure(I) = {
S’ → •S, $
S → • C C, $
C → •c C, c/d
C → •d, c/d
}
goto(I0, S) = closure( {S’ → S •, $ } )
= {S’ → S •, $ } = I1
The grammar G
S’ → S
S → C C
C → c C | d
45.
Exercise
Let I={ (S → C •C, $) }
I2 = closure(I) = ?
I3 = goto(I2, c) = ?
The grammar G
S’ → S
S → C C
C → c C | d
46.
Construction of thesets of LR(1) Items
Augment the grammar with a new start symbo
l S’ and production S’S
Initially, set C = closure({[S’•S, $]})
(this is the start state of the DFA)
For each set of items I C and each grammar
symbol X (NT) such that goto(I, X) C and
goto(I, X) , add the set of items goto(I, X) t
o C
Repeat 3 until no more sets can be added to
C
Construction of theCanonical LR(1)
Parsing Tables
Augment the grammar with S’S
Construct the set C={I0,I1,…,In} of LR(1) items
State i of the parser is constructed from Ii
If [A•a, b] Ii and goto(Ii,a)=Ij then set action[i,a]=shift j
If [A•, a] Ii then set action[i,a]=reduce A (apply only
if AS’)
If [S’S•, $] is in Ii then set action[i,$]=accept
If goto(Ii,A)=Ij then set goto[i,A]=j
Repeat 3 until no more entries added
The initial state i is the Ii holding item [S’•S,$]
49.
Example The grammarG
S’ → S
S → C C
C → c C | d
state
ACTION GOTO
c d $ S C
0 s3 s4 1 2
1 acc
2 s6 s7 5
3 s3 s4 8
4 r3 r3
5 r1
6 s6 s7 9
7 r3
8 r2 r2
9 r2
50.
Example Grammar andLR(1) Items
Unambiguous LR(1) grammar:
S L = R | R
L * R | id
R L
Augment with S’ S
LR(1) items (next slide)
Example LR(1) ParsingTable
s5 s4
acc
s6 r6
r3
s5 s4
r5 r5
s12 s11
r4 r4
r6 r6
r2
r6
s12 s11
r5
r4
id * = $
0
1
2
3
4
5
6
7
8
9
10
11
12
13
S L R
1 2 3
8 7
10 4
10 13
Grammar:
1. S’ S
2. S L = R
3. S R
4. L * R
5. L id
6. R L
53.
LALR(1) Grammars
LR(1)parsing tables have many states
LALR(1) parsing (Look-Ahead LR) combines LR(1)
states to reduce table size
Less powerful than LR(1)
Will not introduce shift-reduce conflicts, because shifts do n
ot use lookaheads
May introduce reduce-reduce conflicts, but seldom do so fo
r grammars of programming languages
SLR and LALR tables for a grammar always have th
e same number of states, and less than LR(1) tables.
Like C, SLR and LALR >100, LR(1) > 1000
54.
Constructing LALR ParsingTables
Two ways
Construction of the LALR parsing table from the
sets of LR(1) items.
Union the states
Requires much space and time
Construction of the LALR parsing table from the
sets of LR(0) items
Efficient
Use in practice.
55.
Example
state
ACTION GOTO
c d$ S C
0 s36 s47 1 2
1 acc
2 s36 s47 5
36 s36 s47 89
47 r3 r3 r3
5 r1
89 r2 r2 r2
state
ACTION GOTO
c d $ S C
0 s3 s4 1 2
1 acc
2 s6 s7 5
3 s3 s4 8
4 r3 r3
5 r1
6 s6 s7 9
7 r3
8 r2 r2
9 r2
LALR(1)
LR(1)
56.
Constructing LALR(1) ParsingTables
Construct sets of LR(1) items
Combine LR(1) sets with sets of items that s
hare the same first part
[L *•R, =]
[R •L, =]
[L •*R, =]
[L •id, =]
[L *•R, $]
[R •L, $]
[L •*R, $]
[L •id, $]
I4:
I11:
[L *•R, =/$]
[R •L, =/$]
[L •*R, =/$]
[L •id, =/$]
Shorthand
for two items
in the same set
57.
Example LALR(1) Grammar
Unambiguous LR(1) grammar:
S L = R | R
L * R | id
R L
Augment with S’ S
LALR(1) items (next slide)
Example LALR(1) ParsingTable
s5 s4
acc
s6 r6
r3
s5 s4
r5 r5
s5 s4
r4 r4
r2
r6 r6
id * = $
0
1
2
3
4
5
6
7
8
9
S L R
1 2 3
9 7
9 8
Grammar:
1. S’ S
2. S L = R
3. S R
4. L * R
5. L id
6. R L
60.
LL, SLR, LR,LALR Summary
LL parse tables computed using FIRST/FOLLOW
Nonterminals terminals productions
Computed using FIRST/FOLLOW
LR parsing tables computed using closure/goto
LR states terminals shift/reduce actions
LR states nonterminals goto state transitions
A grammar is
LL(1) if its LL(1) parse table has no conflicts
SLR if its SLR parse table has no conflicts
LR(1) if its LR(1) parse table has no conflicts
LALR(1) if its LALR(1) parse table has no conflicts