Chapter 4-2
Chang Chi-Chung
2007.06.07
Bottom-Up Parsing
 LR methods (Left-to-right, Rightmost
derivation)
 LR(0), SLR, Canonical LR = LR(1), LALR
 Other special cases:
 Shift-reduce parsing
 Operator-precedence parsing
Operator-Precedence Parsing
 Special case of shift-reduce parsing
 See textbook section 4.6
Bottom-Up Parsing
E → T → T * F → T * id → F * id → id * id
rightmost derivation
reduction
E → E + T | T
T → T * F | F
F → ( E ) | id
Handle Pruning
 Handle
 A handle is a substring of grammar symbols in a
right-sentential form that matches a right-hand
side of a production
Right Sentential Form Handle Reducing Production
id1 * id2 id1 F → id
F * id2 F T → F
T * id2 id2 F → id
T * F T * F E → T * F
E → E + T | T
T → T * F | F
F → ( E ) | id
Handle Pruning
 A rightmost derivation in reverse can be obtai
ned by “handle pruning”
S = γ0 →rm γ1 → rm γ2 →rm …. →rm γn-1 →rm γn =ω
 Handle definition
 S →*rm αAω →*rm αβω
S
A
α ω
β
A handle A →β in the parse tree for αβω
Example: Handle
Handle
Grammar
S  a A B e
A  A b c | b
B  d
NOT a handle, because
further reductions will fail
(result is not a sentential form)
a b b c d e
a A b c d e
a A A e
… ?
a b b c d e
a A b c d e
a A d e
a A B e
S
Shift-Reduce Parsing
 Shift-Reduce Parsing is a form of bottom-up
parsing
 A stack holds grammar symbols
 An input buffer holds the rest of the string to
parsed.
 Shift-Reduce parser action
 shift
 reduce
 accept
 error
Shift-Reduce Parsing
 Shift
 Shift the next input symbol onto the top of the stack.
 Reduce
 The right end of the string to be reduced must be at the top
of the stack.
 Locate the left end of the string within the stack and decide
with what nonterminal to replace the string.
 Accept
 Announce successful completion of parsing
 Error
 Discover a syntax error and call recovery routine
Shift-Reduce Parsing
Stack Input Action
$ id1 * id2 $ shift
$id1 * id2 $ reduce by F → id
$F * id2 $ reduce by T → F
$T * id2 $ shift
$T * id2 $ shift
$T * id2 $ reduce by F → id
$T * F $ reduce by T → T * F
$T $ reduce by E → T
$E $ accept
E → E + T | T
T → T * F | F
F → ( E ) | id
Example: Shift-Reduce Parsing
Grammar:
S  a A B e
A  A b c | b
B  d
Shift-reduce corresponds
to a rightmost derivation:
S rm a A B e
rm a A d e
rm a A b c d e
rm a b b c d e
Reducing a sentence:
a b b c d e
a A b c d e
a A d e
a A B e
S
S
a b b c d e
A
A
B
a b b c d e
A
A
B
a b b c d e
A
A
a b b c d e
A
These match
production’s
right-hand sides
Conflicts During Shift-Reduce Parsing
 Conflicts Type
 shift-reduce
 reduce-reduce
 Shift-reduce and reduce-reduce conflicts are
caused by
 The limitations of the LR parsing method (even
when the grammar is unambiguous)
 Ambiguity of the grammar
Shift-Reduce Conflict
Stack
$
$id
$E
$E+
$E+id
$E+E
$E+E*
$E+E*id
$E+E*E
$E+E
$E
Input
id+id*id$
+id*id$
+id*id$
id*id$
*id$
*id$
id$
$
$
$
$
Action
shift
reduce E  id
shift
shift
reduce E  id
shift (or reduce?)
shift
reduce E  id
reduce E  E * E
reduce E  E + E
accept
Grammar
E  E + E
E  E * E
E  ( E )
E  id
Find handles
to be reduced
How to
resolve
conflicts?
Shift-Reduce Conflict
Stack
$…
$…if E then S
Input
…$
else…$
Action
…
shift or reduce?
Ambiguous grammar:
S  if E then S
| if E then S else S
| other
Resolve in favor
of shift, so else
matches closest if
Shift else to if E then S or
Reduce if E then S
Reduce-Reduce Conflict
Stack
$
$a
Input
aa$
a$
Action
shift
reduce A  a or B  a ?
Grammar
C  A B
A  a
B  a
Resolve in favor
of reduce A  a,
otherwise we’re stuck!
動彈不得
LR
 LR parser are table-driven
 Much like the nonrecursive LL parsers.
 The reasons of the using the LR parsing
 An LR-parsering method is the most general nonbacktracki
ng shift-reduce parsing method known, yet it can be imple
mented as efficiently as other.
 LR parsers can be constructed to recognize virtually all pro
gramming-language constructs for which context-free gram
mars can be written.
 An LR parser can detect a syntactic error as soon as it is p
ossible to do so on a left-to-right scan of the input.
 The class of grammars that can be parsed using LR metho
ds is a proper superset of the class of grammars that can b
e parsed with predictive or LL methods.
LR(0)
 An LR parser makes shift-reduce decisions b
y maintaining states to keep track.
 An item of a grammar G is a production of G
with a dot at some position of the body.
 Example
A → X Y Z
A → . X Y Z
A → X . Y Z
A → X Y . Z
A → X Y Z .
A → X . Y Z
stack next derivations
with input strings
items
Note that production A   has one item [A  •]
LR(0)
 Canonical LR(0) Collection
 One collection of sets of LR(0) items
 Provide the basis for constructing a DFA that is us
ed to make parsing decisions.
 LR(0) automation
 The canonical LR(0) collection for a grammar
 Augmented the grammar
 If G is a grammar with start symbol S, then G’ is the aug
mented grammar for G with new start symbol S’ and new
production S’ → S
 Closure function
 Goto function
Use of the LR(0) Automaton
Function Closure
 If I is a set of items for a grammar G, then closure(I) is th
e set of items constructed from I.
 Create closure(I) by the two rules:
 add every item in I to closure(I)
 If A→α . Bβ is in closure(I) and B →γ is a production, then
add the item B → . γ to closure(I). Apply this rule untill no
more new items can be added to closure(I).
 Divide all the sets of items into two classes
 Kernel items
 initial item S’ → . S, and all items whose dots are not at the left end.
 Nonkernel items
 All items with their dots at the left end, except for S’ → . S
Example
 The grammar G
E’ → E
E → E + T | T
T → T * F | F
F → ( E ) | id
 Let I = { E’ → . E } , then
closure(I) = {
E’ → . E
E → . E + T
E → . T
T → . T * F
T → . F
F → . ( E )
F → . id }
Exercise
 The grammar G
E’ → E
E → E + T | T
T → T * F | F
F → ( E ) | id
 Let I = { E → E + . T }
Function Goto
 Function Goto(I, X)
 I is a set of items
 X is a grammar symbol
 Goto(I, X) is defined to be the closure of the set of
all items [A  α X‧β] such that [A  α‧ Xβ] is in I.
 Goto function is used to define the transitions in th
e LR(0) automation for a grammar.
Example
 I = {
E’ → E .
E → E . + T }
 Goto (I, +) = {
E → E + . T
T → . T * F
T → . F
F → . (E)
F → . id
}
The grammar G
E’ → E
E → E + T | T
T → T * F | F
F → ( E ) | id
Constructing the LR(0) Collection
1. The grammar is augmented with a new start
symbol S’ and production S’S
2. Initially, set C = closure({[S’•S]})
(this is the start state of the DFA)
3. For each set of items I  C and each gramm
ar symbol X  (NT) such that
GOTO(I, X)  C and goto(I, X)  ,
add the set of items GOTO(I, X) to C
4. Repeat 3 until no more sets can be added t
o C
Example: The Parse of id * id
Line STACK SYMBOLS INPUT ACTION
(1) 0 $ id * id $ shift to 5
(2) 0 5 $ id * id $ reduce by F → id
(3) 0 3 $ F * id $ reduce by T → F
(4) 0 2 $ T * id $ shift to 7
(5) 0 2 7 $ T * id $ shift to 5
(6) 0 2 7 5 $ T * id $ reduce by F → id
(7) 0 2 7 10 $ T * F $ reduce by T → T * F
(8) 0 2 $ T $ reduce by E → T
(9) 0 1 $ E $ accept
Model of an LR Parser
Structure of the LR Parsing Table
 Parsing Table consists of two parts:
 A parsing-action function ACTION
 A goto function GOTO
 The Action function, Action[i, a], have one of four for
ms:
 Shift j, where j is a state.
 The action taken by the parser shifts input a to the stack, but u
se state j to represent a.
 Reduce A→β.
 The action of the parser reduces β on the top of the stack to h
ead A.
 Accept
 The parser accepts the input and finishes parsing.
 Error
Structure of the LR Parsing Table(1)
 The GOTO Function, GOTO[Ii, A], defined on
sets of items.
 If GOTO[Ii, A] = Ij, then GOTO also maps a state i
and a nonterminal A to state j.
LR-Parser Configurations
Configuration ( = LR parser state):
($s0 s1 s2 … sm, ai ai+1 … an $)
stack input
($ X1 X2 … Xm, ai ai+1 … an $)
If action[sm, ai] = shift s then push s (ai), and advance input
(s0 s1 s2 … sm, ai ai+1 … an $)  (s0 s1 s2 … sm s, ai+1 … an $)
If action[sm, ai] = reduce A   and goto[sm-r, A] = s with r = ||
then pop r symbols, and push s ( push A )
( (s0 s1 s2 … sm, ai ai+1 … an $)  (s0 s1 s2 … sm-r s, ai ai+1 … an $)
If action[sm, ai] = accept then stop
If action[sm, ai] = error then attempt recovery
Example LR Parse Table
Grammar:
1. E  E + T
2. E  T
3. T  T * F
4. T  F
5. F  ( E )
6. F  id
s5 s4
s6 acc
r2 s7 r2 r2
r4 r4 r4 r4
s5 s4
r6 r6 r6 r6
s5 s4
s5 s4
s6 s11
r1 s7 r1 r1
r3 r3 r3 r3
r5 r5 r5 r5
id + * ( ) $
0
1
2
3
4
5
6
7
8
9
10
11
E T F
1 2 3
8 2 3
9 3
10
shift & goto 5
reduce by
production #1
action goto
state
Line STACK SYMBOLS INPUT ACTION
(1) 0 id * id + id $ shift 5
(2) 0 5 id * id + id $ reduce 6 goto 3
(3) 0 3 F * id + id $ reduce 4 goto 2
(4) 0 2 T * id + id $ shift 7
(5) 0 2 7 T * id + id $ shift 5
(6) 0 2 7 5 T * id + id $ reduce 6 goto 10
(7) 0 2 7 10 T * F + id $ reduce 3 goto 2
(8) 0 2 T + id $ reduce 2 goto 1
(9) 0 1 E + id $ shift 6
(10
)
0 1 6 E + id $ shift 5
(11
)
0 1 6 5 E + id $ reduce 6 goto 3
(12
)
0 1 6 3 E + F $ reduce 4 goto 9
(13 0 1 6 9 E + T $ reduce 1 goto 1
Grammar
0. S  E
1. E  E + T
2. E  T
3. T  T * F
4. T  F
5. F  ( E )
6. F  id
Example
SLR Grammars
 SLR (Simple LR): a simple extension of LR(0)
shift-reduce parsing
 SLR eliminates some conflicts by populating
the parsing table with reductions A on
symbols in FOLLOW(A)
S  E
E  id + E
E  id
State I0:
S  •E
E  •id + E
E  •id
State I2:
E  id•+ E
E  id•
goto(I0,id) goto(I3,+)
FOLLOW(E)={$}
thus reduce on $
Shift on +
SLR Parsing Table
 Reductions do not fill entire rows
 Otherwise the same as LR(0)
s2
acc
s3 r3
s2
r2
id + $
0
1
2
3
4
E
1
4
1. S  E
2. E  id + E
3. E  id
FOLLOW(E)={$}
thus reduce on $
Shift on +
Constructing SLR Parsing Tables
 Augment the grammar with S’ S
 Construct the set C={I0, I1, …, In} of LR(0) items
 State i is constructed from Ii
 If [A•a]  Ii and goto(Ii, a)=Ij then set
action[i, a]=shift j
 If [A•]  Ii then set action[i,a]=reduce A for all a  FOLL
OW(A) (apply only if AS’)
 If [S’S•] is in Ii then set action[i,$]=accept
 If goto(Ii, A)=Ij then set goto[i, A]=j set goto table
 Repeat 3-4 until no more entries added
 The initial state i is the Ii holding item [S’•S]
Example SLR Grammar and LR(0) Items
Augmented
grammar:
0. C’  C
1. C  A B
2. A  a
3. B  a
State I0:
C’  •C
C  •A B
A  •a
State I1:
C’  C•
State I2:
C  A•B
B  •a
State I3:
A  a•
State I4:
C  A B•
State I5:
B  a•
goto(I0,C)
goto(I0,a)
goto(I0,A)
goto(I2,a)
goto(I2,B)
I0 = closure({[C’  •C]})
I1 = goto(I0,C) = closure({[C’  C•]})
…
start
final
Example SLR Parsing Table
s3
acc
s5
r2
r1
r3
a $
0
1
2
3
4
5
C A B
1 2
4
State I0:
C’  •C
C  •A B
A  •a
State I1:
C’  C•
State I2:
C  A•B
B  •a
State I3:
A  a•
State I4:
C  A B•
State I5:
B  a•
1
2
4
5
3
0
start
a
A
C
B
a
Grammar:
0. C’  C
1. C  A B
2. A  a
3. B  a
SLR and Ambiguity
 Every SLR grammar is unambiguous, but not every
unambiguous grammar is SLR, maybe LR(1)
 Consider for example the unambiguous grammar
S  L = R | R
L  * R | id
R  L FOLLOW(R) = {=, $}
I0:
S’  •S
S  •L=R
S  •R
L  •*R
L  •id
R  •L
I1:
S’  S•
I2:
S  L•=R
R  L•
I3:
S  R•
I4:
L  *•R
R  •L
L  •*R
L  •id
I5:
L  id•
I6:
S  L=•R
R  •L
L  •*R
L  •id
I7:
L  *R•
I8:
R  L•
I9:
S  L=R•
action[2,=]=s6
action[2,=]=r5
no
Has no SLR
parsing table
LR(1) Grammars
 SLR too simple
 LR(1) parsing uses lookahead to avoid unnec
essary conflicts in parsing table
 LR(1) item = LR(0) item + lookahead
LR(0) item
[A•]
LR(1) item
[A•, a]
SLR Versus LR(1)
 Split the SLR states by adding LR(1) lookahead
 Unambiguous grammar
S  L = R | R
L  * R | id
R  L
I2:
S  L•=R
R  L•
action[2,=]=s6
Should not reduce, because no
right-sentential form begins with R=
split
R  L•
S  L•=R
LR(1) Items
 An LR(1) item
[A•, a]
contains a lookahead terminal a, meaning  a
lready on top of the stack, expect to see a
 For items of the form
[A•, a]
the lookahead a is used to reduce A only if
the next input is a
 For items of the form
[A•, a]
with  the lookahead has no effect
The Closure Operation for LR(1) Items
 Start with closure(I) = I
 If [A•B, a]  closure(I) then
for each production B in the grammar
and each terminal b  FIRST(a)
add the item [B•, b] to I
if not already in I
 Repeat 2 until no new items can be added
The Goto Operation for LR(1) Items
 For each item [A•X, a]  I, add the set o
f items closure({[AX•, a]}) to goto(I,X) if
not already there
 Repeat step 1 until no more items can be ad
ded to goto(I,X)
Example
 Let I= { (S’ → •S, $) }
 I0 = closure(I) = {
S’ → •S, $
S → • C C, $
C → •c C, c/d
C → •d, c/d
}
 goto(I0, S) = closure( {S’ → S •, $ } )
= {S’ → S •, $ } = I1
The grammar G
S’ → S
S → C C
C → c C | d
Exercise
 Let I= { (S → C •C, $) }
 I2 = closure(I) = ?
 I3 = goto(I2, c) = ?
The grammar G
S’ → S
S → C C
C → c C | d
Construction of the sets of LR(1) Items
 Augment the grammar with a new start symbo
l S’ and production S’S
 Initially, set C = closure({[S’•S, $]})
(this is the start state of the DFA)
 For each set of items I  C and each grammar
symbol X  (NT) such that goto(I, X)  C and
goto(I, X)  , add the set of items goto(I, X) t
o C
 Repeat 3 until no more sets can be added to
C
LR(1) Automation
Construction of the Canonical LR(1)
Parsing Tables
 Augment the grammar with S’S
 Construct the set C={I0,I1,…,In} of LR(1) items
 State i of the parser is constructed from Ii
 If [A•a, b]  Ii and goto(Ii,a)=Ij then set action[i,a]=shift j
 If [A•, a]  Ii then set action[i,a]=reduce A (apply only
if AS’)
 If [S’S•, $] is in Ii then set action[i,$]=accept
 If goto(Ii,A)=Ij then set goto[i,A]=j
 Repeat 3 until no more entries added
 The initial state i is the Ii holding item [S’•S,$]
Example The grammar G
S’ → S
S → C C
C → c C | d
state
ACTION GOTO
c d $ S C
0 s3 s4 1 2
1 acc
2 s6 s7 5
3 s3 s4 8
4 r3 r3
5 r1
6 s6 s7 9
7 r3
8 r2 r2
9 r2
Example Grammar and LR(1) Items
 Unambiguous LR(1) grammar:
S  L = R | R
L  * R | id
R  L
 Augment with S’  S
 LR(1) items (next slide)
[S’  •S, $] goto(I0,S)=I1
[S  •L=R, $] goto(I0,L)=I2
[S  •R, $] goto(I0,R)=I3
[L  •*R, =/$] goto(I0,*)=I4
[L  •id, =/$] goto(I0,id)=I5
[R  •L, $] goto(I0,L)=I2
[S’  S•, $]
[S  L•=R, $] goto(I0,=)=I6
[R  L•, $]
[S  R•, $]
[L  *•R, =/$] goto(I4,R)=I7
[R  •L, =/$] goto(I4,L)=I8
[L  •*R, =/$] goto(I4,*)=I4
[L  •id, =/$] goto(I4,id)=I5
[L  id•, =/$]
[S  L=•R, $] goto(I6,R)=I9
[R  •L, $] goto(I6,L)=I10
[L  •*R, $] goto(I6,*)=I11
[L  •id, $] goto(I6,id)=I12
[L  *R•, =/$]
[R  L•, =/$]
[S  L=R•, $]
[R  L•, $]
[L  *•R, $] goto(I11,R)=I13
[R  •L, $] goto(I11,L)=I10
[L  •*R, $] goto(I11,*)=I11
[L  •id, $] goto(I11,id)=I12
[L  id•, $]
I0:
I1:
I2:
I3:
I4:
I5:
I6:
I7:
I8:
I9:
I10:
I12:
I11:
I13:
Example LR(1) Parsing Table
s5 s4
acc
s6 r6
r3
s5 s4
r5 r5
s12 s11
r4 r4
r6 r6
r2
r6
s12 s11
r5
r4
id * = $
0
1
2
3
4
5
6
7
8
9
10
11
12
13
S L R
1 2 3
8 7
10 4
10 13
Grammar:
1. S’  S
2. S  L = R
3. S  R
4. L  * R
5. L  id
6. R  L
LALR(1) Grammars
 LR(1) parsing tables have many states
 LALR(1) parsing (Look-Ahead LR) combines LR(1)
states to reduce table size
 Less powerful than LR(1)
 Will not introduce shift-reduce conflicts, because shifts do n
ot use lookaheads
 May introduce reduce-reduce conflicts, but seldom do so fo
r grammars of programming languages
 SLR and LALR tables for a grammar always have th
e same number of states, and less than LR(1) tables.
 Like C, SLR and LALR >100, LR(1) > 1000
Constructing LALR Parsing Tables
 Two ways
 Construction of the LALR parsing table from the
sets of LR(1) items.
 Union the states
 Requires much space and time
 Construction of the LALR parsing table from the
sets of LR(0) items
 Efficient
 Use in practice.
Example
state
ACTION GOTO
c d $ S C
0 s36 s47 1 2
1 acc
2 s36 s47 5
36 s36 s47 89
47 r3 r3 r3
5 r1
89 r2 r2 r2
state
ACTION GOTO
c d $ S C
0 s3 s4 1 2
1 acc
2 s6 s7 5
3 s3 s4 8
4 r3 r3
5 r1
6 s6 s7 9
7 r3
8 r2 r2
9 r2
LALR(1)
LR(1)
Constructing LALR(1) Parsing Tables
 Construct sets of LR(1) items
 Combine LR(1) sets with sets of items that s
hare the same first part
[L  *•R, =]
[R  •L, =]
[L  •*R, =]
[L  •id, =]
[L  *•R, $]
[R  •L, $]
[L  •*R, $]
[L  •id, $]
I4:
I11:
[L  *•R, =/$]
[R  •L, =/$]
[L  •*R, =/$]
[L  •id, =/$]
Shorthand
for two items
in the same set
Example LALR(1) Grammar
 Unambiguous LR(1) grammar:
S  L = R | R
L  * R | id
R  L
 Augment with S’  S
 LALR(1) items (next slide)
[S’  •S, $] goto(I0,S)=I1
[S  •L=R, $] goto(I0,L)=I2
[S  •R, $] goto(I0,R)=I3
[L  •*R, =/$] goto(I0,*)=I4
[L  •id, =/$] goto(I0,id)=I5
[R  •L, $] goto(I0,L)=I2
[S’  S•, $]
[S  L•=R, $] goto(I0,=)=I6
[R  L•, $]
[S  R•, $]
[L  *•R, =/$] goto(I4,R)=I7
[R  •L, =/$] goto(I4,L)=I9
[L  •*R, =/$] goto(I4,*)=I4
[L  •id, =/$] goto(I4,id)=I5
[S  L=•R, $] goto(I6,R)=I8
[R  •L, $] goto(I6,L)=I9
[L  •*R, $] goto(I6,*)=I4
[L  •id, $] goto(I6,id)=I5
[L  *R•, =/$]
[S  L=R•, $]
[R  L•, =/$]
I0:
I1:
I2:
I3:
I4:
I5:
I6:
I7:
I8:
I9:
Shorthand
for two items
[R  L•, =]
[R  L•, $]
Example LALR(1) Parsing Table
s5 s4
acc
s6 r6
r3
s5 s4
r5 r5
s5 s4
r4 r4
r2
r6 r6
id * = $
0
1
2
3
4
5
6
7
8
9
S L R
1 2 3
9 7
9 8
Grammar:
1. S’  S
2. S  L = R
3. S  R
4. L  * R
5. L  id
6. R  L
LL, SLR, LR, LALR Summary
 LL parse tables computed using FIRST/FOLLOW
 Nonterminals  terminals  productions
 Computed using FIRST/FOLLOW
 LR parsing tables computed using closure/goto
 LR states  terminals  shift/reduce actions
 LR states  nonterminals  goto state transitions
 A grammar is
 LL(1) if its LL(1) parse table has no conflicts
 SLR if its SLR parse table has no conflicts
 LR(1) if its LR(1) parse table has no conflicts
 LALR(1) if its LALR(1) parse table has no conflicts
LL, SLR, LR, LALR Grammars
LL(1)
LR(1)
LR(0)
SLR
LALR(1)
YACC
Yacc or Bison
compiler
yacc
specification
yacc.y
y.tab.c
input
stream
C
compiler
a.out
output
stream
y.tab.c
a.out

Predicting Stock Market Trends Using Machine Learning and Deep Learning Algorithms: A Comparative Analysis of Continuous and Binary Data

  • 1.
  • 2.
    Bottom-Up Parsing  LRmethods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR  Other special cases:  Shift-reduce parsing  Operator-precedence parsing
  • 3.
    Operator-Precedence Parsing  Specialcase of shift-reduce parsing  See textbook section 4.6
  • 4.
    Bottom-Up Parsing E →T → T * F → T * id → F * id → id * id rightmost derivation reduction E → E + T | T T → T * F | F F → ( E ) | id
  • 5.
    Handle Pruning  Handle A handle is a substring of grammar symbols in a right-sentential form that matches a right-hand side of a production Right Sentential Form Handle Reducing Production id1 * id2 id1 F → id F * id2 F T → F T * id2 id2 F → id T * F T * F E → T * F E → E + T | T T → T * F | F F → ( E ) | id
  • 6.
    Handle Pruning  Arightmost derivation in reverse can be obtai ned by “handle pruning” S = γ0 →rm γ1 → rm γ2 →rm …. →rm γn-1 →rm γn =ω  Handle definition  S →*rm αAω →*rm αβω S A α ω β A handle A →β in the parse tree for αβω
  • 7.
    Example: Handle Handle Grammar S a A B e A  A b c | b B  d NOT a handle, because further reductions will fail (result is not a sentential form) a b b c d e a A b c d e a A A e … ? a b b c d e a A b c d e a A d e a A B e S
  • 8.
    Shift-Reduce Parsing  Shift-ReduceParsing is a form of bottom-up parsing  A stack holds grammar symbols  An input buffer holds the rest of the string to parsed.  Shift-Reduce parser action  shift  reduce  accept  error
  • 9.
    Shift-Reduce Parsing  Shift Shift the next input symbol onto the top of the stack.  Reduce  The right end of the string to be reduced must be at the top of the stack.  Locate the left end of the string within the stack and decide with what nonterminal to replace the string.  Accept  Announce successful completion of parsing  Error  Discover a syntax error and call recovery routine
  • 10.
    Shift-Reduce Parsing Stack InputAction $ id1 * id2 $ shift $id1 * id2 $ reduce by F → id $F * id2 $ reduce by T → F $T * id2 $ shift $T * id2 $ shift $T * id2 $ reduce by F → id $T * F $ reduce by T → T * F $T $ reduce by E → T $E $ accept E → E + T | T T → T * F | F F → ( E ) | id
  • 11.
    Example: Shift-Reduce Parsing Grammar: S a A B e A  A b c | b B  d Shift-reduce corresponds to a rightmost derivation: S rm a A B e rm a A d e rm a A b c d e rm a b b c d e Reducing a sentence: a b b c d e a A b c d e a A d e a A B e S S a b b c d e A A B a b b c d e A A B a b b c d e A A a b b c d e A These match production’s right-hand sides
  • 12.
    Conflicts During Shift-ReduceParsing  Conflicts Type  shift-reduce  reduce-reduce  Shift-reduce and reduce-reduce conflicts are caused by  The limitations of the LR parsing method (even when the grammar is unambiguous)  Ambiguity of the grammar
  • 13.
    Shift-Reduce Conflict Stack $ $id $E $E+ $E+id $E+E $E+E* $E+E*id $E+E*E $E+E $E Input id+id*id$ +id*id$ +id*id$ id*id$ *id$ *id$ id$ $ $ $ $ Action shift reduce E id shift shift reduce E  id shift (or reduce?) shift reduce E  id reduce E  E * E reduce E  E + E accept Grammar E  E + E E  E * E E  ( E ) E  id Find handles to be reduced How to resolve conflicts?
  • 14.
    Shift-Reduce Conflict Stack $… $…if Ethen S Input …$ else…$ Action … shift or reduce? Ambiguous grammar: S  if E then S | if E then S else S | other Resolve in favor of shift, so else matches closest if Shift else to if E then S or Reduce if E then S
  • 15.
    Reduce-Reduce Conflict Stack $ $a Input aa$ a$ Action shift reduce A a or B  a ? Grammar C  A B A  a B  a Resolve in favor of reduce A  a, otherwise we’re stuck! 動彈不得
  • 16.
    LR  LR parserare table-driven  Much like the nonrecursive LL parsers.  The reasons of the using the LR parsing  An LR-parsering method is the most general nonbacktracki ng shift-reduce parsing method known, yet it can be imple mented as efficiently as other.  LR parsers can be constructed to recognize virtually all pro gramming-language constructs for which context-free gram mars can be written.  An LR parser can detect a syntactic error as soon as it is p ossible to do so on a left-to-right scan of the input.  The class of grammars that can be parsed using LR metho ds is a proper superset of the class of grammars that can b e parsed with predictive or LL methods.
  • 17.
    LR(0)  An LRparser makes shift-reduce decisions b y maintaining states to keep track.  An item of a grammar G is a production of G with a dot at some position of the body.  Example A → X Y Z A → . X Y Z A → X . Y Z A → X Y . Z A → X Y Z . A → X . Y Z stack next derivations with input strings items Note that production A   has one item [A  •]
  • 18.
    LR(0)  Canonical LR(0)Collection  One collection of sets of LR(0) items  Provide the basis for constructing a DFA that is us ed to make parsing decisions.  LR(0) automation  The canonical LR(0) collection for a grammar  Augmented the grammar  If G is a grammar with start symbol S, then G’ is the aug mented grammar for G with new start symbol S’ and new production S’ → S  Closure function  Goto function
  • 19.
    Use of theLR(0) Automaton
  • 20.
    Function Closure  IfI is a set of items for a grammar G, then closure(I) is th e set of items constructed from I.  Create closure(I) by the two rules:  add every item in I to closure(I)  If A→α . Bβ is in closure(I) and B →γ is a production, then add the item B → . γ to closure(I). Apply this rule untill no more new items can be added to closure(I).  Divide all the sets of items into two classes  Kernel items  initial item S’ → . S, and all items whose dots are not at the left end.  Nonkernel items  All items with their dots at the left end, except for S’ → . S
  • 21.
    Example  The grammarG E’ → E E → E + T | T T → T * F | F F → ( E ) | id  Let I = { E’ → . E } , then closure(I) = { E’ → . E E → . E + T E → . T T → . T * F T → . F F → . ( E ) F → . id }
  • 22.
    Exercise  The grammarG E’ → E E → E + T | T T → T * F | F F → ( E ) | id  Let I = { E → E + . T }
  • 23.
    Function Goto  FunctionGoto(I, X)  I is a set of items  X is a grammar symbol  Goto(I, X) is defined to be the closure of the set of all items [A  α X‧β] such that [A  α‧ Xβ] is in I.  Goto function is used to define the transitions in th e LR(0) automation for a grammar.
  • 24.
    Example  I ={ E’ → E . E → E . + T }  Goto (I, +) = { E → E + . T T → . T * F T → . F F → . (E) F → . id } The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id
  • 25.
    Constructing the LR(0)Collection 1. The grammar is augmented with a new start symbol S’ and production S’S 2. Initially, set C = closure({[S’•S]}) (this is the start state of the DFA) 3. For each set of items I  C and each gramm ar symbol X  (NT) such that GOTO(I, X)  C and goto(I, X)  , add the set of items GOTO(I, X) to C 4. Repeat 3 until no more sets can be added t o C
  • 26.
    Example: The Parseof id * id Line STACK SYMBOLS INPUT ACTION (1) 0 $ id * id $ shift to 5 (2) 0 5 $ id * id $ reduce by F → id (3) 0 3 $ F * id $ reduce by T → F (4) 0 2 $ T * id $ shift to 7 (5) 0 2 7 $ T * id $ shift to 5 (6) 0 2 7 5 $ T * id $ reduce by F → id (7) 0 2 7 10 $ T * F $ reduce by T → T * F (8) 0 2 $ T $ reduce by E → T (9) 0 1 $ E $ accept
  • 27.
    Model of anLR Parser
  • 28.
    Structure of theLR Parsing Table  Parsing Table consists of two parts:  A parsing-action function ACTION  A goto function GOTO  The Action function, Action[i, a], have one of four for ms:  Shift j, where j is a state.  The action taken by the parser shifts input a to the stack, but u se state j to represent a.  Reduce A→β.  The action of the parser reduces β on the top of the stack to h ead A.  Accept  The parser accepts the input and finishes parsing.  Error
  • 29.
    Structure of theLR Parsing Table(1)  The GOTO Function, GOTO[Ii, A], defined on sets of items.  If GOTO[Ii, A] = Ij, then GOTO also maps a state i and a nonterminal A to state j.
  • 30.
    LR-Parser Configurations Configuration (= LR parser state): ($s0 s1 s2 … sm, ai ai+1 … an $) stack input ($ X1 X2 … Xm, ai ai+1 … an $) If action[sm, ai] = shift s then push s (ai), and advance input (s0 s1 s2 … sm, ai ai+1 … an $)  (s0 s1 s2 … sm s, ai+1 … an $) If action[sm, ai] = reduce A   and goto[sm-r, A] = s with r = || then pop r symbols, and push s ( push A ) ( (s0 s1 s2 … sm, ai ai+1 … an $)  (s0 s1 s2 … sm-r s, ai ai+1 … an $) If action[sm, ai] = accept then stop If action[sm, ai] = error then attempt recovery
  • 31.
    Example LR ParseTable Grammar: 1. E  E + T 2. E  T 3. T  T * F 4. T  F 5. F  ( E ) 6. F  id s5 s4 s6 acc r2 s7 r2 r2 r4 r4 r4 r4 s5 s4 r6 r6 r6 r6 s5 s4 s5 s4 s6 s11 r1 s7 r1 r1 r3 r3 r3 r3 r5 r5 r5 r5 id + * ( ) $ 0 1 2 3 4 5 6 7 8 9 10 11 E T F 1 2 3 8 2 3 9 3 10 shift & goto 5 reduce by production #1 action goto state
  • 32.
    Line STACK SYMBOLSINPUT ACTION (1) 0 id * id + id $ shift 5 (2) 0 5 id * id + id $ reduce 6 goto 3 (3) 0 3 F * id + id $ reduce 4 goto 2 (4) 0 2 T * id + id $ shift 7 (5) 0 2 7 T * id + id $ shift 5 (6) 0 2 7 5 T * id + id $ reduce 6 goto 10 (7) 0 2 7 10 T * F + id $ reduce 3 goto 2 (8) 0 2 T + id $ reduce 2 goto 1 (9) 0 1 E + id $ shift 6 (10 ) 0 1 6 E + id $ shift 5 (11 ) 0 1 6 5 E + id $ reduce 6 goto 3 (12 ) 0 1 6 3 E + F $ reduce 4 goto 9 (13 0 1 6 9 E + T $ reduce 1 goto 1 Grammar 0. S  E 1. E  E + T 2. E  T 3. T  T * F 4. T  F 5. F  ( E ) 6. F  id Example
  • 33.
    SLR Grammars  SLR(Simple LR): a simple extension of LR(0) shift-reduce parsing  SLR eliminates some conflicts by populating the parsing table with reductions A on symbols in FOLLOW(A) S  E E  id + E E  id State I0: S  •E E  •id + E E  •id State I2: E  id•+ E E  id• goto(I0,id) goto(I3,+) FOLLOW(E)={$} thus reduce on $ Shift on +
  • 34.
    SLR Parsing Table Reductions do not fill entire rows  Otherwise the same as LR(0) s2 acc s3 r3 s2 r2 id + $ 0 1 2 3 4 E 1 4 1. S  E 2. E  id + E 3. E  id FOLLOW(E)={$} thus reduce on $ Shift on +
  • 35.
    Constructing SLR ParsingTables  Augment the grammar with S’ S  Construct the set C={I0, I1, …, In} of LR(0) items  State i is constructed from Ii  If [A•a]  Ii and goto(Ii, a)=Ij then set action[i, a]=shift j  If [A•]  Ii then set action[i,a]=reduce A for all a  FOLL OW(A) (apply only if AS’)  If [S’S•] is in Ii then set action[i,$]=accept  If goto(Ii, A)=Ij then set goto[i, A]=j set goto table  Repeat 3-4 until no more entries added  The initial state i is the Ii holding item [S’•S]
  • 36.
    Example SLR Grammarand LR(0) Items Augmented grammar: 0. C’  C 1. C  A B 2. A  a 3. B  a State I0: C’  •C C  •A B A  •a State I1: C’  C• State I2: C  A•B B  •a State I3: A  a• State I4: C  A B• State I5: B  a• goto(I0,C) goto(I0,a) goto(I0,A) goto(I2,a) goto(I2,B) I0 = closure({[C’  •C]}) I1 = goto(I0,C) = closure({[C’  C•]}) … start final
  • 37.
    Example SLR ParsingTable s3 acc s5 r2 r1 r3 a $ 0 1 2 3 4 5 C A B 1 2 4 State I0: C’  •C C  •A B A  •a State I1: C’  C• State I2: C  A•B B  •a State I3: A  a• State I4: C  A B• State I5: B  a• 1 2 4 5 3 0 start a A C B a Grammar: 0. C’  C 1. C  A B 2. A  a 3. B  a
  • 38.
    SLR and Ambiguity Every SLR grammar is unambiguous, but not every unambiguous grammar is SLR, maybe LR(1)  Consider for example the unambiguous grammar S  L = R | R L  * R | id R  L FOLLOW(R) = {=, $} I0: S’  •S S  •L=R S  •R L  •*R L  •id R  •L I1: S’  S• I2: S  L•=R R  L• I3: S  R• I4: L  *•R R  •L L  •*R L  •id I5: L  id• I6: S  L=•R R  •L L  •*R L  •id I7: L  *R• I8: R  L• I9: S  L=R• action[2,=]=s6 action[2,=]=r5 no Has no SLR parsing table
  • 39.
    LR(1) Grammars  SLRtoo simple  LR(1) parsing uses lookahead to avoid unnec essary conflicts in parsing table  LR(1) item = LR(0) item + lookahead LR(0) item [A•] LR(1) item [A•, a]
  • 40.
    SLR Versus LR(1) Split the SLR states by adding LR(1) lookahead  Unambiguous grammar S  L = R | R L  * R | id R  L I2: S  L•=R R  L• action[2,=]=s6 Should not reduce, because no right-sentential form begins with R= split R  L• S  L•=R
  • 41.
    LR(1) Items  AnLR(1) item [A•, a] contains a lookahead terminal a, meaning  a lready on top of the stack, expect to see a  For items of the form [A•, a] the lookahead a is used to reduce A only if the next input is a  For items of the form [A•, a] with  the lookahead has no effect
  • 42.
    The Closure Operationfor LR(1) Items  Start with closure(I) = I  If [A•B, a]  closure(I) then for each production B in the grammar and each terminal b  FIRST(a) add the item [B•, b] to I if not already in I  Repeat 2 until no new items can be added
  • 43.
    The Goto Operationfor LR(1) Items  For each item [A•X, a]  I, add the set o f items closure({[AX•, a]}) to goto(I,X) if not already there  Repeat step 1 until no more items can be ad ded to goto(I,X)
  • 44.
    Example  Let I={ (S’ → •S, $) }  I0 = closure(I) = { S’ → •S, $ S → • C C, $ C → •c C, c/d C → •d, c/d }  goto(I0, S) = closure( {S’ → S •, $ } ) = {S’ → S •, $ } = I1 The grammar G S’ → S S → C C C → c C | d
  • 45.
    Exercise  Let I={ (S → C •C, $) }  I2 = closure(I) = ?  I3 = goto(I2, c) = ? The grammar G S’ → S S → C C C → c C | d
  • 46.
    Construction of thesets of LR(1) Items  Augment the grammar with a new start symbo l S’ and production S’S  Initially, set C = closure({[S’•S, $]}) (this is the start state of the DFA)  For each set of items I  C and each grammar symbol X  (NT) such that goto(I, X)  C and goto(I, X)  , add the set of items goto(I, X) t o C  Repeat 3 until no more sets can be added to C
  • 47.
  • 48.
    Construction of theCanonical LR(1) Parsing Tables  Augment the grammar with S’S  Construct the set C={I0,I1,…,In} of LR(1) items  State i of the parser is constructed from Ii  If [A•a, b]  Ii and goto(Ii,a)=Ij then set action[i,a]=shift j  If [A•, a]  Ii then set action[i,a]=reduce A (apply only if AS’)  If [S’S•, $] is in Ii then set action[i,$]=accept  If goto(Ii,A)=Ij then set goto[i,A]=j  Repeat 3 until no more entries added  The initial state i is the Ii holding item [S’•S,$]
  • 49.
    Example The grammarG S’ → S S → C C C → c C | d state ACTION GOTO c d $ S C 0 s3 s4 1 2 1 acc 2 s6 s7 5 3 s3 s4 8 4 r3 r3 5 r1 6 s6 s7 9 7 r3 8 r2 r2 9 r2
  • 50.
    Example Grammar andLR(1) Items  Unambiguous LR(1) grammar: S  L = R | R L  * R | id R  L  Augment with S’  S  LR(1) items (next slide)
  • 51.
    [S’  •S,$] goto(I0,S)=I1 [S  •L=R, $] goto(I0,L)=I2 [S  •R, $] goto(I0,R)=I3 [L  •*R, =/$] goto(I0,*)=I4 [L  •id, =/$] goto(I0,id)=I5 [R  •L, $] goto(I0,L)=I2 [S’  S•, $] [S  L•=R, $] goto(I0,=)=I6 [R  L•, $] [S  R•, $] [L  *•R, =/$] goto(I4,R)=I7 [R  •L, =/$] goto(I4,L)=I8 [L  •*R, =/$] goto(I4,*)=I4 [L  •id, =/$] goto(I4,id)=I5 [L  id•, =/$] [S  L=•R, $] goto(I6,R)=I9 [R  •L, $] goto(I6,L)=I10 [L  •*R, $] goto(I6,*)=I11 [L  •id, $] goto(I6,id)=I12 [L  *R•, =/$] [R  L•, =/$] [S  L=R•, $] [R  L•, $] [L  *•R, $] goto(I11,R)=I13 [R  •L, $] goto(I11,L)=I10 [L  •*R, $] goto(I11,*)=I11 [L  •id, $] goto(I11,id)=I12 [L  id•, $] I0: I1: I2: I3: I4: I5: I6: I7: I8: I9: I10: I12: I11: I13:
  • 52.
    Example LR(1) ParsingTable s5 s4 acc s6 r6 r3 s5 s4 r5 r5 s12 s11 r4 r4 r6 r6 r2 r6 s12 s11 r5 r4 id * = $ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 S L R 1 2 3 8 7 10 4 10 13 Grammar: 1. S’  S 2. S  L = R 3. S  R 4. L  * R 5. L  id 6. R  L
  • 53.
    LALR(1) Grammars  LR(1)parsing tables have many states  LALR(1) parsing (Look-Ahead LR) combines LR(1) states to reduce table size  Less powerful than LR(1)  Will not introduce shift-reduce conflicts, because shifts do n ot use lookaheads  May introduce reduce-reduce conflicts, but seldom do so fo r grammars of programming languages  SLR and LALR tables for a grammar always have th e same number of states, and less than LR(1) tables.  Like C, SLR and LALR >100, LR(1) > 1000
  • 54.
    Constructing LALR ParsingTables  Two ways  Construction of the LALR parsing table from the sets of LR(1) items.  Union the states  Requires much space and time  Construction of the LALR parsing table from the sets of LR(0) items  Efficient  Use in practice.
  • 55.
    Example state ACTION GOTO c d$ S C 0 s36 s47 1 2 1 acc 2 s36 s47 5 36 s36 s47 89 47 r3 r3 r3 5 r1 89 r2 r2 r2 state ACTION GOTO c d $ S C 0 s3 s4 1 2 1 acc 2 s6 s7 5 3 s3 s4 8 4 r3 r3 5 r1 6 s6 s7 9 7 r3 8 r2 r2 9 r2 LALR(1) LR(1)
  • 56.
    Constructing LALR(1) ParsingTables  Construct sets of LR(1) items  Combine LR(1) sets with sets of items that s hare the same first part [L  *•R, =] [R  •L, =] [L  •*R, =] [L  •id, =] [L  *•R, $] [R  •L, $] [L  •*R, $] [L  •id, $] I4: I11: [L  *•R, =/$] [R  •L, =/$] [L  •*R, =/$] [L  •id, =/$] Shorthand for two items in the same set
  • 57.
    Example LALR(1) Grammar Unambiguous LR(1) grammar: S  L = R | R L  * R | id R  L  Augment with S’  S  LALR(1) items (next slide)
  • 58.
    [S’  •S,$] goto(I0,S)=I1 [S  •L=R, $] goto(I0,L)=I2 [S  •R, $] goto(I0,R)=I3 [L  •*R, =/$] goto(I0,*)=I4 [L  •id, =/$] goto(I0,id)=I5 [R  •L, $] goto(I0,L)=I2 [S’  S•, $] [S  L•=R, $] goto(I0,=)=I6 [R  L•, $] [S  R•, $] [L  *•R, =/$] goto(I4,R)=I7 [R  •L, =/$] goto(I4,L)=I9 [L  •*R, =/$] goto(I4,*)=I4 [L  •id, =/$] goto(I4,id)=I5 [S  L=•R, $] goto(I6,R)=I8 [R  •L, $] goto(I6,L)=I9 [L  •*R, $] goto(I6,*)=I4 [L  •id, $] goto(I6,id)=I5 [L  *R•, =/$] [S  L=R•, $] [R  L•, =/$] I0: I1: I2: I3: I4: I5: I6: I7: I8: I9: Shorthand for two items [R  L•, =] [R  L•, $]
  • 59.
    Example LALR(1) ParsingTable s5 s4 acc s6 r6 r3 s5 s4 r5 r5 s5 s4 r4 r4 r2 r6 r6 id * = $ 0 1 2 3 4 5 6 7 8 9 S L R 1 2 3 9 7 9 8 Grammar: 1. S’  S 2. S  L = R 3. S  R 4. L  * R 5. L  id 6. R  L
  • 60.
    LL, SLR, LR,LALR Summary  LL parse tables computed using FIRST/FOLLOW  Nonterminals  terminals  productions  Computed using FIRST/FOLLOW  LR parsing tables computed using closure/goto  LR states  terminals  shift/reduce actions  LR states  nonterminals  goto state transitions  A grammar is  LL(1) if its LL(1) parse table has no conflicts  SLR if its SLR parse table has no conflicts  LR(1) if its LR(1) parse table has no conflicts  LALR(1) if its LALR(1) parse table has no conflicts
  • 61.
    LL, SLR, LR,LALR Grammars LL(1) LR(1) LR(0) SLR LALR(1)
  • 62.