4. 4
Parse Tree
represent the syntactic structure of a string according to some grammar
"1+2+3"
exp → exp + exp
exp → exp – exp
exp → number
exp
exp exp+
exp exp+
num num
num
1 2 3
⇒
5. 5
Why Parse Tree?
although this tree structure is cumbersome for us, it's very convenient for computers
"<b>wecome to web page</b>"
html → elt html
html → E
elt → word
elt → to word tc
to → <word>
tc → </word>
html
elt html
to tchtml
word
E
b
< > < /elt
word
welcome
html …
elt html
……
6. 6
How Parse Tree?
Top-down VS Bottom-up
exp
exp exp+
exp exp+
num num
num
1 2 3
①
②
③
④
⑤
exp
exp exp+
exp exp+
num num
num
1 2 3
⑤
④
③
②
①
7. 7
How Parse Tree?
S
ACTION GOTO
, a $ LIST ELE
0 s3 1 2
1 s4 acc
2 r2
3 r3 r3
4 s3 5
5 r1
STEP STACK
IN
PUT
ACTION TREE
1 0 a, a$ shift 3 Node(a)
2 0 a 3 , a$ reduce 3 Tree(3)
3 0 ELE , a$ GOTO 2
4 0 ELE 2 , a$ reduce 2 Tree(2)
5 0 LIST , a$ GOTO 1
6 0 LIST 1 , a$ shift 4 Node(,)
7 0 LIST 1 , 4 a$ shift 3 Node(a)
8 0 LIST 1 , 4 a 3 $ reduce 3 Tree(3)
9 0 LIST 1 , 4 ELE $ reduce 1 Tree(1)
10 0 LIST $ GOTO 1
11 0 LIST 1 $ accept Return
LIST
LIST
ELEMENT
a , a
ELEMENT
①
②
④
⑨
⑧
⑦⑥
G = ({LIST, ELEMENT}, {, , a}, P, LIST)
P : LIST → LIST , ELEMENT
P : LIST → ELEMENT
P : ELEMENT → a
8. 8
exp
exp exp+
exp exp+
num num
num
1 2 3
+
+ 3
1 2
Abstact Syntax Tree
not representing every detail appearing in the real syntax
⇒
9. 9
Python Lex-Yacc
A computer program that generates parser
ParserToken Parse Tree
YaccInput
Definition section
%%
Rules section
%%
C code section
18. 18
→ def p_exp(p)
’exp : exp PLUS exp’
line = p.lineno(2) # line number of the PLUS token
index = p.lexpos(2) # Position of the PLUS token
Tracking Line Number
tracks the line number and position of all tokens