Unit-IV
SYNTAX DIRECTED TRANSLATION
Syntax Directed Definitions-Intermediate Code
Generation-Representation and Implementation -
Types And Declarations –Type Checking –Control
Flow Statements-Back Patching –Procedures.
Outline
 Introduction
 Variants of Syntax Trees
 Three-address code
 Types and declarations
 Translation of expressions
 Boolean expression
 Flow of control statements
 Back patching with Boolean expression
 Back patching with Flow of control statements
 Case statements
Intermediate Code Generation
 Translating source program into an “intermediate language.”
 Simple
 CPU Independent,
 …yet, close in spirit to machine language.
» PT Int.code
 Parser ICG code generator
RE,CFG—lexical, syntactic structure
Syntax directed translation-Recognizer for the
language defined by RE& CFG
ICG
Benefits
1. Retargeting is facilitated
2. Machine independent Code Optimization can be
applied.
3. Int.code closer to Target program
4. Coversion from Source program to Intermediate
code generation and further phase construction will
be easy
ICG
Int.Languages
Syntax directed
definition & IC
for
Storage
Organisation
strategy
Representation
a)Highlevel
(syntax tree,
DAG,postfix)
b)Lower level
(Three address
code)
a)Assignment
b)Declaration
c)Boolean
expression
d)Flow of ctrl
e)Array
f)backpatching
Symbol
table
Intermediate Language Representation
High Level:
 syntax trees
 DAG(directed acyclic graph)
 postfix notation
Lower level:
 three-address code (Quadraples)
 we will use quadraples to discuss intermediate code generation
 quadraples are close to machine instructions, but they are not actual machine instructions.
 some programming languages
 java – java virtual machine
 prolog – warren abstract machine
 In fact, there are byte-code emulators to execute instructions in these intermediate languages.
Types of Intermediate Languages
 High Level Graphical Representations.
 Consider the assignment a:=b*-c+b*-c:
assign
a +
* *
uminus uminus
b
c c
b
assign
a
+
*
uminus
b c
Syntax tree DAG
Steps for Syntax tree construction
 Consider the assignment a:=b*-c+b*-c:
assign
a +
* *
uminus uminus
b
c c
b
Syntax tree
1)mknode(op,left,right)
2)mkleaf(id,entry)
Create an identifier
with id label and a ptr
to symbol table entry
for an id
3)mkleaf(num,value)
Create num node with
label num and val field
for value of the number
Steps for Syntax tree construction
Example: a-4+c
P1=mkleaf(id,entry a)
p2:=mknum(num,4)
P3=mknode(‘-’,p1,p2)
P4=mkleaf(id ,entry c)
P5=mknode(‘+’,p3,p4)
+ P3 p4
- P1 p2 Id Entry c
Num 4
id Entry a
p1 p2
p3 p4
p5
Steps for DAG construction
assign
a
+
*
uminus
b c
DAG
Directed Acyclic Graph-
identify common
subexpression
P1=mkleaf(id,entry a)
p2:=mkleaf(id,entry b)
p3:=mkleaf(id,entry c)
P4=mknode(‘-’,p3,-)
P5=mknode(‘*’,p2,p4)
…..
…..
…..
p1
p2 p3
p4
p5
p6
p7
Postfix Notation
assign
a +
* *
uminus uminus
b
c c
b
Syntax tree
Linearised representation of syntax
tree
LRr traversal
abc uminus *bc uminus *+assign
Three address code-lower level representation
 In a three address code there is at most one operator at the right side of an
instruction
 Example:
+
+ *
*
-
b c
a
d
t1 = b – c
t2 = a * t1
t3 = a + t2
t4 = t1 * d
t5 = t3 + t4
Data structures for three address
codes(Implementation)
 Quadruples
 Has four fields: op, arg1, arg2 and result
 Triples
 Temporaries are not used and instead
references to instructions are made
 Indirect triples
 In addition to triples we use a list of
pointers to triples
Example
 a=b * minus c + b * minus c
t1 = minus c
t2 = b * t1
t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Three address code
minus
*
minus c t3
*
+
=
c t1
b t2
t1
b t4
t3
t2 t5
t4
t5 a
arg1 result
arg2
op
Quadruples
minus
*
minus c
*
+
=
c
b (0)
b (2)
(1) (3)
a
arg1 arg2
op
Triples
(4)
0
1
2
3
4
5
minus
*
minus c
*
+
=
c
b (0)
b (2)
(1) (3)
a
arg1 arg2
op
Indirect Triples
(4)
0
1
2
3
4
5
(0)
(1)
(2)
(3)
(4)
(5)
op
35
36
37
38
39
40
Implementations of 3-address statements
a=b * uminus c + b * uminus c
Quadruples
t1:=- c
t2:=b * t1
t3:=- c
t4:=b * t3
t5:=t2 + t4
a:=t5
op arg1 arg2 result
(0) uminus c t1
(1) * b t1 t2
(2) uminus c
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5
a
Temporary names must be entered into the symbol
table as they are created.
Implementations of 3-address statements, II
a=b * minus c + b * minus c
 Triples
t1:=- c
t2:=b * t1
t3:=- c
t4:=b * t3
t5:=t2 + t4
a:=t5
op arg1 arg2
(0) uminu
s
c
(1) * b (0)
(2) uminu
s
c
(3) * b (2)
(4) + (1) (3)
(5) assign a (4)
Temporary names are not entered into the symbol table.
Implementations of 3-address statements
 Indirect Triples
op arg1 arg2
(14) uminus c
(15) * b (14)
(16) uminus c
(17) * b (16)
(18) + (15) (17)
(19) assign a (18)
op
(0) (14)
(1) (15)
(2) (16)
(3) (17)
(4) (18)
(5) (19)
Other types of 3-address statements
 e.g. ternary operations like
x[i]:=y x:=y[i]
 require two or more entries. e.g.
op arg1 arg2
(0) [ ] = x i
(1) assign (0) y
op arg1 arg2
(0) [ ] = y i
(1) assign x (0)
Types of Three-Address Statements.
Assignment Statement: x:=y op z
Assignment instruction: x:=op z
Copy Statement: x:=z
Unconditional Jump: goto L
Conditional Jump: if x relop y goto L
Stack Operations: Push/pop
More Advanced:
Procedure:
param x1
param x2
…
param xn
call p,n
Index Assignments:
x:=y[i]
x[i]:=y
Address and Pointer Assignments:
x:=&y
x:=*y
*x:=y
Syntax directed translation
 Specifies the translation of PL construct interms of
attributes associated with its syntactic components.
Definition:
 It’s a notational framework converts parse tree into
an intermediate form PL constructs suca as
declaration,flow of ctrl stmts,assignment etc.
 Extension of CFG
 Allows subroutines or semantic actions to be
attached to the productions of a CFG.
 The computation of value associated with the
grammer symbol is called translation.
Translations on the parse tree
Production Semantic action
E->E+E {E.val=E.val+E.val}
Edigit {E.val=digit}
Example: w=1+2+3
E E(E.val=6)
E + E => E + E
( E.val=3) (E.val=3)
E + E
E + E
(E.val=1) (E.val=2)
Attribute types
Synthesized
attribute
E->E+E
{E.val=E.val+E.val}
Inherited
attribute
A->xyz
{y.val=2*A.val}
Declarations
(example real id1,id2,id3)
D
T.Type (4) (5) L.in=real(6)
(7) L.in=real(8) , id3(3)
real
(9) L.in=real(10) , id2(2)
id1(1)
Synthesized attribute
Desk calculator program(eg: 3*5+4n)
Prod semantic action
L->En Print E.val
E->E+T E.val:=E.val+T.val
E->T E.val=T.val
T->T*F T.val=T.val*F.val
T->F T.val=F.val
F->(E) F.val=(E.val)
F->id F.val=digit.lexval
Inherited attribute
Syntax directed definition for declaration
Prod semantic action
D->TL L.in=T.type
T->int T.type=integer
T->real T.type=real
L->L1,id L1.in=L.in
addtype(id.entry,L.in)
L->id addtype(id.entry,L.in)
Declarations
(example real id1,id2,id3)
D
T.Type (4) (5) L.in=real(6)
(7) L.in=real(8) , id3(3)
real
(9) L.in=real(10) , id2(2)
id1(1)

u4-p1 syntax directed translation and .ppt

  • 1.
  • 2.
    Syntax Directed Definitions-IntermediateCode Generation-Representation and Implementation - Types And Declarations –Type Checking –Control Flow Statements-Back Patching –Procedures.
  • 3.
    Outline  Introduction  Variantsof Syntax Trees  Three-address code  Types and declarations  Translation of expressions  Boolean expression  Flow of control statements  Back patching with Boolean expression  Back patching with Flow of control statements  Case statements
  • 4.
    Intermediate Code Generation Translating source program into an “intermediate language.”  Simple  CPU Independent,  …yet, close in spirit to machine language. » PT Int.code  Parser ICG code generator RE,CFG—lexical, syntactic structure Syntax directed translation-Recognizer for the language defined by RE& CFG
  • 5.
    ICG Benefits 1. Retargeting isfacilitated 2. Machine independent Code Optimization can be applied. 3. Int.code closer to Target program 4. Coversion from Source program to Intermediate code generation and further phase construction will be easy
  • 6.
    ICG Int.Languages Syntax directed definition &IC for Storage Organisation strategy Representation a)Highlevel (syntax tree, DAG,postfix) b)Lower level (Three address code) a)Assignment b)Declaration c)Boolean expression d)Flow of ctrl e)Array f)backpatching Symbol table
  • 7.
    Intermediate Language Representation HighLevel:  syntax trees  DAG(directed acyclic graph)  postfix notation Lower level:  three-address code (Quadraples)  we will use quadraples to discuss intermediate code generation  quadraples are close to machine instructions, but they are not actual machine instructions.  some programming languages  java – java virtual machine  prolog – warren abstract machine  In fact, there are byte-code emulators to execute instructions in these intermediate languages.
  • 8.
    Types of IntermediateLanguages  High Level Graphical Representations.  Consider the assignment a:=b*-c+b*-c: assign a + * * uminus uminus b c c b assign a + * uminus b c Syntax tree DAG
  • 9.
    Steps for Syntaxtree construction  Consider the assignment a:=b*-c+b*-c: assign a + * * uminus uminus b c c b Syntax tree 1)mknode(op,left,right) 2)mkleaf(id,entry) Create an identifier with id label and a ptr to symbol table entry for an id 3)mkleaf(num,value) Create num node with label num and val field for value of the number
  • 10.
    Steps for Syntaxtree construction Example: a-4+c P1=mkleaf(id,entry a) p2:=mknum(num,4) P3=mknode(‘-’,p1,p2) P4=mkleaf(id ,entry c) P5=mknode(‘+’,p3,p4) + P3 p4 - P1 p2 Id Entry c Num 4 id Entry a p1 p2 p3 p4 p5
  • 11.
    Steps for DAGconstruction assign a + * uminus b c DAG Directed Acyclic Graph- identify common subexpression P1=mkleaf(id,entry a) p2:=mkleaf(id,entry b) p3:=mkleaf(id,entry c) P4=mknode(‘-’,p3,-) P5=mknode(‘*’,p2,p4) ….. ….. ….. p1 p2 p3 p4 p5 p6 p7
  • 12.
    Postfix Notation assign a + ** uminus uminus b c c b Syntax tree Linearised representation of syntax tree LRr traversal abc uminus *bc uminus *+assign
  • 13.
    Three address code-lowerlevel representation  In a three address code there is at most one operator at the right side of an instruction  Example: + + * * - b c a d t1 = b – c t2 = a * t1 t3 = a + t2 t4 = t1 * d t5 = t3 + t4
  • 14.
    Data structures forthree address codes(Implementation)  Quadruples  Has four fields: op, arg1, arg2 and result  Triples  Temporaries are not used and instead references to instructions are made  Indirect triples  In addition to triples we use a list of pointers to triples
  • 15.
    Example  a=b *minus c + b * minus c t1 = minus c t2 = b * t1 t3 = minus c t4 = b * t3 t5 = t2 + t4 a = t5 Three address code minus * minus c t3 * + = c t1 b t2 t1 b t4 t3 t2 t5 t4 t5 a arg1 result arg2 op Quadruples minus * minus c * + = c b (0) b (2) (1) (3) a arg1 arg2 op Triples (4) 0 1 2 3 4 5 minus * minus c * + = c b (0) b (2) (1) (3) a arg1 arg2 op Indirect Triples (4) 0 1 2 3 4 5 (0) (1) (2) (3) (4) (5) op 35 36 37 38 39 40
  • 16.
    Implementations of 3-addressstatements a=b * uminus c + b * uminus c Quadruples t1:=- c t2:=b * t1 t3:=- c t4:=b * t3 t5:=t2 + t4 a:=t5 op arg1 arg2 result (0) uminus c t1 (1) * b t1 t2 (2) uminus c (3) * b t3 t4 (4) + t2 t4 t5 (5) := t5 a Temporary names must be entered into the symbol table as they are created.
  • 17.
    Implementations of 3-addressstatements, II a=b * minus c + b * minus c  Triples t1:=- c t2:=b * t1 t3:=- c t4:=b * t3 t5:=t2 + t4 a:=t5 op arg1 arg2 (0) uminu s c (1) * b (0) (2) uminu s c (3) * b (2) (4) + (1) (3) (5) assign a (4) Temporary names are not entered into the symbol table.
  • 18.
    Implementations of 3-addressstatements  Indirect Triples op arg1 arg2 (14) uminus c (15) * b (14) (16) uminus c (17) * b (16) (18) + (15) (17) (19) assign a (18) op (0) (14) (1) (15) (2) (16) (3) (17) (4) (18) (5) (19)
  • 19.
    Other types of3-address statements  e.g. ternary operations like x[i]:=y x:=y[i]  require two or more entries. e.g. op arg1 arg2 (0) [ ] = x i (1) assign (0) y op arg1 arg2 (0) [ ] = y i (1) assign x (0)
  • 20.
    Types of Three-AddressStatements. Assignment Statement: x:=y op z Assignment instruction: x:=op z Copy Statement: x:=z Unconditional Jump: goto L Conditional Jump: if x relop y goto L Stack Operations: Push/pop More Advanced: Procedure: param x1 param x2 … param xn call p,n Index Assignments: x:=y[i] x[i]:=y Address and Pointer Assignments: x:=&y x:=*y *x:=y
  • 21.
    Syntax directed translation Specifies the translation of PL construct interms of attributes associated with its syntactic components. Definition:  It’s a notational framework converts parse tree into an intermediate form PL constructs suca as declaration,flow of ctrl stmts,assignment etc.  Extension of CFG  Allows subroutines or semantic actions to be attached to the productions of a CFG.  The computation of value associated with the grammer symbol is called translation.
  • 22.
    Translations on theparse tree Production Semantic action E->E+E {E.val=E.val+E.val} Edigit {E.val=digit} Example: w=1+2+3 E E(E.val=6) E + E => E + E ( E.val=3) (E.val=3) E + E E + E (E.val=1) (E.val=2)
  • 23.
  • 24.
    Declarations (example real id1,id2,id3) D T.Type(4) (5) L.in=real(6) (7) L.in=real(8) , id3(3) real (9) L.in=real(10) , id2(2) id1(1)
  • 25.
    Synthesized attribute Desk calculatorprogram(eg: 3*5+4n) Prod semantic action L->En Print E.val E->E+T E.val:=E.val+T.val E->T E.val=T.val T->T*F T.val=T.val*F.val T->F T.val=F.val F->(E) F.val=(E.val) F->id F.val=digit.lexval
  • 26.
    Inherited attribute Syntax directeddefinition for declaration Prod semantic action D->TL L.in=T.type T->int T.type=integer T->real T.type=real L->L1,id L1.in=L.in addtype(id.entry,L.in) L->id addtype(id.entry,L.in)
  • 27.
    Declarations (example real id1,id2,id3) D T.Type(4) (5) L.in=real(6) (7) L.in=real(8) , id3(3) real (9) L.in=real(10) , id2(2) id1(1)