• Syntax Trees: Structure :
• Expressions :
• leaves: identifiers or constants;
• internal nodes are labeled with operators;
• the children of a node are its operands.
• 2. Statements :
• a node’s label indicates what kind of statement it is;
• the children correspond to the components of the statement.
4.
Syntax-Directed Translation ofAbstract Syntax Trees :
Production Semantic Rule
S ® id := E
E ® E1 + E2
E ® E1 * E2
E ® – E1
E ® ( E1 )
E ® id
S . nptr := mknode ( ‘ := ’ , mkleaf (id ,
id .entry), E . nptr )
E . nptr := mknode (‘+’, E1 . nptr , E2 . nptr )
E . nptr := mknode (‘*’, E1 . nptr , E2 . nptr )
E . nptr := mknode (‘ uminus’ , E1 . nptr )
E . nptr := E1 . nptr
E . nptr := mkleaf (id , id .entry)
2) Postfix notation:operations on values stored on
operand stack (similar to JVM bytecode)
• a := b * – c + b * – c
• Postfix expression: a b c uminus * b c uminus * +
assign
3) Three-address code: (e.g. triples and quads)
• Instructions are of the form ‘x = y op z,’ where x, y, z
are variables, constants, or “temporaries”.
• At most one operator allowed on RHS, so no “built-
up” expressions.
7.
Types of Three-addressStatements :
1. Assignment statements of the form x := y op z, where op is a binary
arithmetic or logical operation.
2. Assignment instructions of the form x : = op y, where op is a unary
operation. Essential unary operations include unary minus, logical
nega
tion, shift operators, and conversion operators that, for
example, convert a fixed-point number to a floating-point number.
3. Copy statements of the form x : = y where the value of y is
assigned to x.
4. The unconditional jump goto L. The three-address statement with
label L is the next to be executed.
5. Conditional, jumps such as if x relop y goto L. This instruction
applies to relational operator (<, =, >=, etc.) to x and y, and
executes the statement with label L next if x stands in relation
relop to y. If not, the three-address statement following if x relop y
goto L is executed next, as in the usual sequence.
8.
6. Indexed assignmentsof the form x : = y[i] and x[i] : = y. The first
of these sets x to the value in the location i memory units
beyond location y. The statement x[i] : = y sets the contents of
the location i units beyond x to the value of y. In both these
instructions, x, y, and i refer to data objects
7. Procedure Call: param x, and call p,n for calling a procedure p,
with n parameters. return y is the returned value of the
procedure:
param x1
param x2
. . .
param xn
call p, n
8. Pointer Assignments: x := &y, x := *y, or *x :=y; where &y stands
for the address of y, and *y the value of y.
9.
a = b+c*d;
t1= c*d
t2 = b+t1
a = t2
while(a<b) {a=b*c/25;} a=b-23^2;
L1: if(a<b) goto L2
goto L3
L2: t1= b*c
t2= t1/25
a= t2
goto L1
L3: t3 = 23^2
t4 = b-t3
a = t4
Syntax-directed Translation intoThree-
address Code :
• E.place: the name that will hold the value of E.
• E.code: hold the three address code
statements that evaluate E.
• Use function newtemp that returns a new
temporary variable that we can use.
• Use function gen to generate a single three
address statement given the necessary
information (variable names and operations)
12.
• Synthesized attributes:
S.code : three-address code for S
• S .begin : label to start of S or nil
• S .after : label to end of S or nil
• E .code : three-address code for E
• E .place : a name holding the value of E
13.
Production Semantic Rules
S® id:=E S.code:=E.code || gen(id.place‘=’E.place)
E ® E1 + E2 E.place:= newtemp;
E.code:= E1.code || E2.code ll gen(E.place‘:=’
E1.place ‘+’E2.place)
E ® E1 *E2 E.place:= newtemp;
E.code:= E1.code || E2.code ll gen(E.place‘:=’
E1.place ‘*’E2.place)
E ® – E1 E.place:= newtemp;
E.code:= E1.code || gen(E.place‘:=’
‘uminus’E1.palce)
E ® (E1) E.place := E1.place
E.code := E1.code
E ® id E.place := id.place
E.code := ‘’
Implementations of Three-AddressStatements
• Quadruples :
• A quadruple is a record structure with four fields, which we
call op, arg1, arg2, and result. The op field contains an internal
code for the operator. The three-address statement
• x : = y op z is represented by placing y in arg 1, z in arg2, and x
in result.
• Example
Three address code
t1:= – c
t2:=b * t1
t3:=- c
t4:=b * t3
t5:=t2 + t4
a:=t5
# Op Arg1 arg2 Res
(0) uminus c t1
(1) * b t1 t2
(2) uminus c t3
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 A
16.
• Triples :
•In quadruples all temporary names are stored into the symbol table. To
avoid entering temporary names into the symbol table, we might refer to
a temporary value by the position of the statement that computes it.
Triple is a data structure with three fields op, arg1,arg2.
Op Arg1 arg2
(0) Uminus c
(1) * b (0)
(2) Uminus c
(3) * b (2)
(4) + (1) (3)
(5)
Assign
a (4)
17.
• ternary operationslike x[i]:=y and x = y[i]
requires two entries
op Arg1 arg2
(0) [ ] = x i
(1) assign (0) Y
op arg1 arg2
(0) [ ] = y i
(1) assign x (0)
(a) x[i] = y
(b) x = y[i]
18.
• Indirect Triples:
• Another implementation of three-address code that has been
considered is that of listing pointers to triples, rather than
listing the triples themselves. This implementation is naturally
called indirect triples.
op op arg1 arg2
(0) (14) (14) uminus c
(1) (15) (15) * b (14)
(2) (16) (16) uminus c
(3) (17) (17) * b (16)
(4) (18) (18) + (15) (17)
(5) (19) (19) assign a (18)
Fig. 6.13: Indirect triples
19.
Declarations
P -> MD { }
M -> epsilon {offset:=0 }
D -> id : T { addtype(id.entry, T.type, offset)
offset:=offset + T.width }
T -> char {T.type = char; T.width = 1; }
T -> integer {T.type = integer ; T.width = 4; }
T -> array [ num ] of T1
{T.type=array(1..num.val,T1.type)
T.width = num.val * T1.width}
T -> ^T1 {T.type = pointer(T1.type); T.width = 4}
20.
Assignment Statements
Production SemanticRules
S ® id := E
p := lookup (id.name);
if p ¹ nil then emit (p ‘:=’ E.place)
else error
E ® E1 + E2 E.place :=newtemp;
emit (E.place ‘:=’ E1.place ‘+ ’ E2.place)
E ® E1 * E2 E.place:=newtemp;
emit (E.place ‘:=’ E1.place ‘* ’ E2.place)
E ® – E1 E.place:=newtemp;
emit (E.place ‘:=’ ‘uminus’ E1.place)
E ® (E1) E.place := E1.place
E ® id p := lookup(id.name);
if p ¹ nil then E.place : = p
else error
21.
Array
• Grammar foraddressing array elements :
S -> L := E
E -> E + E | ( E )| L
L -> Elist ] | id
Elist -> Elist , E | id [ E
• Synthesized attributes:
E.place name of temp holding value of E
Elist.array array name
Elist.place name of temp holding index value
Elist.ndim number of array dimensions
L.place lvalue (=name of temp)
L.offset index into array (=name of temp) null indicates
non-array simple id
22.
• S ->L := E { if L.offset = null then
emit(L.place ‘:=’ E.place)
else
emit(L.place[L.offset] ‘:=’ E.place) }
• E -> E1 + E2 { E.place := newtemp();
emit(E.place ‘:=’ E1.place ‘+’ E2.place) }
• E -> ( E1 ) { E.place := E1.place }
• E -> L { if L.offset = null then
E.place := L.place
else
E.place := newtemp();
emit(E.place ‘:=’ L.place[L.offset] }
• L -> Elist ] { L.place := newtemp();
L.offset := newtemp();
emit(L.place ‘:=’ c(Elist.array);
emit(L.offset ‘:=’ Elist.place ‘*’ width(Elist.array)) }
• L -> id { L.place := id.place; L.offset := null }
• Elist -> Elist1 , E
{ t := newtemp(); m := Elist1.ndim + 1;
emit(t ‘:=’ Elist1.place ‘*’ limit(Elist1.array, m));
emit(t ‘:=’ t ‘+’ E.place);
Elist.array := Elist1.array; Elist.place := t;
Elist.ndim := m }
• Elist -> id [ E{ Elist.array := id.place; Elist.place := E.place;
Elist.ndim := 1 }
Translation scheme for addressing array elements
23.
• X [i, j] : = Y [ i + j, k] + z.
• The maximum dimensions of X are [d1, d2] and of Y are [d3, d4].
• The intermediate three-address code for this statement can be written by
intuition as follows:
t1 : = i * d2
2. t2 : = t1 + j
3. t3 : = c(X)
4. t4 : = t2 * width(X)
5. t5 : = i + j
6. t6 : = t5 * d4
7. t7 : = t6 + k
8. t8 : = c(Y)
9. t9 : = t7 * width(Y)
10. t10 : = t8[t9]
11. t11 : = t10 + z
12. t3[t4] : = t11
• A[B] =C [D[2]] + E * F where A, C are array variables
• Three address code for above code is as follows
t1 = address of A
t2 = B * 4
t3 = address of D
t4 = 2 * 4
t5 = t3 [ t4]
t6 = address of C
t7 = t5 * 4
t8 = t6[t7]
t9 = E * F
t10 = t8 + t9
t1[t2] = t10
27.
Type conversions withinassignments :
E -> E1 + E2 { E.place := newtemp;
if E1.type = integer and E2.type = integer then begin
emit (E.place ':=' E1.place 'int+' E2.place);
E.type := integer;
end
else if E1.type = real and E2.type = real then begin
emit (E.place ':=' E1.place 'real+' E2.place);
E.type := real;
end
else if E1.type = integer and E2.type = real then begin
u := newtemp;
emit (u ':=' 'inttoreal' E1.place);
emit (E.place ':=' u 'real+' E2.place );
E.type := real;
end
else if E1.type = real and E2.type = integer then begin
u := newtemp;
emit (u ':=' 'inttoreal' E2.place);
emit (E.place ':=' E1.place 'real+' u);
E.type := real;
end
else
E.type := type_error
}
28.
Boolean Expressions :
•Boolean Expressions are used to either compute logical
values or as conditional expressions in flow-of-control
statements.
• We consider Boolean Expressions with the following
grammar:
• E → E or E | E and E | not E | (E) | id relop id | true | false
• There are two methods to evaluate Boolean Expressions:
• 1. Numerical Representation: Encode true with ‘1’ and
false with ‘0’ and we proceed analogously to arithmetic
expressions.
• 2. Jumping Code: We represent the value of a Boolean
Expression by a position reached in a program.
29.
Numerical Representation ofBoolean Expressions :
• The translation for a or (b and (not c)) is:
t1 := not c
t2 := b and t1
t3 := a or t2
• A relational expression such as a<b is equivalent to the conditional
statement if a<b then 1 else 0 and its translation involves jumps to
labeled statements:
100 : if a < b goto 103
101 : t := 0
102 : goto 104
103 : t := 1
104 :
30.
Translation scheme usinga numerical representation for Booleans
E ->E1 or E2 {E.place := newtemp;
emit (E.place ‘:=’ E1.place’or’E2 .place)}
E -> E1 and E2 {E.place := newtemp;
emit (E.place ‘:=’ E1.place’and’E2. place)}
E -> not E1 {E.place := newtemp; emit (E.place ‘:=’ ‘not’ Er.place)}
E -> (E1) {E.place = E1.place}
E -> id1 relop id2 {E.place := newtemp:
emit (‘if’id1.place relop.op id2.place ‘goto’ nextstat + 3)}
emit (E.place ‘:=’ ‘0’)
emit(‘goto’ nextstat+2)
emit (E.place ‘:=’ 1)
}
E -> true {E.place := newtemp; emit (E.place ‘:=’ 1)}
E -> false {E.place := newtemp; emit (E.place ‘:=’ 0)}
31.
Three address codefor a < b or c < d and e < f using above translation
scheme is as follows
100: if a < b goto 103
101: t1 := 0
102: goto 104
103: t1 := 1
104: if c < d goto 107
105: t2 := 0
106: goto 108
107: t2 := 1
108: if e < f goto 111
109: t3 := 0
110: goto 112
111: t3 := 1
112: t4 := t2 and t3
113: t5 := t1 or t4
32.
Flow of ControlStatements
S -> if E then S
| if E then S1 else S2
| while E do S
• We associate with E two labels using inherited attributes:
• 1. E.true, the label to which control flows if E is true;
• 2. E.false, the label to which control flows if E is false.
• We associate to S the inherited attribute S.next that
represents the label attached to the first statement after
the code for S.
33.
Flow of ControlStatements
S ® if E then S1
{E.true := newlabel;
E.false := S.next;
S1.next := S.next;
S.code := E.code
|| gen(E.true ‘:’)
|| S1.code
}
(a) SDD for if statements :
34.
Flow of ControlStatements
S ® if E then S1 else S2
{E.true := newlabel;
E.false := newlabel;
S1.next := S.next;
S2.next := S.next;
S.code := E.code
|| gen(E.true ‘:’)
|| S1.code
|| gen(‘goto’ S.next)
|| gen(E.false ‘:’)
|| S2.code
}
(b) SDD for if and else statements :
E.false
35.
Flow of ControlStatements
S ® while E do S1
{S.begin := newlabel;
E.true := newlabel;
E.false := S.next;
S1.next := S.begin;
S.code := gen(S.begin ‘:’)
|| E.code
|| gen(E.true ‘:’)
|| S1.code
||gen(‘goto’S.begin)
}
(c) SDD for While statement :
36.
Control flow translationof Boolean
Expressions
• Boolean Expressions are translated in a sequence of conditional and
unconditional jumps to either E.true or E.false.
a < b The code is of the form:
if a < b then goto E.true
goto E.false
• E1 or E2
• If E1 is true then E is true, so E1.true = E.true. Otherwise, E2 must be evaluated,
so E1.false is set to the label of the first statement in the code for E2.
• E1 and E2
• Analogous considerations apply. not E1. We just interchange the true and
false with that for E.
37.
Syntax-directed definition toproduce three-address code for Booleans
Production Semantic Rules
E ® E1 or E2 E1.true := E.true; E1.false := newlabel;
E2.true := E.true; E2.false := E.false ;
E.code := E1.code | gen (E1.false ‘:’) || E2.code
E ® E1 and E2 E1.true := newlabel; E1.false := E.false;
E2.true := E.true; E2.false := E.false
E.code := E1.code | gen (E1.true’:’) | E2.code
E ® not E1 E1.true :=E.false; E1.false := E.true ;
E.code := E1.code
E ® (E1) E1.true := E.true; E1.false := E.false;
E.code := E1.code
E ® id1 relop id2 E.code := gen (‘if’ id1.addr relop.op id2.addr’goto’
E.true) ||
Gen (‘goto’ E.false)
E ® true E.code := gen (‘goto’ E.true)
E ® false E.code := gen (‘goto’ E.false)
38.
a < bor c < d and e < f
if a < b goto Ltrue
goto L1
L1 : if c< d goto L2
goto Lfalse
L2 : if e < f goto Ltrue
goto Lfalse
39.
while a <b do
if c < d then
x : = y + z
else
x : = y – z
40.
L1 : ifa < b goto L2
goto Lnext
L2 : if c < d goto L3
goto L4
L3 : t1 : = y + z
X : = t1
Goto L1
L4 : t2 := y – z
X : = t2
Goto L1
Lnext :
L0 : ifa < c goto L1
goto snext
L1: if b < d goto L2
goto snext
L2: if a = = 1 goto L3
goto L4
L3: t1 = c + 1
c = t1
goto L0
L4: if a< = d goto L5
goto L0
L5: t2 = a + 2
a = t2
goto L4
43.
if a< bthen
while c > d do
x = x + y;
else
do
p = p + q
while ( e< = f)
44.
if a< bgoto L1
goto L2
L1: if c< d goto L3
goto snext
L3: t1 = x + y
x = t1
goto L1
L2: t2 = p + q
p = t2
if e < = f goto L2
snext:
45.
Mixed Mode BooleanExpressions
• Boolean Expressions often contain Arithmetic sub-expressions e.g. (a+b)<c.
• On the other hand, if true = 1 and false = 0, then (a<b)+(b<a) can be an
Arithmetic expression with value 0 if a=b and 1 otherwise.
• The method of representing Boolean Expressions by Jumping code is still a good
option.
• Consider the following grammar:
• E → E+E | E and E | E relop E | id
• E+E, produces an arithmetic result, and the arguments can be mixed;
• E and E, produces a Boolean result, and both arguments must be Boolean;
• E relop E, produces a Boolean result, and the arguments can be mixed;
• id is assumed of type arithmetic.
46.
• To generatecode we use a synthesized attribute E.type, that will
be either arith or bool.
• Boolean Expressions will have inherited attributes E.true and
E.false useful for the jumping code.
• Arithmetic Expressions will have the synthesized attribute E.place
standing for the (temporary) variable holding the value of E.
• The global variable nextstat gives the index of the next three-
address code statement and is incremented by gen.
47.
E ® E1+ E2 E.type := arith;
if E1.type = arith and E2.type = arith then begin
/* arithmetic addition*/
E.place := newtemp;
E.code :- E1.code || E2.code || gen (E.place ‘:=’ E1.place‘+’
E2.place)
End
else if E1.type = arith and E2.type = bool then begin
E.place := newtemp;
E2.true := newlabel;
E2.false := newlabel;
E.code := E1.code || E2.code || gen (E2true ‘:’ E.place ‘+’1)||
gen(‘goto’nextstat + 1) ||
gen(E2.false ‘:’E.place ‘:=’ E1.place)
code to evaluateE to t
goto test
L1 : code for S1
goto next
L2 : code for S2
goto next
…
Ln –1 :code for Sn –1
goto next
Ln : code for Sn
goto next
test : if t = V1 goto L1
if t = V2 goto L2
…
if t = Vn–1 goto Ln–1
goto Ln
next:
50.
code to evaluateE into t
if t ≠ V1 goto L1
code for S1
goto next
L1: if t ≠ V2 goto L2
code for S2
goto next
L2:
…………
Ln-2: if t ≠ Vn-1 goto Ln-1
code for Sn-1
goto next
Ln-1: code for Sn
next:
51.
switch(a+b)
begin
case 2: x= y
case 5: switch x
begin
case 0: a = b+1
case 1: a = b+3
default: a = 2
end
case 9: x = y–1
default : x = y + 1
end
52.
t1 = a+b
ift1 ≠ 2 goto L1
x = y
goto next
L1: if t1 ≠ 5 goto L2
t2 = x
if t2 ≠ 0 goto L3
t3 = b + 1
a = t3
goto next1
L3: if t2 ≠ 1 goto L4
t4 = b+3
a = t4
goto next1
L4: a=2
next1: goto next
L2: if t1 ≠ 9 goto L5
t5 = y–1
x = t5
goto next
L5: t6 = y + 1
x = t6
next:
t1 = a– b
goto down
L1:if a < b goto L2
goto next
L2: t2 = y + 5
x = t2
goto L1
L3:t3 = b+1
a = t3
goto next
L5: a = 5
goto next
down: if t1 = 4 goto L1
if t1 = 6 goto L3
goto L5
next:
55.
Backpatching
• The easiestway to implement syntax-directed definitions is to use two
passes. First syntax tree is constructed and is then traversed in depth-
first order to compute the translations given in the definition.
• The main problem in generating three address codes in a single pass
for Boolean expressions and flow of control statements is that we may
not know the labels that control must go to at the time jump
statements are generated.
• This problem is solved by generating a series of branch statements
with the targets of the jumps temporarily left unspecified.
• Each such statement will be put on a list of goto statements whose
labels will be filled in when the proper label can be determined.
• This subsequent filling of addresses for the determined labels is called
backpatching.
56.
E -> Eor M E Synthesized attirbutes
| E and M E E.code three-address code
| not E E.truelist backpatch list for jumps on true
| (E) E.falselist backpatch list for jumps on false
| id relop id M.quad location of current three-address quad
| true
| false
M -> ε
57.
Backpatch Operations withLists
• makelist(i) creates a new list containing three
address location i, returns a pointer to the list.
• merge(p1, p2) concatenates lists pointed to by
p1 and p2, returns a pointer to the concatenated
list.
• backpatch(p, i) inserts i as the target label for
each of the statements in the list pointed to by p
58.
M → ε{M.quad := nextquad}
E → E1 or M E2 { backpatch (E1.falselist, M.quad):
E.truelist := merge (E1.truelist, E2.truelist);
E.falselist := E2.falselist}
E → E1 and M E2 {backpatch (E1.turelist, M.quad) ;
E.truelist := E2.truelist
E.falselist := merge (E1.falselist, E2.falselist) ; }
E → not E1 { E.truelist := E1.falselist : E2.falselist := E1.turelist }
E →(E1) { E.truelist := E1.falselist; E.falselist := E1.falselist}
Flow-of-Control Statements usingbackpatching
• We now show how backpatching can be used to translate flow-
of-control state
ments in one pass.
S → if E then S1
|if E then S1 else S2
| while E do S1
| begin L end
| A
L → L1 ; S | S
Here S denotes a statement, L a statement list, A an assignment
statement, and E a boolean expression
62.
S → A{ S . nextlist := nil }
S → begin L end { S . nextlist := L . nextlist }
S → if E then M S1 { backpatch (E . truelist , M .quad);
S . nextlist := merge (E . falselist , S1 . nextlist ) }
L → L1 ; M S { backpatch (L1 . nextlist , M .quad);
L . nextlist := S . nextlist ; }
L → S { L . nextlist := S . nextlist ; }
M →ε { M .quad := nextquad }
63.
S → ifE then M1 S1 N else M2 S2
{ backpatch (E . truelist , M1 .quad);
backpatch (E . falselist , M2 .quad);
S . nextlist := merge (S1 . nextlist ,
merge (N . nextlist , S2 . nextlist )) }
S → while M1 E do M2 S1
{ backpatch (S1 , nextlist , M1 .quad);
backpatch (E . truelist , M2 .quad);
S . nextlist := E . falselist ;
emit (‘goto _ ’ ) }
N → ε { N . nextlist := makelist (nextquad);
emit (‘goto _ ’ ) }
64.
S → callid (Elist)
{for each item p on queue do emit (‘param’, emit
(‘call’, id.place) }
Elist → Elist1, E {append(E.place, queue)}
Elist → E {queue := initqueue(E.place)}
Consider the functioncall
x = f (0, y+1) – 1
The corresponding intermediate code is given by
t1 = y + 1
param t1
param 0
call f, 2
retrieve t2
t3 = t2 –1
x = t3