RMM CD LECTURE NOTES UNIT-3 ALL.pdf

Compiler Design Lecture-29 1
Compiler Design
UNIT-III
Syntax-Directed Translation (SDT)
&
Intermediate-Code Generation
Syllabus: 3-1
Syntax-Directed Translation
3.1.1. Syntax-Directed Definitions (SDD)
3.1.2. Evaluation Orders for SDD's
3.1.3. Applications of Syntax-Directed Translation
3.1.4. Syntax-Directed Translation Schemes
3.1.5. Implementing L-Attributed SDD‘s

Syllabus: 3-2
3.2.1. Variants of Syntax Trees
3.2.2. Three-Address Code
3.2.3. Types and Declarations
3.2.4. Type Checking
3.2.5. Control Flow
3.2.6. Switch-Statements
3.2.7. Intermediate Code for Procedures
Compiler Design
UNIT-III
&

Compiler Design
UNIT-III: Syntax-Directed Translation
Lecture-29. Syntax-Directed Definitions (SDD)
Syntax-Directed Definitions (SDD)
• Inherited and Synthesized Attributes
• Evaluating an SDD at the Nodes of a Parse Tree

Semantic Analysis
• Let us concentrate on the third phase of the compiler
called semantic analysis.
• The main goal of the semantic analysis is to
check the correctness of program and enable proper
execution.
• We know that the job of the parser is only to
verify that the input program consists of tokens
arranged on syntactically valid combination.
• In semantic analysis we check whether they
form a sensible set of instructions in the
programming language.

Semantic Analysis
Definition:
• Semantic analysis is the third phase of the
compiler which acts as an interface between syntax
analysis phase and code generation
phase.
• It accepts the parse tree from the syntax
analysis phase and adds the semantic
information to the parse tree and performs certain
checks based on this information.
• It also helps constructing the symbol table with
appropriate information.

Semantic Analysis
Some of the actions performed semantic
analysis phase are:
• Type checking i.e., number and type of arguments in
function call and in function header of function definition
must be same. Otherwise, it results in semantic error.
• Object binding i.e., associating variables with
respective function definitions.
• Automatic type conversion of integers in mixed
mode of operations.
• Helps in intermediate code generation.
• Display appropriate error messages.

Semantic Analysis
The semantics of a language can be described very easily
using two notations namely:
• Syntax Directed Definition (SDD)
• Syntax Directed Translation (SDT)
Note: Consider the production E → E + T. To
distinguish E on LHS of the production and E on RHS of
the production, we use E1 on RHS of the production as
shown below:
E → E1 + T

Semantic Analysis
Let us consider the production, its derivation and
corresponding parse tree as shown below:
Production:
E → E1 + T
Derivation:
E => E1 + T
Parse Tree:
E
E1 T
+

Semantic Analysis
• The non-terminal E on LHS of the production is called
“head of the production”.
• The string of grammar symbols “E1 + T” on RHS of
the production is called “body of the
production”.
• So, in the derivation, head of the
production will be the parent node and
the symbols that represent body of the
production will be the children nodes.

Syntax Directed Definition (SDD)
Definition:
A Syntax Directed Definition (SDD)is a
context free grammar with attributes and
semantic rules. The attributes are associated
with grammar symbols whereas the semantic
rules are associated with productions. The
semantic rules are used to compute the attribute
values.

Example:
A simple Syntax Directed Definition
(SDD)for the production E → E1 + T can be
written as shown below:
Observe that a semantic rule is associated with
production where the attribute name val is associated
with each non-terminal used in the rule.
Production Semantic Rule
E → E1 + T E.val = E1.val + T.val
Where val is attribute

Attribute: Definition
An attribute is a property of a programming
language construct. Attributes are always
associated with grammar symbols. If X is a grammar
symbol and ‘a’ is the attribute, then X.a denote
the value of attribute ‘a’ at a particular node X in the
parse tree. If we implement the nodes of the parse tree
by records or using structures, then the attribute of
X can be implemented as a filed in the record or a
structure.

Attribute: Examples
• Ex 1: If val is the attribute associated with a non-
terminal E, then E.val gives the value of attribute
val at a node E in the parse tree.
• Ex 2: If lexval is the attribute associated with a
terminal digit, then digit.lexval gives the
value of attribute lexval at a node digit in the
parse tree.
• Ex 3: If syn is the attribute associated with a non-
terminal F, then F.syn gives the value of attribute
syn at a node F in the parse tree.

Attribute:
Typical examples of attributes are:
• The data types associated with variables such as int,
float, char etc.
• The value of an expression
• The location of a variable in memory
• The object code of a function or procedure
• The number of significant digits in a number and so
on.

Semantic rule: Definition
The rule that describe how to compute the attribute values
of the attributes associated with a grammar symbol using
attribute values of other grammar symbols is called
semantic rule.
For example, consider the production E → E1 + T. The
attribute value of E which is on LHS of the production
denoted by E.val can be calculated by adding the attribute
values of variables E and T which are on RHS of the
production denoted by E1.val and T.val as shown below:
E.val = E1.val + T.val // Semantic rule

Inherited and Synthesized Attributes
The attribute value for a node in the parse tree may
depend on information from its children nodes or its
sibling nodes or parent nodes. Based on how the
attribute values are obtained we can classify the
attributes.
There are two types of attributes namely:
• Synthesized attribute (S-attribute)
• Inherited attribute (I-attribute)

Synthesized Attribute (S-attribute):
Definition:
The attribute value for a non-terminal A derived from the
attribute values of its children or itself is called
synthesized attribute. Thus, the attribute
values of synthesized attributes are passed
up from children to the parent node in bottom-up
manner.

Example:
Consider the production: E → E1 + T. Suppose, the
attribute value val of E on LHS (head) of the production
is obtained by adding the attribute vales E1.val and
T.val appearing on the RHS (body) of the production
as shown below:

Example:
Parse tree with attribute values:
E.val = 30
E1.val = 10 T.val = 20
+

Example:
Now, attribute val with respect to E appearing on head of
the production is called synthesized attribute. This
is because, the value of E.val which is 30, is obtained
from the children by adding the attribute values 10 and 20
as shown in above parse tree.
E.val = 30
E1.val = 10 T.val = 20
+

Inherited Attribute (I-attribute):
Definition:
The attribute value of a non-terminal A derived from the
attribute values of its siblings or from its parent or itself
is called inherited attribute. Thus, the
attribute values of inherited attributes are
passed from siblings or from parent to children in top-
down manner.

Example:
Consider the production: D → T V which is used for a
single declaration such as:
int sum
In the production, D stands for declaration, T stands for
type such as int and V stands for the variable sum as in
above declaration.

Example:
The production, semantic rule and parse tree along with
attribute values are shown below:
D → T V V.inh = T.type

Example:
D
T.type = int V.inh = int
id.entry

Example:
Observe the following points from the above parse tree:
• The type int obtained from the lexical analyzer is
already stored in T.type whose value is transferred
to its sibling V. This can be done using:
V.inh = T.type
Since attribute value for V is obtained from its sibling, it
is inherited attribute and its attribute is denoted by inh.

Example:
Observe the following points from the above parse tree:
• On similar line, the value int stored on V.inh is
transferred to its child id.entry and hence entry
is inherited attribute of id and attribute value is
denoted by id.entry.
Note: With the help of the annotated parse tree, it is
very easy for us to construct SDD for a given grammar.

Annotated parse tree:
Definition:
A parse tree showing the attribute values of each node is
called annotated parse tree. The terminals in the
annotated parse tree can have only synthesized
attribute values and they are obtained directly from the
lexical analyzer. So, there are no semantic rules in SDD
(Syntax Directed Definition) to get the lexical values into
terminals of the annotated parse tree. The other
nodes in the annotated parse tree may be either
synthesized or inherited attributes.
Note: Terminals can never have inherited attributes.

Annotated parse tree:
Example:
Consider the partial annotated parse tree shown below:
In the above partial annotated parse tree, the attribute
values 10, 20 and 30 are stored in E1.val, T.val and
E.val respectively.
E.val = 30
E1.val = 10 T.val = 20
+

Example-1:
Write the SDD for a simple desk calculator and show
annotated parse tree for the expression (3+4)*(5+6)n
S → En
E → E + T | E - T | T
T → T * F | T / F | F
F → ( E )| digit

Example: Solution
The given grammar is shown below:
S → En
E → E + T | E - T | T
T → T * F | T / F | F
F → ( E )| digit
The above grammar generates an arithmetic expression
consisting of parenthesized or un-parenthesized expression
with operators + and *. For the sake of convenience, let us
consider part of the grammar written as shown below:

Example: Solution
S → Tn
T → T * F | T / F | F
F → digit
Using the above productions we can generate an un-
parenthesized expression consisting of only * operator such
as: 3*4 or 3*4*5 etc. The annotated parse tree
for evaluating the expression 3*5 is shown below:

Example: Solution
S.val = 15
T.val = 15 n (EOF)
T1.val = 3 F.val = 5
*
F.val = 3
digit.lexval = 3
digit.lexval = 5

Example: Solution
It is very easy to see how the values 3 and 5 are moved
from bottom to top till we reach the root node to get the
value 15. The rules to get the value 15 from the
productions used are shown below:
Productions Semantic Rules
F → digit F.val = digit.lexval
T → F T.val = F.val
T → T1 * F T.val = T1.val* F.val
S → Tn S.val = T.val

Example: Solution
On similar lines we can write the semantic rules for the
following productions as shown below:
S → En S.val = E.val
E → T E.val = T.val
F → ( E ) F.val = E.val

Example: Solution
Now, the final SDD along with productions and
semantic rules is shown below:
E → E + T E.val = E1.val + T.val
E → E - T E.val = E1.val - T.val
T → T * F T.val = T1.val * F.val
T → T / F T.val = T1.val / F.val

Example: The annotated parse tree for the expression (3+4)*(5+6)n
consisting of attribute values for each non-terminal is shown below:
S.val = 77
T.val = 77 n (EOF)
T1.val = 7 F.val = 11
*
F.val = 7
E.val = 7
E.val = 77
( )
E1.val = 3 T.val = 4
+
T.val = 3
F.val = 3
digit.lexval = 3
F.val = 4
digit.lexval = 4
E.val = 11
( )
E1.val = 5 T.val = 6
+
T.val = 5
F.val = 5
digit.lexval = 5
F.val = 6
digit.lexval = 6

Evaluating an SDD at the Nodes of a Parse Tree
We can easily obtain an SDD using the following steps:
Step 1: Construct the parse tree.
Step 2: Use the rules to evaluate attributes of all the
nodes of the parse tree.
Step 3: Obtain the attribute values for each non-terminal
and write the semantic rules for each production. When
complete annotated parse tree is ready, we will have the
complete SDD.

How do we construct an annotated parse tree?
In what order do we evaluate attributes?
• If we want to evaluate an attribute of a node of a parse tree, it is
necessary to evaluate all the attributes upon which its value depends.
• If all attributes are synthesized, then we must evaluate the attributes
of all of its children before we can evaluate the attribute of the node
itself.
• With synthesized attributes, we can evaluate attributes in any bottom
up order.
• Whether synthesized or inherited attributes there is no single order
in which the attributes have to be evaluated. There can be one or
more orders in which the evaluation can be done.

Circular dependency
• If the attribute value of a parent node depends on the
attribute value of child node and vice-versa, then we
say, there exists a circular dependency between the
child node and parent node. In this situation, it is not
possible to evaluate the attribute of either parent node
or the child node since one value depends on another
value.

Circular dependency
• For example, consider the non-terminal A with synthesized
attribute A.s and non-terminal B with inherited attribute
B.i with following productions and semantic rules:
Partial annotated parse tree:
A → B A.s = B.i
B.i = A.s + 6
A.s
B.i

Circular dependency
• In the above semantic rule s is synthesized attribute and i
is inherited attribute.
• The above two semantic rules are circular in nature.
• To compute A.s we require the values of B.i and to
compute the value of B.i, we require the value of A.s.
So, it is impossible to evaluate either the value of A.s or
the value of B.i without evaluating other.
A → B A.s = B.i
B.i = A.s + 6

Why evaluate inherited attributes
Consider the following grammar
T → T * F | F
F → digit
Using the above productions we can generate an un-
parenthesized expression consisting of only * operator
such as: 3*4 or 3*4*5 etc. The above grammar has
left-recursion and it is suitable for bottom up parser such
as LR parser. The annotated parse tree for evaluating the
expression 3*5 is shown below:

It is very easy to see how the values 3 and 5 are moved
from bottom to top till we reach the root node to the
value 15 as shown in the above tree.
T.val = 15
T1.val = 3 F.val = 5
*
F.val = 3
digit.lexval = 3
digit.lexval = 5

Thus using bottom-up manner, the values 3 and 5 are
moved upwards to get the result 15 using the semantic
rules associated with each production.
Semantic Rules Productions
F.val = digit.lexval F → id
T.val = F.val T → F
T.val = T1.val * F.val T → T * F

Example:
Obtain SDD for the following grammar using top-down
approach:
S → En
E → E + T | T
T → T * F | F
F → ( E )| digit
and obtain annotated parse tree for the
expression (3 + 4) * (5 + 6)n

Example: Solution
The given grammar has left recursion and hence it is not
suitable for top-down parser. To make it suitable for top-
down parsing, we have to eliminate left recursion. After
eliminating left recursion, the following grammar is
obtained:
S → En
E → TE’
E’ → +TE’| ϵ
T → FT’
T’ → *FT’| ϵ
F → ( E )| digit

Example: Solution
Note:
The variables S, E, T and F are present both in given
grammar and grammar obtained after eliminating left
recursion. So, only for the variables S, E, T and F we use
the attribute name v (stands for val) and for all other
variables we use s for synthesized attribute and i for
inherited attribute.

Example: Solution
Consider the following productions:
S → En
F → ( E )| digit
They do not have left recursion and they are retained in
the grammar which is obtained after eliminating left
recursion. So, we can compute the attribute value of
LHS (head) from the attribute value of RHS (i.e.,
children) for the above productions and hence they have
synthesized attributes.

Example: Solution
The productions, semantic rules and type of the attribute
are shown below:
Production Semantic Rule Type
S → E n S.v = E.v Synthesized
F → ( E ) F.v = E.v Synthesized
F → d F.v = digit.lexval Synthesized

Example: Solution
Consider the following productions and draw the
annotated parse tree for the expression 2*3 with flow of
information as shown below:
Productions
T → F T’
T’ → * F T’ | ϵ
F → digit

Example: Solution
Annotated parse tree for the expression 2*3:
T.val = 6
T’.inh = 2
F.val = 2
digit.lexval = 2
digit.lexval = 3
* F.val = 3 T1’.inh = 6
ϵ
T’.syn = 6
T’.syn = 6

Example: Solution
By following the dotted arrow lines, we can write the
various semantic rules for the corresponding productions
as shown below:
Semantic Rule Production
F.val = 2 is copied to
T’.inh
T’.inh = F.val T → F T’
T’.inh * F.val is
copied to T1’.inh
T1’.inh = T’.inh * F.val T’ → * F T’
T1’.inh is copied to
T’.syn
T’.syn = T1’.inh T’ → ϵ
T’.syn is moved to its
parent
T’.syn = T1’.syn T’ → * F T’
T’.syn is moved to its
parent T
T.val = T’.syn T → F T’

Example: Solution
The above productions and their respective rules along
with the type of attribute are shown below:
Production Semantic Rules
T → F T’
T’.inh = F.val
T.val = T’.syn
T’ → * F T’
T1’.inh = T’.inh * F.val
T’.syn = T1’.syn
T’ → ϵ T’.syn = T1’.inh

Example: Solution
Similar to the above, we can write the semantic rules for
the other productions as shown below:
Production Semantic Rules
E → T E’
E’.inh = T.val
E.val = E’.syn
E’ → + T E’
E1’.inh = E’.inh + T.val
E’.syn = E1’.syn
E’ → ϵ E’.syn = E1’.inh

Example: Solution
Combining all productions and semantic rules, we can write
the final SDD as shown below:
S → E n S.v = E.v Synthesized
E → T E’
E’.inh = T.val Inherited
E.val = E’.syn Synthesized
E’ → + T E’
E1’.inh = E’.inh + T.val Inherited
E’.syn = E1’.syn Synthesized
E’ → ϵ E’.syn = E1’.inh Synthesized
T → F T’
T’.inh = F.val Inherited
T.val = T’.syn Synthesized
T’ → * F T’
T1’.inh = T’.inh * F.val Inherited
T’.syn = T1’.syn Synthesized
T’ → ϵ T’.syn = T1’.inh Synthesized
F → ( E ) F.v = E.v Synthesized
F → d F.v = digit.lexval Synthesized

The annotated parse tree that shows the values of each attribute value while
evaluating the expression (3+4)*(5+6) is shown below:

Summary...
Syntax-Directed Definitions (SDD)
• Inherited and Synthesized Attributes
• Evaluating an SDD at the Nodes of a Parse Tree
Reading: Aho2, Section 5.1.1 & 5.1.2
Next Lecture: Evaluation Orders for SDD's

Compiler Design
Lecture-30. Evaluation Orders for SDD's
Evaluation Orders for SDD's
• Dependency Graphs
• Ordering the Evaluation of Attributes
• S-Attributed Definitions
• L-Attributed Definitions
• Semantic Rules with Controlled Side Effects

• The evaluation order to find attribute values
in a parse tree using semantic rules can be easily
obtained with the help of dependency graph.
• While annotated parse tree shows the values of
attributes, a dependency graph helps us to determine
how those values can be computed.

Evaluation Orders for SDD‘s
Dependency Graphs
Definition:
• A graph that shows the flow of information which
helps in computation of various attribute values in a
particular parse tree is called dependency
graph.
• An edge from one attribute instance to another
attribute instance indicates that the attribute value of
the first is needed to compute value of the second.

Dependency Graphs
For example:
Consider the following production and rule:
In bottom-up parser, the attribute value of LHS (head)
depends on the attribute value of RHS (body of the
production or children in the parse tree). So, the attribute
value of E is obtained from its children E1 and T.

Dependency Graphs
The portion of a dependency graph for this production can be
written as shown below:
In the above figure, the dotted lines along with nodes
connected to them represent the parse tree. The shaded nodes
represented as val with solid arrows originating from one
node and ends in another node is the dependency graph.
E
E1 T
val
val val
+

Dependency Graphs
Example:
An example of a complete dependency graph shown in
below figure:
T → T * F
F → digit
Obtain the dependency graph for the annotated parse
tree.

Dependency Graphs
Example: Solution
The grammar after eliminating left recursion is shown
below:
T → FT’
T’ → *FT1’
T’ → ϵ
F → digit

Dependency Graphs
Example: Solution
The SDD for the given grammar can be written as shown
below:
T → F T’
T’ → * F T1’
F → digit F.val = digit.lexval Synthesized

Dependency Graphs
Example: Solution
The annotated parse tree for 3*5 is shown below:
T.val = 15
F.val= 3
T’.inh = 3
T’.syn = 15
digit.lexval = 3
F.val = 5
*
T1’.inh = 15
T1’.syn= 15
digit.lexval = 5
ϵ

Dependency Graphs
Example: Solution
The dependency graph for the annotated parse tree is shown
below. The nodes of the dependency graph, represented by
the numbers 1 through 9
T 9 val
F 3 val
inh 5 T’ 8 syn
digit 1 lexval
F 4 val
* inh 6 T’ 7 syn
digit 2 lexval ϵ

Dependency Graphs
Example: Solution
Observe the following points from the above dependency
graph:
• Nodes 1 and 2 represent the attribute lexval associated
with the two leaves labeled digit.
• Nodes 3 and 4 represent the attribute val associated with
the two nodes labeled F.
• The edges to node 3 from 1 and to node 4 from 2 result
from the semantic rule that defines F.val in terms of
digit.lexval.

Dependency Graphs
Example: Solution
• In fact, F.val equals digit.lexval, but the edge
represents dependence, not equality.
• Nodes 5 and 6 represent the inherited attribute T’.inh
associated with each of the occurrences of nonterminal T’.
• The edge to 5 from 3 is due to the rule T’.inh =
F.val, which defines T’.inh at the right child of the root
from F.val at the left child.
• We see edges to 6 from node 5 for T’.inh and from node 4
for F.val, because these values are multiplied to evaluate the
attribute inh at node 6.

Dependency Graphs
Example: Solution
• Nodes 7 and 8 represent the synthesized attribute syn
associated with the occurrences of T’.
• The edge to node 7 from 6 is due to the semantic rule
T’.syn = T’.inh associated with production T → ϵ.
• The edge to node 8 from 7 is due to a semantic rule
associated with production T’ → * F T1’.
• Finally, node 9 represents the attribute T.val.
• The edge to 9 from 8 is due to the semantic rule, T.val =
T’.syn, associated with production T → F T’.

Ordering the evaluation of attributes
• The dependency graph characterizes the possible orders in
which we can evaluate the attributes at the various nodes of
a parse tree.
• If the dependency graph has an edge from node M to node N,
then the attribute corresponding to M must be evaluated
before the attribute of N.
• Thus, the only allowable orders of evaluation are those
sequences of nodes N1,N2,…Nk such that if there is an edge
of the dependency graph from Ni to Nj , then i < j.
• Such an ordering embeds a directed graph into a linear
order, and is called a topological sort of the graph.

• If there is any cycle in the graph, then there are no topological sorts;
that is, there is no way to evaluate the SDD on this parse tree.
• If there are no cycles, however, then there is always at least one
topological sort.
• To see why, since there are no cycles, we can surely
find a node with no edge entering.
• For if there were no such node, we could proceed from predecessor
to predecessor until we came back to some node we had already
seen, yielding a cycle.
• Make this node the first in the topological order, remove it from the
dependency graph, and repeat the process on the remaining nodes.

Example:
Consider the dependency graph shown below:
T 9 val
F 3 val
inh 5 T’ 8 syn
digit 1 lexval
F 4 val
* inh 6 T’ 7 syn
digit 2 lexval ϵ

Example:
• The dependency graph shown above has no cycles.
• One topological sort is the order in which the nodes
have already been numbered: 1, 2, 3, 4, 5,
6, 7, 8, 9.
• Notice that every edge of the graph goes from a node
to a higher-numbered node, so this order is surely a
topological sort.
• There are other topological sorts as well, such as 1,
3, 5, 2, 4, 6, 7, 8, 9.

S-Attributed Definitions
• As mentioned earlier, given an SDD, it is very hard to
tell whether there exist any parse trees whose
dependency graphs have cycles.
• In practice, translations can be implemented using
classes of SDD's that guarantee an evaluation order,
since they do not permit dependency graphs with
cycles.

There are two classes of SDD’s that guarantee an
evaluation order are:
• S-attributed definition
• L-attributed definition

An SDD is S-attributed if every attribute is
synthesized.
In an S-attributed SDD, each semantic rule
computes the attribute value of a non-terminal that
occurs on LHS of the production that represent head of
the production with the help of the attribute values of
non-terminals on the RHS of the production that
represent body of the production.

Example:
Consider the following SDD:
T → T1 * F T.val = T1.val * F.val

Example:
The SDD shown above is an example of an S-
attributed definition.
Each attribute, S.val, E.val, T.val, and F.val are
synthesized attributes and hence the SDD is an S-
attributed.

• When an SDD is S-attributed, we can evaluate
its attributes in any bottom-up order of the nodes of
the parse tree.
• It is often especially simple to evaluate the attributes
by performing a postorder traversal of the parse tree
and evaluating the attributes at a node N when the
traversal leaves N for the last time.

• That is, we apply the function postorder, defined
below, to the root of the parse tree:
postorder(N) {
for ( each child C of N, from the
left ) postorder(C);
evaluate the attributes associated
with node N;
}

• S-attributed definitions can be implemented
during bottom-up parsing, since a bottom-up parser
corresponds to a postorder traversal.
• Specifically, postorder corresponds exactly to the
order in which an LR parser reduces a production
body to its head.

L-Attributed Definitions
• The second class of SDD's is called L-attributed
definitions.
• The idea behind this class is that, between the
attributes associated with a production body,
dependency-graph edges can go from left to right, but
not from right to left (hence “L-attributed”).

Definition:
An L-attributed definition is one of the following:
1. Synthesized, or
2. Inherited, but with the rules limited as follows. Suppose that
there is a production A → X1 X2 … Xn, and that there is an
inherited attribute Xi.a computed by a rule associated with
this production. Then the rule may use only:
a) Inherited attributes associated with the head A.
b) Either inherited or synthesized attributes associated with the
occurrences of symbols X1, X2, ….,Xi-1 located to the left of Xi.
c) Inherited or synthesized attributes associated with this occurrence of
Xi itself, but only in such a way that there are no cycles in a
dependency graph formed by the attributes of this Xi.

Example-1:
Consider the SDD shown below:
T → F T’
T’ → * F T1’
F → digit F.val = digit.lexval Synthesized

Example-1:
The SDD shown above is L-attributed. To see why,
consider the semantic rules for inherited
attributes, which are repeated here for convenience:
The first of these rules defines the inherited attribute
T’.inh using only F.val, and F appears to the left
of T’ in the production body, as required.
T → F T’ T’.inh = F.val Inherited
T’ → * F T1’ T1’.inh = T’.inh * F.val Inherited

Example-1:
The second rule defines T1’.inh using the inherited
attribute T’.inh associated with the head, and
F.val, where F appears to the left of T1’in the
production body.
In each of these cases, the rules use information “from
above or from the left,” as required by the class.
The remaining attributes are synthesized.
Hence, the SDD is L-attributed.

Example-2:
Any SDD containing the following production and rules
cannot be L-attributed:
The first rule, A.s = B.b, is a legitimate rule in either an
S-attributed or L-attributed SDD. It defines a synthesized
attribute A.s in terms of an attribute at a child (that is, a
symbol within the production body).
A → BC
A.s = B.b;
B.i = f(C.c, A.s)

Example-2:
The second rule defines an inherited attribute B.i, so the
entire SDD cannot be S-attributed.
Further, although the rule is legal, the SDD cannot be L-
attributed, because the attribute C.c is used to help
define B.i, and C is to the right of B in the production body.
While attributes at siblings in a parse tree may be used in L-
attributed SDD's, they must be to the left of the
symbol whose attribute is being defined.

Semantic Rules with Controlled Side Effects
In practice, translations involve side effects:
1. A desk calculator might print a result;
2. A code generator might enter the type of an identifier into
a symbol table.
• With SDD's, we strike a balance between attribute
grammars and translation schemes.
• Attribute grammars have no side effects and allow any
evaluation order consistent with the dependency graph.
• Translation schemes impose left-to-right evaluation and
allow semantic actions to contain any program fragment.

Attribute grammar:
An SDD without any side effects is called attribute grammar.
The semantic rules in an attribute grammar define the value
of an attribute purely in terms of the values of other
attributes and constants.
Attribute grammars have the following properties:
• They do not have any side effects.
• They allow any evaluation order consistent with
dependency graph.

How to control side effects in SDD?
The side effects in SDD's can be controlled in one of the
following ways:
• Permitting side effects when attribute evaluation based on
any topological sort of the dependency graph produces a
correct translation.
• Impose constraints in the evaluation order so that the same
translation is produced for any allowable order.

Example:
Consider the SDD for desk calculator program. This SDD do not
have any side effects. Now, let us consider the first semantic rule
and corresponding production shown below:
Let us modify the desk calculator to print a result. Instead of the
rule S.val = E.val, which saves the result in the synthesized
attribute S.val, consider:
S → En print (E.val)

Example:
• Semantic rules that are executed for their side effects, such
as print (E.val), will be treated as the definitions of
dummy synthesized attributes associated with the head of
the production.
• The modified SDD produces the same translation under
any topological sort, since the print statement is executed
at the end, after the result is computed into E.val.
S → En print (E.val)

Example Problem:
Write the SDD for a simple type declaration and write the
annotated parse tree and dependency graph for the
declaration “float a, b, c”
D → T L
T → int | float
L → L1, id | id

Example Problem:
Syntax-directed definition for simple
type declarations:
1. D → T L L.inh = T.type
2. T → int T.type = integer
3. T → float T.type = float
4. L → L1, id
L1.inh = L.inh
addType(id.entry, L.inh)
5. L → id addType(id.entry, L.inh)

Example Problem:
Productions 4 and 5 also have a rule in which a function
addType is called with two arguments:
1. id.entry, a lexical value that points to a symbol-table
object, and
2. L.inh, the type being assigned to every identifier on the
list.
We suppose that function addType properly installs the
type L.inh as the type of the represented identifier.

Example Problem:
A dependency graph for the input string float id1,
id2, id3 is shown below:
D
T 4 Type
inh 5 L 6 entry
real
‘
inh 7 L 8 entry
id3 3 entry
inh 9 L 10 entry
‘
id2 2 entry
id1 1 entry

Example Problem:
• Numbers 1 through 10 represent the nodes of the
dependency graph.
• Nodes 1, 2, and 3 represent the attribute entry
associated with each of the leaves labeled id.
• Nodes 6, 8, and 10 are the dummy attributes that
represent the application of the function addType to a
type and one of these entry values.
• Node 4 represents the attribute T.type, and is actually
where attribute evaluation begins. This type is then passed
to nodes 5, 7, and 9 representing L.inh associated with
each of the occurrences of the nonterminal L.

Summary...
• Dependency Graphs
• Ordering the Evaluation of Attributes
• S-Attributed Definitions
• L-Attributed Definitions
• Semantic Rules with Controlled Side Effects
Reading: Aho2, Section 5.2.1 to 5.2.5
Next Lecture: Applications of Syntax-Directed Translation (SDT)

Compiler Design
Lecture-31
Applications of Syntax-Directed
Translation (SDT)
Applications of Syntax-Directed Translation (SDT)
• Construction of Syntax Trees
• The Structure of a Type

Definition:
The Syntax Directed Translation (in short SDT)
is a context free grammar with embedded semantic actions.
The semantic actions are nothing but the sequence of steps
or program fragments that will be carried out when that
production is used in the derivation.
The SDTs are used:
• To build syntax tree for programming constructs
• To translate infix expression into postfix notation
• To evaluate expressions

Construction of Syntax Trees
• The main application in this section is the construction of
syntax trees.
• Since some compilers use syntax trees as an intermediate
representation, a common form of SDD turns its input
string into a tree.
• To complete the translation to intermediate code, the
compiler may then walk the syntax tree, using another set
of rules that are in effect an SDD on the syntax tree rather
than the parse tree.
• The syntax tree is also called abstract
syntax tree. The parse tree is also called
concrete syntax tree.

Definition: Syntax tree
• A syntax tree also called abstract syntax
tree is a compressed form of parse tree which is used
to represent language constructs.
• In a syntax tree for an expression, each interior node
represents an operator and the children of the node
represent the operands of the operator.
• In general, any programming construct can be handled by
making up an operator for the construct and treat
semantically meaningful components of the construct as
operands.

Example: Syntax tree
For the following grammar show the parse tree and syntax
tree for the expression 3*5+4
E → E + T | E – T | T
T → T * F | T / F | F
F → ( E ) | digit | id

Parse tree for the expression 3*5+4
E
T
T
T1 F
*
F
3
5
E +
F
4

Syntax tree for the expression 3*5+4
+
4
3 5
*

We consider two SDD's for constructing syntax trees
for expressions.
1. The first, an S-attributed definition, is
suitable for use during bottom-up parsing.
2. The second, L-attributed, is suitable for use
during top-down parsing.

Syntax tree for S-attributed definition:
A syntax-tree node representing an expression E1 + E2
has label + and two children representing the
subexpressions E1 and E2.
We shall implement the nodes of a syntax tree by objects
with a suitable number of fields.
Each object will have an op field that is the label of the
node.

The objects will have additional fields as follows:
1. Leaf (op, val): This function is called only for the
terminals and it is used to create only leaf nodes containing
two fields namely:
• op filed holds the label for the node
• val field holds the lexical value obtained from the lexical
analyzer.
2. Node(op, c1, c2,…, ck): This function is called to
create only interior nodes with various fields namely:
• op filed holds the label for the node.
• c1, c2,…., ck refer to children for the node labeled op.

Example:
Obtain the semantic rules to construct a syntax tree for
simple arithmetic expression grammar.
E → E1 + T
E → E1 – T
E → T
T → ( E )
T → id
T → number

Example:
The S-attributed definition for Constructing syntax trees for simple
expressions is shown below:
The S-attributed definition shown above constructs syntax
trees for a simple expression grammar involving only the binary operators
+ and -. As usual, these operators are at the same precedence level and
are jointly left associative. All nonterminals have one synthesized attribute
node, which represents a node of the syntax tree.
E → E1 + T E.node = new Node(‘+’, E1.node, T.node)
E → E1 - T E.node = new Node(‘-’, E1.node, T.node)
E → T E.node = T.node
T → ( E ) T.node = E.node
T → id T.node = new Leaf (id, id.entry)
T → num T.node = new Leaf (num, num.val)

Example:
• Every time the first production E → E1 + T is used, its
rule creates a node with ‘+’ for op and two children,
E1.node and T.node, for the subexpressions.
• The second production has a similar rule.
• For production 3, E → T , no node is created, since
E.node is the same as T.node.

Example:
• Similarly, no node is created for production 4, T → (E).
The value of T.node is the same as E.node, since
parentheses are used only for grouping; they influence the
structure of the parse tree and the syntax tree, but once
their job is done, there is no further need to retain them in
the syntax tree.
• The last two T-productions have a single terminal on
the right. We use the constructor Leaf to create a suitable
node, which becomes the value of T.node.

Example:
The parse tree, annotated parse tree depicting the construction of a
syntax tree for the arithmetic expression a–4+c is shown below:
E.node
E.node T.node
+
E.node T.node
-
T.node
id
id
num

Example:
Steps in the construction of the syntax tree for a–4+c:
1) p1 = new Leaf (id, entry-a);
2) p2 = new Leaf (num, 4);
3) p3 = new Node(‘-’, p1, p2);
4) p4 = new Leaf (id, entry-c);
5) p5 = new Node(‘+’, p3, p4);

Syntax tree for L-attributed definition:
• The method of constructing syntax tree for
L-attributed definition remains same as the method
of constructing syntax tree for S-attributed
definition.
• The functions Leaf() and Node() are used with same
number and type of parameters.

Example:
Obtain the semantic rules to construct a syntax tree for
simple arithmetic expression grammar using top-down
approach with operators + and -.
E → E1 + T
E → E1 – T
E → T
T → ( E )
T → id
T → number

Example:
The given grammar is not suitable for top-down parser since
it has left recursion. After eliminating left recursion, we get
the following grammar:
E → TE’
E’ → +TE1
’
E’ → -TE1
’
E’ → ϵ
T → ( E )
T → id
T → number

Example:
The L-attributed definition for constructing
syntax trees during top-down parsing is shown below:
E → TE’ E.node = E’.syn
E’.inh = T.node
E’ → +TE1’ E1’.inh = new Node(‘+’, E’.inh, T.node)
E’ → -TE1’ E1’.inh = new Node(‘-’, E’.inh, T.node)
E’ → ϵ E’.syn = E’.inh

Example:
Dependency graph for a–4+c, with the SDD shown above
is shown below:
E 13 node
T 2 node
inh 5 E’ 12 syn
id 1 entry
T 4 node
- inh 6 E’ 11 syn
num 3 val
T 8 node
+ inh 9 E’ 10 syn
ϵ
id 7 entry

The Structure of a Type
What is the use of inherited attributes
(I-attributes)?
• During top-down parsing, the grammar should not have
left recursion. If the grammar has left recursion, we have
to eliminate left recursion. The resulting grammar need
inherited attributes.
• Inherited attributes are useful when the structure of the
parse tree differs from the abstract syntax of the input;
attributes can then be used to carry information from one
part of the parse tree to another.
• But sometimes, even though the grammar does not have
left recursion, the language itself demands inherited
attributes.

What is the use of inherited attributes
(I-attributes)?
This can be explained by considering array type as shown
below:
Example:
Give the Syntax Directed Translation (SDT) of type
int [2][3] and also give the semantic rules for the respective
productions.
T → B C
B → int
B → float
C → [num] C1
C → ϵ

Example:
In C, the type int [2][3] can be read as, “array of 2
arrays of 3 integers.” The corresponding type expression
array(2, array(3, integer)) is represented by
the tree shown in below Fig. The operator array takes two
parameters, a number and a type. If types are represented by
trees, then this operator returns a tree node labeled array
with two children for a number and a type.
array
2 array
3 integer
Fig. Type expression for int[2][3]

Example: The SDD is shown below:
Nonterminal T generates either a basic type or an array type.
Nonterminal B generates one of the basic types int and float.
T generates a basic type when T derives BC and C derives ϵ.
Otherwise, C generates array components consisting of a
sequence of integers, each integer surrounded by brackets.
T → B C T.t = C.t
C.b = B.t
B → int B.t = integer
B → float B.t = float
C → [num] C1 C.t = array (num.val, C1.t)
C1.b = C.b
C → ϵ C.t = C.b

Example:
The nonterminals B and T have a synthesized attribute t
representing a type. The nonterminal C has two
attributes: an inherited attribute b and a synthesized
attribute t. The inherited b attributes pass a basic type
down the tree, and the synthesized t attributes
accumulate the result.

Example:
An annotated parse tree for the input string int[2][3] is
shown in below Fig.
T.t = array(2, array(3, integer))
B.t = integer C.b = integer
C.t = array(2, array(3, integer))
int
2
[ ] C.b = integer
C.t = array(3, integer)
3
[ ] C.b = integer
C.t = integer
ϵ
Fig. Syntax-Directed Translation of array types

Summary...
Applications of Syntax-Directed Translation (SDT)
• Construction of Syntax Trees
• The Structure of a Type
Reading: Aho2, Section 5.3.1 & 5.3.2
Next Lecture: Syntax-Directed Translation (SDT) Schemes

Compiler Design
Lecture-32
Schemes
Syntax-Directed Translation (SDT) Schemes
• Postfix Translation Schemes
• Parser-Stack Implementation of Postfix SDT's
• SDT's With Actions Inside Productions

Syntax-Directed Translation scheme (SDT) are
a complementary notation to Syntax-Directed
Definitions (SDD). All of the applications of Syntax-
Directed Definitions can be implemented using Syntax-Directed
Translation schemes.
Definition:
A Syntax-Directed Translation scheme (SDT) is
a context-free grammar with program fragments embedded within
production bodies. The program fragments are called semantic
actions and can appear at any position within a production body.
By convention, we place curly braces around actions; if braces
are needed as grammar symbols, then we quote them.

• Any SDT can be implemented by first building a parse tree
and then performing the actions in a left-to-right depth-
first order; that is, during a preorder traversal.
• Typically, SDT's are implemented during parsing, without
building a parse tree.
• The use of SDT's to implement two important classes of
SDD's:
1. The underlying grammar is LR-parsable, and the SDD
is S-attributed.
2. The underlying grammar is LL-parsable, and the SDD
is L-attributed.

• The semantic rules in an SDD can be converted into an
SDT with actions that are executed at the right time.
During parsing, an action in a production body is executed
as soon as all the grammar symbols to the left of the action
have been matched.
• SDT's that can be implemented during parsing can be
characterized by introducing distinct marker nonterminals
in place of each embedded action; each marker M has only
one production, M → ϵ . If the grammar with marker non-
terminals can be parsed by a given method, then the SDT
can be implemented during parsing.

Postfix Translation Schemes
• In an SDD implementation, we can parse the grammar
bottom-up and the SDD is S-attributed.
• An SDT is constructed such that the actions to be executed
are placed at the end of the production and are executed
only when the RHS of the production is reduced to LHS of
the production i.e., reduction of the body to the head of the
production.
• The SDT’s with all actions at the right end of the
production bodies are called postfix SDT’s or
postfix syntax-directed translations.

Example:
Obtain postfix SDT implementation of the desk calculator to
evaluate the given expression.
Solution: SDT can be easily obtained by looking at the SDD
shown below:
T → T1 * F T.val = T1.val * F.val

Example: Solution
The Postfix SDT implementation of the desk calculator
is shown below:
Productions Actions
S → E n {print(E.val);}
E → E1 + T {E.val = E1.val + T.val;}
E → T {E.val = T.val;}
T → T1 * F {T.val = T1.val x F.val;}
T → F {T.val = F.val;}
F → ( E ) {F.val = E.val;}
F → digit {F.val = digit.lexval;}

Parser-Stack Implementation of Postfix SDT's
• Postfix SDT's can be implemented during LR
parsing by executing the actions when reductions
occur.
• The attribute(s) of each grammar symbol can be put
on the stack in a place where they can be found during
the reduction.
• The best plan is to place the attributes along with the
grammar symbols (or the LR states that represent these
symbols) in records on the stack itself.

In the following Fig. , the parser stack contains records with
a field for a grammar symbol (or parser state) and, below it, a
field for an attribute. The three grammar symbols XYZ are
on top of the stack; perhaps they are about to be reduced
according to a production like A → XYZ. Here, we show
X.x as the one attribute of X, and so on.
Z Z.z
Y Y.y
X X.x
State/grammar symbol Synthesized attribute(s)
Fig: Parser stack with a field for synthesized attributes
top
top-1
top-2

• If the attributes are all synthesized, and the actions
occur at the ends of the productions, then we can
compute the attributes for the head when we reduce
the body to the head.
• If we reduce by a production such as A → XYZ, then
we have all the attributes of X, Y , and Z available, at
known positions on the stack, as shown in the above
Fig.
• After the action, A and its attributes are at the top of
the stack, in the position of the record for X.

Example: Implementing the desk calculator on a bottom-
up parsing stack
Productions Actions
S → E n {print(stack[top-1].val);
top = top-1;}
E → E1 + T {stack[top-2].val = stack[top-2].val +
stack[top].val;
top = top-2;}
E → T
T → T1 * F {stack[top-2].val = stack[top-2].val X
stack[top].val;
top = top-2;}
T → F
F → ( E ) {stack[top-2].val = stack[top-1].val;
top = top-2;}
F → digit

Example:
• Suppose that the stack is kept in an array of records
called stack, with top a cursor to the top of the stack.
• Thus, stack[top] refers to the top record on the
stack, stack[top - 1] to the record below that,
and so on.
• Also, we assume that each record has a field called
val, which holds the attribute of whatever grammar
symbol is represented in that record.
• Thus, we may refer to the attribute E.val that
appears at the third position on the stack as
stack[top - 2].val.

Example:
• For instance, in the second production, E → E1 + T,
we go two positions below the top to get the value of
E1, and we find the value of T at the top. The resulting
sum is placed where the head E will appear after the
reduction, that is, two positions below the current top.
The reason is that after the reduction, the three
topmost stack symbols are replaced by one. After
computing E.val, we pop two symbols off the top of
the stack, so the record where we placed E.val will
now be at the top of the stack.

Example:
• In the third production, E → T, no action is
necessary, because the length of the stack does not
change, and the value of T.val at the stack top will
simply become the value of E.val.
• The same observation applies to the productions
T → F and T → digit.
• Production F → ( E )is slightly different. Although
the value does not change, two positions are removed
from the stack during the reduction, so the value has to
move to the position after the reduction.

SDT's With Actions Inside Productions
• An action may be placed at any position within the body of a
production.
• It is performed immediately after all symbols to its left are
processed.
• Thus, if we have a production B → X {a} Y , the action
a is done after we have recognized X (if X is a terminal) or
all the terminals derived from X (if X is a nonterminal).
More precisely,
• If the parse is bottom-up, then we perform action a as soon
as this occurrence of X appears on the top of the parsing
stack.
• If the parse is top-down, we perform a just before we
attempt to expand this occurrence of Y (if Y a nonterminal)
or check for Y on the input (if Y is a terminal).

Example: Problematic SDT for infix-to-prefix translation
during parsing
As an extreme example of a problematic SDT, suppose that
we turn our desk-calculator running example into an SDT
that prints the prefix form of an expression, rather than
evaluating the expression. The productions and actions are
shown below:
1)S → E n
2)E → {print(‘+’);} E1 + T
3)E → T
4)T → {print(‘*’);} T1 * F
5)T → F
6)F → ( E )
7)F → digit {print(digit.lexval);}

Example: Problematic SDT for infix-to-prefix translation
during parsing
• Unfortunately, it is impossible to implement this SDT
during either top-down or bottom-up parsing, because the
parser would have to perform critical actions, like printing
instances of * or +, long before it knows whether these
symbols will appear in its input.
• Using marker nonterminals M2 and M4 for the actions in
productions 2 and 4, respectively, on input that is a digit,
a shift-reduce parser has conflicts between reducing by
M2 → ϵ, reducing by M4 → ϵ, and shifting the digit.

Any SDT can be implemented as follows:
1. Ignoring the actions, parse the input and produce a parse
tree as a result.
2. Then, examine each interior node N, say one for
production A → α. Add additional children to N for the
actions in α, so the children of N from left to right have
exactly the symbols and actions of α.
3. Perform a preorder traversal of the tree, and as soon as a
node labeled by an action is visited, perform that action.

For instance, the following Fig. shows the parse tree for expression 3
* 5 + 4 with actions inserted. If we visit the nodes in preorder, we
get the prefix form of the expression: + * 3 5 4.
L
n
E T
+
T
F
digit
E
T
F
*
digit
F
digit
{ print(‘+’); }
{ print(‘*’); }
{ print(3); }
{ print(5); }
{ print(4); }
Fig: Parse tree with actions embedded

Summary...
• Postfix Translation Schemes
• Parser-Stack Implementation of Postfix SDT's
• SDT's With Actions Inside Productions
Next Lecture: Implementing L-Attributed SDD's

Compiler Design
Lecture-33
Implementing L-Attributed SDD's
• Translation During Recursive-Descent Parsing
• On-The-Fly Code Generation
• L-Attributed SDD's and LL Parsing
• Bottom-Up Parsing of L-Attributed SDD's

The following methods do translation by traversing a
parse tree:
1. Build the parse tree and annotate. This method works
for any noncircular SDD whatsoever.
2. Build the parse tree, add actions, and execute the
actions in preorder. This approach works for any L-
attributed definition. We discussed how to turn
an L-attributed SDD into an SDT; in
particular, we discussed how to embed actions into
productions based on the semantic rules of such an
SDD.

We discuss the following methods for translation during
parsing:
3. Use a recursive-descent parser with one function for
each nonterminal. The function for nonterminal A
receives the inherited attributes of A as arguments
and returns the synthesized attributes of A.
4. Generate code on the fly, using a recursive-descent
parser.
5. Implement an SDT in conjunction with an LL-parser.
The attributes are kept on the parsing stack, and the
rules fetch the needed attributes from known
locations on the stack.

We discuss the following methods for translation during
parsing:
6. Implement an SDT in conjunction with an LR-parser.
This method may be surprising, since the SDT for an
L-attributed SDD typically has actions in the
middle of productions, and we cannot be sure during
an LR parse that we are even in that production
until its entire body has been constructed. We shall
see, however, that if the underlying grammar is LL,
we can always handle both the parsing and
translation bottom-up.

Translation During Recursive-Descent Parsing
A recursive-descent parser has a function A for each
nonterminal A, as discussed in Top-down parser. We can
extend the parser into a translator as follows:
a) The arguments of function A are the inherited
attributes of nonterminal A.
b) The return-value of function A is the collection of
synthesized attributes of nonterminal A.

In the body of function A, we need to both parse and
handle attributes:
1. Decide upon the production used to expand A.
2. Check that each terminal appears on the input when it
is required. We shall assume that no backtracking is
needed, but the extension to recursive-descent
parsing with backtracking can be done by restoring
the input position upon failure.
3. Preserve, in local variables, the values of all
attributes needed to compute inherited attributes for
nonterminals in the body or synthesized attributes for
the head nonterminal.

In the body of function A, we need to both parse and
handle attributes:
4. Call functions corresponding to nonterminals in the
body of the selected production, providing them with
the proper arguments. Since the underlying SDD is
L-attributed, we have already computed these
attributes and stored them in local variables.

Example:
Let us consider the SDD and SDT for while statements:
S → while ( C ) S1
Here, S is the nonterminal that generates all kinds of
statements, including if-statements, assignment
statements, and others. In this example, C stands for a
conditional expression- a boolean expression that
evaluates to true or false.

Example:
SDD for while-statements:
S → while ( C ) S1 L1 = new();
L2 = new();
S1.next = L1;
C.false = S.next;
C.true = L2;
S.code = label || L1 ||
C.code || label || L2 ||
S1.code

Example:
SDT for while-statements:
S → while ( C ) S1{L1 = new(); L2 = new();
C.false = S.next; C.true =
L2;}
{S1.next = L1;}
{S.code = label || L1 ||
S1.code;}

Example:
A pseudocode of the relevant parts of the function S
shown below: [Implementing while-statements with a
recursive-descent parser]
string S(label next) {
string Scode, Ccode; /* local variables
holding code fragments */
label L1, L2; /* the local labels */
if ( current input == token while ) {
advance input;
check ‘(‘ is next on the input, and
advance;

Example:
Implementing while-statements with a recursive-descent parser
L1 = new();
L2 = new();
Ccode = C(next, L2);
check ‘)’ is next on the input, and
advance;
Scode = S(L1);
return("label" || L1 || Ccode ||
"label" || L2 || Scode);
}
else /* other statement types */
}

On-The-Fly Code Generation
• The construction of long strings of code that are
attribute values, is undesirable for several reasons,
including the time it could take to copy or move long
strings.
• In common cases such as our running code generation
example, we can instead incrementally generate pieces
of the code into an array or output file by executing
actions in an SDT.

Example: SDD: On-the-fly recursive-descent
code generation for while-statements
We can modify the function of while-statements to emit
elements of the main translation S.code instead of saving
them for concatenation into a return value of S.code. The
revised function S shown below:
void S(label next) {
label L1, L2; /* the local labels */
if ( current input == token while ) {
advance input;
check ‘(‘ is next on the input,
and advance;
L1 = new();
L2 = new();

Example:
print(“label”, L1)
C(next, L2);
check ‘)’ is next on the input, and
advance;
print(“label”, L2)
C(next, L2);
S(L1);
}
else /* other statement types */
}

Example: SDT for on-the-fly code
generation for while statements
S → while ( {L1 = new(); L2 = new();
L2; print(“label”, L1);}
C ) { S1.next = L1;
print(“label”, L2); }
S1

L-Attributed SDD's and LL Parsing
• Suppose that an L-attributed SDD is based on an
LL-grammar and that we have converted it to an SDT
with actions embedded in the productions.
• We can then perform the translation during LL
parsing by extending the parser stack to hold actions
and certain data items needed for attribute evaluation.
Typically, the data items are copies of attributes.
• In addition to records representing terminals and
nonterminals, the parser stack will hold action-records
representing actions to be executed and synthesize-
records to hold the synthesized attributes for
nonterminals.

We use the following two principles to manage attributes
on the stack:
1. The inherited attributes of a nonterminal A are placed
in the stack record that represents that nonterminal. The
code to evaluate these attributes will usually be
represented by an action-record immediately above the
stack record for A; in fact, the conversion of L-
attributed SDD's to SDT's ensures that the action-
record will be immediately above A.
2. The synthesized attributes for a nonterminal A are
placed in a separate synthesize-record that is
immediately below the record for A on the stack.

• This strategy places records of several types on the parsing
stack, trusting that these variant record types can be
managed properly as subclasses of a “stack-record” class.
In practice, we might combine several records into one, but
the ideas are perhaps best explained by separating data used
for different purposes into different records.
• Action-records contain pointers to code to be executed.
Actions may also appear in synthesize-records; these
actions typically place copies of the synthesized attribute(s)
in other records further down the stack, where the value of
that attribute will be needed after the synthesize-record and
its attributes are popped off the stack.

Example-1: Expansion of S according to
the while-statement production
Figure (a) shows the situation as we are about to use the while-
production to expand S, because the lookahead symbol on the input
is while. The record at the top of stack is for S, and it contains
only the inherited attribute S.next, which we suppose has the
value x. Since we are now parsing top-down, we show the stack top
at the left, according to our usual convention.
S
next = x
top
Fig (a).

while
top
Fig (b).
( Action
snext = x
L1 = ?
L2 = ?
C
false = ?
true = ?
) Action
al1 = ?
al2 = ?
S1
next = ?
L1 = new();
L2 = new();
stack[top - 1].false = snext;
stack[top - 1].true = L2;
stack[top - 3].al1 = L1;
stack[top - 3].al2 = L2;
print("label”, L1);
stack[top - 1].next = al1;
print("label”, aL2);

Figure (b) shows the situation immediately after we have
expanded S. There are action-records in front of the
nonterminals C and S1, corresponding to the actions in the
underlying SDT for on-the-fly code generation for while
statements.
The record for C has room for inherited attributes true and
false, while the record for S1 has room for attribute next, as
all S-records must. We show values for these fields as ?,
because we do not yet know their values.

Example-2: Expansion of S with synthesized
attribute constructed on the stack
S
next = x
top
Fig (a).
Synthesize
S.code
code = ?
data
actions

Example-2: Expansion of S with synthesized
attribute constructed on the stack
while
top
Fig (b).
( Action
L1 = ?
L2 = ?
C
false = ?
true = ?
) S1
next = ?
L1 = new();
L2 = new();
stack[top - 1].true = L2;
stack[top - 4].next = L1;
stack[top - 5].l1 = L1;
stack[top - 5].l2 = L2;
stack[top
- 3].Ccode
= code;
Synthesize
C.code
code = ?
Synthesize
S1.code
code = ?
Ccode = ?
l1 = ?
l2 = ?
Synthesize
S.code
code = ?
data
actions
stack[top - 1].code = “label” || l1
||Ccode || “label” || l2 ||code;

Bottom-Up Parsing of L-Attributed SDD's
We can do bottom-up every translation that we can do top-
down. More precisely, given an L-attributed SDD on an LL
grammar, we can adapt the grammar to compute the same SDD
on the new grammar during an LR parse. The “trick” has three
parts:
1. Start with the SDT constructed earlier, which places
embedded actions before each nonterminal to compute its
inherited attributes and an action at the end of the
production to compute synthesized attributes.
2. Introduce into the grammar a marker nonterminal in place
of each embedded action. Each such place gets a distinct
marker, and there is one production for any marker M,
namely M → ϵ.

3. Modify the action a if marker nonterminal M replaces it
in some production A → α {a} β, and associate with
M → ϵ an action a’ that
a) Copies, as inherited attributes of M, any attributes of A or
symbols of that action a needs.
b) Computes attributes in the same way as a, but makes those
attributes be synthesized attributes of M.

Example-1:
Suppose that there is a production A → B C in an LL
grammar, and the inherited attribute B.i is computed from
inherited attribute A.i by some formula B.i = f(A.i).
That is, the fragment of an SDT we care about is
A → {B.i = f(A.i); } B C
We introduce marker M with inherited attribute M.i and
synthesized attribute M.s. The former will be a copy of A.i
and the latter will be B.i. The SDT will be written
A → M B C
M → {M.i = A.i; M.s = f(M.i); }

Example-2:
Consider the SDT for while-statements as shown below:
S → while ( C ) S1 {L1 = new(); L2 = new();
L2;}
{S1.next = L1;}
{S.code = label || L1 ||
S1.code;}

Example-2:
Let us turn the above SDT into an SDT that can operate with
an LR parse of the revised grammar. We introduce a
marker M before C and a marker N before S1, so the
underlying grammar becomes
S → while ( M C ) N S1
M → ϵ
N → ϵ

Example-2: LR parsing stack after
reduction of ϵ to M
Code executed during reduction of ϵ to M
L1 = new();
L2 = new();
C.true = L2;
C.false = stack[top - 3].next;
?
S.next
top
M
C.true
C.false
L1
L2
while (

Example-2: Stack just before reduction
of the while-production body to S
?
S.next
top
M
C.true
C.false
L1
L2
while ( C
C.code
) N
S1.next
S1
S1.code

Summary
• Translation During Recursive-Descent Parsing
• On-The-Fly Code Generation
• L-Attributed SDD's and LL Parsing
• Bottom-Up Parsing of L-Attributed SDD's
Next Lecture: Intermediate-Code Generation

Compiler Design
UNIT-III(2): INTERMEDIATE-CODE GENERATION
Lecture-34
Variants of Syntax Trees
Outline:
• Introduction to Intermediate-Code Generation
• Directed Acyclic Graphs (DAG’S) for
Expressions
• The Value-Number Method for Constructing
DAG’s

• In the analysis-synthesis model of a compiler, the front end
analyzes a source program and creates an intermediate
representation, from which the back end generates target
code.
• Ideally, details of the source language are confined to the
front end, and details of the target machine to the back end.
• With a suitably defined intermediate representation, a
compiler for language i and machine j can then be built by
combining the front end for language i with the back end
for machine j.
• This approach to creating suite of compilers can save a
considerable amount of effort: m X n compilers can be
built by writing just m front ends and n back ends.

Logical structure of a compiler front end
is shown in below fig:
Parser intermediate
code
Fig: Logical structure of a compiler front end
Static
Checker
Intermediate
Code Generator
Code
Generator
front end back end

• Static checking includes type checking, which
ensures that operators are applied to compatible operands.
• It also includes any syntactic checks that remain after
parsing.
• For example, static checking assures that a break-
statement in C is enclosed within a while-, for-, or
switch-statement; an error is reported if such an
enclosing statement does not exist. The approach in this
chapter can be used for a wide range of intermediate
representations, including syntax trees and three-address
code.

• The term “three-address code” comes from instructions of
the general form x = y op z with three addresses: two
for the operands y and z and one for the result x.
• In the process of translating a program in a given source
language into code for a given target machine, a compiler
may construct a sequence of intermediate representations,
as shown in below Fig.
Source
Program
Fig: A compiler might use a sequence of intermediate representations
High Level
Intermediate
Representation
Low Level
Intermediate
Representation
Target
Code
…

• High-level representations are close to the
source language and low-level representations
are close to the target machine.
• Syntax trees are high level; they depict the
natural hierarchical structure of the source program and
are well suited to tasks like static type checking.
• A low-level representation is suitable for
machine-dependent tasks like register allocation
and instruction selection.
• Three-address code can range from high- to low-
level, depending on the choice of operators.

• For expressions, the differences between syntax
trees and three-address code are superficial.
• For looping statements, for example, a syntax tree
represents the components of a statement, whereas
three-address code contains labels and jump
instructions to represent the flow of control, as in
machine language.
• The choice or design of an intermediate
representation varies from compiler to compiler.
• An intermediate representation may either
be an actual language or it may consist of internal data
structures that are shared by phases of the compiler.

• C is a programming language, yet it is often used as an
intermediate form because it is flexible, it compiles into
efficient machine code, and its compilers are widely
available.
• The original C++ compiler consisted of a front end that
generated C, treating a C compiler as a back end.

• Nodes in a syntax tree represent constructs in the
source program; the children of a node represent the
meaningful components of a construct.
• A Directed Acyclic Graph (hereafter called a
DAG) for an expression identifies the common
subexpressions (subexpressions that occur more than
once) of the expression.
• As we shall see in this section, DAG's can be constructed
by using the same techniques that construct syntax
trees.

Directed Acyclic Graphs for Expressions
• Like the syntax tree for an expression, a DAG has
leaves corresponding to atomic operands and interior
nodes corresponding to operators.
• The difference is that a node N in a DAG has more than one
parent if N represents a common subexpression; in a
syntax tree, the tree for the common subexpression
would be replicated as many times as the subexpression
appears in the original expression.
• Thus, a DAG not only represents expressions more
clearly, it gives the compiler important clues regarding the
generation of efficient code to evaluate the expressions.

Example-1: The following Figure. shows the DAG for
the expression a + a * (b - c) + (b - c) * d
+
+
*
d
a
*
-
b c
Fig: DAG for the expression a + a *(b-c)+(b-c)*d

Example-1:
• The leaf for a has two parents, because a appears twice
in the expression.
• More interestingly, the two occurrences of the common
subexpression b-c are represented by one node, the node
labeled -. That node has two parents, representing its two
uses in the subexpressions a*(b-c) and (b-c)*d.
• Even though b and c appear twice in the complete
expression, their nodes each have one parent, since both
uses are in the common subexpression b-c.

Example-2: Syntax-Directed Definition
(SDD) to produce syntax trees or DAG's
E → E1 + T E.node = new Node(‘+’, E1.node, T.node)
E → E1 - T E.node = new Node(‘-’, E1.node, T.node)
E → T E.node = T.node

Example-2: Steps for constructing the DAG
of the above SDD
The sequence of steps shown below constructs the DAG as
shown in the above Fig., provided Node and Leaf return
an existing node, if possible.
1) p1 = new Leaf (id, entry-a);
2) p2 = new Leaf (id, entry-a) = p1;
3) p3 = new Leaf (id, entry-b);
4) p4 = new Leaf (id, entry-c);
5) p5 = new Node(‘-’, p3, p4);
6) p6 = new Node(‘*’, p1, p5);

Example-2: Steps for constructing the DAG
of the above SDD
7) p7 = new Node(‘+’, p1, p6);
8) p8 = new Leaf (id, entry-b) = p3;
9) p9 = new Leaf (id, entry-c) = p4;
10)p10 = new Node(‘-’, p3, p4) = p5;
11)p11 = new Leaf (id, entry-d);
12)p12 = new Node(‘*’, p5, p11);
13)p13 = new Node(‘+’, p7, p12);

The Value-Number Method for Constructing DAG’s
The nodes of a syntax tree or DAG are stored in an
array of records, as shown in below Fig.
=
+
i 10
Fig (a): DAG for the expression i = i + 10
1 id
2 num 10
3 + 1 2
4 = 1 3
5 …
Fig (b): Array
to entry for i

• Each row of the array represents one record, and therefore
one node.
• In each record, the first field is an operation code,
indicating the label of the node.
• In Fig. (b), leaves have one additional field, which holds
the lexical value (either a symbol-table pointer or a
constant, in this case), and interior nodes have two
additional fields indicating the left and right children.
• In this array, we refer to nodes by giving the integer index
of the record for that node within the array. This integer
historically has been called the value number for the node
or for the expression represented by the node.

• For instance, in above Fig. , the node labeled + has value
number 3, and its left and right children have value numbers 1
and 2, respectively.
• In practice, we could use pointers to records or references to
objects instead of integer indexes, but we shall still refer to the
reference to a node as its ”value number.”
• If stored in an appropriate data structure, value numbers help us
construct expression DAG's efficiently.
• Suppose that nodes are stored in an array, as in above Fig., and
each node is referred to by its value number.
• Let the signature of an interior node be the triple < op, l,
r>, where op is the label, l its left child's value number, and r
its right child's value number. A unary operator may be assumed
to have r = 0.

Algorithm : The value-number method for
constructing the nodes of a DAG.
INPUT: Label op, node l, and node r.
OUTPUT: The value number of a node in the array with
signature < op, l, r >.
METHOD:
• Search the array for a node M with label op, left child l,
and right child r.
• If there is such a node, return the value number of M.
• If not, create in the array a new node N with label op, left
child l, and right child r, and return its value number.

• The above Algorithm yields the desired output, searching
the entire array every time we are asked to locate one node
is expensive, especially if the array holds expressions from
an entire program.
• A more efficient approach is to use a hash table, in which
the nodes are put into ”buckets,” each of which
typically will have only a few nodes.
• The hash table is one of several data structures that
support dictionaries efficiently.

• A dictionary is an abstract data type that allows us to
insert and delete elements of a set, and to determine
whether a given element is currently in the set.
• A good data structure for dictionaries, such as a hash table,
performs each of these operations in time that is constant
or close to constant, independent of the size of the set.
• To construct a hash table for the nodes of a DAG, we need
a hash function h that computes the index of the bucket
for a signature < op, l, r >, in a way that distributes
the signatures across buckets, so that it is unlikely that any
one bucket will get much more than a fair share of the
nodes.

• The bucket index h(op, l, r)is computed
deterministically from op, l, and r, so that we may repeat
the calculation and always get to the same bucket index
for node < op, l, r >.
• The buckets can be implemented as linked lists, as shown
in below Fig.
0
…
9
…
20
…
Fig: Data structure for searching buckets
Array of bucket
headers indexed
by hash value
25 3
2
List elements
representing nodes

• An array, indexed by hash value, holds the bucket headers,
each of which points to the first cell of a list.
• Within the linked list for a bucket, each cell holds the
value number of one of the nodes that hash to that bucket.
That is, node < op, l, r > can be found on the list
whose header is at index h(op, l, r)of the array.
• Thus, given the input node op, l, and r, we compute the
bucket index h(op, l, r)and search the list of cells in
this bucket for the given input node.

Summary
• Introduction to Intermediate-Code Generation
• Directed Acyclic Graphs (DAG’S) for
Expressions
• The Value-Number Method for Constructing
DAG’s
Reading: Aho2, Section 6.1: 6.1.1 & 6.1.2
Next Lecture: Three-Address Code

Compiler Design
UNIT-III(2): INTERMEDIATE-CODE GENERATION
Lecture-35
Three-Address Code
Outline:
• Addresses and Instructions
• Quadruples
• Triples
• Static Single-Assignment Form

Three-Address Code
In three-address code, there is at most one operator on the right
side of an instruction; that is, no built-up arithmetic expressions
are permitted. Thus a source-language expression like x + y *
z might be translated into the sequence of three-address
instructions:
t1 = y * z
t2 = x + t1
where t1 and t2 are compiler-generated temporary names.
This multi-operator arithmetic expressions and of nested flow-of-
control statements makes three-address code desirable for target-
code generation and optimization.
The use of names for the intermediate values computed by a
program allows three-address code to be rearranged easily.

Three-Address Code
Example:
Three-address code is a linearized representation of a syntax
tree or a DAG in which explicit names correspond to the interior
nodes of the graph. A DAG and its corresponding three-address
code is shown below:
+
+
*
d
a
*
-
b c
Fig: DAG for the expression a + a *(b-c)+(b-c)*d

Three-Address Code
Example:
Given expression: a + a *(b-c)+(b-c)*d
t1 = b - c
t2 = a * t1
t3 = a + t2
t4 = t1 * d
t5 = t3 + t4
Fig: Three-address code

Addresses and Instructions
• Three-address code is built from two concepts: addresses
and instructions.
• In object-oriented terms, these concepts correspond to
classes, and the various kinds of addresses and
instructions correspond to appropriate subclasses.
• Alternatively, three-address code can be implemented
using records with fields for the addresses; records called
quadruples and triples.

An address can be one of the following:
• A name. For convenience, we allow source-program names to
appear as addresses in three-address code. In an
implementation, a source name is replaced by a pointer to its
symbol-table entry, where all information about the name is
kept.
• A constant. In practice, a compiler must deal with many
different types of constants and variables. Type conversions
within expressions are considered.
• A compiler-generated temporary. It is useful,
especially in optimizing compilers, to create a distinct name
each time a temporary is needed. These temporaries can be
combined, if possible, when registers are allocated to variables.

List of the common three-address instruction
forms:
1. Assignment instructions of the form x = y op z,
where op is a binary arithmetic or logical operation, and
x, y, and z are addresses.
2. Assignments of the form x = op y, where op is a
unary operation. Essential unary operations include unary
minus, logical negation, and conversion operators that,
for example, convert an integer to a floating-point
number.
3. Copy instructions of the form x = y, where x
is assigned the value of y.

4. An unconditional jump goto L. The three-address
instruction with label L is the next to be executed.
5. Conditional jumps of the form if x goto L and
ifFalse x goto L. These instructions execute the
instruction with label L next if x is true and false,
respectively. Otherwise, the following(6) three-address
instruction in sequence is executed next, as usual.
6. Conditional jumps such as if x relop y goto L,
which apply a relational operator (<, ==, >=, etc.) to x
and y, and execute the instruction with label L next if x
stands in relation relop to y. If not, the three-address
instruction following if x relop y goto L is
executed next, in sequence.

7. Procedure calls and returns are implemented using the following
instructions: param x for parameters; call p, n and y =
call p, n for procedure and function calls, respectively; and
return y, where y, representing a returned value, is optional.
Their typical use is as the sequence of three-address instructions
param x1
param x2
.....
param xn
call p, n
generated as part of a call of the procedure p(x1, x2,..xn). The
integer n, indicating the number of actual parameters in “call p, n,”
is not redundant because calls can be nested. That is, some of the first
param statements could be parameters of a call that comes after p
returns its value; that value becomes another parameter of the later
call.

8. Indexed copy instructions of the form x = y[i] and
x[i]= y. The instruction x = y[i] sets x to the value
in the location i memory units beyond location y. The
instruction x[i]= y sets the contents of the location i
units beyond x to the value of y.
9. Address and pointer assignments of the form x = &y, x
= *y, and *x = y. The instruction x = &y sets the r-
value of x to be the location (l-value) of y. l-value
and r-value are appropriate on the left and right sides of
assignments, respectively. In the instruction x = *y, y is
a pointer or a temporary whose r-value is a location.
The r-value of x is made equal to the contents of that
location. Finally, *x = y sets the r-value of the object
pointed to by x to the r-value of y.

RMM CD LECTURE NOTES UNIT-3 ALL.pdf

RMM CD LECTURE NOTES UNIT-3 ALL.pdf

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to RMM CD LECTURE NOTES UNIT-3 ALL.pdf

Similar to RMM CD LECTURE NOTES UNIT-3 ALL.pdf (20)

Recently uploaded

Recently uploaded (20)

RMM CD LECTURE NOTES UNIT-3 ALL.pdf