Intermediate Code
Representations
Intermediate Code Representations
 Front end - produces an intermediate representation (IR)
 Middle end - transforms the IR into an equivalent IR that runs more efficiently
 Back end - transforms the IR into native code
 IR encodes the compiler’s knowledge of the program
 Middle end usually consists of several passes
Front
End
Middle
End
Back
End
IR IRSource
Code
Target
Code
Intermediate Code Representations
 Decisions in IR design affect the speed and efficiency
of the compiler
 Some important IR properties
 Ease of generation
 Ease of manipulation
 Procedure size
 Freedom of expression
 Level of abstraction
 The importance of different properties varies between compilers
Types of Intermediate Representations
Three major categories
 Structural
 Graphically oriented
 Heavily used in source-to-source translators
 Tend to be large
 Examples: Trees, DAGs
 Linear
 Pseudo-code for an abstract machine
 Level of abstraction varies
 Simple, compact data structures
 Easier to rearrange
 Example: 3 Address Code, Stack Machine Code
Types of Intermediate Representations
 Hybrid
 Combination of graphs and linear code
 Example: control-flow graph
Structural
DAG’s Syntax Tree
 It is sometimes beneficial to crate a DAG instead of tree for Expressions.
 This way we can easily show the common sub-expressions and then use that
knowledge during code generation
 Example: a+a*(b-c)+(b-c)*d
SDD for creating DAG’s
Production Semantic Rules
1) E -> E1+T
2) E -> E1-T
3) E -> T
4) T -> (E)
5) T -> id
6) T -> num
E.node= new Node(‘+’, E1.node,T.node)
E.node= new Node(‘-’, E1.node,T.node)
E.node = T.node
T.node = E.node
T.node = new Leaf(id, id.entry)
T.node = new Leaf(num, num.val)
Linear
Three address code
 In a three address code there is at most one operator at the right side of an
instruction
 Example:
t1 = b – c
t2 = a * t1
t3 = a + t2
t4 = t1 * d
t5 = t3 + t4
Forms of three address instructions
 x = y op z
 x = op y
 x = y
 goto L
 if x goto L and ifFalse x goto L
 if x relop y goto L
 Procedure calls using:
 param x
 call p,n
 y = call p,n
 x = y[i] and x[i] = y
 x = &y and x = *y and *x =y
Example
 do i = i+1; while (a[i] < v);
L: t1 = i + 1
i = t1
t2 = i * 8
t3 = a[t2]
if t3 < v goto L
100: t1 = i + 1
101: i = t1
102: t2 = i * 8
103: t3 = a[t2]
104: if t3 < v goto 100
Symbolic labels Position numbers
Hybrid
Control Flow
boolean expressions are often used to:
 Alter the flow of control.
 Compute logical values.
Short-Circuit Code
Flow-of-Control Statements

Intermediate code representations

  • 1.
  • 2.
    Intermediate Code Representations Front end - produces an intermediate representation (IR)  Middle end - transforms the IR into an equivalent IR that runs more efficiently  Back end - transforms the IR into native code  IR encodes the compiler’s knowledge of the program  Middle end usually consists of several passes Front End Middle End Back End IR IRSource Code Target Code
  • 3.
    Intermediate Code Representations Decisions in IR design affect the speed and efficiency of the compiler  Some important IR properties  Ease of generation  Ease of manipulation  Procedure size  Freedom of expression  Level of abstraction  The importance of different properties varies between compilers
  • 4.
    Types of IntermediateRepresentations Three major categories  Structural  Graphically oriented  Heavily used in source-to-source translators  Tend to be large  Examples: Trees, DAGs  Linear  Pseudo-code for an abstract machine  Level of abstraction varies  Simple, compact data structures  Easier to rearrange  Example: 3 Address Code, Stack Machine Code
  • 5.
    Types of IntermediateRepresentations  Hybrid  Combination of graphs and linear code  Example: control-flow graph Structural DAG’s Syntax Tree  It is sometimes beneficial to crate a DAG instead of tree for Expressions.  This way we can easily show the common sub-expressions and then use that knowledge during code generation  Example: a+a*(b-c)+(b-c)*d
  • 6.
    SDD for creatingDAG’s Production Semantic Rules 1) E -> E1+T 2) E -> E1-T 3) E -> T 4) T -> (E) 5) T -> id 6) T -> num E.node= new Node(‘+’, E1.node,T.node) E.node= new Node(‘-’, E1.node,T.node) E.node = T.node T.node = E.node T.node = new Leaf(id, id.entry) T.node = new Leaf(num, num.val)
  • 7.
    Linear Three address code In a three address code there is at most one operator at the right side of an instruction  Example: t1 = b – c t2 = a * t1 t3 = a + t2 t4 = t1 * d t5 = t3 + t4
  • 8.
    Forms of threeaddress instructions  x = y op z  x = op y  x = y  goto L  if x goto L and ifFalse x goto L  if x relop y goto L  Procedure calls using:  param x  call p,n  y = call p,n  x = y[i] and x[i] = y  x = &y and x = *y and *x =y
  • 9.
    Example  do i= i+1; while (a[i] < v); L: t1 = i + 1 i = t1 t2 = i * 8 t3 = a[t2] if t3 < v goto L 100: t1 = i + 1 101: i = t1 102: t2 = i * 8 103: t3 = a[t2] 104: if t3 < v goto 100 Symbolic labels Position numbers
  • 10.
    Hybrid Control Flow boolean expressionsare often used to:  Alter the flow of control.  Compute logical values. Short-Circuit Code
  • 11.