INTERMEDIATE CODE
GENERATION
INTRODUCTION
• In computing, code generation is the process by which a
compiler's code generator converts some intermediate
representation of source code into a form (e.g., machine code)
that can be readily executed by a machine.
• The front end of a compiler translates a source program into an
independent intermediate code,
• Then the back end of the compiler uses this intermediate code to
generate the target code (which can be understood by the
MACHINE INDEPENDENT INTERMEDIATE
CODE:
• Because of the machine independent intermediate code,
portability will be enhanced.
• It is easier to apply source code modification to improve the
performance of source code by optimizing the intermediate
code.
INTERMEDIATE
REPRESENTATION
HIGH LEVEL IR
• High-level intermediate
code representation is very
close to the source
language itself.
• Not preferred for target
machine optimization
LOW LEVEL IR
• This one is close to the
target machine.
• It is good for machine-
dependent optimizations.
COMMONLY USED INTERMEDIATE
CODE REPRESENTATION
POSTFIX NOTATION
• The postfix notation for the expression “a + b” places the
operator at the right end as ab +.
• No parentheses are needed in postfix notation because the
position and arity (number of arguments) of the operators
permit only one way to decode a postfix expression.
• In postfix notation the operator follows the operand.
THREE ADDRESS CODE
• A statement involving no more than three references(two for
operands and one for result) is known as three address
statement.
• Three address statement is of the form x = y op z , here x, y, z
will have address (memory location).
• Can be represented in 3 forms : Quadruples, Triples, Indirect
type.
QUADRUPLES
• Each instruction in quadruples presentation is divided into four
fields: operator, arg1, arg2, and result.
Operator Argument
1
Argument
2
Resul
t
(1) * C D R1
(2) + B R1 R2
(3) = R2 A
Previous example of a = b + c * d
TRIPLES
• Each instruction in triples presentation has three fields :
op, arg1, and arg2.
• The results of respective sub-expressions are denoted by
the position of expression.
Operator Argument
1
Argument
2
(1) * C D
(2) + B (1)
(3) = (2)
Previous example of a = b + c * d
INDIRECT TRIPLES
• This representation is an enhancement over triples
representation. It uses pointers instead of position to
store results.
• This enables the optimizers to freely re-position the
sub-expression to produce an optimized code.
SYNTAX TREE
• Syntax tree is nothing more than condensed form of a parse
tree.
• The operator and keyword nodes of the parse tree are moved
to their parents and a chain of single productions is replaced
by single link in syntax tree.
• The internal nodes are operators and child nodes are
operands.
SYNTAX TREE
THANK YOU

Intermediate code

  • 1.
  • 2.
    INTRODUCTION • In computing,code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a form (e.g., machine code) that can be readily executed by a machine. • The front end of a compiler translates a source program into an independent intermediate code, • Then the back end of the compiler uses this intermediate code to generate the target code (which can be understood by the
  • 3.
    MACHINE INDEPENDENT INTERMEDIATE CODE: •Because of the machine independent intermediate code, portability will be enhanced. • It is easier to apply source code modification to improve the performance of source code by optimizing the intermediate code.
  • 5.
    INTERMEDIATE REPRESENTATION HIGH LEVEL IR •High-level intermediate code representation is very close to the source language itself. • Not preferred for target machine optimization LOW LEVEL IR • This one is close to the target machine. • It is good for machine- dependent optimizations.
  • 6.
  • 7.
    POSTFIX NOTATION • Thepostfix notation for the expression “a + b” places the operator at the right end as ab +. • No parentheses are needed in postfix notation because the position and arity (number of arguments) of the operators permit only one way to decode a postfix expression. • In postfix notation the operator follows the operand.
  • 8.
    THREE ADDRESS CODE •A statement involving no more than three references(two for operands and one for result) is known as three address statement. • Three address statement is of the form x = y op z , here x, y, z will have address (memory location). • Can be represented in 3 forms : Quadruples, Triples, Indirect type.
  • 9.
    QUADRUPLES • Each instructionin quadruples presentation is divided into four fields: operator, arg1, arg2, and result. Operator Argument 1 Argument 2 Resul t (1) * C D R1 (2) + B R1 R2 (3) = R2 A Previous example of a = b + c * d
  • 10.
    TRIPLES • Each instructionin triples presentation has three fields : op, arg1, and arg2. • The results of respective sub-expressions are denoted by the position of expression. Operator Argument 1 Argument 2 (1) * C D (2) + B (1) (3) = (2) Previous example of a = b + c * d
  • 11.
    INDIRECT TRIPLES • Thisrepresentation is an enhancement over triples representation. It uses pointers instead of position to store results. • This enables the optimizers to freely re-position the sub-expression to produce an optimized code.
  • 12.
    SYNTAX TREE • Syntaxtree is nothing more than condensed form of a parse tree. • The operator and keyword nodes of the parse tree are moved to their parents and a chain of single productions is replaced by single link in syntax tree. • The internal nodes are operators and child nodes are operands.
  • 13.
  • 14.

Editor's Notes

  • #3 In computing, code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a form (e.g., machine code) that can be readily executed by a machine. n the analysis-synthesis model of a compiler, the front end of a compiler translates a source program into an independent intermediate code, then the back end of the compiler uses this intermediate code to generate the target code (which can be understood by the machine).
  • #4 If a compiler translates the source language to its target machine language without having the option for generating intermediate code, then for each new machine, a full native compiler is required. Intermediate code eliminates the need of a new full compiler for every unique machine by keeping the analysis portion same for all the compilers. The second part of compiler, synthesis, is changed according to the target machine. It becomes easier to apply the source code modifications to improve code performance by applying code optimization techniques on the intermediate code.
  • #5 If we generate machine code directly from source code then for n target machine we will have n optimisers and n code generators but if we will have a machine independent intermediate code, we will have only one optimiser. Intermediate code can be either language specific (e.g., Bytecode for Java) or language. independent (three-address code).
  • #6 High Level IR - High-level intermediate code representation is very close to the source language itself. They can be easily generated from the source code and we can easily apply code modifications to enhance performance. But for target machine optimization, it is less preferred. Low Level IR - This one is close to the target machine, which makes it suitable for register and memory allocation, instruction set selection, etc. It is good for machine-dependent optimizations. Intermediate code can be either language specific (e.g., Byte Code for Java) or language independent (three-address code).
  • #8 The ordinary (infix) way of writing the sum of a and b is with operator in the middle : a + b The postfix notation for the same expression places the operator at the right end as ab +. In general, if e1 and e2 are any postfix expressions, and + is any binary operator, the result of applying + to the values denoted by e1 and e2 is postfix notation by e1e2 +. No parentheses are needed in postfix notation because the position and arity (number of arguments) of the operators permit only one way to decode a postfix expression. In postfix notation the operator follows the operand. Example – The postfix representation of the expression (a – b) * (c + d) + (a – b) is :   ab – cd + *ab -+.
  • #9 A statement involving no more than three references(two for operands and one for result) is known as three address statement. A sequence of three address statements is known as three address code. Three address statement is of the form x = y op z , here x, y, z will have address (memory location). Sometimes a statement might contain less than three references but it is still called three address statement. Example – The three address code for the expression a = b + c * d: r1 = c * d; r2 = b + r1; a = r2 r1,r2 are temporary variables. Can be represented in 3 forms : Quadruples, Triples, Indirect type.
  • #13 Syntax tree is nothing more than condensed form of a parse tree. The operator and keyword nodes of the parse tree are moved to their parents and a chain of single productions is replaced by single link in syntax tree the internal nodes are operators and child nodes are operands. To form syntax tree put parentheses in the expression, this way it's easy to recognize which operand should come first.
  • #14 Syntax tree is nothing more than condensed form of a parse tree. The operator and keyword nodes of the parse tree are moved to their parents and a chain of single productions is replaced by single link in syntax tree the internal nodes are operators and child nodes are operands. To form syntax tree put parentheses in the expression, this way it's easy to recognize which operand should come first.