(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
Principles of Compiler Design
1. SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY,
COIMBATORE-10
(An Autonomous Institution)
Department of Information Technology
Academic Year 2021-22 III Year 5th Semester
UITC013 - PRINCIPLES OF COMPILER
DESIGN
M. MARIMUTHU, AP/IT
2. What are Compilers?
• Compiler is a program which translates a
program written in one language (the source
language) to an equivalent program in other
language (the target language)
3. Compilers
• Translates from one representation of the
program to another
• Typically from high level source code to low
level machine code or object code.
• Source code is normally optimized for human
readability.
– Expressive: matches our notion of languages
– Redundant to help avoid programming errors
4. Compilers…(2)
• Machine code is optimized for hardware.
– Redundancy is reduced
– Information about the intent is lost.
7. Phases of a Compiler..
• A compiler operates in phases, each of which
transforms the source program from one
representation into another representation.
• They communicate with error handlers.
• They communicate with the symbol table.
8. Symbol Table Management
• An essential function of a compiler is to record
the identifiers used in the source program and
collect information about various attributes of
each identifier.
• These attributes may provide information about
the storage allocated for an identifier, its type, its
scope and in the case of procedure names, such
things as the number and types of its arguments,
the method of passing each argument and the
type returned.
9. Symbol Table Management…
• A symbol table is a data structure containing a
record for each identifier, with fields for the
attributes of the identifier.
• The data structure allows us to find the record
for each identifier quickly and to store or
retrieve data from that record quickly.
• When an identifier in the source program is
detected by the lexical analyzer, the identifier
is entered into the symbol table.
10. Symbol Table Management…
• However, the attributes of an identifier cannot
normally be determined during lexical analysis.
For example, in a Pascal declaration like.
Var a,b,c: real;
The type “real” is not known when a,b and c are
seen by the lexical analyzer.
• The remaining phases enter information about
identifiers into the symbol table and use this
information in various ways.
11. Error Detection and Reporting
• Each phase can encounter errors. However,
after detecting an error, a phase must deal
with that error, so that compilation can
proceed allowing further errors in the source
program to be detected.
• The lexical phase can detect errors where the
characters remaining in the input do not form
any token of the language.
12. Error Detection and Reporting
• Error where the token stream violates the
structure rules of the language are
determined by the syntax analysis phase.
• During semantic analysis the compiler tries to
detect constructs that have the right syntactic
structure but no meaning to the operation
involved.
13. Error Detection and Reporting
• All phases may encounter errors such as,
• Lexical Phase – Unable to proceed because
the next token in the source program may be
misspelled.
• Syntax Phase – Structure of the statement
violates the rules of programming language.
• Semantic – No meaning in the operation
involved.
14. Error Detection and Reporting
• Intermediate – Operands have incompatible
data types.
• Code Optimizer – Certain statements may
never be reached.
• Code Generation – Constant is too long.
• Symbol table – Multiple declared variables.
15. Lexical Analysis
• The lexical analysis phase reads the characters
in the source program and groups them into a
stream of tokens.
• Each token represents a logically cohesive
sequence of characters, such as an identifier, a
keyword, a punctuation character or a
operator, etc.
• The character sequence forming a token is
called the lexeme for the token.
16. Lexical Analysis
• Certain tokens will be augmented by a “lexical
value”. The lexical analyzer not only generates
a token, but also it enters the lexeme into the
symbol table.
• Eg., Consider the expression
a:= b+c*20
The representation of the above expression
after lexical analysis is,
id1 = id2 + id3 * 20
17. Syntax Analysis
• This phase receives the tokens generated by previous
phase (Lexical Analysis) as input and produces a
hierarchical structure called syntax or Parse tree as
output.
• It checks whether the statements are up to the syntax
of the programming language constructs or not.
• A Parse tree represents the syntactic structure of the
input. A Syntax tree is a compressed representation of
the parse tree in which the operators appear as the
interior nodes and the operands as child nodes.
18. Semantic Analysis
• This phase checks the source program for the
meaning of syntax tree, semantic errors and
gathers “type compatible information” for the
subsequent phases.
• Type Checking - “for each operator, the
operands are checked to find whether they
are acceptable for that operator.
• For example, character data type cannot be
used in an arithmetic expression
19. Intermediate Code Generation
• This phase converts the output of semantic
analyzer into an intermediate representation.
• Intermediate representation should have two
important properties:-
– It should be easy to produce,
– Easy to translate into the target program
• Some of the intermediate forms are
– Syntax Trees
– Postfix Notation
– Three address code – having at most three operands
20. Code Optimization
• Object code programs need to be small and
faster than the source program. Hence this
phase optimizes the intermediate
representation of the source program into an
efficient code by reducing the unnecessary
statements and operands.
21. Code Optimization - Examples
• Detection and removal of dead (unreachable)
code.
• Local optimization :- Elimination of common
sub expressions
• Loop optimization :- Finding out loop
invariants and avoiding them.
22. Code Generation
• This phase converts the optimized
intermediate code into target code which can
be either Assembly code or Machine code.
• This phase also allocates memory locations for
the variables used in the program (allocation
of registers and memory)