COMPILER 
Achyuth Dorasala 
30-Sep-2014
INDEX 
TOPIC PAGE NO. 
1.COMPILER-DEFINATION ..... 03 
2.COMPILER-PHASES ..... 03 
A.ANALYSIS PHASE 
i.LEXICAL ANALYSIS ..... 03 
ii.SYNTAX ANALYSIS ..... 04 
iii.SEMANTIC ANALYSIS ..... 05 
B.SYNTHESIS ANALYSIS 
i.INTERMEDIATE CODE GENERATOR ..... 06 
ii.CODE OPTIMIZER ..... 06 
iii.CODE GENERATOR ..... 06 
3.STRUCTURE OF A COMPILER 
i.FRONT END ..... 07 
ii.MIDDLE END ..... 07 
iii.BACK END ..... 07 
4.COMPILER OUTPUT ..... 08 
5.NATIVE COMPILER ..... 08
6.CROSS COMPILER ..... 08 
7.VIRTUAL MACHINE ..... 08 
8.HARDWARE COMPILATION ..... 10 
9. COMPILER CORRECTNESS ..... 10 
10. RELATED TECHNIQUES ..... 10 
11.CROSS COMPILER ..... 11
COMPILER: 
A compiler is used to translate the high level language to machine level language 
COMPILER HAS TWO PHASES: 
A.ANALYSIS PHASE 
B.SYNTHESIS PHASE 
A.ANALYSIS PHASE: 
It will divide the source program into constituent pieces and create intermediate 
representation. 
An intermediate representation is any data structure that can represent the program 
without loss of information so that its execution can be conducted accurately. It serves as the 
common interface among the compiler components. 
the analysis phase can be divided into three parts 
1.Lexical Analysis 
2.Syntax Analysis 
3.semantic analysis 
1.LEXIAL ANALYSIS: 
the program is considered as a unique sequence of characters. It reads the program 
from left to right and sequence of characters is grouped into tokens. 
TOKENS 
A token is the smallest individual element in a program (identifiers,keywords 
,constants,operators)
lexical analyzer is the first phase of computer . The input for this is the source 
code. It identifies the different tokens in the source program. It is also called as “linear 
phase” or “linear analysis” or “scanning” . The output of this is sent to syntax analysis. 
for instance 
add = a + b ; 
lexical analyzer 
Here identifiers are = add,a,b 
operators are = =,+ 
separators are = ; 
this is all about lexical analysis. 
2.SYNTAX ANALYSIS: 
it is the second phase of the compiler. It is also called as parisng.it groups the 
tokens of source program into grammatical production , in short it gives parse tree. 
it checks whether the operator is in between the two operands are not , 
for instance, sum = a + b ; 
here the operator is ‘+’ . it checks only the operator but not the operands it means 
that if the one operand is the integer and other is string it will not give any error.it is also 
called as “Hierarchical Analysis” as it generates parse tree representation of the tokens.
ASSIGNMENT STATEMENT 
= 
IDENTIFIER SUM EXPRESSION 
+ 
OPERAND ‘a’ OPERAND ’b’ 
3.SEMANTIC ANALYSIS: 
syntax analyzer will just create parse tree. Semantic analyzer will check the meaning 
of the statement parsed in parse tree. It will compare the information in one part of a parse 
tree to that in another part. 
Semantic analysis is used for the following: 
1. maintaining the symbol table for each program. 
2.check for semantic errors in the source program. 
3.collecting type information for code generation. 
4.reporting the compilation errors. 
5.generating the object code.
Semantic analysis will check for the data type of first and second operand and checking for 
the number of operand based on operator in add = a + b. 
2.SYNTHESIS PHASE: 
it Generates the program from the intermediate representation. The synthesis 
part can be divided into three phases 
1.intermediate code generator 
2.code optimizer 
3.code generator 
1.INTERMEDIATE CODE GENERATOR: 
An intermediate code is generated as a program for an abstract machine. The 
intermediate code should be easy to translate to program. 
2.CODE OPTIMIZER: 
it improves the intermediate code such that fast runner code is generated. 
Different compilers obtain different optimization techniques. 
3.CODE GENERATOR: 
it generates the code which consists of assembly code. 
Here , memory locations are allotted separately for each variable.
STRUCTURE OF A COMPILER: 
1.FRONT END 
2.MIDDLE END 
3.BACK END 
1.FRONT END: 
it verifies syntax and semantics, and generates an intermediate 
representation of the source code for processing by the middle-end. performs type 
checking by collecting type information. it generates errors and warning, if any, in a useful 
way. aspects of the front end include lexical analysis, syntax analysis, and semantic analysis. 
2.MIDDLE END: 
It includes the removal of useless or unreachable code and generates 
other intermediate representation for the back end. 
3.BACK END: 
it generates the assembly code. 
SOME OF THE COMPANIES WHICH PROVIDE COMPILERS ARE: 
1. ABSINT 
2. FORTRAN 77 AND FORTRAN 90 FOR WINNT, WIN95, WIN3.1, LINUX, AND MACINTOSH. 
C/C++ FOR MACINTOSH. 
3. ADA CORE TECHNOLIGIES , AND SOME OTHER COMPANIES 
4. A WELL-DOCUMENTED EXAMPLE IS NIKLAUS WIRTH'S PL/0 COMPILER, WHICH WIRTH 
USED TO TEACH COMPILER CONSTRUCTION IN THE 1970,S
COMPILER OUTPUT 
One classification of compilers is by the platform on which their generated code 
executes. This is known as the target platform. 
NATIVE COMPILER: 
In which output is intended to directly run on the same type of 
computer and operating system . 
CROSS COMPILER: 
The output of a cross compiler is designed to run on a different 
platform. Cross compilers are often used when developing software for embedded 
systems that are not intended to support a software development environment . 
The output of a compiler that produces code for a virtual machine (VM) 
may or may not be executed on the same platform as the compiler that produced it. For this 
reason such compilers are not usually classified as native or cross compilers. 
VIRTUAL MACHINE: 
A virtual machine is a software implementation of a machine (for 
example, a computer) that executes programs like a physical machine(a hardware-based 
device like a personal computer)
HARDWARE COMPILATION: 
the output of some compilers may target computer hardware at a 
very low level , for example some compilers like application-specific integrated circuit,… 
Are known as hardware compilers because the source code compiled by them is the final 
configuration of the hardware. 
COMPILER CORRECTNESS: 
Compiler correctness is the branch of software engineering that deals 
with trying to show that a compiler behaves according to its language specification . 
Techniques include developing the compiler using formal methods and using rigorous 
testing (often called compiler validation) on an existing compiler. 
RELATED TECHNIQUES: 
assembly language is a type of low-level language and a program that 
compiles it is more commonly known as assembler , with the inverse program known as 
disassembler 
a program that translates from a low level language to a high level 
language is a decompiler . 
A program that translates between high-level languages is usually 
called a language translator, source to source translator, language converter, or 
language rewriter. The last term is usually applied to translations that do not involve a 
change of language.
CROSS COMPILER: 
A program that translates into an object code format that is not 
supported on the compilation machine is called a cross compiler and is commonly used to 
prepare code for embedded applications.
THANK YOU

Compiler

  • 1.
  • 2.
    INDEX TOPIC PAGENO. 1.COMPILER-DEFINATION ..... 03 2.COMPILER-PHASES ..... 03 A.ANALYSIS PHASE i.LEXICAL ANALYSIS ..... 03 ii.SYNTAX ANALYSIS ..... 04 iii.SEMANTIC ANALYSIS ..... 05 B.SYNTHESIS ANALYSIS i.INTERMEDIATE CODE GENERATOR ..... 06 ii.CODE OPTIMIZER ..... 06 iii.CODE GENERATOR ..... 06 3.STRUCTURE OF A COMPILER i.FRONT END ..... 07 ii.MIDDLE END ..... 07 iii.BACK END ..... 07 4.COMPILER OUTPUT ..... 08 5.NATIVE COMPILER ..... 08
  • 3.
    6.CROSS COMPILER .....08 7.VIRTUAL MACHINE ..... 08 8.HARDWARE COMPILATION ..... 10 9. COMPILER CORRECTNESS ..... 10 10. RELATED TECHNIQUES ..... 10 11.CROSS COMPILER ..... 11
  • 4.
    COMPILER: A compileris used to translate the high level language to machine level language COMPILER HAS TWO PHASES: A.ANALYSIS PHASE B.SYNTHESIS PHASE A.ANALYSIS PHASE: It will divide the source program into constituent pieces and create intermediate representation. An intermediate representation is any data structure that can represent the program without loss of information so that its execution can be conducted accurately. It serves as the common interface among the compiler components. the analysis phase can be divided into three parts 1.Lexical Analysis 2.Syntax Analysis 3.semantic analysis 1.LEXIAL ANALYSIS: the program is considered as a unique sequence of characters. It reads the program from left to right and sequence of characters is grouped into tokens. TOKENS A token is the smallest individual element in a program (identifiers,keywords ,constants,operators)
  • 5.
    lexical analyzer isthe first phase of computer . The input for this is the source code. It identifies the different tokens in the source program. It is also called as “linear phase” or “linear analysis” or “scanning” . The output of this is sent to syntax analysis. for instance add = a + b ; lexical analyzer Here identifiers are = add,a,b operators are = =,+ separators are = ; this is all about lexical analysis. 2.SYNTAX ANALYSIS: it is the second phase of the compiler. It is also called as parisng.it groups the tokens of source program into grammatical production , in short it gives parse tree. it checks whether the operator is in between the two operands are not , for instance, sum = a + b ; here the operator is ‘+’ . it checks only the operator but not the operands it means that if the one operand is the integer and other is string it will not give any error.it is also called as “Hierarchical Analysis” as it generates parse tree representation of the tokens.
  • 6.
    ASSIGNMENT STATEMENT = IDENTIFIER SUM EXPRESSION + OPERAND ‘a’ OPERAND ’b’ 3.SEMANTIC ANALYSIS: syntax analyzer will just create parse tree. Semantic analyzer will check the meaning of the statement parsed in parse tree. It will compare the information in one part of a parse tree to that in another part. Semantic analysis is used for the following: 1. maintaining the symbol table for each program. 2.check for semantic errors in the source program. 3.collecting type information for code generation. 4.reporting the compilation errors. 5.generating the object code.
  • 7.
    Semantic analysis willcheck for the data type of first and second operand and checking for the number of operand based on operator in add = a + b. 2.SYNTHESIS PHASE: it Generates the program from the intermediate representation. The synthesis part can be divided into three phases 1.intermediate code generator 2.code optimizer 3.code generator 1.INTERMEDIATE CODE GENERATOR: An intermediate code is generated as a program for an abstract machine. The intermediate code should be easy to translate to program. 2.CODE OPTIMIZER: it improves the intermediate code such that fast runner code is generated. Different compilers obtain different optimization techniques. 3.CODE GENERATOR: it generates the code which consists of assembly code. Here , memory locations are allotted separately for each variable.
  • 8.
    STRUCTURE OF ACOMPILER: 1.FRONT END 2.MIDDLE END 3.BACK END 1.FRONT END: it verifies syntax and semantics, and generates an intermediate representation of the source code for processing by the middle-end. performs type checking by collecting type information. it generates errors and warning, if any, in a useful way. aspects of the front end include lexical analysis, syntax analysis, and semantic analysis. 2.MIDDLE END: It includes the removal of useless or unreachable code and generates other intermediate representation for the back end. 3.BACK END: it generates the assembly code. SOME OF THE COMPANIES WHICH PROVIDE COMPILERS ARE: 1. ABSINT 2. FORTRAN 77 AND FORTRAN 90 FOR WINNT, WIN95, WIN3.1, LINUX, AND MACINTOSH. C/C++ FOR MACINTOSH. 3. ADA CORE TECHNOLIGIES , AND SOME OTHER COMPANIES 4. A WELL-DOCUMENTED EXAMPLE IS NIKLAUS WIRTH'S PL/0 COMPILER, WHICH WIRTH USED TO TEACH COMPILER CONSTRUCTION IN THE 1970,S
  • 9.
    COMPILER OUTPUT Oneclassification of compilers is by the platform on which their generated code executes. This is known as the target platform. NATIVE COMPILER: In which output is intended to directly run on the same type of computer and operating system . CROSS COMPILER: The output of a cross compiler is designed to run on a different platform. Cross compilers are often used when developing software for embedded systems that are not intended to support a software development environment . The output of a compiler that produces code for a virtual machine (VM) may or may not be executed on the same platform as the compiler that produced it. For this reason such compilers are not usually classified as native or cross compilers. VIRTUAL MACHINE: A virtual machine is a software implementation of a machine (for example, a computer) that executes programs like a physical machine(a hardware-based device like a personal computer)
  • 10.
    HARDWARE COMPILATION: theoutput of some compilers may target computer hardware at a very low level , for example some compilers like application-specific integrated circuit,… Are known as hardware compilers because the source code compiled by them is the final configuration of the hardware. COMPILER CORRECTNESS: Compiler correctness is the branch of software engineering that deals with trying to show that a compiler behaves according to its language specification . Techniques include developing the compiler using formal methods and using rigorous testing (often called compiler validation) on an existing compiler. RELATED TECHNIQUES: assembly language is a type of low-level language and a program that compiles it is more commonly known as assembler , with the inverse program known as disassembler a program that translates from a low level language to a high level language is a decompiler . A program that translates between high-level languages is usually called a language translator, source to source translator, language converter, or language rewriter. The last term is usually applied to translations that do not involve a change of language.
  • 11.
    CROSS COMPILER: Aprogram that translates into an object code format that is not supported on the compilation machine is called a cross compiler and is commonly used to prepare code for embedded applications.
  • 12.