SlideShare a Scribd company logo
Introduction to Compiler
Dr.P.Amudha
Associate Professor
Dept. of CSE
Introduction
• In order to reduce the complexity of designing and
building computers, made to execute relatively simple
commands
• combining these very simple commands into a program in
what is called machine language
• . Since this is a tedious and error-prone process most
programming is, instead, done using a high-level
programming language
• . This language can be very different from the machine
language that the computer can execute, so some means of
bridging the gap is required.
• This is where the compiler comes in.
Contd.,
• Using a high-level language for programming has a large impact
on how fast programs can be developed.
• The main reasons for this are:
• Compared to machine language, the notation used by
programming languages is closer to the way humans think about
problems.
• The compiler can spot some obvious programming mistakes.
• Programs written in a high-level language tend to be shorter
than equivalent programs written in machine language.
• Another advantage of using a high-level level language is:
the same program can be compiled to many different machine
languages and, hence, run on many different machines.
Translator
• A program written in high-level language is called as
source code. To convert the source code into machine
code, translators are needed.
• A translator takes a program written in source language as
input and converts it into a program in target language as
output.
• It also detects and reports the error during translation.
Roles of translator are:
• Translating the high-level language program input into an
equivalent machine language program.
• Providing diagnostic messages wherever the programmer
violates specification of the high-level language program.
Different type of translators
Compiler
Compiler is a translator which is used to convert programs in high-level
language to low-level language. It translates the entire program and also
reports the errors in source program encountered during the translation.
Interpreter
• Interpreter is a translator which is used to convert programs in high-level
language to low-level language. Interpreter translates line by line and
reports the error once it encountered during the translation process.
• It directly executes the operations specified in the source program when the
input is given by the user.
• It gives better error diagnostics than a compiler.
• Assembler
Assembler is a translator which is used to
translate the assembly language code into
machine language code.
S.No. Compiler Interpreter
1 Performs the translation
of a program as a
whole.
Performs statement by
statement translation.
2 Execution is faster. Execution is slower.
3 Requires more memory
as linking is needed for
the generated
intermediate object
code.
Memory usage is efficient as no
intermediate object code is
generated.
4 Debugging is hard as
the error messages are
generated after scanning
the entire program only.
It stops translation when the first
error is met. Hence, debugging
is easy.
5 Programming languages
like C, C++ uses
compilers.
Programming languages like
Python, BASIC, and Ruby uses
interpreters.
Why learn about compilers?
a) It is considered a topic that you should know in order to be
“well-cultured” in computer science.
b) A good craftsman should know his tools, and compilers are
important tools for programmers and computer scientists.
c) The techniques used for constructing a compiler are useful
for other purposes as well.
d) There is a good chance that a programmer or computer
scientist will need to write a compiler or interpreter for a
domain-specific language
Compilers and Interpreters
• “Compilation”
– Translation of a program written in a source
language into a semantically equivalent
program written in a target language
Compiler
Error messages
Source
Program
Target
Program
Input
Output
Compilers and Interpreters
(cont’d)
Interpreter
Source
Program
Input
Output
Error messages
• “Interpretation”
– Performing the operations implied by the
source program
The Analysis-Synthesis Model of
Compilation
• There are two parts to compilation:
– Analysis determines the operations implied by
the source program which are recorded in a tree
structure
– Synthesis takes the tree structure and translates
the operations therein into the target program
Other Tools that Use the
Analysis-Synthesis Model
• Editors (syntax highlighting)
• Pretty printers (e.g. Doxygen)
• Static checkers (e.g. Lint and Splint)
• Interpreters
• Text formatters (e.g. TeX and LaTeX)
• Silicon compilers (e.g. VHDL)
• Query interpreters/compilers (Databases)
Language Processing System
Preprocessor
Compiler
Assembler
Linker
Skeletal Source Program
Source Program
Target Assembly Program
Relocatable Object Code
Absolute Machine Code
Libraries and
Relocatable Object Files
Language Processing System
Preprocessor :
• A preprocessor, generally considered as a part of
compiler, is a tool that produces input for compilers.
• It deals with macro-processing, augmentation, file
inclusion, language extension, etc.
Compiler: A compiler, translates high-level language into
low-level machine language
Assembler:
• An assembler translates assembly language programs
into machine code.
• The output of an assembler is called an object file,
which contains a combination of machine
instructions as well as the data required to place
these instructions in memory.
Contd.,
Loader:
• Loader is a part of operating system and is
responsible for loading executable files into memory
and execute them.
• It calculates the size of a program instructions and
data and creates memory space for it.
• Cross-compiler: A compiler that runs on platform A
and is capable of generating executable code for
platform B is called a cross-compiler.
• Source-to-source Compiler: A compiler that takes the
source code of one programming language and
translates it into the source code of another
programming language is called a source-to-source
compiler.
example
How a program, using C compiler, is executed on
a host machine.
• User writes a program in C language (high-level
language).
• The C compiler, compiles the program and
translates it to assembly program (low-level
language).
• An assembler then translates the assembly
program into machine code (object).
• A linker tool is used to link all the parts of the
program together for execution (executable
machine code).
• A loader loads all of them into memory and
then the program is executed
The Grouping of Phases
• Compiler front and back ends:
– Front end: analysis (machine independent)
– Back end: synthesis (machine dependent)
• Compiler passes:
– A collection of phases is done only once (single pass)
or multiple times (multi pass)
• Single pass: usually requires everything to be defined before
being used in source program
• Multi pass: compiler may have to keep entire program
representation in memory
Phases of compiler
• Lexical Analysis
A token is a string of characters, categorized according to
the rules as a symbol (e.g. IDENTIFIER,
NUMBER, COMMA, etc.).
The process of forming tokens from an input stream of
characters is called tokenization
and the lexer categorizes them according to symbol
type.
A token can look like anything that is useful for
processing an input text stream or text file.
Contd.,
• A lexical analyzer generally does nothing with combinations of
tokens, a task left for a parser. For example, a typical lexical
analyzer recognizes parenthesis as tokens, but does nothing to
ensure that each '(' is matched with a ')'.
• In a compiler, linear analysis is called lexical analysis or scanning.
For example, in lexical analysis
the characters in the assignment statement: Position: = initial + rate ∗ 60
would be grouped into the following tokens:
1. The identifier position.
2. The assignment symbol : =
3. The identifier initial.
4. The plus sign +
5. The identifier 𝐫𝐚𝐭𝐞.
6. The multiplication sign
7. The number 60
The blanks separating the characters of these tokens would be eliminated
Syntax Analysis
• Hierarchical analysis is called parsing or syntax
analysis.
• It involves grouping the tokens of the source
program into grammatical phrases that are used by
the compiler to synthesize output.
• Usually, the grammatical phrases of the source
program are represented by a parse tree
• The hierarchical structure of a program is usually expressed
by recursive rules.
• For example, we might have the following rules as part of the
definition of expressions:
1. Any identifier is an expression.
2. Any number is an expression.
3. If expression1 and expression2 are expressions, then so are
expression1 + expression2
expression1 ∗ expression2
(expression1)
Rules (1) and (2) are non-recursive basic rules, while (3) defines
expressions in terms of operators applied to other expressions.
Thus, by rule (1), initial and rate are expressions.
By rule (2), 60 is an expression, while by rule (3), we can first
infer that rate∗60 is an expression and finally that initial +
rate∗60 is an expression.
Semantic Analysis
• The semantic analysis phase checks the source program for
semantic errors and gathers type information for the subsequent
code-generation phase.
• It uses the hierarchical structure determined by the syntax-analysis
phase to identify the operators and operands of expressions and
statements.
• An important component of semantic analysis is type checking.
• Here the compiler checks that each operator has operands that are
permitted by the source language specification.
• For example, many programming language definitions require a
compiler to report an error every time a real number is used to
index an array.
• However, the language specification may permit some operand
corrections,
• for example, when binary arithmetic operator is applied to an
integer and real.
• In this case, the compiler may need to convert the integer to a real.
• The Phases of a Compiler
• The six phases of compilation: lexical analysis, syntax
analysis, semantic analysis, intermediate code
generation, code optimization, and code generation.
• The 6 phases divided into 2 Groups:
1. Front End: Depends on stream of tokens and parse tree
2. Back End: Dependent on Target, Independent of source
code
Symbol-Table Management:
• A symbol table is a data structure containing a record for
each identifier, with fields for the attributes of the
identifier.
• The data structure allows us to find the record for each
identifier quickly and to store or retrieve data from that
record quickly.
• Symbol table is a Data Structure in a Compiler used for
Managing information about variables & their attributes.
Error Detection and Reporting
• Each phase can encounter errors.
• However, after detecting an error, a phase must somehow
deal with that error, so that compilation can proceed,
allowing further errors in the source program to be
detected.
• A compiler that stops when it finds the first error is not as
helpful as it could be.
• The syntax and semantic analysis phases usually handle a
large fraction of the errors detectable by the compiler.
• The lexical phase can detect errors where the characters
remaining in the input do not form any token of the
language.
• Errors where the token stream violates the structure rules
(syntax) of the language are determined by the syntax
analysis phase.
• Intermediate Code Generations:-
• An intermediate representation of the final machine
language code is produced.
• This phase bridges the analysis and synthesis phases of
translation.
• Code Optimization :-
• This is optional phase described to improve the intermediate
code so that the output runs faster and takes less space.
• Code Generation:-
• The last phase of translation is code generation. A number
of optimizations to reduce the length of machine language
program are carried out during this phase.
• The output of the code generator is the machine language
program of the specified computer.
Compiler construction tools
1. Parser generators.
2. Scanner generators.
3. Syntax-directed translation engines.
4. Automatic code generators.
5. Data-flow analysis engines.
6. Compiler-construction toolkits.
Parser Generators
• Input: Grammatical description of a programming
language
Output: Syntax analyzers.
• Parser generator takes the grammatical description
of a programming language and produces a syntax
analyzer.
Scanner Generators
• Input: Regular expression description of the tokens of a
language
Output: Lexical analyzers.
• Scanner generator generates lexical analyzers from a regular
expression description of the tokens of a language.
Syntax-directed Translation Engines
• Input: Parse tree.
Output: Intermediate code.
• Syntax-directed translation engines produce collections of
routines that walk a parse tree and generates intermediate
code.
Automatic Code Generators
• Input: Intermediate language.
Output: Machine language.
• Code-generator takes a collection of rules that define the translation of each
operation of the intermediate language into the machine language for a
target machine.
Data-flow Analysis Engines
• Data-flow analysis engine gathers the information, that is, the values
transmitted from one part of a program to each of the other parts.
• Data-flow analysis is a key part of code optimization.
Compiler Construction Toolkits
• The toolkits provide integrated set of routines for various phases of compiler.
• Compiler construction toolkits provide an integrated set of routines for
construction of phases of compiler.
Cousins of Compiler
• Interpreters: discussed in detail in first lecture
• Preprocessors: They produce input for the compiler. They perform jobs
such as deleting comments, include files, perform macros etc.
• Assemblers: They are translators for Assembly language. Sometimes
the compiler will generate assembly language in symbolic form then
hand it over to assemblers.
• Linkers: Both compilers and assemblers rely on linkers to collect code
separately compiled or assembled in object file into a file that is
directly executable.
• Loaders: It resolves all relocatable addresses to base address
Applications of compiler technology
1. Implementation of High-Level Programming
2. Optimizations for Computer Architectures
• Parallelism
• M e m o r y hierarchies
3. Design of New Computer Architectures
• R I S C
• Specialized Architectures
4. Program Translations
• Binary Translation
• Hardware Synthesis
• Database Query Interpreters
• Compiled Simulation
5. Software Productivity Tools
• Type Checking
• Bounds Checking
• Memory – Management Tools
Complexity of compiler technology
• A compiler is possibly the most complex
system software
• The complexity arises from the fact that it is
required to map a programmer’s requirements
(in a HLL program) to architectural details
• It uses algorithms and techniques from a very
large number of areas in computer science
• Translates intricate theory into practice -
enables tool building
Compiler Algorithms
Makes practical application of
• Greedy algorithms - register allocation
• Heuristic search - list scheduling
• Graph algorithms - dead code elimination, register allocation
• Dynamic programming - instruction selection
• Optimization techniques - instruction scheduling
• Finite automata - lexical analysis
• Pushdown automata - parsing
• Fixed point algorithms - data-flow analysis
• Complex data structures - symbol tables, parse trees, data
dependence graphs
• Computer architecture - machine code generation
Context Free Grammars
• Grammars are used to describe the syntax of a
programming language. It specifies the structure of
expression and statements.
• stmt -> if (expr) then stmt
where stmt denotes statements, expr denotes
expressions.
Types of grammar
• Type 0 grammar
• Type 1 grammar
• Type 2 grammar
• Type 3 grammar
• Context free grammar is also called as Type 2
grammar.
A context free grammar G is defined by four tuples as,
G=(V,T,P,S)
where,
• G - Grammar
• V - Set of variables
• T - Set of Terminals
• P - Set of productions
• S - Start symbol
• It produces Context Free Language (CFL) which is a collection of
input strings that are terminals, derived from the start symbol of
grammar on multiple steps.
where,
• L-Language
• G- Grammar
• w - Input string
• S - Start symbol
• T - Terminal
Conventions
Terminals are symbols from which strings
are formed.
• Lowercase letters i.e., a, b, c.
• Operators i.e.,+,-,*·
• Punctuation symbols i.e., comma, parenthesis.
• Digits i.e. 0, 1, 2, · · · ,9.
• Boldface letters i.e., id, if.
Non-terminals are syntactic variables that denote a set
of strings.
• Uppercase letters i.e., A, B, C.
• Lowercase italic names i.e., expr , stmt.
• Start symbol is the head of the production stated first in the
grammar.
• Production is of the form LHS ->RHS (or) head ->
body, where head contains only one non-terminal and body
contains a collection of terminals and non-terminals.
• (eg.) Let G be,
Context Free Grammars vs Regular Expressions
• Grammars are more powerful than regular expressions.
• Every construct that can be described by a regular
expression can be described by a grammar but not vice-
versa.
• Every regular language is a context free language but
reverse does not hold.
(eg.)
• RE= (a I b)*abb (set of strings ending with abb).
• Grammar
Syntax directed definition
• Syntax directed definition specifies the values of
attributes by associating semantic rules with the
grammar productions.
• It is a context free grammar with attributes and
rules together which are associated with grammar
symbols and productions respectively.
• The process of syntax directed translation is two-
fold:
• Construction of syntax tree
• Computing values of attributes at each node by
visiting the nodes of syntax tree.
• Semantic actions
• Semantic actions are fragments of code which are
embedded within production bodies by syntax
directed translation.
• They are usually enclosed within curly braces ({
}).
• It can occur anywhere in a production but usually
at the end of production.
• (eg.)
• E---> E1 + T {print ‘+’}
Types of translation
L-attributed translation
• It performs translation during parsing itself.
• No need of explicit tree construction.
• L represents 'left to right'.
•
S-attributed translation
• It is performed in connection with bottom up
parsing.
• 'S' represents synthesized.
Types of attributes
Inherited attributes
• It is defined by the semantic rule associated with the
production at the parent of node.
• Attributes values are confined to the parent of node, its
siblings and by itself.
• The non-terminal concerned must be in the body of the
production.
Synthesized attributes
• It is defined by the semantic rule associated with the
production at the node.
• Attributes values are confined to the children of node and by
itself.
• non terminal concerned must be in the head of production.
• Terminals have synthesized attributes which are the lexical
values (denoted by lexval) generated by the lexical analyzer.
Syntax directed definition of simple desk calculator
Production Semantic rules
L ---> En L.val = E.val
E ---> E1+ T E.val = E1.val+ T.val
E ---> T E.val = T.val
T---> T1*F T.val = Ti.val x F.val
T ---> F T.val = F.val
F ---> (E) F.val = E.val
F ---> digit F.val = digit.lexval
Syntax-directed definition-inherited attributes
Production Semantic Rules
D --->TL L.inh = T.type
T ---> int T.type =integer
T ---> float T.type = float
L ---> L1, id L1.inh = L.inh
addType (id.entry, Linh)
L ---> id addType (id.entry, L.inh)
Symbol T is associated with a synthesized attribute type.
• Symbol L is associated with an inherited attribute inh,
Types of Syntax Directed Definitions
• S-attributed Definitions
• Syntax directed definition that involves only synthesized
attributes is called S-attributed.
• Attribute values for the non-terminal at the head is
computed from the attribute values of the symbols at the
body of the production.
• The attributes of a S-attributed SDD can be evaluated in
bottom up order of nodes of the parse tree.
• i.e., by performing post order traversal of the parse tree
and
• evaluating the attributes at a node when the traversal leaves
that node for the last time.
Production Semantic rules
L ---> En L.val = E.val
E ---> E1+ T E.val = E1.val+ T.val
E ---> T E.val = T.val
T---> T1*F T.val = Ti.val x F.val
T ---> F T.val = F.val
F ---> (E) F.val = E.val
F ---> digit F.val = digit.lexval
• L-attributed Definitions
• The syntax directed definition in which the edges of dependency
graph for the attributes in production body, can go from left to right
and not from right to left is called L-attributed definitions.
• Attributes of L-attributed definitions may either be synthesized or
inherited.
• If the attributes are inherited, it must be computed from:
• Inherited attribute associated with the production head.
• Either by inherited or synthesized attribute associated with the
production located to the left of the attribute which is being computed.
• Either by inherited or synthesized attribute associated with the
attribute under consideration in such a way that no cycles can be
formed by it in dependency graph.
Production Semantic Rules
T ---> FT' T '.inh = F.val
T ' ---> *FT1’ T’1.inh =T'.inh x F.val
In production 1, the inherited attribute T' is computed from the value
of F which is to its left.
In production 2, the inherited attributed Tl' is computed from T'. inh
associated with its head and the value of F which appears to its left
in the production.
i.e., for computing inherited attribute it must either use from the
above or from the left information of SDD.

More Related Content

What's hot

Phases of compiler
Phases of compilerPhases of compiler
Phases of compiler
Akhil Kaushik
 
Intermediate code
Intermediate codeIntermediate code
Intermediate code
Vishal Agarwal
 
Path testing, data flow testing
Path testing, data flow testingPath testing, data flow testing
Path testing, data flow testing
priyasoundar
 
Software Verification and Validation
Software Verification and Validation Software Verification and Validation
Software Verification and Validation
University Of Education Lahore D.G Khan Campus
 
Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translation
Akshaya Arunan
 
Compiler Design Basics
Compiler Design BasicsCompiler Design Basics
Compiler Design Basics
Akhil Kaushik
 
COMPILER DESIGN
COMPILER DESIGNCOMPILER DESIGN
COMPILER DESIGN
Vetukurivenkatashiva
 
Finite automata-for-lexical-analysis
Finite automata-for-lexical-analysisFinite automata-for-lexical-analysis
Finite automata-for-lexical-analysis
Dattatray Gandhmal
 
Introduction to system programming
Introduction to system programmingIntroduction to system programming
Introduction to system programming
sonalikharade3
 
Syntax Analysis in Compiler Design
Syntax Analysis in Compiler Design Syntax Analysis in Compiler Design
Syntax Analysis in Compiler Design
MAHASREEM
 
Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)   Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)
Tasif Tanzim
 
Error Detection & Recovery
Error Detection & RecoveryError Detection & Recovery
Error Detection & Recovery
Akhil Kaushik
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
Muhammed Afsal Villan
 
White box testing
White box testingWhite box testing
White box testing
Neethu Tressa
 
source code metrics and other maintenance tools and techniques
source code metrics and other maintenance tools and techniquessource code metrics and other maintenance tools and techniques
source code metrics and other maintenance tools and techniques
Siva Priya
 
Language Translator ( Compiler)
Language Translator ( Compiler)Language Translator ( Compiler)
Language Translator ( Compiler)
Nazmul Hyder
 
Compiler Chapter 1
Compiler Chapter 1Compiler Chapter 1
Compiler Chapter 1
Huawei Technologies
 
Three address code In Compiler Design
Three address code In Compiler DesignThree address code In Compiler Design
Three address code In Compiler Design
Shine Raj
 
Phases of Compiler
Phases of CompilerPhases of Compiler
Phases of Compiler
Tanzeela_Hussain
 
Lecture 14 run time environment
Lecture 14 run time environmentLecture 14 run time environment
Lecture 14 run time environment
Iffat Anjum
 

What's hot (20)

Phases of compiler
Phases of compilerPhases of compiler
Phases of compiler
 
Intermediate code
Intermediate codeIntermediate code
Intermediate code
 
Path testing, data flow testing
Path testing, data flow testingPath testing, data flow testing
Path testing, data flow testing
 
Software Verification and Validation
Software Verification and Validation Software Verification and Validation
Software Verification and Validation
 
Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translation
 
Compiler Design Basics
Compiler Design BasicsCompiler Design Basics
Compiler Design Basics
 
COMPILER DESIGN
COMPILER DESIGNCOMPILER DESIGN
COMPILER DESIGN
 
Finite automata-for-lexical-analysis
Finite automata-for-lexical-analysisFinite automata-for-lexical-analysis
Finite automata-for-lexical-analysis
 
Introduction to system programming
Introduction to system programmingIntroduction to system programming
Introduction to system programming
 
Syntax Analysis in Compiler Design
Syntax Analysis in Compiler Design Syntax Analysis in Compiler Design
Syntax Analysis in Compiler Design
 
Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)   Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)
 
Error Detection & Recovery
Error Detection & RecoveryError Detection & Recovery
Error Detection & Recovery
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
 
White box testing
White box testingWhite box testing
White box testing
 
source code metrics and other maintenance tools and techniques
source code metrics and other maintenance tools and techniquessource code metrics and other maintenance tools and techniques
source code metrics and other maintenance tools and techniques
 
Language Translator ( Compiler)
Language Translator ( Compiler)Language Translator ( Compiler)
Language Translator ( Compiler)
 
Compiler Chapter 1
Compiler Chapter 1Compiler Chapter 1
Compiler Chapter 1
 
Three address code In Compiler Design
Three address code In Compiler DesignThree address code In Compiler Design
Three address code In Compiler Design
 
Phases of Compiler
Phases of CompilerPhases of Compiler
Phases of Compiler
 
Lecture 14 run time environment
Lecture 14 run time environmentLecture 14 run time environment
Lecture 14 run time environment
 

Similar to Compiler an overview

Compiler Design Introduction
Compiler Design Introduction Compiler Design Introduction
Compiler Design Introduction
Thapar Institute
 
Phases of Compiler.pptx
Phases of Compiler.pptxPhases of Compiler.pptx
Phases of Compiler.pptx
ssuser3b4934
 
Presentation1
Presentation1Presentation1
Presentation1
Zarin Tasnim
 
Presentation1
Presentation1Presentation1
Presentation1
Zarin Tasnim
 
Concept of compiler in details
Concept of compiler in detailsConcept of compiler in details
Concept of compiler in details
kazi_aihtesham
 
Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdf
DrIsikoIsaac
 
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptxCOMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
MUSHAMHARIKIRAN6737
 
Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
NesredinTeshome1
 
Pros and cons of c as a compiler language
  Pros and cons of c as a compiler language  Pros and cons of c as a compiler language
Pros and cons of c as a compiler language
Ashok Raj
 
Introduction_to_Programming.pptx
Introduction_to_Programming.pptxIntroduction_to_Programming.pptx
Introduction_to_Programming.pptx
PmarkNorcio
 
Compiler Design Basics
Compiler Design BasicsCompiler Design Basics
Compiler Design Basics
Akhil Kaushik
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
Radhika Talaviya
 
Chapter1.pdf
Chapter1.pdfChapter1.pdf
Chapter1.pdf
tharwatabdulhmed
 
unit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfunit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdf
DrIsikoIsaac
 
Introduction to programming language (basic)
Introduction to programming language (basic)Introduction to programming language (basic)
Introduction to programming language (basic)
nharsh2308
 
PCSG_Computer_Science_Unit_1_Lecture_2.pptx
PCSG_Computer_Science_Unit_1_Lecture_2.pptxPCSG_Computer_Science_Unit_1_Lecture_2.pptx
PCSG_Computer_Science_Unit_1_Lecture_2.pptx
AliyahAli19
 
Chapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course MaterialChapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course Material
gadisaAdamu
 
X-CS-8.0 Programming in C Language 2022-2023.pdf
X-CS-8.0 Programming in C Language 2022-2023.pdfX-CS-8.0 Programming in C Language 2022-2023.pdf
X-CS-8.0 Programming in C Language 2022-2023.pdf
Alefya1
 
Principles of Compiler Design - Introduction
Principles of Compiler Design - IntroductionPrinciples of Compiler Design - Introduction
Principles of Compiler Design - Introduction
sheelarajasekar205
 
Week 08_Basics of Compiler Construction.pdf
Week 08_Basics of Compiler Construction.pdfWeek 08_Basics of Compiler Construction.pdf
Week 08_Basics of Compiler Construction.pdf
AnonymousQ3EMYoWNS
 

Similar to Compiler an overview (20)

Compiler Design Introduction
Compiler Design Introduction Compiler Design Introduction
Compiler Design Introduction
 
Phases of Compiler.pptx
Phases of Compiler.pptxPhases of Compiler.pptx
Phases of Compiler.pptx
 
Presentation1
Presentation1Presentation1
Presentation1
 
Presentation1
Presentation1Presentation1
Presentation1
 
Concept of compiler in details
Concept of compiler in detailsConcept of compiler in details
Concept of compiler in details
 
Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdf
 
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptxCOMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
 
Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
 
Pros and cons of c as a compiler language
  Pros and cons of c as a compiler language  Pros and cons of c as a compiler language
Pros and cons of c as a compiler language
 
Introduction_to_Programming.pptx
Introduction_to_Programming.pptxIntroduction_to_Programming.pptx
Introduction_to_Programming.pptx
 
Compiler Design Basics
Compiler Design BasicsCompiler Design Basics
Compiler Design Basics
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
 
Chapter1.pdf
Chapter1.pdfChapter1.pdf
Chapter1.pdf
 
unit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfunit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdf
 
Introduction to programming language (basic)
Introduction to programming language (basic)Introduction to programming language (basic)
Introduction to programming language (basic)
 
PCSG_Computer_Science_Unit_1_Lecture_2.pptx
PCSG_Computer_Science_Unit_1_Lecture_2.pptxPCSG_Computer_Science_Unit_1_Lecture_2.pptx
PCSG_Computer_Science_Unit_1_Lecture_2.pptx
 
Chapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course MaterialChapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course Material
 
X-CS-8.0 Programming in C Language 2022-2023.pdf
X-CS-8.0 Programming in C Language 2022-2023.pdfX-CS-8.0 Programming in C Language 2022-2023.pdf
X-CS-8.0 Programming in C Language 2022-2023.pdf
 
Principles of Compiler Design - Introduction
Principles of Compiler Design - IntroductionPrinciples of Compiler Design - Introduction
Principles of Compiler Design - Introduction
 
Week 08_Basics of Compiler Construction.pdf
Week 08_Basics of Compiler Construction.pdfWeek 08_Basics of Compiler Construction.pdf
Week 08_Basics of Compiler Construction.pdf
 

Recently uploaded

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
SupreethSP4
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 

Recently uploaded (20)

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 

Compiler an overview

  • 2. Introduction • In order to reduce the complexity of designing and building computers, made to execute relatively simple commands • combining these very simple commands into a program in what is called machine language • . Since this is a tedious and error-prone process most programming is, instead, done using a high-level programming language • . This language can be very different from the machine language that the computer can execute, so some means of bridging the gap is required. • This is where the compiler comes in.
  • 3. Contd., • Using a high-level language for programming has a large impact on how fast programs can be developed. • The main reasons for this are: • Compared to machine language, the notation used by programming languages is closer to the way humans think about problems. • The compiler can spot some obvious programming mistakes. • Programs written in a high-level language tend to be shorter than equivalent programs written in machine language. • Another advantage of using a high-level level language is: the same program can be compiled to many different machine languages and, hence, run on many different machines.
  • 4. Translator • A program written in high-level language is called as source code. To convert the source code into machine code, translators are needed. • A translator takes a program written in source language as input and converts it into a program in target language as output. • It also detects and reports the error during translation. Roles of translator are: • Translating the high-level language program input into an equivalent machine language program. • Providing diagnostic messages wherever the programmer violates specification of the high-level language program.
  • 5. Different type of translators Compiler Compiler is a translator which is used to convert programs in high-level language to low-level language. It translates the entire program and also reports the errors in source program encountered during the translation. Interpreter • Interpreter is a translator which is used to convert programs in high-level language to low-level language. Interpreter translates line by line and reports the error once it encountered during the translation process. • It directly executes the operations specified in the source program when the input is given by the user. • It gives better error diagnostics than a compiler.
  • 6. • Assembler Assembler is a translator which is used to translate the assembly language code into machine language code.
  • 7. S.No. Compiler Interpreter 1 Performs the translation of a program as a whole. Performs statement by statement translation. 2 Execution is faster. Execution is slower. 3 Requires more memory as linking is needed for the generated intermediate object code. Memory usage is efficient as no intermediate object code is generated. 4 Debugging is hard as the error messages are generated after scanning the entire program only. It stops translation when the first error is met. Hence, debugging is easy. 5 Programming languages like C, C++ uses compilers. Programming languages like Python, BASIC, and Ruby uses interpreters.
  • 8. Why learn about compilers? a) It is considered a topic that you should know in order to be “well-cultured” in computer science. b) A good craftsman should know his tools, and compilers are important tools for programmers and computer scientists. c) The techniques used for constructing a compiler are useful for other purposes as well. d) There is a good chance that a programmer or computer scientist will need to write a compiler or interpreter for a domain-specific language
  • 9. Compilers and Interpreters • “Compilation” – Translation of a program written in a source language into a semantically equivalent program written in a target language Compiler Error messages Source Program Target Program Input Output
  • 10. Compilers and Interpreters (cont’d) Interpreter Source Program Input Output Error messages • “Interpretation” – Performing the operations implied by the source program
  • 11. The Analysis-Synthesis Model of Compilation • There are two parts to compilation: – Analysis determines the operations implied by the source program which are recorded in a tree structure – Synthesis takes the tree structure and translates the operations therein into the target program
  • 12. Other Tools that Use the Analysis-Synthesis Model • Editors (syntax highlighting) • Pretty printers (e.g. Doxygen) • Static checkers (e.g. Lint and Splint) • Interpreters • Text formatters (e.g. TeX and LaTeX) • Silicon compilers (e.g. VHDL) • Query interpreters/compilers (Databases)
  • 13. Language Processing System Preprocessor Compiler Assembler Linker Skeletal Source Program Source Program Target Assembly Program Relocatable Object Code Absolute Machine Code Libraries and Relocatable Object Files
  • 14. Language Processing System Preprocessor : • A preprocessor, generally considered as a part of compiler, is a tool that produces input for compilers. • It deals with macro-processing, augmentation, file inclusion, language extension, etc. Compiler: A compiler, translates high-level language into low-level machine language Assembler: • An assembler translates assembly language programs into machine code. • The output of an assembler is called an object file, which contains a combination of machine instructions as well as the data required to place these instructions in memory.
  • 15. Contd., Loader: • Loader is a part of operating system and is responsible for loading executable files into memory and execute them. • It calculates the size of a program instructions and data and creates memory space for it. • Cross-compiler: A compiler that runs on platform A and is capable of generating executable code for platform B is called a cross-compiler. • Source-to-source Compiler: A compiler that takes the source code of one programming language and translates it into the source code of another programming language is called a source-to-source compiler.
  • 16. example How a program, using C compiler, is executed on a host machine. • User writes a program in C language (high-level language). • The C compiler, compiles the program and translates it to assembly program (low-level language). • An assembler then translates the assembly program into machine code (object). • A linker tool is used to link all the parts of the program together for execution (executable machine code). • A loader loads all of them into memory and then the program is executed
  • 17. The Grouping of Phases • Compiler front and back ends: – Front end: analysis (machine independent) – Back end: synthesis (machine dependent) • Compiler passes: – A collection of phases is done only once (single pass) or multiple times (multi pass) • Single pass: usually requires everything to be defined before being used in source program • Multi pass: compiler may have to keep entire program representation in memory
  • 18. Phases of compiler • Lexical Analysis A token is a string of characters, categorized according to the rules as a symbol (e.g. IDENTIFIER, NUMBER, COMMA, etc.). The process of forming tokens from an input stream of characters is called tokenization and the lexer categorizes them according to symbol type. A token can look like anything that is useful for processing an input text stream or text file.
  • 19. Contd., • A lexical analyzer generally does nothing with combinations of tokens, a task left for a parser. For example, a typical lexical analyzer recognizes parenthesis as tokens, but does nothing to ensure that each '(' is matched with a ')'. • In a compiler, linear analysis is called lexical analysis or scanning. For example, in lexical analysis the characters in the assignment statement: Position: = initial + rate ∗ 60 would be grouped into the following tokens: 1. The identifier position. 2. The assignment symbol : = 3. The identifier initial. 4. The plus sign + 5. The identifier 𝐫𝐚𝐭𝐞. 6. The multiplication sign 7. The number 60 The blanks separating the characters of these tokens would be eliminated
  • 20. Syntax Analysis • Hierarchical analysis is called parsing or syntax analysis. • It involves grouping the tokens of the source program into grammatical phrases that are used by the compiler to synthesize output. • Usually, the grammatical phrases of the source program are represented by a parse tree
  • 21. • The hierarchical structure of a program is usually expressed by recursive rules. • For example, we might have the following rules as part of the definition of expressions: 1. Any identifier is an expression. 2. Any number is an expression. 3. If expression1 and expression2 are expressions, then so are expression1 + expression2 expression1 ∗ expression2 (expression1) Rules (1) and (2) are non-recursive basic rules, while (3) defines expressions in terms of operators applied to other expressions. Thus, by rule (1), initial and rate are expressions. By rule (2), 60 is an expression, while by rule (3), we can first infer that rate∗60 is an expression and finally that initial + rate∗60 is an expression.
  • 22. Semantic Analysis • The semantic analysis phase checks the source program for semantic errors and gathers type information for the subsequent code-generation phase. • It uses the hierarchical structure determined by the syntax-analysis phase to identify the operators and operands of expressions and statements. • An important component of semantic analysis is type checking. • Here the compiler checks that each operator has operands that are permitted by the source language specification. • For example, many programming language definitions require a compiler to report an error every time a real number is used to index an array. • However, the language specification may permit some operand corrections, • for example, when binary arithmetic operator is applied to an integer and real. • In this case, the compiler may need to convert the integer to a real.
  • 23.
  • 24. • The Phases of a Compiler
  • 25. • The six phases of compilation: lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation. • The 6 phases divided into 2 Groups: 1. Front End: Depends on stream of tokens and parse tree 2. Back End: Dependent on Target, Independent of source code Symbol-Table Management: • A symbol table is a data structure containing a record for each identifier, with fields for the attributes of the identifier. • The data structure allows us to find the record for each identifier quickly and to store or retrieve data from that record quickly. • Symbol table is a Data Structure in a Compiler used for Managing information about variables & their attributes.
  • 26. Error Detection and Reporting • Each phase can encounter errors. • However, after detecting an error, a phase must somehow deal with that error, so that compilation can proceed, allowing further errors in the source program to be detected. • A compiler that stops when it finds the first error is not as helpful as it could be. • The syntax and semantic analysis phases usually handle a large fraction of the errors detectable by the compiler. • The lexical phase can detect errors where the characters remaining in the input do not form any token of the language. • Errors where the token stream violates the structure rules (syntax) of the language are determined by the syntax analysis phase.
  • 27. • Intermediate Code Generations:- • An intermediate representation of the final machine language code is produced. • This phase bridges the analysis and synthesis phases of translation. • Code Optimization :- • This is optional phase described to improve the intermediate code so that the output runs faster and takes less space. • Code Generation:- • The last phase of translation is code generation. A number of optimizations to reduce the length of machine language program are carried out during this phase. • The output of the code generator is the machine language program of the specified computer.
  • 28.
  • 29. Compiler construction tools 1. Parser generators. 2. Scanner generators. 3. Syntax-directed translation engines. 4. Automatic code generators. 5. Data-flow analysis engines. 6. Compiler-construction toolkits. Parser Generators • Input: Grammatical description of a programming language Output: Syntax analyzers. • Parser generator takes the grammatical description of a programming language and produces a syntax analyzer.
  • 30. Scanner Generators • Input: Regular expression description of the tokens of a language Output: Lexical analyzers. • Scanner generator generates lexical analyzers from a regular expression description of the tokens of a language. Syntax-directed Translation Engines • Input: Parse tree. Output: Intermediate code. • Syntax-directed translation engines produce collections of routines that walk a parse tree and generates intermediate code.
  • 31. Automatic Code Generators • Input: Intermediate language. Output: Machine language. • Code-generator takes a collection of rules that define the translation of each operation of the intermediate language into the machine language for a target machine. Data-flow Analysis Engines • Data-flow analysis engine gathers the information, that is, the values transmitted from one part of a program to each of the other parts. • Data-flow analysis is a key part of code optimization. Compiler Construction Toolkits • The toolkits provide integrated set of routines for various phases of compiler. • Compiler construction toolkits provide an integrated set of routines for construction of phases of compiler.
  • 32. Cousins of Compiler • Interpreters: discussed in detail in first lecture • Preprocessors: They produce input for the compiler. They perform jobs such as deleting comments, include files, perform macros etc. • Assemblers: They are translators for Assembly language. Sometimes the compiler will generate assembly language in symbolic form then hand it over to assemblers. • Linkers: Both compilers and assemblers rely on linkers to collect code separately compiled or assembled in object file into a file that is directly executable. • Loaders: It resolves all relocatable addresses to base address
  • 33. Applications of compiler technology 1. Implementation of High-Level Programming 2. Optimizations for Computer Architectures • Parallelism • M e m o r y hierarchies 3. Design of New Computer Architectures • R I S C • Specialized Architectures 4. Program Translations • Binary Translation • Hardware Synthesis • Database Query Interpreters • Compiled Simulation 5. Software Productivity Tools • Type Checking • Bounds Checking • Memory – Management Tools
  • 34. Complexity of compiler technology • A compiler is possibly the most complex system software • The complexity arises from the fact that it is required to map a programmer’s requirements (in a HLL program) to architectural details • It uses algorithms and techniques from a very large number of areas in computer science • Translates intricate theory into practice - enables tool building
  • 35. Compiler Algorithms Makes practical application of • Greedy algorithms - register allocation • Heuristic search - list scheduling • Graph algorithms - dead code elimination, register allocation • Dynamic programming - instruction selection • Optimization techniques - instruction scheduling • Finite automata - lexical analysis • Pushdown automata - parsing • Fixed point algorithms - data-flow analysis • Complex data structures - symbol tables, parse trees, data dependence graphs • Computer architecture - machine code generation
  • 36. Context Free Grammars • Grammars are used to describe the syntax of a programming language. It specifies the structure of expression and statements. • stmt -> if (expr) then stmt where stmt denotes statements, expr denotes expressions. Types of grammar • Type 0 grammar • Type 1 grammar • Type 2 grammar • Type 3 grammar • Context free grammar is also called as Type 2 grammar.
  • 37. A context free grammar G is defined by four tuples as, G=(V,T,P,S) where, • G - Grammar • V - Set of variables • T - Set of Terminals • P - Set of productions • S - Start symbol • It produces Context Free Language (CFL) which is a collection of input strings that are terminals, derived from the start symbol of grammar on multiple steps. where, • L-Language • G- Grammar • w - Input string • S - Start symbol • T - Terminal
  • 38. Conventions Terminals are symbols from which strings are formed. • Lowercase letters i.e., a, b, c. • Operators i.e.,+,-,*· • Punctuation symbols i.e., comma, parenthesis. • Digits i.e. 0, 1, 2, · · · ,9. • Boldface letters i.e., id, if. Non-terminals are syntactic variables that denote a set of strings. • Uppercase letters i.e., A, B, C. • Lowercase italic names i.e., expr , stmt.
  • 39. • Start symbol is the head of the production stated first in the grammar. • Production is of the form LHS ->RHS (or) head -> body, where head contains only one non-terminal and body contains a collection of terminals and non-terminals. • (eg.) Let G be,
  • 40. Context Free Grammars vs Regular Expressions • Grammars are more powerful than regular expressions. • Every construct that can be described by a regular expression can be described by a grammar but not vice- versa. • Every regular language is a context free language but reverse does not hold. (eg.) • RE= (a I b)*abb (set of strings ending with abb). • Grammar
  • 41. Syntax directed definition • Syntax directed definition specifies the values of attributes by associating semantic rules with the grammar productions. • It is a context free grammar with attributes and rules together which are associated with grammar symbols and productions respectively. • The process of syntax directed translation is two- fold: • Construction of syntax tree • Computing values of attributes at each node by visiting the nodes of syntax tree.
  • 42. • Semantic actions • Semantic actions are fragments of code which are embedded within production bodies by syntax directed translation. • They are usually enclosed within curly braces ({ }). • It can occur anywhere in a production but usually at the end of production. • (eg.) • E---> E1 + T {print ‘+’}
  • 43. Types of translation L-attributed translation • It performs translation during parsing itself. • No need of explicit tree construction. • L represents 'left to right'. • S-attributed translation • It is performed in connection with bottom up parsing. • 'S' represents synthesized.
  • 44. Types of attributes Inherited attributes • It is defined by the semantic rule associated with the production at the parent of node. • Attributes values are confined to the parent of node, its siblings and by itself. • The non-terminal concerned must be in the body of the production. Synthesized attributes • It is defined by the semantic rule associated with the production at the node. • Attributes values are confined to the children of node and by itself. • non terminal concerned must be in the head of production. • Terminals have synthesized attributes which are the lexical values (denoted by lexval) generated by the lexical analyzer.
  • 45. Syntax directed definition of simple desk calculator Production Semantic rules L ---> En L.val = E.val E ---> E1+ T E.val = E1.val+ T.val E ---> T E.val = T.val T---> T1*F T.val = Ti.val x F.val T ---> F T.val = F.val F ---> (E) F.val = E.val F ---> digit F.val = digit.lexval
  • 46. Syntax-directed definition-inherited attributes Production Semantic Rules D --->TL L.inh = T.type T ---> int T.type =integer T ---> float T.type = float L ---> L1, id L1.inh = L.inh addType (id.entry, Linh) L ---> id addType (id.entry, L.inh) Symbol T is associated with a synthesized attribute type. • Symbol L is associated with an inherited attribute inh,
  • 47. Types of Syntax Directed Definitions • S-attributed Definitions • Syntax directed definition that involves only synthesized attributes is called S-attributed. • Attribute values for the non-terminal at the head is computed from the attribute values of the symbols at the body of the production. • The attributes of a S-attributed SDD can be evaluated in bottom up order of nodes of the parse tree. • i.e., by performing post order traversal of the parse tree and • evaluating the attributes at a node when the traversal leaves that node for the last time.
  • 48. Production Semantic rules L ---> En L.val = E.val E ---> E1+ T E.val = E1.val+ T.val E ---> T E.val = T.val T---> T1*F T.val = Ti.val x F.val T ---> F T.val = F.val F ---> (E) F.val = E.val F ---> digit F.val = digit.lexval
  • 49. • L-attributed Definitions • The syntax directed definition in which the edges of dependency graph for the attributes in production body, can go from left to right and not from right to left is called L-attributed definitions. • Attributes of L-attributed definitions may either be synthesized or inherited. • If the attributes are inherited, it must be computed from: • Inherited attribute associated with the production head. • Either by inherited or synthesized attribute associated with the production located to the left of the attribute which is being computed. • Either by inherited or synthesized attribute associated with the attribute under consideration in such a way that no cycles can be formed by it in dependency graph.
  • 50. Production Semantic Rules T ---> FT' T '.inh = F.val T ' ---> *FT1’ T’1.inh =T'.inh x F.val In production 1, the inherited attribute T' is computed from the value of F which is to its left. In production 2, the inherited attributed Tl' is computed from T'. inh associated with its head and the value of F which appears to its left in the production. i.e., for computing inherited attribute it must either use from the above or from the left information of SDD.