SlideShare a Scribd company logo
1 of 57
Introduction to Compilers
M.Vijaya
Assistant Professor
Dept. of CSE, VJIT
What is Compiler?
Compiler is a translator program that translates a program
written in (HLL) the source program and translates it into an
equivalent program in (MLL) the target program. As an important
part of a compiler is error showing to the programmer.
“Interpretation”
Performing the operations implied by the source program
The Analysis-Synthesis Model of Compilation
There are two parts to compilation:
⚫Analysis determines the operations implied by the
source program which are recorded in a tree structure
⚫Synthesis takes the tree structure and translates the
operations therein into the target program
Preprocessors, Compilers,
Assemblers, and Linkers
There are two parts to compilation: analysis and
synthesis.
⚫The analysis part breaks up the source program into
pieces and creates an intermediate representation of
the source program.
⚫The synthesis part constructs the desired target
program from the intermediate representation.
The Grouping of Phases Compiler front and back ends:
Compiler passes:
Front end: analysis (machine independent)
Back end: synthesis (machine dependent)
Compiler passes: A collection of phases is done only once
(single pass) or multiple times (multi pass)
Single pass: usually requires everything to be defined
before being used in source program
Multi pass: compiler may have to keep entire program
representation in memory
Phases of Compiler
Phases of Compiler
⚫ Compiler operates in phases.
⚫ Each phase transforms the source program from one representation
to another.
Six phases:
– Lexical Analyser
– Syntax Analyser
– Semantic Analyser
– Intermediate code generation
– Code optimization
– Code Generation
• Symbol table and error handling interact with the six phases.
• Some of the phases may be grouped together.
These phases are illustrated by considering the following statement:
position := initial + rate * 60
Lexical Analysis (Scanning)
Reads characters in the source program and groups them into
words -
lexeme (basic unit of syntax)
Produces words and recognises what sort they are.
The output is called token and is a pair of the form
<token_name, attribute value>
Eg: Position = initial + rate * 60
keep a symbol table.
Lexical analysis eliminates white spaces
Speed is important - use a specialised tool: e.g., flex - a tool for
generating
scanners: programs which recognise lexical patterns in text;
Syntax Analysis
⚫Syntax (or syntactic) Analysis (Parsing)
⚫Imposes a hierarchical structure on the token stream.
⚫Use output of lexical analyzer to create a tree like
structure of the token stream – syntax tree.
⚫The tree shows the order in which operations are
performed
⚫Ordering based on the precedence of mathematical
operations
Semantic Analysis :
Semantic Analysis :
•Collects context (semantic) information, checks for semantic
errors, and annotates nodes of the tree with the results.
•Ensures that component of a program fit together
meaningfully.
Also gathers type information and saves either in syntax tree or
symbol table.
Intermediate Code Generations:-
An intermediate representation of the final machine
language code is produced. This phase bridges the analysis
and synthesis phases of translation.
intermediate representation should have two
important properties.
⚫It should be easy to produce.
⚫It should be easy to translate into target machine code.
Code Optimization:-
•The goal is to improve the intermediate code and, thus, the effectiveness of
code generation and the performance of the target code get improved.
• Faster, shorter code and even that consumes less power.
Code Generation:-
⚫The last phase of translation is code generation.
⚫A number of optimizations to Reduce the length of
machine language program are carried out during this
phase. The output of the code generator is the machine
language program of the specified computer.
⚫Target, machine-specific properties may be used to
optimize the code.
⚫ Finally, machine code and associated information
required by the Operating System are generated.
Symbol Table Management
This record variable name used in the source p
This stores attributes of each name
Eg: name, its type, its scope
Method of passing each argument(by value or by reference)
Return type
Implementation of symbol table can be done is either linear list or hash
table.
Error Handling Routine: In the compiler design process error may
occur in all the below-given phases:
Lexical analyzer: Wrongly spelled tokens
Syntax analyzer: Missing parenthesis
Intermediate code generator: Mismatched operands for an operator
Code Optimizer: When the statement is not reachable
Code Generator: Unreachable statements
Symbol tables: Error of multiple declared identifiers
Front end
Front end of a compiler consists of the phases
• Lexical analysis.
• Syntax analysis.
• Semantic analysis.
• Intermediate code generation.
Back end
Back end of a compiler contains
• Code optimization.
• Code generation.
Front End
• Front end comprises of phases which are dependent on the
input (source language)
and independent on the target machine (target language).
• It includes lexical and syntactic analysis, symbol table
management, semantic analysis
and the generation of intermediate code.
• Code optimization can also be done by the front end.
• It also includes error handling at the phases concerned.
Back End
• Back end comprises of those phases of the compiler that are
dependent on the target machine and independent on the source
language.
• This includes code optimization, code generation.
• In addition to this, it also encompasses error handling and symbol
table management operations.
Passes
A Compiler pass refers to the traversal of a compiler through the entire
program. Compiler pass are two types:
•Single Pass Compiler
•Two Pass Compiler or Multi Pass Compiler.
1. Single Pass Compiler:
If we combine or group all the phases of compiler design in
a single module known as single pass compiler.
⚫In above diagram there are all 6 phases are grouped in a
single module, some points of single pass compiler is as:
⚫A one pass/single pass compiler is that type of compiler
that passes through the part of each compilation unit
exactly once.
⚫Single pass compiler is faster and smaller than the multi
pass compiler.
⚫As a disadvantage of single pass compiler is that it is less
efficient in comparison with multipass compiler.
⚫Single pass compiler is one that processes the
input exactly once, so going directly from lexical analysis
to code generator, and then going back for the next read.
⚫First Pass: is refers as
(a). Front end
(b). Analytic part
(c). Platform independent
⚫Second Pass: is refers as
(a). Back end
(b). Synthesis Part
(c). Platform Dependent
compiler vs interpreter
compiler interpreter
It scans the entire program first and
TRANSLATE it into machine code
It scans line by line programs and
convert into machine code
Debugging is slow Debugging is faster
Error occurs after scanning the
whole program
Error occurs after scanning each line
Compiler shows all errors and
warnings at same time
Interpreter show one error at a time
Execution time is less Execution time is more
Compilers is used by languages such
as c,c++ etc.
An interpreter is used languages
such as java,python etc.
The role of Lexical Analyzer
⚫ 1) Lexical analyzer functions (or) role of Lexical Analyzer
⚫ 2)Interaction between LA and SA
⚫ 3)token, lexemes and patterns
⚫ 1) Lexical analyzer functions (or) role of Lexical Analyzer
a) It provides a stream of tokens
b)It removes comments from SP
c)It removes white spaces characters such as blank space, tap space
and new line characters.
d) It counts the line numbers of source programs
e)It generates a symbol table
f)It provides error messages
The role of Lexical Analyzer
2)Interaction between LA and SA
⚫ 3)token, lexemes and patterns
⚫ -pair of components consist of
⚫ <token name, attribute value>
⚫ Identifier/variables/operators/keywords/constants
⚫ Pointer-which points to S.Table
⚫ lexemes : Sequence of characters
⚫ Patterns: used to define regular expression
⚫ Identifier start with (l/_)(l/d/_)*
[a-zA-Z_][a-zA-Z0-9_]*
⚫Lexemes a=b+c;
⚫Symboltable
⚫Lexemes Token
⚫A <id,1>
⚫= <assig><=>
⚫B <id,2>
⚫+ <+>
⚫C <id,3>
⚫;
Id1 Val of identifier
Id2
Id3
Regular Expressions
⚫Regular Expressions are used to denote regular
languages. An expression is regular if:
⚫ɸ is a regular expression for regular language ɸ.
⚫ɛ is a regular expression for regular language {ɛ}.
⚫If a ∈ Σ , a is regular expression with language {a}.
⚫If a and b are regular expression, a + b is also a regular
expression with language {a,b}.
⚫If a and b are regular expression, ab (concatenation of a
and b) is also regular.
⚫If a is regular expression, a* (0 or more times a) is also
regular.
Operations
The various operations on languages are:
⚫Union of two languages L and M is written as
⚫L U M = {s | s is in L or s is in M}
⚫Concatenation of two languages L and M is written as
⚫LM = {st | s is in L and t is in M}
⚫The Kleene Closure of a language L is written as
⚫L* = Zero or more occurrence of language L.
⚫Types of Grammars-
⚫Grammars are classified on different basis as-
Finite automata
⚫Finite automata is a state machine that takes a string of
symbols as input and changes its state accordingly.
Finite automata is a recognizer for regular expressions.
⚫The mathematical model of finite automata consists of:
⚫Finite set of states (Q)
⚫Finite set of input symbols (Σ)
⚫One Start state (q0)
⚫Set of final states (qf)
⚫Transition function (δ)
DFA refers to deterministic
finite automata. ... In DFA,
there is only one path for specific
input from the current state to
the next state.
The finite automata are
called NFA when there exist many
paths for specific input from the
current state to the next state.
Steps for converting NFA with ε to DFA:
⚫Step 1: We will take the ε-closure for the starting state
of NFA as a starting state of DFA.
⚫Step 2: Find the states for each input symbol that can
be traversed from the present. That means the union
of transition value and their closures for each state of
NFA present in the current state of DFA.
⚫Step 3: If we found a new state, take it as current state
and repeat step 2.
⚫Step 4: Repeat Step 2 and Step 3 until there is no new
state present in the transition table of DFA.
⚫Step 5: Mark the states of DFA as a final state which
contains the final state of NFA.
1. Convert the given NFA into its equivalent DFA.
r=(a/b)*abb(a/b) construct nfa from regular expression
Input Buffering in Compiler Design
⚫The lexical analyzer scans the input from left to right
one character at a time. It uses two pointers begin
ptr(bp) and forward to keep track of the pointer of the
input scanned.
⚫Initially both the pointers point to the first character
of the input string as shown below
⚫The forward ptr moves ahead to search for end of
lexeme. As soon as the blank space is encountered, it
indicates end of lexeme. In above example as soon as
ptr (fp) encounters a blank space the lexeme “int” is
identified.
⚫The fp will be moved ahead at white space, when fp
encounters white space, it ignore and moves ahead.
then both the begin ptr(bp) and forward ptr(fp) are
set at next token.
⚫One Buffer Scheme
⚫Two buffer scheme
Regular expressions
⚫Regular expressions are a notation to
represent lexeme patterns for a token.
⚫They are used to represent the language
for lexical analyzer.
⚫They assist in finding the type of token
that accounts for a particular lexeme.
Regular Expression for identifiers
⚫ Ex: a
⚫ Aa
⚫ a123
1)Ex: 9
2)Ex :123
3)Ex: .25 /e
4)E+2/e-2/e2 /e
5)6.33E4
Lex-lexical analyzer generator
Lex-lexical analyzer generator
⚫ Lex is a program that generates lexical analyzer. It is used with
YACC parser generator.
⚫ The lexical analyzer is a program that transforms an input
stream into a sequence of tokens.
⚫ It reads the input stream and produces the source code as
output through implementing the lexical analyzer in the C
program.
The function of Lex is as follows:
⚫ Firstly lexical analyzer creates a program lex.1 in the Lex
language. Then Lex compiler runs the lex.1 program and
produces a C program lex.yy.c.
⚫ Finally C compiler runs the lex.yy.c program and produces an
object program a.out.
⚫ a.out is lexical analyzer that transforms an input stream into a
sequence of tokens.
Lex file format
A Lex program is separated into three sections by %%
delimiters. The formal of Lex source is as follows:
{ definitions } //declaration part
%%
{ rules } //rule part pattern {action}
%%
{ user subroutines } //auxiliary part
⚫Definitions include declarations of constant, variable and
regular definitions.
⚫Rules define the statement of form p1 {action1} p2
{action2}....pn {action}.
⚫Where pi describes the regular expression
and action1 describes the actions what action the lexical
analyzer should take when pattern pi matches a lexeme.
⚫User subroutines are auxiliary procedures needed by the
actions. The subroutine can be loaded with the lexical
analyzer and compiled separately.
% {
#include <stdio.h>
%}
% %
^[a - z A - Z _][a - z A - Z 0 - 9 _] * {printf("Valid Identifier");}
% %
main()
{
yylex();
}
//case conversion
%option noyywrap
lower [a-z]
CAPS [A-Z]
space [tn]
%%
{lower} {printf("%c", yytext[0]- 32);}
{CAPS} {printf("%c", yytext[0]+32);}
{space} ECHO;
ECHO;
%%
main()
{
yylex();
}
//1. program to identify number,letter,operators,special
symbols//
%option noyywrap
digit [0-9]
letter [a-zA-Z]
ops [+'-'*/=%]
sp_sym [^a-zA-Z0-9+- */=%]
%%
{letter}+ {printf("%s is a stringn", yytext);}
{digit}+ {printf("%s is a digitn", yytext);}
{ops}+ {printf("%s is an operatorn", yytext);}
{sp_sym}+ {printf("%s is a special symboln", yytext);}
%%
int main()
{
yylex();
return 0;
}
Lexical Error
⚫During the lexical analysis phase this type of error can be
detected.
⚫Lexical error is a sequence of characters that does not
match the pattern of any token. Lexical phase error is
found during the execution of the program.
Lexical phase error can be:
⚫Spelling error.
⚫Unmatched string.
⚫Appearance of illegal characters.
⚫Exceeding length of identifier.
Lexical analyzer: Wrongly spelled tokens
Syntax analyzer: Missing parenthesis
Intermediate code generator: Mismatched operands for
an operator
Code Optimizer: When the statement is not reachable
Code Generator: Unreachable statements
Symbol tables: Error of multiple declared identifiers
In this code, 1xab is neither a number nor
an identifier. So this code will show the
lexical error.

More Related Content

Similar to Introduction to Compilers

Similar to Introduction to Compilers (20)

Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 
1588147798Begining_ABUAD1.pdf
1588147798Begining_ABUAD1.pdf1588147798Begining_ABUAD1.pdf
1588147798Begining_ABUAD1.pdf
 
Chapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course MaterialChapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course Material
 
1._Introduction_.pptx
1._Introduction_.pptx1._Introduction_.pptx
1._Introduction_.pptx
 
Chapter#01 cc
Chapter#01 ccChapter#01 cc
Chapter#01 cc
 
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
 
Assignment1
Assignment1Assignment1
Assignment1
 
Compiler design and lexical analyser
Compiler design and lexical analyserCompiler design and lexical analyser
Compiler design and lexical analyser
 
phase of compiler
phase of compilerphase of compiler
phase of compiler
 
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
 
1 - Introduction to Compilers.ppt
1 - Introduction to Compilers.ppt1 - Introduction to Compilers.ppt
1 - Introduction to Compilers.ppt
 
Phases of compiler
Phases of compilerPhases of compiler
Phases of compiler
 
role of lexical anaysis
role of lexical anaysisrole of lexical anaysis
role of lexical anaysis
 
Phases of Compiler
Phases of CompilerPhases of Compiler
Phases of Compiler
 
Cpcs302 1
Cpcs302  1Cpcs302  1
Cpcs302 1
 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overview
 
Phases of Compiler.pdf
Phases of Compiler.pdfPhases of Compiler.pdf
Phases of Compiler.pdf
 
LANGUAGE TRANSLATOR
LANGUAGE TRANSLATORLANGUAGE TRANSLATOR
LANGUAGE TRANSLATOR
 
How a Compiler Works ?
How a Compiler Works ?How a Compiler Works ?
How a Compiler Works ?
 

Recently uploaded

Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 

Recently uploaded (20)

Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 

Introduction to Compilers

  • 1. Introduction to Compilers M.Vijaya Assistant Professor Dept. of CSE, VJIT
  • 2. What is Compiler? Compiler is a translator program that translates a program written in (HLL) the source program and translates it into an equivalent program in (MLL) the target program. As an important part of a compiler is error showing to the programmer.
  • 3. “Interpretation” Performing the operations implied by the source program
  • 4. The Analysis-Synthesis Model of Compilation There are two parts to compilation: ⚫Analysis determines the operations implied by the source program which are recorded in a tree structure ⚫Synthesis takes the tree structure and translates the operations therein into the target program
  • 6. There are two parts to compilation: analysis and synthesis. ⚫The analysis part breaks up the source program into pieces and creates an intermediate representation of the source program. ⚫The synthesis part constructs the desired target program from the intermediate representation.
  • 7.
  • 8. The Grouping of Phases Compiler front and back ends: Compiler passes: Front end: analysis (machine independent) Back end: synthesis (machine dependent) Compiler passes: A collection of phases is done only once (single pass) or multiple times (multi pass) Single pass: usually requires everything to be defined before being used in source program Multi pass: compiler may have to keep entire program representation in memory
  • 10. Phases of Compiler ⚫ Compiler operates in phases. ⚫ Each phase transforms the source program from one representation to another. Six phases: – Lexical Analyser – Syntax Analyser – Semantic Analyser – Intermediate code generation – Code optimization – Code Generation • Symbol table and error handling interact with the six phases. • Some of the phases may be grouped together. These phases are illustrated by considering the following statement: position := initial + rate * 60
  • 11. Lexical Analysis (Scanning) Reads characters in the source program and groups them into words - lexeme (basic unit of syntax) Produces words and recognises what sort they are. The output is called token and is a pair of the form <token_name, attribute value> Eg: Position = initial + rate * 60 keep a symbol table. Lexical analysis eliminates white spaces Speed is important - use a specialised tool: e.g., flex - a tool for generating scanners: programs which recognise lexical patterns in text;
  • 12. Syntax Analysis ⚫Syntax (or syntactic) Analysis (Parsing) ⚫Imposes a hierarchical structure on the token stream. ⚫Use output of lexical analyzer to create a tree like structure of the token stream – syntax tree. ⚫The tree shows the order in which operations are performed ⚫Ordering based on the precedence of mathematical operations
  • 13. Semantic Analysis : Semantic Analysis : •Collects context (semantic) information, checks for semantic errors, and annotates nodes of the tree with the results. •Ensures that component of a program fit together meaningfully. Also gathers type information and saves either in syntax tree or symbol table.
  • 14. Intermediate Code Generations:- An intermediate representation of the final machine language code is produced. This phase bridges the analysis and synthesis phases of translation. intermediate representation should have two important properties. ⚫It should be easy to produce. ⚫It should be easy to translate into target machine code.
  • 15. Code Optimization:- •The goal is to improve the intermediate code and, thus, the effectiveness of code generation and the performance of the target code get improved. • Faster, shorter code and even that consumes less power.
  • 16. Code Generation:- ⚫The last phase of translation is code generation. ⚫A number of optimizations to Reduce the length of machine language program are carried out during this phase. The output of the code generator is the machine language program of the specified computer. ⚫Target, machine-specific properties may be used to optimize the code. ⚫ Finally, machine code and associated information required by the Operating System are generated.
  • 17. Symbol Table Management This record variable name used in the source p This stores attributes of each name Eg: name, its type, its scope Method of passing each argument(by value or by reference) Return type Implementation of symbol table can be done is either linear list or hash table. Error Handling Routine: In the compiler design process error may occur in all the below-given phases: Lexical analyzer: Wrongly spelled tokens Syntax analyzer: Missing parenthesis Intermediate code generator: Mismatched operands for an operator Code Optimizer: When the statement is not reachable Code Generator: Unreachable statements Symbol tables: Error of multiple declared identifiers
  • 18.
  • 19. Front end Front end of a compiler consists of the phases • Lexical analysis. • Syntax analysis. • Semantic analysis. • Intermediate code generation. Back end Back end of a compiler contains • Code optimization. • Code generation. Front End • Front end comprises of phases which are dependent on the input (source language) and independent on the target machine (target language). • It includes lexical and syntactic analysis, symbol table management, semantic analysis and the generation of intermediate code. • Code optimization can also be done by the front end. • It also includes error handling at the phases concerned.
  • 20. Back End • Back end comprises of those phases of the compiler that are dependent on the target machine and independent on the source language. • This includes code optimization, code generation. • In addition to this, it also encompasses error handling and symbol table management operations.
  • 21. Passes A Compiler pass refers to the traversal of a compiler through the entire program. Compiler pass are two types: •Single Pass Compiler •Two Pass Compiler or Multi Pass Compiler. 1. Single Pass Compiler: If we combine or group all the phases of compiler design in a single module known as single pass compiler.
  • 22. ⚫In above diagram there are all 6 phases are grouped in a single module, some points of single pass compiler is as: ⚫A one pass/single pass compiler is that type of compiler that passes through the part of each compilation unit exactly once. ⚫Single pass compiler is faster and smaller than the multi pass compiler. ⚫As a disadvantage of single pass compiler is that it is less efficient in comparison with multipass compiler. ⚫Single pass compiler is one that processes the input exactly once, so going directly from lexical analysis to code generator, and then going back for the next read.
  • 23. ⚫First Pass: is refers as (a). Front end (b). Analytic part (c). Platform independent ⚫Second Pass: is refers as (a). Back end (b). Synthesis Part (c). Platform Dependent
  • 24. compiler vs interpreter compiler interpreter It scans the entire program first and TRANSLATE it into machine code It scans line by line programs and convert into machine code Debugging is slow Debugging is faster Error occurs after scanning the whole program Error occurs after scanning each line Compiler shows all errors and warnings at same time Interpreter show one error at a time Execution time is less Execution time is more Compilers is used by languages such as c,c++ etc. An interpreter is used languages such as java,python etc.
  • 25. The role of Lexical Analyzer ⚫ 1) Lexical analyzer functions (or) role of Lexical Analyzer ⚫ 2)Interaction between LA and SA ⚫ 3)token, lexemes and patterns ⚫ 1) Lexical analyzer functions (or) role of Lexical Analyzer a) It provides a stream of tokens b)It removes comments from SP c)It removes white spaces characters such as blank space, tap space and new line characters. d) It counts the line numbers of source programs e)It generates a symbol table f)It provides error messages
  • 26. The role of Lexical Analyzer 2)Interaction between LA and SA
  • 27. ⚫ 3)token, lexemes and patterns ⚫ -pair of components consist of ⚫ <token name, attribute value> ⚫ Identifier/variables/operators/keywords/constants ⚫ Pointer-which points to S.Table ⚫ lexemes : Sequence of characters ⚫ Patterns: used to define regular expression ⚫ Identifier start with (l/_)(l/d/_)* [a-zA-Z_][a-zA-Z0-9_]*
  • 28. ⚫Lexemes a=b+c; ⚫Symboltable ⚫Lexemes Token ⚫A <id,1> ⚫= <assig><=> ⚫B <id,2> ⚫+ <+> ⚫C <id,3> ⚫; Id1 Val of identifier Id2 Id3
  • 29. Regular Expressions ⚫Regular Expressions are used to denote regular languages. An expression is regular if: ⚫ɸ is a regular expression for regular language ɸ. ⚫ɛ is a regular expression for regular language {ɛ}. ⚫If a ∈ Σ , a is regular expression with language {a}. ⚫If a and b are regular expression, a + b is also a regular expression with language {a,b}. ⚫If a and b are regular expression, ab (concatenation of a and b) is also regular. ⚫If a is regular expression, a* (0 or more times a) is also regular.
  • 30. Operations The various operations on languages are: ⚫Union of two languages L and M is written as ⚫L U M = {s | s is in L or s is in M} ⚫Concatenation of two languages L and M is written as ⚫LM = {st | s is in L and t is in M} ⚫The Kleene Closure of a language L is written as ⚫L* = Zero or more occurrence of language L.
  • 31. ⚫Types of Grammars- ⚫Grammars are classified on different basis as-
  • 32. Finite automata ⚫Finite automata is a state machine that takes a string of symbols as input and changes its state accordingly. Finite automata is a recognizer for regular expressions. ⚫The mathematical model of finite automata consists of: ⚫Finite set of states (Q) ⚫Finite set of input symbols (Σ) ⚫One Start state (q0) ⚫Set of final states (qf) ⚫Transition function (δ)
  • 33. DFA refers to deterministic finite automata. ... In DFA, there is only one path for specific input from the current state to the next state. The finite automata are called NFA when there exist many paths for specific input from the current state to the next state.
  • 34. Steps for converting NFA with ε to DFA: ⚫Step 1: We will take the ε-closure for the starting state of NFA as a starting state of DFA. ⚫Step 2: Find the states for each input symbol that can be traversed from the present. That means the union of transition value and their closures for each state of NFA present in the current state of DFA. ⚫Step 3: If we found a new state, take it as current state and repeat step 2. ⚫Step 4: Repeat Step 2 and Step 3 until there is no new state present in the transition table of DFA. ⚫Step 5: Mark the states of DFA as a final state which contains the final state of NFA.
  • 35. 1. Convert the given NFA into its equivalent DFA.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40. r=(a/b)*abb(a/b) construct nfa from regular expression
  • 41.
  • 42. Input Buffering in Compiler Design ⚫The lexical analyzer scans the input from left to right one character at a time. It uses two pointers begin ptr(bp) and forward to keep track of the pointer of the input scanned. ⚫Initially both the pointers point to the first character of the input string as shown below
  • 43.
  • 44. ⚫The forward ptr moves ahead to search for end of lexeme. As soon as the blank space is encountered, it indicates end of lexeme. In above example as soon as ptr (fp) encounters a blank space the lexeme “int” is identified. ⚫The fp will be moved ahead at white space, when fp encounters white space, it ignore and moves ahead. then both the begin ptr(bp) and forward ptr(fp) are set at next token.
  • 46. Regular expressions ⚫Regular expressions are a notation to represent lexeme patterns for a token. ⚫They are used to represent the language for lexical analyzer. ⚫They assist in finding the type of token that accounts for a particular lexeme.
  • 47. Regular Expression for identifiers ⚫ Ex: a ⚫ Aa ⚫ a123
  • 48. 1)Ex: 9 2)Ex :123 3)Ex: .25 /e 4)E+2/e-2/e2 /e 5)6.33E4
  • 50. Lex-lexical analyzer generator ⚫ Lex is a program that generates lexical analyzer. It is used with YACC parser generator. ⚫ The lexical analyzer is a program that transforms an input stream into a sequence of tokens. ⚫ It reads the input stream and produces the source code as output through implementing the lexical analyzer in the C program. The function of Lex is as follows: ⚫ Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs the lex.1 program and produces a C program lex.yy.c. ⚫ Finally C compiler runs the lex.yy.c program and produces an object program a.out. ⚫ a.out is lexical analyzer that transforms an input stream into a sequence of tokens.
  • 51. Lex file format A Lex program is separated into three sections by %% delimiters. The formal of Lex source is as follows: { definitions } //declaration part %% { rules } //rule part pattern {action} %% { user subroutines } //auxiliary part ⚫Definitions include declarations of constant, variable and regular definitions. ⚫Rules define the statement of form p1 {action1} p2 {action2}....pn {action}. ⚫Where pi describes the regular expression and action1 describes the actions what action the lexical analyzer should take when pattern pi matches a lexeme. ⚫User subroutines are auxiliary procedures needed by the actions. The subroutine can be loaded with the lexical analyzer and compiled separately.
  • 52. % { #include <stdio.h> %} % % ^[a - z A - Z _][a - z A - Z 0 - 9 _] * {printf("Valid Identifier");} % % main() { yylex(); }
  • 53. //case conversion %option noyywrap lower [a-z] CAPS [A-Z] space [tn] %% {lower} {printf("%c", yytext[0]- 32);} {CAPS} {printf("%c", yytext[0]+32);} {space} ECHO; ECHO; %% main() { yylex(); }
  • 54. //1. program to identify number,letter,operators,special symbols// %option noyywrap digit [0-9] letter [a-zA-Z] ops [+'-'*/=%] sp_sym [^a-zA-Z0-9+- */=%] %% {letter}+ {printf("%s is a stringn", yytext);} {digit}+ {printf("%s is a digitn", yytext);} {ops}+ {printf("%s is an operatorn", yytext);} {sp_sym}+ {printf("%s is a special symboln", yytext);} %% int main() { yylex(); return 0; }
  • 55. Lexical Error ⚫During the lexical analysis phase this type of error can be detected. ⚫Lexical error is a sequence of characters that does not match the pattern of any token. Lexical phase error is found during the execution of the program. Lexical phase error can be: ⚫Spelling error. ⚫Unmatched string. ⚫Appearance of illegal characters. ⚫Exceeding length of identifier.
  • 56. Lexical analyzer: Wrongly spelled tokens Syntax analyzer: Missing parenthesis Intermediate code generator: Mismatched operands for an operator Code Optimizer: When the statement is not reachable Code Generator: Unreachable statements Symbol tables: Error of multiple declared identifiers
  • 57. In this code, 1xab is neither a number nor an identifier. So this code will show the lexical error.