SlideShare a Scribd company logo
20IT011-COMPILER DESIGN
Text Book :
Alfred V Aho, Monica S Lam, Ravi Sethi and Jeffrey D Ullman, “Compilers - Principles,
Techniques and Tools”, Pearson Education, 2nd Edition, 2018.
COURSE OUTCOME
CO1: Ability to understand the functionalities of the different phases of compiler.
CO2: Ability to understand syntax-directed translation and run-time environment.
CO3: Ability to implement code optimization techniques.
CO4: Ability to apply different parsing algorithms to develop the parsers for a
given grammar.
CO5: Ability to design a scanner and a parser using LEX and YACC tools.
UNIT I
INTRODUCTION TO COMPILERS
UNIT-I SYLLABUS
Structure of a compiler – Lexical Analysis – Role
of Lexical Analyzer -Input Buffering –
Specification of Tokens – Recognition of Tokens-
Lex – Finite Automata-Regular Expressions to
Automata – Minimizing DFA.
STRUCTURE OF A COMPILER
• Definition: A program that accept as input a program text in a certain language
and produces as output a program text in another language (Grune et all)
• Compiler is a translator program that reads a program written in one language -
the source language- and translates it into an equivalent program in another
language-the target language.
COMPILATION PROCESS
PHASES OF A COMPILER
Lexical Analysis
• In a compiler, linear analysis is called lexical analysis or scanning.
• It takes source code as input
• It reads one character at a time and convert into lexemes( represent in the
form of tokens)
Example: position = initial + rate * 60
✓ The identifier position.
✓ The assignment symbol =.
✓ The identifier initial.
✓ The plus sign.
✓ The identifier rate.
✓ The multiplication sign.
✓ The number 60.
<Id,Position> <=,> <Id,initial> <+,> <Id,rate><*,> <num,60>
Contd..
• Needs to record each id attribute: keep a symbol table
• Lexical Analysis eliminates white spaces, etc..
Syntax Analysis
• It is called as parsing or Syntax Analysis
• It takes token as input and generate parse tree as output.
• The parser checks that the expression made by tokens is
syntactically correct or not.
• Context free grammars formalize the rules and guide syntax analysis.
Semantic Analysis
• It checks the parse tree follows the rules of language
• It keeps tracks of identifiers, types and expressions
• Output is annotated as tree syntax
Intermediate Code Generator
• After semantic analysis, some compilers generate an explicit
intermediate representation of the source program.
• This representation should be easy to produce and easy to translate
into the target program.
Code Optimization
• It is used to improve the intermediate code
• So the output can run fast and take less space
• It removes unnecessary lines of code and arrange the sequence in
order to speed up the execution.
Code Generation
• The final phase of the compiler is the generation of target code,
consisting of relocatable machine code or assembly code.
• The intermediate instructions are each translated into sequence of
machine instructions that perform the same task.
Symbol table Manager
• An essential function of a compiler is to record the identifiers used in
the source program and collect its information.
• A symbol table is a data structure containing a record for each
identifier with fields for attributes.(such as, its type, its scope, if
procedure names then the number and type of arguments etc.,)
• The data structure allows to find the record for each identifier and
store or retrieve data from that record quickly.
Example: sum=initial + value * 10 1 sum
2 initial
3 value
4 ….
Error Handling and Reporting
• Each phase can encounter errors. After detecting an error, a phase must deal that
error, so that compilation can proceed ,allowing further errors to be detected.
• Lexical phase can detect error where the characters remaining in the input do not
form any token.
• The syntax and semantic phases handle large fraction of errors. The stream of
tokens violates the syntax rules are determined by syntax phase.
• During semantic, the compiler tries to detect constructs that have the right
syntactic structure but no meaning to the operation involved.
Role of Lexical Analyser
Role of lexical Analyzer
• As the first phase of a compiler, the main task of the lexical analyzer is
to read the input characters of the source program, group them into
lexemes, and produce as output a sequence of tokens for each lexeme
in the source program.
• The stream of tokens is sent to the parser for syntax analysis.
• It is common for the lexical analyzer to interact with the symbol table
as well.
• When the lexical analyzer discovers a lexeme constituting an
identifier, it needs to enter that lexeme into the symbol table.
Contd..
• In some cases, information regarding the kind of identifier may be
read from the symbol table by the lexical analyzer to assist it in
determining the proper token it must pass to the parser.
Interaction between the lexical analyzer and the parser
• The interaction is implemented by having the parser call the lexical
analyzer.
• The call, suggested by the getNextToken command, causes the
lexical analyzer to read characters from its input until it can identify
the next lexeme and produce for it the next token, which it returns to
the parser.
• Since the lexical analyzer is the part of the compiler that reads the
source text, it may perform certain other tasks besides identification of
lexemes
Contd..
• One such task is stripping out comments and whitespace (blank,
newline, tab, and perhaps other characters that are used to separate
tokens in the input).
• Another task is correlating error messages generated by the compiler
with the source program.
• lexical analyzers are divided into a cascade of two processes:
• Scanning consists of the simple processes that do not require tokenization of the
input, such as deletion of comments and compaction of consecutive whitespace
characters into one.
• Lexical analysis proper is the more complex portion, where the scanner produces the
sequence of tokens as output.
Lexical Analysis and Parsing
• Simplicity of design – avoid complexity to work together .
• Improving Compiler Efficiency - A separate lexical analyzer allows us to apply
specialized techniques that serve only the lexical task, not the job of parsing. In
addition, specialized buffering techniques for reading input characters can speed
up the compiler significantly.
• Enhancing Compiler Portability -Input-device-specific peculiarities can be
restricted to the lexical analyzer.
Tokens, Patterns, Lexemes
• A token is a pair consisting of a token name and an optional
attribute value.
• A pattern is a description of the form that the lexemes of a token
may take.
• A lexeme is a sequence of characters in the source program that
matches the pattern for a token and is identified by the lexical analyzer
as an instance of that token.
Example
Attributes for tokens
Lexical Errors
• Some errors are out of power of lexical analyzer to recognize:
✓fi(a==f(x))..
• However it may be able to recognize errors like:
✓d=2r
• Such errors are recognized when no pattern for tokens matches a character
sequence
Example
Error Recovery
• Panic Mode: Successive characters are ignored until we reach to a well
formed token
• Delete one character from the remaining input
• Insert a missing character into the remaining input
• Replace a character by another character
• Transpose two adjacent characters
INPUT BUFFERING
Input Buffering
• Input buffer helps to find the correct lexeme.
• Sometimes lexical analyzer needs to look ahead some symbols to decide about
the token to return
- in C language: we need to look after -,= or < to decide what token to return
- in FORTRAN: DO 5 I=1.25
• We need to introduce a two buffer scheme to handle large look-aheads safely
Buffer Pairs
Two pointers to the input are maintained:
I. Pointer lexemeBegin, marks the beginning of the current lexeme,
whose extent we are attempting to determine.
II. Pointer forward scans ahead until a pattern match is found
Sentinels
• A sentinel is a special character or token used to mark the end of
a sequence.
• In the context of a lexical analyzer, a sentinel can be utilized to
indicate the end of the input source code or the end of a specific
token.
An Introduction to the Compiler Designss
QUIZ
1. ______________ is the heart of the compiler.
2.what will be the next phase of Intermediate code generator?
3. What is the output of Lexical analyzer?
4. Token consist of pairs, what are they?
5. What is the command ,the parser generate to lexical analyzer?
6. What is the drawback in 1-Buffer?
7. What the two pointers used in Buffering technique?
8. What are the Back-end in Phases of compiler?
SPECIFICATION OF TOKENS
SPECIFICATION OF TOKENS
• In theory of compilation regular expressions are used to formalize the
specification of tokens.
• The specification of tokens depends on the pattern of the lexeme.
Example:
letter(letter+digit)*
• Each regular expression is a pattern specifying the form of strings.
Contd..
I) Alphabet, Strings, Languages
II) Operations on Language
III) Regular Expression
IV) Regular Definition
Strings and Languages
Strings and Languages
Symbol:
• A symbol is an abstract entity that we shall not define formally.
Examples:
• Letters: A|B|C…..|Z |a|b….|z
• Digit:0,1,2…9
• Special Characters
Contd..
Alphabet:
• Alphabet is a non-empty finite set of symbols. it is denoted by the
symbol Ɛ(Sigma).
Example:
• Ɛ={0,1} is the binary alphabet which consisting of digits
• Ɛ={a,b,c} is an alphabet which consisting of letters.
• ASCII is an alphabet ;it is used in many softwares.
• Unicode is an alphabet , which includes 1,00,000 characters.
Contd..
String:
• A String is defined as finite sequence of symbols over an alphabet Ɛ .
It is denoted as ‘w’.
• In language theory, the terms “sentence” and “word” are often used as
synonyms for “string”
Examples:
1.Alphabet Ɛ={ a, b }
Strings: w = {a, b, aa, ab, ba, bb, aaa, aab,……..}
2. Ɛ={ 0, 1 }
Strings: w = { 0, 1, 00, 01, 10, 11, 000, 001, 010, 011,100,101,110,111,
0000,....}
Different Notation used in a string
• Length
• Prefix:- any number of leading symbols
• Proper Prefix :- any number of leading symbols without € and
original string
• Suffix:- tailing symbols
• Proper Suffix:- any number of trailing symbols without € and
original string
• Substring
OPERATIONS ON LANGUAGES
Operations on Languages
REGULAR EXPRESSION
Definition
- A regular expression over Ɛ can be defined as
• ɸ is a regular expression for empty set
• Ɛ is a regular expression for null string {Ɛ}
• If ‘a’ is a symbol in Ɛ then ‘a’ is RE for the set {a}.
• If R and S are two RE then
• UNION of two R+E is a RE
• CONCATENATION of two RS is a RE
• KLEEN CLOSURE of any R* S* is a RE
Identity Rule or Algebric rule
EXAMPLE
Example
EXAMPLES
1) Design a RE to accept all possible combinations of a’s and b’s over
Ɛ={ a, b}
2) Design a RE which Starts with 1 and end with 0.

More Related Content

Similar to An Introduction to the Compiler Designss

Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
NesredinTeshome1
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
Sarmad Ali
 
Compiler Design.pptx
Compiler Design.pptxCompiler Design.pptx
Compiler Design.pptx
SouvikRoy149
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
Nayemid4676
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
Radhika Talaviya
 
Lexical Analysis.pdf
Lexical Analysis.pdfLexical Analysis.pdf
Lexical Analysis.pdf
BiswanathSethi2
 
Compiler Design Material
Compiler Design MaterialCompiler Design Material
Compiler Design Material
Dr. C.V. Suresh Babu
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
sivaganesh293
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
sivaganesh293
 
Phases of Compiler.pptx
Phases of Compiler.pptxPhases of Compiler.pptx
Phases of Compiler.pptx
ssuser3b4934
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
Taymoor Nazmy
 
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptxCOMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
MUSHAMHARIKIRAN6737
 
chapter4 end.pptx
chapter4 end.pptxchapter4 end.pptx
chapter4 end.pptx
Maryam522887
 
Compier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptCompier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.ppt
Apoorv Diwan
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
Himajanaidu2
 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overview
amudha arul
 
LANGUAGE TRANSLATOR
LANGUAGE TRANSLATORLANGUAGE TRANSLATOR
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
TANZINTANZINA
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compiler
adilmehmood93
 
10600122065_Animesh mani (CD).pdf
10600122065_Animesh mani (CD).pdf10600122065_Animesh mani (CD).pdf
10600122065_Animesh mani (CD).pdf
AnimeshMani4
 

Similar to An Introduction to the Compiler Designss (20)

Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
 
Compiler Design.pptx
Compiler Design.pptxCompiler Design.pptx
Compiler Design.pptx
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
 
Lexical Analysis.pdf
Lexical Analysis.pdfLexical Analysis.pdf
Lexical Analysis.pdf
 
Compiler Design Material
Compiler Design MaterialCompiler Design Material
Compiler Design Material
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 
Phases of Compiler.pptx
Phases of Compiler.pptxPhases of Compiler.pptx
Phases of Compiler.pptx
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
 
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptxCOMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
 
chapter4 end.pptx
chapter4 end.pptxchapter4 end.pptx
chapter4 end.pptx
 
Compier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptCompier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.ppt
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overview
 
LANGUAGE TRANSLATOR
LANGUAGE TRANSLATORLANGUAGE TRANSLATOR
LANGUAGE TRANSLATOR
 
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compiler
 
10600122065_Animesh mani (CD).pdf
10600122065_Animesh mani (CD).pdf10600122065_Animesh mani (CD).pdf
10600122065_Animesh mani (CD).pdf
 

Recently uploaded

RECENT DEVELOPMENTS IN RING SPINNING.pptx
RECENT DEVELOPMENTS IN RING SPINNING.pptxRECENT DEVELOPMENTS IN RING SPINNING.pptx
RECENT DEVELOPMENTS IN RING SPINNING.pptx
peacesoul123
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
sunnuchadda
 
21EC63_Module1B.pptx VLSI design 21ec63 MOS TRANSISTOR THEORY
21EC63_Module1B.pptx VLSI design 21ec63 MOS TRANSISTOR THEORY21EC63_Module1B.pptx VLSI design 21ec63 MOS TRANSISTOR THEORY
21EC63_Module1B.pptx VLSI design 21ec63 MOS TRANSISTOR THEORY
PradeepKumarSK3
 
How to Formulate A Good Research Question
How to Formulate A  Good Research QuestionHow to Formulate A  Good Research Question
How to Formulate A Good Research Question
rkpv2002
 
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
rawankhanlove256
 
CGR-Unit-1 Basics of Computer Graphics.pdf
CGR-Unit-1 Basics of Computer Graphics.pdfCGR-Unit-1 Basics of Computer Graphics.pdf
CGR-Unit-1 Basics of Computer Graphics.pdf
Rugved Collection
 
printing of ic circuits.pdf
printing       of        ic     circuits.pdfprinting       of        ic     circuits.pdf
printing of ic circuits.pdf
chidambaramnatarajar
 
Business Development_ Identifying and Seizing Market Opportunities with Skyle...
Business Development_ Identifying and Seizing Market Opportunities with Skyle...Business Development_ Identifying and Seizing Market Opportunities with Skyle...
Business Development_ Identifying and Seizing Market Opportunities with Skyle...
Skyler Bloom
 
Traffic Engineering-MODULE-1 vtu syllabus.pptx
Traffic Engineering-MODULE-1 vtu syllabus.pptxTraffic Engineering-MODULE-1 vtu syllabus.pptx
Traffic Engineering-MODULE-1 vtu syllabus.pptx
mailmad391
 
EAAP2023 : Durabilité et services écosystémiques de l'élevage ovin de montagne
EAAP2023 : Durabilité et services écosystémiques de l'élevage ovin de montagneEAAP2023 : Durabilité et services écosystémiques de l'élevage ovin de montagne
EAAP2023 : Durabilité et services écosystémiques de l'élevage ovin de montagne
idelewebmestre
 
Presentation slide on DESIGN AND FABRICATION OF MOBILE CONTROLLED DRAINAGE.pptx
Presentation slide on DESIGN AND FABRICATION OF MOBILE CONTROLLED DRAINAGE.pptxPresentation slide on DESIGN AND FABRICATION OF MOBILE CONTROLLED DRAINAGE.pptx
Presentation slide on DESIGN AND FABRICATION OF MOBILE CONTROLLED DRAINAGE.pptx
Er. Kushal Ghimire
 
Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
IIIT Hyderabad
 
Time-State Analytics: MinneAnalytics 2024 Talk
Time-State Analytics: MinneAnalytics 2024 TalkTime-State Analytics: MinneAnalytics 2024 Talk
Time-State Analytics: MinneAnalytics 2024 Talk
Evan Chan
 
Thermodynamics Digital Material basics subject
Thermodynamics Digital Material basics subjectThermodynamics Digital Material basics subject
Thermodynamics Digital Material basics subject
JigneshChhatbar1
 
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
amzhoxvzidbke
 
Introduction to IP address concept - Computer Networking
Introduction to IP address concept - Computer NetworkingIntroduction to IP address concept - Computer Networking
Introduction to IP address concept - Computer Networking
Md.Shohel Rana ( M.Sc in CSE Khulna University of Engineering & Technology (KUET))
 
FINE-TUNING OF SMALL/MEDIUM LLMS FOR BUSINESS QA ON STRUCTURED DATA
FINE-TUNING OF SMALL/MEDIUM LLMS FOR BUSINESS QA ON STRUCTURED DATAFINE-TUNING OF SMALL/MEDIUM LLMS FOR BUSINESS QA ON STRUCTURED DATA
FINE-TUNING OF SMALL/MEDIUM LLMS FOR BUSINESS QA ON STRUCTURED DATA
kevig
 
Tutorial on MySQl and its basic concepts
Tutorial on MySQl and its basic conceptsTutorial on MySQl and its basic concepts
Tutorial on MySQl and its basic concepts
anishacotta2
 
Stiffness Method for structure analysis - Truss
Stiffness Method  for structure analysis - TrussStiffness Method  for structure analysis - Truss
Stiffness Method for structure analysis - Truss
adninhaerul
 
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
Jim Mimlitz, P.E.
 

Recently uploaded (20)

RECENT DEVELOPMENTS IN RING SPINNING.pptx
RECENT DEVELOPMENTS IN RING SPINNING.pptxRECENT DEVELOPMENTS IN RING SPINNING.pptx
RECENT DEVELOPMENTS IN RING SPINNING.pptx
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
 
21EC63_Module1B.pptx VLSI design 21ec63 MOS TRANSISTOR THEORY
21EC63_Module1B.pptx VLSI design 21ec63 MOS TRANSISTOR THEORY21EC63_Module1B.pptx VLSI design 21ec63 MOS TRANSISTOR THEORY
21EC63_Module1B.pptx VLSI design 21ec63 MOS TRANSISTOR THEORY
 
How to Formulate A Good Research Question
How to Formulate A  Good Research QuestionHow to Formulate A  Good Research Question
How to Formulate A Good Research Question
 
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
 
CGR-Unit-1 Basics of Computer Graphics.pdf
CGR-Unit-1 Basics of Computer Graphics.pdfCGR-Unit-1 Basics of Computer Graphics.pdf
CGR-Unit-1 Basics of Computer Graphics.pdf
 
printing of ic circuits.pdf
printing       of        ic     circuits.pdfprinting       of        ic     circuits.pdf
printing of ic circuits.pdf
 
Business Development_ Identifying and Seizing Market Opportunities with Skyle...
Business Development_ Identifying and Seizing Market Opportunities with Skyle...Business Development_ Identifying and Seizing Market Opportunities with Skyle...
Business Development_ Identifying and Seizing Market Opportunities with Skyle...
 
Traffic Engineering-MODULE-1 vtu syllabus.pptx
Traffic Engineering-MODULE-1 vtu syllabus.pptxTraffic Engineering-MODULE-1 vtu syllabus.pptx
Traffic Engineering-MODULE-1 vtu syllabus.pptx
 
EAAP2023 : Durabilité et services écosystémiques de l'élevage ovin de montagne
EAAP2023 : Durabilité et services écosystémiques de l'élevage ovin de montagneEAAP2023 : Durabilité et services écosystémiques de l'élevage ovin de montagne
EAAP2023 : Durabilité et services écosystémiques de l'élevage ovin de montagne
 
Presentation slide on DESIGN AND FABRICATION OF MOBILE CONTROLLED DRAINAGE.pptx
Presentation slide on DESIGN AND FABRICATION OF MOBILE CONTROLLED DRAINAGE.pptxPresentation slide on DESIGN AND FABRICATION OF MOBILE CONTROLLED DRAINAGE.pptx
Presentation slide on DESIGN AND FABRICATION OF MOBILE CONTROLLED DRAINAGE.pptx
 
Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
 
Time-State Analytics: MinneAnalytics 2024 Talk
Time-State Analytics: MinneAnalytics 2024 TalkTime-State Analytics: MinneAnalytics 2024 Talk
Time-State Analytics: MinneAnalytics 2024 Talk
 
Thermodynamics Digital Material basics subject
Thermodynamics Digital Material basics subjectThermodynamics Digital Material basics subject
Thermodynamics Digital Material basics subject
 
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
 
Introduction to IP address concept - Computer Networking
Introduction to IP address concept - Computer NetworkingIntroduction to IP address concept - Computer Networking
Introduction to IP address concept - Computer Networking
 
FINE-TUNING OF SMALL/MEDIUM LLMS FOR BUSINESS QA ON STRUCTURED DATA
FINE-TUNING OF SMALL/MEDIUM LLMS FOR BUSINESS QA ON STRUCTURED DATAFINE-TUNING OF SMALL/MEDIUM LLMS FOR BUSINESS QA ON STRUCTURED DATA
FINE-TUNING OF SMALL/MEDIUM LLMS FOR BUSINESS QA ON STRUCTURED DATA
 
Tutorial on MySQl and its basic concepts
Tutorial on MySQl and its basic conceptsTutorial on MySQl and its basic concepts
Tutorial on MySQl and its basic concepts
 
Stiffness Method for structure analysis - Truss
Stiffness Method  for structure analysis - TrussStiffness Method  for structure analysis - Truss
Stiffness Method for structure analysis - Truss
 
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
 

An Introduction to the Compiler Designss

  • 1. 20IT011-COMPILER DESIGN Text Book : Alfred V Aho, Monica S Lam, Ravi Sethi and Jeffrey D Ullman, “Compilers - Principles, Techniques and Tools”, Pearson Education, 2nd Edition, 2018.
  • 2. COURSE OUTCOME CO1: Ability to understand the functionalities of the different phases of compiler. CO2: Ability to understand syntax-directed translation and run-time environment. CO3: Ability to implement code optimization techniques. CO4: Ability to apply different parsing algorithms to develop the parsers for a given grammar. CO5: Ability to design a scanner and a parser using LEX and YACC tools.
  • 4. UNIT-I SYLLABUS Structure of a compiler – Lexical Analysis – Role of Lexical Analyzer -Input Buffering – Specification of Tokens – Recognition of Tokens- Lex – Finite Automata-Regular Expressions to Automata – Minimizing DFA.
  • 5. STRUCTURE OF A COMPILER • Definition: A program that accept as input a program text in a certain language and produces as output a program text in another language (Grune et all) • Compiler is a translator program that reads a program written in one language - the source language- and translates it into an equivalent program in another language-the target language.
  • 7. PHASES OF A COMPILER
  • 8. Lexical Analysis • In a compiler, linear analysis is called lexical analysis or scanning. • It takes source code as input • It reads one character at a time and convert into lexemes( represent in the form of tokens) Example: position = initial + rate * 60 ✓ The identifier position. ✓ The assignment symbol =. ✓ The identifier initial. ✓ The plus sign. ✓ The identifier rate. ✓ The multiplication sign. ✓ The number 60. <Id,Position> <=,> <Id,initial> <+,> <Id,rate><*,> <num,60>
  • 9. Contd.. • Needs to record each id attribute: keep a symbol table • Lexical Analysis eliminates white spaces, etc..
  • 10. Syntax Analysis • It is called as parsing or Syntax Analysis • It takes token as input and generate parse tree as output. • The parser checks that the expression made by tokens is syntactically correct or not. • Context free grammars formalize the rules and guide syntax analysis.
  • 11. Semantic Analysis • It checks the parse tree follows the rules of language • It keeps tracks of identifiers, types and expressions • Output is annotated as tree syntax
  • 12. Intermediate Code Generator • After semantic analysis, some compilers generate an explicit intermediate representation of the source program. • This representation should be easy to produce and easy to translate into the target program.
  • 13. Code Optimization • It is used to improve the intermediate code • So the output can run fast and take less space • It removes unnecessary lines of code and arrange the sequence in order to speed up the execution.
  • 14. Code Generation • The final phase of the compiler is the generation of target code, consisting of relocatable machine code or assembly code. • The intermediate instructions are each translated into sequence of machine instructions that perform the same task.
  • 15. Symbol table Manager • An essential function of a compiler is to record the identifiers used in the source program and collect its information. • A symbol table is a data structure containing a record for each identifier with fields for attributes.(such as, its type, its scope, if procedure names then the number and type of arguments etc.,) • The data structure allows to find the record for each identifier and store or retrieve data from that record quickly. Example: sum=initial + value * 10 1 sum 2 initial 3 value 4 ….
  • 16. Error Handling and Reporting • Each phase can encounter errors. After detecting an error, a phase must deal that error, so that compilation can proceed ,allowing further errors to be detected. • Lexical phase can detect error where the characters remaining in the input do not form any token. • The syntax and semantic phases handle large fraction of errors. The stream of tokens violates the syntax rules are determined by syntax phase. • During semantic, the compiler tries to detect constructs that have the right syntactic structure but no meaning to the operation involved.
  • 17. Role of Lexical Analyser
  • 18. Role of lexical Analyzer • As the first phase of a compiler, the main task of the lexical analyzer is to read the input characters of the source program, group them into lexemes, and produce as output a sequence of tokens for each lexeme in the source program. • The stream of tokens is sent to the parser for syntax analysis. • It is common for the lexical analyzer to interact with the symbol table as well. • When the lexical analyzer discovers a lexeme constituting an identifier, it needs to enter that lexeme into the symbol table.
  • 19. Contd.. • In some cases, information regarding the kind of identifier may be read from the symbol table by the lexical analyzer to assist it in determining the proper token it must pass to the parser.
  • 20. Interaction between the lexical analyzer and the parser • The interaction is implemented by having the parser call the lexical analyzer. • The call, suggested by the getNextToken command, causes the lexical analyzer to read characters from its input until it can identify the next lexeme and produce for it the next token, which it returns to the parser. • Since the lexical analyzer is the part of the compiler that reads the source text, it may perform certain other tasks besides identification of lexemes
  • 21. Contd.. • One such task is stripping out comments and whitespace (blank, newline, tab, and perhaps other characters that are used to separate tokens in the input). • Another task is correlating error messages generated by the compiler with the source program. • lexical analyzers are divided into a cascade of two processes: • Scanning consists of the simple processes that do not require tokenization of the input, such as deletion of comments and compaction of consecutive whitespace characters into one. • Lexical analysis proper is the more complex portion, where the scanner produces the sequence of tokens as output.
  • 22. Lexical Analysis and Parsing • Simplicity of design – avoid complexity to work together . • Improving Compiler Efficiency - A separate lexical analyzer allows us to apply specialized techniques that serve only the lexical task, not the job of parsing. In addition, specialized buffering techniques for reading input characters can speed up the compiler significantly. • Enhancing Compiler Portability -Input-device-specific peculiarities can be restricted to the lexical analyzer.
  • 23. Tokens, Patterns, Lexemes • A token is a pair consisting of a token name and an optional attribute value. • A pattern is a description of the form that the lexemes of a token may take. • A lexeme is a sequence of characters in the source program that matches the pattern for a token and is identified by the lexical analyzer as an instance of that token.
  • 26. Lexical Errors • Some errors are out of power of lexical analyzer to recognize: ✓fi(a==f(x)).. • However it may be able to recognize errors like: ✓d=2r • Such errors are recognized when no pattern for tokens matches a character sequence
  • 28. Error Recovery • Panic Mode: Successive characters are ignored until we reach to a well formed token • Delete one character from the remaining input • Insert a missing character into the remaining input • Replace a character by another character • Transpose two adjacent characters
  • 30. Input Buffering • Input buffer helps to find the correct lexeme. • Sometimes lexical analyzer needs to look ahead some symbols to decide about the token to return - in C language: we need to look after -,= or < to decide what token to return - in FORTRAN: DO 5 I=1.25 • We need to introduce a two buffer scheme to handle large look-aheads safely
  • 31. Buffer Pairs Two pointers to the input are maintained: I. Pointer lexemeBegin, marks the beginning of the current lexeme, whose extent we are attempting to determine. II. Pointer forward scans ahead until a pattern match is found
  • 32. Sentinels • A sentinel is a special character or token used to mark the end of a sequence. • In the context of a lexical analyzer, a sentinel can be utilized to indicate the end of the input source code or the end of a specific token.
  • 34. QUIZ 1. ______________ is the heart of the compiler. 2.what will be the next phase of Intermediate code generator? 3. What is the output of Lexical analyzer? 4. Token consist of pairs, what are they? 5. What is the command ,the parser generate to lexical analyzer? 6. What is the drawback in 1-Buffer? 7. What the two pointers used in Buffering technique? 8. What are the Back-end in Phases of compiler?
  • 36. SPECIFICATION OF TOKENS • In theory of compilation regular expressions are used to formalize the specification of tokens. • The specification of tokens depends on the pattern of the lexeme. Example: letter(letter+digit)* • Each regular expression is a pattern specifying the form of strings.
  • 37. Contd.. I) Alphabet, Strings, Languages II) Operations on Language III) Regular Expression IV) Regular Definition
  • 39. Strings and Languages Symbol: • A symbol is an abstract entity that we shall not define formally. Examples: • Letters: A|B|C…..|Z |a|b….|z • Digit:0,1,2…9 • Special Characters
  • 40. Contd.. Alphabet: • Alphabet is a non-empty finite set of symbols. it is denoted by the symbol Ɛ(Sigma). Example: • Ɛ={0,1} is the binary alphabet which consisting of digits • Ɛ={a,b,c} is an alphabet which consisting of letters. • ASCII is an alphabet ;it is used in many softwares. • Unicode is an alphabet , which includes 1,00,000 characters.
  • 41. Contd.. String: • A String is defined as finite sequence of symbols over an alphabet Ɛ . It is denoted as ‘w’. • In language theory, the terms “sentence” and “word” are often used as synonyms for “string” Examples: 1.Alphabet Ɛ={ a, b } Strings: w = {a, b, aa, ab, ba, bb, aaa, aab,……..} 2. Ɛ={ 0, 1 } Strings: w = { 0, 1, 00, 01, 10, 11, 000, 001, 010, 011,100,101,110,111, 0000,....}
  • 42. Different Notation used in a string • Length • Prefix:- any number of leading symbols • Proper Prefix :- any number of leading symbols without € and original string • Suffix:- tailing symbols • Proper Suffix:- any number of trailing symbols without € and original string • Substring
  • 46. Definition - A regular expression over Ɛ can be defined as • ɸ is a regular expression for empty set • Ɛ is a regular expression for null string {Ɛ} • If ‘a’ is a symbol in Ɛ then ‘a’ is RE for the set {a}. • If R and S are two RE then • UNION of two R+E is a RE • CONCATENATION of two RS is a RE • KLEEN CLOSURE of any R* S* is a RE
  • 47. Identity Rule or Algebric rule
  • 50. EXAMPLES 1) Design a RE to accept all possible combinations of a’s and b’s over Ɛ={ a, b} 2) Design a RE which Starts with 1 and end with 0.