SlideShare a Scribd company logo
COMPILER DESIGN
Myself Archana R
Assistant Professor In
Department Of Computer Science
SACWC.
I am here because I love to give
presentations.
IMPLEMENTATION OF LEXICAL ANALYZER
IMPLEMENTATION OF LEXICAL ANALYZER
Outline
• Specifying lexical structure using regular
expressions
• Finite automata
–Deterministic Finite Automata (DFAs)
–Non-deterministic Finite Automata (NFAs)
• Implementation of regular expressions
Reg Exp NFA DFA Tables
Notation
• For convenience, we use a variation (allow user- defined abbreviations)
in regular expression notation.
• Union: A + B  A | B
• Option: A +   A?
• Range: ‘a’+’b’+…+’z’  [a-z]
• Excluded range:
complement of [a-
z]
 [^a-z]
Regular Expressions in Lexical Specification
• Last lecture: a specification for the predicate,
s  L(R)
• But a yes/no answer is not enough !
• Instead: partition the input into tokens.
• We will adapt regular expressions to this goal.
Regular Expressions  Lexical Spec. (1)
1. Select a set of tokens
• Integer, Keyword, Identifier, OpenPar, ...
2. Write a regular expression (pattern) for the lexemes of
each token
• Integer = digit +
• Keyword = ‘if’ + ‘else’ + …
• Identifier = letter (letter + digit)*
• OpenPar = ‘(‘
• …
Regular Expressions  Lexical Spec. (2)
3. Construct R, matching all lexemes for all tokens
R = Keyword + Identifier + Integer + …
= R1 + R2 + R3 + …
Facts: If s  L(R) then s is a lexeme
– Furthermore s  L(Ri) for some “i”
– This “i” determines the token that is reported
Regular Expressions  Lexical Spec. (3)
4. Let input be x1…xn
• (x1 ... xn arecharacters)
• For 1  i  n check
x1…xi  L(R) ?
5. It must be thatt
x1…xi  L(Rj) for some j
(if there is a choice, pick a smallest such j)
6. Remove x1…xi from input and go to previous step
How to Handle Spaces and Comments?
1. We could create a token Whitespace
Whitespace = (‘ ’ + ‘n’ + ‘t’)+
– We could also add comments in there
– An input “ tn 5555 “ is transformed into Whitespace Integer Whitespace
2. Lexer skips spaces (preferred)
• Modify step 5 from before as follows:
It must be that xk ... xi L(Rj) for some j such that x1 ... xk-1 L(Whitespace)
• Parser is not bothered with spaces
Ambiguities (1)
• There are ambiguities in the algorithm
• How much input is used? What if
• x1…xi  L(R) and also
• x1…xK  L(R)
–Rule: Pick the longest possible substring
–The “maximal munch”
Ambiguities (2)
• Which token is used? What if
• x1…xi  L(Rj) and also
• x1…xi  L(Rk)
– Rule: use rule listed first (j if j < k)
• Example:
–R1 = Keyword and R2 = Identifier
–“if” matches both
–Treats “if” as a keyword not an identifier
Error Handling
• What if
No rule matches a prefix of input ?
• Problem: Can’t just get stuck …
• Solution:
–Write a rule matching all “bad” strings
–Put it last
• Lexer tools allow the writing of:
R = R1 + ... + Rn +Error
–Token Error matches if nothing else matches
Regular Languages & FiniteAutomata
Basic formal language theory result:
Regular expressions and finite automata both define the class of regular languages.
Thus, we are going to use:
• Regular expressions for specification
• Finite automata for implementation (automatic generation of lexical analyzers)
FiniteAutomata
• A finite automaton is a recognizer for the strings of a regular language
A finite automaton consists of
–A finite input alphabet 
–A set of states S
–A start state n
–A set of accepting states F  S
–A set of transitions state  input state
• Transition
s1 a s2
• Is read
In state s1 on input “a” go to states2
• If end of input (or no transition possible)
–If in accepting state  accept
–Otherwise  reject
Finite Automata State Graphs
• A state
a
• The start state
• An accepting state
• A transition
A Simple Example
• A finite automaton that accepts only “1”
1
Another Simple Example
• A finite automaton accepting any number of 1’s
followed by a single 0
• Alphabet: {0,1}
1
0
And Another Example
• Alphabet {0,1}
• What language does this recognize?
1
1
1
0
0 0
• Alphabet still { 0, 1 }
1
1
• The operation of the automaton is not completely
defined by the input
– On input “11” the automaton could be in either state
Epsilon Moves
• Another kind of transition: -moves

A B
• Machine can move from state A to state B without
reading input
Deterministic and Non-Deterministic Automata
• Deterministic Finite Automata (DFA)
–One transition per input per state
–No - moves
• Non-deterministic Finite Automata (NFA)
–Can have multiple transitions for one input in a given state
–Can have - moves
• Finite automata have finite memory
–Enough to only encode the current state
Execution of FiniteAutomata
• A DFA can take only one path through the state graph
– Completely determined by input
• NFAs can choose
–Whether to make  -moves
–Which of multiple transitions for a single input to take.
Acceptance of NFAs
• An NFA can get into multiple states
1
0
0
1
• Input: 1 0
1
• Rule: NFA accepts an input if it can get in a final state
NFA vs. DFA (1)
• NFAs and DFAs recognize the same set of languages
(regular languages)
• DFAs are easier to implement
– There are no choices to consider
NFA vs. DFA (2)
• For a given language the NFA can be simpler than the DFA
NF
A
1
0
0
0
DF
A
0
1
1
1
0 0
• DFA can be exponentially larger than NFA
Regular Expressions to FiniteAutomata
• High-level sketch
NFA
Regular expressions DFA
Lexical Specification Table-driven Implementation of
DFA
Regular Expressions to NFA(1)
• For each kind of reg. expr, define an NFA
– Notation: NFA for regular expression M
• For 
• For input a
M

a
Regular Expressions to NFA(2)
• ForAB
A B

• For A + B
A
B




Regular Expressions to NFA(3)
• ForA*
A



The Trick of NFA to DFA
• Simulate the NFA
• Each state of DFA
= a non-empty subset of states of the NFA
• Start state
= the set of NFA states reachable through  -moves from NFA start state
• Add a transition S  a S’ to DFA iff
–S’ is the set of NFA states reachable from any state in S after seeing the input a
•considering  -moves as well
IMPLEMENTATION
• A DFA can be implemented by a 2D table T
–One dimension is “states”
–Other dimension is “input symbols”
–For every transition Si  a Sk define T[i,a] =k
• DFA “execution”
–If in state Si and input a, read T[i,a] = k and skip to state Sk
–Very efficient
Table Implementation of a DFA
S
U
0
1
1
0 1
T
0 1
S T U
T T U
U T U
Implementation (Cont.)
• NFA → DFA conversion is at the heart of tools such
as lex, ML-Lex or flex.
• But, DFAs can be huge.
• In practice, lex/ML-Lex/flex-like tools trade off
speed for space in the choice of NFA and DFA
representations.
DFA and Lexer
Two differences:
• DFAs recognize lexemes. A lexer must return a type of acceptance (token
type) rather than simply an accept/reject indication.
• DFAs consume the complete string and accept or reject it. A lexer must
find the end of the lexeme in the input stream and then find the next one,
etc.
THANK YOU

More Related Content

What's hot

T9. Trust and reputation in multi-agent systems
T9. Trust and reputation in multi-agent systemsT9. Trust and reputation in multi-agent systems
T9. Trust and reputation in multi-agent systems
EASSS 2012
 
Clock synchronization in distributed system
Clock synchronization in distributed systemClock synchronization in distributed system
Clock synchronization in distributed system
Sunita Sahu
 
Artificial Intelligence Notes Unit 2
Artificial Intelligence Notes Unit 2Artificial Intelligence Notes Unit 2
Artificial Intelligence Notes Unit 2
DigiGurukul
 
Design of a two pass assembler
Design of a two pass assemblerDesign of a two pass assembler
Design of a two pass assembler
Dhananjaysinh Jhala
 
Token, Pattern and Lexeme
Token, Pattern and LexemeToken, Pattern and Lexeme
Token, Pattern and Lexeme
A. S. M. Shafi
 
Compiler Chapter 1
Compiler Chapter 1Compiler Chapter 1
Compiler Chapter 1
Huawei Technologies
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
vikas dhakane
 
Language for specifying lexical Analyzer
Language for specifying lexical AnalyzerLanguage for specifying lexical Analyzer
Language for specifying lexical Analyzer
Archana Gopinath
 
Recognition-of-tokens
Recognition-of-tokensRecognition-of-tokens
Recognition-of-tokens
Dattatray Gandhmal
 
knowledge representation using rules
knowledge representation using rulesknowledge representation using rules
knowledge representation using rules
Harini Balamurugan
 
Assemblers
AssemblersAssemblers
Assemblers
Dattatray Gandhmal
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
Akhil Kaushik
 
Multi Head, Multi Tape Turing Machine
Multi Head, Multi Tape Turing MachineMulti Head, Multi Tape Turing Machine
Multi Head, Multi Tape Turing Machine
Radhakrishnan Chinnusamy
 
A simple approach of lexical analyzers
A simple approach of lexical analyzersA simple approach of lexical analyzers
A simple approach of lexical analyzers
Archana Gopinath
 
Role-of-lexical-analysis
Role-of-lexical-analysisRole-of-lexical-analysis
Role-of-lexical-analysis
Dattatray Gandhmal
 
Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translation
Akshaya Arunan
 
Informed search
Informed searchInformed search
Informed search
Amit Kumar Rathi
 
Unification and Lifting
Unification and LiftingUnification and Lifting
Unification and Lifting
Megha Sharma
 
Parsing in Compiler Design
Parsing in Compiler DesignParsing in Compiler Design
Parsing in Compiler Design
Akhil Kaushik
 
Type Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLikeType Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLike
United International University
 

What's hot (20)

T9. Trust and reputation in multi-agent systems
T9. Trust and reputation in multi-agent systemsT9. Trust and reputation in multi-agent systems
T9. Trust and reputation in multi-agent systems
 
Clock synchronization in distributed system
Clock synchronization in distributed systemClock synchronization in distributed system
Clock synchronization in distributed system
 
Artificial Intelligence Notes Unit 2
Artificial Intelligence Notes Unit 2Artificial Intelligence Notes Unit 2
Artificial Intelligence Notes Unit 2
 
Design of a two pass assembler
Design of a two pass assemblerDesign of a two pass assembler
Design of a two pass assembler
 
Token, Pattern and Lexeme
Token, Pattern and LexemeToken, Pattern and Lexeme
Token, Pattern and Lexeme
 
Compiler Chapter 1
Compiler Chapter 1Compiler Chapter 1
Compiler Chapter 1
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
 
Language for specifying lexical Analyzer
Language for specifying lexical AnalyzerLanguage for specifying lexical Analyzer
Language for specifying lexical Analyzer
 
Recognition-of-tokens
Recognition-of-tokensRecognition-of-tokens
Recognition-of-tokens
 
knowledge representation using rules
knowledge representation using rulesknowledge representation using rules
knowledge representation using rules
 
Assemblers
AssemblersAssemblers
Assemblers
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
 
Multi Head, Multi Tape Turing Machine
Multi Head, Multi Tape Turing MachineMulti Head, Multi Tape Turing Machine
Multi Head, Multi Tape Turing Machine
 
A simple approach of lexical analyzers
A simple approach of lexical analyzersA simple approach of lexical analyzers
A simple approach of lexical analyzers
 
Role-of-lexical-analysis
Role-of-lexical-analysisRole-of-lexical-analysis
Role-of-lexical-analysis
 
Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translation
 
Informed search
Informed searchInformed search
Informed search
 
Unification and Lifting
Unification and LiftingUnification and Lifting
Unification and Lifting
 
Parsing in Compiler Design
Parsing in Compiler DesignParsing in Compiler Design
Parsing in Compiler Design
 
Type Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLikeType Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLike
 

Similar to Implementation of lexical analyser

Regular Expression to Finite Automata
Regular Expression to Finite AutomataRegular Expression to Finite Automata
Regular Expression to Finite Automata
Archana Gopinath
 
2_4 Finite Automata.ppt
2_4 Finite Automata.ppt2_4 Finite Automata.ppt
2_4 Finite Automata.ppt
Ratnakar Mikkili
 
Introduction to the theory of computation
Introduction to the theory of computationIntroduction to the theory of computation
Introduction to the theory of computationprasadmvreddy
 
Ch3.ppt
Ch3.pptCh3.ppt
Ch3.ppt
MDSayem35
 
Automata_Theory_and_compiler_design_UNIT-1.pptx.pdf
Automata_Theory_and_compiler_design_UNIT-1.pptx.pdfAutomata_Theory_and_compiler_design_UNIT-1.pptx.pdf
Automata_Theory_and_compiler_design_UNIT-1.pptx.pdf
TONY562
 
02. chapter 3 lexical analysis
02. chapter 3   lexical analysis02. chapter 3   lexical analysis
02. chapter 3 lexical analysisraosir123
 
Unit2 Toc.pptx
Unit2 Toc.pptxUnit2 Toc.pptx
Unit2 Toc.pptx
viswanath kani
 
Lexical
LexicalLexical
Lexical
baran19901990
 
Lecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.pptLecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.ppt
NderituGichuki1
 
Nondeterministic Finite Automata
Nondeterministic Finite Automata Nondeterministic Finite Automata
Nondeterministic Finite Automata
parmeet834
 
Finite automata-for-lexical-analysis
Finite automata-for-lexical-analysisFinite automata-for-lexical-analysis
Finite automata-for-lexical-analysis
Dattatray Gandhmal
 
Lexical analysis, syntax analysis, semantic analysis. Ppt
Lexical analysis, syntax analysis, semantic analysis. PptLexical analysis, syntax analysis, semantic analysis. Ppt
Lexical analysis, syntax analysis, semantic analysis. Ppt
ovidlivi91
 
Optimization of dfa
Optimization of dfaOptimization of dfa
Optimization of dfa
Kiran Acharya
 
TOC Introduction
TOC Introduction TOC Introduction
TOC Introduction
Thapar Institute
 
Lecture4 lexical analysis2
Lecture4 lexical analysis2Lecture4 lexical analysis2
Lecture4 lexical analysis2
Mahesh Kumar Chelimilla
 
Lexical Analyzer Implementation
Lexical Analyzer ImplementationLexical Analyzer Implementation
Lexical Analyzer Implementation
Akhil Kaushik
 

Similar to Implementation of lexical analyser (20)

Regular Expression to Finite Automata
Regular Expression to Finite AutomataRegular Expression to Finite Automata
Regular Expression to Finite Automata
 
SS UI Lecture 5
SS UI Lecture 5SS UI Lecture 5
SS UI Lecture 5
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
2_4 Finite Automata.ppt
2_4 Finite Automata.ppt2_4 Finite Automata.ppt
2_4 Finite Automata.ppt
 
Introduction to the theory of computation
Introduction to the theory of computationIntroduction to the theory of computation
Introduction to the theory of computation
 
Ch3.ppt
Ch3.pptCh3.ppt
Ch3.ppt
 
Automata_Theory_and_compiler_design_UNIT-1.pptx.pdf
Automata_Theory_and_compiler_design_UNIT-1.pptx.pdfAutomata_Theory_and_compiler_design_UNIT-1.pptx.pdf
Automata_Theory_and_compiler_design_UNIT-1.pptx.pdf
 
02. chapter 3 lexical analysis
02. chapter 3   lexical analysis02. chapter 3   lexical analysis
02. chapter 3 lexical analysis
 
Unit2 Toc.pptx
Unit2 Toc.pptxUnit2 Toc.pptx
Unit2 Toc.pptx
 
Lexical
LexicalLexical
Lexical
 
SS UI Lecture 6
SS UI Lecture 6SS UI Lecture 6
SS UI Lecture 6
 
Lecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.pptLecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.ppt
 
Nondeterministic Finite Automata
Nondeterministic Finite Automata Nondeterministic Finite Automata
Nondeterministic Finite Automata
 
Finite automata-for-lexical-analysis
Finite automata-for-lexical-analysisFinite automata-for-lexical-analysis
Finite automata-for-lexical-analysis
 
Lexical analysis, syntax analysis, semantic analysis. Ppt
Lexical analysis, syntax analysis, semantic analysis. PptLexical analysis, syntax analysis, semantic analysis. Ppt
Lexical analysis, syntax analysis, semantic analysis. Ppt
 
Optimization of dfa
Optimization of dfaOptimization of dfa
Optimization of dfa
 
TOC Introduction
TOC Introduction TOC Introduction
TOC Introduction
 
Lecture4 lexical analysis2
Lecture4 lexical analysis2Lecture4 lexical analysis2
Lecture4 lexical analysis2
 
Lexical Analyzer Implementation
Lexical Analyzer ImplementationLexical Analyzer Implementation
Lexical Analyzer Implementation
 

More from Archana Gopinath

Data Transfer & Manipulation.pptx
Data Transfer & Manipulation.pptxData Transfer & Manipulation.pptx
Data Transfer & Manipulation.pptx
Archana Gopinath
 
DP _ CO Instruction Format.pptx
DP _ CO Instruction Format.pptxDP _ CO Instruction Format.pptx
DP _ CO Instruction Format.pptx
Archana Gopinath
 
A Role of Lexical Analyzer
A Role of Lexical AnalyzerA Role of Lexical Analyzer
A Role of Lexical Analyzer
Archana Gopinath
 
minimization the number of states of DFA
minimization the number of states of DFAminimization the number of states of DFA
minimization the number of states of DFA
Archana Gopinath
 
Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and Hadoop
Archana Gopinath
 
Map reduce in Hadoop BIG DATA ANALYTICS
Map reduce in Hadoop BIG DATA ANALYTICSMap reduce in Hadoop BIG DATA ANALYTICS
Map reduce in Hadoop BIG DATA ANALYTICS
Archana Gopinath
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
Archana Gopinath
 
Hadoop
HadoopHadoop
Programming with R in Big Data Analytics
Programming with R in Big Data AnalyticsProgramming with R in Big Data Analytics
Programming with R in Big Data Analytics
Archana Gopinath
 
If statements in c programming
If statements in c programmingIf statements in c programming
If statements in c programming
Archana Gopinath
 
un Guided media
un Guided mediaun Guided media
un Guided media
Archana Gopinath
 
Guided media Transmission Media
Guided media Transmission MediaGuided media Transmission Media
Guided media Transmission Media
Archana Gopinath
 
Main Memory RAM and ROM
Main Memory RAM and ROMMain Memory RAM and ROM
Main Memory RAM and ROM
Archana Gopinath
 
Java thread life cycle
Java thread life cycleJava thread life cycle
Java thread life cycle
Archana Gopinath
 
PCSTt11 overview of java
PCSTt11 overview of javaPCSTt11 overview of java
PCSTt11 overview of java
Archana Gopinath
 

More from Archana Gopinath (15)

Data Transfer & Manipulation.pptx
Data Transfer & Manipulation.pptxData Transfer & Manipulation.pptx
Data Transfer & Manipulation.pptx
 
DP _ CO Instruction Format.pptx
DP _ CO Instruction Format.pptxDP _ CO Instruction Format.pptx
DP _ CO Instruction Format.pptx
 
A Role of Lexical Analyzer
A Role of Lexical AnalyzerA Role of Lexical Analyzer
A Role of Lexical Analyzer
 
minimization the number of states of DFA
minimization the number of states of DFAminimization the number of states of DFA
minimization the number of states of DFA
 
Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and Hadoop
 
Map reduce in Hadoop BIG DATA ANALYTICS
Map reduce in Hadoop BIG DATA ANALYTICSMap reduce in Hadoop BIG DATA ANALYTICS
Map reduce in Hadoop BIG DATA ANALYTICS
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Hadoop
HadoopHadoop
Hadoop
 
Programming with R in Big Data Analytics
Programming with R in Big Data AnalyticsProgramming with R in Big Data Analytics
Programming with R in Big Data Analytics
 
If statements in c programming
If statements in c programmingIf statements in c programming
If statements in c programming
 
un Guided media
un Guided mediaun Guided media
un Guided media
 
Guided media Transmission Media
Guided media Transmission MediaGuided media Transmission Media
Guided media Transmission Media
 
Main Memory RAM and ROM
Main Memory RAM and ROMMain Memory RAM and ROM
Main Memory RAM and ROM
 
Java thread life cycle
Java thread life cycleJava thread life cycle
Java thread life cycle
 
PCSTt11 overview of java
PCSTt11 overview of javaPCSTt11 overview of java
PCSTt11 overview of java
 

Recently uploaded

678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 

Recently uploaded (20)

678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 

Implementation of lexical analyser

  • 1. COMPILER DESIGN Myself Archana R Assistant Professor In Department Of Computer Science SACWC. I am here because I love to give presentations. IMPLEMENTATION OF LEXICAL ANALYZER
  • 3. Outline • Specifying lexical structure using regular expressions • Finite automata –Deterministic Finite Automata (DFAs) –Non-deterministic Finite Automata (NFAs) • Implementation of regular expressions Reg Exp NFA DFA Tables
  • 4. Notation • For convenience, we use a variation (allow user- defined abbreviations) in regular expression notation. • Union: A + B  A | B • Option: A +   A? • Range: ‘a’+’b’+…+’z’  [a-z] • Excluded range: complement of [a- z]  [^a-z]
  • 5. Regular Expressions in Lexical Specification • Last lecture: a specification for the predicate, s  L(R) • But a yes/no answer is not enough ! • Instead: partition the input into tokens. • We will adapt regular expressions to this goal.
  • 6. Regular Expressions  Lexical Spec. (1) 1. Select a set of tokens • Integer, Keyword, Identifier, OpenPar, ... 2. Write a regular expression (pattern) for the lexemes of each token • Integer = digit + • Keyword = ‘if’ + ‘else’ + … • Identifier = letter (letter + digit)* • OpenPar = ‘(‘ • …
  • 7. Regular Expressions  Lexical Spec. (2) 3. Construct R, matching all lexemes for all tokens R = Keyword + Identifier + Integer + … = R1 + R2 + R3 + … Facts: If s  L(R) then s is a lexeme – Furthermore s  L(Ri) for some “i” – This “i” determines the token that is reported
  • 8. Regular Expressions  Lexical Spec. (3) 4. Let input be x1…xn • (x1 ... xn arecharacters) • For 1  i  n check x1…xi  L(R) ? 5. It must be thatt x1…xi  L(Rj) for some j (if there is a choice, pick a smallest such j) 6. Remove x1…xi from input and go to previous step
  • 9. How to Handle Spaces and Comments? 1. We could create a token Whitespace Whitespace = (‘ ’ + ‘n’ + ‘t’)+ – We could also add comments in there – An input “ tn 5555 “ is transformed into Whitespace Integer Whitespace 2. Lexer skips spaces (preferred) • Modify step 5 from before as follows: It must be that xk ... xi L(Rj) for some j such that x1 ... xk-1 L(Whitespace) • Parser is not bothered with spaces
  • 10. Ambiguities (1) • There are ambiguities in the algorithm • How much input is used? What if • x1…xi  L(R) and also • x1…xK  L(R) –Rule: Pick the longest possible substring –The “maximal munch”
  • 11. Ambiguities (2) • Which token is used? What if • x1…xi  L(Rj) and also • x1…xi  L(Rk) – Rule: use rule listed first (j if j < k) • Example: –R1 = Keyword and R2 = Identifier –“if” matches both –Treats “if” as a keyword not an identifier
  • 12. Error Handling • What if No rule matches a prefix of input ? • Problem: Can’t just get stuck … • Solution: –Write a rule matching all “bad” strings –Put it last • Lexer tools allow the writing of: R = R1 + ... + Rn +Error –Token Error matches if nothing else matches
  • 13. Regular Languages & FiniteAutomata Basic formal language theory result: Regular expressions and finite automata both define the class of regular languages. Thus, we are going to use: • Regular expressions for specification • Finite automata for implementation (automatic generation of lexical analyzers)
  • 14. FiniteAutomata • A finite automaton is a recognizer for the strings of a regular language A finite automaton consists of –A finite input alphabet  –A set of states S –A start state n –A set of accepting states F  S –A set of transitions state  input state
  • 15. • Transition s1 a s2 • Is read In state s1 on input “a” go to states2 • If end of input (or no transition possible) –If in accepting state  accept –Otherwise  reject
  • 16. Finite Automata State Graphs • A state a • The start state • An accepting state • A transition
  • 17. A Simple Example • A finite automaton that accepts only “1” 1 Another Simple Example • A finite automaton accepting any number of 1’s followed by a single 0 • Alphabet: {0,1} 1 0
  • 18. And Another Example • Alphabet {0,1} • What language does this recognize? 1 1 1 0 0 0 • Alphabet still { 0, 1 } 1 1 • The operation of the automaton is not completely defined by the input – On input “11” the automaton could be in either state
  • 19. Epsilon Moves • Another kind of transition: -moves  A B • Machine can move from state A to state B without reading input
  • 20. Deterministic and Non-Deterministic Automata • Deterministic Finite Automata (DFA) –One transition per input per state –No - moves • Non-deterministic Finite Automata (NFA) –Can have multiple transitions for one input in a given state –Can have - moves • Finite automata have finite memory –Enough to only encode the current state
  • 21. Execution of FiniteAutomata • A DFA can take only one path through the state graph – Completely determined by input • NFAs can choose –Whether to make  -moves –Which of multiple transitions for a single input to take.
  • 22. Acceptance of NFAs • An NFA can get into multiple states 1 0 0 1 • Input: 1 0 1 • Rule: NFA accepts an input if it can get in a final state
  • 23. NFA vs. DFA (1) • NFAs and DFAs recognize the same set of languages (regular languages) • DFAs are easier to implement – There are no choices to consider
  • 24. NFA vs. DFA (2) • For a given language the NFA can be simpler than the DFA NF A 1 0 0 0 DF A 0 1 1 1 0 0 • DFA can be exponentially larger than NFA
  • 25. Regular Expressions to FiniteAutomata • High-level sketch NFA Regular expressions DFA Lexical Specification Table-driven Implementation of DFA
  • 26. Regular Expressions to NFA(1) • For each kind of reg. expr, define an NFA – Notation: NFA for regular expression M • For  • For input a M  a
  • 27. Regular Expressions to NFA(2) • ForAB A B  • For A + B A B    
  • 28. Regular Expressions to NFA(3) • ForA* A   
  • 29. The Trick of NFA to DFA • Simulate the NFA • Each state of DFA = a non-empty subset of states of the NFA • Start state = the set of NFA states reachable through  -moves from NFA start state • Add a transition S  a S’ to DFA iff –S’ is the set of NFA states reachable from any state in S after seeing the input a •considering  -moves as well
  • 30. IMPLEMENTATION • A DFA can be implemented by a 2D table T –One dimension is “states” –Other dimension is “input symbols” –For every transition Si  a Sk define T[i,a] =k • DFA “execution” –If in state Si and input a, read T[i,a] = k and skip to state Sk –Very efficient
  • 31. Table Implementation of a DFA S U 0 1 1 0 1 T 0 1 S T U T T U U T U
  • 32. Implementation (Cont.) • NFA → DFA conversion is at the heart of tools such as lex, ML-Lex or flex. • But, DFAs can be huge. • In practice, lex/ML-Lex/flex-like tools trade off speed for space in the choice of NFA and DFA representations.
  • 33. DFA and Lexer Two differences: • DFAs recognize lexemes. A lexer must return a type of acceptance (token type) rather than simply an accept/reject indication. • DFAs consume the complete string and accept or reject it. A lexer must find the end of the lexeme in the input stream and then find the next one, etc.