SlideShare a Scribd company logo
1 of 39
Download to read offline
Jeena Thomas, Asst Professor, CSE, SJCET Palai
1
» The scanning/lexical analysis phase of a compiler
performs the task of reading the source program
as a file of characters and dividing up into
tokens.
» Usually implemented as subroutine or co-routine
of parser.
» Front end of compiler.
2
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Each token is a sequence of characters that
represents a unit of information in the source
program.
3
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Keywords which are fixed string of letters .eg: “if”,
“while”.
» Identifiers which are user-defined strings
composed of letters and numbers.
» Special symbols like arithmetic symbols.
4
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Scanners perform pattern matching process.
» The techniques used to implement lexical analyzers
can also be applied to other areas such as query
languages and information retrieval systems.
» Since pattern directed programming is widely
useful, pattern action language called Lex for
specifying lexical analyzers.
» In lex , patterns are specified by regular
expressions, and a compiler for lex can generate an
efficient finite-automaton recognizer for the
regular expression.
5
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» A software tool that automates the
construction of lexical analyzers allows people
with different backgrounds to use pattern
matching in their own areas.
» Jarvis[1976] Lexical analyzer generator to create
a program that recognizes imperfections in
printed circuit boards.
» The circuits are digitally scanned and
converted into “strings” of line segments at
different angles.
» The “lexical analyzer” looked for patterns
corresponding to imperfections in the string of
line segments. 6
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» It can utilize the best-known pattern-matching
algorithms and thereby create efficient lexical
analyzers for people who are not experts in
pattern-matching techniques.
7
Jeena Thomas, Asst Professor, CSE, SJCET Palai
Winter2007
SE
G2
10
1
Ch
ap
ter
8
8
» Lexical analyzer is the first phase of a compiler.
» Its main task is to read input characters and produce as
output a sequence of tokens that parser uses for syntax
analysis.
9
Jeena Thomas, Asst Professor, CSE, SJCET Palai
Type Examples
ID foo n_14 last
NUM 73 00 517 082
REAL 66.1 .5 10. 1e67 5.5e-10
IF if
COMMA ,
NOTEQ !=
LPAREN (
RPAREN )
Type Examples
comment /* ignored */
preprocessor directive #include <foo.h>
#define NUMS 5, 6
macro NUMS
whitespace t n b
» Separation of the input source code into tokens.
» Stripping out the unnecessary white spaces from
the source code.
» Removing the comments from the source text.
» Keeping track of line numbers while scanning the
new line characters. These line numbers are used
by the error handler to print the error messages.
» Preprocessing of macros.
12
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» There are several reasons for separating the
analysis phase of compiling into lexical analysis
and parsing:
» It leads to simpler design of the parser as the
unnecessary tokens can be eliminated by scanner.
» Efficiency of the process of compilation is
improved. The lexical analysis phase is most time
consuming phase in compilation. Using specialized
buffering to improve the speed of compilation.
» Portability of the compiler is enhanced as the
specialized symbols and characters(language and
machine specific) are isolated during this phase. 13
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Connected with lexical analysis are three important
terms with similar meaning.
» Lexeme
» Token
» Patterns
14
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» A token is a pair consisting of a token name and
an optional attribute value. Token name:
Keywords, operators, identifiers, constants, literal strings,
punctuation symbols(such as commas,semicolons)
» A lexeme is a sequence of characters in the
source program that matches the pattern for a
token and is identified by the lexical analyzer as
an instance of that token. E.g.Relation
{<.<=,>,>=,==,<>}
15
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» A pattern is a description of the form that
the lexemes of token may take.
» It gives an informal or formal description of
a token.
» Eg: identifier
» 2 purposes
» Gives a precise description/ specification of
tokens.
» Used to automatically generate a lexical
analyzer
16
Jeena Thomas, Asst Professor, CSE, SJCET Palai
17
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» const pi = 3.1416;
» The substring pi is a lexeme for the token
“identifier.”
» x=x*(acc+123)
18
Jeena Thomas, Asst Professor, CSE, SJCET Palai
19
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» 1.) let us consider a statement “fi(a==f)”. Here “fi”
is a misspelled keyword. This error is not
detected in lexical analysis as “fi” is taken as an
identifier. This error is then detected in other
phases of compilation.
» 2.) in case the lexical analyzer is not able to
continue with the process of compilation, it
resorts to panic mode of error recovery.
• Deleting the successive characters from the
remaining input until a token is detected.
• Deleting extraneous characters.
20
Jeena Thomas, Asst Professor, CSE, SJCET Palai
• Inserting missing characters
• Replacing an incorrect character by a correct
character.
• Transposing two adjacent characters
21
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Is the strategy generally followed by the
lexical analyzer to correct the errors in the
lexemes.
» It is nothing but the minimum number of the
corrections to be made to convert an invalid
lexeme to a valid lexeme.
» But it is not generally used in practice
because it is too costly to implement.
22
Jeena Thomas, Asst Professor, CSE, SJCET Palai
Jeena Thomas, Asst Professor, CSE, SJCET Palai
23
» Scanners are special pattern matching
processors.
» For representing patterns of strings of
characters, Regular Expressions(RE) are used.
» A regular expression (r) is defined by set of
strings that matches it.
» This set is called as the language generated by
the regular expression and is represented as
L(r).
» The set of symbols in the language is called the
alphabet of the language is represented as ∑. 24
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» An alphabet is a finite set of symbols.
» Example
» A set of alphabetic characters is represented as
L={A,…,Z,a,…,z} and set of digits is represented as
D={0,1,…,9}.
» LUD is a language.
» Strings over LUD- Begin,Max1, max1, 123, €…
25
Jeena Thomas, Asst Professor, CSE, SJCET Palai
26
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Intersection
» L∩M={ s|s is in L and S is in M}
» Exponentiation
» Li
=L Li-1
27
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Choice among alternates
» Concatenation
» Repetition
28
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Indicated by metacharacter ‘|’(vertical bar)
» r|s
» R.E that matches any string that is matched either
by r or s.
» L(r|s)= L(r) U L(s)
29
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Consider L(r)={a},
» L(s)={b},
» L(t)= {€},
» L(u)= {}.
» What do the following R.E represent?
» (i) r|s
» (ii) r|t
» (iii) r|u
30
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» rs
» It matches any string that is a concatenation of 2
strings, the first of which matches r and second
of which matches s.
» L(rs) = L(r) L(s)
31
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Consider L(r) ={a}, L(s)={b}, L(t) ={€}, L(u)={ },
L(v)={c}. What do following R.E represent?
» (i) rs
» (ii) rt
» (iii) ru
» (iv) (r|s)v
32
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Also called Kleene closure
» Represents any finite concatenation of strings
each matches strings from L(r).
» r*
» Let S={a}, then L(a*)={€, a, aa, aaa,…}
» S*={€}USUSSUSSSU….=
»
33
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Consider L(r)= {a},
» L(s)={b}.
» What do the following R.E represent?
» (i) r*
» (ii) (rs) *
» (iii) (r|s)*
» (iv) (r|ss)*
34
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Repetition --------------(highest)
» Concatenation left associative
» Choice-------------------(lowest)
35
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» One or more instances: (r)+
» Zero of one instances: r?
» Zero or more instances: r*
36
Jeena Thomas, Asst Professor, CSE, SJCET Palai
37
Jeena Thomas, Asst Professor, CSE, SJCET Palai
38
Jeena Thomas, Asst Professor, CSE, SJCET Palai
» Regular Expression(RE): represents pattern of
string of characters.
» Language (L(r)): set of strings
» Alphabet(∑): set of symbols
» Meta character: is a special character (not a part
of ∑) used in R.E eg: *, | etc
» Basic R.E: R.E consisting of only one character
» Regular language: language defined by RE
39
Jeena Thomas, Asst Professor, CSE, SJCET Palai

More Related Content

What's hot

Recursive Descent Parsing
Recursive Descent Parsing  Recursive Descent Parsing
Recursive Descent Parsing Md Tajul Islam
 
Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Aman Sharma
 
Lexical analyzer generator lex
Lexical analyzer generator lexLexical analyzer generator lex
Lexical analyzer generator lexAnusuya123
 
Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translationAkshaya Arunan
 
Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignAkhil Kaushik
 
1.10. pumping lemma for regular sets
1.10. pumping lemma for regular sets1.10. pumping lemma for regular sets
1.10. pumping lemma for regular setsSampath Kumar S
 
Chomsky classification of Language
Chomsky classification of LanguageChomsky classification of Language
Chomsky classification of LanguageDipankar Boruah
 
Ll(1) Parser in Compilers
Ll(1) Parser in CompilersLl(1) Parser in Compilers
Ll(1) Parser in CompilersMahbubur Rahman
 
Parsing in Compiler Design
Parsing in Compiler DesignParsing in Compiler Design
Parsing in Compiler DesignAkhil Kaushik
 

What's hot (20)

Compilers
CompilersCompilers
Compilers
 
Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
Lex
LexLex
Lex
 
Recursive Descent Parsing
Recursive Descent Parsing  Recursive Descent Parsing
Recursive Descent Parsing
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Lexical Analysis - Compiler design
Lexical Analysis - Compiler design
 
Asymptotic Notation
Asymptotic NotationAsymptotic Notation
Asymptotic Notation
 
Recognition-of-tokens
Recognition-of-tokensRecognition-of-tokens
Recognition-of-tokens
 
Types of Parser
Types of ParserTypes of Parser
Types of Parser
 
Top down parsing
Top down parsingTop down parsing
Top down parsing
 
Multi Head, Multi Tape Turing Machine
Multi Head, Multi Tape Turing MachineMulti Head, Multi Tape Turing Machine
Multi Head, Multi Tape Turing Machine
 
Lexical analyzer generator lex
Lexical analyzer generator lexLexical analyzer generator lex
Lexical analyzer generator lex
 
Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translation
 
Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler Design
 
1.10. pumping lemma for regular sets
1.10. pumping lemma for regular sets1.10. pumping lemma for regular sets
1.10. pumping lemma for regular sets
 
Chomsky classification of Language
Chomsky classification of LanguageChomsky classification of Language
Chomsky classification of Language
 
Type Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLikeType Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLike
 
Ll(1) Parser in Compilers
Ll(1) Parser in CompilersLl(1) Parser in Compilers
Ll(1) Parser in Compilers
 
Parsing in Compiler Design
Parsing in Compiler DesignParsing in Compiler Design
Parsing in Compiler Design
 
Role-of-lexical-analysis
Role-of-lexical-analysisRole-of-lexical-analysis
Role-of-lexical-analysis
 

Viewers also liked

Minimization of DFA
Minimization of DFAMinimization of DFA
Minimization of DFAkunj desai
 
Programming languages,compiler,interpreter,softwares
Programming languages,compiler,interpreter,softwaresProgramming languages,compiler,interpreter,softwares
Programming languages,compiler,interpreter,softwaresNisarg Amin
 
Bottom - Up Parsing
Bottom - Up ParsingBottom - Up Parsing
Bottom - Up Parsingkunj desai
 
Programming Languages / Translators
Programming Languages / TranslatorsProgramming Languages / Translators
Programming Languages / TranslatorsProject Student
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical AnalysisMunni28
 
NFA or Non deterministic finite automata
NFA or Non deterministic finite automataNFA or Non deterministic finite automata
NFA or Non deterministic finite automatadeepinderbedi
 
Intermediate code- generation
Intermediate code- generationIntermediate code- generation
Intermediate code- generationrawan_z
 
Language translator
Language translatorLanguage translator
Language translatorasmakh89
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generationRamchandraRegmi
 

Viewers also liked (20)

Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
Minimization of DFA
Minimization of DFAMinimization of DFA
Minimization of DFA
 
DFA Minimization
DFA MinimizationDFA Minimization
DFA Minimization
 
Programming languages,compiler,interpreter,softwares
Programming languages,compiler,interpreter,softwaresProgramming languages,compiler,interpreter,softwares
Programming languages,compiler,interpreter,softwares
 
NFA to DFA
NFA to DFANFA to DFA
NFA to DFA
 
Bottom - Up Parsing
Bottom - Up ParsingBottom - Up Parsing
Bottom - Up Parsing
 
Programming Languages / Translators
Programming Languages / TranslatorsProgramming Languages / Translators
Programming Languages / Translators
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
Nfa vs dfa
Nfa vs dfaNfa vs dfa
Nfa vs dfa
 
optimization of DFA
optimization of DFAoptimization of DFA
optimization of DFA
 
Optimization of dfa
Optimization of dfaOptimization of dfa
Optimization of dfa
 
Dfa vs nfa
Dfa vs nfaDfa vs nfa
Dfa vs nfa
 
Compiler Chapter 1
Compiler Chapter 1Compiler Chapter 1
Compiler Chapter 1
 
NFA or Non deterministic finite automata
NFA or Non deterministic finite automataNFA or Non deterministic finite automata
NFA or Non deterministic finite automata
 
Intermediate code- generation
Intermediate code- generationIntermediate code- generation
Intermediate code- generation
 
Language translator
Language translatorLanguage translator
Language translator
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generation
 
Translators(Compiler, Assembler) and interpreter
Translators(Compiler, Assembler) and interpreterTranslators(Compiler, Assembler) and interpreter
Translators(Compiler, Assembler) and interpreter
 
Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
Code generation
Code generationCode generation
Code generation
 

Similar to Lexical analysis - Compiler Design

Lexical analysis
Lexical analysisLexical analysis
Lexical analysiswaqar ahmed
 
Structure of the compiler
Structure of the compilerStructure of the compiler
Structure of the compilerSudhaa Ravi
 
Compiler Designs
Compiler DesignsCompiler Designs
Compiler Designswasim liam
 
A Role of Lexical Analyzer
A Role of Lexical AnalyzerA Role of Lexical Analyzer
A Role of Lexical AnalyzerArchana Gopinath
 
Compiler Design
Compiler Design Compiler Design
Compiler Design waqar ahmed
 
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdfTANZINTANZINA
 
Chapter Three(1)
Chapter Three(1)Chapter Three(1)
Chapter Three(1)bolovv
 
Language for specifying lexical Analyzer
Language for specifying lexical AnalyzerLanguage for specifying lexical Analyzer
Language for specifying lexical AnalyzerArchana Gopinath
 
Ch 2.pptx
Ch 2.pptxCh 2.pptx
Ch 2.pptxwoldu2
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler DesignKuppusamy P
 
role of lexical anaysis
role of lexical anaysisrole of lexical anaysis
role of lexical anaysisSudhaa Ravi
 
CH 2.pptx
CH 2.pptxCH 2.pptx
CH 2.pptxObsa2
 
Chapter2CDpdf__2021_11_26_09_19_08.pdf
Chapter2CDpdf__2021_11_26_09_19_08.pdfChapter2CDpdf__2021_11_26_09_19_08.pdf
Chapter2CDpdf__2021_11_26_09_19_08.pdfDrIsikoIsaac
 
Lecture 02 lexical analysis
Lecture 02 lexical analysisLecture 02 lexical analysis
Lecture 02 lexical analysisIffat Anjum
 

Similar to Lexical analysis - Compiler Design (20)

Lexical analysis
Lexical analysisLexical analysis
Lexical analysis
 
Compiler construction
Compiler constructionCompiler construction
Compiler construction
 
Structure of the compiler
Structure of the compilerStructure of the compiler
Structure of the compiler
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Compiler Designs
Compiler DesignsCompiler Designs
Compiler Designs
 
Parsing
ParsingParsing
Parsing
 
A Role of Lexical Analyzer
A Role of Lexical AnalyzerA Role of Lexical Analyzer
A Role of Lexical Analyzer
 
Pcd question bank
Pcd question bank Pcd question bank
Pcd question bank
 
Compiler Design
Compiler Design Compiler Design
Compiler Design
 
Lexical Analysis.pdf
Lexical Analysis.pdfLexical Analysis.pdf
Lexical Analysis.pdf
 
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
 
Chapter Three(1)
Chapter Three(1)Chapter Three(1)
Chapter Three(1)
 
Language for specifying lexical Analyzer
Language for specifying lexical AnalyzerLanguage for specifying lexical Analyzer
Language for specifying lexical Analyzer
 
Ch 2.pptx
Ch 2.pptxCh 2.pptx
Ch 2.pptx
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
 
role of lexical anaysis
role of lexical anaysisrole of lexical anaysis
role of lexical anaysis
 
CH 2.pptx
CH 2.pptxCH 2.pptx
CH 2.pptx
 
Chapter2CDpdf__2021_11_26_09_19_08.pdf
Chapter2CDpdf__2021_11_26_09_19_08.pdfChapter2CDpdf__2021_11_26_09_19_08.pdf
Chapter2CDpdf__2021_11_26_09_19_08.pdf
 
Control structure
Control structureControl structure
Control structure
 
Lecture 02 lexical analysis
Lecture 02 lexical analysisLecture 02 lexical analysis
Lecture 02 lexical analysis
 

More from Muhammed Afsal Villan

Software engineering : Layered Architecture
Software engineering : Layered ArchitectureSoftware engineering : Layered Architecture
Software engineering : Layered ArchitectureMuhammed Afsal Villan
 
Bluetooth - Comprehensive Presentation
Bluetooth - Comprehensive PresentationBluetooth - Comprehensive Presentation
Bluetooth - Comprehensive PresentationMuhammed Afsal Villan
 
3D Graphics : Computer Graphics Fundamentals
3D Graphics : Computer Graphics Fundamentals3D Graphics : Computer Graphics Fundamentals
3D Graphics : Computer Graphics FundamentalsMuhammed Afsal Villan
 
Barcodes - Types, Working and Hardware
Barcodes - Types, Working and HardwareBarcodes - Types, Working and Hardware
Barcodes - Types, Working and HardwareMuhammed Afsal Villan
 
Direct memory access (dma) with 8257 DMA Controller
Direct memory access (dma) with 8257 DMA ControllerDirect memory access (dma) with 8257 DMA Controller
Direct memory access (dma) with 8257 DMA ControllerMuhammed Afsal Villan
 

More from Muhammed Afsal Villan (9)

Software engineering : Layered Architecture
Software engineering : Layered ArchitectureSoftware engineering : Layered Architecture
Software engineering : Layered Architecture
 
Software development process models
Software development process modelsSoftware development process models
Software development process models
 
Bluetooth - Comprehensive Presentation
Bluetooth - Comprehensive PresentationBluetooth - Comprehensive Presentation
Bluetooth - Comprehensive Presentation
 
3D Graphics : Computer Graphics Fundamentals
3D Graphics : Computer Graphics Fundamentals3D Graphics : Computer Graphics Fundamentals
3D Graphics : Computer Graphics Fundamentals
 
Properties of Fourier transform
Properties of Fourier transformProperties of Fourier transform
Properties of Fourier transform
 
Barcodes - Types, Working and Hardware
Barcodes - Types, Working and HardwareBarcodes - Types, Working and Hardware
Barcodes - Types, Working and Hardware
 
8255 Programmable parallel I/O
8255 Programmable parallel I/O 8255 Programmable parallel I/O
8255 Programmable parallel I/O
 
Direct memory access (dma) with 8257 DMA Controller
Direct memory access (dma) with 8257 DMA ControllerDirect memory access (dma) with 8257 DMA Controller
Direct memory access (dma) with 8257 DMA Controller
 
Programmable Timer 8253/8254
Programmable Timer 8253/8254Programmable Timer 8253/8254
Programmable Timer 8253/8254
 

Recently uploaded

GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainAbdul Ahad
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 

Recently uploaded (20)

GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software Domain
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 

Lexical analysis - Compiler Design

  • 1. Jeena Thomas, Asst Professor, CSE, SJCET Palai 1
  • 2. » The scanning/lexical analysis phase of a compiler performs the task of reading the source program as a file of characters and dividing up into tokens. » Usually implemented as subroutine or co-routine of parser. » Front end of compiler. 2 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 3. » Each token is a sequence of characters that represents a unit of information in the source program. 3 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 4. » Keywords which are fixed string of letters .eg: “if”, “while”. » Identifiers which are user-defined strings composed of letters and numbers. » Special symbols like arithmetic symbols. 4 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 5. » Scanners perform pattern matching process. » The techniques used to implement lexical analyzers can also be applied to other areas such as query languages and information retrieval systems. » Since pattern directed programming is widely useful, pattern action language called Lex for specifying lexical analyzers. » In lex , patterns are specified by regular expressions, and a compiler for lex can generate an efficient finite-automaton recognizer for the regular expression. 5 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 6. » A software tool that automates the construction of lexical analyzers allows people with different backgrounds to use pattern matching in their own areas. » Jarvis[1976] Lexical analyzer generator to create a program that recognizes imperfections in printed circuit boards. » The circuits are digitally scanned and converted into “strings” of line segments at different angles. » The “lexical analyzer” looked for patterns corresponding to imperfections in the string of line segments. 6 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 7. » It can utilize the best-known pattern-matching algorithms and thereby create efficient lexical analyzers for people who are not experts in pattern-matching techniques. 7 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 8. Winter2007 SE G2 10 1 Ch ap ter 8 8 » Lexical analyzer is the first phase of a compiler. » Its main task is to read input characters and produce as output a sequence of tokens that parser uses for syntax analysis.
  • 9. 9 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 10. Type Examples ID foo n_14 last NUM 73 00 517 082 REAL 66.1 .5 10. 1e67 5.5e-10 IF if COMMA , NOTEQ != LPAREN ( RPAREN )
  • 11. Type Examples comment /* ignored */ preprocessor directive #include <foo.h> #define NUMS 5, 6 macro NUMS whitespace t n b
  • 12. » Separation of the input source code into tokens. » Stripping out the unnecessary white spaces from the source code. » Removing the comments from the source text. » Keeping track of line numbers while scanning the new line characters. These line numbers are used by the error handler to print the error messages. » Preprocessing of macros. 12 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 13. » There are several reasons for separating the analysis phase of compiling into lexical analysis and parsing: » It leads to simpler design of the parser as the unnecessary tokens can be eliminated by scanner. » Efficiency of the process of compilation is improved. The lexical analysis phase is most time consuming phase in compilation. Using specialized buffering to improve the speed of compilation. » Portability of the compiler is enhanced as the specialized symbols and characters(language and machine specific) are isolated during this phase. 13 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 14. » Connected with lexical analysis are three important terms with similar meaning. » Lexeme » Token » Patterns 14 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 15. » A token is a pair consisting of a token name and an optional attribute value. Token name: Keywords, operators, identifiers, constants, literal strings, punctuation symbols(such as commas,semicolons) » A lexeme is a sequence of characters in the source program that matches the pattern for a token and is identified by the lexical analyzer as an instance of that token. E.g.Relation {<.<=,>,>=,==,<>} 15 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 16. » A pattern is a description of the form that the lexemes of token may take. » It gives an informal or formal description of a token. » Eg: identifier » 2 purposes » Gives a precise description/ specification of tokens. » Used to automatically generate a lexical analyzer 16 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 17. 17 Jeena Thomas, Asst Professor, CSE, SJCET Palai » const pi = 3.1416; » The substring pi is a lexeme for the token “identifier.”
  • 18. » x=x*(acc+123) 18 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 19. 19 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 20. » 1.) let us consider a statement “fi(a==f)”. Here “fi” is a misspelled keyword. This error is not detected in lexical analysis as “fi” is taken as an identifier. This error is then detected in other phases of compilation. » 2.) in case the lexical analyzer is not able to continue with the process of compilation, it resorts to panic mode of error recovery. • Deleting the successive characters from the remaining input until a token is detected. • Deleting extraneous characters. 20 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 21. • Inserting missing characters • Replacing an incorrect character by a correct character. • Transposing two adjacent characters 21 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 22. » Is the strategy generally followed by the lexical analyzer to correct the errors in the lexemes. » It is nothing but the minimum number of the corrections to be made to convert an invalid lexeme to a valid lexeme. » But it is not generally used in practice because it is too costly to implement. 22 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 23. Jeena Thomas, Asst Professor, CSE, SJCET Palai 23
  • 24. » Scanners are special pattern matching processors. » For representing patterns of strings of characters, Regular Expressions(RE) are used. » A regular expression (r) is defined by set of strings that matches it. » This set is called as the language generated by the regular expression and is represented as L(r). » The set of symbols in the language is called the alphabet of the language is represented as ∑. 24 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 25. » An alphabet is a finite set of symbols. » Example » A set of alphabetic characters is represented as L={A,…,Z,a,…,z} and set of digits is represented as D={0,1,…,9}. » LUD is a language. » Strings over LUD- Begin,Max1, max1, 123, €… 25 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 26. 26 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 27. » Intersection » L∩M={ s|s is in L and S is in M} » Exponentiation » Li =L Li-1 27 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 28. » Choice among alternates » Concatenation » Repetition 28 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 29. » Indicated by metacharacter ‘|’(vertical bar) » r|s » R.E that matches any string that is matched either by r or s. » L(r|s)= L(r) U L(s) 29 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 30. » Consider L(r)={a}, » L(s)={b}, » L(t)= {€}, » L(u)= {}. » What do the following R.E represent? » (i) r|s » (ii) r|t » (iii) r|u 30 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 31. » rs » It matches any string that is a concatenation of 2 strings, the first of which matches r and second of which matches s. » L(rs) = L(r) L(s) 31 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 32. » Consider L(r) ={a}, L(s)={b}, L(t) ={€}, L(u)={ }, L(v)={c}. What do following R.E represent? » (i) rs » (ii) rt » (iii) ru » (iv) (r|s)v 32 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 33. » Also called Kleene closure » Represents any finite concatenation of strings each matches strings from L(r). » r* » Let S={a}, then L(a*)={€, a, aa, aaa,…} » S*={€}USUSSUSSSU….= » 33 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 34. » Consider L(r)= {a}, » L(s)={b}. » What do the following R.E represent? » (i) r* » (ii) (rs) * » (iii) (r|s)* » (iv) (r|ss)* 34 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 35. » Repetition --------------(highest) » Concatenation left associative » Choice-------------------(lowest) 35 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 36. » One or more instances: (r)+ » Zero of one instances: r? » Zero or more instances: r* 36 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 37. 37 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 38. 38 Jeena Thomas, Asst Professor, CSE, SJCET Palai
  • 39. » Regular Expression(RE): represents pattern of string of characters. » Language (L(r)): set of strings » Alphabet(∑): set of symbols » Meta character: is a special character (not a part of ∑) used in R.E eg: *, | etc » Basic R.E: R.E consisting of only one character » Regular language: language defined by RE 39 Jeena Thomas, Asst Professor, CSE, SJCET Palai