SlideShare a Scribd company logo
1 of 84
Lexical Analyzer
 Looping
Topics to be covered
1. Interaction of scanner & parser
2. Token, Pattern & Lexemes
3. Input buffering
4. Specification of tokens
5. Regular expression & Regular definition
6. Transition diagram
7. Hard coding & automatic generation lexical analyzers
8. Finite automata
9. Regular expression to NFA using Thompson's rule
10. Conversion from NFA to DFA using subset construction method
11. DFA optimization
12. Conversion from regular expression to DFA
1. Interaction with Scanner & Parser
Interaction of scanner & parser
• Upon receiving a “Get next token” command from parser, the lexical analyzer reads the
input character until it can identify the next token.
• Lexical analyzer also stripping out comments and white space in the form of blanks,
tabs, and newline characters from the source program.
Lexical
Analyzer
Symbol Table
Parser
Token
Get next token
Source
Program
Why to separate lexical analysis & parsing?
1. Simplicity in design.
2. Improves compiler efficiency.
3. Enhance compiler portability.
2. Token, Pattern & Lexemes
Token, Pattern & Lexemes
Sequence of character having a
collective meaning is known as
token.
Categories of Tokens:
1.Identifier
2.Keyword
3.Operator
4.Special symbol
5.Constant
The set of rules called pattern associated
with a token.
Example: “non-empty sequence of digits”,
“letter followed by letters and digits”
The sequence of character in a source
program matched with a pattern for a token
is called lexeme.
Example: Rate, DIET, count, Flag
Token Pattern
Lexemes
Example: Token, Pattern & Lexemes
Example: total = sum + 45
Tokens:
total
=
sum
+
45
Lexemes
Lexemes of identifier: total, sum
Lexemes of operator: =, +
Lexemes of constant: 45
Identifier1
Operator1
Identifier2
Operator2
Constant1
Tokens
3. Input buffering
Input buffering
• There are mainly two techniques for input buffering:
1. Buffer pairs
2. Sentinels
• The lexical analysis scans the input string from left to right one character at a
time.
• Buffer divided into two N-character halves, where N is the number of character on
one disk block. : : : E : : = : : Mi : * : : : C: * : * : 2 : eof : : :
Buffer Pair
Buffer pairs
• Pointer Lexeme Begin, marks the beginning of the current lexeme.
• Pointer Forward, scans ahead until a pattern match is found.
• Once the next lexeme is determined, forward is set to character at its right end.
• Lexeme Begin is set to the character immediately after the lexeme just found.
• If forward pointer is at the end of first buffer half then second is filled with N input
character.
• If forward pointer is at the end of second buffer half then first is filled with N input
character.
: : : E : : = : : Mi : * : : : C: * : * : 2 : eof : : :
forward
lexeme_beginnig
forward
: C: * : * : 2 : eof : : :
Buffer pairs
Code to advance forward pointer
if forward at end of first half then begin
reload second half;
forward := forward + 1;
end
else if forward at end of second half then begin
reload first half;
move forward to beginning of first half;
end
else forward := forward + 1;
: : : E : : = : : Mi : * : : : C: * : * : 2 : eof : : :
forward
lexeme_beginnig
forward
forward
: C: * : * : 2 : eof : : :
Sentinels
• In buffer pairs we must check, each time we move the forward pointer that we
have not moved off one of the buffers.
• Thus, for each character read, we make two tests.
• We can combine the buffer-end test with the test for the current character.
• We can reduce the two tests to one if we extend each buffer to hold a sentinel
character at the end.
• The sentinel is a special character that cannot be part of the source program, and
a natural choice is the character EOF.
: : E : : = : : Mi : * : eof : C: * : * : 2 : eof : : eof
forward
lexeme_beginnig
Sentinels
forward := forward + 1;
if forward = eof then begin
if forward at end of first half then begin
reload second half;
forward := forward + 1;
end
else if forward at the second half then begin
reload first half;
move forward to beginning of first half;
end
else terminate lexical analysis;
end
: : E : : = : : Mi : * : : C: * : * : 2 : eof : : eof
lexeme_beginnig
forward
eof
forward forward
: C: * : * : 2 : eof : : eof
4. Specification of tokens
Strings and languages
Term Definition
Prefix of s A string obtained by removing zero or more trailing symbol of
string S.
e.g., ban is prefix of banana.
Suffix of S A string obtained by removing zero or more leading symbol of
string S.
e.g., nana is suffix of banana.
Sub string of S A string obtained by removing prefix and suffix from S.
e.g., nan is substring of banana
Proper prefix, suffix
and substring of S
Any nonempty string x that is respectively proper prefix, suffix or
substring of S, such that s≠x.
Subsequence of S A string obtained by removing zero or more not necessarily
contiguous symbol from S.
e.g., baaa is subsequence of banana.
Exercise
• Write prefix, suffix, substring, proper prefix, proper suffix and subsequence of
following string:
String: Compiler
Operations on languages
Operation Definition
Union of L and M
Written L U M
𝐿 𝑈 𝑀 = {𝑠 | 𝑠 𝑖𝑠 𝑖𝑛 𝐿 𝑜𝑟 𝑠 𝑖𝑠 𝑖𝑛 𝑀 }
Concatenation of L
and M
Written LM
𝐿𝑀 = {𝑠𝑡 | 𝑠 𝑖𝑠 𝑖𝑛 𝐿 𝑎𝑛𝑑 𝑡 𝑖𝑠 𝑖𝑛 𝑀 }
Kleene closure of L
Written L∗
𝐿
∗
𝑑𝑒𝑛𝑜𝑡𝑒𝑠 “𝑧𝑒𝑟𝑜 𝑜𝑟 𝑚𝑜𝑟𝑒 𝑐𝑜𝑛𝑐𝑎𝑡𝑒𝑛𝑎𝑡𝑖𝑜𝑛 𝑜𝑓” 𝐿.
Positive closure of L
Written L+
𝐿
+
𝑑𝑒𝑛𝑜𝑡𝑒𝑠 “𝑜𝑛𝑒 𝑜𝑟 𝑚𝑜𝑟𝑒 𝑐𝑜𝑛𝑐𝑎𝑡𝑒𝑛𝑎𝑡𝑖𝑜𝑛 𝑜𝑓” 𝐿.
5. Regular Expression & Regular Definition
Regular expression
• A regular expression is a sequence of characters that define a pattern.
Notational shorthand's
1. One or more instances: +
2. Zero or more instances: *
3. Zero or one instances: ?
4. Alphabets: Σ
Rules to define regular expression
1. ∈ is a regular expression that denotes {∈}, the set containing empty string.
2. If 𝑎 is a symbol in 𝛴 then 𝑎 is a regular expression, 𝐿 𝑎 = {𝑎}
3. Suppose 𝑟and 𝑠are regular expression denoting the languages 𝐿 𝑟 and 𝐿 𝑠 .
Then,
a. r | s is a regular expression denoting L r U L(s)
b. r s is a regular expression denoting L r L s
c. r * is a regular expression denoting L r
∗
d. r is a regular expression denoting L( r )
The language denoted by regular expression is said to be a regular set.
Regular expression
• L = Zero or More Occurrences of a = a*
a
aaa
aa
aaaa
aaaaa…..
Infinite …..
𝜖
Regular expression
• L = One or More Occurrences of a = a+
a
aaa
aa
aaaa
aaaaa…..
Infinite …..
Precedence and associativity of operators
Operator Precedence Associative
Kleene * 1 left
Concatenation 2 left
Union | 3 left
Regular expression examples
1. 0 or 1
2. 0 or 11 or 111
3. String having zero or more a.
4. String having one or more a.
5. Regular expression over 𝛴 = {𝑎, 𝑏, 𝑐} that represent all string of length 3.
6. All binary string
𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝟎, 𝟏
𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝟎, 𝟏𝟏, 𝟏𝟏𝟏
𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝛜, 𝐚, 𝐚𝐚, 𝐚𝐚𝐚, 𝐚𝐚𝐚𝐚 … . .
𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝐚, 𝐚𝐚, 𝐚𝐚𝐚, 𝐚𝐚𝐚𝐚 … . .
𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝐚𝐛𝐜, 𝐛𝐜𝐚, 𝐛𝐛𝐛, 𝐜𝐚𝐛, 𝐚𝐛𝐚 … .
𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝟎, 𝟏𝟏, 𝟏𝟎𝟏, 𝟏𝟎𝟏𝟎𝟏, 𝟏𝟏𝟏𝟏 …
𝐑. 𝐄. = 𝟎 𝟏𝟏 𝟏𝟏𝟏
𝐑. 𝐄. = 𝟎 | 𝟏
𝐑. 𝐄. = 𝐚
∗
𝐑. 𝐄. = 𝐚
+
𝐑. 𝐄. = 𝐚 𝐛 𝐜 𝐚 𝐛 𝐜 (𝐚 𝐛 𝐜)
𝐑. 𝐄. = (𝟎 | 𝟏)+
Regular expression examples
7. 0 or more occurrence of either a or b or both
8. 1 or more occurrence of either a or b or both
9. Binary no. ends with 0
10.Binary no. ends with 1
11.Binary no. starts and ends with 1
12.String starts and ends with same character
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝜖, 𝒂, 𝒂𝒂, 𝒂𝒃𝒂𝒃, 𝒃𝒂𝒃 …
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂, 𝒂𝒂, 𝒂𝒃𝒂𝒃, 𝒃𝒂𝒃, 𝒃𝒃𝒃𝒂𝒂𝒂 …
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎, 𝟏𝟎, 𝟏𝟎𝟎, 𝟏𝟎𝟏𝟎, 𝟏𝟏𝟏𝟏𝟎 …
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏, 𝟏𝟎𝟏, 𝟏𝟎𝟎𝟏, 𝟏𝟎𝟏𝟎𝟏, …
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟏, 𝟏𝟎𝟏, 𝟏𝟎𝟎𝟏, 𝟏𝟎𝟏𝟎𝟏, …
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎, 𝟏𝟎𝟏, 𝒂𝒃𝒂, 𝒃𝒂𝒂𝒃 …
𝑹. 𝑬. = (𝒂 | 𝒃) ∗
𝑹. 𝑬. = (𝒂 | 𝒃)+
𝑹. 𝑬. = (𝟎 | 𝟏) ∗ 𝟏
𝑹. 𝑬. = (𝟎 | 𝟏)* 𝟎
𝑹. 𝑬. = 𝟏 (𝟎 | 𝟏) ∗ 𝟏
𝑹. 𝑬. = 𝟏 (𝟎 | 𝟏) ∗ 𝟏 𝐨𝐫 𝟎 (𝟎 | 𝟏) ∗ 𝟎
𝒂 (𝒂 | 𝒃)
∗
𝒂 𝐨𝐫 𝒃 (𝒂 | 𝒃)
∗
𝒃
Regular expression examples
13.All string of a and b starting with a
14.String of 0 and 1 ends with 00
15.String ends with abb
16.String starts with 1 and ends with 0
17.All binary string with at least 3 characters and 3rd character should be zero
18.Language which consist of exactly two b’s over the set 𝛴 = {𝑎, 𝑏}
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂, 𝒂𝒃, 𝒂𝒂𝒃, 𝒂𝒃𝒃… 𝑹. 𝑬. = 𝒂(𝒂 | 𝒃)*
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎, 𝟏𝟎𝟎, 𝟎𝟎𝟎, 𝟏𝟎𝟎𝟎, 𝟏𝟏𝟎𝟎… 𝑹. 𝑬. = (𝟎 | 𝟏) ∗ 𝟎𝟎
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂𝒃𝒃, 𝒃𝒂𝒃𝒃, 𝒂𝒃𝒂𝒃𝒃…
𝑹. 𝑬. = (𝒂 | 𝒃) ∗ 𝒂𝒃𝒃
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟎, 𝟏𝟎𝟎, 𝟏𝟏𝟎, 𝟏𝟎𝟎𝟎, 𝟏𝟏𝟎𝟎… 𝑹. 𝑬. = 𝟏(𝟎 | 𝟏) ∗ 𝟎
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎𝟎, 𝟏𝟎𝟎, 𝟏𝟏𝟎𝟎, 𝟏𝟎𝟎𝟏… 𝑹. 𝑬. = 𝟎 𝟏 𝟎 𝟏 𝟎(𝟎 | 𝟏) ∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒃𝒃, 𝒃𝒂𝒃, 𝒂𝒂𝒃𝒃, 𝒂𝒃𝒃𝒂… 𝑹. 𝑬. = 𝒂
∗
𝒃 𝒂
∗
𝒃 𝒂
∗
Regular expression examples
19.The language with 𝛴 = {𝑎, 𝑏} such that 3rd character from right end of the string
is always a.
20. Any no. of 𝑎 followed by any no. of 𝑏 followed by any no. of 𝑐
21. String should contain at least three 1
22. String should contain exactly two 1
23.Length of string should be at least 1 and at most 3
24.No. of zero should be multiple of 3
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂𝒂𝒂, 𝒂𝒃𝒂, 𝒂𝒂𝒃𝒂, 𝒂𝒃𝒃… 𝑹. 𝑬. = (𝒂 | 𝒃) ∗ 𝒂(𝒂|𝒃)(𝒂|𝒃)
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝜖, 𝒂𝒃𝒄, 𝒂𝒂𝒃𝒃𝒄𝒄, 𝒂𝒂𝒃𝒄, 𝒂𝒃𝒃… 𝑹. 𝑬. = 𝒂
∗
𝒃
∗
𝒄
∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟏𝟏, 𝟎𝟏𝟏𝟎𝟏, 𝟎𝟏𝟎𝟏𝟏𝟏𝟎…. 𝑹. 𝑬. = (𝟎|𝟏)∗
𝟏 (𝟎|𝟏)∗
𝟏 (𝟎|𝟏)∗
𝟏 (𝟎|𝟏)∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟏, 𝟎𝟏𝟎𝟏, 𝟏𝟏𝟎𝟎, 𝟎𝟏𝟎𝟎𝟏𝟎, 𝟏𝟎𝟎𝟏𝟎𝟎…. 𝑹. 𝑬. = 𝟎∗
𝟏𝟎∗
𝟏𝟎∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎, 𝟏, 𝟏𝟏, 𝟎𝟏, 𝟏𝟏𝟏, 𝟎𝟏𝟎, 𝟏𝟎𝟎…. 𝑹. 𝑬. = 𝟎|𝟏 𝟎|𝟏 𝟎|𝟏 𝟎|𝟏 𝟎|𝟏 𝟎|𝟏
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎𝟎, 𝟎𝟏𝟎𝟏𝟎𝟏, 𝟏𝟏𝟎𝟏𝟎𝟎, 𝟎𝟎𝟎𝟎𝟎𝟎, 𝟏𝟎𝟎𝟎𝟏𝟎𝟎𝟏𝟎…. 𝑹. 𝑬. = (𝟏∗
𝟎𝟏∗
𝟎𝟏∗
𝟎𝟏∗
)∗
Regular expression examples
25.The language with 𝛴 = {𝑎, 𝑏, 𝑐} where 𝑎 should be multiple of 3
26. Even no. of 0
27. String should have odd length
28. String should have even length
29. String start with 0 and has odd length
30. String start with 1 and has even length
31. All string begins or ends with 00 or 11
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂𝒂𝒂, 𝒃𝒂𝒂𝒂, 𝒃𝒂𝒄𝒂𝒃𝒂, 𝒂𝒂𝒂𝒂𝒂𝒂. . 𝑹. 𝑬. = ( 𝒃|𝒄 ∗
𝒂 𝒃|𝒄 ∗
𝒂 𝒃|𝒄 ∗
𝒂 𝒃|𝒄 ∗
)∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎, 𝟎𝟏𝟎𝟏, 𝟎𝟎𝟎𝟎, 𝟏𝟎𝟎𝟏𝟎𝟎…. 𝑹. 𝑬. = (𝟏∗𝟎𝟏∗𝟎𝟏∗)∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎, 𝟎𝟏𝟎, 𝟏𝟏𝟎, 𝟎𝟎𝟎, 𝟏𝟎𝟎𝟏𝟎…. 𝑹. 𝑬. = 𝟎|𝟏 ( 𝟎 𝟏 (𝟎|𝟏))∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎, 𝟎𝟏𝟎𝟏, 𝟎𝟎𝟎𝟎, 𝟏𝟎𝟎𝟏𝟎𝟎…. 𝑹. 𝑬. = ( 𝟎 𝟏 (𝟎|𝟏))∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎, 𝟎𝟏𝟎, 𝟎𝟏𝟎, 𝟎𝟎𝟎, 𝟎𝟎𝟎𝟏𝟎…. 𝑹. 𝑬. = 𝟎 ( 𝟎 𝟏 (𝟎|𝟏))∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟎, 𝟏𝟏𝟎𝟎, 𝟏𝟎𝟎𝟎, 𝟏𝟎𝟎𝟏𝟎𝟎…. 𝑹. 𝑬. = 𝟏(𝟎|𝟏)( 𝟎 𝟏 (𝟎|𝟏))∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎𝟏𝟎𝟏, 𝟏𝟎𝟏𝟎𝟎, 𝟏𝟏𝟎, 𝟎𝟏𝟎𝟏𝟏 … 𝑹. 𝑬. = (𝟎𝟎|𝟏𝟏)(𝟎 | 𝟏) ∗ | 𝟎 𝟏
∗
(𝟎𝟎|𝟏𝟏)
Regular expression examples
32. Language of all string containing both 11 and 00 as substring
33. String ending with 1 and not contain 00
34. Language of C identifier
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎𝟏𝟏, 𝟏𝟏𝟎𝟎, 𝟏𝟎𝟎𝟏𝟏𝟎, 𝟎𝟏𝟎𝟎𝟏𝟏 … 𝑹. 𝑬. = ((𝟎|𝟏)∗𝟎𝟎(𝟎|𝟏)∗𝟏𝟏(𝟎|𝟏)∗) | ((𝟎|𝟏)∗𝟏𝟏(𝟎|𝟏)∗𝟎𝟎(𝟎|𝟏)∗)
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟏𝟏, 𝟏𝟏𝟎𝟏, 𝟏𝟎𝟏𝟏 … . 𝑹. 𝑬. = 𝟏 𝟎𝟏
+
𝑹. 𝑬. = (_ + 𝑳)(_ + 𝑳 + 𝑫)∗
𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂𝒓𝒆𝒂, 𝒊, 𝒓𝒆𝒅𝒊𝒐𝒖𝒔, 𝒈𝒓𝒂𝒅𝒆𝟏 … .
𝒘𝒉𝒆𝒓𝒆 𝑳 𝒊𝒔 𝑳𝒆𝒕𝒕𝒆𝒓 & 𝐃 𝐢𝐬 𝐝𝐢𝐠𝐢𝐭
Regular definition
• A regular definition gives names to certain regular expressions and uses those
names in other regular expressions.
• Regular definition is a sequence of definitions of the form:
𝑑1 → 𝑟1
𝑑2 → 𝑟2
……
𝑑𝑛 → 𝑟𝑛
Where 𝑑𝑖 is a distinct name & 𝑟𝑖 is a regular expression.
 Example: Regular definition for identifier
letter  A|B|C|………..|Z|a|b|………..|z
digit  0|1|…….|9|
id letter (letter | digit)*
Regular definition example
• Example: Unsigned Pascal numbers
3
5280
39.37
6.336E4
1.894E-4
2.56E+7
Regular Definition
digit  0|1|…..|9
digits  digit digit*
optional_fraction  .digits | 𝜖
optional_exponent  (E(+|-|𝜖)digits)|𝜖
num  digits optional_fraction optional_exponent
6. Transition Diagram
Transition Diagram
• A stylized flowchart is called transition diagram.
is a state
is a transition
is a start state
is a final state
Transition Diagram : Relational operator
<
0
6
1 2
3
4
5
8
7
=
other
>
=
other
=
>
return (relop,LE)
return (relop,NE)
return (relop,LT)
return (relop,GE)
return (relop,GT)
return (relop,EQ)
Transition diagram : Unsigned number
3
5280
39.37
1.894 E - 4
2.56 E + 7
45 E + 6
96 E 2
1 2 8
other
digit
3 4 5 6 7
digit
digit
digit
+or -
digit
digit
E
.
start
E digit
7. Hard coding & automatic generation Lexical
analyzers
Hard coding and automatic generation lexical analyzers
• Lexical analysis is about identifying the pattern from the input.
• To recognize the pattern, transition diagram is constructed.
• It is known as hard coding lexical analyzer.
• Example: to represent identifier in ‘C’, the first character must be letter and other
characters are either letter or digits.
• To recognize this pattern, hard coding lexical analyzer will work with a transition
diagram.
• The automatic generation lexical analyzer takes special notation as input.
• For example, lex compiler tool will take regular expression as input and finds out the
pattern matching to that regular expression.
2 3
Start
Letter or digit
Letter
1
8. Finite Automata
Finite Automata
• Finite Automata are recognizers.
• FA simply say “Yes” or “No” about each possible input string.
• Finite Automata is a mathematical model consist of:
1. Set of states 𝑺
2. Set of input symbol 𝜮
3. A transition function move
4. Initial state 𝑺𝟎
5. Final states or accepting states 𝐅
Types of finite automata
• Types of finite automata are:
 Nondeterministic finite automata (NFA):
There are no restrictions on the edges
leaving a state. There can be several with the
same symbol as label and some edges can
be labeled with 𝜖.
1 2 3 4
a b b
a
b
1 2 3 4
a b b
a
a
a
b
DFA
NFA
b
 Deterministic finite automata (DFA): have for
each state exactly one edge leaving out for
each symbol.
DFA
NFA
9. Regular expression to NFA using
Thompson's rule
Regular expression to NFA using Thompson's rule
1. For ∈ , construct the NFA
2. For 𝑎 in 𝛴, construct the NFA
𝑖 𝑓
𝜖
start
𝑖 𝑓
a
start
3. For regular expression 𝑠𝑡
Ex: ab
𝑖 𝑓
start
N(s) N(t)
1 2 3
a b
Regular expression to NFA using Thompson's rule
4. For regular expression 𝑠|𝑡
Ex: (a|b)
𝑖 𝑓
start
N(s)
N(t)
𝜖
𝜖
𝜖
𝜖
1
2
5
3
4
6
a
b
𝜖
𝜖 𝜖
𝜖
5. For regular expression 𝑠*
Ex: a*
𝑖 𝑓
start
N(s)
𝜖 𝜖
𝜖
𝜖
1 4
𝜖 𝜖
𝜖
𝜖
2 3
𝑎
Regular expression to NFA using Thompson's rule
• a*b
• b*ab
1 4
𝜖 𝜖
𝜖
2 3
𝑎
5
𝑏
1 4
𝜖 𝜖
𝜖
𝜖
2 3
𝑏
6
5
𝑎 𝑏
Exercise
Convert following regular expression to NFA:
1. abba
2. bb(a)*
3. (a|b)*
4. a* | b*
5. a(a)*ab
6. aa*+ bb*
7. (a+b)*abb
8. 10(0+1)*1
9. (a+b)*a(a+b)
10. (0+1)*010(0+1)*
11. (010+00)*(10)*
12. 100(1)*00(0+1)*
10. Conversion from NFA to DFA using subset
construction method
Subset construction algorithm
Input: An NFA 𝑁.
Output: A DFA D accepting the same language.
Method: Algorithm construct a transition table 𝐷𝑡𝑟𝑎𝑛 for D. We use the following
operation: OPERATION DESCRIPTION
 − 𝑐𝑙𝑜𝑠𝑢𝑟𝑒(𝑠) Set of NFA states reachable from NFA state 𝑠 on
– transition alone.
 − 𝑐𝑙𝑜𝑠𝑢𝑟𝑒(𝑇) Set of NFA states reachable from some NFA state
𝑠 in 𝑇 on – transition alone.
𝑴𝒐𝒗𝒆 (𝑇, 𝑎) Set of NFA states to which there is a transition on
input symbol 𝑎 from some NFA state 𝑠 in 𝑇.
Subset construction algorithm
initially  − 𝑐𝑙𝑜𝑠𝑢𝑟𝑒(𝑠0) be the only state in 𝐷𝑠𝑡𝑎𝑡𝑒𝑠 and it is unmarked;
while there is unmarked states T in 𝐷𝑠𝑡𝑎𝑡𝑒𝑠 do begin
mark 𝑇;
for each input symbol 𝑎 do begin
𝑈 = 𝜖 − 𝑐𝑙𝑜𝑠𝑢𝑟𝑒 𝑚𝑜𝑣𝑒 𝑇, 𝑎 ;
if 𝑈 is not in 𝐷𝑠𝑡𝑎𝑡𝑒𝑠 then
add 𝑈 as unmarked state to 𝐷𝑠𝑡𝑎𝑡𝑒𝑠;
𝐷𝑡𝑟𝑎𝑛[ 𝑇, 𝑎 ] = 𝑈
end
end
Conversion from NFA to DFA
1
(a|b)*abb
2
5
3
4
6 7 8 9
0 10
𝜖
a
b
𝜖 a b b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
Conversion from NFA to DFA
1
2
5
3
4
6 7 8 9
0
𝜖
a
b
𝜖 a b b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
10
{0, 1, 7, 2, 4}
---- A
𝜖- Closure(0)=
= {0,1,2,4,7}
Conversion from NFA to DFA
1
2
5
3
4
6 7 8 9
0
𝜖
a
b
𝜖 a b b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
A= {0, 1, 2, 4, 7}
Move(A,a) = {3,8}
𝜖- Closure(Move(A,a)) = {3, 6, 7, 1, 2, 4, 8}
---- B
= {1,2,3,4,6,7,8}
10
States a b
A = {0,1,2,4,7} B
B = {1,2,3,4,6,7,8}
Conversion from NFA to DFA
1
2
5
3
4
6 7 8
0
𝜖
a
b
𝜖 a b b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
A= {0, 1, 2, 4, 7}
Move(A,b) = {5}
𝜖- Closure(Move(A,b)) = {5, 6, 7, 1, 2, 4}
---- C
= {1,2,4,5,6,7}
{5}
10
9
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8}
C = {1,2,4,5,6,7}
Conversion from NFA to DFA
1
2
5
3
4
6 7 8
0
𝜖
a
b
𝜖 a b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
B = {1, 2, 3, 4, 6, 7, 8}
Move(B,a) = {3,8}
𝜖- Closure(Move(B,a)) = {3, 6, 7, 1, 2, 4, 8}
---- B
= {1,2,3,4,6,7,8}
b
10
9
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8} B
C = {1,2,4,5,6,7}
Conversion from NFA to DFA
1
2
5
3
4
6 7 8
0
𝜖
a
b
𝜖 a b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
B= {1, 2, 3, 4, 6, 7, 8}
Move(B,b) = {5,9}
𝜖- Closure(Move(B,b)) = {5, 6, 7, 1, 2, 4, 9}
---- D
= {1,2,4,5,6,7,9}
b
10
9
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8} B D
C = {1,2,4,5,6,7}
D = {1,2,4,5,6,7,9}
Conversion from NFA to DFA
1
2
5
3
4
6 7 8
0
𝜖
a
b
𝜖 a b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
Move(C,a) = {3,8}
𝜖- Closure(Move(C,a)) = {3, 6, 7, 1, 2, 4, 8}
---- B
= {1,2,3,4,6,7,8}
C= {1, 2, 4, 5, 6 ,7}
b
10
9
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8} B D
C = {1,2,4,5,6,7} B
D = {1,2,4,5,6,7,9}
Conversion from NFA to DFA
1
2
5
3
4
6 7 8
0
𝜖
a
b
𝜖 a b b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
Move(C,b) = {5}
𝜖- Closure(Move(C,b))= {5, 6, 7, 1, 2, 4}
---- C
= {1,2,4,5,6,7}
C= {1, 2, 4, 5, 6, 7}
10
9
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8} B D
C = {1,2,4,5,6,7} B C
D = {1,2,4,5,6,7,9}
Conversion from NFA to DFA
1
2
5
3
4
6 7 8
0
𝜖
a
b
𝜖 a b b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
Move(D,a) = {3,8}
𝜖- Closure(Move(D,a)) = {3, 6, 7, 1, 2, 4, 8}
---- B
= {1,2,3,4,6,7,8}
D= {1, 2, 4, 5, 6, 7, 9}
10
9
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8} B D
C = {1,2,4,5,6,7} B C
D = {1,2,4,5,6,7,9} B
Conversion from NFA to DFA
1
2
5
3
4
6 7 8
0
𝜖
a
b
𝜖 a b b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
Move(D,b) = {5,10}
𝜖- Closure(Move(D,b)) = {5, 6, 7, 1, 2, 4, 10}
---- E
= {1,2,4,5,6,7,10}
D= {1, 2, 4, 5, 6, 7, 9}
10
9
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8} B D
C = {1,2,4,5,6,7} B C
D = {1,2,4,5,6,7,9} B E
E = {1,2,4,5,6,7,10}
Conversion from NFA to DFA
1
2
5
3
4
6 7 8
0
𝜖
a
b
𝜖 a b b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
Move(E,a) = {3,8}
𝜖- Closure(Move(E,a)) = {3, 6, 7, 1, 2, 4, 8}
---- B
= {1,2,3,4,6,7,8}
E= {1, 2, 4, 5, 6, 7, 10}
10
9
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8} B D
C = {1,2,4,5,6,7} B C
D = {1,2,4,5,6,7,9} B E
E = {1,2,4,5,6,7,10} B
Conversion from NFA to DFA
1
2
5
3
4
6 7 8
0
𝜖
a
b
𝜖 a b b
𝜖
𝜖 𝜖
𝜖
𝜖
𝜖
Move(E,b)= {5}
𝜖- Closure(Move(E,b))= {5,6,7,1,2,4}
---- C
= {1,2,4,5,6,7}
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8} B D
C = {1,2,4,5,6,7} B C
E= {1, 2, 4, 5, 6, 7, 10}
D = {1,2,4,5,6,7,9} B E
E = {1,2,4,5,6,7,10} B C
10
9
Conversion from NFA to DFA
A
B
C
a
b
a
b
a
D
E
b
a
b
b
a
Transition Table
DFA
Note:
• Accepting state in NFA is 10
• 10 is element of E
• So, E is acceptance state in DFA
States a b
A = {0,1,2,4,7} B C
B = {1,2,3,4,6,7,8} B D
C = {1,2,4,5,6,7} B C
D = {1,2,4,5,6,7,9} B E
E = {1,2,4,5,6,7,10} B C
Exercise
• Convert following regular expression to DFA using subset construction method:
1. (a+b)*a(a+b)
2. (a+b)*ab*a
11. DFA optimization
DFA optimization
1. Construct an initial partition Π of the set of states with two groups: the
accepting states 𝐹 and the non-accepting states 𝑆 − 𝐹.
2. Apply the repartition procedure to Π to construct a new partition Π𝑛𝑒𝑤.
3. If Π 𝑛𝑒𝑤 = Π, let Π𝑓𝑖𝑛𝑎𝑙 = Π and continue with step (4). Otherwise, repeat
step (2) with Π = Π𝑛𝑒𝑤.
for each group 𝐺 of Π do begin
partition 𝐺 into subgroups such that two states 𝑠 and 𝑡
of 𝐺 are in the same subgroup if and only if for all
input symbols 𝑎, states 𝑠 and 𝑡 have transitions on 𝑎
to states in the same group of Π.
replace 𝐺 in Π𝑛𝑒𝑤 by the set of all subgroups formed.
end
DFA optimization
4. Choose one state in each group of the partition Π𝑓𝑖𝑛𝑎𝑙 as the representative for
that group. The representatives will be the states of 𝑀′. Let s be a
representative state, and suppose on input a there is a transition of 𝑀 from 𝑠 to
𝑡. Let 𝑟 be the representative of 𝑡′s group. Then 𝑀′ has a transition from 𝑠 to 𝑟
on 𝑎. Let the start state of 𝑀′ be the representative of the group containing start
state 𝑠0 of 𝑀, and let the accepting states of 𝑀′ be the representatives that are
in 𝐹.
5. If 𝑀′ has a dead state 𝑑, then remove 𝑑 from 𝑀′. Also remove any state not
reachable from the start state.
DFA optimization
• Now no more splitting is possible.
• If we chose A as the representative for
group (AC), then we obtain reduced
transition table
A B C
B B D
C B C
D B E
E B C
States a b
{𝐴, 𝐵, 𝐶, 𝐷, 𝐸}
Nonaccepting States
{𝐴, 𝐵, 𝐶, 𝐷}
Accepting States
{𝐸}
{𝐴, 𝐵, 𝐶} {𝐷}
{𝐵}
A B A
B B D
D B E
E B A
States a b
Optimized
Transition Table
{𝐴, 𝐶}
12. Conversion from regular expression to DFA
Rules to compute nullable, firstpos, lastpos
• nullable(n)
• The subtree at node 𝑛 generates languages including the empty string.
• firstpos(n)
• The set of positions that can match the first symbol of a string generated by the subtree at node 𝑛.
• lastpos(n)
• The set of positions that can match the last symbol of a string generated be the subtree at node 𝑛.
• followpos(i)
• The set of positions that can follow position 𝑖 in the tree.
Rules to compute nullable, firstpos, lastpos
Node n nullable(n) firstpos(n) lastpos(n)
A leaf labeled by  true ∅ ∅
A leaf with
position 𝐢
false {i} {i}
nullable(c1)
or
nullable(c2)
firstpos(c1)

firstpos(c2)
lastpos(c1)

lastpos(c2)
nullable(c1)
and
nullable(c2)
if (nullable(c1))
thenfirstpos(c1) 
firstpos(c2)
else firstpos(c1)
if (nullable(c2))
then lastpos(c1) 
lastpos(c2)
else lastpos(c2)
true firstpos(c1) lastpos(c1)
|
n
c1 c2
n
∗
n
.
c1 c2
c1
Rules to compute followpos
1. If n is concatenation node with left child c1 and right child c2 and i is a position
in lastpos(c1), then all position in firstpos(c2) are in followpos(i)
2. If n is * node and i is position in lastpos(n), then all position in firstpos(n) are in
followpos(i)
Conversion from regular expression to DFA
𝑎 𝑏
|
∗
.
𝟏 𝟐
𝑎
.
.
.
𝑏
𝑏
#
(a|b)* abb
𝟒
𝟑
𝟓
𝟔
#
Step 2: Nullable node
Here, * is only nullable node
Step 1: Construct Syntax Tree
Conversion from regular expression to DFA
𝑎 𝑏
|
∗
.
{1} {2}
{1,2}
{1,2} 𝑎
{3}
{1,2,3}
.
{4}
{1,2,3}
.
{5}
{1,2,3}
.
{6}
{1,2,3}
𝑏
𝑏
#
Step 3: Calculate firstpos
Firstpos
A leaf with position 𝒊 = {𝒊}
|
n
c1 c2
firstpos(c1)  firstpos(c2)
∗
n
c1
firstpos(c1)
.
n
c1 c2
𝟏 𝟐
𝟒
𝟑
𝟓
𝟔
if (nullable(c1))
thenfirstpos(c1) 
firstpos(c2)
else firstpos(c1)
Conversion from regular expression to DFA
𝑎 𝑏
|
∗
.
{1} {1} {2} {2}
{1,2} {1,2}
{1,2} {1,2} 𝑎
{3} {3}
{1,2,3} {3}
.
{4} {4}
{1,2,3} {4}
.
{5} {5}
{1,2,3} {5}
.
{6} {6}
{1,2,3} {6}
𝑏
𝑏
#
Step 3: Calculate lastpos
𝟏 𝟐
𝟒
𝟑
𝟓
𝟔
Lastpos
A leaf with position 𝒊 = {𝒊}
|
n
c1 c2
if (nullable(c2)) then
lastpos(c1)  lastpos(c2)
else lastpos(c2)
∗
n
c1
lastpos(c1)
.
n
c1 c2
lastpos(c1)  lastpos(c2)
Conversion from regular expression to DFA
Position followpos
𝑎 𝑏
|
∗
.
{1} {1} {2} {2}
{1,2} {1,2}
{1,2} {1,2} 𝑎
{3} {3}
{1,2,3} {3}
.
{4} {4}
{1,2,3} {4}
.
{5} {5}
{1,2,3} {5}
.
{6} {6}
{1,2,3} {6}
𝑏
𝑏
#
Step 4: Calculate followpos
𝟏 𝟐
𝟒
𝟑
𝟓
𝟔
5 6
{1,2,3} {5}
.
{6} {6}
𝒄𝟏 𝒄𝟐
𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑐1) = {5}
𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑐2 = 6
𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 5 = 6
Firstpos
Lastpos
Conversion from regular expression to DFA
Position followpos
𝑎 𝑏
|
∗
.
{1} {1} {2} {2}
{1,2} {1,2}
{1,2} {1,2} 𝑎
{3} {3}
{1,2,3} {3}
.
{4} {4}
{1,2,3} {4}
.
{5} {5}
{1,2,3} {5}
.
{6} {6}
{1,2,3} {6}
𝑏
𝑏
#
Step 4: Calculate followpos
𝟏 𝟐
𝟒
𝟑
𝟓
𝟔
5 6
{1,2,3} {4}
.
{5} {5}
𝒄𝟏 𝒄𝟐
𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑐1) = {4}
𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑐2 = 5
𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 4 = 5
4 5
Conversion from regular expression to DFA
Position followpos
𝑎 𝑏
|
∗
.
{1} {1} {2} {2}
{1,2} {1,2}
{1,2} {1,2} 𝑎
{3} {3}
{1,2,3} {3}
.
{4} {4}
{1,2,3} {4}
.
{5} {5}
{1,2,3} {5}
.
{6} {6}
{1,2,3} {6}
𝑏
𝑏
#
Step 4: Calculate followpos
𝟏 𝟐
𝟒
𝟑
𝟓
𝟔
5 6
{1,2,3} {3}
.
{4} {4}
𝒄𝟏 𝒄𝟐
𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑐1) = {3}
𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑐2 = 4
𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 3 = 4
4 5
3 4
Firstpos
Lastpos
Conversion from regular expression to DFA
Position followpos
𝑎 𝑏
|
∗
.
{1} {1} {2} {2}
{1,2} {1,2}
{1,2} {1,2} 𝑎
{3} {3}
{1,2,3} {3}
.
{4} {4}
{1,2,3} {4}
.
{5} {5}
{1,2,3} {5}
.
{6} {6}
{1,2,3} {6}
𝑏
𝑏
#
Step 4: Calculate followpos
𝟏 𝟐
𝟒
𝟑
𝟓
𝟔
5 6
{1,2} {1,2}
.
{3} {3}
𝒄𝟏 𝒄𝟐
4 5
3 4
2 3
1 3
Firstpos
Lastpos
𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑐1) = {1,2}
𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑐2 = 3
𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 1 = 3
𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 2 = 3
Conversion from regular expression to DFA
Position followpos
𝑎 𝑏
|
∗
.
{1} {1} {2} {2}
{1,2} {1,2}
{1,2} {1,2} 𝑎
{3} {3}
{1,2,3} {3}
.
{4} {4}
{1,2,3} {4}
.
{5} {5}
{1,2,3} {5}
.
{6} {6}
{1,2,3} {6}
𝑏
𝑏
#
Step 4: Calculate followpos
𝟏 𝟐
𝟒
𝟑
𝟓
𝟔
5 6
𝒏
4 5
3 4
2 3
1 3
{1,2} {1,2}
*
1,2,
1,2,
Firstpos
Lastpos
𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑛) = {1,2}
𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑛 = 1,2
𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 1 = 1,2
𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 2 = 1,2
Conversion from regular expression to DFA
Initial state = 𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 of root = {1,2,3} -----
A
State A
δ( (1,2,3),a) = followpos(1) U followpos(3)
=(1,2,3) U (4) = {1,2,3,4} ----- B
δ( (1,2,3),b) = followpos(2)
=(1,2,3) ----- A
Position followpos
5 6
4 5
3 4
2 1,2,3
1 1,2,3
States a b
A={1,2,3} B A
B={1,2,3,4}
Conversion from regular expression to DFA
State B
δ( (1,2,3,4),a) = followpos(1) U followpos(3)
=(1,2,3) U (4) = {1,2,3,4} ----- B
δ( (1,2,3,4),b) = followpos(2) U followpos(4)
=(1,2,3) U (5) = {1,2,3,5} ----- C
State C
δ( (1,2,3,5),a) = followpos(1) U followpos(3)
=(1,2,3) U (4) = {1,2,3,4} ----- B
δ( (1,2,3,5),b) = followpos(2) U followpos(5)
=(1,2,3) U (6) = {1,2,3,6} ----- D
Position followpos
5 6
4 5
3 4
2 1,2,3
1 1,2,3
States a b
A={1,2,3} B A
B={1,2,3,4} B C
C={1,2,3,5} B D
D={1,2,3,6}
Conversion from regular expression to DFA
State D
δ( (1,2,3,6),a) = followpos(1) U followpos(3)
=(1,2,3) U (4) = {1,2,3,4} ----- B
δ( (1,2,3,6),b) = followpos(2)
=(1,2,3) ----- A
Position followpos
5 6
4 5
3 4
2 1,2,3
1 1,2,3
States a b
A={1,2,3} B A
B={1,2,3,4} B C
C={1,2,3,5} B D
D={1,2,3,6} B A
A B C D
a b b
b
a
a
b
a
DFA
Conversion from regular expression to DFA
Construct DFA for following regular expression:
1. (c | d)*c#
Thank You

More Related Content

Similar to WINSEM2022-23_CSI2005_TH_VL2022230504110_Reference_Material_II_22-12-2022_1.2_CSI2005_PCD_Module_1_Part_-2.pptx

Introduction of bison
Introduction of bisonIntroduction of bison
Introduction of bison
vip_du
 
8-Practice problems on operator precedence parser-24-05-2023.docx
8-Practice problems on operator precedence parser-24-05-2023.docx8-Practice problems on operator precedence parser-24-05-2023.docx
8-Practice problems on operator precedence parser-24-05-2023.docx
venkatapranaykumarGa
 
Chapter Three(2)
Chapter Three(2)Chapter Three(2)
Chapter Three(2)
bolovv
 
Chapter 3 -Syntax Analyzer.ppt
Chapter 3 -Syntax Analyzer.pptChapter 3 -Syntax Analyzer.ppt
Chapter 3 -Syntax Analyzer.ppt
FamiDan
 

Similar to WINSEM2022-23_CSI2005_TH_VL2022230504110_Reference_Material_II_22-12-2022_1.2_CSI2005_PCD_Module_1_Part_-2.pptx (20)

Automata
AutomataAutomata
Automata
 
Python data handling
Python data handlingPython data handling
Python data handling
 
Introduction of bison
Introduction of bisonIntroduction of bison
Introduction of bison
 
8-Practice problems on operator precedence parser-24-05-2023.docx
8-Practice problems on operator precedence parser-24-05-2023.docx8-Practice problems on operator precedence parser-24-05-2023.docx
8-Practice problems on operator precedence parser-24-05-2023.docx
 
3. Syntax Analyzer.pptx
3. Syntax Analyzer.pptx3. Syntax Analyzer.pptx
3. Syntax Analyzer.pptx
 
09.LearningMaterial_Sample.pdf
09.LearningMaterial_Sample.pdf09.LearningMaterial_Sample.pdf
09.LearningMaterial_Sample.pdf
 
Regular expression
Regular expressionRegular expression
Regular expression
 
Flat unit 2
Flat unit 2Flat unit 2
Flat unit 2
 
compiler Design course material chapter 2
compiler Design course material chapter 2compiler Design course material chapter 2
compiler Design course material chapter 2
 
Chapter Three(2)
Chapter Three(2)Chapter Three(2)
Chapter Three(2)
 
Python ppt
Python pptPython ppt
Python ppt
 
Chapter 3 -Syntax Analyzer.ppt
Chapter 3 -Syntax Analyzer.pptChapter 3 -Syntax Analyzer.ppt
Chapter 3 -Syntax Analyzer.ppt
 
Top Down Parsing, Predictive Parsing
Top Down Parsing, Predictive ParsingTop Down Parsing, Predictive Parsing
Top Down Parsing, Predictive Parsing
 
Module 1 TOC.pptx
Module 1 TOC.pptxModule 1 TOC.pptx
Module 1 TOC.pptx
 
Computational model language and grammar bnf
Computational model language and grammar bnfComputational model language and grammar bnf
Computational model language and grammar bnf
 
Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
Control structure
Control structureControl structure
Control structure
 
Lex analysis
Lex analysisLex analysis
Lex analysis
 
Suffix Tree and Suffix Array
Suffix Tree and Suffix ArraySuffix Tree and Suffix Array
Suffix Tree and Suffix Array
 
Syntactic analysis in NLP
Syntactic analysis in NLPSyntactic analysis in NLP
Syntactic analysis in NLP
 

Recently uploaded

DESIGN THINKING in architecture- Introduction
DESIGN THINKING in architecture- IntroductionDESIGN THINKING in architecture- Introduction
DESIGN THINKING in architecture- Introduction
sivagami49
 
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
nirzagarg
 
Peaches App development presentation deck
Peaches App development presentation deckPeaches App development presentation deck
Peaches App development presentation deck
tbatkhuu1
 
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
amitlee9823
 
Design Inspiration for College by Slidesgo.pptx
Design Inspiration for College by Slidesgo.pptxDesign Inspiration for College by Slidesgo.pptx
Design Inspiration for College by Slidesgo.pptx
TusharBahuguna2
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
amitlee9823
 
Recommendable # 971589162217 # philippine Young Call Girls in Dubai By Marina...
Recommendable # 971589162217 # philippine Young Call Girls in Dubai By Marina...Recommendable # 971589162217 # philippine Young Call Girls in Dubai By Marina...
Recommendable # 971589162217 # philippine Young Call Girls in Dubai By Marina...
home
 
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
amitlee9823
 

Recently uploaded (20)

SD_The MATATAG Curriculum Training Design.pptx
SD_The MATATAG Curriculum Training Design.pptxSD_The MATATAG Curriculum Training Design.pptx
SD_The MATATAG Curriculum Training Design.pptx
 
Top Rated Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
DESIGN THINKING in architecture- Introduction
DESIGN THINKING in architecture- IntroductionDESIGN THINKING in architecture- Introduction
DESIGN THINKING in architecture- Introduction
 
HiFi Call Girl Service Delhi Phone ☞ 9899900591 ☜ Escorts Service at along wi...
HiFi Call Girl Service Delhi Phone ☞ 9899900591 ☜ Escorts Service at along wi...HiFi Call Girl Service Delhi Phone ☞ 9899900591 ☜ Escorts Service at along wi...
HiFi Call Girl Service Delhi Phone ☞ 9899900591 ☜ Escorts Service at along wi...
 
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
 
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
Nisha Yadav Escorts Service Ernakulam ❣️ 7014168258 ❣️ High Cost Unlimited Ha...
 
Case Study of Hotel Taj Vivanta, Pune
Case Study of Hotel Taj Vivanta, PuneCase Study of Hotel Taj Vivanta, Pune
Case Study of Hotel Taj Vivanta, Pune
 
Peaches App development presentation deck
Peaches App development presentation deckPeaches App development presentation deck
Peaches App development presentation deck
 
💫✅jodhpur 24×7 BEST GENUINE PERSON LOW PRICE CALL GIRL SERVICE FULL SATISFACT...
💫✅jodhpur 24×7 BEST GENUINE PERSON LOW PRICE CALL GIRL SERVICE FULL SATISFACT...💫✅jodhpur 24×7 BEST GENUINE PERSON LOW PRICE CALL GIRL SERVICE FULL SATISFACT...
💫✅jodhpur 24×7 BEST GENUINE PERSON LOW PRICE CALL GIRL SERVICE FULL SATISFACT...
 
Q4-W4-SCIENCE-5 power point presentation
Q4-W4-SCIENCE-5 power point presentationQ4-W4-SCIENCE-5 power point presentation
Q4-W4-SCIENCE-5 power point presentation
 
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Hy...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Hy...Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Hy...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Hy...
 
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
Jigani Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangal...
 
Design Inspiration for College by Slidesgo.pptx
Design Inspiration for College by Slidesgo.pptxDesign Inspiration for College by Slidesgo.pptx
Design Inspiration for College by Slidesgo.pptx
 
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
 
Recommendable # 971589162217 # philippine Young Call Girls in Dubai By Marina...
Recommendable # 971589162217 # philippine Young Call Girls in Dubai By Marina...Recommendable # 971589162217 # philippine Young Call Girls in Dubai By Marina...
Recommendable # 971589162217 # philippine Young Call Girls in Dubai By Marina...
 
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
 

WINSEM2022-23_CSI2005_TH_VL2022230504110_Reference_Material_II_22-12-2022_1.2_CSI2005_PCD_Module_1_Part_-2.pptx

  • 2.  Looping Topics to be covered 1. Interaction of scanner & parser 2. Token, Pattern & Lexemes 3. Input buffering 4. Specification of tokens 5. Regular expression & Regular definition 6. Transition diagram 7. Hard coding & automatic generation lexical analyzers 8. Finite automata 9. Regular expression to NFA using Thompson's rule 10. Conversion from NFA to DFA using subset construction method 11. DFA optimization 12. Conversion from regular expression to DFA
  • 3. 1. Interaction with Scanner & Parser
  • 4. Interaction of scanner & parser • Upon receiving a “Get next token” command from parser, the lexical analyzer reads the input character until it can identify the next token. • Lexical analyzer also stripping out comments and white space in the form of blanks, tabs, and newline characters from the source program. Lexical Analyzer Symbol Table Parser Token Get next token Source Program
  • 5. Why to separate lexical analysis & parsing? 1. Simplicity in design. 2. Improves compiler efficiency. 3. Enhance compiler portability.
  • 6. 2. Token, Pattern & Lexemes
  • 7. Token, Pattern & Lexemes Sequence of character having a collective meaning is known as token. Categories of Tokens: 1.Identifier 2.Keyword 3.Operator 4.Special symbol 5.Constant The set of rules called pattern associated with a token. Example: “non-empty sequence of digits”, “letter followed by letters and digits” The sequence of character in a source program matched with a pattern for a token is called lexeme. Example: Rate, DIET, count, Flag Token Pattern Lexemes
  • 8. Example: Token, Pattern & Lexemes Example: total = sum + 45 Tokens: total = sum + 45 Lexemes Lexemes of identifier: total, sum Lexemes of operator: =, + Lexemes of constant: 45 Identifier1 Operator1 Identifier2 Operator2 Constant1 Tokens
  • 10. Input buffering • There are mainly two techniques for input buffering: 1. Buffer pairs 2. Sentinels • The lexical analysis scans the input string from left to right one character at a time. • Buffer divided into two N-character halves, where N is the number of character on one disk block. : : : E : : = : : Mi : * : : : C: * : * : 2 : eof : : : Buffer Pair
  • 11. Buffer pairs • Pointer Lexeme Begin, marks the beginning of the current lexeme. • Pointer Forward, scans ahead until a pattern match is found. • Once the next lexeme is determined, forward is set to character at its right end. • Lexeme Begin is set to the character immediately after the lexeme just found. • If forward pointer is at the end of first buffer half then second is filled with N input character. • If forward pointer is at the end of second buffer half then first is filled with N input character. : : : E : : = : : Mi : * : : : C: * : * : 2 : eof : : : forward lexeme_beginnig forward : C: * : * : 2 : eof : : :
  • 12. Buffer pairs Code to advance forward pointer if forward at end of first half then begin reload second half; forward := forward + 1; end else if forward at end of second half then begin reload first half; move forward to beginning of first half; end else forward := forward + 1; : : : E : : = : : Mi : * : : : C: * : * : 2 : eof : : : forward lexeme_beginnig forward forward : C: * : * : 2 : eof : : :
  • 13. Sentinels • In buffer pairs we must check, each time we move the forward pointer that we have not moved off one of the buffers. • Thus, for each character read, we make two tests. • We can combine the buffer-end test with the test for the current character. • We can reduce the two tests to one if we extend each buffer to hold a sentinel character at the end. • The sentinel is a special character that cannot be part of the source program, and a natural choice is the character EOF. : : E : : = : : Mi : * : eof : C: * : * : 2 : eof : : eof forward lexeme_beginnig
  • 14. Sentinels forward := forward + 1; if forward = eof then begin if forward at end of first half then begin reload second half; forward := forward + 1; end else if forward at the second half then begin reload first half; move forward to beginning of first half; end else terminate lexical analysis; end : : E : : = : : Mi : * : : C: * : * : 2 : eof : : eof lexeme_beginnig forward eof forward forward : C: * : * : 2 : eof : : eof
  • 16. Strings and languages Term Definition Prefix of s A string obtained by removing zero or more trailing symbol of string S. e.g., ban is prefix of banana. Suffix of S A string obtained by removing zero or more leading symbol of string S. e.g., nana is suffix of banana. Sub string of S A string obtained by removing prefix and suffix from S. e.g., nan is substring of banana Proper prefix, suffix and substring of S Any nonempty string x that is respectively proper prefix, suffix or substring of S, such that s≠x. Subsequence of S A string obtained by removing zero or more not necessarily contiguous symbol from S. e.g., baaa is subsequence of banana.
  • 17. Exercise • Write prefix, suffix, substring, proper prefix, proper suffix and subsequence of following string: String: Compiler
  • 18. Operations on languages Operation Definition Union of L and M Written L U M 𝐿 𝑈 𝑀 = {𝑠 | 𝑠 𝑖𝑠 𝑖𝑛 𝐿 𝑜𝑟 𝑠 𝑖𝑠 𝑖𝑛 𝑀 } Concatenation of L and M Written LM 𝐿𝑀 = {𝑠𝑡 | 𝑠 𝑖𝑠 𝑖𝑛 𝐿 𝑎𝑛𝑑 𝑡 𝑖𝑠 𝑖𝑛 𝑀 } Kleene closure of L Written L∗ 𝐿 ∗ 𝑑𝑒𝑛𝑜𝑡𝑒𝑠 “𝑧𝑒𝑟𝑜 𝑜𝑟 𝑚𝑜𝑟𝑒 𝑐𝑜𝑛𝑐𝑎𝑡𝑒𝑛𝑎𝑡𝑖𝑜𝑛 𝑜𝑓” 𝐿. Positive closure of L Written L+ 𝐿 + 𝑑𝑒𝑛𝑜𝑡𝑒𝑠 “𝑜𝑛𝑒 𝑜𝑟 𝑚𝑜𝑟𝑒 𝑐𝑜𝑛𝑐𝑎𝑡𝑒𝑛𝑎𝑡𝑖𝑜𝑛 𝑜𝑓” 𝐿.
  • 19. 5. Regular Expression & Regular Definition
  • 20. Regular expression • A regular expression is a sequence of characters that define a pattern. Notational shorthand's 1. One or more instances: + 2. Zero or more instances: * 3. Zero or one instances: ? 4. Alphabets: Σ
  • 21. Rules to define regular expression 1. ∈ is a regular expression that denotes {∈}, the set containing empty string. 2. If 𝑎 is a symbol in 𝛴 then 𝑎 is a regular expression, 𝐿 𝑎 = {𝑎} 3. Suppose 𝑟and 𝑠are regular expression denoting the languages 𝐿 𝑟 and 𝐿 𝑠 . Then, a. r | s is a regular expression denoting L r U L(s) b. r s is a regular expression denoting L r L s c. r * is a regular expression denoting L r ∗ d. r is a regular expression denoting L( r ) The language denoted by regular expression is said to be a regular set.
  • 22. Regular expression • L = Zero or More Occurrences of a = a* a aaa aa aaaa aaaaa….. Infinite ….. 𝜖
  • 23. Regular expression • L = One or More Occurrences of a = a+ a aaa aa aaaa aaaaa….. Infinite …..
  • 24. Precedence and associativity of operators Operator Precedence Associative Kleene * 1 left Concatenation 2 left Union | 3 left
  • 25. Regular expression examples 1. 0 or 1 2. 0 or 11 or 111 3. String having zero or more a. 4. String having one or more a. 5. Regular expression over 𝛴 = {𝑎, 𝑏, 𝑐} that represent all string of length 3. 6. All binary string 𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝟎, 𝟏 𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝟎, 𝟏𝟏, 𝟏𝟏𝟏 𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝛜, 𝐚, 𝐚𝐚, 𝐚𝐚𝐚, 𝐚𝐚𝐚𝐚 … . . 𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝐚, 𝐚𝐚, 𝐚𝐚𝐚, 𝐚𝐚𝐚𝐚 … . . 𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝐚𝐛𝐜, 𝐛𝐜𝐚, 𝐛𝐛𝐛, 𝐜𝐚𝐛, 𝐚𝐛𝐚 … . 𝐒𝐭𝐫𝐢𝐧𝐠𝐬: 𝟎, 𝟏𝟏, 𝟏𝟎𝟏, 𝟏𝟎𝟏𝟎𝟏, 𝟏𝟏𝟏𝟏 … 𝐑. 𝐄. = 𝟎 𝟏𝟏 𝟏𝟏𝟏 𝐑. 𝐄. = 𝟎 | 𝟏 𝐑. 𝐄. = 𝐚 ∗ 𝐑. 𝐄. = 𝐚 + 𝐑. 𝐄. = 𝐚 𝐛 𝐜 𝐚 𝐛 𝐜 (𝐚 𝐛 𝐜) 𝐑. 𝐄. = (𝟎 | 𝟏)+
  • 26. Regular expression examples 7. 0 or more occurrence of either a or b or both 8. 1 or more occurrence of either a or b or both 9. Binary no. ends with 0 10.Binary no. ends with 1 11.Binary no. starts and ends with 1 12.String starts and ends with same character 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝜖, 𝒂, 𝒂𝒂, 𝒂𝒃𝒂𝒃, 𝒃𝒂𝒃 … 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂, 𝒂𝒂, 𝒂𝒃𝒂𝒃, 𝒃𝒂𝒃, 𝒃𝒃𝒃𝒂𝒂𝒂 … 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎, 𝟏𝟎, 𝟏𝟎𝟎, 𝟏𝟎𝟏𝟎, 𝟏𝟏𝟏𝟏𝟎 … 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏, 𝟏𝟎𝟏, 𝟏𝟎𝟎𝟏, 𝟏𝟎𝟏𝟎𝟏, … 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟏, 𝟏𝟎𝟏, 𝟏𝟎𝟎𝟏, 𝟏𝟎𝟏𝟎𝟏, … 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎, 𝟏𝟎𝟏, 𝒂𝒃𝒂, 𝒃𝒂𝒂𝒃 … 𝑹. 𝑬. = (𝒂 | 𝒃) ∗ 𝑹. 𝑬. = (𝒂 | 𝒃)+ 𝑹. 𝑬. = (𝟎 | 𝟏) ∗ 𝟏 𝑹. 𝑬. = (𝟎 | 𝟏)* 𝟎 𝑹. 𝑬. = 𝟏 (𝟎 | 𝟏) ∗ 𝟏 𝑹. 𝑬. = 𝟏 (𝟎 | 𝟏) ∗ 𝟏 𝐨𝐫 𝟎 (𝟎 | 𝟏) ∗ 𝟎 𝒂 (𝒂 | 𝒃) ∗ 𝒂 𝐨𝐫 𝒃 (𝒂 | 𝒃) ∗ 𝒃
  • 27. Regular expression examples 13.All string of a and b starting with a 14.String of 0 and 1 ends with 00 15.String ends with abb 16.String starts with 1 and ends with 0 17.All binary string with at least 3 characters and 3rd character should be zero 18.Language which consist of exactly two b’s over the set 𝛴 = {𝑎, 𝑏} 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂, 𝒂𝒃, 𝒂𝒂𝒃, 𝒂𝒃𝒃… 𝑹. 𝑬. = 𝒂(𝒂 | 𝒃)* 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎, 𝟏𝟎𝟎, 𝟎𝟎𝟎, 𝟏𝟎𝟎𝟎, 𝟏𝟏𝟎𝟎… 𝑹. 𝑬. = (𝟎 | 𝟏) ∗ 𝟎𝟎 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂𝒃𝒃, 𝒃𝒂𝒃𝒃, 𝒂𝒃𝒂𝒃𝒃… 𝑹. 𝑬. = (𝒂 | 𝒃) ∗ 𝒂𝒃𝒃 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟎, 𝟏𝟎𝟎, 𝟏𝟏𝟎, 𝟏𝟎𝟎𝟎, 𝟏𝟏𝟎𝟎… 𝑹. 𝑬. = 𝟏(𝟎 | 𝟏) ∗ 𝟎 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎𝟎, 𝟏𝟎𝟎, 𝟏𝟏𝟎𝟎, 𝟏𝟎𝟎𝟏… 𝑹. 𝑬. = 𝟎 𝟏 𝟎 𝟏 𝟎(𝟎 | 𝟏) ∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒃𝒃, 𝒃𝒂𝒃, 𝒂𝒂𝒃𝒃, 𝒂𝒃𝒃𝒂… 𝑹. 𝑬. = 𝒂 ∗ 𝒃 𝒂 ∗ 𝒃 𝒂 ∗
  • 28. Regular expression examples 19.The language with 𝛴 = {𝑎, 𝑏} such that 3rd character from right end of the string is always a. 20. Any no. of 𝑎 followed by any no. of 𝑏 followed by any no. of 𝑐 21. String should contain at least three 1 22. String should contain exactly two 1 23.Length of string should be at least 1 and at most 3 24.No. of zero should be multiple of 3 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂𝒂𝒂, 𝒂𝒃𝒂, 𝒂𝒂𝒃𝒂, 𝒂𝒃𝒃… 𝑹. 𝑬. = (𝒂 | 𝒃) ∗ 𝒂(𝒂|𝒃)(𝒂|𝒃) 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝜖, 𝒂𝒃𝒄, 𝒂𝒂𝒃𝒃𝒄𝒄, 𝒂𝒂𝒃𝒄, 𝒂𝒃𝒃… 𝑹. 𝑬. = 𝒂 ∗ 𝒃 ∗ 𝒄 ∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟏𝟏, 𝟎𝟏𝟏𝟎𝟏, 𝟎𝟏𝟎𝟏𝟏𝟏𝟎…. 𝑹. 𝑬. = (𝟎|𝟏)∗ 𝟏 (𝟎|𝟏)∗ 𝟏 (𝟎|𝟏)∗ 𝟏 (𝟎|𝟏)∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟏, 𝟎𝟏𝟎𝟏, 𝟏𝟏𝟎𝟎, 𝟎𝟏𝟎𝟎𝟏𝟎, 𝟏𝟎𝟎𝟏𝟎𝟎…. 𝑹. 𝑬. = 𝟎∗ 𝟏𝟎∗ 𝟏𝟎∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎, 𝟏, 𝟏𝟏, 𝟎𝟏, 𝟏𝟏𝟏, 𝟎𝟏𝟎, 𝟏𝟎𝟎…. 𝑹. 𝑬. = 𝟎|𝟏 𝟎|𝟏 𝟎|𝟏 𝟎|𝟏 𝟎|𝟏 𝟎|𝟏 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎𝟎, 𝟎𝟏𝟎𝟏𝟎𝟏, 𝟏𝟏𝟎𝟏𝟎𝟎, 𝟎𝟎𝟎𝟎𝟎𝟎, 𝟏𝟎𝟎𝟎𝟏𝟎𝟎𝟏𝟎…. 𝑹. 𝑬. = (𝟏∗ 𝟎𝟏∗ 𝟎𝟏∗ 𝟎𝟏∗ )∗
  • 29. Regular expression examples 25.The language with 𝛴 = {𝑎, 𝑏, 𝑐} where 𝑎 should be multiple of 3 26. Even no. of 0 27. String should have odd length 28. String should have even length 29. String start with 0 and has odd length 30. String start with 1 and has even length 31. All string begins or ends with 00 or 11 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂𝒂𝒂, 𝒃𝒂𝒂𝒂, 𝒃𝒂𝒄𝒂𝒃𝒂, 𝒂𝒂𝒂𝒂𝒂𝒂. . 𝑹. 𝑬. = ( 𝒃|𝒄 ∗ 𝒂 𝒃|𝒄 ∗ 𝒂 𝒃|𝒄 ∗ 𝒂 𝒃|𝒄 ∗ )∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎, 𝟎𝟏𝟎𝟏, 𝟎𝟎𝟎𝟎, 𝟏𝟎𝟎𝟏𝟎𝟎…. 𝑹. 𝑬. = (𝟏∗𝟎𝟏∗𝟎𝟏∗)∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎, 𝟎𝟏𝟎, 𝟏𝟏𝟎, 𝟎𝟎𝟎, 𝟏𝟎𝟎𝟏𝟎…. 𝑹. 𝑬. = 𝟎|𝟏 ( 𝟎 𝟏 (𝟎|𝟏))∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎, 𝟎𝟏𝟎𝟏, 𝟎𝟎𝟎𝟎, 𝟏𝟎𝟎𝟏𝟎𝟎…. 𝑹. 𝑬. = ( 𝟎 𝟏 (𝟎|𝟏))∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎, 𝟎𝟏𝟎, 𝟎𝟏𝟎, 𝟎𝟎𝟎, 𝟎𝟎𝟎𝟏𝟎…. 𝑹. 𝑬. = 𝟎 ( 𝟎 𝟏 (𝟎|𝟏))∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟏𝟎, 𝟏𝟏𝟎𝟎, 𝟏𝟎𝟎𝟎, 𝟏𝟎𝟎𝟏𝟎𝟎…. 𝑹. 𝑬. = 𝟏(𝟎|𝟏)( 𝟎 𝟏 (𝟎|𝟏))∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎𝟏𝟎𝟏, 𝟏𝟎𝟏𝟎𝟎, 𝟏𝟏𝟎, 𝟎𝟏𝟎𝟏𝟏 … 𝑹. 𝑬. = (𝟎𝟎|𝟏𝟏)(𝟎 | 𝟏) ∗ | 𝟎 𝟏 ∗ (𝟎𝟎|𝟏𝟏)
  • 30. Regular expression examples 32. Language of all string containing both 11 and 00 as substring 33. String ending with 1 and not contain 00 34. Language of C identifier 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟎𝟏𝟏, 𝟏𝟏𝟎𝟎, 𝟏𝟎𝟎𝟏𝟏𝟎, 𝟎𝟏𝟎𝟎𝟏𝟏 … 𝑹. 𝑬. = ((𝟎|𝟏)∗𝟎𝟎(𝟎|𝟏)∗𝟏𝟏(𝟎|𝟏)∗) | ((𝟎|𝟏)∗𝟏𝟏(𝟎|𝟏)∗𝟎𝟎(𝟎|𝟏)∗) 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝟎𝟏𝟏, 𝟏𝟏𝟎𝟏, 𝟏𝟎𝟏𝟏 … . 𝑹. 𝑬. = 𝟏 𝟎𝟏 + 𝑹. 𝑬. = (_ + 𝑳)(_ + 𝑳 + 𝑫)∗ 𝑺𝒕𝒓𝒊𝒏𝒈𝒔: 𝒂𝒓𝒆𝒂, 𝒊, 𝒓𝒆𝒅𝒊𝒐𝒖𝒔, 𝒈𝒓𝒂𝒅𝒆𝟏 … . 𝒘𝒉𝒆𝒓𝒆 𝑳 𝒊𝒔 𝑳𝒆𝒕𝒕𝒆𝒓 & 𝐃 𝐢𝐬 𝐝𝐢𝐠𝐢𝐭
  • 31. Regular definition • A regular definition gives names to certain regular expressions and uses those names in other regular expressions. • Regular definition is a sequence of definitions of the form: 𝑑1 → 𝑟1 𝑑2 → 𝑟2 …… 𝑑𝑛 → 𝑟𝑛 Where 𝑑𝑖 is a distinct name & 𝑟𝑖 is a regular expression.  Example: Regular definition for identifier letter  A|B|C|………..|Z|a|b|………..|z digit  0|1|…….|9| id letter (letter | digit)*
  • 32. Regular definition example • Example: Unsigned Pascal numbers 3 5280 39.37 6.336E4 1.894E-4 2.56E+7 Regular Definition digit  0|1|…..|9 digits  digit digit* optional_fraction  .digits | 𝜖 optional_exponent  (E(+|-|𝜖)digits)|𝜖 num  digits optional_fraction optional_exponent
  • 34. Transition Diagram • A stylized flowchart is called transition diagram. is a state is a transition is a start state is a final state
  • 35. Transition Diagram : Relational operator < 0 6 1 2 3 4 5 8 7 = other > = other = > return (relop,LE) return (relop,NE) return (relop,LT) return (relop,GE) return (relop,GT) return (relop,EQ)
  • 36. Transition diagram : Unsigned number 3 5280 39.37 1.894 E - 4 2.56 E + 7 45 E + 6 96 E 2 1 2 8 other digit 3 4 5 6 7 digit digit digit +or - digit digit E . start E digit
  • 37. 7. Hard coding & automatic generation Lexical analyzers
  • 38. Hard coding and automatic generation lexical analyzers • Lexical analysis is about identifying the pattern from the input. • To recognize the pattern, transition diagram is constructed. • It is known as hard coding lexical analyzer. • Example: to represent identifier in ‘C’, the first character must be letter and other characters are either letter or digits. • To recognize this pattern, hard coding lexical analyzer will work with a transition diagram. • The automatic generation lexical analyzer takes special notation as input. • For example, lex compiler tool will take regular expression as input and finds out the pattern matching to that regular expression. 2 3 Start Letter or digit Letter 1
  • 40. Finite Automata • Finite Automata are recognizers. • FA simply say “Yes” or “No” about each possible input string. • Finite Automata is a mathematical model consist of: 1. Set of states 𝑺 2. Set of input symbol 𝜮 3. A transition function move 4. Initial state 𝑺𝟎 5. Final states or accepting states 𝐅
  • 41. Types of finite automata • Types of finite automata are:  Nondeterministic finite automata (NFA): There are no restrictions on the edges leaving a state. There can be several with the same symbol as label and some edges can be labeled with 𝜖. 1 2 3 4 a b b a b 1 2 3 4 a b b a a a b DFA NFA b  Deterministic finite automata (DFA): have for each state exactly one edge leaving out for each symbol. DFA NFA
  • 42. 9. Regular expression to NFA using Thompson's rule
  • 43. Regular expression to NFA using Thompson's rule 1. For ∈ , construct the NFA 2. For 𝑎 in 𝛴, construct the NFA 𝑖 𝑓 𝜖 start 𝑖 𝑓 a start 3. For regular expression 𝑠𝑡 Ex: ab 𝑖 𝑓 start N(s) N(t) 1 2 3 a b
  • 44. Regular expression to NFA using Thompson's rule 4. For regular expression 𝑠|𝑡 Ex: (a|b) 𝑖 𝑓 start N(s) N(t) 𝜖 𝜖 𝜖 𝜖 1 2 5 3 4 6 a b 𝜖 𝜖 𝜖 𝜖 5. For regular expression 𝑠* Ex: a* 𝑖 𝑓 start N(s) 𝜖 𝜖 𝜖 𝜖 1 4 𝜖 𝜖 𝜖 𝜖 2 3 𝑎
  • 45. Regular expression to NFA using Thompson's rule • a*b • b*ab 1 4 𝜖 𝜖 𝜖 2 3 𝑎 5 𝑏 1 4 𝜖 𝜖 𝜖 𝜖 2 3 𝑏 6 5 𝑎 𝑏
  • 46. Exercise Convert following regular expression to NFA: 1. abba 2. bb(a)* 3. (a|b)* 4. a* | b* 5. a(a)*ab 6. aa*+ bb* 7. (a+b)*abb 8. 10(0+1)*1 9. (a+b)*a(a+b) 10. (0+1)*010(0+1)* 11. (010+00)*(10)* 12. 100(1)*00(0+1)*
  • 47. 10. Conversion from NFA to DFA using subset construction method
  • 48. Subset construction algorithm Input: An NFA 𝑁. Output: A DFA D accepting the same language. Method: Algorithm construct a transition table 𝐷𝑡𝑟𝑎𝑛 for D. We use the following operation: OPERATION DESCRIPTION  − 𝑐𝑙𝑜𝑠𝑢𝑟𝑒(𝑠) Set of NFA states reachable from NFA state 𝑠 on – transition alone.  − 𝑐𝑙𝑜𝑠𝑢𝑟𝑒(𝑇) Set of NFA states reachable from some NFA state 𝑠 in 𝑇 on – transition alone. 𝑴𝒐𝒗𝒆 (𝑇, 𝑎) Set of NFA states to which there is a transition on input symbol 𝑎 from some NFA state 𝑠 in 𝑇.
  • 49. Subset construction algorithm initially  − 𝑐𝑙𝑜𝑠𝑢𝑟𝑒(𝑠0) be the only state in 𝐷𝑠𝑡𝑎𝑡𝑒𝑠 and it is unmarked; while there is unmarked states T in 𝐷𝑠𝑡𝑎𝑡𝑒𝑠 do begin mark 𝑇; for each input symbol 𝑎 do begin 𝑈 = 𝜖 − 𝑐𝑙𝑜𝑠𝑢𝑟𝑒 𝑚𝑜𝑣𝑒 𝑇, 𝑎 ; if 𝑈 is not in 𝐷𝑠𝑡𝑎𝑡𝑒𝑠 then add 𝑈 as unmarked state to 𝐷𝑠𝑡𝑎𝑡𝑒𝑠; 𝐷𝑡𝑟𝑎𝑛[ 𝑇, 𝑎 ] = 𝑈 end end
  • 50. Conversion from NFA to DFA 1 (a|b)*abb 2 5 3 4 6 7 8 9 0 10 𝜖 a b 𝜖 a b b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖
  • 51. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 9 0 𝜖 a b 𝜖 a b b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 10 {0, 1, 7, 2, 4} ---- A 𝜖- Closure(0)= = {0,1,2,4,7}
  • 52. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 9 0 𝜖 a b 𝜖 a b b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 A= {0, 1, 2, 4, 7} Move(A,a) = {3,8} 𝜖- Closure(Move(A,a)) = {3, 6, 7, 1, 2, 4, 8} ---- B = {1,2,3,4,6,7,8} 10 States a b A = {0,1,2,4,7} B B = {1,2,3,4,6,7,8}
  • 53. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 0 𝜖 a b 𝜖 a b b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 A= {0, 1, 2, 4, 7} Move(A,b) = {5} 𝜖- Closure(Move(A,b)) = {5, 6, 7, 1, 2, 4} ---- C = {1,2,4,5,6,7} {5} 10 9 States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} C = {1,2,4,5,6,7}
  • 54. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 0 𝜖 a b 𝜖 a b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 B = {1, 2, 3, 4, 6, 7, 8} Move(B,a) = {3,8} 𝜖- Closure(Move(B,a)) = {3, 6, 7, 1, 2, 4, 8} ---- B = {1,2,3,4,6,7,8} b 10 9 States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} B C = {1,2,4,5,6,7}
  • 55. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 0 𝜖 a b 𝜖 a b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 B= {1, 2, 3, 4, 6, 7, 8} Move(B,b) = {5,9} 𝜖- Closure(Move(B,b)) = {5, 6, 7, 1, 2, 4, 9} ---- D = {1,2,4,5,6,7,9} b 10 9 States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} B D C = {1,2,4,5,6,7} D = {1,2,4,5,6,7,9}
  • 56. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 0 𝜖 a b 𝜖 a b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 Move(C,a) = {3,8} 𝜖- Closure(Move(C,a)) = {3, 6, 7, 1, 2, 4, 8} ---- B = {1,2,3,4,6,7,8} C= {1, 2, 4, 5, 6 ,7} b 10 9 States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} B D C = {1,2,4,5,6,7} B D = {1,2,4,5,6,7,9}
  • 57. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 0 𝜖 a b 𝜖 a b b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 Move(C,b) = {5} 𝜖- Closure(Move(C,b))= {5, 6, 7, 1, 2, 4} ---- C = {1,2,4,5,6,7} C= {1, 2, 4, 5, 6, 7} 10 9 States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} B D C = {1,2,4,5,6,7} B C D = {1,2,4,5,6,7,9}
  • 58. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 0 𝜖 a b 𝜖 a b b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 Move(D,a) = {3,8} 𝜖- Closure(Move(D,a)) = {3, 6, 7, 1, 2, 4, 8} ---- B = {1,2,3,4,6,7,8} D= {1, 2, 4, 5, 6, 7, 9} 10 9 States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} B D C = {1,2,4,5,6,7} B C D = {1,2,4,5,6,7,9} B
  • 59. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 0 𝜖 a b 𝜖 a b b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 Move(D,b) = {5,10} 𝜖- Closure(Move(D,b)) = {5, 6, 7, 1, 2, 4, 10} ---- E = {1,2,4,5,6,7,10} D= {1, 2, 4, 5, 6, 7, 9} 10 9 States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} B D C = {1,2,4,5,6,7} B C D = {1,2,4,5,6,7,9} B E E = {1,2,4,5,6,7,10}
  • 60. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 0 𝜖 a b 𝜖 a b b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 Move(E,a) = {3,8} 𝜖- Closure(Move(E,a)) = {3, 6, 7, 1, 2, 4, 8} ---- B = {1,2,3,4,6,7,8} E= {1, 2, 4, 5, 6, 7, 10} 10 9 States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} B D C = {1,2,4,5,6,7} B C D = {1,2,4,5,6,7,9} B E E = {1,2,4,5,6,7,10} B
  • 61. Conversion from NFA to DFA 1 2 5 3 4 6 7 8 0 𝜖 a b 𝜖 a b b 𝜖 𝜖 𝜖 𝜖 𝜖 𝜖 Move(E,b)= {5} 𝜖- Closure(Move(E,b))= {5,6,7,1,2,4} ---- C = {1,2,4,5,6,7} States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} B D C = {1,2,4,5,6,7} B C E= {1, 2, 4, 5, 6, 7, 10} D = {1,2,4,5,6,7,9} B E E = {1,2,4,5,6,7,10} B C 10 9
  • 62. Conversion from NFA to DFA A B C a b a b a D E b a b b a Transition Table DFA Note: • Accepting state in NFA is 10 • 10 is element of E • So, E is acceptance state in DFA States a b A = {0,1,2,4,7} B C B = {1,2,3,4,6,7,8} B D C = {1,2,4,5,6,7} B C D = {1,2,4,5,6,7,9} B E E = {1,2,4,5,6,7,10} B C
  • 63. Exercise • Convert following regular expression to DFA using subset construction method: 1. (a+b)*a(a+b) 2. (a+b)*ab*a
  • 65. DFA optimization 1. Construct an initial partition Π of the set of states with two groups: the accepting states 𝐹 and the non-accepting states 𝑆 − 𝐹. 2. Apply the repartition procedure to Π to construct a new partition Π𝑛𝑒𝑤. 3. If Π 𝑛𝑒𝑤 = Π, let Π𝑓𝑖𝑛𝑎𝑙 = Π and continue with step (4). Otherwise, repeat step (2) with Π = Π𝑛𝑒𝑤. for each group 𝐺 of Π do begin partition 𝐺 into subgroups such that two states 𝑠 and 𝑡 of 𝐺 are in the same subgroup if and only if for all input symbols 𝑎, states 𝑠 and 𝑡 have transitions on 𝑎 to states in the same group of Π. replace 𝐺 in Π𝑛𝑒𝑤 by the set of all subgroups formed. end
  • 66. DFA optimization 4. Choose one state in each group of the partition Π𝑓𝑖𝑛𝑎𝑙 as the representative for that group. The representatives will be the states of 𝑀′. Let s be a representative state, and suppose on input a there is a transition of 𝑀 from 𝑠 to 𝑡. Let 𝑟 be the representative of 𝑡′s group. Then 𝑀′ has a transition from 𝑠 to 𝑟 on 𝑎. Let the start state of 𝑀′ be the representative of the group containing start state 𝑠0 of 𝑀, and let the accepting states of 𝑀′ be the representatives that are in 𝐹. 5. If 𝑀′ has a dead state 𝑑, then remove 𝑑 from 𝑀′. Also remove any state not reachable from the start state.
  • 67. DFA optimization • Now no more splitting is possible. • If we chose A as the representative for group (AC), then we obtain reduced transition table A B C B B D C B C D B E E B C States a b {𝐴, 𝐵, 𝐶, 𝐷, 𝐸} Nonaccepting States {𝐴, 𝐵, 𝐶, 𝐷} Accepting States {𝐸} {𝐴, 𝐵, 𝐶} {𝐷} {𝐵} A B A B B D D B E E B A States a b Optimized Transition Table {𝐴, 𝐶}
  • 68. 12. Conversion from regular expression to DFA
  • 69. Rules to compute nullable, firstpos, lastpos • nullable(n) • The subtree at node 𝑛 generates languages including the empty string. • firstpos(n) • The set of positions that can match the first symbol of a string generated by the subtree at node 𝑛. • lastpos(n) • The set of positions that can match the last symbol of a string generated be the subtree at node 𝑛. • followpos(i) • The set of positions that can follow position 𝑖 in the tree.
  • 70. Rules to compute nullable, firstpos, lastpos Node n nullable(n) firstpos(n) lastpos(n) A leaf labeled by  true ∅ ∅ A leaf with position 𝐢 false {i} {i} nullable(c1) or nullable(c2) firstpos(c1)  firstpos(c2) lastpos(c1)  lastpos(c2) nullable(c1) and nullable(c2) if (nullable(c1)) thenfirstpos(c1)  firstpos(c2) else firstpos(c1) if (nullable(c2)) then lastpos(c1)  lastpos(c2) else lastpos(c2) true firstpos(c1) lastpos(c1) | n c1 c2 n ∗ n . c1 c2 c1
  • 71. Rules to compute followpos 1. If n is concatenation node with left child c1 and right child c2 and i is a position in lastpos(c1), then all position in firstpos(c2) are in followpos(i) 2. If n is * node and i is position in lastpos(n), then all position in firstpos(n) are in followpos(i)
  • 72. Conversion from regular expression to DFA 𝑎 𝑏 | ∗ . 𝟏 𝟐 𝑎 . . . 𝑏 𝑏 # (a|b)* abb 𝟒 𝟑 𝟓 𝟔 # Step 2: Nullable node Here, * is only nullable node Step 1: Construct Syntax Tree
  • 73. Conversion from regular expression to DFA 𝑎 𝑏 | ∗ . {1} {2} {1,2} {1,2} 𝑎 {3} {1,2,3} . {4} {1,2,3} . {5} {1,2,3} . {6} {1,2,3} 𝑏 𝑏 # Step 3: Calculate firstpos Firstpos A leaf with position 𝒊 = {𝒊} | n c1 c2 firstpos(c1)  firstpos(c2) ∗ n c1 firstpos(c1) . n c1 c2 𝟏 𝟐 𝟒 𝟑 𝟓 𝟔 if (nullable(c1)) thenfirstpos(c1)  firstpos(c2) else firstpos(c1)
  • 74. Conversion from regular expression to DFA 𝑎 𝑏 | ∗ . {1} {1} {2} {2} {1,2} {1,2} {1,2} {1,2} 𝑎 {3} {3} {1,2,3} {3} . {4} {4} {1,2,3} {4} . {5} {5} {1,2,3} {5} . {6} {6} {1,2,3} {6} 𝑏 𝑏 # Step 3: Calculate lastpos 𝟏 𝟐 𝟒 𝟑 𝟓 𝟔 Lastpos A leaf with position 𝒊 = {𝒊} | n c1 c2 if (nullable(c2)) then lastpos(c1)  lastpos(c2) else lastpos(c2) ∗ n c1 lastpos(c1) . n c1 c2 lastpos(c1)  lastpos(c2)
  • 75. Conversion from regular expression to DFA Position followpos 𝑎 𝑏 | ∗ . {1} {1} {2} {2} {1,2} {1,2} {1,2} {1,2} 𝑎 {3} {3} {1,2,3} {3} . {4} {4} {1,2,3} {4} . {5} {5} {1,2,3} {5} . {6} {6} {1,2,3} {6} 𝑏 𝑏 # Step 4: Calculate followpos 𝟏 𝟐 𝟒 𝟑 𝟓 𝟔 5 6 {1,2,3} {5} . {6} {6} 𝒄𝟏 𝒄𝟐 𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑐1) = {5} 𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑐2 = 6 𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 5 = 6 Firstpos Lastpos
  • 76. Conversion from regular expression to DFA Position followpos 𝑎 𝑏 | ∗ . {1} {1} {2} {2} {1,2} {1,2} {1,2} {1,2} 𝑎 {3} {3} {1,2,3} {3} . {4} {4} {1,2,3} {4} . {5} {5} {1,2,3} {5} . {6} {6} {1,2,3} {6} 𝑏 𝑏 # Step 4: Calculate followpos 𝟏 𝟐 𝟒 𝟑 𝟓 𝟔 5 6 {1,2,3} {4} . {5} {5} 𝒄𝟏 𝒄𝟐 𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑐1) = {4} 𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑐2 = 5 𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 4 = 5 4 5
  • 77. Conversion from regular expression to DFA Position followpos 𝑎 𝑏 | ∗ . {1} {1} {2} {2} {1,2} {1,2} {1,2} {1,2} 𝑎 {3} {3} {1,2,3} {3} . {4} {4} {1,2,3} {4} . {5} {5} {1,2,3} {5} . {6} {6} {1,2,3} {6} 𝑏 𝑏 # Step 4: Calculate followpos 𝟏 𝟐 𝟒 𝟑 𝟓 𝟔 5 6 {1,2,3} {3} . {4} {4} 𝒄𝟏 𝒄𝟐 𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑐1) = {3} 𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑐2 = 4 𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 3 = 4 4 5 3 4 Firstpos Lastpos
  • 78. Conversion from regular expression to DFA Position followpos 𝑎 𝑏 | ∗ . {1} {1} {2} {2} {1,2} {1,2} {1,2} {1,2} 𝑎 {3} {3} {1,2,3} {3} . {4} {4} {1,2,3} {4} . {5} {5} {1,2,3} {5} . {6} {6} {1,2,3} {6} 𝑏 𝑏 # Step 4: Calculate followpos 𝟏 𝟐 𝟒 𝟑 𝟓 𝟔 5 6 {1,2} {1,2} . {3} {3} 𝒄𝟏 𝒄𝟐 4 5 3 4 2 3 1 3 Firstpos Lastpos 𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑐1) = {1,2} 𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑐2 = 3 𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 1 = 3 𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 2 = 3
  • 79. Conversion from regular expression to DFA Position followpos 𝑎 𝑏 | ∗ . {1} {1} {2} {2} {1,2} {1,2} {1,2} {1,2} 𝑎 {3} {3} {1,2,3} {3} . {4} {4} {1,2,3} {4} . {5} {5} {1,2,3} {5} . {6} {6} {1,2,3} {6} 𝑏 𝑏 # Step 4: Calculate followpos 𝟏 𝟐 𝟒 𝟑 𝟓 𝟔 5 6 𝒏 4 5 3 4 2 3 1 3 {1,2} {1,2} * 1,2, 1,2, Firstpos Lastpos 𝑖 = 𝑙𝑎𝑠𝑡𝑝𝑜𝑠(𝑛) = {1,2} 𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 𝑛 = 1,2 𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 1 = 1,2 𝑓𝑜𝑙𝑙𝑜𝑤𝑝𝑜𝑠 2 = 1,2
  • 80. Conversion from regular expression to DFA Initial state = 𝑓𝑖𝑟𝑠𝑡𝑝𝑜𝑠 of root = {1,2,3} ----- A State A δ( (1,2,3),a) = followpos(1) U followpos(3) =(1,2,3) U (4) = {1,2,3,4} ----- B δ( (1,2,3),b) = followpos(2) =(1,2,3) ----- A Position followpos 5 6 4 5 3 4 2 1,2,3 1 1,2,3 States a b A={1,2,3} B A B={1,2,3,4}
  • 81. Conversion from regular expression to DFA State B δ( (1,2,3,4),a) = followpos(1) U followpos(3) =(1,2,3) U (4) = {1,2,3,4} ----- B δ( (1,2,3,4),b) = followpos(2) U followpos(4) =(1,2,3) U (5) = {1,2,3,5} ----- C State C δ( (1,2,3,5),a) = followpos(1) U followpos(3) =(1,2,3) U (4) = {1,2,3,4} ----- B δ( (1,2,3,5),b) = followpos(2) U followpos(5) =(1,2,3) U (6) = {1,2,3,6} ----- D Position followpos 5 6 4 5 3 4 2 1,2,3 1 1,2,3 States a b A={1,2,3} B A B={1,2,3,4} B C C={1,2,3,5} B D D={1,2,3,6}
  • 82. Conversion from regular expression to DFA State D δ( (1,2,3,6),a) = followpos(1) U followpos(3) =(1,2,3) U (4) = {1,2,3,4} ----- B δ( (1,2,3,6),b) = followpos(2) =(1,2,3) ----- A Position followpos 5 6 4 5 3 4 2 1,2,3 1 1,2,3 States a b A={1,2,3} B A B={1,2,3,4} B C C={1,2,3,5} B D D={1,2,3,6} B A A B C D a b b b a a b a DFA
  • 83. Conversion from regular expression to DFA Construct DFA for following regular expression: 1. (c | d)*c#