Chapter 4_Regular Expressions in Automata.pptx

Theory of Computation
Introduction
Dr. Krishnendu Rarhi
E: Krishnendu.e9621@cumail.in

Dr. Krishnendu Rarhi©
Terminologies
• Symbol: Symbol(often also called character) is the smallest
building block, which can be any alphabet, letter, or picture.
• Example: a, b, c, 0, 1, ….
• Alphabets (Σ): Alphabets are a set of symbols, which are
always finite.
• Examples: Σ = {0, 1}; {0, 1, 2, …, 9}; {a, b, c}; {A, B, C, …, Z}.
• String: String is a finite sequence of symbols from some
alphabet. A string is generally denoted as w and the length of a
string is denoted as |w|.
Σ* is a set of all possible strings(often power set(need not be unique here or we can say
multiset) of string) So this implies that language is a subset of Σ*.
Empty string is the string with zero occurrence of symbols, represented as ε.

Formulation
Number of Strings (of length 2) that can be generated over the alphabet {a, b}
-
- -
a a
a b
b a
b b
Length of String |w| = 2
Number of Strings = 4
Conclusion:
For alphabet {a, b} with length n, number of strings can be generated = 2n.
If the number of symbols in alphabet Σ is represented by |Σ|, then a number of strings of length n, possible over Σ is |Σ|n
.

Terminologies
• Language: A language is a set of strings, chosen from some
Σ* or we can say- ’A language is a subset of Σ* ’. A language
that can be formed over ‘ Σ ‘ can be Finite or Infinite.
Example of Finite Language:
L1 = { set of string of 2 }
L1 = { xy, yx, xx, yy }
Example of Infinite Language:
L1 = { set of all strings starts with 'b' }
L1 = { babb, baa, ba, bbb, baab, ....... }

Regular expression
• Regular Expressions are used to denote regular languages.
Regular Languages are the
most restricted types of
languages and are accepted
by finite automata.
An expression is regular if:
• ɸ is a regular expression for regular language ɸ.
• ɛ is a regular expression for regular language {ɛ}.
• If a Σ (Σ represents the input alphabet), a is regular expression with language {a}.
∈
• If a and b are regular expression, a + b is also a regular expression with language {a,b}.
• If a and b are regular expression, ab (concatenation of a and b) is also regular.
• If a is regular expression, a* (0 or more times a) is also regular.

Regular expression
Regular Expression
Regular Languages
set of vowels ( a e i o u )
∪ ∪ ∪ ∪ {a, e, i, o, u}
a followed by 0 or more b (a.b*
) {a, ab, abb, abbb, abbbb,….}
any no. of vowels followed by any no. of
consonants v*
.c*
( where v – vowels and c – consonants)
{ ε , a ,aou, aiou, b, abcd…..} where ε represent
empty string (in case 0 vowels and o consonants )

Regular Expressions vs. Finite Automata
• Offers a declarative way to express the pattern of any string we want to accept
• E.g., 01*+ 10*
• Automata => more machine-like
< input: string , output: [accept/reject] >
• Regular expressions => more program syntax-like
• Unix environments heavily use regular expressions
• E.g., bash shell, grep, vi & other editors, sed
• Perl scripting – good for string processing
• Lexical analyzers such as Lex or Flex

Regular Expressions
Regular
expressions
Finite Automata
(DFA, NFA, -NFA)
Regular
Languages
=
Automata/machines
Syntactical
expressions
Formal language
classes

Language Operators
• Union of two languages:
• L U M = all strings that are either in L or M
• Note: A union of two languages produces a third language
• Concatenation of two languages:
• L . M = all strings that are of the form xy
s.t., x  L and y  M
• The dot operator is usually omitted
• i.e., LM is same as L.M

Kleene Closure (the * operator)
• Kleene Closure of a given language L:
• L0
= {}
• L1
= {w | for some w  L}
• L2
= { w1w2 | w1  L, w2  L (duplicates allowed)}
• Li
= { w1w2…wi | all w’s chosen are  L (duplicates allowed)}
• (Note: the choice of each wi is independent)
• L* = Ui≥0 Li
(arbitrary number of concatenations)
Example:
• Let L = { 1, 00}
• L0
= {}
• L1
= {1,00}
• L2
= {11,100,001,0000}
• L3
= {111,1100,1001,10000,000000,00001,00100,0011}
• L* = L0
U L1
U L2
U …
“i” here refers to how many strings to concatenate from the parent
language L to produce strings in the language Li

Kleene Closure (special notes)
• L* is an infinite set iff |L|≥1 and L≠{}
• If L={}, then L* = {}
• If L = Φ, then L* = {}
Σ* denotes the set of all words over an alphabet Σ
• Therefore, an abbreviated way of saying there is an
arbitrary language L over an alphabet Σ is:
• L  Σ*
Why?
Why?
Why?

Building Regular Expressions
• Let E be a regular expression and the language represented by E is
L(E)
• Then:
• (E) = E
• L(E + F) = L(E) U L(F)
• L(E F) = L(E) L(F)
• L(E*) = (L(E))*

Example: how to use these regular expression properties and
language operators?
• L = { w | w is a binary string which does not contain two consecutive 0s or two consecutive 1s anywhere)
• E.g., w = 01010101 is in L, while w = 10010 is not in L
• Goal: Build a regular expression for L
• Four cases for w:
• Case A: w starts with 0 and |w| is even
• Case B: w starts with 1 and |w| is even
• Case C: w starts with 0 and |w| is odd
• Case D: w starts with 1 and |w| is odd
• Regular expression for the four cases:
• Case A: (01)*
• Case B: (10)*
• Case C: 0(10)*
• Case D: 1(01)*
• Since L is the union of all 4 cases:
• Reg Exp for L = (01)* + (10)* + 0(10)* + 1(01)*
• If we introduce  then the regular expression can be simplified to:
• Reg Exp for L = ( +1)(01)*( +0)

Equivalence of Regular Expressions
• Equivalence is defined as two regular expressions describing or producing the same
language.
• Assume the regular expressions S and R with language L, if L(S) = (R) then S = R
• We can use regular expressions to show whether two languages produce the same
strings.
• Axioms:
• The associativity property for union: S+(R+T)≡(S+R)+T
• The commutativity property for union: S+R≡R+S
• The associativity property for concatenation: S×(R×T)≡(S×R)×T
• The identity property for union: S+ ≡S
∅
• The identity property for concatenation: S×ε≡S
• The left distributivity property: S(R+T)≡SR+ST
• The right distributivity property:(S+T)R≡SR+TR
• The idempotence property of Kleene star: S∗∗
≡ S∗
• The annihilator property for concatenation: S× ≡ ≡ ×S
∅ ∅ ∅

Equivalence of Regular Expressions
• Let's check the equivalency for the following equation: (0110+01)
(10) ≡01(10)
∗ ∗
• Let's take the LHS: ≡(0110+01)(10) ≡(0110+01)(10)
∗ ∗
• Use the identity property for concatenation: (0110+01ϵ)(10)∗
• Apply the left distributive property: ≡(01(10+01ϵ))(10)∗
• Use the associative property for concatenation: ≡(01)((10+01ϵ)(10) )
∗
• Apply the right distributive property: ≡01(10(10) +
∗ ϵ(10) )
∗
• Use the identity property of concatenation: ≡01(10(10) +(10) )
∗ ∗
• Use the substitution property: ≡01(10)∗
• This is equal to the right-hand side of the equation. It is to be noted
that L(10(10) )
∗ ⊆L((10) ).
∗

PUMPING Lemma
• It gives a method for pumping (generating) many substrings from a given
string.
• In other words, we say it provides means to break a given long input string
into several substrings.
• It gives necessary condition(s) to prove a set of strings is not regular.
• Theorem: For any regular language L, there exists an integer P, such that for
all w in L, |w|>=P
We can break w into three strings, w=xyz such that.
(1)lxyl < P
(2)lyl > 1
(3)for all k>= 0: the string xyk
z is also in L

Application of PUMPING Lemma
• Pumping lemma is to be applied to show that certain languages are
not regular.
• It should never be used to show a language is regular.
• If L is regular, it satisfies the Pumping lemma.
• If L does not satisfy the Pumping Lemma, it is not regular.

Application of PUMPING Lemma
Steps to prove that a language is not regular by using PL are as follows−
• step 1 − We have to assume that L is regular
• step 2 − So, the pumping lemma should hold for L.
• step 3 − It has to have a pumping length (say P).
• step 4 − All strings longer that P can be pumped |w|>=p.
• step 5 − Now find a string 'w' in L such that |w|>=P
• step 6 − Divide w into xyz.
• step 7 − Show that xyi
z L for some i.
∉
• step 8 − Then consider all ways that w can be divided into xyz.
• step 9 − Show that none of these can satisfy all the 3 pumping conditions at same
time.
• step 10 − w cannot be pumped = CONTRADICTION.

Finite Automata (FA) & Regular Expressions (Reg
Ex)
To show that they are interchangeable, consider
the following theorems:
Theorem 1: For every DFA A there exists a regular
expression R such that L(R)=L(A)
Theorem 2: For every regular expression R there exists
an  -NFA E such that L(E)=L(R)
 -NFA NFA
DFA
Reg Ex
Theorem 2
Theorem 1
Kleene Theorem

DFA to RE construction
Reg Ex
DFA
Theorem 1
Example:
q0 q1 q2
0 1
1 0 0,1
(1*) 0 (0*) 1 (0 + 1)*
Informally, trace all distinct paths (traversing cycles only once)
from the start state to each of the final states
and enumerate all the expressions along the way
1*00*1(0+1)*
00*
1* 1 (0+1)*
Q) What is the language?

RE to -NFA construction
 -NFA
Reg Ex
Theorem 2
Example: (0+1)*01(0+1)*
0
1






 0 1
0
1






(0+1)* 01 (0+1)*

Regular Grammar & Regular Language
Regular Grammar : A grammar is regular if it has rules of
form A -> a or A -> aB or A -> ɛ where ɛ is a special symbol
called NULL.
Regular Languages : A language is regular if it can be
expressed in terms of regular expression.

Closure Property of Regular Language
• Union : If L1 and If L2 are two regular languages, their union L1 L2
∪
will also be regular. For example, L1 = {an
| n 0} and L2 = {b
≥ n
| n ≥
0}
L3 = L1 L2 = {a
∪ n
b
∪ n
| n 0} is also regular.
≥
• Intersection : If L1 and If L2 are two regular languages, their
intersection L1 L2 will also be regular. For example,
∩
L1= {am
| m 0} and L2= {b
≥ n
| n 0 }
≥
L3 = L1 L2 = {a
∩ m
bn
| n 0 and m 0} is also regular.
≥ ≥
• Concatenation : If L1 and If L2 are two regular languages, their
concatenation L1.L2 will also be regular. For example,
L1 = {an
| n 0} and L2 = {b
≥ n
| n 0}
≥
L3 = L1.L2 = {am
. bn
| m 0 and n 0} is also regular.
≥ ≥

Closure Property of Regular Language
• Kleene Closure : If L1 is a regular language, its Kleene
closure L1* will also be regular. For example,
L1 = (a b)
∪
L1* = (a b)*
∪
• Complement : If L(G) is regular language, its complement
L’(G) will also be regular. Complement of a language can be
found by subtracting strings which are in L(G) from all
possible strings. For example,
L(G) = {an
| n > 3}
L’(G) = {an
| n <= 3}
Two regular expressions are equivalent if languages generated by them are same. For example,
(a+b*)* and (a+b)* generate same language. Every string which is generated by (a+b*)* is also
generated by (a+b)* and vice versa.

Examples
Which one of the following languages over the alphabet {0,1}
is described by the regular expression?
(0+1)*0(0+1)*0(0+1)*
(A) The set of all strings containing the substring 00.
(B) The set of all strings containing at most two 0’s.
(C) The set of all strings containing at least two 0’s.
(D) The set of all strings that begin and end with either 0 or 1.
Option A says that it must have substring 00. But 10101 is also a part of language but it does not
contain 00 as substring. So it is not correct option.
Option B says that it can have maximum two 0’s but 00000 is also a part of language. So it is not
correct option.
Option C says that it must contain at least two 0. In regular expression, two 0 are present. So this
is correct option.
Option D says that it contains all strings that begin and end with either 0 or 1. But it can generate
strings which start with 0 and end with 1 or vice versa as well. So it is not correct.

Examples
Which of the following languages is generated by given grammar?
S -> aS | bS | ∊
(A) {an
bm
| n,m 0}
≥
(B) {w {a,b}* | w has equal number of a’s and b’s}
∈
(C) {an
| n 0} {b
≥ ∪ n
| n 0} {a
≥ ∪ n
bn
| n 0}
≥
(D) {a,b}*
Option (A) says that it will have 0 or more a followed by 0 or more b. But S -> bS => baS => ba is also
a part of language. So (A) is not correct.
Option (B) says that it will have equal no. of a’s and b’s. But But S -> bS => b is also a part of
language. So (B) is not correct.
Option (C) says either it will have 0 or more a’s or 0 or more b’s or a’s followed by b’s. But as shown
in option (A), ba is also part of language. So (C) is not correct.
Option (D) says it can have any number of a’s and any numbers of b’s in any order. So (D) is correct.

Examples
The regular expression 0*(10*)* denotes the same set as
(A) (1*0)*1*
(B) 0 + (0 + 10)*
(C) (0 + 1)* 10(0 + 1)*
(D) none of these
Two regular expressions are equivalent if languages generated by them are same.
Option (A) can generate all strings generated by 0*(10*)*. So they are equivalent.
Option (B) string null can not generated by given languages but 0*(10*)* can. So
they are not equivalent.
Option (C) will have 10 as substring but 0*(10*)* may or may not. So they are not
equivalent.

Examples
The regular expression for the language having input alphabets a and
b, in which two a’s do not come together:
(A) (b + ab)* + (b +ab)*a
(B) a(b + ba)* + (b + ba)*
(C) both options (A) and (B)
(D) none of the above
Option (C) stating both both options (A) and (B) is the correct regular expression for the stated question.
The language in the question can be expressed as L={&epsilon,a,b,bb,ab,aba,ba,bab,baba,abab,…}.
In option (A) ‘ab’ is considered the building block for finding out the required regular expression.(b + ab)*
covers all cases of strings generated ending with ‘b’.(b + ab)*a covers all cases of strings generated ending
with a.
Applying similar logic for option (B) we can see that the regular expression is derived considering ‘ba’ as
the building block and it covers all cases of strings starting with a and starting with b.

Chomsky Hierarchy

Chomsky Hierarchy (Type 0: Unrestricted
Grammar)
• Type-0 grammars include all formal grammars. Type 0 grammar language
are recognized by Turing machine. These languages are also known as the
Recursively Enumerable languages.
Grammar Production in the form of
|α| -> |β|
where α is ( V + T)* V ( V + T)*
V : Variables
T : Terminals.
β is ( V + T)*
In type 0 there must be at least one variable on Left side of production.
For example,
Sab –> ba
A –> S.
Here, Variables are S, A and Terminals a, b.

Chomsky Hierarchy (Type 1: Context Sensitive
Grammar)
• Type-1 grammars generate the context-sensitive languages. The language
generated by the grammar are recognized by the Linear Bound Automata .
In Type 1
I. First of all Type 1 grammar should be Type 0.
II. Grammar Production in the form of
α -> β; |α| <= |β| (count of symbol in α is less than or equal to β)
For Example,
S –> AB
AB –> abc
B –> b

Chomsky Hierarchy (Type 2: Context Free
Grammar)
• Type-2 grammars generate the context-free languages. The language
generated by the grammar is recognized by a Pushdown automata.
In Type 2,
1. First of all it should be Type 1.
2. Left hand side of production can have only one variable.
For example,
S –> AB
A –> a
B –> b

Chomsky Hierarchy (Type 3: Regular
Grammar)
• Type-3 grammars generate regular languages. These languages are
exactly all languages that can be accepted by a finite state
automaton. Type 3 is most restricted form of grammar.
Type 3 should be in the given form only :
V –> VT / T (left-regular grammar)
(or)
V –> TV /T (right-regular grammar)
for example:
S –> a
The above form is called as strictly regular grammar.
There is another form of regular grammar called extended regular grammar. In this
form :
V –> VT* / T*. (extended left-regular grammar)
(or)
V –> T*V /T* (extended right-regular grammar)
for example :
S –> ab.

Arden’s Theorem
• In order to find out a regular expression of a Finite Automaton, we
use Arden’s Theorem along with the properties of regular expressions.
Statement −
• Let P and Q be two regular expressions.
• If P does not contain null string, then R = Q + RP has a unique solution
that is R = QP*

Arden’s Theorem
• Proof −
R = Q + (Q + RP)P [After putting the value R = Q + RP]
= Q + QP + RPP
When we put the value of R recursively again and again, we get the
following equation −
R = Q + QP + QP2
+ QP3
…..
R = Q (ε + P + P2
+ P3
+ …. )
R = QP* [As P* represents (ε + P + P2 + P3 + ….) ]
Hence, proved.

Designing RE from FA
Here the initial state and final state is q1.
The equations for the three states q1, q2,
and q3 are as follows −
• q1 = q1a + q3a + ε (ε move is because q1 is
the initial state0
• q2 = q1b + q2b + q3b
• q3 = q2a

Designing RE from FA
Now, we will solve these three equations −
q2 = q1b + q2b + q3b
= q1b + q2b + (q2a)b (Substituting value of q3)
= q1b + q2(b + ab)
= q1b (b + ab)* (Applying Arden’s Theorem)
q1 = q1a + q3a + ε
= q1a + q2aa + ε (Substituting value of q3)
= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)
= q1(a + b(b + ab)*aa) + ε
= ε (a+ b(b + ab)*aa)*
= (a + b(b + ab)*aa)*
Hence, the regular expression is (a + b(b + ab)*aa)*.

Designing FA from RE
• Even number of a’s : The regular expression for even
number of a’s is (b|ab*ab*)*. We can construct a finite
automata as shown in Figure
The above automata will accept all strings which have even number of a’s. For
zero a’s, it will be in q0 which is final state. For one ‘a’, it will go from q0 to q1
and the string will not be accepted. For two a’s at any positions, it will go from
q0 to q1 for 1st ‘a’ and q1 to q0 for second ‘a’. So, it will accept all strings with
even number of a’s.

Constructing FA from RE
Case 1 − For a regular expression ‘a’, we
can construct the following FA −
Case 2 − For a regular expression ‘ab’,
we can construct the following FA −
Case 3 − For a regular expression (a+b),
we can construct the following FA −
Case 4 − For a regular expression
(a+b)*, we can construct the following
FA −

Constructing FA from RE
• Method
• Step 1 Construct an NFA with Null
moves from the given regular
expression.
• Step 2 Remove Null transition from
the NFA and convert it into its
equivalent DFA.

• String with ‘ab’ as substring : The regular expression for
strings with ‘ab’ as substring is (a|b)*ab(a|b)*. We can
construct finite automata as shown in Figure
The above automata will accept all string which have ‘ab’ as substring. The
automata will remain in initial state q0 for b’s. It will move to q1 after reading
‘a’ and remain in same state for all ‘a’ afterwards. Then it will move to q2 if ‘b’
is read. That means, the string has read ‘ab’ as substring if it reaches q2.

• String with count of ‘a’ divisible by 3 : The regular expression for
strings with count of a divisible by 3 is {a3n
| n >= 0}. We can construct
automata as shown in Figure
The above automata will accept all string of form a3n
. The automata will remain
in initial state q0 for ɛ and it will be accepted. For string ‘aaa’, it will move from
q0 to q1 then q1 to q2 and then q2 to q0. For every set of three a’s, it will come
to q0, hence accepted. Otherwise, it will be in q1 or q2, hence rejected.
If we want to design a finite automata with number of a’s as 3n+1, same
automata can be used with final state as q1 instead of q0.
If we want to design a finite automata with language {akn
| n >= 0}, k
states are required. We have used k = 3 in our example.

• Binary numbers divisible by 3 : The regular expression for binary numbers which are
divisible by three is (0|1(01*0)*1)*. The examples of binary number divisible by 3 are 0,
011, 110, 1001, 1100, 1111, 10010 etc. The DFA corresponding to binary number
divisible by 3 can be shown in Figure
The above automata will accept all binary numbers divisible by 3. For
1001, the automata will go from q0 to q1, then q1 to q2, then q2 to q1 and
finally q2 to q0, hence accepted. For 0111, the automata will go from q0 to
q0, then q0 to q1, then q1 to q0 and finally q0 to q1, hence rejected.

• String with regular expression (111 + 11111)* : The string accepted using
this regular expression will have 3, 5, 6(111 twice), 8 (11111 once and 111
once), 9 (111 thrice), 10 (11111 twice) and all other counts of 1 afterwards.
The DFA corresponding to given regular expression is given in Figure
The above automata will accept all binary numbers divisible by 3. For
1001, the automata will go from q0 to q1, then q1 to q2, then q2 to q1 and
finally q2 to q0, hence accepted. For 0111, the automata will go from q0 to
q0, then q0 to q1, then q1 to q0 and finally q0 to q1, hence rejected.

Designing FA from RE (Example)
• Will be the minimum number of states for strings with odd
number of a’s?
The regular expression for odd number of a is b*ab*(ab*ab*)* and
corresponding automata is given in Figure and minimum number of
states are 2.

Pumping Lemma
Let L be a regular language. Then there exists a constant ‘c’ such that for every string w in L −
|w| ≥ c
We can break w into three strings, w = xyz, such that −
• |y| > 0
• |xy| ≤ c
• For all k ≥ 0, the string xyk
z is also in L.
Applications of Pumping Lemma
• Pumping Lemma is to be applied to show that certain languages are not regular. It should
never be used to show a language is regular.
• If L is regular, it satisfies Pumping Lemma.
• If L does not satisfy Pumping Lemma, it is non-regular.

Pumping Lemma
Method to prove that a language L is not regular
• At first, we have to assume that L is regular.
• So, the pumping lemma should hold for L.
• Use the pumping lemma to obtain a contradiction −
• Select w such that |w| ≥ c
• Select y such that |y| ≥ 1
• Select x such that |xy| ≤ c
• Assign the remaining string to z.
• Select k such that the resulting string is not in L.

Pumping Lemma
Prove that L = {ai
bi
| i ≥ 0} is not regular.
• At first, we assume that L is regular and n is the number of states.
• Let w = an
bn
. Thus |w| = 2n ≥ n.
• By pumping lemma, let w = xyz, where |xy| ≤ n.
• Let x = ap
, y = aq
, and z = ar
bn
, where p + q + r = n, p ≠ 0, q ≠ 0, r ≠ 0. Thus |y| ≠ 0.
• Let k = 2. Then xy2
z = ap
a2q
ar
bn
.
• Number of as = (p + 2q + r) = (p + q + r) + q = n + q
• Hence, xy2
z = an+q
bn
. Since q ≠ 0, xy2
z is not of the form an
bn
.
• Thus, xy2
z is not in L. Hence L is not regular.

Designing Deterministic Finite Automata
• Problem-1: Construction of a DFA for the set of string over {a, b} such that length of the string
|w|=2 i.e, length of the string is exactly 2.
Explanation – The desired language will be like:
L = {aa, ab, ba, bb}
Here, State A represent set of all string of length zero (0), state B represent set of all string of
length one (1), state C represent set of all string of length two (2). State C is the final state and D
is the dead state it is so because after getting any alphabet as input it will not go into final state
ever.
The above automata will accept all the strings having the length of the string exactly 2. When the
length of the string is 1, then it will go from state A to B. When the length of the string is 2, then
it will go from state B to C and when the length of the string is greater than 2, then it will go from
state C to D (Dead state) and after it from state D TO D itself.
Number of states: n+2
Where n is |w|=n

• Problem-2: Construction of a DFA for the set of string over {a, b} such that
length of the string |w|>=2 i.e, length of the string should be at least 2.
L = {aa, ab, ba, bb, aaa, aab, aba, abb........}
Here, State A represent set of all sting of length zero (0), state B represent set of
all sting of length one (1), and state C represent set of all sting of length two (2).
The above automata will accept all the strings having the length of the string at
least 2. When the length of the string is 1, then it will go from state A to B. When
the length of the string is 2, then it will go from state B to C and lastly when the
length of the string is greater than 2, then it will go from state C to C itself.
Number of states:
n+1 Where n is |w|
>=n

• Problem-2: Construction of a DFA for the set of string over {a, b} such that length of the
string |w|<=2 i.e, length of the string is atmost 2.
L = {?, aa, ab, ba, bb}
Here, State A represent set of all sting of length zero (0), state B represent set of all sting
of length one (1), state C represent set of all sting of length two (2), state A, B, C is the
final state and D is the dead state it is so because after getting any alphabet as input it
will not go into final state ever.
The above automata will accept all the strings having the length of the string at most 2.
When the length of the string is 1, then it will go from state A to B. When the length of the
string is 2, then it will go from state B to C and lastly when the length of the string is
greater than 2, then it will go from state C to D (Dead state).
Number of states:
n+2 Where n is |w|
<=n

Conversion of NFA to DFA
• An NFA can have zero, one or more than one move from a given state on a
given input symbol. An NFA can also have NULL moves (moves without input
symbol). On the other hand, DFA has one and only one move from a given
state on a given input symbol.
• Conversion from NFA to DFA
Suppose there is an NFA N {Q, , q0, δ, F} which recognizes a language L.
∑
Then the DFA D {Q’, , q0, δ’, F’} can be constructed for language L as:
∑
Step 1: Initially Q’ = ɸ.
Step 2: Add q0 to Q’.
Step 3: For each state in Q’, find the possible set of states for each input
symbol using transition function of NFA. If this set of states is not in Q’, add it
to Q’.
Step 4: Final state of DFA will be all states with contain F (final states of NFA)

Consider the following NFA shown in Figure
Following are the various parameters for NFA.
Q = { q0, q1, q2 }
= ( a, b )
∑
F = { q2 }
δ (Transition Function of NFA)

• Step 1: Q’ = ɸ
Step 2: Q’ = {q0}
Step 3: For each state in Q’, find the states for each input
symbol.
Currently, state in Q’ is q0, find moves from q0 on input symbol
a and b using transition function of NFA and update the
transition table of DFA.
δ’ (Transition Function of DFA)
• Now { q0, q1 } will be considered as a single state. As its entry is
not in Q’, add it to Q’.
So Q’ = { q0, { q0, q1 } }

• Now, moves from state { q0, q1 } on different input symbols
are not present in transition table of DFA, we will calculate it
like:
δ’ ( { q0, q1 }, a ) = δ ( q0, a ) ∪ δ ( q1, a ) = { q0, q1 }
δ’ ( { q0, q1 }, b ) = δ ( q0, b ) ∪ δ ( q1, b ) = { q0, q2 }
Now we will update the transition table of DFA.
• Now { q0, q2 } will be considered as a single state. As its
entry is not in Q’, add it to Q’.
So Q’ = { q0, { q0, q1 }, { q0, q2 } }

• Now, moves from state {q0, q2} on different input symbols
are not present in transition table of DFA, we will calculate it
like:
δ’ ( { q0, q2 }, a ) = δ ( q0, a ) ∪ δ ( q2, a ) = { q0, q1 }
δ’ ( { q0, q2 }, b ) = δ ( q0, b ) ∪ δ ( q2, b ) = { q0 }
Now we will update the transition table of DFA.
• As there is no new state generated, we are done with the
conversion. Final state of DFA will be state which has q2 as
its component i.e., {q0, q2 }

• Following are the various parameters for DFA.
Q’ = { q0, { q0, q1 }, { q0, q2 } }
= ( a, b )
∑
F = { { q0, q2 } } and transition function δ’ as shown above.
The final DFA for above NFA has been shown in Figure
Sometimes, it is not easy to convert regular expression to DFA. First you can convert regular
expression to NFA and then NFA to DFA.

Conversion of NFA to DFA (Example)
• The number of states in the minimal deterministic finite automaton
corresponding to the regular expression (0 + 1)* (10) is ___________.
First, we will make an NFA for the above expression. To make an NFA for (0 + 1)*,
NFA will be in same state q0 on input symbol 0 or 1. Then for concatenation, we
will add two moves (q0 to q1 for 1 and q1 to q2 for 0) as shown in Figure

Chapter 4_Regular Expressions in Automata.pptx

More Related Content

Similar to Chapter 4_Regular Expressions in Automata.pptx

More from KrishnenduRarhi

Recently uploaded

Chapter 4_Regular Expressions in Automata.pptx