Regular expressions

Courtesy: Costas/Ullman 1
Regular Expressions

2
RE’s: Introduction
• Regular expressions describe
languages by an algebra.
• They describe exactly the regular
languages.
• If E is a regular expression, then L(E)
is the language it defines.
• We’ll describe RE’s and their
languages recursively.
Courtesy: Costas/Ullman

3
Operations on Languages
• RE’s use three operations: union,
concatenation, and Kleene star.
• The union of languages is the usual thing,
since languages are sets.
• Example: {01,111,10}{00, 01} =
{01,111,10,00}.

4
Concatenation
• The concatenation of languages L and
M is denoted LM.
• It contains every string wx such that
w is in L and x is in M.
• Example: {01,111,10}{00, 01} = {0100,
0101, 11100, 11101, 1000, 1001}.

5
Kleene Star
• If L is a language, then L*, the Kleene star
or just “star,” is the set of strings formed
by concatenating zero or more strings from
L, in any order.
• L* = {ε}  L  LL  LLL  …
• Example: {0,10}* = {ε, 0, 10, 00, 010, 100,
1010,…}

6
RE’s: Definition
• Basis 1: If a is any symbol, then a is a RE,
and L(a) = {a}.
– Note: {a} is the language containing one string,
and that string is of length 1.
• Basis 2: ε is a RE, and L(ε) = {ε}.
• Basis 3: ∅ is a RE, and L(∅) = ∅.

7
RE’s: Definition – (2)
• Induction 1: If E1 and E2 are regular
expressions, then E1+E2 is a regular
expression, and L(E1+E2) = L(E1)L(E2).
• Induction 2: If E1 and E2 are regular
expressions, then E1E2 is a regular
expression, and L(E1E2) = L(E1)L(E2).
• Induction 3: If E is a RE, then E* is a RE,
and L(E*) = (L(E))*.

Definition (continued)
For regular expressions and1r 2r
     2121 rLrLrrL 
     2121 rLrLrrL 
    ** 11 rLrL 
    11 rLrL 

9
Precedence of Operators
• Parentheses may be used wherever needed
to influence the grouping of operators.
• Order of precedence is * (highest), then
concatenation, then + (lowest).

Example
Regular expression:   *aba 
  *abaL      *aLbaL 
   *aLbaL 
       *aLbLaL 
       *aba 
  ,...,,,, aaaaaaba 
 ,...,,,...,,, baababaaaaaa

Example
Regular expression    bbabar  *
   ,...,,,,, bbbbaabbaabbarL 

Example
Regular expression     bbbaar **
  }0,:{ 22
 mnbbarL mn

Example
Regular expression *)10(00*)10( r
)(rL = { all strings containing substring 00 }

Example
Regular expression )0(*)011( r
)(rL = { all strings without substring 00 }

Equivalent Regular Expressions
Definition:
Regular expressions and
are equivalent if
1r 2r
)()( 21 rLrL 

Example
L = { all strings without substring 00 }
)0(*)011(1 r
)0(*1)0(**)011*1(2  r
LrLrL  )()( 21
1r 2rand
are equivalent
regular expressions

Regular Expressions
and
Regular Languages

Theorem
Languages
Generated by
Regular Expressions
Regular
Languages

Languages
Generated by
Regular Expressions
Regular
Languages

Languages
Generated by
Regular Expressions
Regular
Languages

Proof:

Proof - Part 1
r
)(rL
For any regular expression
the language is regular
Languages
Generated by
Regular Expressions
Regular
Languages

Proof by induction on the size of r

Induction Basis
Primitive Regular Expressions: ,,
Corresponding
NFAs
)()( 1  LML
)(}{)( 2  LML 
)(}{)( 3 aLaML 
regular
languages
a

Inductive Hypothesis
Suppose
that for regular expressions and ,
and are regular languages
1r 2r
)( 1rL )( 2rL

Inductive Step
We will prove:
 
 
 
  1
1
21
21
*
rL
rL
rrL
rrL


Are regular
Languages

By definition of regular expressions:
     
     
    
    11
11
2121
2121
**
rLrL
rLrL
rLrLrrL
rLrLrrL





)( 1rL )( 2rL
By inductive hypothesis we know:
and are regular languages
Regular languages are closed under:
   
   
  *1
21
21
rL
rLrL
rLrL Union
Concatenation
Star
We also know:

Therefore:
     
     
    ** 11
2121
2121
rLrL
rLrLrrL
rLrLrrL



Are regular
languages
)())(( 11 rLrL  is trivially a regular language
(by induction hypothesis)
End of Proof-Part 1

Using the regular closure of operations,
we can construct recursively the NFA
that accepts
M
)()( rLML 
Example: 21 rrr 
)()( 11 rLML 
)()( 22 rLML 
)()( rLML 



For any regular language there is
a regular expression with
Proof - Part 2
Languages
Generated by
Regular Expressions
Regular
Languages

L
r LrL )(
We will convert an NFA that accepts
to a regular expression
L

Since is regular, there is a
NFA that accepts it
L
M
LML )(
Take it with a single accept state

From construct the equivalent
Generalized Transition Graph
in which transition labels are regular expressions
M
Example:
a
ba,
c
M
a
ba 
c
Corresponding
Generalized transition graph

Another Example:
ba 
a
b
b
0q 1q 2q
ba,
a
b
b
0q 1q 2q
b
bTransition labels
are regular
expressions

Reducing the states:
ba 
a
b
b
0q 1q 2q
b
0q 2q
babb*
)(* babb 
Transition labels
are regular
expressions

Resulting Regular Expression:
0q 2q
babb*
)(* babb 
*)(**)*( bbabbabbr 
LMLrL  )()(

In General
Removing a state:
iq q jq
a b
cd
e
iq jq
dae* bce*
dce*
bae*
2-neighbors

iq jq
dae* bce*
dce*
bae*
iq q jq
a b
cd
e
kq
f g
kq
fge*
dge*
fae*
bge*
fce*
This can be generalized
to arbitrary number
of neighbors to q
3-neighbors

0q fq
1r
2r
3r
4r
*)*(* 213421 rrrrrrr 
LMLrL  )()(
The resulting regular expression:
By repeating the process until
two states are left, the resulting graph is
Initial graph Resulting graph
End of Proof-Part 2

Standard Representations
of Regular Languages
Regular Languages
DFAs
NFAs
Regular
Expressions

When we say: We are given
a Regular Language
We mean:
L
Language is in a standard
representation
L
(DFA, NFA, or Regular Expression)

Regular expressions

More Related Content

What's hot

Viewers also liked

Similar to Regular expressions

More from Shiraz316

Recently uploaded

Regular expressions