Upcoming SlideShare
×

# Python & Perl: Finite State Automata & Regular Expressions; Perl's Operator m//

1,605
-1

Published on

Published in: Technology
2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
1,605
On Slideshare
0
From Embeds
0
Number of Embeds
18
Actions
Shares
0
0
0
Likes
2
Embeds 0
No embeds

No notes for slide

### Python & Perl: Finite State Automata & Regular Expressions; Perl's Operator m//

1. 1. Python & Perl Finite State Automata & Regular Expressions Vladimir Kulyukinwww.youtube.com/vkedco www.vkedco.blogspot.com
2. 2. Outline ● Deterministic Finite Automata ● Non-Deterministic Finite Automata ● Regular Languages ● Regular Expressions ● Perls Operator m//www.youtube.com/vkedco www.vkedco.blogspot.com
3. 3. Review: Common Algorithms ● Brute-force algorithm ● Rabin-Karp algorithm ● Knuth-Morris-Pratt algorithm ● Boyer-Moore algorithm ● Finite State Automata (aka Regular Expressions)www.youtube.com/vkedco www.vkedco.blogspot.com
4. 4. Deterministic Finite Automatawww.youtube.com/vkedco www.vkedco.blogspot.com
5. 5. Languages • A language is a set of strings over an alphabet • Σ* is the Kleene closure of Σ and denotes the set of all strings over Σ including ε • Examples: – If Σ = {a}, then Σ* = {a}* = {ε, a, aa, aaa, aaaa, …} – If Σ = {0,1}, then Σ* = {0,1}* is an infinite set that includes all strings of 0’s and 1’s and εwww.youtube.com/vkedco www.vkedco.blogspot.com
6. 6. Deterministic Finite Automata • A DFA can be informally defined as a directed graph whose nodes are states and whose edges are transitions on specific symbols • A DFA has a unique start state and a set (possibly empty) of final or accepting states • A DFA processes the input string one symbol at a time. When the last symbol is read, the DFA reaches a state which is either final or not. If the state is final, the DFA accepts (recognizes) the string. If the state is not final, the DFA rejects the stringwww.youtube.com/vkedco www.vkedco.blogspot.com
7. 7. DFA: Formal Definition A DFA M is a 5 - tuple, i.e. M = ( Q, Σ, δ , q0 , F ) , where : Q is a finite set of states; Σ is an alphabet; δ is a transition function; δ : Q × Σ → Q; q0 ∈ Q is the start state; F is the set of accepting (final) states.www.youtube.com/vkedco www.vkedco.blogspot.com
8. 8. Example: DFA M a b a q0 q1 b q0 is the start state. q1 is the final state.www.youtube.com/vkedco www.vkedco.blogspot.com
9. 9. Example: DFA M M = ( Q, Σ, δ , q 0 , F ) , where 1. Q = { q 0 , q1 }; 2. Σ = { a, b}; 3. F = { q1 }; 4. δ ( q 0 , a ) = q1 ; δ ( q 0 , b ) = q 0 ; δ ( q 0 , a ) = q1 ; δ ( q1 , b ) = q 0www.youtube.com/vkedco www.vkedco.blogspot.com
10. 10. Example: DFA M The transition function, δ, can be represented as a table: a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
11. 11. How DFA M Workswww.youtube.com/vkedco www.vkedco.blogspot.com
12. 12. How DFA M Works q0 is the start state of DFA M; read head is placed to the left of the 1st symbol of the input q0 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
13. 13. How DFA M Works Reading symbol b in state q0 q0 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
14. 14. How DFA M Works Staying in state q0 and moving right q0 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
15. 15. How DFA M Works Reading a in state q0 q0 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
16. 16. How DFA M Works Changing state to q1 and moving right q1 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
17. 17. How DFA M Works Reading a in state q1 q1 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
18. 18. How DFA M Works Staying in state q1 and moving right q1 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
19. 19. How DFA M Works Reading b in state q1 q1 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
20. 20. How DFA M Works Changing to q0 and moving right q0 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
21. 21. How DFA M Works Reading a in state q0 q0 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
22. 22. How DFA M Works Changing to q1 and moving right q1 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
23. 23. How DFA M Works End of input is reached; q1 is a final state; input is accepted/recognized q1 b a a b a a b q0 q1 q0 q1 q1 q0www.youtube.com/vkedco www.vkedco.blogspot.com
24. 24. String Membership Example b b a q0 q1 a Which of the following strings are in L(M)? 1. b 2. ε 3. ab 4. abba 5. ababaaaabawww.youtube.com/vkedco www.vkedco.blogspot.com
25. 25. Language of DFA M a,b q0 q1 a,b What is the language accepted by this DFA M?www.youtube.com/vkedco www.vkedco.blogspot.com
26. 26. Language of DFA M Informal answer : All strings over { a, b} whose length is odd. Formal answer : { L( M ) = x | x ∈ { a, b} and x = 2n + 1, n ∈ N . * }www.youtube.com/vkedco www.vkedco.blogspot.com
27. 27. Two Sample DFAs For each of the following languages, draw a DFA that accepts it : { 1. x | x ∈ { a, b} } * 2. {x ∈ { a, b} | number of a s in x is odd} *www.youtube.com/vkedco www.vkedco.blogspot.com
28. 28. DFA M1 a,b q0www.youtube.com/vkedco www.vkedco.blogspot.com
29. 29. DFA M2 b b a q0 q1 awww.youtube.com/vkedco www.vkedco.blogspot.com
30. 30. Nondeterminism & Nondeterministic Finite State Automata (NFAs)www.youtube.com/vkedco www.vkedco.blogspot.com
31. 31. Practical Implication of Nondeterminism ● Key computational implication of nondeterminism is the necessity of search ● In a typical scenario, a legal sequence of steps is a subset of some finite set ● Finding subsets brings us to the concept of power setwww.youtube.com/vkedco www.vkedco.blogspot.com
32. 32. Power Sets Let S be a set. The power set of S is P(S) = { R|R ⊆ S }. Examples : Let S = { a, b}. Then P ( S ) = { ∅, { a} , {b} , { a, b}}. Let S = {1,2,3}. Then P( S ) = { ∅, {1} , { 2}, { 3} , {1,2}, {1,3} , { 2,3} , {1,2,3}}. In general, P( S ) has 2 elements, where n is the size of S . nwww.youtube.com/vkedco www.vkedco.blogspot.com
33. 33. NFA Definition An NFA M is a 5 - tuple M = ( Q, Σ, δ , q0 , F ) , where Q is a finite set of states; Σ is an alphabet, i.e. a finite set of symbols; δ : Q × ( Σ ∪ {ε } ) → P ( Q ) ; q0 ∈ Q is the start state; F ⊆ Q is the set of accepting states.www.youtube.com/vkedco www.vkedco.blogspot.com
34. 34. NFA 1 a,b a,b a a q0 q1 q2 a b q0 {q0, q1} {q0} q1 {q2} {} q2 {q2} {q2}www.youtube.com/vkedco www.vkedco.blogspot.com
35. 35. NFA 2 a,b a,b a a q0 q1 q2 δ ( q0 , aa ) = { q0 , q1 , q2 } *www.youtube.com/vkedco www.vkedco.blogspot.com
36. 36. NFA 3 a q1 ε q0 ε b q2 L( M ) = { a} ∪ {b} * *www.youtube.com/vkedco www.vkedco.blogspot.com
37. 37. NFA vs. DFA ● NFAs are simpler to write, because, in general, have fewer states and allow for spontaneous transitions ● However, they are not more powerful than DFAs, i.e. they accept the same regular languages as DFAs ● For every NFA, one can construct a DFA that accepts the same languagewww.youtube.com/vkedco www.vkedco.blogspot.com
38. 38. Equivalence of NFAs and DFAs ● Basic insight: A DFA can keep track of the states that the equivalent NFA may be in after reading each symbol of the input ● Since the NFA may be in more than one state after reading a symbol, each state of the DFA must correspond to a subset of the NFA’s states ● The construction of an equivalent DFA from an NFA is called subset constructionwww.youtube.com/vkedco www.vkedco.blogspot.com
39. 39. Regular Languages & Expressionswww.youtube.com/vkedco www.vkedco.blogspot.com
40. 40. Regular Languages A language L is regular if and only if there exits a DFA M or NFA N such that L(M) = L or L(N) = Lwww.youtube.com/vkedco www.vkedco.blogspot.com
41. 41. Regular Expressions ● Regular expressions are programmatic equivalents of finite state automata ● Regular expressions are compiled into finite state machines either at run time or at compile time ● Regular expressions are often referred to as patternswww.youtube.com/vkedco www.vkedco.blogspot.com
42. 42. Perls Operator m// source at m_operator_01.plwww.youtube.com/vkedco www.vkedco.blogspot.com
43. 43. Operator m//: Syntax ● The match operator m is followed by a regular expression, aka a pattern, inside two matching delimiters: m/<regexp>/ ● If there is some text txt where we need to find some matches for a regular expression, we can do it as follows: txt =~ m/regexp/ ● =~ is a binding operator, it binds txt on the left to the regular expression of the match operatorwww.youtube.com/vkedco www.vkedco.blogspot.com
44. 44. Example my \$txt = " The mind which is wrongly guided does even  greater harm than the harm inflicted by an  enemy upon his foe. Buddha "; if ( \$txt =~ m/guided/ ) {   print "guided is foundn"; }www.youtube.com/vkedco www.vkedco.blogspot.com
45. 45. Different Pattern Delimiters ## The delimiters // are most common, but other  ## delimiters are possible. m(regexp) or m[regexp] ## or m{regexp}. For example: if ( \$txt =~ m{guided} ) {   print "guided is foundn"; }www.youtube.com/vkedco www.vkedco.blogspot.com
46. 46. Advantage of Using // Delimiters ## A slight advantage of using // to delimit a  ## regexp is that the m operator can be completely ## omitted. Thus, \$txt =~ m/guided/ is the same  ## as \$txt =~ /guided/. if ( \$txt =~ /guided/ ) {   print "guided is foundn"; }www.youtube.com/vkedco www.vkedco.blogspot.com
47. 47. Default Binding to \$_ ## in the absence of the binding operator =~, m ## works on \$_. In the following code segment /ab/  ## is searched for in \$_ which is iteratively bound  ## to 001, ababab, 10ab01, 10001, and  ## bcdefg.  foreach (001, ababab, 10ab01, 10001, bcdefg) {    ## if /ab/ is the same as if \$_ =~ /ab/    print "ab is found in \$_n" if /ab/; }www.youtube.com/vkedco www.vkedco.blogspot.com
48. 48. Perls m// Operator: Example ## patterns can be bound to variables and properly interpolated. In this example, each  ## value of \$_ is treated is as a pattern that is searched for in \$txt. The output is ## guided is FOUND ## harm is FOUND ## Buddha is FOUND ## rightly is NOT FOUND foreach (guided, harm, Buddha, rightly) {   if ( \$txt =~ /\$_/ ) {     print "\$_ is FOUNDn";   }   else {     print "\$_ is NOT FOUNDn";   }}www.youtube.com/vkedco www.vkedco.blogspot.com
49. 49. References ● www.python.org ● http://docs.python.org/2/ ● www.perl.org ● http://perldoc.perl.org/www.youtube.com/vkedco www.vkedco.blogspot.com
50. 50. References ● Davis, Weyuker, Sigal. Ch. 9. Computability, Complexity, and Languages, 2nd Edition, Academic Press ● A. Brooks Weber. Ch. 2, 3. Formal Language: A Practical Introduction, Franklin, Beedle & Associates, Incwww.youtube.com/vkedco www.vkedco.blogspot.com