SlideShare a Scribd company logo
1 of 70
Download to read offline
Module 6
String Matching algorithms
Polynomial Time, Classes-P and NP
Strings
• A string is a sequence of characters
• Examples of strings:
– Java program
– HTML document
– DNA sequence
– Digitized image
• An alphabet S is the set of possible characters for a family of strings
• Example of alphabets:
– ASCII, Unicode, {0, 1}, {A, C, G, T}
• Let P be a string of size m
– A prefix of P is a substring formed by taking any no. of leading
symbols of string
– A suffix of P is a substring formed by taking any no. of trailing
symbols of string
String Operations
• S=“AGCTTGA”
• |S|=7, length of S
• Substring: Si,j=SiS i+1…Sj string fragment b/w indices i and j
– Example: S2,4=“GCT”
• Subsequence of S: deleting zero or more characters from S
– “ACT” and “GCTT” are subsequences.
• Let P be a string of size m
– i is any index b/w 0 and m-1
– A prefix of P is a substring of the type P[0 .. i]
– A suffix of P is a substring of the type P[i ..m - 1]
• Prefix of S:
– “AGCT” is a prefix of S.
• Suffix of S:
– “CTTGA” is a suffix of S.
3
shiwani gupta
Examples
4
shiwani gupta
String Matching Problem
• Also called as Text Matching or Pattern Matching
• Given a text string T of length n and a pattern string P of length m, the
exact string matching problem is to find all occurrences of P in T.
• Example: T=“AGCTTGA” P=“GCT”
• Applications:
– Text editors
– Search engines (Google, Openfind)
– Biological search
– Text processing when compiling programs
– Information Retrieval (eg. In dictionaries)
String Matching algorithms
• Naïve / Brute Force Method
• Boyer Moore Method
• Knuth Morris Pratt Method
• Rabin Karp Method
• String Matching with Finite Automata
A Brute-Force Algorithm
Time: O(mn) where m=|P| and n=|T|.
Brute-Force Algorithm
• The brute-force pattern matching
algorithm compares the pattern
P with the text T for each
possible shift of P relative to T,
until either
– a match is found, or
– all placements of the pattern
have been tried
• Brute-force pattern matching
runs in time (n-m+1)m=O(nm)
• Example of worst case:
– T = aaa … ah
– P = aaah
– may occur in images and
DNA sequences
– unlikely in English text
• Used for small P
Algorithm BruteForceMatch(T, P)
Input text T of size n and pattern
P of size m
Output starting index of a
substring of T equal to P or
-1 if no such substring exists
for i  0 to n - m
// test shift i of the pattern
j  0
while j < m and T[i + j] = P[j]
j  j + 1
if j = m
return i //match at i
else
break while loop
// mismatch
return -1 // no match anywhere
The Brute Force Algorithm
Task: refer slide 8
Run Brute Force manually and through algorithm
T=‘raman likes mango’ P=‘mango’
Analysis
Two-phase Algorithms
• Phase 1:Generate an array to indicate the moving direction.
• Phase 2:Make use of the array to move and match the string.
• Boyer-Moore Algorithm:
– Proposed by Boyer-Moore in 1977.
– https://www.javatpoint.com/daa-boyer-moore-algorithm
• KMP algorithm:
– Proposed by Knuth, Morris and Pratt in 1977.
– https://www.javatpoint.com/daa-knuth-morris-pratt-algorithm
11
shiwani gupta
Boyer-Moore Heuristics
• The Boyer-Moore’s pattern matching algorithm is based on two
heuristics
Looking-glass heuristic: Compare P with a substring of T moving
backwards
Character-jump heuristic: When a mismatch occurs at T[i] = c
– If P contains c, shift P to align the last occurrence of c in P with
T[i]
– Else, shift P to align P[0] with T[i + 1]
• Example
1
a p a t t e r n m a t c h i n g a l g o r i t h m
r i t h m
r i t h m
r i t h m
r i t h m
r i t h m
r i t h m
r i t h m
2
3
4
5
6
7
8
9
10
11
Boyer Moore Characteristics
• Works backwards
• How many positions ahead to start next search is based on the value
of character causing mismatch
• With each unsuccessful attempt to find match, information gained is
used to rule out as many positions of text as possible where strings
can’t match.
• It is most efficient and fastest for moderately sized alphabet and long
pattern.
Case 1: Bad character heuristics
Case 2: failure of bad character heuristic
leading to negative shift
Case 3: Bad character heuristics
Last-Occurrence Function
• Boyer-Moore’s algorithm preprocesses the pattern P and the alphabet
S to build the last-occurrence function L mapping S to integers, where
L(c) is defined as
– the largest index i such that P[i] = c or -1 if no such index exists
• Example:
P = abacab
• The last-occurrence function can be represented by an array indexed
by the numeric codes of the characters
• The last-occurrence function can be computed in time O(m + s),
where m is the size of P and s is the size of S
c a b c d
L(c) 4 5 3 -1
COMPUTE-LAST-OCCURRENCE-FUNCTION (P, m, ∑ )
1. for each character a ∈ ∑
2. do λ [a] = 0
3. for j ← 1 to m
4. do λ [P [j]] ← j
5. Return λ
Last-Occurrence Function
m - j
i
j l
. . . . . . a . . . . . .
. . . . b a
. . . . b a
j
Case 3: j  1 + l
The Boyer-Moore Algorithm
Algorithm BoyerMooreMatch(T, P, S)
L  lastOccurenceFunction(P, S )
i  m - 1
j  m - 1
repeat
if T[i] = P[j]
if j = 0
return i { match at i }
else
i  i - 1
j  j - 1
else
{ character-jump }
l  L[T[i]]
i  i + m – min(j, 1 + l)
j  m - 1
until i > n - 1
return -1 { no match }
m - (1  l)
i
j
l
. . . . . . a . . . . . .
. a . . b .
. a . . b .
1  l
Case 1: 1 + l  j
Example (Case 1 and 3)
1
a b a c a a b a d c a b a c a b a a b b
2
3
4
5
6
7
8
9
10
12
a b a c a b
a b a c a b
a b a c a b
a b a c a b
a b a c a b
a b a c a b
11
13
i=5, j=5, l=4, i=5+6-min(5,1+4), i=6
i=6, j=5
i=4, j=3, l=4,
i=4+6-min(3,1+4),
i=7
i=7, j=5, l=-1,
i=8+6-min(5,1-1),
i=14
Analysis (Case 2): Good suffix heuristic
• Boyer-Moore’s algorithm runs in
time O(nm + s)
• Example of worst case:
– T = aaa … a
– P = baaa
• The worst case may occur in
images and DNA sequences but is
unlikely in English text
• Boyer-Moore’s algorithm is
significantly faster than the brute-
force algorithm on English text
• Longer the pattern, faster we
move on an average
11
1
a a a a a a a a a
2
3
4
5
6
b a a a a a
b a a a a a
b a a a a a
b a a a a a
7
8
9
10
12
13
14
15
16
17
18
19
20
21
22
23
24
Task
22
shiwani gupta
How many character comparisons will Boyer Moore algorithm
make to search for each of the following patterns in binary text?
Text: repeat “01110” 20 times
Pattern: (a) 01111 (b) 01110
KMP-Algorithm
• In Brute Force and Boyer Moore, we compare pattern characters that
don’t match in text.
• On occurrence of mismatch, throw away the info, restart comparison
for another set of characters from text.
• Thus again and again with next incremental position of text, the
characters are matched.
• In KMP when a mismatch occurs, word itself embodies info to
determine where next match would begin, thus bypassing re-
examination of previously matched characters.
• First linear time string matching algorithm.
Demo: https://people.ok.ubc.ca/ylucet/DS/KnuthMorrisPratt.html
The KMP Algorithm - Motivation
• Knuth-Morris-Pratt’s
algorithm compares the pattern
to the text in left-to-right, but
shifts the pattern more
intelligently than the brute-
force algorithm.
• It bypasses re-examination of
previously matched characters
• When a mismatch occurs, what
is the most we can shift the
pattern so as to avoid
redundant comparisons?
• Answer: the largest prefix of
P[0..j] that is a suffix of P[1..j]
x
j
. . a b a a b . . . . .
a b a a b a
a b a a b a
No need to
repeat these
comparisons
Resume
comparing
here
KMP Failure Function
• Knuth-Morris-Pratt’s algorithm
preprocesses the pattern to find
matches of prefixes of the
pattern with the pattern itself
• The failure function shows how
much of the beginning of string
matches to portion immediately
preceding failed comparison
• Prefix table gives you table of
skips per prefix ahead of time
• Knuth-Morris-Pratt’s algorithm
modifies the brute-force
algorithm so that if a mismatch
occurs at P[j]  T[i] we set j 
F(j - 1)
j 0 1 2 3 4 5
P[j] a b a a b a
F(j) 0 0 1 1 2 3
x
j
. . a b a a b . . . . .
a b a a b a
F(j-1)
a b a a b a
Task : Apply failure function code on above pattern
Computing the Failure Function
• Failure function encodes repeated
substrings inside the pattern itself
• The failure function can be
represented by an array and can be
computed in O(m) time
• The construction is similar to the
KMP algorithm itself
• At each iteration of the while-
loop, either
– i increases by one, or
– the shift amount i - j increases
by at least one (observe that F(j
- 1) < j)
• Hence, there are no more than 2m
iterations of the while-loop
Algorithm failureFunction(P)
F[0]  0
i  1
j  0
while i < m
if P[i] = P[j]
{we have matched j + 1
chars}
F[i]  j + 1
i  i + 1
j  j + 1
else if j > 0 then
{use failure function to
shift P}
j  F[j - 1]
else
F[i]  0 { no match }
i  i + 1
j 0 1 2 3 4 5
P[j] a b a c a b
Failure Function
j 0 1 2 3 4 5
P[j] a b a a b a
F(j) 0 0 1 1 2 3
F(0)=0
P[1] ≠ P[0]
j ≯0
F[1]=0
i=i+1=2
P[2]==P[0]
F[2]=1
P[3] ≠ P[1]
j>0
j=F[0]=0
P[3]==P[0]
F[3]=j+1=1
i=4, j=1
P[4]==P[1]
F[4]=1+1=2
i=5,j=2
P[5]==P[2]
F[5]=2+1=3
i=6,j=3 loop stops
The KMP Algorithm
• The failure function can be
represented by an array and can be
computed in O(m) time.
Independent of S
• At each iteration of the while-loop,
either
– i increases by one, or
– the shift amount i - j increases
by at least one (observe that F(j
- 1) < j)
• Hence, there are no more than 2n
iterations of the while-loop
• Thus, KMP’s algorithm runs in
optimal time O(m + n) under worst
case. It requires O(m) extra space.
Algorithm KMPMatch(T, P)
F  failureFunction(P)
i  0
j  0
while i < n
if T[i] = P[j]
if j = m - 1
return i - j { match }
else
i  i + 1
j  j + 1
else
if j > 0
j  F[j - 1]
else
i  i + 1
return -1 { no match }
Example
1
a b a c a a b a c a b a c a b a a b b
7
8
19
18
17
15
a b a c a b
16
14
13
2 3 4 5 6
9
a b a c a b
a b a c a b
a b a c a b
a b a c a b
10 11 12
c
j 0 1 2 3 4 5
P[j] a b a c a b
F(j) 0 0 1 0 1 2
j=F(j-1)=1, i remains same
j=F(j-1)=0, i remains same
j=0, i=i+1
j=F(j-1)=1, i remains same
An Example for KMP Algorithm
Phase 1: apply slide 27
Phase 2: apply slide 29
f(4–1)+1= f(3)+1=0+1=1
matched
shiwani gupta 31
Time Complexity of KMP Algorithm
• Time complexity: O(m+n) (analysis omitted)
– O(m) for computing function f
– O(n) for searching P
shiwani gupta 32
Definition of Rabin-Karp
A string search algorithm which compares a
string's hash values, rather than the strings
themselves.
For efficiency, the hash value of the next
position in the text is easily computed from the
hash value of the current position.
https://www.youtube.com/watch?v=qQ8vS2btsxI
Rabin-Karp
• The Rabin-Karp string searching algorithm calculates a hash value
for the pattern, and for each M-character subsequence of text to be
compared.
• If the hash values are unequal, the algorithm will calculate the hash
value for next M-character sequence.
• If the hash values are equal, the algorithm will do a Brute Force
comparison between the pattern and the M-character sequence.
• In this way, there is only one comparison per text subsequence, and
Brute Force is only needed when hash values match.
• Perhaps an example will clarify some things...
Rabin-Karp Math Example
• Let’s say that our alphabet consists of 10 letters.
• our alphabet = a, b, c, d, e, f, g, h, i, j
• Let’s say that “a” corresponds to 1, “b” corresponds to 2 and so
on.
The hash value for string “cah” would be ...
3*100 + 1*10 + 8*1 = 318
Rabin-Karp Example
• Hash value of “AAAAA” is 37
• Hash value of “AAAAH” is 100
How Rabin-Karp works
• Let characters in both arrays T and P be digits in radix-S
notation. S = (0,1,...,9)
• Let p be the value of the characters in P
• Choose a prime number q.
• Compute (p mod q)
– The value of p mod q is what we will be using to find all matches
of the pattern P in T.
• To compute p, we use Horner’s rule
• p = P[m] + 10(P[m-1] + 10(P[m-2] + … + 10(P[2] +
10(P[1]) …))
• which we can compute in time O(m).
How Rabin-Karp works
(continued)
Compute (T[s+1, .., s+m] mod q) for s = 0 .. n-m
Test against P only those sequences in T having the
same (mod q) value
(T[s+1, .., s+m] mod q) can be incrementally
computed by subtracting the high-order digit,
shifting, adding the low-order bit, all in modulo q
arithmetic.
A Rabin-Karp example
 Given T = 31415926535 and P = 26
 We choose q = 11
 P mod q = 26 mod 11 = 4
1
3 1
4 9
5 6
2 3
5 5
1
3 1
4 9
5 6
2 3
5 5
14 mod 11 = 3 not equal to 4
31 mod 11 = 9 not equal to 4
1
3 1
4 9
5 6
2 3
5 5
41 mod 11 = 8 not equal to 4
Rabin-Karp example continued
1
3 1
4 9
5 6
2 3
5 5
15 mod 11 = 4 equal to 4 -> spurious hit
1
3 1
4 9
5 6
2 3
5 5
59 mod 11 = 4 equal to 4 -> spurious hit
1
3 1
4 9
5 6
2 3
5 5
92 mod 11 = 4 equal to 4 -> spurious hit
1
3 1
4 9
5 6
2 3
5 5
26 mod 11 = 4 equal to 4 -> an exact match!!
1
3 1
4 9
5 6
2 3
5 5
65 mod 11 = 10 not equal to 4
Rabin-Karp example continued
As we can see, when a match is found, further testing is
done to ensure that a match has indeed been found.
1
3 1
4 9
5 6
2 3
5 5
53 mod 11 = 9 not equal to 4
1
3 1
4 9
5 6
2 3
5 5
35 mod 11 = 2 not equal to 4
Rabin-Karp Algorithm
pattern is M characters long
hash_p=hash value of pattern
hash_t=hash value of first M letters in body of text
do
if (hash_p == hash_t)
brute force comparison of pattern and selected section of text
hash_t = hash value of next section of text, one character over
while (end of text or brute force comparison == true)
Rabin-Karp Math
• Consider an M-character sequence as an M-digit number in base b,
where b is the number of letters in the alphabet. The text subsequence
t[i .. i+M-1] is mapped to the number
• Furthermore, given x(i) we can compute x(i+1) for the next
subsequence t[i+1 .. i+M] in constant time, as follows:
• In this way, we never explicitly compute a new value. We simply
adjust the existing value as we move over one character.
Rabin-Karp Mods
• If M is large, then the resulting value (bM) will be enormous. For this
reason, we hash the value by taking it mod a prime number q.
• The mod function (% in Java) is particularly useful in this case due to
several of its inherent properties:
[(x mod q) + (y mod q)] mod q = (x+y) mod q
(x mod q) mod q = x mod q
• For these reasons:
h(i)=((t[i] bM-1 mod q) +(t[i+1] bM-2 mod q) + ... +(t[i+M-1] mod q))mod q
h(i+1) =( h(i)  b mod q
Shift left one digit
-t[i]  bM mod q
Subtract leftmost digit
+t[i+M] mod q )
Add new rightmost digit
mod q
2. The Rabin-Karp Algorithm
Special Case
• Given a text T[1 .. n] of length n, a pattern P[1 .. m] of
length m ≤ n, both as arrays.
• Assume that elements of P and T are characters drawn
from a finite set of alphabets Σ.
• Where Σ = {0, 1, 2, . . . , 9}, so that each character is a
decimal digit.
• Now our objective is “finding all valid shifts with which
a given pattern P occurs in a text T”.
Notations: The Rabin-Karp Algorithm
Let us suppose that
• p denotes decimal value of a given pattern P[1 .. m]
• ts = decimal value of length m substring T[s + 1 .. s + m],
of given text T [1 .. n], for s = 0, 1, ..., n - m.
• It is very obvious that, ts = p if and only if
T [s + 1 .. s + m] = P[1 .. m];
thus, s is a valid shift if and only if ts = p.
• Now the question is how to compute p and ts efficiently
• Answer is Horner’s rule
Horner’s Rule
Example: Horner’s rule
[3, 4, 5] = 5 + 10(4 + 10(3)) = 5 + 10(4 + 30) = 5+340 = 345
p = P[3] + 10 (P[3 - 1] + 10(P[1])).
Formula
• We can compute p in time Θ(m) using this rule as
p = P[m] + 10 (P[m-1] + 10(P[m-2] + … + 10(P[2] + 10P[1]) ))
• Similarly t0 can be computed from T [1 .. m] in time Θ(m).
• To compute t1, t2, . . . , tn-m in time Θ(n - m), it suffices to
observe that ts+1 can be computed from ts in constant time.
Computing ts+1 from ts in constant time
• Text = [3, 1, 4, 1, 5, 2]; t0 = 31415
• m = 5; Shift = 0
• Old higher-order digit = 3
• New low-order digit = 2
• t1 = 10.(31415 – 104.T(1)) + T(5+1)
= 10.(31415 – 104.3) + 2
= 10(1415) + 2 = 14152
• ts+1 = 10(ts – T[s + 1] 10m-1 ) + T[s + m + 1])
• t1 = 10(t0 – T[1] 104) + T[0 + 5 + 1])
• Now t1, t2, . . . , tn-m can be computed in Θ(n - m)
3 1 4 1 5 2
Procedure: Computing ts+1 from ts
1. Subtract T[s + 1]10m-1 from ts, removes high-order digit
2. Multiply result by 10, shifts the number left one position
3. Add T [s + m + 1], it brings appropriate low-order digit.
ts+1 = (10(ts – T[s + 1] 10m-1 ) + T[s + m + 1])
Another issue and its treatment
• The only difficulty with the above procedure is that p and
ts may be too large to work with conveniently.
• Fortunately, there is a simple cure for this problem,
compute p and the ts modulo a suitable modulus q.
2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
7
mod 13
A window of length 5 is shaded.
The numerical value of window = 31415
31415 mod 13 = 7
Computing ts+1 from ts Modulo q = 13
Spurious Hits and their Elimination
• m = 5.
• p = 31415,
• Now, 31415 ≡ 7 (mod 13)
• Now, 67399 ≡ 7 (mod 13)
• Window beginning at position 7 = valid match; s = 6
• Window beginning at position 13 = spurious hit; s = 12
• After comparing decimal values, text comparison is
needed
0
2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
mod 13
8 9 3 11 1 7 8 4 5 10 11 7 9 11
…
…
…
Spurious hit
Valid match
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
2. The Rabin-Karp Algorithm
RABIN-KARP-MATCHER(T, P, d, q)
1 n ← length[T]
2 m ← length[P]
3 h ← dm-1 mod q
4 p ← 0
5 t0 ← 0
6 for i ← 1 to m Preprocessing.
7 do p ← (dp + P[i]) mod q
8 t0 ← (dt0 + T[i]) mod q
9 for s ← 0 to n - m Matching.
10 do if p = ts
11 then if P[1 .. m] = T [s + 1 .. s + m]
12 then print "Pattern occurs with shift" s
13 if s < n - m
14 then ts+1 ← (d(ts - T[s + 1]h) + T[s + m + 1]) mod q
Applications
Bioinformatics
– Used in looking for similarities of two or more proteins;
i.e. high sequence similarity usually implies significant
structural or functional similarity.
Alpha hemoglobin and beta hemoglobin are subunits that make up a
protein called hemoglobin in red blood cells. Notice the similarities
between the two sequences, which probably signify functional
similarity.
TASK
Input: S = "batmanandrobinarebat", pat = "bat"
Input: S = "abesdu", pat = "edu"
String Matching
Whenever you use a search engine, or a “find”
function, you are utilizing a string matching
program. Many of these programs create finite
automata in order to effectively search for your
string.
https://www.youtube.com/watch?v=JF48ymcpEzY
Finite Automata
A finite automaton is a quintuple (Q, , , s, F):
• Q: the finite set of states
• : the finite input alphabet
• : the “transition function” from Qx to Q
• s Q: the start state
• F  Q: the set of final (accepting) states
How it works
A finite automaton accepts strings in a specific
language. It begins in state q0 and reads characters
one at a time from the input string. It makes
transitions () based on these characters, and (if)
when it reaches the end of the tape and is in one of
the accepting states, that string is accepted by the
language.
Finite-Automaton-Matcher
The simple loop structure
implies a run time for a
string of length n is O(n).
However: this is only the
run time for the actual
string matching. It does
not include the time it
takes to compute the
transition function.
3. String Matching with Finite Automata
• A finite automaton M is a 5-tuple (Q, q0, A, Σ, δ), where
– Q is a finite set of states,
– q0 ∈ Q is the start state,
– A ∈ Q is a distinguished set of accepting states,
– Σ is a finite input alphabet,
– δ is a function from Q × Σ into Q, called the transition
function of M.
• String-matching automata are very efficient because it
examines each character exactly once, taking constant time.
• The matching time used after preprocessing the pattern to
build the automaton is therefore Θ(n).
input
state a b
0 1 0
1 0 0
0 1
a
a
b
b
State set Q = {0, 1}
Start state q0 = 0
Input alphabet Ʃ = {a, b}
A tabular representation of
transition function δ
• Q = {0, 1}, Ʃ = {a, b} and transition function is shown below
• A simple two-state finite automaton which accepts those
strings that end in an odd number of a’s.
Example : Transition Table and Finite Automata
input
state a b c P
0 1 0 0 a
1 1 2 0 b
2 3 0 0 a
3 1 4 0 b
4 5 0 0 a
5 1 4 6 c
6 7 0 0 a
7 1 2 0
• Pattern string P = ababaca.
• Edge towards right shows matching
• Edge towards left is for failure
• No edge for some state and some
alphabet means that edge hits initial state
0 1 2 3 4 5 6 7
a b a b a c a
a a
a a
b
b
String Matching Automata for given Pattern
String Matching using Finite Automata
i -- 1 2 3 4 5 6 7 8 9 10 11
T[i] -- a b a b a b a c a b a
state φ(Ti) 0 1 2 3 4 5 4 5 6 7 2 3
Finite Automata for Pattern P = ababaca
Text T = abababacaba.
0 1 2 3 4 5 6 7
a b a b a c a
a a
a a
b
b
3. String Matching with finite Automata
FINITE-AUTOMATON-MATCHER(T, δ, m)
1 n ← length[T]
2 q ← 0
3 for i ← 1 to n
4 do q ← δ(q, T[i])
5 if q = m
6 then print "Pattern occurs with shift" i - m
• Matching time on a text string of length n is Θ(n).
• Memory Usage: O(m|Σ|),
• Preprocessing Time: Best case: O(m|Σ|).
TASK
• P=AABA, T=AABAACAADAABAAABAA
• Suppose a finite automaton which accepts even
number of a's where ∑ = {a, b, c}
Suppose w is a string such as w=bcaabcaaabac
66
General Problems, Input Size and
Time Complexity
• Time complexity of algorithms :
polynomial time algorithm ("efficient algorithm") v.s.
exponential time algorithm ("inefficient algorithm")
f(n)  n 10 30 50
n 0.00001 sec 0.00003 sec 0.00005 sec
n5 0.1 sec 24.3 sec 5.2 mins
2n 0.001 sec 17.9 mins 35.7 yrs
67
“Hard” and “easy’ Problems
• Sometimes the dividing line between “easy” and “hard”
problems is a fine one. For example
– Find the shortest path in a graph from X to Y. (easy)
– Find the longest path in a graph from X to Y. (with no
cycles) (hard)
• View another way – as “yes/no” problems
– Is there a simple path from X to Y with weight <= M? (easy)
– Is there a simple path from X to Y with weight >= M? (hard)
– First problem can be solved in polynomial time.
– All known algorithms for the second problem (could) take
exponential time.
68
The Classes of P and NP
• The class P and Deterministic Turing Machine
• Given a decision problem X, if there is a
polynomial time Deterministic Turing Machine
program that solves X, then X is belong to P
• Informally, there is a polynomial time algorithm
to solve the problem
69
• One of the most important open problem in
theoretical compute science :
Is P=NP ?
Most likely “No”.
Currently, there are many known (decision)
problems in NP, and there is no solution to
show anyone of them in P.
70
• P: (Decision) problems solvable by deterministic
algorithms in polynomial time
• NP: (Decision) problems solved by non-deterministic
algorithms in polynomial time
• A group of (decision) problems,
(Satisfiability, 0/1 Knapsack,
Longest Path, Partition) have an
additional important property:
If any of them can be solved in
polynomial time, then they all can!
NP
P
NP-Complete
• These problems are called NP-complete problems.

More Related Content

What's hot

Lecture optimal binary search tree
Lecture optimal binary search tree Lecture optimal binary search tree
Lecture optimal binary search tree Divya Ks
 
Job sequencing with Deadlines
Job sequencing with DeadlinesJob sequencing with Deadlines
Job sequencing with DeadlinesYashiUpadhyay3
 
Adversarial search with Game Playing
Adversarial search with Game PlayingAdversarial search with Game Playing
Adversarial search with Game PlayingAman Patel
 
Bellman ford algorithm
Bellman ford algorithmBellman ford algorithm
Bellman ford algorithmRuchika Sinha
 
String matching algorithms(knuth morris-pratt)
String matching algorithms(knuth morris-pratt)String matching algorithms(knuth morris-pratt)
String matching algorithms(knuth morris-pratt)Neel Shah
 
module5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdfmodule5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdfShiwani Gupta
 
Branch and bounding : Data structures
Branch and bounding : Data structuresBranch and bounding : Data structures
Branch and bounding : Data structuresKàŕtheek Jåvvàjí
 
Dynamic Programming - Matrix Chain Multiplication
Dynamic Programming - Matrix Chain MultiplicationDynamic Programming - Matrix Chain Multiplication
Dynamic Programming - Matrix Chain MultiplicationPecha Inc.
 
RABIN KARP ALGORITHM STRING MATCHING
RABIN KARP ALGORITHM STRING MATCHINGRABIN KARP ALGORITHM STRING MATCHING
RABIN KARP ALGORITHM STRING MATCHINGAbhishek Singh
 
Dijkstra s algorithm
Dijkstra s algorithmDijkstra s algorithm
Dijkstra s algorithmmansab MIRZA
 

What's hot (20)

Kmp
KmpKmp
Kmp
 
KMP String Matching Algorithm
KMP String Matching AlgorithmKMP String Matching Algorithm
KMP String Matching Algorithm
 
Lecture optimal binary search tree
Lecture optimal binary search tree Lecture optimal binary search tree
Lecture optimal binary search tree
 
Job sequencing with Deadlines
Job sequencing with DeadlinesJob sequencing with Deadlines
Job sequencing with Deadlines
 
Adversarial search with Game Playing
Adversarial search with Game PlayingAdversarial search with Game Playing
Adversarial search with Game Playing
 
Bellman ford algorithm
Bellman ford algorithmBellman ford algorithm
Bellman ford algorithm
 
String matching algorithms(knuth morris-pratt)
String matching algorithms(knuth morris-pratt)String matching algorithms(knuth morris-pratt)
String matching algorithms(knuth morris-pratt)
 
String matching algorithm
String matching algorithmString matching algorithm
String matching algorithm
 
module5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdfmodule5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdf
 
SINGLE-SOURCE SHORTEST PATHS
SINGLE-SOURCE SHORTEST PATHS SINGLE-SOURCE SHORTEST PATHS
SINGLE-SOURCE SHORTEST PATHS
 
String operation
String operationString operation
String operation
 
Branch and bounding : Data structures
Branch and bounding : Data structuresBranch and bounding : Data structures
Branch and bounding : Data structures
 
Dynamic Programming - Matrix Chain Multiplication
Dynamic Programming - Matrix Chain MultiplicationDynamic Programming - Matrix Chain Multiplication
Dynamic Programming - Matrix Chain Multiplication
 
Backtracking
BacktrackingBacktracking
Backtracking
 
RABIN KARP ALGORITHM STRING MATCHING
RABIN KARP ALGORITHM STRING MATCHINGRABIN KARP ALGORITHM STRING MATCHING
RABIN KARP ALGORITHM STRING MATCHING
 
Minimum spanning tree
Minimum spanning treeMinimum spanning tree
Minimum spanning tree
 
Dijkstra s algorithm
Dijkstra s algorithmDijkstra s algorithm
Dijkstra s algorithm
 
5.1 greedy
5.1 greedy5.1 greedy
5.1 greedy
 
8 queens problem.pptx
8 queens problem.pptx8 queens problem.pptx
8 queens problem.pptx
 
Greedy Algorithms
Greedy AlgorithmsGreedy Algorithms
Greedy Algorithms
 

Similar to module6_stringmatchingalgorithm_2022.pdf

Chpt9 patternmatching
Chpt9 patternmatchingChpt9 patternmatching
Chpt9 patternmatchingdbhanumahesh
 
Pattern matching
Pattern matchingPattern matching
Pattern matchingshravs_188
 
Knuth morris pratt string matching algo
Knuth morris pratt string matching algoKnuth morris pratt string matching algo
Knuth morris pratt string matching algosabiya sabiya
 
PatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
PatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnPatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
PatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnRAtna29
 
StringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdfStringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdfbhagabatijenadukura
 
String Matching (Naive,Rabin-Karp,KMP)
String Matching (Naive,Rabin-Karp,KMP)String Matching (Naive,Rabin-Karp,KMP)
String Matching (Naive,Rabin-Karp,KMP)Aditya pratap Singh
 
brown.ppt for identifying rabin karp algo
brown.ppt for identifying rabin karp algobrown.ppt for identifying rabin karp algo
brown.ppt for identifying rabin karp algoSadiaSharmin40
 
Rabin-Karp (2).ppt
Rabin-Karp (2).pptRabin-Karp (2).ppt
Rabin-Karp (2).pptUmeshThoriya
 
Gp 27[string matching].pptx
Gp 27[string matching].pptxGp 27[string matching].pptx
Gp 27[string matching].pptxSumitYadav641839
 
Boyer-Moore-algorithm-Vladimir.pptx
Boyer-Moore-algorithm-Vladimir.pptxBoyer-Moore-algorithm-Vladimir.pptx
Boyer-Moore-algorithm-Vladimir.pptxssuserf56658
 
Knutt Morris Pratt Algorithm by Dr. Rose.ppt
Knutt Morris Pratt Algorithm by Dr. Rose.pptKnutt Morris Pratt Algorithm by Dr. Rose.ppt
Knutt Morris Pratt Algorithm by Dr. Rose.pptsaki931
 
String matching algorithms-pattern matching.
String matching algorithms-pattern matching.String matching algorithms-pattern matching.
String matching algorithms-pattern matching.Swapan Shakhari
 

Similar to module6_stringmatchingalgorithm_2022.pdf (20)

Boyer more algorithm
Boyer more algorithmBoyer more algorithm
Boyer more algorithm
 
Chpt9 patternmatching
Chpt9 patternmatchingChpt9 patternmatching
Chpt9 patternmatching
 
Boyer more algorithm
Boyer more algorithmBoyer more algorithm
Boyer more algorithm
 
String matching algorithms
String matching algorithmsString matching algorithms
String matching algorithms
 
Pattern matching
Pattern matchingPattern matching
Pattern matching
 
Knuth morris pratt string matching algo
Knuth morris pratt string matching algoKnuth morris pratt string matching algo
Knuth morris pratt string matching algo
 
PatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
PatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnPatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
PatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
 
StringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdfStringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdf
 
IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION ALGORITHM
IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION  ALGORITHM  IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION  ALGORITHM
IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION ALGORITHM
 
Lec17
Lec17Lec17
Lec17
 
String Matching (Naive,Rabin-Karp,KMP)
String Matching (Naive,Rabin-Karp,KMP)String Matching (Naive,Rabin-Karp,KMP)
String Matching (Naive,Rabin-Karp,KMP)
 
lec17.ppt
lec17.pptlec17.ppt
lec17.ppt
 
brown.ppt for identifying rabin karp algo
brown.ppt for identifying rabin karp algobrown.ppt for identifying rabin karp algo
brown.ppt for identifying rabin karp algo
 
Rabin-Karp (2).ppt
Rabin-Karp (2).pptRabin-Karp (2).ppt
Rabin-Karp (2).ppt
 
String matching, naive,
String matching, naive,String matching, naive,
String matching, naive,
 
Gp 27[string matching].pptx
Gp 27[string matching].pptxGp 27[string matching].pptx
Gp 27[string matching].pptx
 
Daa unit 5
Daa unit 5Daa unit 5
Daa unit 5
 
Boyer-Moore-algorithm-Vladimir.pptx
Boyer-Moore-algorithm-Vladimir.pptxBoyer-Moore-algorithm-Vladimir.pptx
Boyer-Moore-algorithm-Vladimir.pptx
 
Knutt Morris Pratt Algorithm by Dr. Rose.ppt
Knutt Morris Pratt Algorithm by Dr. Rose.pptKnutt Morris Pratt Algorithm by Dr. Rose.ppt
Knutt Morris Pratt Algorithm by Dr. Rose.ppt
 
String matching algorithms-pattern matching.
String matching algorithms-pattern matching.String matching algorithms-pattern matching.
String matching algorithms-pattern matching.
 

More from Shiwani Gupta

module4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfmodule4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfShiwani Gupta
 
module3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdfmodule3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdfShiwani Gupta
 
module2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdfmodule2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdfShiwani Gupta
 
module1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdfmodule1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdfShiwani Gupta
 
ML MODULE 1_slideshare.pdf
ML MODULE 1_slideshare.pdfML MODULE 1_slideshare.pdf
ML MODULE 1_slideshare.pdfShiwani Gupta
 
Functionsandpigeonholeprinciple
FunctionsandpigeonholeprincipleFunctionsandpigeonholeprinciple
FunctionsandpigeonholeprincipleShiwani Gupta
 
Uncertain knowledge and reasoning
Uncertain knowledge and reasoningUncertain knowledge and reasoning
Uncertain knowledge and reasoningShiwani Gupta
 
Knowledge based agent
Knowledge based agentKnowledge based agent
Knowledge based agentShiwani Gupta
 

More from Shiwani Gupta (20)

ML MODULE 6.pdf
ML MODULE 6.pdfML MODULE 6.pdf
ML MODULE 6.pdf
 
ML MODULE 5.pdf
ML MODULE 5.pdfML MODULE 5.pdf
ML MODULE 5.pdf
 
ML MODULE 4.pdf
ML MODULE 4.pdfML MODULE 4.pdf
ML MODULE 4.pdf
 
module4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfmodule4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdf
 
module3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdfmodule3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdf
 
module2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdfmodule2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdf
 
module1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdfmodule1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdf
 
ML MODULE 1_slideshare.pdf
ML MODULE 1_slideshare.pdfML MODULE 1_slideshare.pdf
ML MODULE 1_slideshare.pdf
 
ML MODULE 2.pdf
ML MODULE 2.pdfML MODULE 2.pdf
ML MODULE 2.pdf
 
ML Module 3.pdf
ML Module 3.pdfML Module 3.pdf
ML Module 3.pdf
 
Problem formulation
Problem formulationProblem formulation
Problem formulation
 
Simplex method
Simplex methodSimplex method
Simplex method
 
Functionsandpigeonholeprinciple
FunctionsandpigeonholeprincipleFunctionsandpigeonholeprinciple
Functionsandpigeonholeprinciple
 
Relations
RelationsRelations
Relations
 
Logic
LogicLogic
Logic
 
Set theory
Set theorySet theory
Set theory
 
Uncertain knowledge and reasoning
Uncertain knowledge and reasoningUncertain knowledge and reasoning
Uncertain knowledge and reasoning
 
Introduction to ai
Introduction to aiIntroduction to ai
Introduction to ai
 
Planning Agent
Planning AgentPlanning Agent
Planning Agent
 
Knowledge based agent
Knowledge based agentKnowledge based agent
Knowledge based agent
 

Recently uploaded

Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 

module6_stringmatchingalgorithm_2022.pdf

  • 1. Module 6 String Matching algorithms Polynomial Time, Classes-P and NP
  • 2. Strings • A string is a sequence of characters • Examples of strings: – Java program – HTML document – DNA sequence – Digitized image • An alphabet S is the set of possible characters for a family of strings • Example of alphabets: – ASCII, Unicode, {0, 1}, {A, C, G, T} • Let P be a string of size m – A prefix of P is a substring formed by taking any no. of leading symbols of string – A suffix of P is a substring formed by taking any no. of trailing symbols of string
  • 3. String Operations • S=“AGCTTGA” • |S|=7, length of S • Substring: Si,j=SiS i+1…Sj string fragment b/w indices i and j – Example: S2,4=“GCT” • Subsequence of S: deleting zero or more characters from S – “ACT” and “GCTT” are subsequences. • Let P be a string of size m – i is any index b/w 0 and m-1 – A prefix of P is a substring of the type P[0 .. i] – A suffix of P is a substring of the type P[i ..m - 1] • Prefix of S: – “AGCT” is a prefix of S. • Suffix of S: – “CTTGA” is a suffix of S. 3 shiwani gupta
  • 5. String Matching Problem • Also called as Text Matching or Pattern Matching • Given a text string T of length n and a pattern string P of length m, the exact string matching problem is to find all occurrences of P in T. • Example: T=“AGCTTGA” P=“GCT” • Applications: – Text editors – Search engines (Google, Openfind) – Biological search – Text processing when compiling programs – Information Retrieval (eg. In dictionaries)
  • 6. String Matching algorithms • Naïve / Brute Force Method • Boyer Moore Method • Knuth Morris Pratt Method • Rabin Karp Method • String Matching with Finite Automata
  • 7. A Brute-Force Algorithm Time: O(mn) where m=|P| and n=|T|.
  • 8. Brute-Force Algorithm • The brute-force pattern matching algorithm compares the pattern P with the text T for each possible shift of P relative to T, until either – a match is found, or – all placements of the pattern have been tried • Brute-force pattern matching runs in time (n-m+1)m=O(nm) • Example of worst case: – T = aaa … ah – P = aaah – may occur in images and DNA sequences – unlikely in English text • Used for small P Algorithm BruteForceMatch(T, P) Input text T of size n and pattern P of size m Output starting index of a substring of T equal to P or -1 if no such substring exists for i  0 to n - m // test shift i of the pattern j  0 while j < m and T[i + j] = P[j] j  j + 1 if j = m return i //match at i else break while loop // mismatch return -1 // no match anywhere
  • 9. The Brute Force Algorithm Task: refer slide 8 Run Brute Force manually and through algorithm T=‘raman likes mango’ P=‘mango’
  • 11. Two-phase Algorithms • Phase 1:Generate an array to indicate the moving direction. • Phase 2:Make use of the array to move and match the string. • Boyer-Moore Algorithm: – Proposed by Boyer-Moore in 1977. – https://www.javatpoint.com/daa-boyer-moore-algorithm • KMP algorithm: – Proposed by Knuth, Morris and Pratt in 1977. – https://www.javatpoint.com/daa-knuth-morris-pratt-algorithm 11 shiwani gupta
  • 12. Boyer-Moore Heuristics • The Boyer-Moore’s pattern matching algorithm is based on two heuristics Looking-glass heuristic: Compare P with a substring of T moving backwards Character-jump heuristic: When a mismatch occurs at T[i] = c – If P contains c, shift P to align the last occurrence of c in P with T[i] – Else, shift P to align P[0] with T[i + 1] • Example 1 a p a t t e r n m a t c h i n g a l g o r i t h m r i t h m r i t h m r i t h m r i t h m r i t h m r i t h m r i t h m 2 3 4 5 6 7 8 9 10 11
  • 13. Boyer Moore Characteristics • Works backwards • How many positions ahead to start next search is based on the value of character causing mismatch • With each unsuccessful attempt to find match, information gained is used to rule out as many positions of text as possible where strings can’t match. • It is most efficient and fastest for moderately sized alphabet and long pattern.
  • 14. Case 1: Bad character heuristics
  • 15. Case 2: failure of bad character heuristic leading to negative shift
  • 16. Case 3: Bad character heuristics
  • 17. Last-Occurrence Function • Boyer-Moore’s algorithm preprocesses the pattern P and the alphabet S to build the last-occurrence function L mapping S to integers, where L(c) is defined as – the largest index i such that P[i] = c or -1 if no such index exists • Example: P = abacab • The last-occurrence function can be represented by an array indexed by the numeric codes of the characters • The last-occurrence function can be computed in time O(m + s), where m is the size of P and s is the size of S c a b c d L(c) 4 5 3 -1
  • 18. COMPUTE-LAST-OCCURRENCE-FUNCTION (P, m, ∑ ) 1. for each character a ∈ ∑ 2. do λ [a] = 0 3. for j ← 1 to m 4. do λ [P [j]] ← j 5. Return λ Last-Occurrence Function
  • 19. m - j i j l . . . . . . a . . . . . . . . . . b a . . . . b a j Case 3: j  1 + l The Boyer-Moore Algorithm Algorithm BoyerMooreMatch(T, P, S) L  lastOccurenceFunction(P, S ) i  m - 1 j  m - 1 repeat if T[i] = P[j] if j = 0 return i { match at i } else i  i - 1 j  j - 1 else { character-jump } l  L[T[i]] i  i + m – min(j, 1 + l) j  m - 1 until i > n - 1 return -1 { no match } m - (1  l) i j l . . . . . . a . . . . . . . a . . b . . a . . b . 1  l Case 1: 1 + l  j
  • 20. Example (Case 1 and 3) 1 a b a c a a b a d c a b a c a b a a b b 2 3 4 5 6 7 8 9 10 12 a b a c a b a b a c a b a b a c a b a b a c a b a b a c a b a b a c a b 11 13 i=5, j=5, l=4, i=5+6-min(5,1+4), i=6 i=6, j=5 i=4, j=3, l=4, i=4+6-min(3,1+4), i=7 i=7, j=5, l=-1, i=8+6-min(5,1-1), i=14
  • 21. Analysis (Case 2): Good suffix heuristic • Boyer-Moore’s algorithm runs in time O(nm + s) • Example of worst case: – T = aaa … a – P = baaa • The worst case may occur in images and DNA sequences but is unlikely in English text • Boyer-Moore’s algorithm is significantly faster than the brute- force algorithm on English text • Longer the pattern, faster we move on an average 11 1 a a a a a a a a a 2 3 4 5 6 b a a a a a b a a a a a b a a a a a b a a a a a 7 8 9 10 12 13 14 15 16 17 18 19 20 21 22 23 24
  • 22. Task 22 shiwani gupta How many character comparisons will Boyer Moore algorithm make to search for each of the following patterns in binary text? Text: repeat “01110” 20 times Pattern: (a) 01111 (b) 01110
  • 23. KMP-Algorithm • In Brute Force and Boyer Moore, we compare pattern characters that don’t match in text. • On occurrence of mismatch, throw away the info, restart comparison for another set of characters from text. • Thus again and again with next incremental position of text, the characters are matched. • In KMP when a mismatch occurs, word itself embodies info to determine where next match would begin, thus bypassing re- examination of previously matched characters. • First linear time string matching algorithm. Demo: https://people.ok.ubc.ca/ylucet/DS/KnuthMorrisPratt.html
  • 24. The KMP Algorithm - Motivation • Knuth-Morris-Pratt’s algorithm compares the pattern to the text in left-to-right, but shifts the pattern more intelligently than the brute- force algorithm. • It bypasses re-examination of previously matched characters • When a mismatch occurs, what is the most we can shift the pattern so as to avoid redundant comparisons? • Answer: the largest prefix of P[0..j] that is a suffix of P[1..j] x j . . a b a a b . . . . . a b a a b a a b a a b a No need to repeat these comparisons Resume comparing here
  • 25.
  • 26. KMP Failure Function • Knuth-Morris-Pratt’s algorithm preprocesses the pattern to find matches of prefixes of the pattern with the pattern itself • The failure function shows how much of the beginning of string matches to portion immediately preceding failed comparison • Prefix table gives you table of skips per prefix ahead of time • Knuth-Morris-Pratt’s algorithm modifies the brute-force algorithm so that if a mismatch occurs at P[j]  T[i] we set j  F(j - 1) j 0 1 2 3 4 5 P[j] a b a a b a F(j) 0 0 1 1 2 3 x j . . a b a a b . . . . . a b a a b a F(j-1) a b a a b a Task : Apply failure function code on above pattern
  • 27. Computing the Failure Function • Failure function encodes repeated substrings inside the pattern itself • The failure function can be represented by an array and can be computed in O(m) time • The construction is similar to the KMP algorithm itself • At each iteration of the while- loop, either – i increases by one, or – the shift amount i - j increases by at least one (observe that F(j - 1) < j) • Hence, there are no more than 2m iterations of the while-loop Algorithm failureFunction(P) F[0]  0 i  1 j  0 while i < m if P[i] = P[j] {we have matched j + 1 chars} F[i]  j + 1 i  i + 1 j  j + 1 else if j > 0 then {use failure function to shift P} j  F[j - 1] else F[i]  0 { no match } i  i + 1
  • 28. j 0 1 2 3 4 5 P[j] a b a c a b Failure Function j 0 1 2 3 4 5 P[j] a b a a b a F(j) 0 0 1 1 2 3 F(0)=0 P[1] ≠ P[0] j ≯0 F[1]=0 i=i+1=2 P[2]==P[0] F[2]=1 P[3] ≠ P[1] j>0 j=F[0]=0 P[3]==P[0] F[3]=j+1=1 i=4, j=1 P[4]==P[1] F[4]=1+1=2 i=5,j=2 P[5]==P[2] F[5]=2+1=3 i=6,j=3 loop stops
  • 29. The KMP Algorithm • The failure function can be represented by an array and can be computed in O(m) time. Independent of S • At each iteration of the while-loop, either – i increases by one, or – the shift amount i - j increases by at least one (observe that F(j - 1) < j) • Hence, there are no more than 2n iterations of the while-loop • Thus, KMP’s algorithm runs in optimal time O(m + n) under worst case. It requires O(m) extra space. Algorithm KMPMatch(T, P) F  failureFunction(P) i  0 j  0 while i < n if T[i] = P[j] if j = m - 1 return i - j { match } else i  i + 1 j  j + 1 else if j > 0 j  F[j - 1] else i  i + 1 return -1 { no match }
  • 30. Example 1 a b a c a a b a c a b a c a b a a b b 7 8 19 18 17 15 a b a c a b 16 14 13 2 3 4 5 6 9 a b a c a b a b a c a b a b a c a b a b a c a b 10 11 12 c j 0 1 2 3 4 5 P[j] a b a c a b F(j) 0 0 1 0 1 2 j=F(j-1)=1, i remains same j=F(j-1)=0, i remains same j=0, i=i+1 j=F(j-1)=1, i remains same
  • 31. An Example for KMP Algorithm Phase 1: apply slide 27 Phase 2: apply slide 29 f(4–1)+1= f(3)+1=0+1=1 matched shiwani gupta 31
  • 32. Time Complexity of KMP Algorithm • Time complexity: O(m+n) (analysis omitted) – O(m) for computing function f – O(n) for searching P shiwani gupta 32
  • 33. Definition of Rabin-Karp A string search algorithm which compares a string's hash values, rather than the strings themselves. For efficiency, the hash value of the next position in the text is easily computed from the hash value of the current position. https://www.youtube.com/watch?v=qQ8vS2btsxI
  • 34. Rabin-Karp • The Rabin-Karp string searching algorithm calculates a hash value for the pattern, and for each M-character subsequence of text to be compared. • If the hash values are unequal, the algorithm will calculate the hash value for next M-character sequence. • If the hash values are equal, the algorithm will do a Brute Force comparison between the pattern and the M-character sequence. • In this way, there is only one comparison per text subsequence, and Brute Force is only needed when hash values match. • Perhaps an example will clarify some things...
  • 35. Rabin-Karp Math Example • Let’s say that our alphabet consists of 10 letters. • our alphabet = a, b, c, d, e, f, g, h, i, j • Let’s say that “a” corresponds to 1, “b” corresponds to 2 and so on. The hash value for string “cah” would be ... 3*100 + 1*10 + 8*1 = 318
  • 36. Rabin-Karp Example • Hash value of “AAAAA” is 37 • Hash value of “AAAAH” is 100
  • 37. How Rabin-Karp works • Let characters in both arrays T and P be digits in radix-S notation. S = (0,1,...,9) • Let p be the value of the characters in P • Choose a prime number q. • Compute (p mod q) – The value of p mod q is what we will be using to find all matches of the pattern P in T. • To compute p, we use Horner’s rule • p = P[m] + 10(P[m-1] + 10(P[m-2] + … + 10(P[2] + 10(P[1]) …)) • which we can compute in time O(m).
  • 38. How Rabin-Karp works (continued) Compute (T[s+1, .., s+m] mod q) for s = 0 .. n-m Test against P only those sequences in T having the same (mod q) value (T[s+1, .., s+m] mod q) can be incrementally computed by subtracting the high-order digit, shifting, adding the low-order bit, all in modulo q arithmetic.
  • 39. A Rabin-Karp example  Given T = 31415926535 and P = 26  We choose q = 11  P mod q = 26 mod 11 = 4 1 3 1 4 9 5 6 2 3 5 5 1 3 1 4 9 5 6 2 3 5 5 14 mod 11 = 3 not equal to 4 31 mod 11 = 9 not equal to 4 1 3 1 4 9 5 6 2 3 5 5 41 mod 11 = 8 not equal to 4
  • 40. Rabin-Karp example continued 1 3 1 4 9 5 6 2 3 5 5 15 mod 11 = 4 equal to 4 -> spurious hit 1 3 1 4 9 5 6 2 3 5 5 59 mod 11 = 4 equal to 4 -> spurious hit 1 3 1 4 9 5 6 2 3 5 5 92 mod 11 = 4 equal to 4 -> spurious hit 1 3 1 4 9 5 6 2 3 5 5 26 mod 11 = 4 equal to 4 -> an exact match!! 1 3 1 4 9 5 6 2 3 5 5 65 mod 11 = 10 not equal to 4
  • 41. Rabin-Karp example continued As we can see, when a match is found, further testing is done to ensure that a match has indeed been found. 1 3 1 4 9 5 6 2 3 5 5 53 mod 11 = 9 not equal to 4 1 3 1 4 9 5 6 2 3 5 5 35 mod 11 = 2 not equal to 4
  • 42. Rabin-Karp Algorithm pattern is M characters long hash_p=hash value of pattern hash_t=hash value of first M letters in body of text do if (hash_p == hash_t) brute force comparison of pattern and selected section of text hash_t = hash value of next section of text, one character over while (end of text or brute force comparison == true)
  • 43. Rabin-Karp Math • Consider an M-character sequence as an M-digit number in base b, where b is the number of letters in the alphabet. The text subsequence t[i .. i+M-1] is mapped to the number • Furthermore, given x(i) we can compute x(i+1) for the next subsequence t[i+1 .. i+M] in constant time, as follows: • In this way, we never explicitly compute a new value. We simply adjust the existing value as we move over one character.
  • 44. Rabin-Karp Mods • If M is large, then the resulting value (bM) will be enormous. For this reason, we hash the value by taking it mod a prime number q. • The mod function (% in Java) is particularly useful in this case due to several of its inherent properties: [(x mod q) + (y mod q)] mod q = (x+y) mod q (x mod q) mod q = x mod q • For these reasons: h(i)=((t[i] bM-1 mod q) +(t[i+1] bM-2 mod q) + ... +(t[i+M-1] mod q))mod q h(i+1) =( h(i)  b mod q Shift left one digit -t[i]  bM mod q Subtract leftmost digit +t[i+M] mod q ) Add new rightmost digit mod q
  • 45. 2. The Rabin-Karp Algorithm Special Case • Given a text T[1 .. n] of length n, a pattern P[1 .. m] of length m ≤ n, both as arrays. • Assume that elements of P and T are characters drawn from a finite set of alphabets Σ. • Where Σ = {0, 1, 2, . . . , 9}, so that each character is a decimal digit. • Now our objective is “finding all valid shifts with which a given pattern P occurs in a text T”.
  • 46. Notations: The Rabin-Karp Algorithm Let us suppose that • p denotes decimal value of a given pattern P[1 .. m] • ts = decimal value of length m substring T[s + 1 .. s + m], of given text T [1 .. n], for s = 0, 1, ..., n - m. • It is very obvious that, ts = p if and only if T [s + 1 .. s + m] = P[1 .. m]; thus, s is a valid shift if and only if ts = p. • Now the question is how to compute p and ts efficiently • Answer is Horner’s rule
  • 47. Horner’s Rule Example: Horner’s rule [3, 4, 5] = 5 + 10(4 + 10(3)) = 5 + 10(4 + 30) = 5+340 = 345 p = P[3] + 10 (P[3 - 1] + 10(P[1])). Formula • We can compute p in time Θ(m) using this rule as p = P[m] + 10 (P[m-1] + 10(P[m-2] + … + 10(P[2] + 10P[1]) )) • Similarly t0 can be computed from T [1 .. m] in time Θ(m). • To compute t1, t2, . . . , tn-m in time Θ(n - m), it suffices to observe that ts+1 can be computed from ts in constant time.
  • 48. Computing ts+1 from ts in constant time • Text = [3, 1, 4, 1, 5, 2]; t0 = 31415 • m = 5; Shift = 0 • Old higher-order digit = 3 • New low-order digit = 2 • t1 = 10.(31415 – 104.T(1)) + T(5+1) = 10.(31415 – 104.3) + 2 = 10(1415) + 2 = 14152 • ts+1 = 10(ts – T[s + 1] 10m-1 ) + T[s + m + 1]) • t1 = 10(t0 – T[1] 104) + T[0 + 5 + 1]) • Now t1, t2, . . . , tn-m can be computed in Θ(n - m) 3 1 4 1 5 2
  • 49. Procedure: Computing ts+1 from ts 1. Subtract T[s + 1]10m-1 from ts, removes high-order digit 2. Multiply result by 10, shifts the number left one position 3. Add T [s + m + 1], it brings appropriate low-order digit. ts+1 = (10(ts – T[s + 1] 10m-1 ) + T[s + m + 1]) Another issue and its treatment • The only difficulty with the above procedure is that p and ts may be too large to work with conveniently. • Fortunately, there is a simple cure for this problem, compute p and the ts modulo a suitable modulus q.
  • 50. 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1 7 mod 13 A window of length 5 is shaded. The numerical value of window = 31415 31415 mod 13 = 7 Computing ts+1 from ts Modulo q = 13
  • 51. Spurious Hits and their Elimination • m = 5. • p = 31415, • Now, 31415 ≡ 7 (mod 13) • Now, 67399 ≡ 7 (mod 13) • Window beginning at position 7 = valid match; s = 6 • Window beginning at position 13 = spurious hit; s = 12 • After comparing decimal values, text comparison is needed 0 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1 mod 13 8 9 3 11 1 7 8 4 5 10 11 7 9 11 … … … Spurious hit Valid match 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
  • 52. 2. The Rabin-Karp Algorithm RABIN-KARP-MATCHER(T, P, d, q) 1 n ← length[T] 2 m ← length[P] 3 h ← dm-1 mod q 4 p ← 0 5 t0 ← 0 6 for i ← 1 to m Preprocessing. 7 do p ← (dp + P[i]) mod q 8 t0 ← (dt0 + T[i]) mod q 9 for s ← 0 to n - m Matching. 10 do if p = ts 11 then if P[1 .. m] = T [s + 1 .. s + m] 12 then print "Pattern occurs with shift" s 13 if s < n - m 14 then ts+1 ← (d(ts - T[s + 1]h) + T[s + m + 1]) mod q
  • 53. Applications Bioinformatics – Used in looking for similarities of two or more proteins; i.e. high sequence similarity usually implies significant structural or functional similarity. Alpha hemoglobin and beta hemoglobin are subunits that make up a protein called hemoglobin in red blood cells. Notice the similarities between the two sequences, which probably signify functional similarity.
  • 54. TASK Input: S = "batmanandrobinarebat", pat = "bat" Input: S = "abesdu", pat = "edu"
  • 55. String Matching Whenever you use a search engine, or a “find” function, you are utilizing a string matching program. Many of these programs create finite automata in order to effectively search for your string. https://www.youtube.com/watch?v=JF48ymcpEzY
  • 56. Finite Automata A finite automaton is a quintuple (Q, , , s, F): • Q: the finite set of states • : the finite input alphabet • : the “transition function” from Qx to Q • s Q: the start state • F  Q: the set of final (accepting) states
  • 57. How it works A finite automaton accepts strings in a specific language. It begins in state q0 and reads characters one at a time from the input string. It makes transitions () based on these characters, and (if) when it reaches the end of the tape and is in one of the accepting states, that string is accepted by the language.
  • 58.
  • 59. Finite-Automaton-Matcher The simple loop structure implies a run time for a string of length n is O(n). However: this is only the run time for the actual string matching. It does not include the time it takes to compute the transition function.
  • 60. 3. String Matching with Finite Automata • A finite automaton M is a 5-tuple (Q, q0, A, Σ, δ), where – Q is a finite set of states, – q0 ∈ Q is the start state, – A ∈ Q is a distinguished set of accepting states, – Σ is a finite input alphabet, – δ is a function from Q × Σ into Q, called the transition function of M. • String-matching automata are very efficient because it examines each character exactly once, taking constant time. • The matching time used after preprocessing the pattern to build the automaton is therefore Θ(n).
  • 61. input state a b 0 1 0 1 0 0 0 1 a a b b State set Q = {0, 1} Start state q0 = 0 Input alphabet Ʃ = {a, b} A tabular representation of transition function δ • Q = {0, 1}, Ʃ = {a, b} and transition function is shown below • A simple two-state finite automaton which accepts those strings that end in an odd number of a’s. Example : Transition Table and Finite Automata
  • 62. input state a b c P 0 1 0 0 a 1 1 2 0 b 2 3 0 0 a 3 1 4 0 b 4 5 0 0 a 5 1 4 6 c 6 7 0 0 a 7 1 2 0 • Pattern string P = ababaca. • Edge towards right shows matching • Edge towards left is for failure • No edge for some state and some alphabet means that edge hits initial state 0 1 2 3 4 5 6 7 a b a b a c a a a a a b b String Matching Automata for given Pattern
  • 63. String Matching using Finite Automata i -- 1 2 3 4 5 6 7 8 9 10 11 T[i] -- a b a b a b a c a b a state φ(Ti) 0 1 2 3 4 5 4 5 6 7 2 3 Finite Automata for Pattern P = ababaca Text T = abababacaba. 0 1 2 3 4 5 6 7 a b a b a c a a a a a b b
  • 64. 3. String Matching with finite Automata FINITE-AUTOMATON-MATCHER(T, δ, m) 1 n ← length[T] 2 q ← 0 3 for i ← 1 to n 4 do q ← δ(q, T[i]) 5 if q = m 6 then print "Pattern occurs with shift" i - m • Matching time on a text string of length n is Θ(n). • Memory Usage: O(m|Σ|), • Preprocessing Time: Best case: O(m|Σ|).
  • 65. TASK • P=AABA, T=AABAACAADAABAAABAA • Suppose a finite automaton which accepts even number of a's where ∑ = {a, b, c} Suppose w is a string such as w=bcaabcaaabac
  • 66. 66 General Problems, Input Size and Time Complexity • Time complexity of algorithms : polynomial time algorithm ("efficient algorithm") v.s. exponential time algorithm ("inefficient algorithm") f(n) n 10 30 50 n 0.00001 sec 0.00003 sec 0.00005 sec n5 0.1 sec 24.3 sec 5.2 mins 2n 0.001 sec 17.9 mins 35.7 yrs
  • 67. 67 “Hard” and “easy’ Problems • Sometimes the dividing line between “easy” and “hard” problems is a fine one. For example – Find the shortest path in a graph from X to Y. (easy) – Find the longest path in a graph from X to Y. (with no cycles) (hard) • View another way – as “yes/no” problems – Is there a simple path from X to Y with weight <= M? (easy) – Is there a simple path from X to Y with weight >= M? (hard) – First problem can be solved in polynomial time. – All known algorithms for the second problem (could) take exponential time.
  • 68. 68 The Classes of P and NP • The class P and Deterministic Turing Machine • Given a decision problem X, if there is a polynomial time Deterministic Turing Machine program that solves X, then X is belong to P • Informally, there is a polynomial time algorithm to solve the problem
  • 69. 69 • One of the most important open problem in theoretical compute science : Is P=NP ? Most likely “No”. Currently, there are many known (decision) problems in NP, and there is no solution to show anyone of them in P.
  • 70. 70 • P: (Decision) problems solvable by deterministic algorithms in polynomial time • NP: (Decision) problems solved by non-deterministic algorithms in polynomial time • A group of (decision) problems, (Satisfiability, 0/1 Knapsack, Longest Path, Partition) have an additional important property: If any of them can be solved in polynomial time, then they all can! NP P NP-Complete • These problems are called NP-complete problems.