SlideShare a Scribd company logo
1 of 23
String-Matching Algorithms (UNIT-5)
1
2
1. String Matching :
Let there is an array of text, T[1..n] of length ‘n’.
Let there is a pattern of text, P[1..m] of length ‘m’.
Let T and P are drawn from a finite alphabet .
Here P and T are called ‘Strings of Characters’.
Here, the pattern P occurs with shift s in text T,
if, 0 ≤ s ≤ n – m
and T[s+1..s+m] = P[1..m]
i.e., for 1 ≤ j ≤ m, T[s+j] = P[j]
If P occurs with shift s in T, it is a VALID SHIFT.
Other wise, we call INVALID SHIFT.
3
The String-matching Problem is the problem of finding
all valid shifts with which a given pattern P occurs in a
given text T.
Ex-1 : Let text T : a b c a b a a b c a b a c
Let pattern P : a b a a
Find the number of valid shifts and ‘s’ values.
Answer : Only one Valid Shift. s = 3
The symbol * (read as ‘sigma-star’) is the set of all
finite-length strings formed using characters from
the alphabet .
4
The zero-length string is called ‘Empty String’.
denoted by ‘ɛ’, also belongs to *.
The length of the string ‘x’ is denoted |x|.
The concatenation of two strings x and y, denoted xy
has length |x| + |y|.
A string ω is a prefix of a string x, denoted as ω ⊏ x,
if x = ω y for some string y ∊ *.
Here, note that if ω ⊏ x, then |w| ≤ |x|.
Similarly, a string ω is a suffix of a string x, denoted
as ω ⊐ x, if x = y ω for some string y ∊ *.
Here, note that if ω ⊐ x, then |w| ≤ |x|.
5
Ex-2 : Let abcca is a string.
Here, ab ⊏ abcca and cca ⊐ abcca
Note-1: The empty string ɛ is both a suffix and
prefix of every string.
Note-2 : Both prefix and suffix are transitive
relations.
Lemma : Suppose that x, y, and z are strings
such that x ⊐ z and y ⊐ z.
Here, if |x| ≤ |y| then x ⊐ y.
if |x| ≥ |y| then y ⊐ x.
if |x| = |y| then x = y.
6
2. The Naïve String-matching Algorithm :
This algorithm finds all valid shifts using a loop that
checks the condition P[1..m] = T[s+1..s+m] for each
of the n –m + 1 possible values of s.
NAÏVE-STRING-MATCHER(T,P)
1 n = T.length
2 m = P.length
3. for s = 0 to n – m
4. if P[1..m] = = T[s+1..s+m]
5 Print “Pattern occurs with shift s.”
7
Ex-3 : Let T = acaabc & P = aab
Find the value of s.
Answer : The value of s = 2
Ex-4 : Let T = 000010001010001
P = 0001
Find the values of ‘s’.
Answer : The value of s = 1 & 5 & 11
Ex-5 : Let T = an and P = am
Answer : The values of s = 0 to n – m
i.e., s contains n – m + 1 values
8
3. The Rabin-Karp Algorithm :
Let  = {0, 1, 2, … , 9}
Here each character is a decimal digit.
d = |  | = 10.
The string 31415 represents 31,415 in radix-d notation.
Let there is a text T[1..n].
Let there is a pattern P[1..m].
Let p denote the corresponding decimal value.
Let ts is the decimal value of the length –m substring
T[s+1..s+m], for s = 0,1,2,..n-m.
 ts = p iff T[s+1..s+m] = P[1..m]
 s is a valid shift iff ts = p
9
Now, the value of p can be computed using
Horner’s rule as follows:
p = P[1..m] = P[1] P[2] P[3]…P[m]
So, p = P[m] + 10 (P[m-1] + 10 (P[m-2] + … +
10 (P[2] + 10 P[1])…)).
Similarly, one can compute t0 as follows :
t0 = T[m] + 10 (T[m-1] + 10 (T[m-2] + … +
10 (T[2] + 10 T[1])…)).
Here we can compute ts+1 from ts as follows :
ts+1 = 10 (ts – 10m-1 T[s+1 ]) + T[s+m+1].
10
Let q is defined so that dq fits in one computer word
and the above recurrence equation can be written as :
ts+1 = (d (ts – T[s+1] h ) + T[s+m+1]) mod q.
Here, h  dm-1 (mod q)
i.e., h is the first digit in the m-digit text window.
Ex-6 : Let m = 5, ts = 31415
Let T[s+m+1] = 2
So, RHS = 10 (ts – 10m-1 T[s+1 ]) + T[s+m+1]
= 10 (31415 – 104 . 3) + 2 = 14150 + 2 = 14152
11
The test ts  p (mod q) is a fast heuristic
test to rule out invalid shifts s.
For any value of ‘s’,
if ts  p (mod q) is TRUE
and P[1..m] = T[s+1..s+m] is FALSE
then ‘s’ is called SPURIOUS HIT.
Note : a) If ts  p (mod q) is TRUE
then ts = p may be TRUE
b) If ts  p (mod q) is FALSE
then ts ≠ p is definitely TRUE
12
RABIN-KARP-MATCHER (T,P,d,q)
1 n = T.length
2 m = P.length
3 h = dm-1 (mod q)
4 p = 0
5 t0 = 0
6 for i = 1 to m // preprocessing
7 p = (dp + P[i]) mod q
8 t0 = (d t0 + T[i]) mod q
9 for s = 0 to n-m //matching
10 if (p = = ts )
11 if (P[1..m] = T[s+1..s+m])
12 print “Pattern occurs with shift” s
13 if (s < n – m)
14 ts+1 = (d (ts – T[s+1] h ) + T[s+m+1]) mod q.
13
Ex-7 : Let T = 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Let P = 3 1 4 1 5
Here n = 19 m = 5 d = 10
q = 13 h = 3
p = 0 t0 = 0
First for statement :
i = 1 : p = 3 t0 = 2
i = 2 : p = 5 t0 = 10
i = 3 : p = 2 t0 = 1
i = 4 : p = 8 t0 = 6
i = 5 : p = 7 t0 = 8
14
Second for statement :
s p ts T p = = ts s < n – m ts+1
0 7 8 23590 FALSE TRUE 9
1 7 9 35902 FALSE TRUE 3
2 7 3 59023 FALSE TRUE 11
3 7 11 90231 FALSE TRUE 0
4 7 0 02314 FALSE TRUE 1
5 7 1 23141 FALSE TRUE 7
6 7 7 31415 TRUE S = 6 TRUE VM 8
7 7 8 14152 FALSE TRUE 4
8 7 4 41526 FALSE TRUE 5
15
s p ts T p = = ts s < n – m ts+1
9 7 5 15267 FALSE TRUE 10
10 7 10 52673 FALSE TRUE 11
11 7 11 26739 FALSE TRUE 7
12 7 7 67399 TRUE S = 12 TRUE SH 9
13 7 9 73992 FALSE TRUE 11
14 7 11 39921 FALSE FALSE ---
Hence, there is only ONE VALID MATCH at s = 6
there is only ONE SPURIOUS HIT at s = 12
16
4. The Knuth-Morris-Pratt Algorithm :
This algorithm is meant for ‘Pattern Matching’.
Here, the prefix function  for a pattern
encapsulates knowledge about how the pattern
matches against shifts of itself.
Ex-8 : Let the Text String T & Pattern P is :
T : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
b a c b a b a b a c a c a c a
P : 1 2 3 4 5 6 7
a b a b a c a
17
COMPUTE-PREFIX-FUNCTION (P) :
1. m = P.length
2. Let [1..m] be a new array
3. [1] = 0
4. k = 0
5. for q = 2 to m
6. while k > 0 and P[k+1]  P[q]
7. k = [k]
8. if P[k+1] = = P[q]
9. k = k + 1
10. [q] = k
11. return 
18
Ex-8 (contd…)
P : 1 2 3 4 5 6 7
a b a b a c a
INIT : m = 7 [1] = 0 k = 0
Step : q = 2 :
Here, k = 0 & P[k+1] = a & P[q] = b
So, while : FALSE & if : FALSE
Hence, [2] = 0
Step : q = 3 :
Here, k = 0 & P[k+1] = a & P[q] = a
So, while : FALSE & if : TRUE k = 1
Hence, [3] = 1
19
Step : q = 4 :
Here, k = 1 & P[k+1] = b & P[q] = b
So, while : FALSE & if : TRUE k = 2
Hence, [4] = 2
Step : q = 5 :
Here, k = 2 & P[k+1] = a & P[q] = a
So, while : FALSE & if : TRUE k = 3
Hence, [5] = 3
Step : q = 6 :
Here, k = 3 & P[k+1] = b & P[q] = c
So, while : TRUE  k = 1 ( = [3] )
& k = 1 & P[k+1] = b & P[q] = c
while : TRUE  k = 0 ( = [1] )
if : FALSE ([P[1] = = P[6])
Hence, [6] = 0
20
Step : q = 7 :
Here, k = 0 & P[k+1] = a & P[q] = a
So, while : FALSE &
if : TRUE (P[1] = = P[7] )
k = 1
Hence, [7] = 1
Hence the  array is as follows :
q : 1 2 3 4 5 6 7
 : 0 0 1 2 3 0 1
Hence, this returns the value : 1
21
KMP-MATCHER (T,P) :
1. n = T.length
2. m = P.length
3.  = COMPUTE-PREFIX-FUNCTION(P)
4. q = 0
5. for i = 1 to n
6. while q > 0 and P[q+1]  T[i]
7. q =  [q]
8. if P[q+1] = = T[i]
9. q = q + 1
10. if q = = m
11. print ”Pattern occurs with shift” i - m
12. q =  [q]
22
Ex-8 contd..
KMP-Matcher (T,P) :
INIT : n = 15 m = 7  =1 q = 0
----------------------------------------------------------------------------------------
i q C1 C2 wh q=  [q] if q++ if print q=  [q]
-------------------------------------------------------------------
1 0 F T F --- F ---- F ---- ----
2 0 F F F --- T q = 1 F ---- ----
3 1 T T T q = 0 F ---- F ---- ----
4 0 F T F --- F ---- F ---- ----
5 0 F F F --- T q = 1 F ---- ----
23
-----------------------------------------------------------------------------------------------
i q C1 C2 wh q=  [q] if q++ if print q=  [q]
-----------------------------------------------------------------------------------------------
6 1 T F F --- T q=2 F ---- ----
7 2 T F F --- T q=3 F ---- ----
8 3 T F F --- T q=4 F ---- ----
9 4 T F F --- T q=5 F ---- ----
10 5 T F F --- T q=6 F ---- ----
11 6 T F F --- T q=7 F shift 4 q=1
12 1 T T T q=0 F ---- F ---- ----
13 0 F F F ---- T q=1 F ---- ----
14 1 T T T q=0 F ---- F ---- ----
15 0 F F F ---- T q=1 F ---- ----
-----------------------------------------------------------------------------------------------

More Related Content

Similar to String-Matching Algorithms Advance algorithm

StringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdfStringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdfbhagabatijenadukura
 
chap09alg.ppt for string matching algorithm
chap09alg.ppt for string matching algorithmchap09alg.ppt for string matching algorithm
chap09alg.ppt for string matching algorithmSadiaSharmin40
 
Pricing Exotics using Change of Numeraire
Pricing Exotics using Change of NumerairePricing Exotics using Change of Numeraire
Pricing Exotics using Change of NumeraireSwati Mital
 
Rabin Carp String Matching algorithm
Rabin Carp String Matching  algorithmRabin Carp String Matching  algorithm
Rabin Carp String Matching algorithmsabiya sabiya
 
Hull White model presentation
Hull White model presentationHull White model presentation
Hull White model presentationStephan Chang
 
Knuth morris pratt string matching algo
Knuth morris pratt string matching algoKnuth morris pratt string matching algo
Knuth morris pratt string matching algosabiya sabiya
 
A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...
A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...
A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...Afshin Tiraie
 
Graph for Coulomb damped oscillation
Graph for Coulomb damped oscillationGraph for Coulomb damped oscillation
Graph for Coulomb damped oscillationphanhung20
 
Solving Linear Equations Over p-Adic Integers
Solving Linear Equations Over p-Adic IntegersSolving Linear Equations Over p-Adic Integers
Solving Linear Equations Over p-Adic IntegersJoseph Molina
 
Theoryofcomp science
Theoryofcomp scienceTheoryofcomp science
Theoryofcomp scienceRaghu nath
 
Find the compact trigonometric Fourier series for the periodic signal.pdf
Find the compact trigonometric Fourier series for the periodic signal.pdfFind the compact trigonometric Fourier series for the periodic signal.pdf
Find the compact trigonometric Fourier series for the periodic signal.pdfarihantelectronics
 
A New Deterministic RSA-Factoring Algorithm
A New Deterministic RSA-Factoring AlgorithmA New Deterministic RSA-Factoring Algorithm
A New Deterministic RSA-Factoring AlgorithmJim Jimenez
 
07 periodic functions and fourier series
07 periodic functions and fourier series07 periodic functions and fourier series
07 periodic functions and fourier seriesKrishna Gali
 

Similar to String-Matching Algorithms Advance algorithm (20)

StringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdfStringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdf
 
chap09alg.ppt for string matching algorithm
chap09alg.ppt for string matching algorithmchap09alg.ppt for string matching algorithm
chap09alg.ppt for string matching algorithm
 
Daa chapter9
Daa chapter9Daa chapter9
Daa chapter9
 
Pricing Exotics using Change of Numeraire
Pricing Exotics using Change of NumerairePricing Exotics using Change of Numeraire
Pricing Exotics using Change of Numeraire
 
Rabin Carp String Matching algorithm
Rabin Carp String Matching  algorithmRabin Carp String Matching  algorithm
Rabin Carp String Matching algorithm
 
25 String Matching
25 String Matching25 String Matching
25 String Matching
 
Hull White model presentation
Hull White model presentationHull White model presentation
Hull White model presentation
 
Knuth morris pratt string matching algo
Knuth morris pratt string matching algoKnuth morris pratt string matching algo
Knuth morris pratt string matching algo
 
Data Analysis Assignment Help
Data Analysis Assignment HelpData Analysis Assignment Help
Data Analysis Assignment Help
 
A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...
A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...
A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...
 
Graph for Coulomb damped oscillation
Graph for Coulomb damped oscillationGraph for Coulomb damped oscillation
Graph for Coulomb damped oscillation
 
Solving Linear Equations Over p-Adic Integers
Solving Linear Equations Over p-Adic IntegersSolving Linear Equations Over p-Adic Integers
Solving Linear Equations Over p-Adic Integers
 
Theoryofcomp science
Theoryofcomp scienceTheoryofcomp science
Theoryofcomp science
 
stochastic processes assignment help
stochastic processes assignment helpstochastic processes assignment help
stochastic processes assignment help
 
lecture6.ppt
lecture6.pptlecture6.ppt
lecture6.ppt
 
String matching algorithm
String matching algorithmString matching algorithm
String matching algorithm
 
Sequence function
Sequence functionSequence function
Sequence function
 
Find the compact trigonometric Fourier series for the periodic signal.pdf
Find the compact trigonometric Fourier series for the periodic signal.pdfFind the compact trigonometric Fourier series for the periodic signal.pdf
Find the compact trigonometric Fourier series for the periodic signal.pdf
 
A New Deterministic RSA-Factoring Algorithm
A New Deterministic RSA-Factoring AlgorithmA New Deterministic RSA-Factoring Algorithm
A New Deterministic RSA-Factoring Algorithm
 
07 periodic functions and fourier series
07 periodic functions and fourier series07 periodic functions and fourier series
07 periodic functions and fourier series
 

More from ssuseraf60311

Graph coloring with back tracking aoa.ppt
Graph coloring with back tracking aoa.pptGraph coloring with back tracking aoa.ppt
Graph coloring with back tracking aoa.pptssuseraf60311
 
application of http.pptx
application of http.pptxapplication of http.pptx
application of http.pptxssuseraf60311
 
Working of web browser.pptx
Working of web browser.pptxWorking of web browser.pptx
Working of web browser.pptxssuseraf60311
 

More from ssuseraf60311 (7)

Graph coloring with back tracking aoa.ppt
Graph coloring with back tracking aoa.pptGraph coloring with back tracking aoa.ppt
Graph coloring with back tracking aoa.ppt
 
3526192.ppt
3526192.ppt3526192.ppt
3526192.ppt
 
8259731.ppt
8259731.ppt8259731.ppt
8259731.ppt
 
fit100-16-dom.ppt
fit100-16-dom.pptfit100-16-dom.ppt
fit100-16-dom.ppt
 
6065165.ppt
6065165.ppt6065165.ppt
6065165.ppt
 
application of http.pptx
application of http.pptxapplication of http.pptx
application of http.pptx
 
Working of web browser.pptx
Working of web browser.pptxWorking of web browser.pptx
Working of web browser.pptx
 

Recently uploaded

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 

Recently uploaded (20)

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 

String-Matching Algorithms Advance algorithm

  • 2. 2 1. String Matching : Let there is an array of text, T[1..n] of length ‘n’. Let there is a pattern of text, P[1..m] of length ‘m’. Let T and P are drawn from a finite alphabet . Here P and T are called ‘Strings of Characters’. Here, the pattern P occurs with shift s in text T, if, 0 ≤ s ≤ n – m and T[s+1..s+m] = P[1..m] i.e., for 1 ≤ j ≤ m, T[s+j] = P[j] If P occurs with shift s in T, it is a VALID SHIFT. Other wise, we call INVALID SHIFT.
  • 3. 3 The String-matching Problem is the problem of finding all valid shifts with which a given pattern P occurs in a given text T. Ex-1 : Let text T : a b c a b a a b c a b a c Let pattern P : a b a a Find the number of valid shifts and ‘s’ values. Answer : Only one Valid Shift. s = 3 The symbol * (read as ‘sigma-star’) is the set of all finite-length strings formed using characters from the alphabet .
  • 4. 4 The zero-length string is called ‘Empty String’. denoted by ‘ɛ’, also belongs to *. The length of the string ‘x’ is denoted |x|. The concatenation of two strings x and y, denoted xy has length |x| + |y|. A string ω is a prefix of a string x, denoted as ω ⊏ x, if x = ω y for some string y ∊ *. Here, note that if ω ⊏ x, then |w| ≤ |x|. Similarly, a string ω is a suffix of a string x, denoted as ω ⊐ x, if x = y ω for some string y ∊ *. Here, note that if ω ⊐ x, then |w| ≤ |x|.
  • 5. 5 Ex-2 : Let abcca is a string. Here, ab ⊏ abcca and cca ⊐ abcca Note-1: The empty string ɛ is both a suffix and prefix of every string. Note-2 : Both prefix and suffix are transitive relations. Lemma : Suppose that x, y, and z are strings such that x ⊐ z and y ⊐ z. Here, if |x| ≤ |y| then x ⊐ y. if |x| ≥ |y| then y ⊐ x. if |x| = |y| then x = y.
  • 6. 6 2. The Naïve String-matching Algorithm : This algorithm finds all valid shifts using a loop that checks the condition P[1..m] = T[s+1..s+m] for each of the n –m + 1 possible values of s. NAÏVE-STRING-MATCHER(T,P) 1 n = T.length 2 m = P.length 3. for s = 0 to n – m 4. if P[1..m] = = T[s+1..s+m] 5 Print “Pattern occurs with shift s.”
  • 7. 7 Ex-3 : Let T = acaabc & P = aab Find the value of s. Answer : The value of s = 2 Ex-4 : Let T = 000010001010001 P = 0001 Find the values of ‘s’. Answer : The value of s = 1 & 5 & 11 Ex-5 : Let T = an and P = am Answer : The values of s = 0 to n – m i.e., s contains n – m + 1 values
  • 8. 8 3. The Rabin-Karp Algorithm : Let  = {0, 1, 2, … , 9} Here each character is a decimal digit. d = |  | = 10. The string 31415 represents 31,415 in radix-d notation. Let there is a text T[1..n]. Let there is a pattern P[1..m]. Let p denote the corresponding decimal value. Let ts is the decimal value of the length –m substring T[s+1..s+m], for s = 0,1,2,..n-m.  ts = p iff T[s+1..s+m] = P[1..m]  s is a valid shift iff ts = p
  • 9. 9 Now, the value of p can be computed using Horner’s rule as follows: p = P[1..m] = P[1] P[2] P[3]…P[m] So, p = P[m] + 10 (P[m-1] + 10 (P[m-2] + … + 10 (P[2] + 10 P[1])…)). Similarly, one can compute t0 as follows : t0 = T[m] + 10 (T[m-1] + 10 (T[m-2] + … + 10 (T[2] + 10 T[1])…)). Here we can compute ts+1 from ts as follows : ts+1 = 10 (ts – 10m-1 T[s+1 ]) + T[s+m+1].
  • 10. 10 Let q is defined so that dq fits in one computer word and the above recurrence equation can be written as : ts+1 = (d (ts – T[s+1] h ) + T[s+m+1]) mod q. Here, h  dm-1 (mod q) i.e., h is the first digit in the m-digit text window. Ex-6 : Let m = 5, ts = 31415 Let T[s+m+1] = 2 So, RHS = 10 (ts – 10m-1 T[s+1 ]) + T[s+m+1] = 10 (31415 – 104 . 3) + 2 = 14150 + 2 = 14152
  • 11. 11 The test ts  p (mod q) is a fast heuristic test to rule out invalid shifts s. For any value of ‘s’, if ts  p (mod q) is TRUE and P[1..m] = T[s+1..s+m] is FALSE then ‘s’ is called SPURIOUS HIT. Note : a) If ts  p (mod q) is TRUE then ts = p may be TRUE b) If ts  p (mod q) is FALSE then ts ≠ p is definitely TRUE
  • 12. 12 RABIN-KARP-MATCHER (T,P,d,q) 1 n = T.length 2 m = P.length 3 h = dm-1 (mod q) 4 p = 0 5 t0 = 0 6 for i = 1 to m // preprocessing 7 p = (dp + P[i]) mod q 8 t0 = (d t0 + T[i]) mod q 9 for s = 0 to n-m //matching 10 if (p = = ts ) 11 if (P[1..m] = T[s+1..s+m]) 12 print “Pattern occurs with shift” s 13 if (s < n – m) 14 ts+1 = (d (ts – T[s+1] h ) + T[s+m+1]) mod q.
  • 13. 13 Ex-7 : Let T = 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1 Let P = 3 1 4 1 5 Here n = 19 m = 5 d = 10 q = 13 h = 3 p = 0 t0 = 0 First for statement : i = 1 : p = 3 t0 = 2 i = 2 : p = 5 t0 = 10 i = 3 : p = 2 t0 = 1 i = 4 : p = 8 t0 = 6 i = 5 : p = 7 t0 = 8
  • 14. 14 Second for statement : s p ts T p = = ts s < n – m ts+1 0 7 8 23590 FALSE TRUE 9 1 7 9 35902 FALSE TRUE 3 2 7 3 59023 FALSE TRUE 11 3 7 11 90231 FALSE TRUE 0 4 7 0 02314 FALSE TRUE 1 5 7 1 23141 FALSE TRUE 7 6 7 7 31415 TRUE S = 6 TRUE VM 8 7 7 8 14152 FALSE TRUE 4 8 7 4 41526 FALSE TRUE 5
  • 15. 15 s p ts T p = = ts s < n – m ts+1 9 7 5 15267 FALSE TRUE 10 10 7 10 52673 FALSE TRUE 11 11 7 11 26739 FALSE TRUE 7 12 7 7 67399 TRUE S = 12 TRUE SH 9 13 7 9 73992 FALSE TRUE 11 14 7 11 39921 FALSE FALSE --- Hence, there is only ONE VALID MATCH at s = 6 there is only ONE SPURIOUS HIT at s = 12
  • 16. 16 4. The Knuth-Morris-Pratt Algorithm : This algorithm is meant for ‘Pattern Matching’. Here, the prefix function  for a pattern encapsulates knowledge about how the pattern matches against shifts of itself. Ex-8 : Let the Text String T & Pattern P is : T : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 b a c b a b a b a c a c a c a P : 1 2 3 4 5 6 7 a b a b a c a
  • 17. 17 COMPUTE-PREFIX-FUNCTION (P) : 1. m = P.length 2. Let [1..m] be a new array 3. [1] = 0 4. k = 0 5. for q = 2 to m 6. while k > 0 and P[k+1]  P[q] 7. k = [k] 8. if P[k+1] = = P[q] 9. k = k + 1 10. [q] = k 11. return 
  • 18. 18 Ex-8 (contd…) P : 1 2 3 4 5 6 7 a b a b a c a INIT : m = 7 [1] = 0 k = 0 Step : q = 2 : Here, k = 0 & P[k+1] = a & P[q] = b So, while : FALSE & if : FALSE Hence, [2] = 0 Step : q = 3 : Here, k = 0 & P[k+1] = a & P[q] = a So, while : FALSE & if : TRUE k = 1 Hence, [3] = 1
  • 19. 19 Step : q = 4 : Here, k = 1 & P[k+1] = b & P[q] = b So, while : FALSE & if : TRUE k = 2 Hence, [4] = 2 Step : q = 5 : Here, k = 2 & P[k+1] = a & P[q] = a So, while : FALSE & if : TRUE k = 3 Hence, [5] = 3 Step : q = 6 : Here, k = 3 & P[k+1] = b & P[q] = c So, while : TRUE  k = 1 ( = [3] ) & k = 1 & P[k+1] = b & P[q] = c while : TRUE  k = 0 ( = [1] ) if : FALSE ([P[1] = = P[6]) Hence, [6] = 0
  • 20. 20 Step : q = 7 : Here, k = 0 & P[k+1] = a & P[q] = a So, while : FALSE & if : TRUE (P[1] = = P[7] ) k = 1 Hence, [7] = 1 Hence the  array is as follows : q : 1 2 3 4 5 6 7  : 0 0 1 2 3 0 1 Hence, this returns the value : 1
  • 21. 21 KMP-MATCHER (T,P) : 1. n = T.length 2. m = P.length 3.  = COMPUTE-PREFIX-FUNCTION(P) 4. q = 0 5. for i = 1 to n 6. while q > 0 and P[q+1]  T[i] 7. q =  [q] 8. if P[q+1] = = T[i] 9. q = q + 1 10. if q = = m 11. print ”Pattern occurs with shift” i - m 12. q =  [q]
  • 22. 22 Ex-8 contd.. KMP-Matcher (T,P) : INIT : n = 15 m = 7  =1 q = 0 ---------------------------------------------------------------------------------------- i q C1 C2 wh q=  [q] if q++ if print q=  [q] ------------------------------------------------------------------- 1 0 F T F --- F ---- F ---- ---- 2 0 F F F --- T q = 1 F ---- ---- 3 1 T T T q = 0 F ---- F ---- ---- 4 0 F T F --- F ---- F ---- ---- 5 0 F F F --- T q = 1 F ---- ----
  • 23. 23 ----------------------------------------------------------------------------------------------- i q C1 C2 wh q=  [q] if q++ if print q=  [q] ----------------------------------------------------------------------------------------------- 6 1 T F F --- T q=2 F ---- ---- 7 2 T F F --- T q=3 F ---- ---- 8 3 T F F --- T q=4 F ---- ---- 9 4 T F F --- T q=5 F ---- ---- 10 5 T F F --- T q=6 F ---- ---- 11 6 T F F --- T q=7 F shift 4 q=1 12 1 T T T q=0 F ---- F ---- ---- 13 0 F F F ---- T q=1 F ---- ---- 14 1 T T T q=0 F ---- F ---- ---- 15 0 F F F ---- T q=1 F ---- ---- -----------------------------------------------------------------------------------------------