SlideShare a Scribd company logo
1 of 45
Approximate pattern matching Given string  x = abbacbbbababacabbbba  and  pattern  p = bbba  find all “almost”-occurrences of  p  ind  x x = a bba c bbba babacab b a bba 17 6 1
String distance ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
k-Approximate matching Given string  x  and pattern  p  find all indices in  x  where: i i+h d( x [ i..i+h ], p )  ≤   k d(   ,  )  ≤   k Generic problem for the various distance functions d
Generic Algorithm for  i=1..| x |: if  d( x [i..i'], p ) <= k for some i'>i: report match at i i i' Time usage: O(n 2   ∙  “time to calculate distance”) (But see Sect. 10.1 for a O(nm) dynamic programming algorithm that works for most distance functions)
The k-mismatch problem Let, for strings  x  and  y , | x |=| y |= n ,  d( x , y ) = |{i | i=1..n,  x [i] ≠ y [i]}| (the  Hamming  distance) The  k -mismatch problem: Given string  x  and pattern  p  find all indices in  x  where: i i+m-1 d( x [ i..i+m-1 ], p )  ≤   k d(   ,  )  ≤   k
Simple k-mismatch algorithm for  i=1..| x |: count = 0 for  j=1..| p |: if   x [i]!= p [j]: count = count + 1 if  count <= k: report match at i Time usage: O(| x || p| )
Bit-vector approach ,[object Object],[object Object],[object Object],[object Object],i j
Using state vector  s Notice that  s  holds information about more than one comparison! j s [j]==l j' s [j']==h Conceptually,  p  is positioned | p | places along  x : s  tries to match  p  at positions   i -| p |+1 ..  i
Using state vector  s When  s [| p |] ≤ k  and  i ≥ | p |, we have an occurrence of  p  in  x  at  i -| p |+1 s [| p |] <= k i i-| p |+1
Example: 2-mismatch x = babacbbbababacabbbba i=0 p = bbba s = 000  000 000 000 s [1]==0 p = bbba p = bbba p = bbba s [2]==0 s [3]==0 s [4]==0
Example: 2-mismatch x = babacbbbababacabbbba i=1 p = b bba s =000   000 000 000 s [1]==0 p = bbba p = bbba p = bbba s [2]==0 s [3]==0 s [4]==0
Example: 2-mismatch x = babacbbbababacabbbba i=2 p = b bba s =001  001  000 000 s [1]==1 p = b b ba p = bbba p = bbba s [2]==1 s [3]==0 s [4]==0
Example: 2-mismatch x = babacbbbababacabbbba i=3 p = b bba s =000  001 001  000 s [1]==0 p = b b ba p = b b b a p = bbba s [2]==1 s [3]==1 s [4]==0
Example: 2-mismatch x = babacbbbababacabbbba i=4 p = b bba s =001  001 010  001 s [1]==1 p = b b ba p = b b b a p = b b ba s [2]==1 s [3]==2 s [4]==1 Match at i-4+1=1
Example: 2-mismatch x = babacbbbababacabbbba i=5 p = b bba s =001  010 010 011 s [1]==1 p = bb ba p = b bb a p = b b ba s [2]==2 s [3]==2 s [4]==3
Example: 2-mismatch x = babacbbbababacabbbba i=6 p = b bba s =000  001 010 011 s [1]==0 p = b b ba p = bb b a p = b bba s [2]==1 s [3]==2 s [4]==3
Example: 2-mismatch x = babacbbbababacabbbba i=7 p = b bba s =000  000 001 011 s [1]==0 p = bb ba p = b bb a p = bb b a s [2]==0 s [3]==1 s [4]==3
Example: 2-mismatch x = babacbbbababacabbbba i=8 p = b bba s =000  000 001  010 s [1]==0 p = bb ba p = bbb a p = b bb a s [2]==0 s [3]==0 s [4]==2 Match at i-4+1=5
Example: 2-mismatch x = babacbbbababacabbbba i=9 p = b bba s =001  001 001  000 s [1]==1 p = b b ba p = bb b a p = bbba s [2]==1 s [3]==1 s [4]==0 Match at i-4+1=6
Constructing  s Let  s i  be the state vector in iteration  i . Then  s i [ j ]= s i-1 [ j -1] +  t where  t =0 if  p [ j ]= x [ i ] and  t =1 otherwise i j s i-1 [j-1] Special case:  s i [0] = 0 for all  i
Table t The number  t ,  where  t =0 if  p [ j ]= x [ i ] and  t =1 otherwise can be pre-calculated and stored in a matrix: t [h,j] = 0 log(| p |+1) if  p [j] == h 0 log(| p |+1)-1 1 if  p [j] != h with rows indexed by the alphabet and columns indexed by indices in  p , using log(| p |+1) bits per cell.
Table t Why use log(| p |+1) bits per cell? To be able to add  t [h] and  s  a word at a time, silly! s i-1 : t [h]: s i : + + 0 + + + + + + + + + + + + words
Table t for pattern p=bbba 1  2  3  4 t ['a']:  001 001 001 000 t ['b']:  000 000 000 001 t ['c']:  001 001 001 001
Example x = babacbbbababacabbbba i=0 p = bbba s =0 00 000 000 000 000 s 0 [1]==0 p = bbba p = bbba p = bbba s 0 [2]==0 s 0 [3]==0 s 0 [4]==0 s 0 [0]==0
Example x = babacbbbababacabbbba i=1 p = b bba   0 00 000 000 000 001 s 1 [1]==0 p = bbba p = bbba p = bbba s 1 [2]==0 s 1 [3]==0 s 1 [4]==1 s 1 [0]==0 t [b] 0 + 0 + 0 + 1 + s 0 [1]==0 s 0 [2]==0 s 0 [3]==0 s 0 [4]==0 s 0 [0]==0 s 1  =   0 00 000 000 000 000 s 0  =   0 00 000 000 001 t [b] =
Example x = babacbbbababacabbbba i=2 p = b bba   0 00 001 001 001 000 s 2 [1]==1 p = b b ba p = bbba p = bbba s 2 [2]==1 s 2 [3]==1 s 2 [4]==0 s 2 [0]==0 t [a] 1 + 1 + 1 + 0 + s 1 [1]==0 s 1 [2]==0 s 1 [3]==0 s 1 [4]==1 s 1 [0]==0 s 2  =   0 00 000 000 000 001 s 1  =   0 01 001 001 000 t [a] =
Example x = babacbbbababacabbbba i=3 p = b bba   0 00 000 001 001 010 s 3 [1]==0 p = b b ba p = b b b a p = bbba s 3 [2]==1 s 3 [3]==1 s 3 [4]==2 s 3 [0]==0 t [b] 0 + 0 + 0 + 1 + s 2 [1]==1 s 2 [2]==1 s 2 [3]==1 s 2 [4]==0 s 2 [0]==0 s 3  =   0 00 001 001 001 000 s 2  =   0 00 000 000 001 t [b] =
Example x = babacbbbababacabbbba i=4 p = b bba   0 00 001 001 010  001 s 4 [1]==1 p = b b ba p = b b b a p = b b ba s 4 [2]==1 s 4 [3]==2 s 4 [4]==1 s 4 [0]==0 t [a] 1 + 1 + 1 + 0 + s 3 [1]==0 s 3 [2]==1 s 3 [3]==1 s 3 [4]==2 s 3 [0]==0 s 4  =   0 00 000 001 001 010 s 3  =   0 01 001 001 000 t [a] = MATCH!
Example x = babacbbbababacabbbba i=5 p = b bba   0 00 001 010 010 011 s 5 [1]==1 p = bb ba p = b bb a p = b b ba s 5 [2]==2 s 5 [3]==2 s 5 [4]==3 s 5 [0]==0 t [c] 1 + 1 + 1 + 1 + s 4 [1]==1 s 4 [2]==1 s 4 [3]==2 s 4 [4]==1 s 4 [0]==0 s 5  =   0 00 001 001 010 001 s 4  =   0 01 001 001 001 t [c] =
Example x = babacbbbababacabbbba i=6 p = b bba   0 00 000 001 010 011 s 6 [1]==0 p = b b ba p = bb b a p = b bba s 6 [2]==1 s 6 [3]==2 s 6 [4]==3 s 6 [0]==0 t [b] 0 + 0 + 0 + 1 + s 5 [1]==1 s 5 [2]==2 s 5 [3]==2 s 5 [4]==3 s 5 [0]==0 s 6  =   0 00 001 010 010 011 s 5  =   0 00 000 000 001 t [b] =
Algorithm Preprocessing: b = log(| p |+1) for  h  in     and  j=1..| p |: t [h,j] = 0 b-1 1 for  j=1..| p |: t [ p [j],j] = 0 b s  = 0 b(| p |+1) Main: for  i=1..| x |: s  = ( s  >> b) +  t [ x [i]] if   s [| p |]<=k and i>=| p |: report i-| p |+1 as match
Runtime complexity ,[object Object],[object Object]
Improvements When  k   «  | p |, we are wasting bits (and time)! We count all mismatches, but we only need to know if there are more than  k  or not. We can replace the log(| p |+1) bit cells in  s  with log( k +1)+1 (for numbers 0..k and an overflow bit) and an additional vector,  o , remembering overflows
Updated operation We update the operation s  = ( s  >> b) +  t [ x [i]] where b=log(| p|+1), to  be s  = ( s  >> b) +  t [ x [i]] o  = ( o  >> b) | ( s  & OF_MASK) s  =  s  & NEG_OF_MASK where b=log( k +1)+1 and OF_MASK = (10 b-1 ) | p | NEG_OF_MASK = ~OF_MASK = (01 b ) | p |
Algorithm Preprocessing: b = log(k+1)+1 OF_MASK =  (10 b-1 ) | p |  ; NEG_OF_MASK = ~OF_MASK for  h  in     and  j=1..| p |:  t [h,j] = 0 b-1 1 for  j=1..| p |:  t [ p [j],j] = 0 b s  = 0 b(| p |+1)  ;  o  = 0 b(| p |+1) Main: for  i=1..| x |: s  = ( s  >> b) +  t [ x [i]] o  = ( o  >> b) | ( s  & OF_MASK) s  =  s  & NEG_OF_MASK if   s [| p |] +  o [| p |] <= k  and i>=| p |: report i-| p |+1 as match
Example x = babacacbababacabbbba i=0 p = bbba p = bbba p = bbba p = bbba s 0 == 000 000 000 000 000 o 0 == 000 000 000 000 000
Example x = babacacbababacabbbba i=1 p = b bba p = bbba p = bbba p = bbba ( s 0  >> b) == 000 000 000 000 000 t ['b'] ==  000 000 000 001 s 1' == 000 000 000 000 001 ( o 0  >> b) == 000 000 000 000 000 ( s 1'  & OF_MASK) == 000 000 000 000 000 o 1 == 000 000 000 000 000 s 1 == 000 000 000 000 001
Example x = babacacbababacabbbba i=2 p = b bba p = b b ba p = bbba p = bbba ( s 1  >> b) == 000 000 000 000 000 t ['a'] ==  001 001 001 000 s 2' == 000 001 001 001 000 ( o 1  >> b) == 000 000 000 000 000 ( s 2'  & OF_MASK) == 000 000 000 000 000 o 2 == 000 000 000 000 000 s 2 == 000 001 001 001 000
Example x = babacacbababacabbbba i=3 p = b bba p = b b ba p = b b b a p = bbba ( s 2  >> b) == 000 000 001 001 001 t ['b'] ==  000 000 000 001 s 3' == 000 000 001 001 010 ( o 2  >> b) == 000 000 000 000 000 ( s 3'  & OF_MASK) == 000 000 000 000 000 o 3 == 000 000 000 000 000 s 3 == 000 000 001 001 010
Example x = babacacbababacabbbba i=4 p = b bba p = b b ba p = b b b a p = b b ba ( s 3  >> b) == 000 000 000 001 001 t ['a'] ==  001 001 001 000 s 4' == 000 001 001 010 001 ( o 3  >> b) == 000 000 000 000 000 ( s 4'  & OF_MASK) == 000 000 000 000 000 o 4 == 000 000 000 000  000 s 4 == 000 001 001 010  001 MATCH!
Example x = babacacbababacabbbba i=5 p = b bba p = bb ba p = b bb a p = b b ba ( s 4  >> b) == 000 000 001 001 010 t ['c'] ==  001 001 001 001 s 5' == 000 001 010 010 011 ( o 4  >> b) == 000 000 000 000 000 ( s 5'  & OF_MASK) == 000 000 000 000 000 o 5 == 000 000 000 000 000 s 5 == 000 001 010 010 011
Example x = babacacbababacabbbba i=6 p = b bba p = bb ba p = bbb a p = b bb a ( s 5  >> b) == 000 000 001 010 010 t ['a'] ==  001 001 001 000 s 6' == 000 001 010 011 010 ( o 5  >> b) == 000 000 000 000 000 ( s 6'  & OF_MASK) == 000 000 000 000 000 o 6 == 000 000 000 000  000 s 6 == 000 001 010 011  010 MATCH!
Example x = babacacbababacabbbba i=7 p = b bba p = bb ba p = bbb a p = bbba ( s 6  >> b) == 000 000 001 010 011 t ['c'] ==  001 001 001 001 s 7' == 000 001 010 011 100 ( o 6  >> b) == 000 000 000 000 000 ( s 7'  & OF_MASK) == 000 000 000 000 100 o 7 == 000 000 000 000 100 s 7 == 000 001 010 011 000
Example x = babacacbababacabbbba i=8 p = b bba p = b b ba p = bb b a p = bbba ( s 7  >> b) == 000 000 001 010 011 t ['b'] ==  000 000 000 001 s 8' == 000 000 001 010 100 ( o 7  >> b) == 000 000 000 000 000 ( s 8'  & OF_MASK) == 000 000 000 000 100 o 8 == 000 000 000 000 100 s 8 == 000 000 001 010 000
Time complexity ,[object Object],[object Object]

More Related Content

What's hot

2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd62016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6Shreya Sharma
 
5 algebra of functions
5 algebra of functions5 algebra of functions
5 algebra of functionsTzenma
 
How to Find a Cartesian Product
How to Find a Cartesian ProductHow to Find a Cartesian Product
How to Find a Cartesian ProductDon Sevcik
 
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting IntegersGogely The Great
 
4.6 more on log and exponential equations t
4.6 more on log and exponential equations t4.6 more on log and exponential equations t
4.6 more on log and exponential equations tmath260
 
1 ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
1  ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.1  ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
1 ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.Gogely The Great
 
Statistics assignment 4
Statistics assignment 4Statistics assignment 4
Statistics assignment 4Ishaq Ahmed
 
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...Abu Bakar Soomro
 
Algebra cheat sheet
Algebra cheat sheetAlgebra cheat sheet
Algebra cheat sheetalex9803
 
2014 06 22_prml_2_4
2014 06 22_prml_2_42014 06 22_prml_2_4
2014 06 22_prml_2_4yakuzen
 
Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892drayertaurus
 
Melaka 1 ans 2010
Melaka 1 ans 2010Melaka 1 ans 2010
Melaka 1 ans 2010sooklai
 
Set theory solutions
Set theory solutionsSet theory solutions
Set theory solutionsGarden City
 
Tugas matematika kelompok
Tugas matematika kelompokTugas matematika kelompok
Tugas matematika kelompokachmadtrybuana
 
2.3 slopes and difference quotient t
2.3 slopes and difference quotient t2.3 slopes and difference quotient t
2.3 slopes and difference quotient tmath260
 
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2eSAT Journals
 

What's hot (18)

2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd62016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
 
5 algebra of functions
5 algebra of functions5 algebra of functions
5 algebra of functions
 
How to Find a Cartesian Product
How to Find a Cartesian ProductHow to Find a Cartesian Product
How to Find a Cartesian Product
 
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
 
4.6 more on log and exponential equations t
4.6 more on log and exponential equations t4.6 more on log and exponential equations t
4.6 more on log and exponential equations t
 
1 ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
1  ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.1  ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
1 ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
 
Statistics assignment 4
Statistics assignment 4Statistics assignment 4
Statistics assignment 4
 
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
 
Chapter activity plus-in-mathematics-9
Chapter activity plus-in-mathematics-9Chapter activity plus-in-mathematics-9
Chapter activity plus-in-mathematics-9
 
Algebra cheat sheet
Algebra cheat sheetAlgebra cheat sheet
Algebra cheat sheet
 
2014 06 22_prml_2_4
2014 06 22_prml_2_42014 06 22_prml_2_4
2014 06 22_prml_2_4
 
Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892
 
Melaka 1 ans 2010
Melaka 1 ans 2010Melaka 1 ans 2010
Melaka 1 ans 2010
 
Set theory solutions
Set theory solutionsSet theory solutions
Set theory solutions
 
Tugas matematika kelompok
Tugas matematika kelompokTugas matematika kelompok
Tugas matematika kelompok
 
Probability And Random Variable Lecture(3)
Probability And Random Variable Lecture(3)Probability And Random Variable Lecture(3)
Probability And Random Variable Lecture(3)
 
2.3 slopes and difference quotient t
2.3 slopes and difference quotient t2.3 slopes and difference quotient t
2.3 slopes and difference quotient t
 
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
 

Similar to Approximate Matching (String Algorithms 2007)

Tugas akhir modul 5 bilangan eva novianawati h.
Tugas akhir modul 5 bilangan eva novianawati h.Tugas akhir modul 5 bilangan eva novianawati h.
Tugas akhir modul 5 bilangan eva novianawati h.MyWife humaeroh
 
Kuliah 10 Vektor.pptx
Kuliah 10 Vektor.pptxKuliah 10 Vektor.pptx
Kuliah 10 Vektor.pptxZaenurRohman7
 
Statistics assignment 7
Statistics assignment 7Statistics assignment 7
Statistics assignment 7Ishaq Ahmed
 
Form 5 Additional Maths Note
Form 5 Additional Maths NoteForm 5 Additional Maths Note
Form 5 Additional Maths NoteChek Wei Tan
 
R eksponen&logaritma
R eksponen&logaritmaR eksponen&logaritma
R eksponen&logaritmaJoe Zidane
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplicationKiran K
 
Ejercicios resueltos de analisis matematico 1
Ejercicios resueltos de analisis matematico 1Ejercicios resueltos de analisis matematico 1
Ejercicios resueltos de analisis matematico 1tinardo
 
MT T4 (Bab 3: Fungsi Kuadratik)
MT T4 (Bab 3: Fungsi Kuadratik)MT T4 (Bab 3: Fungsi Kuadratik)
MT T4 (Bab 3: Fungsi Kuadratik)hasnulslides
 
cps170_bayes_nets.ppt
cps170_bayes_nets.pptcps170_bayes_nets.ppt
cps170_bayes_nets.pptFaizAbaas
 
Truth, deduction, computation lecture g
Truth, deduction, computation   lecture gTruth, deduction, computation   lecture g
Truth, deduction, computation lecture gVlad Patryshev
 
101 math short cuts [www.onlinebcs.com]
101 math short cuts [www.onlinebcs.com]101 math short cuts [www.onlinebcs.com]
101 math short cuts [www.onlinebcs.com]Itmona
 
Assessments for class xi
Assessments  for class  xi Assessments  for class  xi
Assessments for class xi indu psthakur
 
Real number system full
Real  number  system fullReal  number  system full
Real number system fullAon Narinchoti
 
Real number system full
Real  number  system fullReal  number  system full
Real number system fullAon Narinchoti
 
GCSE-CompletingTheSquare.pptx
GCSE-CompletingTheSquare.pptxGCSE-CompletingTheSquare.pptx
GCSE-CompletingTheSquare.pptxMitaDurenSawit
 

Similar to Approximate Matching (String Algorithms 2007) (20)

Tugas akhir modul 5 bilangan eva novianawati h.
Tugas akhir modul 5 bilangan eva novianawati h.Tugas akhir modul 5 bilangan eva novianawati h.
Tugas akhir modul 5 bilangan eva novianawati h.
 
Peretmuan iii iv sistem bilangan ( bagian pertama )
Peretmuan iii iv sistem bilangan ( bagian pertama )Peretmuan iii iv sistem bilangan ( bagian pertama )
Peretmuan iii iv sistem bilangan ( bagian pertama )
 
Kuliah 10 Vektor.pptx
Kuliah 10 Vektor.pptxKuliah 10 Vektor.pptx
Kuliah 10 Vektor.pptx
 
Statistics assignment 7
Statistics assignment 7Statistics assignment 7
Statistics assignment 7
 
Form 5 Additional Maths Note
Form 5 Additional Maths NoteForm 5 Additional Maths Note
Form 5 Additional Maths Note
 
Appt and reasoning
Appt and reasoningAppt and reasoning
Appt and reasoning
 
Peretmuan iii iv sistem bilangan
Peretmuan iii iv sistem bilanganPeretmuan iii iv sistem bilangan
Peretmuan iii iv sistem bilangan
 
R eksponen&logaritma
R eksponen&logaritmaR eksponen&logaritma
R eksponen&logaritma
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplication
 
Ejercicios resueltos de analisis matematico 1
Ejercicios resueltos de analisis matematico 1Ejercicios resueltos de analisis matematico 1
Ejercicios resueltos de analisis matematico 1
 
MT T4 (Bab 3: Fungsi Kuadratik)
MT T4 (Bab 3: Fungsi Kuadratik)MT T4 (Bab 3: Fungsi Kuadratik)
MT T4 (Bab 3: Fungsi Kuadratik)
 
cps170_bayes_nets.ppt
cps170_bayes_nets.pptcps170_bayes_nets.ppt
cps170_bayes_nets.ppt
 
V2.0
V2.0V2.0
V2.0
 
Truth, deduction, computation lecture g
Truth, deduction, computation   lecture gTruth, deduction, computation   lecture g
Truth, deduction, computation lecture g
 
101 math short cuts [www.onlinebcs.com]
101 math short cuts [www.onlinebcs.com]101 math short cuts [www.onlinebcs.com]
101 math short cuts [www.onlinebcs.com]
 
Xi ipa 2
Xi ipa 2Xi ipa 2
Xi ipa 2
 
Assessments for class xi
Assessments  for class  xi Assessments  for class  xi
Assessments for class xi
 
Real number system full
Real  number  system fullReal  number  system full
Real number system full
 
Real number system full
Real  number  system fullReal  number  system full
Real number system full
 
GCSE-CompletingTheSquare.pptx
GCSE-CompletingTheSquare.pptxGCSE-CompletingTheSquare.pptx
GCSE-CompletingTheSquare.pptx
 

More from mailund

Chapter 9 divide and conquer handouts with notes
Chapter 9   divide and conquer handouts with notesChapter 9   divide and conquer handouts with notes
Chapter 9 divide and conquer handouts with notesmailund
 
Chapter 9 divide and conquer handouts
Chapter 9   divide and conquer handoutsChapter 9   divide and conquer handouts
Chapter 9 divide and conquer handoutsmailund
 
Chapter 9 divide and conquer
Chapter 9   divide and conquerChapter 9   divide and conquer
Chapter 9 divide and conquermailund
 
Chapter 7 recursion handouts with notes
Chapter 7   recursion handouts with notesChapter 7   recursion handouts with notes
Chapter 7 recursion handouts with notesmailund
 
Chapter 7 recursion handouts
Chapter 7   recursion handoutsChapter 7   recursion handouts
Chapter 7 recursion handoutsmailund
 
Chapter 7 recursion
Chapter 7   recursionChapter 7   recursion
Chapter 7 recursionmailund
 
Chapter 5 searching and sorting handouts with notes
Chapter 5   searching and sorting handouts with notesChapter 5   searching and sorting handouts with notes
Chapter 5 searching and sorting handouts with notesmailund
 
Chapter 5 searching and sorting handouts
Chapter 5   searching and sorting handoutsChapter 5   searching and sorting handouts
Chapter 5 searching and sorting handoutsmailund
 
Chapter 5 searching and sorting
Chapter 5   searching and sortingChapter 5   searching and sorting
Chapter 5 searching and sortingmailund
 
Chapter 4 algorithmic efficiency handouts (with notes)
Chapter 4   algorithmic efficiency handouts (with notes)Chapter 4   algorithmic efficiency handouts (with notes)
Chapter 4 algorithmic efficiency handouts (with notes)mailund
 
Chapter 4 algorithmic efficiency handouts
Chapter 4   algorithmic efficiency handoutsChapter 4   algorithmic efficiency handouts
Chapter 4 algorithmic efficiency handoutsmailund
 
Chapter 4 algorithmic efficiency
Chapter 4   algorithmic efficiencyChapter 4   algorithmic efficiency
Chapter 4 algorithmic efficiencymailund
 
Chapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slidesChapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slidesmailund
 
Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)mailund
 
Chapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handoutsChapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handoutsmailund
 
Ku 05 08 2009
Ku 05 08 2009Ku 05 08 2009
Ku 05 08 2009mailund
 
Association mapping using local genealogies
Association mapping using local genealogiesAssociation mapping using local genealogies
Association mapping using local genealogiesmailund
 
Neural Networks
Neural NetworksNeural Networks
Neural Networksmailund
 
Probability And Stats Intro
Probability And Stats IntroProbability And Stats Intro
Probability And Stats Intromailund
 
Probability And Stats Intro2
Probability And Stats Intro2Probability And Stats Intro2
Probability And Stats Intro2mailund
 

More from mailund (20)

Chapter 9 divide and conquer handouts with notes
Chapter 9   divide and conquer handouts with notesChapter 9   divide and conquer handouts with notes
Chapter 9 divide and conquer handouts with notes
 
Chapter 9 divide and conquer handouts
Chapter 9   divide and conquer handoutsChapter 9   divide and conquer handouts
Chapter 9 divide and conquer handouts
 
Chapter 9 divide and conquer
Chapter 9   divide and conquerChapter 9   divide and conquer
Chapter 9 divide and conquer
 
Chapter 7 recursion handouts with notes
Chapter 7   recursion handouts with notesChapter 7   recursion handouts with notes
Chapter 7 recursion handouts with notes
 
Chapter 7 recursion handouts
Chapter 7   recursion handoutsChapter 7   recursion handouts
Chapter 7 recursion handouts
 
Chapter 7 recursion
Chapter 7   recursionChapter 7   recursion
Chapter 7 recursion
 
Chapter 5 searching and sorting handouts with notes
Chapter 5   searching and sorting handouts with notesChapter 5   searching and sorting handouts with notes
Chapter 5 searching and sorting handouts with notes
 
Chapter 5 searching and sorting handouts
Chapter 5   searching and sorting handoutsChapter 5   searching and sorting handouts
Chapter 5 searching and sorting handouts
 
Chapter 5 searching and sorting
Chapter 5   searching and sortingChapter 5   searching and sorting
Chapter 5 searching and sorting
 
Chapter 4 algorithmic efficiency handouts (with notes)
Chapter 4   algorithmic efficiency handouts (with notes)Chapter 4   algorithmic efficiency handouts (with notes)
Chapter 4 algorithmic efficiency handouts (with notes)
 
Chapter 4 algorithmic efficiency handouts
Chapter 4   algorithmic efficiency handoutsChapter 4   algorithmic efficiency handouts
Chapter 4 algorithmic efficiency handouts
 
Chapter 4 algorithmic efficiency
Chapter 4   algorithmic efficiencyChapter 4   algorithmic efficiency
Chapter 4 algorithmic efficiency
 
Chapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slidesChapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slides
 
Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)
 
Chapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handoutsChapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handouts
 
Ku 05 08 2009
Ku 05 08 2009Ku 05 08 2009
Ku 05 08 2009
 
Association mapping using local genealogies
Association mapping using local genealogiesAssociation mapping using local genealogies
Association mapping using local genealogies
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Probability And Stats Intro
Probability And Stats IntroProbability And Stats Intro
Probability And Stats Intro
 
Probability And Stats Intro2
Probability And Stats Intro2Probability And Stats Intro2
Probability And Stats Intro2
 

Recently uploaded

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Approximate Matching (String Algorithms 2007)

  • 1. Approximate pattern matching Given string x = abbacbbbababacabbbba and pattern p = bbba find all “almost”-occurrences of p ind x x = a bba c bbba babacab b a bba 17 6 1
  • 2.
  • 3. k-Approximate matching Given string x and pattern p find all indices in x where: i i+h d( x [ i..i+h ], p ) ≤ k d( , ) ≤ k Generic problem for the various distance functions d
  • 4. Generic Algorithm for i=1..| x |: if d( x [i..i'], p ) <= k for some i'>i: report match at i i i' Time usage: O(n 2 ∙ “time to calculate distance”) (But see Sect. 10.1 for a O(nm) dynamic programming algorithm that works for most distance functions)
  • 5. The k-mismatch problem Let, for strings x and y , | x |=| y |= n , d( x , y ) = |{i | i=1..n, x [i] ≠ y [i]}| (the Hamming distance) The k -mismatch problem: Given string x and pattern p find all indices in x where: i i+m-1 d( x [ i..i+m-1 ], p ) ≤ k d( , ) ≤ k
  • 6. Simple k-mismatch algorithm for i=1..| x |: count = 0 for j=1..| p |: if x [i]!= p [j]: count = count + 1 if count <= k: report match at i Time usage: O(| x || p| )
  • 7.
  • 8. Using state vector s Notice that s holds information about more than one comparison! j s [j]==l j' s [j']==h Conceptually, p is positioned | p | places along x : s tries to match p at positions i -| p |+1 .. i
  • 9. Using state vector s When s [| p |] ≤ k and i ≥ | p |, we have an occurrence of p in x at i -| p |+1 s [| p |] <= k i i-| p |+1
  • 10. Example: 2-mismatch x = babacbbbababacabbbba i=0 p = bbba s = 000 000 000 000 s [1]==0 p = bbba p = bbba p = bbba s [2]==0 s [3]==0 s [4]==0
  • 11. Example: 2-mismatch x = babacbbbababacabbbba i=1 p = b bba s =000 000 000 000 s [1]==0 p = bbba p = bbba p = bbba s [2]==0 s [3]==0 s [4]==0
  • 12. Example: 2-mismatch x = babacbbbababacabbbba i=2 p = b bba s =001 001 000 000 s [1]==1 p = b b ba p = bbba p = bbba s [2]==1 s [3]==0 s [4]==0
  • 13. Example: 2-mismatch x = babacbbbababacabbbba i=3 p = b bba s =000 001 001 000 s [1]==0 p = b b ba p = b b b a p = bbba s [2]==1 s [3]==1 s [4]==0
  • 14. Example: 2-mismatch x = babacbbbababacabbbba i=4 p = b bba s =001 001 010 001 s [1]==1 p = b b ba p = b b b a p = b b ba s [2]==1 s [3]==2 s [4]==1 Match at i-4+1=1
  • 15. Example: 2-mismatch x = babacbbbababacabbbba i=5 p = b bba s =001 010 010 011 s [1]==1 p = bb ba p = b bb a p = b b ba s [2]==2 s [3]==2 s [4]==3
  • 16. Example: 2-mismatch x = babacbbbababacabbbba i=6 p = b bba s =000 001 010 011 s [1]==0 p = b b ba p = bb b a p = b bba s [2]==1 s [3]==2 s [4]==3
  • 17. Example: 2-mismatch x = babacbbbababacabbbba i=7 p = b bba s =000 000 001 011 s [1]==0 p = bb ba p = b bb a p = bb b a s [2]==0 s [3]==1 s [4]==3
  • 18. Example: 2-mismatch x = babacbbbababacabbbba i=8 p = b bba s =000 000 001 010 s [1]==0 p = bb ba p = bbb a p = b bb a s [2]==0 s [3]==0 s [4]==2 Match at i-4+1=5
  • 19. Example: 2-mismatch x = babacbbbababacabbbba i=9 p = b bba s =001 001 001 000 s [1]==1 p = b b ba p = bb b a p = bbba s [2]==1 s [3]==1 s [4]==0 Match at i-4+1=6
  • 20. Constructing s Let s i be the state vector in iteration i . Then s i [ j ]= s i-1 [ j -1] + t where t =0 if p [ j ]= x [ i ] and t =1 otherwise i j s i-1 [j-1] Special case: s i [0] = 0 for all i
  • 21. Table t The number t , where t =0 if p [ j ]= x [ i ] and t =1 otherwise can be pre-calculated and stored in a matrix: t [h,j] = 0 log(| p |+1) if p [j] == h 0 log(| p |+1)-1 1 if p [j] != h with rows indexed by the alphabet and columns indexed by indices in p , using log(| p |+1) bits per cell.
  • 22. Table t Why use log(| p |+1) bits per cell? To be able to add t [h] and s a word at a time, silly! s i-1 : t [h]: s i : + + 0 + + + + + + + + + + + + words
  • 23. Table t for pattern p=bbba 1 2 3 4 t ['a']: 001 001 001 000 t ['b']: 000 000 000 001 t ['c']: 001 001 001 001
  • 24. Example x = babacbbbababacabbbba i=0 p = bbba s =0 00 000 000 000 000 s 0 [1]==0 p = bbba p = bbba p = bbba s 0 [2]==0 s 0 [3]==0 s 0 [4]==0 s 0 [0]==0
  • 25. Example x = babacbbbababacabbbba i=1 p = b bba 0 00 000 000 000 001 s 1 [1]==0 p = bbba p = bbba p = bbba s 1 [2]==0 s 1 [3]==0 s 1 [4]==1 s 1 [0]==0 t [b] 0 + 0 + 0 + 1 + s 0 [1]==0 s 0 [2]==0 s 0 [3]==0 s 0 [4]==0 s 0 [0]==0 s 1 = 0 00 000 000 000 000 s 0 = 0 00 000 000 001 t [b] =
  • 26. Example x = babacbbbababacabbbba i=2 p = b bba 0 00 001 001 001 000 s 2 [1]==1 p = b b ba p = bbba p = bbba s 2 [2]==1 s 2 [3]==1 s 2 [4]==0 s 2 [0]==0 t [a] 1 + 1 + 1 + 0 + s 1 [1]==0 s 1 [2]==0 s 1 [3]==0 s 1 [4]==1 s 1 [0]==0 s 2 = 0 00 000 000 000 001 s 1 = 0 01 001 001 000 t [a] =
  • 27. Example x = babacbbbababacabbbba i=3 p = b bba 0 00 000 001 001 010 s 3 [1]==0 p = b b ba p = b b b a p = bbba s 3 [2]==1 s 3 [3]==1 s 3 [4]==2 s 3 [0]==0 t [b] 0 + 0 + 0 + 1 + s 2 [1]==1 s 2 [2]==1 s 2 [3]==1 s 2 [4]==0 s 2 [0]==0 s 3 = 0 00 001 001 001 000 s 2 = 0 00 000 000 001 t [b] =
  • 28. Example x = babacbbbababacabbbba i=4 p = b bba 0 00 001 001 010 001 s 4 [1]==1 p = b b ba p = b b b a p = b b ba s 4 [2]==1 s 4 [3]==2 s 4 [4]==1 s 4 [0]==0 t [a] 1 + 1 + 1 + 0 + s 3 [1]==0 s 3 [2]==1 s 3 [3]==1 s 3 [4]==2 s 3 [0]==0 s 4 = 0 00 000 001 001 010 s 3 = 0 01 001 001 000 t [a] = MATCH!
  • 29. Example x = babacbbbababacabbbba i=5 p = b bba 0 00 001 010 010 011 s 5 [1]==1 p = bb ba p = b bb a p = b b ba s 5 [2]==2 s 5 [3]==2 s 5 [4]==3 s 5 [0]==0 t [c] 1 + 1 + 1 + 1 + s 4 [1]==1 s 4 [2]==1 s 4 [3]==2 s 4 [4]==1 s 4 [0]==0 s 5 = 0 00 001 001 010 001 s 4 = 0 01 001 001 001 t [c] =
  • 30. Example x = babacbbbababacabbbba i=6 p = b bba 0 00 000 001 010 011 s 6 [1]==0 p = b b ba p = bb b a p = b bba s 6 [2]==1 s 6 [3]==2 s 6 [4]==3 s 6 [0]==0 t [b] 0 + 0 + 0 + 1 + s 5 [1]==1 s 5 [2]==2 s 5 [3]==2 s 5 [4]==3 s 5 [0]==0 s 6 = 0 00 001 010 010 011 s 5 = 0 00 000 000 001 t [b] =
  • 31. Algorithm Preprocessing: b = log(| p |+1) for h in  and j=1..| p |: t [h,j] = 0 b-1 1 for j=1..| p |: t [ p [j],j] = 0 b s = 0 b(| p |+1) Main: for i=1..| x |: s = ( s >> b) + t [ x [i]] if s [| p |]<=k and i>=| p |: report i-| p |+1 as match
  • 32.
  • 33. Improvements When k « | p |, we are wasting bits (and time)! We count all mismatches, but we only need to know if there are more than k or not. We can replace the log(| p |+1) bit cells in s with log( k +1)+1 (for numbers 0..k and an overflow bit) and an additional vector, o , remembering overflows
  • 34. Updated operation We update the operation s = ( s >> b) + t [ x [i]] where b=log(| p|+1), to be s = ( s >> b) + t [ x [i]] o = ( o >> b) | ( s & OF_MASK) s = s & NEG_OF_MASK where b=log( k +1)+1 and OF_MASK = (10 b-1 ) | p | NEG_OF_MASK = ~OF_MASK = (01 b ) | p |
  • 35. Algorithm Preprocessing: b = log(k+1)+1 OF_MASK = (10 b-1 ) | p | ; NEG_OF_MASK = ~OF_MASK for h in  and j=1..| p |: t [h,j] = 0 b-1 1 for j=1..| p |: t [ p [j],j] = 0 b s = 0 b(| p |+1) ; o = 0 b(| p |+1) Main: for i=1..| x |: s = ( s >> b) + t [ x [i]] o = ( o >> b) | ( s & OF_MASK) s = s & NEG_OF_MASK if s [| p |] + o [| p |] <= k and i>=| p |: report i-| p |+1 as match
  • 36. Example x = babacacbababacabbbba i=0 p = bbba p = bbba p = bbba p = bbba s 0 == 000 000 000 000 000 o 0 == 000 000 000 000 000
  • 37. Example x = babacacbababacabbbba i=1 p = b bba p = bbba p = bbba p = bbba ( s 0 >> b) == 000 000 000 000 000 t ['b'] == 000 000 000 001 s 1' == 000 000 000 000 001 ( o 0 >> b) == 000 000 000 000 000 ( s 1' & OF_MASK) == 000 000 000 000 000 o 1 == 000 000 000 000 000 s 1 == 000 000 000 000 001
  • 38. Example x = babacacbababacabbbba i=2 p = b bba p = b b ba p = bbba p = bbba ( s 1 >> b) == 000 000 000 000 000 t ['a'] == 001 001 001 000 s 2' == 000 001 001 001 000 ( o 1 >> b) == 000 000 000 000 000 ( s 2' & OF_MASK) == 000 000 000 000 000 o 2 == 000 000 000 000 000 s 2 == 000 001 001 001 000
  • 39. Example x = babacacbababacabbbba i=3 p = b bba p = b b ba p = b b b a p = bbba ( s 2 >> b) == 000 000 001 001 001 t ['b'] == 000 000 000 001 s 3' == 000 000 001 001 010 ( o 2 >> b) == 000 000 000 000 000 ( s 3' & OF_MASK) == 000 000 000 000 000 o 3 == 000 000 000 000 000 s 3 == 000 000 001 001 010
  • 40. Example x = babacacbababacabbbba i=4 p = b bba p = b b ba p = b b b a p = b b ba ( s 3 >> b) == 000 000 000 001 001 t ['a'] == 001 001 001 000 s 4' == 000 001 001 010 001 ( o 3 >> b) == 000 000 000 000 000 ( s 4' & OF_MASK) == 000 000 000 000 000 o 4 == 000 000 000 000 000 s 4 == 000 001 001 010 001 MATCH!
  • 41. Example x = babacacbababacabbbba i=5 p = b bba p = bb ba p = b bb a p = b b ba ( s 4 >> b) == 000 000 001 001 010 t ['c'] == 001 001 001 001 s 5' == 000 001 010 010 011 ( o 4 >> b) == 000 000 000 000 000 ( s 5' & OF_MASK) == 000 000 000 000 000 o 5 == 000 000 000 000 000 s 5 == 000 001 010 010 011
  • 42. Example x = babacacbababacabbbba i=6 p = b bba p = bb ba p = bbb a p = b bb a ( s 5 >> b) == 000 000 001 010 010 t ['a'] == 001 001 001 000 s 6' == 000 001 010 011 010 ( o 5 >> b) == 000 000 000 000 000 ( s 6' & OF_MASK) == 000 000 000 000 000 o 6 == 000 000 000 000 000 s 6 == 000 001 010 011 010 MATCH!
  • 43. Example x = babacacbababacabbbba i=7 p = b bba p = bb ba p = bbb a p = bbba ( s 6 >> b) == 000 000 001 010 011 t ['c'] == 001 001 001 001 s 7' == 000 001 010 011 100 ( o 6 >> b) == 000 000 000 000 000 ( s 7' & OF_MASK) == 000 000 000 000 100 o 7 == 000 000 000 000 100 s 7 == 000 001 010 011 000
  • 44. Example x = babacacbababacabbbba i=8 p = b bba p = b b ba p = bb b a p = bbba ( s 7 >> b) == 000 000 001 010 011 t ['b'] == 000 000 000 001 s 8' == 000 000 001 010 100 ( o 7 >> b) == 000 000 000 000 000 ( s 8' & OF_MASK) == 000 000 000 000 100 o 8 == 000 000 000 000 100 s 8 == 000 000 001 010 000
  • 45.