SlideShare a Scribd company logo
Approximate pattern matching Given string  x = abbacbbbababacabbbba  and  pattern  p = bbba  find all “almost”-occurrences of  p  ind  x x = a bba c bbba babacab b a bba 17 6 1
String distance ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
k-Approximate matching Given string  x  and pattern  p  find all indices in  x  where: i i+h d( x [ i..i+h ], p )  ≤   k d(   ,  )  ≤   k Generic problem for the various distance functions d
Generic Algorithm for  i=1..| x |: if  d( x [i..i'], p ) <= k for some i'>i: report match at i i i' Time usage: O(n 2   ∙  “time to calculate distance”) (But see Sect. 10.1 for a O(nm) dynamic programming algorithm that works for most distance functions)
The k-mismatch problem Let, for strings  x  and  y , | x |=| y |= n ,  d( x , y ) = |{i | i=1..n,  x [i] ≠ y [i]}| (the  Hamming  distance) The  k -mismatch problem: Given string  x  and pattern  p  find all indices in  x  where: i i+m-1 d( x [ i..i+m-1 ], p )  ≤   k d(   ,  )  ≤   k
Simple k-mismatch algorithm for  i=1..| x |: count = 0 for  j=1..| p |: if   x [i]!= p [j]: count = count + 1 if  count <= k: report match at i Time usage: O(| x || p| )
Bit-vector approach ,[object Object],[object Object],[object Object],[object Object],i j
Using state vector  s Notice that  s  holds information about more than one comparison! j s [j]==l j' s [j']==h Conceptually,  p  is positioned | p | places along  x : s  tries to match  p  at positions   i -| p |+1 ..  i
Using state vector  s When  s [| p |] ≤ k  and  i ≥ | p |, we have an occurrence of  p  in  x  at  i -| p |+1 s [| p |] <= k i i-| p |+1
Example: 2-mismatch x = babacbbbababacabbbba i=0 p = bbba s = 000  000 000 000 s [1]==0 p = bbba p = bbba p = bbba s [2]==0 s [3]==0 s [4]==0
Example: 2-mismatch x = babacbbbababacabbbba i=1 p = b bba s =000   000 000 000 s [1]==0 p = bbba p = bbba p = bbba s [2]==0 s [3]==0 s [4]==0
Example: 2-mismatch x = babacbbbababacabbbba i=2 p = b bba s =001  001  000 000 s [1]==1 p = b b ba p = bbba p = bbba s [2]==1 s [3]==0 s [4]==0
Example: 2-mismatch x = babacbbbababacabbbba i=3 p = b bba s =000  001 001  000 s [1]==0 p = b b ba p = b b b a p = bbba s [2]==1 s [3]==1 s [4]==0
Example: 2-mismatch x = babacbbbababacabbbba i=4 p = b bba s =001  001 010  001 s [1]==1 p = b b ba p = b b b a p = b b ba s [2]==1 s [3]==2 s [4]==1 Match at i-4+1=1
Example: 2-mismatch x = babacbbbababacabbbba i=5 p = b bba s =001  010 010 011 s [1]==1 p = bb ba p = b bb a p = b b ba s [2]==2 s [3]==2 s [4]==3
Example: 2-mismatch x = babacbbbababacabbbba i=6 p = b bba s =000  001 010 011 s [1]==0 p = b b ba p = bb b a p = b bba s [2]==1 s [3]==2 s [4]==3
Example: 2-mismatch x = babacbbbababacabbbba i=7 p = b bba s =000  000 001 011 s [1]==0 p = bb ba p = b bb a p = bb b a s [2]==0 s [3]==1 s [4]==3
Example: 2-mismatch x = babacbbbababacabbbba i=8 p = b bba s =000  000 001  010 s [1]==0 p = bb ba p = bbb a p = b bb a s [2]==0 s [3]==0 s [4]==2 Match at i-4+1=5
Example: 2-mismatch x = babacbbbababacabbbba i=9 p = b bba s =001  001 001  000 s [1]==1 p = b b ba p = bb b a p = bbba s [2]==1 s [3]==1 s [4]==0 Match at i-4+1=6
Constructing  s Let  s i  be the state vector in iteration  i . Then  s i [ j ]= s i-1 [ j -1] +  t where  t =0 if  p [ j ]= x [ i ] and  t =1 otherwise i j s i-1 [j-1] Special case:  s i [0] = 0 for all  i
Table t The number  t ,  where  t =0 if  p [ j ]= x [ i ] and  t =1 otherwise can be pre-calculated and stored in a matrix: t [h,j] = 0 log(| p |+1) if  p [j] == h 0 log(| p |+1)-1 1 if  p [j] != h with rows indexed by the alphabet and columns indexed by indices in  p , using log(| p |+1) bits per cell.
Table t Why use log(| p |+1) bits per cell? To be able to add  t [h] and  s  a word at a time, silly! s i-1 : t [h]: s i : + + 0 + + + + + + + + + + + + words
Table t for pattern p=bbba 1  2  3  4 t ['a']:  001 001 001 000 t ['b']:  000 000 000 001 t ['c']:  001 001 001 001
Example x = babacbbbababacabbbba i=0 p = bbba s =0 00 000 000 000 000 s 0 [1]==0 p = bbba p = bbba p = bbba s 0 [2]==0 s 0 [3]==0 s 0 [4]==0 s 0 [0]==0
Example x = babacbbbababacabbbba i=1 p = b bba   0 00 000 000 000 001 s 1 [1]==0 p = bbba p = bbba p = bbba s 1 [2]==0 s 1 [3]==0 s 1 [4]==1 s 1 [0]==0 t [b] 0 + 0 + 0 + 1 + s 0 [1]==0 s 0 [2]==0 s 0 [3]==0 s 0 [4]==0 s 0 [0]==0 s 1  =   0 00 000 000 000 000 s 0  =   0 00 000 000 001 t [b] =
Example x = babacbbbababacabbbba i=2 p = b bba   0 00 001 001 001 000 s 2 [1]==1 p = b b ba p = bbba p = bbba s 2 [2]==1 s 2 [3]==1 s 2 [4]==0 s 2 [0]==0 t [a] 1 + 1 + 1 + 0 + s 1 [1]==0 s 1 [2]==0 s 1 [3]==0 s 1 [4]==1 s 1 [0]==0 s 2  =   0 00 000 000 000 001 s 1  =   0 01 001 001 000 t [a] =
Example x = babacbbbababacabbbba i=3 p = b bba   0 00 000 001 001 010 s 3 [1]==0 p = b b ba p = b b b a p = bbba s 3 [2]==1 s 3 [3]==1 s 3 [4]==2 s 3 [0]==0 t [b] 0 + 0 + 0 + 1 + s 2 [1]==1 s 2 [2]==1 s 2 [3]==1 s 2 [4]==0 s 2 [0]==0 s 3  =   0 00 001 001 001 000 s 2  =   0 00 000 000 001 t [b] =
Example x = babacbbbababacabbbba i=4 p = b bba   0 00 001 001 010  001 s 4 [1]==1 p = b b ba p = b b b a p = b b ba s 4 [2]==1 s 4 [3]==2 s 4 [4]==1 s 4 [0]==0 t [a] 1 + 1 + 1 + 0 + s 3 [1]==0 s 3 [2]==1 s 3 [3]==1 s 3 [4]==2 s 3 [0]==0 s 4  =   0 00 000 001 001 010 s 3  =   0 01 001 001 000 t [a] = MATCH!
Example x = babacbbbababacabbbba i=5 p = b bba   0 00 001 010 010 011 s 5 [1]==1 p = bb ba p = b bb a p = b b ba s 5 [2]==2 s 5 [3]==2 s 5 [4]==3 s 5 [0]==0 t [c] 1 + 1 + 1 + 1 + s 4 [1]==1 s 4 [2]==1 s 4 [3]==2 s 4 [4]==1 s 4 [0]==0 s 5  =   0 00 001 001 010 001 s 4  =   0 01 001 001 001 t [c] =
Example x = babacbbbababacabbbba i=6 p = b bba   0 00 000 001 010 011 s 6 [1]==0 p = b b ba p = bb b a p = b bba s 6 [2]==1 s 6 [3]==2 s 6 [4]==3 s 6 [0]==0 t [b] 0 + 0 + 0 + 1 + s 5 [1]==1 s 5 [2]==2 s 5 [3]==2 s 5 [4]==3 s 5 [0]==0 s 6  =   0 00 001 010 010 011 s 5  =   0 00 000 000 001 t [b] =
Algorithm Preprocessing: b = log(| p |+1) for  h  in     and  j=1..| p |: t [h,j] = 0 b-1 1 for  j=1..| p |: t [ p [j],j] = 0 b s  = 0 b(| p |+1) Main: for  i=1..| x |: s  = ( s  >> b) +  t [ x [i]] if   s [| p |]<=k and i>=| p |: report i-| p |+1 as match
Runtime complexity ,[object Object],[object Object]
Improvements When  k   «  | p |, we are wasting bits (and time)! We count all mismatches, but we only need to know if there are more than  k  or not. We can replace the log(| p |+1) bit cells in  s  with log( k +1)+1 (for numbers 0..k and an overflow bit) and an additional vector,  o , remembering overflows
Updated operation We update the operation s  = ( s  >> b) +  t [ x [i]] where b=log(| p|+1), to  be s  = ( s  >> b) +  t [ x [i]] o  = ( o  >> b) | ( s  & OF_MASK) s  =  s  & NEG_OF_MASK where b=log( k +1)+1 and OF_MASK = (10 b-1 ) | p | NEG_OF_MASK = ~OF_MASK = (01 b ) | p |
Algorithm Preprocessing: b = log(k+1)+1 OF_MASK =  (10 b-1 ) | p |  ; NEG_OF_MASK = ~OF_MASK for  h  in     and  j=1..| p |:  t [h,j] = 0 b-1 1 for  j=1..| p |:  t [ p [j],j] = 0 b s  = 0 b(| p |+1)  ;  o  = 0 b(| p |+1) Main: for  i=1..| x |: s  = ( s  >> b) +  t [ x [i]] o  = ( o  >> b) | ( s  & OF_MASK) s  =  s  & NEG_OF_MASK if   s [| p |] +  o [| p |] <= k  and i>=| p |: report i-| p |+1 as match
Example x = babacacbababacabbbba i=0 p = bbba p = bbba p = bbba p = bbba s 0 == 000 000 000 000 000 o 0 == 000 000 000 000 000
Example x = babacacbababacabbbba i=1 p = b bba p = bbba p = bbba p = bbba ( s 0  >> b) == 000 000 000 000 000 t ['b'] ==  000 000 000 001 s 1' == 000 000 000 000 001 ( o 0  >> b) == 000 000 000 000 000 ( s 1'  & OF_MASK) == 000 000 000 000 000 o 1 == 000 000 000 000 000 s 1 == 000 000 000 000 001
Example x = babacacbababacabbbba i=2 p = b bba p = b b ba p = bbba p = bbba ( s 1  >> b) == 000 000 000 000 000 t ['a'] ==  001 001 001 000 s 2' == 000 001 001 001 000 ( o 1  >> b) == 000 000 000 000 000 ( s 2'  & OF_MASK) == 000 000 000 000 000 o 2 == 000 000 000 000 000 s 2 == 000 001 001 001 000
Example x = babacacbababacabbbba i=3 p = b bba p = b b ba p = b b b a p = bbba ( s 2  >> b) == 000 000 001 001 001 t ['b'] ==  000 000 000 001 s 3' == 000 000 001 001 010 ( o 2  >> b) == 000 000 000 000 000 ( s 3'  & OF_MASK) == 000 000 000 000 000 o 3 == 000 000 000 000 000 s 3 == 000 000 001 001 010
Example x = babacacbababacabbbba i=4 p = b bba p = b b ba p = b b b a p = b b ba ( s 3  >> b) == 000 000 000 001 001 t ['a'] ==  001 001 001 000 s 4' == 000 001 001 010 001 ( o 3  >> b) == 000 000 000 000 000 ( s 4'  & OF_MASK) == 000 000 000 000 000 o 4 == 000 000 000 000  000 s 4 == 000 001 001 010  001 MATCH!
Example x = babacacbababacabbbba i=5 p = b bba p = bb ba p = b bb a p = b b ba ( s 4  >> b) == 000 000 001 001 010 t ['c'] ==  001 001 001 001 s 5' == 000 001 010 010 011 ( o 4  >> b) == 000 000 000 000 000 ( s 5'  & OF_MASK) == 000 000 000 000 000 o 5 == 000 000 000 000 000 s 5 == 000 001 010 010 011
Example x = babacacbababacabbbba i=6 p = b bba p = bb ba p = bbb a p = b bb a ( s 5  >> b) == 000 000 001 010 010 t ['a'] ==  001 001 001 000 s 6' == 000 001 010 011 010 ( o 5  >> b) == 000 000 000 000 000 ( s 6'  & OF_MASK) == 000 000 000 000 000 o 6 == 000 000 000 000  000 s 6 == 000 001 010 011  010 MATCH!
Example x = babacacbababacabbbba i=7 p = b bba p = bb ba p = bbb a p = bbba ( s 6  >> b) == 000 000 001 010 011 t ['c'] ==  001 001 001 001 s 7' == 000 001 010 011 100 ( o 6  >> b) == 000 000 000 000 000 ( s 7'  & OF_MASK) == 000 000 000 000 100 o 7 == 000 000 000 000 100 s 7 == 000 001 010 011 000
Example x = babacacbababacabbbba i=8 p = b bba p = b b ba p = bb b a p = bbba ( s 7  >> b) == 000 000 001 010 011 t ['b'] ==  000 000 000 001 s 8' == 000 000 001 010 100 ( o 7  >> b) == 000 000 000 000 000 ( s 8'  & OF_MASK) == 000 000 000 000 100 o 8 == 000 000 000 000 100 s 8 == 000 000 001 010 000
Time complexity ,[object Object],[object Object]

More Related Content

What's hot

2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd62016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
Shreya Sharma
 
5 algebra of functions
5 algebra of functions5 algebra of functions
5 algebra of functions
Tzenma
 
How to Find a Cartesian Product
How to Find a Cartesian ProductHow to Find a Cartesian Product
How to Find a Cartesian Product
Don Sevcik
 
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
Gogely The Great
 
4.6 more on log and exponential equations t
4.6 more on log and exponential equations t4.6 more on log and exponential equations t
4.6 more on log and exponential equations t
math260
 
1 ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
1  ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.1  ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
1 ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
Gogely The Great
 
Statistics assignment 4
Statistics assignment 4Statistics assignment 4
Statistics assignment 4
Ishaq Ahmed
 
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
Abu Bakar Soomro
 
Chapter activity plus-in-mathematics-9
Chapter activity plus-in-mathematics-9Chapter activity plus-in-mathematics-9
Chapter activity plus-in-mathematics-9
International advisers
 
Algebra cheat sheet
Algebra cheat sheetAlgebra cheat sheet
Algebra cheat sheet
alex9803
 
2014 06 22_prml_2_4
2014 06 22_prml_2_42014 06 22_prml_2_4
2014 06 22_prml_2_4
yakuzen
 
Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892
drayertaurus
 
Melaka 1 ans 2010
Melaka 1 ans 2010Melaka 1 ans 2010
Melaka 1 ans 2010
sooklai
 
Set theory solutions
Set theory solutionsSet theory solutions
Set theory solutions
Garden City
 
Tugas matematika kelompok
Tugas matematika kelompokTugas matematika kelompok
Tugas matematika kelompok
achmadtrybuana
 
Probability And Random Variable Lecture(3)
Probability And Random Variable Lecture(3)Probability And Random Variable Lecture(3)
Probability And Random Variable Lecture(3)
University of Gujrat, Pakistan
 
2.3 slopes and difference quotient t
2.3 slopes and difference quotient t2.3 slopes and difference quotient t
2.3 slopes and difference quotient t
math260
 
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
eSAT Journals
 

What's hot (18)

2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd62016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
2016 10 mathematics_sample_paper_sa2_04_ans_w9tafd6
 
5 algebra of functions
5 algebra of functions5 algebra of functions
5 algebra of functions
 
How to Find a Cartesian Product
How to Find a Cartesian ProductHow to Find a Cartesian Product
How to Find a Cartesian Product
 
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
1 ESO - Unit 04 - Exercises 1.4.2. - Adding and Subtracting Integers
 
4.6 more on log and exponential equations t
4.6 more on log and exponential equations t4.6 more on log and exponential equations t
4.6 more on log and exponential equations t
 
1 ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
1  ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.1  ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
1 ESO - Unit 04 - Exercises 1.4.3. - Integer Multiplication and Division.
 
Statistics assignment 4
Statistics assignment 4Statistics assignment 4
Statistics assignment 4
 
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
Set Operations Topic Including Union , Intersection, Disjoint etc De Morgans ...
 
Chapter activity plus-in-mathematics-9
Chapter activity plus-in-mathematics-9Chapter activity plus-in-mathematics-9
Chapter activity plus-in-mathematics-9
 
Algebra cheat sheet
Algebra cheat sheetAlgebra cheat sheet
Algebra cheat sheet
 
2014 06 22_prml_2_4
2014 06 22_prml_2_42014 06 22_prml_2_4
2014 06 22_prml_2_4
 
Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892
 
Melaka 1 ans 2010
Melaka 1 ans 2010Melaka 1 ans 2010
Melaka 1 ans 2010
 
Set theory solutions
Set theory solutionsSet theory solutions
Set theory solutions
 
Tugas matematika kelompok
Tugas matematika kelompokTugas matematika kelompok
Tugas matematika kelompok
 
Probability And Random Variable Lecture(3)
Probability And Random Variable Lecture(3)Probability And Random Variable Lecture(3)
Probability And Random Variable Lecture(3)
 
2.3 slopes and difference quotient t
2.3 slopes and difference quotient t2.3 slopes and difference quotient t
2.3 slopes and difference quotient t
 
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
On homogeneous biquadratic diophantineequation x4 y4=17(z2-w2)r2
 

Similar to Approximate Matching (String Algorithms 2007)

Tugas akhir modul 5 bilangan eva novianawati h.
Tugas akhir modul 5 bilangan eva novianawati h.Tugas akhir modul 5 bilangan eva novianawati h.
Tugas akhir modul 5 bilangan eva novianawati h.
MyWife humaeroh
 
Peretmuan iii iv sistem bilangan ( bagian pertama )
Peretmuan iii iv sistem bilangan ( bagian pertama )Peretmuan iii iv sistem bilangan ( bagian pertama )
Peretmuan iii iv sistem bilangan ( bagian pertama )
UNIVERSITAS MUHAMMADIYAH BERAU
 
Kuliah 10 Vektor.pptx
Kuliah 10 Vektor.pptxKuliah 10 Vektor.pptx
Kuliah 10 Vektor.pptx
ZaenurRohman7
 
Statistics assignment 7
Statistics assignment 7Statistics assignment 7
Statistics assignment 7
Ishaq Ahmed
 
Form 5 Additional Maths Note
Form 5 Additional Maths NoteForm 5 Additional Maths Note
Form 5 Additional Maths Note
Chek Wei Tan
 
Appt and reasoning
Appt and reasoningAppt and reasoning
Appt and reasoning
Er. Raju Bhardwaj
 
Peretmuan iii iv sistem bilangan
Peretmuan iii iv sistem bilanganPeretmuan iii iv sistem bilangan
Peretmuan iii iv sistem bilangan
UNIVERSITAS MUHAMMADIYAH BERAU
 
R eksponen&logaritma
R eksponen&logaritmaR eksponen&logaritma
R eksponen&logaritma
Joe Zidane
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplication
Kiran K
 
Ejercicios resueltos de analisis matematico 1
Ejercicios resueltos de analisis matematico 1Ejercicios resueltos de analisis matematico 1
Ejercicios resueltos de analisis matematico 1
tinardo
 
MT T4 (Bab 3: Fungsi Kuadratik)
MT T4 (Bab 3: Fungsi Kuadratik)MT T4 (Bab 3: Fungsi Kuadratik)
MT T4 (Bab 3: Fungsi Kuadratik)
hasnulslides
 
cps170_bayes_nets.ppt
cps170_bayes_nets.pptcps170_bayes_nets.ppt
cps170_bayes_nets.ppt
FaizAbaas
 
V2.0
V2.0V2.0
Truth, deduction, computation lecture g
Truth, deduction, computation   lecture gTruth, deduction, computation   lecture g
Truth, deduction, computation lecture g
Vlad Patryshev
 
101 math short cuts [www.onlinebcs.com]
101 math short cuts [www.onlinebcs.com]101 math short cuts [www.onlinebcs.com]
101 math short cuts [www.onlinebcs.com]
Itmona
 
Xi ipa 2
Xi ipa 2Xi ipa 2
Xi ipa 2
Cellia Apridita
 
Assessments for class xi
Assessments  for class  xi Assessments  for class  xi
Assessments for class xi
indu psthakur
 
Real number system full
Real  number  system fullReal  number  system full
Real number system full
Aon Narinchoti
 
Real number system full
Real  number  system fullReal  number  system full
Real number system full
Aon Narinchoti
 
GCSE-CompletingTheSquare.pptx
GCSE-CompletingTheSquare.pptxGCSE-CompletingTheSquare.pptx
GCSE-CompletingTheSquare.pptx
MitaDurenSawit
 

Similar to Approximate Matching (String Algorithms 2007) (20)

Tugas akhir modul 5 bilangan eva novianawati h.
Tugas akhir modul 5 bilangan eva novianawati h.Tugas akhir modul 5 bilangan eva novianawati h.
Tugas akhir modul 5 bilangan eva novianawati h.
 
Peretmuan iii iv sistem bilangan ( bagian pertama )
Peretmuan iii iv sistem bilangan ( bagian pertama )Peretmuan iii iv sistem bilangan ( bagian pertama )
Peretmuan iii iv sistem bilangan ( bagian pertama )
 
Kuliah 10 Vektor.pptx
Kuliah 10 Vektor.pptxKuliah 10 Vektor.pptx
Kuliah 10 Vektor.pptx
 
Statistics assignment 7
Statistics assignment 7Statistics assignment 7
Statistics assignment 7
 
Form 5 Additional Maths Note
Form 5 Additional Maths NoteForm 5 Additional Maths Note
Form 5 Additional Maths Note
 
Appt and reasoning
Appt and reasoningAppt and reasoning
Appt and reasoning
 
Peretmuan iii iv sistem bilangan
Peretmuan iii iv sistem bilanganPeretmuan iii iv sistem bilangan
Peretmuan iii iv sistem bilangan
 
R eksponen&logaritma
R eksponen&logaritmaR eksponen&logaritma
R eksponen&logaritma
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplication
 
Ejercicios resueltos de analisis matematico 1
Ejercicios resueltos de analisis matematico 1Ejercicios resueltos de analisis matematico 1
Ejercicios resueltos de analisis matematico 1
 
MT T4 (Bab 3: Fungsi Kuadratik)
MT T4 (Bab 3: Fungsi Kuadratik)MT T4 (Bab 3: Fungsi Kuadratik)
MT T4 (Bab 3: Fungsi Kuadratik)
 
cps170_bayes_nets.ppt
cps170_bayes_nets.pptcps170_bayes_nets.ppt
cps170_bayes_nets.ppt
 
V2.0
V2.0V2.0
V2.0
 
Truth, deduction, computation lecture g
Truth, deduction, computation   lecture gTruth, deduction, computation   lecture g
Truth, deduction, computation lecture g
 
101 math short cuts [www.onlinebcs.com]
101 math short cuts [www.onlinebcs.com]101 math short cuts [www.onlinebcs.com]
101 math short cuts [www.onlinebcs.com]
 
Xi ipa 2
Xi ipa 2Xi ipa 2
Xi ipa 2
 
Assessments for class xi
Assessments  for class  xi Assessments  for class  xi
Assessments for class xi
 
Real number system full
Real  number  system fullReal  number  system full
Real number system full
 
Real number system full
Real  number  system fullReal  number  system full
Real number system full
 
GCSE-CompletingTheSquare.pptx
GCSE-CompletingTheSquare.pptxGCSE-CompletingTheSquare.pptx
GCSE-CompletingTheSquare.pptx
 

More from mailund

Chapter 9 divide and conquer handouts with notes
Chapter 9   divide and conquer handouts with notesChapter 9   divide and conquer handouts with notes
Chapter 9 divide and conquer handouts with notes
mailund
 
Chapter 9 divide and conquer handouts
Chapter 9   divide and conquer handoutsChapter 9   divide and conquer handouts
Chapter 9 divide and conquer handouts
mailund
 
Chapter 9 divide and conquer
Chapter 9   divide and conquerChapter 9   divide and conquer
Chapter 9 divide and conquer
mailund
 
Chapter 7 recursion handouts with notes
Chapter 7   recursion handouts with notesChapter 7   recursion handouts with notes
Chapter 7 recursion handouts with notes
mailund
 
Chapter 7 recursion handouts
Chapter 7   recursion handoutsChapter 7   recursion handouts
Chapter 7 recursion handouts
mailund
 
Chapter 7 recursion
Chapter 7   recursionChapter 7   recursion
Chapter 7 recursion
mailund
 
Chapter 5 searching and sorting handouts with notes
Chapter 5   searching and sorting handouts with notesChapter 5   searching and sorting handouts with notes
Chapter 5 searching and sorting handouts with notes
mailund
 
Chapter 5 searching and sorting handouts
Chapter 5   searching and sorting handoutsChapter 5   searching and sorting handouts
Chapter 5 searching and sorting handouts
mailund
 
Chapter 5 searching and sorting
Chapter 5   searching and sortingChapter 5   searching and sorting
Chapter 5 searching and sorting
mailund
 
Chapter 4 algorithmic efficiency handouts (with notes)
Chapter 4   algorithmic efficiency handouts (with notes)Chapter 4   algorithmic efficiency handouts (with notes)
Chapter 4 algorithmic efficiency handouts (with notes)
mailund
 
Chapter 4 algorithmic efficiency handouts
Chapter 4   algorithmic efficiency handoutsChapter 4   algorithmic efficiency handouts
Chapter 4 algorithmic efficiency handouts
mailund
 
Chapter 4 algorithmic efficiency
Chapter 4   algorithmic efficiencyChapter 4   algorithmic efficiency
Chapter 4 algorithmic efficiency
mailund
 
Chapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slidesChapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slides
mailund
 
Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)
mailund
 
Chapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handoutsChapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handouts
mailund
 
Ku 05 08 2009
Ku 05 08 2009Ku 05 08 2009
Ku 05 08 2009mailund
 
Association mapping using local genealogies
Association mapping using local genealogiesAssociation mapping using local genealogies
Association mapping using local genealogies
mailund
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
mailund
 
Probability And Stats Intro
Probability And Stats IntroProbability And Stats Intro
Probability And Stats Intro
mailund
 
Probability And Stats Intro2
Probability And Stats Intro2Probability And Stats Intro2
Probability And Stats Intro2
mailund
 

More from mailund (20)

Chapter 9 divide and conquer handouts with notes
Chapter 9   divide and conquer handouts with notesChapter 9   divide and conquer handouts with notes
Chapter 9 divide and conquer handouts with notes
 
Chapter 9 divide and conquer handouts
Chapter 9   divide and conquer handoutsChapter 9   divide and conquer handouts
Chapter 9 divide and conquer handouts
 
Chapter 9 divide and conquer
Chapter 9   divide and conquerChapter 9   divide and conquer
Chapter 9 divide and conquer
 
Chapter 7 recursion handouts with notes
Chapter 7   recursion handouts with notesChapter 7   recursion handouts with notes
Chapter 7 recursion handouts with notes
 
Chapter 7 recursion handouts
Chapter 7   recursion handoutsChapter 7   recursion handouts
Chapter 7 recursion handouts
 
Chapter 7 recursion
Chapter 7   recursionChapter 7   recursion
Chapter 7 recursion
 
Chapter 5 searching and sorting handouts with notes
Chapter 5   searching and sorting handouts with notesChapter 5   searching and sorting handouts with notes
Chapter 5 searching and sorting handouts with notes
 
Chapter 5 searching and sorting handouts
Chapter 5   searching and sorting handoutsChapter 5   searching and sorting handouts
Chapter 5 searching and sorting handouts
 
Chapter 5 searching and sorting
Chapter 5   searching and sortingChapter 5   searching and sorting
Chapter 5 searching and sorting
 
Chapter 4 algorithmic efficiency handouts (with notes)
Chapter 4   algorithmic efficiency handouts (with notes)Chapter 4   algorithmic efficiency handouts (with notes)
Chapter 4 algorithmic efficiency handouts (with notes)
 
Chapter 4 algorithmic efficiency handouts
Chapter 4   algorithmic efficiency handoutsChapter 4   algorithmic efficiency handouts
Chapter 4 algorithmic efficiency handouts
 
Chapter 4 algorithmic efficiency
Chapter 4   algorithmic efficiencyChapter 4   algorithmic efficiency
Chapter 4 algorithmic efficiency
 
Chapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slidesChapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slides
 
Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)
 
Chapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handoutsChapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handouts
 
Ku 05 08 2009
Ku 05 08 2009Ku 05 08 2009
Ku 05 08 2009
 
Association mapping using local genealogies
Association mapping using local genealogiesAssociation mapping using local genealogies
Association mapping using local genealogies
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Probability And Stats Intro
Probability And Stats IntroProbability And Stats Intro
Probability And Stats Intro
 
Probability And Stats Intro2
Probability And Stats Intro2Probability And Stats Intro2
Probability And Stats Intro2
 

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 

Approximate Matching (String Algorithms 2007)

  • 1. Approximate pattern matching Given string x = abbacbbbababacabbbba and pattern p = bbba find all “almost”-occurrences of p ind x x = a bba c bbba babacab b a bba 17 6 1
  • 2.
  • 3. k-Approximate matching Given string x and pattern p find all indices in x where: i i+h d( x [ i..i+h ], p ) ≤ k d( , ) ≤ k Generic problem for the various distance functions d
  • 4. Generic Algorithm for i=1..| x |: if d( x [i..i'], p ) <= k for some i'>i: report match at i i i' Time usage: O(n 2 ∙ “time to calculate distance”) (But see Sect. 10.1 for a O(nm) dynamic programming algorithm that works for most distance functions)
  • 5. The k-mismatch problem Let, for strings x and y , | x |=| y |= n , d( x , y ) = |{i | i=1..n, x [i] ≠ y [i]}| (the Hamming distance) The k -mismatch problem: Given string x and pattern p find all indices in x where: i i+m-1 d( x [ i..i+m-1 ], p ) ≤ k d( , ) ≤ k
  • 6. Simple k-mismatch algorithm for i=1..| x |: count = 0 for j=1..| p |: if x [i]!= p [j]: count = count + 1 if count <= k: report match at i Time usage: O(| x || p| )
  • 7.
  • 8. Using state vector s Notice that s holds information about more than one comparison! j s [j]==l j' s [j']==h Conceptually, p is positioned | p | places along x : s tries to match p at positions i -| p |+1 .. i
  • 9. Using state vector s When s [| p |] ≤ k and i ≥ | p |, we have an occurrence of p in x at i -| p |+1 s [| p |] <= k i i-| p |+1
  • 10. Example: 2-mismatch x = babacbbbababacabbbba i=0 p = bbba s = 000 000 000 000 s [1]==0 p = bbba p = bbba p = bbba s [2]==0 s [3]==0 s [4]==0
  • 11. Example: 2-mismatch x = babacbbbababacabbbba i=1 p = b bba s =000 000 000 000 s [1]==0 p = bbba p = bbba p = bbba s [2]==0 s [3]==0 s [4]==0
  • 12. Example: 2-mismatch x = babacbbbababacabbbba i=2 p = b bba s =001 001 000 000 s [1]==1 p = b b ba p = bbba p = bbba s [2]==1 s [3]==0 s [4]==0
  • 13. Example: 2-mismatch x = babacbbbababacabbbba i=3 p = b bba s =000 001 001 000 s [1]==0 p = b b ba p = b b b a p = bbba s [2]==1 s [3]==1 s [4]==0
  • 14. Example: 2-mismatch x = babacbbbababacabbbba i=4 p = b bba s =001 001 010 001 s [1]==1 p = b b ba p = b b b a p = b b ba s [2]==1 s [3]==2 s [4]==1 Match at i-4+1=1
  • 15. Example: 2-mismatch x = babacbbbababacabbbba i=5 p = b bba s =001 010 010 011 s [1]==1 p = bb ba p = b bb a p = b b ba s [2]==2 s [3]==2 s [4]==3
  • 16. Example: 2-mismatch x = babacbbbababacabbbba i=6 p = b bba s =000 001 010 011 s [1]==0 p = b b ba p = bb b a p = b bba s [2]==1 s [3]==2 s [4]==3
  • 17. Example: 2-mismatch x = babacbbbababacabbbba i=7 p = b bba s =000 000 001 011 s [1]==0 p = bb ba p = b bb a p = bb b a s [2]==0 s [3]==1 s [4]==3
  • 18. Example: 2-mismatch x = babacbbbababacabbbba i=8 p = b bba s =000 000 001 010 s [1]==0 p = bb ba p = bbb a p = b bb a s [2]==0 s [3]==0 s [4]==2 Match at i-4+1=5
  • 19. Example: 2-mismatch x = babacbbbababacabbbba i=9 p = b bba s =001 001 001 000 s [1]==1 p = b b ba p = bb b a p = bbba s [2]==1 s [3]==1 s [4]==0 Match at i-4+1=6
  • 20. Constructing s Let s i be the state vector in iteration i . Then s i [ j ]= s i-1 [ j -1] + t where t =0 if p [ j ]= x [ i ] and t =1 otherwise i j s i-1 [j-1] Special case: s i [0] = 0 for all i
  • 21. Table t The number t , where t =0 if p [ j ]= x [ i ] and t =1 otherwise can be pre-calculated and stored in a matrix: t [h,j] = 0 log(| p |+1) if p [j] == h 0 log(| p |+1)-1 1 if p [j] != h with rows indexed by the alphabet and columns indexed by indices in p , using log(| p |+1) bits per cell.
  • 22. Table t Why use log(| p |+1) bits per cell? To be able to add t [h] and s a word at a time, silly! s i-1 : t [h]: s i : + + 0 + + + + + + + + + + + + words
  • 23. Table t for pattern p=bbba 1 2 3 4 t ['a']: 001 001 001 000 t ['b']: 000 000 000 001 t ['c']: 001 001 001 001
  • 24. Example x = babacbbbababacabbbba i=0 p = bbba s =0 00 000 000 000 000 s 0 [1]==0 p = bbba p = bbba p = bbba s 0 [2]==0 s 0 [3]==0 s 0 [4]==0 s 0 [0]==0
  • 25. Example x = babacbbbababacabbbba i=1 p = b bba 0 00 000 000 000 001 s 1 [1]==0 p = bbba p = bbba p = bbba s 1 [2]==0 s 1 [3]==0 s 1 [4]==1 s 1 [0]==0 t [b] 0 + 0 + 0 + 1 + s 0 [1]==0 s 0 [2]==0 s 0 [3]==0 s 0 [4]==0 s 0 [0]==0 s 1 = 0 00 000 000 000 000 s 0 = 0 00 000 000 001 t [b] =
  • 26. Example x = babacbbbababacabbbba i=2 p = b bba 0 00 001 001 001 000 s 2 [1]==1 p = b b ba p = bbba p = bbba s 2 [2]==1 s 2 [3]==1 s 2 [4]==0 s 2 [0]==0 t [a] 1 + 1 + 1 + 0 + s 1 [1]==0 s 1 [2]==0 s 1 [3]==0 s 1 [4]==1 s 1 [0]==0 s 2 = 0 00 000 000 000 001 s 1 = 0 01 001 001 000 t [a] =
  • 27. Example x = babacbbbababacabbbba i=3 p = b bba 0 00 000 001 001 010 s 3 [1]==0 p = b b ba p = b b b a p = bbba s 3 [2]==1 s 3 [3]==1 s 3 [4]==2 s 3 [0]==0 t [b] 0 + 0 + 0 + 1 + s 2 [1]==1 s 2 [2]==1 s 2 [3]==1 s 2 [4]==0 s 2 [0]==0 s 3 = 0 00 001 001 001 000 s 2 = 0 00 000 000 001 t [b] =
  • 28. Example x = babacbbbababacabbbba i=4 p = b bba 0 00 001 001 010 001 s 4 [1]==1 p = b b ba p = b b b a p = b b ba s 4 [2]==1 s 4 [3]==2 s 4 [4]==1 s 4 [0]==0 t [a] 1 + 1 + 1 + 0 + s 3 [1]==0 s 3 [2]==1 s 3 [3]==1 s 3 [4]==2 s 3 [0]==0 s 4 = 0 00 000 001 001 010 s 3 = 0 01 001 001 000 t [a] = MATCH!
  • 29. Example x = babacbbbababacabbbba i=5 p = b bba 0 00 001 010 010 011 s 5 [1]==1 p = bb ba p = b bb a p = b b ba s 5 [2]==2 s 5 [3]==2 s 5 [4]==3 s 5 [0]==0 t [c] 1 + 1 + 1 + 1 + s 4 [1]==1 s 4 [2]==1 s 4 [3]==2 s 4 [4]==1 s 4 [0]==0 s 5 = 0 00 001 001 010 001 s 4 = 0 01 001 001 001 t [c] =
  • 30. Example x = babacbbbababacabbbba i=6 p = b bba 0 00 000 001 010 011 s 6 [1]==0 p = b b ba p = bb b a p = b bba s 6 [2]==1 s 6 [3]==2 s 6 [4]==3 s 6 [0]==0 t [b] 0 + 0 + 0 + 1 + s 5 [1]==1 s 5 [2]==2 s 5 [3]==2 s 5 [4]==3 s 5 [0]==0 s 6 = 0 00 001 010 010 011 s 5 = 0 00 000 000 001 t [b] =
  • 31. Algorithm Preprocessing: b = log(| p |+1) for h in  and j=1..| p |: t [h,j] = 0 b-1 1 for j=1..| p |: t [ p [j],j] = 0 b s = 0 b(| p |+1) Main: for i=1..| x |: s = ( s >> b) + t [ x [i]] if s [| p |]<=k and i>=| p |: report i-| p |+1 as match
  • 32.
  • 33. Improvements When k « | p |, we are wasting bits (and time)! We count all mismatches, but we only need to know if there are more than k or not. We can replace the log(| p |+1) bit cells in s with log( k +1)+1 (for numbers 0..k and an overflow bit) and an additional vector, o , remembering overflows
  • 34. Updated operation We update the operation s = ( s >> b) + t [ x [i]] where b=log(| p|+1), to be s = ( s >> b) + t [ x [i]] o = ( o >> b) | ( s & OF_MASK) s = s & NEG_OF_MASK where b=log( k +1)+1 and OF_MASK = (10 b-1 ) | p | NEG_OF_MASK = ~OF_MASK = (01 b ) | p |
  • 35. Algorithm Preprocessing: b = log(k+1)+1 OF_MASK = (10 b-1 ) | p | ; NEG_OF_MASK = ~OF_MASK for h in  and j=1..| p |: t [h,j] = 0 b-1 1 for j=1..| p |: t [ p [j],j] = 0 b s = 0 b(| p |+1) ; o = 0 b(| p |+1) Main: for i=1..| x |: s = ( s >> b) + t [ x [i]] o = ( o >> b) | ( s & OF_MASK) s = s & NEG_OF_MASK if s [| p |] + o [| p |] <= k and i>=| p |: report i-| p |+1 as match
  • 36. Example x = babacacbababacabbbba i=0 p = bbba p = bbba p = bbba p = bbba s 0 == 000 000 000 000 000 o 0 == 000 000 000 000 000
  • 37. Example x = babacacbababacabbbba i=1 p = b bba p = bbba p = bbba p = bbba ( s 0 >> b) == 000 000 000 000 000 t ['b'] == 000 000 000 001 s 1' == 000 000 000 000 001 ( o 0 >> b) == 000 000 000 000 000 ( s 1' & OF_MASK) == 000 000 000 000 000 o 1 == 000 000 000 000 000 s 1 == 000 000 000 000 001
  • 38. Example x = babacacbababacabbbba i=2 p = b bba p = b b ba p = bbba p = bbba ( s 1 >> b) == 000 000 000 000 000 t ['a'] == 001 001 001 000 s 2' == 000 001 001 001 000 ( o 1 >> b) == 000 000 000 000 000 ( s 2' & OF_MASK) == 000 000 000 000 000 o 2 == 000 000 000 000 000 s 2 == 000 001 001 001 000
  • 39. Example x = babacacbababacabbbba i=3 p = b bba p = b b ba p = b b b a p = bbba ( s 2 >> b) == 000 000 001 001 001 t ['b'] == 000 000 000 001 s 3' == 000 000 001 001 010 ( o 2 >> b) == 000 000 000 000 000 ( s 3' & OF_MASK) == 000 000 000 000 000 o 3 == 000 000 000 000 000 s 3 == 000 000 001 001 010
  • 40. Example x = babacacbababacabbbba i=4 p = b bba p = b b ba p = b b b a p = b b ba ( s 3 >> b) == 000 000 000 001 001 t ['a'] == 001 001 001 000 s 4' == 000 001 001 010 001 ( o 3 >> b) == 000 000 000 000 000 ( s 4' & OF_MASK) == 000 000 000 000 000 o 4 == 000 000 000 000 000 s 4 == 000 001 001 010 001 MATCH!
  • 41. Example x = babacacbababacabbbba i=5 p = b bba p = bb ba p = b bb a p = b b ba ( s 4 >> b) == 000 000 001 001 010 t ['c'] == 001 001 001 001 s 5' == 000 001 010 010 011 ( o 4 >> b) == 000 000 000 000 000 ( s 5' & OF_MASK) == 000 000 000 000 000 o 5 == 000 000 000 000 000 s 5 == 000 001 010 010 011
  • 42. Example x = babacacbababacabbbba i=6 p = b bba p = bb ba p = bbb a p = b bb a ( s 5 >> b) == 000 000 001 010 010 t ['a'] == 001 001 001 000 s 6' == 000 001 010 011 010 ( o 5 >> b) == 000 000 000 000 000 ( s 6' & OF_MASK) == 000 000 000 000 000 o 6 == 000 000 000 000 000 s 6 == 000 001 010 011 010 MATCH!
  • 43. Example x = babacacbababacabbbba i=7 p = b bba p = bb ba p = bbb a p = bbba ( s 6 >> b) == 000 000 001 010 011 t ['c'] == 001 001 001 001 s 7' == 000 001 010 011 100 ( o 6 >> b) == 000 000 000 000 000 ( s 7' & OF_MASK) == 000 000 000 000 100 o 7 == 000 000 000 000 100 s 7 == 000 001 010 011 000
  • 44. Example x = babacacbababacabbbba i=8 p = b bba p = b b ba p = bb b a p = bbba ( s 7 >> b) == 000 000 001 010 011 t ['b'] == 000 000 000 001 s 8' == 000 000 001 010 100 ( o 7 >> b) == 000 000 000 000 000 ( s 8' & OF_MASK) == 000 000 000 000 100 o 8 == 000 000 000 000 100 s 8 == 000 000 001 010 000
  • 45.