SlideShare a Scribd company logo
1 of 36
Download to read offline
Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Dongbo Bu
January 8, 2010
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Problem Formulation
Problem Statement I
Closest String
Given a set S = {s1, s2, · · · , sn} of strings each length m, find a
center string s of length m minimizing d such that for every string
si ∈ S, d(s, si) ≤ d. Here d(s, si) is the Hamming distance
between s and si.
Closest Substring
Given a set S = {s1, s2, · · · , sn} of strings each length m and an
integer L, find a center string s of length L minimizing d such that
for every string si ∈ S there is a length L substring ti of si with
d(s, ti) ≤ d.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Problem Formulation
Example of Closest String
Given 4 strings s1, s2, s3, s4,
Optimal center string s = 011000
d =
4
max
i=1
d(si, s) = max{2, 2, 1, 2} = 2
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Problem Formulation
Example of Closest Substring
Given 4 strings s1, s2, s3, s4, L = 4,
Optimal center string s = 1100,
d =
4
max
i=1
d(ti, s) = max{1, 0, 1, 2} = 2
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Basic Idea
In polynomial time, we can enumerate all n
r strings for any fixed
r. We can prove that, on positions that r strings all agree, denote
as Q, it is a good approximate solution. We can also prove that,
|Q| ≥ m − r · dopt.
On other positions, we use LP + Random rounding technique to
obtain an approximate solution.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Algorithm I
Input : s1, s2, · · · , sn ∈ Σm, an integer r ≥ 2 and a small
number ǫ > 0.
Output : a center string s ∈ Σm
Algorithm :
1 for each r-element subset {si1 , si2 , · · · , sir } of the n input
strings do
1 let Q = {j|si1
[j] = si2
[j] = · · · = sir
[j]},
P = {1, 2, · · · , m} − Q
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Algorithm II
2 solve the following optimization problem
min d
s.t. d(si|P , y) + d(si|Q, si1
|Q) ≤ d, i = 1, 2, · · · , n
Random rounding the fractional solution ¯y to get a
approximation solution y. Use derandomization technique to
make this step deterministic rather than random.
3 let s′
|Q = si1
|Q, s′
|P = y. Calculate the radius of the solution
with s′
as the center string.
2 for i = 1, 2, · · · , n do
1 calculate the radius of the solution with si as the center string.
3 output the best solution of the above two steps.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Example of Closest String I
Suppose r = 2. In step 1, we should enumerate all 4
2 = 6 cases.
In each cases, we select 2 lines, calculate P and Q.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Example of Closest String II
In step 2, we fix s′|Q = 1000, solve the following optimization
problem :
min d
s.t. y10 + y11 = 1
y20 + y21 = 1
y11 + y21 + 0 ≤ d
y10 + y20 + 0 ≤ d
y11 + y20 + 2 ≤ d
y11 + y20 + 2 ≤ d
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Example of Closest String III
Solve this linear programming, and random rounding the fractional
solution to integer solution :
y10 = y21 = 1, y11 = y20 = 0
So,
s′
|P = 01, s′
= 011000
d =
4
max
i=1
d(si, s′
) = max{1, 1, 2, 3} = 3
Try all 4
2 = 6 cases, obtain the minimum, denote as d0. Then we
finish step 1.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Example of Closest String IV
In step 2, we calculate the radius when si is the center string.
d1 =
4
max
i=1
d(s1, si)
d2 =
4
max
i=1
d(s2, si)
d3 =
4
max
i=1
d(s3, si)
d4 =
4
max
i=1
d(s4, si)
Calculate the minimal radius in both step 1 and step 2.
d = min{d0, d1, d2, d3, d4}
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Analysis I
Theorem
The above algorithm is a PTAS for Closest String problem.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Analysis II
Proof.
Obviously, the time complexity of the algorithm is
O (nm)rnO(log |Σ|·r2/ǫ2) , which is polynomial in terms of n,m.
The proof of approximation guarantee is organized as 3 lemmas as
follows :
Lemma 1 proves s′|Q is a good approximation to s with
approximation rate 1 + 1
2r−1 .
Lemma 2 proves |P| < r · dopt.
Based on the above 2 lemmas, Lemma 3 proves step 1.2 obtains a
approximate solution with rate (1 + 1
2r−1 + ǫ).
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Analysis III
Lemma 1
If maxi≤i,j≤n d(si, sj) > (1 + 1
2r−1 )dopt, then there exists r indices
1 ≤ i1, i2, · · · , ir ≤ n such that for any 1 ≤ l ≤ n,
d(sl|Q, si1 |Q) − d(sl|Q, s|Q) ≤
1
2r − 1
dopt
where Q is the set of positions that si1 , si2 , · · · , sir all agree.
if maxi≤i,j≤n d(si, sj) ≤ (1 + 1
2r−1 )dopt, then any si will be a
good center string. (Recall the step 2 of the algorithm)
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Analysis IV
Lemma 2
Let P = {1, 2, · · · , m} − Q, then |P| ≤ r · dopt and
|Q| ≥ m − r · dopt.
Proof.
Let q ∈ P, then there exists some sij such that sij [q] = s[q]. Since
d(sij , s) ≤ dopt, each sij contributes at most dopt positions for P.
Thus |P| ≤ r · dopt.
this lemma gives the lower bound of dopt, dopt ≥ |P|
r , which is
essential for the analysis of LP + random rounding!
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Analysis V
Lemma 3
Given a string s′ and a position set Q and P, |P| < r · dopt, such
that for any i = 1, 2, · · · , n,
d(si|Q, s′
|Q) − d(si|Q, s|Q) ≤ ρ · dopt,
then step 1.2 gives a solution with approximate rate (1 + ρ + ǫ) in
polynomial time for any fixed ǫ ≥ 0.
Before the proof, we can see that the conditions of this lemma are
all satisfied by lemma 2, where ρ = 1
2r−1 .
Proof
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Analysis VI
Recall the optimization problem in step 1.2
min d
s.t. d(si|P , y) + d(si|Q, s′
|Q) ≤ d, i = 1, 2, · · · , n
First, we show that y = s|P is a solution with cost d ≤ (1 + ρ)dopt.
In fact, for i = 1, 2, · · · , n
d(si|P , s|P ) + d(si|Q, s′
|Q)
≤ d(si|P , s|P ) + d(si|Q, s|Q) + ρ · dopt
≤ (1 + ρ)dopt
Second, rewirte the optimization problem to an ILP problem. In
order for this, define 0-1 variables yj,a, 1 ≤ j ≤ |P|, a ∈ Σ,
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Analysis VII
yj,a = 1 means y[j] = a; define χ(si[j], a) = 0 if si[j] = a and 1 if
si[j] = a. Then the above optimization problem can be formulated
as follows
min d
s.t. a∈Σ yj,a = 1, 1 ≤ j ≤ |P|
1≤j≤|P| a∈Σ χ(si[j], a)yj,a + d(si|Q, s′|Q) ≤ d, 1 ≤ i ≤ n
Solve it by LP to get a fractional solution ¯y with cost ¯d.
Random rounding ¯y to y′ by independently set y′
j,a = 1 and
y′
j,b = 0, b ∈ Σ − {a} with probability ¯yj,a.
So d(si|P , y′) =
|P|
i=1 a∈Σ χ(si[j], a)yj,a, which is the sum of
|P| independent random variables.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Analysis VIII
E(d(si|P )) =
1≤j≤|P| a∈Σ
χ(si[j], a) ¯yj,a
≤ ¯d − d(si|Q, s′
|Q)
≤ (1 + ρ) · dopt − d(si|Q, s′
|Q)
Employ Chernoff Bound
Pr(X > µ + ǫn) ≤ exp(−
1
3
nǫ2
)
we have
Pr(d(si|P , y′
) > (1 + ρ)dopt − d(si|Q, s′
|Q) + ǫ′
|P|) ≤ exp(−
1
3
ǫ′2
|P|)
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest String Problem
Analysis IX
Let s′|P = y′ and consider all n strings, we claim
Pr(d(si, s′
) < (1 + ρ)dopt + ǫ′
|P| for all i) ≥ 1 − n exp(−
1
3
ǫ′2
|P|)
Use standard derandomization methods, we can obtain a
deterministic s′ that satisfies
d(si, s′
) < (1 + ρ)dopt + ǫ′
|P|, 1 ≤ i ≤ n
Recall that |P| < r · dopt, let ǫ′ = ǫ
r , we have
d(si, s′
) < (1 + ρ + ǫ)dopt, 1 ≤ i ≤ n
Then we finish the proof.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Basic Idea
We want to follow the algorithm of Closest String algorithm.
However, we do not know how to construct an optimization
problem, for the reason that we do not konw the optimal substring
in each string. Thus, the choice of a “good” substring from every
string si is the only obstacle on the way to the solution.
As we do in the algorithm of Closest String, first we try all the
choices of r substrings from S, we can assume that ti1 , ti2 , · · · , tir
is a good partial solution. After that, we randomly pick
O(log(mn)) positions from P. By trying all length |R| strings, we
can assume we know s|R. Then for each i ≤ i ≤ n, we find the
substring t′
i from si such that
f(t′
i) = d(s|R, t′
i|R) · |P|
|R| + d(ti1 |Q, t′
i|Q) is minimized. We use
t′
i, 1 ≤ i ≤ n to construct the optimization problem.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Algorithm for Closest Substring problem I
Input : s1, s2, · · · , sn ∈ Σm, an integer 1 ≤ L ≤ m, an integer
r ≥ 2 and a small number ǫ > 0.
Ouput : center string s
Algorithm :
1 for each r-element subset {ti1 , ti2 , · · · , tir }, where tij is a
substring of length L from sij do
1 let Q = {j|ti1
[j] = ti2
[j] = · · · = tir
[j]},
P = {1, 2, · · · , m} − Q
2 randomly select 4
ǫ2 log(mn) positions from P, denote as R
3 for every string x of length |R| do
1 for 1 ≤ i ≤ n, let t′
i be a length L substring of si minimizing
f(t′
i) = d(x, t′
i|R) · |P |
|R|
+ d(ti1 |Q, t′
i|Q).
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Algorithm for Closest Substring problem II
2 solve the optimization problem
min d
s.t. d(t′
i|P , y) + d(t′
i|Q, ti−1|Q) ≤ d, 1 ≤ i ≤ n
Random rounding the fractional solution ¯y to get a
approximation solution y. Use derandomization technique to
make this step deterministic rather than random.
3 Let s′
|Q = ti1 |Q and s′
|P = y. Let c = maxn
i=1 d(s′
, t′
i).
2 for every length L substring s′ of s1 do
1 Let c = maxn
i=1 minti
d(s′
, ti).
3 output the s′ with minimum c in step 1.3.3 and step 2.1.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Example of Closest Subtring I
Suppose r = 2. In step 1, we should enumerate all
4
2 × 3 × 3 = 54 cases. In each cases, we select 2 substring of
length 4 , calculate P and Q.
We fix s′|Q = 00.
Randomly select |R| = O(log(mn)) positions, say, |R| = 1. Then
enumerate all length |R| strings, say, s′|R = 0.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Example of Closest Subtring II
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Example of Closest Subtring III
Now, for each si, find out the t′
i minimizing
f(u) = d(u|R, s′|R) × 2 + d(u|Q, s′|Q).
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Example of Closest Subtring IV
Solve the following optimization problem :
min d
s.t. y10 + y11 = 1
y20 + y21 = 1
y10 + y21 + 0 ≤ d
y10 + y21 + 0 ≤ d
y10 + y21 + 2 ≤ d
y10 + y21 + 1 ≤ d
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Example of Closest Subtring V
Solve this linear programming, and random rounding the fractional
solution to integer solution :
y10 = y21 = 0, y11 = y20 = 1
So,
s′
= 1000
d =
4
max
i=1
d(ti, s′
) = max{0, 0, 2, 1} = 2
Try all 54 cases, obtain the minimum, denote as d0. Then we
finish step 1.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Example of Closest Subtring VI
In step 2, we calculate the radius when ti is the center string where
ti is a length L substring of si. denote the minimun d1. Calculate
the minimal radius in both step 1 and step 2.
d = min{d0, d1}
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Analysis I
Theorem
The above algorithm is a PTAS for Closest Substring problem.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Analysis II
Proof.
The time complexity of the algorithm is O (nm)O(log |Σ|/δ4) ,
which is polynomial in terms of n,m.
The proof of approximation guarantee is organized as 3 lemmas as
follows :
Lemma 1 proves s′|Q is a good approximation to s with
approximation rate 1 + 1
2r−1 .
Define s∗|P = s|P and s∗|Q = ti1 |Q, lemma 2 proves
d(s∗, t′
i) ≤ d(s∗, ti) + 2ǫ|P| for all 1 ≤ i ≤ n.
Based on the above 2 lemmas, Lemma 3 proves the algorithm
obtains a approximate solution with rate (1 + 1
2r−1 + 3ǫr).
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Analysis III
Lemma 1
There exists ti1 , ti2 , · · · , tir chosen in step 1, such that for any
1 ≤ l ≤ n
d(tl|Q, ti1 |Q) − d(sl|Q, s|Q) ≤
1
2r − 1
· dopt
Proof.
The fact follow from Lemma 1 of Closest String directly.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Analysis IV
Lemma 2
Define s∗|P = s|P and s∗|Q = ti1 |Q. Then we have, with high
probability
d(s∗
, t′
i) ≤ d(s∗
, ti) + 2ǫ|P|
for all 1 ≤ i ≤ n.
Proof.
The randomness comes from the randomly selected |R|. Use
standard method, we can derandomize it to make this
deterministic.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Analysis V
Lemma 3
step 1.3.3 gives an approximation solution s′ with approximation
rate (1 + 1
2r−1 + 3ǫr).
Proof.
Recall the optimization problem in step 1.2
min d
s.t. d(t′
i|P , y) + d(t′
i|Q, s′
|Q) ≤ d, i = 1, 2, · · · , n
y = s|P is a feasible solution. Now we calculate its radius when s∗
is the center string. Recall that s∗|P = s|P and s∗|Q = ti1 |Q.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Analysis VI
According to Lemma 2 and Lemma 1
d(s∗
, t′
i) ≤ d(s∗
, ti) + 2ǫ|P|
≤ d(s|P , ti|P ) + d(ti1 |Q, ti|Q) + 2ǫ|P|
≤ d(s|P , ti|P ) + d(s|Q, ti|Q) + 2ǫ|P| +
1
2r − 1
dopt
≤ (1 +
1
2r − 1
)dopt + 2ǫ|P|)
In a short word, y = s|P is a solution of the optimization problem
with cost at most (1 + 1
2r−1 )dopt + 2ǫ|P|).
Next, as we do in the Closest String problem, we rewirte the
optimization problem to an ILP and solve it and random rounding
the fractional solution ¯y to y. Define s′|Q = ti1 |Q and s′|P = y.
Dongbo Bu Closest String and Closest Substring Problems
Closest String and Closest Substring Problems
Algorithm for Closest Substring problem
Analysis VII
E(d(s′
, t′
i)) ≤ ¯d ≤ (1 +
1
2r − 1
+ 2ǫ|P|)dopt
So, chernoff bound ensures that, with high probability,
d(s′
, t′
i)
≤ (1 +
1
2r − 1
)dopt + 2ǫ|P| + ǫ|P|
≤ (1 +
1
2r − 1
)dopt + 3ǫ|P|
≤ (1 +
1
2r − 1
+ 3ǫr)dopt
After derandomization, we can obtain an approximation solution s′
with rate (1 + 1
2r−1 + 3ǫr)dopt.
Dongbo Bu Closest String and Closest Substring Problems

More Related Content

What's hot

Fixed points of contractive and Geraghty contraction mappings under the influ...
Fixed points of contractive and Geraghty contraction mappings under the influ...Fixed points of contractive and Geraghty contraction mappings under the influ...
Fixed points of contractive and Geraghty contraction mappings under the influ...IJERA Editor
 
Longest common subsequence(dynamic programming).
Longest common subsequence(dynamic programming).Longest common subsequence(dynamic programming).
Longest common subsequence(dynamic programming).munawerzareef
 
04 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x204 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x2MuradAmn
 
Differential Equations 4th Edition Blanchard Solutions Manual
Differential Equations 4th Edition Blanchard Solutions ManualDifferential Equations 4th Edition Blanchard Solutions Manual
Differential Equations 4th Edition Blanchard Solutions Manualqyfuc
 
Algorithm Design and Complexity - Course 6
Algorithm Design and Complexity - Course 6Algorithm Design and Complexity - Course 6
Algorithm Design and Complexity - Course 6Traian Rebedea
 
Applications of differential equations by shahzad
Applications of differential equations by shahzadApplications of differential equations by shahzad
Applications of differential equations by shahzadbiotech energy pvt limited
 
Bachelor_Defense
Bachelor_DefenseBachelor_Defense
Bachelor_DefenseTeja Turk
 
Longest Common Subsequence
Longest Common SubsequenceLongest Common Subsequence
Longest Common SubsequenceSyeda
 
Numerical Analysis and Its application to Boundary Value Problems
Numerical Analysis and Its application to Boundary Value ProblemsNumerical Analysis and Its application to Boundary Value Problems
Numerical Analysis and Its application to Boundary Value ProblemsGobinda Debnath
 
Longest Common Subsequence
Longest Common SubsequenceLongest Common Subsequence
Longest Common SubsequenceKrishma Parekh
 
Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Christian Robert
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big DataChristian Robert
 

What's hot (19)

Chapter 26 aoa
Chapter 26 aoaChapter 26 aoa
Chapter 26 aoa
 
Chapter 24 aoa
Chapter 24 aoaChapter 24 aoa
Chapter 24 aoa
 
Chapter 23 aoa
Chapter 23 aoaChapter 23 aoa
Chapter 23 aoa
 
Fixed points of contractive and Geraghty contraction mappings under the influ...
Fixed points of contractive and Geraghty contraction mappings under the influ...Fixed points of contractive and Geraghty contraction mappings under the influ...
Fixed points of contractive and Geraghty contraction mappings under the influ...
 
Cpsc125 ch6sec3
Cpsc125 ch6sec3Cpsc125 ch6sec3
Cpsc125 ch6sec3
 
Longest common subsequence(dynamic programming).
Longest common subsequence(dynamic programming).Longest common subsequence(dynamic programming).
Longest common subsequence(dynamic programming).
 
04 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x204 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x2
 
Differential Equations 4th Edition Blanchard Solutions Manual
Differential Equations 4th Edition Blanchard Solutions ManualDifferential Equations 4th Edition Blanchard Solutions Manual
Differential Equations 4th Edition Blanchard Solutions Manual
 
Algorithm Design and Complexity - Course 6
Algorithm Design and Complexity - Course 6Algorithm Design and Complexity - Course 6
Algorithm Design and Complexity - Course 6
 
Quantum Search and Quantum Learning
Quantum Search and Quantum Learning Quantum Search and Quantum Learning
Quantum Search and Quantum Learning
 
Applications of differential equations by shahzad
Applications of differential equations by shahzadApplications of differential equations by shahzad
Applications of differential equations by shahzad
 
Np cooks theorem
Np cooks theoremNp cooks theorem
Np cooks theorem
 
Bachelor_Defense
Bachelor_DefenseBachelor_Defense
Bachelor_Defense
 
Longest Common Subsequence
Longest Common SubsequenceLongest Common Subsequence
Longest Common Subsequence
 
Numerical Analysis and Its application to Boundary Value Problems
Numerical Analysis and Its application to Boundary Value ProblemsNumerical Analysis and Its application to Boundary Value Problems
Numerical Analysis and Its application to Boundary Value Problems
 
Longest Common Subsequence
Longest Common SubsequenceLongest Common Subsequence
Longest Common Subsequence
 
Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010
 
Richard Everitt's slides
Richard Everitt's slidesRichard Everitt's slides
Richard Everitt's slides
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big Data
 

Similar to Closeszdfgjklfghjt.string

Nonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares MethodNonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares MethodTasuku Soma
 
Introduction to the AKS Primality Test
Introduction to the AKS Primality TestIntroduction to the AKS Primality Test
Introduction to the AKS Primality TestPranshu Bhatnagar
 
A Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment ProblemA Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment ProblemMary Calkins
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPer Kristian Lehre
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPK Lehre
 
module4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfmodule4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfShiwani Gupta
 
SMB_2012_HR_VAN_ST-last version
SMB_2012_HR_VAN_ST-last versionSMB_2012_HR_VAN_ST-last version
SMB_2012_HR_VAN_ST-last versionLilyana Vankova
 
Kolev skalna2018 article-exact_solutiontoa_parametricline
Kolev skalna2018 article-exact_solutiontoa_parametriclineKolev skalna2018 article-exact_solutiontoa_parametricline
Kolev skalna2018 article-exact_solutiontoa_parametriclineAlina Barbulescu
 
Ch3(1).pptxbbbbbbbbbbbbbbbbbbbhhhhhhhhhh
Ch3(1).pptxbbbbbbbbbbbbbbbbbbbhhhhhhhhhhCh3(1).pptxbbbbbbbbbbbbbbbbbbbhhhhhhhhhh
Ch3(1).pptxbbbbbbbbbbbbbbbbbbbhhhhhhhhhhdanielgetachew0922
 

Similar to Closeszdfgjklfghjt.string (20)

Nonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares MethodNonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares Method
 
Unit 3 daa
Unit 3 daaUnit 3 daa
Unit 3 daa
 
Introduction to the AKS Primality Test
Introduction to the AKS Primality TestIntroduction to the AKS Primality Test
Introduction to the AKS Primality Test
 
A Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment ProblemA Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
 
Lecture5
Lecture5Lecture5
Lecture5
 
big_oh
big_ohbig_oh
big_oh
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
module4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfmodule4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdf
 
SMB_2012_HR_VAN_ST-last version
SMB_2012_HR_VAN_ST-last versionSMB_2012_HR_VAN_ST-last version
SMB_2012_HR_VAN_ST-last version
 
Kolev skalna2018 article-exact_solutiontoa_parametricline
Kolev skalna2018 article-exact_solutiontoa_parametriclineKolev skalna2018 article-exact_solutiontoa_parametricline
Kolev skalna2018 article-exact_solutiontoa_parametricline
 
factoring
factoringfactoring
factoring
 
stochastic processes assignment help
stochastic processes assignment helpstochastic processes assignment help
stochastic processes assignment help
 
lecture6.ppt
lecture6.pptlecture6.ppt
lecture6.ppt
 
algorithm Unit 3
algorithm Unit 3algorithm Unit 3
algorithm Unit 3
 
Math
MathMath
Math
 
Least Squares method
Least Squares methodLeast Squares method
Least Squares method
 
Ch08
Ch08Ch08
Ch08
 
Ch3(1).pptxbbbbbbbbbbbbbbbbbbbhhhhhhhhhh
Ch3(1).pptxbbbbbbbbbbbbbbbbbbbhhhhhhhhhhCh3(1).pptxbbbbbbbbbbbbbbbbbbbhhhhhhhhhh
Ch3(1).pptxbbbbbbbbbbbbbbbbbbbhhhhhhhhhh
 
Kumegawa russia
Kumegawa russiaKumegawa russia
Kumegawa russia
 

Recently uploaded

Design-System - FinTech - Isadora Agency
Design-System - FinTech - Isadora AgencyDesign-System - FinTech - Isadora Agency
Design-System - FinTech - Isadora AgencyIsadora Agency
 
Abortion pills in Riyadh +966572737505 <> buy cytotec <> unwanted kit Saudi A...
Abortion pills in Riyadh +966572737505 <> buy cytotec <> unwanted kit Saudi A...Abortion pills in Riyadh +966572737505 <> buy cytotec <> unwanted kit Saudi A...
Abortion pills in Riyadh +966572737505 <> buy cytotec <> unwanted kit Saudi A...samsungultra782445
 
Edward Boginsky's Trailblazing Contributions to Printing
Edward Boginsky's Trailblazing Contributions to PrintingEdward Boginsky's Trailblazing Contributions to Printing
Edward Boginsky's Trailblazing Contributions to PrintingEdward Boginsky
 
Branding in the Psychedelic Landscape Report.pdf
Branding in the Psychedelic Landscape Report.pdfBranding in the Psychedelic Landscape Report.pdf
Branding in the Psychedelic Landscape Report.pdfAlexandra Plesner
 
Spring Summer 26 Colors Trend Book Peclers Paris
Spring Summer 26 Colors Trend Book Peclers ParisSpring Summer 26 Colors Trend Book Peclers Paris
Spring Summer 26 Colors Trend Book Peclers ParisPeclers Paris
 
如何办理(RUG毕业证书)格罗宁根大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(RUG毕业证书)格罗宁根大学毕业证成绩单本科硕士学位证留信学历认证如何办理(RUG毕业证书)格罗宁根大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(RUG毕业证书)格罗宁根大学毕业证成绩单本科硕士学位证留信学历认证ugzga
 
Academic Portfolio (2017-2021) .pdf
Academic Portfolio (2017-2021)      .pdfAcademic Portfolio (2017-2021)      .pdf
Academic Portfolio (2017-2021) .pdfM. A. Architects
 
如何办理(Birmingham毕业证书)伯明翰大学学院毕业证成绩单本科硕士学位证留信学历认证
如何办理(Birmingham毕业证书)伯明翰大学学院毕业证成绩单本科硕士学位证留信学历认证如何办理(Birmingham毕业证书)伯明翰大学学院毕业证成绩单本科硕士学位证留信学历认证
如何办理(Birmingham毕业证书)伯明翰大学学院毕业证成绩单本科硕士学位证留信学历认证ugzga
 
TRose UXPA Experience Design Concord .pptx
TRose UXPA Experience Design Concord .pptxTRose UXPA Experience Design Concord .pptx
TRose UXPA Experience Design Concord .pptxtrose8
 
Abortion pills in Kuwait 🚚+966505195917 but home delivery available in Kuwait...
Abortion pills in Kuwait 🚚+966505195917 but home delivery available in Kuwait...Abortion pills in Kuwait 🚚+966505195917 but home delivery available in Kuwait...
Abortion pills in Kuwait 🚚+966505195917 but home delivery available in Kuwait...drmarathore
 
How to Build a Simple Shopify Website
How to Build a Simple Shopify WebsiteHow to Build a Simple Shopify Website
How to Build a Simple Shopify Websitemark11275
 
ECHOES OF GENIUS - A Tribute to Nari Gandhi's Architectural Legacy. .pdf
ECHOES OF GENIUS - A Tribute to Nari Gandhi's Architectural Legacy. .pdfECHOES OF GENIUS - A Tribute to Nari Gandhi's Architectural Legacy. .pdf
ECHOES OF GENIUS - A Tribute to Nari Gandhi's Architectural Legacy. .pdfSarbjit Bahga
 
如何办理(Bath毕业证书)巴斯大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(Bath毕业证书)巴斯大学毕业证成绩单本科硕士学位证留信学历认证如何办理(Bath毕业证书)巴斯大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(Bath毕业证书)巴斯大学毕业证成绩单本科硕士学位证留信学历认证ugzga
 
如何办理(ArtEZ毕业证书)ArtEZ艺术学院毕业证成绩单本科硕士学位证留信学历认证
如何办理(ArtEZ毕业证书)ArtEZ艺术学院毕业证成绩单本科硕士学位证留信学历认证如何办理(ArtEZ毕业证书)ArtEZ艺术学院毕业证成绩单本科硕士学位证留信学历认证
如何办理(ArtEZ毕业证书)ArtEZ艺术学院毕业证成绩单本科硕士学位证留信学历认证ugzga
 
一比一原版(ANU毕业证书)澳大利亚国立大学毕业证原件一模一样
一比一原版(ANU毕业证书)澳大利亚国立大学毕业证原件一模一样一比一原版(ANU毕业证书)澳大利亚国立大学毕业证原件一模一样
一比一原版(ANU毕业证书)澳大利亚国立大学毕业证原件一模一样yhavx
 
18+ Young ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Girl Serviℂ...
18+ Young ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Girl Serviℂ...18+ Young ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Girl Serviℂ...
18+ Young ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Girl Serviℂ...Payal Garg #K09
 
如何办理(Columbia College毕业证书)纽约市哥伦比亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(Columbia College毕业证书)纽约市哥伦比亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(Columbia College毕业证书)纽约市哥伦比亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(Columbia College毕业证书)纽约市哥伦比亚大学毕业证成绩单本科硕士学位证留信学历认证ugzga
 
如何办理(UMN毕业证书)明尼苏达大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UMN毕业证书)明尼苏达大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UMN毕业证书)明尼苏达大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UMN毕业证书)明尼苏达大学毕业证成绩单本科硕士学位证留信学历认证ugzga
 
Gamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad IbrahimGamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad Ibrahimamgadibrahim92
 
Jual Obat Aborsi Semarang ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Semarang ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Semarang ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Semarang ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 

Recently uploaded (20)

Design-System - FinTech - Isadora Agency
Design-System - FinTech - Isadora AgencyDesign-System - FinTech - Isadora Agency
Design-System - FinTech - Isadora Agency
 
Abortion pills in Riyadh +966572737505 <> buy cytotec <> unwanted kit Saudi A...
Abortion pills in Riyadh +966572737505 <> buy cytotec <> unwanted kit Saudi A...Abortion pills in Riyadh +966572737505 <> buy cytotec <> unwanted kit Saudi A...
Abortion pills in Riyadh +966572737505 <> buy cytotec <> unwanted kit Saudi A...
 
Edward Boginsky's Trailblazing Contributions to Printing
Edward Boginsky's Trailblazing Contributions to PrintingEdward Boginsky's Trailblazing Contributions to Printing
Edward Boginsky's Trailblazing Contributions to Printing
 
Branding in the Psychedelic Landscape Report.pdf
Branding in the Psychedelic Landscape Report.pdfBranding in the Psychedelic Landscape Report.pdf
Branding in the Psychedelic Landscape Report.pdf
 
Spring Summer 26 Colors Trend Book Peclers Paris
Spring Summer 26 Colors Trend Book Peclers ParisSpring Summer 26 Colors Trend Book Peclers Paris
Spring Summer 26 Colors Trend Book Peclers Paris
 
如何办理(RUG毕业证书)格罗宁根大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(RUG毕业证书)格罗宁根大学毕业证成绩单本科硕士学位证留信学历认证如何办理(RUG毕业证书)格罗宁根大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(RUG毕业证书)格罗宁根大学毕业证成绩单本科硕士学位证留信学历认证
 
Academic Portfolio (2017-2021) .pdf
Academic Portfolio (2017-2021)      .pdfAcademic Portfolio (2017-2021)      .pdf
Academic Portfolio (2017-2021) .pdf
 
如何办理(Birmingham毕业证书)伯明翰大学学院毕业证成绩单本科硕士学位证留信学历认证
如何办理(Birmingham毕业证书)伯明翰大学学院毕业证成绩单本科硕士学位证留信学历认证如何办理(Birmingham毕业证书)伯明翰大学学院毕业证成绩单本科硕士学位证留信学历认证
如何办理(Birmingham毕业证书)伯明翰大学学院毕业证成绩单本科硕士学位证留信学历认证
 
TRose UXPA Experience Design Concord .pptx
TRose UXPA Experience Design Concord .pptxTRose UXPA Experience Design Concord .pptx
TRose UXPA Experience Design Concord .pptx
 
Abortion pills in Kuwait 🚚+966505195917 but home delivery available in Kuwait...
Abortion pills in Kuwait 🚚+966505195917 but home delivery available in Kuwait...Abortion pills in Kuwait 🚚+966505195917 but home delivery available in Kuwait...
Abortion pills in Kuwait 🚚+966505195917 but home delivery available in Kuwait...
 
How to Build a Simple Shopify Website
How to Build a Simple Shopify WebsiteHow to Build a Simple Shopify Website
How to Build a Simple Shopify Website
 
ECHOES OF GENIUS - A Tribute to Nari Gandhi's Architectural Legacy. .pdf
ECHOES OF GENIUS - A Tribute to Nari Gandhi's Architectural Legacy. .pdfECHOES OF GENIUS - A Tribute to Nari Gandhi's Architectural Legacy. .pdf
ECHOES OF GENIUS - A Tribute to Nari Gandhi's Architectural Legacy. .pdf
 
如何办理(Bath毕业证书)巴斯大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(Bath毕业证书)巴斯大学毕业证成绩单本科硕士学位证留信学历认证如何办理(Bath毕业证书)巴斯大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(Bath毕业证书)巴斯大学毕业证成绩单本科硕士学位证留信学历认证
 
如何办理(ArtEZ毕业证书)ArtEZ艺术学院毕业证成绩单本科硕士学位证留信学历认证
如何办理(ArtEZ毕业证书)ArtEZ艺术学院毕业证成绩单本科硕士学位证留信学历认证如何办理(ArtEZ毕业证书)ArtEZ艺术学院毕业证成绩单本科硕士学位证留信学历认证
如何办理(ArtEZ毕业证书)ArtEZ艺术学院毕业证成绩单本科硕士学位证留信学历认证
 
一比一原版(ANU毕业证书)澳大利亚国立大学毕业证原件一模一样
一比一原版(ANU毕业证书)澳大利亚国立大学毕业证原件一模一样一比一原版(ANU毕业证书)澳大利亚国立大学毕业证原件一模一样
一比一原版(ANU毕业证书)澳大利亚国立大学毕业证原件一模一样
 
18+ Young ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Girl Serviℂ...
18+ Young ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Girl Serviℂ...18+ Young ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Girl Serviℂ...
18+ Young ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Girl Serviℂ...
 
如何办理(Columbia College毕业证书)纽约市哥伦比亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(Columbia College毕业证书)纽约市哥伦比亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(Columbia College毕业证书)纽约市哥伦比亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(Columbia College毕业证书)纽约市哥伦比亚大学毕业证成绩单本科硕士学位证留信学历认证
 
如何办理(UMN毕业证书)明尼苏达大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UMN毕业证书)明尼苏达大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UMN毕业证书)明尼苏达大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UMN毕业证书)明尼苏达大学毕业证成绩单本科硕士学位证留信学历认证
 
Gamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad IbrahimGamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad Ibrahim
 
Jual Obat Aborsi Semarang ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Semarang ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Semarang ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Semarang ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Closeszdfgjklfghjt.string

  • 1. Closest String and Closest Substring Problems Closest String and Closest Substring Problems Dongbo Bu January 8, 2010 Dongbo Bu Closest String and Closest Substring Problems
  • 2. Closest String and Closest Substring Problems Problem Formulation Problem Statement I Closest String Given a set S = {s1, s2, · · · , sn} of strings each length m, find a center string s of length m minimizing d such that for every string si ∈ S, d(s, si) ≤ d. Here d(s, si) is the Hamming distance between s and si. Closest Substring Given a set S = {s1, s2, · · · , sn} of strings each length m and an integer L, find a center string s of length L minimizing d such that for every string si ∈ S there is a length L substring ti of si with d(s, ti) ≤ d. Dongbo Bu Closest String and Closest Substring Problems
  • 3. Closest String and Closest Substring Problems Problem Formulation Example of Closest String Given 4 strings s1, s2, s3, s4, Optimal center string s = 011000 d = 4 max i=1 d(si, s) = max{2, 2, 1, 2} = 2 Dongbo Bu Closest String and Closest Substring Problems
  • 4. Closest String and Closest Substring Problems Problem Formulation Example of Closest Substring Given 4 strings s1, s2, s3, s4, L = 4, Optimal center string s = 1100, d = 4 max i=1 d(ti, s) = max{1, 0, 1, 2} = 2 Dongbo Bu Closest String and Closest Substring Problems
  • 5. Closest String and Closest Substring Problems Algorithm for Closest String Problem Basic Idea In polynomial time, we can enumerate all n r strings for any fixed r. We can prove that, on positions that r strings all agree, denote as Q, it is a good approximate solution. We can also prove that, |Q| ≥ m − r · dopt. On other positions, we use LP + Random rounding technique to obtain an approximate solution. Dongbo Bu Closest String and Closest Substring Problems
  • 6. Closest String and Closest Substring Problems Algorithm for Closest String Problem Algorithm I Input : s1, s2, · · · , sn ∈ Σm, an integer r ≥ 2 and a small number ǫ > 0. Output : a center string s ∈ Σm Algorithm : 1 for each r-element subset {si1 , si2 , · · · , sir } of the n input strings do 1 let Q = {j|si1 [j] = si2 [j] = · · · = sir [j]}, P = {1, 2, · · · , m} − Q Dongbo Bu Closest String and Closest Substring Problems
  • 7. Closest String and Closest Substring Problems Algorithm for Closest String Problem Algorithm II 2 solve the following optimization problem min d s.t. d(si|P , y) + d(si|Q, si1 |Q) ≤ d, i = 1, 2, · · · , n Random rounding the fractional solution ¯y to get a approximation solution y. Use derandomization technique to make this step deterministic rather than random. 3 let s′ |Q = si1 |Q, s′ |P = y. Calculate the radius of the solution with s′ as the center string. 2 for i = 1, 2, · · · , n do 1 calculate the radius of the solution with si as the center string. 3 output the best solution of the above two steps. Dongbo Bu Closest String and Closest Substring Problems
  • 8. Closest String and Closest Substring Problems Algorithm for Closest String Problem Example of Closest String I Suppose r = 2. In step 1, we should enumerate all 4 2 = 6 cases. In each cases, we select 2 lines, calculate P and Q. Dongbo Bu Closest String and Closest Substring Problems
  • 9. Closest String and Closest Substring Problems Algorithm for Closest String Problem Example of Closest String II In step 2, we fix s′|Q = 1000, solve the following optimization problem : min d s.t. y10 + y11 = 1 y20 + y21 = 1 y11 + y21 + 0 ≤ d y10 + y20 + 0 ≤ d y11 + y20 + 2 ≤ d y11 + y20 + 2 ≤ d Dongbo Bu Closest String and Closest Substring Problems
  • 10. Closest String and Closest Substring Problems Algorithm for Closest String Problem Example of Closest String III Solve this linear programming, and random rounding the fractional solution to integer solution : y10 = y21 = 1, y11 = y20 = 0 So, s′ |P = 01, s′ = 011000 d = 4 max i=1 d(si, s′ ) = max{1, 1, 2, 3} = 3 Try all 4 2 = 6 cases, obtain the minimum, denote as d0. Then we finish step 1. Dongbo Bu Closest String and Closest Substring Problems
  • 11. Closest String and Closest Substring Problems Algorithm for Closest String Problem Example of Closest String IV In step 2, we calculate the radius when si is the center string. d1 = 4 max i=1 d(s1, si) d2 = 4 max i=1 d(s2, si) d3 = 4 max i=1 d(s3, si) d4 = 4 max i=1 d(s4, si) Calculate the minimal radius in both step 1 and step 2. d = min{d0, d1, d2, d3, d4} Dongbo Bu Closest String and Closest Substring Problems
  • 12. Closest String and Closest Substring Problems Algorithm for Closest String Problem Analysis I Theorem The above algorithm is a PTAS for Closest String problem. Dongbo Bu Closest String and Closest Substring Problems
  • 13. Closest String and Closest Substring Problems Algorithm for Closest String Problem Analysis II Proof. Obviously, the time complexity of the algorithm is O (nm)rnO(log |Σ|·r2/ǫ2) , which is polynomial in terms of n,m. The proof of approximation guarantee is organized as 3 lemmas as follows : Lemma 1 proves s′|Q is a good approximation to s with approximation rate 1 + 1 2r−1 . Lemma 2 proves |P| < r · dopt. Based on the above 2 lemmas, Lemma 3 proves step 1.2 obtains a approximate solution with rate (1 + 1 2r−1 + ǫ). Dongbo Bu Closest String and Closest Substring Problems
  • 14. Closest String and Closest Substring Problems Algorithm for Closest String Problem Analysis III Lemma 1 If maxi≤i,j≤n d(si, sj) > (1 + 1 2r−1 )dopt, then there exists r indices 1 ≤ i1, i2, · · · , ir ≤ n such that for any 1 ≤ l ≤ n, d(sl|Q, si1 |Q) − d(sl|Q, s|Q) ≤ 1 2r − 1 dopt where Q is the set of positions that si1 , si2 , · · · , sir all agree. if maxi≤i,j≤n d(si, sj) ≤ (1 + 1 2r−1 )dopt, then any si will be a good center string. (Recall the step 2 of the algorithm) Dongbo Bu Closest String and Closest Substring Problems
  • 15. Closest String and Closest Substring Problems Algorithm for Closest String Problem Analysis IV Lemma 2 Let P = {1, 2, · · · , m} − Q, then |P| ≤ r · dopt and |Q| ≥ m − r · dopt. Proof. Let q ∈ P, then there exists some sij such that sij [q] = s[q]. Since d(sij , s) ≤ dopt, each sij contributes at most dopt positions for P. Thus |P| ≤ r · dopt. this lemma gives the lower bound of dopt, dopt ≥ |P| r , which is essential for the analysis of LP + random rounding! Dongbo Bu Closest String and Closest Substring Problems
  • 16. Closest String and Closest Substring Problems Algorithm for Closest String Problem Analysis V Lemma 3 Given a string s′ and a position set Q and P, |P| < r · dopt, such that for any i = 1, 2, · · · , n, d(si|Q, s′ |Q) − d(si|Q, s|Q) ≤ ρ · dopt, then step 1.2 gives a solution with approximate rate (1 + ρ + ǫ) in polynomial time for any fixed ǫ ≥ 0. Before the proof, we can see that the conditions of this lemma are all satisfied by lemma 2, where ρ = 1 2r−1 . Proof Dongbo Bu Closest String and Closest Substring Problems
  • 17. Closest String and Closest Substring Problems Algorithm for Closest String Problem Analysis VI Recall the optimization problem in step 1.2 min d s.t. d(si|P , y) + d(si|Q, s′ |Q) ≤ d, i = 1, 2, · · · , n First, we show that y = s|P is a solution with cost d ≤ (1 + ρ)dopt. In fact, for i = 1, 2, · · · , n d(si|P , s|P ) + d(si|Q, s′ |Q) ≤ d(si|P , s|P ) + d(si|Q, s|Q) + ρ · dopt ≤ (1 + ρ)dopt Second, rewirte the optimization problem to an ILP problem. In order for this, define 0-1 variables yj,a, 1 ≤ j ≤ |P|, a ∈ Σ, Dongbo Bu Closest String and Closest Substring Problems
  • 18. Closest String and Closest Substring Problems Algorithm for Closest String Problem Analysis VII yj,a = 1 means y[j] = a; define χ(si[j], a) = 0 if si[j] = a and 1 if si[j] = a. Then the above optimization problem can be formulated as follows min d s.t. a∈Σ yj,a = 1, 1 ≤ j ≤ |P| 1≤j≤|P| a∈Σ χ(si[j], a)yj,a + d(si|Q, s′|Q) ≤ d, 1 ≤ i ≤ n Solve it by LP to get a fractional solution ¯y with cost ¯d. Random rounding ¯y to y′ by independently set y′ j,a = 1 and y′ j,b = 0, b ∈ Σ − {a} with probability ¯yj,a. So d(si|P , y′) = |P| i=1 a∈Σ χ(si[j], a)yj,a, which is the sum of |P| independent random variables. Dongbo Bu Closest String and Closest Substring Problems
  • 19. Closest String and Closest Substring Problems Algorithm for Closest String Problem Analysis VIII E(d(si|P )) = 1≤j≤|P| a∈Σ χ(si[j], a) ¯yj,a ≤ ¯d − d(si|Q, s′ |Q) ≤ (1 + ρ) · dopt − d(si|Q, s′ |Q) Employ Chernoff Bound Pr(X > µ + ǫn) ≤ exp(− 1 3 nǫ2 ) we have Pr(d(si|P , y′ ) > (1 + ρ)dopt − d(si|Q, s′ |Q) + ǫ′ |P|) ≤ exp(− 1 3 ǫ′2 |P|) Dongbo Bu Closest String and Closest Substring Problems
  • 20. Closest String and Closest Substring Problems Algorithm for Closest String Problem Analysis IX Let s′|P = y′ and consider all n strings, we claim Pr(d(si, s′ ) < (1 + ρ)dopt + ǫ′ |P| for all i) ≥ 1 − n exp(− 1 3 ǫ′2 |P|) Use standard derandomization methods, we can obtain a deterministic s′ that satisfies d(si, s′ ) < (1 + ρ)dopt + ǫ′ |P|, 1 ≤ i ≤ n Recall that |P| < r · dopt, let ǫ′ = ǫ r , we have d(si, s′ ) < (1 + ρ + ǫ)dopt, 1 ≤ i ≤ n Then we finish the proof. Dongbo Bu Closest String and Closest Substring Problems
  • 21. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Basic Idea We want to follow the algorithm of Closest String algorithm. However, we do not know how to construct an optimization problem, for the reason that we do not konw the optimal substring in each string. Thus, the choice of a “good” substring from every string si is the only obstacle on the way to the solution. As we do in the algorithm of Closest String, first we try all the choices of r substrings from S, we can assume that ti1 , ti2 , · · · , tir is a good partial solution. After that, we randomly pick O(log(mn)) positions from P. By trying all length |R| strings, we can assume we know s|R. Then for each i ≤ i ≤ n, we find the substring t′ i from si such that f(t′ i) = d(s|R, t′ i|R) · |P| |R| + d(ti1 |Q, t′ i|Q) is minimized. We use t′ i, 1 ≤ i ≤ n to construct the optimization problem. Dongbo Bu Closest String and Closest Substring Problems
  • 22. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Algorithm for Closest Substring problem I Input : s1, s2, · · · , sn ∈ Σm, an integer 1 ≤ L ≤ m, an integer r ≥ 2 and a small number ǫ > 0. Ouput : center string s Algorithm : 1 for each r-element subset {ti1 , ti2 , · · · , tir }, where tij is a substring of length L from sij do 1 let Q = {j|ti1 [j] = ti2 [j] = · · · = tir [j]}, P = {1, 2, · · · , m} − Q 2 randomly select 4 ǫ2 log(mn) positions from P, denote as R 3 for every string x of length |R| do 1 for 1 ≤ i ≤ n, let t′ i be a length L substring of si minimizing f(t′ i) = d(x, t′ i|R) · |P | |R| + d(ti1 |Q, t′ i|Q). Dongbo Bu Closest String and Closest Substring Problems
  • 23. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Algorithm for Closest Substring problem II 2 solve the optimization problem min d s.t. d(t′ i|P , y) + d(t′ i|Q, ti−1|Q) ≤ d, 1 ≤ i ≤ n Random rounding the fractional solution ¯y to get a approximation solution y. Use derandomization technique to make this step deterministic rather than random. 3 Let s′ |Q = ti1 |Q and s′ |P = y. Let c = maxn i=1 d(s′ , t′ i). 2 for every length L substring s′ of s1 do 1 Let c = maxn i=1 minti d(s′ , ti). 3 output the s′ with minimum c in step 1.3.3 and step 2.1. Dongbo Bu Closest String and Closest Substring Problems
  • 24. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Example of Closest Subtring I Suppose r = 2. In step 1, we should enumerate all 4 2 × 3 × 3 = 54 cases. In each cases, we select 2 substring of length 4 , calculate P and Q. We fix s′|Q = 00. Randomly select |R| = O(log(mn)) positions, say, |R| = 1. Then enumerate all length |R| strings, say, s′|R = 0. Dongbo Bu Closest String and Closest Substring Problems
  • 25. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Example of Closest Subtring II Dongbo Bu Closest String and Closest Substring Problems
  • 26. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Example of Closest Subtring III Now, for each si, find out the t′ i minimizing f(u) = d(u|R, s′|R) × 2 + d(u|Q, s′|Q). Dongbo Bu Closest String and Closest Substring Problems
  • 27. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Example of Closest Subtring IV Solve the following optimization problem : min d s.t. y10 + y11 = 1 y20 + y21 = 1 y10 + y21 + 0 ≤ d y10 + y21 + 0 ≤ d y10 + y21 + 2 ≤ d y10 + y21 + 1 ≤ d Dongbo Bu Closest String and Closest Substring Problems
  • 28. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Example of Closest Subtring V Solve this linear programming, and random rounding the fractional solution to integer solution : y10 = y21 = 0, y11 = y20 = 1 So, s′ = 1000 d = 4 max i=1 d(ti, s′ ) = max{0, 0, 2, 1} = 2 Try all 54 cases, obtain the minimum, denote as d0. Then we finish step 1. Dongbo Bu Closest String and Closest Substring Problems
  • 29. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Example of Closest Subtring VI In step 2, we calculate the radius when ti is the center string where ti is a length L substring of si. denote the minimun d1. Calculate the minimal radius in both step 1 and step 2. d = min{d0, d1} Dongbo Bu Closest String and Closest Substring Problems
  • 30. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Analysis I Theorem The above algorithm is a PTAS for Closest Substring problem. Dongbo Bu Closest String and Closest Substring Problems
  • 31. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Analysis II Proof. The time complexity of the algorithm is O (nm)O(log |Σ|/δ4) , which is polynomial in terms of n,m. The proof of approximation guarantee is organized as 3 lemmas as follows : Lemma 1 proves s′|Q is a good approximation to s with approximation rate 1 + 1 2r−1 . Define s∗|P = s|P and s∗|Q = ti1 |Q, lemma 2 proves d(s∗, t′ i) ≤ d(s∗, ti) + 2ǫ|P| for all 1 ≤ i ≤ n. Based on the above 2 lemmas, Lemma 3 proves the algorithm obtains a approximate solution with rate (1 + 1 2r−1 + 3ǫr). Dongbo Bu Closest String and Closest Substring Problems
  • 32. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Analysis III Lemma 1 There exists ti1 , ti2 , · · · , tir chosen in step 1, such that for any 1 ≤ l ≤ n d(tl|Q, ti1 |Q) − d(sl|Q, s|Q) ≤ 1 2r − 1 · dopt Proof. The fact follow from Lemma 1 of Closest String directly. Dongbo Bu Closest String and Closest Substring Problems
  • 33. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Analysis IV Lemma 2 Define s∗|P = s|P and s∗|Q = ti1 |Q. Then we have, with high probability d(s∗ , t′ i) ≤ d(s∗ , ti) + 2ǫ|P| for all 1 ≤ i ≤ n. Proof. The randomness comes from the randomly selected |R|. Use standard method, we can derandomize it to make this deterministic. Dongbo Bu Closest String and Closest Substring Problems
  • 34. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Analysis V Lemma 3 step 1.3.3 gives an approximation solution s′ with approximation rate (1 + 1 2r−1 + 3ǫr). Proof. Recall the optimization problem in step 1.2 min d s.t. d(t′ i|P , y) + d(t′ i|Q, s′ |Q) ≤ d, i = 1, 2, · · · , n y = s|P is a feasible solution. Now we calculate its radius when s∗ is the center string. Recall that s∗|P = s|P and s∗|Q = ti1 |Q. Dongbo Bu Closest String and Closest Substring Problems
  • 35. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Analysis VI According to Lemma 2 and Lemma 1 d(s∗ , t′ i) ≤ d(s∗ , ti) + 2ǫ|P| ≤ d(s|P , ti|P ) + d(ti1 |Q, ti|Q) + 2ǫ|P| ≤ d(s|P , ti|P ) + d(s|Q, ti|Q) + 2ǫ|P| + 1 2r − 1 dopt ≤ (1 + 1 2r − 1 )dopt + 2ǫ|P|) In a short word, y = s|P is a solution of the optimization problem with cost at most (1 + 1 2r−1 )dopt + 2ǫ|P|). Next, as we do in the Closest String problem, we rewirte the optimization problem to an ILP and solve it and random rounding the fractional solution ¯y to y. Define s′|Q = ti1 |Q and s′|P = y. Dongbo Bu Closest String and Closest Substring Problems
  • 36. Closest String and Closest Substring Problems Algorithm for Closest Substring problem Analysis VII E(d(s′ , t′ i)) ≤ ¯d ≤ (1 + 1 2r − 1 + 2ǫ|P|)dopt So, chernoff bound ensures that, with high probability, d(s′ , t′ i) ≤ (1 + 1 2r − 1 )dopt + 2ǫ|P| + ǫ|P| ≤ (1 + 1 2r − 1 )dopt + 3ǫ|P| ≤ (1 + 1 2r − 1 + 3ǫr)dopt After derandomization, we can obtain an approximation solution s′ with rate (1 + 1 2r−1 + 3ǫr)dopt. Dongbo Bu Closest String and Closest Substring Problems