ProbSem191214

Like a Needle in a Haystack: Rates and
Algorithms for Group Testing
Leonardo Baldassini
Joint work with Oliver Johnson, Matthew Aldridge and Karen Gunderson
19 December 2014

Warm-up examples
• Molecular biology (DNA screening)
• Recommender systems
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 1 / 24

Warm-up examples
• Molecular biology (DNA screening)
• Recommender systems
• Spectrum sensing
• High-throughput screening techniques
• Network tomography
• Cryptography and cyber security

DNA screening
Problem
Find a speciﬁc, rare sequence of nucleotides among many DNA samples

DNA screening
Problem
1
2
3
4

DNA screening
Problem
1
2
3
4
qPCR

DNA screening
Problem
1
2
3
4
qPCR
+
≥ 1 match
in sample

DNA screening
Problem
1
2
3
4
qPCR
+
≥ 1 match
in sample
−
no match
in sample

Recommender systems
Problem
Suggest songs to a user based on his past preferences.

Recommender systems
Problem
User selects
song

Recommender systems
Problem
User selects
song
RS proposes
song

Recommender systems
Problem
User selects
song
RS proposes
song
+
−

Recommender systems
Problem
User selects
song
RS proposes
song
+
−
RS adjusts guess
on stylis-
tic features

What do they have in common?
COMMON FEATURES

COMMON FEATURES
• Search for sparse property

COMMON FEATURES
• Property can be tested on
groups

COMMON FEATURES
• Property can be tested on
groups
GROUP TESTING
Search methods to recover a sparse
subset of items from a population
that share a feature which can be
detected on groups

Boolean group testing
0 1 0 0 0 0 1 1 0 0 1 0
defectives K
population
N

0 1 0 0 0 0 1 1 0 0 1 0
defectives K
population
N
0 0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 1 1 0 0
0 1 0 1 0 0 0 0 0 0 1 0
1 0 1 0 0 0 1 1 0 0 0 0
0 0 1 1 0 0 0 0 1 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 1
1 0 0 1 0 0 0 0 0 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0
test design
X

0 1 0 0 0 0 1 1 0 0 1 0
defectives K
population
N
0 0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 1 1 0 0
0 1 0 1 0 0 0 0 0 0 1 0
1 0 1 0 0 0 1 1 0 0 0 0
0 0 1 1 0 0 0 0 1 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 1
1 0 0 1 0 0 0 0 0 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0
test design
X
1
1
1
1
0
0
0
0
0
0
0
test oucomes
y

0 1 0 0 0 0 1 1 0 0 1 0
defectives K
population
N
0 0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 1 1 0 0
0 1 0 1 0 0 0 0 0 0 1 0
1 0 1 0 0 0 1 1 0 0 0 0
0 0 1 1 0 0 0 0 1 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 1
1 0 0 1 0 0 0 0 0 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0
test design
X
1
1
1
1
0
0
0
0
0
0
0
test oucomes
y
|N| = N, |K| = K
K = N1−β
, β > 0

0 1 0 0 0 0 1 1 0 0 1 0
defectives K
population
N
0 0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 1 1 0 0
0 1 0 1 0 0 0 0 0 0 1 0
1 0 1 0 0 0 1 1 0 0 0 0
0 0 1 1 0 0 0 0 1 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 1
1 0 0 1 0 0 0 0 0 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0
test design
X
1
1
1
1
0
0
0
0
0
0
0
test oucomes
y
|N| = N, |K| = K
K = N1−β
, β > 0
particularly good for
non-adaptive algorithms

Anatomy of an algorithm
N
K possible
defective
subsets
sequence of
test groups
X ∈ {0, 1}T×N
Encoding
Tests {xi yi }T
i=1 → K
Decoding
adaptivity
Recovered
set K

N
K possible
defective
subsets
sequence of
test groups
X ∈ {0, 1}T×N
Encoding
Tests {xi yi }T
i=1 → K
Decoding
adaptivity
Recovered
set K
log2
N
K bits label

N
K possible
defective
subsets
sequence of
test groups
X ∈ {0, 1}T×N
Encoding
Tests {xi yi }T
i=1 → K
Decoding
adaptivity
Recovered
set K
log2
N
K bits label
each test encodes
part of the label

Rate of group testing (algorithms)
RA =
log2
N
K
TA
bits per test

RA =
log2
N
K
TA
bits per test
RA measures how much we learn with each test, on average.

RA =
log2
N
K
TA
bits per test
R∗
A (β) supremum of rates for algorithm A.

RA =
log2
N
K
TA
bits per test
R∗
A (β) supremum of rates for algorithm A. (Yes, it’s bounded).

WHY DO WE LIKE RATES?

Universal upper bound for noiseless GT
Theorem (Aldridge, B., Johnson, 2013)
The probability of success of any algorithm A can be upper-bounded as
P(success) ≤
2TA
N
K
.

Two important consequences
P(success) ≤
2TA
N
K

P(success) ≤
2TA
N
K
• Successful algorithms use T ≥ log2
N
K tests
optimal algorithms use T = c log2
N
K tests

P(success) ≤
2TA
N
K
N
K tests
N
K tests
• R =
log2
N
K
T
“ranks” optimal algorithms

P(success) ≤
2TA
N
K
N
K tests
N
K tests
• R =
log2
N
K
T
“ranks” optimal algorithms
• Successful algorithms have R∗
A (β) ≤ 1

Capacity of GT

Capacity of GT
C C + ε
C − ε

Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
such that P(success) → 1
Direct part

Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
Direct part
Any algorithm with
lim inf
N→∞
log N
K
T
≥ C + ε
has P(success) < 1 − η, η > 0
Converse part

Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
Direct part
Any algorithm with
lim inf
N→∞
log N
K
T
≥ C + ε
Converse part
• C is inherent in the GT
model

Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
Direct part
Any algorithm with
lim inf
N→∞
log N
K
T
≥ C + ε
Converse part
model
• C measures the “hardness”
of a GT problem

Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
Direct part
Any algorithm with
lim inf
N→∞
log N
K
T
≥ C + ε
Converse part
model
• C measures the “hardness”
of a GT problem
• C quantifies an
efficiency/effectiveness
trade-off

Can we get to R = 1?

population

population
Positive, halve

population
Negative, discard

population
Ok, this is easy

population
Can’t be this...

population
Must be this!

population
Put these back into
the population

population
• Hwang, 1972 (HGBS)

population
• Achieves RHGBS = C = 1
THGBS ≤ K log2 N + (1 + log2 ln 2)K
− log K! + W
= O(K log N)

population
• Achieves RHGBS = C = 1
• Careful choice of sample size
is key

Can we get there...non-adaptively?

• Maybe

• Maybe
• What we did:

• Maybe
• What we did:
• Introduced and studied DD (Deﬁnite Defectives) and SSS (Smallest
Satisfying Set)
• ...hence Bernoulli sampling, xij ∼ Bern(p), p = 1
K+1
• Showed limitation of Bernoulli-based algorithms

DD: Deﬁnite Defectives

0 1 0 0 0 0 1 1 0 0 1 0
0 0 1 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 1 1 0 0
1 1 0 1 0 1 0 0 0 0 1 0
1 0 1 0 0 0 1 0 0 0 0 0
0 0 1 1 0 0 0 0 1 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 1
1 0 0 1 0 0 0 0 0 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 1 0 0 0
1
1
1
1
0
0
0
0
0
0
0
1. X Bernoulli design.

0 1 0 0 0 0 1 1 0 0 1 0
0 0 1 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 1 1 0 0
1 1 0 1 0 1 0 0 0 0 1 0
1 0 1 0 0 0 1 0 0 0 0 0
0 0 1 1 0 0 0 0 1 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 1
1 0 0 1 0 0 0 0 0 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 1 0 0 0
1
1
1
1
0
0
0
0
0
0
0
2. Look at negative tests. . .

0 1 0 0 0 0 1 1 0 0 1 0
0 0 1 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 1 1 0 0
1 1 0 1 0 1 0 0 0 0 1 0
1 0 1 0 0 0 1 0 0 0 0 0
0 0 1 1 0 0 0 0 1 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 1
1 0 0 1 0 0 0 0 0 1 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 1 0 0 0
1
1
1
1
0
0
0
0
0
0
0
3. . . . classify items therein
as non-def. . . .

0 1 0 0 0 0 1 1 0 0 1 0
0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 0 0
1 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0
1
1
1
1
0
0
0
0
0
0
0
as non-def. . . .
4. . . . and remove them from
X.

0 1 0 0 0 0 1 1 0 0 1 0
0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 0 0
1 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0
1
1
1
1
0
0
0
0
0
0
0
as non-def. . . .
X.
5. Look at positive tests with
1! unclassiﬁed item:

0 1 0 0 0 0 1 1 0 0 1 0
0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 0 0
1 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0
1
1
1
1
0
0
0
0
0
0
0
as non-def. . . .
X.
6. That’s deﬁnite defective!

0 1 0 0 0 0 1 1 0 0 1 0
0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 0 0
1 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0
1
1
1
1
0
0
0
0
0
0
0
as non-def. . . .
X.
6. That’s deﬁnite defective!
TDD ≤ max {β, 1 − β} eK ln N
R∗
DD(β) ≥
1
e ln 2
min 1,
β
1 − β

What R∗
DD(β) looks like
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
Lower bound on R∗
DD(β)
Universal up-
per bound

Computing R∗
DD(β): core of the proof - construction

Computing R∗
• PD =
{items not in negative tests}

Computing R∗
• PD =
• |PD | = K + G, G intruding
non-defectives

Computing R∗
• PD =
non-defectives
For every defective i ∈ K:
Li = # tests with i and no item
from PD

Computing R∗
• PD =
non-defectives
from PD
P(success) = P i∈K{Li = 0}

Computing R∗
• PD =
non-defectives
from PD
Bad news We can’t control G

Computing R∗
• PD =
non-defectives
from PD
Good news We can control G | M0,
M0 = # negative tests.

Computing R∗
• PD =
non-defectives
from PD
P(success) =
T
m0=0
T
m0
(1 − p)Km0
1 − (1 − p)K T−m0
×
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
× P(success | M0 = m0, G = g)

Computing R∗
• PD =
non-defectives
from PD
P(success) =
T
m0=0
T
m0
(1 − p)Km0
1 − (1 − p)K T−m0
×
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
there’s a sum in here, too!

Computing R∗
DD(β): core of the proof - concentration

Computing R∗
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
≥ max{0, 1 − K exp(Θ(T, m0))}

Computing R∗
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
≥ max{0, 1 − K exp(Θ(T, m0))}
• M0 ∈ (EM0 − ε, EM0 + ε) w.h.p.

Computing R∗
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
≥ max{0, 1 − K exp(Θ(T, m0))}
• M0 ∈ (EM0 − ε, EM0 + ε) w.h.p.
• Θ(T, m0) ≤ −(max{β, 1 − β}) ln N
for M0 close to EM0

Computing R∗
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
≥ max{0, 1 − K exp(Θ(T, m0))}
• M0 ∈ (EM0 − ε, EM0 + ε) w.h.p.
• Θ(T, m0) ≤ −(max{β, 1 − β}) ln N
for M0 close to EM0
P(success) ≥ P T 1 − p)K
− ε/e ≤ M0 ≤ T (1 − p)K
+ ε/e (1−N−δ
)

SSS: Smallest Satisfying Set
Group testing as LP:

minimize 1 z
subject to xt · z = 0 for t with yt = 0
xt · z ≥ 1 for t with yt = 1
(1 z ≥ K)
z ∈ {0, 1}N

minimize 1 z
(1 z ≥ K)
z ∈ {0, 1}N
• SSS brute-force searches
for a satisfying K

minimize 1 z
(1 z ≥ K)
z ∈ {0, 1}N
for a satisfying K
• Best possible algorithm
(sample-complexity-wise)

minimize 1 z
(1 z ≥ K)
z ∈ {0, 1}N
for a satisfying K
• Infeasible, but benchmark

minimize 1 z
(1 z ≥ K)
z ∈ {0, 1}N
for a satisfying K
• Infeasible, but benchmark
R∗
SSS(β) ≥
1
ln 2
max
α∈[ln 2,1]
min 2αe−α β
2 − β
, − ln 1 − 2e−α
+ 2e−2α
.

What R∗
SSS(β) looks like
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
Lower bound on R∗
SSS(β)

Separability
SSS may fail if more than one
subset satisﬁes constraints

Separability
Question:
Can we enforce
uniqueness in X ?

Separability
Question:
Can we enforce
uniqueness in X ?
Deﬁnition
X is K-separable if, for any K-subsets L, M ⊂ {1, . . . , N}, it is
i∈M
x(i)
=
j∈L
x(j)

Separability
Question:
Can we enforce
uniqueness in X ?
Deﬁnition
X is K-separable if, for any K-subsets L, M ⊂ {1, . . . , N}, it is
i∈M
x(i)
=
j∈L
x(j)
Notice that y = K x(i)
.

Almost separability
Building separable X requires
T = Ω(K2
log N) tests

Almost separability
T = Ω(K2
log N) tests
Idea:
What if we re-
lax separability?

Almost separability
T = Ω(K2
log N) tests
Idea:
What if we re-
lax separability?
Deﬁnition
X is ε-almost K-separable if there are at most ε N
K K-subsets that break
separability.

Almost separability
T = Ω(K2
log N) tests
Idea:
What if we re-
lax separability?
Deﬁnition
X is ε-almost K-separable if there are at most ε N
K K-subsets that break
separability.
• Almost-separable matrices exist
• Need T = O(K log N) tests to get one
• A Bernoulli test design is almost-separable w.h.p. (via concentration)

Family-wise upper bound via SSS

Consider SSS using TSSS tests. Then,
P(success) → 1 ⇒ TSSS >
(1 − β)e ln 2
β
log2
N
K
,
and, if the necessary condition is violated,
TSSS ≤
(1 − β)e ln 2
β
log2
N
K
⇒ P(success) ≤
2
3
.

(1 − β)e ln 2
β
log2
N
K
,
TSSS ≤
(1 − β)e ln 2
β
log2
N
K
⇒ P(success) ≤
2
3
.
• “Weak converse”

(1 − β)e ln 2
β
log2
N
K
,
TSSS ≤
(1 − β)e ln 2
β
log2
N
K
⇒ P(success) ≤
2
3
.
• Applies to all Bernoulli-based algorithms

(1 − β)e ln 2
β
log2
N
K
,
TSSS ≤
(1 − β)e ln 2
β
log2
N
K
⇒ P(success) ≤
2
3
.
• Applies to all Bernoulli-based algorithms
• R∗
SSS(β) ≤ min 1, β
(1−β)e ln 2

Adaptivity gap
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
SSS -based upper bound

Adaptivity gap
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
Adaptivity gap

Adaptivity gap
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
Adaptivity gap
Non-adaptive Bernoulli-
based algorithms
can’t achieve R = 1

The moral

The moral
• Group testing

The moral
• Group testing
• Collective detection of sparse properties
• Information-poor tests

The moral
• Group testing
• Rates and capacities

The moral
• Group testing
• Granular information
• Capacities bound performance of algorithms

The moral
• Group testing
• SSS: bound the performance of any non-adaptive algorithm based on
Bernoulli sampling

The moral
• Group testing
Bernoulli sampling
• DD: order-optimal, rate-optimal (for dense problems)

The moral
• Group testing
Bernoulli sampling
• DD: order-optimal, rate-optimal (for dense problems)
• Future work: noise models, applications, non-identical GT,
non-independent GT, collateral open questions...

ProbSem191214

Recommended

Recommended

More Related Content

Featured

Featured (20)

ProbSem191214