Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
ProbSem191214
1. Like a Needle in a Haystack: Rates and
Algorithms for Group Testing
Leonardo Baldassini
Joint work with Oliver Johnson, Matthew Aldridge and Karen Gunderson
19 December 2014
2. Warm-up examples
• Molecular biology (DNA screening)
• Recommender systems
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 1 / 24
3. Warm-up examples
• Molecular biology (DNA screening)
• Recommender systems
• Spectrum sensing
• High-throughput screening techniques
• Network tomography
• Cryptography and cyber security
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 1 / 24
4. DNA screening
Problem
Find a specific, rare sequence of nucleotides among many DNA samples
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 2 / 24
5. DNA screening
Problem
Find a specific, rare sequence of nucleotides among many DNA samples
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 2 / 24
6. DNA screening
Problem
Find a specific, rare sequence of nucleotides among many DNA samples
1
2
3
4
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 2 / 24
7. DNA screening
Problem
Find a specific, rare sequence of nucleotides among many DNA samples
1
2
3
4
qPCR
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 2 / 24
8. DNA screening
Problem
Find a specific, rare sequence of nucleotides among many DNA samples
1
2
3
4
qPCR
+
≥ 1 match
in sample
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 2 / 24
9. DNA screening
Problem
Find a specific, rare sequence of nucleotides among many DNA samples
1
2
3
4
qPCR
+
≥ 1 match
in sample
−
no match
in sample
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 2 / 24
11. Recommender systems
Problem
Suggest songs to a user based on his past preferences.
User selects
song
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 3 / 24
12. Recommender systems
Problem
Suggest songs to a user based on his past preferences.
User selects
song
RS proposes
song
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 3 / 24
13. Recommender systems
Problem
Suggest songs to a user based on his past preferences.
User selects
song
RS proposes
song
+
−
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 3 / 24
14. Recommender systems
Problem
Suggest songs to a user based on his past preferences.
User selects
song
RS proposes
song
+
−
RS adjusts guess
on stylis-
tic features
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 3 / 24
15. Recommender systems
Problem
Suggest songs to a user based on his past preferences.
User selects
song
RS proposes
song
+
−
RS adjusts guess
on stylis-
tic features
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 3 / 24
16. What do they have in common?
COMMON FEATURES
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 4 / 24
17. What do they have in common?
COMMON FEATURES
• Search for sparse property
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 4 / 24
18. What do they have in common?
COMMON FEATURES
• Search for sparse property
• Property can be tested on
groups
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 4 / 24
19. What do they have in common?
COMMON FEATURES
• Search for sparse property
• Property can be tested on
groups
GROUP TESTING
Search methods to recover a sparse
subset of items from a population
that share a feature which can be
detected on groups
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 4 / 24
20. Boolean group testing
0 1 0 0 0 0 1 1 0 0 1 0
defectives K
population
N
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 5 / 24
27. Anatomy of an algorithm
N
K possible
defective
subsets
sequence of
test groups
X ∈ {0, 1}T×N
Encoding
Tests {xi yi }T
i=1 → K
Decoding
adaptivity
Recovered
set K
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 6 / 24
28. Anatomy of an algorithm
N
K possible
defective
subsets
sequence of
test groups
X ∈ {0, 1}T×N
Encoding
Tests {xi yi }T
i=1 → K
Decoding
adaptivity
Recovered
set K
log2
N
K bits label
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 6 / 24
29. Anatomy of an algorithm
N
K possible
defective
subsets
sequence of
test groups
X ∈ {0, 1}T×N
Encoding
Tests {xi yi }T
i=1 → K
Decoding
adaptivity
Recovered
set K
log2
N
K bits label
each test encodes
part of the label
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 6 / 24
30. Rate of group testing (algorithms)
RA =
log2
N
K
TA
bits per test
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 7 / 24
31. Rate of group testing (algorithms)
RA =
log2
N
K
TA
bits per test
RA measures how much we learn with each test, on average.
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 7 / 24
32. Rate of group testing (algorithms)
RA =
log2
N
K
TA
bits per test
RA measures how much we learn with each test, on average.
R∗
A (β) supremum of rates for algorithm A.
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 7 / 24
33. Rate of group testing (algorithms)
RA =
log2
N
K
TA
bits per test
RA measures how much we learn with each test, on average.
R∗
A (β) supremum of rates for algorithm A. (Yes, it’s bounded).
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 7 / 24
34. WHY DO WE LIKE RATES?
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 8 / 24
35. Universal upper bound for noiseless GT
Theorem (Aldridge, B., Johnson, 2013)
The probability of success of any algorithm A can be upper-bounded as
P(success) ≤
2TA
N
K
.
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 9 / 24
37. Two important consequences
P(success) ≤
2TA
N
K
• Successful algorithms use T ≥ log2
N
K tests
optimal algorithms use T = c log2
N
K tests
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 10 / 24
38. Two important consequences
P(success) ≤
2TA
N
K
• Successful algorithms use T ≥ log2
N
K tests
optimal algorithms use T = c log2
N
K tests
• R =
log2
N
K
T
“ranks” optimal algorithms
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 10 / 24
39. Two important consequences
P(success) ≤
2TA
N
K
• Successful algorithms use T ≥ log2
N
K tests
optimal algorithms use T = c log2
N
K tests
• R =
log2
N
K
T
“ranks” optimal algorithms
• Successful algorithms have R∗
A (β) ≤ 1
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 10 / 24
40. Capacity of GT
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 11 / 24
41. Capacity of GT
C C + ε
C − ε
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 11 / 24
42. Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
such that P(success) → 1
Direct part
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 11 / 24
43. Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
such that P(success) → 1
Direct part
Any algorithm with
lim inf
N→∞
log N
K
T
≥ C + ε
has P(success) < 1 − η, η > 0
Converse part
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 11 / 24
44. Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
such that P(success) → 1
Direct part
Any algorithm with
lim inf
N→∞
log N
K
T
≥ C + ε
has P(success) < 1 − η, η > 0
Converse part
• C is inherent in the GT
model
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 11 / 24
45. Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
such that P(success) → 1
Direct part
Any algorithm with
lim inf
N→∞
log N
K
T
≥ C + ε
has P(success) < 1 − η, η > 0
Converse part
• C is inherent in the GT
model
• C measures the “hardness”
of a GT problem
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 11 / 24
46. Capacity of GT
C C + ε
C − ε
There exists an
algorithm with
lim inf
N→∞
log N
K
T
≥ C − ε
such that P(success) → 1
Direct part
Any algorithm with
lim inf
N→∞
log N
K
T
≥ C + ε
has P(success) < 1 − η, η > 0
Converse part
• C is inherent in the GT
model
• C measures the “hardness”
of a GT problem
• C quantifies an
efficiency/effectiveness
trade-off
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 11 / 24
47. Can we get to R = 1?
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
48. Can we get to R = 1?
population
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
49. Can we get to R = 1?
population
Positive, halve
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
50. Can we get to R = 1?
population
Negative, discard
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
51. Can we get to R = 1?
population
Positive, halve
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
52. Can we get to R = 1?
population
Ok, this is easy
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
53. Can we get to R = 1?
population
Can’t be this...
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
54. Can we get to R = 1?
population
Must be this!
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
55. Can we get to R = 1?
population
Put these back into
the population
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
56. Can we get to R = 1?
population
• Hwang, 1972 (HGBS)
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
57. Can we get to R = 1?
population
• Hwang, 1972 (HGBS)
• Achieves RHGBS = C = 1
THGBS ≤ K log2 N + (1 + log2 ln 2)K
− log K! + W
= O(K log N)
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
58. Can we get to R = 1?
population
• Hwang, 1972 (HGBS)
• Achieves RHGBS = C = 1
• Careful choice of sample size
is key
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 12 / 24
59. Can we get there...non-adaptively?
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 13 / 24
60. Can we get there...non-adaptively?
• Maybe
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 13 / 24
61. Can we get there...non-adaptively?
• Maybe
• What we did:
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 13 / 24
62. Can we get there...non-adaptively?
• Maybe
• What we did:
• Introduced and studied DD (Definite Defectives) and SSS (Smallest
Satisfying Set)
• ...hence Bernoulli sampling, xij ∼ Bern(p), p = 1
K+1
• Showed limitation of Bernoulli-based algorithms
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 13 / 24
68. DD: Definite Defectives
0 1 0 0 0 0 1 1 0 0 1 0
0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 0 0
1 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0
1
1
1
1
0
0
0
0
0
0
0
1. X Bernoulli design.
2. Look at negative tests. . .
3. . . . classify items therein
as non-def. . . .
4. . . . and remove them from
X.
5. Look at positive tests with
1! unclassified item:
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 14 / 24
69. DD: Definite Defectives
0 1 0 0 0 0 1 1 0 0 1 0
0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 0 0
1 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0
1
1
1
1
0
0
0
0
0
0
0
1. X Bernoulli design.
2. Look at negative tests. . .
3. . . . classify items therein
as non-def. . . .
4. . . . and remove them from
X.
5. Look at positive tests with
1! unclassified item:
6. That’s definite defective!
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 14 / 24
70. DD: Definite Defectives
0 1 0 0 0 0 1 1 0 0 1 0
0 0 0 0 0 1 1 0 0 1 0
0 0 0 0 1 0 1 0 0 0
1 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0
1
1
1
1
0
0
0
0
0
0
0
1. X Bernoulli design.
2. Look at negative tests. . .
3. . . . classify items therein
as non-def. . . .
4. . . . and remove them from
X.
5. Look at positive tests with
1! unclassified item:
6. That’s definite defective!
TDD ≤ max {β, 1 − β} eK ln N
R∗
DD(β) ≥
1
e ln 2
min 1,
β
1 − β
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 14 / 24
71. What R∗
DD(β) looks like
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
Lower bound on R∗
DD(β)
Universal up-
per bound
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 15 / 24
72. Computing R∗
DD(β): core of the proof - construction
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 16 / 24
73. Computing R∗
DD(β): core of the proof - construction
• PD =
{items not in negative tests}
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 16 / 24
74. Computing R∗
DD(β): core of the proof - construction
• PD =
{items not in negative tests}
• |PD | = K + G, G intruding
non-defectives
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 16 / 24
75. Computing R∗
DD(β): core of the proof - construction
• PD =
{items not in negative tests}
• |PD | = K + G, G intruding
non-defectives
For every defective i ∈ K:
Li = # tests with i and no item
from PD
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 16 / 24
76. Computing R∗
DD(β): core of the proof - construction
• PD =
{items not in negative tests}
• |PD | = K + G, G intruding
non-defectives
For every defective i ∈ K:
Li = # tests with i and no item
from PD
P(success) = P i∈K{Li = 0}
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 16 / 24
77. Computing R∗
DD(β): core of the proof - construction
• PD =
{items not in negative tests}
• |PD | = K + G, G intruding
non-defectives
For every defective i ∈ K:
Li = # tests with i and no item
from PD
P(success) = P i∈K{Li = 0}
Bad news We can’t control G
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 16 / 24
78. Computing R∗
DD(β): core of the proof - construction
• PD =
{items not in negative tests}
• |PD | = K + G, G intruding
non-defectives
For every defective i ∈ K:
Li = # tests with i and no item
from PD
P(success) = P i∈K{Li = 0}
Bad news We can’t control G
Good news We can control G | M0,
M0 = # negative tests.
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 16 / 24
79. Computing R∗
DD(β): core of the proof - construction
• PD =
{items not in negative tests}
• |PD | = K + G, G intruding
non-defectives
For every defective i ∈ K:
Li = # tests with i and no item
from PD
P(success) = P i∈K{Li = 0}
Bad news We can’t control G
Good news We can control G | M0,
M0 = # negative tests.
P(success) =
T
m0=0
T
m0
(1 − p)Km0
1 − (1 − p)K T−m0
×
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
× P(success | M0 = m0, G = g)
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 16 / 24
80. Computing R∗
DD(β): core of the proof - construction
• PD =
{items not in negative tests}
• |PD | = K + G, G intruding
non-defectives
For every defective i ∈ K:
Li = # tests with i and no item
from PD
P(success) = P i∈K{Li = 0}
Bad news We can’t control G
Good news We can control G | M0,
M0 = # negative tests.
P(success) =
T
m0=0
T
m0
(1 − p)Km0
1 − (1 − p)K T−m0
×
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
× P(success | M0 = m0, G = g)
there’s a sum in here, too!
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 16 / 24
81. Computing R∗
DD(β): core of the proof - concentration
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 17 / 24
82. Computing R∗
DD(β): core of the proof - concentration
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
× P(success | M0 = m0, G = g)
≥ max{0, 1 − K exp(Θ(T, m0))}
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 17 / 24
83. Computing R∗
DD(β): core of the proof - concentration
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
× P(success | M0 = m0, G = g)
≥ max{0, 1 − K exp(Θ(T, m0))}
• M0 ∈ (EM0 − ε, EM0 + ε) w.h.p.
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 17 / 24
84. Computing R∗
DD(β): core of the proof - concentration
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
× P(success | M0 = m0, G = g)
≥ max{0, 1 − K exp(Θ(T, m0))}
• M0 ∈ (EM0 − ε, EM0 + ε) w.h.p.
• Θ(T, m0) ≤ −(max{β, 1 − β}) ln N
for M0 close to EM0
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 17 / 24
85. Computing R∗
DD(β): core of the proof - concentration
N−K
g=0
N − K
g
(1 − p)gm0
(1 − (1 − p)m0
)N−K−g
× P(success | M0 = m0, G = g)
≥ max{0, 1 − K exp(Θ(T, m0))}
• M0 ∈ (EM0 − ε, EM0 + ε) w.h.p.
• Θ(T, m0) ≤ −(max{β, 1 − β}) ln N
for M0 close to EM0
P(success) ≥ P T 1 − p)K
− ε/e ≤ M0 ≤ T (1 − p)K
+ ε/e (1−N−δ
)
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 17 / 24
86. SSS: Smallest Satisfying Set
Group testing as LP:
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 18 / 24
87. SSS: Smallest Satisfying Set
Group testing as LP:
minimize 1 z
subject to xt · z = 0 for t with yt = 0
xt · z ≥ 1 for t with yt = 1
(1 z ≥ K)
z ∈ {0, 1}N
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 18 / 24
88. SSS: Smallest Satisfying Set
Group testing as LP:
minimize 1 z
subject to xt · z = 0 for t with yt = 0
xt · z ≥ 1 for t with yt = 1
(1 z ≥ K)
z ∈ {0, 1}N
• SSS brute-force searches
for a satisfying K
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 18 / 24
89. SSS: Smallest Satisfying Set
Group testing as LP:
minimize 1 z
subject to xt · z = 0 for t with yt = 0
xt · z ≥ 1 for t with yt = 1
(1 z ≥ K)
z ∈ {0, 1}N
• SSS brute-force searches
for a satisfying K
• Best possible algorithm
(sample-complexity-wise)
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 18 / 24
90. SSS: Smallest Satisfying Set
Group testing as LP:
minimize 1 z
subject to xt · z = 0 for t with yt = 0
xt · z ≥ 1 for t with yt = 1
(1 z ≥ K)
z ∈ {0, 1}N
• SSS brute-force searches
for a satisfying K
• Best possible algorithm
(sample-complexity-wise)
• Infeasible, but benchmark
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 18 / 24
91. SSS: Smallest Satisfying Set
Group testing as LP:
minimize 1 z
subject to xt · z = 0 for t with yt = 0
xt · z ≥ 1 for t with yt = 1
(1 z ≥ K)
z ∈ {0, 1}N
• SSS brute-force searches
for a satisfying K
• Best possible algorithm
(sample-complexity-wise)
• Infeasible, but benchmark
R∗
SSS(β) ≥
1
ln 2
max
α∈[ln 2,1]
min 2αe−α β
2 − β
, − ln 1 − 2e−α
+ 2e−2α
.
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 18 / 24
92. What R∗
SSS(β) looks like
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
Lower bound on R∗
SSS(β)
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 19 / 24
93. Separability
SSS may fail if more than one
subset satisfies constraints
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 20 / 24
94. Separability
SSS may fail if more than one
subset satisfies constraints
Question:
Can we enforce
uniqueness in X ?
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 20 / 24
95. Separability
SSS may fail if more than one
subset satisfies constraints
Question:
Can we enforce
uniqueness in X ?
Definition
X is K-separable if, for any K-subsets L, M ⊂ {1, . . . , N}, it is
i∈M
x(i)
=
j∈L
x(j)
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 20 / 24
96. Separability
SSS may fail if more than one
subset satisfies constraints
Question:
Can we enforce
uniqueness in X ?
Definition
X is K-separable if, for any K-subsets L, M ⊂ {1, . . . , N}, it is
i∈M
x(i)
=
j∈L
x(j)
Notice that y = K x(i)
.
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 20 / 24
98. Almost separability
Building separable X requires
T = Ω(K2
log N) tests
Idea:
What if we re-
lax separability?
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 21 / 24
99. Almost separability
Building separable X requires
T = Ω(K2
log N) tests
Idea:
What if we re-
lax separability?
Definition
X is ε-almost K-separable if there are at most ε N
K K-subsets that break
separability.
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 21 / 24
100. Almost separability
Building separable X requires
T = Ω(K2
log N) tests
Idea:
What if we re-
lax separability?
Definition
X is ε-almost K-separable if there are at most ε N
K K-subsets that break
separability.
• Almost-separable matrices exist
• Need T = O(K log N) tests to get one
• A Bernoulli test design is almost-separable w.h.p. (via concentration)
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 21 / 24
101. Family-wise upper bound via SSS
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 22 / 24
102. Family-wise upper bound via SSS
Theorem (Aldridge, B., Johnson, 2014)
Consider SSS using TSSS tests. Then,
P(success) → 1 ⇒ TSSS >
(1 − β)e ln 2
β
log2
N
K
,
and, if the necessary condition is violated,
TSSS ≤
(1 − β)e ln 2
β
log2
N
K
⇒ P(success) ≤
2
3
.
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 22 / 24
103. Family-wise upper bound via SSS
Theorem (Aldridge, B., Johnson, 2014)
Consider SSS using TSSS tests. Then,
P(success) → 1 ⇒ TSSS >
(1 − β)e ln 2
β
log2
N
K
,
and, if the necessary condition is violated,
TSSS ≤
(1 − β)e ln 2
β
log2
N
K
⇒ P(success) ≤
2
3
.
• “Weak converse”
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 22 / 24
104. Family-wise upper bound via SSS
Theorem (Aldridge, B., Johnson, 2014)
Consider SSS using TSSS tests. Then,
P(success) → 1 ⇒ TSSS >
(1 − β)e ln 2
β
log2
N
K
,
and, if the necessary condition is violated,
TSSS ≤
(1 − β)e ln 2
β
log2
N
K
⇒ P(success) ≤
2
3
.
• “Weak converse”
• Applies to all Bernoulli-based algorithms
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 22 / 24
105. Family-wise upper bound via SSS
Theorem (Aldridge, B., Johnson, 2014)
Consider SSS using TSSS tests. Then,
P(success) → 1 ⇒ TSSS >
(1 − β)e ln 2
β
log2
N
K
,
and, if the necessary condition is violated,
TSSS ≤
(1 − β)e ln 2
β
log2
N
K
⇒ P(success) ≤
2
3
.
• “Weak converse”
• Applies to all Bernoulli-based algorithms
• R∗
SSS(β) ≤ min 1, β
(1−β)e ln 2
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 22 / 24
106. Adaptivity gap
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
SSS -based upper bound
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 23 / 24
107. Adaptivity gap
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
SSS -based upper bound
Adaptivity gap
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 23 / 24
108. Adaptivity gap
0.0 0.2 0.4 0.6 0.8 1.0
β
0.0
0.2
0.4
0.6
0.8
1.0
Rate
SSS -based upper bound
Adaptivity gap
Non-adaptive Bernoulli-
based algorithms
can’t achieve R = 1
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 23 / 24
110. The moral
• Group testing
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 24 / 24
111. The moral
• Group testing
• Collective detection of sparse properties
• Information-poor tests
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 24 / 24
112. The moral
• Group testing
• Collective detection of sparse properties
• Information-poor tests
• Rates and capacities
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 24 / 24
113. The moral
• Group testing
• Collective detection of sparse properties
• Information-poor tests
• Rates and capacities
• Granular information
• Capacities bound performance of algorithms
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 24 / 24
114. The moral
• Group testing
• Collective detection of sparse properties
• Information-poor tests
• Rates and capacities
• Granular information
• Capacities bound performance of algorithms
• SSS: bound the performance of any non-adaptive algorithm based on
Bernoulli sampling
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 24 / 24
115. The moral
• Group testing
• Collective detection of sparse properties
• Information-poor tests
• Rates and capacities
• Granular information
• Capacities bound performance of algorithms
• SSS: bound the performance of any non-adaptive algorithm based on
Bernoulli sampling
• DD: order-optimal, rate-optimal (for dense problems)
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 24 / 24
116. The moral
• Group testing
• Collective detection of sparse properties
• Information-poor tests
• Rates and capacities
• Granular information
• Capacities bound performance of algorithms
• SSS: bound the performance of any non-adaptive algorithm based on
Bernoulli sampling
• DD: order-optimal, rate-optimal (for dense problems)
• Future work: noise models, applications, non-identical GT,
non-independent GT, collateral open questions...
Leo Baldassini Rates and Algorithms for Group Testing 19 December 2014 24 / 24