2. 2
The φ-Heavy Hitters Problem
[Cormode, Muthukrishnan, ACM TODS 2005]
§Tracking φ-heavy hitters in a dynamic multiset S of elements.
• φ-heavy hitter: element in universe U = [0..n) with its frequency more than φ|S|.
• Challenges: large alphabet, output sensitivity, high-speed operation
Input: data stream of pairs (xi, Δi) ∈ U × {±1} and real numbers ε, φ in [0, 1).
Task: Maintain frequency information of elements for supporting
• QUERY(): return a set R ⊆ U such that R includes (1) all φ-heavy hitters and
(2) no others whose frequency is no more than (φ − ε)N for N = Σi Δi.
• INSERT(x)/DELETE(x): increment/decrement the frequency Nx of element x.
The (ε-approximate) φ-Heavy Hitters Problem in the turnstile model
Model of computation: The standard w-bit word RAM
3. 3
Large Universes in Mobile Networks
The operation time of existing practical methods depends on
log |U| = log n (Large in practice!)
Combinatorial Group Testing [Cormode, Muthukrishnan, ACM TODS 2005]
Hierarchical Count-Min Sketch [Cormode, Muthukrishnan, LATIN 2005]
IPv4 IPv6
Examples of universe U log |U| = log n
IP addresses 32 128
Pairs of IP addresses 64 256
Five tuples (source/destination IP/Port + Protocol) 104 296
Q. Can we eliminate dependency on log n from operation time?
4. 4
Main results
Key technique: Packed Bidirectional Counter Arrays
Our paper also proposes "cached candidate technique” for improving CGT for arbitrary updates.
This study CGT: Combinatorial Group Testing
[Cormode, Muthukrishnan, ACM TODS 2015]
Update O(r) amortized O(log(n)r) O(logb(n)r)
Query O(r2/ε) O((log(n)+r)r/ε) O((blogb(n)+r)r/ε)
Space O(log(n)r/ε) O(log(n)r/ε) O(blogb(n)r/ε)
n: size of universe δ: failure probability r = log(1/(δφ)) b: any integer in [2..n]
Model of computation: The standard w-bit word RAM
5. 5
Related Work: Packed Counters
Maintaining an array ofm = O(w) counters on the w-bit word RAM.
§Textbook solution for a single counter [Mehlhorn & Sanders, 2008]:
• Ops = inc/test or dec/test: O(1) space; O(m) amortized time.
§Nested counters [Grabowski & Fredriksson, IPL 2008]:
• Ops = inc/test: O(m) space; O(1) amortized time
§Trit counters [Bille & Thorup, SODA 2010]
• Ops = dec/reset/test: O(m) space; O(1) amortized time.
§Bidirectional counters [This talk]
• Ops = inc/dec/test: O(m) space; O(1) time for inc/dec (amortized) and test.
• Naïve bidirectional counters: O(m) space; O(m) time for all operations.
"test": ispositive (C[i] > 0), iszero (C[i] = 0), or isnegative (C[i] < 0)
7. 7
CGT: A Practical Data Structure
§Reports all φ-heavy hitters with probability at least 1 – δ for a specified δ.
§Idea: Random partition of U into d = 2/ε
subsets via each hash function hi.
• A φ-heavy hitter x can be identified from each
C[i, hi(x), 0..m] with probability at least 1/2.
• Setting r = log(1/(δφ) results in a desired failure
probability δ of missing any φ-heavy hitter.
Combinatorial Group Testing [CM, ACM TODS 2005]
1. Three-dimensional counter array: C[1..r, 1..d, 0..m]
2. A set of universal hash functions: h1, ..., hr: U → [1..d]
r = log(1/(δφ))
d = 2/ε
m = 1 + lg n
8. 8
CGT: A Practical Data Structure
§Reports all φ-heavy hitters with probability at least 1 – δ for a specified δ.
Combinatorial Group Testing [CM, ACM TODS 2005]
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] · 2i
[X] is 1 (resp. 0) if X is true (resp. false)
CGT reduces both QUERY and UPDATE
to three basic operations on bidirectional counter array C[1..m]:
CGT: A Practical Data Structure
1. Three-dimensional counter array: C[1..r, 1..d, 0..m]
2. A set of universal hash functions: h1, ..., hr: U → [1..d]
9. 9
CGT: A Practical Data Structure
Combinatorial Group Testing [CM, ACM TODS 2005]
UPDATE(x, Δ): O(log(n)r) time
1. Add Δ to N
2. for i in [1..r] do:
3. Add Δ to C[i, hi(x), 0]
4. if Δ < 0 then: x ← ~x
5. INCREMENT(C[i, hi(x), 1..m], x)
6. DECREMENT(C[i, hi(x), 1..m], ~x)
QUERY(): O((log(n)+r)r/ε) time
1. for i in [1..r] do:
2. for j in [1..d] do:
3. // C[i, j, k] 2C[i, j, k] – C[i, j, 0]
4. x ← ISPOSITIVE(C[i, j, 1..m])
5. if mini C[i, hi(x), 0] > φN then:
6. report x as a φ-heavy hitter
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] · 2i
[X] is 1 (resp. 0) if X is true (resp. false)
10. 10
CGT: A Practical Data Structure
Combinatorial Group Testing [CM, ACM TODS 2005]
UPDATE(x, Δ): O(log(n)r) time
1. Add Δ to N
2. for i in [1..r] do:
3. Add Δ to C[i, hi(x), 0]
4. if Δ < 0 then: x ← ~x
5. INCREMENT(C[i, hi(x), 1..m], x)
6. DECREMENT(C[i, hi(x), 1..m], ~x)
QUERY(): O((log(n)+r)r/ε) time
1. for i in [1..r] do:
2. for j in [1..d] do:
3. // C[i, j, k] 2C[i, j, k] – C[i, j, 0]
4. x ← ISPOSITIVE(C[i, j, 1..m])
5. if mini C[i, hi(x), 0] > φN then:
6. report x as a φ-heavy hitter
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] 2i.
[X] is 1 (resp. 0) if X is true (resp. false)
Q. Can we implement
INCREMENT/DECREMENT/ISPOSITIVE in o(m) time?
11. 11
Packed Bidirectional Counter Arrays
§Basic idea: Exploiting word-level parallelism of the w-bit word RAM
• Redundant binary representation of C[1..m] using digits {0, ±1, ±2}.
• The corresponding k-th digits of C[1..m] are packed into O(1) words.
• The packed k-th digits of C[1..m] are updated in O(1) time, once in 2k times.
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] 2i.
[X] is 1 (resp. 0) if X is true (resp. false)
O(1) amortized time O(1) amortized time O(1) time
using O(m) space (compact!) for m = O(w)
12. 12
Packed Bidirectional Counter Arrays
§Basic idea: Exploiting word-level parallelism of the w-bit word RAM
• Redundant binary representation of C[1..m] using digits {0, ±1, ±2}.
• The corresponding k-th digits of C[1..m] are packed into O(1) words.
• The packed k-th digits of C[1..m] are updated in O(1) time, once in 2k times.
··· ···
m × w = O(w2) bits: O(w) time to access
Naïve bidirectional counters
C[1] C[i] C[m]
Packed bidirectional counter array
m × O(1) = O(w) bits: O(1) time to access
C[1]
···
C[i]
···
C[m]
wdigits
13. 13
Packed Bidirectional Counter Arrays
= 1
= 0
in {0, ±1}
in {0, ±1, ±2}
Fixed-schedule
carry propagation
in O(1) amortized time
[GF, IPL 2008][BT, SODA 2010]
Packed
redundant binary counters
using digits {0, ±1, ±2}
Packed
orders of magnitudes
for detecting sign inversion
···
t
0
1
2
···
level(t)
···
1 2 3 4 5 6 7 8
3
9
level(t) = min{i | t mod 2i = 0}
1. Propagate carry bits 2. Fix orders of magnitudesThe k-th digits are updated
once in 2k times
Never
overflow
The t-th update:
14. 14
Lemma (Packed Bidirectional Counters)
There exists an O(m)-space data structure for representing
an array C[1..m] of m bidirectional counters supporting
§INCREMENT/DECREMENT in O(1) amortized time
§ISPOSITIVE in O(1) time
on the standard w-bit word RAM with w ≥ m.
15. 15
Theorem
§Plugging packed bidirectional counters into CGT, we obtain:
There exists an O(lg(n)r/ε)-space randomized data structure
for solving the ε-approximate φ-heavy hitters problem with
§INSERT/DELETE in O(r) amortized time
§QUERY in O(r2/ε) time with probability at least 1 - δ
on the standard w-bit word RAM with w ≥ lg n. Here, n is
the universe size, δ is a failure probability, and r = lg(1/(δφ)).
16. 16
Experiments: Setup
§Data: 14 datasets of 10 M integers
• Universe: [0, 264).
• Zipf distribution of skewness z in { 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0 }.
• Threshold φ (= ε) in { 0.0001, 0.0005, 0.001, 0.005, 0.01}.
§Methods:
• Ours [This work]: Our proposed method with #rounds r = 4.
• CGT(b) [CM, TODS 2005]: Combinatorial Group Testing with branching factor b in { 2, 16 }.
• CMH(b) [CM, LATIN 2005]: Hierarchical Count-Min Sketch with branching factor b in { 2/16 }.
CGT(b) and CMH(b) were configured as in [Cormode, Hadjieleftheriou, PVLDB 2008].
§Hardware:
• MacBook Pro with Intel® Core™ i7-8559 (2.7GHz) and 16GB main memory.
17. 17
Experiments: Precision
§Ours achieved competitive precisions for skewness z ≥ 1.4.
• Ours output more false positives than others for skewness z < 1.4.
• For z < 1.4, ours should have used larger ε to suppress false positives.
• Recalls of all methods were 100%.
0.8 1.2 1.6 2.0
0
20
40
60
80
100
Precision(%)
= 0.0001
0.8 1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6 2.0
= 0.005
0.8 1.2 1.6 2.0
= 0.01
1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6
= 0.005
CMH(16)CMH(2)CGT(16)Ours CGT(2)
[CM, TODS 2005] [CM, LATIN 2005]
18. 18
Experiments: Update time
§Ours achieved competitive update throughputs with CMH(16).
• CMH(16) achieved best and stable update throughputs.
• CGT(16) had heavy dependence on φ even if it doesn’t in theory.
• CGT(2) and CMH(2) were not competitive.
0.8 1.2 1.6 2.0
0
5000
10000
15000
20000
25000
30000
Updates/msec
= 0.0001
0.8 1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6 2.0
= 0.005
0.8 1.2 1.6 2.0
= 0.01
Note: Median of 5 measured times is reported
1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6
= 0.005
CMH(16)CMH(2)CGT(16)Ours CGT(2)
[CM, TODS 2005] [CM, LATIN 2005]
19. 19
Experiments: Query time
§Ours achieved best query throughputs except for φ = 0.0001.
• Note: ε = φ and r = O(1) in our experiments.
• CGT family (including ours) must examine Θ(1/φ) candidates of heavy hitters.
• CMH family is output sensitive: it is fast if # of heavy hitters is less than 1/φ.
0.8 1.2 1.6 2.0
0
1
2
3
4
5
Queries/msec
= 0.0001
0.8 1.2 1.6 2.0
0
5
10
15
20
= 0.0005
0.8 1.2 1.6 2.0
Skewness
0
10
20
30
40
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6 2.0
0
50
100
150
200
250
= 0.005
0.8 1.2 1.6 2.0
0
200
400
600
800
1000
1200
= 0.01
1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6
= 0.005
CMH(16)CMH(2)CGT(16)Ours CGT(2)
[CM, TODS 2005] [CM, LATIN 2005]
20. 20
Conclusion
§The φ-Heavy Hitters Problem in the strict turnstile model.
We improved CGT [CM, ACM TODS 2005] in
• Update time: from O(log(n)r) to amortized O(r)
• Query time: from O((log(n)+r)r/ε) to O(r2/ε)
using the same O(log(n)r/ε) space for a universe of size n and r = log(1/(δφ)).
§Packed Bidirectional Counter Array:
• Extension of [GF, IPL 2008] and [BT, SODA 2010] to bidirectional counters.
• Ops = inc/dec/test: O(1) amortized inc/dec and O(1) test in compact space.
§Future work
• Extension of our method to arbitrary updates.