Theta join (M-bucket-I algorithm explained)

Processing Theta Joins
using MapReduce
by Minsub Yim

Processing pipeline at a reducer
Goal: We want to minimize job completion time. Since it’s a function of both
input and output, we need a way to model both inputs and outputs to a reducer.
Reducer Join OutputMapper Output
time = f(input size) time = f(output size)
Receive Mapper
Output
Sort input
by key
Read
input
Run join
algorithm
Send join
output

Theta Join Model
S_id Value
1 5
2 6
3 6
4 8
5 8
6 10
Dataset S Dataset T
T_id Value
1 5
2 5
3 6
4 8
5 8
6 10
Assuming join condition:
S.value = T.value

Theta Join Model
S_id Value
1 5
2 6
3 6
4 8
5 8
6 10
Dataset S Dataset T
T_id Value
1 5
2 5
3 6
4 8
5 8
6 10
Assuming join condition:
S.value = T.value
5 5 6 8 8 10
5
6
6
8
8
10
[ Join Matrix M ]
: tuple satisfying the join
condition
S
T

Theta Join Model
(Examples)
5 5 6 8 8 10
5
6
6
8
8
10
Join condition:
S.value <= T.value
S
T 5 5 6 8 8 10
5
6
6
8
8
10
Join condition:
abs (S.value - T.value) < 2
S
T 5 5 6 8 8 10
5
6
6
8
8
10
Join condition:
S.value = T.value
S
T

Goal Revisited
• We want to minimize job completion time
• We need to assign every true cell to exactly one
reducer. (ﬁnd a mapping from M to R)

Goal Revisited
• We want to minimize job completion time
• We need to assign every true cell to exactly one
reducer. (ﬁnd a mapping from M to R)
• Goal: Find a mapping from the join matrix M to
reducers that minimizes job completion time

Mappings from join matrix to
reducers
5 5 6 8 8 10
5
6
6
8
8
10
Join condition:
S.value = T.value
S
T
(1)
(2)
(3)
(4)
[R1]
Input: S1, T1, T2
Output: 2 tuples
!
[R2]
Input: S2, S3, T3
Output: 2 tuples
!
[R3]
Input: S4, S5, T4, T5
Output: 4 tuples
!
[R4]
Input: S6, T6
Output: 1 tuple
!
Max-Reducer-Input: 4
Max-Reducer-Output: 4
5 5 6 8 8 10
5
6
6
8
8
10
Join condition:
S.value = T.value
S
T
(1)
(2)
(3)
(4)
[R1]
Input: S1, S4, S5, T1,
T4, T5
Output: 3 tuples
!
[R2]
Input: S2, S4, T3,T5
Output: 2 tuples
!
[R3]
Output: 2 tuples
!
[R4]
Output: 2 tuples
!
MRI: 6
MRO: 3
(1)
(1)
(2)
(3)
(4)
Stndard equi-join algorithm Random

reducers
5 5 6 8 8 10
5
6
6
8
8
10
Join condition:
S.value = T.value
S
T
(1)
(2)
(3)
[R1]
Output: 2 tuples
!
[R2]
Input: S3, S4, T1, T2, T3
Output: 2 tuples
!
[R3]
Input: S4, S5, S6, T4, T5, T6
Output: 5 tuples
!
!
Max-Reducer-Input: 6
Max-Reducer-Output: 5

reducers
• We see there could be many possible mappings
from join matrix to reducers
• We will see in different cases, which mapping is
(close to) optimal and algorithms to compute
such mapping.

Lemma
We will be using the following lemma repeatedly to show
how (close to) optimal each mapping is.
[ LEMMA 1 ] A reducer that is assigned to c cells of the join
matrix M will receive at least input tuples
[ Proof ] Consider a reducer r that receives m records from
T and n records from S. Then,
!
!
2
p
c
mn c
2
p
mn 2
p
c
m + n 2
p
c

Cross Product
• We ﬁrst consider cross product, where all of
tuples from two datasets satisfy the join
condition. The join matrix would look like the
following:
5 5 6 8 8 10
5
6
6
8
8
10
Join condition:
S X T
S
T

Cross Product
• Since all entries of the join matrix are true, we
can see that the maximum-reducer-output
(MRO) . (Otherwise, there would be
tuples not mapped to a reducer.)
• Along with Lemma 1, we have a lower bound for
the maximum-reducer-input (MRI):
MRI
|S||T|/r
2
r
|S||T|
r
matrix M will receive at least input tuples2
p
c

Cross Product
• Since all entries of the join matrix are true, we
can see that the maximum-reducer-output
(MRO) . (Otherwise, there would be
tuples not mapped to a reducer.
• Along with Lemma 1, we have a lower bound for
the maximum-reducer-input (MRI):
MRI
|S||T|/r
2
r
|S||T|
r
p
c

Cross Product
• We will revisit these two properties frequently to
see the quality of join mappings:
|S||T|/rMRO and MRI 2
r
|S||T|
r

p
|S||T|/rCase 1: Suppose |S| and |T| are multiples of .
Namely, and .|S| = cs
p
|S||T|/r |T| = cT
p
|S||T|/r
Then, partitioning the join matrix with squares of size
is an optimal mapping.
p
|S||T|/r
Proof : is trivial. Each region mapped to a reducer
!
has output size: and input size:|S||T|/r 2
r
|S||T|
r
Cross Product
r
|S||T|
r
Properties

Cross Product
r
|S||T|
r
Properties
5 5 6 8 8 10
5
6
6
8
8
10
S
T
Suppose |S| = |T| = 6
and r = 9

Cross Product
r
|S||T|
r
Properties
5 5 6 8 8 10
5
6
6
8
8
10
S
T
Suppose |S| = |T| = 6
and r = 9
MRO = 4 = 2
r
|S||T|
r
MRI = 4 = |S||T|/r

Case 2: Suppose the cardinality of one dataset is
signiﬁcantly greater than that of the other. (WLOG,
assume ). Then, rectangle cover
Cross Product
r
|S||T|
r
Properties
|S| < |T|/r |S| ⇥ |T|/r
is the optimal mapping.

Case 2: Suppose the cardinality of one dataset is
signiﬁcantly greater than that of the other. (WLOG,
assume ). Then, rectangle cover
Cross Product
r
|S||T|
r
Properties
|S| < |T|/r |S| ⇥ |T|/r
is the optimal mapping.
(e.g., |S| = 3, |T| = 20, r = 5)

Case 3: The remaining case where .
!
Let ,
!
Then, covering M with squares
is a mapping worse than an optimal mapping by a
factor no greater than 4.
Cross Product
r
|S||T|
r
Properties
|T|/r  |S|  |T|
CT =
$
|T|/
r
|S||T|
r
%
CS =
$
|S|/
r
|S||T|
r
%
p
|S||T|/r ⇥
p
|S||T|/r

If |S| and/or |T| is not a multiple of , scale each
!
side by and/or respectively to
!
cover M. Given , we see that
Cross Product
r
|S||T|
r
Properties
p
|S||T|/r
✓
1 +
1
CS
◆ ✓
1 +
1
CT
◆
|T|/r  |S|  |T|
✓
1 +
1
CS
◆ r
|S||T|
r
 2
r
|S||T|
r

Hence, and
Cross Product
r
|S||T|
r
Properties
Comparing these with the lower bounds given above,
we see that the MRO and MRI produced by this mapping
are at most 4 times (twice for MRI) the lower bounds.
MRI  4
p
|S||T|/rMRO  4|S||T|/r

Implementation
• Now we know how to (nearly) optimally partition
the join matrix. So let’s run it!!
• However, when a reducer is given a record (either
from S or T), it does NOT have enough information
where exactly in the dataset (in which row/col) the
record belongs to.
• We could run another pre-process to get that info,
but it can be avoided by running a randomized
algorithm!

Mapping & Randomized
Algorithm
Algorithm 1 : Map (Theta - Join)
!
Input : input tuple
1: if then
2: matrixRow = random(1,|S|)
3: for all regionID in lookup.getRegions(matrixRow)
do
4: Output (regionID, (x, “S”) )
5: else
6: matrixCol = random (1,|T|)
7: for all regionID in lookup.getRegions(matrixCol)
do
8: Output (regionID, (x, “T”) )
x 2 S [ T
x 2 S

Algorithm
Algorithm 1 : Map (Theta - Join)
!
Input : input tuple
1: if then
2: matrixRow = random(1,|S|)
3: for all regionID in lookup.getRegions(matrixRow)
do
4: Output (regionID, (x, “S”) )
5: else
6: matrixCol = random (1,|T|)
7: for all regionID in lookup.getRegions(matrixCol)
do
8: Output (regionID, (x, “T”) )
x 2 S [ T
x 2 S
1. Given a record ( WLOG )
2. Get a row uniformly randomly
3. Get all the regions intersecting that row
and output ( regID, (x, S) )
x 2 S

Algorithm
5 7 7 7 8 9
5
7
7
8
9
9
S
T
Join condition:
S.value = T.value
(1) (2)
(3)
3
5
1
5
1
2
6
2
2
3
6
4
(1,S1) (2,S1)
(3,S2)
(1,S3) (2,S3)
(3,S4)
(1,S5) (2,S5)
(1,S6) (2,S6)
(2,T1) (3,T1)
(1,T2) (3,T2)
(1,T3) (3,T3)
(1,T4) (3,T4)
(2,T5) (3,T5)
(2,T6) (3,T6)
Input
Tuple
Random
Row/Col
Output
Map
Reducer 1 : key 1 (regID)
Input: S1, S3, S5, S6,
T2, T3, T4
Output: (S3,T2) (S3,T3)
(S3,T4)
Input: S1, S3, S5, S6,
T1, T5, T6
(S6,T6)
Input: S2, S4,
T1, T2, T3, T4, T5, T6
(S2,T4) (S4,T5)
Reduce
S1.A = 5
S2.A = 7
S3.A = 7
S4.A = 8
S5.A = 9
S6.A = 9
T1.A = 5
T2.A = 7
T3.A = 7
T4.A = 7
T5.A = 8
T6.A = 9

Cross Product… NOT!
• We have veriﬁed that 1 Bucket Theta algorithm is
close to optimal when the join condition is cross
product.
• How does 1 Bucket Theta algorithm perform
when join condition is NOT cross product ?
• We will compare the quality of 1 Bucket Theta
algorithm to any join algorithm

1BT vs ANY join algorithm
Let . Any matrix to reducer mapping that has
to cover at least of the cells of the join matrix,
by Lemma 1, has MRI
1 x > 0
x|S||T| |S||T|
2
p
x|S||T|
p
c
As we have seen, 1BT guarantees that MRI .
!
Hence,
 4
p
|S||T|
MRI1BT
MRIAnyJoinAlg
=
4
p
|S||T|/r
2
p
x|S||T|/r
=
2
p
x

1BT vs ANY join algorithm
When , the ratio < 3.
!
Hence,compared to ANY join algorithm that assigns more
than 50% of its matrix cells to reducers, the MRI for 1BT is
at most 3 times the MRI of that algorithm.
x = 0.5

M-Bucket-I
• In the previous slide, we see that instead of
covering the entire matrix, mapping smaller
regions would yield better MRI result.
• Ideally, we only want to map those satisfying the
join condition, but it cannot be done before
knowing input statistics and/or join condition.
• M-Bucket-I exploits statistics to improve over 1
Bucket Theta join algorithm

M-Bucket-I
[ Step 1 ] Approximate Equi-Depth Histograms
1) With probability n /|S|, sample approx. n records
from |S|
2) Build k-quantiles (k buckets), where k < n
3) Iterate through |S| and count the number of
records in each bucket
4) Do the same for |T| and build the join matrix
accordingly

M-Bucket-I
S_id Value
1 7
2 2
3 4
4 2
5 1
6 9
7 10
8 2
9 5
10 3
Dataset S Dataset T
T_id Value
1 5
2 5
3 6
4 8
5 8
6 10
7 2
8 4
9 1
10 3

M-Bucket-I
S_id Value
1 7
2 2
3 4
4 2
5 1
6 9
7 10
8 2
9 5
10 3
Dataset S Dataset T
T_id Value
1 5
2 5
3 6
4 8
5 8
6 10
7 2
8 4
9 1
10 3
Sample S 7, 2, 2, 9, 2, 3
Sample T 5, 6, 8, 2, 1, 3

M-Bucket-I
S_id Value
1 7
2 2
3 4
4 2
5 1
6 9
7 10
8 2
9 5
10 3
Dataset S Dataset T
T_id Value
1 5
2 5
3 6
4 8
5 8
6 10
7 2
8 4
9 1
10 3
Sample S 7, 2, 2, 9, 2, 3
Sample T 5, 6, 8, 2, 1, 3
Samples

M-Bucket-I
S_id Value
1 7
2 2
3 4
4 2
5 1
6 9
7 10
8 2
9 5
10 3
Dataset S Dataset T
T_id Value
1 5
2 5
3 6
4 8
5 8
6 10
7 2
8 4
9 1
10 3
Sample S 7, 2, 2, 9, 2, 3
Sample T 5, 6, 8, 2, 1, 3
Samples
Buckets
S
T
0 2 3 9
0 1 5 8 1
1

M-Bucket-I
S_id Value
1 7
2 2
3 4
4 2
5 1
6 9
7 10
8 2
9 5
10 3
Dataset S Dataset T
T_id Value
1 5
2 5
3 6
4 8
5 8
6 10
7 2
8 4
9 1
10 3
Sample S 7, 2, 2, 9, 2, 3
Sample T 5, 6, 8, 2, 1, 3
Samples
Buckets
S
T
0 2 3 9
0 1 5 8 1
1
4 1 4 1
1 5 3 1

M-Bucket-I
S S S S S S S S S S
T
T
T
T
T
T
T
T
T
T
Join condition:
S.value = T.value

M-Bucket-I
S S S S S S S S S S
T
T
T
T
T
T
T
T
T
T
2 3 9
1
5
8
Join condition:
S.value = T.value

M-Bucket-I
S S S S S S S S S S
T
T
T
T
T
T
T
T
T
T
2 3 9
1
5
8
Join condition:
S.value = T.value
We now have candidate
cells. How do we map
these cells to reducers?

M-Bucket-I
[ Step 2 ] M-Bucket-I Algorithm
Algorithm : M-Bucket-I
!
Input : maxInput, r, M
1: row = 0
2: while row < M.noOfRows do
3: (row,r) = CoverSubMatrix(row, maxInput, r, M)
4: if r < 0 then!
5: return false
6: return true!

M-Bucket-I
Algorithm : CoverSubMatrix
!
Input : row_s, maxInput, r, M
1: maxScore = -1, rUsed = 0
2: for i = 1 to maxInput-1 do
3: R_i = CoverRows(row_s, row_s + i, maxInput, M)
4: area = totalCandidateArea(row_s, row_s + i, M)
5: score = area/R_i.size
6: if score >= maxScore then!
7: bestRow = row_s + i
8: rUsed = R_i.size
9: r = r - rUsed
10: return (bestRow + 1, r)

M-Bucket-I
Algorithm : CoverRows
!
Input : row_f, row_l, maxInput, M
1: Regions = 0; r = newRegion()
2: for all c_i in M.getColumns do
3: if r. cap < c_i.candidateInputCosts then!
4: Regions = Regions U r
5: r = newRegion()
6: r.Cells = r.Cells U c_i.candidateCells
7: return Regions

M-Bucket-I
Run the algorithm with r = 6
maxInput = 5

M-Bucket-I
maxInput = 5
row : 0
cost : 4

M-Bucket-I
maxInput = 5
row : 0
cost : 4
row : 1
cost : 13/3 = 4.3

M-Bucket-I
maxInput = 5
row : 0
cost : 4
row : 1
cost : 13/3 = 4.3
row : 2
cost : 22/4 = 5.5

M-Bucket-I
maxInput = 5
row : 0
cost : 4
row : 1
cost : 13/3 = 4.3
row : 2
cost : 22/4 = 5.5
row : 3
cost : 31/7 = 4.428..

M-Bucket-I
maxInput = 5
row : 0
cost : 4
row : 1
cost : 13/3 = 4.3
row : 2
cost : 22/4 = 5.5
row : 3
cost : 31/7 = 4.428..
We choose the mapping with
highest score!
(1) (2)
(3) (4)

M-Bucket-I
maxInput = 5
row : 3
cost : 3
(1) (2)
(3) (4) So on and so forth…

M-Bucket-I
maxInput = 5
Final mapping!
(1) (2)
(3) (4)
(7)(6)(5)
(8) (9)
(10)
(11) (12)
(13)

M-Bucket-I
maxInput = 5
(1) (2)
(3) (4)
However, we have mapped the
candidate cells to > r reducers.
!
We do binary search until we get
to the point where we a mapping
to <= r reducers.
(7)(6)(5)
(8) (9)
(10)
(11) (12)
(13)

M-Bucket-I
[ Step 3 ] Binary Search
MaxInput = |S|+|T|
= 20
Num.Reducers
= 1
MaxInput = 5
Num.Reducers
= 13

M-Bucket-I
MaxInput = |S|+|T|
= 20
Num.Reducers
= 1
MaxInput = 5
Num.Reducers
= 13
MaxInput = 12
Num.Reducers
= 3

M-Bucket-I
MaxInput = |S|+|T|
= 20
Num.Reducers
= 1
MaxInput = 5
Num.Reducers
= 13
MaxInput = 12
Num.Reducers
= 3
MaxInput = 8
Num.Reducers
= 5
Since 7 reducers are required when MaxInput = 7, we stop the binary search here and output
the mapping with MRI = 8.

Performance
1 Bucket Theta Standard Equi Join
Data set
Output size
(billion)
Output
Imbalance
Runtime
(secs)
Output
Imbalance
Runtime
(secs)
Synth - 0 25.00 1.0030 657 1.0124 701
Synth - 0.4 24.99 1.0023 650 1.2541 722
Synth - 0.6 24.98 1.0033 676 1.7780 923
Synth - 0.8 24.95 1.0068 678 3.0103 1482
Synth - 1 24.91 1.0089 667 5.3124 2489
Skewed
Where Output Imbalance =
MRI
Ave.RI
MRI
Ave.RI
Skew Resistance of 1 Bucket Theta

Performance
Step Number of Buckets
1 10 100 1000 10,000 100,000 1,000,000
M-Bucket-I cost details (seconds)
Quantiles 0 115 120 117 122 124 122
Histogram 0 140 145 147 157 167 604
Heuristic 74.01 9.21 0.84 1.50 16.67 118.03 111.27
Join 49384 10905 1157 595 548 540 536
Total 49,458.01 11,169.21 1,422.84 860.5 843.67 949.03 1,373.27

Theta join (M-bucket-I algorithm explained)

Recommended

Recommended

More Related Content

Similar to Theta join (M-bucket-I algorithm explained)

Similar to Theta join (M-bucket-I algorithm explained) (20)

Recently uploaded

Recently uploaded (20)

Theta join (M-bucket-I algorithm explained)