SlideShare a Scribd company logo
GPU Acceleration of Set Similarity Joins
Mateus S. H. Cruz, Yusuke Kozawa
Toshiyuki Amagasa, Hiroyuki Kitagawa
September 2, 2015
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Outline
1 Introduction
2 Tools
3 Proposal
Preprocessing
Signature Matrix
Join
4 Experiments
5 Summary
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Outline
1 Introduction
2 Tools
3 Proposal
Preprocessing
Signature Matrix
Join
4 Experiments
5 Summary
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Processing of Large Data
Sources: Social networks, online stores, sensors
Applications: Recommendation systems, data integration
1/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Processing of Large Data
Sources: Social networks, online stores, sensors
Applications: Recommendation systems, data integration
Problem: Detect similar records
Duplicate elimination
Plagiarism detection
Tsukuba
University
University of
Tsukuba
Same entity?
1/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Set Similarity Join
Find similar records
Similarity threshold (δ)
Student
Name Univ. Name
Bob Tsukuba Univ.
Mary Harvard Univ.
John Harvard Univ.
Anna Univ. of Berlin
University
Univ. Name Country
Univ. of Tsukuba Japan
Harvard Univ. USA
Univ. of Berlin Germany
δ = 0.6
2/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Set Similarity Join
Find similar records
Similarity threshold (δ)
Strings can be seen as sets of words (tokens)
University of Tsukuba → X = {University,of,Tsukuba}
Tsukuba University → Y = {Tsukuba,University}
2/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Set Similarity Join
Find similar records
Similarity threshold (δ)
Strings can be seen as sets of words (tokens)
Set similarity metric: Jaccard similarity (JS)
University of Tsukuba → X = {University,of,Tsukuba}
Tsukuba University → Y = {Tsukuba,University}
JS(X, Y ) =
|X ∩ Y |
|X ∪ Y |
=
2
3
= 0.67
2/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Set Similarity Join
Find similar records
Similarity threshold (δ)
Strings can be seen as sets of words (tokens)
Set similarity metric: Jaccard similarity (JS)
Student
Name Univ. Name
Bob Tsukuba Univ.
Mary Harvard Univ.
John Harvard Univ.
Anna Univ. of Berlin
University
Univ. Name Country
Univ. of Tsukuba Japan
Harvard Univ. USA
Univ. of Berlin Germany
δ = 0.6
2/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Set Similarity Join
Find similar records
Similarity threshold (δ)
Strings can be seen as sets of words (tokens)
Set similarity metric: Jaccard similarity (JS)
Student
Name Univ. Name
Bob Tsukuba Univ.
Mary Harvard Univ.
John Harvard Univ.
Anna Univ. of Berlin
University
Univ. Name Country
Univ. of Tsukuba Japan
Harvard Univ. USA
Univ. of Berlin Germany
δ = 0.7
2/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Set Similarity Join
Find similar records
Similarity threshold (δ)
Strings can be seen as sets of words (tokens)
Set similarity metric: Jaccard similarity (JS)
Problem: Expensive processing
Student
Name Univ. Name
Bob Tsukuba Univ.
Mary Harvard Univ.
John Harvard Univ.
Anna Univ. of Berlin
University
Univ. Name Country
Univ. of Tsukuba Japan
Harvard Univ. USA
Univ. of Berlin Germany
δ = 0.6
2/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Objective
Accelerate the processing of
set similarity joins
3/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Related Work
Serial similarity joins
Xiao et al., Efficient Similarity Joins for Near Duplicate
Detection, TODS 2011
Parallel similarity joins using MapReduce
Vernica et al., Efficient Parallel Set-similarity Joins using
Mapreduce, SIGMOD 2010
Parallel similarity joins using GPU
Lieberman et al., A Fast Similarity Join Algorithm Using
Graphics Processing Units, ICDE 2008
– Normed metric
B¨ohm et al., Index-supported Similarity Join on Graphics
Processors, BTW 2009
– Euclidean distance
4/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Outline
1 Introduction
2 Tools
3 Proposal
Preprocessing
Signature Matrix
Join
4 Experiments
5 Summary
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Graphics Processing Unit (GPU)
GPGPU: General-purpose Computing on GPUs
SM0
...
SMm
Device Memory
Shared Memory
SP0 SP1
... SPn
Registers Registers Registers
Architecture of a modern GPU
5/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Graphics Processing Unit (GPU)
GPGPU: General-purpose Computing on GPUs
Challenges: Processing of text, limited memory
SM0
...
SMm
Device Memory
Shared Memory
SP0 SP1
... SPn
Registers Registers Registers
Architecture of a modern GPU
5/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
MinHash1
Estimates Jaccard similarity
Apply hash functions to sets and keep the minimum
Similar sets will have the same hash value
Use hash values to create signatures
Parts of signatures: bins
Good coupling with GPU
Efficient storage
Suitable for parallel processing
Li et al., GPU-based Minwise Hashing, WWW 2012
1
Broder, On the Resemblance and Containment of Documents,
Compression and Complexity of Sequences: Proceedings 1997
6/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
MinHash: Characteristic Matrix
database
transactions
are crucial
important gains
using gpu
database
transactions
are important
gpu are fast
R0
R1
S0
S1
R0 R1 S0 S1
database 1 0 1 0
transactions 1 0 1 0
are 1 0 1 1
crucial 1 0 0 0
important 0 1 1 0
gains 0 1 0 0
using 0 1 0 0
gpu 0 1 0 1
fast 0 0 0 1
Characteristic matrix
7/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
MinHash: Signature Matrix
1 Randomly permute rows
R0 R1 S0 S1
database 1 0 1 0
transactions 1 0 1 0
are 1 0 1 1
crucial 1 0 0 0
important 0 1 1 0
gains 0 1 0 0
using 0 1 0 0
gpu 0 1 0 1
fast 0 0 0 1
R0 R1 S0 S1
fast 0 0 0 1
important 0 1 1 0
gains 0 1 0 0
database 1 0 1 0
are 1 0 1 1
crucial 1 0 0 0
gpu 0 1 0 1
using 0 1 0 0
transactions 1 0 1 0
8/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
MinHash: Signature Matrix
1 Randomly permute rows
2 Save the index of the first 1 for each bin
R0 R1 S0 S1
fast [0] 0 0 0 1
important [1] 0 1 1 0
gains [2] 0 1 0 0
database [3] 1 0 1 0
are [4] 1 0 1 1
crucial [5] 1 0 0 0
gpu [6] 0 1 0 1
using [7] 0 1 0 0
transactions [8] 1 0 1 0
b0
b1
b2
b0 b1 b2
R0 * 3 8
R1 1 * 6
S0 1 3 8
S1 0 4 6
Signature Matrix
8/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
MinHash: Estimating the Similarity
b0 b1 b2
R0 * 3 8
R1 1 * 6
S0 1 3 8
S1 0 4 6
Sim(X, Y ) = coinciding bins
total bins
9/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
MinHash: Estimating the Similarity
b0 b1 b2
R0 * 3 8
R1 1 * 6
S0 1 3 8
S1 0 4 6
Sim(X, Y ) = coinciding bins
total bins
Sim(R0, S0) = 2/3 = 0.67 (Real similarity: 0.67)
9/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
MinHash: Estimating the Similarity
b0 b1 b2
R0 * 3 8
R1 1 * 6
S0 1 3 8
S1 0 4 6
Sim(X, Y ) = coinciding bins
total bins
Sim(R0, S0) = 2/3 = 0.67 (Real similarity: 0.67)
Sim(R0, S1) = 1/3 = 0.33 (Real similarity: 0.17)
Sim(R1, S0) = 1/3 = 0.33 (Real similarity: 0.14)
Sim(R1, S1) = 1/3 = 0.33 (Real similarity: 0.17)
9/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Outline
1 Introduction
2 Tools
3 Proposal
Preprocessing
Signature Matrix
Join
4 Experiments
5 Summary
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Execution Flow
Input
R S
Preprocessing
Characteristic
matrix
Signature matrix
computation
Signature
matrix
Similarity
join
Array of
similar pairs
Output
formatter
Similar
pairs
Output
CPU GPU
10/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Preprocessing
Input
R S
Preprocessing
Characteristic
matrix
Signature matrix
computation
Signature
matrix
Similarity
join
Array of
similar pairs
Output
formatter
Similar
pairs
Output
CPU GPU
11/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Preprocessing
Compact characteristic matrix
Based on the Compressed Row Storage (CRS) format
Reduces transferred data
R0 R1 S0 S1
database [0] 1 0 1 0
transactions [1] 1 0 1 0
are [2] 1 0 1 1
crucial [3] 1 0 0 0
important [4] 0 1 1 0
gains [5] 0 1 0 0
using [6] 0 1 0 0
gpu [7] 0 1 0 1
fast [8] 0 0 0 1
R0 R1 S0 S1
start 0 4 8 12 15
tok 0 1 2 3 4 5 6 7 0 1 2 4 2 7 8
11/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Signature Matrix Computation
Input
R S
Preprocessing
Characteristic
matrix
Signature matrix
computation
Signature
matrix
Similarity
join
Array of
similar pairs
Output
formatter
Similar
pairs
Output
CPU GPU
12/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Signature Matrix Computation
From characteristic matrix to signature matrix
Parallelization of MinHash
Store signatures in the shared memory
Access device memory using coalesced access
R0 R1 S0 S1
0 4 8 12 15
0 1 2 3 4 5 6 7 0 1 2 4 2 7 8
* 3 8 1 * 6 1 3 8 0 4 6
Signature Matrix
13/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Similarity Join
Input
R S
Preprocessing
Characteristic
matrix
Signature matrix
computation
Signature
matrix
Similarity
join
Array of
similar pairs
Output
formatter
Similar
pairs
Output
CPU GPU
14/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Similarity Join
Parallelize Nested-loop join (NLJ)
Store signatures from R in the shared memory
Read signatures from S using coalesced accesses
R0 * 3 8
R1 1 * 6
S0 1 3 8
S1 0 4 6
Block level parallelization
* 3 8 1 3 8
Thread level parallelization
14/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Result Output
The result size is initially unknown
Cannot allocate memory beforehand
Write conflicts between blocks
Three-phase scheme for result output2
4 2 0 2
0 4 6 6
1 - Execute the join and find the num-
ber of similar pairs for each block
2 - Execute a scan and obtain the initial
writing positions for each block
3 - Allocate the result array, execute the
join and output the similar pairs
2
He et al., Relational Joins on Graphics Processors, SIGMOD 2008
15/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Output Formatter
Input
R S
Preprocessing
Characteristic
matrix
Signature matrix
computation
Signature
matrix
Similarity
join
Array of
similar pairs
Output
formattter
Similar
pairs
Output
CPU GPU
16/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Outline
1 Introduction
2 Tools
3 Proposal
Preprocessing
Signature Matrix
Join
4 Experiments
5 Summary
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Setup
Environment
GPU (CUDA): NVIDIA K20Xm
CPU Serial: Intel Xeon CPU E5-1650
CPU Parallel (OpenMP): 12 threads
Datasets
Images
Abstracts
Transactions
17/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Performance Comparison
Speedups
25 times faster than CPU Parallel
150 times faster than CPU Serial
10−1
100
101
102
103
104
210
211
212
213
214
215
216
217
218
219
|R|
Elapsedtime(s)
CPU (Serial)
CPU (Parallel)
GPU
(a) Images
100
101
102
103
104
210
211
212
213
214
215
216
217
218
219
|R|
Elapsedtime(s)
CPU (Serial)
CPU (Parallel)
GPU
(b) Abstracts
100
101
102
103
104
210
211
212
213
214
215
216
217
218
219
|R|
Elapsedtime(s)
CPU (Serial)
CPU (Parallel)
GPU
(c) Transactions
Overall performance (|R| = |S|)
18/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Performance Breakdown
Bottlenecks
CPU: Executing the join
GPU: Reading from disk
Short data transfer time
Read from disk Preprocessing MinHash Join Data transfer Total
GPU 201 9 0.044 67 0.07 282
CPU (Parallel) 201 9 0.146 585 0 800
CPU (Serial) 194 9 0.927 2411 0 2619
Execution time in seconds, Abstracts dataset, |R| = |S| = 524, 288
19/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Accuracy Evaluation
Varying number of bins
Trade-off: Accuracy and performance
Depends on the dataset
Number of Bins Precision Recall Execution Time (s)
1 0.0000 0.9999 25.3
2 0.0275 0.9999 25.4
4 0.9733 0.9999 25.6
8 0.9994 0.9999 25.7
16 0.9998 1.0000 26.1
32 1.0000 1.0000 27.4
64 1.0000 1.0000 29.6
128 1.0000 1.0000 34.4
256 1.0000 1.0000 45.8
384 1.0000 1.0000 77.6
512 1.0000 1.0000 133.6
640 1.0000 1.0000 161.5
Abstracts dataset, |R| = |S| = 65, 536
20/21
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Outline
1 Introduction
2 Tools
3 Proposal
Preprocessing
Signature Matrix
Join
4 Experiments
5 Summary
Introduction
Tools
Proposal
Preprocessing
Signature Matrix
Join
Experiments
Summary
Summary
Processing large data
Detect similar items
Efficient scheme for set similarity joins
Jaccard similarity (MinHash)
GPU
High speedups: up to 150 times
Future work
Better join technique for GPUs
Multiple GPUs
21/21
Environment
Parameters
MinHash Alg.
Join Alg.
Q&A
Q&A
1/5
Environment
Parameters
MinHash Alg.
Join Alg.
Detailed Setup
GCC 4.4.7 -O3
NVCC 6.5 -O3 -use fast math
OpenMP 4.0
Component Specification
CPU Intel Xeon CPU E5-1650
CPU cores 6 (12 threads with Hyper-Threading)
CPU clock 3.50 GHz
Main memory 32GB
GPU NVIDIA Tesla K20Xm
Scalar processors 2688
Processor clock 732 MHz
Global memory 6GB
2/5
Environment
Parameters
MinHash Alg.
Join Alg.
Parameters
Not much impact on the performance/accuracy
Threads per block
Similarity threshold
Join selectivity
0
25
50
75
100
32
64
128
256
384
512
640
768
8961024
Number of threads per block
Elapsedtime(s)
GPU
0
5
10
15
20
0.2
0.4
0.6
0.8
1.0
Similarity Threshold
Elapsedtime(s)
GPU
50
100
150
200
0.01
0.05
0.1
0.2
0.3
0.4
0.5
Selectivity
Elapsedtime(s)
CPU (Serial)
CPU (Parallel)
GPU
Abstracts dataset, |R| = |S| = 131, 072
3/5
Environment
Parameters
MinHash Alg.
Join Alg.
Parallel MinHash Algorithm
Algorithm 1: Parallel MinHash
input : characteristic matrix CMt×d (t tokens, d
documents), number of bins b
output: signature matrix SMd×b (d documents, b bins)
1 binSize ← t/b ;
2 for i ← 0 to d in parallel do // exec. by blocks
3 for j ← 0 to t in parallel do // exec. by threads
4 if CMj,i = 1 then
5 h ← hash(CMj,i );
6 binIdx ← h/binSize ;
7 SMi,binIdx ← min(SMi,binIdx , h);
8 end
9 end
10 end
4/5
Environment
Parameters
MinHash Alg.
Join Alg.
Parallel NLJ Algorithm
Algorithm 2: Parallel nested-loop join
input : collections R and S, similarity threshold δ
output: pairs of documents whose similarity is greater than δ
1 foreach r ∈ R in parallel do // exec. by blocks
2 foreach s ∈ S do
3 if Sim(r, s) ≥ δ then
4 output(r, s);
5 end
6 end
7 end
5/5

More Related Content

Similar to GPU Acceleration of Set Similarity Joins

Experiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryExperiments on Design Pattern Discovery
Experiments on Design Pattern Discovery
Tim Menzies
 
Data analysis
Data analysisData analysis
Data analysis
AnandDesshpande
 
Distributed Streams
Distributed StreamsDistributed Streams
Distributed Streams
Ashraf Bashir
 
BDACA1516s2 - Lecture7
BDACA1516s2 - Lecture7BDACA1516s2 - Lecture7
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
Manish Parihar
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualization
Nicolas Maisonneuve
 
Henning agt talk-caise-semnet
Henning agt   talk-caise-semnetHenning agt   talk-caise-semnet
Henning agt talk-caise-semnet
caise2013vlc
 
C program compiler presentation
C program compiler presentationC program compiler presentation
C program compiler presentation
Rigvendra Kumar Vardhan
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
Martin Pinzger
 
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
AIRCC Publishing Corporation
 
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
ijcsit
 
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
AIRCC Publishing Corporation
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
Azad public school
 
Sr language for concurrent programming
Sr language for concurrent programmingSr language for concurrent programming
Sr language for concurrent programming
slksagar
 
Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2
gregoryg
 
Accounting serx
Accounting serxAccounting serx
Accounting serx
zeer1234
 
Accounting serx
Accounting serxAccounting serx
Accounting serx
zeer1234
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics
Diane Talley
 
An Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link DiscoveryAn Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link Discovery
Holistic Benchmarking of Big Linked Data
 
Msr2021 tutorial-di penta
Msr2021 tutorial-di pentaMsr2021 tutorial-di penta
Msr2021 tutorial-di penta
Massimiliano Di Penta
 

Similar to GPU Acceleration of Set Similarity Joins (20)

Experiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryExperiments on Design Pattern Discovery
Experiments on Design Pattern Discovery
 
Data analysis
Data analysisData analysis
Data analysis
 
Distributed Streams
Distributed StreamsDistributed Streams
Distributed Streams
 
BDACA1516s2 - Lecture7
BDACA1516s2 - Lecture7BDACA1516s2 - Lecture7
BDACA1516s2 - Lecture7
 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualization
 
Henning agt talk-caise-semnet
Henning agt   talk-caise-semnetHenning agt   talk-caise-semnet
Henning agt talk-caise-semnet
 
C program compiler presentation
C program compiler presentationC program compiler presentation
C program compiler presentation
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
 
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
 
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
 
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Sr language for concurrent programming
Sr language for concurrent programmingSr language for concurrent programming
Sr language for concurrent programming
 
Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2
 
Accounting serx
Accounting serxAccounting serx
Accounting serx
 
Accounting serx
Accounting serxAccounting serx
Accounting serx
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics
 
An Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link DiscoveryAn Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link Discovery
 
Msr2021 tutorial-di penta
Msr2021 tutorial-di pentaMsr2021 tutorial-di penta
Msr2021 tutorial-di penta
 

More from Mateus S. H. Cruz

Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...
Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...
Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...
Mateus S. H. Cruz
 
Privacy-Preserving Search for Chemical Compound Databases
Privacy-Preserving Search for Chemical Compound DatabasesPrivacy-Preserving Search for Chemical Compound Databases
Privacy-Preserving Search for Chemical Compound Databases
Mateus S. H. Cruz
 
Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud
Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the CloudPrivacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud
Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud
Mateus S. H. Cruz
 
Fuzzy Keyword Search over Encrypted Data in Cloud Computing
Fuzzy Keyword Search over Encrypted Data in Cloud ComputingFuzzy Keyword Search over Encrypted Data in Cloud Computing
Fuzzy Keyword Search over Encrypted Data in Cloud Computing
Mateus S. H. Cruz
 
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Mateus S. H. Cruz
 
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Mateus S. H. Cruz
 
DBMask: Fine-Grained Access Control on Encrypted Relational Databases
DBMask: Fine-Grained Access Control on Encrypted Relational DatabasesDBMask: Fine-Grained Access Control on Encrypted Relational Databases
DBMask: Fine-Grained Access Control on Encrypted Relational Databases
Mateus S. H. Cruz
 
ENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query ProcessingENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query Processing
Mateus S. H. Cruz
 
Overview of MONOMI
Overview of MONOMIOverview of MONOMI
Overview of MONOMI
Mateus S. H. Cruz
 
Overview of CryptDB
Overview of CryptDBOverview of CryptDB
Overview of CryptDB
Mateus S. H. Cruz
 

More from Mateus S. H. Cruz (10)

Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...
Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...
Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...
 
Privacy-Preserving Search for Chemical Compound Databases
Privacy-Preserving Search for Chemical Compound DatabasesPrivacy-Preserving Search for Chemical Compound Databases
Privacy-Preserving Search for Chemical Compound Databases
 
Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud
Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the CloudPrivacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud
Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud
 
Fuzzy Keyword Search over Encrypted Data in Cloud Computing
Fuzzy Keyword Search over Encrypted Data in Cloud ComputingFuzzy Keyword Search over Encrypted Data in Cloud Computing
Fuzzy Keyword Search over Encrypted Data in Cloud Computing
 
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
 
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
 
DBMask: Fine-Grained Access Control on Encrypted Relational Databases
DBMask: Fine-Grained Access Control on Encrypted Relational DatabasesDBMask: Fine-Grained Access Control on Encrypted Relational Databases
DBMask: Fine-Grained Access Control on Encrypted Relational Databases
 
ENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query ProcessingENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query Processing
 
Overview of MONOMI
Overview of MONOMIOverview of MONOMI
Overview of MONOMI
 
Overview of CryptDB
Overview of CryptDBOverview of CryptDB
Overview of CryptDB
 

Recently uploaded

Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 

Recently uploaded (20)

Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 

GPU Acceleration of Set Similarity Joins