SlideShare a Scribd company logo
1 of 59
Graphs – rich and powerful
data representation
-
-
-
-
-
Social network
Human Disease Network
[Barabasi 2007]
Food Web [2007]
Terrorist Network
[Krebs 2002]Internet (AS) [2005]
Gene Regulatory Network
[Decourty 2008]
Protein Interactions
[breast cancer]
Political blogs
Power grid
Graphlets
H13 H14 H15 H16 H17H6H2
H7 H8 H9 H10 H11 H12H4H1 H3
H5
1
0
0.5
0.83
0.67
0.33
0.17
Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002]
The Structure and Function of Complex Networks [Newman – Siam Review 2003]
Small induced subgraphs
Graphlets
H13 H14 H15 H16 H17H6H2
H7 H8 H9 H10 H11 H12H4H1 H3
H5
1
0
0.5
0.83
0.67
0.33
0.17
Connected
Disconnected
Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002]
The Structure and Function of Complex Networks [Newman – Siam Review 2003]
Small induced subgraphs
Graphlets
k-graphlets = family of graphlets of size k
2-graphlets 3-graphlets 4-graphlets
H13 H14 H15 H16 H17H6H2
H7 H8 H9 H10 H11 H12H4H1 H3
H5
1
0
0.5
0.83
0.67
0.33
0.17
Connected
Disconnected
Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002]
The Structure and Function of Complex Networks [Newman – Siam Review 2003]
Small induced subgraphs
Graphlets
k-graphlets = family of graphlets of size k
motifs = frequently occurring subgraphs
2-graphlets 3-graphlets 4-graphlets
H13 H14 H15 H16 H17H6H2
H7 H8 H9 H10 H11 H12H4H1 H3
H5
1
0
0.5
0.83
0.67
0.33
0.17
Connected
Disconnected
Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002]
The Structure and Function of Complex Networks [Newman – Siam Review 2003]
Small induced subgraphs
Graphlets
k-graphlets = family of graphlets of size k
motifs = frequently occurring subgraphs
Applied to food web, genetic, neural, web, and other networks
Found distinct graphlets in each case
2-graphlets 3-graphlets 4-graphlets
H13 H14 H15 H16 H17H6H2
H7 H8 H9 H10 H11 H12H4H1 H3
H5
1
0
0.5
0.83
0.67
0.33
0.17
Connected
Disconnected
Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002]
The Structure and Function of Complex Networks [Newman – Siam Review 2003]
Small induced subgraphs
• Biological Networks
⎻ network alignment, protein function prediction
[Pržulj 2007][Milenković-Pržulj 2008] [Hulovatyy-Solava-Milenković 2014]
[Shervashidze et al. 2009][Vishwanathan et al. 2010]
• Social Networks
⎻ Triad analysis, role discovery, community detection
[Granovetter 1983][Holland-Leinhardt 1976][Rossi-Ahmed 2015]
[Ahmed et al. 2015][Xie-Kelley-Szymanski 2013]
• Internet AS [Feldman et al. 2008]
• Spam Detection
[Becchetti et al. 2008][Ahmed et al. 2016]
Applications of Graphlets
Useful for various machine learning tasks
e.g., Anomaly detection, Role Discovery, Relational Learning, Clustering etc.
Useful for a variety of ML tasks
• Graph-based anomaly detection
⎻ Unusual/malicious behavior detection
⎻ Emerging event and threat identification, …
• Graph-based semi-supervised learning, classification, …
• Link prediction and relationship strength estimation
• Graph similarity queries
⎻ Find similar nodes, edges, or graphs
• Subgraph detection and matching
Applications:
Higher-order network analysis and
modeling
Higher-order network structures
• Visualization – “spotting anomalies” [Ahmed et al.
ICDM 2014]
• Finding large cliques, stars, and other larger
network structures [Ahmed et al. KAIS 2015]
• Spectral clustering [Jure et al. Science 2016]
• Role discovery [Ahmed et al. 2016]
...
How
CPU/GP
Us
compare
CPU GPU
Large memory Memory is very limited
Few fast/powerful processing units Thousands of smaller processing units
Handles unbalanced jobs better Performs best with “balanced” workloads
Optimized for general computations
Optimized for simple repetitive calculations
at a very fast rate.
How
CPU/GP
Us
compare
CPU GPU
Large memory Memory is very limited
Few fast/powerful processing units Thousands of smaller processing units
Handles unbalanced jobs better Performs best with “balanced” workloads
Optimized for general computations
Optimized for simple repetitive calculations
at a very fast rate.
Combine advantages of both
INPUT: a large graph G=(V,E), set of graphlets 𝓗
PROBLEM: Find the number of embeddings
(appearances) of each graphlet 𝐻 𝑘 ∈ 𝓗 in G
Problem: global graphlet counting
(macro-level)
INPUT: a large graph G=(V,E), set of graphlets 𝓗
PROBLEM: Find the number of embeddings
(appearances) of each graphlet 𝐻 𝑘 ∈ 𝓗 in G
Problem: global graphlet counting
(macro-level)
Given an input graph G
- How many triangles in G?
- How many cliques of size 4-nodes in G?
- How many cycles of size 4-nodes in G?
INPUT: a large graph G=(V,E), set of graphlets 𝓗
PROBLEM: Find the number of embeddings
(appearances) of each graphlet 𝐻 𝑘 ∈ 𝓗 in G
Problem: global graphlet counting
(macro-level)
Given an input graph G
- How many triangles in G?
- How many cliques of size 4-nodes in G?
- How many cycles of size 4-nodes in G?
 Many applications require counting all k-vertex graphlets
 Recent research work
- Exact/approximation of global counts [Rahman et al. TKDE14] [Jha et al. WWW15]
- Scalable for massive graphs (billions of nodes/edges) ] [Ahmed et al. ICDM15,KAIS16]
INPUT: a large graph G=(V,E), set of graphlets ℋ
PROBLEM: Find the number of occurrences that
edge i is contained within 𝐻 𝑘, for all k = 1, … , |ℋ|
Role discovery, Relational Learning, Multi-label Classification
Problem: local graphlet counting
(micro-level)
Current work
• Enumerate all possible graphlets
- Exhaustive enumeration is too expensive
• Count graphlets for each node
- Expensive for large k [Shervashidze et al. – AISTAT 2009]
Not practical – scales only for graphs with few hundred/thousand nodes/edges
[Hočevar et al. – Bioinfo. 13]
Sequential
Current work
• Enumerate all possible graphlets
- Exhaustive enumeration is too expensive
• Count graphlets for each node
- Expensive for large k [Shervashidze et al. – AISTAT 2009]
Not practical – scales only for graphs with few hundred/thousand nodes/edges
[Hočevar et al. – Bioinfo. 13]
Sequential
Parallel
• Edge-centric graphlet counting (PGD)
⎻ Multi-core CPUs, large graphs
[Ahmed et al. ICDM 14, KAIS 15]
Current work
• Enumerate all possible graphlets
- Exhaustive enumeration is too expensive
• Count graphlets for each node
- Expensive for large k [Shervashidze et al. – AISTAT 2009]
Not practical – scales only for graphs with few hundred/thousand nodes/edges
[Hočevar et al. – Bioinfo. 13]
Sequential
Parallel
• Edge-centric graphlet counting (PGD)
⎻ Multi-core CPUs, large graphs
• Node-centric graphlet counting,
⎻ Single GPU, Handles only tiny graphs (ORCA-GPU)
[Ahmed et al. ICDM 14, KAIS 15]
[Milinković et al.]
Our approach
Hybrid parallel graphlet counting framework that
leverages all available CPUs and GPUs
Parallel Graphlet
Counting Framework
Single GPU
methods
Multi-GPU
methods
Hybrid CPU-GPU
methods
Algorithm classes
Our approach
Hybrid parallel graphlet
counting framework that
leverages all available
CPUs & GPUs
Hybrid Parallel
Graphlet Counting
Framework
Multi-GPU
methods
Hybrid CPU-GPU
methods
Algorithm classes
Single GPU
methods
Other key advantages:
• Edge-centric parallelization
⎻ Improved load balancing & lock-free
• Global and local graphlet counts
• Connected and disconnected graphlets
• Fine-grained parallelization
• Space-efficient
⫶
e
Overview of our approach
uv
SuSv
… T
uv
…
Ec
EDGE
Overview
nodes completing a triangle with edge (v, u)T =
nodes that form a 2-star with v (u)Sv (Su) =
Edge-centric, Parallel, Fast,
Space-efficient Framework
Our Approach –
(Edge-centric, parallel, space-efficient)
Searching Edge
Neighborhoods
For each edge
Find the triangles
Step 1
Our Approach –
(Edge-centric, parallel, space-efficient)
Searching Edge
Neighborhoods
For each edge
Find the triangles
Count a few
k-graphlets
For each edge,
count only:
k-cliques
k-cycles
tailed-triangles
Step 1 Step 2
Our Approach –
(Edge-centric, parallel, space-efficient)
Searching Edge
Neighborhoods
For each edge
Find the triangles
Count a few
k-graphlets
For each edge,
count only:
Count all other
graphlets
For each edge, use
combinatorial
relationships
to derive counts
of other graphlets
in constant time o(1)
k-cliques
k-cycles
tailed-triangles
Step 1 Step 2 Step 3
Our Approach – (Edge-centric, parallel, space-efficient)
Searching Edge
Neighborhoods
For each edge
Find the triangles
Count a few
k-graphlets
For each edge,
count only:
Count all other
graphlets
For each edge, use
combinatorial
relationships
to derive counts
of other graphlets
in constant time o(1)
k-cliques
k-cycles
tailed-triangles
Step 1 Step 2 Step 3
Step 4 Merge all counts
neighborhood runtimes (CPU)
Key Observations
Neighborhood runtimes
are power-lawed
The distribution of
graphlet runtimes for
edge neighborhoods
obey a power-law.
Most edge neighborhoods are fast with
runtimes that are approximately equal.
Key Observations
The distribution of
graphlet runtimes for
edge neighborhoods
obey a power-law.
Neighborhood runtimes
are power-lawed
HOWEVER, a handful of neighborhoods
are hard and take significantly longer.
Most edge neighborhoods are fast with
runtimes that are approximately equal.
Key Observations
The distribution of
graphlet runtimes for
edge neighborhoods
obey a power-law.
Neighborhood runtimes
are power-lawed
HOWEVER, a handful of neighborhoods
are hard and take significantly longer.
Most edge neighborhoods are fast with
runtimes that are approximately equal.
Key Observations
The distribution of
graphlet runtimes for
edge neighborhoods
obey a power-law.
Neighborhood runtimes
are power-lawed
QUESTION:
What is the “best” way to
partition neighborhoods
among CPUs and GPUs?
• “hardness” proxy 
edge deg., vol., ...
Our approach
• Order edges by “hardness” and partition into 3 sets:
Γ e1 Γ e 𝑀Γ e𝑗Γ e 𝑘
⋯ ⋯ ⋯
Πcpu Πgpu
Our approach
• Order edges by “hardness” and partition into 3 sets:
• Compute induced subgraphs centered at each edge
• CPU Workers: use hash table for o(1) lookups, O(N)
• GPU Workers: use binary search for o(log d) lookups
• When finished, dequeue next b edges:
• CPU: get b edges from FRONT of Πunproc
• GPU: get b edges from BACK of Πunproc
Preprocessing steps
Three simple and efficient preprocessing steps:
1) Sort vertices from smallest to largest degree 𝑓(∙) and
relabel them s.t. 𝑓 v1 ≤ ⋯ ≤ 𝑓 v 𝑁
Preprocessing steps
Three simple and efficient preprocessing steps:
1) Sort vertices from smallest to largest degree 𝑓(∙) and
relabel them s.t. 𝑓 v1 ≤ ⋯ ≤ 𝑓 v 𝑁
2) For each Γ v𝑖 , ∀𝑖 = 1, … , N, order the set of neighbors
Γ v𝑖 = … , w𝑗, … , w 𝑘 , … from smallest to largest deg.
Preprocessing steps
Three simple and efficient preprocessing steps:
1) Sort vertices from smallest to largest degree 𝑓(∙) and
relabel them s.t. 𝑓 v1 ≤ ⋯ ≤ 𝑓 v 𝑁
2) For each Γ v𝑖 , ∀𝑖 = 1, … , N, order the set of neighbors
Γ v𝑖 = … , w𝑗, … , w 𝑘 , … from smallest to largest deg.
3) Given edge (v, u) ∈ E, ensure that 𝑓 v ≥ 𝑓 u
⎻ hence, v is always the vertex with largest degree, dv ≥ du
Preprocessing steps
• All of these steps are not required, but significantly improve
• Each step is extremely fast and lends itself to easy parallelization
Three simple and efficient preprocessing steps:
1) Sort vertices from smallest to largest degree 𝑓(∙) and
relabel them s.t. 𝑓 v1 ≤ ⋯ ≤ 𝑓 v 𝑁
2) For each Γ v𝑖 , ∀𝑖 = 1, … , N, order the set of neighbors
Γ v𝑖 = … , w𝑗, … , w 𝑘 , … from smallest to largest deg.
3) Given edge (v, u) ∈ E, ensure that 𝑓 v ≥ 𝑓 u
⎻ hence, v is always the vertex with largest degree, dv ≥ du
Fine Granularity & Work Stealing
For a single edge (v, u) ∈ E,
I. Compute the sets 𝐓 and 𝐒 𝐮
II. Find the total 4-cliques using T
III. Find the total 4-cycles using 𝐒 𝐮
NOTE: (II) and (III) are independent  parallelize
Unrestricted counts
3-graphlets
4-graphlets
De = N −(|Sv|+ |Su| + |T|) − 2
Global counts
4-graphlets
3-graphlets
Time Complexity
K = number of edges
Δ = max degree
Tmax = max number of triangles incident to an edge in G
Smax = max number of 2-stars incident to an edge in G
Experiments
Connected 4-graphlet frequencies for a variety of the real-
world networks investigated from different domains.
Facebook networks
Social networks
Interaction networks
Network type
Collaboration networks
Brain networks
Web graphs
Technological/IP networks
Dense hard benchmark graphs
Validating edge
partitioning
0 2 4 6
x 10
4
0
0.05
0.1
0.15
0.2
Edge neighborhoods
Time(ms)
CPU
GPU
0 2 4 6
x 10
4
0
0.05
0.1
0.15
0.2
Edge neighborhoods
Time(ms)
• Edges partitioned by
“hardness”
• GPUs assigned sparser
neighborhoods
• Assigns edge neighborhoods
to “best” processor type
• Importance of initial
ordering
GPU workers assigned easy & balanced edge
neighborhoods (approx. equal runtimes)
CPU workers assigned difficult
unbalanced/skewed neighborhoods
Experiments: Improvement
Runtime improvement
over state-of-the-art
GPU: Uses a single multi-core GPU
Multi-GPU: Uses all available GPUs
Hybrid: Leverages all multi-core CPUs & GPUs
2 Intel Xeon CPUs (E5-2687) –
• 8 cores (3.10Ghz)
8 Titan Black NVIDIA GPUs –
• 2880 cores (889 Mhz), ~6GB
Experiments: Improvement
Runtime improvement
over state-of-the-art
Improvement:
significant at α = 0.01
GPU: Uses a single multi-core GPU
Multi-GPU: Uses all available GPUs
Hybrid: Leverages all multi-core CPUs & GPUs
Experiments: Improvement
Runtime improvement
over state-of-the-art
Improvement:
significant at α = 0.01
MEAN 8x 40x 126x
GPU: Uses a single multi-core GPU
Multi-GPU: Uses all available GPUs
Hybrid: Leverages all multi-core CPUs & GPUs
Comparing ORCA-GPU methods
Many problems with Orca-GPU:
• No “effective parallelization”, many parts dependent
• Requires synchronization throughout, locks
• No fine-grained parallelization
• Significant improvement
over Orca-GPU (at 𝛼 =
0.01)
Orca-GPU runtime /
runtime of proposed
method
ImprovementoverOrca-GPU
Varying the edge ordering
Ordering strategy significantly impacts performance
Space-efficient & comm.
avoidance
Average memory (MB) per GPU for three networks.
Applications
Ranking/spotting Large Cliques
via Graphlets
Ranking/spotting Large Cliques via
Graphlets
Ranking/spotting Large Cliques via
Graphlets
Ranking/spotting Large Cliques
via Graphlets
Ranking/spotting Large Stars via
Graphlets
Ranking/spotting Large Stars via
Graphlets
Framework & Algorithms
• Introduced hybrid graphlet counting approach that leverages all
available CPUs & GPUs
• First hybrid CPU-GPU approach for graphlet counting
• On average 126x faster than current methods
- Edge-centric computations (only requires access to edge neighborhood)
• Time and space-efficient
Applications
• Visual analytics and real-time graphlet mining
Summary
Research generously supported by:
Data: http://networkrepository.com

More Related Content

Similar to Leveraging Multiple GPUs and CPUs for Graphlet Counting in Large Networks

High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingNesreen K. Ahmed
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCAapo Kyrölä
 
Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesJason Riedy
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphsStanka Dalekova
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
Distributed graph summarization
Distributed graph summarizationDistributed graph summarization
Distributed graph summarizationaftab alam
 
Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...
Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...
Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...IJMER
 
mini project_shortest path visualizer.pptx
mini project_shortest path visualizer.pptxmini project_shortest path visualizer.pptx
mini project_shortest path visualizer.pptxtusharpawar803067
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
 
Design and Implementation of Mobile Map Application for Finding Shortest Dire...
Design and Implementation of Mobile Map Application for Finding Shortest Dire...Design and Implementation of Mobile Map Application for Finding Shortest Dire...
Design and Implementation of Mobile Map Application for Finding Shortest Dire...Eswar Publications
 
GraphLab: Large-Scale Machine Learning on Graphs (BDT204) | AWS re:Invent 2013
GraphLab: Large-Scale Machine Learning on Graphs (BDT204) | AWS re:Invent 2013GraphLab: Large-Scale Machine Learning on Graphs (BDT204) | AWS re:Invent 2013
GraphLab: Large-Scale Machine Learning on Graphs (BDT204) | AWS re:Invent 2013Amazon Web Services
 
Graph Sample and Hold: A Framework for Big Graph Analytics
Graph Sample and Hold: A Framework for Big Graph AnalyticsGraph Sample and Hold: A Framework for Big Graph Analytics
Graph Sample and Hold: A Framework for Big Graph AnalyticsNesreen K. Ahmed
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2Mohit Garg
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmYu Liu
 
Apache Spark GraphX highlights.
Apache Spark GraphX highlights. Apache Spark GraphX highlights.
Apache Spark GraphX highlights. Doug Needham
 
On Sampling from Massive Graph Streams
On Sampling from Massive Graph StreamsOn Sampling from Massive Graph Streams
On Sampling from Massive Graph StreamsNesreen K. Ahmed
 

Similar to Leveraging Multiple GPUs and CPUs for Graphlet Counting in Large Networks (20)

High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
 
Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New Architectures
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphs
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
Distributed graph summarization
Distributed graph summarizationDistributed graph summarization
Distributed graph summarization
 
HalifaxNGGs
HalifaxNGGsHalifaxNGGs
HalifaxNGGs
 
Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...
Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...
Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...
 
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-CentersTowards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
 
mini project_shortest path visualizer.pptx
mini project_shortest path visualizer.pptxmini project_shortest path visualizer.pptx
mini project_shortest path visualizer.pptx
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
Design and Implementation of Mobile Map Application for Finding Shortest Dire...
Design and Implementation of Mobile Map Application for Finding Shortest Dire...Design and Implementation of Mobile Map Application for Finding Shortest Dire...
Design and Implementation of Mobile Map Application for Finding Shortest Dire...
 
GraphLab: Large-Scale Machine Learning on Graphs (BDT204) | AWS re:Invent 2013
GraphLab: Large-Scale Machine Learning on Graphs (BDT204) | AWS re:Invent 2013GraphLab: Large-Scale Machine Learning on Graphs (BDT204) | AWS re:Invent 2013
GraphLab: Large-Scale Machine Learning on Graphs (BDT204) | AWS re:Invent 2013
 
Graph Sample and Hold: A Framework for Big Graph Analytics
Graph Sample and Hold: A Framework for Big Graph AnalyticsGraph Sample and Hold: A Framework for Big Graph Analytics
Graph Sample and Hold: A Framework for Big Graph Analytics
 
Graph Evolution Models
Graph Evolution ModelsGraph Evolution Models
Graph Evolution Models
 
Ijciet 10 01_183
Ijciet 10 01_183Ijciet 10 01_183
Ijciet 10 01_183
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize Algorithm
 
Apache Spark GraphX highlights.
Apache Spark GraphX highlights. Apache Spark GraphX highlights.
Apache Spark GraphX highlights.
 
On Sampling from Massive Graph Streams
On Sampling from Massive Graph StreamsOn Sampling from Massive Graph Streams
On Sampling from Massive Graph Streams
 

Recently uploaded

GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 

Recently uploaded (20)

GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 

Leveraging Multiple GPUs and CPUs for Graphlet Counting in Large Networks

  • 1.
  • 2. Graphs – rich and powerful data representation - - - - - Social network Human Disease Network [Barabasi 2007] Food Web [2007] Terrorist Network [Krebs 2002]Internet (AS) [2005] Gene Regulatory Network [Decourty 2008] Protein Interactions [breast cancer] Political blogs Power grid
  • 3. Graphlets H13 H14 H15 H16 H17H6H2 H7 H8 H9 H10 H11 H12H4H1 H3 H5 1 0 0.5 0.83 0.67 0.33 0.17 Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002] The Structure and Function of Complex Networks [Newman – Siam Review 2003] Small induced subgraphs
  • 4. Graphlets H13 H14 H15 H16 H17H6H2 H7 H8 H9 H10 H11 H12H4H1 H3 H5 1 0 0.5 0.83 0.67 0.33 0.17 Connected Disconnected Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002] The Structure and Function of Complex Networks [Newman – Siam Review 2003] Small induced subgraphs
  • 5. Graphlets k-graphlets = family of graphlets of size k 2-graphlets 3-graphlets 4-graphlets H13 H14 H15 H16 H17H6H2 H7 H8 H9 H10 H11 H12H4H1 H3 H5 1 0 0.5 0.83 0.67 0.33 0.17 Connected Disconnected Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002] The Structure and Function of Complex Networks [Newman – Siam Review 2003] Small induced subgraphs
  • 6. Graphlets k-graphlets = family of graphlets of size k motifs = frequently occurring subgraphs 2-graphlets 3-graphlets 4-graphlets H13 H14 H15 H16 H17H6H2 H7 H8 H9 H10 H11 H12H4H1 H3 H5 1 0 0.5 0.83 0.67 0.33 0.17 Connected Disconnected Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002] The Structure and Function of Complex Networks [Newman – Siam Review 2003] Small induced subgraphs
  • 7. Graphlets k-graphlets = family of graphlets of size k motifs = frequently occurring subgraphs Applied to food web, genetic, neural, web, and other networks Found distinct graphlets in each case 2-graphlets 3-graphlets 4-graphlets H13 H14 H15 H16 H17H6H2 H7 H8 H9 H10 H11 H12H4H1 H3 H5 1 0 0.5 0.83 0.67 0.33 0.17 Connected Disconnected Network Motifs: Simple Building Blocks of Complex Networks [Milo et. al – Science 2002] The Structure and Function of Complex Networks [Newman – Siam Review 2003] Small induced subgraphs
  • 8. • Biological Networks ⎻ network alignment, protein function prediction [Pržulj 2007][Milenković-Pržulj 2008] [Hulovatyy-Solava-Milenković 2014] [Shervashidze et al. 2009][Vishwanathan et al. 2010] • Social Networks ⎻ Triad analysis, role discovery, community detection [Granovetter 1983][Holland-Leinhardt 1976][Rossi-Ahmed 2015] [Ahmed et al. 2015][Xie-Kelley-Szymanski 2013] • Internet AS [Feldman et al. 2008] • Spam Detection [Becchetti et al. 2008][Ahmed et al. 2016] Applications of Graphlets Useful for various machine learning tasks e.g., Anomaly detection, Role Discovery, Relational Learning, Clustering etc.
  • 9. Useful for a variety of ML tasks • Graph-based anomaly detection ⎻ Unusual/malicious behavior detection ⎻ Emerging event and threat identification, … • Graph-based semi-supervised learning, classification, … • Link prediction and relationship strength estimation • Graph similarity queries ⎻ Find similar nodes, edges, or graphs • Subgraph detection and matching
  • 10. Applications: Higher-order network analysis and modeling Higher-order network structures • Visualization – “spotting anomalies” [Ahmed et al. ICDM 2014] • Finding large cliques, stars, and other larger network structures [Ahmed et al. KAIS 2015] • Spectral clustering [Jure et al. Science 2016] • Role discovery [Ahmed et al. 2016] ...
  • 11. How CPU/GP Us compare CPU GPU Large memory Memory is very limited Few fast/powerful processing units Thousands of smaller processing units Handles unbalanced jobs better Performs best with “balanced” workloads Optimized for general computations Optimized for simple repetitive calculations at a very fast rate.
  • 12. How CPU/GP Us compare CPU GPU Large memory Memory is very limited Few fast/powerful processing units Thousands of smaller processing units Handles unbalanced jobs better Performs best with “balanced” workloads Optimized for general computations Optimized for simple repetitive calculations at a very fast rate. Combine advantages of both
  • 13. INPUT: a large graph G=(V,E), set of graphlets 𝓗 PROBLEM: Find the number of embeddings (appearances) of each graphlet 𝐻 𝑘 ∈ 𝓗 in G Problem: global graphlet counting (macro-level)
  • 14. INPUT: a large graph G=(V,E), set of graphlets 𝓗 PROBLEM: Find the number of embeddings (appearances) of each graphlet 𝐻 𝑘 ∈ 𝓗 in G Problem: global graphlet counting (macro-level) Given an input graph G - How many triangles in G? - How many cliques of size 4-nodes in G? - How many cycles of size 4-nodes in G?
  • 15. INPUT: a large graph G=(V,E), set of graphlets 𝓗 PROBLEM: Find the number of embeddings (appearances) of each graphlet 𝐻 𝑘 ∈ 𝓗 in G Problem: global graphlet counting (macro-level) Given an input graph G - How many triangles in G? - How many cliques of size 4-nodes in G? - How many cycles of size 4-nodes in G?  Many applications require counting all k-vertex graphlets  Recent research work - Exact/approximation of global counts [Rahman et al. TKDE14] [Jha et al. WWW15] - Scalable for massive graphs (billions of nodes/edges) ] [Ahmed et al. ICDM15,KAIS16]
  • 16. INPUT: a large graph G=(V,E), set of graphlets ℋ PROBLEM: Find the number of occurrences that edge i is contained within 𝐻 𝑘, for all k = 1, … , |ℋ| Role discovery, Relational Learning, Multi-label Classification Problem: local graphlet counting (micro-level)
  • 17. Current work • Enumerate all possible graphlets - Exhaustive enumeration is too expensive • Count graphlets for each node - Expensive for large k [Shervashidze et al. – AISTAT 2009] Not practical – scales only for graphs with few hundred/thousand nodes/edges [Hočevar et al. – Bioinfo. 13] Sequential
  • 18. Current work • Enumerate all possible graphlets - Exhaustive enumeration is too expensive • Count graphlets for each node - Expensive for large k [Shervashidze et al. – AISTAT 2009] Not practical – scales only for graphs with few hundred/thousand nodes/edges [Hočevar et al. – Bioinfo. 13] Sequential Parallel • Edge-centric graphlet counting (PGD) ⎻ Multi-core CPUs, large graphs [Ahmed et al. ICDM 14, KAIS 15]
  • 19. Current work • Enumerate all possible graphlets - Exhaustive enumeration is too expensive • Count graphlets for each node - Expensive for large k [Shervashidze et al. – AISTAT 2009] Not practical – scales only for graphs with few hundred/thousand nodes/edges [Hočevar et al. – Bioinfo. 13] Sequential Parallel • Edge-centric graphlet counting (PGD) ⎻ Multi-core CPUs, large graphs • Node-centric graphlet counting, ⎻ Single GPU, Handles only tiny graphs (ORCA-GPU) [Ahmed et al. ICDM 14, KAIS 15] [Milinković et al.]
  • 20. Our approach Hybrid parallel graphlet counting framework that leverages all available CPUs and GPUs Parallel Graphlet Counting Framework Single GPU methods Multi-GPU methods Hybrid CPU-GPU methods Algorithm classes
  • 21. Our approach Hybrid parallel graphlet counting framework that leverages all available CPUs & GPUs Hybrid Parallel Graphlet Counting Framework Multi-GPU methods Hybrid CPU-GPU methods Algorithm classes Single GPU methods Other key advantages: • Edge-centric parallelization ⎻ Improved load balancing & lock-free • Global and local graphlet counts • Connected and disconnected graphlets • Fine-grained parallelization • Space-efficient ⫶ e
  • 22. Overview of our approach
  • 23. uv SuSv … T uv … Ec EDGE Overview nodes completing a triangle with edge (v, u)T = nodes that form a 2-star with v (u)Sv (Su) = Edge-centric, Parallel, Fast, Space-efficient Framework
  • 24. Our Approach – (Edge-centric, parallel, space-efficient) Searching Edge Neighborhoods For each edge Find the triangles Step 1
  • 25. Our Approach – (Edge-centric, parallel, space-efficient) Searching Edge Neighborhoods For each edge Find the triangles Count a few k-graphlets For each edge, count only: k-cliques k-cycles tailed-triangles Step 1 Step 2
  • 26. Our Approach – (Edge-centric, parallel, space-efficient) Searching Edge Neighborhoods For each edge Find the triangles Count a few k-graphlets For each edge, count only: Count all other graphlets For each edge, use combinatorial relationships to derive counts of other graphlets in constant time o(1) k-cliques k-cycles tailed-triangles Step 1 Step 2 Step 3
  • 27. Our Approach – (Edge-centric, parallel, space-efficient) Searching Edge Neighborhoods For each edge Find the triangles Count a few k-graphlets For each edge, count only: Count all other graphlets For each edge, use combinatorial relationships to derive counts of other graphlets in constant time o(1) k-cliques k-cycles tailed-triangles Step 1 Step 2 Step 3 Step 4 Merge all counts
  • 28. neighborhood runtimes (CPU) Key Observations Neighborhood runtimes are power-lawed The distribution of graphlet runtimes for edge neighborhoods obey a power-law.
  • 29. Most edge neighborhoods are fast with runtimes that are approximately equal. Key Observations The distribution of graphlet runtimes for edge neighborhoods obey a power-law. Neighborhood runtimes are power-lawed
  • 30. HOWEVER, a handful of neighborhoods are hard and take significantly longer. Most edge neighborhoods are fast with runtimes that are approximately equal. Key Observations The distribution of graphlet runtimes for edge neighborhoods obey a power-law. Neighborhood runtimes are power-lawed
  • 31. HOWEVER, a handful of neighborhoods are hard and take significantly longer. Most edge neighborhoods are fast with runtimes that are approximately equal. Key Observations The distribution of graphlet runtimes for edge neighborhoods obey a power-law. Neighborhood runtimes are power-lawed QUESTION: What is the “best” way to partition neighborhoods among CPUs and GPUs? • “hardness” proxy  edge deg., vol., ...
  • 32. Our approach • Order edges by “hardness” and partition into 3 sets: Γ e1 Γ e 𝑀Γ e𝑗Γ e 𝑘 ⋯ ⋯ ⋯ Πcpu Πgpu
  • 33. Our approach • Order edges by “hardness” and partition into 3 sets: • Compute induced subgraphs centered at each edge • CPU Workers: use hash table for o(1) lookups, O(N) • GPU Workers: use binary search for o(log d) lookups • When finished, dequeue next b edges: • CPU: get b edges from FRONT of Πunproc • GPU: get b edges from BACK of Πunproc
  • 34. Preprocessing steps Three simple and efficient preprocessing steps: 1) Sort vertices from smallest to largest degree 𝑓(∙) and relabel them s.t. 𝑓 v1 ≤ ⋯ ≤ 𝑓 v 𝑁
  • 35. Preprocessing steps Three simple and efficient preprocessing steps: 1) Sort vertices from smallest to largest degree 𝑓(∙) and relabel them s.t. 𝑓 v1 ≤ ⋯ ≤ 𝑓 v 𝑁 2) For each Γ v𝑖 , ∀𝑖 = 1, … , N, order the set of neighbors Γ v𝑖 = … , w𝑗, … , w 𝑘 , … from smallest to largest deg.
  • 36. Preprocessing steps Three simple and efficient preprocessing steps: 1) Sort vertices from smallest to largest degree 𝑓(∙) and relabel them s.t. 𝑓 v1 ≤ ⋯ ≤ 𝑓 v 𝑁 2) For each Γ v𝑖 , ∀𝑖 = 1, … , N, order the set of neighbors Γ v𝑖 = … , w𝑗, … , w 𝑘 , … from smallest to largest deg. 3) Given edge (v, u) ∈ E, ensure that 𝑓 v ≥ 𝑓 u ⎻ hence, v is always the vertex with largest degree, dv ≥ du
  • 37. Preprocessing steps • All of these steps are not required, but significantly improve • Each step is extremely fast and lends itself to easy parallelization Three simple and efficient preprocessing steps: 1) Sort vertices from smallest to largest degree 𝑓(∙) and relabel them s.t. 𝑓 v1 ≤ ⋯ ≤ 𝑓 v 𝑁 2) For each Γ v𝑖 , ∀𝑖 = 1, … , N, order the set of neighbors Γ v𝑖 = … , w𝑗, … , w 𝑘 , … from smallest to largest deg. 3) Given edge (v, u) ∈ E, ensure that 𝑓 v ≥ 𝑓 u ⎻ hence, v is always the vertex with largest degree, dv ≥ du
  • 38. Fine Granularity & Work Stealing For a single edge (v, u) ∈ E, I. Compute the sets 𝐓 and 𝐒 𝐮 II. Find the total 4-cliques using T III. Find the total 4-cycles using 𝐒 𝐮 NOTE: (II) and (III) are independent  parallelize
  • 39. Unrestricted counts 3-graphlets 4-graphlets De = N −(|Sv|+ |Su| + |T|) − 2
  • 41. Time Complexity K = number of edges Δ = max degree Tmax = max number of triangles incident to an edge in G Smax = max number of 2-stars incident to an edge in G
  • 43. Connected 4-graphlet frequencies for a variety of the real- world networks investigated from different domains. Facebook networks Social networks Interaction networks Network type Collaboration networks Brain networks Web graphs Technological/IP networks Dense hard benchmark graphs
  • 44. Validating edge partitioning 0 2 4 6 x 10 4 0 0.05 0.1 0.15 0.2 Edge neighborhoods Time(ms) CPU GPU 0 2 4 6 x 10 4 0 0.05 0.1 0.15 0.2 Edge neighborhoods Time(ms) • Edges partitioned by “hardness” • GPUs assigned sparser neighborhoods • Assigns edge neighborhoods to “best” processor type • Importance of initial ordering GPU workers assigned easy & balanced edge neighborhoods (approx. equal runtimes) CPU workers assigned difficult unbalanced/skewed neighborhoods
  • 45. Experiments: Improvement Runtime improvement over state-of-the-art GPU: Uses a single multi-core GPU Multi-GPU: Uses all available GPUs Hybrid: Leverages all multi-core CPUs & GPUs 2 Intel Xeon CPUs (E5-2687) – • 8 cores (3.10Ghz) 8 Titan Black NVIDIA GPUs – • 2880 cores (889 Mhz), ~6GB
  • 46. Experiments: Improvement Runtime improvement over state-of-the-art Improvement: significant at α = 0.01 GPU: Uses a single multi-core GPU Multi-GPU: Uses all available GPUs Hybrid: Leverages all multi-core CPUs & GPUs
  • 47. Experiments: Improvement Runtime improvement over state-of-the-art Improvement: significant at α = 0.01 MEAN 8x 40x 126x GPU: Uses a single multi-core GPU Multi-GPU: Uses all available GPUs Hybrid: Leverages all multi-core CPUs & GPUs
  • 48. Comparing ORCA-GPU methods Many problems with Orca-GPU: • No “effective parallelization”, many parts dependent • Requires synchronization throughout, locks • No fine-grained parallelization • Significant improvement over Orca-GPU (at 𝛼 = 0.01) Orca-GPU runtime / runtime of proposed method ImprovementoverOrca-GPU
  • 49. Varying the edge ordering Ordering strategy significantly impacts performance
  • 50. Space-efficient & comm. avoidance Average memory (MB) per GPU for three networks.
  • 58. Framework & Algorithms • Introduced hybrid graphlet counting approach that leverages all available CPUs & GPUs • First hybrid CPU-GPU approach for graphlet counting • On average 126x faster than current methods - Edge-centric computations (only requires access to edge neighborhood) • Time and space-efficient Applications • Visual analytics and real-time graphlet mining Summary
  • 59. Research generously supported by: Data: http://networkrepository.com