SlideShare a Scribd company logo
1 of 48
Download to read offline
Hadi Sinaee, University of British Columbia, June 2021
Application Driven Graph Partitioning
Wenfei Fan, et al. SIGMOD’20
For Systopia’s reading group by Hadi Sinaee
University of British Columbia, Canada
June 11th, 2021
1
How do we do it?
What are the metrics for it?
Hadi Sinaee, University of British Columbia, June 2021
Why do we partition a graph?!
Graph
Sub
graph
Sub
graph
.
.
.
Sub
graph
Part-2
Part-3
Part-K
Sub
graph
Part-1
● A Graph
● K Units of Computation
How do we do the partitioning?
Which partitioning is considered a good one?
2
Hadi Sinaee, University of British Columbia, June 2021
Edge Partitioning(VertexCut)
Vertex Partitioning(EdgeCut)
How do we do the partitioning?
3
Hybrid Partitioning
Hadi Sinaee, University of British Columbia, June 2021
Which partitioning is considered a good one?
1. Lower Replication
2. Well-Balanced Sub Graphs
4
Hadi Sinaee, University of British Columbia, June 2021
https://twitter.com/NetflixFilm/status/1291442611962560513?s=20
5
Vertex Partitioning
Edge Partitioning
Hybrid Partitioning
Hadi Sinaee, University of British Columbia, June 2021
Which partitioning is considered A GOOD ONE ??!
6
+
Workload?
Access to nodes?
...
v.s
Hadi Sinaee, University of British Columbia, June 2021
“For an algorithm of our interest, what
partitioning strategy fits it the best and
improves its parallel execution?”
7
Hadi Sinaee, University of British Columbia, June 2021
Example - Common Neighbors(CN)
8
(u, v1,v3)
.
.
.
(u, vi,vj)
...
How many pairs of this form exists?
Pick any two nodes from u’s incoming nodes
In-Degree: number of incoming nodes
C(n,2) = n*(n-1)/2
(u, v1,v2)
u
V
2
V
1
Hadi Sinaee, University of British Columbia, June 2021
Example - Vertex Partitioning and CN
9
5 Vertices + 9 Edges
(Well-Balanced)
F1(10) > F2(2)
(workload)
F1(6) == F2(6)
(workload)
F1: 3 Vertices + 6 Edges
F2: 7 Vertices + 11 Edges
Hadi Sinaee, University of British Columbia, June 2021
What Happened?!
10
Picked An Algorithm
(CN)
Defined A Cost Model For CN
Is this a good
partitioning for CN?
No
Yes
We’re good!
Hadi Sinaee, University of British Columbia, June 2021
What Happened?!
11
Picked An Algorithm
(A)
Defined A Cost Model For CN
Is this a good
partitioning for CN?
No
Yes
We’re good!
Repartition the graph
A Graph Partition
Hadi Sinaee, University of British Columbia, June 2021
What is the process for learning the cost model?
12
feature vector
cost function
training and testing
Hadi Sinaee, University of British Columbia, June 2021
What is the feature vector?
13
In-degree & out-degree
(Each Partition)
In-degree & out-degree
(Graph)
Number of mirrors across all partitions
*When there are multiple copies of a
node, we consider one of them as the
master and the rest as mirrors!
Hadi Sinaee, University of British Columbia, June 2021
What is the process for learning the cost model?
14
feature vector
cost function
training and testing
Hadi Sinaee, University of British Columbia, June 2021
What is the cost function?!
15
Algorithm
Partition i
Computation Cost Communication Cost
Cost Function
(Partition i)
= +
Hadi Sinaee, University of British Columbia, June 2021
Computation and Communication costs!
16
Hadi Sinaee, University of British Columbia, June 2021
Computation and Communication costs?
17
Hadi Sinaee, University of British Columbia, June 2021
Computation and Communication costs?
18
X1 x2 x3 x4 x5 x6
(1+ x1+ x2+ x3+ x4+ x5+ x6)p=2
x1
x1*x1
x1*x2
…
x6*x6
P=2
polynomial function of order P
w1
w2
w3
…
wn
w1*x1 + w2*(x1*x1) + … wn*(x6*x6)
Hadi Sinaee, University of British Columbia, June 2021
Computation and Communication costs?
19
dummy!
polynomial function of order P
Hadi Sinaee, University of British Columbia, June 2021
Computation and Communication costs?
20
*When there are multiple copies of a
node, we consider one of them as the
master and the rest as its mirrors!
Hadi Sinaee, University of British Columbia, June 2021
What is the process for learning the cost model?
21
feature vector
cost function
training and testing
polynomial function of order P
Hadi Sinaee, University of British Columbia, June 2021
How do we train our models?
22
-training data set
-computed costs for v
Run algorithm A on Real-World Graphs and Synthesis Graphs.
Prevent
Overfitting
Mean Squared Relative Error
(MSRE)
w1
w2
w3
…
wn
Hadi Sinaee, University of British Columbia, June 2021
What is the process for learning the cost model?
23
feature vector
cost function
training and testing
polynomial function of order P
Hadi Sinaee, University of British Columbia, June 2021
What Happened?!
24
Picked An Algorithm
(A)
Defined A Cost Model For CN
Is this a good
partitioning for CN?
No
Yes
We’re good!
Repartition the graph
A Graph Partition
Hadi Sinaee, University of British Columbia, June 2021
What Happened?! - CN
25
Picked An Algorithm
(A)
Defined A Cost Model For CN
Is this a good
partitioning for CN?
No
Yes
We’re good!
Repartition the graph
A Graph Partition
Hadi Sinaee, University of British Columbia, June 2021
How to use our trained models?
26
From Edge-Cut to Hybrid-Cut (E2H)
Balance Computation
Balance Communication
EMigrate
E2H Process
Given:
1. Edge-Cut Part.
2. Learned h and g
3. Budget B
Goal:
Hybrid Part. reduces:
1. Find candidates( using BFS ), move (nodes
and all their edges) from overloaded
to underloaded partitions
2. Continues until no migration needed!
Hadi Sinaee, University of British Columbia, June 2021
How to use our trained models?
27
From Edge-Cut to Hybrid-Cut (E2H)
E2H Process
Given:
1. Edge-Cut Part.
2. Learned h and g
3. Budget B
Goal:
Hybrid Part. reduces:
Balance Computation
Balance Communication
EMigrate
ESplit What happened if we couldn’t migrate
anymore?
e.g. e high-degree nodes in power law graphs
Selects a node with a subset of edges for the
migration
1. Find candidates( using BFS ), move (e-cut
nodes and all their edges) from
overloaded to underloaded partitions
2. Continues until no migration needed!
Hadi Sinaee, University of British Columbia, June 2021
How to use our trained models?
28
From Edge-Cut to Hybrid-Cut (E2H)
E2H Process
Given:
1. Edge-Cut Part.
2. Learned h and g
3. Budget B
Goal:
Hybrid Part. reduces:
Balance Computation
Balance Communication
EMigrate
ESplit Selects a node with a subset of edges for the
migration
I.e. cuts a node into multiple v-cut nodes
MAssign Assign master nodes in the border node set
1. Find candidates( using BFS ), move (e-cut
nodes and all their edges) from
overloaded to underloaded partitions
2. Continues until no migration needed!
Hadi Sinaee, University of British Columbia, June 2021
Example!
29
marked for migration
add if it’s in our budget
BFS Order
Hadi Sinaee, University of British Columbia, June 2021
Example!
30
marked for migration
add if it’s in our budget
BFS Order
EMigrate
t3
EMigrate
t2
aborted since it exceeds F2 budget!
Hadi Sinaee, University of British Columbia, June 2021
Example!
31
ESplit
t2
MAssign
Hadi Sinaee, University of British Columbia, June 2021
Parallelized E2H
32
Worker 1 Worker 2 Worker N
*Shared Nothing Distributed State
...
Initial Edge-Cut With N Partition ...
Hadi Sinaee, University of British Columbia, June 2021
Parallelized E2H
33
Worker 1 Worker 2 Worker N
*Shared Nothing Distributed State
...
Initial Edge-Cut With N Partition ...
Worker i ...
Underloaded Partitions
Worker N
Overloaded Partition
Hadi Sinaee, University of British Columbia, June 2021
Parallelized E2H
34
Worker 1 Worker 2 Worker N
*Shared Nothing Distributed State
...
Initial Edge-Cut With N Partitions ...
Worker i ...
Underloaded Partitions
Worker N
Overloaded Partition
EMigrate
c1
c2
c3
?
Hadi Sinaee, University of British Columbia, June 2021
Parallelized E2H
35
Worker 1 Worker 2 Worker N
*Shared Nothing Distributed State
...
Initial Edge-Cut With N Partitions ...
Worker i ...
Underloaded Partitions
Worker N
Overloaded Partition
EMigrate
c1
c2
c3
c1
c2
c3
Hadi Sinaee, University of British Columbia, June 2021
Parallelized E2H
36
Worker 1 Worker 2 Worker N
*Shared Nothing Distributed State
...
Initial Edge-Cut With N Partitions ...
Worker i ...
Underloaded Partitions
Worker N
Overloaded Partition
EMigrate
c1
c2
c3
c1
c2
c3
Hadi Sinaee, University of British Columbia, June 2021
Parallelized E2H
37
Worker 1 Worker 2 Worker N
*Shared Nothing Distributed State
...
Initial Edge-Cut With N Partitions ...
Worker i ...
Underloaded Partitions
Worker N
Overloaded Partition
EMigrate
c3
Hadi Sinaee, University of British Columbia, June 2021
Parallelized E2H
38
Worker 1 Worker 2 Worker N
*Shared Nothing Distributed State
...
Initial Edge-Cut With N Partitions ...
Worker i ...
Underloaded Partitions
Worker N
Overloaded Partition
EMigrate
c3
c3
The process continues until all candidates are either accepted
by some workers or rejected by all of them!
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Setup
39
- LiveJournal(4.8M Nodes, 68M Edges)
- Twitter(42M Nodes, 1.5B Edges)
- UK Web(106M Nodes, 3.7B Edges)
- Common Neighbours(CN)
- Triangle Counting(TC)
- Page Rank(PR)
- 80K Training Samples, 20K Tests
- PyTorch
- NVIDIA Tesla V100 GPU
- 32 machines in an HPC cluster
- 12 cores Xeon 2.2GHz + 128GB RAM
- 10Gbps NIC
- Each partition is processed by 1 worker
- Each worker runs on 1 excl. core
- Each expr. = avg(repeated 5 times)
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Reading!
40
Edge-Cut Partitioners Vertex-Cut Partitioners Hybrid Partitioner
Post processed by Parallel version of E2H
(ParE2H)
Post processed by Parallel version of V2H
(ParV2H)
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Reading!
41
Edge-Cut Partitioners Vertex-Cut Partitioners Hybrid Partitioners
Post processed partitions
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Application Speedup of CN
42
better
worse
Number of partitions
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Application Speedup of TC
43
better
worse
vertex-cut
Edge-cut based
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Application Speedup of PR
44
better
worse
vertex-cut
Edge-cut based
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Scalability
45
On Average:
- ParE2H takes ~12% total run-time
- ParV2H takes:
1. 0.1% HNE run-time
2. ~23% HGrid run-time
Number of partitions G = Synthesis Graph
CN algorithm
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Scalability
46
On Average:
- ParE2H takes ~12% total run-time
- ParV2H takes:
1. 0.1% HNE run-time
2. ~23% HGrid run-time
Number of partitions G = Synthetic Graph
CN algorithm
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Impact of different phases
47
EMigrate: CN(~68%), TC(~26%), PR(75%)
ESplit: CN(1.1 times), TC(2.7 times)
MAssign: CN, TC, PR ~ (20%-30%)
Hadi Sinaee, University of British Columbia, June 2021
Experiments - Efficiency
48

More Related Content

What's hot

Smart grid technologies after midsem slides
Smart grid technologies after midsem slidesSmart grid technologies after midsem slides
Smart grid technologies after midsem slides
IIIT Bhubaneswar
 
Smart Energy Management System
Smart Energy Management SystemSmart Energy Management System
Smart Energy Management System
Alexandr Palamari
 
Denavit hartenberg convention
Denavit hartenberg conventionDenavit hartenberg convention
Denavit hartenberg convention
nguyendattdh
 

What's hot (20)

The inverse kinematics problem - Aiman Al-Allaq
The inverse kinematics problem - Aiman Al-AllaqThe inverse kinematics problem - Aiman Al-Allaq
The inverse kinematics problem - Aiman Al-Allaq
 
Power Consumption Alert System
Power Consumption Alert SystemPower Consumption Alert System
Power Consumption Alert System
 
API Management
API ManagementAPI Management
API Management
 
Tackle-test: An Automatic Unit-level Test Case Generator
Tackle-test: An Automatic Unit-level Test Case GeneratorTackle-test: An Automatic Unit-level Test Case Generator
Tackle-test: An Automatic Unit-level Test Case Generator
 
Inverse kinematics
Inverse kinematicsInverse kinematics
Inverse kinematics
 
Design and Implementation of Pick and Place Robotic Arm
Design and Implementation of Pick and Place Robotic ArmDesign and Implementation of Pick and Place Robotic Arm
Design and Implementation of Pick and Place Robotic Arm
 
Automated Guided Forklift
Automated  Guided  ForkliftAutomated  Guided  Forklift
Automated Guided Forklift
 
Smart grid technologies after midsem slides
Smart grid technologies after midsem slidesSmart grid technologies after midsem slides
Smart grid technologies after midsem slides
 
Meetup milano #4 Anypoint Monitoring and Titanium overview
Meetup milano #4   Anypoint Monitoring and Titanium overviewMeetup milano #4   Anypoint Monitoring and Titanium overview
Meetup milano #4 Anypoint Monitoring and Titanium overview
 
MuleSoft Architecture Presentation
MuleSoft Architecture PresentationMuleSoft Architecture Presentation
MuleSoft Architecture Presentation
 
Smart Energy Management System
Smart Energy Management SystemSmart Energy Management System
Smart Energy Management System
 
敏捷开发全景视图(流程、方法和最佳实践)
敏捷开发全景视图(流程、方法和最佳实践)敏捷开发全景视图(流程、方法和最佳实践)
敏捷开发全景视图(流程、方法和最佳实践)
 
Fundamentals of Robotics and Machine Vision System
Fundamentals of Robotics and Machine Vision SystemFundamentals of Robotics and Machine Vision System
Fundamentals of Robotics and Machine Vision System
 
Automated Meter Reading System
Automated Meter Reading SystemAutomated Meter Reading System
Automated Meter Reading System
 
Running and Managing Mule Applications
Running and Managing Mule ApplicationsRunning and Managing Mule Applications
Running and Managing Mule Applications
 
API Management
API ManagementAPI Management
API Management
 
Denavit hartenberg convention
Denavit hartenberg conventionDenavit hartenberg convention
Denavit hartenberg convention
 
Multicriteria and cost benefit analysis for smart grid projects
Multicriteria and cost benefit analysis for smart grid projectsMulticriteria and cost benefit analysis for smart grid projects
Multicriteria and cost benefit analysis for smart grid projects
 
Digital Energy Meter project
Digital Energy Meter projectDigital Energy Meter project
Digital Energy Meter project
 
Software Defined Grid
Software Defined GridSoftware Defined Grid
Software Defined Grid
 

Similar to Application driven graph partitioning - SIGMOD'20

Question 1 Artifacts” in the Scrum terminology are the equiva.docx
Question 1 Artifacts” in the Scrum terminology are the equiva.docxQuestion 1 Artifacts” in the Scrum terminology are the equiva.docx
Question 1 Artifacts” in the Scrum terminology are the equiva.docx
IRESH3
 
CSMT 442 Cost and Estimating II
CSMT 442 Cost and Estimating II                              CSMT 442 Cost and Estimating II
CSMT 442 Cost and Estimating II
MargenePurnell14
 

Similar to Application driven graph partitioning - SIGMOD'20 (20)

Question 1 Artifacts” in the Scrum terminology are the equiva.docx
Question 1 Artifacts” in the Scrum terminology are the equiva.docxQuestion 1 Artifacts” in the Scrum terminology are the equiva.docx
Question 1 Artifacts” in the Scrum terminology are the equiva.docx
 
Module 5 Computerized Layout Planning
Module 5 Computerized Layout PlanningModule 5 Computerized Layout Planning
Module 5 Computerized Layout Planning
 
Tte 451 operations research fall 2021 part 1
Tte 451  operations research   fall 2021   part 1Tte 451  operations research   fall 2021   part 1
Tte 451 operations research fall 2021 part 1
 
Smart Crowd Analyzer.pptx
Smart Crowd Analyzer.pptxSmart Crowd Analyzer.pptx
Smart Crowd Analyzer.pptx
 
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
 
Intent-aware Visualization Recommendation for Tabular Data
Intent-aware Visualization Recommendation for Tabular DataIntent-aware Visualization Recommendation for Tabular Data
Intent-aware Visualization Recommendation for Tabular Data
 
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
 
1.introduction_to_bigdata_chap1.pdf
1.introduction_to_bigdata_chap1.pdf1.introduction_to_bigdata_chap1.pdf
1.introduction_to_bigdata_chap1.pdf
 
Eng jbmc kickoff2001027
Eng jbmc kickoff2001027Eng jbmc kickoff2001027
Eng jbmc kickoff2001027
 
PMP Lite Mock Exam2.pdf
PMP Lite Mock Exam2.pdfPMP Lite Mock Exam2.pdf
PMP Lite Mock Exam2.pdf
 
Flipped class design for cad rca2 s.dharani kumar
Flipped class design for cad  rca2  s.dharani kumarFlipped class design for cad  rca2  s.dharani kumar
Flipped class design for cad rca2 s.dharani kumar
 
Kisdi presentation
Kisdi presentationKisdi presentation
Kisdi presentation
 
Computing Student Success at Montgomery College in the Web 3.0 Era
Computing Student Success at Montgomery College  in the Web 3.0 EraComputing Student Success at Montgomery College  in the Web 3.0 Era
Computing Student Success at Montgomery College in the Web 3.0 Era
 
AS Supporting Teaching and Learning of Linear Algebra
AS Supporting Teaching and Learning of Linear AlgebraAS Supporting Teaching and Learning of Linear Algebra
AS Supporting Teaching and Learning of Linear Algebra
 
Iabse2008 Chicago Jyrki Jauhiainen
Iabse2008 Chicago Jyrki JauhiainenIabse2008 Chicago Jyrki Jauhiainen
Iabse2008 Chicago Jyrki Jauhiainen
 
CSMT 442 Cost and Estimating II
CSMT 442 Cost and Estimating II                              CSMT 442 Cost and Estimating II
CSMT 442 Cost and Estimating II
 
Test Bank for Managerial Accounting Decision Making and Motivating Performanc...
Test Bank for Managerial Accounting Decision Making and Motivating Performanc...Test Bank for Managerial Accounting Decision Making and Motivating Performanc...
Test Bank for Managerial Accounting Decision Making and Motivating Performanc...
 
Liberty university busi 313 quiz 4 complete solutions correct answers slideshare
Liberty university busi 313 quiz 4 complete solutions correct answers slideshareLiberty university busi 313 quiz 4 complete solutions correct answers slideshare
Liberty university busi 313 quiz 4 complete solutions correct answers slideshare
 
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
 
IES BIM Faculty - Digitisation of Construction in 2017 and the role of IESVE
IES BIM Faculty - Digitisation of Construction in 2017 and the role of IESVEIES BIM Faculty - Digitisation of Construction in 2017 and the role of IESVE
IES BIM Faculty - Digitisation of Construction in 2017 and the role of IESVE
 

Recently uploaded

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 

Application driven graph partitioning - SIGMOD'20

  • 1. Hadi Sinaee, University of British Columbia, June 2021 Application Driven Graph Partitioning Wenfei Fan, et al. SIGMOD’20 For Systopia’s reading group by Hadi Sinaee University of British Columbia, Canada June 11th, 2021 1 How do we do it? What are the metrics for it?
  • 2. Hadi Sinaee, University of British Columbia, June 2021 Why do we partition a graph?! Graph Sub graph Sub graph . . . Sub graph Part-2 Part-3 Part-K Sub graph Part-1 ● A Graph ● K Units of Computation How do we do the partitioning? Which partitioning is considered a good one? 2
  • 3. Hadi Sinaee, University of British Columbia, June 2021 Edge Partitioning(VertexCut) Vertex Partitioning(EdgeCut) How do we do the partitioning? 3 Hybrid Partitioning
  • 4. Hadi Sinaee, University of British Columbia, June 2021 Which partitioning is considered a good one? 1. Lower Replication 2. Well-Balanced Sub Graphs 4
  • 5. Hadi Sinaee, University of British Columbia, June 2021 https://twitter.com/NetflixFilm/status/1291442611962560513?s=20 5 Vertex Partitioning Edge Partitioning Hybrid Partitioning
  • 6. Hadi Sinaee, University of British Columbia, June 2021 Which partitioning is considered A GOOD ONE ??! 6 + Workload? Access to nodes? ... v.s
  • 7. Hadi Sinaee, University of British Columbia, June 2021 “For an algorithm of our interest, what partitioning strategy fits it the best and improves its parallel execution?” 7
  • 8. Hadi Sinaee, University of British Columbia, June 2021 Example - Common Neighbors(CN) 8 (u, v1,v3) . . . (u, vi,vj) ... How many pairs of this form exists? Pick any two nodes from u’s incoming nodes In-Degree: number of incoming nodes C(n,2) = n*(n-1)/2 (u, v1,v2) u V 2 V 1
  • 9. Hadi Sinaee, University of British Columbia, June 2021 Example - Vertex Partitioning and CN 9 5 Vertices + 9 Edges (Well-Balanced) F1(10) > F2(2) (workload) F1(6) == F2(6) (workload) F1: 3 Vertices + 6 Edges F2: 7 Vertices + 11 Edges
  • 10. Hadi Sinaee, University of British Columbia, June 2021 What Happened?! 10 Picked An Algorithm (CN) Defined A Cost Model For CN Is this a good partitioning for CN? No Yes We’re good!
  • 11. Hadi Sinaee, University of British Columbia, June 2021 What Happened?! 11 Picked An Algorithm (A) Defined A Cost Model For CN Is this a good partitioning for CN? No Yes We’re good! Repartition the graph A Graph Partition
  • 12. Hadi Sinaee, University of British Columbia, June 2021 What is the process for learning the cost model? 12 feature vector cost function training and testing
  • 13. Hadi Sinaee, University of British Columbia, June 2021 What is the feature vector? 13 In-degree & out-degree (Each Partition) In-degree & out-degree (Graph) Number of mirrors across all partitions *When there are multiple copies of a node, we consider one of them as the master and the rest as mirrors!
  • 14. Hadi Sinaee, University of British Columbia, June 2021 What is the process for learning the cost model? 14 feature vector cost function training and testing
  • 15. Hadi Sinaee, University of British Columbia, June 2021 What is the cost function?! 15 Algorithm Partition i Computation Cost Communication Cost Cost Function (Partition i) = +
  • 16. Hadi Sinaee, University of British Columbia, June 2021 Computation and Communication costs! 16
  • 17. Hadi Sinaee, University of British Columbia, June 2021 Computation and Communication costs? 17
  • 18. Hadi Sinaee, University of British Columbia, June 2021 Computation and Communication costs? 18 X1 x2 x3 x4 x5 x6 (1+ x1+ x2+ x3+ x4+ x5+ x6)p=2 x1 x1*x1 x1*x2 … x6*x6 P=2 polynomial function of order P w1 w2 w3 … wn w1*x1 + w2*(x1*x1) + … wn*(x6*x6)
  • 19. Hadi Sinaee, University of British Columbia, June 2021 Computation and Communication costs? 19 dummy! polynomial function of order P
  • 20. Hadi Sinaee, University of British Columbia, June 2021 Computation and Communication costs? 20 *When there are multiple copies of a node, we consider one of them as the master and the rest as its mirrors!
  • 21. Hadi Sinaee, University of British Columbia, June 2021 What is the process for learning the cost model? 21 feature vector cost function training and testing polynomial function of order P
  • 22. Hadi Sinaee, University of British Columbia, June 2021 How do we train our models? 22 -training data set -computed costs for v Run algorithm A on Real-World Graphs and Synthesis Graphs. Prevent Overfitting Mean Squared Relative Error (MSRE) w1 w2 w3 … wn
  • 23. Hadi Sinaee, University of British Columbia, June 2021 What is the process for learning the cost model? 23 feature vector cost function training and testing polynomial function of order P
  • 24. Hadi Sinaee, University of British Columbia, June 2021 What Happened?! 24 Picked An Algorithm (A) Defined A Cost Model For CN Is this a good partitioning for CN? No Yes We’re good! Repartition the graph A Graph Partition
  • 25. Hadi Sinaee, University of British Columbia, June 2021 What Happened?! - CN 25 Picked An Algorithm (A) Defined A Cost Model For CN Is this a good partitioning for CN? No Yes We’re good! Repartition the graph A Graph Partition
  • 26. Hadi Sinaee, University of British Columbia, June 2021 How to use our trained models? 26 From Edge-Cut to Hybrid-Cut (E2H) Balance Computation Balance Communication EMigrate E2H Process Given: 1. Edge-Cut Part. 2. Learned h and g 3. Budget B Goal: Hybrid Part. reduces: 1. Find candidates( using BFS ), move (nodes and all their edges) from overloaded to underloaded partitions 2. Continues until no migration needed!
  • 27. Hadi Sinaee, University of British Columbia, June 2021 How to use our trained models? 27 From Edge-Cut to Hybrid-Cut (E2H) E2H Process Given: 1. Edge-Cut Part. 2. Learned h and g 3. Budget B Goal: Hybrid Part. reduces: Balance Computation Balance Communication EMigrate ESplit What happened if we couldn’t migrate anymore? e.g. e high-degree nodes in power law graphs Selects a node with a subset of edges for the migration 1. Find candidates( using BFS ), move (e-cut nodes and all their edges) from overloaded to underloaded partitions 2. Continues until no migration needed!
  • 28. Hadi Sinaee, University of British Columbia, June 2021 How to use our trained models? 28 From Edge-Cut to Hybrid-Cut (E2H) E2H Process Given: 1. Edge-Cut Part. 2. Learned h and g 3. Budget B Goal: Hybrid Part. reduces: Balance Computation Balance Communication EMigrate ESplit Selects a node with a subset of edges for the migration I.e. cuts a node into multiple v-cut nodes MAssign Assign master nodes in the border node set 1. Find candidates( using BFS ), move (e-cut nodes and all their edges) from overloaded to underloaded partitions 2. Continues until no migration needed!
  • 29. Hadi Sinaee, University of British Columbia, June 2021 Example! 29 marked for migration add if it’s in our budget BFS Order
  • 30. Hadi Sinaee, University of British Columbia, June 2021 Example! 30 marked for migration add if it’s in our budget BFS Order EMigrate t3 EMigrate t2 aborted since it exceeds F2 budget!
  • 31. Hadi Sinaee, University of British Columbia, June 2021 Example! 31 ESplit t2 MAssign
  • 32. Hadi Sinaee, University of British Columbia, June 2021 Parallelized E2H 32 Worker 1 Worker 2 Worker N *Shared Nothing Distributed State ... Initial Edge-Cut With N Partition ...
  • 33. Hadi Sinaee, University of British Columbia, June 2021 Parallelized E2H 33 Worker 1 Worker 2 Worker N *Shared Nothing Distributed State ... Initial Edge-Cut With N Partition ... Worker i ... Underloaded Partitions Worker N Overloaded Partition
  • 34. Hadi Sinaee, University of British Columbia, June 2021 Parallelized E2H 34 Worker 1 Worker 2 Worker N *Shared Nothing Distributed State ... Initial Edge-Cut With N Partitions ... Worker i ... Underloaded Partitions Worker N Overloaded Partition EMigrate c1 c2 c3 ?
  • 35. Hadi Sinaee, University of British Columbia, June 2021 Parallelized E2H 35 Worker 1 Worker 2 Worker N *Shared Nothing Distributed State ... Initial Edge-Cut With N Partitions ... Worker i ... Underloaded Partitions Worker N Overloaded Partition EMigrate c1 c2 c3 c1 c2 c3
  • 36. Hadi Sinaee, University of British Columbia, June 2021 Parallelized E2H 36 Worker 1 Worker 2 Worker N *Shared Nothing Distributed State ... Initial Edge-Cut With N Partitions ... Worker i ... Underloaded Partitions Worker N Overloaded Partition EMigrate c1 c2 c3 c1 c2 c3
  • 37. Hadi Sinaee, University of British Columbia, June 2021 Parallelized E2H 37 Worker 1 Worker 2 Worker N *Shared Nothing Distributed State ... Initial Edge-Cut With N Partitions ... Worker i ... Underloaded Partitions Worker N Overloaded Partition EMigrate c3
  • 38. Hadi Sinaee, University of British Columbia, June 2021 Parallelized E2H 38 Worker 1 Worker 2 Worker N *Shared Nothing Distributed State ... Initial Edge-Cut With N Partitions ... Worker i ... Underloaded Partitions Worker N Overloaded Partition EMigrate c3 c3 The process continues until all candidates are either accepted by some workers or rejected by all of them!
  • 39. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Setup 39 - LiveJournal(4.8M Nodes, 68M Edges) - Twitter(42M Nodes, 1.5B Edges) - UK Web(106M Nodes, 3.7B Edges) - Common Neighbours(CN) - Triangle Counting(TC) - Page Rank(PR) - 80K Training Samples, 20K Tests - PyTorch - NVIDIA Tesla V100 GPU - 32 machines in an HPC cluster - 12 cores Xeon 2.2GHz + 128GB RAM - 10Gbps NIC - Each partition is processed by 1 worker - Each worker runs on 1 excl. core - Each expr. = avg(repeated 5 times)
  • 40. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Reading! 40 Edge-Cut Partitioners Vertex-Cut Partitioners Hybrid Partitioner Post processed by Parallel version of E2H (ParE2H) Post processed by Parallel version of V2H (ParV2H)
  • 41. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Reading! 41 Edge-Cut Partitioners Vertex-Cut Partitioners Hybrid Partitioners Post processed partitions
  • 42. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Application Speedup of CN 42 better worse Number of partitions
  • 43. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Application Speedup of TC 43 better worse vertex-cut Edge-cut based
  • 44. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Application Speedup of PR 44 better worse vertex-cut Edge-cut based
  • 45. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Scalability 45 On Average: - ParE2H takes ~12% total run-time - ParV2H takes: 1. 0.1% HNE run-time 2. ~23% HGrid run-time Number of partitions G = Synthesis Graph CN algorithm
  • 46. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Scalability 46 On Average: - ParE2H takes ~12% total run-time - ParV2H takes: 1. 0.1% HNE run-time 2. ~23% HGrid run-time Number of partitions G = Synthetic Graph CN algorithm
  • 47. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Impact of different phases 47 EMigrate: CN(~68%), TC(~26%), PR(75%) ESplit: CN(1.1 times), TC(2.7 times) MAssign: CN, TC, PR ~ (20%-30%)
  • 48. Hadi Sinaee, University of British Columbia, June 2021 Experiments - Efficiency 48