SlideShare a Scribd company logo
1 of 54
Download to read offline
Boosting Vertex-Cut Partitioning for
Streaming Graphs
Hooman Peiro Sajjad*, Amir H. Payberah†, Fatemeh Rahimian†, Vladimir Vlassov*, Seif Haridi†
* KTH Royal Institute of Technology † SICS Swedish ICT
5th IEEE International Congress on Big Data
Introduction
Graph Partitioning
Partition large graphs for
applications such as:
•Complexity reduction,
parallelization and
distributed graph analysis
3
P1 P2
P3 P4
Partitioning Models
4
Partitioning Models
5
Partitioning Models
6
P1 P2
Partitioning Models
7
P1 P2
Partitioning Models
8
P1 P2 P1 P2
Partitioning Models
9
P1 P2 P1 P2
More efficient for power-law graphs
A Good Vertex-Cut Partitioning
10
• Low replication factor
• Balanced partitions with respect to the number of edges
Streaming Graph Partitioning
• Graph elements are
assigned to partitions as
they are being streamed
• No global knowledge
11
Partitioner
P1
P2
Pp
streaming edges
State-of-the-Art Partitioners
12
State-of-the-Art Partitioners
• Centralized partitioner:
• Single thread partitioner
• Multi-threaded partitioner: each thread
partitions a subset of the graph and shares the
state information
13
State-of-the-Art Partitioners
• Centralized partitioner:
• Single thread partitioner
• Multi-threaded partitioner: each thread
partitions a subset of the graph and shares the
state information
• Distributed partitioner:
• Oblivious partitioners: several independent
partitioners
14
State-of-the-Art Partitioners
• Centralized partitioner:
• Single thread partitioner
• Multi-threaded partitioner: each thread
partitions a subset of the graph and shares the
state information
• Distributed partitioner:
• Oblivious partitioners: several independent
partitioners
15
Slow partitioning time
Low replication factor
State-of-the-Art Partitioners
• Centralized partitioner:
• Single thread partitioner
• Multi-threaded partitioner: each thread
partitions a subset of the graph and shares the
state information
• Distributed partitioner:
• Oblivious partitioners: several independent
partitioners
16
Slow partitioning time
Low replication factor
Fast partitioning time
High replication factor
Slow partitioning time
Low replication factor
Centralized partitioner
Partitioning Time vs. Partition Quality
17
Distributed
partitioner
Fast partitioning time
High replication factor
Slow partitioning time
Low replication factor
Centralized partitioner
Partitioning Time vs. Partition Quality
18
Distributed
partitioner
Fast partitioning time
High replication factor
?
Slow partitioning time
Low replication factor
Centralized partitioner
Partitioning Time vs. Partition Quality
19
Distributed
partitioner
Fast partitioning time
High replication factor
HoVerCut
HoVerCut Framework
HoVerCut ...
• Streaming Vertex-Cut partitioner
• Horizontally and Vertically scalable
• Scales without degrading the quality of partitions
• Employs different partitioning policies
21
Architecture Overview
22
Core
Partitioning Policy
Tumbling Window
Local
State
Subpartitioner 1
Edge stream
Core
Partitioning Policy
Tumbling Window
Local
State
Subpartitioner n
Edge stream
Shared
State
Async
Async
Architecture: Input
23
Core
Partitioning Policy
Tumbling Window
Loca
l
Stat
e
Subpartitioner 1
Core
Partitioning Policy
Tumbling Window
Loca
l
Stat
e
Subpartitioner n
Edge stream
Async
Async
• Input graphs are
streamed by their edges
• Each subpartitioner
receives an exclusive
subset of the edges
Shared
State
Architecture: Configurable Window
24
Partitioning Policy
Local
State
Subpartitioner 1
Edge stream
Core
Partitioning Policy
Tumbling Window
Local
State
Subpartitioner n
Edge stream
Async
Async
Subpartitioners
collect a number of
incoming edges in a
window of a certain
size.
Tumbling Window
Core
Shared
State
Architecture: Partitioning Policy
25
Local
State
Subpartitioner 1
Edge stream
Core
Partitioning Policy
Tumbling Window
Local
State
Subpartitioner n
Edge stream
Async
Async
Each subpartitioner
assigns the edges to
the partitions based
on a given policy
Partitioning Policy
Tumbling Window
Shared
State
Core
Architecture: Local State
26
Each subpartitioner has a local
state, which includes information
about the edges processed
locally:
• partial degree
• partitions of each vertex
• num. edges in each partition
Partitioning Policy
Subpartitioner 1
Edge stream
Core
Partitioning Policy
Tumbling Window
Local
State
Subpartitioner n
Edge stream
Async
Async
Local
State
Tumbling Window
Shared
State
Core
Architecture: Shared State
27
Shared-state is the global state accessible by
all subpartitioners.
Partitioning Policy
Subpartitioner 1
Edge
stream
Cor
e
Partitioning Policy
Tumbling Window
Lo
ca
l
S
t
a
t
e
Subpartitioner n
Edge
stream
Asyn
c
Asyn
c
Tumbling Window
Shared
State
Core
Lo
ca
l
S
t
a
t
e
Architecture: Shared State
28
Shared-state is the global state accessible by
all subpartitioners.
putState
getState
ID Partial Degree partitions
v1 12 p1
v2 50 p1,p2
Vertex Table Partition
Table
Shared State
ID Num. of
edges
p1 5000
p2 6500
Partitioning Policy
Subpartitioner 1
Edge
stream
Cor
e
Partitioning Policy
Tumbling Window
Lo
ca
l
S
t
a
t
e
Subpartitioner n
Edge
stream
Asyn
c
Asyn
c
Tumbling Window
Shared
State
Core
Lo
ca
l
S
t
a
t
e
Architecture: Core
29
Partitioning Policy
Local
State
Subpartitioner 1
Edge stream
Core
Partitioning Policy
Tumbling Window
Local
State
Subpartitioner n
Edge stream
Async
Async
The core is HoVerCut’s
main algorithm
parametrised with
partitioning policy and the
window size.
Core
Shared
State
Tumbling Window
Vertex-Cut Partitioning Heuristics
30
For an edge with end-vertices u
and v and for every partition p
Vertex-Cut Partitioning Heuristics
31
Score = ReplicationScore + LoadBalanceScore
For an edge with end-vertices u
and v and for every partition p
Vertex-Cut Partitioning Heuristics
Choose the partition that
maximizes the Score.
32
Score = ReplicationScore + LoadBalanceScore
For an edge with end-vertices u
and v and for every partition p
Vertex-Cut Partitioning Heuristics
Choose the partition that
maximizes the Score
33
Score = ReplicationScore + LoadBalanceScore
State-of-the-Art Heuristics:
•Greedy
•HDRF
For an edge with end-vertices u
and v and for every partition p
Greedy vs. HDRF
34
Greedy vs. HDRF
Greedy: places end-vertices
u and v of an edge in a
partition that already has a
replica of u or v.
35
Greedy vs. HDRF
Greedy: places end-vertices
u and v of an edge in a
partition that already has a
replica of u or v.
36
P1
P2
u
v
u
Greedy
Greedy vs. HDRF
Greedy: places end-vertices
u and v of an edge in a
partition that already has a
replica of u or v.
37
P1
P2
u
v
u
P1
P2
u
v
Greedy
Greedy vs. HDRF
Greedy: places end-vertices
u and v of an edge in a
partition that already has a
replica of u or v.
38
P1
P2
u
v
u
P1
P2
u
v
Greedy
HDRF (High Degree
Replicated First): replicates
the higher degree end-
vertex.
Greedy vs. HDRF
Greedy: places end-vertices
u and v of an edge in a
partition that already has a
replica of u or v.
39
P1
P2
u
v
u
P1
P2
u
v
Greedy
P1
P2
u
v
u
HDRF
v
HDRF (High Degree
Replicated First): replicates
the higher degree end-
vertex.
Greedy vs. HDRF
Greedy: places end-vertices
u and v of an edge in a
partition that already has a
replica of u or v.
40
P1
P2
u
v
u
P1
P2
u
v
Greedy
P1
P2
u
v
u
P1
P2
HDRF
u
v
v
vHDRF (High Degree
Replicated First): replicates
the higher degree end-
vertex.
Partitioning a Window of Edges
41
Partitioning a Window of Edges
vids: the set of vertex ids in the current window
edges: set of edges in current window
pt = get the partition table
vt = get the vertex subtable restricted to vids
42
Partitioning a Window of Edges
vids: the set of vertex ids in the current window
edges: set of edges in current window
pt = get the partition table
vt = get the vertex table restricted to vids
for each e ∊ edges:
u = e.src , v = e.dst
increment vt(u).degree and vt(v).degree
given a partition policy: select p based on vt(u), vt(v) and pt
add p to vt(u).partitions and vt(v).partitions
increment pt(p).size
end
43
Partitioning a Window of Edges
vids: the set of vertex ids in the current window
edges: set of edges in current window
pt = get the partition table
vt = get the vertex table restricted to vids
for each e ∊ edges:
u = e.src , v = e.dst
increment vt(u).degree and vt(v).degree
given a partition policy: select p based on vt(u), vt(v) and pt
add p to vt(u).partitions and vt(v).partitions
increment pt(p).size
end
update the shared state by sending vt, pt represented as deltas
44
Partitioning a Window of Edges
vids: the set of vertex ids in the current window
edges: set of edges in current window
pt = get the partition table
vt = get the vertex table restricted to vids
for each e ∊ edges:
u = e.src , v = e.dst
increment vt(u).degree and vt(v).degree
given a partition policy: select p based on vt(u), vt(v) and pt
add p to vt(u).partitions and vt(v).partitions
increment pt(p).size
end
update the shared state by sending vt, pt represented as deltas
ID Degree partitions
v1 +4 +p1
v2 +2 +p2
ID size
p1 +3
p2 +1
vt pt
Evaluation
Datasets
47
Dataset |V| |E|
Autonomous systems (AS) 1.7M 11M
Pokec social network (PSN) 1.6M 22M
LiveJournal social network (LSN) 4.8M 48M
Orkut social network (OSN) 3.1M 117M
Partitions: 16
Evaluation Metrics
•Replication Factor (RF): the average number of replicated vertices
•Load Relative Standard Deviation (LRSD): the relative standard
deviation of edge size in each partition (LRSD=0 indicates equal
size partitions)
•Partitioning time: the time it takes to partition a graph
48
One Host: Summary
49
HoVerCut’s configuration:
Subpartitioners (threads) = 32
Window size = 32
One Host: Summary
50
HoVerCut’s configuration:
Subpartitioners = 32
Window size = 32
Distributed Configuration
51
AS
|V|=1.7M
|E|=11M
Distributed Configuration
52
AS
|V|=1.7M
|E|=11M
OSN
|V|=3.1M
|E|=117M
Conclusion
•We presented HoVerCut, a parallel and
distributed partitioner
•We can employ different partitioning
policies in a scalable fashion
•We can scale HoVerCut to partition larger
graphs without degrading the quality of
partitions 53
Thank You!

More Related Content

Similar to Boosting Vertex-Cut Partitioning for Streaming Graphs

Aruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisAruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisArunaRavi
 
Advanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineAdvanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineNarann29
 
BlueHat v18 || Hardening hyper-v through offensive security research
BlueHat v18 || Hardening hyper-v through offensive security researchBlueHat v18 || Hardening hyper-v through offensive security research
BlueHat v18 || Hardening hyper-v through offensive security researchBlueHat Security Conference
 
An Introduction to HDTV Principles-Part 1
An Introduction to HDTV Principles-Part 1    An Introduction to HDTV Principles-Part 1
An Introduction to HDTV Principles-Part 1 Dr. Mohieddin Moradi
 
Daewoo service-sl-150 t-sl-150p
Daewoo service-sl-150 t-sl-150pDaewoo service-sl-150 t-sl-150p
Daewoo service-sl-150 t-sl-150pG Lany
 
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...nishimurashoji
 
Efficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informationsEfficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informationsDimitris Akridas
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 
ICCE-Presentation-on-VESA-DisplayPort.pdf
ICCE-Presentation-on-VESA-DisplayPort.pdfICCE-Presentation-on-VESA-DisplayPort.pdf
ICCE-Presentation-on-VESA-DisplayPort.pdfJeffreyWins
 
Masked Software Occlusion Culling
Masked Software Occlusion CullingMasked Software Occlusion Culling
Masked Software Occlusion CullingIntel® Software
 
High Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
High Fidelity Games: Real Examples, Best Practices ... | Oleksii VasylenkoHigh Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
High Fidelity Games: Real Examples, Best Practices ... | Oleksii VasylenkoJessica Tams
 
NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016Mark Kilgard
 
Layout design on MICROWIND
Layout design on MICROWINDLayout design on MICROWIND
Layout design on MICROWINDvaibhav jindal
 
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptxNibrasulIslam
 
Newtec DVB-S2 Calculator: Technical Training
Newtec DVB-S2 Calculator: Technical TrainingNewtec DVB-S2 Calculator: Technical Training
Newtec DVB-S2 Calculator: Technical TrainingNewtec
 

Similar to Boosting Vertex-Cut Partitioning for Streaming Graphs (20)

Aruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisAruna Ravi - M.S Thesis
Aruna Ravi - M.S Thesis
 
Advanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineAdvanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering Pipeline
 
BlueHat v18 || Hardening hyper-v through offensive security research
BlueHat v18 || Hardening hyper-v through offensive security researchBlueHat v18 || Hardening hyper-v through offensive security research
BlueHat v18 || Hardening hyper-v through offensive security research
 
Programmable Piplelines
Programmable PiplelinesProgrammable Piplelines
Programmable Piplelines
 
Computer Graphics
Computer GraphicsComputer Graphics
Computer Graphics
 
An Introduction to HDTV Principles-Part 1
An Introduction to HDTV Principles-Part 1    An Introduction to HDTV Principles-Part 1
An Introduction to HDTV Principles-Part 1
 
Daewoo service-sl-150 t-sl-150p
Daewoo service-sl-150 t-sl-150pDaewoo service-sl-150 t-sl-150p
Daewoo service-sl-150 t-sl-150p
 
Slide1.pdf
Slide1.pdfSlide1.pdf
Slide1.pdf
 
Dukane projector glossary
Dukane projector glossaryDukane projector glossary
Dukane projector glossary
 
phase shifter
phase shifterphase shifter
phase shifter
 
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
 
Efficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informationsEfficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informations
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
ICCE-Presentation-on-VESA-DisplayPort.pdf
ICCE-Presentation-on-VESA-DisplayPort.pdfICCE-Presentation-on-VESA-DisplayPort.pdf
ICCE-Presentation-on-VESA-DisplayPort.pdf
 
Masked Software Occlusion Culling
Masked Software Occlusion CullingMasked Software Occlusion Culling
Masked Software Occlusion Culling
 
High Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
High Fidelity Games: Real Examples, Best Practices ... | Oleksii VasylenkoHigh Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
High Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
 
NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016
 
Layout design on MICROWIND
Layout design on MICROWINDLayout design on MICROWIND
Layout design on MICROWIND
 
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
 
Newtec DVB-S2 Calculator: Technical Training
Newtec DVB-S2 Calculator: Technical TrainingNewtec DVB-S2 Calculator: Technical Training
Newtec DVB-S2 Calculator: Technical Training
 

Recently uploaded

Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoKayode Fayemi
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...David Celestin
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar TrainingKylaCullinane
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lodhisaajjda
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatmentnswingard
 
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfMahamudul Hasan
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...amilabibi1
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfSkillCertProExams
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalFabian de Rijk
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Baileyhlharris
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIINhPhngng3
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 

Recently uploaded (15)

Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 

Boosting Vertex-Cut Partitioning for Streaming Graphs

  • 1. Boosting Vertex-Cut Partitioning for Streaming Graphs Hooman Peiro Sajjad*, Amir H. Payberah†, Fatemeh Rahimian†, Vladimir Vlassov*, Seif Haridi† * KTH Royal Institute of Technology † SICS Swedish ICT 5th IEEE International Congress on Big Data
  • 3. Graph Partitioning Partition large graphs for applications such as: •Complexity reduction, parallelization and distributed graph analysis 3 P1 P2 P3 P4
  • 9. Partitioning Models 9 P1 P2 P1 P2 More efficient for power-law graphs
  • 10. A Good Vertex-Cut Partitioning 10 • Low replication factor • Balanced partitions with respect to the number of edges
  • 11. Streaming Graph Partitioning • Graph elements are assigned to partitions as they are being streamed • No global knowledge 11 Partitioner P1 P2 Pp streaming edges
  • 13. State-of-the-Art Partitioners • Centralized partitioner: • Single thread partitioner • Multi-threaded partitioner: each thread partitions a subset of the graph and shares the state information 13
  • 14. State-of-the-Art Partitioners • Centralized partitioner: • Single thread partitioner • Multi-threaded partitioner: each thread partitions a subset of the graph and shares the state information • Distributed partitioner: • Oblivious partitioners: several independent partitioners 14
  • 15. State-of-the-Art Partitioners • Centralized partitioner: • Single thread partitioner • Multi-threaded partitioner: each thread partitions a subset of the graph and shares the state information • Distributed partitioner: • Oblivious partitioners: several independent partitioners 15 Slow partitioning time Low replication factor
  • 16. State-of-the-Art Partitioners • Centralized partitioner: • Single thread partitioner • Multi-threaded partitioner: each thread partitions a subset of the graph and shares the state information • Distributed partitioner: • Oblivious partitioners: several independent partitioners 16 Slow partitioning time Low replication factor Fast partitioning time High replication factor
  • 17. Slow partitioning time Low replication factor Centralized partitioner Partitioning Time vs. Partition Quality 17 Distributed partitioner Fast partitioning time High replication factor
  • 18. Slow partitioning time Low replication factor Centralized partitioner Partitioning Time vs. Partition Quality 18 Distributed partitioner Fast partitioning time High replication factor ?
  • 19. Slow partitioning time Low replication factor Centralized partitioner Partitioning Time vs. Partition Quality 19 Distributed partitioner Fast partitioning time High replication factor HoVerCut
  • 21. HoVerCut ... • Streaming Vertex-Cut partitioner • Horizontally and Vertically scalable • Scales without degrading the quality of partitions • Employs different partitioning policies 21
  • 22. Architecture Overview 22 Core Partitioning Policy Tumbling Window Local State Subpartitioner 1 Edge stream Core Partitioning Policy Tumbling Window Local State Subpartitioner n Edge stream Shared State Async Async
  • 23. Architecture: Input 23 Core Partitioning Policy Tumbling Window Loca l Stat e Subpartitioner 1 Core Partitioning Policy Tumbling Window Loca l Stat e Subpartitioner n Edge stream Async Async • Input graphs are streamed by their edges • Each subpartitioner receives an exclusive subset of the edges Shared State
  • 24. Architecture: Configurable Window 24 Partitioning Policy Local State Subpartitioner 1 Edge stream Core Partitioning Policy Tumbling Window Local State Subpartitioner n Edge stream Async Async Subpartitioners collect a number of incoming edges in a window of a certain size. Tumbling Window Core Shared State
  • 25. Architecture: Partitioning Policy 25 Local State Subpartitioner 1 Edge stream Core Partitioning Policy Tumbling Window Local State Subpartitioner n Edge stream Async Async Each subpartitioner assigns the edges to the partitions based on a given policy Partitioning Policy Tumbling Window Shared State Core
  • 26. Architecture: Local State 26 Each subpartitioner has a local state, which includes information about the edges processed locally: • partial degree • partitions of each vertex • num. edges in each partition Partitioning Policy Subpartitioner 1 Edge stream Core Partitioning Policy Tumbling Window Local State Subpartitioner n Edge stream Async Async Local State Tumbling Window Shared State Core
  • 27. Architecture: Shared State 27 Shared-state is the global state accessible by all subpartitioners. Partitioning Policy Subpartitioner 1 Edge stream Cor e Partitioning Policy Tumbling Window Lo ca l S t a t e Subpartitioner n Edge stream Asyn c Asyn c Tumbling Window Shared State Core Lo ca l S t a t e
  • 28. Architecture: Shared State 28 Shared-state is the global state accessible by all subpartitioners. putState getState ID Partial Degree partitions v1 12 p1 v2 50 p1,p2 Vertex Table Partition Table Shared State ID Num. of edges p1 5000 p2 6500 Partitioning Policy Subpartitioner 1 Edge stream Cor e Partitioning Policy Tumbling Window Lo ca l S t a t e Subpartitioner n Edge stream Asyn c Asyn c Tumbling Window Shared State Core Lo ca l S t a t e
  • 29. Architecture: Core 29 Partitioning Policy Local State Subpartitioner 1 Edge stream Core Partitioning Policy Tumbling Window Local State Subpartitioner n Edge stream Async Async The core is HoVerCut’s main algorithm parametrised with partitioning policy and the window size. Core Shared State Tumbling Window
  • 30. Vertex-Cut Partitioning Heuristics 30 For an edge with end-vertices u and v and for every partition p
  • 31. Vertex-Cut Partitioning Heuristics 31 Score = ReplicationScore + LoadBalanceScore For an edge with end-vertices u and v and for every partition p
  • 32. Vertex-Cut Partitioning Heuristics Choose the partition that maximizes the Score. 32 Score = ReplicationScore + LoadBalanceScore For an edge with end-vertices u and v and for every partition p
  • 33. Vertex-Cut Partitioning Heuristics Choose the partition that maximizes the Score 33 Score = ReplicationScore + LoadBalanceScore State-of-the-Art Heuristics: •Greedy •HDRF For an edge with end-vertices u and v and for every partition p
  • 35. Greedy vs. HDRF Greedy: places end-vertices u and v of an edge in a partition that already has a replica of u or v. 35
  • 36. Greedy vs. HDRF Greedy: places end-vertices u and v of an edge in a partition that already has a replica of u or v. 36 P1 P2 u v u Greedy
  • 37. Greedy vs. HDRF Greedy: places end-vertices u and v of an edge in a partition that already has a replica of u or v. 37 P1 P2 u v u P1 P2 u v Greedy
  • 38. Greedy vs. HDRF Greedy: places end-vertices u and v of an edge in a partition that already has a replica of u or v. 38 P1 P2 u v u P1 P2 u v Greedy HDRF (High Degree Replicated First): replicates the higher degree end- vertex.
  • 39. Greedy vs. HDRF Greedy: places end-vertices u and v of an edge in a partition that already has a replica of u or v. 39 P1 P2 u v u P1 P2 u v Greedy P1 P2 u v u HDRF v HDRF (High Degree Replicated First): replicates the higher degree end- vertex.
  • 40. Greedy vs. HDRF Greedy: places end-vertices u and v of an edge in a partition that already has a replica of u or v. 40 P1 P2 u v u P1 P2 u v Greedy P1 P2 u v u P1 P2 HDRF u v v vHDRF (High Degree Replicated First): replicates the higher degree end- vertex.
  • 41. Partitioning a Window of Edges 41
  • 42. Partitioning a Window of Edges vids: the set of vertex ids in the current window edges: set of edges in current window pt = get the partition table vt = get the vertex subtable restricted to vids 42
  • 43. Partitioning a Window of Edges vids: the set of vertex ids in the current window edges: set of edges in current window pt = get the partition table vt = get the vertex table restricted to vids for each e ∊ edges: u = e.src , v = e.dst increment vt(u).degree and vt(v).degree given a partition policy: select p based on vt(u), vt(v) and pt add p to vt(u).partitions and vt(v).partitions increment pt(p).size end 43
  • 44. Partitioning a Window of Edges vids: the set of vertex ids in the current window edges: set of edges in current window pt = get the partition table vt = get the vertex table restricted to vids for each e ∊ edges: u = e.src , v = e.dst increment vt(u).degree and vt(v).degree given a partition policy: select p based on vt(u), vt(v) and pt add p to vt(u).partitions and vt(v).partitions increment pt(p).size end update the shared state by sending vt, pt represented as deltas 44
  • 45. Partitioning a Window of Edges vids: the set of vertex ids in the current window edges: set of edges in current window pt = get the partition table vt = get the vertex table restricted to vids for each e ∊ edges: u = e.src , v = e.dst increment vt(u).degree and vt(v).degree given a partition policy: select p based on vt(u), vt(v) and pt add p to vt(u).partitions and vt(v).partitions increment pt(p).size end update the shared state by sending vt, pt represented as deltas ID Degree partitions v1 +4 +p1 v2 +2 +p2 ID size p1 +3 p2 +1 vt pt
  • 47. Datasets 47 Dataset |V| |E| Autonomous systems (AS) 1.7M 11M Pokec social network (PSN) 1.6M 22M LiveJournal social network (LSN) 4.8M 48M Orkut social network (OSN) 3.1M 117M Partitions: 16
  • 48. Evaluation Metrics •Replication Factor (RF): the average number of replicated vertices •Load Relative Standard Deviation (LRSD): the relative standard deviation of edge size in each partition (LRSD=0 indicates equal size partitions) •Partitioning time: the time it takes to partition a graph 48
  • 49. One Host: Summary 49 HoVerCut’s configuration: Subpartitioners (threads) = 32 Window size = 32
  • 50. One Host: Summary 50 HoVerCut’s configuration: Subpartitioners = 32 Window size = 32
  • 53. Conclusion •We presented HoVerCut, a parallel and distributed partitioner •We can employ different partitioning policies in a scalable fashion •We can scale HoVerCut to partition larger graphs without degrading the quality of partitions 53