This document presents a study on streaming graph partitioning algorithms. The goals are to evaluate the benefits of more complex partitioning methods, measure partitioning overhead, and determine the effect of partitioning quality on application performance. Experiments were run on several datasets and platforms. Results show that hash partitioning has the best throughput but worse edge cuts than other algorithms. Load balancing was high for most methods. Iterative applications showed differences in performance depending on partitioning, while streaming applications behaved differently. Open questions remain around state requirements, adapting algorithms for streaming, and handling more complex graphs.
Microsoft Fabric Analytics Engineer (DP-600) Exam Dumps 2024.pdf
VLDB 2018 presentation paper title: Streaming Graph Partitioning
1. STREAMING GRAPH
PARTITIONING
Zainab Abbas1, Vasiliki Kalavri2,
Paris Carbone1, and Vladimir Vlassov1
1. KTH Royal Institute of Technology, Stockholm, Sweden
2. ETH Zurich, Switzerland
An Experimental Study
19. Contributions
• A unified comparison framework on Apache Flink
• Classification and experimental comparison
• Performance of partitioning algorithm
• Effect of algorithms on applications
19
28. Goals
Our work aims at identifying:
• 1) the benefits of using more complex partitioning methods
• 2) the partitioning overhead for an application
• 3) the effect of partitioning quality on the application performance.
28
29. Datasets and Setup
• On-premises cluster
• A virtualized environment at Amazon consisting of 17x r3.2xlarge
EC2 instances 29
45. Conclusion
The trade-off between balancing and reducing
cuts remains
Streaming and iterative applications behave
differently
45
High
Low
46. Open Questions
• Can we design partitioning algorithms with minimal state requirements
for modern stream processing engines?
• Can we adopt existing partitioning algorithms for continuous stream
processing?
• Can we design algorithms with fewer constraints or assumptions about
the input graph?
46
47. THANKS EVERYONE J
Zainab Abbas
zainabab@kth.se
Vasiliki Kalavri
kalavriv@inf.ethz.ch
Paris Carbone
parisc@kth.se
Vladimir Vlassov
vladv@kth.se
47