This document presents a vertex-centric asynchronous belief propagation algorithm for large-scale graphs. It proposes running belief propagation in a single computational node by processing vertices in parallel through an asynchronous vertex-centric approach. The algorithm is shown to converge faster than previous approaches and can scale to larger graphs by utilizing multicore architectures. Experiments on graphs with millions of edges demonstrate that the algorithm achieves similar accuracy to previous methods but with significantly better runtime performance, making it suitable for inference on very large real-world graphs.
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
1. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Vertex Centric Asynchronous Belief Propagation
Algorithm for Large-Scale Graphs
Gabriel Gimenes
Hugo Gualdron
Jose F. Rodrigues-Jr
Instituto de Ciencias Matematicas e de Computacao
University of Sao Paulo - Sao Carlos
DamNet - 2016 ICDM Workshop, Barcelona, Spain
This work has finantial support from FAPESP 2014/25337-0
4. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Context
Ubiquitous data generation
Information availability: pros and cons
Web 2.0 – users are producing data and not only consuming
Relationships between elements
Facebook, Twitter, Amazon, GooglePlay, Email
Intuitive modelling: Graphs(Networks)
5. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Problem
Analyzing large-scale networks – efficient and powerful
Some graphs (e.g YahooWeb e Twitter) may not fit memory
Naive processing: prohibitive
Alternative: distributed processing
complexity, infrastructure, cost
How to process in a single computational node?
6. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Rationale
New approaches: Taking advatange of the multi-core
architecturess
Centralized → Decentralized
Vertex-centric processing techniques
Block-based processing
Asynchronous processing
Proposals: TurboGraph, GraphChi, X-Stream, MMap,
M-Flash, FlashGraph; Pregel, GraphLab, Giraph.
7. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Vertex-centric paradigm
Vertex-centric model
procedure Graph scan(Graph G)
for i = 1 to |V | do
sete ← set of edges adjacent to V [i]
V [i].value ← f (sete )
for each edge e in sete do
e.value ← g(V [i].value, e.value)
Outer loop
procedure Graph processing
while convergence criterion is not satisfied do
Graph scan(G)
9. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Algorithm
Belief propagation - bayesian inference method
Estimating the marginal probability distribution for
non-annotated nodes
Message passing: information travels from annotated to
unannotated nodes
Guilty-by-association or ”birds of a feather flock together”
Heterophily vs Homophily
10. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Problem
Original algorithm proposed for trees - no loops
Loopy BP (Murphy et al.) generalized algorithm
Problems with convergence and performance
Early applications in stereo-imaging and facial reconstruction
11. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Evolution
Performance and scalability: distributed processing
Gonzalez et al. – distributed inefficiencies
Kang et al. – algorithm relevance for anti-malware and fraud
detection applications
Gatterbauer et al. – linear approximation, convergence
guarantees and better performance
14. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Proposal and contributions
Algorithm: change of paradigm, asynchronous parallel
vertex-centric processing
Convergence: better convergence speed (number of iterations)
Scalability: commodity computer
15. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Our algorithm
VC-LinBP
1: procedure VC-LinBP(G(V , E), VExplicit, H, h, t)
2: set H = hH
3: set H2 = H2
4: repeat
5: for each vertex in V do
6: Update(vertex)
7: until t iterations or convergence achieved
16. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Our algorithm
Update
1: procedure Update(vertex)
2: Set degree = 0
3: for each class c in vertex do initializing vertex values for each class
4: vertex.value(c) = 0
5: for each incoming edge e to vertex do processing incoming messages
6: degree+ = e.weight2
7: for each each class cfrom do
8: for each each class cto do
9: vertex.value(cto) += e.weight * e.value(cfrom) * H(cfrom, cto)
10: if vertex is not explicit then echo cancellation of messages
11: for each each class cfrom do
12: for each each class cto do
13: vertex.value(cto)− = degree ∗ vertex.value(cfrom) ∗ H2(cfrom, cto)
14: else adding explicit value of the vertex
15: vertex.value(c)+ = VExplicit (vertex)(c)
16: for each outgoing edge e from vertex do sending messages to neighbors
17: for each each class c do
18: e.value(c) = vertex.value(c)
17. Introduction Belief Propagation Algorithm Methodology and Experiments Conclusions
Experiments
Efficiency and efficacy
i7 CPU 8 cores, 16GB RAM, 240GB SSD
Comparison with LinBP
2 versions: single e multi-threaded
Utilizing the GraphChi framework