Graph Algorithms
and
MapReduce
Paolo Castagna
The words and opinions expressed here are my own, and do not, in any way, represent the views of my employer.
Why graphs ?
I am an infornographer !
see: http://en.wikipedia.org/wiki/Infornography
... addicted to RDF
RDF is (just) a directed
labeled multigraph
RDF is (just) a directed
labeled multigraph
RDF is (just) a directed
labeled multigraph
URI2
URI1 URI3
RDF is (just) a directed
labeled multigraph
URI2
URI1 URI3
URI4
RDF processing
RDF parallel processing
MapReduce ?
“ ... almost no descriptions of graph
algorithms appear in the literature,
with the exception of a simplified
PageRank calculation and a naive
implementation of finding distances
from a specified node. ”
Graph Twiddling in a MapReduce World, Jonathan Cohen
RDF processing
Inference1
(?x p ?y) (?y q r) -> (?x rdf:type t)
(?x p ?y) (?y p ?z) -> (?x p ?z)
1 using a rule engine with forward rules only and a total materialization strategy
Transitive closure
WARNINGS:
- Thinking in progress !
- Not implemented (yet) !
- Stop when no new edges are found
Transitive reduction
Transitive reduction
MapReduce ?
PageRank
Lessons learned
#1
adjacency list
#2
moving the graph around at
each iteration is not ideal
#3
to communicate with all the
vertex use configuration
parameters of a subsequent
MapReduce job
“ Pregel computes over large graphs
much faster than alternatives, and the
application programming interface is
easy to use. Implementing PageRank,
for example, takes only about 15 lines of
code... ”
Official Google Research Blog, Grzegorz Czajkowski
“ Pregel computes over large graphs
much faster than alternatives, and the
application programming interface is
easy to use. Implementing PageRank,
for example, takes only about 15 lines of
code... ”
Official Google Research Blog, Grzegorz Czajkowski
Apache Hamburg ?
Graph algorithms
Graph search
- Depth First Search
- Breadth First Search
Directed (acyclic) graphs
- Reachability and Transitive Closure
- Topological Sorting
Minimum Spanning Tree
Shortest Paths
Network Flow
...
0 comments
Post a comment