• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Graphs
 

Graphs

on

  • 5,629 views

Paolo Castagna talks about Graphs on Hadoop

Paolo Castagna talks about Graphs on Hadoop

Statistics

Views

Total Views
5,629
Views on SlideShare
5,319
Embed Views
310

Actions

Likes
5
Downloads
123
Comments
0

6 Embeds 310

http://www.1060.org 200
http://1060.org 67
http://www.slideshare.net 38
http://www.netkernel.org 3
http://webcache.googleusercontent.com 1
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

Graphs Graphs Presentation Transcript

  • Graph Algorithms and MapReduce Paolo Castagna The words and opinions expressed here are my own, and do not, in any way, represent the views of my employer.
  • Why graphs ?
  • I am an infornographer ! see: http://en.wikipedia.org/wiki/Infornography
  • ... addicted to RDF
  • RDF is (just) a directed labeled multigraph
  • RDF is (just) a directed labeled multigraph
  • RDF is (just) a directed labeled multigraph URI2 URI1 URI3
  • RDF is (just) a directed labeled multigraph URI2 URI1 URI3 URI4
  • RDF processing
  • RDF parallel processing
  • MapReduce ?
  • “ ... almost no descriptions of graph algorithms appear in the literature, with the exception of a simplified PageRank calculation and a naive implementation of finding distances from a specified node. ” Graph Twiddling in a MapReduce World, Jonathan Cohen
  • RDF processing Inference1 (?x p ?y) (?y q r) -> (?x rdf:type t) (?x p ?y) (?y p ?z) -> (?x p ?z) 1 using a rule engine with forward rules only and a total materialization strategy
  • Transitive closure
  • Transitive closure
  • MapReduce ?
  • Transitive closure 1: 4, 6, 7 1: 4, 6, 7 2: 5 map 3: 2, 4, 7 1, >4 4: 1, 3, 6 1, >6 5: 2, 3 1, >7 6: 5 4, <1 6, <1 7: 3, 5 7, <1
  • Transitive closure 1: 4, 6, 7 6, <1 2: 5 6, <4 3: 2, 4, 7 6, >5 4: 1, 3, 6 5: 2, 3 reduce 6: 5 1: 5 7: 3, 5 4: 5
  • Transitive closure WARNINGS: - Thinking in progress ! - Not implemented (yet) ! - Stop when no new edges are found
  • Transitive reduction
  • Transitive reduction
  • MapReduce ?
  • PageRank Lessons learned
  • #1 adjacency list
  • #2 moving the graph around at each iteration is not ideal
  • #3 to communicate with all the vertex use configuration parameters of a subsequent MapReduce job
  • “ Pregel computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code... ” Official Google Research Blog, Grzegorz Czajkowski
  • “ Pregel computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code... ” Official Google Research Blog, Grzegorz Czajkowski
  • Apache Hamburg ?
  • Graph algorithms Graph search - Depth First Search - Breadth First Search Directed (acyclic) graphs - Reachability and Transitive Closure - Topological Sorting Minimum Spanning Tree Shortest Paths Network Flow ...
  • Apache Common Graph (dormant)