Your SlideShare is downloading. ×
0
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Graphs
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Graphs

3,773

Published on

Paolo Castagna talks about Graphs on Hadoop

Paolo Castagna talks about Graphs on Hadoop

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,773
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
126
Comments
0
Likes
5
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Graph Algorithms and MapReduce Paolo Castagna The words and opinions expressed here are my own, and do not, in any way, represent the views of my employer.
  • 2. Why graphs ?
  • 3. I am an infornographer ! see: http://en.wikipedia.org/wiki/Infornography
  • 4. ... addicted to RDF
  • 5. RDF is (just) a directed labeled multigraph
  • 6. RDF is (just) a directed labeled multigraph
  • 7. RDF is (just) a directed labeled multigraph URI2 URI1 URI3
  • 8. RDF is (just) a directed labeled multigraph URI2 URI1 URI3 URI4
  • 9. RDF processing
  • 10. RDF parallel processing
  • 11. MapReduce ?
  • 12. “ ... almost no descriptions of graph algorithms appear in the literature, with the exception of a simplified PageRank calculation and a naive implementation of finding distances from a specified node. ” Graph Twiddling in a MapReduce World, Jonathan Cohen
  • 13. RDF processing Inference1 (?x p ?y) (?y q r) -> (?x rdf:type t) (?x p ?y) (?y p ?z) -> (?x p ?z) 1 using a rule engine with forward rules only and a total materialization strategy
  • 14. Transitive closure
  • 15. Transitive closure
  • 16. MapReduce ?
  • 17. Transitive closure 1: 4, 6, 7 1: 4, 6, 7 2: 5 map 3: 2, 4, 7 1, >4 4: 1, 3, 6 1, >6 5: 2, 3 1, >7 6: 5 4, <1 6, <1 7: 3, 5 7, <1
  • 18. Transitive closure 1: 4, 6, 7 6, <1 2: 5 6, <4 3: 2, 4, 7 6, >5 4: 1, 3, 6 5: 2, 3 reduce 6: 5 1: 5 7: 3, 5 4: 5
  • 19. Transitive closure WARNINGS: - Thinking in progress ! - Not implemented (yet) ! - Stop when no new edges are found
  • 20. Transitive reduction
  • 21. Transitive reduction
  • 22. MapReduce ?
  • 23. PageRank Lessons learned
  • 24. #1 adjacency list
  • 25. #2 moving the graph around at each iteration is not ideal
  • 26. #3 to communicate with all the vertex use configuration parameters of a subsequent MapReduce job
  • 27. “ Pregel computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code... ” Official Google Research Blog, Grzegorz Czajkowski
  • 28. “ Pregel computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code... ” Official Google Research Blog, Grzegorz Czajkowski
  • 29. Apache Hamburg ?
  • 30. Graph algorithms Graph search - Depth First Search - Breadth First Search Directed (acyclic) graphs - Reachability and Transitive Closure - Topological Sorting Minimum Spanning Tree Shortest Paths Network Flow ...
  • 31. Apache Common Graph (dormant)

×