Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Dynamic uri by ccarpenterg 315 views
- [IJCT V3I4P1] Authors: Anusha Itnal... by IJCT JOURNAL 28 views
- A heuristic approach for optimizing... by eSAT Journals 35 views
- Hadoop map reduce concepts by Subhas Kumar Ghosh 1122 views
- lecture 20 by sajinsc 812 views
- JavaScript and AJAX by Frane Bandov 1936 views

4,242 views

Published on

Paolo Castagna talks about Graphs on Hadoop

Published in:
Technology

No Downloads

Total views

4,242

On SlideShare

0

From Embeds

0

Number of Embeds

311

Shares

0

Downloads

129

Comments

0

Likes

5

No embeds

No notes for slide

- 1. Graph Algorithms and MapReduce Paolo Castagna The words and opinions expressed here are my own, and do not, in any way, represent the views of my employer.
- 2. Why graphs ?
- 3. I am an infornographer ! see: http://en.wikipedia.org/wiki/Infornography
- 4. ... addicted to RDF
- 5. RDF is (just) a directed labeled multigraph
- 6. RDF is (just) a directed labeled multigraph
- 7. RDF is (just) a directed labeled multigraph URI2 URI1 URI3
- 8. RDF is (just) a directed labeled multigraph URI2 URI1 URI3 URI4
- 9. RDF processing
- 10. RDF parallel processing
- 11. MapReduce ?
- 12. “ ... almost no descriptions of graph algorithms appear in the literature, with the exception of a simplified PageRank calculation and a naive implementation of finding distances from a specified node. ” Graph Twiddling in a MapReduce World, Jonathan Cohen
- 13. RDF processing Inference1 (?x p ?y) (?y q r) -> (?x rdf:type t) (?x p ?y) (?y p ?z) -> (?x p ?z) 1 using a rule engine with forward rules only and a total materialization strategy
- 14. Transitive closure
- 15. Transitive closure
- 16. MapReduce ?
- 17. Transitive closure 1: 4, 6, 7 1: 4, 6, 7 2: 5 map 3: 2, 4, 7 1, >4 4: 1, 3, 6 1, >6 5: 2, 3 1, >7 6: 5 4, <1 6, <1 7: 3, 5 7, <1
- 18. Transitive closure 1: 4, 6, 7 6, <1 2: 5 6, <4 3: 2, 4, 7 6, >5 4: 1, 3, 6 5: 2, 3 reduce 6: 5 1: 5 7: 3, 5 4: 5
- 19. Transitive closure WARNINGS: - Thinking in progress ! - Not implemented (yet) ! - Stop when no new edges are found
- 20. Transitive reduction
- 21. Transitive reduction
- 22. MapReduce ?
- 23. PageRank Lessons learned
- 24. #1 adjacency list
- 25. #2 moving the graph around at each iteration is not ideal
- 26. #3 to communicate with all the vertex use configuration parameters of a subsequent MapReduce job
- 27. “ Pregel computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code... ” Official Google Research Blog, Grzegorz Czajkowski
- 28. “ Pregel computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code... ” Official Google Research Blog, Grzegorz Czajkowski
- 29. Apache Hamburg ?
- 30. Graph algorithms Graph search - Depth First Search - Breadth First Search Directed (acyclic) graphs - Reachability and Transitive Closure - Topological Sorting Minimum Spanning Tree Shortest Paths Network Flow ...
- 31. Apache Common Graph (dormant)

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment