Your SlideShare is downloading.
×

×
Saving this for later?
Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.

Text the download link to your phone

Standard text messaging rates apply

Like this presentation? Why not share!

- HDFS: Hadoop Distributed Filesystem by Steve Loughran 1019 views
- Gopher A Sub-graph centric framewor... by charithwiki 392 views
- Infovision Vijay Srinivas A _ big d... by Information Excel... 658 views
- InfiniteGraph by University of New... 1139 views
- InfiniteGraph Presentation from Oct... by InfiniteGraph 923 views
- Networkx 0.99 by Deepakshankar S 2507 views

3,773

Published on

Paolo Castagna talks about Graphs on Hadoop

Paolo Castagna talks about Graphs on Hadoop

Published in:
Technology

No Downloads

Total Views

3,773

On Slideshare

0

From Embeds

0

Number of Embeds

4

Shares

0

Downloads

126

Comments

0

Likes

5

No embeds

No notes for slide

- 1. Graph Algorithms and MapReduce Paolo Castagna The words and opinions expressed here are my own, and do not, in any way, represent the views of my employer.
- 2. Why graphs ?
- 3. I am an infornographer ! see: http://en.wikipedia.org/wiki/Infornography
- 4. ... addicted to RDF
- 5. RDF is (just) a directed labeled multigraph
- 6. RDF is (just) a directed labeled multigraph
- 7. RDF is (just) a directed labeled multigraph URI2 URI1 URI3
- 8. RDF is (just) a directed labeled multigraph URI2 URI1 URI3 URI4
- 9. RDF processing
- 10. RDF parallel processing
- 11. MapReduce ?
- 12. “ ... almost no descriptions of graph algorithms appear in the literature, with the exception of a simplified PageRank calculation and a naive implementation of finding distances from a specified node. ” Graph Twiddling in a MapReduce World, Jonathan Cohen
- 13. RDF processing Inference1 (?x p ?y) (?y q r) -> (?x rdf:type t) (?x p ?y) (?y p ?z) -> (?x p ?z) 1 using a rule engine with forward rules only and a total materialization strategy
- 14. Transitive closure
- 15. Transitive closure
- 16. MapReduce ?
- 17. Transitive closure 1: 4, 6, 7 1: 4, 6, 7 2: 5 map 3: 2, 4, 7 1, >4 4: 1, 3, 6 1, >6 5: 2, 3 1, >7 6: 5 4, <1 6, <1 7: 3, 5 7, <1
- 18. Transitive closure 1: 4, 6, 7 6, <1 2: 5 6, <4 3: 2, 4, 7 6, >5 4: 1, 3, 6 5: 2, 3 reduce 6: 5 1: 5 7: 3, 5 4: 5
- 19. Transitive closure WARNINGS: - Thinking in progress ! - Not implemented (yet) ! - Stop when no new edges are found
- 20. Transitive reduction
- 21. Transitive reduction
- 22. MapReduce ?
- 23. PageRank Lessons learned
- 24. #1 adjacency list
- 25. #2 moving the graph around at each iteration is not ideal
- 26. #3 to communicate with all the vertex use configuration parameters of a subsequent MapReduce job
- 27. “ Pregel computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code... ” Official Google Research Blog, Grzegorz Czajkowski
- 28. “ Pregel computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code... ” Official Google Research Blog, Grzegorz Czajkowski
- 29. Apache Hamburg ?
- 30. Graph algorithms Graph search - Depth First Search - Breadth First Search Directed (acyclic) graphs - Reachability and Transitive Closure - Topological Sorting Minimum Spanning Tree Shortest Paths Network Flow ...
- 31. Apache Common Graph (dormant)

Be the first to comment