• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Giraph
 

Giraph

on

  • 2,549 views

 

Statistics

Views

Total Views
2,549
Views on SlideShare
2,231
Embed Views
318

Actions

Likes
4
Downloads
0
Comments
0

2 Embeds 318

http://bt22dr.wordpress.com 317
http://translate.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Giraph Giraph Presentation Transcript

    • Giraphbt22dr@gmail.com
    • Agenda• Introduction• BSP• Pregel• Giraph
    • Giraph
    • Giraph?
    • Apache Giraph
    • Apache Giraph is an iterative graphprocessing system built for highscalability.- 출처 : http://giraph.apache.org/
    • - 출처 : http://giraph.apache.org/Giraph originated as the open-source counterpart to Pregel.
    • Pregel?
    • Google Pregel
    • Google Pregel isdistributed system especially developedfor large scale graph processing
    • 다양한 그래프 문제들V E
    • 웹 그래프소셜 네트워크다양한 그래프 문제들뉴스 기사의 유사성질병 발생 경로운송 경로...
    • 인터넷웹 그래프뉴스 기사의 유사성다양한 그래프 문제들소셜 네트워크질병 발생 경로운송 경로...웹 2.0
    • 웹 그래프소셜 네트워크대규모 그래프Internet of Things모바일인터넷웹 2.0
    • 웹 그래프소셜 네트워크대규모 그래프Internet of Things모바일billions of vertices, trillions of edges
    • Options?
    • Options?• Crafting a custom distributed infrastructure
    • Options?• Crafting a custom distributed infrastructure• Relying on an existing distributed computing platform
    • Options?• Crafting a custom distributed infrastructure• Relying on an existing distributed computing platform• Using a single-computer graph algorithm library
    • Options?• Crafting a custom distributed infrastructure• Relying on an existing distributed computing platform• Using a single-computer graph algorithm library• Using an existing parallel graph system
    • Options?• Crafting a custom distributed infrastructure• Relying on an existing distributed computing platform• Using a single-computer graph algorithm library• Using an existing parallel graph systemefficient processing of large graphs
    • Options?locality of memory accessefficient processing of large graphs• Crafting a custom distributed infrastructure• Relying on an existing distributed computing platform• Using a single-computer graph algorithm library• Using an existing parallel graph system
    • Options?locality of memory accessfault-tolerant platformefficient processing of large graphs• Crafting a custom distributed infrastructure• Relying on an existing distributed computing platform• Using a single-computer graph algorithm library• Using an existing parallel graph system
    • Options?locality of memory accessefficient processing of large graphs• Crafting a custom distributed infrastructure• Relying on an existing distributed computing platform• Using a single-computer graph algorithm library• Using an existing parallel graph systemgeneral-purpose systemfault-tolerant platform
    • locality of memory accessefficient processing of large graphsPregelgeneral-purpose systemfault-tolerant platform••••
    • locality of memory accessefficient processing of large graphsPregelgeneral-purpose systemfault-tolerant platform••••BSP
    • The Bulk Synchronous Parallel (BSP) abstract computeris a bridging model for designing parallel algorithms- 출처 : Bulk synchronous parallel - Wikipedia, the free encyclopedia
    • BSP computer :- processors connected by a communication network- fast local memory- different threads of computation- series of global supersteps
    • (series of) supersteps… …cf. MapReduce : (map / reduce) + (map / reduce) + (map / reduce) + …
    • superstep
    • superstep독립적
    • superstep독립적단방향
    • superstep독립적단방향순서 고려 X
    • superstep독립적단방향Costlybut attractive순서 고려 X
    • Vuser-definefunctionS - 1S + 1superstep S
    • Pregel Computation• Input : directed graph• Sequence of supersteps• output
    • Pregel Computation• Input : directed graph• Sequence of supersteps• output- Vertex ID- Value- Value (weight)- Target vertex ID- Value (weight)- Target vertex ID…-V E E- Vertex ID- Value- Value (weight)- Target vertex ID- Value (weight)- Target vertex ID…-V E E- Vertex ID- Value- Value (weight)- Target vertex ID- Value (weight)- Target vertex ID…-V E E
    • Pregel Computation• Input : directed graph• Sequence of supersteps• output84
    • Pregel Computation• Input : directed graph• Sequence of supersteps• output84
    • Pregel Computation• Input : directed graph• Sequence of supersteps• output84종료!!…4
    • Pregel Computation• Input : directed graph• Sequence of supersteps• outputo the set of values explicitly output by the verticeso aggregated statistics mined from the graph
    • Max value example
    • - 출처 : http://prezi.com/zghqtkqstrg-/apache-giraph-berlin-buzzwords/MapReduce?
    • - 출처 : http://prezi.com/zghqtkqstrg-/apache-giraph-berlin-buzzwords/MapReduce?
    • Pregel API
    • Pregel API• Message Passing• Combiners• Aggregators• Topology Mutations• Input and Output
    • Master/Worker model- 출처 : http://de.slideshare.net/sscdotopen/introducing-apache-giraph-for-large-scale-graph-processing
    • Master/Worker model- 출처 : http://de.slideshare.net/sscdotopen/introducing-apache-giraph-for-large-scale-graph-processing
    • Fault Tolerance• Checkpointingo The master periodically instructs the workers to save the state oftheir partitions to persistent storage e.g., Vertex values, edge values, incoming messages• Failure detectiono Using regular “ping” messages• Recoveryo The master reassigns graph partitions to the currently availableworkerso The workers all reload their partition state from most recentavailable checkpoint
    • Giraph• Open source implementation of Pregel• Runs on Hadoop infrastructureo map-only job in hadoop• Computation is executed in memory• Uses Apache ZooKeeper for synchronizationo If not exist, hadoop file system instead
    • Giraph• Choose your graph generic typeso Vertex ID (type I)o Vertex value (type V)o Edge value (type E)o Message value (type M)• Define how to load the graph into Girapho Vertex Input Format• Define how to store the graph from Girapho Vertex Output Format• Override the compute() method
    • Giraph (Shortest Path example)• generic types• compute() method
    • Giraph (Shortest Path example)• In/output format
    • Giraph (Shortest Path example)1 23 4131210
    • 참고자료• Pregel: A System for Large-Scale Graph Processing• http://en.wikipedia.org/wiki/Bulk_synchronous_parallel• http://giraph.apache.org• http://prezi.com/zghqtkqstrg-/apache-giraph-berlin-buzzwords/