Your SlideShare is downloading. ×
Trinity: A Distributed Graph Engine on a Memory Cloud
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Trinity: A Distributed Graph Engine on a Memory Cloud

288

Published on

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
288
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
17
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Trinity: A Distributed Graph Engine on a Memory Cloud Speaker: LIN Qian http://www.comp.nus.edu.sg/~linqian/
  • 2. Graph applications Online query processing  Low latency Offline graph analytics  High throughput
  • 3. Online queries Random data access e.g., BFS, sub-graph matching, …
  • 4. Offline computations Performed iteratively
  • 5. Insight: Keeping the graph in memory at least the topology
  • 6. Trinity Online query + Offline analytics
  • 7. Random data access problem in large graph computation Globally addressable distr. memory Random access abstraction
  • 8. Belief High-speed network is more available DRAM is cheaper In-memory solution become practical
  • 9. “Trinity itself is not a system that comes with comprehensive built-in graph computation modules.”
  • 10. Trinity cluster
  • 11. Stack of Trinity system modules User define: Graph schema, Communication protocols, Computation paradigms
  • 12. Memory cloud Partition memory space into trunks Hashing
  • 13. Memory trunks 2p > m 1. Trunk level parallelism 2. Efficient hashing
  • 14. Hashing Key-value store p-bit value  i ∈ [0, 2p – 1] Inner trunk hash table
  • 15. Data partitioning and addressing Benefit: Scalability Fault-tolerance
  • 16. Modeling graph Cell: value + schema Represent a node in a cell
  • 17. TSL Object-oriented cell manipulation Data integration Network communication
  • 18. Online queries Traversal based New paradigm
  • 19. Vertex centric offline analytics Restrictive vertex centric model
  • 20. Message passing optimization Create a bipartite partition of the local graph Buffer hub vertices
  • 21. A new paradigm for offline analytics 1. Aggregate answers from local computations 2. Employ probabilistic inference
  • 22. Circular memory management • Aim to avoid memory gaps between large number of key-value pairs
  • 23. Fault tolerance Heartbeat-based failure detection BSP: checkpointing Async.: “periodical interruption”
  • 24. Performance
  • 25. Performance (cont.)

×