Successfully reported this slideshow.
Your SlideShare is downloading. ×

Kineograph: Taking the Pulse of a Fast-Changing and Connected World

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 43 Ad

More Related Content

Slideshows for you (20)

Similar to Kineograph: Taking the Pulse of a Fast-Changing and Connected World (20)

Advertisement

More from Qian Lin (13)

Kineograph: Taking the Pulse of a Fast-Changing and Connected World

  1. 1. Kineograph: Taking the Pulse of a Fast-Changing and Connected World Speaker: LIN Qian http://www.comp.nus.edu.sg/~linqian
  2. 2. Information time-sensitive rich connections
  3. 3. Challenges
  4. 4. 1. Timeliness guarantees
  5. 5. 2. Graph
  6. 6. 3. Graph-mining
  7. 7. Kineograph distr. in-memory graph storage incremental graph mining
  8. 8. Master Progress Continuous table Data feeds Ingest nodes Snapshooter Graph nodes Global consistent snapshots Graph Storage Computation Incremental computation on a static graph snapshot
  9. 9. Graph computation Graph updates
  10. 10. Graph nodes storage layer computation layer
  11. 11. Storage layer key/value store logical partitions
  12. 12. Graph partitioning edge-cut no locality consideration
  13. 13. Snapshot ingest nodes graph nodes global progress table
  14. 14. Ingest node graph-update operations sequence number
  15. 15. Epoch commit protocol
  16. 16. Progress table s1 1 3 2 0 … … Global tx vector Ingest nodes s1 … sn sn 7 3 4 Snapshooter Partition u Partition v 1 2 4 s1 2 3 5 s1 … Epoch specified by progress … Graph nodes … table and snapshooter 4 6 7 sn 5 6 8 sn
  17. 17. Graph update / compute Pipeline Incoming Tweets … … Time Snapshot Si-1 Si Si+1 Construction Graph Epoch Ci Computation ti-1 ti ti’ ti’’ Timeliness
  18. 18. Consistency no global serialization (diff. from 2PL or t.s. ordering)
  19. 19. Atomicity v u v u
  20. 20. Deterministic vertex creation
  21. 21. Computation layer incremental graph-mining
  22. 22. vertex-based computation model
  23. 23. Incremental Graph Computation Updates from other vertices N Detect Vertex Compute New Change Init Status Vertex Values Significantly? Graph-Scale Propagate Y Aggregation Updates
  24. 24. Push model sender-side aggregation
  25. 25. Pull model read a subset of neighbors
  26. 26. Execution model BSP + Dynamic scheduling
  27. 27. 3 apps TunkRank SP K-exposure
  28. 28. TunkRank
  29. 29. SP
  30. 30. K-exposure
  31. 31. Fault tolerance among servers Paxos-based solution
  32. 32. Ingest node failure incarnation number
  33. 33. Fault tolerance @ storage layer quorum-based replication
  34. 34. Fault tolerance @ computation layer roll back & re-execute primary/backup replication
  35. 35. Incremental expansion
  36. 36. Decaying
  37. 37. C# 17,000 LOC
  38. 38. Twitter feeds 8M vertices, 29M edges 100M tweets with 100K/sec power-law
  39. 39. Graph-update throughput
  40. 40. Incremental vs. Non-incremental
  41. 41. Scalability
  42. 42. Incoming data rate
  43. 43. Failure recovery

×