Ciel universal distributed execution engine

458 views
387 views

Published on

Presentation of the CIEL framework as described in the paper: "CIEL: a universal execution engine for distributed data-flow computing"

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
458
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Ciel universal distributed execution engine

  1. 1. Motivation CIEL Skywriting Optimizations & Fault Tolerance Evaluation & Future work. . CIEL universal distributed execution engine... . . Presenter: Emmanouil Dimogerontakis @{AdvDS} EMDC KTH November 6, 2012 . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 1/23
  2. 2. Motivation CIEL Skywriting Distributed Execution Engines Optimizations & Fault Tolerance Evaluation & Future work .. . Motivation 1 Distributed Execution Engines .. . CIEL 2 Dynamic Task Graphs Architecture .. . Skywriting 3 .. . Optimizations & Fault Tolerance 4 .. . Evaluation & Future work 5 Evaluation Future Work Conclusions . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 2/23
  3. 3. Motivation CIEL Skywriting Distributed Execution Engines Optimizations & Fault Tolerance Evaluation & Future work. Purpose Execute a Task Graph providing: Task Scheduling Figure: Task Graph . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 3/23
  4. 4. Motivation CIEL Skywriting Distributed Execution Engines Optimizations & Fault Tolerance Evaluation & Future work. Purpose Execute a Task Graph providing: Task Scheduling Data Distribution Figure: Task Graph . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 3/23
  5. 5. Motivation CIEL Skywriting Distributed Execution Engines Optimizations & Fault Tolerance Evaluation & Future work. Purpose Execute a Task Graph providing: Task Scheduling Data Distribution Load Balancing Figure: Task Graph . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 3/23
  6. 6. Motivation CIEL Skywriting Distributed Execution Engines Optimizations & Fault Tolerance Evaluation & Future work. Purpose Execute a Task Graph providing: Task Scheduling Data Distribution Load Balancing Transparent Fault Tolerance Figure: Task Graph . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 3/23
  7. 7. Motivation CIEL Skywriting Distributed Execution Engines Optimizations & Fault Tolerance Evaluation & Future work. Limitations Task graphs used up to now: Static Acyclic . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 4/23
  8. 8. Motivation CIEL Skywriting Distributed Execution Engines Optimizations & Fault Tolerance Evaluation & Future work. Limitations Limitations : Task graphs used up to now: Limited Expressive Power Static Poor Performance Acyclic Insufficient Fault Tolerance . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 4/23
  9. 9. Motivation CIEL Skywriting Distributed Execution Engines Optimizations & Fault Tolerance Evaluation & Future work. Overview Figure: Distributed Execution Engines comparison . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 5/23
  10. 10. Motivation CIEL Dynamic Task Graphs Skywriting Architecture Optimizations & Fault Tolerance Evaluation & Future work .. . Motivation 1 Distributed Execution Engines .. . CIEL 2 Dynamic Task Graphs Architecture .. . Skywriting 3 .. . Optimizations & Fault Tolerance 4 .. . Evaluation & Future work 5 Evaluation Future Work Conclusions . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 6/23
  11. 11. Motivation CIEL Dynamic Task Graphs Skywriting Architecture Optimizations & Fault Tolerance Evaluation & Future work. CIEL WHY Universal? Support same cluster of algorithms as a TM . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 7/23
  12. 12. Motivation CIEL Dynamic Task Graphs Skywriting Architecture Optimizations & Fault Tolerance Evaluation & Future work. CIEL WHY Universal? Support same cluster of algorithms as a TM HOW ? using Dynamic Task Graphs . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 7/23
  13. 13. Motivation CIEL Dynamic Task Graphs Skywriting Architecture Optimizations & Fault Tolerance Evaluation & Future work. CIEL primitives objects references tasks Figure: A Task Grapha a Source: http://www.cl.cam.ac.uk/~dgm36/CIEL-NSDI-slides.pdf . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 8/23
  14. 14. Motivation CIEL Dynamic Task Graphs Skywriting Architecture Optimizations & Fault Tolerance Evaluation & Future work. Dynamic Task Graphs Figure: A Dynamic Task Graph . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 9/23
  15. 15. Motivation CIEL Dynamic Task Graphs Skywriting Architecture Optimizations & Fault Tolerance Evaluation & Future work. Master & Worker Figure: CIEL Master . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 10/23
  16. 16. Motivation CIEL Dynamic Task Graphs Skywriting Architecture Optimizations & Fault Tolerance Evaluation & Future work. Master & Worker Figure: CIEL Master Figure: CIEL Worker . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 10/23
  17. 17. Motivation CIEL Dynamic Task Graphs Skywriting Architecture Optimizations & Fault Tolerance Evaluation & Future work. Architecture Figure: CIEL Architecture . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 11/23
  18. 18. Motivation CIEL Skywriting Optimizations & Fault Tolerance Evaluation & Future work .. . Motivation 1 Distributed Execution Engines .. . CIEL 2 Dynamic Task Graphs Architecture .. . Skywriting 3 .. . Optimizations & Fault Tolerance 4 .. . Evaluation & Future work 5 Evaluation Future Work Conclusions . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 12/23
  19. 19. Motivation CIEL Skywriting Optimizations & Fault Tolerance Evaluation & Future work. Creating Tasks with Skywriting Figure: Spawning a new task . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 13/23
  20. 20. Motivation CIEL Skywriting Optimizations & Fault Tolerance Evaluation & Future work. Creating Tasks with Skywriting Figure: Spawning a new task Figure: Dereferencing futures . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 13/23
  21. 21. Motivation CIEL Skywriting Optimizations & Fault Tolerance Evaluation & Future work .. . Motivation 1 Distributed Execution Engines .. . CIEL 2 Dynamic Task Graphs Architecture .. . Skywriting 3 .. . Optimizations & Fault Tolerance 4 .. . Evaluation & Future work 5 Evaluation Future Work Conclusions . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 14/23
  22. 22. Motivation CIEL Skywriting Optimizations & Fault Tolerance Evaluation & Future work. Optimizations Globally unique identifiers enable memoization Streaming partially written objects between tasks . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 15/23
  23. 23. Motivation CIEL Skywriting Optimizations & Fault Tolerance Evaluation & Future work. Fault Tolerance Client (no driver program) Worker (periodic heartbeat) Master (persistent logging, secondary masters, object table reconstruction) . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 16/23
  24. 24. Motivation CIEL Evaluation Skywriting Future Work Optimizations & Fault Tolerance Conclusions Evaluation & Future work .. . Motivation 1 Distributed Execution Engines .. . CIEL 2 Dynamic Task Graphs Architecture .. . Skywriting 3 .. . Optimizations & Fault Tolerance 4 .. . Evaluation & Future work 5 Evaluation Future Work Conclusions . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 17/23
  25. 25. Motivation CIEL Evaluation Skywriting Future Work Optimizations & Fault Tolerance Conclusions Evaluation & Future work. Performance Comparison with production system Figure: DistrubutedGrep on Hadoop and Ciel . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 18/23
  26. 26. Motivation CIEL Evaluation Skywriting Future Work Optimizations & Fault Tolerance Conclusions Evaluation & Future work. Perfomance of Iterative Algorithm Figure: K-means on Hadoop and Ciel with 20 workers . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 19/23
  27. 27. Motivation CIEL Evaluation Skywriting Future Work Optimizations & Fault Tolerance Conclusions Evaluation & Future work. Overheads Figure: Speedup of Binomial Options Pricing Model on 47 workers . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 20/23
  28. 28. Motivation CIEL Evaluation Skywriting Future Work Optimizations & Fault Tolerance Conclusions Evaluation & Future work. Future Work Integrate CIEL with existing programming languages Partition master state Explore use of multiple cores (see [5]) Explore use of non-deterministic parallelism (see [3]) . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 21/23
  29. 29. Motivation CIEL Evaluation Skywriting Future Work Optimizations & Fault Tolerance Conclusions Evaluation & Future work. Conclusions CIEL[4, 1] and Skywriting[2] are not good for: sharing large amounts of data fine-grain parallelization fully automatic parallelism relation algebra environment distributed operating system . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 22/23
  30. 30. Motivation CIEL Evaluation Skywriting Future Work Optimizations & Fault Tolerance Conclusions Evaluation & Future work. Conclusions CIEL[4, 1] and Skywriting[2] are not good for: are really good for : sharing large amounts of writing iterative algorithms data data-dependent control flow fine-grain parallelization using dynamic task graphs fully automatic parallelism transparent fault tolerance relation algebra environment and automatic distribution distributed operating system scaling across hundreds of machines . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 22/23
  31. 31. Motivation CIEL Evaluation Skywriting Future Work Optimizations & Fault Tolerance Conclusions Evaluation & Future work. Conclusions CIEL[4, 1] and Skywriting[2] are not good for: are really good for : sharing large amounts of writing iterative algorithms data data-dependent control flow fine-grain parallelization using dynamic task graphs fully automatic parallelism transparent fault tolerance relation algebra environment and automatic distribution distributed operating system scaling across hundreds of machines Questions ? . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 22/23
  32. 32. Motivation CIEL Evaluation Skywriting Future Work Optimizations & Fault Tolerance Conclusions Evaluation & Future work.[1] D.G. Murray. A distributed execution engine supporting data-dependent control flow. PhD thesis, PhD thesis, Univ. of Cambridge, 2011.[2] D.G. Murray and S. Hand. Scripting the cloud with skywriting. In Proceedings of the 2nd USENIX conference on Hot topics in cloud computing, pages 12–12. USENIX Association, 2010.[3] D.G. Murray and S. Hand. Non-deterministic parallelism considered useful. In HotOS XIII, 13th Workshop on Hot Topics in Operating Systems, 2011.[4] D.G. Murray, M. Schwarzkopf, C. Smowton, S. Smith, A. Madhavapeddy, and S. Hand. Ciel: a universal execution engine for distributed data-flow computing. In Proceedings of the 8th USENIX conference on Networked systems design and implementation, page 9. USENIX Association, 2011.[5] M. Schwarzkopf, D.G. Murray, and S. Hand. Condensing the cloud: running ciel on many-core.. Proceedings of EuroSys SFMA, 2011. . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 23/23
  33. 33. CIEL Skywriting Experiments Part I. . Appendix... . . . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 24/23
  34. 34. CIEL Skywriting Experiments .. . CIEL 6 .. . Skywriting 7 .. . Experiments 8 . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 1/7
  35. 35. CIEL Skywriting Experiments. Hidden slide 1 Figure: Task and Object table maintained in Master node . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 2/7
  36. 36. CIEL Skywriting Experiments .. . CIEL 6 .. . Skywriting 7 .. . Experiments 8 . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 3/7
  37. 37. CIEL Skywriting Experiments. Hidden slide 2 Figure: Spawning Tasks1 1 Source: http://www.cl.cam.ac.uk/~dgm36/CIEL-NSDI-slides.pdf . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 4/7
  38. 38. CIEL Skywriting Experiments. Hidden slide 3 Figure: Blocking on futures2 2 Source: http://www.cl.cam.ac.uk/~dgm36/CIEL-NSDI-slides.pdf . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 5/7
  39. 39. CIEL Skywriting Experiments .. . CIEL 6 .. . Skywriting 7 .. . Experiments 8 . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 6/7
  40. 40. CIEL Skywriting Experiments. Hidden slide 4 Figure: Primary Master Failure . . . . . . Presenter: Emmanouil Dimogerontakis @{AdvDS} CIEL universal distributed execution engine 7/7

×