Bft mr-clouds-of-clouds-discco2012 - navtalk


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Bft mr-clouds-of-clouds-discco2012 - navtalk

  1. 1. Byzantine Fault-Tolerant MapReduce in Cloud-of-Clouds Joint work with: Miguel Correia, Marcelo Pasin, Alysson Bessani, Fernando Ramos, Paulo Verissimo Presenter: Pedro Costa Navtalk
  2. 2. Motivation• How to count the number of words in the internet?• How to do it with the help of a cloud-of-clouds (ie, several clouds)• Guarantee integrity and availability of data 2
  3. 3. Outline• Introduction – MapReduce programming model – Fault tolerance in Cloud-of-clouds – 3 problems for Basic scheme• Our approach – Byzantine fault-tolerant MapReduce in clouds-of-clouds• Evaluation 3
  5. 5. What is MapReduce?• Programming model + execution environment • Introduced by Google in 2004 • Used for processing large data sets using clusters of servers • A few implementations available, used by many companies• Hadoop MapReduce, an open-source MapReduce of Apache • The most used, the one we have been using • Includes HDFS, a distributed file system for large files 5
  6. 6. MapReduce basic ideaA file with all the words on the Internet Map Phase <word,1> <word,n> Reduce Phase Tasktracker servers Tasktracker servers Job tracker detects and recovers crashed map/reduce tasks 6
  7. 7. MapReduce components Wordcount TT1 TT2 TT3 TT1 TT3 (TT) 7
  8. 8. But there are more faults…• Problem: Accidental faults may affect the correctness of the results of MapReduce • Task corruptions: memory errors, chipset errors, … • Cloud outages: MapReduce job interruptions (as reported in popular clouds)• Our goal: • guarantee integrity and availability (despite task corruptions and cloud outages) • Develop a new model to compute MapReduce in cloud-of-clouds • Commercially feasible? Yes, but out of scope of this presentation Tobias Kurze et al., Cloud federation. In Proceedings of the 2nd International Conference on Cloud Computing, GRIDs, and Virtualization CLOUD COMPUTING 2011. 8
  9. 9. Byzantine fault-tolerant MapReduce• Basic idea: to replicate tasks in different clouds and vote the results returned by the replicas • The set of clouds forms a clouds, so cloud-of-clouds • Inputs initially stored in all clouds (i.e., not our problem) Cloud 1 Cloud 2 Cloud 3 9
  10. 10. System model• Client is correct (not part of MapReduce)• Clouds: up to t clouds can arbitrarily corrupt all tasks and other modules they execute• Why use t and not f? t≤f• Next: • Basic BFT MapReduce scheme • 3 problems of the Basic scheme • Our approach: Full BFT MapReduce scheme 10
  11. 11. MapReduce: Map perspectiveOfficial Cloud-of-Clouds Replicas in different clouds 11
  12. 12. MapReduce: Reduce perspectiveOfficial Cloud-of-Clouds Replicas in different clouds But we can do better. 12
  13. 13. Improvements over basic version• 3 problems have risen • Computation problem • Communication problem • Job execution control problem• 3 Solutions: Our BFT MapReduce can be thought of as this basic version plus the following mechanisms, • Deferred execution (computation problem) • Digest communication (communication problem) • Distributed Job tracker (job execution control problem) 13
  14. 14. Problem 1: computation split 0 part 0 split 0 part 0 Replicas in differentReplicas in different clouds clouds split 0 part 0 Tasks are executed 2t+1 times 14
  15. 15. Solution 1: Deferred execution• Computation problem is uncommon• Job Tracker replicates tasks across t+1 clouds (t in standby)• If results differ or one cloud stops, request 1 more (up to t) split 0 part 0 split 0 part 0 15
  16. 16. Problem 2: communication split 0 part 0 split 0 part 0 Replicas in different clouds split 0 part 0All this communication through the Internet (delay, cost)! 16
  17. 17. Solution 2: Transferring Digests• Reduces must fetch the map task outputs• Intra-cloud fetch: output fetched normally• Inter-cloud fetch: only hash of the output fetched – key idea split 0 other clouds same cloud part 0 split 0 split 0 17
  18. 18. Problem 3: Job execution control• Job tracker controls all task executions in the task trackers in all clouds• If Job tracker is in one cloud separated from many task trackers by the internet: • Communication is slow • Large timeouts for detecting task tracker failure • …and it’s a single point of failure (this is the case in MR & Hadoop MR) 18
  19. 19. Solution 3: Job execution control Client VJT Job Tracker Job Task Job Tracker Tracker Tracker Task Task Tracker Tracker Task Task Tracker Tracker Task Task Task TaskTracker Tracker Tracker Tracker 19
  20. 20. EVALUATION 20
  21. 21. Setup and TestPlatform configuration• 3 clouds• Each cloud has 3 nodes• 1 JT and 3TT for each cloud• All JTs are interconnectedJob submitted (Wordcount)• Input data: 26 chunks of 64 MB (total 1.5GB )• Map tasks: 26• Reduce tasks: 120, 180, 360, 400 21
  22. 22. Number of reduce tasks executed (no faults, t=1) Nr. Job Job Diff Reduce duration duration tasks (Official) (CoC) 120 00:15:35 00:17:13 00:02:35 180 00:19:35 00:21:36 00:02:01 360 00:31:12 00:33:30 00:02:18 400 00:33:37 00:36:24 00:02:47
  23. 23. Task detailsOfficial BFT Cloud-of-clouds: 1 view Map Duration: 00:06:47 Map duration: 00:07:08 Map Tasks Map Tasks Reduce duration: 00:13:18 Reduce duration: 00:14:46 Reduce Tasks Reduce Tasks 23
  24. 24. Conclusions• Our method guarantee integrity and availability despite task corruptions and cloud outages• BFT MapReduce in cloud-of-clouds is feasible! • No need to execute in all 2t+1 clouds • Only digests sent through the Internet (no “big data”) • Control job execution within each cloud Thank you 24