Published on

Hong Tang talks about Using Simulation for Large-scale Distributed System Verification and Debugging

Published in: Technology
1 Comment
  • Hi, I am research student and trying to install and confirgure Mumak on own computer. And later on i will simulate the Hadoop application regarding data locality as Mumak take account of data locality.
    Could you please anyone tell me, from where i can download the Mumak simulator and how to confirgure on my computer.

    Many thanks
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Mumak Using Simulation for Large-scale Distributed System Verification and Debugging Hong Tang 2009.10 - Hadoop User Group
  2. 2. Outline <ul><li>Motivations </li></ul><ul><li>Overview and Status </li></ul><ul><li>Architecture </li></ul><ul><li>Demo </li></ul><ul><li>Lessons and Experiences </li></ul><ul><li>Conclusions and Future Work </li></ul>
  3. 3. Motivations <ul><li>Large-scale distributed system is hard to verify and debug </li></ul><ul><ul><li>Cannot afford a 2000-node cluster for every developer, feature enhancement, and bug fix </li></ul></ul><ul><ul><li>Time consuming to run benchmarks </li></ul></ul><ul><ul><li>Hard to reproduce production workload </li></ul></ul><ul><ul><li>Hard to reproduce corner case conditions </li></ul></ul>
  4. 4. Motivations (cont.) <ul><li>JobTracker is a fertile area for experimentation </li></ul><ul><ul><li>Scheduling policies – we have four schedulers already </li></ul></ul><ul><ul><li>Synergy with HDFS block placement policies </li></ul></ul><ul><ul><li>Speculative execution policies </li></ul></ul><ul><ul><li>We want more people to help us innovate! </li></ul></ul><ul><li>But, JobTracker is too complex to modify correctly </li></ul><ul><ul><li>Many factors to consider: fairness, capacity/SLA guarantees, data locality, load balance, failure handling and recovery, etc </li></ul></ul><ul><ul><li>Many control knobs in current implementation with subtle interactions </li></ul></ul>
  5. 5. Mumak <ul><li>Discrete-event simulation </li></ul><ul><ul><li>Can simulate a cluster with thousands of nodes in one process </li></ul></ul><ul><ul><li>Does not perform actual IO or computation </li></ul></ul><ul><ul><li>Virtual clock “spins” faster than wall clock </li></ul></ul><ul><ul><li>Can reproduce behavior/performance with degree of confidence </li></ul></ul><ul><li>Plugging in the real JobTracker and Scheduler </li></ul><ul><ul><li>No need to reimplement the scheduling policies </li></ul></ul><ul><ul><li>Inherit both features and bugs in JT and Scheduler </li></ul></ul><ul><li>Simulate all conditions of a production cluster </li></ul><ul><ul><li>Workload and cluster configuration generated by Rumen </li></ul></ul><ul><ul><li>Job submission, inter-arrival, dependencies, high-ram jobs, task exec </li></ul></ul><ul><ul><li>All kinds of failures and failure recovery logic </li></ul></ul><ul><ul><li>Resource contention </li></ul></ul>
  6. 6. Project Status <ul><li>Work-in-progress </li></ul><ul><li>First-cut version committed to Hadoop 0.21 </li></ul><ul><ul><li>Basic framework </li></ul></ul><ul><ul><li>Simplified task execution </li></ul></ul><ul><ul><li>No modeling of resource utilization or contention </li></ul></ul><ul><ul><li>Only individual task failures, no node failures or failure correlations </li></ul></ul><ul><ul><li>No job dependencies, nor speculative execution </li></ul></ul><ul><li>The Team </li></ul><ul><ul><li>Core devs: Arun Murthy, Anirban Dasgupta, Tamas Sarlos, Guanying Wang, Hong Tang </li></ul></ul><ul><ul><li>Collaborators: Dick King, Chris Douglas, Owen O’Malley </li></ul></ul>
  7. 7. Architecture Client Protocol InterTracker Protocol Job Tracker Simulated Job Tracker Sched Task Tracker Task Tracker Task Tracker Simulated Job Client Job Client Job Client Job Client Simulation Engine Rumen Cluster Story Job Story Trace Job Story Cache Simulated Task Tracker Simulated Task Tracker Simulated Task Tracker JobSubmissionEvent HeartBeatEvent TaskAttempt CompletionEvent JobCompletionEvent Job Finalization Event Queue
  8. 8. DEMO <ul><li>Build hadoop-mapreduce </li></ul><ul><ul><ul><li>% ant package </li></ul></ul></ul><ul><li>Run with checked in traces </li></ul><ul><ul><ul><li>% cd build/hadoop-0.22.0-dev </li></ul></ul></ul><ul><ul><ul><li>% contrib/mumak/bin/ </li></ul></ul></ul><ul><ul><ul><li>src/contrib/mumak/src/test/data/19-jobs.trace.json.gz </li></ul></ul></ul><ul><ul><ul><li>src/contrib/mumak/src/test/data/19-jobs.topology.json.gz </li></ul></ul></ul>
  9. 9. Implementation Experience <ul><li>JobTracker is reasonably modular and amenable to a simulated environment </li></ul><ul><ul><li>RPC, Clock, DNS-Switch mapping are all interfaces </li></ul></ul><ul><ul><li>No sleep() in main JT code </li></ul></ul><ul><li>Usage of threads is localized and easy to factor out </li></ul><ul><ul><li>Asynchronous job initialization: make them synchronous (AsjectJ) </li></ul></ul><ul><li>Inheritance is necessary to extend/alter the behavior </li></ul><ul><ul><li>JobTracker, JobInProgress, LaunchTaskAction, TaskTrackerStatus </li></ul></ul><ul><ul><li>Convey extra information: virtual time, task execution time, etc </li></ul></ul><ul><ul><li>Keep up with the base classes change may be hard </li></ul></ul><ul><ul><ul><li>Example: A new variable added to JobTracker </li></ul></ul></ul><ul><li>Make dependency explicit between map & reduce tasks </li></ul>
  10. 10. Mumak as a System Behavior Verifier
  11. 11. Mumak as a JobTracker Debugger <ul><li>MAPREDUCE-995: “ JobHistory should handle cases where task completion events are generated after job completion event ” </li></ul><ul><ul><li>Discovered when testing Mumak patch for 21 submission </li></ul></ul><ul><ul><li>Introduced by the MAPREDUCE-157, committed one day earlier </li></ul></ul><ul><ul><li>Manifested as JobTracker crash due to IOException </li></ul></ul><ul><li>Root cause analysis </li></ul><ul><ul><li>Developer made wrong assumption of the timing of events </li></ul></ul><ul><ul><ul><li>Assumed that when a job is marked as finished, no more heartbeat events related to the job would follow </li></ul></ul></ul><ul><ul><li>Lead to a Closable object being used after it is closed </li></ul></ul><ul><ul><li>To reproduce through benchmarking: need to inject a failed job and encounter “good” timing when an outstanding task completes after the job is marked as failed </li></ul></ul>
  12. 12. Mumak as a JobTracker Profiling Benchmark <ul><li>Memory allocation pattern similar to real JobTracker, but at much faster rate </li></ul><ul><li>Mumak overhead is less than 20-30% </li></ul><ul><li>Limitations: Cannot detect synchronization hotspots or sub-optimal IO or network operations </li></ul><ul><li>Findings through YourKit profiling </li></ul><ul><ul><li>Wasteful String concatenations in Log.debug() statements in mapred.ResourceEstimator.getEstimatedTotalMapOutputSize </li></ul></ul><ul><ul><li>Repetitive parsing of TaskTracker names to extract hostnames </li></ul></ul><ul><ul><li>Unnecessary exceptions from counter localization due to a removed properties file (regression introduced by H-5717) </li></ul></ul>
  13. 13. Conclusions <ul><li>Mumak: A light-weight, versatile tool for MapReduce verification and debugging </li></ul><ul><ul><li>Verification of overall system behavior </li></ul></ul><ul><ul><li>A debugger for JobTracker / scheduler </li></ul></ul><ul><ul><li>A micro-benchmark to stress CPU and memory allocation </li></ul></ul><ul><ul><li>Strengths: </li></ul></ul><ul><ul><ul><li>Easy to setup and run </li></ul></ul></ul><ul><ul><ul><li>Faster than running real benchmark: 1 min ~~ 2 hrs on a 2000-node cluster </li></ul></ul></ul><ul><ul><ul><li>Realistically reproduce conditions and test actual code </li></ul></ul></ul><ul><ul><ul><li>Can easily generate variants of ordering of distributed events </li></ul></ul></ul><ul><ul><li>Limitations: No simulation of system services or threads </li></ul></ul><ul><ul><ul><li>Cannot debug synchronization problems among threads </li></ul></ul></ul><ul><ul><ul><li>Cannot reproduce OS-induced failures </li></ul></ul></ul>
  14. 14. What Next? <ul><li>Simulate more conditions </li></ul><ul><ul><li>Speculative execution </li></ul></ul><ul><ul><li>Resource contentions </li></ul></ul><ul><ul><li>Node failures </li></ul></ul><ul><ul><li>Job dependencies </li></ul></ul><ul><li>Debug issues not resulting in hard-stop failures </li></ul><ul><ul><li>Fairness violation, starvation, utilization problems </li></ul></ul><ul><li>Patch validating, before and after comparison </li></ul><ul><ul><li>Making sure the patch does what is supposed to do, and does not introduce negative side effects </li></ul></ul><ul><li>Use Mumak to stage unit tests </li></ul><ul><ul><li>Construct testcases by building synthetic job stories </li></ul></ul>
  15. 15. QUESTIONS?