Mumak

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    2 Favorites

    Mumak - Presentation Transcript

    1. Mumak Using Simulation for Large-scale Distributed System Verification and Debugging Hong Tang 2009.10 - Hadoop User Group
    2. Outline
      • Motivations
      • Overview and Status
      • Architecture
      • Demo
      • Lessons and Experiences
      • Conclusions and Future Work
    3. Motivations
      • Large-scale distributed system is hard to verify and debug
        • Cannot afford a 2000-node cluster for every developer, feature enhancement, and bug fix
        • Time consuming to run benchmarks
        • Hard to reproduce production workload
        • Hard to reproduce corner case conditions
    4. Motivations (cont.)
      • JobTracker is a fertile area for experimentation
        • Scheduling policies – we have four schedulers already
        • Synergy with HDFS block placement policies
        • Speculative execution policies
        • We want more people to help us innovate!
      • But, JobTracker is too complex to modify correctly
        • Many factors to consider: fairness, capacity/SLA guarantees, data locality, load balance, failure handling and recovery, etc
        • Many control knobs in current implementation with subtle interactions
    5. Mumak
      • Discrete-event simulation
        • Can simulate a cluster with thousands of nodes in one process
        • Does not perform actual IO or computation
        • Virtual clock “spins” faster than wall clock
        • Can reproduce behavior/performance with degree of confidence
      • Plugging in the real JobTracker and Scheduler
        • No need to reimplement the scheduling policies
        • Inherit both features and bugs in JT and Scheduler
      • Simulate all conditions of a production cluster
        • Workload and cluster configuration generated by Rumen
        • Job submission, inter-arrival, dependencies, high-ram jobs, task exec
        • All kinds of failures and failure recovery logic
        • Resource contention
    6. Project Status
      • Work-in-progress
      • First-cut version committed to Hadoop 0.21
        • Basic framework
        • Simplified task execution
        • No modeling of resource utilization or contention
        • Only individual task failures, no node failures or failure correlations
        • No job dependencies, nor speculative execution
      • The Team
        • Core devs: Arun Murthy, Anirban Dasgupta, Tamas Sarlos, Guanying Wang, Hong Tang
        • Collaborators: Dick King, Chris Douglas, Owen O’Malley
    7. Architecture Client Protocol InterTracker Protocol Job Tracker Simulated Job Tracker Sched Task Tracker Task Tracker Task Tracker Simulated Job Client Job Client Job Client Job Client Simulation Engine Rumen Cluster Story Job Story Trace Job Story Cache Simulated Task Tracker Simulated Task Tracker Simulated Task Tracker JobSubmissionEvent HeartBeatEvent TaskAttempt CompletionEvent JobCompletionEvent Job Finalization Event Queue
    8. DEMO
      • Build hadoop-mapreduce
          • % ant package
      • Run with checked in traces
          • % cd build/hadoop-0.22.0-dev
          • % contrib/mumak/bin/mumak.sh
          • src/contrib/mumak/src/test/data/19-jobs.trace.json.gz
          • src/contrib/mumak/src/test/data/19-jobs.topology.json.gz
    9. Implementation Experience
      • JobTracker is reasonably modular and amenable to a simulated environment
        • RPC, Clock, DNS-Switch mapping are all interfaces
        • No sleep() in main JT code
      • Usage of threads is localized and easy to factor out
        • Asynchronous job initialization: make them synchronous (AsjectJ)
      • Inheritance is necessary to extend/alter the behavior
        • JobTracker, JobInProgress, LaunchTaskAction, TaskTrackerStatus
        • Convey extra information: virtual time, task execution time, etc
        • Keep up with the base classes change may be hard
          • Example: A new variable added to JobTracker
      • Make dependency explicit between map & reduce tasks
    10. Mumak as a System Behavior Verifier
    11. Mumak as a JobTracker Debugger
      • MAPREDUCE-995: “ JobHistory should handle cases where task completion events are generated after job completion event ”
        • Discovered when testing Mumak patch for 21 submission
        • Introduced by the MAPREDUCE-157, committed one day earlier
        • Manifested as JobTracker crash due to IOException
      • Root cause analysis
        • Developer made wrong assumption of the timing of events
          • Assumed that when a job is marked as finished, no more heartbeat events related to the job would follow
        • Lead to a Closable object being used after it is closed
        • To reproduce through benchmarking: need to inject a failed job and encounter “good” timing when an outstanding task completes after the job is marked as failed
    12. Mumak as a JobTracker Profiling Benchmark
      • Memory allocation pattern similar to real JobTracker, but at much faster rate
      • Mumak overhead is less than 20-30%
      • Limitations: Cannot detect synchronization hotspots or sub-optimal IO or network operations
      • Findings through YourKit profiling
        • Wasteful String concatenations in Log.debug() statements in mapred.ResourceEstimator.getEstimatedTotalMapOutputSize
        • Repetitive parsing of TaskTracker names to extract hostnames
        • Unnecessary exceptions from counter localization due to a removed properties file (regression introduced by H-5717)
    13. Conclusions
      • Mumak: A light-weight, versatile tool for MapReduce verification and debugging
        • Verification of overall system behavior
        • A debugger for JobTracker / scheduler
        • A micro-benchmark to stress CPU and memory allocation
        • Strengths:
          • Easy to setup and run
          • Faster than running real benchmark: 1 min ~~ 2 hrs on a 2000-node cluster
          • Realistically reproduce conditions and test actual code
          • Can easily generate variants of ordering of distributed events
        • Limitations: No simulation of system services or threads
          • Cannot debug synchronization problems among threads
          • Cannot reproduce OS-induced failures
    14. What Next?
      • Simulate more conditions
        • Speculative execution
        • Resource contentions
        • Node failures
        • Job dependencies
      • Debug issues not resulting in hard-stop failures
        • Fairness violation, starvation, utilization problems
      • Patch validating, before and after comparison
        • Making sure the patch does what is supposed to do, and does not introduce negative side effects
      • Use Mumak to stage unit tests
        • Construct testcases by building synthetic job stories
    15. QUESTIONS?
    SlideShare Zeitgeist 2009

    + Hadoop User GroupHadoop User Group Nominate

    custom

    387 views, 2 favs, 2 embeds more stats

    Hong Tang talks about Using Simulation for Large-sc more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 387
      • 126 on SlideShare
      • 261 from embeds
    • Comments 0
    • Favorites 2
    • Downloads 20
    Most viewed embeds
    • 260 views on http://developer.yahoo.net
    • 1 views on http://74.125.155.132

    more

    All embeds
    • 260 views on http://developer.yahoo.net
    • 1 views on http://74.125.155.132

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories