Your SlideShare is downloading. ×
0
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
HW09 Hadoop Vaidya
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

HW09 Hadoop Vaidya

3,094

Published on

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,094
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
63
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Hadoop Vaidya Viraj Bhat ( [email_address] ) Suhas Gogate ( [email_address] ) Milind Bhandarkar ( [email_address] ) Cloud Computing & Data Infrastructure Group, Yahoo! Inc. Hadoop World October 2, 2009
  • 2. Hadoop & Job Optimization: Why ?
    • Hadoop is a highly configurable commodity cluster computing framework
      • Performance tuning of Hadoop jobs is a significant challenge!
        • 165+ tunable parameters
        • Tuning one parameter adversely affects others
    • Hadoop Job Optimization
      • Job Performance – User perspective
        • Reduce end-to-end execution time
        • Yield quicker analysis of data
      • Cluster Utilization – Provider perspective
        • Efficient sharing of cluster resources across multiple users
        • Increase overall throughput in terms of number of jobs/unit time
  • 3. Hadoop Vaidya -- Rule based performance diagnostics Tool
    • Rule based performance diagnosis of M/R jobs
      • M/R performance analysis expertise is captured and provided as an input through a set of pre-defined diagnostic rules
      • Detects performance problems by postmortem analysis of a job by executing the diagnostic rules against the job execution counters
      • Provides targeted advice against individual performance problems
    • Extensible framework
      • You can add your own rules,
        • based on a rule template and published job counters data structures
      • Write complex rules using existing simpler rules
    Vaidya : An expert (versed in his own profession , esp. in medical science) , skilled in the art of healing , a physician
  • 4. Hadoop Vaidya : Status
    • Input Data used for evaluating the rules
      • Job History, Job Configuration (xml)
    • A Contrib project under Apache Hadoop
      • Available in Hadoop version 0.20.0
      • http://issues.apache.org/jira/browse/HADOOP-4179
    • Automated deployment for analysis of thousands of daily jobs on the Yahoo! Grids
      • Helps quickly identify inefficient user jobs utilizing more resources and advice them appropriately
      • Helps certify user jobs before moving to production clusters (compliance)
  • 5. Diagnostic Test Rule
    • <DiagnosticTest>
    • <Title> Balanced Reduce Partitioning </Title>
    • <ClassName>
    • org.apache.hadoop.vaidya.postexdiagnosis.tests.BalancedReducePartitioning
    • </ClassName>
    • <Description>
    • This rule tests as to how well the input to reduce tasks is balanced
    • </Description>
    • <Importance> High </Importance>
    • <SuccessThreshold> 0.20 </SuccessThreshold>
    • <Prescription> advice </Prescription>
    • <InputElement>
    • <PercentReduceRecords> 0.85 </PercentReduceRecords>
    • </InputElement>
    • </DiagnosticTest >
  • 6. Diagnostic Report Element
    • <TestReportElement>
    • <TestTitle> Balanced Reduce Partitioning </TestTitle>
    • <TestDescription>
    • This rule tests as to how well the input to reduce tasks is balanced
    • </TestDescription>
    • <TestImportance> HIGH </TestImportance>
    • <TestResult> POSITIVE(FAILED) </TestResult>
    • <TestSeverity> 0.98 </TestSeverity>
    • <ReferenceDetails>
    • * TotalReduceTasks: 1000
    • * BusyReduceTasks processing 85% of total records: 2
    • * Impact: 0.98
    • </ReferenceDetails>
    • <TestPrescription>
    • * Use the appropriate partitioning function
    • * For streaming job consider following partitioner and hadoop config parameters
    • * org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner
    • * -jobconf stream.map.output.field.separator, -jobconf stream.num.map.output.key.fields
    • </TestPrescription>
    • </TestReportElement>
  • 7. Hadoop Vaidya Rules - Examples
    • Balanced Reduce Partitioning
      • Checks if intermediate data is well partitioned among reducers.
    • Map/Reduce tasks reading HDFS files as side effect
      • Checks if HDFS files are being read as side effect and in effect causing the access bottleneck across map/reduce tasks
    • Percent Re-execution of Map/Reduce tasks
    • Map tasks data locality
      • Checks the % data locality for Map tasks
    • Use of Combiner & Combiner efficiency
      • Checks if there is a potential in using combiner after map stage
    • Intermediate data compression
      • Checks if intermediate data is compressed to lower the shuffle time
    • Currently there are 15 rules
  • 8. Performance Analysis for sample set of Jobs Vaidya Rules Total jobs analyzed = 794
  • 9. Future Enhancements
    • Online progress analysis of the Map/Reduce jobs to improve utilization
    • Correlation of various prescriptions suggested by Hadoop Vaidya to detect larger performance bottlenecks
    • Proactive SLA monitoring
      • Detect inefficiently executing jobs early enough or those that would eventually fail due to any resource constraints
    • Integration with the Job History viewer
    • Production Job Certification
  • 10.  
  • 11. Results of Hadoop Vaidya
    • Total jobs analyzed = 22602
    • Rules which yielded POSITIVE (TEST FAILED)
      • Balanced Reduce Partitioning (4247 jobs / 18.79%)
      • Impact of Map tasks re-execution (1 job)
      • Impact of Reduce tasks re-execution (8 jobs)
      • #Maps/Reduces tasks reading HDFS data as side effect (20570 jobs / 91%)
      • Map side disk spill (864 jobs / 3.8%)

×