HW09 Hadoop Vaidya


Published on

Published in: Technology, Business

HW09 Hadoop Vaidya

  1. 1. Hadoop Vaidya Viraj Bhat ( [email_address] ) Suhas Gogate ( [email_address] ) Milind Bhandarkar ( [email_address] ) Cloud Computing & Data Infrastructure Group, Yahoo! Inc. Hadoop World October 2, 2009
  2. 2. Hadoop & Job Optimization: Why ? <ul><li>Hadoop is a highly configurable commodity cluster computing framework </li></ul><ul><ul><li>Performance tuning of Hadoop jobs is a significant challenge! </li></ul></ul><ul><ul><ul><li>165+ tunable parameters </li></ul></ul></ul><ul><ul><ul><li>Tuning one parameter adversely affects others </li></ul></ul></ul><ul><li>Hadoop Job Optimization </li></ul><ul><ul><li>Job Performance – User perspective </li></ul></ul><ul><ul><ul><li>Reduce end-to-end execution time </li></ul></ul></ul><ul><ul><ul><li>Yield quicker analysis of data </li></ul></ul></ul><ul><ul><li>Cluster Utilization – Provider perspective </li></ul></ul><ul><ul><ul><li>Efficient sharing of cluster resources across multiple users </li></ul></ul></ul><ul><ul><ul><li>Increase overall throughput in terms of number of jobs/unit time </li></ul></ul></ul>
  3. 3. Hadoop Vaidya -- Rule based performance diagnostics Tool <ul><li>Rule based performance diagnosis of M/R jobs </li></ul><ul><ul><li>M/R performance analysis expertise is captured and provided as an input through a set of pre-defined diagnostic rules </li></ul></ul><ul><ul><li>Detects performance problems by postmortem analysis of a job by executing the diagnostic rules against the job execution counters </li></ul></ul><ul><ul><li>Provides targeted advice against individual performance problems </li></ul></ul><ul><li>Extensible framework </li></ul><ul><ul><li>You can add your own rules, </li></ul></ul><ul><ul><ul><li>based on a rule template and published job counters data structures </li></ul></ul></ul><ul><ul><li>Write complex rules using existing simpler rules </li></ul></ul>Vaidya : An expert (versed in his own profession , esp. in medical science) , skilled in the art of healing , a physician
  4. 4. Hadoop Vaidya : Status <ul><li>Input Data used for evaluating the rules </li></ul><ul><ul><li>Job History, Job Configuration (xml) </li></ul></ul><ul><li>A Contrib project under Apache Hadoop </li></ul><ul><ul><li>Available in Hadoop version 0.20.0 </li></ul></ul><ul><ul><li>http://issues.apache.org/jira/browse/HADOOP-4179 </li></ul></ul><ul><li>Automated deployment for analysis of thousands of daily jobs on the Yahoo! Grids </li></ul><ul><ul><li>Helps quickly identify inefficient user jobs utilizing more resources and advice them appropriately </li></ul></ul><ul><ul><li>Helps certify user jobs before moving to production clusters (compliance) </li></ul></ul>
  5. 5. Diagnostic Test Rule <ul><li><DiagnosticTest> </li></ul><ul><li><Title> Balanced Reduce Partitioning </Title> </li></ul><ul><li><ClassName> </li></ul><ul><li>org.apache.hadoop.vaidya.postexdiagnosis.tests.BalancedReducePartitioning </li></ul><ul><li></ClassName> </li></ul><ul><li><Description> </li></ul><ul><li>This rule tests as to how well the input to reduce tasks is balanced </li></ul><ul><li></Description> </li></ul><ul><li><Importance> High </Importance> </li></ul><ul><li><SuccessThreshold> 0.20 </SuccessThreshold> </li></ul><ul><li><Prescription> advice </Prescription> </li></ul><ul><li><InputElement> </li></ul><ul><li><PercentReduceRecords> 0.85 </PercentReduceRecords> </li></ul><ul><li></InputElement> </li></ul><ul><li></DiagnosticTest > </li></ul>
  6. 6. Diagnostic Report Element <ul><li><TestReportElement> </li></ul><ul><li><TestTitle> Balanced Reduce Partitioning </TestTitle> </li></ul><ul><li><TestDescription> </li></ul><ul><li>This rule tests as to how well the input to reduce tasks is balanced </li></ul><ul><li></TestDescription> </li></ul><ul><li><TestImportance> HIGH </TestImportance> </li></ul><ul><li><TestResult> POSITIVE(FAILED) </TestResult> </li></ul><ul><li><TestSeverity> 0.98 </TestSeverity> </li></ul><ul><li><ReferenceDetails> </li></ul><ul><li>* TotalReduceTasks: 1000 </li></ul><ul><li>* BusyReduceTasks processing 85% of total records: 2 </li></ul><ul><li>* Impact: 0.98 </li></ul><ul><li></ReferenceDetails> </li></ul><ul><li><TestPrescription> </li></ul><ul><li>* Use the appropriate partitioning function </li></ul><ul><li>* For streaming job consider following partitioner and hadoop config parameters </li></ul><ul><li>* org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner </li></ul><ul><li>* -jobconf stream.map.output.field.separator, -jobconf stream.num.map.output.key.fields </li></ul><ul><li></TestPrescription> </li></ul><ul><li></TestReportElement> </li></ul>
  7. 7. Hadoop Vaidya Rules - Examples <ul><li>Balanced Reduce Partitioning </li></ul><ul><ul><li>Checks if intermediate data is well partitioned among reducers. </li></ul></ul><ul><li>Map/Reduce tasks reading HDFS files as side effect </li></ul><ul><ul><li>Checks if HDFS files are being read as side effect and in effect causing the access bottleneck across map/reduce tasks </li></ul></ul><ul><li>Percent Re-execution of Map/Reduce tasks </li></ul><ul><li>Map tasks data locality </li></ul><ul><ul><li>Checks the % data locality for Map tasks </li></ul></ul><ul><li>Use of Combiner & Combiner efficiency </li></ul><ul><ul><li>Checks if there is a potential in using combiner after map stage </li></ul></ul><ul><li>Intermediate data compression </li></ul><ul><ul><li>Checks if intermediate data is compressed to lower the shuffle time </li></ul></ul><ul><li>Currently there are 15 rules </li></ul>
  8. 8. Performance Analysis for sample set of Jobs Vaidya Rules Total jobs analyzed = 794
  9. 9. Future Enhancements <ul><li>Online progress analysis of the Map/Reduce jobs to improve utilization </li></ul><ul><li>Correlation of various prescriptions suggested by Hadoop Vaidya to detect larger performance bottlenecks </li></ul><ul><li>Proactive SLA monitoring </li></ul><ul><ul><li>Detect inefficiently executing jobs early enough or those that would eventually fail due to any resource constraints </li></ul></ul><ul><li>Integration with the Job History viewer </li></ul><ul><li>Production Job Certification </li></ul>
  10. 11. Results of Hadoop Vaidya <ul><li>Total jobs analyzed = 22602 </li></ul><ul><li>Rules which yielded POSITIVE (TEST FAILED) </li></ul><ul><ul><li>Balanced Reduce Partitioning (4247 jobs / 18.79%) </li></ul></ul><ul><ul><li>Impact of Map tasks re-execution (1 job) </li></ul></ul><ul><ul><li>Impact of Reduce tasks re-execution (8 jobs) </li></ul></ul><ul><ul><li>#Maps/Reduces tasks reading HDFS data as side effect (20570 jobs / 91%) </li></ul></ul><ul><ul><li>Map side disk spill (864 jobs / 3.8%) </li></ul></ul>