Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster slow?, Bikas Saha, Software Engineer, Hortonworks

219 views

Published on

This talk draws on our experience in debugging and analyzing Hadoop jobs to describe some methodical approaches to this and present current and new tracing and tooling ideas that can help semi-automate parts of this difficult problem.

Published in: Technology
  • Be the first to comment

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster slow?, Bikas Saha, Software Engineer, Hortonworks

  1. 1. Why is my Hadoop* job slow? Bikas Saha @bikassaha *Apache Hadoop, Falcon, Atlas, Tez, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie, Zeppelin and the Hadoop elephant logo are trademarks of the Apache Software Foundation.
  2. 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Metrics and Monitoring Logging and Correlation Tracing and Analysis
  3. 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Metrics and Monitoring  Metrics as high level pointers  Ambari Metrics System  Ambari Grafana Integration  HBase, HDFS, YARN Dashboards  Metrics based alerting
  4. 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Metrics as high level pointers  Machine level metrics like CPU load  Application level metrics like HDFS counters  Metrics at point of time  Metrics anomalies along a time series  Correlated anomalies  Problem is to need to know what to look for
  5. 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ambari Metrics Service - Motivation  Limited Ganglia capabilities  OpenTSDB – GPL license and needs a Hadoop cluster  Need service level aggregation as well as time based  Alerts based on metrics system  Ability to scale past a 1000 nodes  Ability to perform analytics based on a use case  Allow fine grained control over aspects like: retention, collection intervals, aggregation  Pluggable and Extensible First version released with Ambari 2.0.0
  6. 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ambari Grafana Integration  Open source dashboard builder integrated with AMS.  Available from Ambari-2.2.2  Pre-defined host level and service level (HDFS, HBase, Yarn etc) dashboards.  Added to Ambari through API after upgrade
  7. 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HBase Dashboard
  8. 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Dashboard
  9. 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN Dashboard
  10. 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Metrics based Alerting  Top N support to quickly identify potential offenders  Alerting based on time series
  11. 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Metrics and Monitoring Logging and Correlation Tracing and Analysis
  12. 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Logging and Correlation  HDFS, YARN Audit logs  Caller Context  YARN Application Timeline Service  Lineage tracking of operations across workloads  Ambari Log Search
  13. 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Audit Logs and Caller Context FSNamesystem.audit: allowed=true ugi=userA (auth:SIMPLE) ip=/172.22.68.32 cmd=create src=/tmp/in/_temporary/1/_temporary/attempt_14644848874070_0009_m_009995_0/part-m-09995 dst=null perm=root:hdfs:rw-r--r-- proto=rpc callerContext=tez_ta:attempt_1464484887407_0009_1_00_009995_0 FSNamesystem.audit: allowed=true ugi=userA (auth:SIMPLE) ip=/172.22.68.33 cmd=create src=/tmp/in2/_temporary/1/_temporary/attempt_1464484887407_0011_m_000097_0/part-m-00097 dst=null perm=root:hdfs:rw-r--r-- proto=rpc callerContext=mr_attempt_1464484887407_0011_m_000097_0 FSNamesystem.audit: allowed=true ugi=userB (auth:SIMPLE) ip=/172.22.68.34 cmd=create src=/tmp/in2/_temporary/1/_temporary/attempt_1464484887407_0011_m_000095_0/part-m-00095 dst=null perm=root:hdfs:rw-r--r-- proto=rpc callerContext=mr_attempt_1464484887407_0011_m_000095_0
  14. 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved ResourceManager Audit Logs and Caller Context resourcemanager.RMAuditLogger: USER=userA IP=172.22.68.32 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1464484887407_0001 CALLERCONTEXT=PIG-pigSmoke.sh-8a052588-0013-4e39-83b1-ebad699d8e2e resourcemanager.RMAuditLogger: USER=userA IP=172.22.68.30 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1464484887407_0009 CALLERCONTEXT=CLI resourcemanager.RMAuditLogger: USER=userB IP=172.22.68.34 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1464484887407_0008 CALLERCONTEXT=mr_attempt_1464484887407_0007_m_000000_0 resourcemanager.RMAuditLogger: USER=userB IP=172.22.68.30 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1464484887407_0012 CALLERCONTEXT=HIVE_SSN_ID:f3aadf99-9e36-494b-84a1-99b685ac344b
  15. 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN Application Timeline Service  YARN service for fine grained application level tracing  Enables complex metadata to be recorded as the YARN app makes progress  Allows retrieval of this timeline data based on filters  Can be used to drive limited online analytics and extensive post-hoc analysis
  16. 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Lineage Tracking using YARN Timeline  Timeline:8188/ws/v1/timeline/TEZ_DAG_ID/dag_1464484887407_0013_1 dagContext: { callerId: "root_20160529021115_006f8007-5840-4c64-9970-c1b506f68db2", callerType: "HIVE_QUERY_ID", context: "HIVE", description: "select user, count(visit_id) as visits from users group by user order by visits” }  Timeline:8188/ws/v1/timeline/HIVE_QUERY_ID/root_20160529021115_006f8007- 5840-4c64-9970-c1b506f68db2 hiveContext: { callerId: “workflow_abcd", callerType: “OOZIE_ID", context: “OOZIE", description: “Daily ETL Summary Job” }
  17. 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ambari Log Search
  18. 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ambari Log Search
  19. 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Metrics and Monitoring Logging and Correlation Tracing and Analysis
  20. 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Tracing and Analysis  Use Big Data methods to solve Big Data problems  Apache Zeppelin as analytical tool  Hive/Tez/YARN notebook for analysis
  21. 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Zeppelin for Ad-hoc Analytics
  22. 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN Analyzer
  23. 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Tez Analyzer
  24. 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Tez Analyzer
  25. 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Tez Analyzer
  26. 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank You

×