• Save
De-Bugging Hive with Hadoop-in-the-Cloud
Upcoming SlideShare
Loading in...5
×
 

De-Bugging Hive with Hadoop-in-the-Cloud

on

  • 321 views

 

Statistics

Views

Total Views
321
Views on SlideShare
317
Embed Views
4

Actions

Likes
1
Downloads
0
Comments
0

1 Embed 4

http://jugnu-life.blogspot.com 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    De-Bugging Hive with Hadoop-in-the-Cloud De-Bugging Hive with Hadoop-in-the-Cloud Presentation Transcript

    • DEBUGGING HIVE WITH HADOOP IN THE CLOUD Soam Acharya, David Chaiken, Denis Sheahan, Charles Wimmer Altiscale, Inc. June 4, 2014
    • WHO ARE WE? • Altiscale: Infrastructure Nerds! • Hadoop As A Service • Rack and build our own Hadoop clusters • Provide a suite of Hadoop tools o Hive, Pig, Oozie o Others as needed: R, Python, Spark, Mahout, Impala, etc. • Monthly billing plan: compute, storage • https://www.altiscale.com • @Altiscale #HadoopSherpa
    • TALK ROADMAP • Our Platform and Perspective • Hadoop 2 Primer • Hadoop Debugging Tools • Accessing Logs in Hadoop 2 • Hive + Hadoop Architecture • Hive Logs • Hive Issues + Case Studies • Conclusion: Making Hive Easier to Use
    • OUR DYNAMIC PLATFORM • Hadoop 2.0.5 => Hadoop 2.2.0 • Hive 0.10 => Hive 0.12 • Hive, Pig and Oozie most commonly used tools • Working with customers on: Spark, Stinger (Hive 0.13 + Tez), Impala, …
    • ALTISCALE PERSPECTIVE • Service provider o Hadoop Dialtone! o Keep Hadoop/Hive + other tools running o Service Level Agreements target application-level metrics o Multiple clusters/customers o Operational scalability o Multi-tenancy • Operational approach o How to use Hadoop 2 cluster tools and logs to debug and to tune o This talk will not focus on query optimization
    • Hadoop 2 Cluster Name Node Hadoop Slave Hadoop Slave Hadoop Slave Resource Manager Secondary NameNode Hadoop Slave Node Managers + Data Nodes QUICK PRIMER – HADOOP 2
    • QUICK PRIMER – HADOOP 2 YARN • Resource Manager (per cluster) o Manages job scheduling and execution o Global resource allocation • Application Master (per job) o Manages task scheduling and execution o Local resource allocation • Node Manager (per-machine agent) o Manages the lifecycle of task containers o Reports to RM on health and resource usage
    • HADOOP 1 VS HADOOP 2 • No more JobTrackers, TaskTrackers • YARN ~ Operating System for Clusters o MapReduce is implemented as a YARN application o Bring on the applications! (Spark is just the start…) • Should be Transparent to Hive users
    • HADOOP 2 DEBUGGING TOOLS • Monitoring o System state of cluster:  CPU, Memory, Network, Disk  Nagios, Ganglia, Sensu!  Collectd, statd, Graphite o Hadoop level  HDFS usage  Resource usage: • Container memory allocated vs used • # of jobs running at the same time • Long running tasks
    • HADOOP 2 DEBUGGING TOOLS • Hadoop logs o Daemon logs: Resource Manager, NameNode, DataNode o Application logs: Application Master, MapReduce tasks o Job history file: resources allocated during job lifetime o Application configuration files: store all Hadoop application parameters • Source code instrumentation
    • ACCESSING LOGS IN HADOOP 2 • To view the logs for a job, click on the link under the ID column in Resource Manager UI.
    • ACCESSING LOGS IN HADOOP 2 • To view application top level logs, click on logs. • To view individual logs for the mappers and reducers, click on History.
    • ACCESSING LOGS IN HADOOP 2 • Log output for the entire application.
    • ACCESSING LOGS IN HADOOP 2 • Click on the Map link for mapper logs and the Reduce link for reducer logs.
    • ACCESSING LOGS IN HADOOP 2 • Clicking on a single link under Name provides an overview for that particular map job.
    • ACCESSING LOGS IN HADOOP 2 • Finally, clicking on the logs link will take you to the log output for that map job.
    • ACCESSING LOGS IN HADOOP 2 • Fun, fun, donuts, and more fun…
    • HIVE + HADOOP 2 ARCHITECTURE • Hive 0.10+ Hadoop 2 Cluster Hive CLI Hive Metastore HiveserverJDBC/ODBC Kettle, Hue, …
    • HIVE LOGS • Query Log location • From /etc/hive/hive-site.xml: <property> <name>hive.querylog.location</name> <value>/home/hive/log/${user.name}</value> </property> SessionStart SESSION_ID="soam_201402032341" TIME="1391470900594"
    • HIVE CLIENT LOGS • /etc/hive/hive-log4j.properties: o hive.log.dir=/var/log/hive/${user.name} 2014-05-29 19:51:09,830 INFO parse.ParseDriver (ParseDriver.java:parse(179)) - Parsing command: select count(*) from dogfood_job_data 2014-05-29 19:51:09,852 INFO parse.ParseDriver (ParseDriver.java:parse(197)) - Parse Completed 2014-05-29 19:51:09,852 INFO ql.Driver (PerfLogger.java:PerfLogEnd(124)) - </PERFLOG method=parse start=1401393069830 end=1401393069852 duration=22> 2014-05-29 19:51:09,853 INFO ql.Driver (PerfLogger.java:PerfLogBegin(97)) - <PERFLOG method=semanticAnalyze> 2014-05-29 19:51:09,890 INFO parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(8305)) - Starting Semantic Analysis 2014-05-29 19:51:09,892 INFO parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(8340)) - Completed phase 1 of Semantic Analysis 2014-05-29 19:51:09,892 INFO parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1060)) - Get metadata for source tables 2014-05-29 19:51:09,906 INFO parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1167)) - Get metadata for subqueries 2014-05-29 19:51:09,909 INFO parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1187)) - Get metadata for destination tables
    • HIVE METASTORE LOGS • /etc/hive-metastore/hive-log4j.properties: o hive.log.dir=/service/log/hive-metastore/${user.name} 2014-05-29 19:50:50,179 INFO metastore.HiveMetaStore (HiveMetaStore.java:logInfo(454)) - 200: source:/10.252.18.94 get_table : db=default tbl=dogfood_job_data 2014-05-29 19:50:50,180 INFO HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(239)) - ugi=chaiken ip=/10.252.18.94 cmd=source:/10.252.18.94 get_table : db=default tbl=dogfood_job_data 2014-05-29 19:50:50,236 INFO metastore.HiveMetaStore (HiveMetaStore.java:logInfo(454)) - 200: source:/10.252.18.94 get_table : db=default tbl=dogfood_job_data 2014-05-29 19:50:50,236 INFO HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(239)) - ugi=chaiken ip=/10.252.18.94 cmd=source:/10.252.18.94 get_table : db=default tbl=dogfood_job_data 2014-05-29 19:50:50,261 INFO metastore.HiveMetaStore (HiveMetaStore.java:logInfo(454)) - 200: source:/10.252.18.94 get_table : db=default tbl=dogfood_job_data
    • HIVE ISSUES + CASE STUDIES • Hive Issues o Hive client out of memory o Hive map/reduce task out of memory o Hive metastore out of memory o Hive launches too many tasks • Case Studies: o Hive “stuck” job o Hive “missing directories” o Analyze Hive Query Execution
    • HIVE CLIENT OUT OF MEMORY • Memory intensive client side hive query (map-side join) Number of reduce tasks not specified. Estimated from input data size: 999 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> java.lang.OutOfMemoryError: Java heap space at java.nio.CharBuffer.wrap(CharBuffer.java:350) at java.nio.CharBuffer.wrap(CharBuffer.java:373) at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138)
    • HIVE CLIENT OUT OF MEMORY • Use HADOOP_HEAPSIZE prior to launching Hive client • HADOOP_HEAPSIZE=<new heapsize> hive <fileName> • Watch out for HADOOP_CLIENT_OPTS issue in hive-env.sh! • Important to know the amount of memory available on machine running client… Do not exceed or use disproportionate amount. $ free -m total used free shared buffers cached Mem: 1695 1388 306 0 60 424 -/+ buffers/cache: 903 791 Swap: 895 101 794
    • HIVE TASK OUT OF MEMORY • Query spawns MapReduce jobs that run out of memory • How to find this issue? o Hive diagnostic message o Hadoop MapReduce logs
    • HIVE TASK OUT OF MEMORY • Fix is to increase task RAM allocation… set mapreduce.map.memory.mb=<new RAM allocation>; set mapreduce.reduce.memory.mb=<new RAM allocation>; • Also watch out for… set mapreduce.map.java.opts=-Xmx<heap size>m; set mapreduce.reduce.java.opts=-Xmx<heap size>m; • Not a magic bullet – requires manual tuning • Increase in individual container memory size: o Decrease in overall containers that can be run o Decrease in overall parallelism
    • HIVE METASTORE OUT OF MEMORY • Out of memory issues not necessarily dumped to logs • Metastore can become unresponsive • Can’t submit queries • Restart with a higher heap size: export HADOOP_HEAPSIZE in hcat_server.sh • After notifying hive users about downtime: service hcat restart
    • HIVE LAUNCHES TOO MANY TASKS • Typically a function of the input data set • Lots of little files
    • HIVE LAUNCHES TOO MANY TASKS • Set mapred.max.split.size to appropriate fraction of data size • Also verify that hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
    • CASE STUDY: HIVE STUCK JOB From an Altiscale customer: “This job [jobid] has been running now for 41 hours. Is it still progressing or has something hung up the map/reduce so it’s just spinning? Do you have any insight?”
    • HIVE STUCK JOB 1. Received jobId, application_1382973574141_4536, from client 2. Logged into client cluster. 3. Pulled up Resource Manager 4. Entered part of jobId (4536) in the search box. 5. Clicked on the link that says: application_1382973574141_4536 6. On resulting Application Overview page, clicked on link next to “Tracking URL” that said Application Master
    • HIVE STUCK JOB 7. On resulting MapReduce Application page, we clicked on the Job Id (job_1382973574141_4536). 8. The resulting MapReduce Job page displayed detailed status of the mappers, including 4 failed mappers 9. We then clicked on the 4 link on the Maps row in the Failed column. 10. Title of the next page was “FAILED Map attempts in job_1382973574141_4536.” 11. Each failed mapper generated an error message. 12. Buried in the 16th line: Caused by: java.io.FileNotFoundException: File does not exist: hdfs://opaque_hostname:8020/HiveTableDir/FileNa me.log.date.seq
    • HIVE STUCK JOB • Job was stuck for a day or so, retrying a mapper that would never finish successfully. • During the job, our customers’ colleague realized input file was corrupted and deleted it. • Colleague did not anticipate the affect of removing corrupted data on a running job • Hadoop didn’t make it easy to find out: o RM => search => application link => AM overview page => MR Application Page => MR Job Page => Failed jobs page => parse long logs o Task retry without hope of success
    • HIVE “MISSING DIRECTORIES” From an Altiscale customer: “One problem we are seeing after the [Hive Metastore] restart is that we lost quite a few directories in [HDFS]. Is there a way to recover these?”
    • HIVE “MISSING DIRECTORIES” • Obtained list of “missing” directories from customer: o /hive/biz/prod/* • Confirmed they were missing from HDFS • Searched through NameNode audit log to get block IDs that belonged to missing directories. 13/07/24 21:10:08 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /hive/biz/prod/incremental/carryoverstore/postdepuis /lmt_unmapped_pggroup_schema._COPYING_. BP- 798113632-10.251.255.251-1370812162472 blk_3560522076897293424_2448396{blockUCState=UNDER_C ONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.251.255.177:50 010|RBW], ReplicaUnderConstruction[10.251.255.174:50010|RBW], ReplicaUnderConstruction[10.251.255.169:50010|RBW]]}
    • HIVE “MISSING DIRECTORIES” • Used blockID to locate exact time of file deletion from Namenode logs: 13/07/31 08:10:33 INFO hdfs.StateChange: BLOCK* addToInvalidates: blk_3560522076897293424_2448396 to 10.251.255.177:50010 10.251.255.169:50010 10.251.255.174:50010 • Used time of deletion to inspect hive logs
    • HIVE “MISSING DIRECTORIES” QueryStart QUERY_STRING="create database biz_weekly location '/hive/biz/prod'" QUERY_ID=“usrprod_20130731043232_0a40fd32-8c8a-479c- ba7d-3bd8a2698f4b" TIME="1375245164667" : QueryEnd QUERY_STRING="create database biz_weekly location '/hive/biz/prod'" QUERY_ID=”usrprod_20130731043232_0a40fd32-8c8a-479c- ba7d-3bd8a2698f4b" QUERY_RET_CODE="0" QUERY_NUM_TASKS="0" TIME="1375245166203" : QueryStart QUERY_STRING="drop database biz_weekly" QUERY_ID=”usrprod_20130731073333_e9acf35c-4f07-4f12-bd9d-bae137ae0733" TIME="1375256014799" : QueryEnd QUERY_STRING="drop database biz_weekly" QUERY_ID=”usrprod_20130731073333_e9acf35c-4f07-4f12-bd9d-bae137ae0733" QUERY_NUM_TASKS="0" TIME="1375256014838"
    • HIVE “MISSING DIRECTORIES” • In effect, user “usrprod” issued: At 2013-07-31 04:32:44: create database biz_weekly location '/hive/biz/prod' At 2013-07-31 07:33:24: drop database biz_weekly • This is functionally equivalent to: hdfs dfs -rm -r /hive/biz/prod
    • HIVE “MISSING DIRECTORIES” • Customer manually placed their own data in /hive – the warehouse directory managed and controlled by hive • Customer used CREATE and DROP db commands in their code o Hive deletes database and table locations in /hive with impunity • Why didn’t deleted data end up in .Trash? o Trash collection not turned on in configuration settings o It is now, but need a –skipTrash option (HIVE-6469)
    • HIVE “MISSING DIRECTORIES” • Hadoop forensics: piece together disparate sources… o Hadoop daemon logs (NameNode) o Hive query and metastore logs o Hadoop config files • Need better tools to correlate the different layers of the system: hive client, hive metastore, MapReduce job, YARN, HDFS, operating sytem metrics, … By the way… Operating any distributed system would be totally insane without NTP and a standard time zone (UTC).
    • CASE STUDY – ANALYZE QUERY • Customer provided Hive query + data sets (100GBs to ~5 TBs) • Needed help optimizing the query • Didn’t rewrite query immediately • Wanted to characterize query performance and isolate bottlenecks first
    • ANALYZE AND TUNE EXECUTION • Ran original query on the datasets in our environment: o Two M/R Stages: Stage-1, Stage-2 • Long running reducers run out of memory o set mapreduce.reduce.memory.mb=5120 o Reduces slots and extends reduce time • Query fails to launch Stage-2 with out of memory o set HADOOP_HEAPSIZE=1024 on client machine • Query has 250,000 Mappers in Stage-2 which causes failure o set mapred.max.split.size=5368709120 to reduce Mappers
    • ANALYSIS: HOW TO VISUALIZE? • Next challenge - how to visualize job execution? • Existing hadoop/hive logs not sufficient for this task • Wrote internal tools o parse job history files o plot mapper and reducer execution
    • ANALYSIS: MAP STAGE-1
    • Single reduce task ANALYSIS: REDUCE STAGE-1
    • ANALYSIS: MAP STAGE-2
    • ANALYSIS: REDUCE STAGE-2
    • ANALYZE EXECUTION: FINDINGS • Lone, long running reducer in first stage of query • Analyzed input data: o Query split input data by userId o Bucketizing input data by userId o One very large bucket: “invalid” userId o Discussed “invalid” userid with customer • An error value is a common pattern! o Need to differentiate between “Don’t know and don’t care” or “don’t know and do care.”
    • CONCLUSIONS • Hive + Hadoop debugging can get very complex o Sifting through many logs and screens o Automatic transmission versus manual transmission • Static partitioning induced by Java Virtual Machine has benefits but also induces challenges. • Where there are difficulties, there’s opportunity o Better tooling o Better instrumentation o Better integration of disparate logs and metrics • Hadoop as a Service: aggregate and share expertise • Need to learn from the traditional database community!
    • QUESTIONS? COMMENTS?