Page1 © Hortonworks Inc. 2014
Tez: UI & Debugging
Fall 2014
Version 1.0
gopalv@apache.org
Page2 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TEZ (nomenclature)
• DAG
• Vertex
• Task
• Attempt
• Container
• Edge
Page3 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Directed Acyclic Graphs
Page4 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
How to view raw DAGs from logs
• Tez Application logs contain .dot files in Graphviz format
• To generate images: dot –Tpng –o dag.png dag.dot
• OR javascript version: http://people.apache.org/~gopalv/dagviz/
Page5 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TEZ-8 JIRA & branch
• TEZ UI for progress tracking and history
• https://issues.apache.org/jira/browse/TEZ-8
• https://github.com/apache/tez/tree/TEZ-8
• UI-centric branch
Page6 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez-UI: Landing page
Page7 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: DAG view
Page8 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Vertex view
Page9 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Vertex -> Tasks view
Page10 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Task logs
Task logs
Page11 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Task counters
Task counters
Page12 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Task counters
Search for
counters
Page13 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Per-edge shuffle counters
Map 3 to Map 1 only
Page14 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Payload view
Page15 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Failed DAGs (diagnostic)
Page16 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Failed tasks indication
Failed tasks
Page17 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Failed tasks
Page18 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Failed attempts
Page20 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Post-hoc/Ad-hoc analysis helpers
• tez/tez-tools ships with two helper tools
• swimlanes
• tez-tfile-parser
Page21 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Swimlanes
• ./yarn-swimlanes.sh application_1415860665053_0098
Page22 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TFile parser
• Tez logs can be parsed via PIG
• Allows us to treat our logs exactly like we treat our big-data
• Processing using “pig –x tez” + UDFs [1]
rawLogs = load ‘/app-logs/root/logs/application_1409012059361_0539/*' using
org.apache.tez.tools.TFileLoader() as (machine:chararray, key:chararray, line:chararray);
[1] - https://github.com/rajeshbalamohan/tez_log_parser/blob/master/src/main/resources/pig/udf.groovy
Page23 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TFile parser (contd)
• Parsing INFO logs for shuffle for instance (for time taken + machine)
Problematic machine
Page24 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TFile parser (node/rack traffic at 350 nodes)
Problematic machine
Fetcher in node-100 is always slow
(irrespective of where its pulling data from)
Other faulty nodes
Mapout served from node-100 to node-120
To any node is always slow
Page25 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Questions?
• Thanks all tez contributors for their efforts!
• FYI, Hadoop Summit 2015 (Europe) Call for papers is out

TEZ-8 UI Walkthrough

  • 1.
    Page1 © HortonworksInc. 2014 Tez: UI & Debugging Fall 2014 Version 1.0 gopalv@apache.org
  • 2.
    Page2 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TEZ (nomenclature) • DAG • Vertex • Task • Attempt • Container • Edge
  • 3.
    Page3 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Directed Acyclic Graphs
  • 4.
    Page4 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 How to view raw DAGs from logs • Tez Application logs contain .dot files in Graphviz format • To generate images: dot –Tpng –o dag.png dag.dot • OR javascript version: http://people.apache.org/~gopalv/dagviz/
  • 5.
    Page5 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TEZ-8 JIRA & branch • TEZ UI for progress tracking and history • https://issues.apache.org/jira/browse/TEZ-8 • https://github.com/apache/tez/tree/TEZ-8 • UI-centric branch
  • 6.
    Page6 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez-UI: Landing page
  • 7.
    Page7 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: DAG view
  • 8.
    Page8 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Vertex view
  • 9.
    Page9 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Vertex -> Tasks view
  • 10.
    Page10 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Task logs Task logs
  • 11.
    Page11 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Task counters Task counters
  • 12.
    Page12 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Task counters Search for counters
  • 13.
    Page13 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Per-edge shuffle counters Map 3 to Map 1 only
  • 14.
    Page14 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Payload view
  • 15.
    Page15 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Failed DAGs (diagnostic)
  • 16.
    Page16 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Failed tasks indication Failed tasks
  • 17.
    Page17 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Failed tasks
  • 18.
    Page18 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Failed attempts
  • 19.
    Page20 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Post-hoc/Ad-hoc analysis helpers • tez/tez-tools ships with two helper tools • swimlanes • tez-tfile-parser
  • 20.
    Page21 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Swimlanes • ./yarn-swimlanes.sh application_1415860665053_0098
  • 21.
    Page22 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TFile parser • Tez logs can be parsed via PIG • Allows us to treat our logs exactly like we treat our big-data • Processing using “pig –x tez” + UDFs [1] rawLogs = load ‘/app-logs/root/logs/application_1409012059361_0539/*' using org.apache.tez.tools.TFileLoader() as (machine:chararray, key:chararray, line:chararray); [1] - https://github.com/rajeshbalamohan/tez_log_parser/blob/master/src/main/resources/pig/udf.groovy
  • 22.
    Page23 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TFile parser (contd) • Parsing INFO logs for shuffle for instance (for time taken + machine) Problematic machine
  • 23.
    Page24 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TFile parser (node/rack traffic at 350 nodes) Problematic machine Fetcher in node-100 is always slow (irrespective of where its pulling data from) Other faulty nodes Mapout served from node-100 to node-120 To any node is always slow
  • 24.
    Page25 © HortonworksInc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Questions? • Thanks all tez contributors for their efforts! • FYI, Hadoop Summit 2015 (Europe) Call for papers is out