Apache Tez : Accelerating Hadoop
Query Processing
Bikas Saha
@bikassaha

© Hortonworks Inc. 2013

Page 1
Tez – Introduction
• Distributed execution framework
targeted towards data-processing
applications.
• Based on expressing a computation
as a dataflow graph.
• Highly customizable to meet a
broad spectrum of use cases.
• Built on top of YARN – the resource
management framework for
Hadoop.
• Open source Apache incubator
project and Apache licensed.
© Hortonworks Inc. 2013

Page 2
Tez – Design Themes
• Empowering End Users
• Execution Performance

© Hortonworks Inc. 2013

Page 3
Tez – Empowering End Users
• Expressive dataflow definition API’s
• Flexible Input-Processor-Output runtime model
• Data type agnostic
• Simplifying deployment

© Hortonworks Inc. 2013

Page 4
Tez – Empowering End Users
• Expressive dataflow definition API’s
Task-1

Task-2

Preprocessor Stage

Task-1

Task-2

Partition Stage

Samples

Sampler
Ranges

Distributed Sort

Task-1

© Hortonworks Inc. 2013

Task-2

Aggregate Stage

Page 5
Tez – Empowering End Users
• Flexible Input-Processor-Output runtime model
– Construct physical runtime executors dynamically by connecting
different inputs, processors and outputs.
– End goal is to have a library of inputs, outputs and processors that can
be programmatically composed to generate useful tasks.

HDFSInput

ShuffleInput

MapProcessor

ReduceProcessor

JoinProcessor

FileSortedOutput

HDFSOutput

FileSortedOutput

Mapper

FinalReduce

IntermediateJoiner

© Hortonworks Inc. 2013

Input1

Input2

Page 6
Tez – Empowering End Users
• Data type agnostic
– Tez is only concerned with the movement of data. Files and streams of
bytes.
– Clean separation between logical application layer and physical
framework layer. Design important to be a platform for a variety of
applications.

Tez Task

File

User Code
Key Value

Bytes

Bytes
Tuples

Stream

© Hortonworks Inc. 2013

Page 7
Tez – Empowering End Users
• Simplifying deployment
– Tez is a completely client side application.
– No deployments to do. Simply upload to any accessible FileSystem and
change local Tez configuration to point to that.
– Enables running different versions concurrently. Easy to test new
functionality while keeping stable versions for production.
– Leverages YARN local resources.
HDFS
Tez Lib 1

Tez Lib 2

TezClient

TezTask

TezTask

TezClient

Client
Machine

Node
Manager

Node
Manager

Client
Machine

© Hortonworks Inc. 2013

Page 8
Tez – Empowering End Users
• Expressive dataflow definition API’s
• Flexible Input-Processor-Output runtime model
• Data type agnostic
• Simplifying usage
With great power API’s come great responsibilities 
Tez is a framework on which end user applications can be built

© Hortonworks Inc. 2013

Page 9
Tez – Execution Performance
• Performance gains over Map Reduce
• Optimal resource management
• Plan reconfiguration at runtime
• Dynamic physical data flow decisions

© Hortonworks Inc. 2013

Page 10
Tez – Execution Performance
• Performance gains over Map Reduce
– Eliminate replicated write barrier between successive computations.
– Eliminate job launch overhead of workflow jobs.
– Eliminate extra stage of map reads in every workflow job.
– Eliminate queue and resource contention suffered by workflow jobs that
are started after a predecessor job completes.

Pig/Hive - Tez

Pig/Hive - MR

© Hortonworks Inc. 2013

Page 11
Tez – Execution Performance
• Plan reconfiguration at runtime
– Dynamic runtime concurrency control based on data size, user operator
resources, available cluster resources and locality.
– Advanced changes in dataflow graph structure.
– Progressive graph construction in concert with user optimizer.

HDFS
Blocks
Stage 1
50 maps
100
partitions

Stage 2
100
reducers

Stage 1
50 maps
100
partitions

Only 10GB’s
of data

Stage 2
100 10
reducers

YARN
Resources

© Hortonworks Inc. 2013

Page 12
Tez – Execution Performance
• Optimal resource management
– Reuse YARN containers to launch new tasks.
– Reuse YARN containers to enable shared objects across tasks.
– TezSession to encapsulate all this for the user

Start Task

Tez
Application Master

Task Done

Start Task

YARN Container

TezTask1

TezTask2

Shared Objects

TezTask Host

YARN Container

© Hortonworks Inc. 2013

Page 13
Tez – Execution Performance
• Dynamic physical data flow decisions
– Decide the type of physical byte movement and storage on the fly.
– Store intermediate data on distributed store, local store or in-memory.
– Transfer bytes via blocking files or streaming and the spectrum in
between.
Producer
(small size)

Producer

Local File

At Runtime

In-Memory

Consumer

Consumer

© Hortonworks Inc. 2013

Page 14
Tez – Automatic Reduce Parallelism
Event Model
Map tasks send
data statistics
events to the
Reduce Vertex
Manager.

Vertex Manager

Map Vertex

Vertex Manager
Pluggable user logic
that understands the
data statistics and
can formulate the
correct parallelism.
Advises vertex
controller on
parallelism

Vertex State
Machine

App Master

Reduce Vertex
Cancel Task

© Hortonworks Inc. 2013

Page 15
Tez – Automatic Reduce Parallelism
Event Model
Map tasks send
data statistics
events to the
Reduce Vertex
Manager.

Data Size Statistics

Vertex Manager

Map Vertex

Vertex Manager
Pluggable user logic
that understands the
data statistics and
can formulate the
correct parallelism.
Advises vertex
controller on
parallelism

Vertex State
Machine

App Master

Reduce Vertex
Cancel Task

© Hortonworks Inc. 2013

Page 16
Tez – Automatic Reduce Parallelism
Event Model
Map tasks send
data statistics
events to the
Reduce Vertex
Manager.
Vertex Manager
Pluggable user logic
that understands the
data statistics and
can formulate the
correct parallelism.
Advises vertex
controller on
parallelism

Data Size Statistics

Vertex Manager

Map Vertex

Set Parallelism
Re-Route

Vertex State
Machine

App Master

Reduce Vertex
Cancel Task

© Hortonworks Inc. 2013

Page 17
Tez – Now and Next

© Hortonworks Inc. 2013

Page 18
Tez – Bridge the Data Spectrum
Fact Table

Dimension
Table 1

Dimension
Table 1

Fact Table

Broadcast
Join

Result
Table 1

Dimension
Table 2

Broadcast join
for small data sets

Dimension
Table 1
Dimension
Table 1

Broadcast
Join

Result
Table 2

Dimension
Table 3
Shuffle
Join

Typical pattern in a
TPC-DS query

Result
Table 3

© Hortonworks Inc. 2013

Based on data size,
the query optimizer
can run either plan
as a single Tez job

Page 19
Tez – Current status
• Apache Incubator Project
– Rapid development. Over 800 jiras opened. Over 600 resolved.
– Growing community of contributors and users

• Focus on stability
– Testing and quality are highest priority.
– Code ready and deployed on multi-node environments.

• Support for a vast topology of DAGs
– Already functionally equivalent to Map Reduce. Existing Map Reduce
jobs can be executed on Tez with few or no changes.
– Hive retargeted to use Tez for execution of queries (HIVE-4660).
– Pig to use Tez for execution of scripts (PIG-3446).

© Hortonworks Inc. 2013

Page 20
Tez – Roadmap
• Richer DAG support
– Support for co-scheduling
– Efficient iterations

• Performance optimizations
– More efficiencies in transfer of data
– Improve session performance

• Usability.
– Stability and testability
– Recovery and history
– Tools for performance analysis and debugging

© Hortonworks Inc. 2013

Page 21
Tez – Community
• Early adopters and code contributors welcome
– Adopters to drive more scenarios. Contributors to make them happen.
– Hive and Pig communities are on-board and making great progress - HIVE-4660
and PIG-3446

• Tez meetup for developers and users
– http://www.meetup.com/Apache-Tez-User-Group

• Technical blog series
– http://hortonworks.com/blog/apache-tez-a-new-chapter-in-hadoop-dataprocessing/ (will soon be available on the Apache Wiki)

• Useful links
– Work tracking: https://issues.apache.org/jira/browse/TEZ
– Code: https://github.com/apache/incubator-tez
– Developer list: dev@tez.incubator.apache.org
User list: user@tez.incubator.apache.org
Issues list: issues@tez.incubator.apache.org
© Hortonworks Inc. 2013

Page 22
Tez – Takeaways
• Distributed execution framework that works on computations
represented as dataflow graphs
• Naturally maps to execution plans produced by query
optimizers
• Customizable execution architecture designed to enable
dynamic performance optimizations at runtime
• Works out of the box with the platform figuring out the hard
stuff
• Span the spectrum of interactive latency to batch
• Open source Apache project – your use-cases and code are
welcome
• It works and is already being used by Hive and Pig
© Hortonworks Inc. 2013

Page 23
Tez
Thanks for your time and attention!
Deep dive on Tez video at
http://www.infoq.com/presentations/apache-tez
Questions?
@bikassaha

© Hortonworks Inc. 2013

Page 24

February 2014 HUG : Tez Details and Insides

  • 1.
    Apache Tez :Accelerating Hadoop Query Processing Bikas Saha @bikassaha © Hortonworks Inc. 2013 Page 1
  • 2.
    Tez – Introduction •Distributed execution framework targeted towards data-processing applications. • Based on expressing a computation as a dataflow graph. • Highly customizable to meet a broad spectrum of use cases. • Built on top of YARN – the resource management framework for Hadoop. • Open source Apache incubator project and Apache licensed. © Hortonworks Inc. 2013 Page 2
  • 3.
    Tez – DesignThemes • Empowering End Users • Execution Performance © Hortonworks Inc. 2013 Page 3
  • 4.
    Tez – EmpoweringEnd Users • Expressive dataflow definition API’s • Flexible Input-Processor-Output runtime model • Data type agnostic • Simplifying deployment © Hortonworks Inc. 2013 Page 4
  • 5.
    Tez – EmpoweringEnd Users • Expressive dataflow definition API’s Task-1 Task-2 Preprocessor Stage Task-1 Task-2 Partition Stage Samples Sampler Ranges Distributed Sort Task-1 © Hortonworks Inc. 2013 Task-2 Aggregate Stage Page 5
  • 6.
    Tez – EmpoweringEnd Users • Flexible Input-Processor-Output runtime model – Construct physical runtime executors dynamically by connecting different inputs, processors and outputs. – End goal is to have a library of inputs, outputs and processors that can be programmatically composed to generate useful tasks. HDFSInput ShuffleInput MapProcessor ReduceProcessor JoinProcessor FileSortedOutput HDFSOutput FileSortedOutput Mapper FinalReduce IntermediateJoiner © Hortonworks Inc. 2013 Input1 Input2 Page 6
  • 7.
    Tez – EmpoweringEnd Users • Data type agnostic – Tez is only concerned with the movement of data. Files and streams of bytes. – Clean separation between logical application layer and physical framework layer. Design important to be a platform for a variety of applications. Tez Task File User Code Key Value Bytes Bytes Tuples Stream © Hortonworks Inc. 2013 Page 7
  • 8.
    Tez – EmpoweringEnd Users • Simplifying deployment – Tez is a completely client side application. – No deployments to do. Simply upload to any accessible FileSystem and change local Tez configuration to point to that. – Enables running different versions concurrently. Easy to test new functionality while keeping stable versions for production. – Leverages YARN local resources. HDFS Tez Lib 1 Tez Lib 2 TezClient TezTask TezTask TezClient Client Machine Node Manager Node Manager Client Machine © Hortonworks Inc. 2013 Page 8
  • 9.
    Tez – EmpoweringEnd Users • Expressive dataflow definition API’s • Flexible Input-Processor-Output runtime model • Data type agnostic • Simplifying usage With great power API’s come great responsibilities  Tez is a framework on which end user applications can be built © Hortonworks Inc. 2013 Page 9
  • 10.
    Tez – ExecutionPerformance • Performance gains over Map Reduce • Optimal resource management • Plan reconfiguration at runtime • Dynamic physical data flow decisions © Hortonworks Inc. 2013 Page 10
  • 11.
    Tez – ExecutionPerformance • Performance gains over Map Reduce – Eliminate replicated write barrier between successive computations. – Eliminate job launch overhead of workflow jobs. – Eliminate extra stage of map reads in every workflow job. – Eliminate queue and resource contention suffered by workflow jobs that are started after a predecessor job completes. Pig/Hive - Tez Pig/Hive - MR © Hortonworks Inc. 2013 Page 11
  • 12.
    Tez – ExecutionPerformance • Plan reconfiguration at runtime – Dynamic runtime concurrency control based on data size, user operator resources, available cluster resources and locality. – Advanced changes in dataflow graph structure. – Progressive graph construction in concert with user optimizer. HDFS Blocks Stage 1 50 maps 100 partitions Stage 2 100 reducers Stage 1 50 maps 100 partitions Only 10GB’s of data Stage 2 100 10 reducers YARN Resources © Hortonworks Inc. 2013 Page 12
  • 13.
    Tez – ExecutionPerformance • Optimal resource management – Reuse YARN containers to launch new tasks. – Reuse YARN containers to enable shared objects across tasks. – TezSession to encapsulate all this for the user Start Task Tez Application Master Task Done Start Task YARN Container TezTask1 TezTask2 Shared Objects TezTask Host YARN Container © Hortonworks Inc. 2013 Page 13
  • 14.
    Tez – ExecutionPerformance • Dynamic physical data flow decisions – Decide the type of physical byte movement and storage on the fly. – Store intermediate data on distributed store, local store or in-memory. – Transfer bytes via blocking files or streaming and the spectrum in between. Producer (small size) Producer Local File At Runtime In-Memory Consumer Consumer © Hortonworks Inc. 2013 Page 14
  • 15.
    Tez – AutomaticReduce Parallelism Event Model Map tasks send data statistics events to the Reduce Vertex Manager. Vertex Manager Map Vertex Vertex Manager Pluggable user logic that understands the data statistics and can formulate the correct parallelism. Advises vertex controller on parallelism Vertex State Machine App Master Reduce Vertex Cancel Task © Hortonworks Inc. 2013 Page 15
  • 16.
    Tez – AutomaticReduce Parallelism Event Model Map tasks send data statistics events to the Reduce Vertex Manager. Data Size Statistics Vertex Manager Map Vertex Vertex Manager Pluggable user logic that understands the data statistics and can formulate the correct parallelism. Advises vertex controller on parallelism Vertex State Machine App Master Reduce Vertex Cancel Task © Hortonworks Inc. 2013 Page 16
  • 17.
    Tez – AutomaticReduce Parallelism Event Model Map tasks send data statistics events to the Reduce Vertex Manager. Vertex Manager Pluggable user logic that understands the data statistics and can formulate the correct parallelism. Advises vertex controller on parallelism Data Size Statistics Vertex Manager Map Vertex Set Parallelism Re-Route Vertex State Machine App Master Reduce Vertex Cancel Task © Hortonworks Inc. 2013 Page 17
  • 18.
    Tez – Nowand Next © Hortonworks Inc. 2013 Page 18
  • 19.
    Tez – Bridgethe Data Spectrum Fact Table Dimension Table 1 Dimension Table 1 Fact Table Broadcast Join Result Table 1 Dimension Table 2 Broadcast join for small data sets Dimension Table 1 Dimension Table 1 Broadcast Join Result Table 2 Dimension Table 3 Shuffle Join Typical pattern in a TPC-DS query Result Table 3 © Hortonworks Inc. 2013 Based on data size, the query optimizer can run either plan as a single Tez job Page 19
  • 20.
    Tez – Currentstatus • Apache Incubator Project – Rapid development. Over 800 jiras opened. Over 600 resolved. – Growing community of contributors and users • Focus on stability – Testing and quality are highest priority. – Code ready and deployed on multi-node environments. • Support for a vast topology of DAGs – Already functionally equivalent to Map Reduce. Existing Map Reduce jobs can be executed on Tez with few or no changes. – Hive retargeted to use Tez for execution of queries (HIVE-4660). – Pig to use Tez for execution of scripts (PIG-3446). © Hortonworks Inc. 2013 Page 20
  • 21.
    Tez – Roadmap •Richer DAG support – Support for co-scheduling – Efficient iterations • Performance optimizations – More efficiencies in transfer of data – Improve session performance • Usability. – Stability and testability – Recovery and history – Tools for performance analysis and debugging © Hortonworks Inc. 2013 Page 21
  • 22.
    Tez – Community •Early adopters and code contributors welcome – Adopters to drive more scenarios. Contributors to make them happen. – Hive and Pig communities are on-board and making great progress - HIVE-4660 and PIG-3446 • Tez meetup for developers and users – http://www.meetup.com/Apache-Tez-User-Group • Technical blog series – http://hortonworks.com/blog/apache-tez-a-new-chapter-in-hadoop-dataprocessing/ (will soon be available on the Apache Wiki) • Useful links – Work tracking: https://issues.apache.org/jira/browse/TEZ – Code: https://github.com/apache/incubator-tez – Developer list: dev@tez.incubator.apache.org User list: user@tez.incubator.apache.org Issues list: issues@tez.incubator.apache.org © Hortonworks Inc. 2013 Page 22
  • 23.
    Tez – Takeaways •Distributed execution framework that works on computations represented as dataflow graphs • Naturally maps to execution plans produced by query optimizers • Customizable execution architecture designed to enable dynamic performance optimizations at runtime • Works out of the box with the platform figuring out the hard stuff • Span the spectrum of interactive latency to batch • Open source Apache project – your use-cases and code are welcome • It works and is already being used by Hive and Pig © Hortonworks Inc. 2013 Page 23
  • 24.
    Tez Thanks for yourtime and attention! Deep dive on Tez video at http://www.infoq.com/presentations/apache-tez Questions? @bikassaha © Hortonworks Inc. 2013 Page 24