Tez Data Processing over Yarn

Data Processing
over YARN
Page 1

Tez – Introduction
• Distributed execution framework targeted towards data-processing
© Hortonworks Inc. 2014
applications.
• Based on expressing a computation as a dataflow graph.
• Highly customizable to meet a broad spectrum of use cases.
• Built on top of YARN – the resource management framework
for Hadoop.
• Open source Apache project and Apache licensed.
Page 2

Hadoop 1 -> Hadoop 2
HADOOP 1.0
Pig
(data flow)
Hive
(sql)
Others
(cascading)
MapReduce
(cluster resource management
& data processing)
HDFS
(redundant, reliable storage)
HADOOP 2.0
Data Flow
(execution engine)
YARN
Tez
(cluster resource management)
HDFS2
(redundant, reliable storage)
Pig
SQL
Hive
Others
(Cascading)
Batch
MapReduce Real Time
Stream
Processing
Storm
Online
Data
Processing
HBase,
Accumulo
Monolithic
• Resource Management
• Execution Engine
• User API
Layered
• Resource Management – YARN
• Execution Engine – Tez
• User API – Hive, Pig, Cascading, Your App!

Tez – Problems that it addresses
• Expressing the computation
• Direct and elegant representation of the data processing flow
• Performance
• Late Binding : Make decisions as late as possible using real data from at
runtime
• Leverage the resources of the cluster efficiently
• Just work out of the box!
• Customizable engine to let applications tailor the job to meet their
specific requirements
• Operation simplicity
• Painless to operate, experiment and upgrade
Page 4

Tez – Simplifying Operations
• Tez is a pure YARN application. Easy and safe to try it out!
• No deployments to do, no servers to run
• Enables running different versions concurrently. Easy to test new
functionality while keeping stable versions for production.
• Leverages YARN local resources.
HDFS
Tez Lib 1 Tez Lib 2
TezClient TezTask
TezTask
Page 5
Client
Machine
Node
Manager
Node
Manager
TezClient
Client
Machine

Tez – Expressing the computation
Distributed data processing jobs typically look like DAGs (Directed Acyclic
Graph).
• Vertices in the graph represent data transformations
• Edges represent data movement from producers to consumers
Page 6
Preprocessor Stage
Partition Stage
Aggregate Stage
Sampler
Task-1 Task-2
Task-1 Task-2
Task-1 Task-2
Sample
s
Ranges
Distributed Sort

MR is a 2-vertex sub-set of Tez
Page 7

But Tez is so much more
Page 8

Tez – Expressing the computation
Tez defines the following APIs to define the work
• DAG API
• Defines the structure of the data processing and the relationship
between producers and consumers
• Enable definition of complex data flow pipelines using simple graph
connection API’s. Tez expands the logical DAG at runtime
• This is how all the tasks in the job get specified
• Runtime API
• Defines the interface using which the framework and app code interact
with each other
• App code transforms data and moves it between tasks
• This is how we specify what actually executes in each task on the cluster
nodes
Page 9

reduce1
map2
reduce2
join1
map1
Scatter_Gather
Bipartite
Sequential
Scatter_Gather
Bipartite
Sequential
Tez – DAG API
// Define DAG
DAG dag = DAG.create(“sessionize”);
// Define Vertex-M1
Vertex m1 = Vertex.create("M_1",
ProcessorDescriptor.create(MapProcessor_1.class.getName()));
//Define Vertex-R1
Vertex r1 = Vertex.create(”R_1",
ProcessorDescriptor.create(ReduceProcessor_1.class.getName()));
…
…
// Define Edge (edge between m1 and r1)
Edge e1 =Edge.create(m1, r1,
OrderedPartitionedKVEdgeConfig.newBuilder(…).build()
.createDefaultEdgeProperty());
// Connect them
dag.addVertex(m1).addVertex(r1).addEdge(e1)…
Page 10
Defines the global processing flow

Tez - Different Edge Properties
Scatter-Gather Broadcast One-to-One
11

Tez – Logical DAG expansion at Runtime
Page 12
reduce1
map2
reduce2
join1
map1

Tez – Runtime API building blocks
13

Tez – Library of Inputs and Outputs
Classical ‘Map’
Sorted
Output
Page 14
Classical ‘Reduce’
Map
Processor
HDFS
Input
Intermediate ‘Reduce’ for
Map-Reduce-Reduce
Reduce
Processor
Shuffle
Input
HDFS
Output
Reduce
Processor
Shuffle
Input
Sorted
Output
• What is built in?
– Hadoop InputFormat/OutputFormat
– OrderedPartitioned Key-Value
Input/Output
– UnorderedPartitioned Key-Value
Input/Output
– Key-Value Input/Output

Tez – Container Re-Use
• Reuse YARN containers/JVMs to launch new tasks
• Reduce scheduling and launching delays
• Shared in-memory data across tasks
• JVM JIT friendly execution
Page 15
TezTask Host
TezTask1
TezTask2
Shared Objects
YARN Container / JVM
Tez
Application Master
YARN Container
Start Task
Task Done
Start Task

Container reuse
• Tez specific feature
• Run an entire DAG using the same containers
• Different vertices use same container
• Saves time talking to YARN for new containers

Tez – Sessions
Page 17
Client
Start
Session
Application Master
Submit
DAG
Task Scheduler
Container Pool
Shared
Object
Registry
Pre
Warmed
JVM
Sessions
• Standard concepts of pre-launch
and pre-warm applied
• Key for Interactive queries
• Represents a connection between
the user and the cluster
• Multiple DAGs/Queries executed in
the same AM
• Containers re-used across queries
• Takes care of data locality and
releasing resources when idle

Tez – Re-Use in Action (In Session)
Task Execution Timeline

Tez – Auto Reduce Parallelism
© Hortonworks
Inc. 2011
19

Tez – Auto Reduce Parallelism
20

Tez – End User Benefits
• Better Performance
• Framework performance + application performance
• Better utilization of cluster resources
• Efficient use of allocated resources
• Better predictability of results
• Minimized queuing delays
• Reduced load on HDFS
• Removes unnecessary HDFS writes
• Reduced network usage
• Efficient data transfer using new data patterns
• Increased developer productivity
• Lets the user concentrate on application logic instead of Hadoop
internals
Page 21

Tez – Broadcast Edge
SELECT ss.ss_item_sk, avg_price, inv.inv_quantity_on_hand
FROM (select avg(ss_sales_price) as avg_price, ss_item_sk from store_sales
group by ss_item_sk) ss
JOIN inventory inv
ON (inv.inv_item_sk = ss.ss_item_sk);
M M M
HDFS
M M M
HDFS
Store Sales
scan. Group by
and aggregation.
Inventory and Store
Sales (aggr.) output
scan and shuffle
join.
R R
R R
M M M
M M M
HDFS
Store Sales
scan. Group by
and aggregation.
Inventory and Store
Sales (aggr.) output
scan and shuffle
join.
R R
broadcast

Tez – Multiple Outputs
FROM (SELECT * FROM store_sales, date_dim WHERE ss_sold_date_sk = d_date_sk and d_year = 2000)
INSERT INTO TABLE t1 SELECT distinct ss_item_sk
INSERT INTO TABLE t2 SELECT distinct ss_customer_sk;
Hive – MR Hive – Tez
M
M M M
HDFS
Map join
date_dim/store
sales
Two MR jobs to
do the distinct
M M
M M M
R R
HDFS
HDFS
M M M
R
M M M
R
HDFS
Broadcast Join
(scan date_dim,
join store sales)
Distinct for
customer + items
Materialize join on
HDFS
Hive : Multi-insert
queries

Tez – Data at scale
Hive TPC-DS
Scale 10TB
Page 25

Tez – what if you can’t get enough containers?
• 78 vertex + 8374 tasks on 50 YARN containers
Page 26

Tez – Designed for big, busy clusters
• Number of stages in the DAG
• Higher the number of stages in the DAG, performance of Tez (over MR) will be
better.
• Cluster/queue capacity
• More congested a queue is, the performance of Tez (over MR) will be better due
to container reuse.
• Size of intermediate output
• More the size of intermediate output, the performance of Tez (over MR) will be
better due to reduced HDFS usage (cross-rack traffic)
• Size of data in the job
• For smaller data and more stages, the performance of Tez (over MR) will be
better as percentage of launch overhead in the total time is high for smaller
jobs.
• Move workloads from gateway boxes to the cluster
• Move as much work as possible to the cluster by modelling it via the job DAG.
Exploit the parallelism and resources of the cluster.
Page 27

Tez – Adoption
• Hive
• Hadoop standard for declarative access via SQL-like interface
• Pig
• Hadoop standard for procedural scripting and pipeline processing
• Cascading
• Developer friendly Java API and SDK
• Scalding (Scala API on Cascading)
• Commercial Vendors
• ETL : Use Tez instead of MR or custom pipelines
• Analytics Vendors : Use Tez as a target platform for scaling parallel
analytical tools to large data-sets
Page 28

Tez – Community
• Early adopters and code contributors welcome
– Adopters to drive more scenarios. Contributors to make them happen.
• Technical blog series
– http://hortonworks.com/blog/apache-tez-a-new-chapter-in-hadoop-data-processing
• Useful links
–Work tracking: https://issues.apache.org/jira/browse/TEZ
– Code: https://github.com/apache/tez
– Developer list: dev@tez.apache.org
User list: user@tez.apache.org
Issues list: issues@tez.apache.org
Page 29

Tez Data Processing over Yarn

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Tez Data Processing over Yarn

Similar to Tez Data Processing over Yarn (20)

More from InMobi Technology

More from InMobi Technology (20)

Recently uploaded

Recently uploaded (20)

Tez Data Processing over Yarn

Editor's Notes