More Related Content Similar to Tez Data Processing over Yarn (20) More from InMobi Technology (20) Tez Data Processing over Yarn2. Tez – Introduction
• Distributed execution framework targeted towards data-processing
© Hortonworks Inc. 2014
applications.
• Based on expressing a computation as a dataflow graph.
• Highly customizable to meet a broad spectrum of use cases.
• Built on top of YARN – the resource management framework
for Hadoop.
• Open source Apache project and Apache licensed.
Page 2
3. Hadoop 1 -> Hadoop 2
© Hortonworks Inc. 2014
HADOOP 1.0
Pig
(data flow)
Hive
(sql)
Others
(cascading)
MapReduce
(cluster resource management
& data processing)
HDFS
(redundant, reliable storage)
HADOOP 2.0
Data Flow
(execution engine)
YARN
Tez
(cluster resource management)
HDFS2
(redundant, reliable storage)
Pig
SQL
Hive
Others
(Cascading)
Batch
MapReduce Real Time
Stream
Processing
Storm
Online
Data
Processing
HBase,
Accumulo
Monolithic
• Resource Management
• Execution Engine
• User API
Layered
• Resource Management – YARN
• Execution Engine – Tez
• User API – Hive, Pig, Cascading, Your App!
4. Tez – Problems that it addresses
• Expressing the computation
• Direct and elegant representation of the data processing flow
© Hortonworks Inc. 2014
• Performance
• Late Binding : Make decisions as late as possible using real data from at
runtime
• Leverage the resources of the cluster efficiently
• Just work out of the box!
• Customizable engine to let applications tailor the job to meet their
specific requirements
• Operation simplicity
• Painless to operate, experiment and upgrade
Page 4
5. Tez – Simplifying Operations
• Tez is a pure YARN application. Easy and safe to try it out!
• No deployments to do, no servers to run
• Enables running different versions concurrently. Easy to test new
functionality while keeping stable versions for production.
• Leverages YARN local resources.
HDFS
Tez Lib 1 Tez Lib 2
TezClient TezTask
TezTask
© Hortonworks Inc. 2014
Page 5
Client
Machine
Node
Manager
Node
Manager
TezClient
Client
Machine
6. Tez – Expressing the computation
Distributed data processing jobs typically look like DAGs (Directed Acyclic
Graph).
• Vertices in the graph represent data transformations
• Edges represent data movement from producers to consumers
© Hortonworks Inc. 2014
Page 6
Preprocessor Stage
Partition Stage
Aggregate Stage
Sampler
Task-1 Task-2
Task-1 Task-2
Task-1 Task-2
Sample
s
Ranges
Distributed Sort
7. MR is a 2-vertex sub-set of Tez
© Hortonworks Inc. 2014
Page 7
8. But Tez is so much more
© Hortonworks Inc. 2014
Page 8
9. Tez – Expressing the computation
Tez defines the following APIs to define the work
• DAG API
• Defines the structure of the data processing and the relationship
between producers and consumers
• Enable definition of complex data flow pipelines using simple graph
connection API’s. Tez expands the logical DAG at runtime
• This is how all the tasks in the job get specified
© Hortonworks Inc. 2014
• Runtime API
• Defines the interface using which the framework and app code interact
with each other
• App code transforms data and moves it between tasks
• This is how we specify what actually executes in each task on the cluster
nodes
Page 9
10. © Hortonworks Inc. 2014
reduce1
map2
reduce2
join1
map1
Scatter_Gather
Bipartite
Sequential
Scatter_Gather
Bipartite
Sequential
Tez – DAG API
// Define DAG
DAG dag = DAG.create(“sessionize”);
// Define Vertex-M1
Vertex m1 = Vertex.create("M_1",
ProcessorDescriptor.create(MapProcessor_1.class.getName()));
//Define Vertex-R1
Vertex r1 = Vertex.create(”R_1",
ProcessorDescriptor.create(ReduceProcessor_1.class.getName()));
…
…
// Define Edge (edge between m1 and r1)
Edge e1 =Edge.create(m1, r1,
OrderedPartitionedKVEdgeConfig.newBuilder(…).build()
.createDefaultEdgeProperty());
// Connect them
dag.addVertex(m1).addVertex(r1).addEdge(e1)…
Page 10
Defines the global processing flow
12. Tez – Logical DAG expansion at Runtime
© Hortonworks Inc. 2014
Page 12
reduce1
map2
reduce2
join1
map1
14. Tez – Library of Inputs and Outputs
Classical ‘Map’
Sorted
Output
© Hortonworks Inc. 2014
Page 14
Classical ‘Reduce’
Map
Processor
HDFS
Input
Intermediate ‘Reduce’ for
Map-Reduce-Reduce
Reduce
Processor
Shuffle
Input
HDFS
Output
Reduce
Processor
Shuffle
Input
Sorted
Output
• What is built in?
– Hadoop InputFormat/OutputFormat
– OrderedPartitioned Key-Value
Input/Output
– UnorderedPartitioned Key-Value
Input/Output
– Key-Value Input/Output
15. Tez – Container Re-Use
• Reuse YARN containers/JVMs to launch new tasks
• Reduce scheduling and launching delays
• Shared in-memory data across tasks
• JVM JIT friendly execution
© Hortonworks Inc. 2014
Page 15
TezTask Host
TezTask1
TezTask2
Shared Objects
YARN Container / JVM
Tez
Application Master
YARN Container
Start Task
Task Done
Start Task
16. © Hortonworks Inc. 2014
Container reuse
• Tez specific feature
• Run an entire DAG using the same containers
• Different vertices use same container
• Saves time talking to YARN for new containers
17. © Hortonworks Inc. 2014
Tez – Sessions
Page 17
Client
Start
Session
Application Master
Submit
DAG
Task Scheduler
Container Pool
Shared
Object
Registry
Pre
Warmed
JVM
Sessions
• Standard concepts of pre-launch
and pre-warm applied
• Key for Interactive queries
• Represents a connection between
the user and the cluster
• Multiple DAGs/Queries executed in
the same AM
• Containers re-used across queries
• Takes care of data locality and
releasing resources when idle
18. Tez – Re-Use in Action (In Session)
Task Execution Timeline
© Hortonworks Inc. 2014
19. Tez – Auto Reduce Parallelism
© Hortonworks
Inc. 2011
19
21. Tez – End User Benefits
© Hortonworks Inc. 2014
• Better Performance
• Framework performance + application performance
• Better utilization of cluster resources
• Efficient use of allocated resources
• Better predictability of results
• Minimized queuing delays
• Reduced load on HDFS
• Removes unnecessary HDFS writes
• Reduced network usage
• Efficient data transfer using new data patterns
• Increased developer productivity
• Lets the user concentrate on application logic instead of Hadoop
internals
Page 21
22. Tez – Real World Examples
© Hortonworks
Inc. 2011
22
23. Tez – Broadcast Edge
SELECT ss.ss_item_sk, avg_price, inv.inv_quantity_on_hand
FROM (select avg(ss_sales_price) as avg_price, ss_item_sk from store_sales
group by ss_item_sk) ss
JOIN inventory inv
ON (inv.inv_item_sk = ss.ss_item_sk);
M M M
HDFS
M M M
© Hortonworks Inc. 2014
HDFS
Store Sales
scan. Group by
and aggregation.
Inventory and Store
Sales (aggr.) output
scan and shuffle
join.
R R
R R
M M M
M M M
HDFS
Store Sales
scan. Group by
and aggregation.
Inventory and Store
Sales (aggr.) output
scan and shuffle
join.
R R
broadcast
24. Tez – Multiple Outputs
FROM (SELECT * FROM store_sales, date_dim WHERE ss_sold_date_sk = d_date_sk and d_year = 2000)
INSERT INTO TABLE t1 SELECT distinct ss_item_sk
INSERT INTO TABLE t2 SELECT distinct ss_customer_sk;
Hive – MR Hive – Tez
© Hortonworks Inc. 2014
M
M M M
HDFS
Map join
date_dim/store
sales
Two MR jobs to
do the distinct
M M
M M M
R R
HDFS
HDFS
M M M
R
M M M
R
HDFS
Broadcast Join
(scan date_dim,
join store sales)
Distinct for
customer + items
Materialize join on
HDFS
Hive : Multi-insert
queries
25. Tez – Data at scale
© Hortonworks Inc. 2014
Hive TPC-DS
Scale 10TB
Page 25
26. Tez – what if you can’t get enough containers?
• 78 vertex + 8374 tasks on 50 YARN containers
© Hortonworks Inc. 2014
Page 26
27. Tez – Designed for big, busy clusters
© Hortonworks Inc. 2014
• Number of stages in the DAG
• Higher the number of stages in the DAG, performance of Tez (over MR) will be
better.
• Cluster/queue capacity
• More congested a queue is, the performance of Tez (over MR) will be better due
to container reuse.
• Size of intermediate output
• More the size of intermediate output, the performance of Tez (over MR) will be
better due to reduced HDFS usage (cross-rack traffic)
• Size of data in the job
• For smaller data and more stages, the performance of Tez (over MR) will be
better as percentage of launch overhead in the total time is high for smaller
jobs.
• Move workloads from gateway boxes to the cluster
• Move as much work as possible to the cluster by modelling it via the job DAG.
Exploit the parallelism and resources of the cluster.
Page 27
28. © Hortonworks Inc. 2014
Tez – Adoption
• Hive
• Hadoop standard for declarative access via SQL-like interface
• Pig
• Hadoop standard for procedural scripting and pipeline processing
• Cascading
• Developer friendly Java API and SDK
• Scalding (Scala API on Cascading)
• Commercial Vendors
• ETL : Use Tez instead of MR or custom pipelines
• Analytics Vendors : Use Tez as a target platform for scaling parallel
analytical tools to large data-sets
Page 28
29. © Hortonworks Inc. 2014
Tez – Community
• Early adopters and code contributors welcome
– Adopters to drive more scenarios. Contributors to make them happen.
• Technical blog series
– http://hortonworks.com/blog/apache-tez-a-new-chapter-in-hadoop-data-processing
• Useful links
–Work tracking: https://issues.apache.org/jira/browse/TEZ
– Code: https://github.com/apache/tez
– Developer list: dev@tez.apache.org
User list: user@tez.apache.org
Issues list: issues@tez.apache.org
Page 29
Editor's Notes Container Reuse
Fault Tolerance
Recovery
Routing Data Efficiently
Elasticity
Hard to expect the f/w to do the last bit of optimizations. Sometimes, user would like to instruct the framework on what needs to be done at runtime. Tez allows such customizations.
It is easy to operate, experiment and upgrade Tez deployment. This is hugely important, since we had to get a downtime for the entire cluster with previous MR deployment. Talk a little bit about custom edges as well Hive has written it’s own processor