Apache Tez : Accelerating Hadoop Query Processing


Published on

호튼웍스 아시아 기술 총괄 이사 제프 마크햄 (Jeff Markham) 이 테즈에 대한 소개를 합니다. 테즈는 맵리듀스를 대체하여 하둡의 질의 처리를 가속하는 소프트웨어입니다. 왜 테즈를 만들었고, 어떻게 구성되었으며, 최적화는 어떻게 진행되고, 그 성능은 얼마나 좋아졌는지 전반에 대해 설명합니다.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Apache Tez : Accelerating Hadoop Query Processing

  1. 1. Apache Tez : Accelerating Hadoop Query Processing Jeff Markham Technical Director, APAC Hortonworks Page 1
  2. 2. Tez – Introduction • Distributed execution framework targeted towards data-processing applications. • Based on expressing a computation as a dataflow graph. • Built on top of YARN – the resource management framework for Hadoop. • Open source Apache incubator project and Apache licensed. © Hortonworks Inc. 2013 Page 2
  3. 3. YARN: Taking Hadoop Beyond Batch MapReduce as Base Apache Tez as Base HADOOP 1.0 HADOOP 2.0 Batch   Pig   (data  flow)     Hive   Others   (sql)   (cascading)     MapReduce   MapReduce   Data  Flow   Pig   SQL   Hive     Others   (cascading)     Tez   Storm   (execu:on  engine)   YARN   (cluster  resource  management    &  data  processing)   (cluster  resource  management)   HDFS   HDFS2   (redundant,  reliable  storage)   © Hortonworks Inc. 2013. Online     Real  Time     Data     Stream     Processing   Processing   HBase,   (redundant,  reliable  storage)   Accumulo    
  4. 4. Apache Tez (“Speed”) •  Replaces MapReduce as primitive for Pig, Hive, Cascading etc. – Smaller latency for interactive queries – Higher throughput for batch queries – 22 contributors: Hortonworks (13), Facebook, Twitter, Yahoo, Microsoft Task with pluggable Input, Processor and Output Input   Processor   Output   Task   Tez Task - <Input, Processor, Output> YARN ApplicationMaster to run DAG of Tez Tasks © Hortonworks Inc. 2013.
  5. 5. Tez: Building blocks for scalable data processing Classical ‘Map’ HDFS   Input   Map   Processor   Classical ‘Reduce’ Sorted   Output   Shuffle   Input   Shuffle   Input   Reduce   Processor   Sorted   Output   Intermediate ‘Reduce’ for Map-Reduce-Reduce © Hortonworks Inc. 2013. Reduce   Processor   HDFS   Output  
  6. 6. Hive-on-MR vs. Hive-on-Tez Tez avoids unneeded writes to HDFS SELECT a.x, AVERAGE(b.y) AS avg FROM a JOIN b ON (a.id = b.id) GROUP BY a UNION SELECT x, AVERAGE(y) AS AVG FROM c GROUP BY x ORDER BY AVG; Hive – MR M M Hive – Tez M SELECT a.state SELECT b.id R R M SELECT a.state, c.itemId M M M R M SELECT b.id R M HDFS JOIN (a, c) SELECT c.price M R M R HDFS R JOIN (a, c) R HDFS JOIN(a, b) GROUP BY a.state COUNT(*) AVERAGE(c.price) © Hortonworks Inc. 2013. M M R M JOIN(a, b) GROUP BY a.state COUNT(*) AVERAGE(c.price) R
  7. 7. Tez Sessions … because Map/Reduce query startup is expensive • Tez Sessions – Hot containers ready for immediate use – Removes task and job launch overhead (~5s – 30s) • Hive – Session launch/shutdown in background (seamless, user not aware) – Submits query plan directly to Tez Session Native Hadoop service, not ad-hoc © Hortonworks Inc. 2013.
  8. 8. Tez Delivers Interactive Query - Out of the Box! Feature   DescripEon   Benefit   Tez  Session   Overcomes  Map-­‐Reduce  job-­‐launch  latency  by  pre-­‐ launching  Tez  AppMaster   Latency   Tez  Container  Pre-­‐ Launch   Overcomes  Map-­‐Reduce  latency  by  pre-­‐launching   hot  containers  ready  to  serve  queries.   Latency   Finished  maps  and  reduces  pick  up  more  work   Tez  Container  Re-­‐Use   rather  than  exi:ng.  Reduces  latency  and  eliminates   difficult  split-­‐size  tuning.  Out  of  box  performance!   Run:me  re-­‐ Run:me  query  tuning  by  picking  aggrega:on   configura:on  of  DAG   parallelism  using  online  query  sta:s:cs   Tez  In-­‐Memory   Cache   Hot  data  kept  in  RAM  for  fast  access.   Complex  DAGs   Tez  Broadcast  Edge  and  Map-­‐Reduce-­‐Reduce   paXern  improve  query  scale  and  throughput.   © Hortonworks Inc. 2013. Latency   Throughput   Latency   Throughput   Page 8
  9. 9. Tez – Design Themes • Empowering End Users • Execution Performance © Hortonworks Inc. 2013 Page 9
  10. 10. Tez – Empowering End Users • Expressive dataflow definition API’s • Flexible Input-Processor-Output runtime model • Data type agnostic • Simplifying deployment © Hortonworks Inc. 2013 Page 10
  11. 11. Tez – Empowering End Users • Expressive dataflow definition API’s – Enable definition of complex data flow pipelines using simple graph connection API’s. Tez expands the logical plan at runtime. – Targeted towards data processing applications like Hive/Pig but not limited to it. Hive/Pig query plans naturally map to Tez dataflow graphs with no translation impedance. TaskA-1 TaskA-2 TaskD-1 TaskB-1 TaskB-2 TaskD-2 © Hortonworks Inc. 2013 TaskC-1 TaskE-1 TaskC-2 TaskE-2 Page 11
  12. 12. Tez – Empowering End Users • Expressive dataflow definition API’s Task-2 Task-1 Samples Task-1 Partition Stage Task-2 Preprocessor Stage Sampler Ranges Distributed Sort Task-1 © Hortonworks Inc. 2013 Task-2 Aggregate Stage Page 12
  13. 13. Tez – Empowering End Users • Flexible Input-Processor-Output runtime model – Construct physical runtime executors dynamically by connecting different inputs, processors and outputs. – End goal is to have a library of inputs, outputs and processors that can be programmatically composed to generate useful tasks. HDFSInput ShuffleInput MapProcessor ReduceProcessor JoinProcessor FileSortedOutput HDFSOutput FileSortedOutput Mapper Reducer PairwiseJoin © Hortonworks Inc. 2013 Input1 Input2 Page 13
  14. 14. Tez – Empowering End Users • Data type agnostic – Tez is only concerned with the movement of data. Files and streams of bytes. – Does not impose any data format on the user application. MR application can use Key-Value pairs on top of Tez. Hive and Pig can use tuple oriented formats that are natural and native to them. Tez Task File User Code Key Value Bytes Bytes Tuples Stream © Hortonworks Inc. 2013 Page 14
  15. 15. Tez – Empowering End Users • Simplifying deployment – Tez is a completely client side application. – No deployments to do. Simply upload to any accessible FileSystem and change local Tez configuration to point to that. – Enables running different versions concurrently. Easy to test new functionality while keeping stable versions for production. – Leverages YARN local resources. HDFS Tez Lib 1 Tez Lib 2 TezClient TezTask TezTask TezClient Client Machine Node Manager Node Manager Client Machine © Hortonworks Inc. 2013 Page 15
  16. 16. Tez – Empowering End Users • Expressive dataflow definition API’s • Flexible Input-Processor-Output runtime model • Data type agnostic • Simplifying usage With great power API’s come great responsibilities J Tez is a framework on which end user applications can be built © Hortonworks Inc. 2013 Page 16
  17. 17. Tez – Execution Performance • Performance gains over Map Reduce • Optimal resource management • Plan reconfiguration at runtime • Dynamic physical data flow decisions © Hortonworks Inc. 2013 Page 17
  18. 18. Tez – Execution Performance • Performance gains over Map Reduce – Eliminate replicated write barrier between successive computations. – Eliminate job launch overhead of workflow jobs. – Eliminate extra stage of map reads in every workflow job. – Eliminate queue and resource contention suffered by workflow jobs that are started after a predecessor job completes. Pig/Hive - Tez Pig/Hive - MR © Hortonworks Inc. 2013 Page 18
  19. 19. Tez – Execution Performance • Plan reconfiguration at runtime – Dynamic runtime concurrency control based on data size, user operator resources, available cluster resources and locality. – Advanced changes in dataflow graph structure. – Progressive graph construction in concert with user optimizer. HDFS Blocks Stage 1 50 maps 100 partitions Stage 2 100 reducers Stage 1 50 maps 100 partitions Only 10GB’s of data Stage 2 100 10 reducers YARN Resources © Hortonworks Inc. 2013 Page 19
  20. 20. Tez – Execution Performance • Optimal resource management – Reuse YARN containers to launch new tasks. – Reuse YARN containers to enable shared objects across tasks. Start Task Tez Application Master Task Done Start Task YARN Container TezTask1 TezTask2 Shared Objects TezTask Host YARN Container © Hortonworks Inc. 2013 Page 20
  21. 21. Tez – Execution Performance • Dynamic physical data flow decisions – Decide the type of physical byte movement and storage on the fly. – Store intermediate data on distributed store, local store or inmemory. – Transfer bytes via blocking files or streaming and the spectrum in between. Producer (small size) Producer Local File At Runtime In-Memory Consumer Consumer © Hortonworks Inc. 2013 Page 21
  22. 22. Tez – Sessions Start Session Submit DAG Client Application Master Task Scheduler Container Pool •  Key for interactive queries •  Analogous to database sessions and represents a connection between the user and the cluster •  Run multiple DAGs / queries in the same session •  Maintains a pool of reusable containers for low latency execution of tasks within and across queries •  Takes care of data locality and releasing resources when idle •  Session cache in the Application Master and in the container pool reduce recomputation and re-initialization © Hortonworks Inc. 2013 PreWarmed JVM Shared Object Registry Page 33
  23. 23. Tez – Benchmark Performance Significant (but not all) speed-ups due to Tez: •  DAG support and runtime graph reconfiguration enable utilizing the parallelism of the cluster •  Tez Session and container re-use enable efficient and low latency execution © Hortonworks Inc. 2013 Page 35
  24. 24. Tez – Performance Analysis Tez Session populates container pool AM Dimension table calculation and HDFS split generation in parallel Dimension tables broadcasted to Hive MapJoin tasks … … Final Reducer prelaunched and fetches completed inputs TPC-DS – Query 27 with Hive on Tez © Hortonworks Inc. 2013 Page 36
  25. 25. Tez – Current status • Apache Incubator Project – Rapid development. Over 600 jiras opened. Over 400 resolved. – Growing community of contributors and users. • Focus on stability – Testing and quality are highest priority. – Code ready and deployed on multi-node environments. • Support for a vast topology of DAGs – Already functionally equivalent to Map Reduce. Existing Map Reduce jobs can be executed on Tez with few or no changes. – Hive re-targeted to use Tez for execution of queries (HIVE-4660). – Work started on Pig to use Tez for execution of scripts (PIG-3446). © Hortonworks Inc. 2013 Page 37
  26. 26. Tez – Roadmap • Richer DAG support – Support for co-scheduling and streaming – Better fault tolerance with checkpoints • Performance optimizations – More efficiencies in transfer of data – Improve session performance • Usability – Stability and testability – Recovery and history – Tools for performance analysis and debugging © Hortonworks Inc. 2013 Page 38
  27. 27. Tez – Key Takeaways • Distributed execution framework that works on computations represented as dataflow graphs • Naturally maps to execution plans produced by query optimizers • Customizable execution architecture designed to enable dynamic performance optimizations at runtime • Works out of the box with the platform figuring out the hard stuff • Span the spectrum of interactive latency to batch • Open source Apache project – your use-cases and code are welcome • It works and is already being used by Hive and Pig © Hortonworks Inc. 2013 Page 40
  28. 28. Thank You ! © Hortonworks Inc. 2013 Page 41