Page1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP: long-lived execution in Hive
Sergey Shelukhin
Page2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP: long-lived execution in Hive
Stinger recap and even faster queries+
+ LLAP: overview+
+ Query fragment execution+
+ IO elevator and caching+
+ Performance+
+ Current status and future directions+
+ Query fragment API+
Page3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hive performance recap
• Stinger: An Open Roadmap to improve Apache Hive’s
performance 100x
• Delivered in 100% Apache Open Source
• Stinger.Next: Enterprise SQL at Hadoop Scale
• Launched in September 2014, phase 1 delivered in 2015
Vectorized SQL Engine,
Tez Execution Engine,
ORC Columnar format
Cost Based Optimizer
Hive 0.10
Batch
Processing
100-150x Query Speedup
Hive 0.14
Human
Interactive
(5 seconds)
Page4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
The road ahead to sub-second queries
• Startup costs are now a key bottleneck
• Example: JVM takes 100s of ms to start up
• Vectorized code can benefit from JIT optimization
• JIT optimizer needs (run)time to do its work
• Improved operator performance shifts focus on IO
• Reading data is serialized with data processing
• Reading from HDFS is relatively expensive
• Large machines provide opportunities for data sharing
• Both between parallel computation (sharing) and serial (caching)
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP: overview
Page6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What is LLAP?
• Hybrid execution with daemons in Hive
• Eliminates startup costs for tasks
• Allows the JIT optimizer to have time to optimize
• Multi-threaded execution of vectorized
operator pipelines
• Also allows sharing of metadata, map join tables, etc.
• Asynchronous IO elevator and caching
• Reduces IO cost and parallelizes IO and processing
• Can be spindle-aware; other IO optimizations
• Query fragment API
Node
LLAP Process
Cache
Query Fragment
HDFS
Query Fragment
Page7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What LLAP isn't
• Not a Hive execution engine (like Tez, MR, Spark…)
• Execution engines provide coordination and scheduling
• Some work (e.g. large shuffles) can still be scheduled in containers
• Not a storage layer
• Daemons are stateless and read (and cache) data from HDFS
• Does not supersede existing Hive
• Container-based execution still fully supported
Page8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Example execution: MR vs Tez vs Tez+LLAP
M M M
R R
M M
R
M M
R
M M
R
HDFS
HDFS
HDFS
T T T
R R
R
T T
T
R
M M M
R R
R
M M
R
R
HDFS
In-Memory
columnar cache
Map – Reduce
Intermediate results in HDFS
Tez
Optimized Pipeline
Tez with LLAP
Resident process on Nodes
Map tasks
read HDFS
Page9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP in your cluster
• LLAP daemons run on existing YARN
• Apache Slider is used for provisioning and recovery
• Easy to bring up, tear down, and share clusters
• Resource management via YARN delegation model (WIP)
• LLAP and containers dynamically balance resource usage (WIP)
Page10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Benefits unrelated to performance (WIP)
• Concurrent query execution and priority enforcement
• Access control, including column-level security
• ACID improvements
• Can be used externally via the API
• Will be usable e.g. by Spark, Pig, Cascading, …
Page11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Query fragment API
Page12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Query Fragment API - overview
• Hadoop RPC, protobuf are used to send fragments
• Fragments are "physical algebra": operators, metadata, input
sources and output channels
• Results are returned asynchronously via output channels
• Hive will produce fragments for LLAP as part of physical
optimization
• Other applications can compile their own physical algebra
Page13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Query Fragment API – algebra
• Operators: Scan, Filter, Group By, Hash/Merge join, etc.
• Operators may include statistics for local optimization
• Expressions: comparison, arithmetic, Hive built-in functions
• All Hive datatypes
• Complex types like map/list/etc. – WIP
Page14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Query Fragment API – client API
• Encapsulates creation, submission of query fragments
• Also helps with IO from LLAP
• Getting vectorized record readers, batches, etc.
• Working with output channels (cancellation, availability of records,
failure)
Page15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Query execution
Page16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP: Query Execution
Overview of Query Execution+
+ Scheduling+
++
+ Coordination via Tez+
What Fragments run in LLAP vs Containers+
Future work+
Page17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Tez + LLAP – overview
• Hive on Tez already proven to perform well
• Tez being enhanced to allow it to coordinate work to external
systems (TEZ-2003)
• Pluggable Scheduling
• Pluggable communication – custom execution specifications, protocols
• DAG coordination remains unchanged
• Hive Operators / Tez Runtime components used for Processing
and data transfer
Page18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Deciding on where query components run
• Fragments can run in LLAP, regular containers, AM (as threads)
• Decision made by the Hive Client
• Configurable – all in LLAP, none in LLAP, intelligent mix
• Criteria for running in LLAP (in auto mode)
• No user code (or only blessed user code)
• Data source – HDFS
• ORC and vectorized execution (for now)
• Others can still run in LLAP in "all" mode, w/o IO elevator and cache
• Data size limitations (avoid heavy / long running processing within LLAP)
Page19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
So…
M M M
R R
R
M M
R
R
Tez
Page20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
AM
So…
T T T
R R
R
T T
T
R
M M M
R R
R
M M
R
R
Tez Tez with LLAP (auto)
auto
Page21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
AM AM
So…
T T T
R R
R
T T
T
R
M M M
R R
R
M M
R
R
Tez Tez with LLAP (auto)
T T T
R R
R
T T
T
R
Tez with LLAP (all)
allauto
Page22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Scheduling for LLAP in Tez AM
• Greedy scheduling per query – assumes entire cluster available
• Schedule work to preferred location (HDFS locality)
• Multiple independent queries set the same preferred location if accessing the
same data (improves cache locality)
• LLAP Daemons schedule fragments independently – across
multiple queries
Page23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP
Queue
Queuing fragments
• LLAP daemon has a number of executors
(think containers)
• Wait queue with pluggable priority
• Geared towards low latency queries (default)
• Models estimated work left in query
• Sequencing within a query handled via topological
order
• Fragment start time factors into scheduling decision
Executor
Q1 Reducer 2
Executor
Q1 Map 1
Executor
Q1 Map 1
Executor
Q3 Map 19
Q1 Reducer 2
Q1 Map 1
Q3 Map 19
Q1 Reducer 2
Page24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP Scheduling – pipelining and preemption
• A fragment can run when inputs are not yet
available (for pipelining)
• A fragment is "finishable" if
all the source data is ready
LLAP
QueueExecutor
Executor
Interactive
query map 1/3
…
Interactive
query map 3/3
Executor
Interactive
query map 2/3
Wide query
reduce
Well, 10
mapper out of
100 are done!
Page25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP Scheduling – pipelining and preemption
• A fragment can run when inputs are not yet
available (for pipelining)
• A fragment is "finishable" if
all the source data is ready
• If the data is not ready, may never free the executor
• Non-finishable fragments can be preempted
• Improves throughput, prevents deadlocks
LLAP
QueueExecutor
Executor
Interactive
query map 1/3
…
Interactive
query map 3/3
Executor
Interactive
query map 2/3
Wide query
reduce
Page26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP Scheduling – pipelining and preemption
• A fragment can run when inputs are not yet
available (for pipelining)
• A fragment is "finishable" if
all the source data is ready
• If the data is not ready, may never free the executor
• Non-finishable fragments can be preempted
• Improves throughput, prevents deadlocks
LLAP
QueueExecutor
Executor
Interactive
query map 1/3
…
Interactive
query map 3/3
Executor
Interactive
query map 2/3
Page27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
IO elevator and other internals
Page28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
LLAP: IO elevator and other internals
Asynchronous IO and decompression+
+ Off-heap data caching+
++
+ File metadata caching+
Map join table sharing+
Better JIT usage thanks to persistent daemon+
Page29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Asynchronous IO
• Currently, Hive IO and input
decoding is interleaved
with processing
• Remote HDFS reads are
expensive
• Even local disk might be
• Data decompression and
decoding is expensive
Page30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Asynchronous IO
• With IO elevator, reading,
decoding and processing are
parallel
• IO threads can be spindle
aware (WIP)
• Depending on workload, IO
and processing threads can
balance resource usage
(throttle IO, etc.) (WIP)
Page31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Caching and off-heap data
• Decompressed data is cached off-heap
• Simplifies memory management, mitigates some GC problems
• Saves HDFS and decompression costs, esp. on dimension tables
• In future, processing cache data directly possible to avoid copies
• Replacement policy is pluggable
• Currently, simple local policies are used e.g. FIFO, LRFU
• Other policies possible (e.g. workflow-adaptable, or lazily
coordinated for better cache affinity)
Page32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cache size vs operator memory requirement
• Cache space takes away from operator space
• Sort buffers, hash join tables, GBY buffers take space
• Tradeoff between HDFS reads and operator speed
• Depends on workflow, dataset size, etc.
• New vectorization changes in Hive will speed up operators and
allow for larger cache
Page33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Other benefits
• File metadata and indexes are cached
• Much faster PPD application for selective queries – no HDFS reads
• Same replacement as data cache (but higher priority)
• Map join hash tables, fragment plans are shared
• Multiple tasks do not all generate the table or deserialize the plans
• Better use of JIT optimizer
• Because the daemons are persistent, JIT has more time to kick in
• Especially good with vectorization!
Page34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Performance
Page35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Setup
• 13 physical machines (12 cores, 40Gb RAM each)
• Note – smaller cluster than previous Tez perf runs
• TPCDS 200, interactive queries
• Both – ORC, vectorized, Hadoop 2.8, queries via HS2 w/JMeter
• TEZ: Hive 1.2 + Tez 0.8 (snapshot)
• Pre-warm and container reuse enabled
• LLAP: Branch in pre-alpha stage + Tez 0.8 (snapshot)
• Bias towards executors – small cache
• Otherwise no tuning
Page36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Summary
• NOTE - in early stage – pre-alpha-release perf results
• Still, interactive queries are already 1.5-4 times faster
• First query result after launching CLI significantly improved
• In real life, LLAP daemons would also already be warm
• Parallel queries are already better
• Lots of work still ahead – epic locks in Kryo, Log4j, HDFS, HiveServer2;
better object sharing, better priority enforcement
• Should be much faster in short order
Page37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Query execution time
0
5
10
15
20
25
30
35
query55 query42 query52 query3 query12 query27 query26 query7 query19 query96 query43 query15 query82 query13
Execuonme,sec
Hive (1.2.0)
Hive (LLAP)
Page38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Parallel query execution
• 8 users, 4 parallel
executors on HS
• Tez: 50% of serial
time; LLAP alpha:
41% of serial time
0
50
100
150
200
250
300
Serial Parallel
Execuonme,sec
Total execu on me (13 queries)
Hive (1.2.0)
Hive (LLAP)
Page39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Current status and future directions
Page40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Current status
• Putting the finishing touches on the CTP (alpha release)
• Watch Hortonworks blog, and Apache Hive mailing lists, for details!
• The basic features are functional
• Currently only on Tez; IO only on vectorized and ORC
• AKA the fastest Hive setup possible 
• Lots of performance improvement not yet realized
• Lots of advanced features are WIP or planned
Page41 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Work in progress
• Further performance improvement
• Concurrent query execution improvements
• Better vectorized operators (join, group by, …)
• Defining the API
Page42 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Future work
• Security, including column level security
• Tighter integration with YARN, e.g. resource delegation
• Guaranteed Capacities for better SLA guarantee, maybe with central scheduler
• Dynamic daemon sizing with off-heap storage
• ACID support
• Better (maybe centrally coordinated) locality and caching
• Temp tables, intermediate query results in LLAP
• Interleaving of Fragment Execution
• Past processing is not lost (as against preemption)
• A rogue / badly scheduled query will not hog the system
Page43 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Questions?
?
Interested? Stop by the Hortonworks booth to learn more

LLAP: long-lived execution in Hive

  • 1.
    Page1 © HortonworksInc. 2011 – 2015. All Rights Reserved LLAP: long-lived execution in Hive Sergey Shelukhin
  • 2.
    Page2 © HortonworksInc. 2011 – 2015. All Rights Reserved LLAP: long-lived execution in Hive Stinger recap and even faster queries+ + LLAP: overview+ + Query fragment execution+ + IO elevator and caching+ + Performance+ + Current status and future directions+ + Query fragment API+
  • 3.
    Page3 © HortonworksInc. 2011 – 2015. All Rights Reserved Hive performance recap • Stinger: An Open Roadmap to improve Apache Hive’s performance 100x • Delivered in 100% Apache Open Source • Stinger.Next: Enterprise SQL at Hadoop Scale • Launched in September 2014, phase 1 delivered in 2015 Vectorized SQL Engine, Tez Execution Engine, ORC Columnar format Cost Based Optimizer Hive 0.10 Batch Processing 100-150x Query Speedup Hive 0.14 Human Interactive (5 seconds)
  • 4.
    Page4 © HortonworksInc. 2011 – 2015. All Rights Reserved The road ahead to sub-second queries • Startup costs are now a key bottleneck • Example: JVM takes 100s of ms to start up • Vectorized code can benefit from JIT optimization • JIT optimizer needs (run)time to do its work • Improved operator performance shifts focus on IO • Reading data is serialized with data processing • Reading from HDFS is relatively expensive • Large machines provide opportunities for data sharing • Both between parallel computation (sharing) and serial (caching)
  • 5.
    Page 5 ©Hortonworks Inc. 2011 – 2015. All Rights Reserved LLAP: overview
  • 6.
    Page6 © HortonworksInc. 2011 – 2015. All Rights Reserved What is LLAP? • Hybrid execution with daemons in Hive • Eliminates startup costs for tasks • Allows the JIT optimizer to have time to optimize • Multi-threaded execution of vectorized operator pipelines • Also allows sharing of metadata, map join tables, etc. • Asynchronous IO elevator and caching • Reduces IO cost and parallelizes IO and processing • Can be spindle-aware; other IO optimizations • Query fragment API Node LLAP Process Cache Query Fragment HDFS Query Fragment
  • 7.
    Page7 © HortonworksInc. 2011 – 2015. All Rights Reserved What LLAP isn't • Not a Hive execution engine (like Tez, MR, Spark…) • Execution engines provide coordination and scheduling • Some work (e.g. large shuffles) can still be scheduled in containers • Not a storage layer • Daemons are stateless and read (and cache) data from HDFS • Does not supersede existing Hive • Container-based execution still fully supported
  • 8.
    Page8 © HortonworksInc. 2011 – 2015. All Rights Reserved Example execution: MR vs Tez vs Tez+LLAP M M M R R M M R M M R M M R HDFS HDFS HDFS T T T R R R T T T R M M M R R R M M R R HDFS In-Memory columnar cache Map – Reduce Intermediate results in HDFS Tez Optimized Pipeline Tez with LLAP Resident process on Nodes Map tasks read HDFS
  • 9.
    Page9 © HortonworksInc. 2011 – 2015. All Rights Reserved LLAP in your cluster • LLAP daemons run on existing YARN • Apache Slider is used for provisioning and recovery • Easy to bring up, tear down, and share clusters • Resource management via YARN delegation model (WIP) • LLAP and containers dynamically balance resource usage (WIP)
  • 10.
    Page10 © HortonworksInc. 2011 – 2015. All Rights Reserved Benefits unrelated to performance (WIP) • Concurrent query execution and priority enforcement • Access control, including column-level security • ACID improvements • Can be used externally via the API • Will be usable e.g. by Spark, Pig, Cascading, …
  • 11.
    Page11 © HortonworksInc. 2011 – 2015. All Rights Reserved Query fragment API
  • 12.
    Page12 © HortonworksInc. 2011 – 2015. All Rights Reserved Query Fragment API - overview • Hadoop RPC, protobuf are used to send fragments • Fragments are "physical algebra": operators, metadata, input sources and output channels • Results are returned asynchronously via output channels • Hive will produce fragments for LLAP as part of physical optimization • Other applications can compile their own physical algebra
  • 13.
    Page13 © HortonworksInc. 2011 – 2015. All Rights Reserved Query Fragment API – algebra • Operators: Scan, Filter, Group By, Hash/Merge join, etc. • Operators may include statistics for local optimization • Expressions: comparison, arithmetic, Hive built-in functions • All Hive datatypes • Complex types like map/list/etc. – WIP
  • 14.
    Page14 © HortonworksInc. 2011 – 2015. All Rights Reserved Query Fragment API – client API • Encapsulates creation, submission of query fragments • Also helps with IO from LLAP • Getting vectorized record readers, batches, etc. • Working with output channels (cancellation, availability of records, failure)
  • 15.
    Page15 © HortonworksInc. 2011 – 2015. All Rights Reserved Query execution
  • 16.
    Page16 © HortonworksInc. 2011 – 2015. All Rights Reserved LLAP: Query Execution Overview of Query Execution+ + Scheduling+ ++ + Coordination via Tez+ What Fragments run in LLAP vs Containers+ Future work+
  • 17.
    Page17 © HortonworksInc. 2011 – 2015. All Rights Reserved Tez + LLAP – overview • Hive on Tez already proven to perform well • Tez being enhanced to allow it to coordinate work to external systems (TEZ-2003) • Pluggable Scheduling • Pluggable communication – custom execution specifications, protocols • DAG coordination remains unchanged • Hive Operators / Tez Runtime components used for Processing and data transfer
  • 18.
    Page18 © HortonworksInc. 2011 – 2015. All Rights Reserved Deciding on where query components run • Fragments can run in LLAP, regular containers, AM (as threads) • Decision made by the Hive Client • Configurable – all in LLAP, none in LLAP, intelligent mix • Criteria for running in LLAP (in auto mode) • No user code (or only blessed user code) • Data source – HDFS • ORC and vectorized execution (for now) • Others can still run in LLAP in "all" mode, w/o IO elevator and cache • Data size limitations (avoid heavy / long running processing within LLAP)
  • 19.
    Page19 © HortonworksInc. 2011 – 2015. All Rights Reserved So… M M M R R R M M R R Tez
  • 20.
    Page20 © HortonworksInc. 2011 – 2015. All Rights Reserved AM So… T T T R R R T T T R M M M R R R M M R R Tez Tez with LLAP (auto) auto
  • 21.
    Page21 © HortonworksInc. 2011 – 2015. All Rights Reserved AM AM So… T T T R R R T T T R M M M R R R M M R R Tez Tez with LLAP (auto) T T T R R R T T T R Tez with LLAP (all) allauto
  • 22.
    Page22 © HortonworksInc. 2011 – 2015. All Rights Reserved Scheduling for LLAP in Tez AM • Greedy scheduling per query – assumes entire cluster available • Schedule work to preferred location (HDFS locality) • Multiple independent queries set the same preferred location if accessing the same data (improves cache locality) • LLAP Daemons schedule fragments independently – across multiple queries
  • 23.
    Page23 © HortonworksInc. 2011 – 2015. All Rights Reserved LLAP Queue Queuing fragments • LLAP daemon has a number of executors (think containers) • Wait queue with pluggable priority • Geared towards low latency queries (default) • Models estimated work left in query • Sequencing within a query handled via topological order • Fragment start time factors into scheduling decision Executor Q1 Reducer 2 Executor Q1 Map 1 Executor Q1 Map 1 Executor Q3 Map 19 Q1 Reducer 2 Q1 Map 1 Q3 Map 19 Q1 Reducer 2
  • 24.
    Page24 © HortonworksInc. 2011 – 2015. All Rights Reserved LLAP Scheduling – pipelining and preemption • A fragment can run when inputs are not yet available (for pipelining) • A fragment is "finishable" if all the source data is ready LLAP QueueExecutor Executor Interactive query map 1/3 … Interactive query map 3/3 Executor Interactive query map 2/3 Wide query reduce Well, 10 mapper out of 100 are done!
  • 25.
    Page25 © HortonworksInc. 2011 – 2015. All Rights Reserved LLAP Scheduling – pipelining and preemption • A fragment can run when inputs are not yet available (for pipelining) • A fragment is "finishable" if all the source data is ready • If the data is not ready, may never free the executor • Non-finishable fragments can be preempted • Improves throughput, prevents deadlocks LLAP QueueExecutor Executor Interactive query map 1/3 … Interactive query map 3/3 Executor Interactive query map 2/3 Wide query reduce
  • 26.
    Page26 © HortonworksInc. 2011 – 2015. All Rights Reserved LLAP Scheduling – pipelining and preemption • A fragment can run when inputs are not yet available (for pipelining) • A fragment is "finishable" if all the source data is ready • If the data is not ready, may never free the executor • Non-finishable fragments can be preempted • Improves throughput, prevents deadlocks LLAP QueueExecutor Executor Interactive query map 1/3 … Interactive query map 3/3 Executor Interactive query map 2/3
  • 27.
    Page27 © HortonworksInc. 2011 – 2015. All Rights Reserved IO elevator and other internals
  • 28.
    Page28 © HortonworksInc. 2011 – 2015. All Rights Reserved LLAP: IO elevator and other internals Asynchronous IO and decompression+ + Off-heap data caching+ ++ + File metadata caching+ Map join table sharing+ Better JIT usage thanks to persistent daemon+
  • 29.
    Page29 © HortonworksInc. 2011 – 2015. All Rights Reserved Asynchronous IO • Currently, Hive IO and input decoding is interleaved with processing • Remote HDFS reads are expensive • Even local disk might be • Data decompression and decoding is expensive
  • 30.
    Page30 © HortonworksInc. 2011 – 2015. All Rights Reserved Asynchronous IO • With IO elevator, reading, decoding and processing are parallel • IO threads can be spindle aware (WIP) • Depending on workload, IO and processing threads can balance resource usage (throttle IO, etc.) (WIP)
  • 31.
    Page31 © HortonworksInc. 2011 – 2015. All Rights Reserved Caching and off-heap data • Decompressed data is cached off-heap • Simplifies memory management, mitigates some GC problems • Saves HDFS and decompression costs, esp. on dimension tables • In future, processing cache data directly possible to avoid copies • Replacement policy is pluggable • Currently, simple local policies are used e.g. FIFO, LRFU • Other policies possible (e.g. workflow-adaptable, or lazily coordinated for better cache affinity)
  • 32.
    Page32 © HortonworksInc. 2011 – 2015. All Rights Reserved Cache size vs operator memory requirement • Cache space takes away from operator space • Sort buffers, hash join tables, GBY buffers take space • Tradeoff between HDFS reads and operator speed • Depends on workflow, dataset size, etc. • New vectorization changes in Hive will speed up operators and allow for larger cache
  • 33.
    Page33 © HortonworksInc. 2011 – 2015. All Rights Reserved Other benefits • File metadata and indexes are cached • Much faster PPD application for selective queries – no HDFS reads • Same replacement as data cache (but higher priority) • Map join hash tables, fragment plans are shared • Multiple tasks do not all generate the table or deserialize the plans • Better use of JIT optimizer • Because the daemons are persistent, JIT has more time to kick in • Especially good with vectorization!
  • 34.
    Page34 © HortonworksInc. 2011 – 2015. All Rights Reserved Performance
  • 35.
    Page35 © HortonworksInc. 2011 – 2015. All Rights Reserved Setup • 13 physical machines (12 cores, 40Gb RAM each) • Note – smaller cluster than previous Tez perf runs • TPCDS 200, interactive queries • Both – ORC, vectorized, Hadoop 2.8, queries via HS2 w/JMeter • TEZ: Hive 1.2 + Tez 0.8 (snapshot) • Pre-warm and container reuse enabled • LLAP: Branch in pre-alpha stage + Tez 0.8 (snapshot) • Bias towards executors – small cache • Otherwise no tuning
  • 36.
    Page36 © HortonworksInc. 2011 – 2015. All Rights Reserved Summary • NOTE - in early stage – pre-alpha-release perf results • Still, interactive queries are already 1.5-4 times faster • First query result after launching CLI significantly improved • In real life, LLAP daemons would also already be warm • Parallel queries are already better • Lots of work still ahead – epic locks in Kryo, Log4j, HDFS, HiveServer2; better object sharing, better priority enforcement • Should be much faster in short order
  • 37.
    Page37 © HortonworksInc. 2011 – 2015. All Rights Reserved Query execution time 0 5 10 15 20 25 30 35 query55 query42 query52 query3 query12 query27 query26 query7 query19 query96 query43 query15 query82 query13 Execuonme,sec Hive (1.2.0) Hive (LLAP)
  • 38.
    Page38 © HortonworksInc. 2011 – 2015. All Rights Reserved Parallel query execution • 8 users, 4 parallel executors on HS • Tez: 50% of serial time; LLAP alpha: 41% of serial time 0 50 100 150 200 250 300 Serial Parallel Execuonme,sec Total execu on me (13 queries) Hive (1.2.0) Hive (LLAP)
  • 39.
    Page39 © HortonworksInc. 2011 – 2015. All Rights Reserved Current status and future directions
  • 40.
    Page40 © HortonworksInc. 2011 – 2015. All Rights Reserved Current status • Putting the finishing touches on the CTP (alpha release) • Watch Hortonworks blog, and Apache Hive mailing lists, for details! • The basic features are functional • Currently only on Tez; IO only on vectorized and ORC • AKA the fastest Hive setup possible  • Lots of performance improvement not yet realized • Lots of advanced features are WIP or planned
  • 41.
    Page41 © HortonworksInc. 2011 – 2015. All Rights Reserved Work in progress • Further performance improvement • Concurrent query execution improvements • Better vectorized operators (join, group by, …) • Defining the API
  • 42.
    Page42 © HortonworksInc. 2011 – 2015. All Rights Reserved Future work • Security, including column level security • Tighter integration with YARN, e.g. resource delegation • Guaranteed Capacities for better SLA guarantee, maybe with central scheduler • Dynamic daemon sizing with off-heap storage • ACID support • Better (maybe centrally coordinated) locality and caching • Temp tables, intermediate query results in LLAP • Interleaving of Fragment Execution • Past processing is not lost (as against preemption) • A rogue / badly scheduled query will not hog the system
  • 43.
    Page43 © HortonworksInc. 2011 – 2015. All Rights Reserved Questions? ? Interested? Stop by the Hortonworks booth to learn more