1© Cloudera, Inc. All rights reserved.
Faster Batch Processing with
Hive-on-Spark
Santosh Kumar | Cloudera
Rui Li | Intel
2© Cloudera, Inc. All rights reserved.
Agenda
• What is Hive-on-Spark?
• Using Hive-on-Spark
• Performance Metrics
• Configuration & Tuning
• What’s Next?
• Q&A
3© Cloudera, Inc. All rights reserved.
Apache Spark
Flexible, in-memory data processing for Hadoop
Easy
Development
Flexible Extensible
API
Fast Batch & Stream
Processing
• Rich APIs for Scala, Java,
and Python
• Interactive shell
• APIs for different types
of workloads:
• Batch
• Streaming
• Machine Learning
• Graph
• In-Memory processing
and caching
4© Cloudera, Inc. All rights reserved.
Spark Takes Advantage of Memory
• Resilient Distributed Datasets (RDD)
• In-memory data-structure partitioned across a set of machines
• Can fall back to disk when data-set does not fit in memory
• Created by parallel transformations on data in stable storage
• Provides fault-tolerance through concept of lineage
5© Cloudera, Inc. All rights reserved.
Introduction
• Enables Hive to use Spark as underlying execution engine
• Motivations
• Consolidation of Spark as execution engine
• Better performance
• Increased adoption of Hive (e.g. for Spark users)
• Community effort by Cloudera, IBM, Intel, MapR, and others
6© Cloudera, Inc. All rights reserved.
Choosing the Right SQL Engine
Know Your Audience, Know Your Use Case
Batch
Processing
BI and
SQL Analytics
Procedural
Development
SQLOR
Impala
7© Cloudera, Inc. All rights reserved.
Current State of Hive-on-Spark (HoS)
• Fully supported production release in C5.7
• Functional parity with Hive-on-MapReduce (HoMR)
• Average 3x performance improvement vs HoMR
• Automatic configuration and optimizations via Cloudera Manager
• Strong early user base
• Early commitment for future collaboration from Intel and others
8© Cloudera, Inc. All rights reserved.
Design Principles
• Minimize impact on existing code path
• Minimizes functional and performance impact
• Minimizes maintenance
• Maximizes support for Hive features – current as well as future
• Spark invoked only at execution layer
• HoS produces similar logical operators plan as HoMR
• Logical plan runs on low-level Spark primitives
• Minimizes usage of advanced Spark primitives
9© Cloudera, Inc. All rights reserved.
Getting Started with Hive-on-Spark
10© Cloudera, Inc. All rights reserved.
Configuration
• Minimal configurations needed
• Via Cloudera Manager: Set “Spark on YARN Service” (internally sets
spark.master=yarn-cluster)
• Set hive.execution.engine=spark per service or query
• Only yarn-cluster is supported
• Cloudera Manager auto-configures most configurations
• Configuration & Tuning Guide available on Docs
11© Cloudera, Inc. All rights reserved.
Performance
Avg. ~3X faster than Hive-on-MapReduce
More Suitable Less Suitable
Complex workloads w/ multiple MR stages e.g. filter
followed by JOIN followed by GROUP BY
Simple workloads e.g. select *
Disk-bound w/ multiple disk reads/writes CPU bound workloads e.g. complex UDFs
Workloads requiring mins to hours for completion Workloads typically requiring <1 min
12© Cloudera, Inc. All rights reserved.
Query Execution: Background
Input
status_updates( userid int,status string,ds string)
profiles(userid int,school string,gender int)
Output
school_summary(school string,cnt int,ds string)
gender_summary(gender int,cnt int,ds string)
13© Cloudera, Inc. All rights reserved.
Query Execution: MapReduce
BEGINS CONTINUES
CONTINUES ENDS
14© Cloudera, Inc. All rights reserved.
Query Execution: MapReduce
BEGINS CONTINUES
CONTINUES ENDS
15© Cloudera, Inc. All rights reserved.
Query Execution: MapReduce
BEGINS CONTINUES
CONTINUES ENDS
FileSinkOperator (disk write) and TableScanOperator (disk read)
are very costly
16© Cloudera, Inc. All rights reserved.
Query Execution: Hive-on-Spark
Costly Steps Removed
BEGINS CONTINUES
CONTINUES ENDS
17© Cloudera, Inc. All rights reserved.
Query Execution: Hive-on-Spark
Costly Steps Removed
BEGINS CONTINUES
CONTINUES ENDS
18© Cloudera, Inc. All rights reserved.
Optimization for Resource Management:
Long-Live Executors (LLE)
• MR: Each query an independent YARN application
• Spark: Each SQL session is a long-lived YARN application
• First query of a session spawns a YARN app
• Subsequent queries re-use same YARN app as well as containers
• Session disconnect shuts down YARN app and releases container resources
19© Cloudera, Inc. All rights reserved.
Long-Lived Executors Details
• Hive User Session will submit Spark Application to YARN
• Spark YARN Application:
• YARN container = Spark Executors live in YARN containers
• YARN Application Master = RemoteDriver
• Submits Spark ‘jobs’, aka Hive queries, to Spark executors
• Connects back to HS2 to report job progress from Spark executors
User1
User2
HiveServer2
Session1
Session2
YARN Cluster
AM (RemoteDriver1) Containers (Executors)
AM (RemoteDriver2) Containers (Executors)
20© Cloudera, Inc. All rights reserved.
Configuration and Tuning
Hive-on-Spark
21© Cloudera, Inc. All rights reserved.
Spark Configuration
• Size of executors
• Bigger and fewer executors
• Threads contention
• GC pressure
• Smaller and more executors
• Less memory efficient
• Bigger start-up overhead
22© Cloudera, Inc. All rights reserved.
Spark Configuration
• CPU
• Around 5-7 cores per executor
• Memory
• Leave 10% for OS cache
• Executor memory overhead
• Tune by case
• Can be heavily used by Netty
• Usually 15% - 20%
• Around 3GB per core
23© Cloudera, Inc. All rights reserved.
Spark Configuration
• Serialization
• spark.serializer – kryo performs better and is REQUIRED by HoS
• spark.kryo.referenceTracking – disable to avoid java performance issue
• Shuffle
• spark.shuffle.compress
• spark.shuffle.spill.compress
• Trade CPU for I/O
• Increase number of reducers
24© Cloudera, Inc. All rights reserved.
Partitioning
• Number of mappers
• Inputformat
• mapreduce.input.fileinputformat.split.maxsize
• Number of reducers
• hive.exec.reducers.bytes.per.reducer
• mapreduce.job.reduces
• HoS tends to launch more reducers
• Merge small files
• hive.merge.sparkfiles
25© Cloudera, Inc. All rights reserved.
Hive Configuration
• General optimizations
• Enable vectorization
• Enable CBO
• Map join auto convertion
• Map side aggregation
• Etc.
26© Cloudera, Inc. All rights reserved.
Hive Configuration
• Map join
• hive.auto.convert.join.noconditionaltask.size
• HoS doesn’t support conditional map join yet
• HoS uses raw data size as small table size – different from MR
• hive.stats.collect.rawdatasize
• Skew join
• Compile time – same as MR
• Runtime - HoS will split the original task at join
27© Cloudera, Inc. All rights reserved.
Resource Allocation
• Static allocation
• spark.executor.instances
• Won’t release until session is closed
• Recommended for benchmarking
• Dynamic allocation
• spark.dynamicAllocation.enabled
• spark.executor.dynamicAllocation.initialExecutors
• spark.executor.dynamicAllocation.minExecutors
• spark.executor.dynamicAllocation.maxExecutors
• Number of executors per Spark application scales up and down
• Suited for multi-tenancy scenarios (multi-session)
28© Cloudera, Inc. All rights reserved.
Resource Allocation
• Pre-warm containers
• hive.prewarm.enabled
• spark.scheduler.maxRegisteredResourcesWaitingTime
• spark.scheduler.minRegisteredResourcesRatio
• Attempt for better parallelism
• Considerable delay for start-up job
• Not recommended for short-lived sessions
29© Cloudera, Inc. All rights reserved.
Configuration and Tuning Summary
• Number and size of executors most important determinants of
performance
• Resolve query performance/failures by allocating more executors with
more CPU and RAM
• spark.executor.instances, spark.executor.cores, spark.executor.memory,
spark.yarn.executor.memoryOverhead
• Cloudera Manager takes care of most of the optimizations
• Most Hive config settings applicable to HoS, but few have different
semantics
• See Config and Tuning Guide for details
30© Cloudera, Inc. All rights reserved.
Roadmap
• Additional Optimizations
• Dynamic Partition Pruning
• Vectorization support
• Cost-Based Optimizer
• Others – Caching RDDs across queries, Optimize self join/union etc.
• Supportability Enhancements
• Better support for debugging and logging
• More informative stage description in WebUI
• Others: Improve Hue integration, additional metrics specific to HoS etc.
• Rebase to Spark 2.0 and Parquet 1.8
31© Cloudera, Inc. All rights reserved.
More Information & Next Steps
Get Started
• Download C5.7: www.cloudera.com/downloads
Release Notes
• www.cloudera.com/documentation/enterprise/latest/topics/rg_release_
notes.html
Training Classes
• university.cloudera.com
32© Cloudera, Inc. All rights reserved.
Questions?

Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production

  • 1.
    1© Cloudera, Inc.All rights reserved. Faster Batch Processing with Hive-on-Spark Santosh Kumar | Cloudera Rui Li | Intel
  • 2.
    2© Cloudera, Inc.All rights reserved. Agenda • What is Hive-on-Spark? • Using Hive-on-Spark • Performance Metrics • Configuration & Tuning • What’s Next? • Q&A
  • 3.
    3© Cloudera, Inc.All rights reserved. Apache Spark Flexible, in-memory data processing for Hadoop Easy Development Flexible Extensible API Fast Batch & Stream Processing • Rich APIs for Scala, Java, and Python • Interactive shell • APIs for different types of workloads: • Batch • Streaming • Machine Learning • Graph • In-Memory processing and caching
  • 4.
    4© Cloudera, Inc.All rights reserved. Spark Takes Advantage of Memory • Resilient Distributed Datasets (RDD) • In-memory data-structure partitioned across a set of machines • Can fall back to disk when data-set does not fit in memory • Created by parallel transformations on data in stable storage • Provides fault-tolerance through concept of lineage
  • 5.
    5© Cloudera, Inc.All rights reserved. Introduction • Enables Hive to use Spark as underlying execution engine • Motivations • Consolidation of Spark as execution engine • Better performance • Increased adoption of Hive (e.g. for Spark users) • Community effort by Cloudera, IBM, Intel, MapR, and others
  • 6.
    6© Cloudera, Inc.All rights reserved. Choosing the Right SQL Engine Know Your Audience, Know Your Use Case Batch Processing BI and SQL Analytics Procedural Development SQLOR Impala
  • 7.
    7© Cloudera, Inc.All rights reserved. Current State of Hive-on-Spark (HoS) • Fully supported production release in C5.7 • Functional parity with Hive-on-MapReduce (HoMR) • Average 3x performance improvement vs HoMR • Automatic configuration and optimizations via Cloudera Manager • Strong early user base • Early commitment for future collaboration from Intel and others
  • 8.
    8© Cloudera, Inc.All rights reserved. Design Principles • Minimize impact on existing code path • Minimizes functional and performance impact • Minimizes maintenance • Maximizes support for Hive features – current as well as future • Spark invoked only at execution layer • HoS produces similar logical operators plan as HoMR • Logical plan runs on low-level Spark primitives • Minimizes usage of advanced Spark primitives
  • 9.
    9© Cloudera, Inc.All rights reserved. Getting Started with Hive-on-Spark
  • 10.
    10© Cloudera, Inc.All rights reserved. Configuration • Minimal configurations needed • Via Cloudera Manager: Set “Spark on YARN Service” (internally sets spark.master=yarn-cluster) • Set hive.execution.engine=spark per service or query • Only yarn-cluster is supported • Cloudera Manager auto-configures most configurations • Configuration & Tuning Guide available on Docs
  • 11.
    11© Cloudera, Inc.All rights reserved. Performance Avg. ~3X faster than Hive-on-MapReduce More Suitable Less Suitable Complex workloads w/ multiple MR stages e.g. filter followed by JOIN followed by GROUP BY Simple workloads e.g. select * Disk-bound w/ multiple disk reads/writes CPU bound workloads e.g. complex UDFs Workloads requiring mins to hours for completion Workloads typically requiring <1 min
  • 12.
    12© Cloudera, Inc.All rights reserved. Query Execution: Background Input status_updates( userid int,status string,ds string) profiles(userid int,school string,gender int) Output school_summary(school string,cnt int,ds string) gender_summary(gender int,cnt int,ds string)
  • 13.
    13© Cloudera, Inc.All rights reserved. Query Execution: MapReduce BEGINS CONTINUES CONTINUES ENDS
  • 14.
    14© Cloudera, Inc.All rights reserved. Query Execution: MapReduce BEGINS CONTINUES CONTINUES ENDS
  • 15.
    15© Cloudera, Inc.All rights reserved. Query Execution: MapReduce BEGINS CONTINUES CONTINUES ENDS FileSinkOperator (disk write) and TableScanOperator (disk read) are very costly
  • 16.
    16© Cloudera, Inc.All rights reserved. Query Execution: Hive-on-Spark Costly Steps Removed BEGINS CONTINUES CONTINUES ENDS
  • 17.
    17© Cloudera, Inc.All rights reserved. Query Execution: Hive-on-Spark Costly Steps Removed BEGINS CONTINUES CONTINUES ENDS
  • 18.
    18© Cloudera, Inc.All rights reserved. Optimization for Resource Management: Long-Live Executors (LLE) • MR: Each query an independent YARN application • Spark: Each SQL session is a long-lived YARN application • First query of a session spawns a YARN app • Subsequent queries re-use same YARN app as well as containers • Session disconnect shuts down YARN app and releases container resources
  • 19.
    19© Cloudera, Inc.All rights reserved. Long-Lived Executors Details • Hive User Session will submit Spark Application to YARN • Spark YARN Application: • YARN container = Spark Executors live in YARN containers • YARN Application Master = RemoteDriver • Submits Spark ‘jobs’, aka Hive queries, to Spark executors • Connects back to HS2 to report job progress from Spark executors User1 User2 HiveServer2 Session1 Session2 YARN Cluster AM (RemoteDriver1) Containers (Executors) AM (RemoteDriver2) Containers (Executors)
  • 20.
    20© Cloudera, Inc.All rights reserved. Configuration and Tuning Hive-on-Spark
  • 21.
    21© Cloudera, Inc.All rights reserved. Spark Configuration • Size of executors • Bigger and fewer executors • Threads contention • GC pressure • Smaller and more executors • Less memory efficient • Bigger start-up overhead
  • 22.
    22© Cloudera, Inc.All rights reserved. Spark Configuration • CPU • Around 5-7 cores per executor • Memory • Leave 10% for OS cache • Executor memory overhead • Tune by case • Can be heavily used by Netty • Usually 15% - 20% • Around 3GB per core
  • 23.
    23© Cloudera, Inc.All rights reserved. Spark Configuration • Serialization • spark.serializer – kryo performs better and is REQUIRED by HoS • spark.kryo.referenceTracking – disable to avoid java performance issue • Shuffle • spark.shuffle.compress • spark.shuffle.spill.compress • Trade CPU for I/O • Increase number of reducers
  • 24.
    24© Cloudera, Inc.All rights reserved. Partitioning • Number of mappers • Inputformat • mapreduce.input.fileinputformat.split.maxsize • Number of reducers • hive.exec.reducers.bytes.per.reducer • mapreduce.job.reduces • HoS tends to launch more reducers • Merge small files • hive.merge.sparkfiles
  • 25.
    25© Cloudera, Inc.All rights reserved. Hive Configuration • General optimizations • Enable vectorization • Enable CBO • Map join auto convertion • Map side aggregation • Etc.
  • 26.
    26© Cloudera, Inc.All rights reserved. Hive Configuration • Map join • hive.auto.convert.join.noconditionaltask.size • HoS doesn’t support conditional map join yet • HoS uses raw data size as small table size – different from MR • hive.stats.collect.rawdatasize • Skew join • Compile time – same as MR • Runtime - HoS will split the original task at join
  • 27.
    27© Cloudera, Inc.All rights reserved. Resource Allocation • Static allocation • spark.executor.instances • Won’t release until session is closed • Recommended for benchmarking • Dynamic allocation • spark.dynamicAllocation.enabled • spark.executor.dynamicAllocation.initialExecutors • spark.executor.dynamicAllocation.minExecutors • spark.executor.dynamicAllocation.maxExecutors • Number of executors per Spark application scales up and down • Suited for multi-tenancy scenarios (multi-session)
  • 28.
    28© Cloudera, Inc.All rights reserved. Resource Allocation • Pre-warm containers • hive.prewarm.enabled • spark.scheduler.maxRegisteredResourcesWaitingTime • spark.scheduler.minRegisteredResourcesRatio • Attempt for better parallelism • Considerable delay for start-up job • Not recommended for short-lived sessions
  • 29.
    29© Cloudera, Inc.All rights reserved. Configuration and Tuning Summary • Number and size of executors most important determinants of performance • Resolve query performance/failures by allocating more executors with more CPU and RAM • spark.executor.instances, spark.executor.cores, spark.executor.memory, spark.yarn.executor.memoryOverhead • Cloudera Manager takes care of most of the optimizations • Most Hive config settings applicable to HoS, but few have different semantics • See Config and Tuning Guide for details
  • 30.
    30© Cloudera, Inc.All rights reserved. Roadmap • Additional Optimizations • Dynamic Partition Pruning • Vectorization support • Cost-Based Optimizer • Others – Caching RDDs across queries, Optimize self join/union etc. • Supportability Enhancements • Better support for debugging and logging • More informative stage description in WebUI • Others: Improve Hue integration, additional metrics specific to HoS etc. • Rebase to Spark 2.0 and Parquet 1.8
  • 31.
    31© Cloudera, Inc.All rights reserved. More Information & Next Steps Get Started • Download C5.7: www.cloudera.com/downloads Release Notes • www.cloudera.com/documentation/enterprise/latest/topics/rg_release_ notes.html Training Classes • university.cloudera.com
  • 32.
    32© Cloudera, Inc.All rights reserved. Questions?