Admission Control in Impala

1
Admission Control in Impala
Matthew Jacobs | @mattjacobs | mj@cloudera.com

2 ©2014 Cloudera, Inc. All rights reserved.
• Too many concurrent queries -> oversubscription
• All queries take more time
• Application layer can throttle queries?
• Not much you can do before Impala 1.3
What’s the Problem

• Add an admission control mechanism to Impala!
• Throttle incoming requests
• Queues requests when workload increases
• Queued requests executed when resources available
So what do we do?

• Yarn is a resource manager for Hadoop
• Assumes jobs are composed of tasks, tasks can be restarted
• Need to ask for all resources up front, resources “trickle in”
• Non-trivial overhead: job creates “application master” (AM)
• But cost is small compared to long batch jobs
• Great for MR, things like MR
• Not good for
• Low-latency, high volume workloads
• Gang scheduling, “parts of jobs” can’t be restarted
What about Yarn?

• “Long Lived Application Master”
• Long running AMs
• Create fake requests to acquire necessary resources
• Provides a “gang scheduling” abstraction, waits for all resources
• Offers a resource expansion mechanism -> don’t need to ask for all
up front
• Offers a throttling mechanism
• Caches Yarn containers -> lower latency
• Looks like a square peg in a round hole…
• To be fair, multi-level scheduling is a hard problem!
Llama Bridges the Gap

• Good for Impala sharing resources with other frameworks
• Good general purpose resource mgmt solution
However:
• Not everyone wants/needs to run Yarn and Llama
• Still requires round-trips to a central server
• Increases query latency
• Unlikely to scale for highest latency/throughput requirements
• Impala should have a fast, built in throttling mechanism
Impala + Llama + Yarn?

• Throttle number of concurrent requests or memory
• Fast
• Decentralized
• Works without Yarn/Llama
• Works with CDH4/CDH5
Impala Admission Control

• Configure one or more resource “pools”
• Max # concurrent queries, max memory, max queue size
• Each Impalad capable of making admission decisions
• No new single bottleneck/single point of failure
• Incoming queries are executed, queued, or rejected
• Queue if too many queries OR not enough memory
• Reject if queue is full
Design Overview

• Requests admitted or queued locally
• Each Impalad keeps track of local state
• # queries, pool memory, local queue size
• Disseminates local stats via statestore -> global state
• Uses cached global state in admission decisions
• Decisions are fast; negligible impact on query latency
• No single point of failure
Localized Admission Decisions

• Using cached global state -> may “over-admit”
• E.g. multiple impalads think 1 request can be admitted and admit
before receiving updated state
• Configured pool limits are “soft” limits
• Fn(Submission rate, distribution across impalads)
• Not a big problem in practice
• May occasionally admit a few extra queries
• Can increase statestore heartbeat frequency
• Can add some buffer to configured pool limits
Localized Admission Decisions (II)

• Max memory
• Many workloads are limited by memory
• Impalads kill queries when running out of memory, anyway
• Max number of concurrent queries
• Generic mechanism, not resource specific (e.g. memory)
• Not as good if workload is heterogeneous
• Queries may still be killed if impalads run out of memory
Pool Limits

12
Memory Limits
• Impalads track memory hierarchically
• Per-process memory
• Queries killed when limit is hit
• Per-pool memory
• For admission control
• Per-query memory
Process
Pool1
Query1 Query2
Pool2
©2014 Cloudera, Inc. All rights reserved.

• Admission decisions need more than memory usage
• Incoming queries use no memory yet
• Queries recently admitted haven’t ramped up yet
• Use memory estimates from planning
• Estimate pool memory usage with actual usage & estimates
• Accounts for future memory usage of recently started queries
Admit if:
Pool mem estimate + query mem estimate < pool limit
Memory Limits (II)

• Not perfect, query mem estimates are wrong
• Hard problem; never have perfect estimates from planning
• Usually overly conservative
• Leads to underutilization
• But at least queries won’t be killed
• Less likely to hit process mem limit
• Workarounds
• Increase pool mem limit
• Override with “MEM_LIMIT” query option
• Future improvement: Update estimates as query executes
• Query mem usage will approach the updated estimate
Memory Limits (III)

• Modeled after Yarn resource queues
• Same configuration as Yarn queues
• Have a single configuration for Yarn & Impala
• Usually want to have the same resource allocations mapped
to an organization
• E.g. HR gets 10%, Finance gets 30%, Eng gets 60%
Request Pools

• Users are mapped to pools using the placement policy
• Users are authorized using the specified ACLs
• Pools are defined hierarchically
• ACLs are inherited
• Currently only enforces limits on leaf pools (IMPALA-905)
Request Pools (II)

• Uses Yarn + Llama configs
• Yarn fair scheduler allocation configuration (fair-
scheduler.xml)
• Llama configuration (llama-site.xml)
• Only some of the configuration properties are used
• See the documentation for sample config files
• Cloudera Manager has a nice UI to configure
• No need to touch the xml files
Request Pool General Configuration

Placement Rule Configuration
Please change the default values

• If only 1 pool is needed, a separate (easy) configuration
path exists
• Uses a single “default” pool
• No Yarn/Llama configs involved (not even accepted)
• Configure the pool limits with impalad flags:
• default_pool_max_queued
• default_pool_max_requests
• default_pool_mem_limit
• Doesn’t work with CM5.0, fixed in CM5.0.1
Easy Config Path (Singleton Pool Only)

Submitting to a Pool

• Rejections and timeouts return error messages
• Metrics
• Exposed in impalad web UI: /metrics
• Will be available in CM5.1
• Query profile has admission result
• Impalad logs have lots of useful information
“Debugging” Admission Control Decisions
admission-controller.cc:259] Schedule for id=c541aae43af74ed1:afdec812127f8097 in pool_name=root.test/admin
PoolConfig(max_requests=20 max_queued=50 mem_limit=-1.00 B) query cluster_mem_estimate=42.00 MB
admission-controller.cc:265] Stats: pool=root.test/admin Total(num_running=20, num_queued=7, mem_usage=239.07
MB, mem_estimate=800.00 MB) Local(num_running=20, num_queued=7, mem_usage=239.07 MB,
mem_estimate=800.00 MB)
admission-controller.cc:303] Queuing, query id=c541aae43af74ed1:afdec812127f8097

Metrics

Query Profile Information

• Queue timeout
• Defaults to 60sec, change with --queue_wait_timeout_ms
• Running with Yarn/Llama
• Same configs: “hard limits” enforced by Yarn+Llama
• Disabled by default for CDH4
• Hue (<CDH4.6) doesn’t close queries
• Enable with impalad flag (see --disable_admission_control)
Some Notes

Matthew Jacobs
@mattjacobs
mj@cloudera.com

Admission Control in Impala

More Related Content

What's hot

Viewers also liked

Similar to Admission Control in Impala

More from Cloudera, Inc.

Recently uploaded

Admission Control in Impala