Tips & Tricks
Admission Control in Impala
Like this presentation? Why not share!
Cloudera Impala technical deep dive
Impala 2.0 Update #impalajp
by Cloudera Japan
Building a Hadoop Data Warehouse wi...
by Swiss Big Data Us...
HBase Sizing Guide
Evolution of Impala #hcj2014
by Cloudera Japan
Pivotal hawq internals
by Alexey Grishchenko
Email sent successfully!
Show related SlideShares at end
Admission Control in Impala
May 21, 2014
Comment goes here.
12 hours ago
Are you sure you want to
Your message goes here
Be the first to comment
Senior Software Engineer at Intel Corporation
2 months ago
4 months ago
8 months ago
Software Developer/Test Lead
9 months ago
Apache Committer/PMC and Sr. Software Engineer
The Apache Software Foundation
1 year ago
Number of Embeds
No notes for slide
Transcript of "Admission Control in Impala"
1. 1 Admission Control in Impala Matthew Jacobs | @mattjacobs | firstname.lastname@example.org
2 ©2014 Cloudera, Inc. All rights reserved. • Too many concurrent queries -> oversubscription • All queries take more time • Application layer can throttle queries? • Not much you can do before Impala 1.3 What’s the Problem
3 ©2014 Cloudera, Inc. All rights reserved. • Add an admission control mechanism to Impala! • Throttle incoming requests • Queues requests when workload increases • Queued requests executed when resources available So what do we do?
4 ©2014 Cloudera, Inc. All rights reserved. • Yarn is a resource manager for Hadoop • Assumes jobs are composed of tasks, tasks can be restarted • Need to ask for all resources up front, resources “trickle in” • Non-trivial overhead: job creates “application master” (AM) • But cost is small compared to long batch jobs • Great for MR, things like MR • Not good for • Low-latency, high volume workloads • Gang scheduling, “parts of jobs” can’t be restarted What about Yarn?
5 ©2014 Cloudera, Inc. All rights reserved. • “Long Lived Application Master” • Long running AMs • Create fake requests to acquire necessary resources • Provides a “gang scheduling” abstraction, waits for all resources • Offers a resource expansion mechanism -> don’t need to ask for all up front • Offers a throttling mechanism • Caches Yarn containers -> lower latency • Looks like a square peg in a round hole… • To be fair, multi-level scheduling is a hard problem! Llama Bridges the Gap
6 ©2014 Cloudera, Inc. All rights reserved. • Good for Impala sharing resources with other frameworks • Good general purpose resource mgmt solution However: • Not everyone wants/needs to run Yarn and Llama • Still requires round-trips to a central server • Increases query latency • Unlikely to scale for highest latency/throughput requirements • Impala should have a fast, built in throttling mechanism Impala + Llama + Yarn?
7 ©2014 Cloudera, Inc. All rights reserved. • Throttle number of concurrent requests or memory • Fast • Decentralized • Works without Yarn/Llama • Works with CDH4/CDH5 Impala Admission Control
8 ©2014 Cloudera, Inc. All rights reserved. • Configure one or more resource “pools” • Max # concurrent queries, max memory, max queue size • Each Impalad capable of making admission decisions • No new single bottleneck/single point of failure • Incoming queries are executed, queued, or rejected • Queue if too many queries OR not enough memory • Reject if queue is full Design Overview
9 ©2014 Cloudera, Inc. All rights reserved. • Requests admitted or queued locally • Each Impalad keeps track of local state • # queries, pool memory, local queue size • Disseminates local stats via statestore -> global state • Uses cached global state in admission decisions • Decisions are fast; negligible impact on query latency • No single point of failure Localized Admission Decisions
10 ©2014 Cloudera, Inc. All rights reserved. • Using cached global state -> may “over-admit” • E.g. multiple impalads think 1 request can be admitted and admit before receiving updated state • Configured pool limits are “soft” limits • Fn(Submission rate, distribution across impalads) • Not a big problem in practice • May occasionally admit a few extra queries • Can increase statestore heartbeat frequency • Can add some buffer to configured pool limits Localized Admission Decisions (II)
11 ©2014 Cloudera, Inc. All rights reserved. • Max memory • Many workloads are limited by memory • Impalads kill queries when running out of memory, anyway • Max number of concurrent queries • Generic mechanism, not resource specific (e.g. memory) • Not as good if workload is heterogeneous • Queries may still be killed if impalads run out of memory Pool Limits
12 Memory Limits • Impalads track memory hierarchically • Per-process memory • Queries killed when limit is hit • Per-pool memory • For admission control • Per-query memory Process Pool1 Query1 Query2 Pool2 ©2014 Cloudera, Inc. All rights reserved.
13 ©2014 Cloudera, Inc. All rights reserved. • Admission decisions need more than memory usage • Incoming queries use no memory yet • Queries recently admitted haven’t ramped up yet • Use memory estimates from planning • Estimate pool memory usage with actual usage & estimates • Accounts for future memory usage of recently started queries Admit if: Pool mem estimate + query mem estimate < pool limit Memory Limits (II)
14 ©2014 Cloudera, Inc. All rights reserved. • Not perfect, query mem estimates are wrong • Hard problem; never have perfect estimates from planning • Usually overly conservative • Leads to underutilization • But at least queries won’t be killed • Less likely to hit process mem limit • Workarounds • Increase pool mem limit • Override with “MEM_LIMIT” query option • Future improvement: Update estimates as query executes • Query mem usage will approach the updated estimate Memory Limits (III)
15 ©2014 Cloudera, Inc. All rights reserved. • Modeled after Yarn resource queues • Same configuration as Yarn queues • Have a single configuration for Yarn & Impala • Usually want to have the same resource allocations mapped to an organization • E.g. HR gets 10%, Finance gets 30%, Eng gets 60% Request Pools
16 ©2014 Cloudera, Inc. All rights reserved. • Users are mapped to pools using the placement policy • Users are authorized using the specified ACLs • Pools are defined hierarchically • ACLs are inherited • Currently only enforces limits on leaf pools (IMPALA-905) Request Pools (II)
17 ©2014 Cloudera, Inc. All rights reserved. • Uses Yarn + Llama configs • Yarn fair scheduler allocation configuration (fair- scheduler.xml) • Llama configuration (llama-site.xml) • Only some of the configuration properties are used • See the documentation for sample config files • Cloudera Manager has a nice UI to configure • No need to touch the xml files Request Pool General Configuration
18 ©2014 Cloudera, Inc. All rights reserved.
19 ©2014 Cloudera, Inc. All rights reserved. Placement Rule Configuration Please change the default values
20 ©2014 Cloudera, Inc. All rights reserved. • If only 1 pool is needed, a separate (easy) configuration path exists • Uses a single “default” pool • No Yarn/Llama configs involved (not even accepted) • Configure the pool limits with impalad flags: • default_pool_max_queued • default_pool_max_requests • default_pool_mem_limit • Doesn’t work with CM5.0, fixed in CM5.0.1 Easy Config Path (Singleton Pool Only)
21 ©2014 Cloudera, Inc. All rights reserved. Submitting to a Pool
22 ©2014 Cloudera, Inc. All rights reserved. • Rejections and timeouts return error messages • Metrics • Exposed in impalad web UI: /metrics • Will be available in CM5.1 • Query profile has admission result • Impalad logs have lots of useful information “Debugging” Admission Control Decisions admission-controller.cc:259] Schedule for id=c541aae43af74ed1:afdec812127f8097 in pool_name=root.test/admin PoolConfig(max_requests=20 max_queued=50 mem_limit=-1.00 B) query cluster_mem_estimate=42.00 MB admission-controller.cc:265] Stats: pool=root.test/admin Total(num_running=20, num_queued=7, mem_usage=239.07 MB, mem_estimate=800.00 MB) Local(num_running=20, num_queued=7, mem_usage=239.07 MB, mem_estimate=800.00 MB) admission-controller.cc:303] Queuing, query id=c541aae43af74ed1:afdec812127f8097
23 ©2014 Cloudera, Inc. All rights reserved. Metrics
24 ©2014 Cloudera, Inc. All rights reserved. Query Profile Information
25 ©2014 Cloudera, Inc. All rights reserved. • Queue timeout • Defaults to 60sec, change with --queue_wait_timeout_ms • Running with Yarn/Llama • Same configs: “hard limits” enforced by Yarn+Llama • Disabled by default for CDH4 • Hue (<CDH4.6) doesn’t close queries • Enable with impalad flag (see --disable_admission_control) Some Notes
26 ©2014 Cloudera, Inc. All rights reserved. Matthew Jacobs @mattjacobs email@example.com