SlideShare a Scribd company logo
1 of 40
SCALING IMPALA
Manish Maheshwari | Strata London 2019 | #StrataData
manish@cloudera.com
2 © Cloudera, Inc. All rights reserved.
AGENDA
• Impala overview
• KRPC Improvements
• Scaling issues and solutions
• Understanding query profiles
• Key Takeaways
3 © Cloudera, Inc. All rights reserved.
• Open source
• Fast
• Massively parallel processing (MPP)
• C++, run time code generation, streaming
• Flexible
• Multiple storage engines (HDFS, S3, ADLS, Apache Kudu, …)
• Multiple file formats (Parquet, Text, Sequence, Avro, ORC, …)
• Enterprise-grade
• Authorization, authentication, lineage tracing, auditing, encryption
• >1400 customers, >97000 machines
• Scalable (Now even more!)
• Large clusters with 400+ nodes
Apache Impala
4 © Cloudera, Inc. All rights reserved.
HDFS Kudu S3 HBase
Impala Architecture
Query Compiler
Query Executor
Query Coordinator
Metadata
HDFS NameNode
StateStore
Catalog
FE
(Java)
BE (C++)
Metadata Cache
Impala Daemons
Metadata
Execution
Storage ADLS
SentryHive MetaStore
Query Compiler
Query Executor
Query Coordinator
Metadata
Query Compiler
Query Executor
Query Coordinator
Metadata
HDFS
0
1
Query Compiler
Query Executor
Query Coordinator
Metadata
Kudu S3/ADLS HBase HDFS Kudu S3/ADLS HBase HDFS Kudu S3/ADLS HBase
Impala Daemons
SQL App
ODBC/JDBC
2
34
5
6
7
• Request arrives via ODBC/JDBC
• Planner turns request into collection of plan fragments
• Coordinator initiates execution on remote Impala daemons
• Intermediate results are streamed between Impala daemons
• Query results are streamed back to the client
Select Query flow in Impala
Query Compiler
Query Executor
Query Coordinator
Metadata
Query Compiler
Query Executor
Query Coordinator
Metadata
Impala DaemonsImpala Daemons
6 © Cloudera, Inc. All rights reserved.
What’s KRPC?
• Replaces Thrift RPC for inter daemon communication from CDH 5.15+ for certain
RPCs
• Reduces # of connections in cluster
• Reduce stress on MIT KDC / AD
• KRPC supports both synchronous and asynchronous RPCs
• KRPC supports connection multiplexing
• One connection per direction between every pair of hosts
• Fixed size thread pool maintained by KRPC internally
• Query performances during concurrent execution up by average 2x-3x
7 © Cloudera, Inc. All rights reserved.
KRPC Performance
8 © Cloudera, Inc. All rights reserved.
KRPC Stability
9 © Cloudera, Inc. All rights reserved.
KRPC Throughput
10 © Cloudera, Inc. All rights reserved.
Scaling Issues
• I got 20 more use cases to onboard
• And our data volumes just went up 10x
• And we have 10x more tables and 100x more partitions
• And we have 10x more concurrent queries
• And our ETL is now real time
• Sure, Just add more nodes right?
• Ok, did that too.. My queries are still 3x slower L
11 © Cloudera, Inc. All rights reserved.
Metadata/Catalog Cache Woes
• Impala Catalog and Daemons cache the HMS Metadata + HDFS Block locations
• Memory = num of tables * 5KB + num of partitions * 2KB + num of files * 750B + num of file blocks * 300B +
sum(incremental col stats per table)
• Incremental stats
• For each table, num columns * num partitions * 400B
• e.g. A large telco has over 56K tables
• and partitions and files and blocks and replicas…
• Catalog memory approx. 80 GB, GC times go up significantly
• Reduced memory for query execution.
• OOM issues on the catalogd
• Long metadata loading time, Long time for SS to send the catalog to all the daemons
• Slow DDL and DML Statements, Even `describe table` requires catalog cache for the whole table
12 © Cloudera, Inc. All rights reserved.
Metadata/Catalog Cache Woes
Daemon
#1
CatalogD
A,B,C,D
Table
A
HMS
Table
B
Table
C
Table
D
Namenode
Daemon
#1
A,B,C,D
DaemonsA,B,C,D
Daemon
#1
A,B,C,D
Daemon
#1
A,B,C,D
DaemonsA,B,C,D
Daemon
#1
A,B,C,D
Daemon
#1
A,B,C,D
DaemonsA,B,C,D
StateStore Daemon
13 © Cloudera, Inc. All rights reserved.
Metadata/Catalog Cache - Solutions
• Regularly merge small files
• And run a Refresh Table
• Avoid data ingestion processes that produce many small files
• Use a larger block size (can be over 2GB)
• Optimal partitioning strategy
• Don’t over or under partition
• HDFS file handle cache
• Consider using HBase, Kudu, …
• If nothing works, just delete the data 😜
14 © Cloudera, Inc. All rights reserved.
Dedicated Coordinators
• Coordinators - Compile the queries, creates the execution plan,
• Needs table metadata!!
• Executors - Executes the query plans and sends the results to other executors / coordinator
• Does not need table metadata!!
• Benefits
• Executors need less memory
• Statestore does not need to send metadata to all nodes
• Faster metadata updates and propogation
• Coordinator nodes don’t need to be datanodes
15 © Cloudera, Inc. All rights reserved.
Dedicated Coordinators
Coordinators
CatalogD
A,B,C,D
Table
A
HMS
Table
B
Table
C
Table
Z
Namenode
Executors Executors
StateStore Daemon
CoordinatorsA,B,C,D
Executors
Executors
Executors
Executors
16 © Cloudera, Inc. All rights reserved.
Dedicated Coordinators – Best Practices
• Thumb Rule – 1 Coordinator per 50 executors.
• Start with just one coordinator. Run it on an edge node. (Can run on datanodes too)
• Needs atleast few disks to write any spill data
• Add another coordinator when CPU / network utilization > 80%
• When using a load balancer double the coordinator counts, but set only half as active and rest as backup
• Use sticky connections on the load balancer
• Increase the fe_service_threads on the coordinators to handle client connections
• Increase the Java heap of the coordinators according to the catalog size
• Increase num_metadata_loading_threads
• Default is 16. Increase if you have lots of tables and partitions.
17 © Cloudera, Inc. All rights reserved.
Further Catalog Cache Improvements (In Beta)
Coordinator
#1
CatalogDCoordinator
#2
CatalogD
A B
impala-shell> describe A impala-shell> describe B
Table
A
HMS +
Namenode
Table
B
Table
C
Table
Z
• On-Demand metadata for
coordinators from the CatalogD
• Metadata LRU cache (forget old
tables)
• Metadata release on memory
pressure
• Smart Cache Invalidation
• Compressed Incremental Stats
See IMPALA-7127 for full list of
improvements
HMS
Notifications
Coordinator
#3
C
impala-shell> describe C
18 © Cloudera, Inc. All rights reserved.
HMS Notifications
Hive Metastore
HDFS NameNode
StateStore Catalog
Metadata
2
3
Lightweight
Notifications
Query Coordinators
Coordinators
Lightweight
Notifications
HMS Notifications
Metadata
Query Coordinators
19 © Cloudera, Inc. All rights reserved.
Metadata on-demand (In Beta)
• Use Local catalog cache
• Set --catalog_topic_mode = minimal on the catalog daemons
• Set --use_local_catalog = true on the coordinators
• Time-based catalog cache eviction -
• Set invalidate_tables_timeout_s on both catalogd and coordinators
• E.g invalidate_tables_timeout_s = 3600 will invalidate tables that are older than 1 hour.
• Memory-based catalog cache eviction -
• Set invalidate_tables_on_memory_pressure = true on both catalogd and impalad.
• When the memory pressure reaches 60% of JVM heap size after a Java garbage collection in catalogd,
Impala invalidates 10% of the least recently used tables.
20 © Cloudera, Inc. All rights reserved.
Impala Architecture – Coordinators and Executors
HDFS Kudu S3 HBase
Query Compiler
Query Executor
Query Coordinator
Metadata
Query Executor
HDFS NameNode
StateStore
Catalog
FE
(Java)
BE (C++)
Metadata Cache
Impala Coordinator (‘s) Impala Executors
Metadata
Execution
Storage
Query Executor
Query Executor
Query Executor
Query Executor
Query Executor
Query Executor
Query Executor
Query Executor
Query Executor
ADLS
SentryHive MetaStore
21 © Cloudera, Inc. All rights reserved.
Admission Control Woes
• Impala Admission Control not enabled / Default Memory limit not set for each pool
• Heuristics based memory estimation, not 100% accurate. Worse if table stats are unavailable
• group by’s estimate can be particularly off – when there’s a large number of group by columns.
• Mem estimate = NDV of group by column 1 * NDV of group by column 2 * … NDV of group by column n
• Under admission due to higher than required memory reserved on each daemon
• Queries will OOM out unnecessarily
• Query Status: Admission for query exceeded timeout 60000ms in pool root.nprd_tst_hadoop_data_appl_readonly. Queued
reason: Not enough aggregate memory available in pool root.nprd_tst_hadoop_data_appl_readonly with max mem resources
150.00 GB. Needed 40.00 GB but only 30.00 GB was available.
• Always enable IAC
• Limit the amount of memory used by an individual query using per query mem-limit
• Set it from Impala shell / Hue - set mem_limit=<per query limit>
• Set default memory limit per pool
22 © Cloudera, Inc. All rights reserved.
Query Concurrency Woes
• Impala Admission Control is decentralized, each coordinator makes independent decision on
the basis of last know running queries on the cluster as communicated by the SS
• This makes IAC decisions are fast, But might be a little imprecise during times of heavy load
across many daemons. This is called as over admission.
• Using dedicated coordinators, limits this over admission of queries!!
• You get controlled query concurrency and thus each query runs faster and overall higher
query throughput.
• Ideal total query concurrency = # Cores on the executors / datanodes
23 © Cloudera, Inc. All rights reserved.
Impala Resource Pools
Resource Pools Design -
• 10 impala daemons, 200 GB per daemon – 2 TB total, 8
Tenants
• Ok. Lets divide the memory into tenants. Everyone gets
what they pay for. Good design right?
Issues
• Unused memory cannot be used by other tenants
• Busy tenants queue up queries in admission control
causing overall ”slowness” in query execution.
• Small tenants running large queries will spill to disk
until the spill to disk limit and eventually OOM out
• How bad does it get - 25 + tenants, Only 2/3 active at
any given time
24 © Cloudera, Inc. All rights reserved.
Scaling Impala Resource Pools
• Design resource pools according to peak memory needed
• Use simple grouping to create small, medium and large pools
• Use Cloudera Manager API’s for user chargeback if needed
25 © Cloudera, Inc. All rights reserved.
Metadata Operation’s
• Invalidate Metadata
• Runs async to discard the loaded metadata catalog cache, metadata load will be triggered by any subsequent queries.
• Should be run when
• New tables created / Tables dropped by Hive/Spark
• Block locations changed by HDFS load balancer
• Recover Partitions
• Scans HDFS to check if any new partition directories were added and cache block metadata for those files
• Refresh Table / Refresh Table Partition
• Adding/Removing/Overwriting files into partitions via Hive/Spark
• Running operations like ALTER TABLE
• Reloads metadata for the table from HMS and does an incremental reload of the file and block metadata.
26 © Cloudera, Inc. All rights reserved.
Scaling Metadata Operation’s
• How bad does it get?
• 18K Invalidate tables per day.
• What not to do
• No commands are needed if
operations/ETL runs in Impala
• Always run refresh <table>
<partition> when adding data
• Recover partitions when partitions
are added
• Refresh Table for other changes
• Limit IM to <Table> Only
27 © Cloudera, Inc. All rights reserved.
Automatic Metadata Sync (In Beta)
• CatalogD polls Hive Metastore (HMS) notifications events
• Invalidates the tables when it receives the ALTER TABLE events or the ALTER, ADD,
or DROP their partitions.
• Adds the tables or databases when it receives the CREATE TABLE or CREATE DATABASE events.
• Removes the tables from catalogd when it receives the DROP TABLE or DROP
DATABASE events.
• The operations that do not generate events in HMS, such as adding new data to existing
tables/partitions from Spark are not supported.
• Aka, Load / Insert still needs a refresh table partition
28 © Cloudera, Inc. All rights reserved.
Automatic Metadata Sync (In Beta)
• To disable the event based HMS sync for a new database, set the impala.disableHmsSync database properties in Hive
• CREATE DATABASE <name> WITH DBPROPERTIES ('impala.disableHmsSync'='true');
• To enable or disable the event based HMS sync for a table
• CREATE TABLE <name> WITH TBLPROPERTIES ('impala.disableHmsSync'='true' | 'false');
• To change the event based HMS sync at the table level
• ALTER TABLE <name> WITH TBLPROPERTIES ('impala.disableHmsSync'='true' | 'false’);
• When both table and database level properties are set, the table level property takes precedence.
• If the property is changed from true (meaning events are skipped) to false (meaning events are not skipped), issue a
manual INVALIDATE METADATA command to reset.
29 © Cloudera, Inc. All rights reserved.
Scaling Compute Stats
• Compute Stats is very CPU-intensive – Based on number of rows, number of data files, the total size
of the data files, and the file format.
• For partitioned tables, the numbers are calculated per partition, and as totals for the whole table.
• Limit the number of columns to only compute stats on columns involved in filters, join conditions,
group by or partition by clauses.
• Re-compute stats only when there is > 30% data change
• Run compute stats on weekends/nights. Not needed after every data load.
• If you reload a complete new set of data for a table, but the number of rows and number of distinct
values for each column is relatively unchanged from before, you do not need to recompute stats for
the table
• Use enable_stats_extrapolation (experimental)
30 © Cloudera, Inc. All rights reserved.
Set Statistics Manually
• Quick fix as part of data load, while compute stats can be scheduled on weekends
• Set total number of rows. Applies to both unpartitioned and partitioned tables.
• alter table <table_name> set tblproperties('numRows'='new_value',
'STATS_GENERATED_VIA_STATS_TASK'='true’);
• Set total number of rows for a specific partition. Applies to partitioned tables only. -- You must specify all
the partition key columns in the PARTITION clause.
• alter table table_name partition (keycol1=val1,keycol2=val2...) set
tblproperties('numRows'='new_value',
'STATS_GENERATED_VIA_STATS_TASK'='true’);
• Column stats:
• ALTER TABLE <table_name> SET COLUMN STATS <col_name>
(‘numDVs'=‘100‘)
• Compute numDVs with “SELECT NDV(col)”
31 © Cloudera, Inc. All rights reserved.
Other Scalability Considerations
• Use star schemas, integer join keys
• Check for hot spotting - Increase replication factor for master data / frequently queries data
• Avoid casts – implicit or explicit (easily over 10% improvements for larger volumes)
• Increase RUNTIME_FILTER_WAIT_TIME_MS for complicated queries, but coordinators need to do
more work
• Use HDFS file handle cache
• Give the OS enough free memory to cache data blocks
• Set default compression codec - improves disk read performance
• Use high CPU nodes, fast processors
• Impala clusters - DistCp the data from remote cluster
32 © Cloudera, Inc. All rights reserved.
BI Tools
• Always always close queries
• idle_query_timeout = 60
• idle_session_timeout = 1800
• Use handcrafted SQL’s
• Use different pools for different queries
and encourage use of set mem_limit;
• Use JDBC over Kerberos authentication
mechanism
33 © Cloudera, Inc. All rights reserved.
Understanding Query Profiles
• Impala query profiles can be retrieved from Cloudera manager, impala coordinator webui or from
the command line by executing `profile`
• Includes nanosecond timers for all operations on all nodes
• Quite detailed and exhaustive, but the basics are easy
• We can easily answer -
• What’s the bottleneck for this query?
• Why this run is fast but that run is slow?
• How can I tune to improve this query’s performance.
34 © Cloudera, Inc. All rights reserved.
Understanding Query Profiles
• Always check impala version
and default query options set
• Check warnings
• Query state – running,
cancelled
• Check query type – Query,
DDL, etc
35 © Cloudera, Inc. All rights reserved.
Understanding Query Profiles
• Check per node peak memory usage
• Tell you how much should be the memory limit for these queries
• Shows skews in memory usage among nodes
36 © Cloudera, Inc. All rights reserved.
Understanding Query Profiles
• For completed queries, read the summary in detail
• Check what’s taking the max time, max memory, check for skews in data
• Check row estimates, depending on stats available or not these can be skewed
• Check join order is determined entirely by total size (#rows * column width)
• Try to ensure after partition pruning, we have RHS smaller than LHS
• Broadcast joins are the default, partitioned joins for large tables of roughly equal size.
37 © Cloudera, Inc. All rights reserved.
Understanding Query Profiles
• Read the query timeline in detail
• Check which step is taking the most time and
why?
• Usual culprits
• Metadata load
• Completed admission
• ClientFetchWaitTimer
• First dynamic filter received
• Last row fetched
38 © Cloudera, Inc. All rights reserved.
Understanding Query Profiles
• Check each plan fragment
• Tells us what it did, how many
hosts did it run on
• How much data it processed
• Partition pruning stats for
HDFS scans
• Parquet push down predicates
39 © Cloudera, Inc. All rights reserved.
Key Takeaways
• Always use dedicated coordinators/executors
• IAC should be enabled and memory limit set
• Metadata management is significantly improved
• Zero touch metadata coming soon
• Follow best practices for impala queries and performance tuning –Refer Impala cookbook
40 © Cloudera, Inc. All rights reserved.
Rate today ’s session

More Related Content

What's hot

Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
Chandler Huang
 

What's hot (20)

Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
 
Rds data lake @ Robinhood
Rds data lake @ Robinhood Rds data lake @ Robinhood
Rds data lake @ Robinhood
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 
Apache Flink and Apache Hudi.pdf
Apache Flink and Apache Hudi.pdfApache Flink and Apache Hudi.pdf
Apache Flink and Apache Hudi.pdf
 

Similar to Strata London 2019 Scaling Impala

February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
Yahoo Developer Network
 

Similar to Strata London 2019 Scaling Impala (20)

DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast DataKudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
 
Intro to Apache Kudu (short) - Big Data Application Meetup
Intro to Apache Kudu (short) - Big Data Application MeetupIntro to Apache Kudu (short) - Big Data Application Meetup
Intro to Apache Kudu (short) - Big Data Application Meetup
 
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in HadoopKudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
 
Kudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast DataKudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast Data
 
SFHUG Kudu Talk
SFHUG Kudu TalkSFHUG Kudu Talk
SFHUG Kudu Talk
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Chicago spark meetup-april2017-public
Chicago spark meetup-april2017-publicChicago spark meetup-april2017-public
Chicago spark meetup-april2017-public
 
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Real-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using ImpalaReal-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using Impala
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
 

Recently uploaded

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
amitlee9823
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Recently uploaded (20)

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 

Strata London 2019 Scaling Impala

  • 1. SCALING IMPALA Manish Maheshwari | Strata London 2019 | #StrataData manish@cloudera.com
  • 2. 2 © Cloudera, Inc. All rights reserved. AGENDA • Impala overview • KRPC Improvements • Scaling issues and solutions • Understanding query profiles • Key Takeaways
  • 3. 3 © Cloudera, Inc. All rights reserved. • Open source • Fast • Massively parallel processing (MPP) • C++, run time code generation, streaming • Flexible • Multiple storage engines (HDFS, S3, ADLS, Apache Kudu, …) • Multiple file formats (Parquet, Text, Sequence, Avro, ORC, …) • Enterprise-grade • Authorization, authentication, lineage tracing, auditing, encryption • >1400 customers, >97000 machines • Scalable (Now even more!) • Large clusters with 400+ nodes Apache Impala
  • 4. 4 © Cloudera, Inc. All rights reserved. HDFS Kudu S3 HBase Impala Architecture Query Compiler Query Executor Query Coordinator Metadata HDFS NameNode StateStore Catalog FE (Java) BE (C++) Metadata Cache Impala Daemons Metadata Execution Storage ADLS SentryHive MetaStore Query Compiler Query Executor Query Coordinator Metadata Query Compiler Query Executor Query Coordinator Metadata
  • 5. HDFS 0 1 Query Compiler Query Executor Query Coordinator Metadata Kudu S3/ADLS HBase HDFS Kudu S3/ADLS HBase HDFS Kudu S3/ADLS HBase Impala Daemons SQL App ODBC/JDBC 2 34 5 6 7 • Request arrives via ODBC/JDBC • Planner turns request into collection of plan fragments • Coordinator initiates execution on remote Impala daemons • Intermediate results are streamed between Impala daemons • Query results are streamed back to the client Select Query flow in Impala Query Compiler Query Executor Query Coordinator Metadata Query Compiler Query Executor Query Coordinator Metadata Impala DaemonsImpala Daemons
  • 6. 6 © Cloudera, Inc. All rights reserved. What’s KRPC? • Replaces Thrift RPC for inter daemon communication from CDH 5.15+ for certain RPCs • Reduces # of connections in cluster • Reduce stress on MIT KDC / AD • KRPC supports both synchronous and asynchronous RPCs • KRPC supports connection multiplexing • One connection per direction between every pair of hosts • Fixed size thread pool maintained by KRPC internally • Query performances during concurrent execution up by average 2x-3x
  • 7. 7 © Cloudera, Inc. All rights reserved. KRPC Performance
  • 8. 8 © Cloudera, Inc. All rights reserved. KRPC Stability
  • 9. 9 © Cloudera, Inc. All rights reserved. KRPC Throughput
  • 10. 10 © Cloudera, Inc. All rights reserved. Scaling Issues • I got 20 more use cases to onboard • And our data volumes just went up 10x • And we have 10x more tables and 100x more partitions • And we have 10x more concurrent queries • And our ETL is now real time • Sure, Just add more nodes right? • Ok, did that too.. My queries are still 3x slower L
  • 11. 11 © Cloudera, Inc. All rights reserved. Metadata/Catalog Cache Woes • Impala Catalog and Daemons cache the HMS Metadata + HDFS Block locations • Memory = num of tables * 5KB + num of partitions * 2KB + num of files * 750B + num of file blocks * 300B + sum(incremental col stats per table) • Incremental stats • For each table, num columns * num partitions * 400B • e.g. A large telco has over 56K tables • and partitions and files and blocks and replicas… • Catalog memory approx. 80 GB, GC times go up significantly • Reduced memory for query execution. • OOM issues on the catalogd • Long metadata loading time, Long time for SS to send the catalog to all the daemons • Slow DDL and DML Statements, Even `describe table` requires catalog cache for the whole table
  • 12. 12 © Cloudera, Inc. All rights reserved. Metadata/Catalog Cache Woes Daemon #1 CatalogD A,B,C,D Table A HMS Table B Table C Table D Namenode Daemon #1 A,B,C,D DaemonsA,B,C,D Daemon #1 A,B,C,D Daemon #1 A,B,C,D DaemonsA,B,C,D Daemon #1 A,B,C,D Daemon #1 A,B,C,D DaemonsA,B,C,D StateStore Daemon
  • 13. 13 © Cloudera, Inc. All rights reserved. Metadata/Catalog Cache - Solutions • Regularly merge small files • And run a Refresh Table • Avoid data ingestion processes that produce many small files • Use a larger block size (can be over 2GB) • Optimal partitioning strategy • Don’t over or under partition • HDFS file handle cache • Consider using HBase, Kudu, … • If nothing works, just delete the data 😜
  • 14. 14 © Cloudera, Inc. All rights reserved. Dedicated Coordinators • Coordinators - Compile the queries, creates the execution plan, • Needs table metadata!! • Executors - Executes the query plans and sends the results to other executors / coordinator • Does not need table metadata!! • Benefits • Executors need less memory • Statestore does not need to send metadata to all nodes • Faster metadata updates and propogation • Coordinator nodes don’t need to be datanodes
  • 15. 15 © Cloudera, Inc. All rights reserved. Dedicated Coordinators Coordinators CatalogD A,B,C,D Table A HMS Table B Table C Table Z Namenode Executors Executors StateStore Daemon CoordinatorsA,B,C,D Executors Executors Executors Executors
  • 16. 16 © Cloudera, Inc. All rights reserved. Dedicated Coordinators – Best Practices • Thumb Rule – 1 Coordinator per 50 executors. • Start with just one coordinator. Run it on an edge node. (Can run on datanodes too) • Needs atleast few disks to write any spill data • Add another coordinator when CPU / network utilization > 80% • When using a load balancer double the coordinator counts, but set only half as active and rest as backup • Use sticky connections on the load balancer • Increase the fe_service_threads on the coordinators to handle client connections • Increase the Java heap of the coordinators according to the catalog size • Increase num_metadata_loading_threads • Default is 16. Increase if you have lots of tables and partitions.
  • 17. 17 © Cloudera, Inc. All rights reserved. Further Catalog Cache Improvements (In Beta) Coordinator #1 CatalogDCoordinator #2 CatalogD A B impala-shell> describe A impala-shell> describe B Table A HMS + Namenode Table B Table C Table Z • On-Demand metadata for coordinators from the CatalogD • Metadata LRU cache (forget old tables) • Metadata release on memory pressure • Smart Cache Invalidation • Compressed Incremental Stats See IMPALA-7127 for full list of improvements HMS Notifications Coordinator #3 C impala-shell> describe C
  • 18. 18 © Cloudera, Inc. All rights reserved. HMS Notifications Hive Metastore HDFS NameNode StateStore Catalog Metadata 2 3 Lightweight Notifications Query Coordinators Coordinators Lightweight Notifications HMS Notifications Metadata Query Coordinators
  • 19. 19 © Cloudera, Inc. All rights reserved. Metadata on-demand (In Beta) • Use Local catalog cache • Set --catalog_topic_mode = minimal on the catalog daemons • Set --use_local_catalog = true on the coordinators • Time-based catalog cache eviction - • Set invalidate_tables_timeout_s on both catalogd and coordinators • E.g invalidate_tables_timeout_s = 3600 will invalidate tables that are older than 1 hour. • Memory-based catalog cache eviction - • Set invalidate_tables_on_memory_pressure = true on both catalogd and impalad. • When the memory pressure reaches 60% of JVM heap size after a Java garbage collection in catalogd, Impala invalidates 10% of the least recently used tables.
  • 20. 20 © Cloudera, Inc. All rights reserved. Impala Architecture – Coordinators and Executors HDFS Kudu S3 HBase Query Compiler Query Executor Query Coordinator Metadata Query Executor HDFS NameNode StateStore Catalog FE (Java) BE (C++) Metadata Cache Impala Coordinator (‘s) Impala Executors Metadata Execution Storage Query Executor Query Executor Query Executor Query Executor Query Executor Query Executor Query Executor Query Executor Query Executor ADLS SentryHive MetaStore
  • 21. 21 © Cloudera, Inc. All rights reserved. Admission Control Woes • Impala Admission Control not enabled / Default Memory limit not set for each pool • Heuristics based memory estimation, not 100% accurate. Worse if table stats are unavailable • group by’s estimate can be particularly off – when there’s a large number of group by columns. • Mem estimate = NDV of group by column 1 * NDV of group by column 2 * … NDV of group by column n • Under admission due to higher than required memory reserved on each daemon • Queries will OOM out unnecessarily • Query Status: Admission for query exceeded timeout 60000ms in pool root.nprd_tst_hadoop_data_appl_readonly. Queued reason: Not enough aggregate memory available in pool root.nprd_tst_hadoop_data_appl_readonly with max mem resources 150.00 GB. Needed 40.00 GB but only 30.00 GB was available. • Always enable IAC • Limit the amount of memory used by an individual query using per query mem-limit • Set it from Impala shell / Hue - set mem_limit=<per query limit> • Set default memory limit per pool
  • 22. 22 © Cloudera, Inc. All rights reserved. Query Concurrency Woes • Impala Admission Control is decentralized, each coordinator makes independent decision on the basis of last know running queries on the cluster as communicated by the SS • This makes IAC decisions are fast, But might be a little imprecise during times of heavy load across many daemons. This is called as over admission. • Using dedicated coordinators, limits this over admission of queries!! • You get controlled query concurrency and thus each query runs faster and overall higher query throughput. • Ideal total query concurrency = # Cores on the executors / datanodes
  • 23. 23 © Cloudera, Inc. All rights reserved. Impala Resource Pools Resource Pools Design - • 10 impala daemons, 200 GB per daemon – 2 TB total, 8 Tenants • Ok. Lets divide the memory into tenants. Everyone gets what they pay for. Good design right? Issues • Unused memory cannot be used by other tenants • Busy tenants queue up queries in admission control causing overall ”slowness” in query execution. • Small tenants running large queries will spill to disk until the spill to disk limit and eventually OOM out • How bad does it get - 25 + tenants, Only 2/3 active at any given time
  • 24. 24 © Cloudera, Inc. All rights reserved. Scaling Impala Resource Pools • Design resource pools according to peak memory needed • Use simple grouping to create small, medium and large pools • Use Cloudera Manager API’s for user chargeback if needed
  • 25. 25 © Cloudera, Inc. All rights reserved. Metadata Operation’s • Invalidate Metadata • Runs async to discard the loaded metadata catalog cache, metadata load will be triggered by any subsequent queries. • Should be run when • New tables created / Tables dropped by Hive/Spark • Block locations changed by HDFS load balancer • Recover Partitions • Scans HDFS to check if any new partition directories were added and cache block metadata for those files • Refresh Table / Refresh Table Partition • Adding/Removing/Overwriting files into partitions via Hive/Spark • Running operations like ALTER TABLE • Reloads metadata for the table from HMS and does an incremental reload of the file and block metadata.
  • 26. 26 © Cloudera, Inc. All rights reserved. Scaling Metadata Operation’s • How bad does it get? • 18K Invalidate tables per day. • What not to do • No commands are needed if operations/ETL runs in Impala • Always run refresh <table> <partition> when adding data • Recover partitions when partitions are added • Refresh Table for other changes • Limit IM to <Table> Only
  • 27. 27 © Cloudera, Inc. All rights reserved. Automatic Metadata Sync (In Beta) • CatalogD polls Hive Metastore (HMS) notifications events • Invalidates the tables when it receives the ALTER TABLE events or the ALTER, ADD, or DROP their partitions. • Adds the tables or databases when it receives the CREATE TABLE or CREATE DATABASE events. • Removes the tables from catalogd when it receives the DROP TABLE or DROP DATABASE events. • The operations that do not generate events in HMS, such as adding new data to existing tables/partitions from Spark are not supported. • Aka, Load / Insert still needs a refresh table partition
  • 28. 28 © Cloudera, Inc. All rights reserved. Automatic Metadata Sync (In Beta) • To disable the event based HMS sync for a new database, set the impala.disableHmsSync database properties in Hive • CREATE DATABASE <name> WITH DBPROPERTIES ('impala.disableHmsSync'='true'); • To enable or disable the event based HMS sync for a table • CREATE TABLE <name> WITH TBLPROPERTIES ('impala.disableHmsSync'='true' | 'false'); • To change the event based HMS sync at the table level • ALTER TABLE <name> WITH TBLPROPERTIES ('impala.disableHmsSync'='true' | 'false’); • When both table and database level properties are set, the table level property takes precedence. • If the property is changed from true (meaning events are skipped) to false (meaning events are not skipped), issue a manual INVALIDATE METADATA command to reset.
  • 29. 29 © Cloudera, Inc. All rights reserved. Scaling Compute Stats • Compute Stats is very CPU-intensive – Based on number of rows, number of data files, the total size of the data files, and the file format. • For partitioned tables, the numbers are calculated per partition, and as totals for the whole table. • Limit the number of columns to only compute stats on columns involved in filters, join conditions, group by or partition by clauses. • Re-compute stats only when there is > 30% data change • Run compute stats on weekends/nights. Not needed after every data load. • If you reload a complete new set of data for a table, but the number of rows and number of distinct values for each column is relatively unchanged from before, you do not need to recompute stats for the table • Use enable_stats_extrapolation (experimental)
  • 30. 30 © Cloudera, Inc. All rights reserved. Set Statistics Manually • Quick fix as part of data load, while compute stats can be scheduled on weekends • Set total number of rows. Applies to both unpartitioned and partitioned tables. • alter table <table_name> set tblproperties('numRows'='new_value', 'STATS_GENERATED_VIA_STATS_TASK'='true’); • Set total number of rows for a specific partition. Applies to partitioned tables only. -- You must specify all the partition key columns in the PARTITION clause. • alter table table_name partition (keycol1=val1,keycol2=val2...) set tblproperties('numRows'='new_value', 'STATS_GENERATED_VIA_STATS_TASK'='true’); • Column stats: • ALTER TABLE <table_name> SET COLUMN STATS <col_name> (‘numDVs'=‘100‘) • Compute numDVs with “SELECT NDV(col)”
  • 31. 31 © Cloudera, Inc. All rights reserved. Other Scalability Considerations • Use star schemas, integer join keys • Check for hot spotting - Increase replication factor for master data / frequently queries data • Avoid casts – implicit or explicit (easily over 10% improvements for larger volumes) • Increase RUNTIME_FILTER_WAIT_TIME_MS for complicated queries, but coordinators need to do more work • Use HDFS file handle cache • Give the OS enough free memory to cache data blocks • Set default compression codec - improves disk read performance • Use high CPU nodes, fast processors • Impala clusters - DistCp the data from remote cluster
  • 32. 32 © Cloudera, Inc. All rights reserved. BI Tools • Always always close queries • idle_query_timeout = 60 • idle_session_timeout = 1800 • Use handcrafted SQL’s • Use different pools for different queries and encourage use of set mem_limit; • Use JDBC over Kerberos authentication mechanism
  • 33. 33 © Cloudera, Inc. All rights reserved. Understanding Query Profiles • Impala query profiles can be retrieved from Cloudera manager, impala coordinator webui or from the command line by executing `profile` • Includes nanosecond timers for all operations on all nodes • Quite detailed and exhaustive, but the basics are easy • We can easily answer - • What’s the bottleneck for this query? • Why this run is fast but that run is slow? • How can I tune to improve this query’s performance.
  • 34. 34 © Cloudera, Inc. All rights reserved. Understanding Query Profiles • Always check impala version and default query options set • Check warnings • Query state – running, cancelled • Check query type – Query, DDL, etc
  • 35. 35 © Cloudera, Inc. All rights reserved. Understanding Query Profiles • Check per node peak memory usage • Tell you how much should be the memory limit for these queries • Shows skews in memory usage among nodes
  • 36. 36 © Cloudera, Inc. All rights reserved. Understanding Query Profiles • For completed queries, read the summary in detail • Check what’s taking the max time, max memory, check for skews in data • Check row estimates, depending on stats available or not these can be skewed • Check join order is determined entirely by total size (#rows * column width) • Try to ensure after partition pruning, we have RHS smaller than LHS • Broadcast joins are the default, partitioned joins for large tables of roughly equal size.
  • 37. 37 © Cloudera, Inc. All rights reserved. Understanding Query Profiles • Read the query timeline in detail • Check which step is taking the most time and why? • Usual culprits • Metadata load • Completed admission • ClientFetchWaitTimer • First dynamic filter received • Last row fetched
  • 38. 38 © Cloudera, Inc. All rights reserved. Understanding Query Profiles • Check each plan fragment • Tells us what it did, how many hosts did it run on • How much data it processed • Partition pruning stats for HDFS scans • Parquet push down predicates
  • 39. 39 © Cloudera, Inc. All rights reserved. Key Takeaways • Always use dedicated coordinators/executors • IAC should be enabled and memory limit set • Metadata management is significantly improved • Zero touch metadata coming soon • Follow best practices for impala queries and performance tuning –Refer Impala cookbook
  • 40. 40 © Cloudera, Inc. All rights reserved. Rate today ’s session