Query Compilation in Impala

Query Compilation in Impala
Alexander Behm | Software Engineer
May 2014 @ Impala User Group

Compile Query
Execute Query
Client
Client
SQL Text
Executable Plan
Query Results
Impala Frontend
(Java)
Impala Backend
(C++)
Focus of this talk
Flow of a SQL Query

Client
SQL Text
Executable Plan
Query Compilation
Query
Compiler
SQL
Parsing
Semantic
Analysis
Query
Planning
Parse Tree
Parse Tree + Analyzer

Query Parsing
SELECT c1, SUM(c2)
FROM t1 JOIN t2 USING(id)
WHERE c3 > 10 GROUP BY c1
SelectList TableRefs WhereClause
SelectStmt
GroupByClause
ColRef AggExpr
ColRef
BinaryPredicate
ColRef IntLiteral
ColRefTableRef TableRef
UsingClause
ColRef
• Applies SQL grammar, reports syntax errors
• Produces parse tree capturing syntactic structure of query

Semantic Analysis…
• Precondition: Query is syntactically valid. Analysis operates on parse tree.
• Consults table metadata
• Do t1 and t2 exist? Does c1 exist in t1 or t2 (or both  error)? Does id exist in t1 and t2?
• Does the user have privileges to SELECT from t1?
• Checks type compatibility of expressions, adds implicit casts
• c3 > 10  c3 > cast(10 as bigint)
• SQL rules (semantic, not syntactic)
• Does c1 appear in the GROUP BY clause?
SELECT c1, SUM(c2)
FROM t1 JOIN t2 USING(id)

… Semantic Analysis
• Expression substitution for views
• Resolve column references against base tables
• Preparation for Planning
• Register state in analyzer for correct predicate assignment during planning
• Register predicates (WHERE, HAVING, ON, USING, etc.)
• Register outer-joined tables
• Compute value-transfer graph and equivalence classes for predicate inference
• (…)
• Postcondition: Query is valid. An executable plan can be produced.
SELECT c1, SUM(c2)
FROM (SELECT dept AS c1, revenue AS c2,
month AS c3 FROM t1) AS v
SELECT dept, SUM(revenue)
FROM t1
WHERE month > 10
GROUP BY dept

• Generate executable plan (“tree” of operators)
• Maximize scan locality using DN block metadata
• Minimize data movement
• Full distribution of operators
• Query operators
• Scan, HashJoin, HashAggregation, Union, TopN,
Exchange
Query Planning: Goals

Query Planning: Overview
Semantic
Analysis
Parse Tree + Analyzer
Query
Planner
Walk Parse Tree
Parallelize
& Fragment
Single-node Plan
Executable Plan

Query Planning: Single-Node Plan
• Four major functions:
1. Parse Tree  Plan Tree
2. Assigns predicates to lowest plan node
3. Optimizes join order
4. Prunes irrelevant columns

Parse Tree  Single-Node Plan Tree
HashJoin
Scan: t1
Scan: t3
Scan: t2
HashJoin
TopN
Agg
SELECT t1.dept, SUM(t2.revenue)
FROM LargeHdfsTable t1
JOIN HugeHdfsTable t2 ON (t1.id1 = t2.id)
JOIN SmallHbaseTable t3 ON (t1.id2 = t3.id)
WHERE t3.category = 'Online‘ AND t1.id > 10
GROUP BY t1.dept
HAVING COUNT(t2.revenue) > 10
ORDER BY revenue LIMIT 10

GROUP BY t1.dept
Predicate Assignment & Inference
HashJoin
Scan: t1
Scan: t3
Scan: t2
HashJoin
TopN
Agg
COUNT(t2.revenue) > 10
t1.id2 = t3.id
t1.id1 = t2.id
id1 > 10
category = ‘Online’
id > 10
Inferred
Predicate

Join-Order Optimization
• Inner joins are commutative and associative
• Query results correct independent of execution order
• Query execution costs vary dramatically!
• Hash table sizes, network transfers, #hash lookups
• Join-order optimization
• Impala only considers left-deep join trees
• (Right join input is a table, not another join)
• Find cheapest valid join order
• Relies heavily on table and column statistics
• Limitation: Choice of join order independent of join strategy

Invalid Join Orders
GROUP BY t1.dept
No explicit or implicit
predicate between t2 and t3

HashJoin
Scan: t1
Scan: t3
Scan: t2
HashJoin
HashJoin
Scan: t1
Scan: t2
Scan: t3
HashJoin
HashJoin
Scan: t2
Scan: t3
Scan: t1
HashJoin
HashJoin
Scan: t2
Scan: t1
Scan: t3
HashJoin
HashJoin
Scan: t3
Scan: t2
Scan: t1
HashJoin
HashJoin
Scan: t3
Scan: t1
Scan: t2
HashJoin
Order:
t1, t2, t3
Order:
t1, t3, t2
Order:
t2, t1, t3
Order:
t2, t3, t1
Order:
t3, t1, t2
Order:
t3, t2, t1

• Impala’s Implementation:
1. Heuristic
• Order tables descending by size
• Best plan typically has largest table on the left (if valid)
2. Plan enumeration & costing
• Generate all possible join orders starting from a given
left-most table (starting with largest one)
• Ignore invalid join orders
• Estimate intermediate result sizes (key!)
• Choose plan that minimizes intermediate result sizes

Query Planning: Distributed Plans
• Distributed Aggregation
• Pre-aggregation where data is first materialized
• Merge-aggregation partitioned by grouping columns
• Distinct aggregation: additional level of pre- and merge aggregation
• Distributed Top-N
• Initial Top-N where data is first materialized
• Final Top-N at coordinator
• Distributed Union
• Pre-aggregation/top-n placed into plans of each union operand
• Union-operand plans executed in parallel, merged via exchange
• Above strategies are currently fixed in Impala
• Independent of column/table stats

Query Planning: Distributed Joins
• Broadcast Join
• Join is co-located with left input
• Broadcast right input to all nodes executing join
• Build hash table on right input, streaming probe from left input
•  Preferred for small right side (relative to left side)
• Partitioned Join
• Both tables hash-partitioned on join columns
• Same build/probe procedure as above
•  Preferred for joins where both left and right side are large
• Cost-based decision based on table/column stats
• Minimize required network transfer

Query Planning: Distributed Plans
HashJoinScan: t2
Scan: t3
Scan: t1
HashJoin
TopN
Pre-Agg
MergeAgg
TopN
Broadcast
Merge
hash t2.idhash t1.id1
hash
t1.custid
at HDFS DN
at HBase RS
at coordinator
HashJoin
Scan: t2
Scan: t3
Scan: t1
HashJoin
TopN
Agg
Single-Node
Plan

Explain Example: TPCDS Q42
SELECT d.d_year, i.i_category_id, i.i_category, SUM(ss_ext_sales_price)
FROM store_sales ss
JOIN date_dim d
ON (ss.ss_sold_date_sk = d.d_date_sk)
JOIN item i
ON (ss.ss_item_sk = i.i_item_sk)
WHERE i.i_manager_id = 1 AND d.d_moy = 12 AND d.d_year = 1998
GROUP BY d.d_year, i.i_category_id, i.i_category
ORDER BY total_sales DESC, d_year, i_category_id, i_category
LIMIT 100

+---------------------------------------------------------------------+
| Explain String |
+---------------------------------------------------------------------+
| Estimated Per-Host Requirements: Memory=3.76GB VCores=3 |
| |
| 12:TOP-N [LIMIT=100] |
| 11:EXCHANGE [PARTITION=UNPARTITIONED] |
| 06:TOP-N [LIMIT=100] |
| 10:AGGREGATE [MERGE FINALIZE] |
| 09:EXCHANGE [PARTITION=HASH(d.d_year,i.i_category_id,i.i_category)] |
| 05:AGGREGATE |
| 04:HASH JOIN [INNER JOIN, BROADCAST] |
| |--08:EXCHANGE [BROADCAST] |
| | 02:SCAN HDFS [tpcds1000gb.item i] |
| | 01:SCAN HDFS [tpcds1000gb.date_dim d] |
| 00:SCAN HDFS [tpcds1000gb.store_sales ss] |
+---------------------------------------------------------------------+
set num_nodes=0;

Conclusion
• Cost-based choice of join order and strategy
• Critical for performance
• Relies on table and column stats
• Other plan optimizations currently independent of stats
• Likely to expand plan choices in the future
• Likely to increase reliance on stats
• Helpful Impala commands
• compute stats
• show table/column stats
• explain query/insert stmt
• set explain_level=[0-3]
• set num_nodes=0  show single-node plan

Try It Out!
•Questions/comments?
• Download: cloudera.com/impala
• Email: impala-user@cloudera.org
• Join: groups.cloudera.org

Query Compilation in Impala

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Query Compilation in Impala

Similar to Query Compilation in Impala (20)

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Recently uploaded

Recently uploaded (20)

Query Compilation in Impala