SlideShare a Scribd company logo
1 of 32
Download to read offline
+
Cost-based Query Optimization
Maryann Xue (Intel)
Julian Hyde (Hortonworks)
Hadoop Summit, San Jose
June 2016
•@maryannxue
•Apache Phoenix PMC member
•Intel
•@julianhyde
•Apache Calcite VP
•Hortonworks
What is Apache Phoenix?
• A relational database layer for Apache HBase
– Query engine
• Transforms SQL queries into native HBase API calls
• Pushes as much work as possible onto the cluster for parallel
execution
– Metadata repository
• Typed access to data stored in HBase tables
– Transaction support
– Table Statistics
– A JDBC driver
Advanced Features
• Secondary indexes
• Strong SQL standard compliance
• Windowed aggregates
• Connectivity (e.g. remote JDBC driver, ODBC driver)
Created architectural pain… We decided to do it right!
Example 1: Optimizing Secondary Indexes
How we match secondary
indexes in Phoenix 4.8:
What about both?
SELECT * FROM Emp ORDER BY name
SELECT * FROM Emp WHERE empId > 100
CREATE TABLE Emps(empId INT PRIMARY KEY, name VARCHAR(100));



CREATE INDEX I_Emps_Name ON Emps(name);
SELECT * FROM Emp

WHERE empId > 100 ORDER BY name
Q1
Q2
Q3
I_Emps_Name
Emps
We need to make a cost-based decision! Statistics can help.
?
Phoenix + Calcite
• Both are Apache projects
• Involves changes to both projects
• Work is being done on a branch of Phoenix, with changes to Calcite
as needed
• Goals:
– Remove code! (Use Calcite’s SQL parser, validator)
– Improve planning (Faster planning, faster queries)
– Improve SQL compliance
– Some “free” SQL features (e.g. WITH, scalar subquery, FILTER)
– Close to full compatibility with current Phoenix SQL and APIs
• Status: beta, expected GA: late 2016
Current Phoenix Architecture
Parser
Algebra
Phoenix Schema
Stage 1: ParseNode tree
Stage 2: Normalization,
secondary index rewrite
Stage 3: Expression tree
HBase Data
Runtime
Query Plan
Calcite Architecture
Parser
Algebra
Schema SPI Operators,

Rules,

Statistics,
Cost model
Data
Engine
Data
Engine
Data
Engine
Phoenix + Calcite Architecture
Parser
Algebra
Phoenix Schema Logical + Phoenix Operators,

Builtin + Phoenix Rules,

Phoenix Statistics,
Phoenix Cost model
Data
JDBC (optional)
HBase Data
Phoenix Runtime
Data
Other (optional)
Query Plan
Cost-based Query Optimizer

with Apache Calcite
• Base all query optimization decisions on cost
– Filter push down; range scan vs. skip scan
– Hash aggregate vs. stream aggregate vs. partial stream aggregate
– Sort optimized out; sort/limit push through; fwd/rev/unordered
scan
– Hash join vs. merge join; join ordering
– Use of data table vs. index table
– All above (any many others) COMBINED
• Query optimizations are modeled as pluggable rules
Calcite Algebra
SELECT products.name, COUNT(*)

FROM sales

JOIN products USING (productId)

WHERE sales.discount IS NOT NULL

GROUP BY products.name

ORDER BY COUNT(*) DESC
scan
[products]
scan
[sales]
join
filter
aggregate
sort
translate SQL to
relational
algebra
Example 2: FilterIntoJoinRule
SELECT products.name, COUNT(*)

FROM sales

JOIN products USING (productId)

WHERE sales.discount IS NOT NULL

GROUP BY products.name

ORDER BY COUNT(*) DESC
scan
[products]
scan
[sales]
join
filter
aggregate
sort
scan
[products]
scan
[sales]
filter’
join’
aggregate
sort
FilterIntoJoinRule
translate SQL to
relational
algebra
Example 3: Phoenix Joins
• Hash join vs. Sort merge join
– Hash join good for: either input is small
– Sort merge join good for: both inputs are big
– Hash join downside: potential OOM
– Sort merge join downside: extra sorting required sometimes
• Better to exploit the sortedness of join input
• Better to exploit the sortedness of join output
Example 3: Calcite Algebra
SELECT empid, e.name, d.name, location

FROM emps AS e

JOIN depts AS d USING (deptno)

ORDER BY d.deptno
scan
[emps]
scan
[depts]
join
sort
project
translate SQL to
relational
algebra
Example 3: Plan Candidates
scan
[emps]
scan
[depts]
hash-join
sort
project
scan
[emps]
scan
[depts]
sort
merge-join
projectCandidate 1:
hash-join
*also what standalone
Phoenix compiler
would generate.
Candidate 2:
merge-join
1. Very little difference in all other operators: project, scan, hash-join or merge-join
2. Candidate 1 would sort “emps join depts”, while candidate 2 would only sort “emps”
Win
SortRemoveRule
sorted on [deptno]
SortRemoveRule
sorted on [e.deptno]
Example 3: Improved Plan
scan ‘depts’
send ‘depts’ over to RS
& build hash-cache
scan ‘emps’ hash-join ‘depts’
sort joined table on ‘e.deptno’
scan ‘emps’
merge-join ‘emps’ and ‘depts’
sort by ‘deptno’
scan ‘depts’
Old vs. New
1. Exploited the sortedness of join input
2. Exploited the sortedness of join output
(and now, a brief look at Calcite)
Apache Calcite
• Apache top-level project since October, 2015
• Query planning framework
– Relational algebra, rewrite

rules
– Cost model & statistics
– Federation via adapters
– Extensible
• Packaging
– Library
– Optional SQL parser, JDBC server
– Community-authored rules, adapters
Embedded Adapters Streaming
Apache Drill
Apache Hive
Apache Kylin
Apache Phoenix*
Cascading
Lingual
Apache Cassandra*
Apache Spark
CSV
In-memory
JDBC
JSON
MongoDB
Splunk
Web tables
Apache Flink*
Apache Samza
Apache Storm
Apache Calcite Avatica
• Database connectivity
stack
• Self-contained sub-
project of Calcite
• Fast, open, stable
• Powers Phoenix Query
Server
Calcite – APIs and SPIs
Cost, statistics
RelOptCost
RelOptCostFactory
RelMetadataProvider
• RelMdColumnUniquensss
• RelMdDistinctRowCount
• RelMdSelectivity
SQL parser
SqlNode

SqlParser

SqlValidator
Transformation rules
RelOptRule
• MergeFilterRule
• PushAggregateThroughUnionRule
• 100+ more
Global transformations
• Unification (materialized view)
• Column trimming
• De-correlation
Relational algebra
RelNode (operator)
• TableScan
• Filter
• Project
• Union
• Aggregate
• …
RelDataType (type)
RexNode (expression)
RelTrait (physical property)
• RelConvention (calling-convention)
• RelCollation (sortedness)
• TBD (bucketedness/distribution)
JDBC driver (Avatica)
Metadata
Schema
Table
Function
• TableFunction
• TableMacro
Lattice
Calcite Planning Process
SQL
parse
tree
Planner
RelNode
Graph
Sql-to-Rel Converter
SqlNode
! RelNode
+ RexNode
Node for each node in Input
Plan
Each node is a Set of
alternate Sub Plans
Set further divided into
Subsets: based on traits like
sortedness
1. Plan Graph
Rule: specifies an Operator
sub-graph to match and logic
to generate equivalent ‘better’
sub-graph
New and original sub-graph
both remain in contention
2. Rules
RelNodes have Cost &
Cumulative Cost
3. Cost Model
Used to plug in Schema,
cost formulas
Filter selectivity
Join selectivity
NDV calculations
4. Metadata Providers
Rule Match Queue
Best RelNode Graph
Translate to
runtime
Logical Plan
Based on “Volcano” & “Cascades” papers [G. Graefe]
Add Rule matches to Queue
Apply Rule match transformations
to plan graph
Iterate for fixed iterations or until
cost doesn’t change
Match importance based on cost of
RelNode and height
Views and materialized views
• A view is a named
relational expression,
stored in the catalog,
that is expanded
while planning a
query.
• A materialized view is an equivalence,
stored in the catalog, between a table
and a relational expression.



The planner substitutes the table into
queries where it will help, even if the
queries do not reference the
materialized view.
Query using a view
Scan [Emps]
Join [$0, $5]
Project [$0, $1, $2, $3]
Filter [age >= 50]
Aggregate [deptno, min(salary)]
Scan [Managers]
Aggregate [manager]
Scan [Emps]
SELECT deptno, min(salary)

FROM Managers

WHERE age >= 50

GROUP BY deptno
CREATE VIEW Managers AS

SELECT *

FROM Emps 

WHERE EXISTS (

SELECT *

FROM Emps AS underling

WHERE underling.manager = emp.id)
view scan to
be expanded
After view expansion
Scan [Emps] Aggregate [manager]
Join [$0, $5]
Project [$0, $1, $2, $3]
Filter [age >= 50]
Aggregate [deptno, min(salary)]
Scan [Emps]
SELECT deptno, min(salary)

FROM Managers

WHERE age >= 50

GROUP BY deptno
CREATE VIEW Managers AS

SELECT *

FROM Emps 

WHERE EXISTS (

SELECT *

FROM Emps AS underling

WHERE underling.manager = emp.id)
can be pushed
down
Materialized view
Scan [Emps]
Aggregate [deptno, gender,

COUNT(*), SUM(sal)]
Scan [EmpSummary]
=
Scan [Emps]
Filter [deptno = 10 AND gender = ‘M’]
Aggregate [COUNT(*)]
CREATE MATERIALIZED VIEW EmpSummary AS

SELECT deptno,

gender,

COUNT(*) AS c,

SUM(sal) AS s

FROM Emps

GROUP BY deptno, gender
SELECT COUNT(*)

FROM Emps

WHERE deptno = 10

AND gender = ‘M’
Materialized view, step 2: Rewrite query to
match
Scan [Emps]
Aggregate [deptno, gender,

COUNT(*), SUM(sal)]
Scan [EmpSummary]
=
Scan [Emps]
Filter [deptno = 10 AND gender = ‘M’]
Aggregate [deptno, gender,

COUNT(*) AS c, SUM(sal) AS s]
Project [c]
CREATE MATERIALIZED VIEW EmpSummary AS

SELECT deptno,

gender,

COUNT(*) AS c,

SUM(sal) AS s

FROM Emps

GROUP BY deptno, gender
SELECT COUNT(*)

FROM Emps

WHERE deptno = 10

AND gender = ‘M’
Materialized view, step 3: Substitute table
Scan [Emps]
Aggregate [deptno, gender,

COUNT(*), SUM(sal)]
Scan [EmpSummary]
=
Filter [deptno = 10 AND gender = ‘M’]
Project [c]
Scan [EmpSummary]
CREATE MATERIALIZED VIEW EmpSummary AS

SELECT deptno,

gender,

COUNT(*) AS c,

SUM(sal) AS s

FROM Emps

GROUP BY deptno, gender
SELECT COUNT(*)

FROM Emps

WHERE deptno = 10

AND gender = ‘M’
(and now, back to Phoenix)
Example 1, Revisited: Secondary Index
Optimizer internally creates a mapping (query, table) equivalent to:
Scan [Emps]
Filter [deptno BETWEEN 100 and 150]
Project [deptno, name]
Sort [deptno]
CREATE MATERIALIZED VIEW I_Emp_Deptno AS

SELECT deptno, empno, name

FROM Emps

ORDER BY deptno
Scan [Emps]
Project [deptno, empno, name]
Sort [deptno, empno, name]
Filter [deptno BETWEEN 100 and 150]
Project [deptno, name]
Scan
[I_Emp_Deptno]
1,000
1,000
200
1600 1,000
1,000
200
very simple
cost based
on row-count
Beyond Phoenix 4.8

with Apache Calcite
• Get the missing SQL support
– WITH, UNNEST, Scalar subquery, etc.
• Materialized views
– To allow other forms of indices (maybe defined as external), e.g., a
filter view, a join view, or an aggregate view.
• Interop with other Calcite adapters
– Already used by Drill, Hive, Kylin, Samza, etc.
– Supports any JDBC source
– Initial version of Drill-Phoenix integration already working
Drillix: Interoperability with Apache Drill
SELECT deptno, sum(salary) FROM emps GROUP BY deptno
Stage 1:
Local Partial aggregation
Stage 3:
Final aggregation
Stage 2:
Shuffle partial results
Drill Aggregate [deptno, sum(salary)]
Drill Shuffle [deptno]
Phoenix Aggregate [deptno, sum(salary)]
Phoenix TableScan [emps]
Phoenix Tables on HBase
Thank you! Questions?
@maryannxue
@julianhyde
http://phoenix.apache.org
http://calcite.apache.org

More Related Content

What's hot

Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache Arrow
DataWorks Summit
 

What's hot (20)

Apache Calcite overview
Apache Calcite overviewApache Calcite overview
Apache Calcite overview
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache Calcite
 
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
 
Streaming SQL with Apache Calcite
Streaming SQL with Apache CalciteStreaming SQL with Apache Calcite
Streaming SQL with Apache Calcite
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
 
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
 
Incremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and IcebergIncremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and Iceberg
 
Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache Arrow
 
iceberg introduction.pptx
iceberg introduction.pptxiceberg introduction.pptx
iceberg introduction.pptx
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
SQL on everything, in memory
SQL on everything, in memorySQL on everything, in memory
SQL on everything, in memory
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 

Similar to Cost-based Query Optimization in Apache Phoenix using Apache Calcite

phoenix-on-calcite-nyc-meetup
phoenix-on-calcite-nyc-meetupphoenix-on-calcite-nyc-meetup
phoenix-on-calcite-nyc-meetup
Maryann Xue
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluation
avniS
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
Paul Chao
 

Similar to Cost-based Query Optimization in Apache Phoenix using Apache Calcite (20)

phoenix-on-calcite-nyc-meetup
phoenix-on-calcite-nyc-meetupphoenix-on-calcite-nyc-meetup
phoenix-on-calcite-nyc-meetup
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
 
Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14
 
Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache Spark
 
Chapter15
Chapter15Chapter15
Chapter15
 
MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for Programmers
 
Data Science
Data ScienceData Science
Data Science
 
Pig Experience
Pig ExperiencePig Experience
Pig Experience
 
The Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBaseThe Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBase
 
Meetup Junio Data Analysis with python 2018
Meetup Junio Data Analysis with python 2018Meetup Junio Data Analysis with python 2018
Meetup Junio Data Analysis with python 2018
 
Hands on Mahout!
Hands on Mahout!Hands on Mahout!
Hands on Mahout!
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluation
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
 
Polyalgebra
PolyalgebraPolyalgebra
Polyalgebra
 
SSRS - PPS - MOSS Profile
SSRS - PPS - MOSS ProfileSSRS - PPS - MOSS Profile
SSRS - PPS - MOSS Profile
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
 
Lecture 2 part 3
Lecture 2 part 3Lecture 2 part 3
Lecture 2 part 3
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTO
 

More from Julian Hyde

More from Julian Hyde (20)

Building a semantic/metrics layer using Calcite
Building a semantic/metrics layer using CalciteBuilding a semantic/metrics layer using Calcite
Building a semantic/metrics layer using Calcite
 
Cubing and Metrics in SQL, oh my!
Cubing and Metrics in SQL, oh my!Cubing and Metrics in SQL, oh my!
Cubing and Metrics in SQL, oh my!
 
Adding measures to Calcite SQL
Adding measures to Calcite SQLAdding measures to Calcite SQL
Adding measures to Calcite SQL
 
Morel, a data-parallel programming language
Morel, a data-parallel programming languageMorel, a data-parallel programming language
Morel, a data-parallel programming language
 
Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...
 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query Language
 
What to expect when you're Incubating
What to expect when you're IncubatingWhat to expect when you're Incubating
What to expect when you're Incubating
 
Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Open Source SQL - beyond parsers: ZetaSQL and Apache CalciteOpen Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite
 
Efficient spatial queries on vanilla databases
Efficient spatial queries on vanilla databasesEfficient spatial queries on vanilla databases
Efficient spatial queries on vanilla databases
 
Don't optimize my queries, organize my data!
Don't optimize my queries, organize my data!Don't optimize my queries, organize my data!
Don't optimize my queries, organize my data!
 
Spatial query on vanilla databases
Spatial query on vanilla databasesSpatial query on vanilla databases
Spatial query on vanilla databases
 
Lazy beats Smart and Fast
Lazy beats Smart and FastLazy beats Smart and Fast
Lazy beats Smart and Fast
 
Data profiling with Apache Calcite
Data profiling with Apache CalciteData profiling with Apache Calcite
Data profiling with Apache Calcite
 
A smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
A smarter Pig: Building a SQL interface to Apache Pig using Apache CalciteA smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
A smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
 
Data Profiling in Apache Calcite
Data Profiling in Apache CalciteData Profiling in Apache Calcite
Data Profiling in Apache Calcite
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 

Recently uploaded

Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 

Recently uploaded (20)

Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 

Cost-based Query Optimization in Apache Phoenix using Apache Calcite

  • 1. + Cost-based Query Optimization Maryann Xue (Intel) Julian Hyde (Hortonworks) Hadoop Summit, San Jose June 2016
  • 2. •@maryannxue •Apache Phoenix PMC member •Intel •@julianhyde •Apache Calcite VP •Hortonworks
  • 3. What is Apache Phoenix? • A relational database layer for Apache HBase – Query engine • Transforms SQL queries into native HBase API calls • Pushes as much work as possible onto the cluster for parallel execution – Metadata repository • Typed access to data stored in HBase tables – Transaction support – Table Statistics – A JDBC driver
  • 4. Advanced Features • Secondary indexes • Strong SQL standard compliance • Windowed aggregates • Connectivity (e.g. remote JDBC driver, ODBC driver) Created architectural pain… We decided to do it right!
  • 5. Example 1: Optimizing Secondary Indexes How we match secondary indexes in Phoenix 4.8: What about both? SELECT * FROM Emp ORDER BY name SELECT * FROM Emp WHERE empId > 100 CREATE TABLE Emps(empId INT PRIMARY KEY, name VARCHAR(100));
 
 CREATE INDEX I_Emps_Name ON Emps(name); SELECT * FROM Emp
 WHERE empId > 100 ORDER BY name Q1 Q2 Q3 I_Emps_Name Emps We need to make a cost-based decision! Statistics can help. ?
  • 6. Phoenix + Calcite • Both are Apache projects • Involves changes to both projects • Work is being done on a branch of Phoenix, with changes to Calcite as needed • Goals: – Remove code! (Use Calcite’s SQL parser, validator) – Improve planning (Faster planning, faster queries) – Improve SQL compliance – Some “free” SQL features (e.g. WITH, scalar subquery, FILTER) – Close to full compatibility with current Phoenix SQL and APIs • Status: beta, expected GA: late 2016
  • 7. Current Phoenix Architecture Parser Algebra Phoenix Schema Stage 1: ParseNode tree Stage 2: Normalization, secondary index rewrite Stage 3: Expression tree HBase Data Runtime Query Plan
  • 8. Calcite Architecture Parser Algebra Schema SPI Operators,
 Rules,
 Statistics, Cost model Data Engine Data Engine Data Engine
  • 9. Phoenix + Calcite Architecture Parser Algebra Phoenix Schema Logical + Phoenix Operators,
 Builtin + Phoenix Rules,
 Phoenix Statistics, Phoenix Cost model Data JDBC (optional) HBase Data Phoenix Runtime Data Other (optional) Query Plan
  • 10. Cost-based Query Optimizer
 with Apache Calcite • Base all query optimization decisions on cost – Filter push down; range scan vs. skip scan – Hash aggregate vs. stream aggregate vs. partial stream aggregate – Sort optimized out; sort/limit push through; fwd/rev/unordered scan – Hash join vs. merge join; join ordering – Use of data table vs. index table – All above (any many others) COMBINED • Query optimizations are modeled as pluggable rules
  • 11. Calcite Algebra SELECT products.name, COUNT(*)
 FROM sales
 JOIN products USING (productId)
 WHERE sales.discount IS NOT NULL
 GROUP BY products.name
 ORDER BY COUNT(*) DESC scan [products] scan [sales] join filter aggregate sort translate SQL to relational algebra
  • 12. Example 2: FilterIntoJoinRule SELECT products.name, COUNT(*)
 FROM sales
 JOIN products USING (productId)
 WHERE sales.discount IS NOT NULL
 GROUP BY products.name
 ORDER BY COUNT(*) DESC scan [products] scan [sales] join filter aggregate sort scan [products] scan [sales] filter’ join’ aggregate sort FilterIntoJoinRule translate SQL to relational algebra
  • 13. Example 3: Phoenix Joins • Hash join vs. Sort merge join – Hash join good for: either input is small – Sort merge join good for: both inputs are big – Hash join downside: potential OOM – Sort merge join downside: extra sorting required sometimes • Better to exploit the sortedness of join input • Better to exploit the sortedness of join output
  • 14. Example 3: Calcite Algebra SELECT empid, e.name, d.name, location
 FROM emps AS e
 JOIN depts AS d USING (deptno)
 ORDER BY d.deptno scan [emps] scan [depts] join sort project translate SQL to relational algebra
  • 15. Example 3: Plan Candidates scan [emps] scan [depts] hash-join sort project scan [emps] scan [depts] sort merge-join projectCandidate 1: hash-join *also what standalone Phoenix compiler would generate. Candidate 2: merge-join 1. Very little difference in all other operators: project, scan, hash-join or merge-join 2. Candidate 1 would sort “emps join depts”, while candidate 2 would only sort “emps” Win SortRemoveRule sorted on [deptno] SortRemoveRule sorted on [e.deptno]
  • 16. Example 3: Improved Plan scan ‘depts’ send ‘depts’ over to RS & build hash-cache scan ‘emps’ hash-join ‘depts’ sort joined table on ‘e.deptno’ scan ‘emps’ merge-join ‘emps’ and ‘depts’ sort by ‘deptno’ scan ‘depts’ Old vs. New 1. Exploited the sortedness of join input 2. Exploited the sortedness of join output
  • 17. (and now, a brief look at Calcite)
  • 18. Apache Calcite • Apache top-level project since October, 2015 • Query planning framework – Relational algebra, rewrite
 rules – Cost model & statistics – Federation via adapters – Extensible • Packaging – Library – Optional SQL parser, JDBC server – Community-authored rules, adapters Embedded Adapters Streaming Apache Drill Apache Hive Apache Kylin Apache Phoenix* Cascading Lingual Apache Cassandra* Apache Spark CSV In-memory JDBC JSON MongoDB Splunk Web tables Apache Flink* Apache Samza Apache Storm
  • 19. Apache Calcite Avatica • Database connectivity stack • Self-contained sub- project of Calcite • Fast, open, stable • Powers Phoenix Query Server
  • 20. Calcite – APIs and SPIs Cost, statistics RelOptCost RelOptCostFactory RelMetadataProvider • RelMdColumnUniquensss • RelMdDistinctRowCount • RelMdSelectivity SQL parser SqlNode
 SqlParser
 SqlValidator Transformation rules RelOptRule • MergeFilterRule • PushAggregateThroughUnionRule • 100+ more Global transformations • Unification (materialized view) • Column trimming • De-correlation Relational algebra RelNode (operator) • TableScan • Filter • Project • Union • Aggregate • … RelDataType (type) RexNode (expression) RelTrait (physical property) • RelConvention (calling-convention) • RelCollation (sortedness) • TBD (bucketedness/distribution) JDBC driver (Avatica) Metadata Schema Table Function • TableFunction • TableMacro Lattice
  • 21. Calcite Planning Process SQL parse tree Planner RelNode Graph Sql-to-Rel Converter SqlNode ! RelNode + RexNode Node for each node in Input Plan Each node is a Set of alternate Sub Plans Set further divided into Subsets: based on traits like sortedness 1. Plan Graph Rule: specifies an Operator sub-graph to match and logic to generate equivalent ‘better’ sub-graph New and original sub-graph both remain in contention 2. Rules RelNodes have Cost & Cumulative Cost 3. Cost Model Used to plug in Schema, cost formulas Filter selectivity Join selectivity NDV calculations 4. Metadata Providers Rule Match Queue Best RelNode Graph Translate to runtime Logical Plan Based on “Volcano” & “Cascades” papers [G. Graefe] Add Rule matches to Queue Apply Rule match transformations to plan graph Iterate for fixed iterations or until cost doesn’t change Match importance based on cost of RelNode and height
  • 22. Views and materialized views • A view is a named relational expression, stored in the catalog, that is expanded while planning a query. • A materialized view is an equivalence, stored in the catalog, between a table and a relational expression.
 
 The planner substitutes the table into queries where it will help, even if the queries do not reference the materialized view.
  • 23. Query using a view Scan [Emps] Join [$0, $5] Project [$0, $1, $2, $3] Filter [age >= 50] Aggregate [deptno, min(salary)] Scan [Managers] Aggregate [manager] Scan [Emps] SELECT deptno, min(salary)
 FROM Managers
 WHERE age >= 50
 GROUP BY deptno CREATE VIEW Managers AS
 SELECT *
 FROM Emps 
 WHERE EXISTS (
 SELECT *
 FROM Emps AS underling
 WHERE underling.manager = emp.id) view scan to be expanded
  • 24. After view expansion Scan [Emps] Aggregate [manager] Join [$0, $5] Project [$0, $1, $2, $3] Filter [age >= 50] Aggregate [deptno, min(salary)] Scan [Emps] SELECT deptno, min(salary)
 FROM Managers
 WHERE age >= 50
 GROUP BY deptno CREATE VIEW Managers AS
 SELECT *
 FROM Emps 
 WHERE EXISTS (
 SELECT *
 FROM Emps AS underling
 WHERE underling.manager = emp.id) can be pushed down
  • 25. Materialized view Scan [Emps] Aggregate [deptno, gender,
 COUNT(*), SUM(sal)] Scan [EmpSummary] = Scan [Emps] Filter [deptno = 10 AND gender = ‘M’] Aggregate [COUNT(*)] CREATE MATERIALIZED VIEW EmpSummary AS
 SELECT deptno,
 gender,
 COUNT(*) AS c,
 SUM(sal) AS s
 FROM Emps
 GROUP BY deptno, gender SELECT COUNT(*)
 FROM Emps
 WHERE deptno = 10
 AND gender = ‘M’
  • 26. Materialized view, step 2: Rewrite query to match Scan [Emps] Aggregate [deptno, gender,
 COUNT(*), SUM(sal)] Scan [EmpSummary] = Scan [Emps] Filter [deptno = 10 AND gender = ‘M’] Aggregate [deptno, gender,
 COUNT(*) AS c, SUM(sal) AS s] Project [c] CREATE MATERIALIZED VIEW EmpSummary AS
 SELECT deptno,
 gender,
 COUNT(*) AS c,
 SUM(sal) AS s
 FROM Emps
 GROUP BY deptno, gender SELECT COUNT(*)
 FROM Emps
 WHERE deptno = 10
 AND gender = ‘M’
  • 27. Materialized view, step 3: Substitute table Scan [Emps] Aggregate [deptno, gender,
 COUNT(*), SUM(sal)] Scan [EmpSummary] = Filter [deptno = 10 AND gender = ‘M’] Project [c] Scan [EmpSummary] CREATE MATERIALIZED VIEW EmpSummary AS
 SELECT deptno,
 gender,
 COUNT(*) AS c,
 SUM(sal) AS s
 FROM Emps
 GROUP BY deptno, gender SELECT COUNT(*)
 FROM Emps
 WHERE deptno = 10
 AND gender = ‘M’
  • 28. (and now, back to Phoenix)
  • 29. Example 1, Revisited: Secondary Index Optimizer internally creates a mapping (query, table) equivalent to: Scan [Emps] Filter [deptno BETWEEN 100 and 150] Project [deptno, name] Sort [deptno] CREATE MATERIALIZED VIEW I_Emp_Deptno AS
 SELECT deptno, empno, name
 FROM Emps
 ORDER BY deptno Scan [Emps] Project [deptno, empno, name] Sort [deptno, empno, name] Filter [deptno BETWEEN 100 and 150] Project [deptno, name] Scan [I_Emp_Deptno] 1,000 1,000 200 1600 1,000 1,000 200 very simple cost based on row-count
  • 30. Beyond Phoenix 4.8
 with Apache Calcite • Get the missing SQL support – WITH, UNNEST, Scalar subquery, etc. • Materialized views – To allow other forms of indices (maybe defined as external), e.g., a filter view, a join view, or an aggregate view. • Interop with other Calcite adapters – Already used by Drill, Hive, Kylin, Samza, etc. – Supports any JDBC source – Initial version of Drill-Phoenix integration already working
  • 31. Drillix: Interoperability with Apache Drill SELECT deptno, sum(salary) FROM emps GROUP BY deptno Stage 1: Local Partial aggregation Stage 3: Final aggregation Stage 2: Shuffle partial results Drill Aggregate [deptno, sum(salary)] Drill Shuffle [deptno] Phoenix Aggregate [deptno, sum(salary)] Phoenix TableScan [emps] Phoenix Tables on HBase