SlideShare a Scribd company logo
Dynamic filtering for
Presto join optimisation
Roman Zeyde
Presto Conference Israel 2019
Agenda
Existing join optimization techniques
Dynamic filtering description
Implementation details
Performance analysis
Existing join optimization techniques
Happen during planning phase:
• Join reordering
• Join distribution type (distributed vs. broadcast)
Depend on cost-based optimizer (need column
statistics)
• Should be enabled via session parameters
• Can be estimated using ANALYZE statement
Example: join reordering
SELECT * FROM items JOIN sales ON sales.item_id = items.id;
Prefer keeping the "smaller" table on the right-hand side of the join:
Join
(item_id=id)
Join
(item_id=id)
Scan
sales
Scan
items
Scan
items
Scan
sales
Example: broadcast join
If the right-hand side table is "small", it can be replicated to
all join workers - saving the CPU and network cost of left-
hand side repartitioning:
Join worker
Join worker
Join workerLeft-hand side
Right-hand side
Join worker
Join worker
Example: distributed join
Otherwise, both tables are repartitioned using the join key,
allowing joins with larger right-hand side tables:
Join workerLeft-hand side
Right-hand side
Dynamic filtering - introduction
Consider the following query:
SELECT * FROM sales JOIN items
ON sales.item_id = items.id
WHERE items.price > 1000;
Assumptions:
● sales table is large
● items scan results in a few rows
(due to predicate pushdown)
Most of the scanned sales rows will be
discarded during the join (i.e. high selectivity).
How can we optimize this use-case?
Join
(item_id=id)
Scan
sales
Scan
items
[price>1000]
Dynamic filtering - description
1. Collect relevant id values during items scan
2. Construct dynamic filter F using the collected ids
3. Apply dynamic predicate pushdown using F to
sales scan
Benefits:
• Connector may optimize the scan given F
• Most sales rows are not touched by Presto
• CPU & network savings for large tables
Requirements:
• F cannot be too large (memory-wise)
• F need to "back propagate" into sales scan in runtime
Join
(item_id=id)
Scan
items
[price>1000]
Scan
sales
[item_id∈F]
(3)
(1)
(2)
Construct
dynamic
filter F
Implementation details - Qubole et al.
Supports both distributed and broadcast joins, but
requires significant changes in Presto:
• Add plan nodes and optimizer rules for dynamic
filter collection and application
• New coordinator REST endpoint for dynamic filter
collection from worker nodes.
• Allow connectors to prune partitions during split
generation (when dynamic filter is ready)
More details can be found here:
qubole.com/blog/sql-join-optimizations-qubole-presto
(https://docs.google.com/document/d/1TOlxS8ZAXSIHR5ftHbPsgUkuUA-ky-odwmPdZARJrUQ)
Implementation - Varada
When broadcast join is used, sales'
ScanFilterAndProject and items' HashBuilder
operators run at the same process:
• Add a "pass-through" operator to collect
build-side ids.
• When ready, pushdown the resulting
predicate F into sales page source.
No changes needed at the planner, optimizer and
coordinator!
Implemented as a patch on top of
github.com/prestosql/presto (currently work-in-
progress).
ScanFilterAndProject
sales
[item_id∈F]
ScanFilterAndProject
items
[price>1000]
Exchange
Exchange
Collect
F:=F∪{id}
LookupJoin
[item_id=id]
HashBuilder
[id]
TaskOutput
Performance analysis - benchmark
Consider the following query (based on TPC-DS sf10000 dataset):
SELECT ss_item_sk FROM store_sales JOIN customer
ON ss_customer_sk = c_customer_sk
WHERE c_customer_id = 'AAAAAAAAMCOOKLCA';
• store_sales contains 27.7B rows
• customer contains 65M rows
• Query result contains 334 rows
Performance analysis - results
Regular join Dynamic filtering improvement
Execution time 25 sec 0.9 sec x27 faster
CPU time 57.4 min 7.8 sec x440 lower
Peak total memory 261 MB 2.2 MB x118 lower
Data read (from connector) 258 GB 3.3 kB x78M lower
Tested on Varada cluster (with CBO enabled):
Up next in Presto improvements
• Distributed Joins - extend dynamic filtering
• Aggregation Pushdown
• Coordinator HA
Thank you!

More Related Content

What's hot

Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
ScyllaDB
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Adam Doyle
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache Spark
Databricks
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
Databricks
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
Cloudera, Inc.
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Databricks
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Databricks
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
Databricks
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
Databricks
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
Databricks
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
Alluxio, Inc.
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
Alexander Korotkov
 
Delta Lake Cheat Sheet.pdf
Delta Lake Cheat Sheet.pdfDelta Lake Cheat Sheet.pdf
Delta Lake Cheat Sheet.pdf
karansharma62792
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Databricks
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
Gleb Kanterov
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
Altinity Ltd
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Databricks
 

What's hot (20)

Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache Spark
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
Delta Lake Cheat Sheet.pdf
Delta Lake Cheat Sheet.pdfDelta Lake Cheat Sheet.pdf
Delta Lake Cheat Sheet.pdf
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 

Similar to Dynamic filtering for presto join optimisation

Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache Hive
Murtaza Doctor
 
Rough cut connect2-xyz
Rough cut connect2-xyzRough cut connect2-xyz
Rough cut connect2-xyz
Brij Consulting, LLC
 
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
James Jara Portfolio 2014  - Enterprise datagrid - Part 3James Jara Portfolio 2014  - Enterprise datagrid - Part 3
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
James Jara
 
CIM MODULE 2.pptx
CIM MODULE 2.pptxCIM MODULE 2.pptx
CIM MODULE 2.pptx
RithwikRanjan
 
Mutable data @ scale
Mutable data @ scaleMutable data @ scale
Mutable data @ scale
Ori Reshef
 
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
NGA Human Resources
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculator
solarisyourep
 
SplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunkLive! Advanced Session
SplunkLive! Advanced Session
Splunk
 
Oracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data TemplateOracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data Template
Edi Yanto
 
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
ijceronline
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA
 
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
Piyush Kumar
 
Sprint 50 review
Sprint 50 reviewSprint 50 review
Sprint 50 review
ManageIQ
 
Cognos framework manager
Cognos framework managerCognos framework manager
Cognos framework manager
maxonlinetr
 
Sql query analyzer & maintenance
Sql query analyzer & maintenanceSql query analyzer & maintenance
Sql query analyzer & maintenance
nspyrenet
 
Unifying your data management with Hadoop
Unifying your data management with HadoopUnifying your data management with Hadoop
Unifying your data management with Hadoop
Jayant Shekhar
 
Maximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereMaximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL Anywhere
SAP Technology
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Grega Kespret
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performance
Guy Harrison
 
Summer '23 Tips.pptx
Summer '23 Tips.pptxSummer '23 Tips.pptx
Summer '23 Tips.pptx
Carl Brundage
 

Similar to Dynamic filtering for presto join optimisation (20)

Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache Hive
 
Rough cut connect2-xyz
Rough cut connect2-xyzRough cut connect2-xyz
Rough cut connect2-xyz
 
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
James Jara Portfolio 2014  - Enterprise datagrid - Part 3James Jara Portfolio 2014  - Enterprise datagrid - Part 3
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
 
CIM MODULE 2.pptx
CIM MODULE 2.pptxCIM MODULE 2.pptx
CIM MODULE 2.pptx
 
Mutable data @ scale
Mutable data @ scaleMutable data @ scale
Mutable data @ scale
 
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculator
 
SplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunkLive! Advanced Session
SplunkLive! Advanced Session
 
Oracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data TemplateOracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data Template
 
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
 
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
 
Sprint 50 review
Sprint 50 reviewSprint 50 review
Sprint 50 review
 
Cognos framework manager
Cognos framework managerCognos framework manager
Cognos framework manager
 
Sql query analyzer & maintenance
Sql query analyzer & maintenanceSql query analyzer & maintenance
Sql query analyzer & maintenance
 
Unifying your data management with Hadoop
Unifying your data management with HadoopUnifying your data management with Hadoop
Unifying your data management with Hadoop
 
Maximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereMaximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL Anywhere
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performance
 
Summer '23 Tips.pptx
Summer '23 Tips.pptxSummer '23 Tips.pptx
Summer '23 Tips.pptx
 

Recently uploaded

[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
Fwdays
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
christinelarrosa
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
Sease
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
UiPathCommunity
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 

Recently uploaded (20)

[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 

Dynamic filtering for presto join optimisation

  • 1. Dynamic filtering for Presto join optimisation Roman Zeyde Presto Conference Israel 2019
  • 2. Agenda Existing join optimization techniques Dynamic filtering description Implementation details Performance analysis
  • 3. Existing join optimization techniques Happen during planning phase: • Join reordering • Join distribution type (distributed vs. broadcast) Depend on cost-based optimizer (need column statistics) • Should be enabled via session parameters • Can be estimated using ANALYZE statement
  • 4. Example: join reordering SELECT * FROM items JOIN sales ON sales.item_id = items.id; Prefer keeping the "smaller" table on the right-hand side of the join: Join (item_id=id) Join (item_id=id) Scan sales Scan items Scan items Scan sales
  • 5. Example: broadcast join If the right-hand side table is "small", it can be replicated to all join workers - saving the CPU and network cost of left- hand side repartitioning: Join worker Join worker Join workerLeft-hand side Right-hand side
  • 6. Join worker Join worker Example: distributed join Otherwise, both tables are repartitioned using the join key, allowing joins with larger right-hand side tables: Join workerLeft-hand side Right-hand side
  • 7. Dynamic filtering - introduction Consider the following query: SELECT * FROM sales JOIN items ON sales.item_id = items.id WHERE items.price > 1000; Assumptions: ● sales table is large ● items scan results in a few rows (due to predicate pushdown) Most of the scanned sales rows will be discarded during the join (i.e. high selectivity). How can we optimize this use-case? Join (item_id=id) Scan sales Scan items [price>1000]
  • 8. Dynamic filtering - description 1. Collect relevant id values during items scan 2. Construct dynamic filter F using the collected ids 3. Apply dynamic predicate pushdown using F to sales scan Benefits: • Connector may optimize the scan given F • Most sales rows are not touched by Presto • CPU & network savings for large tables Requirements: • F cannot be too large (memory-wise) • F need to "back propagate" into sales scan in runtime Join (item_id=id) Scan items [price>1000] Scan sales [item_id∈F] (3) (1) (2) Construct dynamic filter F
  • 9. Implementation details - Qubole et al. Supports both distributed and broadcast joins, but requires significant changes in Presto: • Add plan nodes and optimizer rules for dynamic filter collection and application • New coordinator REST endpoint for dynamic filter collection from worker nodes. • Allow connectors to prune partitions during split generation (when dynamic filter is ready) More details can be found here: qubole.com/blog/sql-join-optimizations-qubole-presto (https://docs.google.com/document/d/1TOlxS8ZAXSIHR5ftHbPsgUkuUA-ky-odwmPdZARJrUQ)
  • 10. Implementation - Varada When broadcast join is used, sales' ScanFilterAndProject and items' HashBuilder operators run at the same process: • Add a "pass-through" operator to collect build-side ids. • When ready, pushdown the resulting predicate F into sales page source. No changes needed at the planner, optimizer and coordinator! Implemented as a patch on top of github.com/prestosql/presto (currently work-in- progress). ScanFilterAndProject sales [item_id∈F] ScanFilterAndProject items [price>1000] Exchange Exchange Collect F:=F∪{id} LookupJoin [item_id=id] HashBuilder [id] TaskOutput
  • 11. Performance analysis - benchmark Consider the following query (based on TPC-DS sf10000 dataset): SELECT ss_item_sk FROM store_sales JOIN customer ON ss_customer_sk = c_customer_sk WHERE c_customer_id = 'AAAAAAAAMCOOKLCA'; • store_sales contains 27.7B rows • customer contains 65M rows • Query result contains 334 rows
  • 12. Performance analysis - results Regular join Dynamic filtering improvement Execution time 25 sec 0.9 sec x27 faster CPU time 57.4 min 7.8 sec x440 lower Peak total memory 261 MB 2.2 MB x118 lower Data read (from connector) 258 GB 3.3 kB x78M lower Tested on Varada cluster (with CBO enabled):
  • 13. Up next in Presto improvements • Distributed Joins - extend dynamic filtering • Aggregation Pushdown • Coordinator HA

Editor's Notes

  1. CBO is supported today by Presto Hive connector (using Hive statistics).
  2. Since hash-join requires reading the right-hand side table into memory, we would like to estimate the expected sizes' and reorder the join accordingly. It can be done manually - or automatically (using CBO) via connector-provided statistics.
  3. Broadcast join optimization allows to save network cost for LHS repartitioning at the expense of RHS replication. Can be set manually, or via CBO (by enumerating the possible join types and choosing the one with lowest cost).
  4. If we knew the item IDs during the planning, we could use predicate pushdown to propagate them into the connector.
  5. So instead, we need to construct the predicate in run-time (before starting the LHS scan). We re-use existing predicate pushdown mechanism, which allows us to skip most of the LHS table (can be done efficiently in our case).
  6. The coordination problem is much simpler in this case.
  7. Note: we don't know the join key during the plan, so regular predicate pushdown doesn't work. There is a single customer that matches RHS, so the results are highly selective.
  8. Same query, same hardware - without / with dynamic filtering. These results show that dynamic filtering may significantly improve the performance of highly-selective queries, by making relatively small changes in Presto.
  9. We are planning to continue the work on dynamic filtering, as well as adding support for aggregation pushdown and coordinator high-availability.