SlideShare a Scribd company logo
1 of 14
Download to read offline
Pessimistic Cardinality Estimation:
Presented by Zhuomin Liu
PingCAP.com
Introduction & Background
● How to define a join order.
○ Binary join reorder algorithm. ------ need a good count estimation model.
○ Intermediate join count estimation. ------ need a precise cardinality calculation.
○ Cardinality for a join plan. ------ how to estimate it?
○ Naive solution:
!"#$ %&'(#$&)#*+ = max()12* 31+ %&'(#$&)#*+, '#5ℎ* 31+ %&'(#$&)#*+)
● What results are cardinality underestimation caused?
○ The count of join are overestimated.
○ Choose a very slow join order.
PingCAP.com
Introduction & Background
● What does paper do?
○ An approach to tightening join cardinality upper bounds.
○ A method of generating entropic bounding formulas.
○ A partition budgeting strategy to control space and time complexity.
○ Demonstrate approach in multiple datasets.
● Pre-knowledge: hypergraph
PingCAP.com
Introduction & Background
PingCAP.com
Approach
● The Bound Sketch
○ !(#) — a relation with attributes #
○ % — a random hash function & ↦ 1, … , +
○ + — domain of hash function 1, … , +
○ , — an array of hash values, , ∈ + #
○ ,[/] — hash value of attribute /
○ !1
— subset of relational instance !, as those tuples 2 ∈ ! that hash to ,
!1
= 2 ∈ !: % 2 / = , / , ∀/ ∈ #
678 = !1
978 / = :;<
{>∈?:@ > A1[;]}
|{2: 2 / = D}|
PingCAP.com
Approach
● The Bound Sketch example
Relation !(#, %)
Hash function ℎ(#) = #%2
PingCAP.com
Approach
● Cardinality Bounds example
○ ! ", $, % : ' ", $ , ( $, % , ) %, "
○ random variables *, +, , for ", $, %
○ joint entropy ℎ(", $, %) = log(|!(", $, %)|)
PingCAP.com
Approach
● Cardinality Bounds example
|"($)| ≤
'()*
+
, '*)-
.
, '-)(
/
'()-
/
, '*)(
+
, '-)*
.
('()
0
1('*)
0
1('-)
0
1
|"($)| ≤ 2
3∈ 5 6
789
'(:)*:
+
, '*:)-:
.
, '-:)(:
/
'(:)-:
/
, '*:)(:
+
, '-:)*:
.
('(:)
0
1('*:)
0
1('-:)
0
1
PingCAP.com
Approach
● Selection Propagation and Processing
PingCAP.com
Approach
● How to decide Hash Partition Budgeting
○ fixed size for each attribute?
○ let sum size of attributes fixed?
● Hash Partition Budgeting
○ fixed max size ! for each attribute.
○ distribute to attributes covered unconditionally by a relation.
PingCAP.com
Approach
● Bound Formula Generation
○ enumerate all relation cover set in hypergraph
○ for each relation in a set, calculate its contribution for formula
● Example
PingCAP.com
Evaluation
● Progressive Bound Tightness
○ 45 GooglePlus community edge-sets
○ Postgres 9.6.6 instance
○ non-cryptographic Murmur3 algorithm
PingCAP.com
Evaluation
● Plan Execution Time Improvement
Thank You !

More Related Content

Similar to Paper Reading: Pessimistic Cardinality Estimation

Multivariate Time Series
Multivariate Time SeriesMultivariate Time Series
Multivariate Time SeriesApache MXNet
 
Python for Scientific Computing -- Ricardo Cruz
Python for Scientific Computing -- Ricardo CruzPython for Scientific Computing -- Ricardo Cruz
Python for Scientific Computing -- Ricardo Cruzrpmcruz
 
Histograms : Pre-12c and Now
Histograms : Pre-12c and NowHistograms : Pre-12c and Now
Histograms : Pre-12c and NowAnju Garg
 
C - programming - Ankit Kumar Singh
C - programming - Ankit Kumar Singh C - programming - Ankit Kumar Singh
C - programming - Ankit Kumar Singh AnkitSinghRajput35
 
Count-min sketch to Infinity.pdf
Count-min sketch to Infinity.pdfCount-min sketch to Infinity.pdf
Count-min sketch to Infinity.pdfStephen Lorello
 
Landmark Retrieval & Recognition
Landmark Retrieval & RecognitionLandmark Retrieval & Recognition
Landmark Retrieval & Recognitionkenluck2001
 
RDataMining slides-regression-classification
RDataMining slides-regression-classificationRDataMining slides-regression-classification
RDataMining slides-regression-classificationYanchang Zhao
 
Moment-Based Estimation for Hierarchical Models in Apache Spark with Kyle Sch...
Moment-Based Estimation for Hierarchical Models in Apache Spark with Kyle Sch...Moment-Based Estimation for Hierarchical Models in Apache Spark with Kyle Sch...
Moment-Based Estimation for Hierarchical Models in Apache Spark with Kyle Sch...Databricks
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CSusumu Tokumoto
 
Intro to Machine Learning for GPUs
Intro to Machine Learning for GPUsIntro to Machine Learning for GPUs
Intro to Machine Learning for GPUsSri Ambati
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
 
Dynomite at Erlang Factory
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factorymoonpolysoft
 
MongoDB Analytics
MongoDB AnalyticsMongoDB Analytics
MongoDB Analyticsdatablend
 
Resultbest_fit_for_X1 =DistName tlocationscaleNLogL 2.1.pdf
Resultbest_fit_for_X1 =DistName tlocationscaleNLogL 2.1.pdfResultbest_fit_for_X1 =DistName tlocationscaleNLogL 2.1.pdf
Resultbest_fit_for_X1 =DistName tlocationscaleNLogL 2.1.pdfaparnatiwari291
 
Probabilistic Data Structures and Approximate Solutions
Probabilistic Data Structures and Approximate SolutionsProbabilistic Data Structures and Approximate Solutions
Probabilistic Data Structures and Approximate SolutionsOleksandr Pryymak
 

Similar to Paper Reading: Pessimistic Cardinality Estimation (20)

Multivariate Time Series
Multivariate Time SeriesMultivariate Time Series
Multivariate Time Series
 
sysprog1
sysprog1sysprog1
sysprog1
 
Curvefitting
CurvefittingCurvefitting
Curvefitting
 
Python for Scientific Computing -- Ricardo Cruz
Python for Scientific Computing -- Ricardo CruzPython for Scientific Computing -- Ricardo Cruz
Python for Scientific Computing -- Ricardo Cruz
 
Histograms : Pre-12c and Now
Histograms : Pre-12c and NowHistograms : Pre-12c and Now
Histograms : Pre-12c and Now
 
C - programming - Ankit Kumar Singh
C - programming - Ankit Kumar Singh C - programming - Ankit Kumar Singh
C - programming - Ankit Kumar Singh
 
Cpl
CplCpl
Cpl
 
Gpu Join Presentation
Gpu Join PresentationGpu Join Presentation
Gpu Join Presentation
 
Count-min sketch to Infinity.pdf
Count-min sketch to Infinity.pdfCount-min sketch to Infinity.pdf
Count-min sketch to Infinity.pdf
 
Landmark Retrieval & Recognition
Landmark Retrieval & RecognitionLandmark Retrieval & Recognition
Landmark Retrieval & Recognition
 
RDataMining slides-regression-classification
RDataMining slides-regression-classificationRDataMining slides-regression-classification
RDataMining slides-regression-classification
 
Map Reduce In 5 Minutes
Map Reduce In 5 MinutesMap Reduce In 5 Minutes
Map Reduce In 5 Minutes
 
Moment-Based Estimation for Hierarchical Models in Apache Spark with Kyle Sch...
Moment-Based Estimation for Hierarchical Models in Apache Spark with Kyle Sch...Moment-Based Estimation for Hierarchical Models in Apache Spark with Kyle Sch...
Moment-Based Estimation for Hierarchical Models in Apache Spark with Kyle Sch...
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
 
Intro to Machine Learning for GPUs
Intro to Machine Learning for GPUsIntro to Machine Learning for GPUs
Intro to Machine Learning for GPUs
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
 
Dynomite at Erlang Factory
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factory
 
MongoDB Analytics
MongoDB AnalyticsMongoDB Analytics
MongoDB Analytics
 
Resultbest_fit_for_X1 =DistName tlocationscaleNLogL 2.1.pdf
Resultbest_fit_for_X1 =DistName tlocationscaleNLogL 2.1.pdfResultbest_fit_for_X1 =DistName tlocationscaleNLogL 2.1.pdf
Resultbest_fit_for_X1 =DistName tlocationscaleNLogL 2.1.pdf
 
Probabilistic Data Structures and Approximate Solutions
Probabilistic Data Structures and Approximate SolutionsProbabilistic Data Structures and Approximate Solutions
Probabilistic Data Structures and Approximate Solutions
 

More from PingCAP

[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...PingCAP
 
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big DataPingCAP
 
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...PingCAP
 
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-TreePingCAP
 
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
[Paper Reading]The Bw-Tree: A B-tree for New Hardware PlatformsPingCAP
 
[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] QAGen: Generating query-aware test databases[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] QAGen: Generating query-aware test databasesPingCAP
 
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
[Paper Reading]  Leases: An Efficient Fault-Tolerant Mechanism for Distribute...[Paper Reading]  Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...PingCAP
 
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...PingCAP
 
[Paperreading] Paxos made easy (by sen han)
[Paperreading]  Paxos made easy (by sen han)[Paperreading]  Paxos made easy (by sen han)
[Paperreading] Paxos made easy (by sen han)PingCAP
 
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...PingCAP
 
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...PingCAP
 
The Dark Side Of Go -- Go runtime related problems in TiDB in production
The Dark Side Of Go -- Go runtime related problems in TiDB  in productionThe Dark Side Of Go -- Go runtime related problems in TiDB  in production
The Dark Side Of Go -- Go runtime related problems in TiDB in productionPingCAP
 
TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote PingCAP
 
Finding Logic Bugs in Database Management Systems
Finding Logic Bugs in Database Management SystemsFinding Logic Bugs in Database Management Systems
Finding Logic Bugs in Database Management SystemsPingCAP
 
Chaos Practice in PingCAP
Chaos Practice in PingCAPChaos Practice in PingCAP
Chaos Practice in PingCAPPingCAP
 
TiDB at PayPay
TiDB at PayPayTiDB at PayPay
TiDB at PayPayPingCAP
 
Paper Reading: FPTree
Paper Reading: FPTreePaper Reading: FPTree
Paper Reading: FPTreePingCAP
 
Paper Reading: Smooth Scan
Paper Reading: Smooth ScanPaper Reading: Smooth Scan
Paper Reading: Smooth ScanPingCAP
 
Paper Reading: Flexible Paxos
Paper Reading: Flexible PaxosPaper Reading: Flexible Paxos
Paper Reading: Flexible PaxosPingCAP
 
Paper reading: Cost-based Query Transformation in Oracle
Paper reading: Cost-based Query Transformation in OraclePaper reading: Cost-based Query Transformation in Oracle
Paper reading: Cost-based Query Transformation in OraclePingCAP
 

More from PingCAP (20)

[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
 
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
 
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
 
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
 
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
 
[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] QAGen: Generating query-aware test databases[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] QAGen: Generating query-aware test databases
 
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
[Paper Reading]  Leases: An Efficient Fault-Tolerant Mechanism for Distribute...[Paper Reading]  Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
 
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
 
[Paperreading] Paxos made easy (by sen han)
[Paperreading]  Paxos made easy (by sen han)[Paperreading]  Paxos made easy (by sen han)
[Paperreading] Paxos made easy (by sen han)
 
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
 
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
 
The Dark Side Of Go -- Go runtime related problems in TiDB in production
The Dark Side Of Go -- Go runtime related problems in TiDB  in productionThe Dark Side Of Go -- Go runtime related problems in TiDB  in production
The Dark Side Of Go -- Go runtime related problems in TiDB in production
 
TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote
 
Finding Logic Bugs in Database Management Systems
Finding Logic Bugs in Database Management SystemsFinding Logic Bugs in Database Management Systems
Finding Logic Bugs in Database Management Systems
 
Chaos Practice in PingCAP
Chaos Practice in PingCAPChaos Practice in PingCAP
Chaos Practice in PingCAP
 
TiDB at PayPay
TiDB at PayPayTiDB at PayPay
TiDB at PayPay
 
Paper Reading: FPTree
Paper Reading: FPTreePaper Reading: FPTree
Paper Reading: FPTree
 
Paper Reading: Smooth Scan
Paper Reading: Smooth ScanPaper Reading: Smooth Scan
Paper Reading: Smooth Scan
 
Paper Reading: Flexible Paxos
Paper Reading: Flexible PaxosPaper Reading: Flexible Paxos
Paper Reading: Flexible Paxos
 
Paper reading: Cost-based Query Transformation in Oracle
Paper reading: Cost-based Query Transformation in OraclePaper reading: Cost-based Query Transformation in Oracle
Paper reading: Cost-based Query Transformation in Oracle
 

Recently uploaded

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Recently uploaded (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

Paper Reading: Pessimistic Cardinality Estimation

  • 2. PingCAP.com Introduction & Background ● How to define a join order. ○ Binary join reorder algorithm. ------ need a good count estimation model. ○ Intermediate join count estimation. ------ need a precise cardinality calculation. ○ Cardinality for a join plan. ------ how to estimate it? ○ Naive solution: !"#$ %&'(#$&)#*+ = max()12* 31+ %&'(#$&)#*+, '#5ℎ* 31+ %&'(#$&)#*+) ● What results are cardinality underestimation caused? ○ The count of join are overestimated. ○ Choose a very slow join order.
  • 3. PingCAP.com Introduction & Background ● What does paper do? ○ An approach to tightening join cardinality upper bounds. ○ A method of generating entropic bounding formulas. ○ A partition budgeting strategy to control space and time complexity. ○ Demonstrate approach in multiple datasets. ● Pre-knowledge: hypergraph
  • 5. PingCAP.com Approach ● The Bound Sketch ○ !(#) — a relation with attributes # ○ % — a random hash function & ↦ 1, … , + ○ + — domain of hash function 1, … , + ○ , — an array of hash values, , ∈ + # ○ ,[/] — hash value of attribute / ○ !1 — subset of relational instance !, as those tuples 2 ∈ ! that hash to , !1 = 2 ∈ !: % 2 / = , / , ∀/ ∈ # 678 = !1 978 / = :;< {>∈?:@ > A1[;]} |{2: 2 / = D}|
  • 6. PingCAP.com Approach ● The Bound Sketch example Relation !(#, %) Hash function ℎ(#) = #%2
  • 7. PingCAP.com Approach ● Cardinality Bounds example ○ ! ", $, % : ' ", $ , ( $, % , ) %, " ○ random variables *, +, , for ", $, % ○ joint entropy ℎ(", $, %) = log(|!(", $, %)|)
  • 8. PingCAP.com Approach ● Cardinality Bounds example |"($)| ≤ '()* + , '*)- . , '-)( / '()- / , '*)( + , '-)* . ('() 0 1('*) 0 1('-) 0 1 |"($)| ≤ 2 3∈ 5 6 789 '(:)*: + , '*:)-: . , '-:)(: / '(:)-: / , '*:)(: + , '-:)*: . ('(:) 0 1('*:) 0 1('-:) 0 1
  • 10. PingCAP.com Approach ● How to decide Hash Partition Budgeting ○ fixed size for each attribute? ○ let sum size of attributes fixed? ● Hash Partition Budgeting ○ fixed max size ! for each attribute. ○ distribute to attributes covered unconditionally by a relation.
  • 11. PingCAP.com Approach ● Bound Formula Generation ○ enumerate all relation cover set in hypergraph ○ for each relation in a set, calculate its contribution for formula ● Example
  • 12. PingCAP.com Evaluation ● Progressive Bound Tightness ○ 45 GooglePlus community edge-sets ○ Postgres 9.6.6 instance ○ non-cryptographic Murmur3 algorithm