SlideShare a Scribd company logo
1 of 17
Download to read offline
DB reading group
May. 16, 2018 Keisuke Suzuki
Today’s paper
Efficiently Compiling Efficient Query Plans for Modern Hardware
● Thomas Neuman, 2011 VLDB
○ Creator of HyPer
■ Main memory RDBMS for mixed OLTP and OLAP workloads
● Topic: Query execution on the modern CPUs
Query processing on RDBMS
Scope of this paper
Executing relational algebraic plans
Ri: relation
σ: selection
Γ: aggregation
⋈: natual join
● Variation of executor
○ Compiled VS Interpreted
○ Pipelining VS Block processing
○ Pull VS Push
● ref: CMU Advanced Database
Systems - 03 Query Compilation
Volcano style execution
● interpreted + pipelining + pull
● Pros
○ easy to implement
○ no materialization
● Cons
○ poor cache locality
○ high cost virtual function calls
● popular in disk-based DBMS
○ e.g. PostgreSQL
● performance much worse than
hand-written code on modern systems
1. next()
2. next()
3. a tuple
4. a tuple
Related work: MonetDB/X100
● interpreted + block processing + pull
● Pros
○ better locality than Volcano
● Cons
○ virtual function calls
○ unnecessary tuple copy can be
happened on the boundaries
e.g.) tuples x <> 7 on step 3.
● still slower than hand-written code
1. next_chunk()
2. next_
chunk()
3. chunk
of tuples
4. chunk
of tuples
Proposed method
● compiled + block processing + push
○ tuples are pushed to the next pipeline
breaker (e.g. hash, aggregation, …)
● Pros
○ good locality
○ no virtual function calls
○ generated query execution codes are
easy to parallelize
■ SIMD
■ multi threading
1. loop the filter over
R1 tuples
Translate algebraic plan into code fragments
?
Translation: Pull based
interface Node { Tuple next(); }
class JoinNode implement Node {
Node left, right;
Tuple next() { .. }
}
class SelectNode implement Node {
Node child;
Tuple next() { .. }
}
class ScanNode implement Node {
Tuple next() { .. }
}
● Simple pipelining of operator nodes
● Tree structure
Translation: Proposed method
● Not tree structure
● Ambiguous operation boundaries
?
Producer / Consumer interface
● produce()
○ asks the operator to produce results
● comsume(attributes, source)
○ called to push results forward the
operator
● Flow
1. call produce() of root operator
2. recursively call produce() until
reaching leaf operator
3. leaf operator generate results
4. recursively call consume()
until reaching root operator
Example
⋈{a=b}.produce
-> σ{x=7}.produce
-> scan{R1}.produce
(read tuples from R1)
-> σ{x=7}.consume
(select tuples with x = 7)
-> ⋈{a=b}.consume
(materialize tuples in hash table)
Example
⋈{a=b}.produce
-> σ{x=7}.produce
-> scan{R1}.produce
(read tuples from R1)
-> σ{x=7}.consume
(select tuples with x = 7)
-> ⋈{a=b}.consume
(materialize tuples in hash table)
Materialize breaks loop
Generating Machine Code
● At first: generate C++ codes -> compile -> load as shared library
○ their system written in C++ (HyPer)
○ Bad: slow compilation (multiple seconds)
○ Bad: C++ does not offer total control over the generated code
● Next: Mixed LLVM and C++ codes
○ drive and connect operators by LLVM and call pre-compiled C++
functions for complex processing (e.g. disk IOs, memory allocation)
○ good: fast compilation (a few milliseconds)
○ good: LLVM enables robust assembler producing than manual writing
Performance Tuning
● Branch prediction
○ branch nearly 0% or 100% true is cheap
○ branch 50% true is expensive
20% faster
hash value mostly exists but mostly no collision
-> 1st iteration true, 2nd iteration false
Performance on OLTP / OLAP
● OLTP: small performance improvement
○ low selectivity (touch only small number of tuples)
● OLAP: big performance improvement
Criticism: Maintainability of operator template
● Template expansion easily becomes too complex
○ Code bases increase as more and more optimization added
○ One of the major reason that pull (iterator) model is prefered
● low-level language (LLVM IR)
Some study follow this problem
● e.g. Building Efficient Query Engines in a High-Level Language

More Related Content

What's hot

Principles of programming languages(Functional programming Languages using LISP)
Principles of programming languages(Functional programming Languages using LISP)Principles of programming languages(Functional programming Languages using LISP)
Principles of programming languages(Functional programming Languages using LISP)Preethi T G
 
OpenStreetMap in the age of Spark
OpenStreetMap in the age of SparkOpenStreetMap in the age of Spark
OpenStreetMap in the age of SparkAdrian Bona
 
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)Alexey Rybak
 
The immutable database datomic
The immutable database   datomicThe immutable database   datomic
The immutable database datomicLaurence Chen
 
Altitude San Francisco 2018: WebAssembly Tools & Applications
Altitude San Francisco 2018: WebAssembly Tools & ApplicationsAltitude San Francisco 2018: WebAssembly Tools & Applications
Altitude San Francisco 2018: WebAssembly Tools & ApplicationsFastly
 
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...Grafana Labs
 
Csc1100 lecture13 ch16_pt1
Csc1100 lecture13 ch16_pt1Csc1100 lecture13 ch16_pt1
Csc1100 lecture13 ch16_pt1IIUM
 
Prometheus Monitoring Mixins
Prometheus Monitoring MixinsPrometheus Monitoring Mixins
Prometheus Monitoring MixinsGrafana Labs
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsSam Bowne
 
Распределенные системы хранения данных, особенности реализации DHT в проекте ...
Распределенные системы хранения данных, особенности реализации DHT в проекте ...Распределенные системы хранения данных, особенности реализации DHT в проекте ...
Распределенные системы хранения данных, особенности реализации DHT в проекте ...yaevents
 
Dynamo db and Cross Region Migration
Dynamo db and Cross Region MigrationDynamo db and Cross Region Migration
Dynamo db and Cross Region MigrationAnamika Gupta
 
Cassandra Lunch #59 Functions in Cassandra
Cassandra Lunch #59  Functions in CassandraCassandra Lunch #59  Functions in Cassandra
Cassandra Lunch #59 Functions in CassandraAnant Corporation
 
Datomic rtree-pres
Datomic rtree-presDatomic rtree-pres
Datomic rtree-presjsofra
 
Mongo nyc nyt + mongodb
Mongo nyc nyt + mongodbMongo nyc nyt + mongodb
Mongo nyc nyt + mongodbDeep Kapadia
 
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...Erlang Solutions
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Alexey Rybak
 
Mi Primer Trabajo
Mi Primer TrabajoMi Primer Trabajo
Mi Primer Trabajocarlosgp98
 
Open spending as-is 2011-06
Open spending   as-is 2011-06Open spending   as-is 2011-06
Open spending as-is 2011-06Stefan Urbanek
 
Ch 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsCh 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsSam Bowne
 

What's hot (20)

Principles of programming languages(Functional programming Languages using LISP)
Principles of programming languages(Functional programming Languages using LISP)Principles of programming languages(Functional programming Languages using LISP)
Principles of programming languages(Functional programming Languages using LISP)
 
OpenStreetMap in the age of Spark
OpenStreetMap in the age of SparkOpenStreetMap in the age of Spark
OpenStreetMap in the age of Spark
 
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
 
The immutable database datomic
The immutable database   datomicThe immutable database   datomic
The immutable database datomic
 
Altitude San Francisco 2018: WebAssembly Tools & Applications
Altitude San Francisco 2018: WebAssembly Tools & ApplicationsAltitude San Francisco 2018: WebAssembly Tools & Applications
Altitude San Francisco 2018: WebAssembly Tools & Applications
 
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...
 
Demonstration
DemonstrationDemonstration
Demonstration
 
Csc1100 lecture13 ch16_pt1
Csc1100 lecture13 ch16_pt1Csc1100 lecture13 ch16_pt1
Csc1100 lecture13 ch16_pt1
 
Prometheus Monitoring Mixins
Prometheus Monitoring MixinsPrometheus Monitoring Mixins
Prometheus Monitoring Mixins
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflows
 
Распределенные системы хранения данных, особенности реализации DHT в проекте ...
Распределенные системы хранения данных, особенности реализации DHT в проекте ...Распределенные системы хранения данных, особенности реализации DHT в проекте ...
Распределенные системы хранения данных, особенности реализации DHT в проекте ...
 
Dynamo db and Cross Region Migration
Dynamo db and Cross Region MigrationDynamo db and Cross Region Migration
Dynamo db and Cross Region Migration
 
Cassandra Lunch #59 Functions in Cassandra
Cassandra Lunch #59  Functions in CassandraCassandra Lunch #59  Functions in Cassandra
Cassandra Lunch #59 Functions in Cassandra
 
Datomic rtree-pres
Datomic rtree-presDatomic rtree-pres
Datomic rtree-pres
 
Mongo nyc nyt + mongodb
Mongo nyc nyt + mongodbMongo nyc nyt + mongodb
Mongo nyc nyt + mongodb
 
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)
 
Mi Primer Trabajo
Mi Primer TrabajoMi Primer Trabajo
Mi Primer Trabajo
 
Open spending as-is 2011-06
Open spending   as-is 2011-06Open spending   as-is 2011-06
Open spending as-is 2011-06
 
Ch 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsCh 5: Introduction to heap overflows
Ch 5: Introduction to heap overflows
 

Similar to Efficient Query Plans for Modern Hardware

Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedData Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedHostedbyConfluent
 
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache BeamMalo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache BeamFlink Forward
 
ApacheCon 2022_ Large scale unification of file format.pptx
ApacheCon 2022_ Large scale unification of file format.pptxApacheCon 2022_ Large scale unification of file format.pptx
ApacheCon 2022_ Large scale unification of file format.pptxXinliShang1
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solidLars Albertsson
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodbPGConf APAC
 
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a FoeHaim Yadid
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockScyllaDB
 
Troubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer PerspectiveTroubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer PerspectiveMarcelo Altmann
 
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overviewFlink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overviewFlink Forward
 
Flow Base Programming with Node-RED and Functional Reactive Programming with ...
Flow Base Programming with Node-RED and Functional Reactive Programming with ...Flow Base Programming with Node-RED and Functional Reactive Programming with ...
Flow Base Programming with Node-RED and Functional Reactive Programming with ...Sven Beauprez
 
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarExploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarSpark Summit
 
Etl confessions pg conf us 2017
Etl confessions   pg conf us 2017Etl confessions   pg conf us 2017
Etl confessions pg conf us 2017Corey Huinker
 
Spark Meetup at Uber
Spark Meetup at UberSpark Meetup at Uber
Spark Meetup at UberDatabricks
 
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...
Flink Forward SF 2017: Malo Deniélou -  No shard left behind: Dynamic work re...Flink Forward SF 2017: Malo Deniélou -  No shard left behind: Dynamic work re...
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...Flink Forward
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCAapo Kyrölä
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics MongoDB
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 

Similar to Efficient Query Plans for Modern Hardware (20)

14 query processing-sorting
14 query processing-sorting14 query processing-sorting
14 query processing-sorting
 
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedData Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
 
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache BeamMalo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
 
ApacheCon 2022_ Large scale unification of file format.pptx
ApacheCon 2022_ Large scale unification of file format.pptxApacheCon 2022_ Large scale unification of file format.pptx
ApacheCon 2022_ Large scale unification of file format.pptx
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solid
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
Troubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer PerspectiveTroubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer Perspective
 
Feel++ webinar 9 27 2012
Feel++ webinar 9 27 2012Feel++ webinar 9 27 2012
Feel++ webinar 9 27 2012
 
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overviewFlink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
 
Flow Base Programming with Node-RED and Functional Reactive Programming with ...
Flow Base Programming with Node-RED and Functional Reactive Programming with ...Flow Base Programming with Node-RED and Functional Reactive Programming with ...
Flow Base Programming with Node-RED and Functional Reactive Programming with ...
 
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarExploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
 
Etl confessions pg conf us 2017
Etl confessions   pg conf us 2017Etl confessions   pg conf us 2017
Etl confessions pg conf us 2017
 
Spark Meetup at Uber
Spark Meetup at UberSpark Meetup at Uber
Spark Meetup at Uber
 
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...
Flink Forward SF 2017: Malo Deniélou -  No shard left behind: Dynamic work re...Flink Forward SF 2017: Malo Deniélou -  No shard left behind: Dynamic work re...
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 

Recently uploaded

How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 

Recently uploaded (20)

How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 

Efficient Query Plans for Modern Hardware

  • 1. DB reading group May. 16, 2018 Keisuke Suzuki
  • 2. Today’s paper Efficiently Compiling Efficient Query Plans for Modern Hardware ● Thomas Neuman, 2011 VLDB ○ Creator of HyPer ■ Main memory RDBMS for mixed OLTP and OLAP workloads ● Topic: Query execution on the modern CPUs
  • 3. Query processing on RDBMS Scope of this paper
  • 4. Executing relational algebraic plans Ri: relation σ: selection Γ: aggregation ⋈: natual join ● Variation of executor ○ Compiled VS Interpreted ○ Pipelining VS Block processing ○ Pull VS Push ● ref: CMU Advanced Database Systems - 03 Query Compilation
  • 5. Volcano style execution ● interpreted + pipelining + pull ● Pros ○ easy to implement ○ no materialization ● Cons ○ poor cache locality ○ high cost virtual function calls ● popular in disk-based DBMS ○ e.g. PostgreSQL ● performance much worse than hand-written code on modern systems 1. next() 2. next() 3. a tuple 4. a tuple
  • 6. Related work: MonetDB/X100 ● interpreted + block processing + pull ● Pros ○ better locality than Volcano ● Cons ○ virtual function calls ○ unnecessary tuple copy can be happened on the boundaries e.g.) tuples x <> 7 on step 3. ● still slower than hand-written code 1. next_chunk() 2. next_ chunk() 3. chunk of tuples 4. chunk of tuples
  • 7. Proposed method ● compiled + block processing + push ○ tuples are pushed to the next pipeline breaker (e.g. hash, aggregation, …) ● Pros ○ good locality ○ no virtual function calls ○ generated query execution codes are easy to parallelize ■ SIMD ■ multi threading 1. loop the filter over R1 tuples
  • 8. Translate algebraic plan into code fragments ?
  • 9. Translation: Pull based interface Node { Tuple next(); } class JoinNode implement Node { Node left, right; Tuple next() { .. } } class SelectNode implement Node { Node child; Tuple next() { .. } } class ScanNode implement Node { Tuple next() { .. } } ● Simple pipelining of operator nodes ● Tree structure
  • 10. Translation: Proposed method ● Not tree structure ● Ambiguous operation boundaries ?
  • 11. Producer / Consumer interface ● produce() ○ asks the operator to produce results ● comsume(attributes, source) ○ called to push results forward the operator ● Flow 1. call produce() of root operator 2. recursively call produce() until reaching leaf operator 3. leaf operator generate results 4. recursively call consume() until reaching root operator
  • 12. Example ⋈{a=b}.produce -> σ{x=7}.produce -> scan{R1}.produce (read tuples from R1) -> σ{x=7}.consume (select tuples with x = 7) -> ⋈{a=b}.consume (materialize tuples in hash table)
  • 13. Example ⋈{a=b}.produce -> σ{x=7}.produce -> scan{R1}.produce (read tuples from R1) -> σ{x=7}.consume (select tuples with x = 7) -> ⋈{a=b}.consume (materialize tuples in hash table) Materialize breaks loop
  • 14. Generating Machine Code ● At first: generate C++ codes -> compile -> load as shared library ○ their system written in C++ (HyPer) ○ Bad: slow compilation (multiple seconds) ○ Bad: C++ does not offer total control over the generated code ● Next: Mixed LLVM and C++ codes ○ drive and connect operators by LLVM and call pre-compiled C++ functions for complex processing (e.g. disk IOs, memory allocation) ○ good: fast compilation (a few milliseconds) ○ good: LLVM enables robust assembler producing than manual writing
  • 15. Performance Tuning ● Branch prediction ○ branch nearly 0% or 100% true is cheap ○ branch 50% true is expensive 20% faster hash value mostly exists but mostly no collision -> 1st iteration true, 2nd iteration false
  • 16. Performance on OLTP / OLAP ● OLTP: small performance improvement ○ low selectivity (touch only small number of tuples) ● OLAP: big performance improvement
  • 17. Criticism: Maintainability of operator template ● Template expansion easily becomes too complex ○ Code bases increase as more and more optimization added ○ One of the major reason that pull (iterator) model is prefered ● low-level language (LLVM IR) Some study follow this problem ● e.g. Building Efficient Query Engines in a High-Level Language