Submit Search
Upload
Hypothetical Partitioning for PostgreSQL
•
Download as PPTX, PDF
•
1 like
•
234 views
Yuzuko Hosoya
Follow
at POSTGRESOPEN SV 2018
Read less
Read more
Software
Report
Share
Report
Share
1 of 40
Download now
Recommended
YugaByte DB - "Designing a Distributed Database Architecture for GDPR Complia...
YugaByte DB - "Designing a Distributed Database Architecture for GDPR Complia...
Jimmy Guerrero
Optimizing Time Series Performance in the Real World
Optimizing Time Series Performance in the Real World
DevOps.com
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
InfluxData
Bloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQL
Masahiko Sawada
Master the RETE algorithm
Master the RETE algorithm
Masahiko Umeno
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
Dan Pilone
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
Alok Singh
Zero to Snowflake Presentation
Zero to Snowflake Presentation
Brett VanderPlaats
Recommended
YugaByte DB - "Designing a Distributed Database Architecture for GDPR Complia...
YugaByte DB - "Designing a Distributed Database Architecture for GDPR Complia...
Jimmy Guerrero
Optimizing Time Series Performance in the Real World
Optimizing Time Series Performance in the Real World
DevOps.com
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
InfluxData
Bloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQL
Masahiko Sawada
Master the RETE algorithm
Master the RETE algorithm
Masahiko Umeno
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
Dan Pilone
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
Alok Singh
Zero to Snowflake Presentation
Zero to Snowflake Presentation
Brett VanderPlaats
PostgreSQL major version upgrade using built in Logical Replication
PostgreSQL major version upgrade using built in Logical Replication
Atsushi Torikoshi
MySQL 8.0: What Is New in Optimizer and Executor?
MySQL 8.0: What Is New in Optimizer and Executor?
Norvald Ryeng
Application of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructure
NTT DATA OSS Professional Services
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 Updates
Taro L. Saito
Funnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and Druid
Databricks
Engineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee Joo
Two Sigma
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles Sonigo
Graph-Based Identity Resolution at Scale
Graph-Based Identity Resolution at Scale
TigerGraph
Why Open Source Works for DevOps Monitoring
Why Open Source Works for DevOps Monitoring
DevOps.com
Histogram Support in MySQL 8.0
Histogram Support in MySQL 8.0
oysteing
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
kbajda
Sensor Data Management & Analytics: Advanced Process Control
Sensor Data Management & Analytics: Advanced Process Control
TIBCO_Software
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Taro L. Saito
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
GetInData
Introducing the Tachyum Prodigy Processor
Introducing the Tachyum Prodigy Processor
inside-BigData.com
Fri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataops
DataKitchen
ODSC data science to DataOps
ODSC data science to DataOps
Christopher Bergh
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Jaroslav Gergic
Compute-based sizing and system dashboard
Compute-based sizing and system dashboard
DataWorks Summit
Distributed Database Architecture for GDPR
Distributed Database Architecture for GDPR
Yugabyte
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
AnnaArtyushina1
More Related Content
Similar to Hypothetical Partitioning for PostgreSQL
PostgreSQL major version upgrade using built in Logical Replication
PostgreSQL major version upgrade using built in Logical Replication
Atsushi Torikoshi
MySQL 8.0: What Is New in Optimizer and Executor?
MySQL 8.0: What Is New in Optimizer and Executor?
Norvald Ryeng
Application of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructure
NTT DATA OSS Professional Services
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 Updates
Taro L. Saito
Funnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and Druid
Databricks
Engineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee Joo
Two Sigma
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles Sonigo
Graph-Based Identity Resolution at Scale
Graph-Based Identity Resolution at Scale
TigerGraph
Why Open Source Works for DevOps Monitoring
Why Open Source Works for DevOps Monitoring
DevOps.com
Histogram Support in MySQL 8.0
Histogram Support in MySQL 8.0
oysteing
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
kbajda
Sensor Data Management & Analytics: Advanced Process Control
Sensor Data Management & Analytics: Advanced Process Control
TIBCO_Software
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Taro L. Saito
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
GetInData
Introducing the Tachyum Prodigy Processor
Introducing the Tachyum Prodigy Processor
inside-BigData.com
Fri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataops
DataKitchen
ODSC data science to DataOps
ODSC data science to DataOps
Christopher Bergh
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Jaroslav Gergic
Compute-based sizing and system dashboard
Compute-based sizing and system dashboard
DataWorks Summit
Distributed Database Architecture for GDPR
Distributed Database Architecture for GDPR
Yugabyte
Similar to Hypothetical Partitioning for PostgreSQL
(20)
PostgreSQL major version upgrade using built in Logical Replication
PostgreSQL major version upgrade using built in Logical Replication
MySQL 8.0: What Is New in Optimizer and Executor?
MySQL 8.0: What Is New in Optimizer and Executor?
Application of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructure
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 Updates
Funnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and Druid
Engineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee Joo
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Graph-Based Identity Resolution at Scale
Graph-Based Identity Resolution at Scale
Why Open Source Works for DevOps Monitoring
Why Open Source Works for DevOps Monitoring
Histogram Support in MySQL 8.0
Histogram Support in MySQL 8.0
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
Sensor Data Management & Analytics: Advanced Process Control
Sensor Data Management & Analytics: Advanced Process Control
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
Introducing the Tachyum Prodigy Processor
Introducing the Tachyum Prodigy Processor
Fri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataops
ODSC data science to DataOps
ODSC data science to DataOps
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Compute-based sizing and system dashboard
Compute-based sizing and system dashboard
Distributed Database Architecture for GDPR
Distributed Database Architecture for GDPR
Recently uploaded
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
AnnaArtyushina1
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
WSO2
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
Papp Krisztián
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
masabamasaba
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2
Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2
WSO2
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
Jim McKeeth
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2
Recently uploaded
(20)
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
Hypothetical Partitioning for PostgreSQL
1.
Copyright©2018 NTT Corp.
All Rights Reserved. Hypothetical Partitioning for PostgreSQL POSTGRESOPEN SV 2018 7th of September Yuzuko Hosoya NTT Open Source Software Center
2.
2Copyright©2018 NTT Corp.
All Rights Reserved. Who am I ? •Yuzuko Hosoya (@pyyycha) From Tokyo, Japan •Work ⁃NTT Open Source Software Center ⁃Working on HypoPG ⁃Interested in Planner/Partitioning •Like Dancing (Classical Ballet/Jazz)
3.
3Copyright©2018 NTT Corp.
All Rights Reserved. I. Introduction of HypoPG II. Why Hypothetical Partitioning? III.HypoPG Usage & Architecture IV.Demo V. Future Works VI.Summary Agenda
4.
4Copyright©2018 NTT Corp.
All Rights Reserved. Introduction of HypoPG
5.
5Copyright©2018 NTT Corp.
All Rights Reserved. Introduction of HypoPG Released by Julien Rouhaud Latest version HypoPG 1.1.2 Support PostgreSQL 9.2 and above Github https://github.com/HypoPG/hypopg Documentation https://hypopg.readthedocs.io
6.
6Copyright©2018 NTT Corp.
All Rights Reserved. •Supports hypothetical Indexes •Allows users to define hypothetical indexes on real tables and data •If hypothetical indexes are defined on a table, planner considers them along with any real indexes •Outputs queries’ plan/cost with EXPLAIN as if hypothetical indexes actually existed •Helps index design tuning HypoPG 1.X
7.
7Copyright©2018 NTT Corp.
All Rights Reserved. HypoPG 2 (still under development) •Supports hypothetical partitioning (v10~) •Allows users to try/simulate different hypothetical partitioning schemes on real tables and data •Outputs queries’ plan/cost with EXPLAIN using hypothetical partitioning schemes •Helps partitioning design tuning
8.
8Copyright©2018 NTT Corp.
All Rights Reserved. Why Hypothetical Partitioning?
9.
9Copyright©2018 NTT Corp.
All Rights Reserved. • Declarative Partitioning was a long-desired feature • It will be further improved with new PostgreSQL versions PostgreSQL Partitioning Many more enhancements of partitioning ⁃ HASH Partitioning ⁃ Partitioned Index ⁃ Faster Partition Pruning ⁃ Partition-wise Join/Aggregate and so on Declarative Partitioning was introduced ⁃ Partitioning enabled by CREATE TABLE statement ⁃ RANGE/LIST Partitioning v10 v11 v9.6 Partitioning by combination of following features ⁃ Table Inheritance ⁃ Check Constraint ⁃ Trigger
10.
10Copyright©2018 NTT Corp.
All Rights Reserved. •The most IMPORTANT factor affecting query performance over a partitioned table is partitioning schemes Which tables to be partitioned Which strategy to be chosen Which columns to be specified as a partition key How many partitions to be created Query Performance and Partitioning Schemes
11.
11Copyright©2018 NTT Corp.
All Rights Reserved. •Do not scan unnecessary partitions to reduce data size to be read from the disk Partition Pruning January December 12 partitions orders SELECT * FROM orders WHERE date=‘January’; Scanned NOT Scanned
12.
12Copyright©2018 NTT Corp.
All Rights Reserved. •Do not scan unnecessary partitions to reduce data size to be read from the disk •Optimal partitioning Scheme: Partition by the column specified in WHERE clause Partition Pruning January December 12 partitions orders Partition by date SELECT * FROM orders WHERE date=‘January’;
13.
13Copyright©2018 NTT Corp.
All Rights Reserved. •Joining pairs of partitions might be faster than joining large tables Partition-wise Join SELECT c.name FROM orders o LEFT JOIN customers c ON o.cust_id = c.cust_id where o.date = ‘September’; orders customers 1-1000 5001-6000 1-1000 5001-6000
14.
14Copyright©2018 NTT Corp.
All Rights Reserved. •Joining pairs of partitions might be faster than joining large tables •Optimal partitioning scheme: Partitioned by the column specified in JOIN clause Partition-wise Join SELECT c.name FROM orders o LEFT JOIN customers c ON o.cust_id = c.cust_id where o.date = ‘September’; orders customers 1-1000 5001-6000 1-1000 5001-6000Partition by cust_id Partition by cust_id
15.
15Copyright©2018 NTT Corp.
All Rights Reserved. •Performing aggregation per partition and combining them might be faster than performing aggregation on large table Partition-wise Aggregate AggJanuary December orders Agg SELECT date, sum(price) FROM orders GROUP BY date;
16.
16Copyright©2018 NTT Corp.
All Rights Reserved. •Performing aggregation per partition and combining them might be faster than performing aggregation on large table •Optimal partitioning scheme: Partitioned by the column specified in GROUP BY clause Partition-wise Aggregate SELECT date, sum(price) FROM orders GROUP BY date; January December orders Partition by date
17.
17Copyright©2018 NTT Corp.
All Rights Reserved. •Assume this situation Question: How do you partition these tables? orders customers SELECT * FROM orders WHERE date=‘January’; SELECT c.name FROM orders o LEFT JOIN customers c ON o.cust_id = c.cust_id where o.date = ‘September’; Partition by cust_id? Partition by date? Partition by order_id?
18.
18Copyright©2018 NTT Corp.
All Rights Reserved. •There are often multiple choices on partitioning schemes •Queries’ plan/cost are largely affected by data size and parameter settings It’s VERY HARD …
19.
19Copyright©2018 NTT Corp.
All Rights Reserved. •There are often multiple choices on partitioning schemes •Query plan/cost are largely affected by data size and parameter settings It’s HARD … Do you want to know how your queries behave before actually partitioning tables?
20.
20Copyright©2018 NTT Corp.
All Rights Reserved. • You can quickly check how your queries would behave if certain tables were partitioned without actually partitioning any tables and wasting any resources • You can try different partitioning schemes in a short time Hypothetical Partitioning is Helpful!
21.
21Copyright©2018 NTT Corp.
All Rights Reserved. HypoPG Usage & Architecture
22.
22Copyright©2018 NTT Corp.
All Rights Reserved. •Need to have an actual plain table •Create hypothetical partitioning schemes using following HypoPG function: For hypothetical partitioned table For hypothetical partitions NOTE: the 3rd argument is only for subpartitioning How to Create Hypothetical Partitioning Schemes hypopg_partition_table (‘table_name’, ‘PARTITION BY strategy(column)’) hypopg_add_partition (‘table_name’, ‘PARTITION OF parent FOR VALUES partition_bound’, ‘PARTITION BY strategy(column)’)
23.
23Copyright©2018 NTT Corp.
All Rights Reserved. •Create hypothetical statistics for hypothetical partitions using following HypoPG function: •The table specified in the first argument of this function must be a root table •Statistics of all leaf partitions, which are related to the given table, are created How to Create Hypothetical Statistics hypopg_analyze (‘table_name’, percentage for sampling)
24.
24Copyright©2018 NTT Corp.
All Rights Reserved. •For each partition, get the partition bounds including its ancestors if any How hypopg_analyze() Works January December orders February 1-1000 1000-2000 partition bound: date = ‘January’ , cust_id >= 1 AND cust_id < 1000 hypopg_analyze(‘orders’, 70); partition by date partition by cust_id
25.
25Copyright©2018 NTT Corp.
All Rights Reserved. •For each partition, get the partition bounds including its ancestors if any •Generate a WHERE clause according to the partition bound How hypopg_analyze() Works January December orders February 1-1000 1000-2000 partition bound: date = ‘January’ and cust_id >= 1 and cust_id < 1000 WHERE date = ‘January’ AND cust_id >= 1 AND cust_id < 1000 hypopg_analyze(‘orders’, 70); partition by date partition by cust_id
26.
26Copyright©2018 NTT Corp.
All Rights Reserved. •For each partition, get the partition bounds including its ancestors if any •Generate a WHERE clause according to the partition bound •Compute stats for sampling data got by TABLESAMPLE and the WHERE clause How hypopg_analyze() Works January December orders February 1-1000 1000-2000 partition bound: date = ‘January’ and cust_id >= 1 and cust_id < 1000 WHERE date = ‘January’ AND cust_id >= 1 AND cust_id < 1000 hypopg_analyze(‘orders’, 70); partition by date partition by cust_id
27.
27Copyright©2018 NTT Corp.
All Rights Reserved. Get sampling data from a target table according to percentage Two kinds of sampling method ⁃SYSTEM: pick data by the page ⁃BERNOULLI: pick data by the row Specify WHERE clause eliminate data that didn’t satisfied WHERE condition TABLESAMPLE Clause SELECT select_expression FROM table_name TABLESAMPLE sampling_method(percentage) WHERE condition sampling filtering target table
28.
28Copyright©2018 NTT Corp.
All Rights Reserved. •For each partition, get the partition bounds including its ancestors if any •Generate a WHERE clause according to the partition bound •Compute stats for sampling data got by TABLESAMPLE and the WHERE clause How hypopg_analyze() Works January December orders February 1-1000 1000-2000 partition bound: date = ‘January’ and cust_id >= 1 and cust_id < 1000 WHERE date = ‘January’ AND cust_id >= 1 AND cust_id < 1000 sampling filtering Compute Stats orders hypopg_analyze(‘orders’, 70); partition by date partition by cust_id
29.
29Copyright©2018 NTT Corp.
All Rights Reserved. HypoPG Architecture HypoPG uses four kinds of hooks T T T1 T3T2 stat stat stat stat EXPLAIN Query Plan PostgreSQL HypoPG Actual Hypothetical Table OID Hypothetical Scheme/ Statistics
30.
30Copyright©2018 NTT Corp.
All Rights Reserved. HypoPG Architecture Using ProcessUtility_hook, HypoPG detects an EXPLAIN without ANALYZE option and saves it in a local flag for following steps T T T1 T3T2 stat stat stat stat EXPLAIN Query Plan PostgreSQL HypoPG Actual Hypothetical Table OID Hypothetical Scheme/ Statistics 1
31.
31Copyright©2018 NTT Corp.
All Rights Reserved. HypoPG Architecture Using get_relation_info_hook, hypothetical partitioning scheme is injected if the target table is partitioned hypothetically T T T1 T3T2 stat stat stat stat EXPLAIN Query Plan PostgreSQL HypoPG Actual Hypothetical Table OID Hypothetical Scheme/ Statistics 2
32.
32Copyright©2018 NTT Corp.
All Rights Reserved. HypoPG Architecture Using get_relation_stats_hook, hypothetical statistics are injected to estimate correctly if they were created in advance T T T1 T3T2 stat stat stat stat EXPLAIN Query Plan PostgreSQL HypoPG Actual Hypothetical Table OID Hypothetical Scheme/ Statistics 3
33.
33Copyright©2018 NTT Corp.
All Rights Reserved. HypoPG Architecture Using ExecutorEnd_hook, the EXPLAIN flag is removed. Finally a query plan using hypothetical partitioning schemes is displayed!! T T T1 T3T2 stat stat stat stat EXPLAIN Query Plan PostgreSQL HypoPG Actual Hypothetical Table OID Hypothetical Scheme/ Statistics 4
34.
34Copyright©2018 NTT Corp.
All Rights Reserved. •Simulate RANGE/LIST/HASH Partitioning •Simulate SELECT queries Partition Pruning, Partition-wise Join/Aggregate, N-way Join, Parallel Query •Simulate multi-level hypothetical partitioning •Simulate hypothetical indexes on hypothetical partitions What You Can Do
35.
35Copyright©2018 NTT Corp.
All Rights Reserved. •Support only plain tables (not partitioned tables) •Support only SELECT queries •Do not support PostgreSQL 10 yet Limitations
36.
36Copyright©2018 NTT Corp.
All Rights Reserved. •Create a hypothetical partitioning scheme and execute some simple queries ⁃A simple customers table to be partitioned actually: ⁃A simple orders table to be partitioned hypothetically: NOTE: for convenience, this table is named **hypo_**orders to quickly identify it in the plans. DEMO time! customers (cust_id integer PRIMARY KEY, name TEXT, address TEXT) hypo_orders (orders_id int PRIMARY KEY, cust_id int, price int, date date)
37.
37Copyright©2018 NTT Corp.
All Rights Reserved. Future Works & Summary
38.
38Copyright©2018 NTT Corp.
All Rights Reserved. •Support PostgreSQL 10 •Have a better integration in PostgreSQL core to ease our limitations - Support UPDATE/DELETE/INSERT queries - Support already partitioned tables •Add automated advisor feature What’s next?
39.
39Copyright©2018 NTT Corp.
All Rights Reserved. HypoPG2 will support hypothetical partitioning ⁃Allows users to try/simulate different hypothetical partitioning schemes on real tables and data ⁃Outputs queries’ plan/cost with EXPLAIN using hypothetical partitioning schemes Hypothetical partitioning helps you all to design partitioning schemes ⁃You can quickly check how your queries would behave if certain tables were partitioned without actually partitioning any tables and wasting any resources ⁃You can try different partitioning schemes in a short time We’d be happy to have some feedbacks! Summary
40.
40Copyright©2018 NTT Corp.
All Rights Reserved. THANK YOU! Yuzuko Hosoya hosoya.yuzuko@lab.ntt.co.jp
Download now