SlideShare a Scribd company logo
1 of 18
Download to read offline
© 2013 EDB All rights reserved. 1
Implementing Parallelism in
PostgreSQL
•
Robert Haas | PGCon 2014
© 2014 EDB All rights reserved. 2
•
Between 1996 and 2004, single-threaded CPU
performance on SPECint and SPECfp benchmarks
increased by >50% per year. Between 2004 and 2012,
it increased by ~21% per year.
− http://preshing.com/20120208/a-look-back-at-single-threaded-cpu-performance/
•
Single-threaded 7-zip performance was only 39%
faster on 2 x Intel Xeon L5640 (March 16, 2010; 2.7
GHz, 12 MB cache) than on 4 x AMD Opteron 880
(September 26, 2005; 2.4 GHz, 2MB cache). That's
only 7.7% per year.
− http://www.anandtech.com/show/6825/inside-anandtech-2013-cpu-performance
Parallelism: Why? (1)
© 2014 EDB All rights reserved. 3
•
Dell Configuration Tool (as of 2014-05-15):
− 2x Intel® Xeon® E7-4890 v2 Processor 2.8GHz, 37.5M
Cache, 8.0 GT/s QPI, Turbo, 15 Core, 155W [add $7,735.68]
− 2x Intel® Xeon® E7-8893 v2 Processor 3.4GHz, 37.5M
Cache, 8.0 GT/s QPI, Turbo, 6 Core, 155W [add $8,410.30]
Parallelism: Why? (2)
© 2014 EDB All rights reserved. 4
Hash Join
Join Cond: foo.x = bar.x
→ Seq Scan on foo
Filter: something_complicated
→ Hash
→ Seq Scan on bar
•
One backend could run the Seq Scan and apply the
filter condition; it could then stream the results to
another backend to perform the Hash Join.
Parallel Query: Inter-Node
© 2014 EDB All rights reserved. 5
Hash Join
Join Cond: foo.x = bar.x
→ Seq Scan on foo
Filter: something_complicated
→ Hash
→ Seq Scan on bar
•
Multiple backends could cooperate to perform Seq
Scan – or Hash Join.
Parallel Query: Intra-Node
© 2014 EDB All rights reserved. 6
•
CREATE INDEX
− Parallel Heap Scan
− Parallel Sort
•
VACUUM
− Parallel Heap Scan
− Worker Per Index (suggestion from Andres and Heikki)
Parallel Maintenance / DDL Commands
© 2014 EDB All rights reserved. 7
•
Processes – Not Threads
− None of our fundamental subsystems are thread-safe (e.g.
palloc/pfree, ereport, syscache, relcache, buffer manager).
− Making them thread-safe would add synchronization overhead
even in the single-threaded case – and also bugs.
•
Started By Postmaster – Not created via fork()
− Can't fork() on Windows, where many of our users are.
− Currently, all backends are direct children of the postmaster;
seems preferable to keep it that way.
Architectural Overview (1)
© 2014 EDB All rights reserved. 8
•
Shared Memory – Not Pipes or Files
− Files would cause more system calls and more I/O.
− Pipes are a good paradigm, but shared memory is more
flexible.
− We can use shared memory to emulate a pipe if we need to –
see shm_mq. (This also dodges platform dependencies.)
•
Dynamic Shared Memory – Not Main Segment
− For an application such as parallel sort, we might need a LOT
of memory, like a terabyte. We can't pre-reserve that!
•
Dynamic Shared Memory Could Be At a Different
Address in Every Process
− No good, general techniques for achieving this.
Architectural Overview (2)
© 2014 EDB All rights reserved. 9
•
Basic Facilities (done in 9.4)
•
Plumbing (some work done/in progress)
•
Parallel Environment (a little unpublished work done)
•
Parallel Execution (some study/thought)
•
Parallel Planning (no idea yet)
What Do We Need To Build?
© 2014 EDB All rights reserved. 10
•
Dynamic Background Workers (done in 9.4)
•
Dynamic Shared Memory (done in 9.4)
Basic Facilities
© 2014 EDB All rights reserved. 11
•
DSM Table of Contents (done in 9.4)
− I just mapped this dynamic shared memory segment; how do I
figure out what it contains?
•
Message Queueing (done in 9.4)
− How does a background worker send tuples, errors, notices,
etc. to a user backend?
•
Error Propagation (working on it)
− Common infrastructure to make using message queueing
easy.
•
Shared Memory Allocator (early draft posted)
•
Shared Hash Table (someday)
Plumbing
© 2014 EDB All rights reserved. 12
•
Make the Background Worker Look Enough Like a
Regular User Backend To Do Useful Work
− Copy Relevant State (e.g. User, Database, Snapshot)
•
Useful Work Doesn't Mean Everything
− Some operations seem fundamentally unsafe in a parallel
context (e.g. calling a user-defined function that sets a GUC).
− Some operations could theoretically be made safe, but we
might not bother (e.g. setseed() + random()).
− Even if we share lots of state, arbitrary user-supplied code can
never be safe; must label unsafe functions.
Parallel Environment
© 2014 EDB All rights reserved. 13
•
User ID and Database
•
GUCs
•
Transaction State
•
Current and Active Snapshot
•
Combo CID Hash
Parallel Environment: What To Copy
© 2014 EDB All rights reserved. 14
•
Sequence Operations
•
Generation of Invalidation Messages
•
Cursor Operations
•
Large Object Manipulation
•
LISTEN/NOTIFY
•
Access to Temporary Buffers
•
Prepared Statements
Parallel Environment: What To Prohibit
© 2014 EDB All rights reserved. 15
•
Background Workers Can't Rely on User Backend To
Hold Necessary Locks
− The user backend might die or be killed before the background
worker terminates.
•
If Background Workers Re-Lock The Same Relations,
Parallel Query Might Self Deadlock
− User backend locks X; another process queues for a
conflicting lock on X; background worker tries to re-lock X.
•
Probably Need a Concept of Locking Groups Inside the
Lock Manager
Parallel Environment: Lock Management
© 2014 EDB All rights reserved. 16
•
This is the “easy” part.
•
Parallel sorting algorithms are described in the
literature and well-understood.
•
For parallel sequential scan, grab blocks or block
ranges in alternation.
•
Amdahl's Law: If α is the fraction of running time a
program spends executing serially, the maximum
speedup from parallelism is 1/α.
Parallel Execution
© 2014 EDB All rights reserved. 17
•
This is probably hard.
•
Right now, we do costing based on estimating the page
access costs (CPU and I/O) and tuple processing
costs.
•
For parallelism, need to consider worker startup costs
and IPC costs.
•
A plan that's a little cheaper for me might be much
more expensive in total.
Parallel Query Planning
© 2014 EDB All rights reserved. 18
•
Any questions?
Thanks.

More Related Content

What's hot

Top 10 Tips for an Effective Postgres Deployment
Top 10 Tips for an Effective Postgres DeploymentTop 10 Tips for an Effective Postgres Deployment
Top 10 Tips for an Effective Postgres Deployment
EDB
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQ
Don Brizendine
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009)
PostgreSQL Experts, Inc.
 

What's hot (20)

PostgreSQL and Benchmarks
PostgreSQL and BenchmarksPostgreSQL and Benchmarks
PostgreSQL and Benchmarks
 
EDB Postgres DBA Best Practices
EDB Postgres DBA Best PracticesEDB Postgres DBA Best Practices
EDB Postgres DBA Best Practices
 
EnterpriseDB BackUp and Recovery Tool
EnterpriseDB BackUp and Recovery ToolEnterpriseDB BackUp and Recovery Tool
EnterpriseDB BackUp and Recovery Tool
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL
 
Top 10 Tips for an Effective Postgres Deployment
Top 10 Tips for an Effective Postgres DeploymentTop 10 Tips for an Effective Postgres Deployment
Top 10 Tips for an Effective Postgres Deployment
 
HBase operations
HBase operationsHBase operations
HBase operations
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQ
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009)
 
EDB Postgres with Containers
EDB Postgres with ContainersEDB Postgres with Containers
EDB Postgres with Containers
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and Future
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
Postgres Point-in-Time Recovery
Postgres Point-in-Time RecoveryPostgres Point-in-Time Recovery
Postgres Point-in-Time Recovery
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
5 Tips to Simplify the Management of Your Postgres Database
5 Tips to Simplify the Management of Your Postgres Database5 Tips to Simplify the Management of Your Postgres Database
5 Tips to Simplify the Management of Your Postgres Database
 
How to use postgresql.conf to configure and tune the PostgreSQL server
How to use postgresql.conf to configure and tune the PostgreSQL serverHow to use postgresql.conf to configure and tune the PostgreSQL server
How to use postgresql.conf to configure and tune the PostgreSQL server
 
Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)
Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)
Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)
 
Enterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional DatabasesEnterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional Databases
 
Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges
 

Similar to Implementing Parallelism in PostgreSQL - PGCon 2014

Similar to Implementing Parallelism in PostgreSQL - PGCon 2014 (20)

Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
 
Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale
 
Introduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersIntroduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developers
 
Developing a Ceph Appliance for Secure Environments
Developing a Ceph Appliance for Secure EnvironmentsDeveloping a Ceph Appliance for Secure Environments
Developing a Ceph Appliance for Secure Environments
 
Next Generation Hadoop Operations
Next Generation Hadoop OperationsNext Generation Hadoop Operations
Next Generation Hadoop Operations
 
Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014
 
EECI 2013 - ExpressionEngine Performance & Optimization - Laying a Solid Foun...
EECI 2013 - ExpressionEngine Performance & Optimization - Laying a Solid Foun...EECI 2013 - ExpressionEngine Performance & Optimization - Laying a Solid Foun...
EECI 2013 - ExpressionEngine Performance & Optimization - Laying a Solid Foun...
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)
 
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
 
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
 
MySQL Enterprise Backup apr 2016
MySQL Enterprise Backup apr 2016MySQL Enterprise Backup apr 2016
MySQL Enterprise Backup apr 2016
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Scale your Alfresco Solutions
Scale your Alfresco Solutions Scale your Alfresco Solutions
Scale your Alfresco Solutions
 
Database failover from client perspective
Database failover from client perspectiveDatabase failover from client perspective
Database failover from client perspective
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt
 
Platform Engineering for the Modern Oracle World
Platform Engineering for the Modern Oracle WorldPlatform Engineering for the Modern Oracle World
Platform Engineering for the Modern Oracle World
 

More from EDB

EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
EDB
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
EDB
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
EDB
 

More from EDB (20)

Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQL
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQL
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
 
Data Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLData Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQL
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQL
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQL
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJ
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos données
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - Italiano
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJ
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 

Recently uploaded (20)

JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 

Implementing Parallelism in PostgreSQL - PGCon 2014

  • 1. © 2013 EDB All rights reserved. 1 Implementing Parallelism in PostgreSQL • Robert Haas | PGCon 2014
  • 2. © 2014 EDB All rights reserved. 2 • Between 1996 and 2004, single-threaded CPU performance on SPECint and SPECfp benchmarks increased by >50% per year. Between 2004 and 2012, it increased by ~21% per year. − http://preshing.com/20120208/a-look-back-at-single-threaded-cpu-performance/ • Single-threaded 7-zip performance was only 39% faster on 2 x Intel Xeon L5640 (March 16, 2010; 2.7 GHz, 12 MB cache) than on 4 x AMD Opteron 880 (September 26, 2005; 2.4 GHz, 2MB cache). That's only 7.7% per year. − http://www.anandtech.com/show/6825/inside-anandtech-2013-cpu-performance Parallelism: Why? (1)
  • 3. © 2014 EDB All rights reserved. 3 • Dell Configuration Tool (as of 2014-05-15): − 2x Intel® Xeon® E7-4890 v2 Processor 2.8GHz, 37.5M Cache, 8.0 GT/s QPI, Turbo, 15 Core, 155W [add $7,735.68] − 2x Intel® Xeon® E7-8893 v2 Processor 3.4GHz, 37.5M Cache, 8.0 GT/s QPI, Turbo, 6 Core, 155W [add $8,410.30] Parallelism: Why? (2)
  • 4. © 2014 EDB All rights reserved. 4 Hash Join Join Cond: foo.x = bar.x → Seq Scan on foo Filter: something_complicated → Hash → Seq Scan on bar • One backend could run the Seq Scan and apply the filter condition; it could then stream the results to another backend to perform the Hash Join. Parallel Query: Inter-Node
  • 5. © 2014 EDB All rights reserved. 5 Hash Join Join Cond: foo.x = bar.x → Seq Scan on foo Filter: something_complicated → Hash → Seq Scan on bar • Multiple backends could cooperate to perform Seq Scan – or Hash Join. Parallel Query: Intra-Node
  • 6. © 2014 EDB All rights reserved. 6 • CREATE INDEX − Parallel Heap Scan − Parallel Sort • VACUUM − Parallel Heap Scan − Worker Per Index (suggestion from Andres and Heikki) Parallel Maintenance / DDL Commands
  • 7. © 2014 EDB All rights reserved. 7 • Processes – Not Threads − None of our fundamental subsystems are thread-safe (e.g. palloc/pfree, ereport, syscache, relcache, buffer manager). − Making them thread-safe would add synchronization overhead even in the single-threaded case – and also bugs. • Started By Postmaster – Not created via fork() − Can't fork() on Windows, where many of our users are. − Currently, all backends are direct children of the postmaster; seems preferable to keep it that way. Architectural Overview (1)
  • 8. © 2014 EDB All rights reserved. 8 • Shared Memory – Not Pipes or Files − Files would cause more system calls and more I/O. − Pipes are a good paradigm, but shared memory is more flexible. − We can use shared memory to emulate a pipe if we need to – see shm_mq. (This also dodges platform dependencies.) • Dynamic Shared Memory – Not Main Segment − For an application such as parallel sort, we might need a LOT of memory, like a terabyte. We can't pre-reserve that! • Dynamic Shared Memory Could Be At a Different Address in Every Process − No good, general techniques for achieving this. Architectural Overview (2)
  • 9. © 2014 EDB All rights reserved. 9 • Basic Facilities (done in 9.4) • Plumbing (some work done/in progress) • Parallel Environment (a little unpublished work done) • Parallel Execution (some study/thought) • Parallel Planning (no idea yet) What Do We Need To Build?
  • 10. © 2014 EDB All rights reserved. 10 • Dynamic Background Workers (done in 9.4) • Dynamic Shared Memory (done in 9.4) Basic Facilities
  • 11. © 2014 EDB All rights reserved. 11 • DSM Table of Contents (done in 9.4) − I just mapped this dynamic shared memory segment; how do I figure out what it contains? • Message Queueing (done in 9.4) − How does a background worker send tuples, errors, notices, etc. to a user backend? • Error Propagation (working on it) − Common infrastructure to make using message queueing easy. • Shared Memory Allocator (early draft posted) • Shared Hash Table (someday) Plumbing
  • 12. © 2014 EDB All rights reserved. 12 • Make the Background Worker Look Enough Like a Regular User Backend To Do Useful Work − Copy Relevant State (e.g. User, Database, Snapshot) • Useful Work Doesn't Mean Everything − Some operations seem fundamentally unsafe in a parallel context (e.g. calling a user-defined function that sets a GUC). − Some operations could theoretically be made safe, but we might not bother (e.g. setseed() + random()). − Even if we share lots of state, arbitrary user-supplied code can never be safe; must label unsafe functions. Parallel Environment
  • 13. © 2014 EDB All rights reserved. 13 • User ID and Database • GUCs • Transaction State • Current and Active Snapshot • Combo CID Hash Parallel Environment: What To Copy
  • 14. © 2014 EDB All rights reserved. 14 • Sequence Operations • Generation of Invalidation Messages • Cursor Operations • Large Object Manipulation • LISTEN/NOTIFY • Access to Temporary Buffers • Prepared Statements Parallel Environment: What To Prohibit
  • 15. © 2014 EDB All rights reserved. 15 • Background Workers Can't Rely on User Backend To Hold Necessary Locks − The user backend might die or be killed before the background worker terminates. • If Background Workers Re-Lock The Same Relations, Parallel Query Might Self Deadlock − User backend locks X; another process queues for a conflicting lock on X; background worker tries to re-lock X. • Probably Need a Concept of Locking Groups Inside the Lock Manager Parallel Environment: Lock Management
  • 16. © 2014 EDB All rights reserved. 16 • This is the “easy” part. • Parallel sorting algorithms are described in the literature and well-understood. • For parallel sequential scan, grab blocks or block ranges in alternation. • Amdahl's Law: If α is the fraction of running time a program spends executing serially, the maximum speedup from parallelism is 1/α. Parallel Execution
  • 17. © 2014 EDB All rights reserved. 17 • This is probably hard. • Right now, we do costing based on estimating the page access costs (CPU and I/O) and tuple processing costs. • For parallelism, need to consider worker startup costs and IPC costs. • A plan that's a little cheaper for me might be much more expensive in total. Parallel Query Planning
  • 18. © 2014 EDB All rights reserved. 18 • Any questions? Thanks.