SlideShare a Scribd company logo
1 of 36
Download to read offline
Benchmarking for
PostgreSQL
workloads in
Kubernetes (part 2)
Gabriele Bartolini
#109 Data on Kubernetes (DOK) Webinar
16 December 2021
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Today’s speakers
Gabriele Bartolini
VP of Cloud Native at EDB
PostgreSQL user since ~2000
• Community member since 2006
• Co-founder of PostgreSQL Europe
Previously at 2ndQuadrant, from 2008 to 2020
• Co-founder
• Head of Global Support
• Cloud Native Initiative Lead
• Founding member of Barman
DevOps evangelist
2
Twitter: @_GBartolini_ / @EDBPostgres
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
EDB and Kubernetes
3
Major sponsor
of the PostgreSQL project
Kubernetes Certified Service
Provider (KCSP)
Silver Member of CNCF &
Linux Foundation
Platinum founding sponsor of
the Data on Kubernetes
Community
We have contributed to the PostgreSQL community every year since 2006, making major feature contributions.
We had 32 contributors in PostgreSQL 14, including 7 code committers and 3 core members.
Bringing PostgreSQL to Kubernetes
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Agenda
• Key takeaways from Dok #58
• A day in the life of a Postgres transaction
• Recommended architectures
• Our methodology
• Conclusions
DoK #58 webinar
Recap
2021 Copyright © EnterpriseDB Corporation All Rights Reserved 6
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Why Kubernetes? Why PostgreSQL?
7
Cloud native culture with a highly versatile SQL driven database
● “Cloud Native” is much more than tools (Kubernetes)
○ Patterns/architectures (microservices, operators, ...)
○ Principles/culture (devops/lean/agile, velocity, automation, pervasive quality and security processes, …)
● Kubernetes is becoming popular for stateful workloads, including databases:
○ Please refer to dok.community/dokc-2021-report/ for details
○ Reasons: storage classes, local persistent volumes, the operator pattern
● PostgreSQL is based on 25+ years of evolutionary innovation
○ Linux : Operating System = Postgres : Database
○ Database of the year in 2017, 2018, and 2020 at db-engines.com
○ Some of its main features:
■ Native streaming replication, both physical and logical, sync and async, cascading
■ Online Continuous Backup and Point In Time Recovery
■ Declarative Partitioning
■ Parallel queries
■ Extensibility and extensions (e.g. PostGIS)
■ JSON support (SQL/noSQL hybrid databases)
■ ACID transactions
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Benchmarking PostgreSQL
8
Workloads, storage and the database
● Storage
○ Write Ahead Log (WAL), or historically xlog
■ Sequential writes and fsync
○ Shared buffers cleaning
■ By checkpoint, bgwriter, or the single backend
■ Random writes (and OS cache)
○ Page reads
■ Random reads
○ Optimization: Table scans
■ Sequential reads
○ Capping on cloud environments
● Database
○ Workloads: in-memory, OLTP, and OLAP
○ Initial focus: TPS on large OLTP workloads (RAM < DB size)
○ pgbench
● We introduced cnp-bench
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Know your storage
9
You need to trust it
● Make sure you benchmark your storage before you go in production
● Make sure you benchmark your storage before you test your database
○ Storage can become your bottleneck
■ If your storage is slow, your database will be slow
● Please refer to DoK #58 for more information on storage benchmarking
○ Use cnp-bench
○ Use fio directly
A day in the life of a transaction
(VERY simplified view)
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Disclaimer
Postgres internals are more complex than
this. The following is a simplified view for
clarity.
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Disk (pg_wal)
Disk (PGDATA)
12
Shared Buffers (Postgres cache)
8kb 8kb … WAL file segment
Checkpoint
8kb 8kb …
Postgres backend
Another “brick” in the WAL
8kb
8kb
Ready to be recycled
usually 16MB in size
Transaction log
Sequential writes
Fsync-ed
Regularly the
database cache is
flushed on disk
(“dirty pages”)
WAL file segment
DISCLAIMER: simplified view for didactic purposes
8kb
e.g Random writes
Random reads
Seq scans
Recommended
architectures for
PostgreSQL
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
PostgreSQL architectures in Business Continuity
14
Always plan for benchmarking the final production architecture
● Start with one instance to spot the major bottlenecks
○ e.g. storage
● Then move to a real life production architecture
● Consider your Business Continuity goals
○ Disaster recovery - primarily focused on Recovery Point Objective (RPO)
○ High Availability - primarily focused on Recovery Time Objective (RTO)
○ Plan your production database architectures with both RTO and RPO in mind
● PostgreSQL provides the fundamental blocks for Business Continuity
○ Continuous backup and Point In Time Recovery
■ Base backups
■ WAL archiving
○ Native streaming replication based on the Write Ahead Log (WAL)
● The WAL is central in PostgreSQL
● To keep your data safe, managing the above in Kubernetes requires an operator written by
Postgres experts
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Disk (pg_wal)
Disk (PGDATA)
15
Shared Buffers (Postgres cache)
WAL file segment
Postgres backend
The criticality of the WAL in day-to-day Postgres
Ready to be recycled
WAL file segment
DISCLAIMER: simplified view for didactic purposes
archive_command
wal_sender(s)
streaming replication
WAL
archive
Replicas
(standby)
Ready to be archived
Potential Bottlenecks!
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Kubernetes cluster / namespace
Recommended architecture
16
Node
Local storage
Node
Local storage
Node
Local storage
Primary Sync Standby Potential Sync
Standby
zone 1 zone 2 zone 3
base backups
and WAL archive
Continuous backup
(WAL archiving)
restore_command
Streaming
Replication
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Some potential bottlenecks and issues
17
A summary of the major issues that bottlenecks can cause
● WAL writing: local storage
○ Slow system
● WAL archiving: serialized process, network, remote storage, compression (if applicable)
○ Bottleneck cause WAL files to pile up on the volume where pg_wal is, causing Postgres to halt
● Streaming replication: network, remote storage
○ wal_keep_segments/wal_keep_size
■ Beyond this threshold, WAL files are recycled on the primary and the standby falls out of sync
○ replication slots
■ The primary keeps track of the location in the WAL needed by a standby and keeps the WAL file
● Same issue as WAL archiving - WAL files pile up and Postgres risks to halt
○ synchronous replication
■ A bottleneck here slows down writes on the primary
■ If all synchronous standby servers are down, the primary stops accepting writes (never use a single synchronous standby)
● Restore command: serialized process, network, remote storage, decompression (if applicable)
○ A standby cannot start streaming replication and relies on WAL files from the archive
○ Delayed standby - possible impact on RPO and RTO in case of failover of the primary
● Standby replay: single process
○ Delayed standby - possible impact on RPO and RTO in case of failover of the primary
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Dedicated resources
18
Prefer shared nothing architectures, even in Kubernetes
● If you can, dedicate a Kubernetes node to one Postgres instance only
○ Take advantage of Pod scheduling capabilities and availability zones (where available)
■ pod affinity/anti-affinity
■ node selectors
■ tolerations
○ Properly set resource requests and limits
■ Guaranteed QoS is recommended
● If you can, use local storage on the dedicated node
○ Benchmark throughput
○ In the public cloud, watch out for IOPS limitations
● Costs/benefits analysis
○ One more reason why benchmarking is fundamental in proper and effective capacity planning
○ It’s your choice, and yours only
Our methodology
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
How we’re benchmarking Postgres on K8s
20
Observe and let numbers and diagrams help you discover issues
● We rely on:
○ cnp-sandbox
■ Prometheus, Grafana, Cloud Native PostgreSQL operator (EDB)
○ cnp-bench
■ on existing clusters
○ pg_stat_statements
● You can use your own PostgreSQL setup
○ Your favourite operator
● You can use your favorite observability tools
○ Your own Prometheus/Grafana
○ Something else (you should know what to look for now!)
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
A sandbox for Cloud Native PostgreSQL
21
cnp-sandbox is an open source helm chart
● Deploys a sandbox environment in Kubernetes with:
○ Prometheus
○ Grafana
○ Cloud Native PostgreSQL with:
■ a selection of PostgreSQL metrics for the native Prometheus exporter in CNP
■ a custom Grafana dashboard developed by EDB for Cloud Native PostgreSQL
● Main goals:
○ Evaluate Cloud Native PostgreSQL’s observability with Prometheus and Grafana
○ Integrate benchmarks with real-time collected data
● Suitable for pre-production and staging environments
○ Production environments should have their own Prometheus and Grafana installations
○ Metrics and dashboards can be reused
● URL: github.com/EnterpriseDB/cnp-sandbox
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Deployment
22
helm repo add cnp-sandbox https://enterprisedb.github.io/cnp-sandbox/
helm repo update
helm upgrade --install cnp-sandbox 
cnp-sandbox/cnp-sandbox
2021 Copyright © EnterpriseDB Corporation All Rights Reserved 23
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
cnp-bench
24
Benchmarking the storage and a database PostgreSQL
● Storage benchmarking with fio
● Database benchmarking and stress testing with:
○ pgbench
○ HammerDB
● Can be run against an existing Postgres database
○ Integrated with Cloud Native PostgreSQL, including pgBouncer for connection pooling
● Suitable for pre-production and staging environments
● URL: https://github.com/EnterpriseDB/cnp-bench
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
An example of pgbench initialization
25
cnp:
existingCluster: true
existingCredentials: pg-14-app
existingHost: pg-14-rw
existingDatabase: pgbench
image: quay.io/enterprisedb/postgresql:14.1
pgbench:
nodeSelector:
workload: pgbench
scaleFactor: 8000
initialize: true
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
An example of pgbench run
26
cnp:
existingCluster: true
existingCredentials: pg-14-app
existingHost: pg-14-rw
existingDatabase: pgbench
image: quay.io/enterprisedb/postgresql:14.1
pgbench:
nodeSelector:
workload: pgbench
initialize: false
skipVacuum: true
reportLatencies: true
time: 600
clients: 64
jobs: 128
2021 Copyright © EnterpriseDB Corporation All Rights Reserved 27
Notes:
● 5 x 10min pgbench tests
● scale factor 8000 (120GB)
● 3 dedicated nodes:
○ AKS Standard_E8s_v4
○ 7 cores/56Gi RAM
○ Guaranteed Qos
○ Premium P80 storage class
● 1 MinSync replication
● Azure Blob Container (backup)
● pgbench on AKS Standard_D64s_v4
2021 Copyright © EnterpriseDB Corporation All Rights Reserved 28
Another example showing ~ 13k tps with 32 cores
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Bottleneck: serialized WAL archiving
29
Anticipate and avoid this scenario!
36k piled WALs!
What if the primary dies now?
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Parallel archiving (1)
30
Remediation: parallel WAL archiving and large segment size (64MB)
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Parallel archiving (2)
31
Might be OK (bulk loads or vacuums)
There’s more ...
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
What’s next with cnp-bench
33
Our plan for the H1/2022
● Manage increasing number of client connections
● Manage repetitions
● Support custom pgbench scripts
● Improve support for HammerDB
● Introduce application level benchmarking
○ Web application load generation with hey
○ Front end scalability
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
About Cloud Native PostgreSQL
34
The Kubernetes operator from EDB
● It is currently closed source
○ Available for trials
● Fully declarative
● Integrated with the Kubernetes API server (no external tool for failover)
● Directly manages persistent volumes
● Our intention is to open source Cloud Native PostgreSQL in 2022
● It is the component that manages PostgreSQL in the data layer of BigAnimal
2021 Copyright © EnterpriseDB Corporation All Rights Reserved
Conclusions
35
Why benchmarking PostgreSQL is important?
● Data is the most important asset of an organization
● Data can live in Kubernetes, in reliable databases like PostgreSQL
● Don’t leave anything to chance
○ Benchmark your storage and know its limits
○ Benchmark your database and know its limits
● Benchmark before you go to production
○ You might not be able to benchmark when in production
● Highly consider dedicating storage and nodes to a single PostgreSQL instance
○ First benchmark the single node, and focus on the storage primarily
○ Then benchmark the high availability cluster, with continuous backup and replicas
■ Pay attention to WAL archiving, streaming, WAL restore, replay, and so on …
● Evaluate introduction of failover and switchover events in benchmarks (chaos)
○ Observe the cluster and always consider your RPO and RTO goals
● Study Postgres, love Postgres!
○ There are so many features you might not know that Postgres already has!
“
Thank you!
DoK #109 webinar - Benchmarking for PostgreSQL workloads in Kubernetes (part 2)
Gabriele Bartolini - @_GBartolini_
Watch part 1!

More Related Content

Similar to Benchmarking for postgresql workloads in kubernetes

Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsCeph Community
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4OpenEBS
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On DemandBogdan Kyryliuk
 
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedData Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedHostedbyConfluent
 
What's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon ValleyWhat's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon ValleyCeph Community
 
It's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda ArchitectureIt's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda ArchitectureYaroslav Tkachenko
 
OpenNebulaConf2018 - Is Hyperconverged Infrastructure what you need? - Boyan ...
OpenNebulaConf2018 - Is Hyperconverged Infrastructure what you need? - Boyan ...OpenNebulaConf2018 - Is Hyperconverged Infrastructure what you need? - Boyan ...
OpenNebulaConf2018 - Is Hyperconverged Infrastructure what you need? - Boyan ...OpenNebula Project
 
What's coming in Airflow 2.0? - NYC Apache Airflow Meetup
What's coming in Airflow 2.0? - NYC Apache Airflow MeetupWhat's coming in Airflow 2.0? - NYC Apache Airflow Meetup
What's coming in Airflow 2.0? - NYC Apache Airflow MeetupKaxil Naik
 
Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)Idan Atias
 
19. Cloud Native Computing - Kubernetes - Bratislava - Databases in K8s world
19. Cloud Native Computing - Kubernetes - Bratislava - Databases in K8s world19. Cloud Native Computing - Kubernetes - Bratislava - Databases in K8s world
19. Cloud Native Computing - Kubernetes - Bratislava - Databases in K8s worldDávid Kőszeghy
 
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and DockerWSO2
 
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyIt's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyHostedbyConfluent
 
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph Ceph Community
 
Ceph Day NYC: Building Tomorrow's Ceph
Ceph Day NYC: Building Tomorrow's CephCeph Day NYC: Building Tomorrow's Ceph
Ceph Day NYC: Building Tomorrow's CephCeph Community
 
9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdfsreedb2
 
Scaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQLScaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQLOSInet
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph Ceph Community
 
MongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseMongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseFITC
 
Leveraging open source for large scale analytics
Leveraging open source for large scale analyticsLeveraging open source for large scale analytics
Leveraging open source for large scale analyticsSouth West Data Meetup
 

Similar to Benchmarking for postgresql workloads in kubernetes (20)

Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah Watkins
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On Demand
 
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedData Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
 
What's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon ValleyWhat's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon Valley
 
It's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda ArchitectureIt's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda Architecture
 
OpenNebulaConf2018 - Is Hyperconverged Infrastructure what you need? - Boyan ...
OpenNebulaConf2018 - Is Hyperconverged Infrastructure what you need? - Boyan ...OpenNebulaConf2018 - Is Hyperconverged Infrastructure what you need? - Boyan ...
OpenNebulaConf2018 - Is Hyperconverged Infrastructure what you need? - Boyan ...
 
What's coming in Airflow 2.0? - NYC Apache Airflow Meetup
What's coming in Airflow 2.0? - NYC Apache Airflow MeetupWhat's coming in Airflow 2.0? - NYC Apache Airflow Meetup
What's coming in Airflow 2.0? - NYC Apache Airflow Meetup
 
Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)
 
19. Cloud Native Computing - Kubernetes - Bratislava - Databases in K8s world
19. Cloud Native Computing - Kubernetes - Bratislava - Databases in K8s world19. Cloud Native Computing - Kubernetes - Bratislava - Databases in K8s world
19. Cloud Native Computing - Kubernetes - Bratislava - Databases in K8s world
 
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
 
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyIt's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
 
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
 
Ceph Day NYC: Building Tomorrow's Ceph
Ceph Day NYC: Building Tomorrow's CephCeph Day NYC: Building Tomorrow's Ceph
Ceph Day NYC: Building Tomorrow's Ceph
 
9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf
 
Scaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQLScaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQL
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph
 
MongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseMongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL Database
 
Leveraging open source for large scale analytics
Leveraging open source for large scale analyticsLeveraging open source for large scale analytics
Leveraging open source for large scale analytics
 

More from DoKC

Distributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and HowDistributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and HowDoKC
 
Is It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes OperatorsIs It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes OperatorsDoKC
 
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryStop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryDoKC
 
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...DoKC
 
The State of Stateful on Kubernetes
The State of Stateful on KubernetesThe State of Stateful on Kubernetes
The State of Stateful on KubernetesDoKC
 
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...DoKC
 
Make Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-ReadyMake Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-ReadyDoKC
 
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...DoKC
 
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the CloudRun PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the CloudDoKC
 
The Kubernetes Native Database
The Kubernetes Native DatabaseThe Kubernetes Native Database
The Kubernetes Native DatabaseDoKC
 
ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023DoKC
 
Implementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentImplementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentDoKC
 
StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154DoKC
 
Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151DoKC
 
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...DoKC
 
Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147DoKC
 
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...DoKC
 
We will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8sWe will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8sDoKC
 
Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators DoKC
 
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...DoKC
 

More from DoKC (20)

Distributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and HowDistributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and How
 
Is It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes OperatorsIs It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes Operators
 
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryStop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
 
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
 
The State of Stateful on Kubernetes
The State of Stateful on KubernetesThe State of Stateful on Kubernetes
The State of Stateful on Kubernetes
 
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
 
Make Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-ReadyMake Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-Ready
 
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
 
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the CloudRun PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
 
The Kubernetes Native Database
The Kubernetes Native DatabaseThe Kubernetes Native Database
The Kubernetes Native Database
 
ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023
 
Implementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentImplementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch government
 
StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154
 
Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151
 
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
 
Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147
 
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
 
We will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8sWe will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8s
 
Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators
 
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
 

Recently uploaded

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?Watsoo Telematics
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 

Recently uploaded (20)

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 

Benchmarking for postgresql workloads in kubernetes

  • 1. Benchmarking for PostgreSQL workloads in Kubernetes (part 2) Gabriele Bartolini #109 Data on Kubernetes (DOK) Webinar 16 December 2021
  • 2. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Today’s speakers Gabriele Bartolini VP of Cloud Native at EDB PostgreSQL user since ~2000 • Community member since 2006 • Co-founder of PostgreSQL Europe Previously at 2ndQuadrant, from 2008 to 2020 • Co-founder • Head of Global Support • Cloud Native Initiative Lead • Founding member of Barman DevOps evangelist 2 Twitter: @_GBartolini_ / @EDBPostgres
  • 3. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved EDB and Kubernetes 3 Major sponsor of the PostgreSQL project Kubernetes Certified Service Provider (KCSP) Silver Member of CNCF & Linux Foundation Platinum founding sponsor of the Data on Kubernetes Community We have contributed to the PostgreSQL community every year since 2006, making major feature contributions. We had 32 contributors in PostgreSQL 14, including 7 code committers and 3 core members. Bringing PostgreSQL to Kubernetes
  • 4. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Agenda • Key takeaways from Dok #58 • A day in the life of a Postgres transaction • Recommended architectures • Our methodology • Conclusions
  • 6. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved 6
  • 7. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Why Kubernetes? Why PostgreSQL? 7 Cloud native culture with a highly versatile SQL driven database ● “Cloud Native” is much more than tools (Kubernetes) ○ Patterns/architectures (microservices, operators, ...) ○ Principles/culture (devops/lean/agile, velocity, automation, pervasive quality and security processes, …) ● Kubernetes is becoming popular for stateful workloads, including databases: ○ Please refer to dok.community/dokc-2021-report/ for details ○ Reasons: storage classes, local persistent volumes, the operator pattern ● PostgreSQL is based on 25+ years of evolutionary innovation ○ Linux : Operating System = Postgres : Database ○ Database of the year in 2017, 2018, and 2020 at db-engines.com ○ Some of its main features: ■ Native streaming replication, both physical and logical, sync and async, cascading ■ Online Continuous Backup and Point In Time Recovery ■ Declarative Partitioning ■ Parallel queries ■ Extensibility and extensions (e.g. PostGIS) ■ JSON support (SQL/noSQL hybrid databases) ■ ACID transactions
  • 8. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Benchmarking PostgreSQL 8 Workloads, storage and the database ● Storage ○ Write Ahead Log (WAL), or historically xlog ■ Sequential writes and fsync ○ Shared buffers cleaning ■ By checkpoint, bgwriter, or the single backend ■ Random writes (and OS cache) ○ Page reads ■ Random reads ○ Optimization: Table scans ■ Sequential reads ○ Capping on cloud environments ● Database ○ Workloads: in-memory, OLTP, and OLAP ○ Initial focus: TPS on large OLTP workloads (RAM < DB size) ○ pgbench ● We introduced cnp-bench
  • 9. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Know your storage 9 You need to trust it ● Make sure you benchmark your storage before you go in production ● Make sure you benchmark your storage before you test your database ○ Storage can become your bottleneck ■ If your storage is slow, your database will be slow ● Please refer to DoK #58 for more information on storage benchmarking ○ Use cnp-bench ○ Use fio directly
  • 10. A day in the life of a transaction (VERY simplified view)
  • 11. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Disclaimer Postgres internals are more complex than this. The following is a simplified view for clarity.
  • 12. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Disk (pg_wal) Disk (PGDATA) 12 Shared Buffers (Postgres cache) 8kb 8kb … WAL file segment Checkpoint 8kb 8kb … Postgres backend Another “brick” in the WAL 8kb 8kb Ready to be recycled usually 16MB in size Transaction log Sequential writes Fsync-ed Regularly the database cache is flushed on disk (“dirty pages”) WAL file segment DISCLAIMER: simplified view for didactic purposes 8kb e.g Random writes Random reads Seq scans
  • 14. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved PostgreSQL architectures in Business Continuity 14 Always plan for benchmarking the final production architecture ● Start with one instance to spot the major bottlenecks ○ e.g. storage ● Then move to a real life production architecture ● Consider your Business Continuity goals ○ Disaster recovery - primarily focused on Recovery Point Objective (RPO) ○ High Availability - primarily focused on Recovery Time Objective (RTO) ○ Plan your production database architectures with both RTO and RPO in mind ● PostgreSQL provides the fundamental blocks for Business Continuity ○ Continuous backup and Point In Time Recovery ■ Base backups ■ WAL archiving ○ Native streaming replication based on the Write Ahead Log (WAL) ● The WAL is central in PostgreSQL ● To keep your data safe, managing the above in Kubernetes requires an operator written by Postgres experts
  • 15. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Disk (pg_wal) Disk (PGDATA) 15 Shared Buffers (Postgres cache) WAL file segment Postgres backend The criticality of the WAL in day-to-day Postgres Ready to be recycled WAL file segment DISCLAIMER: simplified view for didactic purposes archive_command wal_sender(s) streaming replication WAL archive Replicas (standby) Ready to be archived Potential Bottlenecks!
  • 16. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Kubernetes cluster / namespace Recommended architecture 16 Node Local storage Node Local storage Node Local storage Primary Sync Standby Potential Sync Standby zone 1 zone 2 zone 3 base backups and WAL archive Continuous backup (WAL archiving) restore_command Streaming Replication
  • 17. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Some potential bottlenecks and issues 17 A summary of the major issues that bottlenecks can cause ● WAL writing: local storage ○ Slow system ● WAL archiving: serialized process, network, remote storage, compression (if applicable) ○ Bottleneck cause WAL files to pile up on the volume where pg_wal is, causing Postgres to halt ● Streaming replication: network, remote storage ○ wal_keep_segments/wal_keep_size ■ Beyond this threshold, WAL files are recycled on the primary and the standby falls out of sync ○ replication slots ■ The primary keeps track of the location in the WAL needed by a standby and keeps the WAL file ● Same issue as WAL archiving - WAL files pile up and Postgres risks to halt ○ synchronous replication ■ A bottleneck here slows down writes on the primary ■ If all synchronous standby servers are down, the primary stops accepting writes (never use a single synchronous standby) ● Restore command: serialized process, network, remote storage, decompression (if applicable) ○ A standby cannot start streaming replication and relies on WAL files from the archive ○ Delayed standby - possible impact on RPO and RTO in case of failover of the primary ● Standby replay: single process ○ Delayed standby - possible impact on RPO and RTO in case of failover of the primary
  • 18. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Dedicated resources 18 Prefer shared nothing architectures, even in Kubernetes ● If you can, dedicate a Kubernetes node to one Postgres instance only ○ Take advantage of Pod scheduling capabilities and availability zones (where available) ■ pod affinity/anti-affinity ■ node selectors ■ tolerations ○ Properly set resource requests and limits ■ Guaranteed QoS is recommended ● If you can, use local storage on the dedicated node ○ Benchmark throughput ○ In the public cloud, watch out for IOPS limitations ● Costs/benefits analysis ○ One more reason why benchmarking is fundamental in proper and effective capacity planning ○ It’s your choice, and yours only
  • 20. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved How we’re benchmarking Postgres on K8s 20 Observe and let numbers and diagrams help you discover issues ● We rely on: ○ cnp-sandbox ■ Prometheus, Grafana, Cloud Native PostgreSQL operator (EDB) ○ cnp-bench ■ on existing clusters ○ pg_stat_statements ● You can use your own PostgreSQL setup ○ Your favourite operator ● You can use your favorite observability tools ○ Your own Prometheus/Grafana ○ Something else (you should know what to look for now!)
  • 21. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved A sandbox for Cloud Native PostgreSQL 21 cnp-sandbox is an open source helm chart ● Deploys a sandbox environment in Kubernetes with: ○ Prometheus ○ Grafana ○ Cloud Native PostgreSQL with: ■ a selection of PostgreSQL metrics for the native Prometheus exporter in CNP ■ a custom Grafana dashboard developed by EDB for Cloud Native PostgreSQL ● Main goals: ○ Evaluate Cloud Native PostgreSQL’s observability with Prometheus and Grafana ○ Integrate benchmarks with real-time collected data ● Suitable for pre-production and staging environments ○ Production environments should have their own Prometheus and Grafana installations ○ Metrics and dashboards can be reused ● URL: github.com/EnterpriseDB/cnp-sandbox
  • 22. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Deployment 22 helm repo add cnp-sandbox https://enterprisedb.github.io/cnp-sandbox/ helm repo update helm upgrade --install cnp-sandbox cnp-sandbox/cnp-sandbox
  • 23. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved 23
  • 24. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved cnp-bench 24 Benchmarking the storage and a database PostgreSQL ● Storage benchmarking with fio ● Database benchmarking and stress testing with: ○ pgbench ○ HammerDB ● Can be run against an existing Postgres database ○ Integrated with Cloud Native PostgreSQL, including pgBouncer for connection pooling ● Suitable for pre-production and staging environments ● URL: https://github.com/EnterpriseDB/cnp-bench
  • 25. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved An example of pgbench initialization 25 cnp: existingCluster: true existingCredentials: pg-14-app existingHost: pg-14-rw existingDatabase: pgbench image: quay.io/enterprisedb/postgresql:14.1 pgbench: nodeSelector: workload: pgbench scaleFactor: 8000 initialize: true
  • 26. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved An example of pgbench run 26 cnp: existingCluster: true existingCredentials: pg-14-app existingHost: pg-14-rw existingDatabase: pgbench image: quay.io/enterprisedb/postgresql:14.1 pgbench: nodeSelector: workload: pgbench initialize: false skipVacuum: true reportLatencies: true time: 600 clients: 64 jobs: 128
  • 27. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved 27 Notes: ● 5 x 10min pgbench tests ● scale factor 8000 (120GB) ● 3 dedicated nodes: ○ AKS Standard_E8s_v4 ○ 7 cores/56Gi RAM ○ Guaranteed Qos ○ Premium P80 storage class ● 1 MinSync replication ● Azure Blob Container (backup) ● pgbench on AKS Standard_D64s_v4
  • 28. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved 28 Another example showing ~ 13k tps with 32 cores
  • 29. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Bottleneck: serialized WAL archiving 29 Anticipate and avoid this scenario! 36k piled WALs! What if the primary dies now?
  • 30. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Parallel archiving (1) 30 Remediation: parallel WAL archiving and large segment size (64MB)
  • 31. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Parallel archiving (2) 31 Might be OK (bulk loads or vacuums)
  • 33. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved What’s next with cnp-bench 33 Our plan for the H1/2022 ● Manage increasing number of client connections ● Manage repetitions ● Support custom pgbench scripts ● Improve support for HammerDB ● Introduce application level benchmarking ○ Web application load generation with hey ○ Front end scalability
  • 34. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved About Cloud Native PostgreSQL 34 The Kubernetes operator from EDB ● It is currently closed source ○ Available for trials ● Fully declarative ● Integrated with the Kubernetes API server (no external tool for failover) ● Directly manages persistent volumes ● Our intention is to open source Cloud Native PostgreSQL in 2022 ● It is the component that manages PostgreSQL in the data layer of BigAnimal
  • 35. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Conclusions 35 Why benchmarking PostgreSQL is important? ● Data is the most important asset of an organization ● Data can live in Kubernetes, in reliable databases like PostgreSQL ● Don’t leave anything to chance ○ Benchmark your storage and know its limits ○ Benchmark your database and know its limits ● Benchmark before you go to production ○ You might not be able to benchmark when in production ● Highly consider dedicating storage and nodes to a single PostgreSQL instance ○ First benchmark the single node, and focus on the storage primarily ○ Then benchmark the high availability cluster, with continuous backup and replicas ■ Pay attention to WAL archiving, streaming, WAL restore, replay, and so on … ● Evaluate introduction of failover and switchover events in benchmarks (chaos) ○ Observe the cluster and always consider your RPO and RTO goals ● Study Postgres, love Postgres! ○ There are so many features you might not know that Postgres already has!
  • 36. “ Thank you! DoK #109 webinar - Benchmarking for PostgreSQL workloads in Kubernetes (part 2) Gabriele Bartolini - @_GBartolini_ Watch part 1!