SlideShare a Scribd company logo
1 of 29
Download to read offline
etcd based
PostgreSQL HA cluster
TL;DR: 

github.com/compose/template-etcd-based-postgres-ha
Introduction
Chris Winslett
@winsletts
compose.io
reading the top 5 comments on Imgur since 2012
How we started using PostgreSQL
MongoDB was a primary datastore
launched project to understand financial
metrics
required data exploration, which is brutal in
MongoDB
Our database product
our platform runs databases
these databases scale
automatically as a customer
increases data size
Our database product
could we run PostgreSQL on our
platform?
Database operational requirements
• replicated
• highly-available
• no human interaction for failover
• minimize core-engine
modifications
• customers use entire
deployment
Tools investigated
repmgr with pgpool II
required human interaction for failover
does not use PostgreSQL streaming
pgpool was flakey on failover
Tools investigated
PostgreSQL streaming replication
no automatic failover
Tools investigated
bi-directional replication
i.e. master-master
only runs on one database per cluster
requires a patch on core engine
is automated failover too
ambitious with PostgreSQL?
Learned from tools investigation
PostgreSQL should not be the canonical
store of its own state, investigated:
serf - not consensus based
consul - runs with consensus
etcd - run with conensus
Consul
we built the prototype on Consul
using:
locking
sessions
health checks
code at: https://github.com/MongoHQ/consul_ha
Consul
Code at: https://github.com/MongoHQ/
consul_ha
Tight coupling between:
Consul interaction
and
HA decision loop
Consul Diagram 1
Final Consul Diagram
Consul Results
amazing
automatically growing and shrinking Consul
clusters
health checks to prevent unhealthy
secondaries from acquiring locks
Consul
until, we ran into massive swap
allocation.
40 GB swap allocation.
fine for prototypes, not for production.
Results from Consul
HA PostgreSQL is possible
but, we need a tool which uses our
resources more wisely.
Switch to etcd
because of what we’d learned in
Consul, the switch to etcd took a
day to have a working sample
Modern etcd diagram
Start
Connect to
etcd?
Is data
directory
empty?
yes
Win race to set
initialization
key?
yes
Initialize
database
Take over
lead TTL
key
Start
PostgreSQL
as a
leaderless
Secondary
no
yes
Leader
owns key?
pg_basebackup
from leader
Do I own
leader
key?
Acquire
leader lock?
yes
Update
leader
TTL lock
yes
Promote
to leader
Is leader
key
owned?
no
Am I
following
the correct
leader?
yes
Am I the
healthiest
member?
no
Am I the
leader?
no
Wait 30
seconds
yes
yes
no
yes
Start
Postgres
wait 5
seconds
no
wait 5
seconds
no
follow
proper
leader
no
yes
Running Loop
Start
Postgres
Start-up Process
etcd features used
concensus
recursivettl
prevValue
prevExist
https://coreos.com/docs/distributed-configuration/
etcd-api/
etcd: recursive
used to find all members known
to a cluster
etcd: ttl
used with our keep alive from a
PostgreSQL runner
etcd: prevValue
used in conjunction with TTL to
ensure the leader remains the
leader when updating the TTL
etcd: prevExist
used to create a deployment
initialization race
Improved with etcd
removed tight coupling in classes:
HA decision process
etcd state interaction
PostgreSQL handler
Issues with etcd
overly aggressive about consensus
instructions for optimization at
https://coreos.com/docs/cluster-
management/debugging/etcd-
tuning/
Issues with etcd
overly aggressive about
consensus
we quit running etcd along side
PostgreSQL because we wanted
expanding PostgreSQL clusters
Time for live demo?

More Related Content

What's hot

Fluentd and PHP
Fluentd and PHPFluentd and PHP
Fluentd and PHP
chobi e
 
Operating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesOperating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with Kubernetes
Jonathan Katz
 

What's hot (20)

Tuning Linux for MongoDB
Tuning Linux for MongoDBTuning Linux for MongoDB
Tuning Linux for MongoDB
 
From zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and ElasticsearchFrom zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and Elasticsearch
 
Ignacy Kowalczyk
Ignacy KowalczykIgnacy Kowalczyk
Ignacy Kowalczyk
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log system
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
DCSF 19 eBPF Superpowers
DCSF 19 eBPF SuperpowersDCSF 19 eBPF Superpowers
DCSF 19 eBPF Superpowers
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
How Typepad changed their architecture without taking down the service
How Typepad changed their architecture without taking down the serviceHow Typepad changed their architecture without taking down the service
How Typepad changed their architecture without taking down the service
 
Os Webb
Os WebbOs Webb
Os Webb
 
Fluentd and PHP
Fluentd and PHPFluentd and PHP
Fluentd and PHP
 
Testing Wi-Fi with OSS Tools
Testing Wi-Fi with OSS ToolsTesting Wi-Fi with OSS Tools
Testing Wi-Fi with OSS Tools
 
Fluentd - road to v1 -
Fluentd - road to v1 -Fluentd - road to v1 -
Fluentd - road to v1 -
 
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for LogsTuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
 
PGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from Trenches
 
Chaos Engineering for Docker
Chaos Engineering for DockerChaos Engineering for Docker
Chaos Engineering for Docker
 
Container Monitoring with Sysdig
Container Monitoring with SysdigContainer Monitoring with Sysdig
Container Monitoring with Sysdig
 
Operating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesOperating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with Kubernetes
 
High availability for puppet - 2016
High availability for puppet - 2016High availability for puppet - 2016
High availability for puppet - 2016
 
Configuration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needsConfiguration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needs
 
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
 

Viewers also liked

Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsBest Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Jignesh Shah
 

Viewers also liked (16)

PostgreSQL High-Availability and Geographic Locality using consul
PostgreSQL High-Availability and Geographic Locality using consulPostgreSQL High-Availability and Geographic Locality using consul
PostgreSQL High-Availability and Geographic Locality using consul
 
[NYC Meetup] Docker at Nuxeo
[NYC Meetup] Docker at Nuxeo[NYC Meetup] Docker at Nuxeo
[NYC Meetup] Docker at Nuxeo
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsBest Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 
Production Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated WorldProduction Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated World
 
Consul - service discovery and others
Consul - service discovery and othersConsul - service discovery and others
Consul - service discovery and others
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
 
KubeCon EU 2016: Full Automatic Database: PostgreSQL HA with Kubernetes
KubeCon EU 2016: Full Automatic Database: PostgreSQL HA with KubernetesKubeCon EU 2016: Full Automatic Database: PostgreSQL HA with Kubernetes
KubeCon EU 2016: Full Automatic Database: PostgreSQL HA with Kubernetes
 
Dynamic Database Credentials: Security Contingency Planning
Dynamic Database Credentials: Security Contingency PlanningDynamic Database Credentials: Security Contingency Planning
Dynamic Database Credentials: Security Contingency Planning
 
FreeBSD: Dev to Prod
FreeBSD: Dev to ProdFreeBSD: Dev to Prod
FreeBSD: Dev to Prod
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
범용 PaaS 플랫폼 mesos(mesosphere)
범용 PaaS 플랫폼 mesos(mesosphere)범용 PaaS 플랫폼 mesos(mesosphere)
범용 PaaS 플랫폼 mesos(mesosphere)
 
ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方
 
gRPC: The Story of Microservices at Square
gRPC: The Story of Microservices at SquaregRPC: The Story of Microservices at Square
gRPC: The Story of Microservices at Square
 
Ingesting Drone Data into Big Data Platforms
Ingesting Drone Data into Big Data Platforms Ingesting Drone Data into Big Data Platforms
Ingesting Drone Data into Big Data Platforms
 

Similar to etcd based PostgreSQL HA Cluster

Similar to etcd based PostgreSQL HA Cluster (20)

Prototype4Production Presented at FOSSASIA2015 at Singapore
Prototype4Production Presented at FOSSASIA2015 at SingaporePrototype4Production Presented at FOSSASIA2015 at Singapore
Prototype4Production Presented at FOSSASIA2015 at Singapore
 
Whats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
Whats wrong with postgres | PGConf EU 2019 | Craig KerstiensWhats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
Whats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
 
The Secrets of The FullStack Ninja - Part A - Session I
The Secrets of The FullStack Ninja - Part A - Session IThe Secrets of The FullStack Ninja - Part A - Session I
The Secrets of The FullStack Ninja - Part A - Session I
 
Presto@Uber
Presto@UberPresto@Uber
Presto@Uber
 
“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...
“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...
“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracing
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails Projects
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic Tool
 
Performance Tuning with XHProf
Performance Tuning with XHProfPerformance Tuning with XHProf
Performance Tuning with XHProf
 
Big data made easy with a Spark
Big data made easy with a SparkBig data made easy with a Spark
Big data made easy with a Spark
 
Why we love pgpool-II and why we hate it!
Why we love pgpool-II and why we hate it!Why we love pgpool-II and why we hate it!
Why we love pgpool-II and why we hate it!
 
Git workflows á la-carte, Presenation at jdays2013 www.jdays.se by Nicola Pao...
Git workflows á la-carte, Presenation at jdays2013 www.jdays.se by Nicola Pao...Git workflows á la-carte, Presenation at jdays2013 www.jdays.se by Nicola Pao...
Git workflows á la-carte, Presenation at jdays2013 www.jdays.se by Nicola Pao...
 
Midwest PHP 2017 DevOps For Small team
Midwest PHP 2017 DevOps For Small teamMidwest PHP 2017 DevOps For Small team
Midwest PHP 2017 DevOps For Small team
 
Big Data made easy with a Spark
Big Data made easy with a SparkBig Data made easy with a Spark
Big Data made easy with a Spark
 
Engage 2019: The good, the bad and the ugly: a not so objective view on front...
Engage 2019: The good, the bad and the ugly: a not so objective view on front...Engage 2019: The good, the bad and the ugly: a not so objective view on front...
Engage 2019: The good, the bad and the ugly: a not so objective view on front...
 
Performant Django - Ara Anjargolian
Performant Django - Ara AnjargolianPerformant Django - Ara Anjargolian
Performant Django - Ara Anjargolian
 
Useful PostgreSQL Extensions
Useful PostgreSQL ExtensionsUseful PostgreSQL Extensions
Useful PostgreSQL Extensions
 
Build next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeBuild next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google Chrome
 
Stacktrace Berlin RC.2
Stacktrace Berlin RC.2Stacktrace Berlin RC.2
Stacktrace Berlin RC.2
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

etcd based PostgreSQL HA Cluster