SlideShare a Scribd company logo
1
Alexander Kukushkin
PGConf US 2017, Jersey City
Patroni - HA PostgreSQL made easy
2
ABOUT ME
Alexander Kukushkin
Database Engineer @ZalandoTech
Email: alexander.kukushkin@zalando.de
Twitter: @cyberdemn
3
ZALANDO AT A GLANCE
~3.6billion EURO
net sales 2016
~165
million
visits
per
month
>12,000
employees in
Europe
50%
return rate across
all categories
~20
million
active customers
~200,000
product choices
>1,500
brands
15
countries
4
ZALANDO TECHNOLOGY
BERLIN
5
ZALANDO TECHNOLOGY
BERLIN
DORTMUND
DUBLIN
HELSINKI
ERFURT
MÖNCHENGLADBACH
HAMBURG
6
ZALANDO TECHNOLOGY
● > 150 databases in DC
● > 130 databases on AWS
● > 1600 tech employees
● We are hiring!
7
POSTGRESQL
● Rock-solid by default
● Transactional DDL
● Standard-compliant modern SQL
● Blazing performance
● PostgreSQL is a community
The world’s most advanced open-source database
8
RUNNING DATABASES AT SCALE
9
RUNNING DATABASES AT SCALE
10
CLOUD DATABASES
● Rapid deployments
● Commodity hardware (cattle vs pets)
● Standard configuration and automatic tuning
11
12
AUTOMATIC FAILOVER
“PostgreSQL does not
provide the system
software required to
identify a failure on the
primary and notify the
standby database
server.”
CC0 Public Domain
13
EXISTING AUTOMATIC FAILOVER SOLUTIONS
● Promote a replica when the master is not responding
○ Split brain/potentially many masters
● Use one monitor node to make decisions
○ Monitor node is a single point of failure
○ Former master needs to be killed (STONITH)
● Use multiple monitor nodes
○ Distributed consistency problem
14
DISTRIBUTED CONSISTENCY PROBLEM
https://www.flickr.com/photos/kevandotorg
15
PATRONI APPROACH
● Use Distributed Configuration System (DCS): Etcd, Zookeeper or Consul
● Built-in distributed consensus (RAFT, Zab)
● Session/TTL to expire data (i.e. master key)
● Key-value storage for cluster information
● Atomic operations (CAS)
● Watches for important keys
16
DCS STRUCTURE
● /service/cluster/
○ config
○ initialize
○ members/
■ dbnode1
■ dbnode2
○ leader
○ optime/
■ leader
○ failover
17
● initialize
○ "key": "/service/testcluster/initialize",
"value": "6303731710761975832"
● leader/optime
○ "key": "/service/testcluster/optime/leader",
"value": "67393608"
● config
○ "key": "/service/testcluster/config",
"value": "{"postgresql":{"parameters":{"max_connections":"200"}}}"
KEYS THAT NEVER EXPIRE
18
● leader
○ "key": "/service/testcluster/leader",
"value": "dbnode2",
"ttl": 22
● members
○ "key": "/service/testcluster/members/dbnode2",
“value": "{"role":"master","state":"running","xlog_location":67393608,
"conn_url":"postgres://172.17.0.3:5432/postgres",
"api_url":"http://172.17.0.3:8008/patroni"}",
"ttl": 22
KEYS WITH TTL
19
● Initialization race
● initdb by a winner of an initialization race
● Waiting for the leader key by the rest of the nodes
● Bootstrapping of non-leader nodes (pg_basebackup)
BOOTSTRAPPING OF A NEW CLUSTER
20
● Update the leader key or demote if update failed
● Write the leader/optime (xlog position)
● Update the member key
● Add/delete replication slots for other members
EVENT LOOP OF A RUNNING CLUSTER (MASTER)
21
● Check that the cluster has a leader
○ Check recovery.conf points to the correct leader
○ Join the leader race if a leader is not present
● Add/delete replication slots for cascading replicas
● Update the member key
EVENT LOOP OF A RUNNING CLUSTER (REPLICA)
22
● Check whether the member is the healthiest
○ Evaluate its xlog position against all other members
● Try to acquire the leader lock
● Promote itself to become a master after acquiring the lock
LEADER RACE
23
LEADER RACE
CREATE (“/leader”, “A”, ttl=30, prevExists=False)
CREATE (“/leader”, “B”, ttl=30, prevExists=False)
Success
Fail
promote
A
B
24
LIVE DEMO
25
PATRONI FEATURES
● Manual and Scheduled Failover
● Synchronous mode
● Attach the old master with pg_rewind
● Customizable replica creation methods
● Linux watchdog support (coming soon)
● Pause (maintenance) mode
● patronictl
26
● Change Patroni/PostgreSQL parameters via Patroni REST API
○ Store them in DCS and apply dynamically on all nodes
● Ensure identical configuration of the following parameters on all members:
○ ttl, loop_wait, retry_timeout, maximum_lag_on_failover
○ wal_level, hot_standby
○ max_connections, max_prepared_transactions, max_locks_per_transaction,
max_worker_processes, track_commit_timestamp, wal_log_hints
○ wal_keep_segments, max_replication_slots
● Inform the user that PostgreSQL needs to be restarted (pending_restart flag)
DYNAMIC CONFIGURATION
27
BUILDING HA POSTGRESQL BASED ON PATRONI
● Client traffic routing
○ patroni callbacks
○ confd + haproxy, pgbouncer
● Backup and recovery
○ WAL-E, barman
● Monitoring
○ Nagios, zabbix, zmon
Image by flickr user https://www.flickr.com/photos/brickset/
28
SPILO: DOCKER + PATRONI + WAL-E + AWS/K8S
29
SPILO DEPLOYMENT
30
AUTOMATIC FAILOVER IS HARD
31
WHEN SHOULD THE MASTER DEMOTE ITSELF?
● Chances of data loss vs write availability
● Avoiding too many master switches (retry_timeout, loop_wait, ttl)
● 2 x retry_timeout + loop_wait < ttl
● Zookeeper and Consul session duration quirks
32
CHOOSING A NEW MASTER
● Reliability/performance of the host or connection
○ nofailover tag
● XLOG position
○ highest xlog position = the best candidate
○ xlog > leader/optime - maximum_lag_on_failover
■ maximum_lag_on_failover > size of WAL segment (16MB) for disaster recovery
33
ATTACHING THE OLD MASTER BACK AS REPLICA
● Diverged timelines after the former master crash
● pg_rewind
○ use_pg_rewind
○ remove_data_directory_on_rewind_failure
34
USEFUL LINKS
● Spilo: https://github.com/zalando/spilo
● Confd: http://www.confd.io
● Etcd: https://github.com/coreos/etcd
● RAFT: http://thesecretlivesofdata.com/raft/
35
Questions?
https://github.com/zalando/patroni

More Related Content

What's hot

Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
distributed matters
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deployment
hyeongchae lee
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
YoungHeon (Roy) Kim
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
EXEM
 
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIXHigh Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
Julyanto SUTANDANG
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication Cheatsheet
Alexey Lesovsky
 
Oracle GoldenGate 18c - REST API Examples
Oracle GoldenGate 18c - REST API ExamplesOracle GoldenGate 18c - REST API Examples
Oracle GoldenGate 18c - REST API Examples
Bobby Curtis
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with Barman
Gabriele Bartolini
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...
PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...
PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...
Mydbops
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuningelliando dias
 
What is new in PostgreSQL 14?
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?
Mydbops
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
Alexander Korotkov
 
PostgreSQL Replication Tutorial
PostgreSQL Replication TutorialPostgreSQL Replication Tutorial
PostgreSQL Replication Tutorial
Hans-Jürgen Schönig
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
botsplash.com
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practice
Alexey Lesovsky
 
Oracle 12c PDB insights
Oracle 12c PDB insightsOracle 12c PDB insights
Oracle 12c PDB insights
Kirill Loifman
 
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Databricks
 
[Pgday.Seoul 2017] 8. PostgreSQL 10 새기능 소개 - 김상기
[Pgday.Seoul 2017] 8. PostgreSQL 10 새기능 소개 - 김상기[Pgday.Seoul 2017] 8. PostgreSQL 10 새기능 소개 - 김상기
[Pgday.Seoul 2017] 8. PostgreSQL 10 새기능 소개 - 김상기
PgDay.Seoul
 

What's hot (20)

Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deployment
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
 
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIXHigh Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication Cheatsheet
 
Oracle GoldenGate 18c - REST API Examples
Oracle GoldenGate 18c - REST API ExamplesOracle GoldenGate 18c - REST API Examples
Oracle GoldenGate 18c - REST API Examples
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with Barman
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...
PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...
PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
 
What is new in PostgreSQL 14?
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
PostgreSQL Replication Tutorial
PostgreSQL Replication TutorialPostgreSQL Replication Tutorial
PostgreSQL Replication Tutorial
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practice
 
Oracle 12c PDB insights
Oracle 12c PDB insightsOracle 12c PDB insights
Oracle 12c PDB insights
 
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
 
[Pgday.Seoul 2017] 8. PostgreSQL 10 새기능 소개 - 김상기
[Pgday.Seoul 2017] 8. PostgreSQL 10 새기능 소개 - 김상기[Pgday.Seoul 2017] 8. PostgreSQL 10 새기능 소개 - 김상기
[Pgday.Seoul 2017] 8. PostgreSQL 10 새기능 소개 - 김상기
 

Similar to Patroni - HA PostgreSQL made easy

Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
Juan Berner
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Pablo Garbossa
 
PostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacksPostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacks
Showmax Engineering
 
Elasticsearch on Kubernetes
Elasticsearch on KubernetesElasticsearch on Kubernetes
Elasticsearch on Kubernetes
Joerg Henning
 
Varnish - PLNOG 4
Varnish - PLNOG 4Varnish - PLNOG 4
Varnish - PLNOG 4
Leszek Urbanski
 
PERFORMANCE_SCHEMA and sys schema
PERFORMANCE_SCHEMA and sys schemaPERFORMANCE_SCHEMA and sys schema
PERFORMANCE_SCHEMA and sys schema
FromDual GmbH
 
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander KukushkinPGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
Equnix Business Solutions
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB plc
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django application
bangaloredjangousergroup
 
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major Features
InMobi Technology
 
MySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdfMySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdf
YunusShaikh49
 
FOSDEM 2015: gdb tips and tricks for MySQL DBAs
FOSDEM 2015: gdb tips and tricks for MySQL DBAsFOSDEM 2015: gdb tips and tricks for MySQL DBAs
FOSDEM 2015: gdb tips and tricks for MySQL DBAs
Valerii Kravchuk
 
MySQL for Oracle DBAs
MySQL for Oracle DBAsMySQL for Oracle DBAs
MySQL for Oracle DBAs
FromDual GmbH
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
PgTraining
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
Wei Shan Ang
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC
 
Learn backend java script
Learn backend java scriptLearn backend java script
Learn backend java script
Tsuyoshi Maeda
 

Similar to Patroni - HA PostgreSQL made easy (20)

Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
 
PostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacksPostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacks
 
Elasticsearch on Kubernetes
Elasticsearch on KubernetesElasticsearch on Kubernetes
Elasticsearch on Kubernetes
 
Varnish - PLNOG 4
Varnish - PLNOG 4Varnish - PLNOG 4
Varnish - PLNOG 4
 
PERFORMANCE_SCHEMA and sys schema
PERFORMANCE_SCHEMA and sys schemaPERFORMANCE_SCHEMA and sys schema
PERFORMANCE_SCHEMA and sys schema
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander KukushkinPGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance Optimization
 
Web scale monitoring
Web scale monitoringWeb scale monitoring
Web scale monitoring
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django application
 
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major Features
 
MySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdfMySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdf
 
FOSDEM 2015: gdb tips and tricks for MySQL DBAs
FOSDEM 2015: gdb tips and tricks for MySQL DBAsFOSDEM 2015: gdb tips and tricks for MySQL DBAs
FOSDEM 2015: gdb tips and tricks for MySQL DBAs
 
MySQL for Oracle DBAs
MySQL for Oracle DBAsMySQL for Oracle DBAs
MySQL for Oracle DBAs
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
Learn backend java script
Learn backend java scriptLearn backend java script
Learn backend java script
 

Recently uploaded

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 

Recently uploaded (20)

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 

Patroni - HA PostgreSQL made easy

  • 1. 1 Alexander Kukushkin PGConf US 2017, Jersey City Patroni - HA PostgreSQL made easy
  • 2. 2 ABOUT ME Alexander Kukushkin Database Engineer @ZalandoTech Email: alexander.kukushkin@zalando.de Twitter: @cyberdemn
  • 3. 3 ZALANDO AT A GLANCE ~3.6billion EURO net sales 2016 ~165 million visits per month >12,000 employees in Europe 50% return rate across all categories ~20 million active customers ~200,000 product choices >1,500 brands 15 countries
  • 6. 6 ZALANDO TECHNOLOGY ● > 150 databases in DC ● > 130 databases on AWS ● > 1600 tech employees ● We are hiring!
  • 7. 7 POSTGRESQL ● Rock-solid by default ● Transactional DDL ● Standard-compliant modern SQL ● Blazing performance ● PostgreSQL is a community The world’s most advanced open-source database
  • 10. 10 CLOUD DATABASES ● Rapid deployments ● Commodity hardware (cattle vs pets) ● Standard configuration and automatic tuning
  • 11. 11
  • 12. 12 AUTOMATIC FAILOVER “PostgreSQL does not provide the system software required to identify a failure on the primary and notify the standby database server.” CC0 Public Domain
  • 13. 13 EXISTING AUTOMATIC FAILOVER SOLUTIONS ● Promote a replica when the master is not responding ○ Split brain/potentially many masters ● Use one monitor node to make decisions ○ Monitor node is a single point of failure ○ Former master needs to be killed (STONITH) ● Use multiple monitor nodes ○ Distributed consistency problem
  • 15. 15 PATRONI APPROACH ● Use Distributed Configuration System (DCS): Etcd, Zookeeper or Consul ● Built-in distributed consensus (RAFT, Zab) ● Session/TTL to expire data (i.e. master key) ● Key-value storage for cluster information ● Atomic operations (CAS) ● Watches for important keys
  • 16. 16 DCS STRUCTURE ● /service/cluster/ ○ config ○ initialize ○ members/ ■ dbnode1 ■ dbnode2 ○ leader ○ optime/ ■ leader ○ failover
  • 17. 17 ● initialize ○ "key": "/service/testcluster/initialize", "value": "6303731710761975832" ● leader/optime ○ "key": "/service/testcluster/optime/leader", "value": "67393608" ● config ○ "key": "/service/testcluster/config", "value": "{"postgresql":{"parameters":{"max_connections":"200"}}}" KEYS THAT NEVER EXPIRE
  • 18. 18 ● leader ○ "key": "/service/testcluster/leader", "value": "dbnode2", "ttl": 22 ● members ○ "key": "/service/testcluster/members/dbnode2", “value": "{"role":"master","state":"running","xlog_location":67393608, "conn_url":"postgres://172.17.0.3:5432/postgres", "api_url":"http://172.17.0.3:8008/patroni"}", "ttl": 22 KEYS WITH TTL
  • 19. 19 ● Initialization race ● initdb by a winner of an initialization race ● Waiting for the leader key by the rest of the nodes ● Bootstrapping of non-leader nodes (pg_basebackup) BOOTSTRAPPING OF A NEW CLUSTER
  • 20. 20 ● Update the leader key or demote if update failed ● Write the leader/optime (xlog position) ● Update the member key ● Add/delete replication slots for other members EVENT LOOP OF A RUNNING CLUSTER (MASTER)
  • 21. 21 ● Check that the cluster has a leader ○ Check recovery.conf points to the correct leader ○ Join the leader race if a leader is not present ● Add/delete replication slots for cascading replicas ● Update the member key EVENT LOOP OF A RUNNING CLUSTER (REPLICA)
  • 22. 22 ● Check whether the member is the healthiest ○ Evaluate its xlog position against all other members ● Try to acquire the leader lock ● Promote itself to become a master after acquiring the lock LEADER RACE
  • 23. 23 LEADER RACE CREATE (“/leader”, “A”, ttl=30, prevExists=False) CREATE (“/leader”, “B”, ttl=30, prevExists=False) Success Fail promote A B
  • 25. 25 PATRONI FEATURES ● Manual and Scheduled Failover ● Synchronous mode ● Attach the old master with pg_rewind ● Customizable replica creation methods ● Linux watchdog support (coming soon) ● Pause (maintenance) mode ● patronictl
  • 26. 26 ● Change Patroni/PostgreSQL parameters via Patroni REST API ○ Store them in DCS and apply dynamically on all nodes ● Ensure identical configuration of the following parameters on all members: ○ ttl, loop_wait, retry_timeout, maximum_lag_on_failover ○ wal_level, hot_standby ○ max_connections, max_prepared_transactions, max_locks_per_transaction, max_worker_processes, track_commit_timestamp, wal_log_hints ○ wal_keep_segments, max_replication_slots ● Inform the user that PostgreSQL needs to be restarted (pending_restart flag) DYNAMIC CONFIGURATION
  • 27. 27 BUILDING HA POSTGRESQL BASED ON PATRONI ● Client traffic routing ○ patroni callbacks ○ confd + haproxy, pgbouncer ● Backup and recovery ○ WAL-E, barman ● Monitoring ○ Nagios, zabbix, zmon Image by flickr user https://www.flickr.com/photos/brickset/
  • 28. 28 SPILO: DOCKER + PATRONI + WAL-E + AWS/K8S
  • 31. 31 WHEN SHOULD THE MASTER DEMOTE ITSELF? ● Chances of data loss vs write availability ● Avoiding too many master switches (retry_timeout, loop_wait, ttl) ● 2 x retry_timeout + loop_wait < ttl ● Zookeeper and Consul session duration quirks
  • 32. 32 CHOOSING A NEW MASTER ● Reliability/performance of the host or connection ○ nofailover tag ● XLOG position ○ highest xlog position = the best candidate ○ xlog > leader/optime - maximum_lag_on_failover ■ maximum_lag_on_failover > size of WAL segment (16MB) for disaster recovery
  • 33. 33 ATTACHING THE OLD MASTER BACK AS REPLICA ● Diverged timelines after the former master crash ● pg_rewind ○ use_pg_rewind ○ remove_data_directory_on_rewind_failure
  • 34. 34 USEFUL LINKS ● Spilo: https://github.com/zalando/spilo ● Confd: http://www.confd.io ● Etcd: https://github.com/coreos/etcd ● RAFT: http://thesecretlivesofdata.com/raft/