SlideShare a Scribd company logo
1 of 22
Download to read offline
PostgreSQL Replication
22.09.2016
─
Suyog Shirgaonkar
Iconnect360 Sdn Bhd
The intermark vista tower
Kuala Lumpur, Malaysia 50400
1
Overview
When you’re using PostgreSQL, you can set up streaming replication quite easily in order
to have your data redundant on two or more nodes. But we can leverage the the copies
of your data to balance the load of your primary database-server. Will try to explain how
to setup a redundant PostgreSQL database with load balancing.
Goals
1. Achieve the redundancy using built-in streaming replication of PostgreSQL.
2. Use the slave data nodes to load balance the read queries.
Specifications
To achieve the redundancy, I’ve chosen to use the built-in streaming replication of
PostgreSQl. Another tool, repmgr, will be used for easier control of the replication
between those redundant PostgreSQL instances. The tool doesn’t replicate data itself but
it allows you to easily control the replication, the standby-server(s) and monitor the
status of the whole replication process.
In order to use both ( or more if you would like to) copies of the database-data, I will use
pgpool. Pgpool can pool the connections to the nodes and will monitor their status and
trigger failover (by repmgr) if needed. Pgpool has the capability to load balance traffic,
based on the type of SQL query. For example, a SELECT query can perfectly be executed
on a slave (read only) node and save the resources on the master(read-write) node.
Milestones
I. Setup
We will setup two nodes as redundant PostgreSQl DB (a primary and a standby)
and one server which will do the pooling and distribution. The pool is a single
point of failure and ideally should also be redundant. For this case. The most
important is that all data stays available in case of failure.
2
As we can see in the above scheme, SQL queries will be distributed to the primary and
stand, based on the type of query. Data written to the master should get replicated to the
standby.
In case of a failure of the primary, the standby should replace the primary and become
read-write. This will make sure that the database stays available for all the applications.
In the meanwhile, one can investigate what happened to the old primary and after a
checkup, start it as the new standby.
The version of PostgreSQL used in the this document is 9.4 but once can easily replace
that with any other supported (recent) PostgreSQL version.
3
II. Prerequisites
Firewall and SELinux
Considered that the iptables/firewall and SELinux is disabled on the server, we will not
be covering those parts as the objective of this document is only PostgreSQL failover
replication.
Hostname
Before we will get started with the setup, it’s important that all nodes in the setup can
connect to each other via their hostname. If that’s not an option, you could use IP’s
instead of hostnames but that would make this explanation less clear.
Public Key Authentication
Another prerequisite is that all nodes can connect to each other over SSH without a
password prompt with user postgres. SSH will be used to rsync the data from the
primary to standby and to initiate a failover from pgpool. Password-less SSH can be
achieved with public key authentication.
To setup the SSH authentication for the postgres user, we first need to create the user.
Now, we will need to generate the a new RSA keypair for all the nodes.
As you can see I did a cat of the pubic part of the key pair on each node. To allow all
machines to connect to each other and accept each other’s key, we’ll need to add the
generated public keys of all hosts to /var/lib/pgsql/.ssh/authorized_keys:
4
5
In the below snapshot, I added each of the generated public RSA keys to the
authorized_keys file on node1. Don’t forget the last line to change the permission to
postgres user or SSH will not accept this file.
Since I really want unattended access ( so no password or any other question), I will also
add all hosts to the known_hosts file. This prevents the question to add the hosts
fingerprint on the first connection:
Since my authorized_keys and known host on node1 is fine for all other hosts, I’ll copy
these files to other nodes.
6
It’s also a good idea to test the SSH connection between all hosts using the postgres user
from everywhere:
Also, pgpool server root user executes the commands on postgres nodes, to allow
passwordless authentication we will copy the postgres user pub key to root user:
III. Repository
The standard CentOS repositories do not contain pgpool or repmgr and for CentOS 7, the
supported PostgreSQL version will be 9.2. Because we want to use PostgreSQl 9.4,
repmgr and pgpool, we need to add the PGDG repository to Yum.
Lets add the repo on all nodes:
7
Setup the primary PostgreSQL node
Now that we finished with all prerequisites, it’s time to really get started. The first step is
to install the PostgreSQl database for the master and configure it for replication
Let’s start with installing required packages:
I have experienced some problems with the latest version of repmgr (repmgr-3.1.5-1 at
the time of writing), especially to recover from a failed node, so i used the previous
version (repmgr94-2.0.2-4) of repmgr, using older version all problems seemed to
resolved.
One can find the older version link here:
https://mirror.its.sfu.ca/mirror/CentOS-Third-Party/pgrpm/pgrpm-94/redhat/rhel-7-x86_6
4/repmgr94-2.0.2-4.rhel7.x86_64.rpm
After the installation of postgreSQL and repmgr, let’s initialize PostgreSQL:
Configure database access by editing /var/lib/pgsql/9.4/data/pg_hba.conf
8
The above configuration allows DB-nodes to connect from other DB-nodes as well as
from pgpool server with pgpool-user. By mentioning trust will allow users to connect
without password and the rest of the users will have to use password (since method is
mention as md5). So with the above configuration pgpool server can communicate to
DB-nodes with password for DB-operations.
Configure the PostgreSQL and streaming replication by editing
/var/lib/pgsql/9.4/data/postgresql.conf
9
It’s good to merge the changed parameters with the existing, default content of the
PostgreSQL, since it contains many useful comments. I have just posted the effective
settings here to keep it clear.
Create a directory for repmgr configuration files:
Configure the repmgr by editing the /var/lib/pgsql/repmgr/repmgr.conf
Now set the owner of the files as postgres-user
Enable and start PostgreSQL on the master(node1):
Create required users for replication and repmgr and the repmgr DB:
10
Last step is to register node1 as master node for repmgr:
Setup the Standby PostgreSQL node
Setting up the standby node doesn;t require a lot of configuration. On the first sync with
primary, most of the configuration is taken from here.
Same as with primary, the first is to start with installation of necessary packages:
Once all the packages are installed, we can sync the configuration and content of the
primary with the standby.
In case you experience problems with the synchronization, you need to first delete the
data and have to recreate it, as the postgreSQL backup-restore process doesn’t overwrite
the content, for which we need to clear the content from the folder. And run the given
command.
11
After the synchronization with the primary, configuration repmgr, similar as we did on
the primary by creating the directory for the configuration files:
Also configured the /var/lib/pgsql/repmgr/repmgr.conf and don’t forget the last to
change the owner of the file.
Now let’s start and enable PostgreSQL service on standby server
As the last step in the standby node setup, register the standby in repmgr:
Test the replication
Before we test, let’s first look at the status of repmgr:
This seems to look good. At this point, updates on master should be replicated to the
slave, as we would create except by creating the new database on the primary
node(node1).
12
Before we create database on primary let’s take a look at the list of databases on standby
node;
Now, let’s create a database on primary node:
List the databases on slave node using same command which run on primary:
As you can see the new database test got replicated to standby:
We will do one more test to check the if the standby is in read-only mode, we will actually
try to create a database on standby:
The above result is looks good, as we don’t want somebody to update the standby,
everything should pass via primary and get replicated to standby.
13
Setup Pgpool
Setup pgpool in order to distribute the queries to the primary and standby and to make
sure that we failover to the standby in case of primary isn’t available anymore:
As with the DB-nodes, we will start with installing the necessary packages:
Configure pgpool by copying the sample configuration file:
Edit the newly copied sample file /etc/pgpool-II-94/pgpool.conf, don’t replace the content
of files but edit everything listed below;
As mention in the above settings, we have specified that the failover.sh should be
executed on failover, so we need to create this file in /etc/pgpool-II-94/failover.sh
Make sure that this script is executable:
14
Also need use the pool_hba.conf for access control, so create this file in /etc/pgpool-II-94
with the below content;
The above configuration allows all users to connect from ip range 10.0.10.0/24 using
password.
Every user that needs to connect via pgpool needs to be added in pool_passwd. We need
to create this file and change the owner to postgres.
In the above snapshot we have added the pgpool-user to the file, which we created
earlier.
We need to repeat this last step for every user that needs access to the Database.
Last step is allow connection via PCP to manage pgpool. This requires a similar approach
as with pool_passwd.
This completes the pgpool configuration.
Let’s start and enable the pgpool service:
Test pgpool
Hoping that till now all went well. We should have a working installation of pgpool that
will execute queries on the replicated PostgreSQL nodes.
Let’s test if pgpool is working by executing a query on the database via pgpool:
15
We will get same result from another host which can access to pgpool-server.
Pgpool is having range of show-commands available that allows us to get the status of
pgpool itself. They are executed as normal SQL but will be intercepted by pgpool.
Check the status of nodes connected to pgpool:
As we can see that node1 is primary and node2 is secondary node. Also the status (2)
mention as that the node is online.
Test failover
To test the failover scenarios we need to bring down the primary node and see if the
standby node becomes master or not:
16
Post failover, new state of our replication will be like this:
Status from pgpool-server:
In the above screenshot we can see that the former slave (node2) has now became new
primary the former primary (node1) has now became standby but has an offline status
(3)
17
Result in repmgr:
In the failover.sh script we defined and was called when the primary went
down/unavailable, we specified to log /tmp/pgpool_failover.log.
Below are the contents of the file after the failover;
As we can see, the database stays available via pgpool and the former standby became
read-write.
Recover after failover to master-slave replication
After the failover, we will be in non-redundant situation. Everything keeps working for
users of the database but the we need to go back to primary-standby configuration.
18
The new state of replication will be:
First thing is to troubleshoot why the primary DB became unavailable and once the
problem is resolved , we can start using that node as the new standby-server:
Make sure that database is stopped on node1:
Then, we will sync the data from new primary(node2) to the previously failed node
(node1). In meanwhile if the database is updated and changed so we need to make sure
that those get replicated to node1 before we start it as a standby:
Also, as I mention earlier that postgres backup-restore procedure doesn’t overwrite the
data in /var/lib/pgsql/9.4/data directory. I simply deleted all the data in the directory and
19
with the help of repmgr command synced all the updated data from new primary
node(node2):
As the new standby node1 is ready let’s start the service and repmgr notices
automatically that the new standby got available:
For pgpool, we need to re-attach the failed node in order for it to be visible and usable as
a standby node. Password requested for below command is ​secret, which we set while
creating pgpool user.
Now at this point, we are back in redundant status where our old standby(node2)
functions as the primary and the old primary(node1) functions as a standby. Both the
machines are now equal, we can leave the situation as it is and continue to use this way
or we can revert to original position.
20
Recover the original situation
In case we want to go back to initial setup (node1 as primary and node2 as standby), we
can initiate a manual failover by stopping the current primary(node2) and recover it to
use as a standby:
Once the master(node2) stopped, a failover was triggered and node1 gets assigned as
primary again;
Now, sync node2 with new primary(node1) and start:
21
Re-attach node2 to pgool in order to use it as a standby:
After going though all the above steps, now we are back to the initial situation where
node1 acts as primary and node2 acts as the standby.

More Related Content

What's hot

Replication in PostgreSQL tutorial given in Postgres Conference 2019
Replication in PostgreSQL tutorial given in Postgres Conference 2019Replication in PostgreSQL tutorial given in Postgres Conference 2019
Replication in PostgreSQL tutorial given in Postgres Conference 2019Abbas Butt
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performancePostgreSQL-Consulting
 
Query Optimizer – MySQL vs. PostgreSQL
Query Optimizer – MySQL vs. PostgreSQLQuery Optimizer – MySQL vs. PostgreSQL
Query Optimizer – MySQL vs. PostgreSQLChristian Antognini
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQLJim Mlodgenski
 
Postgresql stored procedure
Postgresql stored procedurePostgresql stored procedure
Postgresql stored procedureJong Woo Rhee
 
OpenGurukul : Database : PostgreSQL
OpenGurukul : Database : PostgreSQLOpenGurukul : Database : PostgreSQL
OpenGurukul : Database : PostgreSQLOpen Gurukul
 
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleMariaDB plc
 
Flexviews materialized views for my sql
Flexviews materialized views for my sqlFlexviews materialized views for my sql
Flexviews materialized views for my sqlJustin Swanhart
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsAlexander Korotkov
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundMasahiko Sawada
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergendistributed matters
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaPostgreSQL-Consulting
 
Advanced Sharding Techniques with Spider (MUC2010)
Advanced Sharding Techniques with Spider (MUC2010)Advanced Sharding Techniques with Spider (MUC2010)
Advanced Sharding Techniques with Spider (MUC2010)Kentoku
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL AdministrationEDB
 

What's hot (20)

Replication in PostgreSQL tutorial given in Postgres Conference 2019
Replication in PostgreSQL tutorial given in Postgres Conference 2019Replication in PostgreSQL tutorial given in Postgres Conference 2019
Replication in PostgreSQL tutorial given in Postgres Conference 2019
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Crunchy containers
Crunchy containersCrunchy containers
Crunchy containers
 
Query Optimizer – MySQL vs. PostgreSQL
Query Optimizer – MySQL vs. PostgreSQLQuery Optimizer – MySQL vs. PostgreSQL
Query Optimizer – MySQL vs. PostgreSQL
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Backup and-recovery2
Backup and-recovery2Backup and-recovery2
Backup and-recovery2
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
Postgresql stored procedure
Postgresql stored procedurePostgresql stored procedure
Postgresql stored procedure
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 
OpenGurukul : Database : PostgreSQL
OpenGurukul : Database : PostgreSQLOpenGurukul : Database : PostgreSQL
OpenGurukul : Database : PostgreSQL
 
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
 
Flexviews materialized views for my sql
Flexviews materialized views for my sqlFlexviews materialized views for my sql
Flexviews materialized views for my sql
 
Postgresql
PostgresqlPostgresql
Postgresql
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
 
Advanced Sharding Techniques with Spider (MUC2010)
Advanced Sharding Techniques with Spider (MUC2010)Advanced Sharding Techniques with Spider (MUC2010)
Advanced Sharding Techniques with Spider (MUC2010)
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
 

Viewers also liked

Inside PostgreSQL Shared Memory
Inside PostgreSQL Shared MemoryInside PostgreSQL Shared Memory
Inside PostgreSQL Shared MemoryEDB
 
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Command Prompt., Inc
 
Postgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency ControlPostgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency ControlReactive.IO
 
Dba PostgreSQL desde básico a avanzado parte2
Dba PostgreSQL desde básico a avanzado parte2Dba PostgreSQL desde básico a avanzado parte2
Dba PostgreSQL desde básico a avanzado parte2EQ SOFT EIRL
 
Managing replication of PostgreSQL, Simon Riggs
Managing replication of PostgreSQL, Simon RiggsManaging replication of PostgreSQL, Simon Riggs
Managing replication of PostgreSQL, Simon RiggsFuenteovejuna
 
What Makes Great Infographics
What Makes Great InfographicsWhat Makes Great Infographics
What Makes Great InfographicsSlideShare
 
TEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkTEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkVolker Hirsch
 
Masters of SlideShare
Masters of SlideShareMasters of SlideShare
Masters of SlideShareKapost
 
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to SlideshareSTOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to SlideshareEmpowered Presentations
 
10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation Optimization10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation OptimizationOneupweb
 
How To Get More From SlideShare - Super-Simple Tips For Content Marketing
How To Get More From SlideShare - Super-Simple Tips For Content MarketingHow To Get More From SlideShare - Super-Simple Tips For Content Marketing
How To Get More From SlideShare - Super-Simple Tips For Content MarketingContent Marketing Institute
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShareSlideShare
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShareSlideShare
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksSlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShareSlideShare
 

Viewers also liked (16)

Inside PostgreSQL Shared Memory
Inside PostgreSQL Shared MemoryInside PostgreSQL Shared Memory
Inside PostgreSQL Shared Memory
 
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
 
Postgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency ControlPostgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency Control
 
Dba PostgreSQL desde básico a avanzado parte2
Dba PostgreSQL desde básico a avanzado parte2Dba PostgreSQL desde básico a avanzado parte2
Dba PostgreSQL desde básico a avanzado parte2
 
Managing replication of PostgreSQL, Simon Riggs
Managing replication of PostgreSQL, Simon RiggsManaging replication of PostgreSQL, Simon Riggs
Managing replication of PostgreSQL, Simon Riggs
 
What Makes Great Infographics
What Makes Great InfographicsWhat Makes Great Infographics
What Makes Great Infographics
 
TEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkTEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of Work
 
Masters of SlideShare
Masters of SlideShareMasters of SlideShare
Masters of SlideShare
 
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to SlideshareSTOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
 
You Suck At PowerPoint!
You Suck At PowerPoint!You Suck At PowerPoint!
You Suck At PowerPoint!
 
10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation Optimization10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation Optimization
 
How To Get More From SlideShare - Super-Simple Tips For Content Marketing
How To Get More From SlideShare - Super-Simple Tips For Content MarketingHow To Get More From SlideShare - Super-Simple Tips For Content Marketing
How To Get More From SlideShare - Super-Simple Tips For Content Marketing
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
 

Similar to ProstgreSQLFailoverConfiguration

PuppetConf 2016: An Introduction to Measuring and Tuning PE Performance – Cha...
PuppetConf 2016: An Introduction to Measuring and Tuning PE Performance – Cha...PuppetConf 2016: An Introduction to Measuring and Tuning PE Performance – Cha...
PuppetConf 2016: An Introduction to Measuring and Tuning PE Performance – Cha...Puppet
 
Percona Cluster Installation with High Availability
Percona Cluster Installation with High AvailabilityPercona Cluster Installation with High Availability
Percona Cluster Installation with High AvailabilityRam Gautam
 
How To Install Openbravo ERP 2.50 MP43 in Ubuntu
How To Install Openbravo ERP 2.50 MP43 in UbuntuHow To Install Openbravo ERP 2.50 MP43 in Ubuntu
How To Install Openbravo ERP 2.50 MP43 in UbuntuWirabumi Software
 
[MathWorks] Versioning Infrastructure
[MathWorks] Versioning Infrastructure[MathWorks] Versioning Infrastructure
[MathWorks] Versioning InfrastructurePerforce
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2benjaminwootton
 
Oracle 11g Installation With ASM and Data Guard Setup
Oracle 11g Installation With ASM and Data Guard SetupOracle 11g Installation With ASM and Data Guard Setup
Oracle 11g Installation With ASM and Data Guard SetupArun Sharma
 
9 steps to install and configure postgre sql from source on linux
9 steps to install and configure postgre sql from source on linux9 steps to install and configure postgre sql from source on linux
9 steps to install and configure postgre sql from source on linuxchinkshady
 
How to Replicate PostgreSQL Database
How to Replicate PostgreSQL DatabaseHow to Replicate PostgreSQL Database
How to Replicate PostgreSQL DatabaseSangJin Kang
 
RAC-Installing your First Cluster and Database
RAC-Installing your First Cluster and DatabaseRAC-Installing your First Cluster and Database
RAC-Installing your First Cluster and DatabaseNikhil Kumar
 
Postgre sql best_practices
Postgre sql best_practicesPostgre sql best_practices
Postgre sql best_practicesJacques Kostic
 
Postgresql quick guide
Postgresql quick guidePostgresql quick guide
Postgresql quick guideAshoka Vanjare
 
Scale Apache with Nginx
Scale Apache with NginxScale Apache with Nginx
Scale Apache with NginxBud Siddhisena
 
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...PostgreSQL-Consulting
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Denish Patel
 
How to install and configure LEMP stack
How to install and configure LEMP stackHow to install and configure LEMP stack
How to install and configure LEMP stackRootGate
 
Tutorial to setup OpenStreetMap tileserver with customized boundaries of India
Tutorial to setup OpenStreetMap tileserver with customized boundaries of IndiaTutorial to setup OpenStreetMap tileserver with customized boundaries of India
Tutorial to setup OpenStreetMap tileserver with customized boundaries of IndiaArun Ganesh
 

Similar to ProstgreSQLFailoverConfiguration (20)

PuppetConf 2016: An Introduction to Measuring and Tuning PE Performance – Cha...
PuppetConf 2016: An Introduction to Measuring and Tuning PE Performance – Cha...PuppetConf 2016: An Introduction to Measuring and Tuning PE Performance – Cha...
PuppetConf 2016: An Introduction to Measuring and Tuning PE Performance – Cha...
 
Percona Cluster Installation with High Availability
Percona Cluster Installation with High AvailabilityPercona Cluster Installation with High Availability
Percona Cluster Installation with High Availability
 
How To Install Openbravo ERP 2.50 MP43 in Ubuntu
How To Install Openbravo ERP 2.50 MP43 in UbuntuHow To Install Openbravo ERP 2.50 MP43 in Ubuntu
How To Install Openbravo ERP 2.50 MP43 in Ubuntu
 
[MathWorks] Versioning Infrastructure
[MathWorks] Versioning Infrastructure[MathWorks] Versioning Infrastructure
[MathWorks] Versioning Infrastructure
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
 
Oracle 11g Installation With ASM and Data Guard Setup
Oracle 11g Installation With ASM and Data Guard SetupOracle 11g Installation With ASM and Data Guard Setup
Oracle 11g Installation With ASM and Data Guard Setup
 
9 steps to install and configure postgre sql from source on linux
9 steps to install and configure postgre sql from source on linux9 steps to install and configure postgre sql from source on linux
9 steps to install and configure postgre sql from source on linux
 
linux installation.pdf
linux installation.pdflinux installation.pdf
linux installation.pdf
 
How to Replicate PostgreSQL Database
How to Replicate PostgreSQL DatabaseHow to Replicate PostgreSQL Database
How to Replicate PostgreSQL Database
 
RAC-Installing your First Cluster and Database
RAC-Installing your First Cluster and DatabaseRAC-Installing your First Cluster and Database
RAC-Installing your First Cluster and Database
 
Postgre sql best_practices
Postgre sql best_practicesPostgre sql best_practices
Postgre sql best_practices
 
Postgresql quick guide
Postgresql quick guidePostgresql quick guide
Postgresql quick guide
 
Scale Apache with Nginx
Scale Apache with NginxScale Apache with Nginx
Scale Apache with Nginx
 
Mysql ppt
Mysql pptMysql ppt
Mysql ppt
 
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
 
Postgre sql best_practices
Postgre sql best_practicesPostgre sql best_practices
Postgre sql best_practices
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)
 
How to install and configure LEMP stack
How to install and configure LEMP stackHow to install and configure LEMP stack
How to install and configure LEMP stack
 
Tutorial to setup OpenStreetMap tileserver with customized boundaries of India
Tutorial to setup OpenStreetMap tileserver with customized boundaries of IndiaTutorial to setup OpenStreetMap tileserver with customized boundaries of India
Tutorial to setup OpenStreetMap tileserver with customized boundaries of India
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 

ProstgreSQLFailoverConfiguration

  • 1. PostgreSQL Replication 22.09.2016 ─ Suyog Shirgaonkar Iconnect360 Sdn Bhd The intermark vista tower Kuala Lumpur, Malaysia 50400
  • 2. 1 Overview When you’re using PostgreSQL, you can set up streaming replication quite easily in order to have your data redundant on two or more nodes. But we can leverage the the copies of your data to balance the load of your primary database-server. Will try to explain how to setup a redundant PostgreSQL database with load balancing. Goals 1. Achieve the redundancy using built-in streaming replication of PostgreSQL. 2. Use the slave data nodes to load balance the read queries. Specifications To achieve the redundancy, I’ve chosen to use the built-in streaming replication of PostgreSQl. Another tool, repmgr, will be used for easier control of the replication between those redundant PostgreSQL instances. The tool doesn’t replicate data itself but it allows you to easily control the replication, the standby-server(s) and monitor the status of the whole replication process. In order to use both ( or more if you would like to) copies of the database-data, I will use pgpool. Pgpool can pool the connections to the nodes and will monitor their status and trigger failover (by repmgr) if needed. Pgpool has the capability to load balance traffic, based on the type of SQL query. For example, a SELECT query can perfectly be executed on a slave (read only) node and save the resources on the master(read-write) node. Milestones I. Setup We will setup two nodes as redundant PostgreSQl DB (a primary and a standby) and one server which will do the pooling and distribution. The pool is a single point of failure and ideally should also be redundant. For this case. The most important is that all data stays available in case of failure.
  • 3. 2 As we can see in the above scheme, SQL queries will be distributed to the primary and stand, based on the type of query. Data written to the master should get replicated to the standby. In case of a failure of the primary, the standby should replace the primary and become read-write. This will make sure that the database stays available for all the applications. In the meanwhile, one can investigate what happened to the old primary and after a checkup, start it as the new standby. The version of PostgreSQL used in the this document is 9.4 but once can easily replace that with any other supported (recent) PostgreSQL version.
  • 4. 3 II. Prerequisites Firewall and SELinux Considered that the iptables/firewall and SELinux is disabled on the server, we will not be covering those parts as the objective of this document is only PostgreSQL failover replication. Hostname Before we will get started with the setup, it’s important that all nodes in the setup can connect to each other via their hostname. If that’s not an option, you could use IP’s instead of hostnames but that would make this explanation less clear. Public Key Authentication Another prerequisite is that all nodes can connect to each other over SSH without a password prompt with user postgres. SSH will be used to rsync the data from the primary to standby and to initiate a failover from pgpool. Password-less SSH can be achieved with public key authentication. To setup the SSH authentication for the postgres user, we first need to create the user. Now, we will need to generate the a new RSA keypair for all the nodes. As you can see I did a cat of the pubic part of the key pair on each node. To allow all machines to connect to each other and accept each other’s key, we’ll need to add the generated public keys of all hosts to /var/lib/pgsql/.ssh/authorized_keys:
  • 5. 4
  • 6. 5 In the below snapshot, I added each of the generated public RSA keys to the authorized_keys file on node1. Don’t forget the last line to change the permission to postgres user or SSH will not accept this file. Since I really want unattended access ( so no password or any other question), I will also add all hosts to the known_hosts file. This prevents the question to add the hosts fingerprint on the first connection: Since my authorized_keys and known host on node1 is fine for all other hosts, I’ll copy these files to other nodes.
  • 7. 6 It’s also a good idea to test the SSH connection between all hosts using the postgres user from everywhere: Also, pgpool server root user executes the commands on postgres nodes, to allow passwordless authentication we will copy the postgres user pub key to root user: III. Repository The standard CentOS repositories do not contain pgpool or repmgr and for CentOS 7, the supported PostgreSQL version will be 9.2. Because we want to use PostgreSQl 9.4, repmgr and pgpool, we need to add the PGDG repository to Yum. Lets add the repo on all nodes:
  • 8. 7 Setup the primary PostgreSQL node Now that we finished with all prerequisites, it’s time to really get started. The first step is to install the PostgreSQl database for the master and configure it for replication Let’s start with installing required packages: I have experienced some problems with the latest version of repmgr (repmgr-3.1.5-1 at the time of writing), especially to recover from a failed node, so i used the previous version (repmgr94-2.0.2-4) of repmgr, using older version all problems seemed to resolved. One can find the older version link here: https://mirror.its.sfu.ca/mirror/CentOS-Third-Party/pgrpm/pgrpm-94/redhat/rhel-7-x86_6 4/repmgr94-2.0.2-4.rhel7.x86_64.rpm After the installation of postgreSQL and repmgr, let’s initialize PostgreSQL: Configure database access by editing /var/lib/pgsql/9.4/data/pg_hba.conf
  • 9. 8 The above configuration allows DB-nodes to connect from other DB-nodes as well as from pgpool server with pgpool-user. By mentioning trust will allow users to connect without password and the rest of the users will have to use password (since method is mention as md5). So with the above configuration pgpool server can communicate to DB-nodes with password for DB-operations. Configure the PostgreSQL and streaming replication by editing /var/lib/pgsql/9.4/data/postgresql.conf
  • 10. 9 It’s good to merge the changed parameters with the existing, default content of the PostgreSQL, since it contains many useful comments. I have just posted the effective settings here to keep it clear. Create a directory for repmgr configuration files: Configure the repmgr by editing the /var/lib/pgsql/repmgr/repmgr.conf Now set the owner of the files as postgres-user Enable and start PostgreSQL on the master(node1): Create required users for replication and repmgr and the repmgr DB:
  • 11. 10 Last step is to register node1 as master node for repmgr: Setup the Standby PostgreSQL node Setting up the standby node doesn;t require a lot of configuration. On the first sync with primary, most of the configuration is taken from here. Same as with primary, the first is to start with installation of necessary packages: Once all the packages are installed, we can sync the configuration and content of the primary with the standby. In case you experience problems with the synchronization, you need to first delete the data and have to recreate it, as the postgreSQL backup-restore process doesn’t overwrite the content, for which we need to clear the content from the folder. And run the given command.
  • 12. 11 After the synchronization with the primary, configuration repmgr, similar as we did on the primary by creating the directory for the configuration files: Also configured the /var/lib/pgsql/repmgr/repmgr.conf and don’t forget the last to change the owner of the file. Now let’s start and enable PostgreSQL service on standby server As the last step in the standby node setup, register the standby in repmgr: Test the replication Before we test, let’s first look at the status of repmgr: This seems to look good. At this point, updates on master should be replicated to the slave, as we would create except by creating the new database on the primary node(node1).
  • 13. 12 Before we create database on primary let’s take a look at the list of databases on standby node; Now, let’s create a database on primary node: List the databases on slave node using same command which run on primary: As you can see the new database test got replicated to standby: We will do one more test to check the if the standby is in read-only mode, we will actually try to create a database on standby: The above result is looks good, as we don’t want somebody to update the standby, everything should pass via primary and get replicated to standby.
  • 14. 13 Setup Pgpool Setup pgpool in order to distribute the queries to the primary and standby and to make sure that we failover to the standby in case of primary isn’t available anymore: As with the DB-nodes, we will start with installing the necessary packages: Configure pgpool by copying the sample configuration file: Edit the newly copied sample file /etc/pgpool-II-94/pgpool.conf, don’t replace the content of files but edit everything listed below; As mention in the above settings, we have specified that the failover.sh should be executed on failover, so we need to create this file in /etc/pgpool-II-94/failover.sh Make sure that this script is executable:
  • 15. 14 Also need use the pool_hba.conf for access control, so create this file in /etc/pgpool-II-94 with the below content; The above configuration allows all users to connect from ip range 10.0.10.0/24 using password. Every user that needs to connect via pgpool needs to be added in pool_passwd. We need to create this file and change the owner to postgres. In the above snapshot we have added the pgpool-user to the file, which we created earlier. We need to repeat this last step for every user that needs access to the Database. Last step is allow connection via PCP to manage pgpool. This requires a similar approach as with pool_passwd. This completes the pgpool configuration. Let’s start and enable the pgpool service: Test pgpool Hoping that till now all went well. We should have a working installation of pgpool that will execute queries on the replicated PostgreSQL nodes. Let’s test if pgpool is working by executing a query on the database via pgpool:
  • 16. 15 We will get same result from another host which can access to pgpool-server. Pgpool is having range of show-commands available that allows us to get the status of pgpool itself. They are executed as normal SQL but will be intercepted by pgpool. Check the status of nodes connected to pgpool: As we can see that node1 is primary and node2 is secondary node. Also the status (2) mention as that the node is online. Test failover To test the failover scenarios we need to bring down the primary node and see if the standby node becomes master or not:
  • 17. 16 Post failover, new state of our replication will be like this: Status from pgpool-server: In the above screenshot we can see that the former slave (node2) has now became new primary the former primary (node1) has now became standby but has an offline status (3)
  • 18. 17 Result in repmgr: In the failover.sh script we defined and was called when the primary went down/unavailable, we specified to log /tmp/pgpool_failover.log. Below are the contents of the file after the failover; As we can see, the database stays available via pgpool and the former standby became read-write. Recover after failover to master-slave replication After the failover, we will be in non-redundant situation. Everything keeps working for users of the database but the we need to go back to primary-standby configuration.
  • 19. 18 The new state of replication will be: First thing is to troubleshoot why the primary DB became unavailable and once the problem is resolved , we can start using that node as the new standby-server: Make sure that database is stopped on node1: Then, we will sync the data from new primary(node2) to the previously failed node (node1). In meanwhile if the database is updated and changed so we need to make sure that those get replicated to node1 before we start it as a standby: Also, as I mention earlier that postgres backup-restore procedure doesn’t overwrite the data in /var/lib/pgsql/9.4/data directory. I simply deleted all the data in the directory and
  • 20. 19 with the help of repmgr command synced all the updated data from new primary node(node2): As the new standby node1 is ready let’s start the service and repmgr notices automatically that the new standby got available: For pgpool, we need to re-attach the failed node in order for it to be visible and usable as a standby node. Password requested for below command is ​secret, which we set while creating pgpool user. Now at this point, we are back in redundant status where our old standby(node2) functions as the primary and the old primary(node1) functions as a standby. Both the machines are now equal, we can leave the situation as it is and continue to use this way or we can revert to original position.
  • 21. 20 Recover the original situation In case we want to go back to initial setup (node1 as primary and node2 as standby), we can initiate a manual failover by stopping the current primary(node2) and recover it to use as a standby: Once the master(node2) stopped, a failover was triggered and node1 gets assigned as primary again; Now, sync node2 with new primary(node1) and start:
  • 22. 21 Re-attach node2 to pgool in order to use it as a standby: After going though all the above steps, now we are back to the initial situation where node1 acts as primary and node2 acts as the standby.