Streaming Replication Made Easy in v9.3

Hot Streaming
Replication in
Postgres v9.3
Setup, Failover and Rebuilding the Master made easy with
Postgres
20/6/2014
v9.3

PREPARE THE INSTANCE
 Install Postgres on Servers which are going to hold Primary and
Secondary database
 Setup and configure the database cluster on Primary servers
 In this example:
 Primary DB Server:
 Dbserver1- 192.168.160.155
 Data directory: /opt/PostgresPlus/9.3AS/data
 Port: 5445
 Stand by DB Server
 Dbserver2- 192.168.160.155
 Data Directory: /opt/PostgresPlus/9.3AS/data2
 Port: 5446
2

EDIT postgresql.conf AND pg_hba.conf
ON MASTER
 wal_level = hot_standby (mandatory)
 max_wal_senders = 3 (mandatory to be set to a positive integer)
 wal_keep_segments = 128 (optional/depending on load)
 replication_timeout = 5 sec (optional)
 hot_standby = on (effective only for hot stand by server)
 Add entry in pg_hba.conf
 host replication enterprisedb 192.168.160.155/32 trust
3

TAKE A BACKUP AND RESTORE ON
SECONDARY
 Take a backup of Primary DB Instance/Cluster
 If archive WALs are available then you can take hot-backup using
pg_basebackup
 Restore the same for creating the DB cluster/instance on Secondary
server
 Change the port number if required in the new DB Cluster
4

CREATE recovery.conf IN SECONDARY
SERVER
 standby_mode = 'on' #mandatory
 primary_conninfo = 'host=192.168.160.155 port=5445
user=enterprisedb password=password'
 recovery_target_timeline = 'latest' #optional ## Important for rebuilding
 trigger_file = '/opt/PostgresPlus/9.3AS/data2/recover.trigger' #optional
 Note: pg_basebackup has option -R to create a default recovery.conf
file while dumping the backup
 pg_basebackup -h 192.168.160.155 –p 5445 –U enterprisedb -R
5

START THE SERVERS
 Start the secondary server
 There will be a warning in log files that primary server is not available,
ignore that
 Start the primary server
6

TEST REPLICATION
 On Primary:
edb=# insert into replication_test values (2);
INSERT 0 1
 On Secondary:
edb=# select * from replication_test;
test_column
-------------
1
2
(2 rows)
 Secondary server is read-only:
ERROR: cannot execute INSERT in a read-only transaction
7

TRIGGERING THE FAILOVER
 To, Trigger a failure on Primary and create the recovery trigger file
(manually, but can be scripted too)
touch /opt/PostgresPlus/9.2AS/data2/recover.trigger | pg_ctl promote –D
/opt/PostgresPlus/9.3AS/data2
 Logic to script the above step:
while( pg_ctl -h 192.168.160.147 –p 5444 -c "select 1 “)
{ sleep $connection_wait_time; }
touch opt/PostgresPlus/9.3AS/data2/recover.trigger
 Once completed, the recovery.conf will change to recover.done
 Connect to secondary db and execute insert to confirm the failover
INSERT 0 1
 Or execute select pg_is_in_recovery(); (output must be “f”) to confirm recovery is completed
 Point the database/Virtual IP to new database server
8

TRIGGERING THE SWITCHOVER
 Disconnect all the application from Primary Node
 Shutdown the primary database
 To, Trigger a failure on Primary and create the recovery trigger file
touch opt/PostgresPlus/9.3AS/data2/recover.trigger | pg_ctl promote
 Once completed, the recovery.conf will change to recover.done
 Connect to secondary db and execute insert
INSERT 0 1
 Or execute select pg_is_in_recovery(); (output must be “f”) to confirm recovery is completed
 Point the database/Virtual IP to new database server
9

HANDLING MULTIPLE REPLICAS
 v9.3 re-mastering will not need rebuilding the slaves.
 In v9.3 Timeline switches are part of WAL as well which can replicated
 Timeline switches happen during PITR or when slaves are promoted
 Other replicas can be re-configured and restarted to receive WAL from
new primary without rebuilding them from scratch
10

11
 Before Failover
RE-MASTERING
Srv1 as
Master
Srv2 as
Slave1
Srv3 as
Slave2
After failover
Srv1 has
Crashed
Srv2 as
Master
Srv3 as
Slave
No need to rebuild Srv3 or
restore archives. Timeline
switch info will be received
from Srv2 via streamed
WAL
Reconfigure
Srv3 to pull
WAL from
Srv2 and
Restart

REBUILDING THE MASTER
 If old primary needs to be added back to cluster as slave, it need not be re-
built
 Prior to v9.3 you either need to have wal archives to add a lost primary as
slave or you need to rebuild the primary
 In v9.3 Timeline switches are part of WAL as well which can be replicated
 As long as all the WAL since failure are available you can add the lost
master without any downtime/rebuilding
 Copy the recovery.done from new primary as recovery.conf in data
directory of lost primary
 Make changes in connection information and Start the old primary
instance as new hot standby
12

MONITORING THE REPLICATION
 Check if the current node is master or slave:
 SELECT pg_is_in_recovery();
 See the current snapshot on master and slave:
 SELECT txid_current_snapshot();
 Get latest information about replication from
 pg_stat_replication view
13

Sameer Kumar
Ashnik Pte Ltd, Singapore
www.ashnik.com | sameer.kumar@ashnik.com
www.slideshare.net/sameerkasi200x |
www.twitter.com/sameerkasi200x
Follow my blogs- pgpen.blogspot.com

Streaming Replication Made Easy in v9.3

More Related Content

What's hot

Similar to Streaming Replication Made Easy in v9.3

Recently uploaded

Streaming Replication Made Easy in v9.3