Advanced PostgreSQL
backup & recovery
methods
Anastasia Lubennikova
Postgres@CERN 2020
1
Agenda
- Why backup?
- What is a good backup tool?
- Overview of advanced backup features
- Overview of PostgreSQL backup tools
Spoiler: this talk doesn’t contain any benchmarks.
2
Why do you need a backup?
- To restore the database after an accident
- hardware failure
- software bug
- human error
- To set up a new replica
- To create a test environment
- To inspect data from the past
3
What are the options?
- replica is not a backup
- dump a.k.a. “logical backup”
- storage snapshots
- pg_basebackup
- set of custom scripts
- PostgreSQL specific backup tools
4
What makes a good backup tool?
- Convenience
- out-of-box automatization of various routines
- documentation & support
- convenient and stable api
- Performance
- parallel execution
- compression
- incremental & differential backups
- WAL prefetch
5
What backup tools exist?
- Barman
- pgBackRest
- pg_probackup
- WAL-G
- BART
- part of the “EDB Advanced Server”
- requires pg_basebackup
6
Who is who? Barman
- https://www.pgbarman.org/
- 2ndQuadrant
- GPL v 3.0
- Python
- first release: 2011
- Two methods: basebackup & rsync
Notable features:
Synchronous streaming for “zero data loss”.
7
Who is who? pgBackRest
- https://pgbackrest.org/
- Crunchy Data
- MIT License
- C
- first release: 2014
Notable features:
Performance optimizations for large backups.
8
Who is who? pg_probackup
- https://github.com/postgrespro/pg_probackup
- Postgres Professional
- PostgreSQL License
- C
- first release: 2017 (based on pg_arman)
Notable features:
Page-level incremental backups and built-in validation.
9
Who is who? WAL-G
- https://github.com/wal-g/wal-g
- introduced by Citus Data,
now maintained by Yandex Cloud team
- Apache License, Version 2.0
- Go
- first release: 2017 ( “based on” WAL-E)
Notable features:
Out-of-box support for various cloud storages.
10
Feature list
1. Documentation & Support
2. Backup management
3. WAL archive management
4. Incremental backups
5. Compression and parallel execution
6. Remote backup
7. Cloud backup
8. Advanced restore options
9. Backup validation
10. Backup retention
11
1. Documentation & Support
12
Documentation
Barman User guide & command reference.
Great overview of backup architectures
pgBackRest User guide & command reference
pg_probackup User guide & command reference
WAL-G README
13
Installation
Barman Linux packages, Build from source
pgBackRest Linux packages, Build from source
pg_probackup Linux packages, Build from source,
Windows installer
WAL-G Linux binary, Build from source
14
Support: bug fixes
Barman https://github.com/2ndquadrant-it/barman/issues
pgBackRest https://github.com/pgbackrest/pgbackrest/issues
pg_probackup https://github.com/postgrespro/pg_probackup/issues
WAL-G https://github.com/wal-g/wal-g/issues
15
Commercial support
Barman 2ndQuadrant
pgBackRest CrunchyData
pg_probackup Postgres Professional
WAL-G
16
2. Backup management
17
Set up new PostgreSQL instance
Barman server
configuration files
pgBackRest stanza
configuration files
pg_probackup instance
configuration files, set-config command
WAL-G -
config via environment variables
18
Backup information
Barman plain
pgBackRest plain, json
+ postgresql table
pg_probackup plain, json
+ detailed wal archive info
WAL-G plain, json
19
3. WAL archive management
20
WAL archive management
Barman rsync / get-wal
pgBackRest archive-push / archive get
archive-async
pg_probackup archive-push / archive-get
WAL-G wal-push / wal-fetch
wal prefetch
21
Streaming backups
- Recovery Point Objective (RPO):
"maximum targeted period in which data might be lost
from an IT service due to a major incident"
- “RPO = 0” (Zero data loss)
can be achieved by synchronous WAL streaming
- replication slot
prevents the removal of WAL that is not yet received
(PostgreSQL feature)
22
Streaming backups
Barman streaming_archiver (pg_recievewal)
replication slot
pgBackRest
pg_probackup backup --stream
replication slot
WAL-G
23
4. Incremental backups
Full backup includes all data files.
Differential backup contains changes since last full backup.
Incremental backup contains changes since last backup.
24
Incremental backup methods
- DELTA - read everything, backup what changed
- independent method
- read load on data server
- PAGE - scan WAL to determine changed blocks
- requires WAL archive
- minimal load on data server
- PTRACK - remember changed blocks in a map
- requires core patch
- minimal load during backup
25
Incremental backups
Barman file-level incremental (DELTA)
pgBackRest file-level incremental (DELTA)
file-level differential (DELTA)
pg_probackup page-level incremental:
DELTA, PAGE, PTRACK
WAL-G page-level incremental (DELTA)
26
5. Compression and parallel execution
27
6. Remote backup
Barman SSH
pgBackRest SSH
pg_probackup SSH
WAL-G
28
7. Cloud backup
29
Backup to cloud storage
Barman scripts to ship backups to S3
pgBackRest Amazon S3
+ encryption
pg_probackup
WAL-G Amazon S3, Google Cloud Storage,
Azure Storage, Swift Object Storage
+ encryption
30
Extra backup features
- Backup from standby (All tools)
- to reduce load on master data server
- Resume backup (only pgBackRest)
31
8. Advanced restore options. PITR
Restore to a certain moment in time.
32
Point-in-time-recovery
Barman recovery target options
pgBackRest recovery target options
pg_probackup recovery target options
WAL-G
33
Partial restore
Barman
pgBackRest restore selected databases
pg_probackup restore selected databases
WAL-G
34
9. Backup validation
35
Validate backups
Barman DIY with custom hooks
on backup & restore
pgBackRest page checksums on backup
pg_probackup page checksums on backup
validate on demand
check instance
WAL-G
36
10. Backup retention
37
10. Backup retention. Redundancy = 3
38
10. Backup retention. Window = 7 days
39
Retention policy
Barman retention_policy = REDUNDANCY
retention_policy = RECOVERY WINDOW
pgBackRest redundancy
pg_probackup --retention-redundancy
--retention-window
WAL-G redundancy: retain N
window: delete before
40
Backup pinning
Barman
pgBackRest
pg_probackup ttl=0
WAL-G backup-mark
41
Archive retention
Barman
pgBackRest Archive Retention
--repo-retention-archive
pg_probackup delete --expired --wal
--wal-depth=1
WAL-G
42
Backup merging
Save space by merging old incremental backups.
43
Backup merging
Barman
pgBackRest
pg_probackup merge
--merge-expired
WAL-G
44
45
Conclusion
Barman
(rsync)
pgBackRest pg_probackup WAL-G
Support + + + +
Backup management + + + -
WAL management + + + +
Incremental backup + + + +
Compression &
parallel execution
+ + + +
46
Conclusion
Barman
(rsync)
pgBackRest pg_probackup WAL-G
Remote backup + + + +
Cloud backup - + - +
Advanced restore + + + -
Backup validation + + + -
Backup retention + + + +
47

Advanced backup methods (Postgres@CERN)

  • 1.
    Advanced PostgreSQL backup &recovery methods Anastasia Lubennikova Postgres@CERN 2020 1
  • 2.
    Agenda - Why backup? -What is a good backup tool? - Overview of advanced backup features - Overview of PostgreSQL backup tools Spoiler: this talk doesn’t contain any benchmarks. 2
  • 3.
    Why do youneed a backup? - To restore the database after an accident - hardware failure - software bug - human error - To set up a new replica - To create a test environment - To inspect data from the past 3
  • 4.
    What are theoptions? - replica is not a backup - dump a.k.a. “logical backup” - storage snapshots - pg_basebackup - set of custom scripts - PostgreSQL specific backup tools 4
  • 5.
    What makes agood backup tool? - Convenience - out-of-box automatization of various routines - documentation & support - convenient and stable api - Performance - parallel execution - compression - incremental & differential backups - WAL prefetch 5
  • 6.
    What backup toolsexist? - Barman - pgBackRest - pg_probackup - WAL-G - BART - part of the “EDB Advanced Server” - requires pg_basebackup 6
  • 7.
    Who is who?Barman - https://www.pgbarman.org/ - 2ndQuadrant - GPL v 3.0 - Python - first release: 2011 - Two methods: basebackup & rsync Notable features: Synchronous streaming for “zero data loss”. 7
  • 8.
    Who is who?pgBackRest - https://pgbackrest.org/ - Crunchy Data - MIT License - C - first release: 2014 Notable features: Performance optimizations for large backups. 8
  • 9.
    Who is who?pg_probackup - https://github.com/postgrespro/pg_probackup - Postgres Professional - PostgreSQL License - C - first release: 2017 (based on pg_arman) Notable features: Page-level incremental backups and built-in validation. 9
  • 10.
    Who is who?WAL-G - https://github.com/wal-g/wal-g - introduced by Citus Data, now maintained by Yandex Cloud team - Apache License, Version 2.0 - Go - first release: 2017 ( “based on” WAL-E) Notable features: Out-of-box support for various cloud storages. 10
  • 11.
    Feature list 1. Documentation& Support 2. Backup management 3. WAL archive management 4. Incremental backups 5. Compression and parallel execution 6. Remote backup 7. Cloud backup 8. Advanced restore options 9. Backup validation 10. Backup retention 11
  • 12.
  • 13.
    Documentation Barman User guide& command reference. Great overview of backup architectures pgBackRest User guide & command reference pg_probackup User guide & command reference WAL-G README 13
  • 14.
    Installation Barman Linux packages,Build from source pgBackRest Linux packages, Build from source pg_probackup Linux packages, Build from source, Windows installer WAL-G Linux binary, Build from source 14
  • 15.
    Support: bug fixes Barmanhttps://github.com/2ndquadrant-it/barman/issues pgBackRest https://github.com/pgbackrest/pgbackrest/issues pg_probackup https://github.com/postgrespro/pg_probackup/issues WAL-G https://github.com/wal-g/wal-g/issues 15
  • 16.
    Commercial support Barman 2ndQuadrant pgBackRestCrunchyData pg_probackup Postgres Professional WAL-G 16
  • 17.
  • 18.
    Set up newPostgreSQL instance Barman server configuration files pgBackRest stanza configuration files pg_probackup instance configuration files, set-config command WAL-G - config via environment variables 18
  • 19.
    Backup information Barman plain pgBackRestplain, json + postgresql table pg_probackup plain, json + detailed wal archive info WAL-G plain, json 19
  • 20.
    3. WAL archivemanagement 20
  • 21.
    WAL archive management Barmanrsync / get-wal pgBackRest archive-push / archive get archive-async pg_probackup archive-push / archive-get WAL-G wal-push / wal-fetch wal prefetch 21
  • 22.
    Streaming backups - RecoveryPoint Objective (RPO): "maximum targeted period in which data might be lost from an IT service due to a major incident" - “RPO = 0” (Zero data loss) can be achieved by synchronous WAL streaming - replication slot prevents the removal of WAL that is not yet received (PostgreSQL feature) 22
  • 23.
    Streaming backups Barman streaming_archiver(pg_recievewal) replication slot pgBackRest pg_probackup backup --stream replication slot WAL-G 23
  • 24.
    4. Incremental backups Fullbackup includes all data files. Differential backup contains changes since last full backup. Incremental backup contains changes since last backup. 24
  • 25.
    Incremental backup methods -DELTA - read everything, backup what changed - independent method - read load on data server - PAGE - scan WAL to determine changed blocks - requires WAL archive - minimal load on data server - PTRACK - remember changed blocks in a map - requires core patch - minimal load during backup 25
  • 26.
    Incremental backups Barman file-levelincremental (DELTA) pgBackRest file-level incremental (DELTA) file-level differential (DELTA) pg_probackup page-level incremental: DELTA, PAGE, PTRACK WAL-G page-level incremental (DELTA) 26
  • 27.
    5. Compression andparallel execution 27
  • 28.
    6. Remote backup BarmanSSH pgBackRest SSH pg_probackup SSH WAL-G 28
  • 29.
  • 30.
    Backup to cloudstorage Barman scripts to ship backups to S3 pgBackRest Amazon S3 + encryption pg_probackup WAL-G Amazon S3, Google Cloud Storage, Azure Storage, Swift Object Storage + encryption 30
  • 31.
    Extra backup features -Backup from standby (All tools) - to reduce load on master data server - Resume backup (only pgBackRest) 31
  • 32.
    8. Advanced restoreoptions. PITR Restore to a certain moment in time. 32
  • 33.
    Point-in-time-recovery Barman recovery targetoptions pgBackRest recovery target options pg_probackup recovery target options WAL-G 33
  • 34.
    Partial restore Barman pgBackRest restoreselected databases pg_probackup restore selected databases WAL-G 34
  • 35.
  • 36.
    Validate backups Barman DIYwith custom hooks on backup & restore pgBackRest page checksums on backup pg_probackup page checksums on backup validate on demand check instance WAL-G 36
  • 37.
  • 38.
    10. Backup retention.Redundancy = 3 38
  • 39.
    10. Backup retention.Window = 7 days 39
  • 40.
    Retention policy Barman retention_policy= REDUNDANCY retention_policy = RECOVERY WINDOW pgBackRest redundancy pg_probackup --retention-redundancy --retention-window WAL-G redundancy: retain N window: delete before 40
  • 41.
  • 42.
    Archive retention Barman pgBackRest ArchiveRetention --repo-retention-archive pg_probackup delete --expired --wal --wal-depth=1 WAL-G 42
  • 43.
    Backup merging Save spaceby merging old incremental backups. 43
  • 44.
  • 45.
  • 46.
    Conclusion Barman (rsync) pgBackRest pg_probackup WAL-G Support+ + + + Backup management + + + - WAL management + + + + Incremental backup + + + + Compression & parallel execution + + + + 46
  • 47.
    Conclusion Barman (rsync) pgBackRest pg_probackup WAL-G Remotebackup + + + + Cloud backup - + - + Advanced restore + + + - Backup validation + + + - Backup retention + + + + 47