Successfully reported this slideshow.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Replication Solutions for PostgreSQL

  1. 1. Replication Solutions for PostgreSQL Peter Eisentraut petere@postgresql.org
  2. 2. What's in a Term? • Replication? • Clustering? • High availability? • Failover? • Standby? • Putting data on more than one computer 2
  3. 3. Space of Possibilities • Goals • What do you want to achieve? • Techniques • How can this be implemented? • Solutions • What software is available to do this? 3
  4. 4. Goals • High availability • Performance • Read • Write • Wide-area networks • Offline peers 4
  5. 5. Goal: High Availability • No one wants “low availability”! • Provisions for system failures • Software faults • Hardware faults • External interference 5
  6. 6. Goal: Read Performance • Applications with: • many readers (e.g., library information system) • resource-intensive readers (e.g., data warehousing) • Distribute readers on more hardware. • Most often, one physical machine is enough. 6
  7. 7. Goal: Write Performance • Applications with: • Many writers • Distribute writers on more hardware? • Constraint checking, conflict resolution?!? • Faster writing contradicts replication. • Partition, don't replicate! • RAID 0/striping is not replication – it makes things “worse”. • RAID 10 is a good idea, but not the topic here. 7
  8. 8. Goal: Wide-Area Networks • Faster access across WANs • Reading? • Local copies • Writing? • Synchronization? 8
  9. 9. Goal: Offline Peers • Synchronize data with laptops, handhelds, ... • “Road warriors” • May be considered very-low-latency WANs 9
  10. 10. Techniques • Replication • Master/Slave • Asynchronous • Synchronous • Multi-Master • Asynchronous • Synchronous • Proxy • Standby system 10
  11. 11. Technique: Replication Master/Slave Asynchronous • High(er) availability(?) • Read performance • Load spreading, load balancing • Offline peers M asy c n S (unidirectional sync.) 11
  12. 12. Technique: Replication Master/Slave Synchronous • High availibility • Better read performance • Worse write performance M sy c n S 12
  13. 13. Technique: Replication Multi-Master Asynchronous • Read performance • Faster access across WANs • Manage offline peers M M • Requires conflict asy c n resolution mechanism 13
  14. 14. Technique: Replication Multi-Master Synchronous • “Holy grail of replication” • High availability • Read performance • Difficult to get good M M write performance sy c n 14
  15. 15. Technique: Proxy • High availability • Read performance • Proxy instance should Proxy be redundant • Transparent to the application C C 15
  16. 16. Technique: Standby System • High availability M sy c n S 16
  17. 17. Constraints • Hardware • Operating system • Application 17
  18. 18. No Built-In Solution? • FIXME 18
  19. 19. Solutions • Slony-I, -II • PGCluster • DBMirror • pgpool • WAL replication • Sequoia • DRBD • Shared storage 19
  20. 20. Solution: Slony-I (Slony ← слоны ← elephants) • Asynchronous master/slave replication • Multiple slaves, cascading possible • Particularly useful for: • Read performance (load balancing with pgpool) • Limited form of high availability • Offline slaves via file-based log shipping http://www.slony.info/ 20
  21. 21. Solution: Slony-II • Synchronous master/master replication? • See Gavin Sherry's session for details 21
  22. 22. Solution: PGCluster • Synchronous master/master replication • Replicates the query string • Particularly useful for: • Load balancing • High availability http://pgcluster.projects.postgresql.org/ 22
  23. 23. Solution: DBMirror • Asynchronous master/slave replication • Very simple (compared to Slony-I) • Particularly useful for: • Read performance • Offline peers contrib/dbmirror/ in PostgreSQL source tree 23
  24. 24. Solution: pgpool • Connection pool daemon for PostgreSQL • Supports simple proxying • Useful as frontend for Slony-I http://pgpool.projects.postgresql.org/ 24
  25. 25. Solution: WAL Replication • Use the “archived” WAL logs for “recovery” on a standby system • Disadvantages: • Only full database cluster replication • Master and slave must be binary-compatible • Rather slow across network • Useful for: • High availability 25
  26. 26. Solution: Sequoia • Formerly C[lustered]-JDBC • Proxy offering clustering, load balancing and failover services • Particularly useful for: • High availability • Read performance • Currently only for Java/JDBC applications http://sequoia.continuent.org/ 26
  27. 27. Solution: DRBD • File system (block device) replication • Linux kernel module • Standby system • Useful for: • High availability • Secure any service, not just a database system http://www.drbd.org/ 27
  28. 28. Solution: Shared Storage • NAS, iSCSI, Fiberchannel, ... • Available from many vendors • Standby system • Useful for: • High availability • Secure any service, not just a database system • Single storage system is a possible point of failure 28
  29. 29. Summary • Plenty of solutions for diverse applications • Make a (project) plan. 29
  30. 30. Suggestions • Minimum for any production installation: • Sensible disk clustering • RAID 10 • Tablespace management • Separate disk(s) for WAL • DRBD or shared storage • Slony-I for load balancing or warehousing • Java developers consider Sequoia 30
  31. 31. Outlook • Slony-II • WAL replication management • XA support • More packaging efforts 31
  32. 32. The End Replication Solutions for PostgreSQL 32

×