Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Replication Solutions for PostgreSQL

11,359 views

Published on

Published in: Technology
  • Be the first to comment

Replication Solutions for PostgreSQL

  1. 1. Replication Solutions for PostgreSQL Peter Eisentraut petere@postgresql.org
  2. 2. What's in a Term? • Replication? • Clustering? • High availability? • Failover? • Standby? • Putting data on more than one computer 2
  3. 3. Space of Possibilities • Goals • What do you want to achieve? • Techniques • How can this be implemented? • Solutions • What software is available to do this? 3
  4. 4. Goals • High availability • Performance • Read • Write • Wide-area networks • Offline peers 4
  5. 5. Goal: High Availability • No one wants “low availability”! • Provisions for system failures • Software faults • Hardware faults • External interference 5
  6. 6. Goal: Read Performance • Applications with: • many readers (e.g., library information system) • resource-intensive readers (e.g., data warehousing) • Distribute readers on more hardware. • Most often, one physical machine is enough. 6
  7. 7. Goal: Write Performance • Applications with: • Many writers • Distribute writers on more hardware? • Constraint checking, conflict resolution?!? • Faster writing contradicts replication. • Partition, don't replicate! • RAID 0/striping is not replication – it makes things “worse”. • RAID 10 is a good idea, but not the topic here. 7
  8. 8. Goal: Wide-Area Networks • Faster access across WANs • Reading? • Local copies • Writing? • Synchronization? 8
  9. 9. Goal: Offline Peers • Synchronize data with laptops, handhelds, ... • “Road warriors” • May be considered very-low-latency WANs 9
  10. 10. Techniques • Replication • Master/Slave • Asynchronous • Synchronous • Multi-Master • Asynchronous • Synchronous • Proxy • Standby system 10
  11. 11. Technique: Replication Master/Slave Asynchronous • High(er) availability(?) • Read performance • Load spreading, load balancing • Offline peers M asy c n S (unidirectional sync.) 11
  12. 12. Technique: Replication Master/Slave Synchronous • High availibility • Better read performance • Worse write performance M sy c n S 12
  13. 13. Technique: Replication Multi-Master Asynchronous • Read performance • Faster access across WANs • Manage offline peers M M • Requires conflict asy c n resolution mechanism 13
  14. 14. Technique: Replication Multi-Master Synchronous • “Holy grail of replication” • High availability • Read performance • Difficult to get good M M write performance sy c n 14
  15. 15. Technique: Proxy • High availability • Read performance • Proxy instance should Proxy be redundant • Transparent to the application C C 15
  16. 16. Technique: Standby System • High availability M sy c n S 16
  17. 17. Constraints • Hardware • Operating system • Application 17
  18. 18. No Built-In Solution? • FIXME 18
  19. 19. Solutions • Slony-I, -II • PGCluster • DBMirror • pgpool • WAL replication • Sequoia • DRBD • Shared storage 19
  20. 20. Solution: Slony-I (Slony ← слоны ← elephants) • Asynchronous master/slave replication • Multiple slaves, cascading possible • Particularly useful for: • Read performance (load balancing with pgpool) • Limited form of high availability • Offline slaves via file-based log shipping http://www.slony.info/ 20
  21. 21. Solution: Slony-II • Synchronous master/master replication? • See Gavin Sherry's session for details 21
  22. 22. Solution: PGCluster • Synchronous master/master replication • Replicates the query string • Particularly useful for: • Load balancing • High availability http://pgcluster.projects.postgresql.org/ 22
  23. 23. Solution: DBMirror • Asynchronous master/slave replication • Very simple (compared to Slony-I) • Particularly useful for: • Read performance • Offline peers contrib/dbmirror/ in PostgreSQL source tree 23
  24. 24. Solution: pgpool • Connection pool daemon for PostgreSQL • Supports simple proxying • Useful as frontend for Slony-I http://pgpool.projects.postgresql.org/ 24
  25. 25. Solution: WAL Replication • Use the “archived” WAL logs for “recovery” on a standby system • Disadvantages: • Only full database cluster replication • Master and slave must be binary-compatible • Rather slow across network • Useful for: • High availability 25
  26. 26. Solution: Sequoia • Formerly C[lustered]-JDBC • Proxy offering clustering, load balancing and failover services • Particularly useful for: • High availability • Read performance • Currently only for Java/JDBC applications http://sequoia.continuent.org/ 26
  27. 27. Solution: DRBD • File system (block device) replication • Linux kernel module • Standby system • Useful for: • High availability • Secure any service, not just a database system http://www.drbd.org/ 27
  28. 28. Solution: Shared Storage • NAS, iSCSI, Fiberchannel, ... • Available from many vendors • Standby system • Useful for: • High availability • Secure any service, not just a database system • Single storage system is a possible point of failure 28
  29. 29. Summary • Plenty of solutions for diverse applications • Make a (project) plan. 29
  30. 30. Suggestions • Minimum for any production installation: • Sensible disk clustering • RAID 10 • Tablespace management • Separate disk(s) for WAL • DRBD or shared storage • Slony-I for load balancing or warehousing • Java developers consider Sequoia 30
  31. 31. Outlook • Slony-II • WAL replication management • XA support • More packaging efforts 31
  32. 32. The End Replication Solutions for PostgreSQL 32

×