2. What's in a Term?
• Replication?
• Clustering?
• High availability?
• Failover?
• Standby?
• Putting data on more than one computer
2
3. Space of Possibilities
• Goals
• What do you want to achieve?
• Techniques
• How can this be implemented?
• Solutions
• What software is available to do this?
3
5. Goal: High Availability
• No one wants “low availability”!
• Provisions for system failures
• Software faults
• Hardware faults
• External interference
5
6. Goal: Read Performance
• Applications with:
• many readers (e.g., library information system)
• resource-intensive readers (e.g., data
warehousing)
• Distribute readers on more hardware.
• Most often, one physical machine is
enough.
6
7. Goal: Write Performance
• Applications with:
• Many writers
• Distribute writers on more hardware?
• Constraint checking, conflict resolution?!?
• Faster writing contradicts replication.
• Partition, don't replicate!
• RAID 0/striping is not replication – it makes
things “worse”.
• RAID 10 is a good idea, but not the topic here.
7
8. Goal: Wide-Area Networks
• Faster access across WANs
• Reading?
• Local copies
• Writing?
• Synchronization?
8
9. Goal: Offline Peers
• Synchronize data with laptops,
handhelds, ...
• “Road warriors”
• May be considered very-low-latency WANs
9
11. Technique: Replication
Master/Slave Asynchronous
• High(er) availability(?)
• Read performance
• Load spreading,
load balancing
• Offline peers M asy c
n S
(unidirectional sync.)
11
12. Technique: Replication
Master/Slave Synchronous
• High availibility
• Better read
performance
• Worse write
performance M sy c
n S
12
13. Technique: Replication
Multi-Master Asynchronous
• Read performance
• Faster access across
WANs
• Manage offline peers
M M
• Requires conflict asy c
n
resolution mechanism
13
14. Technique: Replication
Multi-Master Synchronous
• “Holy grail of replication”
• High availability
• Read performance
• Difficult to get good
M M
write performance sy c
n
14
15. Technique: Proxy
• High availability
• Read performance
• Proxy instance should Proxy
be redundant
• Transparent to the
application
C C
15
23. Solution: DBMirror
• Asynchronous master/slave replication
• Very simple (compared to Slony-I)
• Particularly useful for:
• Read performance
• Offline peers
contrib/dbmirror/ in PostgreSQL source tree
23
24. Solution: pgpool
• Connection pool daemon for PostgreSQL
• Supports simple proxying
• Useful as frontend for Slony-I
http://pgpool.projects.postgresql.org/
24
25. Solution: WAL Replication
• Use the “archived” WAL logs for “recovery”
on a standby system
• Disadvantages:
• Only full database cluster replication
• Master and slave must be binary-compatible
• Rather slow across network
• Useful for:
• High availability
25
26. Solution: Sequoia
• Formerly C[lustered]-JDBC
• Proxy offering clustering, load
balancing and failover services
• Particularly useful for:
• High availability
• Read performance
• Currently only for Java/JDBC applications
http://sequoia.continuent.org/
26
27. Solution: DRBD
• File system (block device) replication
• Linux kernel module
• Standby system
• Useful for:
• High availability
• Secure any service, not just a database system
http://www.drbd.org/
27
28. Solution: Shared Storage
• NAS, iSCSI, Fiberchannel, ...
• Available from many vendors
• Standby system
• Useful for:
• High availability
• Secure any service, not just a database system
• Single storage system is a possible point of
failure
28
29. Summary
• Plenty of solutions for diverse applications
• Make a (project) plan.
29
30. Suggestions
• Minimum for any production installation:
• Sensible disk clustering
• RAID 10
• Tablespace management
• Separate disk(s) for WAL
• DRBD or shared storage
• Slony-I for load balancing or warehousing
• Java developers consider Sequoia
30
31. Outlook
• Slony-II
• WAL replication management
• XA support
• More packaging efforts
31