We will show the advantages of having a geo-distributed database cluster and how to create one using Galera Cluster for MySQL. We will also discuss the configuration and status variables that are involved and how to deal with typical situations on the WAN such as slow, untrusted or unreliable links, latency and packet loss. We will demonstrate a multi-region cluster on Amazon EC2 and perform some throughput and latency measurements in real-time.
2. Agenda
• A very quick overview of Galera Cluster
• What is a geo-distributed database and why use it?
• Galera’s approach
• Configuration considerations
• AWS demo
3. Galera Cluster Overview
Synchronous
– each transaction is immediately replicated on all nodes at commit
– no stale slaves
Multi-Master
– read from and write to any node
– automatic transaction conflict detection
Replication
– a copy of the entire dataset is available on all nodes
– new nodes can join automatically
For MySQL
– based on a modified version of MySQL (5.5, 5.6 with 5.7 coming up)
– InnoDB storage engine
4. And more …
• Recovers from node failures within seconds
• Data consistency protections
– avoids reading stale data
– prevents unsafe data modifications
• Cloud and WAN support
5. What is a Geo-distributed
Database Cluster?
• There are database nodes in different physical locations
– multiple data centers, regions, continents …
• Nodes work together as a single entity
– rather than be in some subordinate relationship
6. Why Have a Geo-Distributed Database?
• Distribute global data globally
• Bring data closer to the users
• Go beyond availability zones and achieve multi-
datacenter redundancy
– multiple availability zones can fail at the same time
• Use multiple cloud providers
8. Galera’s Approach
• Single logical MySQL database
– behaves as a single entity with multiple connection points
• Each node has a complete replica of the database
– can respond to any read request without delay
– removes latency for many operations
– may reduce the number of caching layers required
• Each node is a master
– no primary/secondary relationship
– no need to promote a secondary to master on master failure
9. Galera Features for WAN
• Optimized network protocol
– packets exchanged over WAN only at transaction commit time
• Topology-aware replication
– each transaction is sent to each datacenter only once
– if needed, node synchronizes with nearest neighbors
• Traffic encryption
• Detection and automatic eviction of unreliable nodes
– node will be evicted if it repeatedly suffers network issues
– it will not be allowed to rejoin without a manual intervention
10. What Data Can Take Advantage of
Synchronous WAN Replication?
• Global in nature
– configuration data, authentication databases, SSO, etc.
– e.g. OpenStack's Keystone and Glance databases
• High read-to-write ratio
(in a distributed system, consistent writes require communication)
• Very high consistency requirements
– financial data, payments, bank accounts
• High write availability requirements
– writes must be possible at all times (without violating consistency)
11. Designing Your Cluster Topology
• Use an odd number of data centers
• If two data centers, run a Galera arbitrator
• Consider multiple nodes per datacenter
12. Latency Considerations
• Delay at commit time is generally equal to max RTT
– the highest latency dominates the overall response time
• A client can commit a maximum of 1/RTT transactions/
second
– consolidate updates into larger transactions
– larger connection pool may be required
• In multi-master setups, you can successfully update a
given row a maximum of 1/RTT times per second
– or conflicts can occur and an error will be returned to client
13. Bandwidth/Throughput Considerations
• All links between nodes are important for overall
performance
• Galera slows down commits to what the network is able
to handle
• Full snapshot transfers (SST) across WAN are
bandwidth-intensive
– have more than one node at each location
14. Configuration
• Configure gmcast.segment = ID
– each location should have a separate ID
• Review default values for:
– evs.inactive_timeout (15 seconds); evs.suspect_timeout (5 seconds)
• Size gcache appropriately
– to avoid snapshot transfers (SST) over WAN
• Set up optional auto-eviction
• Set up optional encryption
– SST encryption is configured separately
15. Network Configuration
• Use static/reserved public IPs
• Open firewall ports: 3306, 4567, 4568, 4444
– but not to the entire world
• Settings that use the public IPs:
– wsrep_cluster_address
– wsrep_node_address
• Settings that use the private IPs:
– ist.recv_bind
17. Demo
• EC2 nodes in US East, Brazil and Australia
– m4.large instances (2 virtual CPUs, 8GB RAM, $0.12/hour)
– latencies:
Brazil
Sao Paulo
Australia
Sydney
US East
Virginia
319
229
119 ms