Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison

28,293 views

Published on

Galera Cluster for MySQL, Percona XtraDB Cluster and MariaDB Cluster (the three “flavours” of Galera Cluster) make use of the Galera WSREP libraries to handle synchronous replication.MySQL Cluster is the official clustering solution from Oracle, while Galera Cluster for MySQL is slowly but surely establishing itself as the de-facto clustering solution in the wider MySQL eco-system.

In this webinar, we will look at all these alternatives and present an unbiased view on their strengths/weaknesses and the use cases that fit each alternative.

This webinar will cover the following:

MySQL Cluster architecture: strengths and limitations
Galera Architecture: strengths and limitations
Deployment scenarios
Data migration
Read and write workloads (Optimistic/pessimistic locking)
WAN/Geographical replication
Schema changes
Management and monitoring

Published in: Technology
0 Comments
36 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
28,293
On SlideShare
0
From Embeds
0
Number of Embeds
18,862
Actions
Shares
0
Downloads
0
Comments
0
Likes
36
Embeds 0
No embeds

No notes for slide

Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison

  1. 1. MySQL Cluster (NDB) vs Galera for MySQL Confidential December 11, 2014 Alex Yu alex@severalnines.com
  2. 2. Copyright Severalnines AB Webinar Housekeeping !This webinar is being recorded !A link to the recording & slides will be posted on severalnines.com !We welcome questions: enter questions into the chat box and we will respond at the end of the presentation !Think of something later? !Email Severalnines at info@severalnines.com 2
  3. 3. Copyright Severalnines AB Agenda !MySQL Cluster (NDB) and Galera Architecture Overview !Read and write workloads !Deployment scenarios !WAN/Geographical replication !Data migration & Schema changes !Management and performance monitoring 3
  4. 4. Copyright Severalnines AB MySQL Cluster (NDB Storage Engine) !Distributed shared nothing realtime database cluster !Multi-master, auto sharding, in-memory & disk-data storage !Near linear scalability with transparent load balancing !SQL and NoSQL interfaces !Local and Geographical Replication !Synchronous and Asynchronous replication !99.999% availability, no single point of failure !Telecom “Carrier Grade” legacy 4
  5. 5. Copyright Severalnines AB MySQL Cluster Applications Example !Subscriber Databases (Telecom HLR/HSS systems) !Massive volume of write traffic (location and updates) !Response time < 3ms !eCommerce !Payment processing and fulfilment !High batch and realtime loads !Service Delivery Platforms !High volume of traffic !Mixed read/write loads 5
  6. 6. Copyright Severalnines AB MySQL Cluster Architecture 6 MySQL [SQL Node] MySQL [SQL Node] Data Node Data Node Data Node Data Node Mgmt MgmNt ode Node Web App Clients NDB API (C++) SQL based clients MySQL Client/Server protocol MGM C API Management Client SQL Nodes Data Nodes Management Nodes (default arbitrator) Synchronous replication within a Node group
  7. 7. Copyright Severalnines AB Automatic Sharding 7 Table T with 8 rows 4 Data Nodes - 2 Node Groups Data Node Data Node Data Node Data Node
  8. 8. Copyright Severalnines AB Automatic Sharding (cont.) 8 Table T with 8 rows 4 Data Nodes - 2 Node Groups Data Node 1 Data Node 2 Data Node 3 Data Node 4 • Sharding based on hashing the primary key or a user defined key • Each node stores primary fragment for 1 partition and back-up fragment for another • # of node groups == # of data nodes / # of replicas
  9. 9. Copyright Severalnines AB Automatic Sharding (cont.) 9 Table T with 8 rows 4 Data Nodes - 2 Node Groups Data Node 1 Data Node 2 Data Node 3 Data Node 4 • Sharding based on hashing the primary key or a user defined key • Each node stores primary fragment for 1 partition and back-up fragment for another • # of node groups == # of data nodes / # of replicas
  10. 10. Copyright Severalnines AB Automatic Sharding (cont.) 10 Table T with 8 rows 4 Data Nodes - 2 Node Groups Data Node 1 Data Node 2 Data Node 3 Data Node 4 • Sharding based on hashing the primary key or a user defined key • Each node stores primary fragment for 1 partition and back-up fragment for another • # of node groups == # of data nodes / # of replicas
  11. 11. Copyright Severalnines AB Automatic Sharding (cont.) 11 Table T with 8 rows 4 Data Nodes - 2 Node Groups Data Node 1 Data Node 2 Data Node 3 Data Node 4 • Sharding based on hashing the primary key or a user defined key • Each node stores primary fragment for 1 partition and back-up fragment for another • # of node groups == # of data nodes / # of replicas
  12. 12. Copyright Severalnines AB Automatic Sharding (cont.) 12 Table T with 8 rows 4 Data Nodes - 2 Node Groups Data Node Data Node Data Node Data Node • Sharding based on hashing the primary key or a user defined key • Each node stores primary fragment for 1 partition and back-up fragment for another • # of node groups == # of data nodes / # of replicas
  13. 13. Copyright Severalnines AB Automatic Sharding (cont.) 13 Data Node Data Node Data Node Data Node 4 partitions Secondary Fragments 4 Data Nodes - 2 Node Groups Primary Fragments 1 2 3 4 • The cluster is fully operational as long as we have 1 node up in each node group! • If all nodes in a single node group is gone then the cluster will gracefully shutdown
  14. 14. Copyright Severalnines AB Automatic Sharding (cont.) 14 Data Node Data Node Data Node Data Node 4 partitions Secondary Fragments 4 Data Nodes - 2 Node Groups Primary Fragments 1 2 3 4 • The cluster is fully operational as long as we have 1 node up in each node group! • If all nodes in a single node group is gone then the cluster will gracefully shutdown
  15. 15. Copyright Severalnines AB Primary Key Requests 15 MySQL [SQL Node] MySQL [SQL Node] Data Node Data Node Data Node Data Node Mgmt MgmNt ode Node Web Web Web • PK lookup goes directly to the node with the primary fragment • Parallel operations, Transparent load balancing
  16. 16. Copyright Severalnines AB Joins, index and table scans 16 MySQL [SQL Node] MySQL [SQL Node] Data Node Data Node Data Node Data Node Mgmt MgmNt ode Node Web Web Web • Table and Index Scans parallel on all nodes • Joins executes on data nodes, merged results sent back to SQL node
  17. 17. Copyright Severalnines AB Migration to MySQL Cluster !Limitations !14K row size, 512 attributes (columns + indexes) / table !32 attributes / key, only first 3072 bytes of column can be used for index !No fulltext or spatial indexes, temporary tables cannot be created using the NDB storage engine !Every table must have a Primary Key !Hidden PK is automatically created if not defined !In-Memory or disk-based tables !Dataset exceeds available system memory for the cluster? !Network, Local and Global Checkpoint !Write intensive, dimension disk subsystem !Dedicated >= 1Gb/s networking !ALTER TABLE … ENGINE NDB !Alt. MySQL Replication, Backup & Restore 17
  18. 18. Copyright Severalnines AB Deployment Scenarios 18 MySQL [SQL Node] MySQL [SQL Node] … Data Node Data Node Data Node Data Node … • Scale up to 48 Data Nodes • Limit of 255 number of nodes (regardless of type)
  19. 19. Copyright Severalnines AB Deployment Scenarios (cont.) 19 Master Slave/Standby MySQL [SQL Node] MySQL [SQL Node] Data Node Data Node Data Node Data Node MySQL [SQL Node] MySQL [SQL Node] Data Node Data Node Data Node Data Node MySQL Asynchronous Replication Single Point of Failure • Multiple replication topologies available • Master - Master (Bi-directional) West East • Conflict detection and resolution • Master - Slave(s) • Circular • etc Synchronous replication within a Node group
  20. 20. Copyright Severalnines AB Deployment Scenarios (cont.) 20 Master Slave/Standby MySQL [SQL Node] MySQL [SQL Node] Data Node Data Node Data Node Data Node MySQL [SQL Node] MySQL [SQL Node] Data Node Data Node Data Node Data Node Primary START SLAVE only on Primary! Secondary/Standby West East • Master - Slave(s) • Standby replication channel • Manual failover
  21. 21. Copyright Severalnines AB Deployment Scenarios (cont.) 21 Master Master MySQL [SQL Node] MySQL [SQL Node] Data Node Data Node Data Node Data Node MySQL [SQL Node] MySQL [SQL Node] Data Node Data Node Data Node Data Node Primary Secondary/Standby • Master - Master (Bi-directional) • Conflict detection and resolution West East • “timestamp based” • row by row not transaction based
  22. 22. Copyright Severalnines AB Galera Cluster for MySQL 22 !Synchronous (Virtually) Multi-Master Replication !Read and Write on any Node !No Master Failover! No Slave Lag! !Guaranteed write consistency !Cluster wide conflicts resolution (certification) !Automatic Node Provisioning !Highly Available and Scalable Client Client Client R/W R/W R/W MySQL [WSREP] !No SPOF !Read and Write (Parallel Applier threads) scalability !Geographical Replication (Mix MySQL Async & Galera Sync) Galera Replication (Synchronous) !Codership, Percona XtraDB Cluster, MariaDB Galera Cluster LB MySQL [WSREP] MySQL [WSREP]
  23. 23. Copyright Severalnines AB Galera Cluster for MySQL (cont.) !Recommended minimum 3 nodes !Network partition/split-brain !Blocking SST (rsync, mysqldump) !Higher probability for “deadlocks” !Cluster wide optimistic locking !Locking conflicts detected at commit !First to commit succeeds !Replication performance dependent on !Network latency !Performance of the “slowest” or the farthest Node (RTT) !Number of deployed nodes 23 Client Client Client R/W R/W R/W MySQL [WSREP] LB MySQL [WSREP] MySQL [WSREP] Galera Replication (Synchronous)
  24. 24. Copyright Severalnines AB Galera Concepts !Primary Component - PC !The whole cluster is a PC during normal operation !Node and network failures ! Splits clusters into several components !Only PC can continue to modify state !Quorum algorithm invoked to select a PC during cluster partitioning !Majority rules !Minority tries to reconnect with PC 24 MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] Primary Component
  25. 25. Copyright Severalnines AB Galera Concepts (cont.) !State Snapshot Transfer - SST !A transfer of a consistent snapshot of a node state corresponding to a certain GTID !Initialize the state of a newly joining cluster node from an already initialized node (donor) !Incremental State Transfer - IST !Catch up with the cluster by replaying missing transactions ! Known initial node state ! Enough transactions cached at the donor ! gcache.size < database size 25
  26. 26. Copyright Severalnines AB High Latency Network !Galera 2.x WAN replication (MySQL 5.5) !Point to point connection for all nodes! !Transaction latency dependent on the slowest link 26 MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] DC1 MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] DC2
  27. 27. Copyright Severalnines AB High Latency Network (cont.) !Galera 3.x WAN optimization (MySQL 5.6) !“Cluster” Segment ID to group nodes by location !Replication between segments go over a single connection !Replication writesets distributed within each segment peer to peer !Segment connection/gateway can change per transaction 27 gmcast.segment = 1 gmcast.segment = 2 MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] segment gateway DC1 DC2
  28. 28. Copyright Severalnines AB High Latency Network (cont.) !Galera 3.x WAN optimization (MySQL 5.6) !“Cluster” Segment ID to group nodes by location !Replication between segments go over a single connection !Replication writesets distributed within each segment peer to peer !Segment connection/gateway can change per transaction 28 gmcast.segment = 1 segment gateway gmcast.segment = 2 MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] DC1 DC2
  29. 29. Copyright Severalnines AB Network Partition/Split Brain !Quorum based system !“Majority >50%” partition continues operation !“Minority” partition blocks operations ! Until reconnected with Primary Component !Use odd number of nodes !Minimum 3 (5, 7, 9 etc) !Galera Arbitrator (garbd) !Useful if you have even number of nodes !Nodes across DCs !Replication relay 29 Galera Arbitrator DC3 MySQL [WSREP] MySQL [WSREP] DC1 MySQL [WSREP] DC2 Client Client Client Load balancer Replication Relay
  30. 30. Copyright Severalnines AB Migration to Galera Cluster !Only InnoDB storage engine !Limited MyISAM support - not recommended !Every Table should have a Primary Key !DELETE operations are unsupported on tables without a primary key !Rows in tables without a primary key may appear in a different order on different nodes. (for cert. md5sum pseudo key from full row) !Transaction size ! A writeset is processed as a single memory-resident buffer ! Extremely large transactions e.g. LOAD DATA can affect performance ! wsrep_load_data_splitting = ON | OFF # 10K inserts/transaction ! wsrep_max_ws_rows (128K), wsrep_max_ws_size (1GB) 30
  31. 31. Copyright Severalnines AB Migration to Galera Cluster (cont.) !Auto Increments !Managed automatically ! Node-1: 1, 4, 7 ! Node-2: 2, 5, 8 ! Node-3: 3, 6, 9 !Auto increment sequence gaps if inserts hit different nodes randomly !Triggers fire only in the Galera node which executes the transaction !Events fire on all nodes 31
  32. 32. Copyright Severalnines AB Schema Changes !DDLs replicated in statement format !Two main methods !TOI - Total Order Isolation !RSU - Rolling Schema Upgrade !wsrep_osu_method = TOI | RSU !wsrep_desync=ON + wsrep_on=OFF !Disconnect from cluster and stop writeset replication (standalone MySQL server) !Dropping Node !Set global wsrep_cluster_address=gcomm:// !Joining must be through IST !Percona Toolkit !pt-online-schema-change 32
  33. 33. Copyright Severalnines AB Schema Changes (cont.) !TOI - Total Order Isolation !Default DDL replication method !Strict consistency, all nodes get the same change !No schema backwards compatibility !Strict commit order force every transactions to wait until DDL is completed !Cluster performance degradation 33
  34. 34. Copyright Severalnines AB Schema Changes (cont.) !RSU - Rolling Schema Upgrade !Desynchronize node from replication until DDL completes !Incoming replication is buffered, nothing is replicated out of the node !After the DDL completes the node will automatically join the cluster and catch up missed transactions from the writeset cache (gcache.size) !Potential no cluster performance degradation !Schema changes need to be backwards compatible ! Applications should be able to use old and new schemas !Only one RSU operation at a time !Rolling operation of the cluster is manual 34
  35. 35. Copyright Severalnines AB Deployment Scenarios 35 Users Users Users HAProxy Load Balancer hthtpttp http http HAProxy Load Balancer R/W R/W R/W MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] Galera Replication (Synchronous) ClusterControl hthtpttp Admin VIP http://support.severalnines.com/entries/23612682-Install-HAProxy-and-Keepalived-Virtual-IP- subnet
  36. 36. Copyright Severalnines AB Deployment Scenarios (cont.) 36 Galera as MySQL Slave MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] Slave MySQL [Master] MySQL Replication wsrep_mysql_replication_bundle=N • Replication events can be bundled to commit as a single group • Less waits for replication synchronization • wsrep_mysql_replication_bundle=n • Groups n mysql replication transactions in one large transaction
  37. 37. Copyright Severalnines AB Deployment Scenarios (cont.) 37 Galera as MySQL replication Master MySQL [WSREP] Master MySQL [WSREP] MySQL [WSREP] Master MySQL [Slave] MySQL Replication MySQL [Slave] DC1 DC2 • Backups & Reports • Disaster Recovery
  38. 38. Copyright Severalnines AB Deployment Scenarios (cont.) 38 MySQL [WSREP] Disaster Recovery Master Standby MySQL Replication MySQL [WSREP] MySQL [WSREP] Master MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] Master DC1 DC2 • “Manual” replication failover • Slave lag
  39. 39. Copyright Severalnines AB Deployment Scenarios (cont.) 39 MySQL [WSREP] Master MySQL [WSREP] Slave MySQL [WSREP] Slave MySQL [Master] MySQL Replication Multi-Source Sink MySQL [Master] MySQL [Slave] http://www.severalnines.com/blog/multi-source-replication-galera-cluster-mysql
  40. 40. Copyright Severalnines AB Severalnines - ClusterControl !Monitor and Manage Heterogeneous Database Cluster !MySQL Cluster, Galera Cluster for MySQL, MongoDB !Automatic !Node and Cluster Recovery !Scheduled Backups !Add/Remove Nodes !Create single DB Node and Cluster !Alerts/Email !Host and DB Metrics 40
  41. 41. Copyright Severalnines AB 41
  42. 42. Copyright Severalnines AB 42
  43. 43. Copyright Severalnines AB 43
  44. 44. Copyright Severalnines AB Thank You! !Severalnines recorded webinars !http://www.severalnines.com/resources/webinars !Severalnines Blog !www.severalnines.com/blog !Galera Cluster for MySQL Intro !http://www.severalnines.com/clustercontrol-mysql-galera-tutorial !MySQL Cluster Training !http://www.severalnines.com/mysql-cluster-training !More Questions? Contact us at: !info@severalnines.com 44

×