• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Galera cluster for MySQL - Introduction Slides
 

Galera cluster for MySQL - Introduction Slides

on

  • 43,131 views

This set of slides gives you an overview of Galera, configuration basics and deployment best practices. ...

This set of slides gives you an overview of Galera, configuration basics and deployment best practices.

The following topics are covered:
- Concepts
- Node provisioning
- Network partitioning
- Configuration example
- Benchmarks
- Deployment best practices
- Galera monitoring and management

Statistics

Views

Total Views
43,131
Views on SlideShare
5,276
Embed Views
37,855

Actions

Likes
20
Downloads
1
Comments
0

25 Embeds 37,855

http://www.severalnines.com 37687
http://localhost 54
http://translate.googleusercontent.com 49
http://91.121.99.140 21
http://severalnines.com 7
https://www.google.com 6
http://webcache.googleusercontent.com 5
https://translate.googleusercontent.com 4
http://www.google.com 3
https://www.google.co.kr 2
http://www.docshut.com 2
http://www.google.de 2
http://www.google.com.mt 1
http://www.google.es 1
https://www.google.com.mx 1
https://www.google.co.th 1
http://honyaku.yahoofs.jp 1
https://www.google.co.in 1
https://www.google.de 1
https://www.google.ro 1
https://twitter.com 1
http://hghltd.yandex.net 1
https://www.google.co.uk 1
http://www.slashdocs.com 1
http://www.google.co.in 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Galera cluster for MySQL - Introduction Slides Galera cluster for MySQL - Introduction Slides Presentation Transcript

    • Multi-Master Synchronous ReplicationGalera Cluster for MySQL August 2012 Alex Yu VP Products alex@severalnines.com Confidential
    • Copyright Severalnines ABAgenda  About Severalnines  What is Galera Replication?  Galera Concepts  Node Provisioning  Network partitioning/Split brain  Configuration Example  Benchmarks & Performance Metrics  Best Practices  Monitoring and Management Confidential 2
    • Copyright Severalnines ABAbout Us Stockholm, Tokyo and Singapore Database Automation and DBaaS software vendor Over 7,000 deployments to date Commercial product launched Q1 2011 Winner Best Startup EuroCloud Europe 2011 Launched Europe’s first Data Cloud in Nov 2011 Press coverage 2011: CIO Magazine, eWeek, PC-World, IDG News, Le Figaro, LeMondeInformatique, heise.de, Computerwelt, silicon.de, etc … Confidential 3
    • Copyright Severalnines ABWhat is Galera Replication?  Synchronous (Virtually) Multi-Master Replication  Read and Write on any Node  No Master Failover! No Slave Lag! Application MySQL Server  Guaranteed write consistency WSREP API WSREP API  Cluster wide conflicts resolution (certification) WSREP Provider wsrep plugin  Highly Available and Scalable Replication Replication  No SPOF  Read and Write (Parallel Applier threads) scalability  Geographical Replication (Mix MySQL Async & Galera Sync)  Cluster (Group Communication Protocol)  Automatic Node Provisioning, QoS Confidential 4
    • Copyright Severalnines ABGalera Cluster for MySQL Codership patches for MySQL  Binaries and source available at launchpad InnoDB (& MyISAM experimental) Client Client Client  No need to change DB schema/queries  Local queries LB Parallel Replication!  Multiple Applier Threads (1-512) R/W R/W R/W MySQL MySQL MySQL  Row events, row level locks [WSREP] [WSREP] [WSREP] Asynchronous Replication Galera Replication (Synchronous)  In/Out of the cluster Confidential 5
    • Copyright Severalnines ABGalera Cluster for MySQL cont.  Higher probability for “deadlocks”  Cluster wide optimistic locking  Locking conflicts detected at commit Client Client Client  First to commit succeeds  Minimum 3 nodes required LB  “Donor” node blocks writes during full synch of joining/recovering node R/W R/W R/W  3rd node then is available for service MySQL MySQL MySQL [WSREP]  Gotchas: 2 recovering nodes will block the last node [WSREP] [WSREP]  Replication performance dependent on Galera Replication (Synchronous)  Network latency  Performance of the “slowest” or the farthest Node (RTT)  Number of deployed nodes Confidential 6
    • Copyright Severalnines ABSynchronous Replication Transaction t1 Node 1 BEGIN COMMIT (REQ) COMMIT (ACK/returns) Statements Commit response time time COMMIT or Rollback WS Replication event OK or Conflict Node 2 Transaction applied (virtually synchronous) WS time Certification Apply event Node 3 Transaction applied (virtually synchronous) WS time Certification Apply event All nodes 100% sync Confidential 7
    • Copyright Severalnines ABGalera Concepts Application State  A set of data that application decides to replicate  Default is the whole MySQL databases. Every node is a complete replica  Application state is identified by a Global Transaction ID Global Transaction ID (GTID)  f7720ae0-6f9b-11e1-0800-598d1b386dce:32520198989  CLUSTER/HISTORY/STATE UUID:TRX/STATE/SEQNO  All replicated transactions can be uniquely referenced in any node Initial state: f7720ae0-6f9b-11e1-0800-598d1b386dce:0 Undefined state: 00000000-0000-0000-0000-000000000000:-1 Confidential 8
    • Copyright Severalnines ABGalera Concepts cont. MySQL [WSREP] Primary Component - PC  The whole cluster is a PC during normal operation  Node and network failures MySQL [WSREP] MySQL [WSREP]  Splits clusters into several components Primary Component Only PC can continue to modify state Quorum algorithm invoked to select a PC during cluster partitioning  Majority rules  Minority tries to reconnect with PC Confidential 9
    • Copyright Severalnines ABGalera Concepts cont. State Snapshot Transfer - SST  A transfer of a consistent snapshot of a node state corresponding to a certain GTID  Initialize the state of a newly joining cluster node from an already initialized node (donor) Incremental State Transfer - IST  Catch up with the cluster by replaying missing transactions  Known initial node state  Enough transactions cached at the donor Confidential 10
    • Copyright Severalnines ABGalera Concepts cont. Node Failures  A peer crash is indistinguishable from network failure  A node is considered failed when it no longer can be communicated with Node health verified by receiving messages or keepalives  evs.inactive_timeout  sets the timeout after which node is considered inactive (dead)  evs.suspect_timeout  sets the timeout after which the node can be pronounced dead if everyone else agrees Confidential 11
    • Copyright Severalnines ABGalera Concepts cont. LAN vs WAN replication  No notion of local or remote node  Works as long as TCP works May need tuning to be more tolerant to network latency/issues Network params sample  evs.keepalive_period = PT3S  evs.inactive_check_period = PT10S  evs.suspect_timeout = PT30S  evs.inactive_timeout = PT1M  evs.consensus_timeout = PT1M Confidential 12
    • Copyright Severalnines ABNode Provisioning Automatic node (re)synchronization A ‘donor’ is chosen to provision a ‘joiner’ node  ‘Donor’ node is blocked (write operations) until SST completes State Snapshot Transfer - SST  Scriptable interface  mysqldump (slow)  rsync (fast)  Percona Xtrabackup (faster and non-blocking) Confidential 13
    • Copyright Severalnines ABNode Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] Node 2 MySQL [WSREP] Confidential 14
    • Copyright Severalnines ABNode Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] MySQL Node 2 MySQL [WSREP] [WSREP] ‘Joiner’ Node 3 Confidential 15
    • Copyright Severalnines ABNode Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] ‘Joiner’ Node 3 MySQL Node 2 MySQL [WSREP] [WSREP] rsync receive wsrep_cluster_address=Node 2 SST Request Confidential 16
    • Copyright Severalnines ABNode Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] ‘Joiner’ Node 3 MySQL Node 2 MySQL [WSREP] [WSREP] rsync receive rsync send Node 2 in ‘donor mode’. Write operations blocked Confidential 17
    • Copyright Severalnines ABNode Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] Catch up MySQL Node 2 MySQL [WSREP] [WSREP] Node 3 Confidential 18
    • Copyright Severalnines ABNetwork Partitioning/Split Brain Quorum based system  “Majority >50%” partition continues operation  “Minority” partition blocks operations  Until reconnected with Primary Component Use odd number of nodes  Minimum 3 (5, 7, 9 etc) Galera Arbitrator (garbd)  Useful if you have even number of nodes  Nodes across DCs  Replication relay Confidential 19
    • Copyright Severalnines ABNetwork Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQL1 Primary Component [WSREP] MySQL [WSREP] DC1 DC2 Confidential 20
    • Copyright Severalnines ABNetwork Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQL Block operations untilPrimary Component ? [WSREP] reconnected with PC MySQL [WSREP] DC1 DC2 Confidential 21
    • Copyright Severalnines ABNetwork Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] DC1 DC2 Galera Arbitrator DC3 Confidential 22
    • Copyright Severalnines ABNetwork Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] Replication Relay DC1 DC2 Galera Arbitrator DC3 Confidential 23
    • Copyright Severalnines ABNetwork Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQLPrimary Component ? [WSREP] MySQL [WSREP] DC1 DC2 Galera Arbitrator DC3 Confidential 24
    • Copyright Severalnines ABGalera Configuration Example [mysqld] wsrep_cluster_address=/usr/lib64/libgalera_smm.so wsrep_node_address=gcomm:// # NOTE: This must be changed to peer address ASAP! wsrep_node_name=node1 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_provider_options=gcache.size=1G;socket.ssl_key=my_key;socket.ssl_cert=my_cert wsrep_slave_threads=16 wsrep_sst_method=xtrabackup wsrep_sst_auth=root: innodb_buffer_pool_size=1G innodb_log_file_size=256M innodb_autoinc_lock_mode=2 innodb_flush_log_at_trx_commit=0 innodb_doublewrite=0 innodb_file_per_table=1 binlog_format=ROW datadir=/var/lib/mysql log-bin = mysql-bin server-id = 2 relay-log = mysql-relay-bin #read-only = 1 log-slave-updates = 1 Confidential 25
    • Copyright Severalnines ABwsrep variables wsrep_provider  Path to wsrep provider library wsrep_cluster_address  URI form:gcomm://another_node_address?opt1=val1&opt2=val2  gcomm:// special meaning. Initialize the cluster (never leave it in my.cnf) wsrep_node_address  An optional address of the node. A short-cut way to configure listen addresses for replication and state transfers  By default it will be initialized to the first network interface returned by ifconfig. This could be unreliable.  For best results initialize it explicitly Confidential 26
    • Copyright Severalnines ABwsrep variables cont. wsrep_node_name  An optional name for the node. It will be used in logging and to identify the desired donor for state transfer  Default it will be initialized to hostname wsrep_provider_options  Semicolon-separated list of options specific to provider  Ex:  gcache.size – a size of the permanent transaction on-disk cache  socket.ssl_key, socket.ssl_cert – SSL key and certificate files Confidential 27
    • Copyright Severalnines ABwsrep variables cont. wsrep_slave_threads  Parallel applying threads (1-512)  >1 requires certain InnoDB settings. Applying of STATEMENT-based events is always serialized wsrep_sst_method  Base package contains scripts for mysqldump, rsync and xtrabackup based state snapshot transfers. Own scripts can be used  Default is mysqldump Confidential 28
    • Copyright Severalnines ABPerformance Metrics  wsrep_flow_control_paused  Fraction of the time replication was paused  wsrep_flow_control_sent  How many times this node paused replication  wsrep_local_recv_queue_avg  Average length of slave trx queue – a sign of slave side bottleneck  wsrep_cert_deps_distance  How many transactions can be applied in parallel  wsrep_local_send_queue_avg  A sign of network bottleneck Confidential 29
    • Copyright Severalnines ABNumber of conflicts/”deadlocks” wsrep_last_committed  Last committed transaction wsrep_local_cert_failures, wsrep_local_bf_aborts  Rollbacks, conflicts detected Confidential 30
    • Copyright Severalnines ABBenchmarks: sysbench, tps http://codership.com/content/whats-difference-kenneth Confidential 31
    • Copyright Severalnines ABBenchmarks: sysbench, latency http://codership.com/content/whats-difference-kenneth Confidential 32
    • Copyright Severalnines ABBenchmarks: Comparing NDB vs Galera Note: No optimizations done for the NDB storage engine (DB schema nor queries) http://codership.com/content/whats-difference-kenneth Confidential 33
    • Copyright Severalnines ABBenchmarks: Comparing NDB vs Galera Note: No optimizations done for the NDB storage engine (DB schema nor queries) http://codership.com/content/whats-difference-kenneth Confidential 34
    • Copyright Severalnines ABBest Practices  Dedicated switch/network for Galera Nodes (1 GBit min)  Connection pools/Load balancing with applications  Gives best performance  Use static/elastic IPs for the Galera nodes  Con: Need to handle node membership changes  Con: JDBC/PHP etc are not aware of Galera specific Node states  Load Balancers  Hardware, e.g., IP5  SW load balancer  HAProxy with Galera specific health check scripts  IP dispatching in the kernal for example Linux LVS  GLB (Galera Load Balancer)  Con: Need to setup LB redundancy Confidential 35
    • Copyright Severalnines ABBest Practices cont. Reference Node Client Client Client  Act as a ‘donor’ node  Backup node  No client connections LB R/W R/W R/W MySQL [WSREP] ... MySQL [WSREP] MySQL [WSREP] Donor & Backup Node Confidential 36
    • Copyright Severalnines ABBest Practices cont. Minimize probability of deadlocks  Writes go only to 1 Node  Applications use connection pool or Client Client Client load balancer on read only nodes  Have 1 “reference” Node for write failover LB and donor R R W MySQL [WSREP] ... MySQL [WSREP] MySQL [WSREP] “Master” Node Confidential 37
    • Copyright Severalnines ABGalera Limitations  MyISAM replication is experimental  DDL statements are replicated in statement level  Any writes to other table types, including system (mysql.*) tables are not replicated  CREATE USER..., but issuing: INSERT INTO mysql.user..., will not be replicated  Non-deterministic functions like NOW() are not supported  Query log cannot be directed to table  LOCK/UNLOCK TABLES cannot be supported in multi-master setups  lock functions (GET_LOCK(), RELEASE_LOCK()... )  Maximum allowed transaction size is defined by wsrep_max_ws_rows and wsrep_max_ws_size  XA transactions can not be supported due to possible rollback on commit Confidential 38
    • Copyright Severalnines ABMonitoring and Management Confidential 39
    • Copyright Severalnines ABClusterControl  Host Monitoring (CPU, RAM, Disk, Network)  Configuration Management  DB Metrics Monitoring  Performance Management  DB Resources Monitoring  Database Upgrades/Downgrades  Cluster-wide Query Analyzer  Online Scaling of MySQL Servers  Schema Management  Configurable Resource Thresholds  Replication Fail-over  Alarms and Email Notifications  Clusterware – Process Management and Automated Recovery  Backup Scheduling  Manual start/stop of Nodes  Real-time Performance Probes Confidential 40
    • Copyright Severalnines ABConfigurators Confidential 41
    • Copyright Severalnines ABGalera Configurator Confidential 42
    • Copyright Severalnines ABGalera Configurator cont. Confidential 43
    • Copyright Severalnines ABGalera Configurator cont. Confidential 44
    • Copyright Severalnines ABDeploy Galera Cluster with HAProxy cd ~/s9s-galera-2.10/mysql/scripts/install ./deploy.sh &> | tee -a cc.log wget http://severalnines.com/downloads/s9s-haproxy.tar.gz tar zxvf s9s-haproxy.tar.gz cd haproxy ./install-haproxy.sh <lb host> <rhel|debian> galera done... Confidential 45
    • Copyright Severalnines AB Confidential 46
    • Copyright Severalnines AB Confidential 47
    • Copyright Severalnines ABResources  Severalnines MySQL Galera Configurator  http://www.severalnines.com/resources/configurator  Supported platforms (MySQL Galera)  http://support.severalnines.com/entries/21589522-verified-and-supported-operating- systems  Galera limitations  http://support.severalnines.com/entries/21692388-limitations-in-galera-replication-for- mysql  ClusterControl server requirements  http://support.severalnines.com/entries/20614858-server-requirements-on-premise- amis-other-imageshttp://support.severalnines.com/entries/20614858-server- requirements-on-premise-amis-other-images Confidential 48