XtraDB Cluster Chiba.pm #2 at 2013/03/23 yoku0825
＼こんにちは！／• I’m yoku0825, working as DBA for the company’s web services.• Husband of my wife :)• Father of my sun :)• I love MySQL too match, such as I can’t log-in PostgreSQL and Oracle :(• Maybe Perl Monger, seldom wrote codes but!!
Percona XtraDB ClusterClustered by – Multi Master – (Virtual) Synchronous – Parallel in Row-LevelWrite Set Replication(wsrep)So it makes flat(no nodes are specific as Master or Slave)MySQL Replication topology.And wsrep takes a automated Full-physical andIncrement-logical data synchronization.
wsrep• wsrep is the concept for realize following notes(following next page) – wsrep implementation by cordership named Galera Replication, it takes the form of patch for MySQL(there is implementation for PostgreSQL, but I don’t know about that). And MySQL compiled with Galera Replication patch is destributed by cordership, named Galera Cluster for MySQL. • http://www.codership.com/content/using-galera-cluster – Percona-Server compiled with Galera Replication patch is named Percona XtraDB Cluster. • http://www.percona.com/software/percona-xtradb-cluster – MariaDB compiled with Galera Replication patch is named MariaDB Galera Cluster. • https://kb.askmonty.org/en/what-is-mariadb-galera-cluster/
Multi Master• All nodes are writable – Of cource, standard MySQL Replication is so. – But you write into table on Replication Slave, its updates(INSERT, UPDATE, DELETE, and so) are not replicated to any node without its own Slave Server. This means it breaks consistency, so we can’t allow to write Replication Slave by read_only option variable. – wsrep library certs consistency when you write into table on any node by using (Virtual) Synchronous Replication.
(Virtual) Synchronous• All nodes have same data (DAI-TAI) anytime – When you write some data into some node, wsrep library hooks InnoDB API and writes wsrep’s binary log(Galera Cache) in row-based binary log, before it answers commit request. – Galera Cache is pushed through Communication-Path and confirmed “Can it execute with integrity, without conflict of update?” on all other nodes. – When wsrep gets that confirmation(Certification) on all other nodes, its update is committed at last. Otherwise, when wsrep received Certification Failure, its transaction will rollback(=return Error: 1213 “Deadlock found when trying to get lock.”) and any node doesn’t update to keep consistency.
(Virtual) Synchronous• Why it called “Virtual” Synchronous? – It has a little rag between Certification and Real- Write into Disk. – The server which is received to write request at first writes its data into disk and notifies “its data is committed”. All other servers can write into disk after that notification(because of keeping it transaction can rollback)
Parallel in Row-Level• Any updates logged in Galera Cache is Row-Based Binary log – It makes each Certification is isolated in Row-Level, it makes each Certification can work in multi thread. – Statement-Based Replication can’t detect even if replication has lost data consistency. Do you face Error:1032 “Can’t find record in `tablename`”?• Not only Galera Cache, but also “binary log” for using standard replication. – It inherits some weakpoint of standard RBR.• MySQL 5.6 improves Multi Thread SQL_Thread but it works only separated Database-Level.
State Snapshot Transfer(SST)• It is full data copy from joined cluster node to joining cluster node, automated by wsrep library. – You can choise “rsync”(physical copy, need lock), “mysqldump”(logical copy, need lock too), “custom script”. – XtraDB Cluster can use xtrabackup(physical copy, without lock) as “custom script”. It makes adding new node and coming back downed node are lock-free and almost automation.• If wsrep detects “impossible(=can’t certificate) update” why data is mismatch just then wsrep execute SST and keeps consistency over the cluster.
Incremental State Transfer(IST)• It is incremental data copy from joined cluster node to joining cluster node, automated by wsrep library. – When cluster node leave and come back shortly, wsrep doesn’t need SST if they have enough Galera Cache covered all over its downtime. – IST doesn’t need any locks and need any reads physical data file, because of IST uses only Galera Cache. And Galera Cache is maybe hot cache ‘coz it is “nearest updates for cluster”, IST is very faster than SST.
Flat Topology + Automated SST =• You don’t be annoyed at all, when – “OMG! Master is down! Get to switchover now!” • There’s nothing to do you’ve already prepared your LVS configure. – “OMG! There is some different data between Master and Slave!” • wsrep have already fixed it before you know that, you can see only the fact SST is done. – “OMG! Today is Christmas but I don’t have Girlfriend!” • I’m very veeeeery sorry. If I don’t introduce XtraDB Cluster, you can keep you don’t know that.
Combination with Standard Replication• wsrep hooks only InnoDB write method, it means wsrep is separated from MySQL standard Replication. – Each node joining XtraDB Cluster can be Slave of another mysqld. – Any mysqld can be Slave of node joining XtraDB Cluster. – But XtraDB Cluster Node doesn’t have binary log for update through Galera Cache and inside out, you have to put log-slave-updates in your my.cnf. – And you have to know, wsrep doesn’t support(correctly, “does ignore”) except of updates into InnoDB.