Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to XtraDB Cluster


Published on

2013/03/23 #2の資料です。

  • Be the first to comment

Introduction to XtraDB Cluster

  1. 1. XtraDB Cluster #2 at 2013/03/23 yoku0825
  2. 2. \こんにちは!/• I’m yoku0825, working as DBA for the company’s web services.• Husband of my wife :)• Father of my sun :)• I love MySQL too match, such as I can’t log-in PostgreSQL and Oracle :(• Maybe Perl Monger, seldom wrote codes but!!
  3. 3. Today, I talk aboutPercona XtraDB Cluster
  4. 4. Today, I talk about Percona XtraDB Cluster ぺるこなえくすとらでぃーびーくらすたー
  5. 5. Do you know Percona LLC? Famous their work are Percona-Toolkit, for MySQL Utility Tools, and Percona-Server, extended MySQL clone.
  6. 6. XtraDB Cluster is implementation of Galera Replication by Percona LLC. What’s Galera Replication? ATO-DE-MATA-DETEKIMASU.
  7. 7. 取り敢えず響きが良いですよ ね エクストラディービークラスター
  8. 8. ベンチマーク無双してる頃に 会社の食堂に行って
  9. 9. 「あ、XtraDB」 「えっ」「ホットで」 「えっ」 「えっ」
  10. 10. orz
  11. 11. くらい、響きが良いですよ ね :D XtraDBだけだと ストレージエンジンの 名前なんですけどね
  12. 12. Percona XtraDB ClusterClustered by – Multi Master – (Virtual) Synchronous – Parallel in Row-LevelWrite Set Replication(wsrep)So it makes flat(no nodes are specific as Master or Slave)MySQL Replication topology.And wsrep takes a automated Full-physical andIncrement-logical data synchronization.
  13. 13. wsrep• wsrep is the concept for realize following notes(following next page) – wsrep implementation by cordership named Galera Replication, it takes the form of patch for MySQL(there is implementation for PostgreSQL, but I don’t know about that). And MySQL compiled with Galera Replication patch is destributed by cordership, named Galera Cluster for MySQL. • – Percona-Server compiled with Galera Replication patch is named Percona XtraDB Cluster. • – MariaDB compiled with Galera Replication patch is named MariaDB Galera Cluster. •
  14. 14. Multi Master• All nodes are writable – Of cource, standard MySQL Replication is so. – But you write into table on Replication Slave, its updates(INSERT, UPDATE, DELETE, and so) are not replicated to any node without its own Slave Server. This means it breaks consistency, so we can’t allow to write Replication Slave by read_only option variable. – wsrep library certs consistency when you write into table on any node by using (Virtual) Synchronous Replication.
  15. 15. (Virtual) Synchronous• All nodes have same data (DAI-TAI) anytime – When you write some data into some node, wsrep library hooks InnoDB API and writes wsrep’s binary log(Galera Cache) in row-based binary log, before it answers commit request. – Galera Cache is pushed through Communication-Path and confirmed “Can it execute with integrity, without conflict of update?” on all other nodes. – When wsrep gets that confirmation(Certification) on all other nodes, its update is committed at last. Otherwise, when wsrep received Certification Failure, its transaction will rollback(=return Error: 1213 “Deadlock found when trying to get lock.”) and any node doesn’t update to keep consistency.
  16. 16. (Virtual) Synchronous• Why it called “Virtual” Synchronous? – It has a little rag between Certification and Real- Write into Disk. – The server which is received to write request at first writes its data into disk and notifies “its data is committed”. All other servers can write into disk after that notification(because of keeping it transaction can rollback)
  17. 17. Parallel in Row-Level• Any updates logged in Galera Cache is Row-Based Binary log – It makes each Certification is isolated in Row-Level, it makes each Certification can work in multi thread. – Statement-Based Replication can’t detect even if replication has lost data consistency. Do you face Error:1032 “Can’t find record in `tablename`”?• Not only Galera Cache, but also “binary log” for using standard replication. – It inherits some weakpoint of standard RBR.• MySQL 5.6 improves Multi Thread SQL_Thread but it works only separated Database-Level.
  18. 18. State Snapshot Transfer(SST)• It is full data copy from joined cluster node to joining cluster node, automated by wsrep library. – You can choise “rsync”(physical copy, need lock), “mysqldump”(logical copy, need lock too), “custom script”. – XtraDB Cluster can use xtrabackup(physical copy, without lock) as “custom script”. It makes adding new node and coming back downed node are lock-free and almost automation.• If wsrep detects “impossible(=can’t certificate) update” why data is mismatch just then wsrep execute SST and keeps consistency over the cluster.
  19. 19. Incremental State Transfer(IST)• It is incremental data copy from joined cluster node to joining cluster node, automated by wsrep library. – When cluster node leave and come back shortly, wsrep doesn’t need SST if they have enough Galera Cache covered all over its downtime. – IST doesn’t need any locks and need any reads physical data file, because of IST uses only Galera Cache. And Galera Cache is maybe hot cache ‘coz it is “nearest updates for cluster”, IST is very faster than SST.
  20. 20. Flat Topology + Automated SST =• You don’t be annoyed at all, when – “OMG! Master is down! Get to switchover now!” • There’s nothing to do you’ve already prepared your LVS configure. – “OMG! There is some different data between Master and Slave!” • wsrep have already fixed it before you know that, you can see only the fact SST is done. – “OMG! Today is Christmas but I don’t have Girlfriend!” • I’m very veeeeery sorry. If I don’t introduce XtraDB Cluster, you can keep you don’t know that.
  21. 21. Combination with Standard Replication• wsrep hooks only InnoDB write method, it means wsrep is separated from MySQL standard Replication. – Each node joining XtraDB Cluster can be Slave of another mysqld. – Any mysqld can be Slave of node joining XtraDB Cluster. – But XtraDB Cluster Node doesn’t have binary log for update through Galera Cache and inside out, you have to put log-slave-updates in your my.cnf. – And you have to know, wsrep doesn’t support(correctly, “does ignore”) except of updates into InnoDB.
  22. 22. How is it?Do you feel something WAK-WAK?
  23. 23. 勿論弱点も(いっぱい)ありま す I/Oに負荷が偏るとか wsrepの部分の日本語情報少ないとか xtrabackupと連携させるのにinnobackupexいじるとか xtrabackup使わないとSSTでリードロックかかるし RBR前提なのでバイナリログでっかいとかPrimary Keyが無いと悲惨とか 更新がかかったノードにしかバイナリログ無いとか InnoDB(XtraDB)以外のデータはそもそも同期されないとか
  24. 24. 性能は? tpcc-mysqlで叩いた感じ、Semi-Sync(M/S)の3割引くらいでした ネットワーク帯域とかバイナリログの保管場所にも注意 LVS通した分のオーバーヘッドもたぶんある パラレルレプリケーションする為にそれなりのコアが必要 (1行が1スレッド使うから、1万行更新するクエリは…)サーバまたぎの更新がlock waitじゃなくてCertification Failure(=Deadlock扱い)になるの で、 アプリ側でハンドルしてやらないといけない
  25. 25. ( ´-`).oO(難易度的には ndbcluster(=MySQLCluster)と 良い勝負かな) 異論はあると思う
  26. 26. でも楽しそうですよね! 何より響きが良い
  27. 27. ところでみなさんお気付きでしょうか
  28. 28. 実はこれ、Perlの話じゃないんで す “Percona” =~ /^Per/ But “Percona” != “Perl”
  29. 29. #1 で話していた書き捨てMySQLモニターツー ル
  30. 30. 挫折しましたorzだってCactiあるんだもん
  31. 31. でもたまに役に立ちます Cactiよりドリルダウンしたいとき金曜日に地味に機能追加してみたりしたスローログ数えるやつも地味に使ってる
  32. 32. あ、あと
  33. 33. この前MyNA会で「俺はVimmer」って言いました が
  34. 34. あれは嘘です 嘘っていうか、Vimしか使えない消極的Vimmer.vimrcもgtags関連しか入れてない
  35. 35. Terapad最高!Perlスクリプトもてらぱで書く
  36. 36. ご清聴ありがとうございました オチなし