Tungsten Replicator tutorial

4,979 views

Published on

Full tutorial of Tungsten Replicator installation and management

Published in: Technology, Self Improvement

Tungsten Replicator tutorial

  1. 1. ©Continuent 2013Using Tungsten Replicatorto solve replicationproblemsNeil Armitage, Cluster implementation Engineer, ContinuentGiuseppe Maxia, QA Director, Continuent11Monday, April 22, 13
  2. 2. ©Continuent 2013ABOUT US• Neil Armitage• Continuent Tungsten Deployment and SupportEngineer, Continuent, Inc• 20 years development and DB experience• Giuseppe Maxia, a.k.a. "The Data Charmer"• QA Director, Continuent, Inc• 25 years development and DB experience• long timer MySQL community member. OracleACE Director22Monday, April 22, 13
  3. 3. ©Continuent 2013Tungsten replicator• Global transaction ID• Multiple masters• Multiple sources• Flexible topologies• Parallel replication• Heterogeneous replication• ... and more33Monday, April 22, 13
  4. 4. ©Continuent 2013What Tungsten Replicator is NOT• Automated management• Automatic failover• Transparent connections• All the above (and more) are available with acommercial solution named ContinuentTungsten (a.k.a. Tungsten Enterprise)44Monday, April 22, 13
  5. 5. ©Continuent 2013What are we talking about?• Requirements• Components• Installation• Topologies• Administration• Troubleshooting55Monday, April 22, 13
  6. 6. ©Continuent 2013Tungsten Replicator Concepts6RoleserviceReplicatorMaster, slave, direct slaveA.k.a. "pipeline"The replication enginestage extract,queue,apply6Monday, April 22, 13
  7. 7. ©Continuent 2013Tungsten Replicator Components7THLservice schemaproperties fileTransaction History LogMakes the node crash proofservice definitiontoolsRuling from a centralizedlocation7Monday, April 22, 13
  8. 8. ©Continuent 2013Tungsten Replicator in a nutshellbinlog THLslavemasterhost1host2THLtrep_commit_seqnotrep_commit_seqnotrep_commit_seqnoorigin seqno eventidtrep_commit_seqnotrep_commit_seqnotrep_commit_seqnoorigin seqno eventidglobaltransaction ID88Monday, April 22, 13
  9. 9. ©Continuent 2013Planning9• Hosts• Topology• Stand-alone or taking over9Monday, April 22, 13
  10. 10. ©Continuent 2013starmaster-slave Heterogeneousfan-in slave all-mastersMySQLOracleOracleMySQL OracleOracleMySQL MySQL10Monday, April 22, 13
  11. 11. ©Continuent 2013Installation1111Monday, April 22, 13
  12. 12. ©Continuent 2013Installation• System Requirements• Validate !rst• Deploying from a single location1212Monday, April 22, 13
  13. 13. ©Continuent 2013Installation - tools• tools/ tungsten-installer• tools/ con!gure-service• tools/update• (Using the cookbook recipes, you hardly seethem)1313Monday, April 22, 13
  14. 14. ©Continuent 2013Tungsten in practiceInstallation1414Monday, April 22, 13
  15. 15. ©Continuent 2013Installation• Check the requirements• Get the binaries• Expand the tarball• Run cookbook1515Monday, April 22, 13
  16. 16. ©Continuent 2013REQUIREMENTS• Java JRE or JDK (Sun/Oracle or Open-jdk)• Ruby 1.8 (only during installation)• ssh access to the same user in all nodes• MySQL user with all privileges1616Monday, April 22, 13
  17. 17. ©Continuent 2013Installation - Choices• --master-slave• --direct1717Monday, April 22, 13
  18. 18. ©Continuent 2013binlogTHLTHLslaveslavemasterhost1host2host3THL18master-slave18Monday, April 22, 13
  19. 19. ©Continuent 2013binlogTHLslavemasterrelay loghost1host2host3THLslaverelay log19direct19Monday, April 22, 13
  20. 20. ©Continuent 2013Overview of Virtual Machines• Copy zip !les from USB Key• Expand on local disk• Start all 4 Machines in VirtualBox2020Monday, April 22, 13
  21. 21. ©Continuent 2013Virtual Machines• 4 Nodes host1->host4• Running centos 6.3 and Percona 5.5• Root and tungsten password = ‘password’• localhost port 2222 redirects to 22 on hosts21ssh  -­‐p  2222  tungsten@localhost21Monday, April 22, 13
  22. 22. ©Continuent 2013VERY important de!nitions• Staging directory:• Where you unpack the software and run theinstaller.• There is generally only one, in one host;• Can be discarded after installation• Installation directory:• Where your installed software will go;• There is one for every host;2222Monday, April 22, 13
  23. 23. ©Continuent 2013Example23host1host3Staging directory:$HOME/tungsten-replicator-2.0.8-167host2Installation directory:/opt/replicationInstallation directory:/opt/replicationInstallation directory:/opt/replication23Monday, April 22, 13
  24. 24. ©Continuent 2013Requirements : how to• step by step: how it happened2424Monday, April 22, 13
  25. 25. ©Continuent 2013installing VMs• Step-by-step demo2525Monday, April 22, 13
  26. 26. ©Continuent 2013Overview of Tungstencookbook2626Monday, April 22, 13
  27. 27. ©Continuent 2013tungsten cookbooktungsten-replicator-2.0.8-167|+--/cluster-home+--/cookbook+--/tools+--/tungsten-replicator2727Monday, April 22, 13
  28. 28. ©Continuent 2013tungsten cookbooktungsten-replicator-2.0.8-167|+--/cookbook|+--COMMON_NODES.sh+--USER_VALUES.sh+--NODES_MASTER_SLAVE.sh+--install_master_slave+--show_cluster+--test_cluster...2828Monday, April 22, 13
  29. 29. ©Continuent 2013tungsten cookbooktungsten-replicator-2.0.8-167|+--/cookbook|+--COMMON_NODES.sh+--USER_VALUES.sh+--NODES_ALL_MASTERS.sh+--install_all_masters+--show_cluster+--test_cluster...2929Monday, April 22, 13
  30. 30. ©Continuent 2013tungsten cookbooktungsten-replicator-2.0.8-167|+--/cookbook|+--COMMON_NODES.sh+--USER_VALUES.sh+--NODES_STAR.sh+--install_star+--show_cluster+--test_cluster...3030Monday, April 22, 13
  31. 31. ©Continuent 2013tungsten cookbooktungsten-replicator-2.0.8-167|+--/cookbook|+--COMMON_NODES.sh+--USER_VALUES.sh+--NODES_FAN_IN.sh+--install_fan_in+--show_cluster+--test_cluster...3131Monday, April 22, 13
  32. 32. ©Continuent 2013tungsten cookbook$ cat COMMON_NODES.shexport NODE1=host1export NODE2=host2export NODE3=host3export NODE4=host43232Monday, April 22, 13
  33. 33. ©Continuent 2013tungsten cookbook$ cat USER_VALUES.sh# User defined values for the cluster to beinstalled.export TUNGSTEN_BASE=$HOME/installs/cookbookexport DATABASE_USER=tungstenexport BINLOG_DIRECTORY=/var/lib/mysqlexport MY_CNF=/etc/my.cnfexport DATABASE_PASSWORD=secretexport DATABASE_PORT=3306export TUNGSTEN_SERVICE=cookbookexport RMI_PORT=10000export THL_PORT=2112export START_OPTION=start3333Monday, April 22, 13
  34. 34. ©Continuent 2013Getting started: VALIDATE FIRSTexport VERBOSE=1./cookbook/check_cookbook./cookbook/validate_cluster3434Monday, April 22, 13
  35. 35. ©Continuent 2013sample master-slave installation• edit cookbook/COMMON_NODES.sh• edit cookbook/USER_VALUES.sh• run cookbook/install_master_slave• and then:• run cookbook/show_cluster• run cookbook/test_cluster3535Monday, April 22, 13
  36. 36. ©Continuent 2013What does the installation dohost41: Validate all servershost1 host2 host3Report all errors3636Monday, April 22, 13
  37. 37. ©Continuent 2013What does the installation dohost41: (again) Validate all servershost1 host2 host33737Monday, April 22, 13
  38. 38. ©Continuent 2013What does the installation do2: install Tungsten in all servershost3$HOME/tinstall/config/releases/relay/thl/tungsten/backups/host4host1host23838Monday, April 22, 13
  39. 39. ©Continuent 2013example (from manual installation)ssh r2 chmod 444 $HOME/tinstall./tools/tungsten-installer --master-slave --master-host=r1 --datasource-user=tungsten --datasource-password=secret --service-name=dragon --home-directory=$HOME/tinstall --thl-directory=$HOME/tinstall/logs --relay-directory=$HOME/tinstall/relay --cluster-hosts=r1,r2,r3,r4 --startERROR >> qa.r2.continuent.com >> /home/tungsten/tinstall is not writeable3939Monday, April 22, 13
  40. 40. ©Continuent 2013examplessh r2 chmod 755 $HOME/tinstall./tools/tungsten-installer --master-slave --master-host=r1 --datasource-user=tungsten --datasource-password=secret --service-name=dragon --home-directory=$HOME/tinstall --thl-directory=$HOME/tinstall/logs --relay-directory=$HOME/tinstall/relay --cluster-hosts=r1,r2,r3,r4 --start# no errors4040Monday, April 22, 13
  41. 41. ©Continuent 2013After installation.A tour of the cookbookutilities4141Monday, April 22, 13
  42. 42. ©Continuent 2013General principles (1)42• Scripts without extension are designed to belaunched by users• e.g. ./cookbook/help• ./cookbook/install_master_slave• Scripts with extension ".sh" are either forinternal use only or deprecated.• ./cookbook/install_* scripts can be usedbefore installing. Most everything elserequire an installed topology42Monday, April 22, 13
  43. 43. ©Continuent 2013General principles (2)43• After installation there is a !leCURRENT_TOPOLOGY in the stagingdirectory• cookbook scripts can be used either from thestaging directory or from the installationdirectory.43Monday, April 22, 13
  44. 44. ©Continuent 2013Cookbook tour: help and checks44./cookbook/check_cookbook./cookbook/help./cookbook/readme44Monday, April 22, 13
  45. 45. ©Continuent 2013Cookbook tour: Getting information45./cookbook/show_cluster./cookbook/paths./cookbook/backups./cookbook/services./cookbook/query_node {node} {query}./cookbook/query_all_nodes {query}45Monday, April 22, 13
  46. 46. ©Continuent 2013Cookbook tour: Inspecting replication46./cookbook/replicator./cookbook/trepctl./cookbook/thl./cookbook/show_conf./cookbook/edit_conf./cookbook/show_log./cookbook/vimlog./cookbook/emacslog46Monday, April 22, 13
  47. 47. ©Continuent 2013Cookbook tour: testing tools47./cookbook/test_cluster./cookbook/start_load [start|stop]./cookbook/test_all_topologies47Monday, April 22, 13
  48. 48. ©Continuent 2013Cookbook tour: powerful admin tools48./cookbook/heartbeat./cookbook/switch./cookbook/add_node_master_slave./cookbook/add_node_star./cookbook/copy_backup./cookbook/clear_cluster # <--- CAUTION!48Monday, April 22, 13
  49. 49. ©Continuent 2013More installation4949Monday, April 22, 13
  50. 50. ©Continuent 2013DRY-RUN50• Method to simulate installation;• Does NOT perform installation;• Does NOT even do validation;• It only shows the commands used to install;• Allows you to get the commands and do aninstallation manually (e.g. when you cant sshbetween nodes)50Monday, April 22, 13
  51. 51. ©Continuent 2013DRY-RUN51export DRYRUN=1./cookbook/install_master_slave51Monday, April 22, 13
  52. 52. ©Continuent 2013Intro to multi-master installation5252Monday, April 22, 13
  53. 53. ©Continuent 2013How tungsten-installer Works forBasic Master/Slave Deployment53db1db2db3Staging copyof filescheck prereqscopy codeconfigure53Monday, April 22, 13
  54. 54. ©Continuent 2013From Master/Slave Replication ...54db1Replicatordb3Service alphadb2ReplicatorService alphaReplicatorService alphaInstall master and slaves on the whole clustertungsten-installertungsten-installertungsten-installer54Monday, April 22, 13
  55. 55. ©Continuent 2013To Multi-Master55db1 ReplicatorService alphaService bravodb2ReplicatorService bravoService alphaInstall master ondb1tungsten-installerinstall masteron db2tungsten-installerinstall slaveservice on db1con!gure-serviceinstall slaveservice on db2con!gure-service55Monday, April 22, 13
  56. 56. ©Continuent 2013tungsten-installer master 156TUNGSTEN_HOME=/home/tungsten/installs/cookbook./tools/tungsten-installer--master-slave--master-host=$MASTER1--datasource-port=3306--datasource-user=tungsten--datasource-password=secret--datasource-log-directory=/var/lib/mysql--service-name=alpha--home-directory=$TUNGSTEN_HOME--cluster-hosts=$MASTER1--startcreating service alphaNotice: --cluster-hosts has only one host56Monday, April 22, 13
  57. 57. ©Continuent 2013tungsten-installer master 257TUNGSTEN_HOME=/home/tungsten/installs/cookbook./tools/tungsten-installer--master-slave--master-host=$MASTER2--datasource-port=3306--datasource-user=tungsten--datasource-password=secret--datasource-log-directory=/var/lib/mysql--service-name=bravo--home-directory=$TUNGSTEN_HOME--cluster-hosts=$MASTER2--startcreating service bravoNotice: --cluster-hosts has only one host57Monday, April 22, 13
  58. 58. ©Continuent 2013Con!gure Service master 158TUNGSTEN_HOME=/home/tungsten/installs/cookbook$TUNGSTEN_HOME/tungsten/tools/configure-service -C--quiet--host=$MASTER1--datasource=$MASTER1--local-service-name=alpha--role=slave--service-type=remote--release-directory=$TUNGSTEN_HOME/tungsten--skip-validation-check=THLStorageCheck--master-thl-host=$MASTER2--master-thl-port=2112--svc-start bravoNotice: bravo is the master service in host 258Monday, April 22, 13
  59. 59. ©Continuent 2013Con!gure Service master 259TUNGSTEN_HOME=/home/tungsten/installs/cookbook$TUNGSTEN_HOME/tungsten/tools/configure-service -C--quiet--host=$MASTER2--datasource=$MASTER2--local-service-name=bravo--role=slave--service-type=remote--release-directory=$TUNGSTEN_HOME/tungsten--skip-validation-check=THLStorageCheck--master-thl-host=$MASTER1--master-thl-port=2112--svc-start alphaNotice: alpha is the master service in host 159Monday, April 22, 13
  60. 60. ©Continuent 2013From Master/Slave Replication ...60db1Replicatordb3Service db1db2ReplicatorService db1ReplicatorService db1./cooobook/install_master_slave60Monday, April 22, 13
  61. 61. ©Continuent 2013How Do I Install Fan-In Replication?61db1Replicatordb3Service db1db2ReplicatorService db2ReplicatorService db1Service db2./cooobook/install_fan_in61Monday, April 22, 13
  62. 62. ©Continuent 2013How Do I Install Multi-Master?62db1 ReplicatorService db1Service db2db2ReplicatorService db1Service db2./cooobook/install_all_masters62Monday, April 22, 13
  63. 63. ©Continuent 2013How Do I Extend Multi-Master?63db1 ReplicatorService db1Service db2Service db3db3Service db1Service db2Service db3db2ReplicatorService db1Service db2Service db3Replicator63Monday, April 22, 13
  64. 64. ©Continuent 2013How Do I Extend Multi-Master?64db1db3Service db1Service db2Service db3db2Replicatordb4Service db1Service db2Service db4ReplicatorService db3Service db4Service db1Service db2Service db3ReplicatorService db4Service db1Service db2Service db3ReplicatorService db464Monday, April 22, 13
  65. 65. ©Continuent 2013How Do I Install a Star Topology?65db1ReplicatorService db1Service db3db3Service db1Service db2Service db3db2ReplicatorService db2Service db3HUBReplicator./cooobook/install_star65Monday, April 22, 13
  66. 66. ©Continuent 2013How Do I Extend a Star Topology?66db1ReplicatorService db1Service db3db3Service db1Service db2Service db3db2ReplicatorService db2Service db3db4ReplicatorService db3Service db4HUBService db466Monday, April 22, 13
  67. 67. ©Continuent 2013How Do I Extend a Star Topology?67db1ReplicatorService db1Service db3db3Service db1Service db2Service db3db2ReplicatorService db2Service db3db4ReplicatorService db3Service db4HUBService db4db5ReplicatorService db5Service db3Service db567Monday, April 22, 13
  68. 68. ©Continuent 2013BI-DIR: the painless way• edit cookbook/COMMON_NODES.sh• edit cookbook/USER_VALUES.sh• remove two nodes• edit the variables in cookbook/NODES_ALL_MASTERS.sh• cookbook/install_all_masters6868Monday, April 22, 13
  69. 69. ©Continuent 2013Multiple masters• fan-in• Steps:• install a master service in each node• install a slave service for each master in the fan-in node• or :• cookbook/install_fan_in6969Monday, April 22, 13
  70. 70. ©Continuent 2013Multiple masters• star topology• Steps:• install a master service in each server• in the hub, install a slave service for each spoke• in each spoke, install a slave service for the hub,using bypass option• cookbook/install_star7070Monday, April 22, 13
  71. 71. ©Continuent 2013Taking Over from StandardReplication• cookbook/install_standard_replicaton• cookbook/takeover7171Monday, April 22, 13
  72. 72. ©Continuent 2013Replication Management7272Monday, April 22, 13
  73. 73. ©Continuent 2013Common Commands• replicator• trepctl• thl• the Tungsten service schema7373Monday, April 22, 13
  74. 74. ©Continuent 2013replicator• It’s the service provider• You launch it once when you start• You may restart it when you change con!g7474Monday, April 22, 13
  75. 75. ©Continuent 2013trepctl• Tungsten Replicator ConTroLler• It’s the driving seat for your replication• You can start, update, and stop services• You can get speci!c info7575Monday, April 22, 13
  76. 76. ©Continuent 2013trepctl• Tungsten Replicator Controller• put services online or o"ine• check status• skip events• inspect internals• change roles• heartbeat• backup/restore• ... and a lot more7676Monday, April 22, 13
  77. 77. ©Continuent 2013thl• Transaction History List• Gives you access to the Tungsten transactionhistory logs7777Monday, April 22, 13
  78. 78. ©Continuent 2013thl• Transaction History Log• info• index• list (total or a speci!c event, or by range)• purge7878Monday, April 22, 13
  79. 79. ©Continuent 2013Tungsten service schema• one for each service• named "tungsten_SERVICE_NAME"• e.g. tungsten_alpha, tungsten_dragon• Most important table: trep_commit_seqno7979Monday, April 22, 13
  80. 80. ©Continuent 2013Looking at the tungsten service dbselect * from tungsten_dragon.trep_commit_seqnoG******************* 1. row *******************task_id: 0seqno: 102fragno: 0last_frag: 1source_id: qa.r1.continuent.comepoch_number: 0eventid: mysql-bin.000002:0000000000018903;0applied_latency: 0update_timestamp: 2012-02-06 05:56:12shard_id: tungsten_dragonextract_timestamp: 2012-02-06 05:56:098080Monday, April 22, 13
  81. 81. ©Continuent 2013Where are the toolsin the tungsten directory:$TUNGSTEN_BASE/tungsten/tungsten-replicator/binreplicator # the daemontrepctl # replicator controllerthl # transaction history log tool8181Monday, April 22, 13
  82. 82. ©Continuent 2013Starting and stopping the replicatorcd $TUNGSTEN_BASE/tungsten/tungsten-replicator/bin./replicator statusTungsten Replicator Service is running (PID:32400)../replicator stopStopping Tungsten Replicator Service...Stopped Tungsten Replicator Service../replicator startStarting Tungsten Replicator Service....... or ./cookbook/replicator ...8282Monday, April 22, 13
  83. 83. ©Continuent 2013checking replicator vitalstrepctl servicesProcessing services command...NAME VALUE---- -----appliedLastSeqno: -1 # bad sign?appliedLatency : -1.0role : slaveserviceName : dragonserviceType : localstarted : truestate : ONLINEFinished services command...8383Monday, April 22, 13
  84. 84. ©Continuent 2013sending a heartbeattrepctl -host $MASTER_HOST heartbeattrepctl servicesProcessing services command...NAME VALUE---- -----appliedLastSeqno: 102appliedLatency : 3.139role : slaveserviceName : dragonserviceType : localstarted : truestate : ONLINEFinished services command...8484Monday, April 22, 13
  85. 85. ©Continuent 2013replicator status (1)trepctl statusProcessing status command...NAME VALUE---- -----appliedLastEventId : mysql-bin.000002:0000000000018903;0appliedLastSeqno : 102appliedLatency : 3.139clusterName : defaultcurrentEventId : NONEcurrentTimeMillis : 1328504342058dataServerHost : qa.r4.continuent.comextensions :latestEpochNumber : 0masterConnectUri : thl://qa.r1.continuent.com:2112/masterListenUri : thl://qa.r4.continuent.com:2112/maximumStoredSeqNo : 102minimumStoredSeqNo : 0[...]8585Monday, April 22, 13
  86. 86. ©Continuent 2013replicator status (2)[...]offlineRequests : NONEpendingError : NONEpendingErrorCode : NONEpendingErrorEventId : NONEpendingErrorSeqno : -1pendingExceptionMessage: NONEresourcePrecedence : 99rmiPort : 10000role : slaveseqnoType : java.lang.LongserviceName : dragonserviceType : localsimpleServiceName : dragonsiteName : defaultsourceId : qa.r4.continuent.comstate : ONLINEtimeInStateSeconds : 245.215uptimeSeconds : 245.539Finished status command...8686Monday, April 22, 13
  87. 87. ©Continuent 2013A failover scenario1: MySQL native replication8787Monday, April 22, 13
  88. 88. ©Continuent 20131. one Master, two slaves• Loading the “employees” test database8888Monday, April 22, 13
  89. 89. ©Continuent 20132. Master goes away* Stop replication* Slaves are updated at di"erent levels# 2select count(*) from titles333,145# 3select count(*) from titles443,3088989Monday, April 22, 13
  90. 90. ©Continuent 20133. Look into Slave #2 binary logs• !nd the last transaction9090Monday, April 22, 13
  91. 91. ©Continuent 20134. Look into Slave #3 binary logs1. !nd the transaction that was last in slave #22. Recognize that last transaction in the log ofslave #3 (This can actually take you aLOOOONG TIME)3. Get the position immediately after thistransaction4. (e.g. 134000 in !le mysql-bin.000018)9191Monday, April 22, 13
  92. 92. ©Continuent 20135. promote Slave #3 to master* in slave #2CHANGE MASTER TOmaster_host=‘slave_3_IP’,master_user=‘slavename’,master_password=‘slavepassword’,master_log_file=‘mysql-bin.000018’,master_log_pos=134000;9292Monday, April 22, 13
  93. 93. ©Continuent 2013A failover scenario1I:Tungsten Replicator9393Monday, April 22, 13
  94. 94. ©Continuent 20131. one master, two slaves• loading the ‘employees’ test database9494Monday, April 22, 13
  95. 95. ©Continuent 20132. Master goes away* Stop replication* Slaves are updated at di"erent levels# 2select count(*) from titles333,145# 3select count(*) from titles443,3089595Monday, April 22, 13
  96. 96. ©Continuent 20133. no need to !nd the last transaction# simply change rolestrepctl -host slave3 setrole -role mastertrepctl -host slave2 setrole -role slave -uri thl://slave3trepctl -host slave3 onlineState: ONLINEtrepctl -host slave2 onlineState: GOING-ONLINE:SYNCHRONIZING9696Monday, April 22, 13
  97. 97. ©Continuent 20134. Check that the slave hassynchronized# new masterselect seqno from tungsten.trep_commit_seqno;78# new slaveselect seqno from tungsten.trep_commit_seqno;649797Monday, April 22, 13
  98. 98. ©Continuent 20134. Tell the replicator to hurry up# new mastertrepctl -node slave3 flushMaster log is synchronized with database at logsequence number: 78# new slavetrepctl host slave2 wait -applied 78ONLINEselect seqno from tungsten.trep_commit_seqno;789898Monday, April 22, 13
  99. 99. ©Continuent 20134. ... and we’re done# new masterselect count(*) from employees.titlescount(*)443308# new slave:count(*)4433089999Monday, April 22, 13
  100. 100. ©Continuent 2013planned role switchcookbook/install_master_slavecookbook/switch100100Monday, April 22, 13
  101. 101. ©Continuent 2013Switching roles in master/slavereplication (1)101db1Replicatordb3Service db1db2ReplicatorService db1ReplicatorService db1✔online✔online✔online101Monday, April 22, 13
  102. 102. ©Continuent 2013Switching roles in master/slavereplication (2)102db1Replicatordb3Service db1db2ReplicatorService db1ReplicatorService db1o"ine✗✔online✔online102Monday, April 22, 13
  103. 103. ©Continuent 2013Switching roles in master/slavereplication (3)103db1Replicatordb3Service db1db2ReplicatorService db1ReplicatorService db1o"ine✗✔online✔onlineWait for transactions to be applied103Monday, April 22, 13
  104. 104. ©Continuent 2013Switching roles in master/slavereplication (4)104db1Replicatordb3Service db1db2ReplicatorService db1ReplicatorService db1o"ine✗o"ine✗o"ine✗Slaves go offline104Monday, April 22, 13
  105. 105. ©Continuent 2013Switching roles in master/slavereplication (5)105db1Replicatordb3Service db1db2ReplicatorService db1ReplicatorService db1o"ine✗o"ine✗o"ine✗Slave is promoted.Notice: 2 masters, buto"ine105Monday, April 22, 13
  106. 106. ©Continuent 2013Switching roles in master/slavereplication (6)106db1Replicatordb3Service db1db2ReplicatorService db1ReplicatorService db1o"ine✗o"ine✗o"ine✗old master becomes slave106Monday, April 22, 13
  107. 107. ©Continuent 2013Switching roles in master/slavereplication (7)107db1Replicatordb3Service db1db2ReplicatorService db1ReplicatorService db1o"ine✗o"ine✗o"ine✗slaves are directed to new master107Monday, April 22, 13
  108. 108. ©Continuent 2013Switching roles in master/slavereplication (8)108db1Replicatordb3Service db1db2ReplicatorService db1ReplicatorService db1✔online✔online✔onlineall nodes go online, using new master108Monday, April 22, 13
  109. 109. ©Continuent 2013Tungsten GTID vs MySQL 5.6 GTID• What is GTID• How it works in Tungsten• How it works (or not) in MySQL 5.6109109Monday, April 22, 13
  110. 110. ©Continuent 2013without global transaction ID110slavemasterslaveAB Ccommitcommitcommitcommitbinlogpositionbinlogpositionpositionbinlog110Monday, April 22, 13
  111. 111. ©Continuent 2013with global transaction ID111slavemasterslaveAB Ccommitcommitcommitcommitid#200id#200id#200111Monday, April 22, 13
  112. 112. ©Continuent 2013Tungsten and global transaction ID:activation(none)active by default112112Monday, April 22, 13
  113. 113. ©Continuent 2013Tungsten and global transaction ID:statustrepctl statusProcessing status command...NAME VALUE---- -----appliedLastEventId : mysql-bin.000002:0000000000001442;0appliedLastSeqno : 6appliedLatency : 0.862clusterName : defaultcurrentEventId : NONEcurrentTimeMillis : 1354304680923dataServerHost : qa.r4.continuent.com113113Monday, April 22, 13
  114. 114. ©Continuent 2013Tungsten and global transaction ID:seeing transactionsthl list -seqno 6SEQ# = 6 / FRAG# = 0 (last frag)- TIME = 2012-11-30 20:44:35.0- EPOCH# = 0- EVENTID = mysql-bin.000002:0000000000001442;0- SOURCEID = qa.r1.continuent.com- SQL(0) = insert into test.v1 values (1, insertedby node #1) /* ___SERVICE___ = [cookbook] */114114Monday, April 22, 13
  115. 115. ©Continuent 2013Tungsten and global transaction ID:changing master connectiontrepctl offlinetrepctl online -seqno 105115115Monday, April 22, 13
  116. 116. ©Continuent 2013Tungsten and Global transaction ID:crash-safe slave tablesmysql -e select * from tungsten_cookbook.trep_commit_seqnoG*************************** 1. row ***************************task_id: 0seqno: 6fragno: 0last_frag: 1source_id: qa.r1.continuent.comepoch_number: 0eventid: mysql-bin.000002:0000000000001442;0applied_latency: 0update_timestamp: 2012-11-30 20:44:35shard_id: testextract_timestamp: 2012-11-30 20:44:35116116Monday, April 22, 13
  117. 117. ©Continuent 2013Tungsten and Global transaction ID:crash-safe tables and parallel replicationmysql -e select seqno, source_id, shard_id,update_timestamp fromtungsten_cookbook.trep_commit_seqno+-------+----------------------+----------+---------------------+| seqno | source_id | shard_id | update_timestamp |+-------+----------------------+----------+---------------------+| 7 | qa.r1.continuent.com | db1 | 2012-11-30 20:54:14 || 8 | qa.r1.continuent.com | db2 | 2012-11-30 20:54:14 || 9 | qa.r1.continuent.com | db3 | 2012-11-30 20:54:14 || 10 | qa.r1.continuent.com | db4 | 2012-11-30 20:54:14 || 11 | qa.r1.continuent.com | db5 | 2012-11-30 20:54:14 || 12 | qa.r1.continuent.com | db6 | 2012-11-30 20:54:14 || 13 | qa.r1.continuent.com | db7 | 2012-11-30 20:54:14 || 14 | qa.r1.continuent.com | db8 | 2012-11-30 20:54:14 || 15 | qa.r1.continuent.com | db9 | 2012-11-30 20:54:14 || 16 | qa.r1.continuent.com | db10 | 2012-11-30 20:54:14 |+-------+----------------------+----------+---------------------+117117Monday, April 22, 13
  118. 118. ©Continuent 2013MySQL 5.6 and global transaction IDactivationmysqld --log-slave-updates --gtid-mode=on --enforce-gtid-consistencyWARNING: before MySQL 5.6.10, it was--disable-gtid-unsafe-statements118118Monday, April 22, 13
  119. 119. ©Continuent 2013MySQL 5.6 and global transaction IDseeing transactions#121203 11:15:49 server id 1 end_log_pos 344 CRC32 0x45b25c8fGTID [commit=yes]SET @@SESSION.GTID_NEXT= 7A77A490-3D3A-11E2-8CC9-7DCF9991097B:2/*!*/;# at 344#121203 11:15:49 server id 1 end_log_pos 423 CRC32 0x873c8facQuery thread_id=3 exec_time=0 error_code=0SET TIMESTAMP=1354533349/*!*/;BEGIN/*!*/;# at 423#121203 11:15:49 server id 1 end_log_pos 522 CRC32 0xb4bf4372Query thread_id=3 exec_time=0 error_code=0SET TIMESTAMP=1354533349/*!*/;insert into t1 values (1)119119Monday, April 22, 13
  120. 120. ©Continuent 2013MySQL 5.6 and global transaction IDstatusshow slave statusG*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 127.0.0.1Master_User: rsandboxMaster_Port: 13233Connect_Retry: 60Master_Log_File: mysql-bin.000002Read_Master_Log_Pos: 1837Relay_Log_File: mysql_sandbox13234-relay-bin.000005Relay_Log_Pos: 2047Relay_Master_Log_File: mysql-bin.000002...Retrieved_Gtid_Set: 46E13434-3B28-11E2-BF47-6C626DA07446:1-7Executed_Gtid_Set: 46E13434-3B28-11E2-BF47-6C626DA07446:1-7120120Monday, April 22, 13
  121. 121. ©Continuent 2013MySQL 5.6 and global transaction IDchanging master connectionCHANGE MASTER TO master_log_file=mysql-bin-000003,master_log_pos=1234# No global transaction ID is used121121Monday, April 22, 13
  122. 122. ©Continuent 2013MySQL 5.6 and global transaction IDcrash-safe slave tableselect * from slave_relay_log_infoG********************* 1. row ********************Number_of_lines: 7Relay_log_name: ./mysql_sandbox13234-relay-bin.000005Relay_log_pos: 2047Master_log_name: mysql-bin.000002Master_log_pos: 1837Sql_delay: 0Number_of_workers: 5Id: 1# NO Global transaction ID is used!122122Monday, April 22, 13
  123. 123. ©Continuent 2013MySQL 5.6 and global transaction IDcrash-safe slave table + parallelselect * from mysql.slave_worker_infoGId: 12Relay_log_name: ./mysql_sandbox13234-relay-bin.000007Relay_log_pos: 4299Master_log_name: mysql-bin.000002Master_log_pos: 7155Checkpoint_relay_log_name: ./mysql_sandbox13234-relay-bin.000007Checkpoint_relay_log_pos: 1786Checkpoint_master_log_name: mysql-bin.000002Checkpoint_master_log_pos: 4642Checkpoint_seqno: 9Checkpoint_group_size: 64Checkpoint_group_bitmap: ?# NO Global transaction ID is used!123123Monday, April 22, 13
  124. 124. ©Continuent 2013Filters124124Monday, April 22, 13
  125. 125. ©Continuent 2013Tungsten Replication Service125Extract Filter ApplyStageExtract Filter ApplyStageExtract Filter ApplyStagePipelineMasterDBMSTransactionHistory LogIn-MemoryQueueSlaveDBMS125Monday, April 22, 13
  126. 126. ©Continuent 2013Restrict replication to some schemasand tables126./tools/tungsten-installer --master-slave -a   ...  --svc-extractor-filters=replicate   "--property=replicator.filter.replicate.do=test,*.foo"   ...  --start-and-report# test="test.*" -> same drawback as binlog-do-db in MySQL# *.foo = table foo in any database# employees.dept_codes,employees.salaries => safest way126Monday, April 22, 13
  127. 127. ©Continuent 2013Exclude some schemas and tablesfrom replication127./tools/tungsten-installer --master-slave -a   ...  --svc-extractor-filters=replicate   "--property=replicator.filter.replicate.ignore=test,*.foo"   ...  --start-and-report# test="test.*" -> same drawback as binlog-ignore-db in MySQL# *.foo = table foo in any database# employees.dept_codes,employees.salaries => safest way# DO NOT MIX .do and .ignore!# (you can do it, but it may not do what you mean)127Monday, April 22, 13
  128. 128. ©Continuent 2013Change name of replicated schema128-a --svc-applier-filters=dbtransform   --property=replicator.filter.dbtransform.from_regex1=stores   --property=replicator.filter.dbtransform.to_regex1=playground# from_regex1=stores -> name of the schema in the master# to_regex1=playground -> name of the schema in the slave# WARNING: requires "USE schema_name" to work properly.128Monday, April 22, 13
  129. 129. ©Continuent 2013Multi-master:Con#ict prevention129129Monday, April 22, 13
  130. 130. ©Continuent 2012CONFLICTS130130Monday, April 22, 13
  131. 131. ©Continuent 2013Whats a con#ict• Data modi!ed by several sources (masters)• Creates one or more :• data loss (unwanted delete)• data inconsistency (unwanted update)• duplicated data (unwanted insert)• replication break131131Monday, April 22, 13
  132. 132. ©Continuent 2013Data duplication132id name amount1 Joe 1002 Frank 1103 Sue 100alphabravocharlie4 Matt 1304 Matt 140BREAKSREPLICATION132Monday, April 22, 13
  133. 133. ©Continuent 2013auto_increment o$setsare not a remedy• A popular recipe• auto_increment_increment +auto_increment_offset• They dont prevent con#icts• They hide duplicates133133Monday, April 22, 13
  134. 134. ©Continuent 2013Hidden data duplication134id name amount1 Joe 1002 Frank 1103 Sue 100alphao$set 1bravoo$set 2charlieo$set 313 Matt 13011 Matt 140INSERTINSERT134Monday, April 22, 13
  135. 135. ©Continuent 2013Data inconsistency135id name amount1 Joe 1002 Frank 1103 Sue 100alphabravocharlie3 Sue 1053 Sue 108UPDATEUPDATE135Monday, April 22, 13
  136. 136. ©Continuent 2013Data loss136id name amount1 Joe 1002 Frank 1103 Sue 100alphabravocharlierecord #33 Sue 108MAY BREAKREPLICATIONUPDATEDELETE136Monday, April 22, 13
  137. 137. ©Continuent 2012con#ict handling strategies• resolving• after the fact• Needs information that is missing in async replication• avoiding• requires synchronous replication with 2pc• preventing• setting and enforcing a split sources policy• Transforming and resolving• all records are converted to INSERTs• con"icts are resolved within a given time window137used by Tungstenplanned forfuture useplanned forfuture use137Monday, April 22, 13
  138. 138. ©Continuent 2013Multi-master:Con!ict prevention138138Monday, April 22, 13
  139. 139. ©Continuent 2013Tungsten con#ict preventionin a nutshell1. de!ne the rules(which master can update which database)2. tell Tungsten the rules3. de!ne the policy(error, drop, warn, or accept)4. Let Tungsten enforce your rules139139Monday, April 22, 13
  140. 140. ©Continuent 2013Tungsten Con#ict prevention facts• Sharded by database• De!ned dynamically• Applied on the slave services• methods:• error: make replication fail• drop: drop silently• warn: drop with warning140140Monday, April 22, 13
  141. 141. ©Continuent 2013Tungsten con#ict preventionapplicability• unknown shards• The schema being updated is not planned• actions: accept, drop, warn, error• unwanted shards• the schema is updated from the wrong master• actions: accept, drop, warn, error• whitelisted shards• can be updated by any master141141Monday, April 22, 13
  142. 142. ©Continuent 2013Con#ict prevention directives--svc-extractor-filters=shardfilterreplicator.filter.shardfilter.unknownShardPolicy=errorreplicator.filter.shardfilter.unwantedShardPolicy=errorreplicator.filter.shardfilter.enforceHomes=falsereplicator.filter.shardfilter.allowWhitelisted=false142142Monday, April 22, 13
  143. 143. ©Continuent 2013con#ict prevention in a star topology143Host1master: alphadatabase: employeesHost2master: bravodatabase: buildingsHost3master: charlie (hub)database: vehiclesABCBACCalpha updatesemployees✔✔143Monday, April 22, 13
  144. 144. ©Continuent 2013con#ict prevention in a star topology144Host1master: alphadatabase: employeesHost2master: bravodatabase: buildingsHost3master: charlie (hub)database: vehiclesABCBACCalpha updatesvehicles✗144Monday, April 22, 13
  145. 145. ©Continuent 2013con#ict prevention in a all-masterstopology145Host1master: alphadatabase: employeesHost2master: bravodatabase: buildingsHost3master: charliedatabase: vehiclesABCBACCABalpha updatesemployees✔✔145Monday, April 22, 13
  146. 146. ©Continuent 2013con#ict prevention in a all-masterstopology146Host1master: alphadatabase: employeesHost2master: bravodatabase: buildingsHost3master: charliedatabase: vehiclesABCBACCABcharlie updatesvehicles✔✔146Monday, April 22, 13
  147. 147. ©Continuent 2013con#ict prevention in a all-masterstopology147Host1master: alphadatabase: employeesHost2master: bravodatabase: buildingsHost3master: charliedatabase: vehiclesABCBACCABbravo updatesemployees✗✗147Monday, April 22, 13
  148. 148. ©Continuent 2013con#ict prevention in a all-masterstopology148Host1master: alphadatabase: employeesHost2master: bravodatabase: buildingsHost3master: charliedatabase: vehiclesABCBACCABcharlie updatesemployees✗✗148Monday, April 22, 13
  149. 149. ©Continuent 2013setting con#ict prevention rulestrepctl -host host1 -service charlie shard -insert < shards.mapcat shards.mapshard_id master criticalpersonnel alpha falsebuildings bravo falsevehicles charlie falsetest whitelisted false# charlie is slave service in host 1149149Monday, April 22, 13
  150. 150. ©Continuent 2013setting con#ict prevention rulestrepctl -host host2 -service charlie shard -insert < shards.mapcat shards.mapshard_id master criticalpersonnel alpha falsebuildings bravo falsevehicles charlie falsetest whitelisted false# charlie is slave service in host 2150150Monday, April 22, 13
  151. 151. ©Continuent 2013setting con#ict prevention rulestrepctl -host host3 -service alpha shard -insert < shards.maptrepctl -host host3 -service bravo shard -insert < shards.mapcat shards.mapshard_id master criticalpersonnel alpha falsebuildings bravo falsevehicles charlie falsetest whitelisted false# alpha and bravo are slave services in host 3151151Monday, April 22, 13
  152. 152. ©Continuent 2013Con#ict prevention demo152• reminder• Server #1 can update "employees"• Server #2 can update "buildings"• Server #3 can update "vehicles"152Monday, April 22, 13
  153. 153. ©Continuent 2013Sample correct operation (1)mysql #1> create table employees.names( ... )# all servers receive the table# all servers keep working well153153Monday, April 22, 13
  154. 154. ©Continuent 2013Sample correct operation (2)mysql #2> create table buildings.homes( ... )# all servers receive the table# all servers keep working well154154Monday, April 22, 13
  155. 155. ©Continuent 2013Sample incorrect operation (1)mysql #2> create table employees.nicknames( ... )# Only server #2 receives the table# slave service in hub gets an error# slave service in #1 does not receive anything155155Monday, April 22, 13
  156. 156. ©Continuent 2013sample incorrect operation (2)#3 $ trepct services | simple_servicesalpha [slave]seqno: 7 - latency: 0.136 - ONLINEbravo [slave]seqno: -1 - latency: -1.000 - OFFLINE:ERRORcharlie [master]seqno: 66 - latency: 0.440 - ONLINE156156Monday, April 22, 13
  157. 157. ©Continuent 2013sample incorrect operation (3)#3 $ trepct -service bravo statusNAME VALUE---- -----appliedLastEventId : NONEappliedLastSeqno : -1appliedLatency : -1.0(...)offlineRequests : NONEpendingError : Stage task failed: q-to-dbmspendingErrorCode : NONEpendingErrorEventId : mysql-bin.000002:0000000000001241;0pendingErrorSeqno : 7pendingExceptionMessage: Rejected event from wrong shard:seqno=7 shard ID=employees shard master=alpha service=bravo(...)157157Monday, April 22, 13
  158. 158. ©Continuent 2013Fixing the issuemysql #1> drop table if exists employees.nicknames;mysql #1> create table if exists employees.nicknames ( ... ) ;#3 $ trepct -service bravo online -skip-seqno 7# all servers receive the new table158158Monday, April 22, 13
  159. 159. ©Continuent 2013Sample whitelisted operationmysql #2> create table test.hope4best( ... )mysql #1> insert into test.hope4best values ( ... )# REMEMBER: test was explicitly whitelisted# All servers get the new table and records# But there is no protection against conflicts159159Monday, April 22, 13
  160. 160. ©Continuent 2013administration160160Monday, April 22, 13
  161. 161. ©Continuent 2013Viewing THL Eventsthl infolog directory = /home/tungsten/installs/master_slave/thl/dragon/min seq# = 0max seq# = 101events = 101161161Monday, April 22, 13
  162. 162. ©Continuent 2013viewing THL eventsthl indexLogIndexEntry thl.data.0000000001(0:102)162162Monday, April 22, 13
  163. 163. ©Continuent 2013viewing THL eventsthl index[...]LogIndexEntry thl.data.0000000001(0:18)LogIndexEntry thl.data.0000000002(19:33)LogIndexEntry thl.data.0000000003(34:35)LogIndexEntry thl.data.0000000004(36:3641)LogIndexEntry thl.data.0000000005(3642:3712)LogIndexEntry thl.data.0000000006(3713:3838)LogIndexEntry thl.data.0000000007(3839:3949)LogIndexEntry thl.data.0000000008(3950:4011)LogIndexEntry thl.data.0000000009(4012:4039)LogIndexEntry thl.data.0000000010(4040:4057)LogIndexEntry thl.data.0000000011(4058:4067)LogIndexEntry thl.data.0000000012(4068:4073)LogIndexEntry thl.data.0000000013(4074:4085)LogIndexEntry thl.data.0000000014(4086:4095)LogIndexEntry thl.data.0000000015(4096:4101)LogIndexEntry thl.data.0000000016(4102:4111)163163Monday, April 22, 13
  164. 164. ©Continuent 2013viewing THL eventsthl list -seqno 102[...]SEQ# = 102 / FRAG# = 0 (last frag)- TIME = 2012-02-06 05:56:09.0- EPOCH# = 0- EVENTID = mysql-bin.000002:0000000000018903;0- SOURCEID = qa.r1.continuent.com- METADATA =[mysql_server_id=10;is_metadata=true;service=dragon;shard=tungsten_dragon;heartbeat=NONE]- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent- OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 1, foreign_key_checks = 1,unique_checks = 1, sql_mode = IGNORE_SPACE, character_set_client = 8, collation_connection = 8,collation_server = 8]- SCHEMA = tungsten_dragon- SQL(0) = UPDATE tungsten_dragon.heartbeat SET source_tstamp="2012-02-06 05:56:09", salt= 2, name= "NONE" WHERE id= 1 /*___SERVICE___ = [dragon] */164164Monday, April 22, 13
  165. 165. ©Continuent 2013Skipping a THL Eventtrepctl online -skip-seqno 1092trepctl online -skip-seqno 1092,1093,1094# see example165165Monday, April 22, 13
  166. 166. ©Continuent 2013Adding a Member• Lets see the cookbook, and use it166166Monday, April 22, 13
  167. 167. ©Continuent 2013parallel replication167167Monday, April 22, 13
  168. 168. Replicator Pipeline ArchitectureTHL SlaveDBMSTransactionHistory LogMySQLBinlogshard.listfileApplyExtract ExtractPipelineTungsten Replicator ProcessStageApplyExtractApplyExtractApplyExtractParallelQueueAssignShardIDApplyStageStage“channels”168Monday, April 22, 13
  169. 169. ©Continuent 2013Parallel replication facts✓Sharded by database✓Good choice for slave lag problems❖Bad choice for single database projects169169Monday, April 22, 13
  170. 170. Parallel Replication testbinary logsMySQL slaveTungsten slaveOFFLINESTOPPEDreplicator alphadirect:alpha(slave)Concurrent sysbenchon 30 databasesrunning for 1 hourTOTAL DATA: 130 GBRAM per server: 20GBSlaves will have 1 hour lag170Monday, April 22, 13
  171. 171. measuring resultsbinary logsMySQL slaveTungsten slaveONLINESTARTreplicator alphadirect:alpha(slave)Recordingcatch-up time171Monday, April 22, 13
  172. 172. MySQL nativereplicationslave catch up in 04:29:30172Monday, April 22, 13
  173. 173. Tungsten parallelreplicationslave catch up in 00:55:40173Monday, April 22, 13
  174. 174. Parallel replication made simplerFROM HERE ....174Monday, April 22, 13
  175. 175. Parallel replication made simplerTO HERE175Monday, April 22, 13
  176. 176. Parallel replication made simpler176Monday, April 22, 13
  177. 177. ©Continuent 2013parallel replicationdirect slave facts✓No need to install Tungsten on the master✓Tungsten runs only on the slave✓Replication can revert to native slave with twocommands (trepctl offline; start slave)✓Native replication can continue on other slaves❖Failover (either native or Tungsten) becomes a manualtask177177Monday, April 22, 13
  178. 178. ©Continuent 2013installing parallel replication• MORE_OPTIONS=--channels=10• ./cookbook/install_master_slave178178Monday, April 22, 13
  179. 179. ©Continuent 2013Checking parallel replicationtrepctl statustrepctl status -name taskstrepctl status -name shardstrepctl status -name stores179179Monday, April 22, 13
  180. 180. ©Continuent 2013Parallel replication demo180180Monday, April 22, 13
  181. 181. ©Continuent 2013Troubleshooting181181Monday, April 22, 13
  182. 182. ©Continuent 2013Identify the Failed Component• Steps1. trepctl services2. trepctl -service SVC_NAME status3. look at the logs4. Take action182182Monday, April 22, 13
  183. 183. ©Continuent 2013reading the logsls $TUNGSTEN_BASE/tungsten/tungsten-replicator/logs/trepsvc.log user.log...or./cookbook/show_log# lets see it in practice183183Monday, April 22, 13
  184. 184. ©Continuent 2013Parting thoughts184184Monday, April 22, 13
  185. 185. ©Continuent 2013Open source TungstenReplicator now includesOracle-to-MySQL andOracle-to-Oracle extractorsand appliers!185185Monday, April 22, 13
  186. 186. ©Continuent 2012 186Continuent Website:http://www.continuent.comTungsten Replicator 2.0:http://code.google.com/p/tungsten-replicatorOur Blogs:http://scale-out-blog.blogspot.comhttp://datacharmer.blogspot.comhttp://flyingclusters.blogspot.com560 S.Winchester Blvd., Suite 500San Jose, CA 95128Tel +1 (866) 998-3642Fax +1 (408) 668-1009e-mail: sales@continuent.com186Monday, April 22, 13

×