Successfully reported this slideshow.
Your SlideShare is downloading. ×

9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 39 Ad

9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides

Galera is a MySQL replication technology that can simplify the design of a high availability application stack. With a true multi-master MySQL setup, an application can now read and write from any database instance without worrying about master/slave roles, data integrity, slave lag or other drawbacks of asynchronous replication.

And that all sounds great until it’s time to go into production. Throw in a live migration from an existing database setup and devops life just got a bit more interesting ...

So if you are in devops, then this webinar is for you!

Operations is not so much about specific technologies, but about the techniques and tools you use to deploy and manage them. Monitoring, managing schema changes and pushing them in production, performance optimizations, configurations, version upgrades, backups; these are all aspects to consider – preferably before going live.

Let us guide you through 9 key tips to consider before taking Galera Cluster into production.

Galera is a MySQL replication technology that can simplify the design of a high availability application stack. With a true multi-master MySQL setup, an application can now read and write from any database instance without worrying about master/slave roles, data integrity, slave lag or other drawbacks of asynchronous replication.

And that all sounds great until it’s time to go into production. Throw in a live migration from an existing database setup and devops life just got a bit more interesting ...

So if you are in devops, then this webinar is for you!

Operations is not so much about specific technologies, but about the techniques and tools you use to deploy and manage them. Monitoring, managing schema changes and pushing them in production, performance optimizations, configurations, version upgrades, backups; these are all aspects to consider – preferably before going live.

Let us guide you through 9 key tips to consider before taking Galera Cluster into production.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to 9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides (20)

Advertisement

More from Severalnines (20)

Recently uploaded (20)

Advertisement

9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides

  1. 1. Nine DevOps Tips For Going Into Production With Galera Cluster Confidential November 11, 2014 Johan Andersson & Jean-Jérôme Schmidt Severalnines johan@severalnines.com & jj@severalnines.com
  2. 2. Confidential Logistics ! I'm Jean-Jérôme and I’ll be your host for today's webinar ! Feel free to ask any questions in the Questions section of this application or via the Chat box ! You can also contact me directly via the chat box or via email: jj@severalnines.com during or after the webinar 2 Copyright Severalnines AB
  3. 3. Confidential ClusterControl In a nutshell 3 Copyright 2012 Severalnines AB Manage Scale Deploy Monitor
  4. 4. Supported Databases Confidential SQL ! MariaDB Cluster ! MySQL Galera Cluster (Codership) ! Percona XtraDB Cluster ! MySQL Cluster (NDB) ! MySQL Replication 5.6 ! Standalone MySQL/MariaDB NoSQL ! MongoDB Sharded Cluster ! MongoDB Replica Set ! TokuMX Cluster Copyright Severalnines AB 4
  5. 5. Confidential Customers 5 Copyright Severalnines AB
  6. 6. Confidential Agenda ! 101 Sanity Check ! Operating System ! Backup Strategies ! Galera Recovery ! Query Performance ! Schema Changes ! Security / Encryption ! Reporting ! Protecting from Disasters 6 Copyright Severalnines AB
  7. 7. #1 – 101 Sanity Check (1/4) ! Ensure ALL tables are InnoDB or XtraDB ! Innodb supports FULLTEXT indexes in MySQL5.6 ! Ensure ALL tables have a PRIMARY KEY ! If no PRIMARY KEY is defined you can do: ALTER TABLE table ADD COLUMN pkid BIGINT AUTO_INCREMENT PRIMARY KEY; ! Ensure you have NO unbound queries ! E.g UPDATE table SET x=x+1 (and there are many rows) ! Update/delete in smaller batches (e.g 1000 records). ! Better support for huge (unbound) queries are in the pipe Confidential 7 Copyright Severalnines AB
  8. 8. #1 – 101 Sanity Check (2/4) ! Ensure that the application can tolerate non-sequential Confidential auto increments. ! Redirect deadlock prone update queries on hot tables and rows to one of the Galera nodes: ! E.g UPDATE counter_tbl SET counter = counter +1; ! http://www.severalnines.com/blog/avoiding-deadlocks-galera- set-haproxy-single-node-writes-and-multi-node-reads ! Use wsrep_sst_method=xtrabackup-v2 8 Copyright Severalnines AB
  9. 9. #1 – 101 Sanity Check (3/4) Confidential ! WAN environment? ! [Note] WSREP: (0b066c90-d4fb-11e1-0800-96b2cf43aaf6, 'tcp:// 0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.0.1.2:4567 [Note] WSREP: (0b066c90-d4fb-11e1-0800-96b2cf43aaf6, 'tcp:// 0.0.0.0:4567') reconnecting to df4e387f-d4e2- 11e1-0800-2e6080299165 (tcp://10.0.1.2:4567), attempt 0 ! Increase timeouts ! wsrep_provider_options=‘evs.keepalive_period=PT3S; evs.inactive_check_period=PT10S; evs.suspect_timeout=PT30S; evs.inactive_timeout=PT1M; evs.install_timeout=PT1M; evs.send_window=1024; evs.user_send_window=512’; ! This will relax how fast a node will be evicted from the cluster. ! Usually default values are good if networks with a ping time of <10-15 ms. 9 Copyright Severalnines AB
  10. 10. #1 – 101 Sanity Check (4/4) ! There is no reason to use any 5.5 based MySQL variants ! Use MySQL 5.6 / MariaDB 10.x ! Lots of bug fixes ! Performance optimizations ! Write intensive workloads ! Query optimizer enhancements ! A good foundation for later upgrading to 5.7 based Confidential MySQL ! Use Galera Version 3.x series 10 Copyright Severalnines AB
  11. 11. #2 – Operating System (1/2) Confidential ! Swapping ! echo “1” > /proc/sys/vm/swappiness ! NUMA on Multi-socket ! Can lead to contention and strange lock ups ! Is it enabled: dmesg | grep –i numa ! Grub boot option ”numa=off” ! … and other possibilities ! Filesystem ! Reduce writes by mounting with ! noatime 11 Copyright Severalnines AB
  12. 12. #2 – Operating System (2/2) ! In Virtualized environments it is easy to over-commit resources on a single Host. ! Keep track on the Host hosting the VMs ! Is it heavily loaded? ! CPU Steal (check on the VMs)? ! Is it swapping? Confidential 12 Copyright Severalnines AB
  13. 13. Confidential #3 – Backup ! Percona XtraBackup ! Online consistent backup ! Full and Incremental backups ! Possible to backup databases and tables when innodb_file_per_table is used. ! Parallelism & compression & encryption ! mysqldump ! Use with –single-transaction, consistent / online for innodb tables ! May require tweaking of innodb_old_blocks_time and innodb_old_blocks_pct (default values in 5.6 are quite good). ! S3 / Glazier or Swift can be used for offline/offsite storage 13 Copyright Severalnines AB
  14. 14. Copyright Severalnines AB #4 – Galera Recovery (IST) (1/3) ! IST (Incremental State Transfer) is faster than SST (Snapshot Confidential State Transfer). ! Each Galera node has a cache, gcache. ! Stores committed write sets ! Circular buffer ! If a node is down (crash, maint window) and then becomes a JOINER: ! Send ID of last applied write set to the DONOR ! DONOR checks if it can send the next events from the gcache. ! Yes == IST (fast) ! No == SST (slow). E.g 3TB of data is no fun to SST.
  15. 15. #4 – Galera Recovery (IST) (2/3) ! Dimension the gcache, example to handle a maintenance window of 6 hours: ! Writes to cluster per second: 1MB/s ! Maintenance window (seconds) = 6 hours *60*60 = 21600s ! gcache size = 1 MB/s x 21600 s = 21GB ! 1.5x or 2x the value to have margins: Confidential ! gcache.size=42G ! wsrep_provider_options=‘gcache.size=42G;…’ 15 Copyright Severalnines AB
  16. 16. Copyright Severalnines AB #4 – Galera Recovery (IST) (3/3) Confidential ! How much do you write to the Galera Cluster? ! Look at the sum of wsrep_replicated_bytes and wsrep_received_bytes and get the rate between two points in time. ! Here we can see that a node handles: 109575 + 33758 bytes / second = ~140KB/s 16
  17. 17. Copyright Severalnines AB #4 – Galera Recovery (SST) ! Two pitfalls to be avoided with SSTs: ! wsrep_sst_method=rsync ! you may have to change in /usr/bin/wsrep_sst_rsync: ! timeout = 300 ! 3600 (or bigger) ! Else the rsync daemons may timeout when initializing the SST. ! wsrep_sst_method=xtrabackup[-v2] ! Uses mysql tmpdir by default. ! If tmpdir is too small SST may fail on the donor. The transaction Confidential log simply does not fit. ! You can set in my.cnf: [xtrabackup] tmpdir=/a/bigger/partition ! How big do I need tmpdir to be? [KB writes to node ] x [ backup time ]. Similar to the gcache.
  18. 18. #5 – Query Performance (1/5) ! A number of things to watch out for: ! Badly written queries or missing indexes ! DDL locking many record (BEGIN; SELECT * FROM t1 FOR UPDATE; Confidential … ) ! DDL updating/deleting many records in one chunk ! wsrep_max_ws_rows/wsrep_max_ws_size sets upper limits ! Update/delete “small” batches of 1000-10000 records. Do not update 100000 records. ! Deadlocks and deadlock prone code ! E.g running two mysqldumps at the same time ! Updating the very same record in a very hot table from multiple threads on multiple hosts ! Use your favorite tool to detect the problems 18 Copyright Severalnines AB
  19. 19. #5 – Query Performance (2/5) ! When Performance grinds to a halt you want to know! Confidential 19 Copyright Severalnines AB
  20. 20. #5 – Query Performance (3/5) ! You may want to have an Alarm notification (in 1.2.9) Confidential 20 Copyright Severalnines AB
  21. 21. #5 – Query Performance (4/5) ! If a dead-lock happens, you want to tell your developers Confidential (in 1.2.9) 21 Copyright Severalnines AB
  22. 22. #5 – Query Performance (5/5) ! And see if Galera is clogged up Confidential 22 Copyright Severalnines AB
  23. 23. #6 – Schema Changes (1/4) ! Consider an upgrade from schema version V1 to version V2 ! There are two principal types of schema changes that can Confidential be performed: ! Compatible ! E.g ALTER TABLE .. ADD COLUMN … , CREATE INDEX .. ! Application(s) will still continue with V1 ! Upgrade schema first, then applications ! Incompatible ! E.g ALTER TABLE .. DROP COLUMN … ! Application(s) cannot use V2 ! Must upgrade applications first to support V2, and upgrade schema to V2 23 Copyright Severalnines AB
  24. 24. #6 – Schema Changes (2/4) ! Galera supports multiple ways for upgrading schema Confidential from V1 to V2 ! Total Order Isolation (TOI) ! wsrep_osu_method=TOI ! Rolling Schema Upgrade (RSU) ! wsrep_osu_method=RSU ! Desynching nodes (not covered here),but check out http://www.severalnines.com/blog/webinar-replay-slides-galera- cluster-best-practices-zero-downtime-schema-changes 24 Copyright Severalnines AB
  25. 25. #6 – Schema Changes (3/4) ! Total Order Isolation (TOI) Confidential ! Default method ! Executed in the same order wrt to other transactions on all Galera nodes. ! Cluster behaves like a single mysql server ! Ok for non-copying ALTER TABLE or tiny seldomly used tables (100s of records) or if application traffic is disabled. ! ALTER TABLE … ADD INDEX.. / CREATE INDEX.. ! Not ok for copying ALTERs, since table is LOCKED. ! May wreak havoc 25 Copyright Severalnines AB
  26. 26. #6 – Schema Changes (4/4) ! Rolling Schema Upgrade (RSU) ! DDL is not replicated, executed on one node at a time ! Executed on one node at a time: ! node1> SET GLOBAL wsrep_OSU_method=RSU; node1> ALTER TABLE …. // check that node1 is SYNCED node2> SET GLOBAL wsrep_OSU_method=RSU; node2> ALTER TABLE …. ! The change MUST be a compatible schema change. ! E.g ALTER TABLE .. DROP COLUMN will wreak havoc. Confidential 26 Copyright Severalnines AB
  27. 27. #7 – Security / Encryption (1/2) ! Encrypt replication links between Galera nodes. Especially in public cloud environments / WAN setups. ! Create a certificate and a key http://www.severalnines.com/blog/performance-impact-encrypted-replication- galera-cluster-mysql ! wsrep_provider_options=’socket.ssl_cert=galera_rep.crt; socket.ssl_key=galera_rep.key;<rest of wsrep provider options>’ ! Requires a complete stop of all nodes. ! Don’t forget to set this option for the garbd (arbitrator)! Confidential 27 Copyright Severalnines AB
  28. 28. #7 – Security / Encryption (2/2) ! Nothing to do with Galera but… ! Encrypt application links between Galera nodes and applications, especially in public/untrusted environments: ! Dense howto: http://www.vmadmin.co.uk/linux/44-redhat/145- linuxmysqlencryption Confidential 28 Copyright Severalnines AB
  29. 29. Confidential #8 - Reporting ! Try to separate OLTP and OLAP if possible ! Run reports off an async slave or dedicated node ! Remember: huge queries eat CPU, RAM and DISK. ! Galera is not faster than its slowest node. ! Watch out for reports with side effects ! Large updates writing back? ! Consider using Amazon Redshift (Data Warehouse) ! Upload CSV files for processing. ! http://www.severalnines.com/blog/data-warehouse-cloud-how- upload-mysql-data-amazon-redshift-reporting-and-analytics 29 Copyright Severalnines AB
  30. 30. #9 – Protecting from Disasters (1/5) ! Eventually a disaster will happen ! Software bugs ! Network / router upgrades ! AZ down ! Schema / software / hardware upgrade going wrong ! Too many connections Confidential 30 Copyright Severalnines AB
  31. 31. #9 – Protecting from Disasters (1/5) ! One way of protecting from Cluster failures is to use an asynchronous slave replicating from the Galera Cluster. ! If the Cluster would fail, the asynchronous slave could take over and handle the application work load until the cluster error has been resolved. Confidential 31 Copyright Severalnines AB Galera Cluster in DC1 Async Slave DC2 READ ONLY!
  32. 32. #9 – Protecting from Disasters (2/5) ! Asynchronously replicated slave, benefits: ! + Decoupled from the Cluster ! - Data loss is possible (minimize with semi-sync replication, but write Confidential performance will suffer) ! Setup one or more Galera nodes to be Replication Masters (RM) and connect the Replication Slave (RS) 32 Copyright Severalnines AB Galera Cluster in DC1 Async Slave DC2 RM1 RM2 Asynchronous Replication, Semi sync is optional RS1 READ ONLY!
  33. 33. #9 – Protecting from Disasters (3/5) ! Using GTIDs (available in MySQL 5.6 and MariaDB 10.0) allows for easy fail-over from RM2 to RM1: ! slave> CHANGE MASTER TO MASTER_HOST=’RM1’, MASTER_AUTO_POSITION=1; START SLAVE; Confidential 33 Copyright Severalnines AB Galera Cluster in DC1 Async Slave DC2 RM1 RM2 Asynchronous Replication, Semi sync is optional RS1
  34. 34. #9 – Protecting from Disasters (4/5) ! Prepare each Galera Replication Master (MySQL 5.6 style): gtid_mode=ON enforce_gtid_consistency=1 log_bin=binlog log_slave_updates=1 expire_logs_days=7 #loose_rpl_semi_sync_master_enabled=1 server_id=X # must be a unique number ! Prepare each Slave (MySQL 5.6 style): gtid_mode=ON enforce_gtid_consistency=1 log_bin=binlog relay_log=relay-bin log_slave_updates=1 expire_logs_days=7 #loose_rpl_semi_sync_slave_enabled=1 server_id=Y ## must be a unique number Confidential Copyright Severalnines AB
  35. 35. Copyright Severalnines AB #9 – Protecting from Disasters (5/5) ! Enabling the binlog (log_slave_updates and log_bin) on the Galera nodes allows you to do Point In Time Recovery ! Restore last backup (must be newer than Confidential expire_logs_days ) ! Use the binary logs to roll forward until last transaction.
  36. 36. #9 – Protecting from Disasters (1/5) ! A common problem is overload situations, which can originate from: ! DDOS ! Website is loading slow, user reload, creating more and more Confidential connections ! Eventually the MySQL server runs out of connections (max_connections) ! 5.6 only scales to 40 cores anyways ! A good way of alleviating this is to use a Proxy, e.g HAProxy. 36 Copyright Severalnines AB
  37. 37. #9 – Protecting from Disasters (3/5) Confidential 37 Copyright Severalnines AB Galera Cluster Limit the # of backend connections HAProxy queues incoming connections
  38. 38. Confidential 38
  39. 39. Confidential Thank You! ! Cluster Configurator ! www.severalnines.com/config ! ClusterControl ! www.severalnines.com/clustercontrol ! Severalnines Blog ! www.severalnines.com/blog ! Contact: jj@severalnines.com 39

×