Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Robust ha solutions with proxysql

240 views

Published on

Robust HA Solutions - Native Support for PXC and InnoDB cluster in ProxySQL
This talk will illustrate and discuss several MySQL reference architectures that implement a different grade of tightly coupled database cluster.
We will show how ProxySQL implementation is a natural fit in all of them, and how easily it will provide additional stability and functionalities improvement.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Robust ha solutions with proxysql

  1. 1. 1 Robust HA Solutions - Native Support for PXC and InnoDB Cluster in ProxySQL Marco Tusa Percona
  2. 2. 2 2 • Open source enthusiast • Principal architect • Working in DB world over 25 years • Open source developer and community contributor About Me
  3. 3. 3 What this talk is about High level description of High Availability HA solutions Misses from current vanilla solutions Where ProxySQL fits It is NOT about Performance It is NOT an in depth description of HA or ProxySQL feature It is NOT an implementation guide
  4. 4. 4 • The need and dimension of HA or DR is related to the real need of your business. • We are pathologically online/connected, and often we expect to have over dimensioned HA or DR. • Business needs often do not require all the components to be ALWAYS available. How We Need to think about HA and DR
  5. 5. 5 Why We Need HA and DR Do: • Business needs • Technical challenges • Supportable solutions • Knowhow Don’t: • Choose based on the “shiny object” • Pick something you know nothing about • Choose by yourself and push it up or down • Use shortcuts, to accelerate deploying time. The first step to have a robust solution is to design the right solution for your business.
  6. 6. 6 • High Availability is to cover service availability in a location • 10 Gb Ethernet Best case scenario
 400 mt distance Max HA vs DR
  7. 7. 7 • Disaster Recovery is to assure we can restore service, in a geographically different location • Real speed may vary • Linear distance ~1000Km HA vs DR
  8. 8. 8 HA vs DR DO NOT MIX the solutions in your architecture http://www.tusacentral.com/joomla/index.php/mysql-blogs/204-how-not-to-do-mysql-high-availability-geographic-node-distribution-with-galera-based-replication-misuse http://www.tusacentral.com/joomla/index.php/mysql-blogs/205-mysql-high-availability-on-premises-a-geographically-distributed-scenario
  9. 9. 9 Replicate data is the key - Sync VS Async 1
 Data state 3 Different Data state
  10. 10. 10 Data Replication is the Base Tightly coupled database clusters • Datacentric approach (single state of the data, distributed commit) • Data is consistent in time cross nodes • Replication requires high performant link • Geographic distribution is forbidden • DR is not supported Loosely coupled database clusters • Single node approach (local commit) • Data state differs by node • Single node state does not affect the cluster • Replication link doesn’t need to be high performance • Geographic distribution is allow • DR is supported
  11. 11. 11 In MySQL ecosystem we have different solutions • NDB • Galera replication • InnoDB group replication • Basic Replication Different solutions
  12. 12. 12 Different solutions
  13. 13. 13 Good thing: • Very High availability (five 9s) • Write scale (for real not a fake) • Read scale • Data is in sync all the time Bad things: • Too be very efficient should
 stay in memory as much as possible • Complex to manage • Require deep understanding also from application point of view Understand solutions: NDB
  14. 14. 14 Replication by internal design Good thing: • Highly available • Read scale • Data is in almost in sync
 all the time Bad things: • Doesn’t Scale writes • More nodes more internal
 overhead • Network is a very impacting factor • 1 Master only (PLEASE!!!!!) Understand solutions: Galera
  15. 15. 15 Understand solutions: Group replication Replication by standard MySQL Good thing: • Highly available • Read scale • Data is in almost in sync
 all the time • More network tolerant Bad things: • Doesn’t Scale writes • 1 Master only (PLEASE!!!!!) When something ends to be a new things?
  16. 16. 16 Understand solutions: Replication Replication by internal design Good thing: • Each node is independent • Read scale • Low network impact Bad things: • Stale reads • Low HA • Consistent only in the master • Each node has its own data state
  17. 17. 17 1. No embedded R/W split • Writes and reads are bound to 1 node • Failing read channel will cause total failure 2. No recognition of possible stale reads • In case of replication delay, application can get wrong data 3. No node failure identification and no single entry point • Architecture must implement a double mechanism to check nodes status (topology manager) and entry point like VIP • Application must use a connector that allow multiple entry point in case of issue. (like Java connector) What they have in common?
  18. 18. 18 1. Embedded R/W split by group of servers (Host Group) • Writes and reads are automatically redirect to working node(s) 2. Identify stale node with the binary log reader • In case of replication delay, delayed node will not receive the read requests 3. Embedded node failure, automatic redirection • Proxy provide native status recognition for Galera and Group replication • Only when using basic replication, you need a topology manager • Application connect to ProxySQL no need to add VIP What ProxySQL can do to be better
  19. 19. 19 Native health checks read_only wsrep_local_recv_queue wsrep_desync wsrep_reject_queries wsrep_sst_donor_rejects_queries primary_partition ProxySQL for Galera • writer_hostgroup: the hostgroup ID that refers to the WRITER • backup_writer_hostgroup: the hostgoup ID referring to the Hostgorup that will contain the candidate servers • reader_hostgroup: The reader Hostgroup ID, containing the list of servers that need to be taken in consideration • offline_hostgroup: The Hostgroup ID that will eventually contain the writer that will be put OFFLINE • active: True[1]/False[0] if this configuration needs to be used or not • max_writers: This will contain the MAX number of writers you want to have at the same time. In a sane setup this should be always 1, but if you want to have multiple writers, you can define it up to the number of nodes. • writer_is_also_reader: If true [1] the Writer will NOT be removed from the reader HG • max_transactions_behind: The number of wsrep_local_recv_queue after which the node will be set OFFLINE. This must be carefully set, observing the node behaviour. • comment: I suggest to put some meaningful notes to identify what is what.
  20. 20. 20 I have these nodes 192.168.1.205 (Node1) 192.168.1.21 (Node2) 192.168.1.231 (node3) ProxySQL for Galera Will use these groups: Writer HG-> 100 Reader HG-> 101 BackupW HG-> 102 offHG HG-> 9101 INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.205’,100,3306,1000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.205',101,3306,1000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.21',101,3306,1000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.231',101,3306,1000); insert into mysql_galera_hostgroups (writer_hostgroup,backup_writer_hostgroup,reader_hostgroup, offline_hostgroup,active,max_writers,writer_is_also_reader,max_transactions_behind) values (100,102,101,9101,1,1,1,16);
  21. 21. 21 ProxySQL for Galera
  22. 22. 21 ProxySQL for Galera
  23. 23. 21 ProxySQL for Galera
  24. 24. 22 ProxySQL for Group Replication Native health checks Read_only Viable_candidate Transaction_behind • writer_hostgroup: the hostgroup ID that refers to the WRITER • backup_writer_hostgroup: the hostgoup ID referring to the Hostgorup that will contain the candidate servers • reader_hostgroup: The reader Hostgroup ID, containing the list of servers that need to be taken in consideration • offline_hostgroup: The Hostgroup ID that will eventually contain the writer that will be put OFFLINE • active: True[1]/False[0] if this configuration needs to be used or not • max_writers: This will contain the MAX number of writers you want to have at the same time. In a sane setup this should be always 1, but if you want to have multiple writers, you can define it up to the number of nodes. • writer_is_also_reader: If true [1] the Writer will NOT be removed from the reader HG • max_transactions_behind: determines the maximum number of transactions behind the writers that ProxySQL should allow before shunning the node to prevent stale reads (this is determined by querying the transactions_behind field of the sys.gr_member_routing_candidate_status table in MySQL). • comment: I suggest to put some meaningful notes to identify what is what.
  25. 25. 23 I have these nodes 192.168.4.55 (Node1) 192.168.4.56 (Node2) 192.168.4.57 (node3) ProxySQL for Group Replication Will use these groups: Writer HG-> 400 Reader HG-> 401 BackupW HG-> 402 offHG HG-> 9401 INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('192.168.4.55',400,3306,10000,2000,'GR1'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('192.168.4.55',401,3306,100,2000,'GR1'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('192.168.4.56',401,3306,10000,2000,'GR2'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('192.168.4.57',401,3306,10000,2000,'GR2'); insert into mysql_group_replication_hostgroups (writer_hostgroup,backup_writer_hostgroup,reader_hostgroup, offline_hostgroup,active,max_writers,writer_is_also_reader,max_transactions_behind) values (400,402,401,9401,1,1,1,100); select * from mysql_server_group_replication_log order by 3 desc,1 limit 3 ; +--------------+------+------------------+-----------------+------------------+-----------+---------------------+-------+ | hostname | port | time_start_us | success_time_us | viable_candidate | read_only | transactions_behind | error | +--------------+------+------------------+-----------------+------------------+-----------+---------------------+-------+ | 192.168.4.57 | 3306 | 1569593421324355 | 3085 | YES | YES | 0 | NULL | | 192.168.4.56 | 3306 | 1569593421321825 | 2889 | YES | YES | 0 | NULL | | 192.168.4.55 | 3306 | 1569593421321435 | 2764 | YES | NO | 0 | NULL | +--------------+------+------------------+-----------------+------------------+-----------+---------------------+-------+
  26. 26. 24 ProxySQL for Group Replication
  27. 27. 24 ProxySQL for Group Replication
  28. 28. 24 ProxySQL for Group Replication
  29. 29. 24 ProxySQL for Group Replication
  30. 30. 25 Native health checks MySQL ping read_only Latency ProxySQL for NDB Simpler to configure, no replication lag either ON or OFF delete from mysql_servers where hostgroup_id in (300,301); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.107',300,3306,10000,2000,'DC1 writer'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.107',301,3306,10000,2000,'DC1 Reader'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.108',300,3306,10000,2000,'DC1 writer'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.108',301,3306,10000,2000,'DC1 Reader'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.109',300,3306,10000,2000,'DC1 writer'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.109',301,3306,10000,2000,'DC1 Reader'); INSERT INTO mysql_replication_hostgroups VALUES (300,301,'read_only','NDB_cluster'); LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK; Easy to monitor select b.weight, c.* from stats_mysql_connection_pool c left JOIN runtime_mysql_servers b ON c.hostgroup=b.hostgroup_id and c.srv_host=b.hostname and c.srv_port = b.port where hostgroup in (300,301) order by hostgroup,srv_host desc; +--------+-----------+------------+----------+--------+...+------------+ | weight | hostgroup | srv_host | srv_port | status |...| Latency_us | +--------+-----------+------------+----------+--------+...+------------+ | 10000 | 300 | 10.0.0.109 | 3306 | ONLINE |...| 561 | | 10000 | 300 | 10.0.0.108 | 3306 | ONLINE |...| 494 | | 10000 | 300 | 10.0.0.107 | 3306 | ONLINE |...| 457 | | 10000 | 301 | 10.0.0.109 | 3306 | ONLINE |...| 561 | | 10000 | 301 | 10.0.0.108 | 3306 | ONLINE |...| 494 | | 10000 | 301 | 10.0.0.107 | 3306 | ONLINE |...| 457 | +--------+-----------+------------+----------+--------+...+------------+
  31. 31. 26 ProxySQL for NDB
  32. 32. 26 ProxySQL for NDB
  33. 33. 26 ProxySQL for NDB
  34. 34. 27 ProxySQL for MySQL Replication Native health checks MySQL ping read_only Latency Stale READS with GTID delete from mysql_servers where hostgroup_id in (500,501); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.50',500,3306,10000,2000,'DC1 writer'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.50',501,3306,10000,2000,'DC1 Reader'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.51',501,3306,10000,2000,'DC1 Reader'); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections,comment) VALUES ('10.0.0.52',501,3306,10000,2000,'DC1 Reader'); INSERT INTO mysql_replication_hostgroups VALUES (500,501,'read_only', 'Simple replica'); LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK; Easy to monitor select b.weight, c.* from stats_mysql_connection_pool c left JOIN runtime_mysql_servers b ON c.hostgroup=b.hostgroup_id and c.srv_host=b.hostname and c.srv_port = b.port where hostgroup in (500,501) order by hostgroup,srv_host desc; select * from mysql_server_read_only_log order by time_start_us desc limit 3; +------------+------+------------------+-----------------+-----------+-------+ | hostname | port | time_start_us | success_time_us | read_only | error | +------------+------+------------------+-----------------+-----------+-------+ | 10.0.0.52 | 3306 | 1569604127710729 | 895 | 1 | NULL | | 10.0.0.51 | 3306 | 1569604127697866 | 542 | 1 | NULL | | 10.0.0.50 | 3306 | 1569604127685005 | 795 | 0 | NULL | +------------+------+------------------+-----------------+-----------+-------+
  35. 35. 28 ProxySQL for MySQL Replication
  36. 36. 28 ProxySQL for MySQL Replication
  37. 37. 28 ProxySQL for MySQL Replication
  38. 38. 28 ProxySQL for MySQL Replication
  39. 39. 29 ProxySQL is cool because: • Allow you to better distribute the load • Reduce the service downtime shifting to available node(s) • In case of writer down reads are served • With Binlog reader allow u to identify node with stale data also in basic replication • More stuff that we are not covering here… • Multiplexing; sharding; masking; firewalling etc.. In short your HA solution will be more resilient and flexible, which means that your business will be safer. Conclusions
  40. 40. 30
  41. 41. 31

×