Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MySQL 5.7 clustering: The developer perspective


Published on

(Compiled from revised slides of previous presentations - skip if you know the old presentations)

A summary on clustering MySQL 5.7 with focus on the PHP clients view and the PHP driver. Which kinds on MySQL clusters are there, what are their goal, how does wich one scale, what extra work does which clustering technique put at the client and finally, how the PHP driver (PECL/mysqlnd_ms) helps you.

Published in: Technology
  • Be the first to comment

MySQL 5.7 clustering: The developer perspective

  1. 1. MySQL 5.7 clustering: The developer perspective Ulf Wendel, MySQL/Oracle
  2. 2. The speaker says... MySQL aims to be the web database. Web databases must scale easily. Using a distributed database and clustering is a must. In this talk, we'll explore the clients view on the task at the example of PHP: • Which cluster options exist? • What tasks are there from an developer perspective? • Which tier should handle the additional tasks • How the PHP driver helps you
  3. 3. Part 1 Cluster for MySQL
  4. 4. Availability • Cluster as a whole unaffected by loss of nodes Scalability • Geographic distribution • Scale size in terms of users and data • Database specific: read and/or write load Distribution Transparency • Access, Location, Migration, Relocation (while in use) • Replication • Concurrency, Failure Goals of distributed databases
  5. 5. The speaker says... A distributed database cluster strives for maximum availability and scalability while maintaining distribution transparency. MySQL Cluster has a shared-nothing design good enough for 99,999% (five minutes downtime per year). It scales from Rasperry Pi run in a briefcase to 1.2 billion write transactions per second on a 30 data nodes cluster (if using possibly unsupported low level APIs.) It offers full distribution transparency with the exception of partition relocation to be triggered manually but performed transparently by the cluster. True? On the other end of the distribution transparency scale is MySQL Replication. Why two products? What kind of clusters exist?
  6. 6. Where are transactions run? Primary Copy Update Anywhere When does synchronizatio n happen? Eager n/a (MySQL semi- synch Replication) MySQL Cluster 3rd party: Galera Lazy MySQL Replication/Fabric 3rd party: Tungsten MySQL Cluster Replication What kind of cluster?
  7. 7. The speaker says... A wide range of clusters can be categorized by asking where transactions are run and when replicas synchronize their data. Any eager solution ensures that all replicas are synchronized at any time: it offers strong consistency. A transaction cannot commit before synchronization is done. Please note, what it means to transaction rates: • Single computer tx rate ~ disk/fsync rate • Lazy cluster tx rate ~ disk/fsync rate • Eager cluster tx rate ~ network latency (RTT) Test: Would you deploy a synchronous (eager) cluster on a WAN, or prefer using an asynchronous (lazy) solution?
  8. 8. Lazy Primary Copy 010101001011010 101010110100101 101010010101010 101010110101011 101010110111101 Primary/Master Write Secondary/Slave Secondary/Slave Secondary/Slave Read Read Lazy synchronization: eventual consistency Primary Copy: where any transaction may run „asynchronous“
  9. 9. The speaker says... MySQL Replication falls into the category of lazy Primary Copy clusters. It is a rather unflexible solution as all updates must be sent to the primary. However, this simplifies concurrency control of conflicting, concurrent update transactions. Concurrency control is no different from a single database. Lazy replication can be fast. Transactions don't have to wait for synchronization of replicas. The price of the fast execution is the risk of stale reads and eventual consistency. Transactions can be lost when the primary crashes after commit and before any copy has been updated. (Workaround: MySQL semi-sync replication, which delays the commit until delivery to copy. Alternatively, use shared disk and standby.)
  10. 10. Read but not write scale out 010101001011010 101010110100101 101010010101010 101010110101011 101010110111101 Primary/Master Write Secondary/Slave Secondary/Slave Secondary/Slave Read Read Read Read
  11. 11. The speaker says... Primary copy is best suited for read dominated workloads, as commonly found in web applications. The more secondaries there are, the more reads can be served concurrently. Adding secondaries to the setup does not add (significant) load to the primary. Write scale out is not possible. By definition there is only one server that accepts update transactions. Eager Update Anywhere may offer some, limited write scale capability. MySQL Replication can also be used for: • Backup – perfom blocking backup on slave • High Availablility – for example, remote slaves • Warehousing – OLTP reporting on the slaves
  12. 12. Eager Update Anywhere 010101001011010 101010110100101 101010010101010 101010110101011 101010110111101 Replica Write Replica Replica Replica Read Eager synchronization: strong consistency Update Anywhere: any transaction can run on any replica „synchronous“
  13. 13. The speaker says... An eager update anywhere cluster improves distribution transparency and removes the risk of reading stale data. Transparency and flexibility is improved because any transaction can be directed to any replica. Synchronization happens as part of the commit, thus strong consistency is achieved. Remember: transaction rate ~ network RTT. Failure tolerance is better than with Primary Copy. There is no single point of failure – the primary - that can cause a total outage of the cluster. Nodes may fail without bringing the cluster down immediately. Concurrency control (synchronization) is complex as concurrent transactions from different replicas may conflict.
  14. 14. Replica W R Full read, limited write scale Replica Write Replica Read W R
  15. 15. The speaker says... Read transactions can execute locally in an eager update anywhere cluster. Adding replicas to the cluster, increases the clusters read throughput. Writes scale to some degree only. Update transactions can execute in parallel on multiple replicas and profit from the hardware capacity every additional replica adds to the cluster a a whole. But update transactions must be certified (test for conflicts) before commit and serialized (1SR) at latest at commit on all replicas. Every additional replica, every additional write adds load to all of the current replicas. (Using a quorum instead of ROWA only shifts the issue.)
  16. 16. Record 201 … 300 Record 101 … 200 Record 1 … 100 Replica Partial replication: write scale Replica Write, id = 1 Replica Read, id = 50 R, id = 127R, id = 176 W, id = 198 R, id = 150...300 W, id = 273 W, id = 204
  17. 17. The speaker says... Eager (synchronous) systems are subject to a hard limit: transactions are synchronized over network cables. Network cables are longer than HDD/Flash/RAM cables. Given the limited speed of light, ultimately, messaging on the network will always be slower than messaging on a single machine. However, to scale writes there is only one solution: divide- and-conquery, partial replication. The number of writes processed by a replica must be reduced and replicas must not wait for (many) other replicas before commit. Horizontal partitioning (sharding) can be used. The downside is extra network trafic for reads that involve multiple replicas.
  18. 18. NDB Data NodeNDB Data Node 100% distribution transparency at the SQL-level Partial repl.: MySQL Cluster MySQL SQL Clients NDB Storage Engine C++ API clients 4.3b read tx/s 1.2b write tx/s (in 2012)MySQL NDB Storage Engine NDB Data Node NDB Data Node
  19. 19. The speaker says... MySQL has two products that use the idea of partial replication to scale writes. The products have different designs for different use cases. MySQL Cluster is optimized for fast, reliable network connections, short transactions and real time responses. It prefers main memory over disk storage but also supports on-disk storage. From a SQL clients point of view, MySQL Cluster is nothing but a different storage engine for SQL tables created in a MySQL database. However, internally, it is an auto-sharding system. Guess what uneducated users did and got disappointed about in the past: JOINs … (MySQL Cluster 7.3 has much improved JOIN performance.)
  20. 20. MySQL Fabric Primary/Master Secondary/Slave Secondary/Slave Secondary/Slave Lazy synchronization: eventual consistency Backing Store MySQL Fabric (Master) Group Group (Shard) Group NEW: 5.7
  21. 21. The speaker says... MySQL Fabric is daemon for managing farms of MySQL servers. Farms consist of „groups“. A group either consists of any number of individual MySQL servers, or holds a MySQL Replication cluster. A group describing a replication cluster consists of a master and any number of slaves, as ever since. MySQL Fabric can setup, administrate and monitor groups. Once a MySQL Server has been installed, Fabric can take care of the replication setup details. Failure detectors (built-in: basic heartbeating) can be used for automatic failover of primaries. Failover is based on MySQL 5.6+ GTID logic, use MySQL 5.6.10 or newer. More
  22. 22. Scalability: sharding Primary/Master Slave Slave MySQL Fabric Shard Group: shard1 Backing Store Master Slave Shard Group: shard2 shard_key column 1 Abkhazia 2 Afghanistan shard_key column 11 Azerbaijan 12 Bahamas Setup, Monitor Split, Merge, Move RANGE, HASH, LIST etc NEW: 5.7
  23. 23. The speaker says... MySQL Fabric also uses partial replication/partitioning to scale both read and writes. It is build on top of MySQL Replication (lazy primary copy) which tolerates slow and unreliable network connections. Fabric supports range, hash or list based partitioning of tables using one column as a shard key. Each partition is assigned to a logical shard group, short: shard. Sharding is not transparent to SQL users. A sharded table exists on multiple replicas. Each replica holds a subset of the data. Clients must query all replicas they need data from.
  24. 24. A short, theoretical overview • Enough to get MySQL developers started Follow up for DBAs • Know your workload (OLTP/OLAP, read to write ratio, transactional requirements, complexity of queries, …) • Categorize by Availability, Scalability, Distribution Transparency • More on cluster theory and MySQL Cluster cluster-or-mysql-cluster Wrapping up
  25. 25. The speaker says... For reasons of brevity, the theoretical discussion of MySQL clustering options stops there. A rough catalog has been proposed under „Goals of distributed systems“ to further investigate the various options. Of course, there are interesting 3rd party options to discover – particularily for the DBA. However, to get a developer started with MySQL clustering this should have been more than enough background information.
  26. 26. Part 2 New client tasks
  27. 27. Tasks when using any cluster • Maintain connection pool • Load balance within candidates • Handle node failures • Ideally: auto-explore cluster Cluster independent tasks Client MySQL 1 MySQL 2 MySQL <n>
  28. 28. The speaker says... All PHP applications talking to a cluster of MySQL database servers are facing some common tasks. Replacing a single database with a database cluster means changing the 1:1 relation between the application and the database into a 1:n relation. 1:n puts an additional task on the application: find the right n, find the right server to talk to and so forth.
  29. 29. Task – Choose replica, where to run transaction? • Primary vs. Update Anywhere • Full vs. Partial Replication • If partial: transparent? Task – Consistency, when transactions synchronize • Lazy (asynchronous): eventual consistency, stale reads • Eager (synchronous): strong consistency Cluster dependent tasks
  30. 30. The speaker says... Further tasks may exist depending on the kind of cluster choosen. Generally speaking an eager update anywhere system is the easiest to use. However, it may not always fit the bill. Then lazy replication has to be used or even partial replication, which complicates matters and adds extra tasks to application developer.
  31. 31. Part 3 Which tier for the „load balancing“?
  32. 32. The speaker says... All of the following is a slightly revised and updated version of balancing-peclmysqlndms-14 Skip, if you know that presentation.
  33. 33. Simple 1:1 mapping ?! • Requires eager (synchronous) cluster • No fault tolerance: node failure = stack failure • Inefficient resource usage: no balancing Whole stack cluster HTTP ServerHTTP Server App Server (PHP) MySQL Node App Server (PHP) App Server (PHP) HTTP Server MySQL Node MySQL Node
  34. 34. The speaker says... In a synchronous cluster, for example, if using MySQL Cluster, all nodes have all the data. Thus, every application server could be assigned to one DBMS node. Easy, fast and good for MySQL Cluster but with limits. No good for asynchronous MySQL Replication. Limit: DBMS node failure includes application node failure. Clients should have additional failover logic. Limit: Over- and undercapacity of a DBMS node cannot be compensated. Clients cannot switch to more powerful nodes. Overcapacity of a MySQL node cannot be used by other clients. Different hardware size is ignored.
  35. 35. Free and open source MySQL Proxy • No application changes, handy 3rd party proxy scripts • MySQL node failure does not stop application server • Good resource usage, adaptive load balancing possible Proxy approach - Pro HTTP ServerHTTP Server App Server (PHP) MySQL Node App Server (PHP) App Server (PHP) HTTP Server MySQL Node MySQL Node MySQL Proxy
  36. 36. The speaker says... A transparent MySQL Proxy based solution requires no application changes. Clients connect to the Proxy using MySQL Client Server Protocol, as if the MySQL Proxy was a MySQL Server. Proxy can compensate for DBMS failures. It can react to temporary and permant outages. Proxy can adapt load balancing dynamically. Taking into account node hardware size, current node load, latency, location, … fantasy sets the limits here for self-written or 3rd party Proxy scripts.
  37. 37. Additional component • Complicates architecture, can be single point of failure • C and Lua (scripting) are no natural match for PHP lovers • Adds latency: from client to proxy to node Proxy approach - Con HTTP ServerHTTP Server App Server (PHP) MySQL Node App Server (PHP) App Server (PHP) HTTP Server MySQL Node MySQL Node MySQL Proxy
  38. 38. The speaker says... MySQL Proxy is a great performer! But, … MySQL Proxy adds complexity to the stack. MySQL Proxy needs to be managed. MySQL Proxy is build around C and Lua. C and PHP would be a better match for PHP users. Wrongly used, MySQL Proxy becomes a single point of failure. It is single threaded (about to change). This bares the risk of tasks (Proxy scripts) blocking each other. But, … MySQL Proxy adds latency. Although, it can be minimized significantly running MySQL Proxy on the App Server to avoid use of TCP/IP between PHP and Proxy.
  39. 39. Client application handles load balancing • No single point of failure • No added latency • Highly configurable and adaptive Client side load balancer - Pro HTTP ServerHTTP Server App Server (PHP) MySQL Node App Server (PHP) App Server (PHP) HTTP Server MySQL Node MySQL Node <?php .. ?> <?php ...?> <?php ?>
  40. 40. The speaker says... A client-side approach to the problem is promising, if client applications can be changed. It has most Pro's of the previous approaches. The load balancer is part of the client application. It scales by client and it fails by client. Scalability is given and there is no single point of failure. No additional latency occurs. Load balancing can be adaptive for good resource usage. DBMS node failures do not block clients, fail over is possible.
  41. 41. Client application handles load balancing • PHP is stateless: adaptive load balancing is difficult • Application must be designed for (specific) cluster use • No drop-in solution for existing applications Client side load balancer - Con HTTP ServerHTTP Server App Server (PHP) MySQL Node App Server (PHP) App Server (PHP) HTTP Server MySQL Node MySQL Node <?php .. ?> <?php ...?> <?php ?>
  42. 42. The speaker says... The major downside of a client-side application based solution is the need to change the application. Application changes must be done even for basic cluster use cases. Modifications to source code may not be possible. They may complicate upgrades of standard software. They may cause lots of work thus become expensive. Load balancing is part of PHP thus stateless. This is both a Pro and a Con. It is difficult to hint other clients about node failures. On the other hand, shared nothing is not a bad idea either.
  43. 43. Client side and driver-based: PECL/mysqlnd_ms • Pro: all previous • Synchronous, full replication cluster: no application changes • Asynchronous, full replication cluster: no changes for basics Is driver based the solution? HTTP ServerHTTP Server App Server (PHP) MySQL Node App Server (PHP) App Server (PHP) HTTP Server MySQL Node MySQL Node PECL/mysqlnd_ms PECL/mysqlnd_ms PECL/mysqlnd_ms
  44. 44. The speaker says... A driver based client-side solution has all Pro's! Considering load balancing aspect only, no application changes are required. If using MySQL Cluster, a synchronous cluster, no application changes are required. If using MySQL Replication, an asynchronous cluster that needs read-write splitting, the need for application changes is significantly reduced! Please, welcome the free and open source PECL/mysqlnd_ms driver plugin. The PECL mysqlnd_ms installation can be part of the next PHP deployment, done anyway for security considerations. No new procedure. If any issues, simply disable extension.
  45. 45. Part 4 The PHP driver
  46. 46. Plug, play and pray Application PHP APIs - mysql, mysqli, PDO_MYSQL mysqlnd (PHP 5.3+) Plugin MySQL Server libmysql
  47. 47. The speaker says... In late 2006, MySQL decided to develop a client server library written in C, tailored to the needs of PHP and offered to be integrated into PHP under the terms of the PHP license. The MySQL native driver for PHP was born. Mysqlnd is an alternative to using the MySQL Client Library with PHP. Both libraries are compiled into PHP. They work beneath the user APIs and are written in C like PHP. Very early statistics for monitoring and a plugin API have been added. The C-level plugin API allows plugins hook each and every C call inside the library. This is a comfortable way to add a MySQL proxy/interceptor to PHP without impacting PHP applications. All inside the PHP interpiler.
  48. 48. The plugin for all of you WordPress, Drupal, Symfony, Oxid, ZendFramework, ... mysql, mysqli, PDO_MYSQL mysqlnd Balancing Connection PoolingStatement Redirection PECL/mysqlnd_ms plugin Failover MySQL Server 1 MySQL Server 2 MySQL Server n
  49. 49. The speaker says... PECL/mysqlnd_ms is a plugin for the MySQL native driver for PHP (mysqlnd) library. The mysqlnd library is part of PHP as of version 5.3. As of 5.4 mysqlnd is a default. All three PHP MySQL APIs (mysql, mysqli, PDO_MySQL) can be compiled to use the mysqlnd library, thus all existing PHP MySQL application are supported by the plugin. From an applications point of view a mysqlnd plugin can act as a transparent proxy. Depending on
  50. 50. Anything missing? • All MySQL clusters supported – if not, file bug! • All PHP MySQL APIs, semi-transparent, drop-in • Connection pooling, lazy connections w some state alignment • Optional read-write splitting, transaction aware • Load Balancing, failover (optional: automatic/silent) • Service levels: consistency, slave lag limit, indirect: trottling • Optional client-side query result cache integration • Manual partitioning/sharding (direct query to node group) • Handling for transient errors • 150+ pages documentation, <15 bugs since introduced • Lines of code in tests ~ 2x plugin source code
  51. 51. The speaker says... No worry, it is not complicated. All the magic is beneath the hood, in the C code of PECL/mysqlnd_ms. The user interface is short and comprehensive. This is all what PECL/mysqlnd_ms is about: make using any kind of MySQL cluster easy! Let the load balancer do the work. If anything is missing for your favourite cluster, let us know. The biggest gap currently is Fabric, which is pre-production. The free and open source PECL/mysqlnd_ms is not the only way to get the job done. If you are happy with your current solution, stick with it. Compare features, copy ideas...
  52. 52. /* php.ini */ mysqlnd_ms.enable=1 mysqlnd_ms.config_file=/path/to/mysqlnd_ms_plugin_json.conf /* /path/to/mysqlnd_ms_plugin_json.conf contents */ { "myapp": { "master": { "master_0": { "host": "localhost", "socket": "/tmp/mysql.sock" }}, "slave": { "slave_0": { "host": "", "port": "3306" }}}} Speaking code: configuration
  53. 53. $mysqli = new mysqli("myapp", "username", "password", "database"); $pdo = new PDO('mysql:host=myapp;dbname=database', 'username', 'password'); $mysql = mysql_connect("myapp", "username", "password"); /* Statements will be run on the master */ $mysqli->query("DROP TABLE IF EXISTS test"); $mysqli->query("CREATE TABLE test(id INT)"); $mysqli->query("INSERT INTO test(id) VALUES (1)"); /* Goes to the slave - don't let slave lag fool you! */ $res = $mysqli->query("SELECT id FROM test"); var_dump($res->fetch_assoc()); $mysqli->close(); Connect and use... - as ever
  54. 54. Optional, can be diabled • Read: statement begins with SELECT or SQL slave hint • Read: transaction set read only • Custom: callback decides whether read or write Read/Write splitting Client Writes Reads (SELECT, SQL hint) Master SlaveSlave Slave
  55. 55. The speaker says... Some cluster require read/write splitting. If so, clients must direct all writes to the primary (master). The plugin can, optionally, do automatic read/write splitting. Any automatic decision of the plugin can be overruled. In this case using a hint or a callback. Detection of reads is driver and API dependent and works differently among different MySQL drivers. The PHP driver contains no full-fledged parser for SQL statement analysis. The existing APIs do neither force nor allow developers to set a connection read-only. If the application does not hint a read only transaction, the plugin checks if a statement begins with SELECT or a certain SQL hint. If not, its considered a write – a safe default choice.
  56. 56. No server switch allowed during a transaction • PHP 5.4+: monitoring of all API calls controlling transactions • SQL not checked, protocol does not announce boundaries yet • Use API calls only to control transactions, or SQL hint Redirections and transactions Client Writes Reads (SELECT, SQL hint) Master SlaveSlave Slave
  57. 57. The speaker says... Statement redirection for load balancing bares a pitfall: connection state. Session variables and transactions are part of the state. A load balancer must not switch servers in the middle of a transaction. When using primary copy, all transactions shall go to the primary (master) because it is unknown at the beginning of a transaction whether it includes updates. The MySQL Client Server protocol does not yet hint whether a transaction is active. Load balancers must either parse SQL to catch BEGIN, monitor API calls or be hinted by the application. PECL/mysqlnd_ms monitors the API and understands hints.
  58. 58. Learn: use API calls only to control transactions • We updated the mysqli API to cover all SQL trx options • When switching, plugin takes care of connection charset etc. • Hint: if using MySQL replication, set trx_stickiness=master • Hint: consider master_on_write = 1 or QoS (see below) Transaction aware code $mysqli = new mysqli("myapp", "username", "password", "database"); /* Disable autocommit, master used, no server switch allowed */ $mysqli->autocommit(FALSE); /* ... */ if (!$mysqli->commit()) { die(sprintf("[%d] %sn", $mysqli->errno, $mysqli->error)); } /* Transaction has ended, load balancing begins again */ $mysqli->autocommit(TRUE);
  59. 59. The speaker says... Use the API calls to control transactions. Give the database driver a chance to help you. Virtually any state change that you perform through API calls is handled by PECL/mysqlnd_ms. For example, if you change the charset with mysqli_set_charset(), the plugin makes sure that all opened connections use the same charset. We hide complexity, make cluster use easier. Consider using master_on_write=1. This will keep all reads followed by the first write on the master. Handy, if you use asynchronous cluster and eventual consistency is not enough. Wait for quality-of-service, its getting
  60. 60. Built-in or user-defined through callback • Round robin, Random, Random once (sticky) • 1.3 – kind of adaptive by setting maximum slave lag • 1.4 – nodes can be assigned weigths/probabilities • Future: monitors, such Fabric, could adapt weights dynamically Load Balancing Client Writes Load Balancing of Reads Master SlaveSlave Slave Weight = 2 Weight = 1 Weight = 1Weight = 1
  61. 61. The speaker says... The best and default load balancing stategy is the one with the lowest impact on the connection state: random once. Minimize switches for the life-span of your PHP script. Pick a random slave during startup and use it for all reads. The life-span of PHP is the of a web request. It is short. Two web requests will end up on two random slaves. Nodes with a weight of two will get twice as many requests as nodes with a weight of one. This is not only great if your nodes have different computing power but also to prefer local nodes over remote ones by setting a higher weight. Fabric or similar could drive this further to truly adaptive load balancing
  62. 62. Caution: connection state change • Default – raise exception, let application try again • Optional – silent failover to (another) master (blue) • 1.4 – remember failed, try many before giving up (red) Pooling: connection failover Client Writes Reads Master SlaveSlave Slave Any cluster Old illustrations
  63. 63. The speaker says... Connection state changes are what speaks against automatic, silent fail over once a connection has been established and fails. If your application is written in a way that automatic reconnect upon failure is allowed, you can enable it. Otherwise, handle the error, do not close the connection but run the next query to trigger a new load balancing attempt! Failed nodes can be remembered for the duration of a request. Automatic failover can either try to connect to the master or try to use another slave before failing over to the master. If using MySQL Cluster, the search logic still applies with the only difference that only masters are to be configured.
  64. 64. Do no open connection before needed • Delay connection establishment until statement execution • Reduce number of connections, reduce load • Great for legacy PHP apps, even without clustering • PoC Connection Multiplexing Plugin (PECL/mysqlnd_mux) Pooling: lazy connections Client Writes Reads Master SlaveSlave Slave
  65. 65. The speaker says... By default, to reduce the connection pool size of every PHP web request, connections are not opened before executing a statement. Imagine you have 50 concurrent web requests, you configured four slaves and, you are using random once. If not using lazy connections, the plugin opens 50 x 4 = 200 slave connections but uses at max 50 of them (random once). Every connection occupies resources – ever seen a PHP application opening connections before using? There are many badly written ones... Many connection state changing API calls are buffered and replayed when opening lazy to prevent pitfalls. Server support is missing.
  66. 66. Do not bother about node selection – we do • mysqlnd_ms_set_qos() • Eventual, session and strong consistency • Allow only certain slave lag, enable client-side caching Quality of service Database cluster as a service Client Node NodeNode Node PECL/mysqlnd_ms
  67. 67. The speaker says... The basics (load balancing, failover) are solved. Abstraction can start. We can start to see the cluster as a service. The load balancer knows how to use the cluster to achieve a certain quality of service. The application formulates its service quality requirements, the load balancer takes appropriate action. For example, state the consistency level needed. Is reading stale data from a slave allowed (eventual consistency) and if so, is there any limit on freshness? Having API calls to know the client demands is a great help for load balancer and cluster vendors!
  68. 68. Nodes may or may not serve the latest copies • Default with lazy (asynchronous) systems, MySQL Replication • Nodes may server stale copies or not have copies at all Recap: Eventual consistency MySQL Node MySQL Node MySQL Node GET X, NULL SET X = 1 GET X, X = 0
  69. 69. The speaker says... An eventual consistent node may or may not serve the latest copy. In fact, there is no promise that a particular copy is available from every node. Many systems that default to eventual consistency reach strong consistency over time. Eventually – after some time - all nodes get synchronized. This model is similar to that of a cache. Eventual consistency is good enough for browsing product catalogs or other infrequently changing contents in areas where stale data is acceptable. It is the default consistency level with MySQL Replication. Is eventual consistency evil? Depends on requirements. And, BTW, how do ATMs work – always connected to host?
  70. 70. One function call • More flexible than master_on_write = 1 config setting Setting QoS for consistency $mysqli = new mysqli("myapp", "username", "password", "database"); /* read-write splitting: master used */ $mysqli->query("INSERT INTO orders(item) VALUES ('elePHPant')"); /* Request session consistency: read your writes */ mysqlnd_ms_set_qos($mysqli, MYSQLND_MS_QOS_CONSISTENCY_SESSION); /* Plugin picks a node which has the changes, here: master */ $res = $mysqli->query("SELECT item FROM orders WHERE order_id = 1"); var_dump($res->fetch_assoc()); /* Back to eventual consistency: stale data allowed */ mysqlnd_ms_set_qos($mysqli, MYSQLND_MS_QOS_CONSISTENCY_EVENTUAL); /* Plugin picks any slave, stale data is allowed */ $res = $mysqli->query("SELECT item, price FROM specials");
  71. 71. The speaker says... An eager update anywhere replication system such as MySQL Cluster delivers strong consistency. Classic users of MySQL Replication achieve different level of consistency using the simple rules: eventual consistency – any slave, session and strong consistency – master only. Using the master for any level beyond eventual consistency has a negative impact on read scale out. MySQL 5.6 Global Transactions Identifiers help to achieve session consistency even when reading from slaves – reliable read your writes has become possible. PECL/mysqlnd_ms implements the node selection logic.
  72. 72. QoS may be used query slaves not lagging too far • MySQL estimates the replication lag MySQL Replication: eventual c. MySQL Master MySQL Slave 1 MySQL Slave 2 Slave lag = 1 second SET X = 9 GET X, X = 8 Slave lag = 7 seconds
  73. 73. The speaker says... A MySQL Replication slave is eventual consistent – it may or may not have the latest updates. Slaves can be filtered by replication lag: mysqlnd_ms_set_qos($conn, MYSQLND_MS_QOS_CONSISTENCY_EVENTUAL, MYSQLND_MS_QOS_OPTION_AGE, 5) filters out all slaves that report an estimated lag of more than five seconds. Note that a slave is a cache...
  74. 74. Automatically cache results no older than n seconds • Reduce slave load, reduce latency • Cache stores: process memory, Memcached, … • PHP 5.4+, PECL/mysqlnd_ms 1.4+, PECL/mysqlnd_qc 1.1+ QoS: Client cache integration MySQL Slave 1 MySQL Slave 2 Slave lag = 1 second Slave lag = 7 seconds Max_TTL = 10 – 1 = 9 TTL = MAX_AGE - 7TTL = MAX_AGE - 1 Cache 1st GET X, X = 8 2nd GET X, X = 8
  75. 75. The speaker says... If the load balancer knows that eventual consistency is good and a certain degree of stale data (age, slave lag) is allowed, a slave access can be transparently replaced with a cache access. This lowers the overall slave load in the cluster and reduces latency. Replacing a slave access with a cache access is transparent from an application point of view. PECL/mysqlnd_ms is using PECL/mysqlnd_qc (qc = query cache) for the caching. PECL/mysqlnd_qc can also be used as a standalone cache to store query results in Memcache,
  76. 76. $mysqli = new mysqli("myapp", "username", "password", "database"); /* no caching */ $res = $mysqli->query("SELECT headline FROM latest_news"); var_dump($res->fetch_all(MYSQLI_ASSOC)); /* use cache, if slave lag allows */ mysqlnd_ms_set_qos($mysqli, MYSQLND_MS_QOS_CONSISTENCY_EVENTUAL, MYSQLND_MS_QOS_OPTION_CACHE, 60)); $res = $mysqli->query("SELECT headline FROM old_news"); var_dump($res->fetch_all(MYSQLI_ASSOC)); Transparent caching
  77. 77. The speaker says... Complicated logic behind but easy to use – PHP should look out for new cluster aware APIs. Let's look beyond eventual consistency – even if using a lazy primary copy system such as MySQL Replication.
  78. 78. Combination of server id and sequence number • Emulation: PECL/mysqlnd_ms 1.2+, MySQL Proxy • Built-in (for failover use case): MySQL 5.6 • Alternative: MASTER_POS_WAIT() Global transaction identifier MySQL Master Log 7, Pos 34, GTID M:1: UPDATE x=1 Log 7, Pos 35, GTID M:2: UPDATE x=9 MySQL Slave 1 MySQL Slave 2 … , GTID M:1: UPDATE x=1… , GTID M:1: UPDATE x=1 … , GTID M:2: UPDATE x=9
  79. 79. The speaker says... A global transaction identifier is a cluster-wide unique transaction identifier. MySQL 5.6 can generate it automatically. MySQL Proxy and PECL/mysqlnd_ms 1.2 feature client-side emulations for use with any MySQL version. SQL can be used to access the GTIDs. GTIDs have been been created to make MySQL Replication failover easier. However, they are useful for load balancing as well in a primary copy system.
  80. 80. Use GTID to find slave, which as my write replicated • Check slave status using SQL • Reduce load on master MySQL Replication: session con. MySQL Master ..., GTID M:1: UPDATE x=1 ..., GTID M:2: UPDATE x=9 MySQL Slave 1 MySQL Slave 2 … , GTID M:1: UPDATE x=1… , GTID M:1: UPDATE x=1 … , GTID M:2: UPDATE x=9 SET X = 9 GET X, X = 9
  81. 81. The speaker says... Global transaction identifier help to find „up-to- date“ slaves that have already replicated the latest updates of a client. Thus, session consistency can now be achieved by reading from the master and selected „up-to-date“ slaves. This works with the GTID emulation of PECL/mysqlnd_ms 1.2+ and any MySQL version. And, it works with MySQL 5.6 that has built-in GTIDs. PECL/mysqlnd_ms 1.4 can either wait for a slave to catch up or search all slaves before considering the master. The wait effectively means throttling slaves to reduce slave lag!
  82. 82. Always, primary only • Only primary has all comitted transactions MySQL Replication: strong cons. MySQL Master MySQL Slave MySQL Slave SET X = 1 Client A GET X, X = 1 Client B
  83. 83. The speaker says... Configuring PECL/mysqlnd_ms for use with a MySQL Replication cluster and calling mysqlnd_ms_set_qos($conn, MYSQLND_MS_QOS_CONSISTENCY_STRONG) instructs PECL/mysqlnd_ms to use only the MySQL Replication master server for requests. In a lazy primary copy system there is only one node that is guaranteed to have all comitted updates: the primary. Please note, that its possible to achieve higher consistency levels than eventual consistency in an lazy primary copy system by appropriately choosing nodes.
  84. 84. THE END Contact:
  85. 85. The speaker says... Thank you for your attendance! Upcoming shows: Talk&Show! - YourPlace, any time PHP Summit Berlin, 02.-04.12.2013