The document discusses some key challenges with achieving high availability and scalability in distributed systems based on the CAP theorem. It explains that consistency, availability, and partition tolerance cannot all be guaranteed simultaneously. It then provides examples of how this manifests in real database systems like MySQL, PostgreSQL, Redis, and RabbitMQ. It discusses strategies for improving availability and scalability in systems like PostgreSQL clusters and MySQL Galera clusters, but also limitations and complexity involved.
MySQL 5.6 Global Transaction IDs - Use case: (session) consistencyUlf Wendel
PECL/mysqlnd_ms is a transparent load balancer for PHP and MySQL. It can be used with any kind of MySQL Cluster. If used with MySQL Replication it has some tricks to offer to break out of the default eventual consistency of the lazy primary copy design of MySQL Replication. It is using global transaction ids to lower read load on the master while still offering session consistency. Users of MySQL 5.6 can use the server built-in global transaction id feature, everybody else can use the driver built-in emulation that works with previous MySQL versions as well. Of course, its a mysqlnd plugin and as such it works with all PHP MySQL APIs (mysql, mysqli, PDO_MySQL). Happy hacking!
PECL/mysqlnd_mux adds multiplexing to all PHP MySQL APIs (mysql, mysqli, PDO_MySQL) compiled to use mysqlnd. Connection multiplexing refers to sharing one MySQL connection among multiple user connection handles, among multiple clients. Multiplexing does reduce client-side connection overhead and minimizes the total number of concurrently open connections. The latter lowers the MySQL server load. As a highly specific optimization it has not only strong but also weak sides. See, what this free plugin has to offer in prototype stage. And, how does it compare to other techniques such as pooling or persistent connections - what to use when tuning PHP MySQL to the extreme.
Clustering MySQL is a mainstream technology to handle todays web loads. Regardless whether you choose MySQL Replication, MySQL Cluster or any other type of clustering solution you will need a load balancer. PECL/mysqlnd_ms 1.4 is a driver integrated load balancer for PHP. It works with all APIs, is free, semi-transparent, at the best possible layer in your stack and loaded with features. Get an overview of the latest development version 1.4.
DIY: A distributed database cluster, or: MySQL ClusterUlf Wendel
Live from the International PHP Conference 2013: MySQL Cluster is a distributed, auto-sharding database offering 99,999% high availability. It runs on Rasperry PI as well as on a cluster of multi-core machines. A 30 node cluster was able to deliver 4.3 billion (not million) read transactions per second in 2012. Take a deeper look into the theory behind all the MySQL replication/clustering solutions (including 3rd party) and learn how they differ.
MySQL Group Replication provides a high availability multi-master replication solution for MySQL. It allows multiple MySQL instances to act as equal masters that can accept writes and remain available even if some instances fail. Transactions are synchronously committed across all members of the replication group to ensure consistency. Group Replication handles failure detection and recovery transparently through its use of group communication systems and built-in conflict detection. It provides a highly available, scalable and fully distributed database solution compared to traditional MySQL replication and clustering options.
MySQL 5.6 Global Transaction Identifier - Use case: FailoverUlf Wendel
The document discusses how global transaction IDs (GTIDs) and PECL/mysqlnd_ms can improve MySQL replication and failover capabilities. GTIDs allow for easier identification of the most up-to-date transactions during failover. PECL/mysqlnd_ms can fail over client connections transparently when errors occur. While GTIDs and PECL/mysqlnd_ms improve availability, changes to the replication topology still require deploying updates to client configurations.
MySQL 5.7 Fabric: Introduction to High Availability and Sharding Ulf Wendel
MySQL 5.7 has sharding built-in to MySQL. The free and open source MySQL Fabric utility simplifies the management of MySQL clusters of any kind. This includes MySQL Replication setup, monitoring, automatic failover, switchover and so fort for High Availability. Additionally, it offers measures to shard a MySQL database over many an arbitrary number of servers. Intelligent load balancer (updated drivers) take care of routing queries to the appropriate shards.
MySQL 5.6 Global Transaction IDs - Use case: (session) consistencyUlf Wendel
PECL/mysqlnd_ms is a transparent load balancer for PHP and MySQL. It can be used with any kind of MySQL Cluster. If used with MySQL Replication it has some tricks to offer to break out of the default eventual consistency of the lazy primary copy design of MySQL Replication. It is using global transaction ids to lower read load on the master while still offering session consistency. Users of MySQL 5.6 can use the server built-in global transaction id feature, everybody else can use the driver built-in emulation that works with previous MySQL versions as well. Of course, its a mysqlnd plugin and as such it works with all PHP MySQL APIs (mysql, mysqli, PDO_MySQL). Happy hacking!
PECL/mysqlnd_mux adds multiplexing to all PHP MySQL APIs (mysql, mysqli, PDO_MySQL) compiled to use mysqlnd. Connection multiplexing refers to sharing one MySQL connection among multiple user connection handles, among multiple clients. Multiplexing does reduce client-side connection overhead and minimizes the total number of concurrently open connections. The latter lowers the MySQL server load. As a highly specific optimization it has not only strong but also weak sides. See, what this free plugin has to offer in prototype stage. And, how does it compare to other techniques such as pooling or persistent connections - what to use when tuning PHP MySQL to the extreme.
Clustering MySQL is a mainstream technology to handle todays web loads. Regardless whether you choose MySQL Replication, MySQL Cluster or any other type of clustering solution you will need a load balancer. PECL/mysqlnd_ms 1.4 is a driver integrated load balancer for PHP. It works with all APIs, is free, semi-transparent, at the best possible layer in your stack and loaded with features. Get an overview of the latest development version 1.4.
DIY: A distributed database cluster, or: MySQL ClusterUlf Wendel
Live from the International PHP Conference 2013: MySQL Cluster is a distributed, auto-sharding database offering 99,999% high availability. It runs on Rasperry PI as well as on a cluster of multi-core machines. A 30 node cluster was able to deliver 4.3 billion (not million) read transactions per second in 2012. Take a deeper look into the theory behind all the MySQL replication/clustering solutions (including 3rd party) and learn how they differ.
MySQL Group Replication provides a high availability multi-master replication solution for MySQL. It allows multiple MySQL instances to act as equal masters that can accept writes and remain available even if some instances fail. Transactions are synchronously committed across all members of the replication group to ensure consistency. Group Replication handles failure detection and recovery transparently through its use of group communication systems and built-in conflict detection. It provides a highly available, scalable and fully distributed database solution compared to traditional MySQL replication and clustering options.
MySQL 5.6 Global Transaction Identifier - Use case: FailoverUlf Wendel
The document discusses how global transaction IDs (GTIDs) and PECL/mysqlnd_ms can improve MySQL replication and failover capabilities. GTIDs allow for easier identification of the most up-to-date transactions during failover. PECL/mysqlnd_ms can fail over client connections transparently when errors occur. While GTIDs and PECL/mysqlnd_ms improve availability, changes to the replication topology still require deploying updates to client configurations.
MySQL 5.7 Fabric: Introduction to High Availability and Sharding Ulf Wendel
MySQL 5.7 has sharding built-in to MySQL. The free and open source MySQL Fabric utility simplifies the management of MySQL clusters of any kind. This includes MySQL Replication setup, monitoring, automatic failover, switchover and so fort for High Availability. Additionally, it offers measures to shard a MySQL database over many an arbitrary number of servers. Intelligent load balancer (updated drivers) take care of routing queries to the appropriate shards.
PoC: Using a Group Communication System to improve MySQL Replication HAUlf Wendel
High Availability solutions for MySQL Replication are either simple to use but introduce a single point of failure or free of pitfalls but complex and hard to use. The Proof-of-Concept sketches a way in the middle. For monitoring a group communication system is embedded into MySQL usng a MySQL plugin which eliminates the monitoring SPOF and is easy to use. Much emphasis is put of the often neglected client side. The PoC shows an architecture in which clients reconfigure themselves dynamically. No client deployment is required.
The mysqlnd replication and load balancing pluginUlf Wendel
The mysqlnd replication and load balancing plugin for mysqlnd makes using MySQL Replication from PHP much easier. The plugin takes care of Read/Write splitting, Load Balancing, Failover and Connection Pooling. Lazy Connections, a feature not only useful with replication, help reducing the MySQL server load. Like any other mysqlnd plugin, the plugin operates mostly transparent from an applications point of view and can be used in a drop-in style.
The document discusses various topics related to the Tungsten Connector including:
- The role of the Connector in routing connections to the appropriate database nodes.
- Best practices for deploying Connectors in different topologies including on application servers, dedicated nodes, or database nodes with load balancing.
- How to perform zero-downtime maintenance on a Tungsten cluster by manually switching the master role between nodes using the cctrl utility.
- How Connectors route connections in a composite cluster with multiple local clusters and how affinity can be set to prefer local reads from a particular cluster.
- That a Connector can provide access to multiple clusters or composite clusters by configuring the dat
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6Severalnines
You'll learn how Galera integrates with MySQL 5.6 and Global Transaction IDs to enable cross-datacenter and cloud replication over high latency networks. The benefits are clear; a globally distributed MySQL setup across regions to deliver Severalnines availability and real-time responsiveness.
Galera Cluster for MySQL is a true multi-master MySQL replication plugin, and has been proven in mission-critical infrastructures of companies like Ping Identity, AVG Technologies, KPN and HP Cloud DNS. In this webcast you¹ll learn about the following Galera Cluster capabilities, including the latest innovations in the new 3.0 release:
Galera Cluster features and benefits
Support for MySQL 5.6
Integration with MySQL Global Transaction Identifiers
Mixing Galera synchronous replication and asynchronous MySQL replication
Deploying in WAN and Cloud environments
Handling high-latency networks
Management of Galera
MySQL Group Replication is a new 'synchronous', multi-master, auto-everything replication plugin for MySQL introduced with MySQL 5.7. It is the perfect tool for small 3-20 machine MySQL clusters to gain high availability and high performance. It stands for high availability because the fault of replica don't stop the cluster. Failed nodes can rejoin the cluster and new nodes can be added in a fully automatic way - no DBA intervention required. Its high performance because multiple masters process writes, not just one like with MySQL Replication. Running applications on it is simple: no read-write splitting, no fiddling with eventual consistency and stale data. The cluster offers strong consistency (generalized snapshot isolation).
It is based on Group Communication principles, hence the name.
Zero Downtime Schema Changes - Galera Cluster - Best PracticesSeveralnines
Database schema changes are usually not popular among DBAs or sysadmins, not when you are operating a cluster and cannot afford to switch off the service during a maintenance window. There are different ways to perform schema changes, some procedures being more complicated than others.
Galera Cluster is great at making your MySQL database highly available, but are you concerned about schema changes? Is an ALTER TABLE statement something that requires a lot of advance scheduling? What is the impact on your database uptime?
This is a common question, since ALTER operations in MySQL usually cause the table to be locked and rebuilt – which can potentially be disruptive to your live applications. Fortunately, Galera Cluster has mechanisms to replicate DDL across its nodes.
In these slides, you will learn about the following:
How to perform Zero Downtime Schema Changes
2 main methods: TOI and RSU
Total Order Isolation: predictability and consistency
Rolling Schema Upgrades
pt-online-schema-change
Schema synchronization with re-joining nodes
Recommended procedures
Common pitfalls/user errors
The slides are courtesy of Seppo Jaakola, CEO, Codership - creators of Galera Cluster
This document summarizes and compares several solutions for multi-master replication in MySQL databases: Native MySQL replication, MySQL Cluster (NDB), Galera, and Tungsten. Native MySQL replication supports only limited topologies and has asynchronous replication. MySQL Cluster allows synchronous replication across two data centers but is limited to in-memory tables. Galera provides synchronous, row-based replication across multiple masters with automatic conflict resolution. Tungsten allows asynchronous multi-master replication to different database systems and automatic failover.
Topics covered in this presentation, which was used for user group meetings, conferences & webinars:
1. Galera Cluster for MySQL - overview
2. Release 3 New Features:
* WAN Replication
* 5.6 Global Transaction ID (GTID) Support
* MySQL Replication Support
* and more features
3. The Galera Cluster Project
MySQL native driver for PHP (mysqlnd) - Introduction and overview, Edition 2011Ulf Wendel
A quick overview on the MySQL native driver for PHP (mysqlnd) and its unique features. Edition 2011. What is mysqlnd, why use it, which plugins exist, where to find more information.... the current state. Expect a new summary every year.
- Galera is a MySQL clustering solution that provides true multi-master replication with synchronous replication and no single point of failure.
- It allows high availability, data integrity, and elastic scaling of databases across multiple nodes.
- Companies like Percona and MariaDB have integrated Galera to provide highly available database clusters.
Galera replication works by synchronizing data across multiple database servers so that any server can accept writes and all servers instantly reflect the new data. It uses global transaction IDs and group communication to replicate write sets in parallel to all nodes, ensuring consistency. Any node can join the cluster as long as it knows the cluster name and can find an active member to bootstrap from.
MySQL 5.7 clustering: The developer perspectiveUlf Wendel
(Compiled from revised slides of previous presentations - skip if you know the old presentations)
A summary on clustering MySQL 5.7 with focus on the PHP clients view and the PHP driver. Which kinds on MySQL clusters are there, what are their goal, how does wich one scale, what extra work does which clustering technique put at the client and finally, how the PHP driver (PECL/mysqlnd_ms) helps you.
Slides for the webinar held on January 21st 2014
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters
Galera Cluster, NDB Cluster, VIP with HAProxy and Keepalived, MongoDB Sharded Cluster, etc. all have their own availability models. We are aware of these availability models and will demonstrate in this webinar how to take corrective action in case of failures via our cluster management tool, ClusterControl.
In this webinar, Severalnines CTO Johan Andersson will show you how to leverage ClusterControl to detect failures in your database cluster and automatically repair them to maximize the availability of your database services. And Codership CEO Seppo Jaakola will be joining Johan to provide a deep-dive into Galera recovery internals.
Agenda:
Redundancy models for Galera, NDB and MongoDB/TokuMX
Failover & Recovery (Automatic vs Manual)
Zooming into Galera recovery procedures
Split brains in multi-datacenter setups
The document discusses the introduction of an HTTP plugin for MySQL. Key points:
- The plugin allows MySQL to communicate over HTTP and return data in JSON format, making it more accessible to web developers.
- It provides three HTTP APIs - SQL, CRUD, and key-document - that all return JSON and leverage the power of SQL.
- The initial release has some limitations but demonstrates the concept, with the goal of getting feedback to improve the APIs.
- The plugin acts as a proxy between HTTP and SQL, translating requests and allowing full access to MySQL's features via the SQL endpoint.
Fine-tuning Group Replication for PerformanceVitor Oliveira
This presentation is an overview of Group Replication from the perspective of performance optimization. It shows the main moving parts, the available options and how they can be used to tune its behaviour, and also a few significant benchmark results.
How oracle 12c flexes its muscles against oracle 11g r2 finalAjith Narayanan
The document summarizes the new high availability features introduced in Oracle 12c, including Flex Cluster and Flex ASM. A Flex Cluster allows for hub and leaf nodes, where leaf nodes do not require direct access to shared storage. Flex ASM runs Automatic Storage Management (ASM) on fewer nodes, allows for larger disk groups, and isolates ASM traffic to improve performance. The new features provide more flexibility, scalability and ease of management for high availability.
Kafka Reliability - When it absolutely, positively has to be thereGwen (Chen) Shapira
Kafka provides reliability guarantees through replication and configuration settings. It replicates data across multiple brokers to protect against failures. Producers can ensure data is committed to all in-sync replicas through configuration settings like request.required.acks. Consumers maintain offsets and can commit after processing to prevent data loss. Monitoring is also important to detect any potential issues or data loss in the Kafka system.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
PoC: Using a Group Communication System to improve MySQL Replication HAUlf Wendel
High Availability solutions for MySQL Replication are either simple to use but introduce a single point of failure or free of pitfalls but complex and hard to use. The Proof-of-Concept sketches a way in the middle. For monitoring a group communication system is embedded into MySQL usng a MySQL plugin which eliminates the monitoring SPOF and is easy to use. Much emphasis is put of the often neglected client side. The PoC shows an architecture in which clients reconfigure themselves dynamically. No client deployment is required.
The mysqlnd replication and load balancing pluginUlf Wendel
The mysqlnd replication and load balancing plugin for mysqlnd makes using MySQL Replication from PHP much easier. The plugin takes care of Read/Write splitting, Load Balancing, Failover and Connection Pooling. Lazy Connections, a feature not only useful with replication, help reducing the MySQL server load. Like any other mysqlnd plugin, the plugin operates mostly transparent from an applications point of view and can be used in a drop-in style.
The document discusses various topics related to the Tungsten Connector including:
- The role of the Connector in routing connections to the appropriate database nodes.
- Best practices for deploying Connectors in different topologies including on application servers, dedicated nodes, or database nodes with load balancing.
- How to perform zero-downtime maintenance on a Tungsten cluster by manually switching the master role between nodes using the cctrl utility.
- How Connectors route connections in a composite cluster with multiple local clusters and how affinity can be set to prefer local reads from a particular cluster.
- That a Connector can provide access to multiple clusters or composite clusters by configuring the dat
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6Severalnines
You'll learn how Galera integrates with MySQL 5.6 and Global Transaction IDs to enable cross-datacenter and cloud replication over high latency networks. The benefits are clear; a globally distributed MySQL setup across regions to deliver Severalnines availability and real-time responsiveness.
Galera Cluster for MySQL is a true multi-master MySQL replication plugin, and has been proven in mission-critical infrastructures of companies like Ping Identity, AVG Technologies, KPN and HP Cloud DNS. In this webcast you¹ll learn about the following Galera Cluster capabilities, including the latest innovations in the new 3.0 release:
Galera Cluster features and benefits
Support for MySQL 5.6
Integration with MySQL Global Transaction Identifiers
Mixing Galera synchronous replication and asynchronous MySQL replication
Deploying in WAN and Cloud environments
Handling high-latency networks
Management of Galera
MySQL Group Replication is a new 'synchronous', multi-master, auto-everything replication plugin for MySQL introduced with MySQL 5.7. It is the perfect tool for small 3-20 machine MySQL clusters to gain high availability and high performance. It stands for high availability because the fault of replica don't stop the cluster. Failed nodes can rejoin the cluster and new nodes can be added in a fully automatic way - no DBA intervention required. Its high performance because multiple masters process writes, not just one like with MySQL Replication. Running applications on it is simple: no read-write splitting, no fiddling with eventual consistency and stale data. The cluster offers strong consistency (generalized snapshot isolation).
It is based on Group Communication principles, hence the name.
Zero Downtime Schema Changes - Galera Cluster - Best PracticesSeveralnines
Database schema changes are usually not popular among DBAs or sysadmins, not when you are operating a cluster and cannot afford to switch off the service during a maintenance window. There are different ways to perform schema changes, some procedures being more complicated than others.
Galera Cluster is great at making your MySQL database highly available, but are you concerned about schema changes? Is an ALTER TABLE statement something that requires a lot of advance scheduling? What is the impact on your database uptime?
This is a common question, since ALTER operations in MySQL usually cause the table to be locked and rebuilt – which can potentially be disruptive to your live applications. Fortunately, Galera Cluster has mechanisms to replicate DDL across its nodes.
In these slides, you will learn about the following:
How to perform Zero Downtime Schema Changes
2 main methods: TOI and RSU
Total Order Isolation: predictability and consistency
Rolling Schema Upgrades
pt-online-schema-change
Schema synchronization with re-joining nodes
Recommended procedures
Common pitfalls/user errors
The slides are courtesy of Seppo Jaakola, CEO, Codership - creators of Galera Cluster
This document summarizes and compares several solutions for multi-master replication in MySQL databases: Native MySQL replication, MySQL Cluster (NDB), Galera, and Tungsten. Native MySQL replication supports only limited topologies and has asynchronous replication. MySQL Cluster allows synchronous replication across two data centers but is limited to in-memory tables. Galera provides synchronous, row-based replication across multiple masters with automatic conflict resolution. Tungsten allows asynchronous multi-master replication to different database systems and automatic failover.
Topics covered in this presentation, which was used for user group meetings, conferences & webinars:
1. Galera Cluster for MySQL - overview
2. Release 3 New Features:
* WAN Replication
* 5.6 Global Transaction ID (GTID) Support
* MySQL Replication Support
* and more features
3. The Galera Cluster Project
MySQL native driver for PHP (mysqlnd) - Introduction and overview, Edition 2011Ulf Wendel
A quick overview on the MySQL native driver for PHP (mysqlnd) and its unique features. Edition 2011. What is mysqlnd, why use it, which plugins exist, where to find more information.... the current state. Expect a new summary every year.
- Galera is a MySQL clustering solution that provides true multi-master replication with synchronous replication and no single point of failure.
- It allows high availability, data integrity, and elastic scaling of databases across multiple nodes.
- Companies like Percona and MariaDB have integrated Galera to provide highly available database clusters.
Galera replication works by synchronizing data across multiple database servers so that any server can accept writes and all servers instantly reflect the new data. It uses global transaction IDs and group communication to replicate write sets in parallel to all nodes, ensuring consistency. Any node can join the cluster as long as it knows the cluster name and can find an active member to bootstrap from.
MySQL 5.7 clustering: The developer perspectiveUlf Wendel
(Compiled from revised slides of previous presentations - skip if you know the old presentations)
A summary on clustering MySQL 5.7 with focus on the PHP clients view and the PHP driver. Which kinds on MySQL clusters are there, what are their goal, how does wich one scale, what extra work does which clustering technique put at the client and finally, how the PHP driver (PECL/mysqlnd_ms) helps you.
Slides for the webinar held on January 21st 2014
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters
Galera Cluster, NDB Cluster, VIP with HAProxy and Keepalived, MongoDB Sharded Cluster, etc. all have their own availability models. We are aware of these availability models and will demonstrate in this webinar how to take corrective action in case of failures via our cluster management tool, ClusterControl.
In this webinar, Severalnines CTO Johan Andersson will show you how to leverage ClusterControl to detect failures in your database cluster and automatically repair them to maximize the availability of your database services. And Codership CEO Seppo Jaakola will be joining Johan to provide a deep-dive into Galera recovery internals.
Agenda:
Redundancy models for Galera, NDB and MongoDB/TokuMX
Failover & Recovery (Automatic vs Manual)
Zooming into Galera recovery procedures
Split brains in multi-datacenter setups
The document discusses the introduction of an HTTP plugin for MySQL. Key points:
- The plugin allows MySQL to communicate over HTTP and return data in JSON format, making it more accessible to web developers.
- It provides three HTTP APIs - SQL, CRUD, and key-document - that all return JSON and leverage the power of SQL.
- The initial release has some limitations but demonstrates the concept, with the goal of getting feedback to improve the APIs.
- The plugin acts as a proxy between HTTP and SQL, translating requests and allowing full access to MySQL's features via the SQL endpoint.
Fine-tuning Group Replication for PerformanceVitor Oliveira
This presentation is an overview of Group Replication from the perspective of performance optimization. It shows the main moving parts, the available options and how they can be used to tune its behaviour, and also a few significant benchmark results.
How oracle 12c flexes its muscles against oracle 11g r2 finalAjith Narayanan
The document summarizes the new high availability features introduced in Oracle 12c, including Flex Cluster and Flex ASM. A Flex Cluster allows for hub and leaf nodes, where leaf nodes do not require direct access to shared storage. Flex ASM runs Automatic Storage Management (ASM) on fewer nodes, allows for larger disk groups, and isolates ASM traffic to improve performance. The new features provide more flexibility, scalability and ease of management for high availability.
Kafka Reliability - When it absolutely, positively has to be thereGwen (Chen) Shapira
Kafka provides reliability guarantees through replication and configuration settings. It replicates data across multiple brokers to protect against failures. Producers can ensure data is committed to all in-sync replicas through configuration settings like request.required.acks. Consumers maintain offsets and can commit after processing to prevent data loss. Monitoring is also important to detect any potential issues or data loss in the Kafka system.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
ProxySQL - High Performance and HA Proxy for MySQLRené Cannaò
High Availability proxy designed to solve real issues of MySQL setups from small to very large production environments.
Presentation at Percona Live Amsterdam 2015
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScaleColin Charles
This document discusses MySQL proxy technologies including MySQL Router, ProxySQL, and MariaDB MaxScale. It provides an overview of each technology, including when they were released, key features, and comparisons between them. ProxySQL is highlighted as a popular option currently with integration with Percona tools, while MySQL Router may become more widely used due to its support for MySQL InnoDB Cluster. MariaDB MaxScale is noted for its binlog routing capabilities. Overall the document aims to help people understand and choose between the different MySQL proxy options.
Get the best out of MySQL Cluster, presentation covers:
- Tuning and optimization to exploit the auto-sharded, distributed design of MySQL Cluster
- Using Adaptive Query Localization to scale cross-shard JOINs
- Data access patterns, schema and query optimizations
- Recommended tuning parameters
Tune in to the on-demand webinar: http://www.mysql.com/news-and-events/on-demand-webinars/display-od-719.html
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityColin Charles
Best practices for MySQL/MariaDB Server/Percona Server High Availability - presented at Percona Live Amsterdam 2016. The focus is on picking the right High Availability solution, discussing replication, handling failure (yes, you can achieve a quick automatic failover), proxies (there are plenty), HA in the cloud/geographical redundancy, sharding solutions, how newer versions of MySQL help you, and what to watch for next.
This was a short 25 minute talk, but we go into a bit of a history of MySQL, how the branches and forks appeared, what's sticking around today (branch? Percona Server. Fork? MariaDB Server). What should you use? Think about what you need today and what the roadmap holds.
Best practices for MySQL High AvailabilityColin Charles
The MariaDB/MySQL world is full of tradeoffs, and choosing a high availability (HA) solution is no exception. This session aims to look at all the alternatives in an unbiased way. Preference is of course only given to open source solutions.
How do you choose between: asynchronous/semi-synchronous/synchronous replication, MHA (MySQL high availability tools), DRBD, Tungsten Replicator, or Galera Cluster? Do you integrate Pacemaker and Heartbeat like Percona Replication Manager? The cloud brings even more fun, especially if you are dealing with a hybrid cloud and must think about geographical redundancy.
What about newer solutions like using Consul for MySQL HA?
When you’ve decided on your solution, how do you provision and monitor these solutions?
This and more will be covered in a walkthrough of MySQL HA options and when to apply them.
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Severalnines
Galera Cluster for MySQL, Percona XtraDB Cluster and MariaDB Cluster (the three “flavours” of Galera Cluster) make use of the Galera WSREP libraries to handle synchronous replication.MySQL Cluster is the official clustering solution from Oracle, while Galera Cluster for MySQL is slowly but surely establishing itself as the de-facto clustering solution in the wider MySQL eco-system.
In this webinar, we will look at all these alternatives and present an unbiased view on their strengths/weaknesses and the use cases that fit each alternative.
This webinar will cover the following:
MySQL Cluster architecture: strengths and limitations
Galera Architecture: strengths and limitations
Deployment scenarios
Data migration
Read and write workloads (Optimistic/pessimistic locking)
WAN/Geographical replication
Schema changes
Management and monitoring
MySQL High Availability Solutions - Feb 2015 webinarAndrew Morgan
How important is your data? Can you afford to lose it? What about just some of it? What would be the impact if you couldn’t access it for a minute, an hour, a day or a week?
Different applications can have very different requirements for High Availability. Some need 100% data reliability with 24x7x365 read & write access while many others are better served by a simpler approach with more modest HA ambitions.
MySQL has an array of High Availability solutions ranging from simple backups, through replication and shared storage clustering – all the way up to 99.999% available shared nothing, geographically replicated clusters. These solutions also have different ‘bonus’ features such as full InnoDB compatibility, in-memory real-time performance, linear scalability and SQL & NoSQL APIs.
The purpose of this presentation is to help you decide where your application sits in terms of HA requirements and discover which of the MySQL solutions best fit the bill. It will also cover what you need outside of the database to ensure High Availability – state of the art monitoring being a prime example.
Retaining Goodput with Query Rate LimitingScyllaDB
Distributed systems are usually optimized with particular workloads in mind. At the same time, the system should still behave in a sane way when the assumptions about workload do not hold - notably, one user shouldn't be able to ruin the whole system's performance. Buggy parts of the system can be a source of the overload as well, so it is worth considering overload protection on a per-component basis. For example, ScyllaDB's shared-nothing architecture gives it great scalability, but at the same time makes it prone to a "hot partition" problem: a single partition accessed with disproportionate frequency can ruin performance for other requests handled by the same shards. This talk will describe how we implemented rate limiting on a per-partition basis which reduces the performance impact in such a case, and how we reduced the CPU cost of handling failed requests such as timeouts (spoiler: it's about C++ exceptions).
Software architecture for data applicationsDing Li
The document provides an overview of software architecture considerations for data applications. It discusses sample data system components like Memcached, Redis, Elasticsearch, and Solr. It covers topics such as service level objectives, data models, query languages, graph models, data warehousing, machine learning pipelines, and distributed systems. Specific frameworks and technologies mentioned include Spark, Kafka, Neo4j, PostgreSQL, and ZooKeeper. The document aims to help understand architectural tradeoffs and guide the design of scalable, performant, and robust data systems.
This document discusses concepts related to client-server computing and database management systems (DBMS). It covers topics such as DBMS concepts and architecture, centralized and distributed systems, client-server systems, transaction servers, data servers, parallel and distributed databases, and network types. Key points include the definitions of centralized, client-server, and distributed systems. Transaction servers and data servers are described as two types of server system architectures. Issues related to parallelism such as speedup, scaleup, and factors limiting them are also covered.
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Clustrix
The document discusses scaling MySQL databases and alternatives to sharding. It begins by outlining the typical path organizations take to sharding MySQL as their data and usage grows over time. This involves continually upgrading hardware, adding read replicas, and eventually implementing sharding. The document then covers the challenges of sharding, such as data skew across shards, lack of ACID transactions, application changes required, and complex infrastructure needs. As an alternative, the document introduces ClustrixDB, a database that can scale write and read performance linearly just by adding more servers without sharding. It achieves this through automatic data distribution, query fan-out, and data rebalancing. Performance benchmarks show ClustrixDB vastly outscaling alternatives on Amazon
Pacemaker is a high availability cluster resource manager that can be used to provide high availability for MySQL databases. It monitors MySQL instances and replicates data between nodes using replication. If the primary MySQL node fails, Pacemaker detects the failure and fails over to the secondary node, bringing the MySQL service back online without downtime. Pacemaker manages shared storage and virtual IP failover to ensure connections are direct to the active MySQL node. It is important to monitor replication state and lag to ensure data consistency between nodes.
Locking and Race Conditions in Web ApplicationsAndrew Kandels
Mutexes, locks, transactions -- they all may seem more relevant in compiled languages, lower level drivers or in databases; however, race conditions can be of equal dilemma in modern web applications. Something as simple as a user double clicking a submit form can yield unexpected results. These problems are difficult to replicate and to test, so they often go undetected. They can occur with or without significant traffic. Finally, with NoSQL alternatives growing in popularity for storing data and as caching layers, we need new alternatives to database transactions and locking.
In this session, I will present situations which are vulnerable to race conditions, along with solutions. I'll also talk about locking approaches that are reliable, efficient and scalable.
HBase is an open-source implementation of Google's Bigtable storage system and is modeled after Bigtable. It is a distributed, scalable, big data store that allows for storage and retrieval of large amounts of data across clusters of commodity servers. HBase provides a key-value data model and uses Hadoop HDFS for storage. It allows for fast random reads and writes across billions of rows and millions of columns.
we will discuss important topics related to multi-master setups:
* Practical considerations when using Galera in a multi-master setup
* Evaluating the characteristics of your database workload
* Preparing your application for multi-master
* Detecting and dealing with transaction conflicts
The document discusses NoSQL databases and Cassandra. It provides background on the rise of NoSQL with the need for large web companies to handle big data in a distributed manner. It introduces the CAP theorem and explains that NoSQL databases sacrifice consistency to achieve availability and partition tolerance. Eventual consistency is described where updates eventually propagate throughout the system. Cassandra is summarized as an open source, distributed, column-oriented database developed at Facebook to be highly scalable and fault tolerant. It uses an eventual consistency model and is robust to failures.
The document discusses the CAP theorem and related concepts like PACELC, ACID, and BASE. It analyzes how different database systems like PostgreSQL, MongoDB, and a hybrid PostgreSQL/Salesforce/Heroku Connect system fit within these models. While CAP classifications can be imprecise, the key aspects to understand are the consistency, availability, and partition tolerance tradeoffs that distributed systems must make.
Talon systems - Distributed multi master replication strategySaptarshi Chatterjee
This document proposes a new approach to multi-master data replication called TalonStore. It describes existing replication strategies and identifies limitations. TalonStore uses an event-driven architecture where writes are published to a queue and all nodes subscribe independently. For reads, nodes constitute a quorum to enforce consistency if the majority agree. This allows parallel writes without locking, and eliminates single points of failure compared to traditional synchronous replication. The goal is to improve performance and availability for distributed databases while maintaining consistency.
NoSQL databases were developed to address the limitations of scaling relational databases for large datasets and distributed systems. The CAP theorem states that a distributed data store can only provide two of three properties: consistency, availability, and partition tolerance. Most NoSQL databases emphasize availability and partition tolerance over strong consistency, using eventual consistency models. Hadoop ecosystems like HBase, Hive and Pig provide scalable storage and processing of large datasets beyond the capabilities of relational databases.
This document discusses various data distribution models and consistency approaches in NoSQL databases. It covers replication, which copies data across nodes, and sharding, which distributes different data to different nodes. Replication can be master-slave, with one master node handling writes, or peer-to-peer, allowing writes to any node. Sharding and replication can be combined. The document also discusses consistency models and handling write-write conflicts through pessimistic locking or optimistic conditional updates.
This document provides an overview of NoSQL databases and key-value stores. It discusses why NoSQL databases were created, examples of different NoSQL categories like key-value stores and document stores. It then focuses on key-value stores like Memcached and MemcacheDB. Memcached is an in-memory key-value store while MemcacheDB provides persistence. Both use the BerkeleyDB for storage with MemcacheDB.
This document provides an overview of distributed key-value stores and Cassandra. It discusses key concepts like data partitioning, replication, and consistency models. It also summarizes Cassandra's features such as high availability, elastic scalability, and support for different data models. Code examples are given to demonstrate basic usage of the Cassandra client API for operations like insert, get, multiget and range queries.
This document provides an overview of distributed key-value stores and summarizes Cassandra in particular. It discusses how distributed key-value stores address the scalability limitations of relational databases by partitioning and replicating data across multiple servers. The document outlines some common distributed key-value store architectures and algorithms, such as Amazon's Dynamo, and describes how Cassandra implements these approaches. Examples of typical applications of distributed key-value stores and an overview of Cassandra's features and code samples are also provided.
Give you a brief overview of the product. - What is esProc SPL? And show some cases helping you to know what it uses for. Talk about why esProc works better. And overview its brief characteristics. After that, Introduce the main technical solutions which esProc is often used.
What to do when you have a perfect model for your software but you are constrained by an imperfect business model?
This talk explores the challenges of bringing modelling rigour to the business and strategy levels, and talking to your non-technical counterparts in the process.
Boost Your Savings with These Money Management AppsJhone kinadey
A money management app can transform your financial life by tracking expenses, creating budgets, and setting financial goals. These apps offer features like real-time expense tracking, bill reminders, and personalized insights to help you save and manage money effectively. With a user-friendly interface, they simplify financial planning, making it easier to stay on top of your finances and achieve long-term financial stability.
DevOps Consulting Company | Hire DevOps Servicesseospiralmantra
Spiral Mantra excels in providing comprehensive DevOps services, including Azure and AWS DevOps solutions. As a top DevOps consulting company, we offer controlled services, cloud DevOps, and expert consulting nationwide, including Houston and New York. Our skilled DevOps engineers ensure seamless integration and optimized operations for your business. Choose Spiral Mantra for superior DevOps services.
https://www.spiralmantra.com/devops/
🏎️Tech Transformation: DevOps Insights from the Experts 👩💻campbellclarkson
Connect with fellow Trailblazers, learn from industry experts Glenda Thomson (Salesforce, Principal Technical Architect) and Will Dinn (Judo Bank, Salesforce Development Lead), and discover how to harness DevOps tools with Salesforce.
Liberarsi dai framework con i Web Component.pptxMassimo Artizzu
In Italian
Presentazione sulle feature e l'utilizzo dei Web Component nell sviluppo di pagine e applicazioni web. Racconto delle ragioni storiche dell'avvento dei Web Component. Evidenziazione dei vantaggi e delle sfide poste, indicazione delle best practices, con particolare accento sulla possibilità di usare web component per facilitare la migrazione delle proprie applicazioni verso nuovi stack tecnologici.
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...kalichargn70th171
In today's fiercely competitive mobile app market, the role of the QA team is pivotal for continuous improvement and sustained success. Effective testing strategies are essential to navigate the challenges confidently and precisely. Ensuring the perfection of mobile apps before they reach end-users requires thoughtful decisions in the testing plan.
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...Luigi Fugaro
Vector databases are transforming how we handle data, allowing us to search through text, images, and audio by converting them into vectors. Today, we'll dive into the basics of this exciting technology and discuss its potential to revolutionize our next-generation AI applications. We'll examine typical uses for these databases and the essential tools
developers need. Plus, we'll zoom in on the advanced capabilities of vector search and semantic caching in Java, showcasing these through a live demo with Redis libraries. Get ready to see how these powerful tools can change the game!
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...kalichargn70th171
Visual testing plays a vital role in ensuring that software products meet the aesthetic requirements specified by clients in functional and non-functional specifications. In today's highly competitive digital landscape, users expect a seamless and visually appealing online experience. Visual testing, also known as automated UI testing or visual regression testing, verifies the accuracy of the visual elements that users interact with.
Photoshop Tutorial for Beginners (2024 Edition)alowpalsadig
Photoshop Tutorial for Beginners (2024 Edition)
Explore the evolution of programming and software development and design in 2024. Discover emerging trends shaping the future of coding in our insightful analysis."
Here's an overview:Introduction: The Evolution of Programming and Software DevelopmentThe Rise of Artificial Intelligence and Machine Learning in CodingAdopting Low-Code and No-Code PlatformsQuantum Computing: Entering the Software Development MainstreamIntegration of DevOps with Machine Learning: MLOpsAdvancements in Cybersecurity PracticesThe Growth of Edge ComputingEmerging Programming Languages and FrameworksSoftware Development Ethics and AI RegulationSustainability in Software EngineeringThe Future Workforce: Remote and Distributed TeamsConclusion: Adapting to the Changing Software Development LandscapeIntroduction: The Evolution of Programming and Software Development
Photoshop Tutorial for Beginners (2024 Edition)Explore the evolution of programming and software development and design in 2024. Discover emerging trends shaping the future of coding in our insightful analysis."Here's an overview:Introduction: The Evolution of Programming and Software DevelopmentThe Rise of Artificial Intelligence and Machine Learning in CodingAdopting Low-Code and No-Code PlatformsQuantum Computing: Entering the Software Development MainstreamIntegration of DevOps with Machine Learning: MLOpsAdvancements in Cybersecurity PracticesThe Growth of Edge ComputingEmerging Programming Languages and FrameworksSoftware Development Ethics and AI RegulationSustainability in Software EngineeringThe Future Workforce: Remote and Distributed TeamsConclusion: Adapting to the Changing Software Development LandscapeIntroduction: The Evolution of Programming and Software Development
The importance of developing and designing programming in 2024
Programming design and development represents a vital step in keeping pace with technological advancements and meeting ever-changing market needs. This course is intended for anyone who wants to understand the fundamental importance of software development and design, whether you are a beginner or a professional seeking to update your knowledge.
Course objectives:
1. **Learn about the basics of software development:
- Understanding software development processes and tools.
- Identify the role of programmers and designers in software projects.
2. Understanding the software design process:
- Learn about the principles of good software design.
- Discussing common design patterns such as Object-Oriented Design.
3. The importance of user experience (UX) in modern software:
- Explore how user experience can improve software acceptance and usability.
- Tools and techniques to analyze and improve user experience.
4. Increase efficiency and productivity through modern development tools:
- Access to the latest programming tools and languages used in the industry.
- Study live examples of applications
Consistent toolbox talks are critical for maintaining workplace safety, as they provide regular opportunities to address specific hazards and reinforce safe practices.
These brief, focused sessions ensure that safety is a continual conversation rather than a one-time event, which helps keep safety protocols fresh in employees' minds. Studies have shown that shorter, more frequent training sessions are more effective for retention and behavior change compared to longer, infrequent sessions.
Engaging workers regularly, toolbox talks promote a culture of safety, empower employees to voice concerns, and ultimately reduce the likelihood of accidents and injuries on site.
The traditional method of conducting safety talks with paper documents and lengthy meetings is not only time-consuming but also less effective. Manual tracking of attendance and compliance is prone to errors and inconsistencies, leading to gaps in safety communication and potential non-compliance with OSHA regulations. Switching to a digital solution like Safelyio offers significant advantages.
Safelyio automates the delivery and documentation of safety talks, ensuring consistency and accessibility. The microlearning approach breaks down complex safety protocols into manageable, bite-sized pieces, making it easier for employees to absorb and retain information.
This method minimizes disruptions to work schedules, eliminates the hassle of paperwork, and ensures that all safety communications are tracked and recorded accurately. Ultimately, using a digital platform like Safelyio enhances engagement, compliance, and overall safety performance on site. https://safelyio.com/
WWDC 2024 Keynote Review: For CocoaCoders AustinPatrick Weigel
Overview of WWDC 2024 Keynote Address.
Covers: Apple Intelligence, iOS18, macOS Sequoia, iPadOS, watchOS, visionOS, and Apple TV+.
Understandable dialogue on Apple TV+
On-device app controlling AI.
Access to ChatGPT with a guest appearance by Chief Data Thief Sam Altman!
App Locking! iPhone Mirroring! And a Calculator!!
2. CAP theorem
Presented as a conjuncture at PODC 2000 (Brewer's conjecture)
Formalized and proved in 2002 by Nancy Lynch and Seth Gilbert (MIT)
Consistency, Availability and Partition- Tolerance cannot be achieved all at the same time in a
distributed system
There is a tradeoff between these 3 properties
1. Consistency (all nodes see the same data at the same time)
2. Availability (every request receives a response about
whether it succeeded or failed)
3. Partition tolerance (the system continues to operate despite
arbitrary partitioning due to network failures)
3. Definition In simple terms:
in an asynchronous network that performs as expected, where
messages may be lost (partition-tolerance), it is impossible to
implement a service that provides consistent data and responds
eventually to every request (availability) under every pattern of
message loss
4. Consistency:
• Data is consistent and the same for all nodes.
• All the nodes in the system see the same state of the data vi
5. • Every request to non-failing node should be processed and
receive response whether it failed or succeeded
Availability:
8. In simple words:
● Consistency & Availability = some guaranties of data loss
● Consistency & Partitioning = scaling
Why do we need to care about this?
10. • Black-box systems testing. Bugs reproduced in Jepsen are
observable in production, not theoretical. But tests are
nondeterministic, and they cannot prove correctness, only find
errors.
• Testing under distributed systems failure modes: faulty
networks, unsynchronized clocks, and partial failure. Test suites
only evaluate the behavior of healthy clusters
• Generative testing: systemc constructs random operations,
apply them to the system, and constructs a concurrent history
of their results. That history is checked against a model to
establish its correctness. Generative (or property-based) tests
often reveal edge cases with subtle combinations of inputs.
Jepsen (http://jepsen.io/)
13. RDBMS (again theory)
• Standardized with SQL
• Ubiquitous – widely used and understood
• Supports transactions
• High availability is achieved via Replication
• Master – Master
• Master – Slave
• Synchronous/Asynchronous
14. Why RDBMS is AC: ACID
Atomicity of an operation(transaction)
• "All or nothing“ – If part fails, the entire transaction fails.
Consistency
• Database will remain in a valid state after the transaction.
• Means adhering to the database rules (key, uniqueness,
etc.)
Isolation
• 2 Simultaneous transactions cannot interfere one with the
other. (Executed as if executed sequentially)
Durability
• Once a transaction is commited, it remains so indefinitely,
even after power loss or crash. (no caching) Definition – ACID
15. ACID in Dist. Systems
• Proved problematic in big dist systems
• How to guarantee ACID properties ?
• Atomicity requires more thought - e.g. two-phase
commit (and 3-phase commit, PAXOS…)
• Isolation requires to hold all of its locks for the entire
transaction duration - High Lock Contention !
• Complex
• Prone to failure - algorithm should handle
• Failure = outage during write.
• Comes with High overhead commits.
17. Does it means that we can’t scale
RDBMS out of the box?
18. But we have PG cluster!
But In PG cluster only one node can write.
According Amazone research it brings 5% overhead for
master node + network delay and replica delay for 2 PC
commit. So it can balance only reading via pgpool
PG cluster is not about balancing load (at least writing)
Okay, at least we have ACID
Right?
Well… almost. Even though the Postgres server is always
consistent, the distributed system composed of the server
and client together may not be consistent. It’s possible for
the client and server to disagree about whether or not a
transaction took place.
19. PG cluster
Postgres' commit protocol, like most relational databases, is a special case of
two-phase commit, or 2PC. In the first phase, the client votes to commit (or
abort) the current transaction, and sends that message to the server. The server
checks to see whether its consistency constraints allow the transaction to
proceed, and if so, it votes to commit. It writes the transaction to storage and
informs the client that the commit has taken place (or failed, as the case may
be.) Now both the client and server agree on the outcome of the transaction.
What happens if the message acknowledging the commit is dropped before the
client receives it? Then the client doesn’t know whether the commit succeeded
or not! The 2PC protocol says that we must wait for the acknowledgement
message to arrive in order to decide the outcome. Waiting forever isn’t realistic
for real systems, so at some point the client will time out and declare an error
occurred. The commit protocol is now in an indeterminate state.
20. PG cluster + Jepsen + Withdraw
example
https://aphyr.com/posts/282-jepsen-postgres
21. But we have pg_shard for scaling load
https://www.citusdata.com/citus-products/pg-shard/pg-
shard-quick-start-guide
Yes but Postgres with pg_shard is not ACID!
Limitations:
• Transactional semantics for queries that span across
multiple shards - For example, you're a financial institution
and you sharded your data based on customer_id. You'd
now like to withdraw money from one customer's account
and debit it to another one's account, in a single transaction
block.
• Unique constraints on columns other than the partition
key, or foreign key constraints.
• Distributed JOINs also aren't supported in pg_shard
22. pg_shard
Frequently Asked Questions
How does pg_shard handle INSERT/UPDATE/DELETE commands?
pg_shard requires that any modifications (INSERTs, UPDATEs, or DELETEs) involve exactly one shard.
In the UPDATE and DELETE case, this means commands must include a WHERE qualification on the partition column that restricts
the query to a single shard. Such qualifications usually take the form of an equality clause on the tables partition column.
As for INSERT commands, the partition column of the row being inserted must be specified using an expression
that can be reduced to a constant. For instance, a value such as 3, or even char_length('bob') would be suitable,
though rand() would not. In additions, INSERT commands must specify exactly one row to be inserted.
Note that the above restriction implies that commands similar to "INSERT INTO table SELECT col_one, col_two
FROM other_table" are not currently supported.
From an implementation standpoint, pg_shard determines the shard involved in a given INSERT, UPDATE,
or DELETE command and then rewrites the SQL of that command to reference the shard table.
The rewritten SQL is then sent to the placements for that shard to complete processing of the command.
How exactly does pg_shard distribute my data?
Rather than using hosts as the unit of distribution, pg_shard creates many small shards and places them across many hosts in a round-robin fashion.
For example, a user might have eight hosts in their cluster but 256 shards with a replication factor of two. Shard one would be created on hosts A and B, shard two on B and
C, and so forth.
The advantage of this approach is that the additional load incurred after a host failure is spread among many other hosts instead of falling entirely on a single replica.
23. But Mysql Galera has master-master
cluster approach!
Multi-master replication means that applications update the
same tables on different masters, and the changes replicate
automatically between those masters.
Row-Based Replication to Avoid Data Drift
Replication depends on deterministic updates--a transaction that changes 10 rows on the original master
should change exactly the same rows when it executes against a replica. Unfortunately many SQL
statements that are deterministic in master/slave replication are non-deterministic in multi-master
topologies. Consider the following example, which gives a 10% raise to employees in department #35.
UPDATE emp SET salary = salary * 1.1 WHERE dep_id = 35;
If all masters add employees, then the number of employees who actually get the raise will vary depending on
whether such additions have replicated to all masters. Your servers will very likely become inconsistent with
statement replication. The fix is to enable row-based replication using binlog-format=row in my.cnf. Row
replication transfers the exact row updates from each master to the others and eliminates ambiguity.
But this reduce performance dramatically.
24. Mysql Galera
Prevent Key Collisions on INSERTs
For applications that use auto-increment keys, MySQL offers a useful trick to ensure that such keys do not
collide between masters using the auto-increment-increment and auto-increment-offset parameters in
my.cnf. The following example ensures that auto-increment keys start at 1 and increment by 4 to give values
like 1, 5, 9, etc. on this server.
server-id=1
auto-increment-offset = 1
auto-increment-increment = 4
This works so long as your applications use auto-increment keys faithfully. However, any table that either does
not have a primary key or where the key is not an auto-increment field is suspect. You need to hunt them
down and ensure the application generates a proper key that does not collide across masters, for example
using UUIDs or by putting the server ID into the key. Here is a query on the MySQL information schema to
help locate tables that do not have an auto-increment primary key.
25. Mysql Galera
Semantic Conflicts in Applications
MySQL replication can resolve conflicts. You need to avoid them in your applications. Here are a few tips as
you go about this.
First, avoid obvious conflicts. These include inserting data with the same keys on different masters (described
above), updating rows in two places at once, or deleting rows that are updated elsewhere. Any of these can
cause errors that will break replication or cause your masters to become out of sync. The good news is that
many of these problems are not hard to detect and eliminate using properly formatted transactions. The
bad news is that these are the easy conflicts. There are others that are much harder to address.
For example, accounting systems need to generate unbroken sequences of numbers for invoices. A common
approach is to use a table that holds the next invoice number and increment it in the same transaction that
creates a new invoice. Another accounting example is reports that need to read the value of accounts
consistently, for example at monthly close. Neither example works off-the-shelf in a multi-master system
with asynchronous replication, as they both require some form of synchronization to ensure global
consistency across masters. Or salary and balance task. These and other such cases may force substantial
application changes. Some applications simply do not work with multi-master topologies for this reason.
26. Mysql Galera
Have a Plan for Sorting Out Mixed Up Data
Master/slave replication has its discontents, but at least sorting out messed up replicas is simple: re-provision from another slave
or the master. No so with multi-master topologies--you can easily get into a situation where all masters have transactions you
need to preserve and the only way to sort things out is to track down differences and update masters directly. Here are some
thoughts on how to do this.
1. Ensure you have tools to detect inconsistencies. Tungsten has built-in consistency checking with the 'trepctl check'
command. You can also use the Percona Toolkit pt-table-checksum to find differences. Be forewarned that neither of
these works especially well on large tables and may give false results if more than one master is active when you run them.
2. Consider relaxing foreign key constraints. I love foreign keys because they keep data in sync. However, they can also
create problems for fixing messed up data, because the constraints may break replication or make it difficult to go table-
by-table when synchronizing across masters. There is an argument for being a little more relaxed in multi-master settings.
3. Switch masters off if possible. Fixing problems is a lot easier if you can quiesce applications on all but one master.
4. Know how to fix data. Being handy with SQL is very helpful for fixing up problems. I find SELECT INTO OUTFILE and LOAD
DATA INFILE quite handy for moving changes between masters. Don't forget SET SESSION LOG_FILE_BIN=0 to keep
changes from being logged and breaking replication elsewhere. There are also various synchronization tools like pt-table-
sync, but I do not know enough about them to make recommendations.
5. At this point it's probably worth mentioning commercial support. Unless you are a replication guru, it is very comforting to
have somebody to call when you are dealing with messed up masters. Even better, expert advice early on can help you
avoid problems in the first place.
27. Mysql Galera + Jepsen + Withdraw
https://aphyr.com/posts/327-jepsen-mariadb-galera-cluster
Imagine a system of two bank accounts, each with a balance of
$10.
SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE
set autocommit=0
select * from accounts where id = 0
select * from accounts where id = 1
UPDATE accounts SET balance = 8 WHERE id = 0
UPDATE accounts SET balance = 12 WHERE id = 1
COMMIT
28. Mysql Galera + Jepsen + Withdraw
Case 1: T1 commits before T2’s start time. Operations from T1 and T2 cannot
interleave, by Lemma 1, because their intervals do not overlap.
Case 2: T1 and T2 operate on disjoint sets of accounts. They serialize trivially.
Case 3: T1 and T2 operate on intersecting sets of accounts, and T1 commits before T2
commits. Then T1 wrote data that T2 also wrote, and committed in T2’s interval,
which violates First-committer-wins. T2 must abort.
Case 4: T1 and T2 operate on intersecting sets of accounts, and T1 commits after T2
commits. Then T2 wrote data that T1 also wrote, and committed in T1’s interval,
which violates First-committer-wins. T1 must abort.
29. Mysql Galera + Jepsen + Withdraw
Read-only transactions trivially serialize with one another. Do they serialize with
respect to transfer transactions? The answer is yes: since every read-only transaction
sees only committed data in a Snapshot Isolation system, and commits no data itself,
it must appear to take place atomically at some time between other transactions.
SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE
set autocommit=0
select * from accounts
COMMIT
31. Mysql Galera conclusion
The transfer transactions should have kept the total amount of
money at $20, but by the end of the test the totals all sum to
$22. And in this run, 25% of the funds in the system
mysteriously vanish. These results remain stable after all other
transactions have ended–they are not a concurrency anomaly.
Dirty reads!
No first-committer-wins, no snapshot isolation. No snapshot
isolation, well… I’m not sure exactly what Galera does
guarantee.
Master-Master works for append only DB
http://scale-out-blog.blogspot.com/2012/04/if-you-must-
deploy-multi-master.html
http://www.onlamp.com/2016/04/20/advanced-mysql-
32.
33. We know that
Instagram uses Postgres,
pinterest uses mysql!
True!
https://engineering.pinterest.com/blog/sharding-pinterest-
how-we-scaled-our-mysql-fleet
>>In 2011, we hit traction. By some estimates, we were growing
faster than any other previous startup. Around September
2011, every piece of our infrastructure was over capacity. We
had several NoSQL technologies, all of which eventually broke
catastrophically. We also had a boatload of MySQL slaves we
were using for reads, which makes lots of irritating bugs,
especially with caching.
34. Pinterest
How we sharded
Whatever we were going build needed to meet our needs and be stable, performant and repairable. In other
words, it needed to not suck, and so we chose a mature technology as our base to build on, MySQL. We
intentionally ran away from auto-scaling newer technology like MongoDB, Cassandra and Membase, because
their maturity was simply not far enough along (and they were crashing in spectacular ways on us!).
Aside: I still recommend startups avoid the fancy new stuff — try really hard to just use MySQL. Trust me. I
have the scars to prove it.
MySQL is mature, stable and it just works. Not only do we use it, but it’s also used by plenty of other
companies pushing even bigger scale. MySQL supports our need for ordering data requests, selecting certain
ranges of data and row-level transactions. It has a hell of a lot more features, but we don’t need or use them.
But, MySQL is a single box solution, hence the need to shard our data. Here’s our solution:
We started with eight EC2 servers running one MySQL instance each:
35. Pinterest
How we sharded
So how do we distribute our data to these shards?
We created a 64 bit ID that contains the shard ID, the type of the containing data, and where this data is in the
table (local ID). The shard ID is 16 bits, type ID is 10 bits and local ID is 36 bits. The savvy additionology
experts out there will notice that only adds to 62 bits. My past in compiler and chip design has taught me
that reserve bits are worth their weight in gold. So we have two (set to zero).
ID = (shard ID << 46) | (type ID << 36) | (local ID<<0)
36. RabbitMQ
RabbitMQ is a distributed message queue,
and is probably the most popular open-source implementation
of the AMQP messaging protocol. It supports a wealth of
durability, routing, and fanout strategies, and combines excellent
documentation with well-designed protocol extensions.
38. RabbitMQ cluster + CAP
According table there is a choice between CP and CA, but in real
life CP means loss data
from http://www.rabbitmq.com/partitions.html
RabbitMQ clusters do not tolerate network partitions well. If you
are thinking of clustering across a WAN, don't. You should use
federation or the shovel instead.
However, sometimes accidents happen.
RabbitMQ stores information about queues, exchanges, bindings
etc in Erlang's distributed database, Mnesia.
39. RabbitMQ cluster and partitions
RabbitMQ also offers three ways to deal with network partitions automatically: pause-minority mode, pause-
if-all-down mode and autoheal mode. (The default behaviour is referred to as ignore mode).
In pause-minority mode RabbitMQ will automatically pause cluster nodes which determine themselves to be in
a minority (i.e. fewer or equal than half the total number of nodes) after seeing other nodes go down. It
therefore chooses partition tolerance over availability from the CAP theorem. This ensures that in the event of
a network partition, at most the nodes in a single partition will continue to run. The minority nodes will pause
as soon as a partition starts, and will start again when the partition ends.
In pause-if-all-down mode, RabbitMQ will automatically pause cluster nodes which cannot reach any of the
listed nodes. In other words, all the listed nodes must be down for RabbitMQ to pause a cluster node. This is
close to the pause-minority mode, however, it allows an administrator to decide which nodes to prefer, instead
of relying on the context. For instance, if the cluster is made of two nodes in rack A and two nodes in rack B,
and the link between racks is lost, pause-minority mode will pause all nodes. In pause-if-all-down mode, if the
administrator listed the two nodes in rack A, only nodes in rack B will pause. Note that it is possible the listed
nodes get split across both sides of a partition: in this situation, no node will pause. That is why there is an
additional ignore/autoheal argument to indicate how to recover from the partition.
In autoheal mode RabbitMQ will automatically decide on a winning partition if a partition is deemed to have
occurred, and will restart all nodes that are not in the winning partition. Unlike pause_minority mode it
therefore takes effect when a partition ends, rather than when one starts.
The winning partition is the one which has the most clients connected (or if this produces a draw, the one with
the most nodes; and if that still produces a draw then one of the partitions is chosen in an unspecified way).
40. How to scale?
Federation
Federation allows an exchange or queue on one broker to receive messages published to an exchange or queue on another (the
brokers may be individual machines, or clusters). Communication is via AMQP (with optional SSL), so for two exchanges or queues
to federate they must be granted appropriate users and permissions.
Federated exchanges are connected with one way point-to-point links. By default, messages will only be forwarded over a
federation link once, but this can be increased to allow for more complex routing topologies. Some messages may not be
forwarded over the link; if a message would not be routed to a queue after reaching the federated exchange, it will not be
forwarded in the first place.
Federated queues are similarly connected with one way point-to-point links. Messages will be moved between federated queues
an arbitrary number of times to follow the consumers.
Typically you would use federation to link brokers across the internet for pub/sub messaging and work queueing.
The Shovel
Connecting brokers with the shovel is conceptually similar to connecting them with federation. However, the shovel works at a
lower level.
Whereas federation aims to provide opinionated distribution of exchanges and queues, the shovel simply consumes messages
from a queue on one broker, and forwards them to an exchange on another.
Typically you would use the shovel to link brokers across the internet when you need more control than federation provides.
41. How to scale?
Horizontally!
We offer to use more simple way of scaling instead of Federation or shovel
Just start N clusters (like mysql or postgres):
Gateways RabbitMqRabbitMqRabbitMqGatewaysGateways
GatewaysGatewaysBackends
Gateways RabbitMqRabbitMqRabbitMqGatewaysGateways
GatewaysGatewaysBackends
Gateways RabbitMqRabbitMqRabbitMqGatewaysGateways
GatewaysGatewaysBackends
44. Redis fast?
Exceptionally Fast : Redis is very fast and can perform about
110000 SETs per second, about 81000 GETs per second (one
thread)
1. Operations are atomic : All the Redis operations are atomic,
which ensures that if two clients concurrently access Redis
server will get the updated value. discuss about CAS in java.
45. Redis fast?
Access by value O(1), by score O(log(N)). For numerical
members, the value is the score. For string members, the score is
a hash of the string.
46. Redis scalable?
Yes!
due to simple format of data storage (key -> value), where every
entry uses hash for searching, very simple to shard by hash range
or value range by , no additional effort comparing to mongodb
(speak about mongodb indexes) for example.
approaches:
1. Proxy assisted partitioning means that our clients send requests to a proxy that is able to speak the Redis
protocol, instead of sending requests directly to the right Redis instance. The proxy will make sure to
forward our request to the right Redis instance accordingly to the configured partitioning schema, and
will send the replies back to the client. The Redis and Memcached proxy Twemproxy implements proxy
assisted partitioning.
2. Query routing means that you can send your query to a random instance, and the instance will make
sure to forward your query to the right node. Redis Cluster implements an hybrid form of query routing,
with the help of the client (the request is not directly forwarded from a Redis instance to another, but
the client gets redirected to the right node).
47. Redis scalable?
Yes!
due to simple format of data storage (key -> value), where every entry uses hash for searching, very simple to
shard by hash range or value range by , no additional effort comparing to mongodb (speak about mongodb
indexes) for example.
approaches:
1. crc32: Proxy assisted partitioning means that our clients send requests to a proxy that is able to speak
the Redis protocol, instead of sending requests directly to the right Redis instance. The proxy will make
sure to forward our request to the right Redis instance accordingly to the configured partitioning schema,
and will send the replies back to the client. The Redis and Memcached proxy Twemproxy implements
proxy assisted partitioning.
2. Redis Cluster: Query routing means that you can send your query to a random instance, and the instance
will make sure to forward your query to the right node. Redis Cluster implements an hybrid form of query
routing, with the help of the client (the request is not directly forwarded from a Redis instance to
another, but the client gets redirected to the right node).
Discuss how to configure this! & Presharding
http://redis.io/topics/cluster-tutorial
http://redis.io/topics/partitioning
http://docs.spring.io/spring-data/redis/docs/current/reference/html/#redis:sentinel
48. What about HA
Redis offers asynchronous primary->secondary replication. A single server is chosen as
the primary, which can accept writes. It relays its state changes to secondary servers,
which follow along. Asynchronous means that you don’t have to wait for a write to be
replicated before the primary returns a response to the client.
1. Sentinel
Sentinel tries to establish a quorum between Sentinel nodes, agree on which Redis
servers are alive, and promote any which appear to have failed. If we colocate the
Sentinel nodes with the Redis nodes, this should allow us to promote a new primary in
the majority component (should one exist).
2. Redis cluster (discuss about slots)!
http://redis.io/topics/replication
http://redis.io/topics/sentinel
http://redis.io/topics/cluster-tutorial
http://redis.io/topics/sentinel-clients
50. Eureka (pure AP algorithm)
Once the server starts receiving traffic, all of the operations that is performed on the server is
replicated to all of the peer nodes that the server knows about. If an operation fails for some
reason, the information is reconciled on the next heartbeat that also gets replicated between
servers.
When the Eureka server comes up, it tries to get all of the instance registry information from a
neighboring node. If there is a problem getting the information from a node, the server tries all of
the peers before it gives up. If the server is able to successfully get all of the instances, it sets the
renewal threshold that it should be receiving based on that information. If any time, the renewals
falls below the percent configured for that value (below 85% within 15 mins), the server stops
expiring instances to protect the current instance registry information.
It is called as self-preservation mode and is primarily used as a protection in scenarios where
there is a network partition between a group of clients and the Eureka Server. In these scenarios,
the server tries to protect the information it already has. There may be scenarios in case of a
mass outage that this may cause the clients to get the instances that do not exist anymore. The
clients must make sure they are resilient to eureka server returning an instance that is non-
existent or un-responsive. The best protection in these scenarios is to timeout quickly and try
other servers.
What we do in balancer, gateway (file service, rabbitmq), backends (rabbitmq)
In the case, where the server is not able get the registry information from the neighboring node,
it waits for a few minutes (5 mins) so that the clients can register their information.
51. Eureka (AP)
What happens during network outages between Peers?
In the case of network outages between peers, following things may happen
1. The heartbeat replications between peers may fail and the server detects this
situation and enters into a self-preservation mode protecting the current state.
2. The situation autocorrects itself after the network connectivity is restored to a
stable state. When the peers are able to communicate fine, the registration
information is automatically transferred to the servers that do not have them.
The bottom line is, during the network outages, the server tries to be as resilient as
possible, but there is a possibility of clients having different views of the servers during
that time
52. Zookeeper based on PAXOS algorithm and provides CA
That is mean that it uses transactions for sharing state and can’t
provide partition tolerance
While eureka sends entire state all the time
Transactions?
Eureka vs Zookeeper CAP
53. 1. Eureka integrates better with other NetflixOSS components
(Ribbon especially).
2. ZooKeeper is hard. We've gotten pretty good at it, but it
requires care and feeding.
https://tech.knewton.com/blog/2014/12/eureka-shouldnt-use-
zookeeper-service-discovery/
Eureka vs Zookeeper
57. Each Component Scaling Capability
Type CAP Best for
Platform module Independent; stateless HA & Performance
Redis CP Performance
Weave DNS AP HA w/o consistency
Docker Swarm CA HA
RabbitMQ Queues replicated
across nodes
HA & slight Performance
Eureka AP HA w/o consistency
Conf service Stateless HA
58. Reminder
1. L1 cache reference 0.3 ns
2. Branch mispredict 3 ns
3. L2 cache reference 7 ns
4. Mutex lock/unlock 80 ns
5. Main memory reference 100 ns
6. Compress 1K bytes with Zippy 10,000 ns
7. Send 2K bytes over 1 Gbps network 20,000 ns
8. Read 1 MB sequentially from memory 250,000 ns
9. Round trip within same datacenter 500,000 ns
10.Disk seek 10,000,000 ns
11.Read 1 MB sequentially from network 5,000,000 ns
12.Read 1 MB sequentially from disk 30,000,000 ns
13.Send packet CA->Netherlands->CA 150,000,000 ns
59. Reminder 2
Ensure your design works if scale changes by 10X or 20X
but the right solution for X often not optimal for 100X
60. Eventual Consistency
Eventual Consistency - BASE Along with the CAP conjuncture, Brewer suggested a new consistency
model - BASE (Basically Available, Soft state, Eventual consistency) • BASE model gives up on
Consistency from the CAP Theorem. • This model is optimistic and accepts eventual consistency, in
contrast to ACID. o Given enough time, all nodes will be consistent and every request will result with
same responses. • Brewer points out that ACID and BASE are two extremes and one can have a
range of options in choosing the balance between consistency and availability. (consistency models).
Basically Available - the system does guarantee availability, in terms of the CAP theorem. It is always
available, but subsets of data may become unavailable for short periods of time. • Soft state - State of
system may change over time, even without input. Data does not have to be consistent. • Eventual
Consistency - System will become consistent eventually in the future. ACID, on the contrary, enforces
consistency immediately after any operation.