To get better replication speed and less lag, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK. But fully benefiting from this feature is not as simple as just enabling it.
In this talk, I explain in detail how this feature works. I also cover how to optimize parallel replication and the improvements made in MySQL 8.0 and back-ported in 5.7 (Write Sets), greatly improving the potential for parallel execution on replicas (but needing RBR).
Come to this talk to get all the details about MySQL 5.7 and 8.0 Parallel Replication.
Up to MySQL 5.5, replication was not crash safe: after an unclean shutdown, it would fail with “duplicate key” or “row not found” error, or might generate silent data corruption. It looks like 5.6 is much better, right ? The short answer is maybe: in the simplest case, it is possible to achieve replication crash safety, but it is not the default setting. MySQL 5.7 is not much better, 8.0 has better defaults, but it is still not replication crash-safe by default, and it is still easy to get things wrong.
Crash safety is impacted by replication positioning (File+Position or GTID), type (single-threaded or MTS), MTS settings (Database or Logical Clock, and with or without slave preserve commit order), the sync-ing of relay logs, the presence of binary logs, log-slave-updates and the sync-ing of binary logs. This is very complicated stuff and even the manual is sometimes confused about it.
In this talk, I will explain the impact of the above and help you find the path to crash safety nirvana. I will also give details about replication internals, so you might learn a thing or two.
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
In the last 24 months, MySQL/MariaDB replication speed has improved a lot thanks to parallel replication. MySQL and MariaDB have different types of parallel replication; in this talk, I present the different implementations, with their limitations and the corresponding tuning parameters. I cover what to do to make parallel replication faster and what to avoid for maximizing parallel replication benefits. I also present benchmark results from real Booking.com workloads. Finally, I discuss some deployments at Booking.com that take advantage of parallel replication speed improvements.
This tutorial covers all parallel replication implementation in MariaDB 10.0 and 10.1 and MySQL 5.6, 5.7 and 8.0 (including how it works in Group Replication).
MySQL and MariaDB have different types of parallel replication. In this tutorial, we present the different implementations that allow us to understand their limitations and tuning parameters. We cover how to make parallel replication faster and what to avoid for maximizing its benefits. We also present tests from Booking.com workloads.
Some of the subjects that are covered are group commit and optimistic parallel replication in MariaDB, the parallelism interval of MySQL and its Write Set optimization, and the ?slowing down the master to speed up the slave? optimization.
After this tutorial, you will know everything you need to implement and tune parallel replication in your environment. But more importantly, we will show how you can test parallel replication benefit in a non-disruptive way before deployment.
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorJean-François Gagné
Of course there is no such thing as perfect service discovery, and we will see why in the talk. However, the way ProxySQL is deployed in this case minimizes the risk for split-brains, and this is why I qualify it as almost perfect. But let’s step back a little...
MySQL alone is not a high availability solution. To provide resilience to primary failure, other components need to be integrated with MySQL. At MessageBird, these additional components are ProxySQL and Orchestrator. In this talk, we describe how ProxySQL is architectured to provide close to perfect Service Discovery and how this, combined with Orchestrator, allows for automatic failover. The talk presents the details of the integration of MySQL, ProxySQL and Orchestrator in Google Cloud (and it would be easy to re-implement a similar architecture at other cloud vendors or on-premises). We will also cover lessons learned for the 2 years this architecture has been in production. Come to this talk to learn more about MySQL high availability, ProxySQL and Orchestrator.
MySQL has multiple timeouts variables to control its operations. This presentation focus on the purpose of each timeout variables and how it can be used.
MySQL performance can be improved by tuning queries, server options, and hardware. Traditionally it was an area of responsibility for three different roles: Development, DBA, and System Administrators. Now DevOps handle these all. But there is a gap. Knowledge gained by MySQL DBAs after years or focusing on a single product is hard to gain when you focus on more than one. This is why I am doing this session. I will show a minimal but most effective set of options to improve MySQL performance. For illustrations, I will use real user stories gained from my Support experience and Percona Kubernetes operators for PXC and MySQL.
Up to MySQL 5.5, replication was not crash safe: after an unclean shutdown, it would fail with “duplicate key” or “row not found” error, or might generate silent data corruption. It looks like 5.6 is much better, right ? The short answer is maybe: in the simplest case, it is possible to achieve replication crash safety, but it is not the default setting. MySQL 5.7 is not much better, 8.0 has better defaults, but it is still not replication crash-safe by default, and it is still easy to get things wrong.
Crash safety is impacted by replication positioning (File+Position or GTID), type (single-threaded or MTS), MTS settings (Database or Logical Clock, and with or without slave preserve commit order), the sync-ing of relay logs, the presence of binary logs, log-slave-updates and the sync-ing of binary logs. This is very complicated stuff and even the manual is sometimes confused about it.
In this talk, I will explain the impact of the above and help you find the path to crash safety nirvana. I will also give details about replication internals, so you might learn a thing or two.
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
In the last 24 months, MySQL/MariaDB replication speed has improved a lot thanks to parallel replication. MySQL and MariaDB have different types of parallel replication; in this talk, I present the different implementations, with their limitations and the corresponding tuning parameters. I cover what to do to make parallel replication faster and what to avoid for maximizing parallel replication benefits. I also present benchmark results from real Booking.com workloads. Finally, I discuss some deployments at Booking.com that take advantage of parallel replication speed improvements.
This tutorial covers all parallel replication implementation in MariaDB 10.0 and 10.1 and MySQL 5.6, 5.7 and 8.0 (including how it works in Group Replication).
MySQL and MariaDB have different types of parallel replication. In this tutorial, we present the different implementations that allow us to understand their limitations and tuning parameters. We cover how to make parallel replication faster and what to avoid for maximizing its benefits. We also present tests from Booking.com workloads.
Some of the subjects that are covered are group commit and optimistic parallel replication in MariaDB, the parallelism interval of MySQL and its Write Set optimization, and the ?slowing down the master to speed up the slave? optimization.
After this tutorial, you will know everything you need to implement and tune parallel replication in your environment. But more importantly, we will show how you can test parallel replication benefit in a non-disruptive way before deployment.
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorJean-François Gagné
Of course there is no such thing as perfect service discovery, and we will see why in the talk. However, the way ProxySQL is deployed in this case minimizes the risk for split-brains, and this is why I qualify it as almost perfect. But let’s step back a little...
MySQL alone is not a high availability solution. To provide resilience to primary failure, other components need to be integrated with MySQL. At MessageBird, these additional components are ProxySQL and Orchestrator. In this talk, we describe how ProxySQL is architectured to provide close to perfect Service Discovery and how this, combined with Orchestrator, allows for automatic failover. The talk presents the details of the integration of MySQL, ProxySQL and Orchestrator in Google Cloud (and it would be easy to re-implement a similar architecture at other cloud vendors or on-premises). We will also cover lessons learned for the 2 years this architecture has been in production. Come to this talk to learn more about MySQL high availability, ProxySQL and Orchestrator.
MySQL has multiple timeouts variables to control its operations. This presentation focus on the purpose of each timeout variables and how it can be used.
MySQL performance can be improved by tuning queries, server options, and hardware. Traditionally it was an area of responsibility for three different roles: Development, DBA, and System Administrators. Now DevOps handle these all. But there is a gap. Knowledge gained by MySQL DBAs after years or focusing on a single product is hard to gain when you focus on more than one. This is why I am doing this session. I will show a minimal but most effective set of options to improve MySQL performance. For illustrations, I will use real user stories gained from my Support experience and Percona Kubernetes operators for PXC and MySQL.
Since 5.7.2, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK (DATABASE based parallel replication is also implemented in 5.6 but this is not covered in this talk). In early 5.7 versions, parallel replication was based on group commit (like MariaDB) and 5.7.6 changed that to intervals.
Intervals are more complicated but they are also more powerful. In this talk, I will explain in detail how they work and why intervals are better than group commit. I will also cover how to optimize parallel replication in MySQL 5.7 and what improvements are coming in MySQL 8.0.
The presentation covers improvements made to the redo logs in MySQL 8.0 and their impact on the MySQL performance and Operations. This covers the MySQL version still MySQL 8.0.30.
MySQL Database Monitoring: Must, Good and Nice to HaveSveta Smirnova
It is very easy to find if a database installation is having issues. You only need to enable Operating System monitoring. A disk, memory, or CPU usage change will alert you about the problems. But they would not show *why* the trouble happens. You need the help of database-specific monitoring tools.
As a Support Engineer, I am always very upset when handling complaints about the database behavior lacking specific database monitoring data because I cannot help!
There are two reasons database and system administrators do not enable necessary instrumentation. The first is a natural or expected performance impact. Second is the lack of knowledge on what needs to be on to resolve a particular issue.
In this talk, I will cover both concerns.
I will show which monitoring instruments will give information on what causes disk, memory, or CPU problems.
I will teach you how to use them.
I will uncover which performance impact these instruments have.
I will use both MySQL command-line client and open-source graphical instrument Percona Monitoring and Management (PMM) for the examples.
Have you ever needed to get some additional write throughput from MySQL ? If yes, you probably found that setting sync_binlog to 0 (and trx_commit to 2) gives you an extra performance boost. As all such easy optimisation, it comes at a cost. This talk explains how this tuning works, presents its consequences and makes recommendations to avoid them. This will bring us to the details of how MySQL commits transactions and how those are replicated to slaves. Come to this talk to learn how to get the benefit of this tuning the right way and to learn some replication internals.
Replication Troubleshooting in Classic VS GTIDMydbops
This presentation talk will assist you in troubleshooting MySQL replication for the most common issues we might face with a simple comparison of how can we get them solved in the different replication methods (Classic VS GTID).
24시간 365일 서비스를 위한 MySQL DB 이중화.
MySQL 이중화 방안들에 대해 알아보고 운영하면서 겪은 고민들을 이야기해 봅니다.
목차
1. DB 이중화 필요성
2. 이중화 방안
- HW 이중화
- MySQL Replication 이중화
3. 이중화 운영 장애
4. DNS와 VIP
5. MySQL 이중화 솔루션 비교
대상
- MySQL을 서비스하고 있는 인프라 담당자
- MySQL 이중화에 관심 있는 개발자
MySQL InnoDB Cluster - Advanced Configuration & OperationsFrederic Descamps
MySQL InnoDB Cluster is a very easy HA solution to deploy. However it's also a very customizable solution able to respond to most needs. During this session I will give an overview of settings that you may tune like those related to quorum lost, level of consistency, but also some you may not know like how to change recovery system, effect of increasing the event horizon. We will also discus about maintenance operations like how to stream large transactions, how to deal with DDL in multi-primary environments...
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScaleColin Charles
As proxies (and database routers) go, the first one I ever used was the now deprecated MySQL Proxy. Since then, I've managed to use MariaDB MaxScale quite a bit (including its fork AirBnB MaxScale), played around with ProxySQL in recent time, and also started taking a look at MySQL Router. In this quick 20-minute overview, we'll discuss why these three exist, a feature comparison, and reasons when to use the right tool for the job.
Performance Schema is a powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tuning what to instrument. More than 100 consumers store collected data.
In this tutorial, we will try all the important instruments out. We will provide a test environment and a few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information but have experience with it.
Tutorial at Percona Live Austin 2019
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...Jean-François Gagné
Since 5.7.2, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK (DATABASE based parallel replication is also implemented in 5.6 but this is not covered in this talk). In early 5.7 versions, parallel replication was based on group commit (like MariaDB) and 5.7.6 changed that to intervals.
Intervals are more complicated but they are also more powerful. In this talk, I will explain in detail how they work and why intervals are better than group commit. I will also cover how to optimize parallel replication in MySQL 5.7 and what improvements are coming in MySQL 8.0. I will also explain why Group Replication is replicating faster than standard asynchronous replication.
Come to this talk to get all the details about MySQL 5.7 Parallel Replication.
MySQL Replication Evolution -- Confoo Montreal 2017Dave Stokes
MySQL Replication has evolved since the early days with simple async master/slave replication with better security, high availability, and now InnoDB Cluster
Up to MySQL 5.5, replication was not crash safe: it would fail with “dup.key” or “not found” error (or data corruption). So 5.6 is better, right? Maybe: it is possible, but not the default. MySQL 5.7 is not much better, 8.0 has safer defaults but it is still easy to get things wrong.
Crash safety is impacted by positioning (File+Pos or GTID), type (single/multi-threaded), MTS settings (Db/Logical Clock, and preserve commit order), the sync-ing of relay logs, the presence of binlogs, log-slave-updates and their sync-ing. This is complicated and even the manual is confused about it.
In this talk, I will explain above with details on replication internals, so you might learn a thing or two.
Since 5.7.2, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK (DATABASE based parallel replication is also implemented in 5.6 but this is not covered in this talk). In early 5.7 versions, parallel replication was based on group commit (like MariaDB) and 5.7.6 changed that to intervals.
Intervals are more complicated but they are also more powerful. In this talk, I will explain in detail how they work and why intervals are better than group commit. I will also cover how to optimize parallel replication in MySQL 5.7 and what improvements are coming in MySQL 8.0.
The presentation covers improvements made to the redo logs in MySQL 8.0 and their impact on the MySQL performance and Operations. This covers the MySQL version still MySQL 8.0.30.
MySQL Database Monitoring: Must, Good and Nice to HaveSveta Smirnova
It is very easy to find if a database installation is having issues. You only need to enable Operating System monitoring. A disk, memory, or CPU usage change will alert you about the problems. But they would not show *why* the trouble happens. You need the help of database-specific monitoring tools.
As a Support Engineer, I am always very upset when handling complaints about the database behavior lacking specific database monitoring data because I cannot help!
There are two reasons database and system administrators do not enable necessary instrumentation. The first is a natural or expected performance impact. Second is the lack of knowledge on what needs to be on to resolve a particular issue.
In this talk, I will cover both concerns.
I will show which monitoring instruments will give information on what causes disk, memory, or CPU problems.
I will teach you how to use them.
I will uncover which performance impact these instruments have.
I will use both MySQL command-line client and open-source graphical instrument Percona Monitoring and Management (PMM) for the examples.
Have you ever needed to get some additional write throughput from MySQL ? If yes, you probably found that setting sync_binlog to 0 (and trx_commit to 2) gives you an extra performance boost. As all such easy optimisation, it comes at a cost. This talk explains how this tuning works, presents its consequences and makes recommendations to avoid them. This will bring us to the details of how MySQL commits transactions and how those are replicated to slaves. Come to this talk to learn how to get the benefit of this tuning the right way and to learn some replication internals.
Replication Troubleshooting in Classic VS GTIDMydbops
This presentation talk will assist you in troubleshooting MySQL replication for the most common issues we might face with a simple comparison of how can we get them solved in the different replication methods (Classic VS GTID).
24시간 365일 서비스를 위한 MySQL DB 이중화.
MySQL 이중화 방안들에 대해 알아보고 운영하면서 겪은 고민들을 이야기해 봅니다.
목차
1. DB 이중화 필요성
2. 이중화 방안
- HW 이중화
- MySQL Replication 이중화
3. 이중화 운영 장애
4. DNS와 VIP
5. MySQL 이중화 솔루션 비교
대상
- MySQL을 서비스하고 있는 인프라 담당자
- MySQL 이중화에 관심 있는 개발자
MySQL InnoDB Cluster - Advanced Configuration & OperationsFrederic Descamps
MySQL InnoDB Cluster is a very easy HA solution to deploy. However it's also a very customizable solution able to respond to most needs. During this session I will give an overview of settings that you may tune like those related to quorum lost, level of consistency, but also some you may not know like how to change recovery system, effect of increasing the event horizon. We will also discus about maintenance operations like how to stream large transactions, how to deal with DDL in multi-primary environments...
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScaleColin Charles
As proxies (and database routers) go, the first one I ever used was the now deprecated MySQL Proxy. Since then, I've managed to use MariaDB MaxScale quite a bit (including its fork AirBnB MaxScale), played around with ProxySQL in recent time, and also started taking a look at MySQL Router. In this quick 20-minute overview, we'll discuss why these three exist, a feature comparison, and reasons when to use the right tool for the job.
Performance Schema is a powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tuning what to instrument. More than 100 consumers store collected data.
In this tutorial, we will try all the important instruments out. We will provide a test environment and a few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information but have experience with it.
Tutorial at Percona Live Austin 2019
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...Jean-François Gagné
Since 5.7.2, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK (DATABASE based parallel replication is also implemented in 5.6 but this is not covered in this talk). In early 5.7 versions, parallel replication was based on group commit (like MariaDB) and 5.7.6 changed that to intervals.
Intervals are more complicated but they are also more powerful. In this talk, I will explain in detail how they work and why intervals are better than group commit. I will also cover how to optimize parallel replication in MySQL 5.7 and what improvements are coming in MySQL 8.0. I will also explain why Group Replication is replicating faster than standard asynchronous replication.
Come to this talk to get all the details about MySQL 5.7 Parallel Replication.
MySQL Replication Evolution -- Confoo Montreal 2017Dave Stokes
MySQL Replication has evolved since the early days with simple async master/slave replication with better security, high availability, and now InnoDB Cluster
Up to MySQL 5.5, replication was not crash safe: it would fail with “dup.key” or “not found” error (or data corruption). So 5.6 is better, right? Maybe: it is possible, but not the default. MySQL 5.7 is not much better, 8.0 has safer defaults but it is still easy to get things wrong.
Crash safety is impacted by positioning (File+Pos or GTID), type (single/multi-threaded), MTS settings (Db/Logical Clock, and preserve commit order), the sync-ing of relay logs, the presence of binlogs, log-slave-updates and their sync-ing. This is complicated and even the manual is confused about it.
In this talk, I will explain above with details on replication internals, so you might learn a thing or two.
Up to MySQL 5.5, replication was not crash safe: after a crash, it would fail with "duplicate key" or "row not found" error, or might generate silent data corruption. It looks like 5.6 is much better, right? The short answer is maybe: in the simplest case, it is possible to achieve replication crash safety but it is not the default setting. MySQL 5.7 is not much better, 8.0 has safer defaults but it is still easy to get things wrong.
Crash safety is impacted by replication positioning (File+Pos or GTID), type (single-threaded or MTS), MTS settings (Database or Logical Clock, and with or without slave preserve commit order), the sync-ing of relay logs, the presence of binary logs, log-slave-updates and their sync-ing. This is very complicated stuff and even the manual is confused about it.
In this talk, I will explain the impact of above and help you finding the path to crash safety nirvana. I will also give details about replication internals, so you might learn a thing or two.
ConFoo MySQL Replication Evolution : From Simple to Group ReplicationDave Stokes
MySQL Replication has been around for many years but how wee do you under stand it? Do you know about read/write splitting, RBR vs SBR style replication, and InnoDB cluster?
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale by ...Insight Technology, Inc.
Scalability with MariaDB and MaxScale talks about MariaDB 10, and MaxScale, a pluggable router for your queries. These are technologies developed at MariaDB Corporation, made opensource, and will help scale your MariaDB and MySQL workloads
MySQL Scalability and Reliability for Replicated EnvironmentJean-François Gagné
You have a working application that is using MySQL: great! At the beginning, you are probably using a single database instance, and maybe – but not necessarily – you have replication for backups, but you are not reading from slaves yet. Scalability and reliability were not the main focus in the past, but they are starting to be a concern. Soon, you will have many databases and you will have to deal with replication lag. This talk will present how to tackle the transition.
We mostly cover standard/asynchronous replication, but we will also touch on Galera and Group Replication. We present how to adapt the application to become replication-friendly, which facilitate reading from and failing over to slaves. We also present solutions for managing read views at scale and enabling read-your-own-writes on slaves. We also touch on vertical and horizontal sharding for when deploying bigger servers is not possible anymore.
Are UNIQUE and FOREIGN KEYs still possible at scale, what are the downsides of AUTO_INCREMENTs, how to avoid overloading replication, what are the limits of archiving, … Come to this talk to get answers and to leave with tools for tackling the challenges of the future.
NoSQL on MySQL - MySQL Document Store by Vadim TkachenkoData Con LA
Abstract:- Should you use SQL on NoSQL Engine ? With MySQL Document Store you can do both. In this talk we will introduce MySQL Document Store and discuss its advantages and downsides compared to purpose build Document Store database engines such as MongoDB
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
MySQL Webinar, presented on the 25th of April, 2024.
Summary:
MySQL solutions enable the deployment of diverse Database Architectures tailored to specific needs, including High Availability, Disaster Recovery, and Read Scale-Out.
With MySQL Shell's AdminAPI, administrators can seamlessly set up, manage, and monitor these solutions, ensuring efficiency and ease of use in their administration. MySQL Router, on the other hand, provides transparent routing from the application traffic to the backend servers in the architectures, requiring minimal configuration.
Completely built in-house and supported by Oracle, these solutions have been adopted by enterprises of all sizes for their business-critical applications.
In this presentation, we'll delve into various database architecture solutions to help you choose the right one based on your business requirements. Focusing on technical details and the latest features to maximize the potential of these solutions.
Going thru the era of IoT that involves lots more and much bigger data, we need a faster database. MySQL 5.7 gives you 3x speed of its predecessor and able to reach 1.6m qps on our select benchmark.
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)Aurimas Mikalauskas
Is my MySQL server configured properly? Should I run Community MySQL, MariaDB, Percona or WebScaleSQL? How many innodb buffer pool instances should I run? Why should I NOT use the query cache? How do I size the innodb log file size and what IS that innodb log anyway? All answers are inside.
Aurimas Mikalauskas is a former Percona performance consultant and architect currently writing and teaching at speedemy.com. He's been involved with MySQL since 1999, scaling and optimizing MySQL backed systems since 2004 for companies such as BBC, EngineYard, famous social networks and small shops like EstanteVirtual, Pine Cove and hundreds of others.
Additional content mentioned in the presentation can be found here: http://speedemy.com/17
OSDC 2018 | Scaling & High Availability MySQL learnings from the past decade+...NETWAYS
The MySQL world is full of tradeoffs and choosing a High Availability (HA) solution is no exception. This session aims to look at all of the alternatives in an unbiased nature. While the landscape will be covered, including but not limited to MySQL replication, MHA, DRBD, Galera Cluster, etc. the focus of the talk will be what is recommended for today, and what to look out for. Thus, this will include extensive deep-dive coverage of ProxySQL, semi-sync replication, Orchestrator, MySQL Router, and Galera Cluster variants like Percona XtraDB Cluster and MariaDB Galera Cluster. I will also touch on group replication.
Learn how we do this for our nearly 4000+ customers!
MySQL Replication Update -- Zendcon 2016Dave Stokes
How does MySQL work at a conceptual level and at a how-to-do-it level is covered in this presentation plus information on other replication options like Group Replication and Multi Master
Similar to MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK) (20)
You deployed automation, enabled automatic database master failover and tested it many times: great, you can now sleep at night without being paged by a failing server. However, when you wake up in the morning, things might not have gone the way you expect. This talk will be about such a surprise.
Once upon a time, a failure brought down a MySQL master database. Automation kicked in and fixed things. However, a fancy failure, combined with human errors, an edge-case recovery, and a lack of oversight in tooling and scripting lead to a split-brain and data corruption. This talk will go into details about the convoluted—but still real-world—sequence of events that lead to this disaster. I cover what could have avoided the split-brain and what could have made data reconciliation easier.
MySQL Scalability and Reliability for Replicated EnvironmentJean-François Gagné
You have a working application that is using MySQL: great! At the beginning, you are probably using a single database instance, and maybe – but not necessarily – you have replication for backups, but you are not reading from slaves yet. Scalability and reliability were not the main focus in the past, but they are starting to be a concern. Soon, you will have many databases and you will have to deal with replication lag. This talk will present how to tackle the transition.
We mostly cover standard/asynchronous replication, but we will also touch on Galera and Group Replication. We present how to adapt the application to become replication-friendly, which facilitate reading from and failing over to slaves. We also present solutions for managing read views at scale and enabling read-your-own-writes on slaves. We also touch on vertical and horizontal sharding for when deploying bigger servers is not possible anymore.
Are UNIQUE and FOREIGN KEYs still possible at scale, what are the downsides of AUTO_INCREMENTs, how to avoid overloading replication, what are the limits of archiving, … Come to this talk to get answers and to leave with tools for tackling the challenges of the future.
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsJean-François Gagné
In the last 24 months, MySQL/MariaDB replication speed has improved a lot, thanks to parallel replication. MySQL and MariaDB Server have different types of parallel replication; in this talk, I present the different implementations which will allow us to understand their limitations and tuning parameters. I am covering how to make parallel replication faster and what to avoid for maximizing its benefits. I also present benchmark results from Booking.com workloads. Finally, I discuss some deployments at Booking.com that take advantage of parallel replication speed improvements.
This short talk will be about an incident that kept DBAs working on a weekend. Two bugs, one in our application code and one in the database, joined force and almost brought down Booking.com. And this occurred at one of the worst possible times. Curious about what happened: come to this talk to learn more.
You’ve deployed automation, enabled automatic master failover and tested it many times: great, you can now sleep at night without being paged by a failing server. However, when you wake up in the morning, things might not have gone the way you expect. This talk will be about such surprise.
Once upon a time, a failure brought down a master. Automation kicked in and fixed things. However, a fancy failure, combined with human errors, with an edge-case recovery, and a lack of oversight in automation, lead to a split-brain. This talk will go into details about the convoluted - but still real world - sequence of events that lead to this disaster. I will cover what could have avoided the split-brain and what could have make things easier to fix it.
MySQL/MariaDB replication is asynchronous. You can make replication faster by using better hardware (faster CPU, more RAM, or quicker disks), or you can use parallel replication to remove it single-threaded limitation; but lag can still happen. This talk is not about making replication faster, it is how to deal with its asynchronous nature, including the (in-)famous lag.
We will start by explaining the consequences of asynchronous replication and how/when lag can happen. Then, we will present the solution used at Booking.com to avoid both creating lag and minimize the consequence of stale reads on slaves (hint: this solution does not mean reading from the master because this does not scale).
Once all above is well understood, we will discuss how Booking.com’s solution can be improved: this solution was designed years ago and we would do this differently if starting from scratch today. Finally, I will present an innovative way to avoid lag: the no-slave-left-behind MariaDB patch.
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
In the last 24 months, MySQL replication speed has improved a lot thanks to implementing parallel replication. MySQL and MariaDB have different types of parallel replication; in this talk, I present in details the different implementations, with their limitations and the corresponding tuning parameters. I also present benchmark results from real Booking.com workloads. Finally, I discuss some deployments at Booking.com that benefits from parallel replication speed improvements.
MySQL Parallel Replication: inventory, use-cases and limitationsJean-François Gagné
In the last 24 months, MySQL replication speed has improved a lot thanks to implementing parallel replication. MySQL and MariaDB have different types of parallel replication; in this talk, I present in detail the different implementations, with their limitations and the corresponding tuning parameters (covering MySQL 5.6, MariaDB 10.0, MariaDB 10.1 and MySQL 5.7). I also present benchmark results from real Booking.com workloads. Finally, I discuss some deployments at Booking.com that benefits from parallel replication speed improvements.
Riding the Binlog: an in Deep Dissection of the Replication StreamJean-François Gagné
Binary Logs are the cornerstone of MySQL Replication, but is it fully understood ? To start apprehending this, we can think of the binary logs as a transport for a Stream of Transactions. Traveling from master to slave, sometimes via Intermediate Masters, this stream evolves: it can shrink by the application of filters, can grow by the addition of slave-local transactions, and two streams can merge by the usage of multi-source replication. After presenting the binary logs Stream Model, the different MySQL use-cases will be mapped to the model, which can serve as a validation of the model. After this validation, the model will be used to make prediction on new use-cases/features that could emerge in the future.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
1. MySQL Parallel Replication:
all the 5.7 and 8.0 details
(LOGICAL_CLOCK)
by Jean-François Gagné
System and MySQL Expert at HubSpot
Presented at Percona Live Austin 2022
jfg.mysql AT gmail DOT com
Twitter: @jfg956
2. Benchmark Preview
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
2
Replication throughput (TPS) without and with parallel replication
3. Terminology [1 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
MySQL Terminology Updates in 2020:
https://dev.mysql.com/blog-archive/mysql-terminology-updates/
• master / slave à primary / replica
• the master of a slave à the source of a replica
This presentation uses above terminology, with below additions:
• Multi-Threaded Slave (MTS) à Multi-Threaded Replication (MTR)
• intermediary master à intermediary source or non-primary source
(Exceptions for citing content from before the terminology change)
3
4. Terminology [2 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
From MySQL 8.0.22 (not in 5.7), new commands were introduced:
• START/STOP SLAVE à START/STOP REPLICA
• And more…
• Some missing: START REPLICA UNTIL SQL_AFTER_MTS_GAPS
From 8.0.26 (not in 5.7 either), new variables were introduced:
• log_slave_updates à log_replica_updates
• slave_parallel_workers à replica_parallel_workers
• And more…
Because this talk applies to 5.7, and not everyone is on 8.0-latest,
this presentation uses old command and variable names
5. Summary
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
• Replication reminders
• Why parallel replication is important (and complicated)
• How to enable parallel replication and what is LOGICAL_CLOCK
• How parallel replication works on replicas
• How dependency tracking works on primaries
• Benchmarks
• Closing thoughts
5
6. MySQL Replication
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
• The primary records transactions in a journal (binary logs), and replicas
• downloads the journal and saves them locally in the relay logs (IO thread)
• executes the relay logs on their local database (SQL thread)
• could also write binlogs to be themselves a source (log-slave-updates)
7. Other Replication Concepts: Durability
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Persistence / Durability (D of ACID):
• Is there data lost after a crash ?
(different types of crash: MySQL process (mysqld) or OS (Linux))
• Parameters controlling durability: sync_binlog and trx_commit
(trx_commit is short for innodb_flush_logs_at_trx_commit)
• High Durability (HD) or no data lost after crashes
(sync_binlog = 1 and trx_commit = 1)
(most expensive / slower because flush to disk after each trx)
• No Durability (ND sometimes called low durability)
(sync_binlog = 0 and trx_commit = 2)
(faster, with data lost on OS crash, but no loss on mysqld crash)
• (Data lost for mysqld crash in case trx_commit = 0 but not covered here) 7
8. Other Replication Concepts: Group Commit
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Group Commit is an optimization making high durability faster
• Before Group Commit (GC), each trx needed its disk flush
• With GC, many trx can be persisted in the same flush
à better write throughput on primaries with concurency
• (GC was introduced in MariaDB xx and then in MySQL 5.6)
• (Single threaded replication cannot Group Commit)
• (MariaDB 10.0 introduced Replica Group Commit)
8
9. From Single to Multi-Threaded Replication
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Without MTR, a single CPU is used for applying writes on replicas:
• on the primary, many CPUs are used for writes (one per session)
WO MTR, a single read IO can be “in flight” for writes on replicas:
• this is a bottleneck, especially for high latency disks (cloud)
• on primary, many IOs can be in flights for writes (one per session)
WO MTR, no group commit on replicas:
• on the p., a single flush can persist many trx in binlogs and redo logs
• on replicas, flushing each trx is a bottleneck
Multi-Threaded Replication is a solution to all above problems 9
10. Challenges of Multi-Threaded Replication
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
But running transactions in parallel on replica is not trivial:
• Updating a row cannot be run in parallel with the insert of this row
• MTR implementation needs to keep data consistency
Enabling MTR is also not straightforward:
• Many parameters and complex tunning
MTR also still has some rough edges:
• In 5.7, START SLAVE UNTIL not supported with MTR (ok from 8.0.0)
• Latest -1 Percona Server 5.7 crashes on STOP SLAVE with MTR
(PS-8030: 5.7.27 to .36 affected and 8.0.17 to .21, Oracle MySQL not affected)
• I have also seen deadlock with PS 5.7.36, and rumor of another crash
(to be investigated, unknown if both in Percona Server and Oracle MySQL)
11. Solutions for MTR Data Consistency
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
• MySQL 5.6: schema based parallel replication
• MariaDB 10.0: group commit based parallel replication
• MariaDB 10.1: optimistic parallel replication
• Improvement on 10.0, probably the fastest solution, but high CPU cost
• MySQL 5.7: same schema parallel replication (Logical Clock)
• Not as fast as optimistic, but not wasteful in CPU
• MySQL 8.0: Write-Set improvement (still Logical Clock)
• Lot to say, including need for RBR and back-ported in 5.7.22
This talk is about Logical Clock in 5.7 and 8.0 11
12. Solutions for MTR Data Consistency
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
• MySQL 5.6: schema based parallel replication
• MariaDB 10.0: group commit based parallel replication
• MariaDB 10.1: optimistic parallel replication
• Improvement on 10.0, probably the fastest solution, but high CPU cost
• MySQL 5.7: same schema parallel replication (Logical Clock)
• Not as fast as optimistic, but not wasteful in CPU
• MySQL 8.0: Write-Set improvement (still Logical Clock)
• Lot to say, including need for RBR and back-ported in 5.7.22
This talk is about Logical Clock in 5.7 and 8.0 12
13. Enabling Parallel Replication
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Set slave_parallel_type to LOGICAL_CLOCK
• This is the new default in 8.0.27
(was DATABASE before (schema-based solution of 5.6), including in 5.7)
(in 8.0.26, this variable is deprecated for dropping DATABASE support)
And set slave_preserve_commit_order to ON
• This is also the new default in 8.0.27(was OFF before, including in 5.7)
(more about this parameter later in the talk)
And set slave_parallel_workers to 2 or more
• New default to 4 in 8.0.27 (was 0 before including in 5.7)
These enable MTR, but are not sufficient for best throughput 13
14. What is LOGICAL_CLOCK ?
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
From the MySQL Reference Manual for LOGICAL_CLOCK:
• Transactions that are part of the same binary log group commit on a
source are applied in parallel on a replica.
• But this is not 100% exact (Bug#85977)
When using LOGICAL_CLOCK, MTR is scheduling trx on replica
using dependency tracking information from the binary logs:
• We will name these information Intervals
• By default, trx intervals are generated with Commit Timestamps
(which is close, but not strictly the same as group commit)
• Much better intervals can be generated with Write Set
(and Write Set is completely different to group commit) 14
15. Intervals
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Each trx is tagged with two numbers in the binlogs (starting with 5.7):
• sequence_number: increasing id for each trx (not to confuse with GTID)
• last_committed: id of the latest trx on which this trx depends
The last_committed / sequence_number is the trx interval
Here an example of intervals from a real system:
15
mysqlbinlog $f | awk '$11 ~ "last_committed"{print $1,$2,$11,$12}'
[...]
#170206 20:08:33 last_committed=6201 sequence_number=6203
#170206 20:08:33 last_committed=6203 sequence_number=6204
#170206 20:08:33 last_committed=6203 sequence_number=6205
#170206 20:08:33 last_committed=6203 sequence_number=6206
#170206 20:08:33 last_committed=6205 sequence_number=6207
16. Parallel Scheduling on Replicas [1 of 4]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
16
A transaction can be started when its last_committed and all
previous transactions have been executed
The intervals below mean that:
• When transaction #6203 (and all previous) have been executed,
transactions #6204, #6205 and #6206 can be started
• When transaction #6205 (and all previous) have been executed,
transaction #6207 can be started
#170206 20:08:33 last_committed=6201 sequence_number=6203
#170206 20:08:33 last_committed=6203 sequence_number=6204
#170206 20:08:33 last_committed=6203 sequence_number=6205
#170206 20:08:33 last_committed=6203 sequence_number=6206
#170206 20:08:33 last_committed=6205 sequence_number=6207
17. Parallel Scheduling on Replicas [2 of 4]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
17
Trx #6204 and #6205 (and #6206) can be run in parallel but #6205
could complete before #6204
slave_preserve_commit_order (SPCO) does what you expect:
• If SPCO is OFF, #6205 can commit before #6204
• If SPCO is ON , #6205 waits for #6204 completion before committing
SPCO to OFF generates data states that never existed on the primary
• Use with care (you probably want ON most of the time)
#170206 20:08:33 last_committed=6201 sequence_number=6203
#170206 20:08:33 last_committed=6203 sequence_number=6204
#170206 20:08:33 last_committed=6203 sequence_number=6205
#170206 20:08:33 last_committed=6203 sequence_number=6206
#170206 20:08:33 last_committed=6205 sequence_number=6207
18. Parallel Scheduling on Replicas [3 of 4]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
SPCO to OFF generates data states that never existed on the primary
• Use with care (you probably want ON most of the time)
Also, SPCO to OFF generates temporary gaps in GTIDs
• For filling the gaps, use START SLAVE UNTIL SQL_AFTER_MTS_GAPS
SPCO to ON needed log_slave_updates until 8.0.19 (Bug#75396)
Before 5.7.33 and 8.0.23, SPCO to ON could deadlock (Bug#89247)
• If replication gets stuck, kill -9 mysqld (no data lost unless trx_commit = 0)
• Thanks Percona / Venkatesh for the fixing this, but…
• But I have recently seen deadlocks in PS 5.7.36: not fully fixed 18
19. Parallel Scheduling on Replicas [4 of 4]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
19
When run in parallel, trx could fail to execute (below from the doc):
• If replication fails to execute a trx because of an InnoDB deadlock or
the trx's execution time exceeded innodb_lock_wait_timeout…
Such failed trx will be retried (below more from the doc):
• …it automatically retries slave_transaction_retries times before
stopping with an error.
The column COUNT_TRANSACTIONS_RETRIES from the the P_S
table replication_applier_status indicates how many retries
have happened, and the table repl._app._status_by_worker
contains details about the retries
20. Reminder: What is LOGICAL_CLOCK ?
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
From the MySQL Reference Manual for LOGICAL_CLOCK:
• Transactions that are part of the same binary log group commit on a
source are applied in parallel on a replica.
• But this is not 100% exact (Bug#85977)
When using LOGICAL_CLOCK, MTR is scheduling trx on replica
using dependency tracking information from the binary logs:
• We will name these information Intervals
• By default, trx intervals are generated with Commit Timestamps
(which is close, but not strictly the same as group commit)
• Much better intervals can be generated with Write Set
(and Write Set is completely different to group commit) 20
21. Commit Timestamps [1 of 3]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Commit Timestamps (CTS) is the default for 5.7 and 8.0
(binlog_transaction_dependency_tracking = COMMIT_ORDER)
It is a dependency tracking method exploiting parallelism on the
source to generate the intervals:
• While running a DML, the current trx notes the last committed trx
(this is where last_committed from mysqlbinlog is coming from)
(this is close to Group Commit, but not strictly the same)
Without tunning, CTS is not producing efficient intervals
(by default, MTR will not give much better replication speed) 21
22. Commit Timestamps [2 of 3]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
In below, the update id=3 ran while the update id=1 was not yet committed:
• Insert id=4 and update id=1 can run in parallel on a replica
• (Note that nothing commit at the same time à no group commit)
Session #1: CREATE TABLE t(id INT PRIMARY KEY, v INT);
Session #1: INSERT INTO t VALUES (1,1),(2,2),(3,3);
Session #2: BEGIN; UPDATE t SET v = 11 WHERE id = 1;
Session #3: UPDATE t SET v = 33 WHERE id = 3;
Session #1: INSERT INTO t VALUES (4,4);
Session #2: COMMIT;
#220410 11:21:47 last_committed=0 sequence_number=1 -- CREATE
#220410 11:21:49 last_committed=1 sequence_number=2 –- INSERT id=1,2,3
#220410 11:21:55 last_committed=2 sequence_number=3 -- UPDATE id=3
#220410 11:21:59 last_committed=3 sequence_number=4 –- INSERT id=4
#220410 11:22:03 last_committed=2 sequence_number=5 -- UPDATE id=1 22
23. Commit Timestamps [3 of 3]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
More binlog: note that the DML order is different from the commit order
mysqlbinlog $f |
awk '$11 ~ "last_committed"{print $1,$2,$11,$12} $10 ~ "_rows|Xid"{print $1,$2,$10}'
#220410 11:21:47 last_committed=0 sequence_number=1 -- CREATE
#220410 11:21:49 last_committed=1 sequence_number=2 -- INSERT id=1,2,3
#220410 11:21:49 Write_rows:
#220410 11:21:49 Xid
#220410 11:21:55 last_committed=2 sequence_number=3 -- UPDATE id=3
#220410 11:21:55 Update_rows:
#220410 11:21:55 Xid
#220410 11:21:59 last_committed=3 sequence_number=4 -- INSERT id=4
#220410 11:21:59 Write_rows:
#220410 11:21:59 Xid
#220410 11:22:03 last_committed=2 sequence_number=5 -- UPDATE id=1
#220410 11:21:52 Update_rows:
#220410 11:22:03 Xid
23
24. Tunning Commit Timestamp
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
If very few trx run in parallel, or if commit is quick (auto-commit, very
fast disks or low durability), CTS do not identify much parallelism
(low durability: sync_binlog != 1 and trx_commit != 1)
A solution is to delay commit for having more transactions running
in parallel, which leads to better interval generation:
• binlog_group_commit_sync_delay
• binlog_group_commit_sync_no_delay_count
I call this slowing-down the primary to speed-up the replicas
But performing this tunning just looking at replica speed is tedious24
25. Intervals Quality
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
To ease the tunning, I came-up with a metric for interval quality:
• the Average Modified Interval Length (AMIL)
It is based on the following observation:
• the larger the intervals are, the better MTR throughput will be
All the details about the AMIL in
http://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html
Computing the AMIL needs parsing the output of mysqlbinlog
for extracting interval information, MySQL should make this easier:
• Bug#85965: Expose, on the master, counters for monitoring // information quality
• Bug#85966: Expose, on slaves, counters for monitoring // information quality 25
26. Tunning Commit Timestamps with AMIL [1 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Monitoring the AMIL provides an immediate tunning feedback:
• After increasing binlog_group_commit_sync_delay, the AMIL
should increase (if not, increase more for better parallel replication,
or rollback for not penalizing commit latency)
But while doing this, also look at the commit-rate of the primary:
• If the commit-rate noticeably drop, consider reverting the tuning
(it is penalizing the application throughput)
But penalizing throughput might be ok for solving replication lag:
• lowering the primary throughput from 200 to 100 tps might be
acceptable to increase replica throughput from 50 to 150 tps 26
27. Tunning Commit Timestamps with AMIL [2 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
AMIL without and with tuning on four primaries:
(speed-up the replicas by increasing binlog_group_commit_sync_delay)
27
28. Other Commit Timestamp Tunning
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
28
To alleviate drop in throughput caused by delay,
increase the number of operations done in parallel in MySQL
Also, more transactions run in parallel by the application leads to
more parallelism identified by Commit Timestamps and better
replication speed on replicas
29. CTS and Non-Primary Sources
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Without MTR, CTS leads to no parallelism identification on replicas
Similarly, even with MTR and compared to the primary,
less parallelism is identified on replicas by Commit Timestamps:
• MTR loses efficiency when replicating from a non-primary source
Also, after promoting a replica as the new primary, its binlogs have
smaller intervals than the one from the old primary:
• Using these leads to less efficient MTR (eg: after restore backup)
With SBR, no good solutions to that (except Binlog Servers)
https://medium.com/booking-com-infrastructure/better-parallel-replication-for-mysql-14e2d7857813 29
30. Write Set [1 of 4]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
If using RBR, a better dependency tracking method can be used:
• Write Set was developed for Group Replication (in MySQL 5.7)
• It can also be used for dependency tracking (introduced in 8.0)
• It was back-ported in 5.7.22 (thanks Oracle for this back-port)
The Write Set is a data structure used to generate intervals:
• It stores a hash of all primary keys (and unique keys) modified by trx
(tunable size with binlog_transaction_dependency_history_size)
The intervals generated by Write Set on replicas are as good as on
the primary (and better if the primary is using CTS)
(Write Set can be used on a RBR replica of a SBR source) 30
31. Write Set [2 of 4]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Enabling Write Set dependency tracking (on both primary and replicas):
• Set BTDT to WRITESET (or WRITESET_SESSION)
(BTDT for binlog_transaction_dependency_tracking)
• This needs TWSE to XXHASH64 (or MURMUR32)
(TWSE for transaction_write_set_extraction)
• Which also needs binlog_format to ROW
With BTDT to WRITESET, a second trx run by a single session does
not depend on the first:
• With BTDT to WRITESET, SPCO should be enabled on replica
• Or set BTDT to WRITESET_SESSION on the primary
(not a good setting for replicas, we will come back to this) 31
32. Write Set [3 of 4]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
In below, updates id=3 and id=1, and insert id=4 can execute in parallel:
• Without SPCO (and the update), insert id=4 could commit before id=1,2,3
SET GLOBAL binlog_transaction_dependency_tracking = WRITESET;
Session #1: CREATE TABLE t(id INT PRIMARY KEY, v INT);
Session #1: INSERT INTO t VALUES (1,1),(2,2),(3,3);
Session #2: BEGIN; UPDATE t SET v = 11 WHERE id = 1;
Session #3: UPDATE t SET v = 33 WHERE id = 3;
Session #1: INSERT INTO t VALUES (4,4);
Session #2: COMMIT;
#220410 12:01:19 last_committed=0 sequence_number=1 -- CREATE
#220410 12:01:21 last_committed=1 sequence_number=2 -- INSERT id=1,2,3
#220410 12:01:29 last_committed=2 sequence_number=3 -- UPDATE id=3
#220410 12:01:34 last_committed=1 sequence_number=4 -- INSERT id=4
#220410 12:01:38 last_committed=2 sequence_number=5 -- UPDATE id=1 32
33. Barriers in Parallel Scheduling on Replicas
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
#220410 12:01:19 last_committed=0 sequence_number=1 -- CREATE
#220410 12:01:21 last_committed=1 sequence_number=2 -- INSERT id=1,2,3
#220410 12:01:29 last_committed=2 sequence_number=3 -- UPDATE id=3
#220410 12:01:34 last_committed=1 sequence_number=4 -- INSERT id=4
#220410 12:01:38 last_committed=2 sequence_number=5 -- UPDATE id=1 33
In below, the update id=3 (seq_no 3) depends on the multi-value insert:
• Parallel transaction scheduling blocks until the dependent trx completes
A dependent transaction acts as a barrier in parallel scheduling:
• Even if insert id=4 could be run at the same time as the multi-value insert,
the update id=3 prevents this
This might change in a future MySQL version
(it is also the M in AMIL – Modified)
34. Write Set [4 of 4]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
If running with slave_preserve_commit_order to OFF,
WRITESET_SESSION avoid inconsistencies: insert id=4 runs after id=1,2,3
SET GLOBAL binlog_transaction_dependency_tracking = WRITESET_SESSION;
Session #1: CREATE TABLE t(id INT PRIMARY KEY, v INT);
Session #1: INSERT INTO t VALUES (1,1),(2,2),(3,3);
Session #2: BEGIN; UPDATE t SET v = 11 WHERE id = 1;
Session #3: UPDATE t SET v = 33 WHERE id = 3;
Session #1: INSERT INTO t VALUES (4,4);
Session #2: COMMIT;
#220410 12:12:30 last_committed=0 sequence_number=1 -- CREATE
#220410 12:12:32 last_committed=1 sequence_number=2 -- INSERT id=1,2,3
#220410 12:12:39 last_committed=2 sequence_number=3 -- UPDATE id=3
#220410 12:12:43 last_committed=2 sequence_number=4 -- INSERT id=4
#220410 12:12:47 last_committed=2 sequence_number=5 -- UPDATE id=1 34
35. Write Set and Non-Primary Sources
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
We can enable Write Set on intermediary source:
• Even on a single-threaded replica
• But we should use BTDT to WRITESET because
WRITESET_SESSION would create new dependencies
(without MTR, all intervals of WRITESET_SESSION are of size one)
• This solves the CTS interval degradation on intermediary source
This was done to get the first (E*) benchmark results (in 2017):
• Primary in 5.7 (pre .22),
intermediary source in 8.0 with WRITESET, replica with MTR
This is useful to “try” MTR before big migration:
• SBR to RBR for Write Set, or 5.6 to 5.7 for MTR 35
36. AMIL for Write Set [1 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
AMIL from a CTS primary and a Write Set intermediary source:
• The primary does not need RBR, only the intermediary source does
• Write Set AMIL is usually better than untuned and tunned CTS
36
37. AMIL for Write Set [2 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Sometimes, combining delay with Write Set gives a better AMIL:
• Discovered by accident, tunning opportunity?, more research needed
(Write Set AMIL below 3x better with delay compared to without)
37
38. More Tunning: Smaller Transaction Sizes
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Because transaction scheduling sometimes/often blocks (barriers
and dependencies), some replica workers might be idle or waiting
on another transaction completion:
• In the worse case, many workers are waiting for a single big trx
These big trx reduce the potential for speedups:
• Avoiding big transactions in the application will make MTR faster
Ironically, when commit was expensive, or to amortize round-trip to
database, doing many things in a single (big) trx was / is the right
thing to do, but this makes MTR run slower 38
39. Recommended Settings for MTR Readiness
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Without RBR, CTS is the only available dependency tracking
à not worth “preparing” for MTR (implement and tune when needed)
With RBR and MySQL 8.0:
• binlog_transaction_dependency_tracking to WRITESET
• slave_parallel_type to LOGICAL_CLOCK (new default in 8.0.27)
• slave_preserve_commit_order to ON (new default in 8.0.27)
With RBR and MySQL 5.7, all above plus:
• transaction_write_set_extraction to XXHASH64
(this is the default in 8.0)
With these, faster repl. only needs updating slave_parallel_workers
40. Thoughts on MySQL 8.0.27 new Defaults
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
In MySQL 8.0.27 we have:
• slave_parallel_type to LOGICAL_CLOCK
• slave_preserve_commit_order to ON (was OFF before)
• and slave_parallel_workers to 4 in 8.0.27 (was 0 before)
But intervals are still generated with Commit Timestamps:
• Having 4 workers will probably not improve replication speed much
• But this opens possibility for MTR rough-edges
Enabling parallel replication by default might be a little early
and when these bugs will be fixed, Write Set should also be enabled
40
41. Benchmark
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Benchmarks done a few years ago and at HubSpot:
• Used 0 to 256 threads (slave_parallel_workers)
• E1 to E8: different environments / workloads (2017-2018)
• F1 to F4: more environments / workloads (from HubSpot)
• HD: High Durability: sync_binlog = 1 and trx_commit = 1
• ND: No Durability: sync_binlog = 0 and trx_commit = 2
• IS: Intermediary Source: with log_slave_updates
• NO: No Order: slave_preserve_commit_order = OFF
• (WO: with order: slave_preserve_commit_order = ON)
• (RB: replica with binary logs: without log_slave_updates)
41
42. Benchmark Presentation [1 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
42
E5 IS-HD Single-Threaded: 6138 seconds
E5 IS-ND Single-Threaded: 2238 seconds
Going from HD to ND for E5 IS gives a speedup of 2.74
43. Benchmark Presentation [2 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
All in a single graph:
• HD et ND: previous speedups
• HD vs ND single threaded: the speedup from HD to ND
43
44. Old Benchmark Setup
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
The test environments involve three nodes:
+---+ +---+ +---+
| A | -------> | B | -------> | C |
+---+ +---+ +---+
• A is a production primary (MySQL 5.6 or 5.7: this was in 2017-2018)
• Some of the primaries are SBR, some are RBR
• B is a MySQL 8.0.3 Intermediary Source in RBR with Write Set
(transaction_write_set_extraction = XXHASH64)
(binlog_transaction_dependency_tracking = WRITESET)
• C is a replica where timing is done 44
45. Previous Benchmark Results [1 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
46. Previous Benchmark Results [2 of 2]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Observations:
• High variation of speedups, very workload dependent
(probably possible to do better if spending time optimizing the application)
• HD gives better speedups than ND (normal because group commit)
• ND gives some interesting speedups, but do not expect 16x faster
not possible to use all CPUs of modern hardware on replica for writes
47. HubSpot Benchmark Setup
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
The test environments involve two nodes:
+---+ +---+
| A | -------> | B |
+---+ +---+
• A is a production primary (Percona Server 5.7.36 with Write Set)
(transaction_write_set_extraction = XXHASH64)
(binlog_transaction_dependency_tracking = WRITESET)
• B is a replica where timing is done
47
48. HubSpot Benchmarks
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Observations:
• Higher HD speedups for F{1,2} than E*
• Better HD vs ND single-threaded
for F* than E* (F1 being much better)
Something is different between E and F.
49. Benchmark Discussion
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Previous was on local SSD, HubSpot is EBS in AWS
• Disk-sync more expensive for HubSpot
• Which explains better HD results
• Also explains better HD vs ND single-threaded speedup
• (Also, make benchmarking hard because AWS variance)
(HubSpot is running production with ND + semi-sync)
Also, HubSpot is running with the previously recommended settings:
• Write Set enabled on primaries and replicas
• slave_parallel_type to LOGICAL_CLOCK
• slave_preserve_commit_order to ON
à Ready for MTR by updating slave_parallel_workers 49
50. Other Benchmarks Results
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
I also have results for:
• WO, and NO vs WO
• RB, and IS vs RB
Too long to present everything, contact me if interested
TL&DR:
• NO is faster than WO
• RB is faster than IS
• Both highly variable depending on workload and durability
50
51. Other Parallel Replication Subjects
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
MySQL Parallel Replication Simulator
https://blog.koehntopp.info/2021/11/08/mysql-parallel-replication.html
Replica Group Commit (Slave Group Commit in MariaDB)
https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-2-slave-group-commit-459026a141d2
• Replicating trx sequentially, in many threads for Group Committing
• Almost as good as ND single-threaded speed, with benefit of HD
Optimistic Parallel Replication (in MariaDB)
https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-4-more-benchmarks-in-production-49ee255043ab
• Go beyond scheduling barriers on replica, achieving replication prefetching
Facebook partial transaction parallel execution in MySQL 5.6
https://www.percona.com/live/e18/sessions/faster-mysql-replication-using-row-dependencies
• Using Write Set on replicas for fine grain parallel statement excution 51
52. Links [1 of 3]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Content related to this talk from the speaker:
• A Metric for Tuning Parallel Replication in MySQL 5.7
https://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html
• Better Parallel Replication for MySQL (Binlog Server)
https://medium.com/booking-com-infrastructure/better-parallel-replication-for-mysql-14e2d7857813
• An update on Write Set (parallel replication) bug fix in MySQL 8.0
https://jfg-mysql.blogspot.com/2018/01/an-update-on-write-set-parallel-replication-bug-fix-in-mysql-8-0.html
Binlog Server content from the speaker:
• MySQL Slave Scaling (and more)
https://medium.com/booking-com-infrastructure/mysql-slave-scaling-and-more-a09d88713a20
• Abstracting Binlog Servers and MySQL Master Promotion without Reconfiguring all Slaves
https://medium.com/booking-com-infrastructure/abstracting-binlog-servers-and-mysql-master-promotion-without-reconfiguring-all-slaves-44be1febc8a0
Other parallel replication content from the speaker:
• Slave Group Commit
https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-2-slave-group-commit-459026a141d2
• Optimistic Parallel Replication
https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-4-more-benchmarks-in-production-49ee255043ab 52
53. Links [2 of 3]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
Related bugs (some fixed, some still open) for the full history:
• Bug#75396: Allow slave_preserve_commit_order without log-slave-updates.
• Bug#81840: Automatic Replication Recovery Does Not Handle Lost Events (applies to MTR)
• Bug#85965: Expose, on the master, counters for monitoring // information quality.
• Bug#85966: Expose, on slaves, counters for monitoring // information quality.
• Bug#85977: The doc. for LOGICAL_CLOCK wrongly references Group Commit
• Bug#89247: Deadlock with MTS when slave_preserve_commit_order = ON
• PS-8030: Multithreaded replica crashes [at STOP SLAVE] inside rli::cannot_safely_rollback()
• I think there is still a deadlock bug in 5.7 (at least in Percona Server 5.7.36),
troubleshooting in progress at HubSpot
• Also, there might be other bugs
53
54. Links [3 of 3]
(MySQL Parallel Replication Logical Clock – Percona Live Austin 2022)
And more related content:
• Solving MySQL Replication Lag with LOGICAL_CLOCK and Calibrated Delay
https://orangematter.solarwinds.com/2017/01/13/solving-mysql-replication-lag-with-logical_clock-and-calibrated-delay/
• How to Fix a Lagging MySQL Replication
https://thoughts.t37.net/fixing-a-very-lagging-mysql-replication-db6eb5a6e15d
• Estimating potential for MySQL 5.7 parallel replication
https://www.percona.com/blog/2016/02/10/estimating-potential-for-mysql-5-7-parallel-replication/
• How Binary Logs (and Filesystems) Affect MySQL Performance
https://www.percona.com/blog/2018/05/04/how-binary-logs-and-filesystems-affect-mysql-performance/
• A Dive Into MySQL Multi-Threaded Replication
https://www.percona.com/blog/a-dive-into-mysql-multi-threaded-replication/
• MySQL Parallel Replication Simulator
https://blog.koehntopp.info/2021/11/08/mysql-parallel-replication.html
54
55. Thanks !
by Jean-François Gagné
System and MySQL Expert at HubSpot
Presented at Percona Live Austin 2022
jfg.mysql AT gmail DOT com
Twitter: @jfg956