SlideShare a Scribd company logo
Avoiding the ring of death
Why you shouldn’t use MySQL Ring/Circular replication
Aishvarya Verma
What is Ring/Circular
Replication
• Ring or circular replication is a
multi-master topology with the
nodes of the cluster organized in
a ring or circular manner
• Each node is a master, i.e. it
accepts writes, which are then
propagated to all the other
nodes serially,
What is the thought behind using this?
• It is thought that the multiple nodes provide High Availability
• Spreading the writes to multiple nodes should provide High Scalability
• Example : Most enterprises cater to data for different companies,
many users/accounts and from different geographies. So, the idea is
to spread out the load to multiple servers and have the data available
on all servers for high availability
• BUT DOES IT ACTUALLY PROVIDE THESE BENEFITS ???
• NO. It actually does the exact opposite of what it is believed to
provide
CON#1 : Multiple Points of failure
• Each time a server goes down, the whole
chain is impacted. So, availability is even poor
compared to multiple individual servers
• Things can get real trippy when a DB event
from a failed node is being replicated to other
nodes. It will go into an infinite loop, coz only
the failed node could have stopped the event
from propagating further
• When a failed node is out of the chain, there
is increased load on the other servers. If the
failed node is not recovered in time, and
there is high traffic it will have a domino
effect and if one or more servers fail, the
whole system can be choked and can
completely crash
• No single master data. So, its complex to
recover a failed node
• Chain is only as strong as its weakest link !!
CON#2 Write/Read scalability Mirage
• If single server is able to handle W writes/sec then its fair to assume
that using 3 servers we will get 3x the write capacity = 3W. Is it??
• No, because the replication puts extra load on the whole system
• Lot of computing resources on each server will now be consumed to
deal with the writes on the other 2 servers.
• Example: Assume W = 1000 w/s
Now, as each server needs to process the writes on the other 2
servers to be in sync, at peak time each server will only be able to
have 330 w/s of its own, as it will need to process 660 w/s for the
other 2 servers and so the cluster write speed in worst case is still
bound by each server’s write handling capacity.
• Even if we add another node, it will do more harm than good
CON#3 Write conflicts
• Duplicate key errors break the replication chain. This is caused by
AUTO_INCREMENT of keys and can literally bring your replication ring
to its knees
• Each time replication chain breaks, there is additional complexity of
removing that node from the chain and then recovering it and adding
it back to the chain
• Inconsistent data is also possible, if multiple users are allowed to
write/update same row of data on multiple nodes. One way to avoid
this is to have users mapped to only a single server for writes and
allow reads from other nodes only for load balancing. But, even this
doesn’t guarantee that incorrect data state will not be observed.
CON#4 Under-utilized Server resources
• Table sizes and index sizes are huge, due to the unnecessary 67% of
extra data that is added due to replication
• Due to this the performance of the DB server(s) goes down
• Server resources are wasted in dealing with replication of data from
other servers and are strained by the additional load from this data
• In summary, the server resources can be used more efficiently by
using it for work that is related to the active data on each node
Solution : Keep it simple silly..
• Sharding : Split your data on multiple nodes so that if unfortunately a
server does go down then only a subsection of the system is affected
• Use active passive multi master setup : Each node is now holding all the
data for a portion of your users/system. So, we can have independent
backup for that
• In case of Failure the passive master becomes the active master, while the
failed master is recovered and added as passive slave again
• Note : This is not the only solution, but shows how a simple setup can be
better than a complex ring topology
Active Passive Masters : Advantages
• Provides failover, and does not put extra load on other Master pairs
• Provides scalability : If a server X is attracting heavy load then a new
Active-passive pair can be added and the load from X can be split and
moved to the new node
• Replication delay is reduced significantly, because of dedicated slaves
• Hardware upgrades can be different on different pairs, depending
upon the load experienced by each of them
• Queries will perform better on each node because of smaller table
and index sizes on each active master node
Will it multiply cost?? Not really
• Lets say, we currently have a cluster of 4 nodes, with each node
having 32 CPU cores, and a storage of 3.6 TB(3x the actual DB size)
• Now, if we split this to Active passive configuration, we will need 8
nodes, i.e. 4 masters & 4 slaves.
• But, now as we are Sharding our data across these nodes, our DB size
on each node should reduce by a factor of 4 to 0.9 TB. We can now
choose to reduce the CPU cores and RAM too, by a factor of 2, as we
have less data on each node now to process.
• So, now lets compare the resource cost for this topology change
Cost comparison
Performance comparison
• The cost of 4 new servers might seem like a deterrent, but if this setup is
implemented on AWS EC2 infra then it will not cost more than the current setup
in AWS EC2.
• The storage cost for each node will be reduced by a factor of 1/N, where N is the
number of nodes. This is because we will essentially be storing only the required
data on each node, which amounts to 1/N of the current DB size on each master
in the cluster
• As storage size reduces, so does the table sizes and index sizes
• As the RAM to DB size ratio increases, it is guaranteed to give better memory
performance
• CPU cores available per TB of data increases, which will give better
performance
• If we use SSDs for storage then we can achieve even better performance
Conclusion
• Ring replication is definitely not the right option for High availability
and scalability, and is not recommended for these use cases
• Active passive master configuration is not the only solution, but is
just compared here to show the inefficiencies in the current ring
replication strategy
• It shows that by just sharding our data and re-allocating our compute
resources, we can achieve a much better performance, with more
stability and efficiency of the cluster
References
• https://www.packtpub.com/books/content/setting-mysql-replication-
high-availability
• https://www.percona.com/blog/2014/10/07/mysql-ring-replication-
why-it-is-a-bad-option/
• http://www.onlamp.com/2006/04/20/advanced-mysql-
replication.html
• https://www.safaribooksonline.com/library/view/effective-mysql-
replication/9780071791861/

More Related Content

What's hot

Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
Dave Gardner
 
Apache Cassandra Management
Apache Cassandra ManagementApache Cassandra Management
Apache Cassandra Management
Instaclustr
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
Clement Demonchy
 
Event Hub & Kafka
Event Hub & KafkaEvent Hub & Kafka
Event Hub & Kafka
Aparna Pillai
 
Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark
Anubhav Kale
 
AWS multi-region DB design and deployment
AWS multi-region DB design and deploymentAWS multi-region DB design and deployment
AWS multi-region DB design and deployment
Sudheer Kondla
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
DataStax
 
Kafka at scale facebook israel
Kafka at scale   facebook israelKafka at scale   facebook israel
Kafka at scale facebook israel
Gwen (Chen) Shapira
 
Kafka: Internals
Kafka: InternalsKafka: Internals
Kafka: Internals
Knoldus Inc.
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
Srikrishna k
 
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
In-Memory Computing Summit
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
natashasweety7
 
AWS RDS Benchmark - Instance comparison
AWS RDS Benchmark - Instance comparisonAWS RDS Benchmark - Instance comparison
AWS RDS Benchmark - Instance comparison
Roberto Gaiser
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
Oleksandr Semenov
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Srikrishna k
 
Load balancing
Load balancingLoad balancing
Load balancing
ankur bhalla
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
Ruben Badaró
 
Load Balancing from the Cloud - Layer 7 Aware Solution
Load Balancing from the Cloud - Layer 7 Aware SolutionLoad Balancing from the Cloud - Layer 7 Aware Solution
Load Balancing from the Cloud - Layer 7 Aware Solution
Imperva Incapsula
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
emreakis
 
Flume vs. kafka
Flume vs. kafkaFlume vs. kafka
Flume vs. kafka
Omid Vahdaty
 

What's hot (20)

Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 
Apache Cassandra Management
Apache Cassandra ManagementApache Cassandra Management
Apache Cassandra Management
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Event Hub & Kafka
Event Hub & KafkaEvent Hub & Kafka
Event Hub & Kafka
 
Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark
 
AWS multi-region DB design and deployment
AWS multi-region DB design and deploymentAWS multi-region DB design and deployment
AWS multi-region DB design and deployment
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
 
Kafka at scale facebook israel
Kafka at scale   facebook israelKafka at scale   facebook israel
Kafka at scale facebook israel
 
Kafka: Internals
Kafka: InternalsKafka: Internals
Kafka: Internals
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
 
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
AWS RDS Benchmark - Instance comparison
AWS RDS Benchmark - Instance comparisonAWS RDS Benchmark - Instance comparison
AWS RDS Benchmark - Instance comparison
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Load balancing
Load balancingLoad balancing
Load balancing
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Load Balancing from the Cloud - Layer 7 Aware Solution
Load Balancing from the Cloud - Layer 7 Aware SolutionLoad Balancing from the Cloud - Layer 7 Aware Solution
Load Balancing from the Cloud - Layer 7 Aware Solution
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Flume vs. kafka
Flume vs. kafkaFlume vs. kafka
Flume vs. kafka
 

Similar to Avoiding the ring of death

UNIT II (1).pptx
UNIT II (1).pptxUNIT II (1).pptx
UNIT II (1).pptx
gopi venkat
 
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Continuent
 
System design fundamentals CAP.pdf
System design fundamentals CAP.pdfSystem design fundamentals CAP.pdf
System design fundamentals CAP.pdf
UsmanAhmed269749
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon
 
Manjeet Singh.pptx
Manjeet Singh.pptxManjeet Singh.pptx
Manjeet Singh.pptx
RAMCHANDRASHARMA7
 
Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101
Mark Kromer
 
Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021
Mark Kromer
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Dave Anselmi
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
DataStax Academy
 
adap-stability-202310.pptx
adap-stability-202310.pptxadap-stability-202310.pptx
adap-stability-202310.pptx
Michael Ming Lei
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating Netflix
Jason Brown
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
Marco Tusa
 
NoSQL Evolution
NoSQL EvolutionNoSQL Evolution
NoSQL Evolution
Abdul Manaf
 
performance_tuning.pdf
performance_tuning.pdfperformance_tuning.pdf
performance_tuning.pdf
Alexadiaz52
 
performance_tuning.pdf
performance_tuning.pdfperformance_tuning.pdf
performance_tuning.pdf
Alexadiaz52
 
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Continuent
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategy
Saptarshi Chatterjee
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High Availability
MariaDB plc
 
System models for distributed and cloud computing
System models for distributed and cloud computingSystem models for distributed and cloud computing
System models for distributed and cloud computing
purplesea
 
final demo 1.pptx about Property rental system
final demo 1.pptx about Property rental systemfinal demo 1.pptx about Property rental system
final demo 1.pptx about Property rental system
ravindrakulkarni478
 

Similar to Avoiding the ring of death (20)

UNIT II (1).pptx
UNIT II (1).pptxUNIT II (1).pptx
UNIT II (1).pptx
 
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
 
System design fundamentals CAP.pdf
System design fundamentals CAP.pdfSystem design fundamentals CAP.pdf
System design fundamentals CAP.pdf
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
 
Manjeet Singh.pptx
Manjeet Singh.pptxManjeet Singh.pptx
Manjeet Singh.pptx
 
Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101
 
Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
adap-stability-202310.pptx
adap-stability-202310.pptxadap-stability-202310.pptx
adap-stability-202310.pptx
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating Netflix
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
 
NoSQL Evolution
NoSQL EvolutionNoSQL Evolution
NoSQL Evolution
 
performance_tuning.pdf
performance_tuning.pdfperformance_tuning.pdf
performance_tuning.pdf
 
performance_tuning.pdf
performance_tuning.pdfperformance_tuning.pdf
performance_tuning.pdf
 
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategy
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High Availability
 
System models for distributed and cloud computing
System models for distributed and cloud computingSystem models for distributed and cloud computing
System models for distributed and cloud computing
 
final demo 1.pptx about Property rental system
final demo 1.pptx about Property rental systemfinal demo 1.pptx about Property rental system
final demo 1.pptx about Property rental system
 

Recently uploaded

Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
AjmalKhan50578
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
Introduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptxIntroduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptx
MiscAnnoy1
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
architagupta876
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
Gino153088
 

Recently uploaded (20)

Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
Introduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptxIntroduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptx
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
 

Avoiding the ring of death

  • 1. Avoiding the ring of death Why you shouldn’t use MySQL Ring/Circular replication Aishvarya Verma
  • 2. What is Ring/Circular Replication • Ring or circular replication is a multi-master topology with the nodes of the cluster organized in a ring or circular manner • Each node is a master, i.e. it accepts writes, which are then propagated to all the other nodes serially,
  • 3. What is the thought behind using this? • It is thought that the multiple nodes provide High Availability • Spreading the writes to multiple nodes should provide High Scalability • Example : Most enterprises cater to data for different companies, many users/accounts and from different geographies. So, the idea is to spread out the load to multiple servers and have the data available on all servers for high availability • BUT DOES IT ACTUALLY PROVIDE THESE BENEFITS ??? • NO. It actually does the exact opposite of what it is believed to provide
  • 4. CON#1 : Multiple Points of failure • Each time a server goes down, the whole chain is impacted. So, availability is even poor compared to multiple individual servers • Things can get real trippy when a DB event from a failed node is being replicated to other nodes. It will go into an infinite loop, coz only the failed node could have stopped the event from propagating further • When a failed node is out of the chain, there is increased load on the other servers. If the failed node is not recovered in time, and there is high traffic it will have a domino effect and if one or more servers fail, the whole system can be choked and can completely crash • No single master data. So, its complex to recover a failed node • Chain is only as strong as its weakest link !!
  • 5. CON#2 Write/Read scalability Mirage • If single server is able to handle W writes/sec then its fair to assume that using 3 servers we will get 3x the write capacity = 3W. Is it?? • No, because the replication puts extra load on the whole system • Lot of computing resources on each server will now be consumed to deal with the writes on the other 2 servers. • Example: Assume W = 1000 w/s Now, as each server needs to process the writes on the other 2 servers to be in sync, at peak time each server will only be able to have 330 w/s of its own, as it will need to process 660 w/s for the other 2 servers and so the cluster write speed in worst case is still bound by each server’s write handling capacity. • Even if we add another node, it will do more harm than good
  • 6. CON#3 Write conflicts • Duplicate key errors break the replication chain. This is caused by AUTO_INCREMENT of keys and can literally bring your replication ring to its knees • Each time replication chain breaks, there is additional complexity of removing that node from the chain and then recovering it and adding it back to the chain • Inconsistent data is also possible, if multiple users are allowed to write/update same row of data on multiple nodes. One way to avoid this is to have users mapped to only a single server for writes and allow reads from other nodes only for load balancing. But, even this doesn’t guarantee that incorrect data state will not be observed.
  • 7. CON#4 Under-utilized Server resources • Table sizes and index sizes are huge, due to the unnecessary 67% of extra data that is added due to replication • Due to this the performance of the DB server(s) goes down • Server resources are wasted in dealing with replication of data from other servers and are strained by the additional load from this data • In summary, the server resources can be used more efficiently by using it for work that is related to the active data on each node
  • 8. Solution : Keep it simple silly.. • Sharding : Split your data on multiple nodes so that if unfortunately a server does go down then only a subsection of the system is affected • Use active passive multi master setup : Each node is now holding all the data for a portion of your users/system. So, we can have independent backup for that • In case of Failure the passive master becomes the active master, while the failed master is recovered and added as passive slave again • Note : This is not the only solution, but shows how a simple setup can be better than a complex ring topology
  • 9. Active Passive Masters : Advantages • Provides failover, and does not put extra load on other Master pairs • Provides scalability : If a server X is attracting heavy load then a new Active-passive pair can be added and the load from X can be split and moved to the new node • Replication delay is reduced significantly, because of dedicated slaves • Hardware upgrades can be different on different pairs, depending upon the load experienced by each of them • Queries will perform better on each node because of smaller table and index sizes on each active master node
  • 10. Will it multiply cost?? Not really • Lets say, we currently have a cluster of 4 nodes, with each node having 32 CPU cores, and a storage of 3.6 TB(3x the actual DB size) • Now, if we split this to Active passive configuration, we will need 8 nodes, i.e. 4 masters & 4 slaves. • But, now as we are Sharding our data across these nodes, our DB size on each node should reduce by a factor of 4 to 0.9 TB. We can now choose to reduce the CPU cores and RAM too, by a factor of 2, as we have less data on each node now to process. • So, now lets compare the resource cost for this topology change
  • 12. Performance comparison • The cost of 4 new servers might seem like a deterrent, but if this setup is implemented on AWS EC2 infra then it will not cost more than the current setup in AWS EC2. • The storage cost for each node will be reduced by a factor of 1/N, where N is the number of nodes. This is because we will essentially be storing only the required data on each node, which amounts to 1/N of the current DB size on each master in the cluster • As storage size reduces, so does the table sizes and index sizes • As the RAM to DB size ratio increases, it is guaranteed to give better memory performance • CPU cores available per TB of data increases, which will give better performance • If we use SSDs for storage then we can achieve even better performance
  • 13. Conclusion • Ring replication is definitely not the right option for High availability and scalability, and is not recommended for these use cases • Active passive master configuration is not the only solution, but is just compared here to show the inefficiencies in the current ring replication strategy • It shows that by just sharding our data and re-allocating our compute resources, we can achieve a much better performance, with more stability and efficiency of the cluster
  • 14. References • https://www.packtpub.com/books/content/setting-mysql-replication- high-availability • https://www.percona.com/blog/2014/10/07/mysql-ring-replication- why-it-is-a-bad-option/ • http://www.onlamp.com/2006/04/20/advanced-mysql- replication.html • https://www.safaribooksonline.com/library/view/effective-mysql- replication/9780071791861/