SlideShare a Scribd company logo
1 of 27
Flexible transactional scale for the connected world.
Challenges to Scaling MySQL:
Best Practices for Creating High Availability
Dave A. Anselmi @AnselmiDave
Director of Product Management
Questions for Today
PROPRIETARY & CONFIDENTIAL 2
o What is high availability and when is it needed?
o What’s the difference between high availability and fault tolerance?
o How is it possible to survive a multi-node failure in MySQL?
o What are the best practices for achieving high availability with MySQL?
o What are the costs of achieving HA? What can be the most cost-effective
strategy??
HA: As You Scale, Your Exposure GrowsSCALE
(GROWTH/SUCCESS)
T I M E
LAMP Stack
AWS, Azure,
RAX, GCE, etc
Private Cloud
REACH LIMIT
App too slow;
Lost users
REACH LIMIT
(AGAIN)
App too slow;
Lost users
Migrate
to Bigger
Machine
• Read slaves, then
sharding, etc:
• Add more hardware &
DBAs
• Refactor code
/hardwired app
 More Expensive
 Higher Risk
 Lost Revenue
ONGOING:
• Refactoring
hardware
• Data balancing
• Shard
maintenance
REPEAT
Migrate
to Bigger
Machine
PROPRIETARY & CONFIDENTIAL 3
What do we mean by “High Availability”?
Availability by the 9s
PROPRIETARY & CONFIDENTIAL 5https://www.percona.com/blog/2016/06/07/choosing-mysql-high-availability-solutions/
“High Availability” –vs- “Fault Tolerance”
o High Availability – Minimize system downtime
– Trade-off between as high “9s” level as can be budgeted
– Goal: least amount of data loss possible
o Fault Tolerance – System cannot go down
– Arrays of redundant hardware, and automated failover systems
– Cost is very high
ORCL:
– A high availability system minimizes the time when the system is down, or
unavailable and maximizes the time when it is running, or available.
IBM:
– A fault tolerant environment has no service interruption but a significantly
higher cost,
– A highly available environment has a minimal service interruption.
PROPRIETARY & CONFIDENTIAL 6
Fault Tolerance –vs– High Availability
o Fault Tolerance:
– Failover application processes,
including heartbeat
– Shared storage layer, multiple
participants
PROPRIETARY & CONFIDENTIAL 7
o High Availability:
– Multiple redundant shared-
nothing servers
– Replication to keep in sync
High Availability rather than Fault-Tolerance
o MySQL systems are rarely fault tolerant
– High cost of fault tolerance is prohibitive
o Most MySQL systems use replication for HA/DR
o Galera isn’t fault tolerant
– Certification replication provides synchronous replication
between nodes
– Availability is enforced over consistency: the write-set can be
committed on the local node before the rest of the cluster has
committed (Jepsen)
PROPRIETARY & CONFIDENTIAL 8
Challenges to Deploying HA Systems
Four Challenges to HA
1. Tech/DevOps always wants HA:
1. Throughput & uptime is their core metric/KPIs
2. Business/Finance demands justification of the costs:
1. Redundant servers reflect underutilized resources
2. Redundant servers are considered “wasted budget”
3. Cloud/PaaS/IaaS can imply more HA than they provide
– “Architected for 11x 9s”
4. Tension between scale and HA
1. Ideally, each new server would provide scale and redundancy
2. In practice, result is mixed; so choice is usually for scale
PROPRIETARY & CONFIDENTIAL 10
Realities of HA in the Cloud
o “Promise” –vs- “Reality” of the cloud
– Promise of the cloud: web scale
– Reality of the cloud: TANSTAAFL
o “Doesn’t the cloud provide HA automatically?”
– MBAs: literally taught “DevOps just wants to spend $$”
• “We don’t need redundancy: we’re on the cloud, & the cloud is 5x 9s, right?”
• “S3 is architected for 11x 9s, right?”
• “We’re on Amazon, it’s backed up”
o MUST deploy redundant hardware in the cloud
– If it’s not on your bill, you haven’t provisioned it
o “Success” of AWS Marketing  Exposed Workload
– 2/28/2017 4 Hour S3 outage: Even though “the cloud” has lots of hardware
that does NOT mean your systems are fault tolerant, let alone HA
PROPRIETARY & CONFIDENTIAL 11
“Obvious” Critical Workloads needing HA
o E-Commerce
– Black Friday/Cyber Monday, Single’s Day, Back to School, flash
sales, etc
– 80% of Revenue in 2 months
– Provisioning > 3x capacity for 2 months
o Finance
– System of Record
– “Money changing hands”
o Healthcare
– “Life/death decisions” & DSS
PROPRIETARY & CONFIDENTIAL 12
Assessing Your Workload’s Exposure
o Downtime: how much new business lost?
o How much does brand awareness/damage cost?
o Lost data = what kind of cost?
– Orders unfulfilled, unhappy customers
– Missing/stale reports, unhappy executives
o Not just e-commerce:
– Internal critical DSS Reports => top bank runs 2x 100+ node
sharded arrays
• DSS needs to be near-real time
• What if a shard fails, or the data is old?
PROPRIETARY & CONFIDENTIAL 13
Business Case for HA
The “insurance” of HA offsets multiple costs:
o Opportunity cost
– Each missed visitor was potentially a customer or referral
o Single sale cost
– Each missed sale is a tangible missed $-value
o Customer lifetime cost
– Unhappy customers who find sites they like better, won’t return
o Market/brand cost
– All customers use social media: communication “force multiplier”
– “If you make customers unhappy in the physical world, they might each tell six friends. If
you make customers unhappy on the internet, they can each tell 6,000.” – Jeff Bezos
– W. Edwards Deming said “5” and “20”…
– Call it “Customer Satisfaction at Web-Scale”
PROPRIETARY & CONFIDENTIAL 14
Strategies to Make MySQL Deployments HA
MySQL HA is usually Replication-based
o Redundant servers
– Goal: get HA and more scale
– Some level of consistency
o Read slave or DR – data is still ‘seconds behind master’
– Async or Semistrict
o Certification
– Strong consistency as long as only a single master accepts writes
o Group Replication
– Strong consistency as long as only a single master accepts writes
PROPRIETARY & CONFIDENTIAL 16
Consistency Ramifications to High Availability
o Async Replication (Master/Slave):
– Replication-based: latency between master and slaves
– Always some number of transactions which COMMIT on Master aren’t
represented on the Slave
– “Trade latency for throughput.” OK for your workload?
o Sync Replication:
– Certification Replication: certificate is transmitted, local master commits
before ACK, other nodes commit in background
– Cloud Spanner & CockroachDB: time-based optimization for replicated
partitions
o Strong Consistency
– Every node is in identical, global transactional state at all times
– All nodes (at least two) containing data associated with the transaction are
durably updated before application receives ACK
PROPRIETARY & CONFIDENTIAL 17
Different Replication Strategies for HA
Approach Details Pro’s Con’s
Read Slave(s) Add a “Slave” read-server(s) to
“Master” database server
(e.g. “DR” node or cluster)
• Easy setup
• Single-master simplicity
• Async == Slave is usually behind
master
• Eventually Consistent
Master/
Master
Both Masters are Slaves to each
other
• Allows updates to both masters • Async == Slave is usually behind
master
• Eventually Consistent
Certification
Replication
Multi-Master cluster using
synchronous Replication
• Allows multiple masters to be close
in state
• Sync == Other nodes need to commit
the certification. Window of skew
exists (much shorter than async)
Group
Replication
1. Single-Primary, with
automatic leader election
2. Multi-Primary, i.e. similar to
certification replication
• Allows multiple masters to be close
in state
• Sync == Other nodes need to commit
the certification. Window of skew
exists (much shorter than async)
MySQL Deployment Architectures
PROPRIETARY & CONFIDENTIAL 19
SHARDO4SHARDO1 SHARDO2 SHARDO3
A-G H-M N-S T-Z
DRDR DR DR
A-G H-M N-S T-Z
HA Strategies per Architecture
MySQL Deployment
Approach Single Node Read Slave(s) Master/Master Sharding
Read
Slave(s)
• Each read slave adds
read scale + HA
• Eventual consistency
N/A
• Secondary master is
effectively same state as a
read slave
• Each shard has a read
slave
• Eventual consistency
Master/
Master
• No HA benefit over
Read Slave
• Secondary master is
effectively same state
as a read slave
N/A
• Each shard in
Master/Master
• Eventual consistency
Certification
Replication
• Nodes are closer in
state than read slave
• Nodes are closer in
state than read slave
• Nodes are closer in state
than Master/Master
• Each shard in
Master/Master using
certification replication
Group
Replication
• Automatic Master
election
• Group members are
closer in state than read
slave
• Automatic Master
election
• Group members are
closer in state than
read slave
• Group members are closer
in state than Master/Master
• Each shard using group
Replication
• Automatic Master election
How ClustrixDB Provides High Availability
ClustrixDB:
PROPRIETARY & CONFIDENTIAL 22
ClustrixDB
ACID Compliant
Transactions & Joins
Optimized for OLTP
Built-In High Availability
Flex-Up and Flex-Down
Minimal DB Admin
o Write + Read Linear Scale-Out
o Automatically Highly Available
o MySQL-Compatible
PROPRIETARY & CONFIDENTIAL 23
Automatic High Availability
o Planned or Unplanned Outages
– Planned: “soft-fail” the node(s)
– Single minimal “database pause” to regain
quorum
o At least 2 instances of the data distributed
across all the nodes
– All data instances fully in sync at all times
o Data is automatically rebalanced across
the cluster
– Tables are online for reads and writes
– MVCC for lockless reads while writing
S1
S2
S3
S3
S4
S4
S5
S1
ClustrixDB
S2
S5
Questions for Today
o What is high availability and when is it needed?
– Redundancy to minimize downtimes
– Financial, health, and other critical workloads
o What’s the difference between high availability and fault tolerance?
– High availability: minimize downtime
– Fault tolerance: zero downtime
o How is it possible to survive a multi-node failure in MySQL?
– Multiple server redundancy
– Maintaining strong consistency requires synchronous data replication
between servers
PROPRIETARY & CONFIDENTIAL 24
Questions for Today
o What are the best practices for achieving high availability with
MySQL?
– Synchronous replication: can affect performance or scale
– Asynchronous replication: can affect data consistency
o What are the costs of achieving HA? What can be the most cost-
effective strategy??
– Redundancy of servers: CAPEX & OPEX for DevOps
– License/support costs: ramps up by # of servers
– Ideally: each server provides scale + HA
PROPRIETARY & CONFIDENTIAL 25
QUESTIONS?
THANK YOU!

More Related Content

What's hot

Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in JavaRuben Badaró
 
MariaDB on Docker
MariaDB on DockerMariaDB on Docker
MariaDB on DockerMariaDB plc
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centersMariaDB plc
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machineheraflux
 
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
2015 GHC Presentation - High Availability and High Frequency Big Data AnalyticsEsther Kundin
 
Become a MySQL DBA: performing live database upgrades - webinar slides
Become a MySQL DBA: performing live database upgrades - webinar slidesBecome a MySQL DBA: performing live database upgrades - webinar slides
Become a MySQL DBA: performing live database upgrades - webinar slidesSeveralnines
 
Webinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High AvailabilityWebinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High AvailabilitySeveralnines
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...confluent
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayDataStax Academy
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Dave Anselmi
 
Apache Cassandra Certification
Apache Cassandra CertificationApache Cassandra Certification
Apache Cassandra CertificationVskills
 
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScaleHow Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScaleMariaDB plc
 
Using all of the high availability options in MariaDB
Using all of the high availability options in MariaDBUsing all of the high availability options in MariaDB
Using all of the high availability options in MariaDBMariaDB plc
 
Kafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesKafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesTodd Palino
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High AvailabilityMariaDB plc
 
Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Jay Patel
 
PowerDNS with MySQL
PowerDNS with MySQLPowerDNS with MySQL
PowerDNS with MySQLI Goo Lee
 

What's hot (20)

Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
MariaDB on Docker
MariaDB on DockerMariaDB on Docker
MariaDB on Docker
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centers
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
 
Azure Databases with IaaS
Azure Databases with IaaSAzure Databases with IaaS
Azure Databases with IaaS
 
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
 
Become a MySQL DBA: performing live database upgrades - webinar slides
Become a MySQL DBA: performing live database upgrades - webinar slidesBecome a MySQL DBA: performing live database upgrades - webinar slides
Become a MySQL DBA: performing live database upgrades - webinar slides
 
Webinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High AvailabilityWebinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High Availability
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
 
Apache Cassandra Certification
Apache Cassandra CertificationApache Cassandra Certification
Apache Cassandra Certification
 
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScaleHow Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
 
Using all of the high availability options in MariaDB
Using all of the high availability options in MariaDBUsing all of the high availability options in MariaDB
Using all of the high availability options in MariaDB
 
Dev Ops without the Ops
Dev Ops without the OpsDev Ops without the Ops
Dev Ops without the Ops
 
Kafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesKafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier Architectures
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High Availability
 
Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013
 
PowerDNS with MySQL
PowerDNS with MySQLPowerDNS with MySQL
PowerDNS with MySQL
 

Similar to Tech Talk Series, Part 4: How do you achieve high availability in a MySQL environment?

Why Scale Matters and How the Cloud Really is Different
Why Scale Matters and How the Cloud Really is Different Why Scale Matters and How the Cloud Really is Different
Why Scale Matters and How the Cloud Really is Different Amazon Web Services
 
Single tenant software to multi-tenant SaaS using K8S
Single tenant software to multi-tenant SaaS using K8SSingle tenant software to multi-tenant SaaS using K8S
Single tenant software to multi-tenant SaaS using K8SCloudLinux
 
Build agile and elastic data pipeline
Build agile and elastic data pipelineBuild agile and elastic data pipeline
Build agile and elastic data pipelineDeba Chatterjee
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineData Con LA
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archroyans
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archguest18a0f1
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archmclee
 
1 architecture & design
1   architecture & design1   architecture & design
1 architecture & designMark Swarbrick
 
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...Redis Labs
 
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...DataStax
 
Cloud - High Availability @ Low Cost - Workshop - Gurpreet ahuja
Cloud - High Availability @ Low Cost - Workshop - Gurpreet ahujaCloud - High Availability @ Low Cost - Workshop - Gurpreet ahuja
Cloud - High Availability @ Low Cost - Workshop - Gurpreet ahujaResellerClub
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCCal Henderson
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterContinuent
 
Netherlands Tech Tour 02 - MySQL Fabric
Netherlands Tech Tour 02 -   MySQL FabricNetherlands Tech Tour 02 -   MySQL Fabric
Netherlands Tech Tour 02 - MySQL FabricMark Swarbrick
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBMariaDB plc
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople
 
VMworld Europe 2014: Virtualizing Databases Doing IT Right – The Sequel
VMworld Europe 2014: Virtualizing Databases Doing IT Right – The SequelVMworld Europe 2014: Virtualizing Databases Doing IT Right – The Sequel
VMworld Europe 2014: Virtualizing Databases Doing IT Right – The SequelVMworld
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13Dave Gardner
 
Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka confluent
 

Similar to Tech Talk Series, Part 4: How do you achieve high availability in a MySQL environment? (20)

SQL Saturday San Diego
SQL Saturday San DiegoSQL Saturday San Diego
SQL Saturday San Diego
 
Why Scale Matters and How the Cloud Really is Different
Why Scale Matters and How the Cloud Really is Different Why Scale Matters and How the Cloud Really is Different
Why Scale Matters and How the Cloud Really is Different
 
Single tenant software to multi-tenant SaaS using K8S
Single tenant software to multi-tenant SaaS using K8SSingle tenant software to multi-tenant SaaS using K8S
Single tenant software to multi-tenant SaaS using K8S
 
Build agile and elastic data pipeline
Build agile and elastic data pipelineBuild agile and elastic data pipeline
Build agile and elastic data pipeline
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
1 architecture & design
1   architecture & design1   architecture & design
1 architecture & design
 
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
 
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
 
Cloud - High Availability @ Low Cost - Workshop - Gurpreet ahuja
Cloud - High Availability @ Low Cost - Workshop - Gurpreet ahujaCloud - High Availability @ Low Cost - Workshop - Gurpreet ahuja
Cloud - High Availability @ Low Cost - Workshop - Gurpreet ahuja
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
 
Netherlands Tech Tour 02 - MySQL Fabric
Netherlands Tech Tour 02 -   MySQL FabricNetherlands Tech Tour 02 -   MySQL Fabric
Netherlands Tech Tour 02 - MySQL Fabric
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDB
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
VMworld Europe 2014: Virtualizing Databases Doing IT Right – The Sequel
VMworld Europe 2014: Virtualizing Databases Doing IT Right – The SequelVMworld Europe 2014: Virtualizing Databases Doing IT Right – The Sequel
VMworld Europe 2014: Virtualizing Databases Doing IT Right – The Sequel
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka
 

More from Clustrix

Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.Clustrix
 
ClustrixDB 7.5 Announcement
ClustrixDB 7.5 AnnouncementClustrixDB 7.5 Announcement
ClustrixDB 7.5 AnnouncementClustrix
 
Moving an E-commerce Site to AWS. A Case Study
Moving an  E-commerce Site to AWS. A Case StudyMoving an  E-commerce Site to AWS. A Case Study
Moving an E-commerce Site to AWS. A Case StudyClustrix
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Clustrix
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Clustrix
 
Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.
Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.
Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.Clustrix
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Clustrix
 
Scaling Techniques to Increase Magento Capacity
Scaling Techniques to Increase Magento CapacityScaling Techniques to Increase Magento Capacity
Scaling Techniques to Increase Magento CapacityClustrix
 
Supersizing Magento
Supersizing MagentoSupersizing Magento
Supersizing MagentoClustrix
 
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site GrowthWhy Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site GrowthClustrix
 
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.Clustrix
 
Clustrix Database Overview
Clustrix Database OverviewClustrix Database Overview
Clustrix Database OverviewClustrix
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix
 

More from Clustrix (13)

Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.
 
ClustrixDB 7.5 Announcement
ClustrixDB 7.5 AnnouncementClustrixDB 7.5 Announcement
ClustrixDB 7.5 Announcement
 
Moving an E-commerce Site to AWS. A Case Study
Moving an  E-commerce Site to AWS. A Case StudyMoving an  E-commerce Site to AWS. A Case Study
Moving an E-commerce Site to AWS. A Case Study
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
 
Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.
Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.
Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
 
Scaling Techniques to Increase Magento Capacity
Scaling Techniques to Increase Magento CapacityScaling Techniques to Increase Magento Capacity
Scaling Techniques to Increase Magento Capacity
 
Supersizing Magento
Supersizing MagentoSupersizing Magento
Supersizing Magento
 
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site GrowthWhy Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
 
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
 
Clustrix Database Overview
Clustrix Database OverviewClustrix Database Overview
Clustrix Database Overview
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmark
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Tech Talk Series, Part 4: How do you achieve high availability in a MySQL environment?

  • 1. Flexible transactional scale for the connected world. Challenges to Scaling MySQL: Best Practices for Creating High Availability Dave A. Anselmi @AnselmiDave Director of Product Management
  • 2. Questions for Today PROPRIETARY & CONFIDENTIAL 2 o What is high availability and when is it needed? o What’s the difference between high availability and fault tolerance? o How is it possible to survive a multi-node failure in MySQL? o What are the best practices for achieving high availability with MySQL? o What are the costs of achieving HA? What can be the most cost-effective strategy??
  • 3. HA: As You Scale, Your Exposure GrowsSCALE (GROWTH/SUCCESS) T I M E LAMP Stack AWS, Azure, RAX, GCE, etc Private Cloud REACH LIMIT App too slow; Lost users REACH LIMIT (AGAIN) App too slow; Lost users Migrate to Bigger Machine • Read slaves, then sharding, etc: • Add more hardware & DBAs • Refactor code /hardwired app  More Expensive  Higher Risk  Lost Revenue ONGOING: • Refactoring hardware • Data balancing • Shard maintenance REPEAT Migrate to Bigger Machine PROPRIETARY & CONFIDENTIAL 3
  • 4. What do we mean by “High Availability”?
  • 5. Availability by the 9s PROPRIETARY & CONFIDENTIAL 5https://www.percona.com/blog/2016/06/07/choosing-mysql-high-availability-solutions/
  • 6. “High Availability” –vs- “Fault Tolerance” o High Availability – Minimize system downtime – Trade-off between as high “9s” level as can be budgeted – Goal: least amount of data loss possible o Fault Tolerance – System cannot go down – Arrays of redundant hardware, and automated failover systems – Cost is very high ORCL: – A high availability system minimizes the time when the system is down, or unavailable and maximizes the time when it is running, or available. IBM: – A fault tolerant environment has no service interruption but a significantly higher cost, – A highly available environment has a minimal service interruption. PROPRIETARY & CONFIDENTIAL 6
  • 7. Fault Tolerance –vs– High Availability o Fault Tolerance: – Failover application processes, including heartbeat – Shared storage layer, multiple participants PROPRIETARY & CONFIDENTIAL 7 o High Availability: – Multiple redundant shared- nothing servers – Replication to keep in sync
  • 8. High Availability rather than Fault-Tolerance o MySQL systems are rarely fault tolerant – High cost of fault tolerance is prohibitive o Most MySQL systems use replication for HA/DR o Galera isn’t fault tolerant – Certification replication provides synchronous replication between nodes – Availability is enforced over consistency: the write-set can be committed on the local node before the rest of the cluster has committed (Jepsen) PROPRIETARY & CONFIDENTIAL 8
  • 10. Four Challenges to HA 1. Tech/DevOps always wants HA: 1. Throughput & uptime is their core metric/KPIs 2. Business/Finance demands justification of the costs: 1. Redundant servers reflect underutilized resources 2. Redundant servers are considered “wasted budget” 3. Cloud/PaaS/IaaS can imply more HA than they provide – “Architected for 11x 9s” 4. Tension between scale and HA 1. Ideally, each new server would provide scale and redundancy 2. In practice, result is mixed; so choice is usually for scale PROPRIETARY & CONFIDENTIAL 10
  • 11. Realities of HA in the Cloud o “Promise” –vs- “Reality” of the cloud – Promise of the cloud: web scale – Reality of the cloud: TANSTAAFL o “Doesn’t the cloud provide HA automatically?” – MBAs: literally taught “DevOps just wants to spend $$” • “We don’t need redundancy: we’re on the cloud, & the cloud is 5x 9s, right?” • “S3 is architected for 11x 9s, right?” • “We’re on Amazon, it’s backed up” o MUST deploy redundant hardware in the cloud – If it’s not on your bill, you haven’t provisioned it o “Success” of AWS Marketing  Exposed Workload – 2/28/2017 4 Hour S3 outage: Even though “the cloud” has lots of hardware that does NOT mean your systems are fault tolerant, let alone HA PROPRIETARY & CONFIDENTIAL 11
  • 12. “Obvious” Critical Workloads needing HA o E-Commerce – Black Friday/Cyber Monday, Single’s Day, Back to School, flash sales, etc – 80% of Revenue in 2 months – Provisioning > 3x capacity for 2 months o Finance – System of Record – “Money changing hands” o Healthcare – “Life/death decisions” & DSS PROPRIETARY & CONFIDENTIAL 12
  • 13. Assessing Your Workload’s Exposure o Downtime: how much new business lost? o How much does brand awareness/damage cost? o Lost data = what kind of cost? – Orders unfulfilled, unhappy customers – Missing/stale reports, unhappy executives o Not just e-commerce: – Internal critical DSS Reports => top bank runs 2x 100+ node sharded arrays • DSS needs to be near-real time • What if a shard fails, or the data is old? PROPRIETARY & CONFIDENTIAL 13
  • 14. Business Case for HA The “insurance” of HA offsets multiple costs: o Opportunity cost – Each missed visitor was potentially a customer or referral o Single sale cost – Each missed sale is a tangible missed $-value o Customer lifetime cost – Unhappy customers who find sites they like better, won’t return o Market/brand cost – All customers use social media: communication “force multiplier” – “If you make customers unhappy in the physical world, they might each tell six friends. If you make customers unhappy on the internet, they can each tell 6,000.” – Jeff Bezos – W. Edwards Deming said “5” and “20”… – Call it “Customer Satisfaction at Web-Scale” PROPRIETARY & CONFIDENTIAL 14
  • 15. Strategies to Make MySQL Deployments HA
  • 16. MySQL HA is usually Replication-based o Redundant servers – Goal: get HA and more scale – Some level of consistency o Read slave or DR – data is still ‘seconds behind master’ – Async or Semistrict o Certification – Strong consistency as long as only a single master accepts writes o Group Replication – Strong consistency as long as only a single master accepts writes PROPRIETARY & CONFIDENTIAL 16
  • 17. Consistency Ramifications to High Availability o Async Replication (Master/Slave): – Replication-based: latency between master and slaves – Always some number of transactions which COMMIT on Master aren’t represented on the Slave – “Trade latency for throughput.” OK for your workload? o Sync Replication: – Certification Replication: certificate is transmitted, local master commits before ACK, other nodes commit in background – Cloud Spanner & CockroachDB: time-based optimization for replicated partitions o Strong Consistency – Every node is in identical, global transactional state at all times – All nodes (at least two) containing data associated with the transaction are durably updated before application receives ACK PROPRIETARY & CONFIDENTIAL 17
  • 18. Different Replication Strategies for HA Approach Details Pro’s Con’s Read Slave(s) Add a “Slave” read-server(s) to “Master” database server (e.g. “DR” node or cluster) • Easy setup • Single-master simplicity • Async == Slave is usually behind master • Eventually Consistent Master/ Master Both Masters are Slaves to each other • Allows updates to both masters • Async == Slave is usually behind master • Eventually Consistent Certification Replication Multi-Master cluster using synchronous Replication • Allows multiple masters to be close in state • Sync == Other nodes need to commit the certification. Window of skew exists (much shorter than async) Group Replication 1. Single-Primary, with automatic leader election 2. Multi-Primary, i.e. similar to certification replication • Allows multiple masters to be close in state • Sync == Other nodes need to commit the certification. Window of skew exists (much shorter than async)
  • 19. MySQL Deployment Architectures PROPRIETARY & CONFIDENTIAL 19 SHARDO4SHARDO1 SHARDO2 SHARDO3 A-G H-M N-S T-Z DRDR DR DR A-G H-M N-S T-Z
  • 20. HA Strategies per Architecture MySQL Deployment Approach Single Node Read Slave(s) Master/Master Sharding Read Slave(s) • Each read slave adds read scale + HA • Eventual consistency N/A • Secondary master is effectively same state as a read slave • Each shard has a read slave • Eventual consistency Master/ Master • No HA benefit over Read Slave • Secondary master is effectively same state as a read slave N/A • Each shard in Master/Master • Eventual consistency Certification Replication • Nodes are closer in state than read slave • Nodes are closer in state than read slave • Nodes are closer in state than Master/Master • Each shard in Master/Master using certification replication Group Replication • Automatic Master election • Group members are closer in state than read slave • Automatic Master election • Group members are closer in state than read slave • Group members are closer in state than Master/Master • Each shard using group Replication • Automatic Master election
  • 21. How ClustrixDB Provides High Availability
  • 22. ClustrixDB: PROPRIETARY & CONFIDENTIAL 22 ClustrixDB ACID Compliant Transactions & Joins Optimized for OLTP Built-In High Availability Flex-Up and Flex-Down Minimal DB Admin o Write + Read Linear Scale-Out o Automatically Highly Available o MySQL-Compatible
  • 23. PROPRIETARY & CONFIDENTIAL 23 Automatic High Availability o Planned or Unplanned Outages – Planned: “soft-fail” the node(s) – Single minimal “database pause” to regain quorum o At least 2 instances of the data distributed across all the nodes – All data instances fully in sync at all times o Data is automatically rebalanced across the cluster – Tables are online for reads and writes – MVCC for lockless reads while writing S1 S2 S3 S3 S4 S4 S5 S1 ClustrixDB S2 S5
  • 24. Questions for Today o What is high availability and when is it needed? – Redundancy to minimize downtimes – Financial, health, and other critical workloads o What’s the difference between high availability and fault tolerance? – High availability: minimize downtime – Fault tolerance: zero downtime o How is it possible to survive a multi-node failure in MySQL? – Multiple server redundancy – Maintaining strong consistency requires synchronous data replication between servers PROPRIETARY & CONFIDENTIAL 24
  • 25. Questions for Today o What are the best practices for achieving high availability with MySQL? – Synchronous replication: can affect performance or scale – Asynchronous replication: can affect data consistency o What are the costs of achieving HA? What can be the most cost- effective strategy?? – Redundancy of servers: CAPEX & OPEX for DevOps – License/support costs: ramps up by # of servers – Ideally: each server provides scale + HA PROPRIETARY & CONFIDENTIAL 25

Editor's Notes

  1. With each additional server or node, you add complexity and fragility
  2. Here’s what “5x 9’s” really means, etc. Typical production systems target 5x 9’s
  3. Let’s define some terms… ORCL: https://docs.oracle.com/cd/E17904_01/core.1111/e10106/intro.htm#ASHIA712 IBM: https://www.ibm.com/support/knowledgecenter/SSPHQG_7.2.1/com.ibm.powerha.concepts/ha_concepts_fault.htm
  4. https://aphyr.com/posts/327-jepsen-mariadb-galera-cluster
  5. At the risk of making generalizations…