SlideShare a Scribd company logo
1 of 39
Download to read offline
Scaling MySQL in AWS 
All Your Base Conf, Oxford, UK 
Laine Campbell, Co-Founder 
October 17th, 2014
Who am I? 
DB Architect, Entrepreneur and super hero… 
Good hair, not overly clever, female pronouns
Who am I? 
Humor - Self effacing, self aggrandizing, dry… 
Cajun, rogueish, and I destroy everything I touch...
Who am I? 
When in doubt, laugh
Who am I? 
Happy Belated Ada Lovelace Day
Agenda 
● Amazon options for implementation 
● MySQL scaling patterns 
● Resiliency 
● Round it out 
● War Stories
RDS and EC2/MySQL 
A love story...
Amazon RDS (DBaaS) 
Basic Operations Managed 
Ease of Deployment 
Supports Scaling via Replication 
Resilient via Replication, EBS RAID, Multi-AZ, Multi-Region
What is Multi-AZ? 
Automatic Failover, spread across availability zones 
Significantly reduces impact of operations such as: 
● Backups 
● Replica Builds 
● Patching
Managed Operations 
● Backups and Recovery 
● Provisioning 
● Patching 
● Auto Failover 
● Replication
What does it cost? 
● Instance tax: 35% - 37% 
● Provisioned IOPS: 15% 
● Lack of transparency - can increase downtime
What does it cost? 
● Multi AZ Master - db.r3.8xlarge, 1 yr reserved 
○ $17,312 
● 3 replicas - b.r3.8xlarge, 1 yr reserved 
○ $25,968 
● Provisioned IOPS - general purpose (new) 
○ 1 TB dataset = $6,144 (5 copies) 
● Total Cost RDS = $49,424 
● Off of RDS = $34,726
Cost Thoughts 
● RDS costs $14,698/yr 
○ DBA costs $144K/yr, = $108 hour (time off, productivity, 
retention/churn) 
○ Equals 136 hours of DBA time (3.5 weeks) 
● Automating via EC2 is a one time job, RDS tax is ALWAYS 
○ 5 clusters costs you 680 hours of DBA time (17 weeks) 
○ 5 clusters, 3 years = 2,040 hours, or 51 weeks 
○ What can your DBA do with an extra 51 weeks?
Other Costs 
● Lock-in: 
○ In 5.6, you can replicate out, making this moot. 
○ You have to automate once out, you could have spent 
this in the beginning. 
● Lack of Visibility: 
○ Dtrace, TCPDump, Top, VMStat, etc… 
● Lack of Control: 
○ Data Security, Shared Environments, Backups???? 
○ Restarts due to exploits, etc...
Amazon EC2, Roll Your Own 
Build your own automation: 
● Provisioning, with replication 
● Configuration management 
● Backup and recovery 
● Instance registration 
Other DBaaS Options such as Tesora Trove 
● Community, free 
● Enterprise starts at $25,000 for 50 instances 
● don’t forget RDS, 5 clusters, 1 years = $73,440
Choosing RDS vs EC2
Why use RDS? 
● Legacy apps that cannot use 5.6, and can accept < 
99.65% SLAs 
● Low volume/traffic applications for companies who 
choose to not have their own operations expertise
Why use EC2? 
● Want MariaDB or XtraDB Variants 
● Want more flexibility in multi-region setups (before 
5.6) 
● Want portability to other clouds 
● High performance and scale needs that require 
access to the OS, and to the FULL DB instance
Storage Options in AWS 
Type RDS EC2 Persistent Max 
IOPS 
Max 
Throughput 
Cost 
Pure SSD No Yes No 100,000 390 MBps Free w/Instance Cost 
EBS General 
Purpose SSD 
Yes Yes Yes 3,000 128 MBps $.10/GB/month 
EBS PIOPS SSD Yes Yes Yes 4,000 128 MBps $.125/GB/month + $. 
065/PIOPS/month 
EBS Magnetic Yes Yes Yes 40-200 40-90 MBps $.05/GB/month + $. 
05/million IOPS
Scaling MySQL at AWS
My Definition of Scaling 
● Capacity is elastic and automated 
● Performance stays reasonably consistent 
● Availability scales 
● Resiliency scales 
● Operational visibility scales 
● Backup and recovery scales
MySQL Workload Scaling 
● Break out workloads to their own clusters 
○ To facilitate sharding large data-sets horizontally 
○ To segregate specific workload characteristics 
● Evaluate each workload’s read/write needs 
Total dataset size and growth 
Data change delta (updates/deletes)
MySQL Workload Scaling 
User Login/Profile 
1 TB (1MM User) 
20,000 iops peak 
10% write 
90% read 
User Content 
5 TB 
500,000 iops peak 
1% write 
99% read 
Site Metadata 
5 GB 
500 iops peak 
25% write 
Shard Candidate 75% read 
Shard Candidate
MySQL Workload Scaling 
Determine sharding size based on constraints 
○ AWS Write IO 
○ Replication Limits 
○ Tolerance for large numbers of systems 
○ Budget
MySQL Workload Scaling 
Sharding Topology: 
● Schema:Shard relationship 1:1 
● Instance:Schema relationship 1:N 
● Host:Instance relationship 1:1 
Host 1 
Instance 1 
Shard 1 Shard 2
MySQL Workload Scaling 
User Profile Data - 10 shards, hashable 
○ Per Shard: 2,000 (500MB) iops, 200 wps (50MB), 100 GB storage 
■ Two replication threads/shards per cluster 
○ SSD General Purpose EBS, 2 300 GB Volumes Striped ( 
■ 1,800 IOPS, 6000 burst, 800 MBps 
○ 3,600 rps (900 MBps) requires 2 replicas (500 MBps), +1 for 
redundancy 
○ 1 master, 1 failover, 3 replicas = 25 hosts, r3.2x lg 
■ Memory = 61 GB > active dataset 
■ Assumes read/write splitting 
○ Total Cost = $48,600 instances + $18,000 storage = $66,600
MySQL Workload Scaling 
Host 1 
Instance 1 
Shard 1 
Instance 2 
Keep Schema:Shard relationship 1:1 
Change Schema:Instance relationship from 1:N to 1:1 
Change Instance:Host relationship from 1:1 to 1:N 
Shard 2
MySQL Workload Scaling 
Summary 
○ Final Constraint is Write IOPS 
○ Sharding eliminates Constraint 
○ There are ways to reduce those reads: 
■ Caching 
○ There are ways to reduce those writes: 
■ Throttling concurrency with queuing and loose coupling to 
keep write IO down 
■ Compression! Native or application based 
○ Moving storage to ephemeral SSD saves $$$$ 
■ If you truly can leverage backup and recovery with small 
datasets
MySQL Workload Scaling 
Master 
Replica Replica 
Failover 
Shard 1 
Master 
Replica Replica 
Failover 
Shard 2
Resiliency Layers 
● Sharding 
○ Shard N = y% of traffic, where y = 1/Shards*100 
○ Aka with 64 shards, one shard lost = 1.56% 
● EBS Snapshots 
○ Rapid redeployment of failed nodes (kill, rebuild)
Type of Change EC2 RDS Master 
(Non Multi-AZ) 
RDS Master 
(Multi-AZ) 
RDS Replica 
Instance resize 
up/down 
Rolling Migrations Moderate 
Downtime 
Minimal 
Downtime 
Moderate Downtime 
(take out of service) 
EBS <-> PIOPS Severe Perf impact Severe Perf impact Minor Perf impact Severe Perf Impact 
(take out of service) 
PIOPS Amount 
Change 
Minor Perf impact Minor Perf impact Minor Perf impact Severe Perf Impact 
(take out of service) 
Disk Space Change 
(add) 
Severe Perf impact Severe Perf impact Minor Perf impact Severe Perf Impact 
(take out of service) 
Disk Space Change 
(reduce) 
Rolling Migrations Severe Downtime 
(promote from 
replica) 
Minimal 
Downtime 
Moderate Downtime 
(take out of service)
Power Up - Resiliency via Geography
MySQL Workload Scaling 
S1 Master 
US East 1 
Replica Replica 
S2 Failover 
During failover, spawn replicas 
US West 1 
AZ 2 AZ 3 
AZ 1 
Future 
Replica 
Future 
Replica 
AZ 2 AZ 3 
Nginx Nginx
Cluster Management and Failover 
● Roll your own…. 
○ Config Mgmt 
○ Automation Scripting 
○ HAProxy 
● Spend some dough 
○ RDS 
○ Continuent Tungsten 
○ Scalearc 
● Bleeding edge but clotting 
○ Galera, via MariaDB or XtraDB Cluster 
● Bleeding edge and fresh 
○ Openstack, Trove and Tesora
Rounding it Out: Operational Visibility 
● Monitoring and Alerting: Sensu (not Nagios) 
● Time Series Trending: Graphite or OpenTSDB 
● Graphing of Data: Grafana 
● Log Aggregation and Management: Logstash or Splunk 
● Application Monitoring: New Relic or AppDynamics
Rounding it Out: Backup and Recovery 
● Sharding keeps systems small and agile 
○ Snapshots for tactical kill and build 
○ S3 for longer-term 
○ Glacier forever 
○ Offsite for Legal
Questions and Follow up 
○ lcampbell@pythian.com 
○ www.pythian.com 
○ www.linkedin.com/lainecampbell 
○ www.adafoundation.org

More Related Content

What's hot

C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarDataStax Academy
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandraScyllaDB
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControl
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControlWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControl
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControlContinuent
 
Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)Continuent
 
MySQL Group Replication - an Overview
MySQL Group Replication - an OverviewMySQL Group Replication - an Overview
MySQL Group Replication - an OverviewMatt Lord
 
AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)Matteo Moretti
 
MySQL High Availability Solutions
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL High Availability SolutionsMydbops
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraMichael Kjellman
 
Choosing a MySQL High Availability solution - Percona Live UK 2011
Choosing a MySQL High Availability solution - Percona Live UK 2011Choosing a MySQL High Availability solution - Percona Live UK 2011
Choosing a MySQL High Availability solution - Percona Live UK 2011Henrik Ingo
 
Migrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMigrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMaris Elsins
 
Tips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloudTips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloudSeveralnines
 
PowerDNS with MySQL
PowerDNS with MySQLPowerDNS with MySQL
PowerDNS with MySQLI Goo Lee
 
Webinar: How to Shrink Your Datacenter Footprint by 50%
Webinar: How to Shrink Your Datacenter Footprint by 50%Webinar: How to Shrink Your Datacenter Footprint by 50%
Webinar: How to Shrink Your Datacenter Footprint by 50%ScyllaDB
 
Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.Mydbops
 
The True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsThe True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsScyllaDB
 
HTTP Plugin for MySQL!
HTTP Plugin for MySQL!HTTP Plugin for MySQL!
HTTP Plugin for MySQL!Ulf Wendel
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityOSSCube
 
Demystifying the Distributed Database Landscape
Demystifying the Distributed Database LandscapeDemystifying the Distributed Database Landscape
Demystifying the Distributed Database LandscapeScyllaDB
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Dave Gardner
 

What's hot (20)

C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from Cassandra
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControl
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControlWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControl
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControl
 
Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)
 
MySQL Group Replication - an Overview
MySQL Group Replication - an OverviewMySQL Group Replication - an Overview
MySQL Group Replication - an Overview
 
AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)
 
MySQL High Availability Solutions
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL High Availability Solutions
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to Cassandra
 
Choosing a MySQL High Availability solution - Percona Live UK 2011
Choosing a MySQL High Availability solution - Percona Live UK 2011Choosing a MySQL High Availability solution - Percona Live UK 2011
Choosing a MySQL High Availability solution - Percona Live UK 2011
 
Migrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMigrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for Oracle
 
Tips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloudTips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloud
 
PowerDNS with MySQL
PowerDNS with MySQLPowerDNS with MySQL
PowerDNS with MySQL
 
Webinar: How to Shrink Your Datacenter Footprint by 50%
Webinar: How to Shrink Your Datacenter Footprint by 50%Webinar: How to Shrink Your Datacenter Footprint by 50%
Webinar: How to Shrink Your Datacenter Footprint by 50%
 
Running Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft AzureRunning Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft Azure
 
Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.
 
The True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsThe True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS Options
 
HTTP Plugin for MySQL!
HTTP Plugin for MySQL!HTTP Plugin for MySQL!
HTTP Plugin for MySQL!
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High Availability
 
Demystifying the Distributed Database Landscape
Demystifying the Distributed Database LandscapeDemystifying the Distributed Database Landscape
Demystifying the Distributed Database Landscape
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 

Viewers also liked

MySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal ScalingMySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal ScalingMats Kindahl
 
Sharding using MySQL and PHP
Sharding using MySQL and PHPSharding using MySQL and PHP
Sharding using MySQL and PHPMats Kindahl
 
MySQL 5.7 Fabric: Introduction to High Availability and Sharding
MySQL 5.7 Fabric: Introduction to High Availability and Sharding MySQL 5.7 Fabric: Introduction to High Availability and Sharding
MySQL 5.7 Fabric: Introduction to High Availability and Sharding Ulf Wendel
 
RDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and PatternsRDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and PatternsLaine Campbell
 
Velocity pythian operational visibility
Velocity pythian operational visibilityVelocity pythian operational visibility
Velocity pythian operational visibilityLaine Campbell
 
Data Storage Practice
Data Storage PracticeData Storage Practice
Data Storage PracticeDavies Liu
 
Pythian operational visibility
Pythian operational visibilityPythian operational visibility
Pythian operational visibilityLaine Campbell
 
How to Successfully Deploy SSD in the Enterprise
How to Successfully Deploy SSD in the EnterpriseHow to Successfully Deploy SSD in the Enterprise
How to Successfully Deploy SSD in the Enterprisevelobit
 
豆瓣技术架构的发展历程 @ QCon Beijing 2009
豆瓣技术架构的发展历程 @ QCon Beijing 2009豆瓣技术架构的发展历程 @ QCon Beijing 2009
豆瓣技术架构的发展历程 @ QCon Beijing 2009Qiangning Hong
 
Innodisk at aditech customer meet 2015
Innodisk at aditech customer meet 2015Innodisk at aditech customer meet 2015
Innodisk at aditech customer meet 2015Vilas Fulsundar
 
Recruiting for diversity in tech
Recruiting for diversity in techRecruiting for diversity in tech
Recruiting for diversity in techLaine Campbell
 
cloud computing in e commerce
cloud computing in e commercecloud computing in e commerce
cloud computing in e commercesteffz
 
SSD 2015 Presentation, POPS-OFDM: Ping-pong Optimized Pulse Shaping OFDM for ...
SSD 2015 Presentation, POPS-OFDM: Ping-pong Optimized Pulse Shaping OFDM for ...SSD 2015 Presentation, POPS-OFDM: Ping-pong Optimized Pulse Shaping OFDM for ...
SSD 2015 Presentation, POPS-OFDM: Ping-pong Optimized Pulse Shaping OFDM for ...Mohamed Siala
 
Best Practices for Running eCommerce in the AWS Cloud
Best Practices for Running eCommerce in the AWS CloudBest Practices for Running eCommerce in the AWS Cloud
Best Practices for Running eCommerce in the AWS CloudAmazon Web Services
 
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈 Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈 Amazon Web Services Korea
 
ELB와 EBS의 아키텍터로 생각해보는 사용상 주의할 점들
ELB와 EBS의 아키텍터로 생각해보는 사용상 주의할 점들ELB와 EBS의 아키텍터로 생각해보는 사용상 주의할 점들
ELB와 EBS의 아키텍터로 생각해보는 사용상 주의할 점들AWSKRUG - AWS한국사용자모임
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAAndrew Morgan
 

Viewers also liked (20)

MySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal ScalingMySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal Scaling
 
Sharding using MySQL and PHP
Sharding using MySQL and PHPSharding using MySQL and PHP
Sharding using MySQL and PHP
 
MySQL 5.7 Fabric: Introduction to High Availability and Sharding
MySQL 5.7 Fabric: Introduction to High Availability and Sharding MySQL 5.7 Fabric: Introduction to High Availability and Sharding
MySQL 5.7 Fabric: Introduction to High Availability and Sharding
 
RDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and PatternsRDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and Patterns
 
Velocity pythian operational visibility
Velocity pythian operational visibilityVelocity pythian operational visibility
Velocity pythian operational visibility
 
Database engineering
Database engineeringDatabase engineering
Database engineering
 
Data Storage Practice
Data Storage PracticeData Storage Practice
Data Storage Practice
 
Pythian operational visibility
Pythian operational visibilityPythian operational visibility
Pythian operational visibility
 
How to Successfully Deploy SSD in the Enterprise
How to Successfully Deploy SSD in the EnterpriseHow to Successfully Deploy SSD in the Enterprise
How to Successfully Deploy SSD in the Enterprise
 
豆瓣技术架构的发展历程 @ QCon Beijing 2009
豆瓣技术架构的发展历程 @ QCon Beijing 2009豆瓣技术架构的发展历程 @ QCon Beijing 2009
豆瓣技术架构的发展历程 @ QCon Beijing 2009
 
Innodisk at aditech customer meet 2015
Innodisk at aditech customer meet 2015Innodisk at aditech customer meet 2015
Innodisk at aditech customer meet 2015
 
AWS로 불꺼온 나날들
AWS로 불꺼온 나날들AWS로 불꺼온 나날들
AWS로 불꺼온 나날들
 
Recruiting for diversity in tech
Recruiting for diversity in techRecruiting for diversity in tech
Recruiting for diversity in tech
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
cloud computing in e commerce
cloud computing in e commercecloud computing in e commerce
cloud computing in e commerce
 
SSD 2015 Presentation, POPS-OFDM: Ping-pong Optimized Pulse Shaping OFDM for ...
SSD 2015 Presentation, POPS-OFDM: Ping-pong Optimized Pulse Shaping OFDM for ...SSD 2015 Presentation, POPS-OFDM: Ping-pong Optimized Pulse Shaping OFDM for ...
SSD 2015 Presentation, POPS-OFDM: Ping-pong Optimized Pulse Shaping OFDM for ...
 
Best Practices for Running eCommerce in the AWS Cloud
Best Practices for Running eCommerce in the AWS CloudBest Practices for Running eCommerce in the AWS Cloud
Best Practices for Running eCommerce in the AWS Cloud
 
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈 Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
 
ELB와 EBS의 아키텍터로 생각해보는 사용상 주의할 점들
ELB와 EBS의 아키텍터로 생각해보는 사용상 주의할 점들ELB와 EBS의 아키텍터로 생각해보는 사용상 주의할 점들
ELB와 EBS의 아키텍터로 생각해보는 사용상 주의할 점들
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEA
 

Similar to Scaling MySQL in Amazon Web Services

Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon AuroraAmazon Web Services
 
Getting started with amazon aurora - Toronto
Getting started with amazon aurora - TorontoGetting started with amazon aurora - Toronto
Getting started with amazon aurora - TorontoAmazon Web Services
 
Percona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPercona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPythian
 
Amazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database EngineAmazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database EngineAmazon Web Services
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon AuroraAmazon Web Services
 
AWS Activate webinar - Scalable databases for fast growing startups
AWS Activate webinar - Scalable databases for fast growing startupsAWS Activate webinar - Scalable databases for fast growing startups
AWS Activate webinar - Scalable databases for fast growing startupsAmazon Web Services
 
Amazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Web Services
 
Amazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Web Services
 
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web ServicesAWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web ServicesAmazon Web Services
 
Amazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About PerformanceAmazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About PerformanceDanilo Poccia
 
Amazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database EngineAmazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database EngineDanilo Poccia
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud EraMydbops
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016Pierre Mavro
 
Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...Amazon Web Services
 
Selecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
Selecting the Right AWS Database Solution - AWS 2017 Online Tech TalksSelecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
Selecting the Right AWS Database Solution - AWS 2017 Online Tech TalksAmazon Web Services
 
MySQL on AWS RDS
MySQL on AWS RDSMySQL on AWS RDS
MySQL on AWS RDSMydbops
 
(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWSAmazon Web Services
 
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)Amazon Web Services
 

Similar to Scaling MySQL in Amazon Web Services (20)

Running MySQL in AWS
Running MySQL in AWSRunning MySQL in AWS
Running MySQL in AWS
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
Getting started with amazon aurora - Toronto
Getting started with amazon aurora - TorontoGetting started with amazon aurora - Toronto
Getting started with amazon aurora - Toronto
 
Percona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPercona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWS
 
Amazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database EngineAmazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database Engine
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
AWS Activate webinar - Scalable databases for fast growing startups
AWS Activate webinar - Scalable databases for fast growing startupsAWS Activate webinar - Scalable databases for fast growing startups
AWS Activate webinar - Scalable databases for fast growing startups
 
Amazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from Amazon
 
Amazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from Amazon
 
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web ServicesAWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
 
Amazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About PerformanceAmazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About Performance
 
Amazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database EngineAmazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database Engine
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud Era
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...
 
Selecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
Selecting the Right AWS Database Solution - AWS 2017 Online Tech TalksSelecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
Selecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
 
MySQL on AWS RDS
MySQL on AWS RDSMySQL on AWS RDS
MySQL on AWS RDS
 
(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS
 
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Scaling MySQL in Amazon Web Services

  • 1. Scaling MySQL in AWS All Your Base Conf, Oxford, UK Laine Campbell, Co-Founder October 17th, 2014
  • 2. Who am I? DB Architect, Entrepreneur and super hero… Good hair, not overly clever, female pronouns
  • 3. Who am I? Humor - Self effacing, self aggrandizing, dry… Cajun, rogueish, and I destroy everything I touch...
  • 4. Who am I? When in doubt, laugh
  • 5. Who am I? Happy Belated Ada Lovelace Day
  • 6. Agenda ● Amazon options for implementation ● MySQL scaling patterns ● Resiliency ● Round it out ● War Stories
  • 7. RDS and EC2/MySQL A love story...
  • 8. Amazon RDS (DBaaS) Basic Operations Managed Ease of Deployment Supports Scaling via Replication Resilient via Replication, EBS RAID, Multi-AZ, Multi-Region
  • 9. What is Multi-AZ? Automatic Failover, spread across availability zones Significantly reduces impact of operations such as: ● Backups ● Replica Builds ● Patching
  • 10. Managed Operations ● Backups and Recovery ● Provisioning ● Patching ● Auto Failover ● Replication
  • 11. What does it cost? ● Instance tax: 35% - 37% ● Provisioned IOPS: 15% ● Lack of transparency - can increase downtime
  • 12. What does it cost? ● Multi AZ Master - db.r3.8xlarge, 1 yr reserved ○ $17,312 ● 3 replicas - b.r3.8xlarge, 1 yr reserved ○ $25,968 ● Provisioned IOPS - general purpose (new) ○ 1 TB dataset = $6,144 (5 copies) ● Total Cost RDS = $49,424 ● Off of RDS = $34,726
  • 13. Cost Thoughts ● RDS costs $14,698/yr ○ DBA costs $144K/yr, = $108 hour (time off, productivity, retention/churn) ○ Equals 136 hours of DBA time (3.5 weeks) ● Automating via EC2 is a one time job, RDS tax is ALWAYS ○ 5 clusters costs you 680 hours of DBA time (17 weeks) ○ 5 clusters, 3 years = 2,040 hours, or 51 weeks ○ What can your DBA do with an extra 51 weeks?
  • 14. Other Costs ● Lock-in: ○ In 5.6, you can replicate out, making this moot. ○ You have to automate once out, you could have spent this in the beginning. ● Lack of Visibility: ○ Dtrace, TCPDump, Top, VMStat, etc… ● Lack of Control: ○ Data Security, Shared Environments, Backups???? ○ Restarts due to exploits, etc...
  • 15. Amazon EC2, Roll Your Own Build your own automation: ● Provisioning, with replication ● Configuration management ● Backup and recovery ● Instance registration Other DBaaS Options such as Tesora Trove ● Community, free ● Enterprise starts at $25,000 for 50 instances ● don’t forget RDS, 5 clusters, 1 years = $73,440
  • 17. Why use RDS? ● Legacy apps that cannot use 5.6, and can accept < 99.65% SLAs ● Low volume/traffic applications for companies who choose to not have their own operations expertise
  • 18. Why use EC2? ● Want MariaDB or XtraDB Variants ● Want more flexibility in multi-region setups (before 5.6) ● Want portability to other clouds ● High performance and scale needs that require access to the OS, and to the FULL DB instance
  • 19. Storage Options in AWS Type RDS EC2 Persistent Max IOPS Max Throughput Cost Pure SSD No Yes No 100,000 390 MBps Free w/Instance Cost EBS General Purpose SSD Yes Yes Yes 3,000 128 MBps $.10/GB/month EBS PIOPS SSD Yes Yes Yes 4,000 128 MBps $.125/GB/month + $. 065/PIOPS/month EBS Magnetic Yes Yes Yes 40-200 40-90 MBps $.05/GB/month + $. 05/million IOPS
  • 21. My Definition of Scaling ● Capacity is elastic and automated ● Performance stays reasonably consistent ● Availability scales ● Resiliency scales ● Operational visibility scales ● Backup and recovery scales
  • 22. MySQL Workload Scaling ● Break out workloads to their own clusters ○ To facilitate sharding large data-sets horizontally ○ To segregate specific workload characteristics ● Evaluate each workload’s read/write needs Total dataset size and growth Data change delta (updates/deletes)
  • 23. MySQL Workload Scaling User Login/Profile 1 TB (1MM User) 20,000 iops peak 10% write 90% read User Content 5 TB 500,000 iops peak 1% write 99% read Site Metadata 5 GB 500 iops peak 25% write Shard Candidate 75% read Shard Candidate
  • 24. MySQL Workload Scaling Determine sharding size based on constraints ○ AWS Write IO ○ Replication Limits ○ Tolerance for large numbers of systems ○ Budget
  • 25. MySQL Workload Scaling Sharding Topology: ● Schema:Shard relationship 1:1 ● Instance:Schema relationship 1:N ● Host:Instance relationship 1:1 Host 1 Instance 1 Shard 1 Shard 2
  • 26. MySQL Workload Scaling User Profile Data - 10 shards, hashable ○ Per Shard: 2,000 (500MB) iops, 200 wps (50MB), 100 GB storage ■ Two replication threads/shards per cluster ○ SSD General Purpose EBS, 2 300 GB Volumes Striped ( ■ 1,800 IOPS, 6000 burst, 800 MBps ○ 3,600 rps (900 MBps) requires 2 replicas (500 MBps), +1 for redundancy ○ 1 master, 1 failover, 3 replicas = 25 hosts, r3.2x lg ■ Memory = 61 GB > active dataset ■ Assumes read/write splitting ○ Total Cost = $48,600 instances + $18,000 storage = $66,600
  • 27.
  • 28. MySQL Workload Scaling Host 1 Instance 1 Shard 1 Instance 2 Keep Schema:Shard relationship 1:1 Change Schema:Instance relationship from 1:N to 1:1 Change Instance:Host relationship from 1:1 to 1:N Shard 2
  • 29.
  • 30. MySQL Workload Scaling Summary ○ Final Constraint is Write IOPS ○ Sharding eliminates Constraint ○ There are ways to reduce those reads: ■ Caching ○ There are ways to reduce those writes: ■ Throttling concurrency with queuing and loose coupling to keep write IO down ■ Compression! Native or application based ○ Moving storage to ephemeral SSD saves $$$$ ■ If you truly can leverage backup and recovery with small datasets
  • 31. MySQL Workload Scaling Master Replica Replica Failover Shard 1 Master Replica Replica Failover Shard 2
  • 32. Resiliency Layers ● Sharding ○ Shard N = y% of traffic, where y = 1/Shards*100 ○ Aka with 64 shards, one shard lost = 1.56% ● EBS Snapshots ○ Rapid redeployment of failed nodes (kill, rebuild)
  • 33. Type of Change EC2 RDS Master (Non Multi-AZ) RDS Master (Multi-AZ) RDS Replica Instance resize up/down Rolling Migrations Moderate Downtime Minimal Downtime Moderate Downtime (take out of service) EBS <-> PIOPS Severe Perf impact Severe Perf impact Minor Perf impact Severe Perf Impact (take out of service) PIOPS Amount Change Minor Perf impact Minor Perf impact Minor Perf impact Severe Perf Impact (take out of service) Disk Space Change (add) Severe Perf impact Severe Perf impact Minor Perf impact Severe Perf Impact (take out of service) Disk Space Change (reduce) Rolling Migrations Severe Downtime (promote from replica) Minimal Downtime Moderate Downtime (take out of service)
  • 34. Power Up - Resiliency via Geography
  • 35. MySQL Workload Scaling S1 Master US East 1 Replica Replica S2 Failover During failover, spawn replicas US West 1 AZ 2 AZ 3 AZ 1 Future Replica Future Replica AZ 2 AZ 3 Nginx Nginx
  • 36. Cluster Management and Failover ● Roll your own…. ○ Config Mgmt ○ Automation Scripting ○ HAProxy ● Spend some dough ○ RDS ○ Continuent Tungsten ○ Scalearc ● Bleeding edge but clotting ○ Galera, via MariaDB or XtraDB Cluster ● Bleeding edge and fresh ○ Openstack, Trove and Tesora
  • 37. Rounding it Out: Operational Visibility ● Monitoring and Alerting: Sensu (not Nagios) ● Time Series Trending: Graphite or OpenTSDB ● Graphing of Data: Grafana ● Log Aggregation and Management: Logstash or Splunk ● Application Monitoring: New Relic or AppDynamics
  • 38. Rounding it Out: Backup and Recovery ● Sharding keeps systems small and agile ○ Snapshots for tactical kill and build ○ S3 for longer-term ○ Glacier forever ○ Offsite for Legal
  • 39. Questions and Follow up ○ lcampbell@pythian.com ○ www.pythian.com ○ www.linkedin.com/lainecampbell ○ www.adafoundation.org