SlideShare a Scribd company logo
Migrating to and
living on
RDS/Aurora
life after Datacenters
About me
Balázs Pőcze
● I came from the operations world (ops/devops/sre)
● Works as a DBA for 4 years
● Currently I work for Gizmodo
● @banyek
● https://github.com/banyek
● http://blog.balazspocze.me
“Move to the cloud they said, it will be fun, they said...”
Why to migrate to RDS
- It is AWS native
- A lot of complexity is handled by Amazon
- It is Someone Else’s Problem (SEP ™)
- You have someone to blame
- It just works!
Why not migrate to RDS
- It’s not always the same approach as you’d choose
- You can’t access certain parts of your system
- When you have your own enlightened toolset which you
want to use
- You are not happy to use a black box system, when you
just hope it will work somehow
What is Aurora
RDS/Aurora
- MySQL compatible HA database cluster
- No replication between the nodes, they share a common
storage, and the redo logs got shared
- This means there’s no ‘real’ replication lag between the
write node and the readers, in our case it is between
around 12-16ms
- The cluster has a writer and a reader endpoint, the writer
is not getting read
- A cluster can have up to 16 nodes
- The nodes doesn’t have to be the same size
Creating cluster
Using the web interface
- DON’T!
- Just don’t.
- It is not reproducible
- There is no clean code you can just read and hand to
someone else
- You can export it to a cloudformation template
Awscli / boto3
- Amazon native
- Flexible
- Relatively easy
- Not the most reusable
Puppet/ Chef / Ansible
We tried to use puppet, but it wasn't a success - for us
- A lot of recreations happened instead of modifying
resources
- The newest API was not supported
- It worked, but with some limitations
Terraform
- No vendor lock
- It needs an Atlas server to work properly
- We had bad experience
- Constantly changes
- Not always the latest API in use
- It always recreated and destroyed the cluster instead of modifying
CloudFormation
- Native Amazon solution
- Could be written in YAML or in JSON
- Works with changesets
- The Amazon infrastructure fully supports it
- With CodePipeline and CodeBuild you can create a real
neat pipeline!
Sizing the nodes
- Always check Amazon’s node parameters, because with
MySQL not often the CPU is the bottleneck, but the
storage speed (Aurora using EBS) or the network could
be!
Loading data
mysqldump / mysqlpump
- --single-transaction
- --master-data=2
- --hex-dump
- Don’t forget the interactive_timeout parameter!
- When uploading to s3 you have to take care of splicing up
- mysqlpump can dump parallel, it exists since 5.7
- The data loading is still single threaded
Mydumper / Myloader
- Parallel dumper, way faster than mysqldump (in my case
the data loading was around 5hrs instead of 26!
- No --single-transaction, so you have to take care
of it
- My recommendation: do it on a replica, because you’ll have the
binlog file name and position as well (STOP SLAVE; mydumper
… ; START SLAVE)
S3 - csv
Since version 1.8 you are able to load data directly from s3
with ‘LOAD DATA FROM S3’ command. (These command is
similiar to LOAD DATA INFILE regarding the usage and data
formats. To create a compatible dump, use SELECT INTO
OUTFILE …)
More info:
https://goo.gl/BA2fGc
S3 - xtrabackup
Regarding the documentation it is possible to load data from
MySQL 5.5 and 5.6 which was created with xtrabackup
● However it never worked for me
More info:
https://goo.gl/KUBZUY
AWS Database Migration Service
It is designed to migrate from “any” database to RDS
● Supports schema migration
● After the schemas moved it populates the tables through
replication
https://aws.amazon.com/dms/
Instance type
- Because the network bandwidth does matter
- Index creation
Compatibility
MariaDB?
- No, it is not.
- It seems like MariaDB in the perspective of GTIDs, but it
is not
- For example, we had to break GTID replication, because MariaDB
don’t uses the same GTID implementation as native MySQL could,
but it can fail back there - theoretically. Aurora can’t.
Time Zones
- Aurora databases are in UTC time zone
- Actually as everything else in AWS
- You can change it
- But if you do it, and replicates to an external db be aware if the
mysql timezones are installed - this can break replication
https://goo.gl/LoVkUt
Replication
Not designed to replicate
- You can replicate, in or out, but I don’t recommend it
- My recommendation is to use replication only for
migration
- It’s not safe
- I had broken and unfixable replication because of a node restart
(just because a config change happened!)
RDS managed tables in MySQL
schema
- They are not appear in a mysqldump
- When you create a new host, create these tables
manually, (SHOW CREATE TABLE …) or your replication
will break
Aurora (RDS) as master
- Check the hostname of the cluster before setting up
replication, it might be too long for data dictionary (in
mysql 5.6 there are a VARCHAR(60) field hostnames
kinja-staging-rdsstack-aaaaaaaaaaaa-auroracl
uster-bbbbbbbbbbbb.cluster-cbrmrdukwwf3.us-e
ast-1.rds.amazonaws.com
- Just create a CNAME record, or use ip address instead
- I never tested what happens during/after a failover
Security & Access
Internet facing cluster
- Try to not expose it on internet
- I don’t have to talk about there are vulnerabilities always, right?
- If so, please always use SSL at least
- You can force SSL connection with mysql
-
Management node
- It is good to have a management node where you can
access your database
- Preconfigured work environment, plenty of disk space
for dumps, preconfigured connection parameters etc.
- Shouldn’t be an expensive node (unless you want to
utilize a lot of CPU or disk IO
- Should be spin up quickly and be destroyed when not
needed (hey, this is cloud, right?)
IAM Users
- You can authenticate IAM users, and grant privileges
inside of RDS but WHY WOULD YOU DO THIS?!
- I think it is way better to separate db users from IAM users
- Yes, this applies to AD as well
Create your own users
- Don’t use the cluster’s own user just for administrative
purposes
- Restrict your app users*
- Separate Read/Write and administrative users
* This is not Aurora/RDS only!
Scaling Aurora
Native autoscaling?
- Sadly there is no autoscaling built in, however it would be
nice to have one
- You can create your own one
- You can add node with API
- You can delete node with API
- You can have events generated by CPU usage or number of queries
etc.
Cluster addresses
- One dns address for read/write
- One dns address for read
- Each hosts has it’s own dns name
- If you remove a node, it’s address will be in dns for 60
seconds (default TTL for Aurora cluster addresses)
Bring your own Autoscaler
(BYOA)
- Endless possibilities!
- Haproxy/ProxySQL/DNS frontend
- My solution is a simple dns based autoscaler, it manages the cluster
address in Route53, when I add a node it creates a node, and puts
the nodes address to dns when it is available, when I remove a node,
it cleans it out from dns first, and after ttl starts to delete the node
Living without SUPER
Stored procedures (a lot)
● mysql.rds_set_external_master
● mysql.rds_reset_external_master
● mysql.rds_start_replication
● mysql.rds_stop_replication
● mysql.rds_skip_repl_error
● mysql.rds_next_master_log
● mysql.rds_innodb_buffer_pool_dump_now
● mysql.rds_innodb_buffer_pool_load_now
● mysql.rds_innodb_buffer_pool_load_abort
● mysql.rds_set_configuration
● mysql.rds_show_configuration
● mysql.rds_kill
● mysql.rds_kill_query
● mysql.rds_rotate_general_log
● mysql.rds_rotate_slow_log
● mysql.rds_enable_gsh_collector
● mysql.rds_set_gsh_collector
● mysql.rds_disable_gsh_collector
● mysql.rds_collect_global_status_history
● mysql.rds_enable_gsh_rotation
● mysql.rds_set_gsh_rotation
● mysql.rds_disable_gsh_rotation
● mysql.rds_rotate_global_status_history
Restricted commands
- SET GLOBAL READ_ONLY = 0/1
- This is managed by RDS
Read-only variables
- All variables which needs SUPER to change acts like
read-only variables
- You can change them in the parameter group
- And restart the node to the change take place
Configuration
DB parameter group/Cluster
parameter group
- Nodes can have different DB parameter groups, not all of
them has to be the same (Like read_only)
- Cluster parameter groups have options which has to be
the same along the entire cluster (Like timezone,
character encoding)
Read only parameters and
cloudformation
- When you change a read only parameter, the cluster
node gets in a ‘pending change’ state and you can reboot
it manually to apply the changes
- When the cluster/node has associated parameter group
in cloudformation, and you change a read only value, your
node will be rebooted automatically
- To avoid this, don’t associate parameter groups with
instances/clusters via cloudformation
- You can have blue/green param group pairs
Backup
Automated backups
- They’re created via snapshotting the instances
- It is in S3 as backend, but you can’t access it as object, just
from RDS
- You can’t keep them “forever”
- You can’t set retention policy and storage class
- Quick -> Slow -> Long term (Glacier)
Long term backups
- Create logical dumps
(mysqldump/mysqlpump/mydumper/SELECT INTO
OUTFILE, SELECT INTO S3)
- You can provision a host for this, do the backup, and copy it to s3,
and remove the backup host (awscli/boto3)
- You can set up storage class for the long term backups
- WARNING: Glacier is freaking expensive if you want to
access your backup. No joke.
ALWAYS TEST YOUR BACKUPS
IF YOUR BACKUP IS NOT TESTED WITH RESTORE, THAT
BACKUP IS A NON-EXISTENT BACKUP. PERIOD.
IF YOUR BACKUP IS NOT TESTED WITH RESTORE, THAT
BACKUP IS A NON-EXISTENT BACKUP. PERIOD.
IF YOUR BACKUP IS NOT TESTED WITH RESTORE, THAT
BACKUP IS A NON-EXISTENT BACKUP. PERIOD.
Monitoring & Logging
Cloudwatch
- You can access host metrics from cloudwatch
- like CPU, Memory, Disk IO
- Not all the MySQL metrics are accessible
- You can create nice dashboards
- As well as nice alerting based on events
Third party monitoring tools
- Vividcortex
- PMM
- Newrelic
These tools can connect to mysql port, and collect metrics via
performance_schema. Check their documentation about the
details, be security-aware. If you can monitor via a host or a
container I believe the second one is better. (KISS)
Monitoring
- Cloudwatch is good, but not all the metrics can be
reached by it.
- Vividcortex should connect via mysql protocol using
performance_schema
Logging
- Hell.
- You can access them from the web interface
- You can access them with awscli
- (maybe there’s a logger backend somewhere?)
Thank you for your time!

More Related Content

What's hot

Wordpress optimization
Wordpress optimizationWordpress optimization
Wordpress optimization
Almog Baku
 
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
Amazon Web Services
 
PostgreSQL on Amazon RDS
PostgreSQL on Amazon RDSPostgreSQL on Amazon RDS
PostgreSQL on Amazon RDS
PGConf APAC
 
Scaling PHP apps
Scaling PHP appsScaling PHP apps
Scaling PHP apps
Matteo Moretti
 
AutoScaling and Drupal
AutoScaling and DrupalAutoScaling and Drupal
AutoScaling and Drupal
Promet Source
 
[245] presto 내부구조 파헤치기
[245] presto 내부구조 파헤치기[245] presto 내부구조 파헤치기
[245] presto 내부구조 파헤치기
NAVER D2
 
AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)
Matteo Moretti
 
Developing with Cassandra
Developing with CassandraDeveloping with Cassandra
Developing with Cassandra
Sperasoft
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Experts, Inc.
 
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and moreScaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Dropsolid
 
DevOps Meetup ansible
DevOps Meetup   ansibleDevOps Meetup   ansible
DevOps Meetup ansible
sriram_rajan
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
PostgreSQL Experts, Inc.
 
Beyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden KarauBeyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden Karau
Spark Summit
 
GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)
PostgreSQL Experts, Inc.
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
Joe Stein
 
re:dash is awesome
re:dash is awesomere:dash is awesome
re:dash is awesome
Hiroshi Toyama
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
PGConf APAC
 
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
Amazon Web Services
 
Pgbr 2013 postgres on aws
Pgbr 2013   postgres on awsPgbr 2013   postgres on aws
Pgbr 2013 postgres on aws
Emanuel Calvo
 
Impala 2.0 Update #impalajp
Impala 2.0 Update #impalajpImpala 2.0 Update #impalajp
Impala 2.0 Update #impalajp
Cloudera Japan
 

What's hot (20)

Wordpress optimization
Wordpress optimizationWordpress optimization
Wordpress optimization
 
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
 
PostgreSQL on Amazon RDS
PostgreSQL on Amazon RDSPostgreSQL on Amazon RDS
PostgreSQL on Amazon RDS
 
Scaling PHP apps
Scaling PHP appsScaling PHP apps
Scaling PHP apps
 
AutoScaling and Drupal
AutoScaling and DrupalAutoScaling and Drupal
AutoScaling and Drupal
 
[245] presto 내부구조 파헤치기
[245] presto 내부구조 파헤치기[245] presto 내부구조 파헤치기
[245] presto 내부구조 파헤치기
 
AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)
 
Developing with Cassandra
Developing with CassandraDeveloping with Cassandra
Developing with Cassandra
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and moreScaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
 
DevOps Meetup ansible
DevOps Meetup   ansibleDevOps Meetup   ansible
DevOps Meetup ansible
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
Beyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden KarauBeyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden Karau
 
GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
 
re:dash is awesome
re:dash is awesomere:dash is awesome
re:dash is awesome
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
 
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
 
Pgbr 2013 postgres on aws
Pgbr 2013   postgres on awsPgbr 2013   postgres on aws
Pgbr 2013 postgres on aws
 
Impala 2.0 Update #impalajp
Impala 2.0 Update #impalajpImpala 2.0 Update #impalajp
Impala 2.0 Update #impalajp
 

Similar to Migrating and living on rds aurora

RDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and PatternsRDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and Patterns
Laine Campbell
 
AWS Cloud SAA Relational Database presentation
AWS Cloud SAA Relational Database presentationAWS Cloud SAA Relational Database presentation
AWS Cloud SAA Relational Database presentation
TATA LILIAN SHULIKA
 
Triple Blitz Strike
Triple Blitz StrikeTriple Blitz Strike
Triple Blitz Strike
Denis Zhdanov
 
Conquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard queryConquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard query
Justin Swanhart
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
DataStax
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performance
Ashok Modi
 
High Availabiltity & Replica Sets with mongoDB
High Availabiltity & Replica Sets with mongoDBHigh Availabiltity & Replica Sets with mongoDB
High Availabiltity & Replica Sets with mongoDB
Gareth Davies
 
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
AWS Chicago
 
Performance and Scalability
Performance and ScalabilityPerformance and Scalability
Performance and Scalability
Mediacurrent
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
Chris Purrington
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)
Chinmay Kulkarni
 
Linuxfest Northwest Proper Care and Feeding Of a MySQL for Busy Linux Admins
Linuxfest Northwest Proper Care and Feeding Of a MySQL for Busy Linux AdminsLinuxfest Northwest Proper Care and Feeding Of a MySQL for Busy Linux Admins
Linuxfest Northwest Proper Care and Feeding Of a MySQL for Busy Linux Admins
Dave Stokes
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Cluster
percona2013
 
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
Alexander Dean
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Databricks
 
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
DevOpsDays Tel Aviv
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Cluster
percona2013
 
Snowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat SheetSnowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat Sheet
Jeno Yamma
 
Databases on aws part 2
Databases on aws   part 2Databases on aws   part 2
Databases on aws part 2
Parag Patil
 
Inside Microsoft Azure
Inside Microsoft AzureInside Microsoft Azure
Inside Microsoft Azure
Ernest Mueller
 

Similar to Migrating and living on rds aurora (20)

RDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and PatternsRDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and Patterns
 
AWS Cloud SAA Relational Database presentation
AWS Cloud SAA Relational Database presentationAWS Cloud SAA Relational Database presentation
AWS Cloud SAA Relational Database presentation
 
Triple Blitz Strike
Triple Blitz StrikeTriple Blitz Strike
Triple Blitz Strike
 
Conquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard queryConquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard query
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performance
 
High Availabiltity & Replica Sets with mongoDB
High Availabiltity & Replica Sets with mongoDBHigh Availabiltity & Replica Sets with mongoDB
High Availabiltity & Replica Sets with mongoDB
 
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
 
Performance and Scalability
Performance and ScalabilityPerformance and Scalability
Performance and Scalability
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)
 
Linuxfest Northwest Proper Care and Feeding Of a MySQL for Busy Linux Admins
Linuxfest Northwest Proper Care and Feeding Of a MySQL for Busy Linux AdminsLinuxfest Northwest Proper Care and Feeding Of a MySQL for Busy Linux Admins
Linuxfest Northwest Proper Care and Feeding Of a MySQL for Busy Linux Admins
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Cluster
 
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
 
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Cluster
 
Snowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat SheetSnowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat Sheet
 
Databases on aws part 2
Databases on aws   part 2Databases on aws   part 2
Databases on aws part 2
 
Inside Microsoft Azure
Inside Microsoft AzureInside Microsoft Azure
Inside Microsoft Azure
 

Recently uploaded

一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 

Recently uploaded (20)

一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 

Migrating and living on rds aurora

  • 1. Migrating to and living on RDS/Aurora life after Datacenters
  • 2. About me Balázs Pőcze ● I came from the operations world (ops/devops/sre) ● Works as a DBA for 4 years ● Currently I work for Gizmodo ● @banyek ● https://github.com/banyek ● http://blog.balazspocze.me
  • 3. “Move to the cloud they said, it will be fun, they said...”
  • 4. Why to migrate to RDS - It is AWS native - A lot of complexity is handled by Amazon - It is Someone Else’s Problem (SEP ™) - You have someone to blame - It just works!
  • 5. Why not migrate to RDS - It’s not always the same approach as you’d choose - You can’t access certain parts of your system - When you have your own enlightened toolset which you want to use - You are not happy to use a black box system, when you just hope it will work somehow
  • 7. RDS/Aurora - MySQL compatible HA database cluster - No replication between the nodes, they share a common storage, and the redo logs got shared - This means there’s no ‘real’ replication lag between the write node and the readers, in our case it is between around 12-16ms - The cluster has a writer and a reader endpoint, the writer is not getting read - A cluster can have up to 16 nodes - The nodes doesn’t have to be the same size
  • 9. Using the web interface - DON’T! - Just don’t. - It is not reproducible - There is no clean code you can just read and hand to someone else - You can export it to a cloudformation template
  • 10. Awscli / boto3 - Amazon native - Flexible - Relatively easy - Not the most reusable
  • 11. Puppet/ Chef / Ansible We tried to use puppet, but it wasn't a success - for us - A lot of recreations happened instead of modifying resources - The newest API was not supported - It worked, but with some limitations
  • 12. Terraform - No vendor lock - It needs an Atlas server to work properly - We had bad experience - Constantly changes - Not always the latest API in use - It always recreated and destroyed the cluster instead of modifying
  • 13. CloudFormation - Native Amazon solution - Could be written in YAML or in JSON - Works with changesets - The Amazon infrastructure fully supports it - With CodePipeline and CodeBuild you can create a real neat pipeline!
  • 14. Sizing the nodes - Always check Amazon’s node parameters, because with MySQL not often the CPU is the bottleneck, but the storage speed (Aurora using EBS) or the network could be!
  • 16. mysqldump / mysqlpump - --single-transaction - --master-data=2 - --hex-dump - Don’t forget the interactive_timeout parameter! - When uploading to s3 you have to take care of splicing up - mysqlpump can dump parallel, it exists since 5.7 - The data loading is still single threaded
  • 17. Mydumper / Myloader - Parallel dumper, way faster than mysqldump (in my case the data loading was around 5hrs instead of 26! - No --single-transaction, so you have to take care of it - My recommendation: do it on a replica, because you’ll have the binlog file name and position as well (STOP SLAVE; mydumper … ; START SLAVE)
  • 18. S3 - csv Since version 1.8 you are able to load data directly from s3 with ‘LOAD DATA FROM S3’ command. (These command is similiar to LOAD DATA INFILE regarding the usage and data formats. To create a compatible dump, use SELECT INTO OUTFILE …) More info: https://goo.gl/BA2fGc
  • 19. S3 - xtrabackup Regarding the documentation it is possible to load data from MySQL 5.5 and 5.6 which was created with xtrabackup ● However it never worked for me More info: https://goo.gl/KUBZUY
  • 20. AWS Database Migration Service It is designed to migrate from “any” database to RDS ● Supports schema migration ● After the schemas moved it populates the tables through replication https://aws.amazon.com/dms/
  • 21. Instance type - Because the network bandwidth does matter - Index creation
  • 23. MariaDB? - No, it is not. - It seems like MariaDB in the perspective of GTIDs, but it is not - For example, we had to break GTID replication, because MariaDB don’t uses the same GTID implementation as native MySQL could, but it can fail back there - theoretically. Aurora can’t.
  • 24. Time Zones - Aurora databases are in UTC time zone - Actually as everything else in AWS - You can change it - But if you do it, and replicates to an external db be aware if the mysql timezones are installed - this can break replication https://goo.gl/LoVkUt
  • 26. Not designed to replicate - You can replicate, in or out, but I don’t recommend it - My recommendation is to use replication only for migration - It’s not safe - I had broken and unfixable replication because of a node restart (just because a config change happened!)
  • 27. RDS managed tables in MySQL schema - They are not appear in a mysqldump - When you create a new host, create these tables manually, (SHOW CREATE TABLE …) or your replication will break
  • 28. Aurora (RDS) as master - Check the hostname of the cluster before setting up replication, it might be too long for data dictionary (in mysql 5.6 there are a VARCHAR(60) field hostnames kinja-staging-rdsstack-aaaaaaaaaaaa-auroracl uster-bbbbbbbbbbbb.cluster-cbrmrdukwwf3.us-e ast-1.rds.amazonaws.com - Just create a CNAME record, or use ip address instead - I never tested what happens during/after a failover
  • 30. Internet facing cluster - Try to not expose it on internet - I don’t have to talk about there are vulnerabilities always, right? - If so, please always use SSL at least - You can force SSL connection with mysql -
  • 31. Management node - It is good to have a management node where you can access your database - Preconfigured work environment, plenty of disk space for dumps, preconfigured connection parameters etc. - Shouldn’t be an expensive node (unless you want to utilize a lot of CPU or disk IO - Should be spin up quickly and be destroyed when not needed (hey, this is cloud, right?)
  • 32. IAM Users - You can authenticate IAM users, and grant privileges inside of RDS but WHY WOULD YOU DO THIS?! - I think it is way better to separate db users from IAM users - Yes, this applies to AD as well
  • 33. Create your own users - Don’t use the cluster’s own user just for administrative purposes - Restrict your app users* - Separate Read/Write and administrative users * This is not Aurora/RDS only!
  • 35. Native autoscaling? - Sadly there is no autoscaling built in, however it would be nice to have one - You can create your own one - You can add node with API - You can delete node with API - You can have events generated by CPU usage or number of queries etc.
  • 36. Cluster addresses - One dns address for read/write - One dns address for read - Each hosts has it’s own dns name - If you remove a node, it’s address will be in dns for 60 seconds (default TTL for Aurora cluster addresses)
  • 37. Bring your own Autoscaler (BYOA) - Endless possibilities! - Haproxy/ProxySQL/DNS frontend - My solution is a simple dns based autoscaler, it manages the cluster address in Route53, when I add a node it creates a node, and puts the nodes address to dns when it is available, when I remove a node, it cleans it out from dns first, and after ttl starts to delete the node
  • 39. Stored procedures (a lot) ● mysql.rds_set_external_master ● mysql.rds_reset_external_master ● mysql.rds_start_replication ● mysql.rds_stop_replication ● mysql.rds_skip_repl_error ● mysql.rds_next_master_log ● mysql.rds_innodb_buffer_pool_dump_now ● mysql.rds_innodb_buffer_pool_load_now ● mysql.rds_innodb_buffer_pool_load_abort ● mysql.rds_set_configuration ● mysql.rds_show_configuration ● mysql.rds_kill ● mysql.rds_kill_query ● mysql.rds_rotate_general_log ● mysql.rds_rotate_slow_log ● mysql.rds_enable_gsh_collector ● mysql.rds_set_gsh_collector ● mysql.rds_disable_gsh_collector ● mysql.rds_collect_global_status_history ● mysql.rds_enable_gsh_rotation ● mysql.rds_set_gsh_rotation ● mysql.rds_disable_gsh_rotation ● mysql.rds_rotate_global_status_history
  • 40. Restricted commands - SET GLOBAL READ_ONLY = 0/1 - This is managed by RDS
  • 41. Read-only variables - All variables which needs SUPER to change acts like read-only variables - You can change them in the parameter group - And restart the node to the change take place
  • 43. DB parameter group/Cluster parameter group - Nodes can have different DB parameter groups, not all of them has to be the same (Like read_only) - Cluster parameter groups have options which has to be the same along the entire cluster (Like timezone, character encoding)
  • 44. Read only parameters and cloudformation - When you change a read only parameter, the cluster node gets in a ‘pending change’ state and you can reboot it manually to apply the changes - When the cluster/node has associated parameter group in cloudformation, and you change a read only value, your node will be rebooted automatically - To avoid this, don’t associate parameter groups with instances/clusters via cloudformation - You can have blue/green param group pairs
  • 46. Automated backups - They’re created via snapshotting the instances - It is in S3 as backend, but you can’t access it as object, just from RDS - You can’t keep them “forever” - You can’t set retention policy and storage class - Quick -> Slow -> Long term (Glacier)
  • 47. Long term backups - Create logical dumps (mysqldump/mysqlpump/mydumper/SELECT INTO OUTFILE, SELECT INTO S3) - You can provision a host for this, do the backup, and copy it to s3, and remove the backup host (awscli/boto3) - You can set up storage class for the long term backups - WARNING: Glacier is freaking expensive if you want to access your backup. No joke.
  • 48. ALWAYS TEST YOUR BACKUPS IF YOUR BACKUP IS NOT TESTED WITH RESTORE, THAT BACKUP IS A NON-EXISTENT BACKUP. PERIOD. IF YOUR BACKUP IS NOT TESTED WITH RESTORE, THAT BACKUP IS A NON-EXISTENT BACKUP. PERIOD. IF YOUR BACKUP IS NOT TESTED WITH RESTORE, THAT BACKUP IS A NON-EXISTENT BACKUP. PERIOD.
  • 50. Cloudwatch - You can access host metrics from cloudwatch - like CPU, Memory, Disk IO - Not all the MySQL metrics are accessible - You can create nice dashboards - As well as nice alerting based on events
  • 51. Third party monitoring tools - Vividcortex - PMM - Newrelic These tools can connect to mysql port, and collect metrics via performance_schema. Check their documentation about the details, be security-aware. If you can monitor via a host or a container I believe the second one is better. (KISS)
  • 52. Monitoring - Cloudwatch is good, but not all the metrics can be reached by it. - Vividcortex should connect via mysql protocol using performance_schema
  • 53. Logging - Hell. - You can access them from the web interface - You can access them with awscli - (maybe there’s a logger backend somewhere?)
  • 54. Thank you for your time!