SlideShare a Scribd company logo
1 of 32
Matt Lord - Principal Engineer, Vitess @
itess: VReplication
@mattalord
Standing on the Shoulders of a MySQL Giant
Agenda
@vitessio
❖ Vitess Overview
❖ VReplication Overview
❖ What’s New and Resources
Agenda
@vitessio
❖ Vitess Overview
❖ VReplication Overview
❖ What’s New and Resources
Vitess
A database clustering system for horizontal scaling of MySQL
• CNCF graduated project
• Open source, Apache 2.0 license
• Contributors from around the community
• Written in Golang
@vitessio
Cloud-native distributed database
- Runs in Kubernetes; Vitess Operator (VTOP)
Scalable
Highly available
Durability guarantees
Illusion of “single database”
- Single dedicated connection
- MySQL 5.7 or 8.0
- Compatible with frameworks / ORMs etc.
@vitessio
Vitess
Vitess Serves Millions of QPS in
Production
@vitessio
Concepts
Keyspace
- Logical database
Shard
- Subset or partition of a logical database
Cell
- Failure domain (e.g. DC or AZ)
@vitessio
Vitess Architecture Basics
A common replicated
database cluster with primary
and replicas
@vitessio
Vitess Architecture Basics
Each MySQL server is assigned a vttablet
- A daemon/sidecar
- Controls the mysqld process
- Interacts with the mysqld server
- Typically on same host as mysqld
@vitessio
Vitess Architecture Basics
In production you have multiple
keyspaces, each with 1 or more
shards
@vitessio
Vitess Architecture Basics
User and application traffic is routed via
vtgate
- A smart, stateless proxy
- Speaks the MySQL protocol
- Impersonates a monolithic MySQL
server
- Relays queries to vttablets
- Coordinates scatter-gather queries
when needed
@vitessio
Vitess Architecture Basics
A vitess deployment will run
multiple vtgate servers for scale out
@vitessio
Vitess Architecture Basics
vtgate will transparently route
queries to the correct keyspaces,
shards, and vttablets
app
app
commerce
shard 0
commerce
shard 1
internal
shard 1
(unsharded)
?
@vitessio
Vitess Architecture Basics
Queries routed based on schema & sharding
scheme (vindexes)
app
app
commerce/-80
commerce/80-
internal/-
USE commerce;
SELECT order_id, price
FROM orders
WHERE customer_id=4;
@vitessio
d2fd8867d50d2dfe
Vitess Architecture Basics
topo: distributed key/value store and coordination service
- Stores the state of vitess: schemas, shards, sharding
scheme, tablets, roles, etc.
- Provides a shared locking service
- etcd/ZooKeeper/Consul/Kubernetes
- Small dataset, mostly cached by vtgate
@vitessio
commerce/-80
commerce/80-
internal/-
vtctld: control daemon
- Runs ad hoc operations
- API server
- Reads/writes state in topo
- Uses locks in topo
- Operates on vttablets
@vitessio
Vitess Architecture Basics
commerce/-80
commerce/80-
internal/-
Vitess Architecture Summary
@vitessio
Agenda
❖ Vitess Overview
❖ VReplication Overview
❖ What’s New and Resources
@vitessio
VReplication
A Framework For Creating And Managing Data Streams And Workflows
When data matches some defined criteria then execute a defined
Workflow:
● Sharding
● Filtered Replication
● Transformations
● Materialized Views
● Online Migrations / Schema Changes
● Event Streams / CDC / Job Queues
● ...
@vitessio
● Add a tablet for the external/unmanaged MySQL instance
● Add this temporary tablet to Vitess
● Use MoveTables to move [and shard] the data into a Vitess
managed keyspace
Getting Data Into Vitess
@vitessio
leetdb/-80
leetdb/80-
okdb (RDS)
Unmanaged Tablet
● You have many/all tables in a single keyspace
● You want to move some of the tables to a new keyspace
● You want to achieve this without making significant changes to your
application or incurring downtime
● Use MoveTables to split the tables into N keyspaces
@vitessio
alldata/-
Vertical / Functional Sharding
products/-
orders/-
● Going from 1 to 2 shards, 2 to 4, 4 to 8, to … as your dataset and usage grows
○ Can also shrink by merging shards and/or keyspaces when needed
● Add new tablets to manage new keyspace range splits
● Use Reshard to redistribute the data
Horizontal Sharding
@vitessio
leetdb/-80
leetdb/80-
leetdb/80-c0
leetdb/c0-
leetdb/-40
leetdb/40-80
VDiff
@vitessio
- Runs a diff between the source
and target shards
- One VDiff per workflow
- This is a blocking call
● Real-time views into a [transformed] [subset] of data from db1 in db2
● Aggregated views on certain data to perform analytics against
● Local copies of a “global” lookup table (e.g. country,state,postcodes)
● This data is automatically kept correct and up-to-date
● Use Materialize to setup the materialized view
@vitessio
products/-
alldata/-
orders/-
Materialized Views
● Non-blocking, monitorable, revertable, cancelable, configurable throttling
● Supports typical SQL statements as well as declarative
● Resilient to failures, failovers, and topology changes
● Lazy, phased cleanup (to avoid cost of dropping large tables in prod)
● Uses VReplication, driven by primary tablet in each shard
$ vtctlclient ApplySchema -ddl_strategy "online" -sql "..." <keyspace>
● Has its own SQL statements:
mysql> show vitess_migrations; alter vitess_migration …;
mysql> show vitess_migration_logs; …
● Has its own set of vtctl commands:
$ vtctlclient OnlineDDL <keyspace> [show,retry,cancel]
[<migration_uuid>,all,running,complete,failed]
Online Schema Changes
@vitessio
● Use a Vitess managed message bus, job queue, or event stream
○ For example, managing “offline” processing data pipelines
● CREATE a standard [sharded] table with required fields and comments
○ https://vitess.io/docs/reference/features/messaging
● DMLs against the table generate events
● SUBSCRIBERs receive and acknowledge events via SQL or gRPC
Change Data Capture / Event Streams
@vitessio
app
app
commerce/-80
commerce/80-
internal/-
STREAM *
FROM vt_job_queue;
INSERT INTO vt_job_queue …;
UPDATE vt_job_queue
SET time_acked = NOW()
WHERE id = 100;
● This will grow over time based on new use cases that present
themselves
● Most recently, Vitess Native OnlineDDL
● Or your own custom workflows and pipelines! For example:
https://medium.com/bolt-labs/streaming-vitess-at-bolt-f8ea93211c3f
Any New Built-in Workflows...
@vitessio
● VDiff
○ Compare the full set of logical rows on both sides using a
consistent snapshot
● Limit impact on production traffic
○ configurable tablet throttling
○ –tablet_types, –max_replication_lag, –filtered_replication_wait_time, –
vreplication_copy_phase_max_innodb_history_list_length, –
vreplication_copy_phase_max_mysql_replication_lag, …
● -cells to avoid cross-AZ traffic
● Custom RoutingRules to make cutovers seamless for apps/users
● Rollbacks via $ vtctl ReverseTraffic
...
Safety Mechanisms
@vitessio
VDiff2
(Diffs done on tablets)
@vitessio
Agenda
@vitessio
❖ Vitess Overview
❖ VReplication Overview
❖ What’s New and Resources
New* and Upcoming Features
- 14.0 release GA late June 2022
- VReplication based Online DDL - 10.0 (GA in 14.0)
- Continuous benchmarking - since 11.0
- More supported query constructs - (MySQL 8.0) ongoing
- Gen4 query planner - GA in 13.0 (default in 14.0)
- MySQL compatible collations - 13.0
- Multi-column VIndexes - 13.0
- VTAdmin
- Automatic failure detection and handling (VTorc)
- VDiff2, running on tablets (and other improvements)
- Distributed x-shard transactions
@vitessio
‫٭‬ish
Resources
Docs: vitess.io/docs/
Code: github.com/vitessio/vitess
Slack: vitess.slack.com
Thank you!
@vitessio

More Related Content

What's hot

Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkBo Yang
 
Patroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionPatroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionAlexander Kukushkin
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introductioncolorant
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introductionchrislusf
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Mydbops
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsAlexander Korotkov
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyAlexander Kukushkin
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiDatabricks
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeDatabricks
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper lookJignesh Shah
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLScyllaDB
 
Migrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at FacebookMigrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at FacebookMariaDB plc
 

What's hot (20)

Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
 
Patroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionPatroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companion
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easy
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper look
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
 
Migrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at FacebookMigrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at Facebook
 

Similar to Vitess VReplication: Standing on the Shoulders of a MySQL Giant

Integrating best of breed open source tools to vitess orchestrator pleu21
Integrating best of breed open source tools to vitess  orchestrator   pleu21Integrating best of breed open source tools to vitess  orchestrator   pleu21
Integrating best of breed open source tools to vitess orchestrator pleu21Alkin Tezuysal
 
How to shard MariaDB like a pro - FOSDEM 2021
How to shard MariaDB like a pro  - FOSDEM 2021How to shard MariaDB like a pro  - FOSDEM 2021
How to shard MariaDB like a pro - FOSDEM 2021Alkin Tezuysal
 
Vitess: Scalable Database Architecture - Kubernetes Community Days Africa Ap...
Vitess: Scalable Database Architecture -  Kubernetes Community Days Africa Ap...Vitess: Scalable Database Architecture -  Kubernetes Community Days Africa Ap...
Vitess: Scalable Database Architecture - Kubernetes Community Days Africa Ap...Alkin Tezuysal
 
Replicating in Real-time from MySQL to Amazon Redshift
Replicating in Real-time from MySQL to Amazon RedshiftReplicating in Real-time from MySQL to Amazon Redshift
Replicating in Real-time from MySQL to Amazon RedshiftContinuent
 
Managing and Deploying High Performance Computing Clusters using Windows HPC ...
Managing and Deploying High Performance Computing Clusters using Windows HPC ...Managing and Deploying High Performance Computing Clusters using Windows HPC ...
Managing and Deploying High Performance Computing Clusters using Windows HPC ...Saptak Sen
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld
 
How to build a winning solution for large scale VDI deployments
How to build a winning solution for large scale VDI deploymentsHow to build a winning solution for large scale VDI deployments
How to build a winning solution for large scale VDI deploymentsNetApp
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database OverviewSteve Min
 
Using a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsUsing a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsVoltDB
 
Denver SQL Saturday The Next Frontier
Denver SQL Saturday The Next FrontierDenver SQL Saturday The Next Frontier
Denver SQL Saturday The Next FrontierKellyn Pot'Vin-Gorman
 
VMworld - sto7650 -Software defined storage @VMmware primer
VMworld - sto7650 -Software defined storage  @VMmware primerVMworld - sto7650 -Software defined storage  @VMmware primer
VMworld - sto7650 -Software defined storage @VMmware primerDuncan Epping
 
Cloud Bursting 101: What to do When Cloud Computing Demand Exceeds Capacity
Cloud Bursting 101: What to do When Cloud Computing Demand Exceeds CapacityCloud Bursting 101: What to do When Cloud Computing Demand Exceeds Capacity
Cloud Bursting 101: What to do When Cloud Computing Demand Exceeds CapacityAvere Systems
 
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...VMworld
 
Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?DataWorks Summit
 
Slidy architecture
Slidy architectureSlidy architecture
Slidy architectureCC Expertise
 
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best PracticesVMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best PracticesVMworld
 

Similar to Vitess VReplication: Standing on the Shoulders of a MySQL Giant (20)

Integrating best of breed open source tools to vitess orchestrator pleu21
Integrating best of breed open source tools to vitess  orchestrator   pleu21Integrating best of breed open source tools to vitess  orchestrator   pleu21
Integrating best of breed open source tools to vitess orchestrator pleu21
 
KubeCon_NA_2021
KubeCon_NA_2021KubeCon_NA_2021
KubeCon_NA_2021
 
How to shard MariaDB like a pro - FOSDEM 2021
How to shard MariaDB like a pro  - FOSDEM 2021How to shard MariaDB like a pro  - FOSDEM 2021
How to shard MariaDB like a pro - FOSDEM 2021
 
Vitess: Scalable Database Architecture - Kubernetes Community Days Africa Ap...
Vitess: Scalable Database Architecture -  Kubernetes Community Days Africa Ap...Vitess: Scalable Database Architecture -  Kubernetes Community Days Africa Ap...
Vitess: Scalable Database Architecture - Kubernetes Community Days Africa Ap...
 
Replicating in Real-time from MySQL to Amazon Redshift
Replicating in Real-time from MySQL to Amazon RedshiftReplicating in Real-time from MySQL to Amazon Redshift
Replicating in Real-time from MySQL to Amazon Redshift
 
Managing and Deploying High Performance Computing Clusters using Windows HPC ...
Managing and Deploying High Performance Computing Clusters using Windows HPC ...Managing and Deploying High Performance Computing Clusters using Windows HPC ...
Managing and Deploying High Performance Computing Clusters using Windows HPC ...
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphere
 
How to build a winning solution for large scale VDI deployments
How to build a winning solution for large scale VDI deploymentsHow to build a winning solution for large scale VDI deployments
How to build a winning solution for large scale VDI deployments
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
 
Copy Data Management for the DBA
Copy Data Management for the DBACopy Data Management for the DBA
Copy Data Management for the DBA
 
Using a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsUsing a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming Aggregations
 
Denver SQL Saturday The Next Frontier
Denver SQL Saturday The Next FrontierDenver SQL Saturday The Next Frontier
Denver SQL Saturday The Next Frontier
 
VMworld - sto7650 -Software defined storage @VMmware primer
VMworld - sto7650 -Software defined storage  @VMmware primerVMworld - sto7650 -Software defined storage  @VMmware primer
VMworld - sto7650 -Software defined storage @VMmware primer
 
Cloud Bursting 101: What to do When Cloud Computing Demand Exceeds Capacity
Cloud Bursting 101: What to do When Cloud Computing Demand Exceeds CapacityCloud Bursting 101: What to do When Cloud Computing Demand Exceeds Capacity
Cloud Bursting 101: What to do When Cloud Computing Demand Exceeds Capacity
 
VoltDB on SolftLayer Cloud
VoltDB on SolftLayer CloudVoltDB on SolftLayer Cloud
VoltDB on SolftLayer Cloud
 
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
 
Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?
 
Galera Cluster 4 for MySQL 8 Release Webinar slides
Galera Cluster 4 for MySQL 8 Release Webinar slidesGalera Cluster 4 for MySQL 8 Release Webinar slides
Galera Cluster 4 for MySQL 8 Release Webinar slides
 
Slidy architecture
Slidy architectureSlidy architecture
Slidy architecture
 
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best PracticesVMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
 

More from Matt Lord

MongDB Mobile: Bringing the Power of MongoDB to Your Device
MongDB Mobile: Bringing the Power of MongoDB to Your DeviceMongDB Mobile: Bringing the Power of MongoDB to Your Device
MongDB Mobile: Bringing the Power of MongoDB to Your DeviceMatt Lord
 
MongoDB Mobile: Bringing the Power of MongoDB to Your Device
MongoDB Mobile: Bringing the Power of MongoDB to Your DeviceMongoDB Mobile: Bringing the Power of MongoDB to Your Device
MongoDB Mobile: Bringing the Power of MongoDB to Your DeviceMatt Lord
 
Using MySQL Containers
Using MySQL ContainersUsing MySQL Containers
Using MySQL ContainersMatt Lord
 
Why MySQL High Availability Matters
Why MySQL High Availability MattersWhy MySQL High Availability Matters
Why MySQL High Availability MattersMatt Lord
 
MySQL High Availability -- InnoDB Clusters
MySQL High Availability -- InnoDB ClustersMySQL High Availability -- InnoDB Clusters
MySQL High Availability -- InnoDB ClustersMatt Lord
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLMatt Lord
 
OpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackOpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackMatt Lord
 
MySQL Group Replication - an Overview
MySQL Group Replication - an OverviewMySQL Group Replication - an Overview
MySQL Group Replication - an OverviewMatt Lord
 
OpenStack and MySQL
OpenStack and MySQLOpenStack and MySQL
OpenStack and MySQLMatt Lord
 
MySQL DBaaS with OpenStack Trove
MySQL DBaaS with OpenStack TroveMySQL DBaaS with OpenStack Trove
MySQL DBaaS with OpenStack TroveMatt Lord
 
Getting Started with MySQL Full Text Search
Getting Started with MySQL Full Text SearchGetting Started with MySQL Full Text Search
Getting Started with MySQL Full Text SearchMatt Lord
 
Using MySQL in the Cloud
Using MySQL in the CloudUsing MySQL in the Cloud
Using MySQL in the CloudMatt Lord
 
MySQL 5.7 GIS
MySQL 5.7 GISMySQL 5.7 GIS
MySQL 5.7 GISMatt Lord
 

More from Matt Lord (13)

MongDB Mobile: Bringing the Power of MongoDB to Your Device
MongDB Mobile: Bringing the Power of MongoDB to Your DeviceMongDB Mobile: Bringing the Power of MongoDB to Your Device
MongDB Mobile: Bringing the Power of MongoDB to Your Device
 
MongoDB Mobile: Bringing the Power of MongoDB to Your Device
MongoDB Mobile: Bringing the Power of MongoDB to Your DeviceMongoDB Mobile: Bringing the Power of MongoDB to Your Device
MongoDB Mobile: Bringing the Power of MongoDB to Your Device
 
Using MySQL Containers
Using MySQL ContainersUsing MySQL Containers
Using MySQL Containers
 
Why MySQL High Availability Matters
Why MySQL High Availability MattersWhy MySQL High Availability Matters
Why MySQL High Availability Matters
 
MySQL High Availability -- InnoDB Clusters
MySQL High Availability -- InnoDB ClustersMySQL High Availability -- InnoDB Clusters
MySQL High Availability -- InnoDB Clusters
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
 
OpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackOpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStack
 
MySQL Group Replication - an Overview
MySQL Group Replication - an OverviewMySQL Group Replication - an Overview
MySQL Group Replication - an Overview
 
OpenStack and MySQL
OpenStack and MySQLOpenStack and MySQL
OpenStack and MySQL
 
MySQL DBaaS with OpenStack Trove
MySQL DBaaS with OpenStack TroveMySQL DBaaS with OpenStack Trove
MySQL DBaaS with OpenStack Trove
 
Getting Started with MySQL Full Text Search
Getting Started with MySQL Full Text SearchGetting Started with MySQL Full Text Search
Getting Started with MySQL Full Text Search
 
Using MySQL in the Cloud
Using MySQL in the CloudUsing MySQL in the Cloud
Using MySQL in the Cloud
 
MySQL 5.7 GIS
MySQL 5.7 GISMySQL 5.7 GIS
MySQL 5.7 GIS
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

Vitess VReplication: Standing on the Shoulders of a MySQL Giant

  • 1. Matt Lord - Principal Engineer, Vitess @ itess: VReplication @mattalord Standing on the Shoulders of a MySQL Giant
  • 2. Agenda @vitessio ❖ Vitess Overview ❖ VReplication Overview ❖ What’s New and Resources
  • 3. Agenda @vitessio ❖ Vitess Overview ❖ VReplication Overview ❖ What’s New and Resources
  • 4. Vitess A database clustering system for horizontal scaling of MySQL • CNCF graduated project • Open source, Apache 2.0 license • Contributors from around the community • Written in Golang @vitessio
  • 5. Cloud-native distributed database - Runs in Kubernetes; Vitess Operator (VTOP) Scalable Highly available Durability guarantees Illusion of “single database” - Single dedicated connection - MySQL 5.7 or 8.0 - Compatible with frameworks / ORMs etc. @vitessio Vitess
  • 6. Vitess Serves Millions of QPS in Production @vitessio
  • 7. Concepts Keyspace - Logical database Shard - Subset or partition of a logical database Cell - Failure domain (e.g. DC or AZ) @vitessio
  • 8. Vitess Architecture Basics A common replicated database cluster with primary and replicas @vitessio
  • 9. Vitess Architecture Basics Each MySQL server is assigned a vttablet - A daemon/sidecar - Controls the mysqld process - Interacts with the mysqld server - Typically on same host as mysqld @vitessio
  • 10. Vitess Architecture Basics In production you have multiple keyspaces, each with 1 or more shards @vitessio
  • 11. Vitess Architecture Basics User and application traffic is routed via vtgate - A smart, stateless proxy - Speaks the MySQL protocol - Impersonates a monolithic MySQL server - Relays queries to vttablets - Coordinates scatter-gather queries when needed @vitessio
  • 12. Vitess Architecture Basics A vitess deployment will run multiple vtgate servers for scale out @vitessio
  • 13. Vitess Architecture Basics vtgate will transparently route queries to the correct keyspaces, shards, and vttablets app app commerce shard 0 commerce shard 1 internal shard 1 (unsharded) ? @vitessio
  • 14. Vitess Architecture Basics Queries routed based on schema & sharding scheme (vindexes) app app commerce/-80 commerce/80- internal/- USE commerce; SELECT order_id, price FROM orders WHERE customer_id=4; @vitessio d2fd8867d50d2dfe
  • 15. Vitess Architecture Basics topo: distributed key/value store and coordination service - Stores the state of vitess: schemas, shards, sharding scheme, tablets, roles, etc. - Provides a shared locking service - etcd/ZooKeeper/Consul/Kubernetes - Small dataset, mostly cached by vtgate @vitessio commerce/-80 commerce/80- internal/-
  • 16. vtctld: control daemon - Runs ad hoc operations - API server - Reads/writes state in topo - Uses locks in topo - Operates on vttablets @vitessio Vitess Architecture Basics commerce/-80 commerce/80- internal/-
  • 18. Agenda ❖ Vitess Overview ❖ VReplication Overview ❖ What’s New and Resources @vitessio
  • 19. VReplication A Framework For Creating And Managing Data Streams And Workflows When data matches some defined criteria then execute a defined Workflow: ● Sharding ● Filtered Replication ● Transformations ● Materialized Views ● Online Migrations / Schema Changes ● Event Streams / CDC / Job Queues ● ... @vitessio
  • 20. ● Add a tablet for the external/unmanaged MySQL instance ● Add this temporary tablet to Vitess ● Use MoveTables to move [and shard] the data into a Vitess managed keyspace Getting Data Into Vitess @vitessio leetdb/-80 leetdb/80- okdb (RDS) Unmanaged Tablet
  • 21. ● You have many/all tables in a single keyspace ● You want to move some of the tables to a new keyspace ● You want to achieve this without making significant changes to your application or incurring downtime ● Use MoveTables to split the tables into N keyspaces @vitessio alldata/- Vertical / Functional Sharding products/- orders/-
  • 22. ● Going from 1 to 2 shards, 2 to 4, 4 to 8, to … as your dataset and usage grows ○ Can also shrink by merging shards and/or keyspaces when needed ● Add new tablets to manage new keyspace range splits ● Use Reshard to redistribute the data Horizontal Sharding @vitessio leetdb/-80 leetdb/80- leetdb/80-c0 leetdb/c0- leetdb/-40 leetdb/40-80
  • 23. VDiff @vitessio - Runs a diff between the source and target shards - One VDiff per workflow - This is a blocking call
  • 24. ● Real-time views into a [transformed] [subset] of data from db1 in db2 ● Aggregated views on certain data to perform analytics against ● Local copies of a “global” lookup table (e.g. country,state,postcodes) ● This data is automatically kept correct and up-to-date ● Use Materialize to setup the materialized view @vitessio products/- alldata/- orders/- Materialized Views
  • 25. ● Non-blocking, monitorable, revertable, cancelable, configurable throttling ● Supports typical SQL statements as well as declarative ● Resilient to failures, failovers, and topology changes ● Lazy, phased cleanup (to avoid cost of dropping large tables in prod) ● Uses VReplication, driven by primary tablet in each shard $ vtctlclient ApplySchema -ddl_strategy "online" -sql "..." <keyspace> ● Has its own SQL statements: mysql> show vitess_migrations; alter vitess_migration …; mysql> show vitess_migration_logs; … ● Has its own set of vtctl commands: $ vtctlclient OnlineDDL <keyspace> [show,retry,cancel] [<migration_uuid>,all,running,complete,failed] Online Schema Changes @vitessio
  • 26. ● Use a Vitess managed message bus, job queue, or event stream ○ For example, managing “offline” processing data pipelines ● CREATE a standard [sharded] table with required fields and comments ○ https://vitess.io/docs/reference/features/messaging ● DMLs against the table generate events ● SUBSCRIBERs receive and acknowledge events via SQL or gRPC Change Data Capture / Event Streams @vitessio app app commerce/-80 commerce/80- internal/- STREAM * FROM vt_job_queue; INSERT INTO vt_job_queue …; UPDATE vt_job_queue SET time_acked = NOW() WHERE id = 100;
  • 27. ● This will grow over time based on new use cases that present themselves ● Most recently, Vitess Native OnlineDDL ● Or your own custom workflows and pipelines! For example: https://medium.com/bolt-labs/streaming-vitess-at-bolt-f8ea93211c3f Any New Built-in Workflows... @vitessio
  • 28. ● VDiff ○ Compare the full set of logical rows on both sides using a consistent snapshot ● Limit impact on production traffic ○ configurable tablet throttling ○ –tablet_types, –max_replication_lag, –filtered_replication_wait_time, – vreplication_copy_phase_max_innodb_history_list_length, – vreplication_copy_phase_max_mysql_replication_lag, … ● -cells to avoid cross-AZ traffic ● Custom RoutingRules to make cutovers seamless for apps/users ● Rollbacks via $ vtctl ReverseTraffic ... Safety Mechanisms @vitessio
  • 29. VDiff2 (Diffs done on tablets) @vitessio
  • 30. Agenda @vitessio ❖ Vitess Overview ❖ VReplication Overview ❖ What’s New and Resources
  • 31. New* and Upcoming Features - 14.0 release GA late June 2022 - VReplication based Online DDL - 10.0 (GA in 14.0) - Continuous benchmarking - since 11.0 - More supported query constructs - (MySQL 8.0) ongoing - Gen4 query planner - GA in 13.0 (default in 14.0) - MySQL compatible collations - 13.0 - Multi-column VIndexes - 13.0 - VTAdmin - Automatic failure detection and handling (VTorc) - VDiff2, running on tablets (and other improvements) - Distributed x-shard transactions @vitessio ‫٭‬ish