Vitess provides a large set of features that allow you to use and manage a scalable set of MySQL database instances across custom partitions or shards of your dataset as if it was a single logical database. One of the key components used within Vitess is called VReplication.
In this talk, we'll cover what VReplication is and how it relates to MySQL replication, including how VReplication leverages the technologies you're already familiar with while expanding on them to add a set of powerful primitives and abstractions that support an ever-growing list of high-level features such as sharding and resharding of tables, materialized views, online DDL, change streams (CDC), and message or job queues.
This talk should leave a MySQL user/operator with a good understanding of what VReplication could do for them and when they may want to use it.
4. Vitess
A database clustering system for horizontal scaling of MySQL
• CNCF graduated project
• Open source, Apache 2.0 license
• Contributors from around the community
• Written in Golang
@vitessio
5. Cloud-native distributed database
- Runs in Kubernetes; Vitess Operator (VTOP)
Scalable
Highly available
Durability guarantees
Illusion of “single database”
- Single dedicated connection
- MySQL 5.7 or 8.0
- Compatible with frameworks / ORMs etc.
@vitessio
Vitess
9. Vitess Architecture Basics
Each MySQL server is assigned a vttablet
- A daemon/sidecar
- Controls the mysqld process
- Interacts with the mysqld server
- Typically on same host as mysqld
@vitessio
11. Vitess Architecture Basics
User and application traffic is routed via
vtgate
- A smart, stateless proxy
- Speaks the MySQL protocol
- Impersonates a monolithic MySQL
server
- Relays queries to vttablets
- Coordinates scatter-gather queries
when needed
@vitessio
13. Vitess Architecture Basics
vtgate will transparently route
queries to the correct keyspaces,
shards, and vttablets
app
app
commerce
shard 0
commerce
shard 1
internal
shard 1
(unsharded)
?
@vitessio
14. Vitess Architecture Basics
Queries routed based on schema & sharding
scheme (vindexes)
app
app
commerce/-80
commerce/80-
internal/-
USE commerce;
SELECT order_id, price
FROM orders
WHERE customer_id=4;
@vitessio
d2fd8867d50d2dfe
15. Vitess Architecture Basics
topo: distributed key/value store and coordination service
- Stores the state of vitess: schemas, shards, sharding
scheme, tablets, roles, etc.
- Provides a shared locking service
- etcd/ZooKeeper/Consul/Kubernetes
- Small dataset, mostly cached by vtgate
@vitessio
commerce/-80
commerce/80-
internal/-
16. vtctld: control daemon
- Runs ad hoc operations
- API server
- Reads/writes state in topo
- Uses locks in topo
- Operates on vttablets
@vitessio
Vitess Architecture Basics
commerce/-80
commerce/80-
internal/-
19. VReplication
A Framework For Creating And Managing Data Streams And Workflows
When data matches some defined criteria then execute a defined
Workflow:
● Sharding
● Filtered Replication
● Transformations
● Materialized Views
● Online Migrations / Schema Changes
● Event Streams / CDC / Job Queues
● ...
@vitessio
20. ● Add a tablet for the external/unmanaged MySQL instance
● Add this temporary tablet to Vitess
● Use MoveTables to move [and shard] the data into a Vitess
managed keyspace
Getting Data Into Vitess
@vitessio
leetdb/-80
leetdb/80-
okdb (RDS)
Unmanaged Tablet
21. ● You have many/all tables in a single keyspace
● You want to move some of the tables to a new keyspace
● You want to achieve this without making significant changes to your
application or incurring downtime
● Use MoveTables to split the tables into N keyspaces
@vitessio
alldata/-
Vertical / Functional Sharding
products/-
orders/-
22. ● Going from 1 to 2 shards, 2 to 4, 4 to 8, to … as your dataset and usage grows
○ Can also shrink by merging shards and/or keyspaces when needed
● Add new tablets to manage new keyspace range splits
● Use Reshard to redistribute the data
Horizontal Sharding
@vitessio
leetdb/-80
leetdb/80-
leetdb/80-c0
leetdb/c0-
leetdb/-40
leetdb/40-80
23. VDiff
@vitessio
- Runs a diff between the source
and target shards
- One VDiff per workflow
- This is a blocking call
24. ● Real-time views into a [transformed] [subset] of data from db1 in db2
● Aggregated views on certain data to perform analytics against
● Local copies of a “global” lookup table (e.g. country,state,postcodes)
● This data is automatically kept correct and up-to-date
● Use Materialize to setup the materialized view
@vitessio
products/-
alldata/-
orders/-
Materialized Views
25. ● Non-blocking, monitorable, revertable, cancelable, configurable throttling
● Supports typical SQL statements as well as declarative
● Resilient to failures, failovers, and topology changes
● Lazy, phased cleanup (to avoid cost of dropping large tables in prod)
● Uses VReplication, driven by primary tablet in each shard
$ vtctlclient ApplySchema -ddl_strategy "online" -sql "..." <keyspace>
● Has its own SQL statements:
mysql> show vitess_migrations; alter vitess_migration …;
mysql> show vitess_migration_logs; …
● Has its own set of vtctl commands:
$ vtctlclient OnlineDDL <keyspace> [show,retry,cancel]
[<migration_uuid>,all,running,complete,failed]
Online Schema Changes
@vitessio
26. ● Use a Vitess managed message bus, job queue, or event stream
○ For example, managing “offline” processing data pipelines
● CREATE a standard [sharded] table with required fields and comments
○ https://vitess.io/docs/reference/features/messaging
● DMLs against the table generate events
● SUBSCRIBERs receive and acknowledge events via SQL or gRPC
Change Data Capture / Event Streams
@vitessio
app
app
commerce/-80
commerce/80-
internal/-
STREAM *
FROM vt_job_queue;
INSERT INTO vt_job_queue …;
UPDATE vt_job_queue
SET time_acked = NOW()
WHERE id = 100;
27. ● This will grow over time based on new use cases that present
themselves
● Most recently, Vitess Native OnlineDDL
● Or your own custom workflows and pipelines! For example:
https://medium.com/bolt-labs/streaming-vitess-at-bolt-f8ea93211c3f
Any New Built-in Workflows...
@vitessio
28. ● VDiff
○ Compare the full set of logical rows on both sides using a
consistent snapshot
● Limit impact on production traffic
○ configurable tablet throttling
○ –tablet_types, –max_replication_lag, –filtered_replication_wait_time, –
vreplication_copy_phase_max_innodb_history_list_length, –
vreplication_copy_phase_max_mysql_replication_lag, …
● -cells to avoid cross-AZ traffic
● Custom RoutingRules to make cutovers seamless for apps/users
● Rollbacks via $ vtctl ReverseTraffic
...
Safety Mechanisms
@vitessio