All the ways to (not) break your database in production
1. The best ways to (not)
break your database in
production.
Rotem Tamir
Go Israel Meetup
September 5th, 2022
2. Hello, world!
Rotem Tamir (37), father of two.
● Co-founder @ Ariga
● Ent, a Linux Foundation Project. (11.5k ⭐)
● Atlas. (1.8k ⭐)
● Ex-Nexar, Ex-ironSource
@rotemtam @_rtam
3. Agenda (and why we’re presenting here)
● How can schema migrations break production?
● CI for migration files with Atlas + Demo
● Building a linter for schema migration files with Go
● How Go tests go test (and why we adopted the same solution)
4. Why does it matter?
velocity stability
Experiment
Respond to change
Reduce “inventory”
DB downtime is expensive
Data-loss is disastrous
5. Why are (SQL) database outages hard to manage?
● Maintaining shared, persistent state
● Scaling out isn’t trivial (or slow)
● Rollbacks aren’t trivial
● Rebuild and deploy doesn’t work
7. Destructive changes
● Changes to a database schema that
result in loss of data.
● Accidental deletion of data can be
disastrous.
● Cannot be undone by reverse DDL
statements.
● MySQL/MariaDB do not support
transactional DDL, worsening the
effect of failed migrations.
ALTER TABLE `users`
DROP COLUMN
`email_address`;
8. Breaking changes
Changes to a database schema
that break the contract between
the consuming applications
(backends) and the database.
Also called “backwards
incompatible changes”.
Trivia: Even adding a column can
be a breaking change! (How?)
ALTER TABLE `users`
RENAME COLUMN `email_address` TO
`email`;
9. Table copy
Changes to a database schema
that cause the table to lock for
operations as data is being
restructured on disk.
If the operation is heavy and takes
a lot of time, this is effectively a self
inflicted Denial-of-Service.
ALTER TABLE `users`
MODIFY `email_address`
varchar(100);
In MySQL, changing a non-virtual columns type
requires a physical rewrite (copy) of the table.
10. Data-dependent changes
Changes to a database schema
that may fail due to constraint
violations and break a deployment
sequence.
Since not all DDL statements can
run in transactions, databases may
be left in an unknown state.
ALTER TABLE `example`.`orders`
ADD UNIQUE INDEX `idx_name`
(`name`);
This statement is considered data-dependent
because if the orders table contains duplicate values
on the name column we will not be able to add a
uniqueness constraint.
17. Supporting many types of databases
● We have common checks, but each database is different (even
different versions of the same engine behave differently).
● How do we avoid switch/case statement maintenance hell?
18. database/sql: Global driver registries
https://github.com/lib/pq/blob/master/conn.go
https://cs.opensource.google/go/go/+/refs/tags/go1.19:src/database/sql/sql.go
19. gocloud.dev
● An open source project building libraries and tools to improve the
experience of developing for the cloud with Go.
● Maintained by Google.
● Go CDK provides commonly used, vendor-neutral generic APIs that
you can deploy across cloud providers.
20. Blob storage
● Change implementations
easily
● Just change the URL scheme:
○ mem://
○ s3://
○ gcs://
22. Benefits of global driver registries
● Decoupling - infra code does not need to know about each
database avoiding awareness of which database we’re working
against.
● Ecosystem - drivers can be contributed without being part of the
“central” codebase
23. Testing CLI applications with testscript
● “Hidden gem” inside the Go
standard library codebase
● Used to test the Go toolchain
● Maintained by rogpeppe in
go-internal
● Uses txtar - a trivial
text based archive
format.
● Works in the Go
Playground!
24. Using testscript
● Sandbox env per test
● Custom commands
● Example (test_race.txt)
● Atlas Lint example