2. Migrations
● Migration is a set of database instruction.
● They describe database changes.
Rails migrations
● Rails Migration allows you to use Ruby to define changes to
your database schema, making it possible to use a version
control system to keep things synchronized with the actual
code.
3. ● Adding a column
● Backfilling data
● Removing a column
● Changing the type of a column
● Renaming a column
● Renaming a table
● Creating a table with the force option
● Adding a check constraint
● Setting NOT NULL on an existing column
● Executing SQL directly
Some of the unsafe migrations
4. ● Adding an index non-concurrently
● Adding a reference
● Adding a foreign key
● Adding a json column
Postgres-specific checks
8. DB locks
● Locks are a mechanism for ensuring multiple operations
don’t update the same row at the same time.
● There are 8 different lock modes, ranging from ACCESS
SHARE (anyone can read and write data) to ACCESS
EXCLUSIVE (no one else is permitted to read data).
● Certain database migrations will obtain an ACCESS
EXCLUSIVE lock, and prevent the rest of your application
from reading data until the migration completes.
16. Because,
● Of the locking mode it uses and can and will cause
downtime if you have enough rows in your database and
enough traffic on the system.
● Though Postgres 11 actually addresses this problem in
certain circumstances. Adding a static default value no
longer requires obtaining a table level access exclusive
locks. But note the caveat, under certain circumstances.
● For example adding backfilling a new UUID column will
obtain that lock.
21. DB transactions
● Transactions combine multiple database operations into a
single, “all-or-nothing” operation.
● They provide four guarantees: atomicity, consistency,
isolation, and durability (“ACID”).
● Consistency and isolation are guaranteed by locks.
● When a a row is being updated, an exclusive lock is issued,
and no one else can update that same row until the first
update is complete.
22. DB transactions
● Locks are issued on a first-come, first-served basis, and live
for the duration of a transaction, even if the statement that
requested the lock has already executed.
● Migrations are automatically wrapped in a transaction.
● So for most of your database operations this might not be a
problem, as it usually happens in a the order of milliseconds.
● But when you have to perform millions of database
operations on a very large datasets.
26. So, how this transactions affect migrations
● Our columns were added, with row 1 we are not actually
locking the entire table, but instead the first row is locked,
mark it true and move on. Even though it was successful, as
I mentioned, that lock doesn't get released until your
transaction get committed.
30. disable_ddl_transaction!
● It disables that global transaction.
● It is implicitly enabled but you can explicitly when you're
running a particular migration.
● So, we write a separate migration and run once the column
was added.
● Rather than marking every single user inside/outside a
transaction, we iterate users in batches and wrapping each
individual batch inside of a transaction.
● Batch size defaults to 1000 of course it's configurable based
on your individual needs.
31. What’s the difference??
● This transaction that is updating 1000 rows is gonna
complete and commit much faster than a transaction
updating 10 million rows.
● That changes your lag time from minutes to order of
seconds or even lesser where an individual subset of users
might receive a slightly delayed response.
● So, users most likely won't even notice that anything
happened.
● So our rule of thumb here is???
36. Indexing will
● Interfere with regular operation of a database.
● Locks the table to be indexed against writes and performs
the entire index build with a single scan of the table.
● Have a severe effect if the system is a live production
database.
● Very large tables can take many hours to be indexed, and
even for smaller tables, an index build can lock out writers
for periods that are unacceptably long for a production
system.
39. algorithm: :concurrently
● Waits for all existing transactions that could potentially
modify or use the index to terminate.
● Requires more total work than a standard index build and
takes significantly longer to complete.
● Useful for adding new indexes in a production environment.
● Of course, the extra CPU and I/O load imposed by the index
creation might slow other operations.
41. L = λ * W
Little’s law
Concurrency Throughput Response Time
4 = 100 * 40 ms
Concurrent requests Req’s / sec Response Time
42. Concurrency
● Every application has a theoretical maximum level of
concurrency it can support at any given time.
● Your database obeys the same principles. How fast your
queries are, and how large your connection pool is,
determines how many queries you can concurrently handle.
● Requests start queueing when they arrive faster than your
application, or its database, can respond to them.
● If a database operation blocks many requests for a long
time, your entire application will grind to a halt.
44. DB Performance
● You don't have to understand the performance
characteristics of the application.
● But you have to understand how they change during before
and after your migration.
● You have to do this on a regular basis.
● If we had an understanding on the effects of the migration
even before we migrate them live, makes an advantage on
us to not drop on outages.
46. Gems
● To help your database healthy and still can add schema
changes.
● Static analysis will warn in advance about certain unsafe
migrations.
● Catch problems at dev time, not deploy time.
● ankane/strong_migrations
● LendingHome/zero_downtime_migrations
● Not technically a gem, but: Gitlab migration helpers
47. Strong migrations
● Catch unsafe migrations in development
● Detects potentially dangerous operations
● Prevents them from running by default
● Provides instructions on safer ways to do what you want
● Supports for PostgreSQL, MySQL, and MariaDB
50. Application Performance Monitoring
● Understanding your application's baseline performance is
critical to understanding how migrations will change its
performance characteristics.
51. Takeaways
● DON’T add columns with a default value.
● DON’T backfill data inside a transaction.
● DON’T mix schema and data changes in the same migration.
● DO add Postgres indexes concurrently.
● DO monitor and test database performance before, during,
and after migrations.