6. My assumptions:
● Schema migrations are frequent.
● Automated schema migration is a goal.
● Stage environment is enough like
production for testing.
● Writing a small amount of code is ok.
7. No tool is perfect.
DBAs should drive migration tool
choice.
Chose a tool that your developers like.
Or, don't hate.
8. Part 0: #dbaproblems
Part 1: Why we should work with developers
on migrations
Part 2: Picking the right migration tool
Part 3: Using Alembic
Part 4: Lessons Learned
Part 5: Things Alembic could learn
12. Changing a CHECK constraint
on 1000+ partitions.
http://tinyurl.com/q5cjh45
13. What sucked about this:
● Wasn't the first time (see 2012 bugs)
● Change snuck into partitioning UDF
Jan-April 2013
● No useful audit trail
● Some partitions affected, not others
● Error dated back to 2010
● Wake up call to examine process!
17. What was awesome:
● Used Alembic to manage the change
● Tested in stage
● Experimentation revealed which
partitions could be modified without
deadlocking
● Rolled out change with a regular release
during normal business hours
18. Process with Alembic:
1. Make changes to model.py or
raw_sql files
2. Run: alembic revision –-auto-generate
3. Edit revision file
4.Commit changes
5. Run migration on stage after
auto-deploy of a release
19. Process with Alembic:
1. Make changes to model.py or
raw_sql files
2. Run: alembic revision -–auto-generate
3. Edit revision file
4.Commit changes
5. Run migration on stage after
auto-deploy of a release
20. Problems Alembic solved:
● Easy-to-deploy migrations including
UDFs for dev and stage
● Can embed raw SQL, issue multi-
commit changes
● Includes downgrades
21. Problems Alembic solved:
● Enables database change discipline
● Enables code review discipline
● Revisions are decoupled from release
versions and branch commit order
22. Problems Alembic solved (continued):
● 100k+ lines of code removed
● No more post-deploy schema
checkins
● Enabling a tested, automated stage
deployment
● Separated schema definition from
version-specific configuration
23. Photo courtesy of secure.flickr.com/photos/lambj
HAPPY
AS A CAT IN A BOX
24. Part I: Why we should work
with developers on migrations
31. Database systems resist change because:
Exist at the center of multiple systems
Stability is a core competency
Schema often is the only API between
components
40. Migrations are for:
● Communicating change
● Communicating process
● Executing change in a controled,
repeatable way with developers and
operations
43. Questions to ask:
● How often does your schema change?
● Can the migrations be run without you?
● Can you test a migration before you run
it in production?
44. Questions to ask:
● Can developers create a new schema
without your help?
● How hard is it to get from an old
schema to a new one using the tool?
● Are change rollbacks a standard use of
the tool?
45. What does our system need to do?
● Communicate change
● Apply changes in the correct order
● Apply a change only once
● Use raw SQL where needed
● Provide a single interface for change
● Rollback gracefully
46. How you are going to feel
about the next slide:
49. A good ORM provides:
● One source of truth about the schema
● Reusable components
● Database version independence
● Ability to use raw SQL
50. And good ORM stewardship:
● Fits with existing tooling and
developer workflows
● Enables partnership with developers
● Integrates with a testing framework
51. And:
● Gives you a new way to think about
schemas
● Develops compassion for how
horrible ORMs can be
● Gives you developer-friendly
vocabulary for discussing why ORM-
generated code is often terrible
54. https://alembic.readthedocs.org
revision: a single migration
down_revision: previous migration
upgrade: apply 'upgrade' change
downgrade: apply 'downgrade' change
offline mode: emit raw SQL for a change
55. Installing and using:
virtualenv venv-alembic
. venv-alembic/bin/activate
pip install alembic
alembic init
vi alembic.ini
alembic revision -m “new”
alembic upgrade head
alembic downgrade -1
57. Helper functions?
Put your helper functions in a custom
library and add this to env.py:
import myproj.migrations
58. Ignore certain schemas or partitions?
In env.py:
def include_symbol(tablename, schema):
return schema in (None, "bixie") and
re.search(r'_d{8}$', tablename)
is None
59. Manage User Defined Functions?
Chose to use raw SQL files
3 directories, 128 files:
procs/ types/ views/
codepath = '/socorro/external/pg/raw_sql/procs'
def load_stored_proc(op, filelist):
app_path = os.getcwd() + codepath
for filename in filelist:
sqlfile = app_path + filename
with open(myfile, 'r') as stored_proc:
op.execute(stored_proc.read())
60. Stamping database revision?
from alembic.config import Config
from alembic import command
alembic_cfg =
Config("/path/to/yourapp/alembic.ini")
command.stamp(alembic_cfg, "head")
62. Always roll forward.
1. Put migrations in a separate commit
from schema changes.
2. Revert commits for schema change,
leave migration commit in-place for
downgrade support.
63. Store schema objects in the smallest,
reasonable, composable unit.
1. Use an ORM for core schema.
2. Put types, UDFs and views in separate
files.
3. Consider storing the schema in a
separate repo from the application.
64. Write tests. Run them every time.
1. Write a simple tool to create a new
schema from scratch.
2. Write a simple tool to generate fake
data.
3. Write tests for these tools.
4.When anything fails, add a test.
66. 1. Understand partitions
2. Never apply a DEFAULT to a new
column
3. Help us manage UDFs better
4.INDEX CONCURRENTLY
5. Prettier syntax for multi-commit
sequences
67. 1. Understand partitions
2. Never apply a DEFAULT to a new
column
3. Help us manage UDFs better
4.INDEX CONCURRENTLY
5. Prettier syntax for multi-commit
sequences