3. Zalando SE
• One of the biggest fashion
retailers in Europe
• 15 european countries
• Millions of transactions
every day
• Relies of hundreds of
PostgreSQL databases
17. Postgres is insanely great!
• Fully ACID-compliant
• Transactional DDL
• Sprocs in different languages
• Schemas and search_path support
18. Sqitch
• Utility for writing incremental database changes
• Works on top of your VCS
• Deploy/Revert/Verify
• Explicit order of changes with sqitch.plan
• Test on staging, bundle and deploy to production
19. First steps
• $ git init .
• Initialized empty Git repository in /Users/alexk/devel/conference/2015/pgconf.us/.git/
• $ sqitch --engine pg init pgconf.us --uri http://pgconf.us
• Created sqitch.conf
• Created sqitch.plan
• Created deploy/
• Created revert/
• Created verify/
20. Deploy/Revert/Verify
$ sqitch add approle -n 'add pgconf role'
Created deploy/approle.sql
Created revert/approle.sql
Created verify/approle.sql
Added "pgconf" to sqitch.plan
29. Dependencies
$ sqitch revert db:pg:conference_test
Revert all changes from db:pg:conference_test? [Yes] y
- approle .. ok
$ sqitch deploy
Deploying changes to db:pg:conference_test
+ approle .... ok
+ appschema .. ok
30. Plan your trip
$ cat sqitch.plan
%syntax-version=1.0.0
%project=pgconf.us
%uri=http://pgconf.us/2015/
approle 2015-03-25T04:05:04Z Oleksii Kliukin
<alexk@hintbits.com> # add a role definition for the app
appschema [approle] 2015-03-25T12:06:12Z Oleksii Kliukin
<alexk@hintbits.com> # define application schema
31. Branches
$ git checkout master
Already on 'master'
$ git merge speaker
Auto-merging sqitch.plan
CONFLICT (content): Merge conflict in sqitch.plan
Automatic merge failed; fix conflicts and then
commit the result.
33. Plan file conflicts
• git rebase
• use union merge for the plan file
echo sqitch.plan merge=union > .gitattributes
• sqitch rebase to revert/deploy all changes
37. Staging -> production
• sqitch tag @release1.0
• sqitch bundle —to @release1.0
• cd bundle && sqitch deploy db:pg:production
38. Reworking
• Edit your changes in-place
• Do not go through revert/deploy cycle (i.e. on production)
• Old version of changes duplicated and renamed to a
change@tag
• ‚Deploy script of the old version becomes revert of the
new one [by default]
• Requires a tag between the old and reworked changes
39. Reworking
sqitch rework set_talk -n ‚process description field'
Added „set_talk [set_talk@v1.0]“ to sqitch.plan.
Modify these files as appropriate:
* deploy/set_talk.sql
* revert/set_talk.sql
* verify/set_talk.sql
new version of changes
40. Reworking
$ git status
On branch master
Changes not staged for commit:
modified: revert/set_talk.sql
modified: sqitch.plan
Untracked files:
deploy/set_talk@v1.0.sql
revert/set_talk@v1.0.sql
verify/set_talk@v1.0.sql
original changes
original deployment script
41. Versioning
A package written by Hubert ‘depesz’ Lubaczewski:
https://github.com/depesz/Versioning
Instead of making changes on development server,
then finding differences between production and
development, deciding which ones should be installed
on production, and finding a way to install them -
you start with writing diffs themselves!
44. Track changes
postgres@test:~$ psql -f beer_not_null.sql -d testdb
BEGIN
register_patch
----------------
(0 rows)
ALTER TABLE
COMMIT
postgres@test:~$ psql -f beer_not_null.sql -d testdb
BEGIN
psql:beer_not_null.sql:2: ERROR: Patch beer_not_null.sql is
already applied!
CONTEXT: SQL function "register_patch" statement 1
psql:beer_not_null.sql:3: ERROR: current transaction is aborted,
commands ignored until end of transaction block
ROLLBACK
45. Rollbacks
Rollback patches are recommended for ‘dangerous’ changes.
BEGIN;
SELECT _v.unregister_patch('beer_not_null.sql');
ALTER TABLE public.bar ALTER COLUMN beer DROP NOT NULL;
COMMIT;
47. Metadata
Column Type Modifiers
patch_name text primary key
applied_tsz timestamp with time
zone
not null default now()
applied_by text not null
requires text[]
conflicts text[]
48. Honorable mentions
• PgTap - a database unit testing framework
http://pgtap.org
• SEM - schema evolution manager
https://github.com/gilt/schema-evolution-manager
• FDIFF - consistent sproc modifications
https://github.com/trustly/fdiff
51. Data access
sprocs API sprocs API
sprocs API
Applications access data in PostgreSQL databases by calling
stored procedures.
52. Deployment procedures
• Weekly deployment cycle
• Use release branches in git
• Changes to schema, sprocs API and Java code
• No application or database downtime
53. Release branches in git
database
10_data
20_api
03_types
04_tables
db_diffs
03_types
05_stored_procedures
R14_00_42
R14_00_42
Use numbers to indicate the order compatible with sort -V
54. Deployment stages
• DB Diffs
• Sprocs API
• Java code
• Post-deployment actions (i.e data migrations)
55. Database diffs rollout
• ‚Versioning’ package by depesz
• Changes developed by feature teams
• Reviewed by database engineers
• Schema changes should not break a running app
• Expensive locks should be avoided
56. Avoiding exclusive locks
BEGIN;
SELECT _v.register_patch('TDO-3575.camp');
ALTER TABLE zcm_data.media_placement
ADD COLUMN mp_last_modified timestamptz;
ALTER COLUMN mp_last_modified
SET DEFAULT now();
COMMIT;
-- make the column not null
UPDATE zcm_data.media_placement
SET mp_last_modified = mp_created
WHERE mp_last_modified IS NULL;
ALTER TABLE zcm_data.media_placement
ALTER COLUMN mp_last_modified SET NOT NULL;
Steps to add a not-null
column with a default value:
●add a column
●set default value
●commit a transaction
●update the new column with
the default value
●set the new column not null
57. Single definition for a database object
BEGIN;
SELECT _v.register_patch('FOO-1235.add_bar.sql_diff');
# add table bar
i database/baz/10_data/01_bar.sql
COMMIT;
database/baz/10_data/bar/01_bar.sql
-- 01_bar.sql
CREATE TABLE bar(b_id PRIMARY KEY,
b_name TEXT);
58. Lifecycle of a database diff
• Produced by the feature team developer, tested locally
• Applied on integration DB by the author
• Reviewed by 2 Database Engineers
• Applied to Release Staging
• Tested by the QA team as a part of the feature
• Applied to patch staging and LIVE databases
61. search_path based deployments
• API is deployed after the schema changes
• Multiple versions of API sprocs co-exist in a database
• Active version is chosen with search_path
• Fast rollbacks
• Almost no in-place sprocs modification
62. postgres=# SET search_path = pgconf_eu_2014;
SET
postgres=# select hello();
hello
---------------
Hello Madrid!
(1 row)
postgres=# SET search_path = pgconf_us_2015;
SET
postgres=# select hello();
hello
---------------
Hello New York!
(1 row)
63. New API deployment
APP INSTANCE 1
APP INSTANCE 2
Initially, all instances are running the same version of Java code and PostgreSQL API
64. New API deployment
New sprocs API is deployed but inactive, old one is still in use by the application
APP INSTANCE 1
APP INSTANCE 2
65. New API deployment
One instance of the application is switched to the new API via the search_path
APP INSTANCE 1
APP INSTANCE 2
67. Bootstrapping databases
• Database objects are defined in SQL files in git
• Every change produces both a db diff and edit of a SQL file
• Order of bootstrapping is controlled by file names
00_create_schema.sql
04_tables/05_newsletter.sql
• One line to bootstrap the fresh testing database
find . -name ‘*.sql’|sort -V|xargs cat|psql -d testdb -1f -
68. Tooling for data migrations
• Split updates or deletes in chunks
• Run vacuum after a given number of changes
• Controls load on a database server
• Multi-threaded to run on multiple shards at once
• Limits for max load and number of rows per minute
69. environment: integration
database: testdb
getid: select id from public.foo
update: update public.bar SET foo_count = 1 WHERE
foo_id = %(id)s
commitrows: 20
maxload: 3.0
vacuum_cycles: 10
vacuum_delay: 100
vacuum_table: bar
70. A sneak peek into the future
• Automated testing of schema changes
• Schema comparison tool to quickly spot schema anomalies
• Handle dependencies between migrations and schema
changes
• Stay tuned!