CI/CD of relational
databases
Jasmin Fluri
About me
Jasmin Fluri
Database / Automation Engineer
Schaltstelle GmbH
Database Development
Development Process
Automation and Tooling
Continuous Integration
Continuous Delivery
Software Engineering
@jasminfluri
@jasminfluri
jasminfluri Pipelining / Provisioning
«The person that builds / fixes the
CI/CD pipeline.»
Why am I talking about CI/CD of relational DBs?
3
Database Engineer in a
Java Environment
First Data Mart
Project
First DWH
Project
Another DWH
Project
Automated
Provisioning of
Servers
Automated
XP -> Win7 Migration
CI/CD / Ansible / GIT /
Docker / Flyway
Manual
Deployment / SVN
Manual
Deployment / GIT
2009
2011
2012
2015
2017
2018
automated CI/CD!
automated CI/CD!
automated CI/CD!
Another DB
Project
Manual
Deployment / Git
2019
automated CI/CD!
Research Project
on DB CI/CD
2021
Finished
Research
2022
Manual
Deployment / no VCS
… a short story about a database
change!
4
Disclaimer
Persons and processes are fictitious.
Similarities with real projects are coincidental
and not intended.
In our development project…
5
… consisting of a Spring Boot
application …
…. and an Oracle database …
… work 4 developers!
In our development project…
6
CI/CD of the Spring Boot App
is set up consistently!
Testing is an integral part!
But it looks different with the
database...
No team member has been there
from the beginning; the code was
taken over from their
predecessors.
The skills are mainly in the
development of the Spring Boot
app.
In our development project…
7
... this is either tried to abstract on
the application side...
In the event of an emerging
database change ...
...or to avoid it completely!
If it is nevertheless
unavoidable...
In our development project…
8
The execution changes the access
logic to the application, so it must
be redeployed.
The DBA executes the script
manually!
... someone in the team must write
a migration script!
dbchange.sql
E-Mail
DBA
.. Why is this approach
problematic?
9
In our development project…
10
The execution changes the access
logic to the application, so it must
be redeployed.
The DBA executes the script
manually!
... someone in the team must write
a migration script!
dbchange.sql
E-Mail
DBA
SQL scripts are not
available under VCS!
Because sent by email, we do not
know exactly when the change is
made!
Manual execution leaves a
lot of room for error!
No versioned access layer, which
leads to downtime!
No regression tests on the DB side!
What happens when DB Changes
are avoided?
11
12
… it appears that most large enterprises
actually care about minimizing application
maintenance of existing production systems.
That causes them to utilize “bad” schemas,
and generally to allow “database decay”.
– Stonebreaker et al. (2016)
If we neglect our (DB) applications, they
will disintegrate!
often you also hear…
"... we don't avoid DB Changes,
but we have fixed Release
Windows!"
13
What effects do fixed-release
windows have?
14
With fixed release windows …
features that are done need to wait …
• ... this causes dependencies among features.
risk increases to deploy many features at once
• ... Releases get bigger and bigger!
Short feedback cycles are not possible!
• ... Feedback is only available at the release window.
15
The longer we wait, the
greater the risk of
deployment!
16
Source: https://kromatic.com/blog/the-risk-of-building-the-wrong-stuff/
Features are coupled!
Mental load of
developers increases!
Risk increases!
We should deliver features when
they are ready! Only in
production do we know how they
behave in reality.
17
What do we know about releasing
software?
18
What do we know about releasing software?
19
The faster your teams can make changes to your
software, the sooner you can deliver value to your
customers, run experiments, and receive valuable
feedback. State of DevOps Report 2022
A high level of software delivery performance requires
technical preconditions!
20
Technical capabilities build upon one another.
Continuous delivery and version control amplify each other’s
ability to promote high levels of software delivery performance.
Combining continuous delivery, loosely-coupled architecture,
version control, and continuous integration fosters software
delivery performance that is greater than the sum of its parts.
State of DevOps Report 2022
https://services.google.com/fh/files/misc/2022_state_of_devops_report.pdf
How is Software Delivery
Performance measured?
5 Key Metrics!
Lead Time For
Changes (low)
MTTR (low)
Change Failure Rate
(low)
Deployment
Frequency (high)
Deployment
Reliability (high)
21
High-performing teams recover
much faster from incidents.
Why is that?
22
Small changes = small risk = small impact = small feedback loops
23
Source: NYTimes – The best path to long term success is slow, simple and boring!
What are the positive effects of small changes?
24
Small changes have minor effects! It's easy to see all the
elements that are affected!
Small changes carry only a small risk!
Small changes can be more easily undone or corrected if
they are incorrect.
Goal of software releases:
We want to ship small changes
continuously to get fast and continuous
feedback!
25
... deploying small changes are
thus a good practice!
26
... but in database development it
is not very common!
27
Database development in
numbers!
(Studies from 2015-2022)
28
More than half of all database
applications have no data quality
or application testing!
29
Less than half of DB development
projects have automated
development workflows!
30
Lack of automation of DB-CI/CD is
one of the most common
bottlenecks in releasing changes.
31
Less than 30% of DB development
projects have both automated
testing and static code analysis.
32
How is CI/CD for applications
structured (non-DB)?
33
Developer
Version Control
System
Continuous
Integration
Server
Continuous
Integration
Pipeline
Push
Trigger pipeline
on commit oder
on merge request
execute
1 - Checkout Source Code
2 - Build and test of
the application
3 – System tests,
Static code analysis
and Metrics
4 – Reports and
Notifications
5 - notify
Continuous Integration
File-based
development
New versions
replace old
ones.
No State!
No order of
changes!
How is database CI/CD different?
35
if (system == database){
build = installation;
}
Developer
Version Control
System
Continuous
Integration
Server
Continuous
Integration
Pipeline
Push
Trigger pipeline
on commit oder
on merge request
execute
1 - Checkout source code
2 - Build and Test of
the database
3 – System tests,
Static code analysis
and Metrics
4 – Reports and
Notifications
5 - notify
Continuous Integration
File-based
development,
generated code,
exported code
The changes
have a defined
sequence!
Builds are
always
installation of
changes!
A lot of state!!
What are preconditions for a
good database CI/CD?
38
Before you start building a database CI/CD pipeline you need…
Static Code
Analysis /
Linting
Automated
Tests
Everything
stored under
version
control
Database
Migration
Tool
Decoupled
Codebase
DB vs APP
39
(1) Version Control
40
… without Version Control there’s
no single source of truth!
41
Everything that belongs to the DB must be stored in version control!
42
🤞 36.6%
Database source code in version control
myproject/
├── …
├── docs/
│ ├── documentation.md
├── db/
│ ├── 0_sequences/
│ ├── 1_tables/
│ ├── 2_mviews/
│ ├── 3_views/
│ ├── 4_packages/
│ └── 5_utils/
├── migrations/
│ ├── scripts/
└── db-tests/
├── packages/
└── data/
DDL
Code (states)
Test Code
and Test Data
Example Project Structure
• DDLs contain the object definitions
(state based approach)
• Migration scripts allow upgrading to the
next version
(migration based approach)
• Test Code tests our database logic and its
behavior!
Migration Scripts (deltas)
BUT: Pure file-based development
is uncommon in DB development.
44
The choice of your database
migration tool will affect how you
store your source code!
45
Passive version control
(generating or exporting code from DB)
introduces the risk that we forget to export
things that we have changed inside the DB.
46
The majority of people uses a mix of generating and writing
migration scripts!
47
Only 23% of
developers use a
file-based
approach!
In order to achieve robust
deployments, migration scripts must be
repeatable!
48
What happens if migration scripts are not repeatable?
Non-repeatable migration script
If it fails midway, we need a new script with the
remaining changes (and corrections). Otherwise, the
script would immediately run on error.
49
Repeatable migration script
If it fails, we change it and run it again - it will continue where
it failed and execute the remaining migrations, skip the ones
already applied.
Branching strategies in DB
development!
DB changes are related to
infrastructure changes. They
often build on each other and
only make sense in sequence.
51
The problem with feature branches
Tests on Branches are useless if they do not integrate
Changes on Main!
Trunk-based development and Continuous Integration
Integration Pipeline Delivery Pipeline
triggers
Artefact
Repository
How do code reviews work when
you’re doing trunk-based
development?
54
Asynch Code Reviews can take a lot of time!
55
«There’s no time for Pair Programming»
56
Source: https://twitter.com/d_stepanovic
(2) Static Code Analysis / Linting
57
To be able to perform code reviews, you need standards!
These standards should be enforced automatically!
58
(3) Database Migration Tool
59
Database Schema Evolution
Database Version 1 Database Version 2
Initial DDL
DDL & DML
Database Version 3
DDL & DML …
Database Migration Tools
61
🤞 14.7%
Changeset formats – just use SQL!
62
🤞 43.7%
(4) Automated Database Tests
63
Projects without automated testing
spend about 25% of their team's
capacity on manual testing.
64
65
Unit Testing in the application
- API Tests
- Integration Tests
- Unit Tests of Backends
Testing Tools and Frameworks
Unit Testing in the database
- Unit Tests
- Integration Tests
(5) Decoupled Codebase
67
What happens when release cycles are tied to other teams (or
components)?
68
Team A
Team B
Team C
Release 1 Release 2 Release 3 Release 4
This is not Continuous Delivery!
Why is it important to be able to deploy changes at any time?
• We don’t want to execute deployments outside of business hours.
−Because if something goes wrong, nobody is available to help fix things.
−We deserve our free time.
• Being able to deploy continuously reduces stress inside the team.
−Deployment isn’t seen as a «special event» - as it shouldn’t be.
69
What is an abstraction layer?
70
Application
A
Application
B
Application
C
Service Layer (Datenbank)
Infrastruktur
provide
Services
consume
Services
The database schema is an API - changes to it can be "breaking"
for consumers!
If the database
schema changes…
... the applications have to adopt
the change, if there is no versioned
access layer!
Coupling of release
cycles must be
eliminated!
72
Application A
DB Service XY
Infrastructure
provide
Services
consume
Services
Version 1 Version 2 Version 3
Team A
Version 4
Team B
Teams want to work independently and continuously!
… but the problem is … only few already have an abstraction layer!
73
🤞 64,8%
Now we know all preconditions to
build a CI/CD pipeline for our
database!
Let‘s see how it could be built!
75
Some important things…
Ideally, the developer can create a new
development environment by clicking a button.
77
Some important things…
Check your code
before you deploy
it!
Store your artefacts in
an artefact repository –
no more VCS access!
78
Some important things…
Backup / Snapshot
your environment
before installing
changes!
What effects does CI/CD have in
database development?
79
Not automating integration and
deployment is a change preventer!
80
The number of deployments
increases over 5 times once the
introduce automated pipelines.
81
The change failure rate decreases
over 75% once automated
pipelines exist.
82
The cognitive load of developers
decreases if they can rely on
automated pipelines!
83
@jasminfluri
@jasminfluri
jasminfluri
Thank you for your time!
What questions do you have?
Ressources / References
Accelerate Book
https://itrevolution.com/book/accelerate/
DORA 2022
https://cloud.google.com/blog/products/devops-sre/dora-2022-accelerate-state-of-devops-report-now-out
DORA 2021
https://services.google.com/fh/files/misc/state-of-devops-2021.pdf
The best path to long term success is slow :
https://www.nytimes.com/2017/07/31/your-money/the-best-path-to-long-term-change-is-slow-simple-and-boring.html
Martin Fowler : Continuous Integration
https://www.martinfowler.com/articles/continuousIntegration.htm
Stonebreaker – Database Decay
http://people.csail.mit.edu/dongdeng/papers/bigdata2016-decay.pdf
Dragan Stepanovic – Twitter
https://twitter.com/d_stepanovic
85

Relational Database CI/CD

  • 1.
  • 2.
    About me Jasmin Fluri Database/ Automation Engineer Schaltstelle GmbH Database Development Development Process Automation and Tooling Continuous Integration Continuous Delivery Software Engineering @jasminfluri @jasminfluri jasminfluri Pipelining / Provisioning «The person that builds / fixes the CI/CD pipeline.»
  • 3.
    Why am Italking about CI/CD of relational DBs? 3 Database Engineer in a Java Environment First Data Mart Project First DWH Project Another DWH Project Automated Provisioning of Servers Automated XP -> Win7 Migration CI/CD / Ansible / GIT / Docker / Flyway Manual Deployment / SVN Manual Deployment / GIT 2009 2011 2012 2015 2017 2018 automated CI/CD! automated CI/CD! automated CI/CD! Another DB Project Manual Deployment / Git 2019 automated CI/CD! Research Project on DB CI/CD 2021 Finished Research 2022 Manual Deployment / no VCS
  • 4.
    … a shortstory about a database change! 4 Disclaimer Persons and processes are fictitious. Similarities with real projects are coincidental and not intended.
  • 5.
    In our developmentproject… 5 … consisting of a Spring Boot application … …. and an Oracle database … … work 4 developers!
  • 6.
    In our developmentproject… 6 CI/CD of the Spring Boot App is set up consistently! Testing is an integral part! But it looks different with the database... No team member has been there from the beginning; the code was taken over from their predecessors. The skills are mainly in the development of the Spring Boot app.
  • 7.
    In our developmentproject… 7 ... this is either tried to abstract on the application side... In the event of an emerging database change ... ...or to avoid it completely! If it is nevertheless unavoidable...
  • 8.
    In our developmentproject… 8 The execution changes the access logic to the application, so it must be redeployed. The DBA executes the script manually! ... someone in the team must write a migration script! dbchange.sql E-Mail DBA
  • 9.
    .. Why isthis approach problematic? 9
  • 10.
    In our developmentproject… 10 The execution changes the access logic to the application, so it must be redeployed. The DBA executes the script manually! ... someone in the team must write a migration script! dbchange.sql E-Mail DBA SQL scripts are not available under VCS! Because sent by email, we do not know exactly when the change is made! Manual execution leaves a lot of room for error! No versioned access layer, which leads to downtime! No regression tests on the DB side!
  • 11.
    What happens whenDB Changes are avoided? 11
  • 12.
    12 … it appearsthat most large enterprises actually care about minimizing application maintenance of existing production systems. That causes them to utilize “bad” schemas, and generally to allow “database decay”. – Stonebreaker et al. (2016) If we neglect our (DB) applications, they will disintegrate!
  • 13.
    often you alsohear… "... we don't avoid DB Changes, but we have fixed Release Windows!" 13
  • 14.
    What effects dofixed-release windows have? 14
  • 15.
    With fixed releasewindows … features that are done need to wait … • ... this causes dependencies among features. risk increases to deploy many features at once • ... Releases get bigger and bigger! Short feedback cycles are not possible! • ... Feedback is only available at the release window. 15
  • 16.
    The longer wewait, the greater the risk of deployment! 16 Source: https://kromatic.com/blog/the-risk-of-building-the-wrong-stuff/ Features are coupled! Mental load of developers increases! Risk increases!
  • 17.
    We should deliverfeatures when they are ready! Only in production do we know how they behave in reality. 17
  • 18.
    What do weknow about releasing software? 18
  • 19.
    What do weknow about releasing software? 19 The faster your teams can make changes to your software, the sooner you can deliver value to your customers, run experiments, and receive valuable feedback. State of DevOps Report 2022
  • 20.
    A high levelof software delivery performance requires technical preconditions! 20 Technical capabilities build upon one another. Continuous delivery and version control amplify each other’s ability to promote high levels of software delivery performance. Combining continuous delivery, loosely-coupled architecture, version control, and continuous integration fosters software delivery performance that is greater than the sum of its parts. State of DevOps Report 2022 https://services.google.com/fh/files/misc/2022_state_of_devops_report.pdf
  • 21.
    How is SoftwareDelivery Performance measured? 5 Key Metrics! Lead Time For Changes (low) MTTR (low) Change Failure Rate (low) Deployment Frequency (high) Deployment Reliability (high) 21
  • 22.
    High-performing teams recover muchfaster from incidents. Why is that? 22
  • 23.
    Small changes =small risk = small impact = small feedback loops 23 Source: NYTimes – The best path to long term success is slow, simple and boring!
  • 24.
    What are thepositive effects of small changes? 24 Small changes have minor effects! It's easy to see all the elements that are affected! Small changes carry only a small risk! Small changes can be more easily undone or corrected if they are incorrect.
  • 25.
    Goal of softwarereleases: We want to ship small changes continuously to get fast and continuous feedback! 25
  • 26.
    ... deploying smallchanges are thus a good practice! 26
  • 27.
    ... but indatabase development it is not very common! 27
  • 28.
  • 29.
    More than halfof all database applications have no data quality or application testing! 29
  • 30.
    Less than halfof DB development projects have automated development workflows! 30
  • 31.
    Lack of automationof DB-CI/CD is one of the most common bottlenecks in releasing changes. 31
  • 32.
    Less than 30%of DB development projects have both automated testing and static code analysis. 32
  • 33.
    How is CI/CDfor applications structured (non-DB)? 33
  • 34.
    Developer Version Control System Continuous Integration Server Continuous Integration Pipeline Push Trigger pipeline oncommit oder on merge request execute 1 - Checkout Source Code 2 - Build and test of the application 3 – System tests, Static code analysis and Metrics 4 – Reports and Notifications 5 - notify Continuous Integration File-based development New versions replace old ones. No State! No order of changes!
  • 35.
    How is databaseCI/CD different? 35
  • 36.
    if (system ==database){ build = installation; }
  • 37.
    Developer Version Control System Continuous Integration Server Continuous Integration Pipeline Push Trigger pipeline oncommit oder on merge request execute 1 - Checkout source code 2 - Build and Test of the database 3 – System tests, Static code analysis and Metrics 4 – Reports and Notifications 5 - notify Continuous Integration File-based development, generated code, exported code The changes have a defined sequence! Builds are always installation of changes! A lot of state!!
  • 38.
    What are preconditionsfor a good database CI/CD? 38
  • 39.
    Before you startbuilding a database CI/CD pipeline you need… Static Code Analysis / Linting Automated Tests Everything stored under version control Database Migration Tool Decoupled Codebase DB vs APP 39
  • 40.
  • 41.
    … without VersionControl there’s no single source of truth! 41
  • 42.
    Everything that belongsto the DB must be stored in version control! 42 🤞 36.6%
  • 43.
    Database source codein version control myproject/ ├── … ├── docs/ │ ├── documentation.md ├── db/ │ ├── 0_sequences/ │ ├── 1_tables/ │ ├── 2_mviews/ │ ├── 3_views/ │ ├── 4_packages/ │ └── 5_utils/ ├── migrations/ │ ├── scripts/ └── db-tests/ ├── packages/ └── data/ DDL Code (states) Test Code and Test Data Example Project Structure • DDLs contain the object definitions (state based approach) • Migration scripts allow upgrading to the next version (migration based approach) • Test Code tests our database logic and its behavior! Migration Scripts (deltas)
  • 44.
    BUT: Pure file-baseddevelopment is uncommon in DB development. 44
  • 45.
    The choice ofyour database migration tool will affect how you store your source code! 45
  • 46.
    Passive version control (generatingor exporting code from DB) introduces the risk that we forget to export things that we have changed inside the DB. 46
  • 47.
    The majority ofpeople uses a mix of generating and writing migration scripts! 47 Only 23% of developers use a file-based approach!
  • 48.
    In order toachieve robust deployments, migration scripts must be repeatable! 48
  • 49.
    What happens ifmigration scripts are not repeatable? Non-repeatable migration script If it fails midway, we need a new script with the remaining changes (and corrections). Otherwise, the script would immediately run on error. 49 Repeatable migration script If it fails, we change it and run it again - it will continue where it failed and execute the remaining migrations, skip the ones already applied.
  • 50.
    Branching strategies inDB development!
  • 51.
    DB changes arerelated to infrastructure changes. They often build on each other and only make sense in sequence. 51
  • 52.
    The problem withfeature branches Tests on Branches are useless if they do not integrate Changes on Main!
  • 53.
    Trunk-based development andContinuous Integration Integration Pipeline Delivery Pipeline triggers Artefact Repository
  • 54.
    How do codereviews work when you’re doing trunk-based development? 54
  • 55.
    Asynch Code Reviewscan take a lot of time! 55
  • 56.
    «There’s no timefor Pair Programming» 56 Source: https://twitter.com/d_stepanovic
  • 57.
    (2) Static CodeAnalysis / Linting 57
  • 58.
    To be ableto perform code reviews, you need standards! These standards should be enforced automatically! 58
  • 59.
  • 60.
    Database Schema Evolution DatabaseVersion 1 Database Version 2 Initial DDL DDL & DML Database Version 3 DDL & DML …
  • 61.
  • 62.
    Changeset formats –just use SQL! 62 🤞 43.7%
  • 63.
  • 64.
    Projects without automatedtesting spend about 25% of their team's capacity on manual testing. 64
  • 65.
  • 66.
    Unit Testing inthe application - API Tests - Integration Tests - Unit Tests of Backends Testing Tools and Frameworks Unit Testing in the database - Unit Tests - Integration Tests
  • 67.
  • 68.
    What happens whenrelease cycles are tied to other teams (or components)? 68 Team A Team B Team C Release 1 Release 2 Release 3 Release 4 This is not Continuous Delivery!
  • 69.
    Why is itimportant to be able to deploy changes at any time? • We don’t want to execute deployments outside of business hours. −Because if something goes wrong, nobody is available to help fix things. −We deserve our free time. • Being able to deploy continuously reduces stress inside the team. −Deployment isn’t seen as a «special event» - as it shouldn’t be. 69
  • 70.
    What is anabstraction layer? 70
  • 71.
    Application A Application B Application C Service Layer (Datenbank) Infrastruktur provide Services consume Services Thedatabase schema is an API - changes to it can be "breaking" for consumers! If the database schema changes… ... the applications have to adopt the change, if there is no versioned access layer!
  • 72.
    Coupling of release cyclesmust be eliminated! 72 Application A DB Service XY Infrastructure provide Services consume Services Version 1 Version 2 Version 3 Team A Version 4 Team B Teams want to work independently and continuously!
  • 73.
    … but theproblem is … only few already have an abstraction layer! 73 🤞 64,8%
  • 74.
    Now we knowall preconditions to build a CI/CD pipeline for our database! Let‘s see how it could be built!
  • 75.
  • 76.
    Some important things… Ideally,the developer can create a new development environment by clicking a button.
  • 77.
    77 Some important things… Checkyour code before you deploy it! Store your artefacts in an artefact repository – no more VCS access!
  • 78.
    78 Some important things… Backup/ Snapshot your environment before installing changes!
  • 79.
    What effects doesCI/CD have in database development? 79
  • 80.
    Not automating integrationand deployment is a change preventer! 80
  • 81.
    The number ofdeployments increases over 5 times once the introduce automated pipelines. 81
  • 82.
    The change failurerate decreases over 75% once automated pipelines exist. 82
  • 83.
    The cognitive loadof developers decreases if they can rely on automated pipelines! 83
  • 84.
    @jasminfluri @jasminfluri jasminfluri Thank you foryour time! What questions do you have?
  • 85.
    Ressources / References AccelerateBook https://itrevolution.com/book/accelerate/ DORA 2022 https://cloud.google.com/blog/products/devops-sre/dora-2022-accelerate-state-of-devops-report-now-out DORA 2021 https://services.google.com/fh/files/misc/state-of-devops-2021.pdf The best path to long term success is slow : https://www.nytimes.com/2017/07/31/your-money/the-best-path-to-long-term-change-is-slow-simple-and-boring.html Martin Fowler : Continuous Integration https://www.martinfowler.com/articles/continuousIntegration.htm Stonebreaker – Database Decay http://people.csail.mit.edu/dongdeng/papers/bigdata2016-decay.pdf Dragan Stepanovic – Twitter https://twitter.com/d_stepanovic 85