From 100 to 1000+
deployments a day
Pat Hermens
Amsterdam | April 2-3, 2019
From 100 to 1,000+ deployments
What does it take to get to the next level?
Pat Hermens
Development Manager
Coding for 15+ years
Father & husband
Rotterdam, Netherlands
@phermens
pat@coolblue.nl
Why 1,000+ deployments/day?
6
@phermens
¯_(ツ)_/¯
7
@phermens
“It depends”
100 / 1,000 / 10,000; so what?
http://www.cs.utexas.edu/users/ewd/ewd02xx/ewd249.pdf
… Apparently we are too much trained to disregard
differences in scale, to treat them as “gradual differences that
are not essential.”
We tell ourselves that what we can do once, we can also do
twice and by induction, we fool ourselves into believing that
we can do it as many times as needed, but this is just not true!
— Edsger Dijkstra, 1969
Prerequisites...
23
@phermens
The “faster to master” checklist
Developing
⬜ Coding guidelines
⬜ Code reviews
⬜ Code analysis
⬜ Short-lived branches
⬜ Feature toggles
Monitoring
⬜ Defined key metrics
⬜ Measure – create dashboards
⬜ Proactivity – add alerting (& act!)
Testing
⬜ Unit tests
⬜ Module tests
⬜ Integration tests
⬜ End-to-end tests
⬜ Test automation
Deploying
⬜ Continuous Integration
⬜ Continuous Delivery
⬜ Infrastructure as Code
24
@phermens
Developing
☑ Coding guidelines
☑ Code reviews
⬜ Code analysis
☑ Short-lived branches
⬜ Feature toggles
Monitoring
☑ Defined key metrics
⬜ Measure – create dashboards
⬜ Proactivity – add alerting (& act!)
Testing
☑ Unit tests
⬜ Module tests
⬜ Integration tests
⬜ End-to-end tests
☑ Test automation
Deploying
☑ Continuous Integration
⬜ Continuous Delivery
⬜ Infrastructure as Code
The “faster to master” checklist
25
@phermens
Developing
☑ Coding guidelines
☑ Code reviews
⬜ Code analysis
☑ Short-lived branches
☑ Feature toggles
Monitoring
☑ Defined key metrics
☑ Measure – create dashboards
⬜ Proactivity – add alerting (& act!)
Testing
☑ Unit tests
☑ Module tests
⬜ Integration tests
☑ End-to-end tests
☑ Test automation
Deploying
☑ Continuous Integration
☑ Continuous Delivery
⬜ Infrastructure as Code
The “faster to master” checklist
26
@phermens
Developing
☑ Coding guidelines
☑ Code reviews
☑ Code analysis
☑ Short-lived branches
☑ Feature toggles
Monitoring
☑ Defined key metrics
☑ Measure – create dashboards
☑ Proactivity – add alerting (& act!)
Testing
☑ Unit tests
☑ Module tests
☑ Integration tests
☑ End-to-end tests
☑ Test automation
Deploying
☑ Continuous Integration
☑ Continuous Delivery
☑ Infrastructure as Code
The “faster to master” checklist
Story time!
Responsibility Autonomy
FailureOwnership
Responsibility Autonomy
FailureOwnership
Responsibility
32
@phermens
Hosting &
Deployment
“DevOps”
33
@phermens
Hosting &
Deployment
“DevOps”
Team
A
Team
D
Team
B
Team
C
34
@phermens
Hosting &
Deployment
“DevOps”
Team
A
Team
D
Team
F
Team
B
Team
C
Team
E
35
@phermens
Hosting &
Deployment
“DevOps”
Team
A
Team
D
Team
F
Team
B
Team
C
Team
E
36
@phermens
Hosting &
Deployment
“DevOps”
Team
H
Team
A
Team
D
Team
G
Team
E
Team
J
Team
F
Team
B
Team
I
Team
C
37
@phermens
Hosting &
Deployment
“DevOps”
Team
H
Team
A
Team
D
Team
G
Team
E
Team
J
Team
F
Team
B
Team
I
Team
C
38
@phermens
Hosting &
Deployment
“DevOps”
Team
H
Team
A
Team
D
Team
G
Team
E
Team
J
Team
F
Team
B
Team
I
Team
C
39
@phermens
Hosting &
Deployment
“DevOps”
Team
H
Team
A
Team
D
Team
G
Team
E
Team
J
Team
F
Team
B
Team
I
Team
C
40
@phermens
Hosting &
Deployment
“DevOps”
Team
H
Team
A
Team
D
Team
G
Team
E
Team
J
Team
F
Team
B
Team
I
Team
C
41
@phermens
Team
H
Team
A
Team
C
Team
G
Team
E
Team
J
Team
F
Team
B
Team
I
Team
C
Team
D
Team
L
Deploy-
ment
44
@phermens
Responsibility Autonomy
FailureOwnership
Responsibility Autonomy
FailureOwnership
Autonomy
Autonomy
49
@phermens
Serilog + Splunk + DataDog
Autonomy
58
@phermens
59
@phermens
60
@phermens
61
@phermens
$ $ $
$ $
62
@phermens
$ $ $
$ $
63
@phermens
64
@phermens
UDP
65
@phermens
UDP
66
@phermens
UDP
67
@phermens
UDP
68
@phermens
UDP
69
@phermens
70
@phermens
TCP
71
@phermens
TCP
72
@phermens
73
@phermens
74
@phermens
75
@phermens
Hosting &
Deployment
“DevOps”
Team
H
Team
A
Team
D
Team
G
Team
E
Team
J
Team
F
Team
B
Team
I
Team
C
76
@phermens
77
@phermens
AWS Lambda stats for Vanessa Optima Prime (as of 2019-04-01)
Last 30 days
Last 7 days
78
@phermens
AWS Lambda stats for Vanessa Optima Prime (as of 2019-04-01)
Last 30 days
Last 7 days
79
@phermens
80
@phermens
Responsibility Autonomy
FailureOwnership
Responsibility Autonomy
FailureOwnership
83
@phermens
Ownership
84
@phermens
Build System
85
@phermens
86
@phermens
87
@phermens
Development team tooling choices
88
@phermens
Infrastructure team tooling choices
89
@phermens
90
@phermens
What’s next?
91
@phermens
What’s next?
92
@phermens
¯_(ツ)_/¯
93
@phermens
Responsibility Autonomy
FailureOwnership
Responsibility Autonomy
FailureOwnership
96
@phermens
Fail early, fail often,
but always fail forward.
97
@phermens
Failure
106
@phermens
[REDACTED]
[REDACTED]
NOTE:
20 cubic
centimeters
NOTE:
Now 50cc!
NOTE:
49 cubic
centimeters
NOTE:
More than
49cc’s!
NOTE:
Now 45cc!
123
@phermens
124
@phermens
126
@phermens
#war-room
Responsibility Autonomy
FailureOwnership
Responsibility Autonomy
FailureOwnership
130
@phermens
The “faster to master” checklist
Developing
⬜ Coding guidelines
⬜ Code reviews
⬜ Code analysis
⬜ Short-lived branches
⬜ Feature toggles
Monitoring
⬜ Defined key metrics
⬜ Measure – create dashboards
⬜ Proactivity – add alerting (& act!)
Testing
⬜ Unit tests
⬜ Module tests
⬜ Integration tests
⬜ End-to-end tests
⬜ Test automation
Deploying
⬜ Continuous Integration
⬜ Continuous Delivery
⬜ Infrastructure as Code
131
@phermens
The “100 to 1,000” checklist
Responsibility
⬜ Can I control my deployments?
⬜ Have I removed all bottlenecks?
⬜ Do I have good support systems?
⬜ Am I collaborating with my peers?
Autonomy
⬜ Freedom to innovate/improve?
⬜ Am I in control and independent? ⬜
Can I find out what went wrong?
⬜ Do I have the skills to fix it?
Ownership
⬜ Agreed interfaces (or contracts?)
⬜ Out-of-the-box solutions?
⬜ Tooling is 100% automated?
⬜ Local build == “Production” build?
Failure
⬜ Monitoring = Alerting = Action
⬜ Take pride in improving
⬜ Small deliverables, quickly
⬜ Non-blocking deployments
Questions?
Checklists!
134
@phermens
The “faster to master” checklist
Developing
⬜ Coding guidelines
⬜ Code reviews
⬜ Code analysis
⬜ Short-lived branches
⬜ Feature toggles
Monitoring
⬜ Defined key metrics
⬜ Measure – create dashboards
⬜ Proactivity – add alerting (& act!)
Testing
⬜ Unit tests
⬜ Module tests
⬜ Integration tests
⬜ End-to-end tests
⬜ Test automation
Deploying
⬜ Continuous Integration
⬜ Continuous Delivery
⬜ Infrastructure as Code
135
@phermens
The “100 to 1,000” checklist
Responsibility
⬜ Can I control my deployments?
⬜ Have I removed all bottlenecks?
⬜ Do I have good support systems?
⬜ Am I collaborating with my peers?
Autonomy
⬜ Freedom to innovate/improve?
⬜ Am I in control and independent? ⬜
Can I find out what went wrong?
⬜ Do I have the skills to fix it?
Ownership
⬜ Agreed interfaces or contracts?
⬜ Out-of-the-box solutions?
⬜ Tooling is 100% automated?
⬜ Local build == “Production” build?
Failure
⬜ Monitoring = Alerting = Action
⬜ Take pride in improving
⬜ Small deliverables, quickly
⬜ Non-blocking deployments
https://sddconf.com/brands/sdd/library/DOLLARD-Debugging.pdf
137
@phermens
Thanks!
@phermens hermens.com.au
pat@coolblue.nl careersatcoolblue.com

Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019