Getting a System to Production
... and keeping it there
Eoin Woods

eoin.woods@endava.com

www.eoinwoods.info
1
Content
Introducing Production Systems
What Goes Wrong in Production?
Solutions for Production Systems
Conclusions
2
Production Systems
3
What is a production system?
4
Any system

being used

for real work
Why is Productionisation Hard?
No one teaches you about production
who do you talk to?
what do they want?
what is the definition of “done” ?
Production is difficult for developers
hard to access, interrogate, debug, change, ...
5
A new cast of characters
6
Developers
Development
Users
A new cast of characters
6
Production
Users
Developers
Auditors
Operations
Acquirers
Infrastructure
Business

Management
Production is constrained
Highly controlled
Content is all valuable
Change can be difficult
7
Production is unpredictable
8
Production is highly visible!
9
You don’t own production
10
What goes wrong?
11
Performance surprises
Interactive load
Batch time surprises
System abusers!
“all transactions this year”,
“average since 1967”, ...
12
Environment bombshells
Constraints and contention
Unexpected behaviour
Integration points
13
Failures happen
Software defects
Platform failures
Environment failures
14
Security tangles
Security is simple in
Development
Much more complex
in Production!
15
Finding Solutions
16
Key requirements for production
Functionally correct
does what the business process requires
Stability
behaves predictably in all situations
Capacity
can process the workload required (at all times)
Security
limits access to those who are authorised to have it
17
Solution Framework
Correctness Stability Capacity Security
Design
Principles
Technology
Processes
Simplicity
Resource
Governor
Threat
Modelling
18
General Principles
One Team
Automate
Measure and Improve (feedback loops)
Good Enough over Perfection
19
Timeless principles … today led by CD and DevOps
Example: Achieving Stability
20
Stability - design principles
Fail quickly
fail fast, timeouts
Isolate problems
flow control, circuit breakers, bulkheads,
asynchronous integration
Ensure steady state operation
housekeeping, predictable resource allocation,
governors, throttling
21
Stability - technology solutions
Timeouts
Circuit
Breaker
Fail fast
Bulkhead
Governor
Housekeeping
22
Example - Circuit Breaker
Clear
Checking
Tripped
err_returned
timeout
err_returned &&

err_count > 10err_returned
23
Stability - process principles
Repeatability
defined processes, practice scenarios, prelive
environments
Automation
automate the routine, automate the difficult
allow the human back in the loop on demand
Transparency
logging, monitoring, alerts, trends
24
Stability - process automation
Logging 

& Metrics
Monitoring
Automation
25
Achieving Capacity
26
Capacity - design principles
Minimise workload
efficiency is important
Flatten the peaks
move workload around
Design for the large (scalability)
understand where the time goes
multiply by a million
27
Capacity - technology solutions
Measure and minimise
understand where the work is
Caching and pre-computing
reduce the work to be done
Sharding and partitioning
separate workload to allow scale
28
Capacity - solutions
Segment
Timings
Static cache
Lookaside cache
Precompute
Result set caching
Phased
batch
29
Moving Work Around
Utilisation
0
25
50
75
100
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Utilisation
0
25
50
75
100
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
30
Capacity - process principles
Model and estimate
Test capacity on realistic environments
allows model calibration
Monitoring and trend analysis
tests theory against reality
spots impending storms before they hit
31
Achieving Security
32
Security - design principles
What they don’t have won’t hurt you
least privilege - grant the minimum needed
Security needs simplicity
what you can’t analyse you can’t be sure about
Don’t put your eggs in one basket
separate privileges to avoid total breaches
Fail safely
33
Security - solutions
Authentication
& Roles
Least privilege
/ separation
Privacy (TLS)
Isolation (firewalls
& zones)
Trust (certs)
34
Security - processes
Model threats to identify mitigation
Define policy to know what to protect
Test security as well as functions
35
Security - techniques
Security
Model
Threat

Model
36
Summary
37
Production is just different
it’s not yours and you need to respect that
Production is demanding
Correctness
Stability
Capacity
Security
Summary
38
Summary (ii)
Solutions for each requirement by area
principles
technologies
processes
39
Summary (iii)
Production requirements and principles
go back to the age of the mainframe
CD and DevOps the latest incarnation
welcome attention from developers
new tech enabling new possibilities
breaking down silos to make it happen
40
Books
SoftwareSystems
ArchitectureSecondEdition
NICK ROZANSKI • EOIN WOODS
Working with Stakeholders Using Viewpoints and Perspectives
Second
Edition
41
Eoin Woods

www.eoinwoods.info

@eoinwoodz
Thank you.
Questions?
42
Acknowledgements
http://www.icons-land.com
http://www.alamy.com/
http://www.42u.com

Getting Your System to Production and Keeping it There