In this talk, we explore five practical, tried-and-tested, real world techniques for improving operability with many kinds of software systems, including cloud, Serverless, on-premise, and IoT.
Logging as a live diagnostics vector with sparse Event IDs
Operational checklists and ‘Run Book dialogue sheets’ as a discovery mechanism for teams
Endpoint healthchecks as a way to assess runtime dependencies and complexity
Correlation IDs beyond simple HTTP calls
Lightweight ‘User Personas’ as drivers for operational dashboards
Based on our work in many industry sectors, we will share our experience of helping teams to improve the operability of their software systems through
Required audience experience
Some experience of building web-scale systems or industrial IoT/embedded systems would be helpful.
Objective of the talk
We will share our experience of helping teams to improve the operability of their software systems. Attendees will learn some practical operability approaches and how teams can expand their understanding and awareness of operability through these simple, team-friendly techniques.
From a talk given at Continuous Lifecycle London 2018: https://continuouslifecycle.london/sessions/practical-team-focused-operability-techniques-for-distributed-systems/
Run Book dialogue sheets
Checklists for typical
runbooktemplate.infoRun Book dialogue sheets
Hours of operation
During what hours does the service or system actually need to operate? Can portions or features of the
system be unavailable at times if needed?
Hours of operation - core features
(e.g. 03:00-01:00 GMT+0)
Hours of operation - secondary features
(e.g. 07:00-23:00 GMT+0)
Data and processing flows
How and where does data flow through the system? What controls or triggers data flows?
(e.g. mobile requests / scheduled batch jobs / inbound IoT sensor data )
Lightweight user personas
explore the UX and needs of
different roles for rapid
decisions via dashboards
use modern logging, Run Book
dialogue sheets, endpoint
healthchecks, correlation IDs,
and user personas as
team collaboration techniques
Team Guide to
Matthew Skelton & Rob Thatcher
Download a free sample chapter
•Team Guide to Software Operability by Matthew
Skelton and Rob Thatcher (Skelton Thatcher
Publications, 2016) http://operabilitybook.com/
•Run Book template & Run Book dialogue sheets