SlideShare a Scribd company logo
1 of 40
Download to read offline
How to handle incidents, downtime & outages
Devopsdays, Amsterdam 2015
David Mytton, Founder, Server Density
Cost of uptime?
Cost of uptime?
Cost of uptime?
$2.9bn

Q1: 2015
Cost of uptime?
Cost of uptime?
$2.9bn

Q1: 2015
$870m

Q1: 2015
Cost of uptime?
Cost of uptime?
$2.9bn

Q1: 2015
$870m

Q1: 2015
$4.1bn

Q1: 2015
Cost of uptime?
How much are you spending?
Expect downtime
• Prepare
• Respond
• Postmortem
Prepare
• On call
• Primary/secondary
Prepare
• On call
• Primary/secondary
• Reachability
Prepare
• On call
• Off call
Prepare
• On call
• Off call
• Docs
Prepare
• On call
• Off call
• Docs
• Searchable
Prepare
• On call
• Off call
• Docs
• Searchable
• Independent
Prepare
• Key info
• Team contacts
Prepare
• Key info
• Team contacts
• Vendor contacts
Prepare
• Key info
• Team contacts
• Vendor contacts
• Key credentials
Prepare
• Key info
• Unexpected situations
Prepare
• Communication
• Key info
• Unexpected situations
Prepare
• Communication
• Internet access
• Key info
• Unexpected situations
• Communication
• Internet access
• Support access
Prepare
Respond
• First responder
1. Load incident response checklist
Respond
• First responder
1. Load incident response checklist
2. Log into Ops War Room
Respond
• First responder
1. Load incident response checklist
2. Log into Ops War Room
3. Log incident in JIRA
Respond
• First responder
1. Load incident response checklist
2. Log into Ops War Room
3. Log incident in JIRA
4. Begin investigation
• Key response principles
• Log everything
Respond
Respond
• Key response principles
• Log everything
• Frequent public updates
Respond
• Key response principles
• Log everything
• Frequent public updates
• Gather the team
Respond
• Key response principles
• Log everything
• Frequent public updates
• Gather the team
• Escalate!
• Within a few days
Postmortem
• Within a few days
• Tell the story
Postmortem
• Within a few days
• Tell the story
• Appropriate technical detail
Postmortem
• Within a few days
• Tell the story
• Appropriate technical detail
• What failed, why?
Postmortem
Postmortem
• How it’s going to be fixed
Postmortem
ありがとうございます
david@serverdensity.com
@davidmytton

More Related Content

Viewers also liked

Experiences from DevOps production: Deployment, performance, failure.
Experiences from DevOps production: Deployment, performance, failure.Experiences from DevOps production: Deployment, performance, failure.
Experiences from DevOps production: Deployment, performance, failure.Server Density
 
Remote startup - building a company from everywhere in the world
Remote startup - building a company from everywhere in the worldRemote startup - building a company from everywhere in the world
Remote startup - building a company from everywhere in the worldServer Density
 
NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013Server Density
 
DevOps Incident Handling - Making friends not enemies.
DevOps Incident Handling - Making friends not enemies.DevOps Incident Handling - Making friends not enemies.
DevOps Incident Handling - Making friends not enemies.Server Density
 
Puppet at the centre of everything
Puppet at the centre of everythingPuppet at the centre of everything
Puppet at the centre of everythingServer Density
 
Flight training for DevOps & HumanOps - IncontroDevOps 2016
Flight training for DevOps & HumanOps - IncontroDevOps 2016Flight training for DevOps & HumanOps - IncontroDevOps 2016
Flight training for DevOps & HumanOps - IncontroDevOps 2016Server Density
 
Infrastructure choices - cloud vs colo vs bare metal
Infrastructure choices - cloud vs colo vs bare metalInfrastructure choices - cloud vs colo vs bare metal
Infrastructure choices - cloud vs colo vs bare metalServer Density
 
Flight training for DevOps
Flight training for DevOpsFlight training for DevOps
Flight training for DevOpsServer Density
 
Content marketing @ Server Density
Content marketing @ Server DensityContent marketing @ Server Density
Content marketing @ Server DensityServer Density
 

Viewers also liked (13)

Experiences from DevOps production: Deployment, performance, failure.
Experiences from DevOps production: Deployment, performance, failure.Experiences from DevOps production: Deployment, performance, failure.
Experiences from DevOps production: Deployment, performance, failure.
 
NoSQL Infrastructure
NoSQL InfrastructureNoSQL Infrastructure
NoSQL Infrastructure
 
Remote startup - building a company from everywhere in the world
Remote startup - building a company from everywhere in the worldRemote startup - building a company from everywhere in the world
Remote startup - building a company from everywhere in the world
 
NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013
 
DevOps Incident Handling - Making friends not enemies.
DevOps Incident Handling - Making friends not enemies.DevOps Incident Handling - Making friends not enemies.
DevOps Incident Handling - Making friends not enemies.
 
Why puppet? Why now?
Why puppet? Why now?Why puppet? Why now?
Why puppet? Why now?
 
Puppet at the centre of everything
Puppet at the centre of everythingPuppet at the centre of everything
Puppet at the centre of everything
 
Flight training for DevOps & HumanOps - IncontroDevOps 2016
Flight training for DevOps & HumanOps - IncontroDevOps 2016Flight training for DevOps & HumanOps - IncontroDevOps 2016
Flight training for DevOps & HumanOps - IncontroDevOps 2016
 
Infrastructure choices - cloud vs colo vs bare metal
Infrastructure choices - cloud vs colo vs bare metalInfrastructure choices - cloud vs colo vs bare metal
Infrastructure choices - cloud vs colo vs bare metal
 
Flight training for DevOps
Flight training for DevOpsFlight training for DevOps
Flight training for DevOps
 
Content marketing @ Server Density
Content marketing @ Server DensityContent marketing @ Server Density
Content marketing @ Server Density
 
How to monitor NGINX
How to monitor NGINXHow to monitor NGINX
How to monitor NGINX
 
How to Monitor MySQL
How to Monitor MySQLHow to Monitor MySQL
How to Monitor MySQL
 

Similar to Handling incidents

Accelerating Delivery of Value
Accelerating Delivery of ValueAccelerating Delivery of Value
Accelerating Delivery of ValueRyan D. Hatch
 
Insight Facts & Figures
Insight Facts & FiguresInsight Facts & Figures
Insight Facts & FiguresVince Caldwell
 
Archer Daniels Midland: The Journey from 100% Paper to 100% Digital
Archer Daniels Midland: The Journey from 100% Paper to 100% DigitalArcher Daniels Midland: The Journey from 100% Paper to 100% Digital
Archer Daniels Midland: The Journey from 100% Paper to 100% DigitalTradeshift
 
SPSDFW Bottom Up SharePoint Design
SPSDFW Bottom Up SharePoint DesignSPSDFW Bottom Up SharePoint Design
SPSDFW Bottom Up SharePoint DesignDavid Broussard
 
Slow is the New Down - Global Ecommerce
Slow is the New Down - Global EcommerceSlow is the New Down - Global Ecommerce
Slow is the New Down - Global EcommerceMark Lewis
 
Backup and Disaster Recovery for Business Owners and Directors
Backup and Disaster Recovery for Business Owners and DirectorsBackup and Disaster Recovery for Business Owners and Directors
Backup and Disaster Recovery for Business Owners and DirectorsLucy Denver
 
Richard Crawley, PAS - What happens in planning authorities?
Richard Crawley, PAS - What happens in planning authorities?Richard Crawley, PAS - What happens in planning authorities?
Richard Crawley, PAS - What happens in planning authorities?PAS_Team
 
Altus Dynamics 2016 - Is Your Dashboard a Picasso?
Altus Dynamics 2016 - Is Your Dashboard a Picasso?Altus Dynamics 2016 - Is Your Dashboard a Picasso?
Altus Dynamics 2016 - Is Your Dashboard a Picasso?Sparkrock
 
IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...
IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...
IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...TeamQuest Corporation
 
O'Reilly Webcast: How Nordstrom Prepares Its Site for Holidays and Major Events
O'Reilly Webcast: How Nordstrom Prepares Its Site for Holidays and Major EventsO'Reilly Webcast: How Nordstrom Prepares Its Site for Holidays and Major Events
O'Reilly Webcast: How Nordstrom Prepares Its Site for Holidays and Major EventsSOASTA
 
The future is sooner thank you think - Sage at Accountex 2014
The future is sooner thank you think - Sage at Accountex 2014The future is sooner thank you think - Sage at Accountex 2014
The future is sooner thank you think - Sage at Accountex 2014Sageukofficial
 
Real World Results from an Investment in Process / Performance Improvement
Real World Results from an Investment in Process / Performance ImprovementReal World Results from an Investment in Process / Performance Improvement
Real World Results from an Investment in Process / Performance ImprovementDavid Greer - CMMI, LSSBB, ISO, CSM
 
Meeting the Demands of an On-Demand World
Meeting the Demands of an On-Demand WorldMeeting the Demands of an On-Demand World
Meeting the Demands of an On-Demand WorldHostway|HOSTING
 
Data Quality Success Stories
Data Quality Success StoriesData Quality Success Stories
Data Quality Success StoriesDATAVERSITY
 
Biz Dev for Startups - Part 3, 4 and 5
Biz Dev for Startups - Part 3, 4 and 5Biz Dev for Startups - Part 3, 4 and 5
Biz Dev for Startups - Part 3, 4 and 5Matteo Fabiano
 
Profiting from customer profitability + big data fitzgerald analytics
Profiting from customer profitability + big data fitzgerald analyticsProfiting from customer profitability + big data fitzgerald analytics
Profiting from customer profitability + big data fitzgerald analyticsFitzgerald Analytics, Inc.
 
CV - Kitti David 09.09.2016
CV - Kitti David 09.09.2016CV - Kitti David 09.09.2016
CV - Kitti David 09.09.2016Kitti David
 
E Tech On A Shoestring-Shenandoah AFP Luncheon
E Tech On A Shoestring-Shenandoah AFP LuncheonE Tech On A Shoestring-Shenandoah AFP Luncheon
E Tech On A Shoestring-Shenandoah AFP Luncheonkrucker
 
Benchmarking The Digital Workplace
Benchmarking The Digital WorkplaceBenchmarking The Digital Workplace
Benchmarking The Digital WorkplaceUnicorn Titans
 

Similar to Handling incidents (20)

Accelerating Delivery of Value
Accelerating Delivery of ValueAccelerating Delivery of Value
Accelerating Delivery of Value
 
Insight Facts & Figures
Insight Facts & FiguresInsight Facts & Figures
Insight Facts & Figures
 
Archer Daniels Midland: The Journey from 100% Paper to 100% Digital
Archer Daniels Midland: The Journey from 100% Paper to 100% DigitalArcher Daniels Midland: The Journey from 100% Paper to 100% Digital
Archer Daniels Midland: The Journey from 100% Paper to 100% Digital
 
SPSDFW Bottom Up SharePoint Design
SPSDFW Bottom Up SharePoint DesignSPSDFW Bottom Up SharePoint Design
SPSDFW Bottom Up SharePoint Design
 
Slow is the New Down - Global Ecommerce
Slow is the New Down - Global EcommerceSlow is the New Down - Global Ecommerce
Slow is the New Down - Global Ecommerce
 
Backup and Disaster Recovery for Business Owners and Directors
Backup and Disaster Recovery for Business Owners and DirectorsBackup and Disaster Recovery for Business Owners and Directors
Backup and Disaster Recovery for Business Owners and Directors
 
Richard Crawley, PAS - What happens in planning authorities?
Richard Crawley, PAS - What happens in planning authorities?Richard Crawley, PAS - What happens in planning authorities?
Richard Crawley, PAS - What happens in planning authorities?
 
Understanding Lean IT
Understanding Lean ITUnderstanding Lean IT
Understanding Lean IT
 
Altus Dynamics 2016 - Is Your Dashboard a Picasso?
Altus Dynamics 2016 - Is Your Dashboard a Picasso?Altus Dynamics 2016 - Is Your Dashboard a Picasso?
Altus Dynamics 2016 - Is Your Dashboard a Picasso?
 
IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...
IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...
IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...
 
O'Reilly Webcast: How Nordstrom Prepares Its Site for Holidays and Major Events
O'Reilly Webcast: How Nordstrom Prepares Its Site for Holidays and Major EventsO'Reilly Webcast: How Nordstrom Prepares Its Site for Holidays and Major Events
O'Reilly Webcast: How Nordstrom Prepares Its Site for Holidays and Major Events
 
The future is sooner thank you think - Sage at Accountex 2014
The future is sooner thank you think - Sage at Accountex 2014The future is sooner thank you think - Sage at Accountex 2014
The future is sooner thank you think - Sage at Accountex 2014
 
Real World Results from an Investment in Process / Performance Improvement
Real World Results from an Investment in Process / Performance ImprovementReal World Results from an Investment in Process / Performance Improvement
Real World Results from an Investment in Process / Performance Improvement
 
Meeting the Demands of an On-Demand World
Meeting the Demands of an On-Demand WorldMeeting the Demands of an On-Demand World
Meeting the Demands of an On-Demand World
 
Data Quality Success Stories
Data Quality Success StoriesData Quality Success Stories
Data Quality Success Stories
 
Biz Dev for Startups - Part 3, 4 and 5
Biz Dev for Startups - Part 3, 4 and 5Biz Dev for Startups - Part 3, 4 and 5
Biz Dev for Startups - Part 3, 4 and 5
 
Profiting from customer profitability + big data fitzgerald analytics
Profiting from customer profitability + big data fitzgerald analyticsProfiting from customer profitability + big data fitzgerald analytics
Profiting from customer profitability + big data fitzgerald analytics
 
CV - Kitti David 09.09.2016
CV - Kitti David 09.09.2016CV - Kitti David 09.09.2016
CV - Kitti David 09.09.2016
 
E Tech On A Shoestring-Shenandoah AFP Luncheon
E Tech On A Shoestring-Shenandoah AFP LuncheonE Tech On A Shoestring-Shenandoah AFP Luncheon
E Tech On A Shoestring-Shenandoah AFP Luncheon
 
Benchmarking The Digital Workplace
Benchmarking The Digital WorkplaceBenchmarking The Digital Workplace
Benchmarking The Digital Workplace
 

Handling incidents