SlideShare a Scribd company logo
What To Do When It All Goes So Wrong David Levy AdventuresInSql.com SQL Saturday #67 Chicago
More than 11 years in IT SQL Server DBA for over 3 years Previous Life as Developer Blogger http://adventuresinsql.com Syndicated on SQLServerCentral.com Syndicated on SQLServerPedia.com @dave_levy on Twitter About Me
Peak Time of Peak Sales Day Typical Hourly Sales $100K/HR Order Entry Screen is Locked Up Users report Slowness Initially Now the “Sales Center” Application is Just “Clocking” EMERGENCY!
Let Everyone Know There is a Problem Prevent Duplicated Efforts Allows Others to Speak Up Recent Changes Related Issues Communicate http://www.freedigitalphotos.net/images/view_photog.php?photogid=1983
Send Up a Flare Send to an IT Only Distribution Group Keep the Subject Line General Provide Broad Overview Including: Systems Impacted Major Symptoms Including Error Messages Number of People Impacted Any Location Specific Information Communicate
What Resources Do You need? Subject Matter Experts Specialized Equipment Communicate
Never Assign Blame Only State Facts Communicate
To:		IT Emergencies Subject:	Sales Center Issues Sales Center Users are reporting that the Order Entry screen has quit responding. We are currently investigating the issue with the Sales Center Development Team. We will provide updates as we know more. Communicate
What Are the Symptoms? What Locations are Involved? Collect
What Systems are Involved? SQL Server AS400 Mainframe Web Farm Major Network Components like Load Balancers Collect
What Has Changed? Look at Change Control Calendar Talk to Primary On-Calls for Related Systems Collect
Anything in the Logs? Windows Logs Application Specific Logs Custom Exception Handling Systems Collect
What are Performance Indicators Showing? Perfmon SQL Wait Stats Third-party tools Collect
Analyze Collected Information Are There Any Obvious Signs of Trouble? Can the Problem be Linked to a Change? Can Any Patterns be Identified? Process
Prove It Is Your Issue Shows Humility Shows Respect for Everyone Else’s Time Avoid Appearing Arrogant Process
Prove It Is Your Issue Construct Tests to Prove Theories in Order of Likelihood Until Problem Proven or Theories Exhausted Faster than arguing about what it is not How can you know it is not your issue? Process
List Potential Actions Rank by effort, confidence, level of risk Develop action plans for best options and re-rank Each potential action should have a rollback plan Process
Define Measures What will indicate things have gotten better? Adding this index will reduce Disk IO by 10 million reads per second The execution time of query x will drop from 6 minutes to 50 milliseconds Process
Define Measures What will indicate things have gotten worse? Disk IO may go up The execution time of query x may go up Adding this index may slow inserts from the order upload process Process
Communicate Your Intentions Make the Change Follow a written plan Make a single change A single person should make the change Document any additional steps taken Start Over by Collecting More Data Respond
Signs You Need to Convene A War Room Having Trouble Finding Anything Wrong 30 Minutes Without Progress An Issue Appears to Span Multiple Systems Having Difficulty Getting People Engaged The War Room
Get Everyone in a Room No Changes Made Outside the Room No Heroes Watch out for people doing a lot of typing Avoid changes that take more than a few minutes Have a Call in Number for Remote Coworkers The War Room
Have a Technology Kit Old Switch Patch Cords Mice + Mouse Pads Power Strips The War Room
Monitor Your Guest List 1-2 Representatives From Each Team Try to Keep Management Out Watch for Disruptive People The War Room
To:		IT Emergencies Subject:	Sales Center Issues We are convening a war room for the Sales Center issue. Everyone working on the issue please meet in the North Conference Room. Remote/WFH coworkers should dial into the conference bridge 888-888-1234, participant code:1234. Communicate
White Board the Issue Every System Gets Own Column Write All Facts on White Board Closed Items Get Crossed Out Not Erased Include a Resolution for Each Closed Item The War Room
Share the Floor Likely Issue Owner Has the Lead Make Sure Everyone is Heard Contributing Often Involves Staying Out of the Way Don’t Be Afraid to Fade Back and Run The Whiteboard The War Room
Never Call “Not-It” and Leave Not Helpful You May be Wrong Appears Arrogant The War Room
Keep an Eye On Time Provide Regular Updates to Management Bring in Food Around Meal Times Raises Spirits Brings in More People to Help The War Room
To:		IT Emergencies Subject:	Sales Center Issues Update The Sales Center war room is still going. We are currently looking into a driver issue with IBM. All necessary resources have been engaged. Communicate
Keep People in Reserve Each Team Should Divide up the Day Rotate People In and Out Send Someone Home Early to Come in Early The War Room
Closing Out Communicate Resolution Capture Contents of Whiteboard Clean Up Room The War Room
To:		IT Emergencies Subject:	Sales Center Issues Resolved The Sales Center issue has been resolved. The issue was caused by a patch that was applied over the weekend. Now that it has been backed out everything has returned to normal. Communicate
? Questions?
What To Do When It All Goes So Wrong

More Related Content

Similar to What To Do When It All Goes So Wrong

How to Immediately Become a Better Closer
How to Immediately Become a Better CloserHow to Immediately Become a Better Closer
How to Immediately Become a Better Closer
SalesScripter
 
The Lean Startup fbFund Edition
The Lean Startup fbFund EditionThe Lean Startup fbFund Edition
The Lean Startup fbFund EditionEric Ries
 
The Lean Startup EA edition
The Lean Startup EA editionThe Lean Startup EA edition
The Lean Startup EA editionEric Ries
 
UX Design Heuristics, aka "what makes an interaction good"?
UX Design Heuristics, aka "what makes an interaction good"?UX Design Heuristics, aka "what makes an interaction good"?
UX Design Heuristics, aka "what makes an interaction good"?
Jamal Nichols
 
Bally chohan support (Bally Chohan Bally ) | Bally chohan | Bally chohan support
Bally chohan support (Bally Chohan Bally ) | Bally chohan | Bally chohan supportBally chohan support (Bally Chohan Bally ) | Bally chohan | Bally chohan support
Bally chohan support (Bally Chohan Bally ) | Bally chohan | Bally chohan support
dubai
 
Five whys summary
Five whys summaryFive whys summary
Five whys summary
Steve Hayes
 
7 Secrets To A Successful Social Networking Launch
7 Secrets To A Successful Social Networking Launch7 Secrets To A Successful Social Networking Launch
7 Secrets To A Successful Social Networking Launch
Vanguard Technology
 
2010 02 19 the lean startup - webstock 2010
2010 02 19 the lean startup - webstock 20102010 02 19 the lean startup - webstock 2010
2010 02 19 the lean startup - webstock 2010
Eric Ries
 
Teleconference #2: Getting Down to Business - Selling and Daily Activity 0715
Teleconference #2: Getting Down to Business - Selling and Daily Activity 0715Teleconference #2: Getting Down to Business - Selling and Daily Activity 0715
Teleconference #2: Getting Down to Business - Selling and Daily Activity 0715
BusinessAccelerator
 
Pitch the way VCs think
Pitch the way VCs thinkPitch the way VCs think
Pitch the way VCs think
khoslaventures
 
2010 10 28 the lean startup at ucsd
2010 10 28 the lean startup at ucsd2010 10 28 the lean startup at ucsd
2010 10 28 the lean startup at ucsdEric Ries
 
The Lean Startup at Code for America fellows
The Lean Startup at Code for America fellowsThe Lean Startup at Code for America fellows
The Lean Startup at Code for America fellowsEric Ries
 
Communication and Testing: Why You Have Been Wrong All Along!
Communication and Testing: Why You Have Been Wrong All Along!Communication and Testing: Why You Have Been Wrong All Along!
Communication and Testing: Why You Have Been Wrong All Along!
TechWell
 
Confurrent SWOmaha
Confurrent SWOmahaConfurrent SWOmaha
Confurrent SWOmahaScott Blaine
 
How to Build an Attribution Solution in 1 Day
How to Build an Attribution Solution in 1 DayHow to Build an Attribution Solution in 1 Day
How to Build an Attribution Solution in 1 Day
Phillip Law
 
How to Build an Attribution Solution in 1 Day
How to Build an Attribution Solution in 1 DayHow to Build an Attribution Solution in 1 Day
How to Build an Attribution Solution in 1 Day
Phillip Law
 
Get Faster - While You're Getting Better
Get Faster - While You're Getting BetterGet Faster - While You're Getting Better
Get Faster - While You're Getting Better
antoineg
 
Am Fam Telecon2 Getting Down To Business Selling And Daily Activity 071509 ...
Am Fam Telecon2 Getting Down To Business   Selling And Daily Activity 071509 ...Am Fam Telecon2 Getting Down To Business   Selling And Daily Activity 071509 ...
Am Fam Telecon2 Getting Down To Business Selling And Daily Activity 071509 ...
BusinessAccelerator
 
Off-Hours Critical Issue Escalation
Off-Hours Critical Issue EscalationOff-Hours Critical Issue Escalation
Off-Hours Critical Issue Escalation
Evan Hamilton
 
Group Presentation on Bussiness Intelligence
Group Presentation on Bussiness IntelligenceGroup Presentation on Bussiness Intelligence
Group Presentation on Bussiness IntelligenceGaurav Paliwal
 

Similar to What To Do When It All Goes So Wrong (20)

How to Immediately Become a Better Closer
How to Immediately Become a Better CloserHow to Immediately Become a Better Closer
How to Immediately Become a Better Closer
 
The Lean Startup fbFund Edition
The Lean Startup fbFund EditionThe Lean Startup fbFund Edition
The Lean Startup fbFund Edition
 
The Lean Startup EA edition
The Lean Startup EA editionThe Lean Startup EA edition
The Lean Startup EA edition
 
UX Design Heuristics, aka "what makes an interaction good"?
UX Design Heuristics, aka "what makes an interaction good"?UX Design Heuristics, aka "what makes an interaction good"?
UX Design Heuristics, aka "what makes an interaction good"?
 
Bally chohan support (Bally Chohan Bally ) | Bally chohan | Bally chohan support
Bally chohan support (Bally Chohan Bally ) | Bally chohan | Bally chohan supportBally chohan support (Bally Chohan Bally ) | Bally chohan | Bally chohan support
Bally chohan support (Bally Chohan Bally ) | Bally chohan | Bally chohan support
 
Five whys summary
Five whys summaryFive whys summary
Five whys summary
 
7 Secrets To A Successful Social Networking Launch
7 Secrets To A Successful Social Networking Launch7 Secrets To A Successful Social Networking Launch
7 Secrets To A Successful Social Networking Launch
 
2010 02 19 the lean startup - webstock 2010
2010 02 19 the lean startup - webstock 20102010 02 19 the lean startup - webstock 2010
2010 02 19 the lean startup - webstock 2010
 
Teleconference #2: Getting Down to Business - Selling and Daily Activity 0715
Teleconference #2: Getting Down to Business - Selling and Daily Activity 0715Teleconference #2: Getting Down to Business - Selling and Daily Activity 0715
Teleconference #2: Getting Down to Business - Selling and Daily Activity 0715
 
Pitch the way VCs think
Pitch the way VCs thinkPitch the way VCs think
Pitch the way VCs think
 
2010 10 28 the lean startup at ucsd
2010 10 28 the lean startup at ucsd2010 10 28 the lean startup at ucsd
2010 10 28 the lean startup at ucsd
 
The Lean Startup at Code for America fellows
The Lean Startup at Code for America fellowsThe Lean Startup at Code for America fellows
The Lean Startup at Code for America fellows
 
Communication and Testing: Why You Have Been Wrong All Along!
Communication and Testing: Why You Have Been Wrong All Along!Communication and Testing: Why You Have Been Wrong All Along!
Communication and Testing: Why You Have Been Wrong All Along!
 
Confurrent SWOmaha
Confurrent SWOmahaConfurrent SWOmaha
Confurrent SWOmaha
 
How to Build an Attribution Solution in 1 Day
How to Build an Attribution Solution in 1 DayHow to Build an Attribution Solution in 1 Day
How to Build an Attribution Solution in 1 Day
 
How to Build an Attribution Solution in 1 Day
How to Build an Attribution Solution in 1 DayHow to Build an Attribution Solution in 1 Day
How to Build an Attribution Solution in 1 Day
 
Get Faster - While You're Getting Better
Get Faster - While You're Getting BetterGet Faster - While You're Getting Better
Get Faster - While You're Getting Better
 
Am Fam Telecon2 Getting Down To Business Selling And Daily Activity 071509 ...
Am Fam Telecon2 Getting Down To Business   Selling And Daily Activity 071509 ...Am Fam Telecon2 Getting Down To Business   Selling And Daily Activity 071509 ...
Am Fam Telecon2 Getting Down To Business Selling And Daily Activity 071509 ...
 
Off-Hours Critical Issue Escalation
Off-Hours Critical Issue EscalationOff-Hours Critical Issue Escalation
Off-Hours Critical Issue Escalation
 
Group Presentation on Bussiness Intelligence
Group Presentation on Bussiness IntelligenceGroup Presentation on Bussiness Intelligence
Group Presentation on Bussiness Intelligence
 

Recently uploaded

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 

Recently uploaded (20)

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 

What To Do When It All Goes So Wrong

  • 1. What To Do When It All Goes So Wrong David Levy AdventuresInSql.com SQL Saturday #67 Chicago
  • 2. More than 11 years in IT SQL Server DBA for over 3 years Previous Life as Developer Blogger http://adventuresinsql.com Syndicated on SQLServerCentral.com Syndicated on SQLServerPedia.com @dave_levy on Twitter About Me
  • 3. Peak Time of Peak Sales Day Typical Hourly Sales $100K/HR Order Entry Screen is Locked Up Users report Slowness Initially Now the “Sales Center” Application is Just “Clocking” EMERGENCY!
  • 4. Let Everyone Know There is a Problem Prevent Duplicated Efforts Allows Others to Speak Up Recent Changes Related Issues Communicate http://www.freedigitalphotos.net/images/view_photog.php?photogid=1983
  • 5. Send Up a Flare Send to an IT Only Distribution Group Keep the Subject Line General Provide Broad Overview Including: Systems Impacted Major Symptoms Including Error Messages Number of People Impacted Any Location Specific Information Communicate
  • 6. What Resources Do You need? Subject Matter Experts Specialized Equipment Communicate
  • 7. Never Assign Blame Only State Facts Communicate
  • 8. To: IT Emergencies Subject: Sales Center Issues Sales Center Users are reporting that the Order Entry screen has quit responding. We are currently investigating the issue with the Sales Center Development Team. We will provide updates as we know more. Communicate
  • 9.
  • 10. What Are the Symptoms? What Locations are Involved? Collect
  • 11. What Systems are Involved? SQL Server AS400 Mainframe Web Farm Major Network Components like Load Balancers Collect
  • 12. What Has Changed? Look at Change Control Calendar Talk to Primary On-Calls for Related Systems Collect
  • 13. Anything in the Logs? Windows Logs Application Specific Logs Custom Exception Handling Systems Collect
  • 14. What are Performance Indicators Showing? Perfmon SQL Wait Stats Third-party tools Collect
  • 15. Analyze Collected Information Are There Any Obvious Signs of Trouble? Can the Problem be Linked to a Change? Can Any Patterns be Identified? Process
  • 16. Prove It Is Your Issue Shows Humility Shows Respect for Everyone Else’s Time Avoid Appearing Arrogant Process
  • 17. Prove It Is Your Issue Construct Tests to Prove Theories in Order of Likelihood Until Problem Proven or Theories Exhausted Faster than arguing about what it is not How can you know it is not your issue? Process
  • 18. List Potential Actions Rank by effort, confidence, level of risk Develop action plans for best options and re-rank Each potential action should have a rollback plan Process
  • 19. Define Measures What will indicate things have gotten better? Adding this index will reduce Disk IO by 10 million reads per second The execution time of query x will drop from 6 minutes to 50 milliseconds Process
  • 20. Define Measures What will indicate things have gotten worse? Disk IO may go up The execution time of query x may go up Adding this index may slow inserts from the order upload process Process
  • 21. Communicate Your Intentions Make the Change Follow a written plan Make a single change A single person should make the change Document any additional steps taken Start Over by Collecting More Data Respond
  • 22. Signs You Need to Convene A War Room Having Trouble Finding Anything Wrong 30 Minutes Without Progress An Issue Appears to Span Multiple Systems Having Difficulty Getting People Engaged The War Room
  • 23. Get Everyone in a Room No Changes Made Outside the Room No Heroes Watch out for people doing a lot of typing Avoid changes that take more than a few minutes Have a Call in Number for Remote Coworkers The War Room
  • 24. Have a Technology Kit Old Switch Patch Cords Mice + Mouse Pads Power Strips The War Room
  • 25. Monitor Your Guest List 1-2 Representatives From Each Team Try to Keep Management Out Watch for Disruptive People The War Room
  • 26. To: IT Emergencies Subject: Sales Center Issues We are convening a war room for the Sales Center issue. Everyone working on the issue please meet in the North Conference Room. Remote/WFH coworkers should dial into the conference bridge 888-888-1234, participant code:1234. Communicate
  • 27.
  • 28. White Board the Issue Every System Gets Own Column Write All Facts on White Board Closed Items Get Crossed Out Not Erased Include a Resolution for Each Closed Item The War Room
  • 29. Share the Floor Likely Issue Owner Has the Lead Make Sure Everyone is Heard Contributing Often Involves Staying Out of the Way Don’t Be Afraid to Fade Back and Run The Whiteboard The War Room
  • 30. Never Call “Not-It” and Leave Not Helpful You May be Wrong Appears Arrogant The War Room
  • 31. Keep an Eye On Time Provide Regular Updates to Management Bring in Food Around Meal Times Raises Spirits Brings in More People to Help The War Room
  • 32. To: IT Emergencies Subject: Sales Center Issues Update The Sales Center war room is still going. We are currently looking into a driver issue with IBM. All necessary resources have been engaged. Communicate
  • 33. Keep People in Reserve Each Team Should Divide up the Day Rotate People In and Out Send Someone Home Early to Come in Early The War Room
  • 34. Closing Out Communicate Resolution Capture Contents of Whiteboard Clean Up Room The War Room
  • 35. To: IT Emergencies Subject: Sales Center Issues Resolved The Sales Center issue has been resolved. The issue was caused by a patch that was applied over the weekend. Now that it has been backed out everything has returned to normal. Communicate