Incident Consequence Analysis<Description of major incident>Service desk references: ######This is reported as a <Minor, M...
Incident Consequence Analysis
Incident Consequence Analysis
Incident Consequence Analysis
Incident Consequence Analysis
Upcoming SlideShare
Loading in …5
×

Incident Consequence Analysis

1,150 views

Published on

Incident consequence analysis using in major incident process

Published in: Business, Economy & Finance
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,150
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
39
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Incident Consequence Analysis

  1. 1. Incident Consequence Analysis<Description of major incident>Service desk references: ######This is reported as a <Minor, Major> Incident.<Business units> affected in <location>. <x> minutes unavailable and/or <x> minutes degraded. <Resolution>. <Service> affected by <cause>. <No, blank> further root cause analysis required. Escalated to <escalations>.This incident affected the company <less than, the same as, greater than> usual. The outage was <less than, blank, greater than> normal. The risk is <less than, blank, greater than> average. Prepared by: <first name, surname><Major Incident Dashboard>Rolling Incident averages:Classification – <Norm>, Outage analysis – <Norm>, Risk management – <Norm>This incident was <calculation> less than the norm using the Incident User Metric.Resources<Job descriptions and names of resources involved>Service affected<Name of service from catalogue>Description of incident<Description of incident>>Resolution<Description of resolution?Timelines& detailsIncidentdd/mm hhmmDetectedhhmmRepairhhmmRecoverhhmmRestorehhmmResolutionhhmmWorkaround<Description>DiagnosishhmmEscalations: Problem management: dd/mm<Any extra details>Notification / Report:dd/mmTime analysis<Graph of Expanded Incident Lifecycle>Incident Breakdowns<Pie chart of incident breakdown by service><Pie chart of incident breakdown by cause>Time unavailable/degraded<x> minutes unavailable, <x> minutes degradedMTTR=<x> minutes, MTBF=<x> days, MTBSI=<x> days.Incident User Metric Cost of downtime analysis<x><Incident user Metric skyline><Last 10 Incidents – ROC analysis>Classification (<x>%)Outage analysis (<x>%)Risk Management (<x>%)SCROPUPITBIVCM<x><x><x><x><x><x><x><x><x><x>ClassificationScope (S)<input from major incident draft template>Credibility (CR)<input from major incident draft template>Operations (OP)<input from major incident draft template>Urgency (U)<input from major incident draft template>Prioritization (P)<input from major incident draft template>Outage analysisIT service outage analysis (IT)<input from major incident draft template>Business service outage analysis (B)<input from major incident draft template>Risk managementRisk impact (I) Best practice CIA analysis (CRAMM) – Confidentiality (unauthorized disclosure), Integrity (unauthorized modification or misuse), Availability (destruction or loss).<input from major incident draft template>Risk vulnerability (M) What are the chances of the outage occurring considering loss, error or failure?<input from major incident draft template>Countermeasures (CM) What is being done to prevent this from happening again?<input from major incident draft template>ClosureEscalations Please note that if no comments or questions are received within 5 working days this reported is classed as Accepted<input from major incident draft template><br />Example<br />Incident Consequence AnalysisEmail outage in PofadderService desk references: 555772This is reported as a Minor Incident.All Business units affected in Pofadder. 12 minutes unavailable and 238 minutes degraded. Mail server recycled. Messaging affected by bug. No further root cause analysis required. Escalated to Infrastructure Manager.This incident affected the company less than usual. The outage was normal. The risk is less than average. Prepared by: Ronald BartelsRolling Incident averages:Classification – 69%, Outage analysis – 49%, Risk management – 54%This incident was 66% less than the norm using the Incident User Metric.ResourcesService Level Manager (M Mouse), Regional Infrastructure team leader (D Duck).Service affectedMessagingDescription of incidentIT customers located in the Pofadder office experienced slow delivery of mail messages to other regions and business units. IT Customers unable to confirm instructions or send credit minutes via email. The inbound and outbound queues on the Exchange server were not flowing. Documents scanned and emailed via multi-function devices where the size of the document was over 1.5mb where largely affected. Log file gave specific error code which suggested several resolutions from the knowledge base. (http://support.microsoft.com/kb/329617). The bad mail folder was cleared and the SMTP service was restarted. However, this did not clear the issue and it was only when the mail server was power cycled that normal operations resumed.ResolutionMail server recycled.Timelines& detailsIncident9/10 09h30Detected11h25Repair13h15Recover13h17Restore13h35Resolution13h35ServerrestartedWorkaroundFailed - Bad mail folder cleared and SMTP service restarted.Diagnosis13h06Escalations: Problem management: 9/10Notification / Report:9/10Time analysisIncident BreakdownsIncident breakdown by Service(affected messaging)EcommerceMonitoringPrintingThird partyOperationsBackupsService deskStorageADDocumentsSecurityIntranetHostingPaymentsVoice MessagingData networksIncident breakdown by Cause(caused by bug)ChangeCapacityProcessVendorHardwareBugEnvironmentalService ProviderCarrierConfigurationComponent FailureTime unavailable/degraded12 minutes unavailable, 238 minutes degradedMTTR=238 minutes, MTBF=8 days, MTBSI=7 days.Incident User Metric Cost of downtime analysis347Classification (60%)Outage analysis (50%)Risk Management (41%)SCROPUPITBIVCM2214331221ClassificationScope (S)Less than 25% of customers affected*Credibility (CR)Multiple business units affected negativelyOperations (OP)Some interference with normal completion of workUrgency (U)Underway and could not be stoppedPrioritization (P)High - Technicians respond immediately, assess the situation, and may interrupt other staff working low or medium priority jobs for assistance.Outage analysisIT service outage analysis (IT)Major - App, server, link (network or voice) unavailable for greater than 1 hour or degraded for greater than 4 hoursBusiness service outage analysis (B)Minor -Financial loss with a visible impact on profitability but no real effect, greater than $10k or some embarrassment or rule or process breaches or medical treatmentRisk managementRisk impact (I) Best practice CIA analysis (CRAMM) – Confidentiality (unauthorized disclosure), Integrity (unauthorized modification or misuse), Availability (destruction or loss).Confidentiality=Confidential, Integrity=High, Availability=ModerateRisk vulnerability (M) What are the chances of the outage occurring considering loss, error or failure?Low loss probabilityModerate error probabilityModerate failure probabilityCountermeasures (CM) What is being done to prevent this from happening again?Proactive monitoring of environment. Refer to the knowledge base at service desk. Antivirus service locks up SMTP Service when BadMail queue reaches a specific size. Add check to daily check list to monitor BadMail folder size.ClosureEscalations Please note that if no comments or questions are received within 5 working days this reported is classed as AcceptedNo further root cause analysis required. Escalated to Infrastructure Manager.<br />

×