SlideShare a Scribd company logo
1Pa g e
Is minor incident management
The secret to
Major Incident
Management
Bob Fishman
RobertFishman25@gmail.com
508-259-1467
2Pa g e
A WELL RUN Network Operations
Center (NOC) KEEPS
YOUR
BUSINESS
RUNNING
SMOOTHLY
Performance
Minimize service interruptions
Rapid recovery
Ongoing support and maintenance
Well supported business functions
Prevent, detect, respond
A good NOC should be able to deal with even catastrophic
situations, like natural disasters, smoothly, confidently and
quickly.
How do we make a NOC run smoothly? By managing the
little stuff very, very well.
3Pa g e
1 2
4
What makes a NOC Rock?
MONITORING
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
4Pa g e
1 2
3 4
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
What makes a NOC Rock?
MONITORING
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
5Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
MONITORING
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
6Pa g e
Alerts should either be:
• Automatically Ticketed and properly assigned
• Automatically Ticketed and closed when cleared
• Discarded
Monitoring
1 Smart monitoring means the right alerts with the right information.
5
Eyes on Glass should try to be avoided
• Monotonous periods of inactivity occur which lead to
less than optimal performance of humans
• If Eyes on Glass are needed strict process must be
adhered to as to what events get ticketed
7Pa g e
• No work should be done that isn’t ticketed. Why? Tickets should contain a trail left by engineers. The ticket is an important record of
what was done, by who and why.
• Un-ticketed work leads to memory and procedural gaps that cause issues. Furthermore, it means your team is loosing track of how they
spend their time. That rarely ends well.
• Ticket types should include:
o Incidents, Service Requests, Changes and Problems
• Tickets containing thorough comments can lead to great Knowledge Base articles
Ticketing
5
2
Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
8Pa g eProcess
5
Every system needs a well documented process. Good processes mean good responses. Good documentation
means a consistent response no matter who’s on duty
3
• Documented processes – particularly for lower level engineers
• Process leads to repeatable, scalable, measurable outcomes with fewer errors
o The outcomes will contain fewer errors which are also able to be reported on
• Undocumented process becomes institutional knowledge and that knowledge may be lost when employees leave
• All work notes must be in the ticket
• If it isn’t in the ticket, it didn’t happen
• When, how and to who to escalate the incident
• Well defined shift hand-over steps and documentation
• When and in what format and to who communications must be sent
9Pa g eTraining
5
Training is as much about expectations and approach as it is about specific knowledge and processes. Good
training makes for good teams.
• Train for professional development
• A more knowledgeable workforce
• Ability to promote from within
• Train so employees understand the corporate values and responsibilities
• Helps company communicate legal issues such as Sexual harassment and Safety to employees
4
10Pa g eCommunication
5
Communication is essential between team members, OEMS, and the business. The leaders sets a tone and
process, and everyone participates.
• Well documented communication processes
• Escalation to the next level up and notification of that escalation
• Stakeholder
• Internal
• External
• The who, what, when and how of each step
• Verbal communication
• All occurrences should be documented with the ticket
• Shift Hand-overs
5
11Pa g ePeople
5
A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same
location.
• The core of any organization are the people
• Retain your best talent
• People must be working towards a common goal defined by the corporate entity
• They need defined duties
• Timely and accurate feedback is essential
• Trained employees feel empowered by and to move up in the organization
• This is a win win scenario
6
12Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
MONITORING
13Pa g e
1 2
3 4
PEOPLE
TRAINING
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
14Pa g e
1 2
3 4
PEOPLE
TRAINING
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
15Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
• A Major Incident is no
place for training
• Lower level engineers
can join the bridge and
or the shared video of
troubleshooting BUT
this is a higher level
issue
• Experienced and trained
prior to being part of a
Major Incident
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
16Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
• A Major Incident is no
place for training
• Lower level engineers
can join the bridge and
or the shared video of
troubleshooting BUT
this is a higher level
issue
• Experienced and trained
prior to being part of a
Major Incident
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
• This process and
associated tools should
be well defined prior
to expecting an MI to
be handled properly
• Who is responsible to
communicate when, to
who and how
• Who owns the
escalation of the
incident, if needed
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
MONITORING
17Pa g e
1 2
3 4
PEOPLE
TRAINING
Training is as much about
expectations and approach
as it is about specific
knowledge and processes.
Good training makes for
good teams.
A NOC needs capable,
dedicated, trained people
that feel like a team even
when they aren’t all in the
same location.
COMMUNICATION
PROCESS
Every system needs a well
documented process. Good
processes mean good
responses. Good
documentation means a
consistent response no matter
who’s on duty.
Communication is essential
between team members,
OEMS, and the business. The
leaders sets a tone and
process, and everyone
participates.
What makes a NOC Rock?
• A Major Incident is no
place for training
• Lower level engineers
can join the bridge and
or the shared video of
troubleshooting BUT
this is a higher level
issue
• Experienced and trained
prior to being part of a
Major Incident
Smart monitoring means the
right alerts with the right
information.
5 6
TICKETING
Ticket each manual request
and only actionable
monitored events. Post
closure review of tickets is
how we learn and improve.
• People should be
prepared for any MI
• They should understand
the goals and SLAs
• Major Incidents can be
stressful understand
how this may affect
your staff
Monitoring and Ticketing
Should need no changes
if the system ticketing
process correctly
identifies Major Incidents
or P1s
• A validated MI needs
its own process and
documentation
• Who owns the ticket
BECAUSE it is an MI
• Who notifies
Engineering that there
is an MI
• There should be a RACI
document for a Major
Incident
• This process and
associated tools should
be well defined prior
to expecting an MI to
be handled properly
• Who is responsible to
communicate when, to
who and how
• Who owns the
escalation of the
incident, if needed
MONITORING
Incident handling is the key to success
for proper handling of Major Incidents
Preparing for Major Incidents by taking care of the “normal”
incidents will make your NOC Rock
19Pa g e
There actually is no secret to
Major Incident Management
To the end user there are
no minor incidents
Bob Fishman
RobertFishman25@gmail.com
508-259-1467
20Pa g e
There actually is no secret to
Major Incident Management
To the end user there are
no minor incidents
Bob Fishman
RobertFishman25@gmail.com
508-259-1467

More Related Content

What's hot

How to Build an Invincible Incident Management Plan
How to Build an Invincible Incident Management PlanHow to Build an Invincible Incident Management Plan
How to Build an Invincible Incident Management Plan
DevOps.com
 
ITIL Incident Management Workflow - Process Guide
	 ITIL Incident Management Workflow - Process Guide	 ITIL Incident Management Workflow - Process Guide
ITIL Incident Management Workflow - Process Guide
Flevy.com Best Practices
 
Technical Escalations Best Practices
Technical Escalations Best PracticesTechnical Escalations Best Practices
Technical Escalations Best Practicesmagalong
 
Incident Management Best Practices
Incident Management Best PracticesIncident Management Best Practices
Incident Management Best Practices
TechExcel
 
Network Operations Center
Network Operations Center  Network Operations Center
Network Operations Center
Muhannad Kalbouneh
 
ITIL Incident Management Workflow PowerPoint Presentation Slides
ITIL Incident Management Workflow PowerPoint Presentation SlidesITIL Incident Management Workflow PowerPoint Presentation Slides
ITIL Incident Management Workflow PowerPoint Presentation Slides
SlideTeam
 
Most Recent updatedResume Vaibhav
Most Recent updatedResume VaibhavMost Recent updatedResume Vaibhav
Most Recent updatedResume VaibhavVaibhav Sawant
 
A Practical Approach to Incident Management for SaaS/PaaS
A Practical Approach to Incident Management for SaaS/PaaSA Practical Approach to Incident Management for SaaS/PaaS
A Practical Approach to Incident Management for SaaS/PaaS
Michael Weber
 
Service now vulnerability patching_move
Service now vulnerability patching_moveService now vulnerability patching_move
Service now vulnerability patching_move
Subrat Kumar Dash
 
2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"Gene Kim
 
Incident Management PowerPoint Presentation Slides
Incident Management PowerPoint Presentation SlidesIncident Management PowerPoint Presentation Slides
Incident Management PowerPoint Presentation Slides
SlideTeam
 
Credit Union Cyber Security
Credit Union Cyber SecurityCredit Union Cyber Security
Credit Union Cyber Security
Stacy Willis
 
Bright talk running a cloud - final
Bright talk   running a cloud - finalBright talk   running a cloud - final
Bright talk running a cloud - finalAndrew White
 
18 Ways Incident Management Systems Create Order (And Why It Matters)
18 Ways Incident Management Systems Create Order (And Why It Matters)18 Ways Incident Management Systems Create Order (And Why It Matters)
18 Ways Incident Management Systems Create Order (And Why It Matters)
24/7 Software
 
Getting Started with Business Continuity
Getting Started with Business ContinuityGetting Started with Business Continuity
Getting Started with Business Continuity
Stephen Cobb
 
Best Practices in Disaster Recovery Planning and Testing
Best Practices in Disaster Recovery Planning and TestingBest Practices in Disaster Recovery Planning and Testing
Best Practices in Disaster Recovery Planning and Testing
Axcient
 
Software and Tear
Software and TearSoftware and Tear
Software and Tear
Josh Howell
 
Liberate Your IT Team
Liberate Your IT TeamLiberate Your IT Team
Liberate Your IT Teamvblackwell
 
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
VAST
 
Mastering disaster e book Telehouse
Mastering disaster e book TelehouseMastering disaster e book Telehouse
Mastering disaster e book Telehouse
Telehouse
 

What's hot (20)

How to Build an Invincible Incident Management Plan
How to Build an Invincible Incident Management PlanHow to Build an Invincible Incident Management Plan
How to Build an Invincible Incident Management Plan
 
ITIL Incident Management Workflow - Process Guide
	 ITIL Incident Management Workflow - Process Guide	 ITIL Incident Management Workflow - Process Guide
ITIL Incident Management Workflow - Process Guide
 
Technical Escalations Best Practices
Technical Escalations Best PracticesTechnical Escalations Best Practices
Technical Escalations Best Practices
 
Incident Management Best Practices
Incident Management Best PracticesIncident Management Best Practices
Incident Management Best Practices
 
Network Operations Center
Network Operations Center  Network Operations Center
Network Operations Center
 
ITIL Incident Management Workflow PowerPoint Presentation Slides
ITIL Incident Management Workflow PowerPoint Presentation SlidesITIL Incident Management Workflow PowerPoint Presentation Slides
ITIL Incident Management Workflow PowerPoint Presentation Slides
 
Most Recent updatedResume Vaibhav
Most Recent updatedResume VaibhavMost Recent updatedResume Vaibhav
Most Recent updatedResume Vaibhav
 
A Practical Approach to Incident Management for SaaS/PaaS
A Practical Approach to Incident Management for SaaS/PaaSA Practical Approach to Incident Management for SaaS/PaaS
A Practical Approach to Incident Management for SaaS/PaaS
 
Service now vulnerability patching_move
Service now vulnerability patching_moveService now vulnerability patching_move
Service now vulnerability patching_move
 
2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"
 
Incident Management PowerPoint Presentation Slides
Incident Management PowerPoint Presentation SlidesIncident Management PowerPoint Presentation Slides
Incident Management PowerPoint Presentation Slides
 
Credit Union Cyber Security
Credit Union Cyber SecurityCredit Union Cyber Security
Credit Union Cyber Security
 
Bright talk running a cloud - final
Bright talk   running a cloud - finalBright talk   running a cloud - final
Bright talk running a cloud - final
 
18 Ways Incident Management Systems Create Order (And Why It Matters)
18 Ways Incident Management Systems Create Order (And Why It Matters)18 Ways Incident Management Systems Create Order (And Why It Matters)
18 Ways Incident Management Systems Create Order (And Why It Matters)
 
Getting Started with Business Continuity
Getting Started with Business ContinuityGetting Started with Business Continuity
Getting Started with Business Continuity
 
Best Practices in Disaster Recovery Planning and Testing
Best Practices in Disaster Recovery Planning and TestingBest Practices in Disaster Recovery Planning and Testing
Best Practices in Disaster Recovery Planning and Testing
 
Software and Tear
Software and TearSoftware and Tear
Software and Tear
 
Liberate Your IT Team
Liberate Your IT TeamLiberate Your IT Team
Liberate Your IT Team
 
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
Kept up by Potential IT Disasters? Your Guide to Disaster Recovery as a Servi...
 
Mastering disaster e book Telehouse
Mastering disaster e book TelehouseMastering disaster e book Telehouse
Mastering disaster e book Telehouse
 

Similar to Major Incident - make your NOC Rock

How to Digitally Transform Your Internal Operations
How to Digitally Transform Your Internal OperationsHow to Digitally Transform Your Internal Operations
How to Digitally Transform Your Internal Operations
Integrify
 
Seminar on Process Documentation.pptx
Seminar on Process Documentation.pptxSeminar on Process Documentation.pptx
Seminar on Process Documentation.pptx
NioAbaoCasyao
 
KV_ResumeAttachment_Updated 24112015
KV_ResumeAttachment_Updated 24112015KV_ResumeAttachment_Updated 24112015
KV_ResumeAttachment_Updated 24112015Chau Kek Voon
 
Management Science - Krimzen Tech
Management Science - Krimzen TechManagement Science - Krimzen Tech
Management Science - Krimzen Tech
DarrenTofu
 
Process Management by Jan Mohammed.pptx
Process Management by Jan Mohammed.pptxProcess Management by Jan Mohammed.pptx
Process Management by Jan Mohammed.pptx
JanMohammed3
 
NARCA Presentation - IT Best Practice
NARCA Presentation - IT Best PracticeNARCA Presentation - IT Best Practice
NARCA Presentation - IT Best PracticeBrenda Majewski
 
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
Case IQ
 
Management Science - Krimzen Tech
Management Science - Krimzen TechManagement Science - Krimzen Tech
Management Science - Krimzen Tech
DarrenTofu
 
Continous auditing and risk monitoring 9 23-09
Continous auditing and risk monitoring  9 23-09Continous auditing and risk monitoring  9 23-09
Continous auditing and risk monitoring 9 23-09
Gaiani (CarnCorpAudit)
 
Business process mapping
Business process mappingBusiness process mapping
Business process mapping
DAVIS THOMAS
 
Xero
XeroXero
Xero
Robson52
 
IT In The Park 2016
IT In The Park 2016IT In The Park 2016
IT In The Park 2016
Ray Bugg
 
December GIP Monthly Meeting
December GIP Monthly MeetingDecember GIP Monthly Meeting
December GIP Monthly Meeting
Cole Wirpel
 
ADDO19 - Automate or not from the beginning that is the question
ADDO19 - Automate or not from the beginning that is the questionADDO19 - Automate or not from the beginning that is the question
ADDO19 - Automate or not from the beginning that is the question
Enrique Carbonell
 
How mature are your processes? The stages of eDiscovery evolution
How mature are your processes? The stages of eDiscovery evolutionHow mature are your processes? The stages of eDiscovery evolution
How mature are your processes? The stages of eDiscovery evolution
Matthew Altass
 
Service catalogue presentation
Service catalogue presentationService catalogue presentation
Service catalogue presentation
subtitle
 
S&OP maturity comes prior to advance planning software
S&OP maturity comes prior to advance planning softwareS&OP maturity comes prior to advance planning software
S&OP maturity comes prior to advance planning software
Tristan Wiggill
 
Planning for an Oil & Gas Operation Well Life Cycle Framework
Planning for an Oil & Gas Operation Well Life Cycle FrameworkPlanning for an Oil & Gas Operation Well Life Cycle Framework
Planning for an Oil & Gas Operation Well Life Cycle Framework
Jeff Dyk
 

Similar to Major Incident - make your NOC Rock (20)

How to Digitally Transform Your Internal Operations
How to Digitally Transform Your Internal OperationsHow to Digitally Transform Your Internal Operations
How to Digitally Transform Your Internal Operations
 
Seminar on Process Documentation.pptx
Seminar on Process Documentation.pptxSeminar on Process Documentation.pptx
Seminar on Process Documentation.pptx
 
KV_ResumeAttachment_Updated 24112015
KV_ResumeAttachment_Updated 24112015KV_ResumeAttachment_Updated 24112015
KV_ResumeAttachment_Updated 24112015
 
The SID
The SIDThe SID
The SID
 
Management Science - Krimzen Tech
Management Science - Krimzen TechManagement Science - Krimzen Tech
Management Science - Krimzen Tech
 
Process Management by Jan Mohammed.pptx
Process Management by Jan Mohammed.pptxProcess Management by Jan Mohammed.pptx
Process Management by Jan Mohammed.pptx
 
NARCA Presentation - IT Best Practice
NARCA Presentation - IT Best PracticeNARCA Presentation - IT Best Practice
NARCA Presentation - IT Best Practice
 
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
How to Drive Efficiency and Reduce Risk with Investigative Case Management So...
 
Management Science - Krimzen Tech
Management Science - Krimzen TechManagement Science - Krimzen Tech
Management Science - Krimzen Tech
 
Continous auditing and risk monitoring 9 23-09
Continous auditing and risk monitoring  9 23-09Continous auditing and risk monitoring  9 23-09
Continous auditing and risk monitoring 9 23-09
 
Business process mapping
Business process mappingBusiness process mapping
Business process mapping
 
Xero
XeroXero
Xero
 
IT In The Park 2016
IT In The Park 2016IT In The Park 2016
IT In The Park 2016
 
December GIP Monthly Meeting
December GIP Monthly MeetingDecember GIP Monthly Meeting
December GIP Monthly Meeting
 
ADDO19 - Automate or not from the beginning that is the question
ADDO19 - Automate or not from the beginning that is the questionADDO19 - Automate or not from the beginning that is the question
ADDO19 - Automate or not from the beginning that is the question
 
How mature are your processes? The stages of eDiscovery evolution
How mature are your processes? The stages of eDiscovery evolutionHow mature are your processes? The stages of eDiscovery evolution
How mature are your processes? The stages of eDiscovery evolution
 
E gov championship workshop bangalore 21082013
E gov championship workshop bangalore 21082013E gov championship workshop bangalore 21082013
E gov championship workshop bangalore 21082013
 
Service catalogue presentation
Service catalogue presentationService catalogue presentation
Service catalogue presentation
 
S&OP maturity comes prior to advance planning software
S&OP maturity comes prior to advance planning softwareS&OP maturity comes prior to advance planning software
S&OP maturity comes prior to advance planning software
 
Planning for an Oil & Gas Operation Well Life Cycle Framework
Planning for an Oil & Gas Operation Well Life Cycle FrameworkPlanning for an Oil & Gas Operation Well Life Cycle Framework
Planning for an Oil & Gas Operation Well Life Cycle Framework
 

Recently uploaded

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 

Recently uploaded (20)

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 

Major Incident - make your NOC Rock

  • 1. 1Pa g e Is minor incident management The secret to Major Incident Management Bob Fishman RobertFishman25@gmail.com 508-259-1467
  • 2. 2Pa g e A WELL RUN Network Operations Center (NOC) KEEPS YOUR BUSINESS RUNNING SMOOTHLY Performance Minimize service interruptions Rapid recovery Ongoing support and maintenance Well supported business functions Prevent, detect, respond A good NOC should be able to deal with even catastrophic situations, like natural disasters, smoothly, confidently and quickly. How do we make a NOC run smoothly? By managing the little stuff very, very well.
  • 3. 3Pa g e 1 2 4 What makes a NOC Rock? MONITORING Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
  • 4. 4Pa g e 1 2 3 4 TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. What makes a NOC Rock? MONITORING Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
  • 5. 5Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? MONITORING Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
  • 6. 6Pa g e Alerts should either be: • Automatically Ticketed and properly assigned • Automatically Ticketed and closed when cleared • Discarded Monitoring 1 Smart monitoring means the right alerts with the right information. 5 Eyes on Glass should try to be avoided • Monotonous periods of inactivity occur which lead to less than optimal performance of humans • If Eyes on Glass are needed strict process must be adhered to as to what events get ticketed
  • 7. 7Pa g e • No work should be done that isn’t ticketed. Why? Tickets should contain a trail left by engineers. The ticket is an important record of what was done, by who and why. • Un-ticketed work leads to memory and procedural gaps that cause issues. Furthermore, it means your team is loosing track of how they spend their time. That rarely ends well. • Ticket types should include: o Incidents, Service Requests, Changes and Problems • Tickets containing thorough comments can lead to great Knowledge Base articles Ticketing 5 2 Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve.
  • 8. 8Pa g eProcess 5 Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty 3 • Documented processes – particularly for lower level engineers • Process leads to repeatable, scalable, measurable outcomes with fewer errors o The outcomes will contain fewer errors which are also able to be reported on • Undocumented process becomes institutional knowledge and that knowledge may be lost when employees leave • All work notes must be in the ticket • If it isn’t in the ticket, it didn’t happen • When, how and to who to escalate the incident • Well defined shift hand-over steps and documentation • When and in what format and to who communications must be sent
  • 9. 9Pa g eTraining 5 Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. • Train for professional development • A more knowledgeable workforce • Ability to promote from within • Train so employees understand the corporate values and responsibilities • Helps company communicate legal issues such as Sexual harassment and Safety to employees 4
  • 10. 10Pa g eCommunication 5 Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. • Well documented communication processes • Escalation to the next level up and notification of that escalation • Stakeholder • Internal • External • The who, what, when and how of each step • Verbal communication • All occurrences should be documented with the ticket • Shift Hand-overs 5
  • 11. 11Pa g ePeople 5 A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. • The core of any organization are the people • Retain your best talent • People must be working towards a common goal defined by the corporate entity • They need defined duties • Timely and accurate feedback is essential • Trained employees feel empowered by and to move up in the organization • This is a win win scenario 6
  • 12. 12Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. MONITORING
  • 13. 13Pa g e 1 2 3 4 PEOPLE TRAINING COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. MONITORING
  • 14. 14Pa g e 1 2 3 4 PEOPLE TRAINING COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s • A validated MI needs its own process and documentation • Who owns the ticket BECAUSE it is an MI • Who notifies Engineering that there is an MI • There should be a RACI document for a Major Incident Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. MONITORING
  • 15. 15Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? • A Major Incident is no place for training • Lower level engineers can join the bridge and or the shared video of troubleshooting BUT this is a higher level issue • Experienced and trained prior to being part of a Major Incident Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s • A validated MI needs its own process and documentation • Who owns the ticket BECAUSE it is an MI • Who notifies Engineering that there is an MI • There should be a RACI document for a Major Incident A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. MONITORING
  • 16. 16Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? • A Major Incident is no place for training • Lower level engineers can join the bridge and or the shared video of troubleshooting BUT this is a higher level issue • Experienced and trained prior to being part of a Major Incident Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s • A validated MI needs its own process and documentation • Who owns the ticket BECAUSE it is an MI • Who notifies Engineering that there is an MI • There should be a RACI document for a Major Incident • This process and associated tools should be well defined prior to expecting an MI to be handled properly • Who is responsible to communicate when, to who and how • Who owns the escalation of the incident, if needed A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. MONITORING
  • 17. 17Pa g e 1 2 3 4 PEOPLE TRAINING Training is as much about expectations and approach as it is about specific knowledge and processes. Good training makes for good teams. A NOC needs capable, dedicated, trained people that feel like a team even when they aren’t all in the same location. COMMUNICATION PROCESS Every system needs a well documented process. Good processes mean good responses. Good documentation means a consistent response no matter who’s on duty. Communication is essential between team members, OEMS, and the business. The leaders sets a tone and process, and everyone participates. What makes a NOC Rock? • A Major Incident is no place for training • Lower level engineers can join the bridge and or the shared video of troubleshooting BUT this is a higher level issue • Experienced and trained prior to being part of a Major Incident Smart monitoring means the right alerts with the right information. 5 6 TICKETING Ticket each manual request and only actionable monitored events. Post closure review of tickets is how we learn and improve. • People should be prepared for any MI • They should understand the goals and SLAs • Major Incidents can be stressful understand how this may affect your staff Monitoring and Ticketing Should need no changes if the system ticketing process correctly identifies Major Incidents or P1s • A validated MI needs its own process and documentation • Who owns the ticket BECAUSE it is an MI • Who notifies Engineering that there is an MI • There should be a RACI document for a Major Incident • This process and associated tools should be well defined prior to expecting an MI to be handled properly • Who is responsible to communicate when, to who and how • Who owns the escalation of the incident, if needed MONITORING
  • 18. Incident handling is the key to success for proper handling of Major Incidents Preparing for Major Incidents by taking care of the “normal” incidents will make your NOC Rock
  • 19. 19Pa g e There actually is no secret to Major Incident Management To the end user there are no minor incidents Bob Fishman RobertFishman25@gmail.com 508-259-1467
  • 20. 20Pa g e There actually is no secret to Major Incident Management To the end user there are no minor incidents Bob Fishman RobertFishman25@gmail.com 508-259-1467