2. 2
Service Operation
Key Concepts
• Event
• Service Request
• Self Help
Functions
• Service Desk
• Technical Management
• IT Operations Management
• Applications Management
Processes
• Event Management
• Incident Management
• Request Fulfillment
• Problem Management
• Access Management
• Operation Management
3. 3
Processes
• Event Management is the process that monitors all events that occur
through the IT infrastructure to allow for normal operation and also to
detect and escalate exception conditions.
• Incident Management concentrates on restoring the service to users as
quickly as possible, in order to minimize business impact.
• Request Fulfillment involves the management of customer or user
requests that are not generated as an incident from an unexpected
service delay or disruption.
• Access Management: this is the process of granting authorized users the
right to use a service, while restricting access to non-authorized users.
4. 4
Operation Management
• In addition, there are several other processes that will be executed or
supported during Service Operation, but which are driven during other
phases of the Service Management Lifecycle. The operational aspects of
these processes will be discussed in the final part of this chapter and
include:
− Change Management, a major process which should be closely linked to
Configuration Management and Release Management. These topics are
primarily covered in the Service Transition publication.
− Capacity and Availability Management, the operational aspects of which are
covered in this publication, but which are covered in more detail in the Service
Design publication.
− Financial Management, which is covered in the Service Strategy publication.
− Knowledge Management, which is covered in the Service Transition
publication.
− IT Service Continuity, which is covered in the Service Design publication.
− Service Reporting and Measurement, which are covered in the Continual
Service Improvement publication.
6. 6
Event Management
• An event can be defined as any detectable or discernible occurrence that
has significance for the management of the IT Infrastructure or the
delivery of IT service and evaluation of the impact a deviation might
cause to the services.
• Events are typically notifications created by an IT service, Configuration
Item (CI) or monitoring tool.
• Effective Service Operation is dependent on knowing the status of the
infrastructure and detecting any deviation from normal or expected
operation. This is provided by good monitoring and control systems,
which are based on two types of tools:
− active monitoring tools that poll key CIs to determine their status and
availability. Any exceptions will generate an alert that needs to be
communicated to the appropriate tool or team for action
− passive monitoring tools that detect and correlate operational alerts or
communications generated by CIs.
7. 7
Event Management
• The ability to detect events, make sense of them and determine the
appropriate control action is provided by Event Management. Event
Management is therefore the basis for Operational Monitoring and
Control.
• In addition, if these events are programmed to communicate operational
information as well as warnings and exceptions, they can be used as a
basis for automating many routine Operations Management activities, for
example executing scripts on remote devices, or submitting jobs for
processing, or even dynamically balancing the demand for a service
across multiple devices to enhance performance.
• Event Management therefore provides the entry point for the execution of
many Service Operation processes and activities.
• In addition, it provides a way of comparing actual performance and
behavior against design standards and SLAs. As such, Event
Management also provides a basis for Service Assurance and Reporting;
and Service Improvement. This is covered in detail in the Continual
Service Improvement publication.
8. 8
Event Management - Objectives
• Detect Events, make sense of them, and determine the
appropriate control action
• Event Management is the basis for Operational Monitoring
and Control
9. 9
Scope
• Event Management can be applied to any aspect of Service
Management that needs to be controlled and which can be
automated.
• These include:
− Configuration Items:
• Some CIs will be included because they need to stay in a constant
state (e.g. a switch on a network needs to stay on and Event
Management tools confirm this by monitoring responses to ‘pings’).
• Some CIs will be included because their status needs to change
frequently and Event Management can be used to automate this
and update the CMS (e.g. the updating of a file server).
− Environmental conditions (e.g. fire and smoke detection)
− Software license monitoring for usage to ensure optimum/legal license
utilization and allocation
− Security (e.g. intrusion detection)
− Normal activity (e.g. tracking the use of an application or the
performance of a server).
10. 10
Event Management – Basic Concepts
• Event
An alert or notification created by any IT Service,
Configuration Item or monitoring tool. For example a batch
job has completed. Events typically require IT Operations
personnel to take actions, and often lead to Incidents being
logged.
• Event Management
The Process responsible for managing Events throughout
their Lifecycle.
11. 11
Value to business
• Event Management’s value to the business is generally indirect;
however, it is possible to determine the basis for its value as follows:
− Event Management provides mechanisms for early detection of incidents. In
many cases it is possible for the incident to be detected and assigned to the
appropriate group for action before any actual service outage occurs.
− Event Management makes it possible for some types of automated activity to
be monitored by exception – thus removing the need for expensive and
resource intensive real-time monitoring, while reducing downtime.
• When integrated into other Service Management processes (such as, for
example, Availability or Capacity Management), Event Management can
signal status changes or exceptions that allow the appropriate person or
team to perform early response, thus improving the performance of the
process. This, in turn, will allow the business to benefit from more
effective and more efficient Service Management overall.
• Event Management provides a basis for automated operations, thus
increasing efficiencies and allowing expensive human resources to be
used for more innovative work, such as designing new or improved
functionality or defining new ways in which the business can exploit
technology for increased competitive advantage.
12. 12
Policies, principles and basic concepts
• There are many different types of events:
• Events that signify regular operation
− Notification that a scheduled workload has completed
− An e-mail has reached its intended recipient
• Events that signify an exception
− A user attempts to log on to an application with the incorrect password
− A device’s CPU is above the acceptable utilization rate
• Events that signify unusual, but not exceptional, operation.
− A server’s memory utilization reaches within 5% of its highest
acceptable performance level
13. 13
Example of Event Categories
• Informational: This refers to an event that does not require any action and does not
represent an exception. They are typically stored in the system or service log files and kept
for a predetermined period. Informational events are typically used to check on the status of
a device or service, or to confirm the successful completion of an activity. Examples of
informational events include:
− A user logs onto an application
− A job in the batch queue completes successfully A device has come online
− A transaction is completed successfully.
• Warning: A warning is an event that is generated when a service or device is approaching a
threshold. Warnings are intended to notify the appropriate person, process or tool so that
the situation can be checked and the appropriate action taken to prevent an exception.
Warnings are not typically raised for a device failure. Examples of warnings are:
− Memory utilization on a server is currently at 65% and increasing. If it reaches 75%,
response times will be unacceptably long and the OLA for that department will be
breached.
− The collision rate on a network has increased by 15% over the past hour.
• Exception: An exception means that a service or device is currently operating abnormally
(however that has been defined). Typically, this means that an OLA and SLA have been
breached and the business is being impacted. Exceptions could represent a total failure,
impaired functionality or degraded performance. Examples of exceptions include:
− A server is down
− Response time of a standard transaction across the network has slowed to more than
15 seconds
− A segment of the network is not responding to routine requests.
16. 16
Event Management
Information & Warnings
Information
Log
Warning
Problem
RFC
Incident/
Problem/
Change?
Incident
Alert Human
Intervention
Auto Response
17. 17
Event Management - Roles
• Event management roles are filled by people in the
following functions
−Service Desk
−Technical Management
−Application Management
−IT Operations Management
19. 19
Purpose/Goals/Objectives
• The primary goal of the Incident Management process is to
restore normal service operation as quickly as possible and
minimize the adverse impact on business operations, thus
ensuring that the best possible levels of service quality and
availability are maintained. ‘Normal service operation’ is
defined here as service operation within SLA limits.
20. 20
Incident Management - Scope
• Managing any disruption or potential disruption to live IT
services
• Incidents are identified
− Directly by users through the Service Desk
− Through an interface from Event Management to Incident
Management tools
• Reported and/or logged by technical staff
21. 21
Incident Management – Business value
• The value of Incident Management includes:
− Quicker incident resolution leads to higher availability of service
• The ability to detect and resolve incidents, which results in lower downtime to the business, which in turn means
higher availability of the service. This means that the business is able to exploit the functionality of the service
as designed.
− Align IT activity to business priorities
• The ability to align IT activity to real-time business priorities. This is because Incident Management includes the
capability to identify business priorities and dynamically allocate resources as necessary.
− Identify potential improvements leads to improved quality
• The ability to identify potential improvements to services. This happens as a result of understanding what
constitutes an incident and also from being in contact with the activities of business operational staff.
− The Service Desk can, during its handling of incidents, identify
additional service or training requirements found in IT or the business.
22. 22
Policies, principles and basic concepts
• An Incident
− An unplanned interruption or reduction in the quality of an IT Service
− Any event which could affect an IT Service in the future is also an
Incident
• Timescales
• Incident Models
• Major Incidents
24. 24
Incident Identification
INCIDENT IDENTIFICATION
− Work cannot begin on dealing with an incident until it is known that an
incident has occurred.
− It is usually unacceptable, from a business perspective, to wait until a
user is impacted and contacts the Service Desk.
− Ideally, incidents should be resolved before they have an impact on
users!
INCIDENT LOGGING
− All incidents must be fully logged and date/time stamped, regardless
of whether they are raised through a Service Desk telephone call or
whether automatically detected via an event alert.
25. 25
Incident Management – Interfaces
• Problem Management
• Service Asset and Configuration Management (SACM)
• Change Management
• Capacity Management
• Availability Management
• Service Level Management
26. 26
Incident Management – Key Metrics
• Total number of incidents (as a control measure)
• Breakdown of incidents at each stage
(for example, logged, WIP, closed, etc.)
• Size of incident backlog
• Mean elapsed time to resolution
• % resolved by the Service Desk (first-line fix)
• % handled within agreed response time
• % resolved within agreed Service Level Agreement target
• No. and % of Major Incidents
• No. and % of incident correctly assigned
• Average cost of incident handling
27. 27
Incident Management – Roles
• Incident Manager
− May be performed by Service Desk Supervisor
• Super Users
− Usually Service Desk Analysis
• Second-Line Support
• Third-Line Support (Technical Management, IT Operations,
Applications Management, Third-party suppliers)
28. 28
Incident Management – Challenges
• Ability to detect incidents as quickly as possible (dependency
on Event Management)
• Ensuring all incidents are logged
• Ensuring previous history is available (Incidents, Problems,
Known Errors, Changes)
• Integration with Configuration Management System, Service
Level Management, and Known Error Database (CMS, SLM,
KEDB)
30. 30
Request Fulfillment
• The term ‘Service Request’ is used as a generic description
for many varying types of demands that are placed upon the
IT Department by the users.
• Many of these are actually small changes – low risk,
frequently occurring, low cost, etc.
− a request to change a password
− a request to install an additional software application onto a particular
workstation
− a request to relocate some items of desktop equipment
− a question requesting information
• Their scale and frequent, low-risk nature means that they are
better handled by a separate process, rather than being
allowed to congest and obstruct the normal Incident and
Change Management processes.
31. 31
Purpose/Goals/Objectives
• Request Fulfillment is the processes of dealing with Service
Requests from the users.
• The objectives of the Request Fulfillment process include:
− To provide a channel for users to request and receive standard
services for which a pre-defined approval and qualification process
exists
− To provide information to users and customers about the availability of
services and the procedure for obtaining them
− To source and deliver the components of requested standard services
(e.g. licences and software media)
− To assist with general information, complaints or comments.
32. 32
Scope
• The process needed to fulfill a request will vary depending upon exactly
what is being requested – but can usually be broken down into a set of
activities that have to be performed.
• Some organizations will be comfortable to let the Service Requests be
handled through their Incident Management processes (and tools) – with
Service Requests being handled as a particular type of ‘incident’ (using a
high-level categorization system to identify those ‘incidents’ that are in
fact Service Requests).
33. 33
Value to business
• The value of Request Fulfillment is to provide quick and effective access
to standard services which business staff can use to improve their
productivity or the quality of business services and products.
• Request Fulfillment effectively reduces the bureaucracy involved in
requesting and receiving access to existing or new services, thus also
reducing the cost of providing these services.
• Centralizing fulfillment also increases the level of control over these
services. This in turn can help reduce costs through centralized
negotiation with suppliers, and can also help to reduce the cost of
support.
34. 34
Policies, principles and basic concepts
• Service Request
− A request from a User for information or advice, or for a Standard
Change, For example
• To reset a password, or to provide standard IT Services for a new
User.
• Request Model
• Self-Help
35. 35
Request Fulfillment – Roles
• Not usually dedicated staff
• Service Desk staff
• Incident Management staff
• Service Operations teams
38. 38
Purpose/Goals/Objectives
• Problem Management is the process responsible for
managing the lifecycle of all problems.
• The primary objectives of Problem Management are to
prevent problems and resulting incidents from happening, to
eliminate recurring incidents and to minimize the impact of
incidents that cannot be prevented.
− To prevent problems and resulting Incidents from happening
− To eliminate recurring incidents
− To minimize the impact of incidents that cannot be prevented
39. 39
Scope
• Problem Management includes the activities required to diagnose the
root cause of incidents and to determine the resolution to those
problems.
− It is also responsible for ensuring that the resolution is implemented through
the appropriate control procedures, especially Change Management and
Release Management.
• Problem Management will also maintain information about problems and
the appropriate workarounds and resolutions, so that the organization is
able to reduce the number and impact of incidents over time.
− In this respect, Problem Management has a strong interface with Knowledge
Management, and tools such as the Known Error Database will be used for
both.
• Although Incident and Problem Management are separate processes,
they are closely related and will typically use the same tools, and may
use similar categorization, impact and priority coding systems. This will
ensure effective communication when dealing with related incidents and
problems.
40. 40
Value to business
• Problem Management works together with Incident Management and
Change Management to ensure that IT service availability and quality are
increased.
• When incidents are resolved, information about the resolution is
recorded. Over time, this information is used to speed up the resolution
time and identify permanent solutions, reducing the number and
resolution time of incidents.
• This results in less downtime and less disruption to business critical
systems.
• Additional value is derived from the following:
− Higher availability of IT services
− Higher productivity of business and IT staff
− Reduced expenditure on workarounds or fixes that do not work
− Reduction in cost of effort in fire-fighting or resolving repeat incidents.
41. 41
Policies, principles and basic concepts
• Problem
− The unknown cause of one or more incidents
• Problem Models
• Workaround
• Known Error
• Known Error Database
42. 42
Two major process in Problem Management
Problem Management consists of two major processes:
− Reactive Problem Management
• Resolution of underlying cause(s)
• Covered in Service Operation
− Pro-active Problem Management
• Prevention of future problems
• Generally undertaken as part of CSI
43. 43
Problem Management – Roles
• Problem Manager
• Supported by technical groups
− Technical Management
− IT Operations
− Applications Management
− Third-party suppliers
45. 45
Access Management
• Access Management is the process of granting authorized
users the right to use a service, while preventing access to
non-authorized users.
• It has also been referred to as Rights Management or
Identity Management in different organizations.
46. 46
Purpose/Goals/Objectives
• Granting authorized users the right to use a service
− Access Management provides the right for users to be able to use a
service or group of services.
• Preventing access by non-authorized users
− It is therefore the execution of policies and actions defined in Security
and Availability Management.
47. 47
Scope
• Access Management is effectively the execution of both Availability and
Information Security Management, in that it enables the organization to
manage the confidentiality, availability and integrity of the organization’s
data and intellectual property.
• Access Management ensures that users are given the right to use a
service, but it does not ensure that this access is available at all agreed
times – this is provided by Availability Management.
• Access Management is a process that is executed by all Technical and
Application Management functions and is usually not a separate function.
However, there is likely to be a single control point of coordination,
usually in IT Operations Management or on the Service Desk.
• Access Management can be initiated by a Service Request through the
Service Desk.
48. 48
Value to business
• Access Management provides the following value:
− Controlled access to services ensures that the organization is able to
maintain more effectively the confidentiality of its information
− Employees have the right level of access to execute their jobs
effectively
− There is less likelihood of errors being made in data entry or in the
use of a critical service by an unskilled user (e.g. production control
systems)
− The ability to audit use of services and to trace the abuse of services
− The ability more easily to revoke access rights when needed – an
important security consideration
− May be needed for regulatory compliance (e.g. SOX, HIPAA, COBIT).
49. 49
Policies, principles and basic concepts
• Access refers to the level and extent of a service’s functionality or data
that a user is entitled to use.
• Identity refers to the information about them that distinguishes them as
an individual and which verifies their status within the organization. By
definition, the Identity of a user is unique to that user.
• Rights (also called privileges) refer to the actual settings whereby a user
is provided access to a service or group of services. Typical rights, or
levels of access, include read, write, execute, change, delete.
• Services or service groups. Most users do not use only one service, and
users performing a similar set of activities will use a similar set of
services. Instead of providing access to each service for each user
separately, it is more efficient to be able to grant each user – or group of
users – access to the whole set of services that they are entitled to use at
the same time.
• Directory Services refers to a specific type of tool that is used to manage
access and rights.
50. 50
Activities
• Requesting access
• Verification
• Providing rights
• Monitoring identity status
• Logging and tracking access
• Removing or restricting rights
51. 51
Access Management – Roles
• Not usually dedicated staff
• Access management is an execution of Availability
Management and Information Security Management
• Service Desk staff
• Technical Management staff
• Application Management staff
• IT Operations staff
53. 53
Operational Activities – Other Processes
• The re are o pe ratio nalactivitie s in o the r pro ce sse s, as
fo llo ws:
− Change Management
− Configuration Management
− Release Management
− Capacity Management
− Availability Management
− Knowledge Management
− Financial Management
− IT Service Continuity Management
56. 56
Service Desk
• Primary point of contact
• Deals with all user issues (incidents, requests, standard
changes)
• Coordinates actions across the IT organization to meet user
requirements
• Different options (Local, Centralized, Virtual, Follow-the-Sun,
specialized groups)
57. 57
Types of Service Desk
• Local Service Desk
− Designed to support local business needs
− Support is usually in the same location as the business it is supporting
• Central Service Desk
− Designed to support multiple locations
− The desk is in a central location where the business is distributed
− Could provide support to local desks
• Virtual Service Desk
− Location of SD analysts is invisible to the customers
− This may include some element of ‘home working’
− Common processes and procedures should exist
61. 61
Service Desk objectives
• Logging and categorizing Incidents, Service Requests and some
categories of change
• First line investigation and diagnosis
• Escalation
• Communication with Users and IT Staff
• Closing calls
• Customer satisfaction
• Update the CMS if so agreed
62. 62
Service Desk staffing
• Correct number and qualifications at any given time
considering:
− Customer expectations and business requirements
− Number of users to support, their language and skills
− Coverage period, out-of-hours, time zones/locations, travel time
− Processes and procedures in place
• Minimum qualifications
− Interpersonal skills
− Business understanding
− IT understanding
− Skill sets
• Customer and Technical emphasis, Expert
63. 63
Service Desk metrics
• Periodic evaluations of health, maturity, efficiency,
effectiveness and any opportunity to improve
• Realistic and carefully chosen – total number of call is not
itself good or bad
• Some examples:
− First-line resolution rate
− Average time to resolve and/or escalate and incident
− Total costs for the period divided by total call duration minutes
− The number of calls broken down by time of day and day of week,
combined with the average call-time
64. 64
Applications Management
• Manages Applications throughout their Lifecycle
• Performed by any department, group or team managing and
supporting operational Applications
• Role in the design, testing and improvement of Applications
that form part of IT Services
• Involved in development projects, but not usually the same
as the Application Development teams
• Provides resources throughout the lifecycle
• Guidance to IT Operations Management
65. 65
Applications Management – Objectives
• Well designed, resilient, cost effective applications
• Ensuring availability of functionality
• Maintain operational applications
• Support during application failures
66. 66
Applications Management – Roles
• Application Manager / Team leaders
• Applications Analyst / Architect
• Note: Application Management teams are usually aligned to
the applications they manage
67. 67
Technical Management
• The groups, departments or teams that provide technical
expertise and overall management of the IT Infrastructure
− Custodians of technical knowledge and expertise related to managing
the IT Infrastructure
− Provide the actual resources to support the IT Service Management
Lifecycle
− Perform many of the common activities already outlined
− Execute most ITSM processes
68. 68
Technical Management Organization
• Technical teams are usually aligned to the technology they
manage
• Can include operational activities
• Examples:
− Mainframe management
− Server Management
− Internet / Web Management
− Network Management
− Database Administration
69. 69
Technical Management – Objectives
• Design of resilient, cost-effective infrastructure configuration
• Maintenance of the infrastructure
• Support during technical failures
71. 71
IT Operations Management
• The department, group or team of people responsible for
performing the organization’s day-to-day operational
activities, such as:
− Console Management
− Job Scheduling
− Backup and Restore
− Print and Output management
− Performance of maintenance activities
− Facilities Management
− Operations Bridge
− Network Operations Center
− Monitoring the infrastructure, applications and services
72. 72
IT Operations Management - Objectives
• Maintaining the “status quo” to achieve infrastructure stability
• Identify opportunities to improve operational performance
and save costs
• Initial diagnosis and resolution of operational Incidents
73. 73
IT Operations Management – Roles
• IT Operations Manager
• Shift Leaders
• IT Operations Analysis
• IT Operators
Editor's Notes
Purpose: Key Message: The ability to detect events, make sense of them and determine the appropriate control action is provided by Event Management. Event Management is therefore the basis for Operational Monitoring and Control. In addition, if these events are programmed to communicate operational information as well as warnings and exceptions, they can be used as a basis for automating many routine Operations Management activities, for example executing scripts on remote devices, or submitting jobs for processing, or even dynamically balancing the demand for a service across multiple devices to enhance performance. Event Management therefore provides the entry point for the execution of many Service Operation processes and activities. In addition, it provides a way of comparing actual performance and behavior against design standards and SLAs. As such, Event Management also provides a basis for Service Assurance and Reporting; and Service Improvement. This is covered in detail in the Continual Service Improvement publication. Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: The primary goal of the Incident Management process is to restore normal service operation as quickly as possible and minimize the adverse impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained. ‘ Normal service operation’ is defined here as service operation within SLA limits. Additional Information: Transition to Next Slide:
Purpose: Key Message: Incident Management includes any event which disrupts, or which could disrupt, a service. This includes events which are communicated directly by users, either through the Service Desk or through an interface from Event Management to Incident Management tools. Incidents can also be reported and/or logged by technical staff (if, for example, they notice something untoward with a hardware or network component they may report or log an incident and refer it to the Service Desk). This does not mean, however, that all events are incidents. Many classes of events are not related to disruptions at all, but are indicators of normal operation or are simply informational. Although both incidents and service requests are reported to the Service Desk, this does not mean that they are the same. Service requests do not represent a disruption to agreed service, but are a way of meeting the customer’s needs and may be addressing an agreed target in an SLA. Service requests are dealt with by the Request Fulfillment process. Additional Information: Transition to Next Slide:
Purpose: Key Message: The business value of Incident Management includes: Quicker incident resolution – Lower downtime to the business, which in turn means higher availability of the service. Improved quality – Ability to identify potential improvements to Services. This happens as a result of understanding what constitutes an Incident and also from being in contact with the activities of Business operational staff. Reduced support costs – Capability to identify business priorities and dynamically allocate the right resources as necessary, therefore avoiding resource misuse. Incident Management is highly visible to the business, and it is therefore easier to demonstrate its value than most areas in Service Operation. For this reason, Incident Management is often one of the first processes to be implemented in Service Management projects. The added benefit of doing this is that Incident Management can be used to highlight other areas that need attention – thereby providing a justification for expenditure on implementing other processes. Additional Information: Transition to Next Slide:
Purpose: Key Message: INCIDENT In ITIL terminology, an ‘Incident’ is defined as: “An unplanned interruption to an IT service or reduction in the quality of an IT service”. Any Event, such as a failure of a configuration item that has not yet impacted service is also an Incident. For example failure of one disk from a mirror set. TIMESCALES Timescales must be agreed for all incident-handling stages (these will differ depending upon the priority level of the incident) – based upon the overall incident response and resolution targets within SLAs – and captured as targets within OLAs and Underpinning Contracts (UCs). All support groups should be made fully aware of these timescales. INCIDENT MODELS Many incidents are not new – they involve dealing with something that has happened before and may well happen again. For this reason, many organizations will find it helpful to pre-define ‘standard’ Incident Models – and apply them to appropriate incidents when they occur. An Incident Model is a way of pre-defining the steps that should be taken to handle a process (in this case a process for dealing with a particular type of incident) in an agreed way. Support tools can then be used to manage the required process. This will ensure that ‘standard’ incidents are handled in a pre-defined path and within pre-defined timescales. Incidents which would require specialized handling can be treated in this way (for example, security-related incidents can be routed to Information Security Management and capacity- or performance-related incidents that would be routed to Capacity Management. The Incident Model should include: The steps that should be taken to handle the incident The chronological order these steps should be taken in, with any dependences or co-processing defined Responsibilities; who should do what Timescales and thresholds for completion of the actions Escalation procedures; who should be contacted and when Any necessary evidence-preservation activities (particularly relevant for security- and capacity-related incidents). The models should be input to the incident-handling support tools in use and the tools should then automate the handling, management and escalation of the process. MAJOR INCIDENT A separate procedure, with shorter timescales and greater urgency, must be used for ‘major’ incidents. A definition of what constitutes a major incident must be agreed and ideally mapped on to the overall incident prioritization system – such that they will be dealt with through the major incident process. Additional Information: Transition to Next Slide:
Purpose: Key Message: The process to be followed during incident management includes the following steps: Identification Logging Categorization Prioritization Initial diagnosis Escalation Investigation and diagnosis Resolution and recovery Closure Additional Information: Transition to Next Slide:
Purpose: Key Message: INCIDENT IDENTIFICATION Work cannot begin on dealing with an incident until it is known that an incident has occurred. It is usually unacceptable, from a business perspective, to wait until a user is impacted and contacts the Service Desk. As far as possible, all key components should be monitored so that failures or potential failures are detected early so that the incident management process can be started quickly. Ideally, incidents should be resolved before they have an impact on users! INCIDENT LOGGING All incidents must be fully logged and date/time stamped, regardless of whether they are raised through a Service Desk telephone call or whether automatically detected via an event alert. All relevant information relating to the nature of the incident must be logged so that a full historical record is maintained – and so that if the incident has to be referred to other support group(s), they will have all relevant information to hand to assist them. The information needed for each incident is likely to include: Unique reference number Incident categorization (often broken down into between two and four levels of sub-categories) Incident urgency Incident impact Incident prioritization Date/time recorded Name/ID of the person and/or group recording the incident Method of notification (telephone, automatic, e-mail, in person, etc.) Name/department/phone/location of user Call-back method (telephone, mail, etc.) Description of symptoms Incident status (active, waiting, closed, etc.) Related CI Support group/person to which the incident is allocated Related problem/Known Error Activities undertaken to resolve the incident Resolution date and time Closure category Closure date and time. Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: N/A Transition to Next Slide:
Purpose: Key Message: Note, however, that there is a significant difference here – an incident is usually an unplanned event whereas a Service Request is usually something that can and should be planned! Therefore, in an organization where large numbers of Service Requests have to be handled, and where the actions to be taken to fulfill those requests are very varied or specialized, it may be appropriate to handle Service Requests as a completely separate work stream – and to record and manage them as a separate record type. This may be particularly appropriate if the organization has chosen to widen the scope of the Service Desk to expand upon just IT-related issues and use the desk as a focal point for other types or request for service Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: N/A Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Problem Management consists of two major processes: Reactive Problem Management, which is generally executed as part of Service Operation – and is therefore covered in this publication Proactive Problem Management which is initiated in Service Operation, but generally driven as part of Continual Service Improvement (see this publication for fuller details). Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: N/A Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide:
Purpose: Key Message: Additional Information: Transition to Next Slide: