Processes in Service Operation
       Event Management
Event Management
• GOAL
  – To detect Events, make sense of them, and determine
    the appropriate control action to be provided by
    Event Management
• OBJECTIVE
  – To provide the entry point for the execution of many
    Service Operation processes and activities
• SCOPE
  – Any aspect of Service Management that needs to be
    controlled and which can be automated
Concept : Event
An event is a change of state that is significant for the
management of Configuration Item or service
 For example…
 This term is also used       Action Needed   Backup In Progress

 to mean an alert or
 notification.              STATUS            STATUS
                             IN MAINTENANCE    UNAVAILABLE
 Events typically IT
 Operations personnel
 to take action, and        STATUS            STATUS
 often lead to Incidents       PROCESSING        AVAILABLE

 being logged

                              UNAUTHORIZED       SERVICE
                                 ACCESS         DEGRADED
Concept Of Alert
An Alert is a warning that…

                    CHANGE
                                • A threshold has been reached
                                • Something has changed
                                • A failure has ocurred



 • Alerts are often created and managed by system
   management tools and are managed by the event
   management process
 • The purpose of a alert is to ensure that the person with
   the skills appropriate to deal with the event is notified
Key Metrics
• No. of events by:
   – Category
   – Significance
• No. and % of events
   –   That required human intervention and wheter this was performed
   –   That resulted in Incidents or Changes
   –   Caused by existing Problems or Known Errors
   –   Compared with the number of Incidents
• No. and % of:
   –   Repeated or duplicated events
   –   Events indicating performance issues
   –   Events indicating potential availability issues
   –   Each type of events per platform or application
Implementation Challenges




                                   Correct level of filtering
          Obtain funding




Rolling out necessary monitoring           Acquiring necessary
             agents                               skills
Service Operation Processes

     Incident Management
Incident Management
• GOAL
  – To restore normal service operation as quickly as possible
    and minimize the adverse impact on business operations
• OBJECTIVE
  – To ensure that the best possible levels of service quality
    and availability are maintained
• SCOPE
  – Incident Management includes any Events which disrupts,
    or which could disrupt ,a service. This includes Events
    which are communicated directly by users, either through
    the Service Desk or through an interface from Event
    Management to Incident Management tools
Basic Concepts
• Timescales
• Incident model
• Major Incidents
       NOTE: People sometimes use loose terminology and/or
        confuse a Major Incident with a Problem. In reality, an
        Incident remains an Incident forever – it may grow in
        impact or priority to become a Major Incident, but an
        Incident never ‘becomes a Problem’. A Problem is the
        unknown cause of one or more Incidents and remains a
        separate entity always.
Key Metrics
• Total numbers of Incidents
   – Breakdown at each stage
• Mean elapsed time to achieve Incident resolution to
  circumvention, broke down by impact code
• Percentage of Incidents handled within agreed response
  time
• Incident response-time
• Average cost per Incident
• Number and percentage of:
   – Major incidents, backlog, incorrectly assigned or categorized
   – Resolved remotely, without the need for a visit
Implementation Challenges
• The ability to detect Incidents as early as possible
• Convincing all staff (technical teams as well as
  users) that all Incidents must be logged
• Availability of information about Problems and
  Known Errors
• Integration into the:
         – Configuration Management system (CMS)
         – Service Level Management process (SLM)
         – Service Knowledge Management System (SKMS
Service Operation Processes

      Request Fulfillment
Request
                          Fulfillment
• GOAL
   – To deal with Service Requests from the users/customers
• OBJECTIVE
   – To provide a channel for users to request and receive standard services
     for which a predefined approval and qualification process exists
   – To provide information to users and customers about the availability of
     services and the procedure for obtaining them
   – To source and deliver the components of requested standard services
     (e.g. : licenses and software media)
   – To assist with general information, complaints or comments
• SCOPE
   – Each organization will need to decide and document which requests it
     will handle through the Request Fulfillment process and which others
     will have to go through more formal Change Management process
Concept of the Service Request
The request from a user for information, advice, a
  standard change or access to an IT service.

  – For Example :
     • To reset a password
     • To provide standard
       IT services for a user


Service requests are usually handled by a Service
  Desk and do not require an RFC to be submitted.
Concept of the Request Model
The Request Model is a way of predefining the steps that
  should be taken to handle a process (in this case a
  process for dealing with a particular type of request) in
  an agreed way

Support tools can be used to manage the required process.
  This will ensure that standard requests are handled in a
  predefined path and within predefined timescales
Key Metrics
                                                   Average Cost




                      Backlog




Met SLA    Did not
           meet SLA

~~~~~~~~   ~~~
~~~~~~~~   ~~~
~~~~~~~~   ~~~                  Satisfaction Surveys
Implementations Challenges
Clearly defining and documenting the type of requests
   that will be handled within the Request Fulfillment
    process (and those that will either go through the
 Service Desk and be handled as Incidents or those that
  will need to go through formal Change Management)
  – so that all parties are absolutely clear on the scope

Establishing self-help front-end capabilities that allow
  the users to interface successfully with the Request
                    Fulfillment process
Service Operation Processes

     Problem Management
Problem Management
• GOAL
  – To diagnose the root cause of incidents, to determine
    the resolution to those problems and to implement
    resolutions through appropriate control procedures
• OBJECTIVE
  – Primarily to prevent problems and resulting Incidents,
    eliminate recurring Incidents and to minimize the
    impact of Incidents that cannot be prevented
• SCOPE
  – The Management of the lifecycle of all problems
Problem Management
Problem Management is the process responsible
  for managing the Lifecycle of all Problems

Problem Management consists of two major processes :

        1.    Reactive Problem Management is generally executed as
              part of Service Operation and is, therefore, covered in the
              Service Operation book

        2.    Proactive Problem Management is initiated in Service
              Operation, but is generally driven as part of Continual
              Service Improvement.
Problem
The unknown cause of one or more Incidents. The
 cause is not usually known at the time a Problem
 Record is created, and the Problem Management
  process is responsible for further investigation
Problem
     Investigation & Diagnosis

•   Chronological Analysis
•   Pain Value Analysis
•   Kepner and Tregoe
•   Brainstorming
•   Ishikawa Diagrams
•   Pareto Analysis
Workaround
• A technique which reduces or eliminates the
  impact of an incident or problem for which a
  full resolution is not yet

                               For Example…
                               • Restarting a failed Configuration Item
                               • Rerouting workload
             Incident # xxxx   • Workarounds for incidents that do not
             Category : …
                                 have associated problem records are
             Step 1 : …
             Step 2 : …          documented in the incident record
             Step 3 : …
             Step 4 : …


    Shared
     Data
Known Error
A Problem that has a documented Root Cause and a
  Workaround
                  Root Cause       Workaround       Known Error

              +                +                =

    Problem
Known Errors are created and managed throughout their
  lifecycle by Problem Management. Known Errors may
  also be identified by development or suppliers
Known Error Database ( KEDB )
• A database containing all Known Error records
• The purpose is to store previous knowledge of
  Incidents and Problems, and how they were
  overcome, to allow quicker diagnosis and
  resolution if they recur

     KEDB                  • This database is created by Problem
                             Management and used by Incident and
                             Problem Management
         Known
          Error
                  Known
                   Error   • The Known Error Database is part of the
                             Service Knowledge Management System
         Known    Known
          Error    Error
Concept of The Problem Model
A problem model is a way of predefining the steps that
  should be taken to handle a process (in this case a
  process for dealing with a particular type of problem) in
  an agreed way

Support tools can then be used to manage the required
  process. This will ensure that ‘standard’ problems are
  handled in a pre-defined path and within pre-defined
  timescales
Key Metrics
• Total problem recorded      • Average cost per problem
• % of problems resolved      • # of Major problems
  within SLA                    identified
• # or % problems that        • # of Major problem
  exceed resolution targets     reviews conducted
                              • Known Errors added to
• Aged problems                 KEDB
Implementation Challenges
• The establishment of an effective Incident Management process
  and tools
• Formal interfaces and common practices between the two
  processes
• Links between Incident and Problem Management tools
• The ability to relate Incident and Problem Management Records
• Second and third-line Staff need to have a good working
  relationship with first-line staff
• Business Impact is well understood by staff undertaking
  investigation of problems
• Problem Management is able to use all Knowledge and
  Configuration Management resources available
Service Operation Processes

      Access Management
Management
• GOAL
  – To execute the policies and actions defined in
    Security and Availability Management.
• OBJECTIVE
  – To provide the entry rights for users to be able to use
    service or group of services
• SCOPE
  – Access Management ensures that users are given the
    rights to use the service, but it does not ensure this
    access is available at all agreed times
Concepts
•   Access
•   Identity
•   Rights ( also called privileges )
•   Services or service groups
•   Directory services
Key Metrics
Number of …

          – Requests for access ( Service Request, RFC, etc.)
          – Incidents requiring a reset of access rights
          – Incidents caused by incorrect access settings

Instances of access granted : By service, user ,
   department, etc.
Implementation Challenges
Provision of a database of all users and the
   rights that they have been granted the
                  ability to…
        – Verify the identity of a user
        – Verify the identity of the approving person of body
        – Verify that a user qualifies for access to a specific
          service
        – Link multiple access rights to an individual user
        – Being able to determine the status of the user at any
          time
        – Manage changes to a user’s access requirements
        – Restrict access rights to unauthorized user’s

Service Operation Processes

  • 1.
    Processes in ServiceOperation Event Management
  • 2.
    Event Management • GOAL – To detect Events, make sense of them, and determine the appropriate control action to be provided by Event Management • OBJECTIVE – To provide the entry point for the execution of many Service Operation processes and activities • SCOPE – Any aspect of Service Management that needs to be controlled and which can be automated
  • 3.
    Concept : Event Anevent is a change of state that is significant for the management of Configuration Item or service For example… This term is also used Action Needed Backup In Progress to mean an alert or notification. STATUS STATUS IN MAINTENANCE UNAVAILABLE Events typically IT Operations personnel to take action, and STATUS STATUS often lead to Incidents PROCESSING AVAILABLE being logged UNAUTHORIZED SERVICE ACCESS DEGRADED
  • 4.
    Concept Of Alert AnAlert is a warning that… CHANGE • A threshold has been reached • Something has changed • A failure has ocurred • Alerts are often created and managed by system management tools and are managed by the event management process • The purpose of a alert is to ensure that the person with the skills appropriate to deal with the event is notified
  • 5.
    Key Metrics • No.of events by: – Category – Significance • No. and % of events – That required human intervention and wheter this was performed – That resulted in Incidents or Changes – Caused by existing Problems or Known Errors – Compared with the number of Incidents • No. and % of: – Repeated or duplicated events – Events indicating performance issues – Events indicating potential availability issues – Each type of events per platform or application
  • 6.
    Implementation Challenges Correct level of filtering Obtain funding Rolling out necessary monitoring Acquiring necessary agents skills
  • 7.
    Service Operation Processes Incident Management
  • 8.
    Incident Management • GOAL – To restore normal service operation as quickly as possible and minimize the adverse impact on business operations • OBJECTIVE – To ensure that the best possible levels of service quality and availability are maintained • SCOPE – Incident Management includes any Events which disrupts, or which could disrupt ,a service. This includes Events which are communicated directly by users, either through the Service Desk or through an interface from Event Management to Incident Management tools
  • 9.
    Basic Concepts • Timescales •Incident model • Major Incidents NOTE: People sometimes use loose terminology and/or confuse a Major Incident with a Problem. In reality, an Incident remains an Incident forever – it may grow in impact or priority to become a Major Incident, but an Incident never ‘becomes a Problem’. A Problem is the unknown cause of one or more Incidents and remains a separate entity always.
  • 10.
    Key Metrics • Totalnumbers of Incidents – Breakdown at each stage • Mean elapsed time to achieve Incident resolution to circumvention, broke down by impact code • Percentage of Incidents handled within agreed response time • Incident response-time • Average cost per Incident • Number and percentage of: – Major incidents, backlog, incorrectly assigned or categorized – Resolved remotely, without the need for a visit
  • 11.
    Implementation Challenges • Theability to detect Incidents as early as possible • Convincing all staff (technical teams as well as users) that all Incidents must be logged • Availability of information about Problems and Known Errors • Integration into the: – Configuration Management system (CMS) – Service Level Management process (SLM) – Service Knowledge Management System (SKMS
  • 12.
    Service Operation Processes Request Fulfillment
  • 13.
    Request Fulfillment • GOAL – To deal with Service Requests from the users/customers • OBJECTIVE – To provide a channel for users to request and receive standard services for which a predefined approval and qualification process exists – To provide information to users and customers about the availability of services and the procedure for obtaining them – To source and deliver the components of requested standard services (e.g. : licenses and software media) – To assist with general information, complaints or comments • SCOPE – Each organization will need to decide and document which requests it will handle through the Request Fulfillment process and which others will have to go through more formal Change Management process
  • 14.
    Concept of theService Request The request from a user for information, advice, a standard change or access to an IT service. – For Example : • To reset a password • To provide standard IT services for a user Service requests are usually handled by a Service Desk and do not require an RFC to be submitted.
  • 15.
    Concept of theRequest Model The Request Model is a way of predefining the steps that should be taken to handle a process (in this case a process for dealing with a particular type of request) in an agreed way Support tools can be used to manage the required process. This will ensure that standard requests are handled in a predefined path and within predefined timescales
  • 16.
    Key Metrics Average Cost Backlog Met SLA Did not meet SLA ~~~~~~~~ ~~~ ~~~~~~~~ ~~~ ~~~~~~~~ ~~~ Satisfaction Surveys
  • 17.
    Implementations Challenges Clearly definingand documenting the type of requests that will be handled within the Request Fulfillment process (and those that will either go through the Service Desk and be handled as Incidents or those that will need to go through formal Change Management) – so that all parties are absolutely clear on the scope Establishing self-help front-end capabilities that allow the users to interface successfully with the Request Fulfillment process
  • 18.
    Service Operation Processes Problem Management
  • 19.
    Problem Management • GOAL – To diagnose the root cause of incidents, to determine the resolution to those problems and to implement resolutions through appropriate control procedures • OBJECTIVE – Primarily to prevent problems and resulting Incidents, eliminate recurring Incidents and to minimize the impact of Incidents that cannot be prevented • SCOPE – The Management of the lifecycle of all problems
  • 20.
    Problem Management Problem Managementis the process responsible for managing the Lifecycle of all Problems Problem Management consists of two major processes : 1. Reactive Problem Management is generally executed as part of Service Operation and is, therefore, covered in the Service Operation book 2. Proactive Problem Management is initiated in Service Operation, but is generally driven as part of Continual Service Improvement.
  • 21.
    Problem The unknown causeof one or more Incidents. The cause is not usually known at the time a Problem Record is created, and the Problem Management process is responsible for further investigation
  • 22.
    Problem Investigation & Diagnosis • Chronological Analysis • Pain Value Analysis • Kepner and Tregoe • Brainstorming • Ishikawa Diagrams • Pareto Analysis
  • 23.
    Workaround • A techniquewhich reduces or eliminates the impact of an incident or problem for which a full resolution is not yet For Example… • Restarting a failed Configuration Item • Rerouting workload Incident # xxxx • Workarounds for incidents that do not Category : … have associated problem records are Step 1 : … Step 2 : … documented in the incident record Step 3 : … Step 4 : … Shared Data
  • 24.
    Known Error A Problemthat has a documented Root Cause and a Workaround Root Cause Workaround Known Error + + = Problem Known Errors are created and managed throughout their lifecycle by Problem Management. Known Errors may also be identified by development or suppliers
  • 25.
    Known Error Database( KEDB ) • A database containing all Known Error records • The purpose is to store previous knowledge of Incidents and Problems, and how they were overcome, to allow quicker diagnosis and resolution if they recur KEDB • This database is created by Problem Management and used by Incident and Problem Management Known Error Known Error • The Known Error Database is part of the Service Knowledge Management System Known Known Error Error
  • 26.
    Concept of TheProblem Model A problem model is a way of predefining the steps that should be taken to handle a process (in this case a process for dealing with a particular type of problem) in an agreed way Support tools can then be used to manage the required process. This will ensure that ‘standard’ problems are handled in a pre-defined path and within pre-defined timescales
  • 27.
    Key Metrics • Totalproblem recorded • Average cost per problem • % of problems resolved • # of Major problems within SLA identified • # or % problems that • # of Major problem exceed resolution targets reviews conducted • Known Errors added to • Aged problems KEDB
  • 28.
    Implementation Challenges • Theestablishment of an effective Incident Management process and tools • Formal interfaces and common practices between the two processes • Links between Incident and Problem Management tools • The ability to relate Incident and Problem Management Records • Second and third-line Staff need to have a good working relationship with first-line staff • Business Impact is well understood by staff undertaking investigation of problems • Problem Management is able to use all Knowledge and Configuration Management resources available
  • 29.
  • 30.
    Management • GOAL – To execute the policies and actions defined in Security and Availability Management. • OBJECTIVE – To provide the entry rights for users to be able to use service or group of services • SCOPE – Access Management ensures that users are given the rights to use the service, but it does not ensure this access is available at all agreed times
  • 31.
    Concepts • Access • Identity • Rights ( also called privileges ) • Services or service groups • Directory services
  • 32.
    Key Metrics Number of… – Requests for access ( Service Request, RFC, etc.) – Incidents requiring a reset of access rights – Incidents caused by incorrect access settings Instances of access granted : By service, user , department, etc.
  • 33.
    Implementation Challenges Provision ofa database of all users and the rights that they have been granted the ability to… – Verify the identity of a user – Verify the identity of the approving person of body – Verify that a user qualifies for access to a specific service – Link multiple access rights to an individual user – Being able to determine the status of the user at any time – Manage changes to a user’s access requirements – Restrict access rights to unauthorized user’s