• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
RDrew ITIL Presentation
 

RDrew ITIL Presentation

on

  • 6,916 views

 

Statistics

Views

Total Views
6,916
Views on SlideShare
6,887
Embed Views
29

Actions

Likes
3
Downloads
250
Comments
0

1 Embed 29

http://www.slideshare.net 29

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • ITIL stands for the “Information Technology Infrastructure Library”. Developed by the CCTA (Central Computer and Telecommunications Agency) in the late 80’s. ITIL has become the de facto standard in Service Management. CCTA now OGC (Office of Government Commerce) part of HM Treasury – authority for best practices within Government. Underpins ISO 9000 quality standards, considered a fast path toward quality certification. ITIL stands for the “Information Technology Infrastructure Library”. The fact that it is referred to as a “library” relates to the fact that the information and guidance surrounding the implementation of each process in an organization can be found in books.
  • Business goals is the referee between innovation and formalization This slide illustrates the dynamic action between innovation and formalization. Formalization is the red tape, procedures, etc. Innovation is the business’s adaptability to the global economy Red arrow is the dynamic tension. The referee in this game is the Business Goals. In IT we often speak of the PPT triangle; people, processes and technology are key ingredients in IT service delivery. Achieving a balance between these is often challenging. It’s more than just People processes & technology. These are tied to IT strategy which is tied to Corporate strategy. Draw triangle on flip chart: P P T
  • This is a modified slide incorporating additional arrows to illustrate the more normal flow of operations. The process view has been added to show the similarities and differences from the department view of things. Animation was added to this slide Each item comes up when is pressed The process view department view is dimmed Instead of having to go thru management to get things done, people are empowered to perform the required activities and boundaries (silos) between functional groups are removed
  • This is the end in mind. After covering the service support processes, this slide will be revisited and the delegates will have a much better understanding of the whole picture. This slide illustrate the relationships between the support processes. There is a similar one for the delivery processes. However one must not conclude that delivery and support processes do not share information, on the contrary. It is simply that a diagram attempting to illustrate the relationships between all processes would be too busy.
  • This is the end in mind. After covering the service support processes, this slide will be revisited and the delegates will have a much better understanding of the whole picture. This slide illustrate the relationships between the support processes. There is a similar one for the delivery processes. However one must not conclude that delivery and support processes do not share information, on the contrary. It is simply that a diagram attempting to illustrate the relationships between all processes would be too busy.
  • Ownership, Monitoring, Tracking & Communication: The Service Desk is responsible for owning and overseeing the resolution of all outstanding Incidents The Service Desk is 1 st line Incident Management Incident Detection & Recording: Symptoms, basic diagnostic data, and information about the related Configuration Item should be included in Incident records during detection and recording Classification & Initial Support: Classification is the process of identifying the reason for the Incident. Many Incidents are regularly experienced and the appropriate resolution actions are well known. The Service Desk should then apply the known fix asap. This is not always the case, however, and a procedure (Investigation & Diagnosis) for matching Incident classification data against that for Problems and Known Errors is necessary. (Incident Matching) Successful matching gives access to proven resolution actions, which should require no further investigation effort. Service Request: Specific to the Service Desk – non-failure related – follows an established path. Does not leave first line. Investigation & Diagnosis: Investigation and diagnosis may become an iterative process, starting with a different specialist support group and following elimination of a previous possible cause. It may involve multi-site support groups and support staff from different vendors. It may continue overnight with a new shift of support staff taking over the next day. All this demands a rigorous, disciplined approach and a comprehensive record of actions taken with corresponding results. Resolution & Recovery: After successful execution of the resolution or some circumvention activity, service recovery can be effected and recovery actions carried out, often by specialist staff (second- or third-level support). The Incident Management system should allow for the recording of events and actions during the resolution and recovery activity. Incident Closure: The Service Desk must ensure the following before closure can occur: details of the action taken to resolve the Incident are concise and readable classification is complete and accurate according to root cause resolution/action is agreed with the Customer - verbally or, preferably, by email or in writing all details applicable to this phase of the Incident control are recorded, such that: the Customer is satisfied cost-centre project codes are allocated the time spent on the Incident is recorded the person, date and time of closure are recorded.
  • Here is an example of a discussion you can have with the delegates: Although the Service Desk is not a process as per ITIL, let’s use it as an example What is the goal of the Service Desk? Assume you have more than one Service desk in your organization, who ensures that all help desks do things the same way? – Process owner How can you say you have a quality service desk? What is quality? A: fit for purpose. Timex and Rolex, which is a better product? What are some KPI’s for a service Desk? What are a few activities of a service desk? Inputs: What does a service desk need to know to do a good job? Outputs: What does a Service Desk produce? What resources does a Service desk need to have at their disposal? Phone system, escalation, call display, call tracking, knowledge base, equipment used at customer sites What roles will you find in a Service desk? A: Manager, team leader, supervisor, customer liaison, escalation management, agent, etc.
  • Give examples where they can be shared For example Priority and Categorization models should be shared by IM, PM, and ChM A difference is the models would be the response times – IM may be 2 hrs, PM may be 2 days etc…
  • Talk through slide points Actions have to come form goals and objectives
  • Ownership, Monitoring, Tracking & Communication: The Service Desk is responsible for owning and overseeing the resolution of all outstanding Incidents The Service Desk is 1 st line Incident Management Incident Detection & Recording: Symptoms, basic diagnostic data, and information about the related Configuration Item should be included in Incident records during detection and recording Classification & Initial Support: Classification is the process of identifying the reason for the Incident. Many Incidents are regularly experienced and the appropriate resolution actions are well known. The Service Desk should then apply the known fix ASAP. This is not always the case, however, and a procedure (Investigation & Diagnosis) for matching Incident classification data against that for Problems and Known Errors is necessary. (Incident Matching) Successful matching gives access to proven resolution actions, which should require no further investigation effort. Service Request: Specific to the Service Desk – non-failure related – follows an established path. Does not leave first line. Investigation & Diagnosis: Investigation and diagnosis may become an iterative process, starting with a different specialist support group and following elimination of a previous possible cause. It may involve multi-site support groups and support staff from different vendors. It may continue overnight with a new shift of support staff taking over the next day. All this demands a rigorous, disciplined approach and a comprehensive record of actions taken with corresponding results. Resolution & Recovery: After successful execution of the resolution or some circumvention activity, service recovery can be effected and recovery actions carried out, often by specialist staff (second- or third-level support). The Incident Management system should allow for the recording of events and actions during the resolution and recovery activity. Incident Closure: The Service Desk must ensure the following before closure can occur: details of the action taken to resolve the Incident are concise and readable classification is complete and accurate according to root cause resolution/action is agreed with the Customer - verbally or, preferably, by email or in writing all details applicable to this phase of the Incident control are recorded, such that: the Customer is satisfied cost-center project codes are allocated the time spent on the Incident is recorded the person, date and time of closure are recorded.
  • Animation on this slide helps the trainer to show to the delegates the flow of incident management and its relationship to other processes Animation is Sources of incidents – incidents can be detected by many sources, not just the service desk Service request procedure – these are often, FAQ’s, IMACS, “how-to” questions. They still have to be recorded but have a separate procedure. Consulting the CMDB – ideally the service would have access to auto populate the incident ticket and for investigation/diagnosis Inputs and outputs with problem management – lightly introduce the problem management process Inputs and outputs with change management – an incident can require a RFC to resolve it. Resolution and workarounds leave the process – closing the loop back
  • This is a new slide to illustrate the linkage with all other support levels. This is a complimentary slide to the previous two This diagram is taken from the book and replaces 2 slides that were part of the previous version. The arrows from 2 nd , 3 rd , Nth level support back towards first level indicate that information, education and training will be provided back to first level when required. In many organizations, only the service desk (I.e. 1 st level usually) can close an incident ticket. Animation is 1 st level – see slide xx (activities) 2 nd level – functional escalation + activities expected from 2 nd level 3 rd level – functional escalation + activities expected from 3 rd level N th level - functional escalation until someone finds a temporary solution, fix or workaround
  • No changes to this slide This is a prioritization matrix that was developed by Pink Elephant consultants for a client. Therefore, the matrix itself is not defined within ITIL, but demonstrates how the principles of “Impact” and “Urgency” were practically applied to an organization. Be sure to mention that “ expected effort ” could be used as a tie breaker. This grid is used as a guideline only. Impact and urgency can move along the grid over time.
  • Categorization is one of the most important aspects to ensuring proper management information is available. Category trees need to be maintained and not too cumbersome. IMPACT, URGENCY, EXPECTED EFFORT - Should be consistent across Incident, Problem and Change Mgmt
  • Exercise – Incident matching exercise.. create grid on flip chart for teams to enter matches – see differences and similarities – discuss the differences, why they are / or not matched – incident ticket and description is not the challenge – the challenge is the problem ticket Guide delegates through this slide, but don’t make major issues out of remembering it. This process will be different in each organization. The main message is that it should exist and this is what ITIL regard as best practice.
  • Proactive Proactive activities to minimize the adverse impact of Incidents and Problems on the business that are caused by errors within the IT Infrastructure and to prevent the recurrence of Incidents related to these errors. The activity involved to reach this goal is root cause analysis of the Incidents and then initiate actions to improve or correct the situation. Proactive Problem Management is concerned with identifying and solving Problems and Known Errors before Incidents occur in the first place. Book: “Analysing Incidents over differing time periods.” Reactive This is concerned with solving Problems in response to one or more Incidents. Book: “Analysing Incidents as they occur.”
  • This illustrates the reactive activities within Problem Management. The attendees should be very comfortable with this information now.
  • Occurrence of Incidents is unavoidable – increasingly complex IT infrastructures will always cause Incidents to occur and interruptions to normal service. Reactive Problem Control is concerned with the first three phases involved in the Problem Control process. Problem Identification and Recording takes place when: Matching the Incident and CIs to existing Problems and Known Errors is unsuccessful during the stage of initial incident support and classification Analysis of Incident date reveals recurrent Incidents Analysis of Incident data reveals Incidents that are not yet matched to existing Problems or KEs Analysis of the IT infrastructure indicates a Problem that could potentially lead to Incidents A major or significant Incident (serious and adverse impact on services to the Customer) occurs for which a structural solution has to be found. Problems should be recorded on CMDB and linked to any related Incidents on the Incident records Solutions and work-arounds of Incidents should be recorded in the relevant Problem Records for others to access should other related Incidents occur. NOTES Some Problems will be identified from other processes i.e. Capacity - and Availability is concerned with detection and avoidance of problems and incidents to the IT infrastructure. It may not be cost effective to put this possibly expensive resource skill into place for some Incidents and Problems. If this is the case then a dummy Problem Record can be introduced in the CMDB related to all connect Incidents, Known Errors, RFCs and CIs. Problem Classification Problem classification similar to Incident Classification i.e. (see next slides)
  • Error control covers the processes involved in successful correction of Known Error s. The objective is to change IT components to remove Known Errors affecting the IT infrastructure and thus to prevent any recurrence of Incident s. Many IT departments are concerned with error control, and it spans both the live and development environments. It directly interfaces with, and operates alongside, Change Management processes. Figure shows the three phases of the error control process. The monitoring and tracking phase covers the entire Problem /error life-cycle.
  • Diagnosis frequently reveals that the cause of a Problem is not an error in a registered CI (hardware, software item, documentation or procedure) but is procedural. Incorrect release of a version of a program is one example. These situations result in Problem closure with an appropriate categorization code. Problems of this type do not automatically achieve the formal status of Known Error . To ensure that these Problems are followed up and that action is taken to address them, consider creating a dummy CI record for the offending procedure and re-classifying the Problem as a Known Error, or raise an RFC . Diagnosis showing the cause to be a fault in a registered CI should automatically change the status of the Problem into a Known Error. At this point the error control system and procedures take over. As indicated earlier, the objectives of Problem investigation frequently conflict with those of Incident resolution. For example, Problem investigation may require detailed diagnostics data, which is available only when an Incident has occurred; its capture may significantly delay the restoration of normal services. Be sure to liaise closely with Incident control and the computer operations or network control functions to get a balanced view of the right time for such actions.
  • Problem Management staff perform an initial assessment of the means of resolving the error, in collaboration with specialist staff. If necessary, they then complete an RFC according to Change Management procedures. The priority of the RFC is determined by the urgency and impact of the error on the business. The RFC identifier should be included in the Known Error record and vice versa in order to maintain a full audit trail, or the two records should be linked. The final stages of error resolution - impact analysis, detailed assessment of the resolution action to be carried out, amendment of the item in error, and testing of the Change - are under the control of Change Management. In extreme circumstances, authorization and execution of an urgent resolution may be necessary
  • This just supports the previous slides and explains how the live environment and the development environment interact to cope with Known Errors and resolutions. Error control in the software environments The processes of Problem and error control are essentially the same in the live and development environments. The support tools described earlier for Problem Management in the live environment are precisely those required in the development environment. Figure shows how there is a cyclical relationship between error control in the live and development environments. Interlocking and integrated Problem Management systems facilitate the handling of this situation. Errors found during live operations result in an accumulation of RFC s. The Release strategy ( Release Management ) allows for the eventual creation of a Release to incorporate authorized Changes for the amendment of system facilities. Development staff should be aware of all Known Error s and Problems that are associated with the package Release. They are required to delete Known Errors as they are corrected, but they add any newly introduced errors from the development activity itself, to a revised errors database (or CMDB ). Upon implementation of a new Release, this revised errors database replaces the database of the previous Release as the live version. The cycle then repeats itself as new errors are discovered in live operation.
  • Trend Analysis “ Incident and Problem analysis reports provide information for proactive measures to improve service quality.” Trends such as the post-change occurrence of particular Problem types Incipient faults of a particular type Recurring Problems of a particular type or with an individual item The need for more customer training or better documentation It may reveal: Problems occurring on one platform may occur on another platform – for example, a Problem concerning network software on a midrange system may well be of significance on a mainframe. The existence of recurring problems – for example, if three routers are substituted serially, because of the same failure, it may indicate that the router-type concerned is not appropriate and should be replaced by another type, or when a software application is involved then complete redevelopment might be necessary which would be classed as a major change.
  • Targeting preventive action Proactive Problem Management can estimate the business-related impact of Incidents in a specific Problem area in order to carry out target preventive actions. “ With this concept a pain value is given to each Incident category on the basis of a formula, taking into account, for instance: the volume of Incidents the number of customers impacted the duration and related costs of resolving the Incidents the cost to the business – this being perhaps the most important factor of all.”
  • Each incident is a result of an error in the IT infrastructure. To be able to discover the error, a problem is defined and this problem can be researched and diagnosed. As soon as the CI that causes the error is found, the problem changes into a known error. A known error is removed by means of a change and Problem Management ensures that an RFC is sent to Change Management. To remove the error, a CI, for example, must be replaced or repaired. Only after a successful change has the cause of the incidents - the error - been removed and is it guaranteed these incidents will never recur. Only then has a structural solution to the incidents been applied. In the interests of clarity, the differences between the Help Desk and Problem Management are explained below:   Help Desk The objective of the Help Desk process is to resolve incidents as quickly as possible. Incidents arise during the utilization of the IT infrastructure and all incidents are recorded. Problem Management Problem Management focuses on introducing structural improvements to the IT infrastructure. The main concern here is not that it is solved quickly, as with the Help Desk, but that the solution is reliable and robust. A problem is defined on the basis of the incidents recorded by the Help Desk­.

RDrew ITIL Presentation RDrew ITIL Presentation Presentation Transcript

  • ITIL ® IT Service Management Overview Ron Drew, PMP IT Executive Consultant
  • ITIL Overview Agenda
    • What is ITIL?
    • ITIL Key Concepts
    • Operational ITIL Processes
    • Tactical ITIL Processes
    • Process Design
  • What Is ITIL?
    • The Information Technology Infrastructure Library (ITIL) provides a framework of “Best Practice” guidance for IT Service Management and is the most widely used in the world. The de facto standard in IT Service Management
    A framework, developed by the United Kingdom’s Office of Government Commerce (OGC) captured in a series of books
  • What is IT Service Management?
    • IT Service Management is a top-down, business driven approach to the management of IT that specifically addresses the strategic business value generated by the IT organization and the need to deliver a high quality IT service.
    • IT Service Management is designed to focus on the people, processes and technology issues that IT organizations face.
  • ITIL: Origins & Evolution
    • Late 1980s
      • UK government project started
      • CCTA (OGC) involved in development plus practitioner and consulting organizations
      • Organizations outside of government became interested
      • First books published
    • Early 1990s
      • The library completed
    • Late 1990s
      • Generally accepted as the de-facto standard for IT service management worldwide
  • ITIL: Origins & Evolution
    • 2000-2005
      • British Standards Institute “Specification for IT Service Management (BSI) 15000
      • Australian Standard 8018
      • ISO (ISO 20000)
      • Vendor community supports ITIL and are developing products and practices in support of the framework
      • All major service management software vendors moving towards ITIL Service Management compatibility
    • Where is it going?
      • Refresh project initiated late 2005
      • New books published May 30, 2007
  • Who has Adopted ITIL?
    • ITIL has been incorporated with Service Management Framework in some major companies:
  • ITIL: Library Series
    • Service Desk
    • Incident Management
    • Problem Management
    • Change Management
    • Release Management
    • Configuration Management
    • Service Level Management
    • Availability Management
    • Capacity Management
    • Financial Management for IT Services
    • IT Service Continuity Management
    • Service Delivery
    • Infrastructure Management
    • Applications Management
    • Security Management
    The library is a set of books:
  • ITIL: Publication Framework
  • Relationship Between Innovation & Formalization Innovation Formalization Business Goals Process Effectiveness IT Strategy People/ Culture Technology Processes
    • The right time & place
    • The right quality & quantity
    “ right” = at a reasonable cost
  • Business Alignment Corporate Vision Corporate Mission Corporate SWOT Corporate Strategies And Goals IT Vision IT Mission IT SWOT IT Strategic Goals IT Tactical Goals IT Operational Goals Are we working towards achieving our vision? Are we operating within the purpose and scope? What needs to be measured? What needs to be measured? What needs to be measured? Business Objectives IT Objectives Performance Indicators
  • Systems Versus Service Management
    • Systems Management
      • Isolated systems
      • Technology and asset focused
      • Systems monitoring
      • IT perspective
    • Service Management
      • Service as experienced / consumed
      • Technology transparent to customer
      • From customer perspective
  • Service Versus System Mainframe Gateway WAN Router Server Hub PC PC PC E mail service System Availability 99.96% System Availability 98% System Availability 97.5% System Availability 96% Service Availability 91.69% Availability = Ops * WAN * LAN * Desktop Total Availability of Email Service = 91.69%
    • Software Layer
    • Infrastructure Layer
    • Procedure Layer
    • People Layer
  • Service Portfolio Management DEFINE---Collect information and inventories of existing services. Establish the requirements for the requested service, and establish the business case for implementing the service. ANALYZE---Review the long-term business goals, and determine what services are required to meet those goals. Then analyze the requested service for financial viability, operational capability, and technical feasibility to determine how the organization is going to get there. (You may decide to obtain the service from an outsourcer rather than develop it internally.)
  • Service Portfolio Management APPROVE---Make a decision to retain, replace, renew, or retire the services. CHARTER---Communicate action items to the organization to implement approved service, and allocate budget and resources. The Define, Analyze, and Approve steps are described in the ITIL V3 Service Strategy book. The Charter step is discussed in the ITIL V3 Service Design book.
  • Configuration Management System (CMS) CMS was introduced in ITIL V3. It provides a strong foundation for the service catalog. CMS is an ecosystem that feeds, manages, analyzes, and presents the information contained in the Configuration Management Database (CMDB) which is another fundamental component of ITIL. CMDB is depicted in the ITIL books as merely a core component of CMS, a well-architected, federated CMDB implements much of the functionality of the CMS
  • Service Catalog Service Catalog is a subset of the service portfolio that is visible to customers.
    • Provides the service consumer view.
    • Services Available to consumers
    • Shows relationships of services to business units/processes
    • Provides IT view.
    • Makeup of services
    • Shows relationships of services to enterprise infrastructure elements that support them.
  • Processes vs. Departments IT Service Desk 2 nd Dept. 3 rd Dept. Department View Step 1 Step 2 Step 3 Step 4 Step 5 Process View
  • Information Technology Infrastructure Library (ITIL) Overview
  • Benefits of Adopting ITIL
    • ITIL best practices allow you to create services that are standardized and repeatable
    • Enhanced Customer satisfaction as service providers know and deliver what is expected of them
    • Overall improved quality of business operations by ensuring the IT processes align to the business processes
    • Model compatible with PMBOK, SixSigma, ISO
    • More reliable business support provided by processes such as Incident and Change Management – as well as the Service Desk function
    • Provides a common language, guidelines for the establishment of roles, responsibilities and skills requirements
    • Increases productivity of Business and Customer staff because of more reliable and available IT services
      • Lowers the cost of delivering services
      • Enables customer expectation setting and satisfaction
      • Ensures consistent, enhanced service quality
  • ITIL Service Support Process Model Management Tools Difficulties Queries, Enquiries Communication Updates Workarounds Service Desk Incidents Incidents CMDB Change Schedule CAB Minutes Change Statistics Change Reviews Audit Reports Releases CIs Relationships Problems Known Errors Changes CMDB Reports CMDB Statistics Policy/Standards Audit Reports Release Schedule Release Statistics Release Reviews Secure Library Testing standards Audit Reports Problem Statistics Trend Analysis Problem Reports Problem Reviews Diagnostic Aids Audit Reports Problem Service Reports Incident statistics Audit Reports Releases Release The Business, Customers & Users Changes Incident Change Incidents Configuration CIs Relationships Service Requests RFCs/Change Documentation Release Documentation
  • ITIL Management/Process Model (Another View)
  • Service Support Processes
    • Service Desk:
      • Provide a strategic central point of contact for customers and an operational single point of contact for managing incidents to resolution – actually a function, not a process.
    • Incident Management:
      • Restore normal service operation as quickly as possible and minimize adverse impact on business operations within the defined parameters of service level agreements.
    • Problem Management:
      • Minimize the impact of incidents and problems on the business that are caused by errors in the IT infrastructure and to prevent recurrence of incidents related to these errors.
    • Change Management:
      • Ensure standardized methods and procedures are used for efficient and prompt handling of all changes to minimize the impact of change-related incidents upon service quality improving day to day operations.
  • Service Support Processes
    • Release Management:
      • Holistic view of a Change to an IT service to ensure all aspects of a release, both technical and non-technical are considered together.
    • Configuration Management:
      • Identify, record and report on all IT components that are under the control and scope of Configuration Management.
    • Service Level Management:
      • Improve and maintain IT Service Quality through repetitive agreeing, monitoring, and reporting of IT Service achievements and investigate actions to eradicate poor service.
    • Financial Management for IT:
      • Provide cost-effective stewardship of the IT assets and resources used in providing IT services.
  • Service Support Processes
    • Capacity Management:
      • Ensure IT Infrastructure capacity matches evolving business demands both cost effectively and timely. Balance cost to capacity and supply to demand.
    • IT Service Continuity:
      • Support Business Continuity Management (BCM) by ensuring business defined critical IT technical and service facilities can be recovered within predefined timeframes.
    • Availability Management:
      • Optimize IT Infrastructure services and partners capability to deliver a cost effective and sustainable level of availability enabling business objectives to be met.
  • ITIL Implementation
  • The Service Delivery Process Model SLA’s, OLA’s, SLR’s Service requests Service catalogue SIP Exception reports Audit reports Management Tools The Business, Customers & users Capacity Plan CDB Targets/Thresholds Capacity Reports Schedule Audit Reports Capacity Alerts Exceptions Changes IT Service Continuity IT Continuity Plans BIA & Risk Analysis Define Requirements Control Centers DR Contacts Reports Audit Reports IT Financial Management Availability Plan AMDB Design Criteria Targets/Thresholds Reports Audit Reports Queries Enquiries Communication Updates Reports Requirements Targets Achievements Financial Plans Types & Models Costs & Charges Reports Budgets & Forecasts Audit Reports Service Level Management Availability
  • Operational Process Design Assumptions:
    • Demonstrated Sr. level commitment
    • Dedicated resources in place
    • No major political constraints
    • Single location or region
    < 6 Months> High Level Process Model 6 - 8 wks Authority Matrices 1-2 wks Procedures / Policy Documents Functional tool requirements Process workshops Skills development Communication Plan 16 Weeks Management Information (Quality Norms and Metrics) What Who How
  • ITIL Process Design
    • High Level Process
    • High Level Policies
    • Procedures
    • Roles and Responsibilities (ARCI Matrix)
    • Detailed Work Instructions
    • CSFs, KPIs and Metrics
    • Reporting
  • Generic Process Model Process Input & Input Specifications Output & Output Specifications Process Owner Process Goal Quality Parameters & Key Performance Indicators Process Control Resources Roles Process Enablers A connected series of actions, activities, changes etc. performed by agents with the intent of satisfying a purpose or achieving a goal. Activities & Sub-Processes
  • Policies
    • Shared by multiple processes
    • Describe the agreed decisions made by a group or organization
    • Frequently used to describe quality control measures
  • Detailed Workflow & Procedures
    • Develop Detailed Workflow and Procedures
      • Inputs and outputs of each procedure activity
      • Procedures on what needs to happen
      • Detailed activity description
      • Work instructions when required
    • Options when documenting
      • Show roles with each process activity
      • Show tools used for each process activity
  • Roles & Responsibilities
    • Define the roles required to execute the new process activities and procedures
    • Identify majors tasks and responsibilities for each role
    • Identify skill requirements for each role
      • Identify new roles to the organization and begin working HR to get approval on these roles
      • Identify major changes to existing roles and begin working with HR to get approval on these roles
    • Map the roles and responsibilities back to the organization (Authority Matrix – ARCI / Accountable, Responsible, Consulted and Informed)
  • Measurement Framework Financial Customer Innovation Internal Improve Quality Of Service Improve Management Control Employ New Technology Reduce Cost Example Goals Process Measures
    • Value
    • Quality
    • Performance
    • Compliance
    In order to understand something you must look at it more then one way Balanced Score Card Approach
  • Service Support KPI’s Quality Compliance # of releases by type that satisfy release management criteria when submitted to Change # of releases that bypass the process Ensure production readiness, quality and authorization of new or modified CIs and their planned deployment Release Quality Value % of CMDB data population and accuracy vs actual, according to scope % Growth or Change by CI type over an elapsed time period Identify / control / manage IT resources within a Configuration Management Database Config. Quality Value # of changes by type / category / Group / Customer. (emergency changes trending down) # of changes that have resulting incidents, or fail and have to be backed out Handle changes efficiently while minimizing impact to service delivery Change Quality Value # of problems identified & root cause determined with solution or workaround. # of Repeat incidents by category trending downwards Identify systemic Infrastructure Errors and eliminate them to minimize impact and improve availability Problem Quality Performance # of Incident by category, priority and resolution type by LOB # of Incidents restored within SLA Targets Restore service degradations to expected level ASAP Incident Category Example Core KPIs Core Objective Process
  • Service Delivery KPI’s Compliance % deviation of forecasted versus actual cost of IT services within defined tolerance limits (% of Deviation $ of Deviation) Plan for and deliver IT Services within a forecasted budget against actual cost Finance Quality Performance % of systems that fail recovery test Time to execute test of plan and recover IT services in a contingency state against expected targets. Recover IT systems to normal state in an alternate way after a disaster within an expected timeframe ITSCM Quality % of components the breach tolerance thresholds in correspondence to planned capacity levels for components and complete IT systems. Current and future resources are greater than or equal to demand, but excess is planned Capacity Quality % of service availability within SLA negotiated requirements. Define and plan for service availability to meet or exceed stated business requirements through process, technology and people resource planning and implementation Avail. Value % Score of customer satisfaction survey trends up over time i.e. Customer Satisfaction Survey Define services Agree on level, scope, quality, performance Monitor & Manage SLM Category Example Core KPI Core Objective Process
  • Let’s Look at Incident Management
    • To restore normal service operation as quickly as possible and minimize the adverse impact on business operations
    • Incident management can be performed by 1st, 2nd, 3rd,…, Nth level support teams, including 3rd party vendors and management
    • An incident is a deviation from the expected standard operation of a system or service
  • Incident Management Activities Ownership, Monitoring, Tracking & Communication Incident Detection & Recording Resolution & Recovery Investigation & Diagnosis Classification & Initial Support Incident Closure Service Request Procedure Yes No Service Request
  • Incident Management Process Resolutions/ Work-arounds Leave Process RFC Resolution Change Management Process CMDB Configuration Details Problem & Error DB Resolution/ Work-around Service Request Procedures Routing Monitoring Incidents Enter Process Service Desk Computer Operations Networking Procedures Other Sources Of Incidents
  • First, Second & Third Line Support 1 st level Incident Detection & Recording Service Request Procedure Service Request Classification & Initial Support Investigation & Diagnosis Resolution & Recovery Resolved? Incident Closure No Yes 2 nd level Investigation & Diagnosis Resolution & Recovery Resolved? No Yes 3 rd level Resolved? Investigation & Diagnosis Resolution & Recovery No Yes N th level Etc.
  • Prioritization 3 2 1 4 3 2 5 4 3 Low Medium High Low Medium High URGENCY IMPACT Highest Priority
  • Escalation Hierarchical (Authority) Functional (Competence)
  • Categorization
    • The purpose of categories is to organize your information so you can manage more effectively
    • Provide the ability to perform layered reporting against a single category or a combination of categories
    • Categorization will help with defining your assignment paths, escalation paths, and work groups
    • Service
      • System
        • Configuration Item Attributes
  • Service Lifecycle Categories
  • Incident Matching Procedure Incident alert Y Y N N Update Incident record with ID of Known Error Update incident record with category data Extract resolution or circumvention action from Known Errors Db Support required? Execute resolution action Assess Incident and follow routine procedure Update Incident count on Problem Record Update Incident record with ID of Problem Update Incident record with category data Extract resolution or circumvention action from Problem Db Support required? Execute resolution action Routine Incident? Raise new record on Problem Db Allocate to PM Team N Inform Customer of Work-around Match on Problem Db? Match on Known Errors Db New Problem Alert Process Incident/ Problem Update Incident count on Known Error record Y Y Y N N
  • Handling Major Incidents
    • Notify the Problem Manager
    • Should arrange a formal meeting with interested parties (or regular meetings if necessary)
      • All key in-house support staff
      • Vendor support staff
      • IT Services management
      • Service Desk representative
      • Customer representative
  • Goal Of Problem Management
    • The goal of Problem Management is to minimize the adverse impact of Incidents and Problems on the business that are caused by errors within the IT Infrastructure, and to prevent recurrence of Incidents related to these errors
    • In order to achieve this goal, Problem Management seeks to get to the root cause of Incidents and then initiate actions to improve or correct the situation. Problem Management achieves this by being both reactive and proactive
  • Definitions
    • Problem
      • A condition identified from multiple incidents exhibiting common symptoms, or from a single significant Incident, indicative of a single error, for which the cause is unknown
    • Known Error
      • A condition identified by successful diagnosis of the root cause of a problem, when it is confirmed which CI is at fault
  • Reactive Problem Management Error Identification & Recording Error Assessment Record Error Resolution Tracking & Monitoring Of Errors RFC Problem Identification & Recording Problem Resolution & Evolution To Known Error Problem Investigation & Diagnosis Problem Classification Tracking & Monitoring Of Problems Problem Control Error Control Close Error & Associated Problems Change Successfully Implemented
  • Problem Control
    • The aim of Problem control is to identify the root cause, such as which CIs are at fault, and to provide the Service Desk with information and advice on Work-arounds when available
    Problem Control Problem Identification & Recording RFC & Problem Resolution & Closure Problem Investigation & Diagnosis Problem Classification Tracking & Monitoring Of Problems
  • Error Control
    • Error Control covers the processes involved in progressing Known Errors until they are eliminated by the successful implementation of a Change under the Change Management process
    • The objective of error control is to be aware of errors, to monitor them and to eliminate them when feasible and cost-justifiable
    Error Control Error Identification & Recording Error Assessment Record Error Resolution Close Error & Associated Problem(s) Tracking & Monitoring Of Errors RFC Change Successfully Implemented
  • Problem Closure & Resolution
    • When a Problem has been resolved the Problem Manager should ensure that:
      • The details of the actions taken to resolve the problem are concise and readable
      • Classification is complete and accurate according to the root cause
      • Resolution/action is agreed with the Customer
      • All details applicable are recorded such that:
        • The Customer is satisfied
        • Cost-center project codes are allocated
        • Time spent on the problem is recorded
        • The person, date/time of closure is recorded
  • Error Assessment
    • The initial assessment may be in collaboration with specialist staff
    • Keep in mind maintenance targets for third party products
    • Follow an error cycle for software environments
  • Known Error Sources
    • There are two main sources of Known Errors:
      • Problem control subsystem
        • Problem record forms basis of Known Error record
      • Development activity
        • The data needs to be made available by the developers to the Problem or Incident Manager or other live environment custodian
  • The Error Cycle In The Live & Development Environments Live Known Errors Db Development Known Errors Db Investigation & Diagnosis Applications Development & Maintenance Live Operations Investigation & Diagnosis Change Management Requests For Change Problems Problems Release
  • Proactive Problem Management
    • Proactive Problem Management is concerned with:
      • Identifying Problems and Known Errors
      • Resolving Problems and Known Errors
      • Before incidents occur
    • Problem prevention ranges from:
      • Prevention of individual problems
      • To strategic decisions
      • And supplying Customers with information to stop them from calling for assistance
  • Trend Analysis
    • Incident and problem analysis can reveal:
      • Trends such as the post-change occurrence of particular problem types
      • Emerging faults of a particular type
      • Recurring problems of a particular type or with an individual item
      • The need for more customer training or better documentation
  • Targeting Preventive Action
    • The business impact must be identified, the ‘pain factor’:
      • The volume of incidents
      • The number of Customers impacted
      • The duration and related costs of resolving the incidents
      • The cost to the business
  • Targeting Preventive Action
    • The appropriate actions may include:
      • Raising an RFC
      • Providing feedback regarding testing, procedures, training and documentation
      • Initiating Customer education and training
      • Initiating Service Support staff education and training
      • Ensuring adherence to Problem and Incident Management procedures
      • Process or procedural improvement
  • Problem Reviews
    • The appropriate people involved in the resolution
      • All key in-house support staff
      • Vendor support staff
      • IT Services management
      • Service Desk representative
      • Customer representative
    • Should be called to the review to determine:
      • What was done right
      • What was done wrong
      • What could be done better next time
      • How to prevent the Problem from happening again
  • From Incident(s) To A Problem To A Known Error To A Change Problem Known Error Change X X X X Problem evolves into error record Change Management Incident Management Problem Management CI at Fault X X X X } X X X X } X X X X } Workaround } Incident Matching Root cause determined Temporary solution RFC
  • ITIL Certifications
  • Thank You! [email_address] www.linkedin/in/rondrew