The Information Technology Infrastructure Library (ITIL) provides a framework of “Best Practice” guidance for IT Service Management and is the most widely used in the world. The de facto standard in IT Service Management
A framework, developed by the United Kingdom’s Office of Government Commerce (OGC) captured in a series of books
IT Service Management is a top-down, business driven approach to the management of IT that specifically addresses the strategic business value generated by the IT organization and the need to deliver a high quality IT service.
IT Service Management is designed to focus on the people, processes and technology issues that IT organizations face.
Relationship Between Innovation & Formalization Innovation Formalization Business Goals Process Effectiveness IT Strategy People/ Culture Technology Processes
The right time & place
The right quality & quantity
“ right” = at a reasonable cost
Business Alignment Corporate Vision Corporate Mission Corporate SWOT Corporate Strategies And Goals IT Vision IT Mission IT SWOT IT Strategic Goals IT Tactical Goals IT Operational Goals Are we working towards achieving our vision? Are we operating within the purpose and scope? What needs to be measured? What needs to be measured? What needs to be measured? Business Objectives IT Objectives Performance Indicators
Service Versus System Mainframe Gateway WAN Router Server Hub PC PC PC E mail service System Availability 99.96% System Availability 98% System Availability 97.5% System Availability 96% Service Availability 91.69% Availability = Ops * WAN * LAN * Desktop Total Availability of Email Service = 91.69%
Service Portfolio Management DEFINE---Collect information and inventories of existing services. Establish the requirements for the requested service, and establish the business case for implementing the service. ANALYZE---Review the long-term business goals, and determine what services are required to meet those goals. Then analyze the requested service for financial viability, operational capability, and technical feasibility to determine how the organization is going to get there. (You may decide to obtain the service from an outsourcer rather than develop it internally.)
Service Portfolio Management APPROVE---Make a decision to retain, replace, renew, or retire the services. CHARTER---Communicate action items to the organization to implement approved service, and allocate budget and resources. The Define, Analyze, and Approve steps are described in the ITIL V3 Service Strategy book. The Charter step is discussed in the ITIL V3 Service Design book.
Configuration Management System (CMS) CMS was introduced in ITIL V3. It provides a strong foundation for the service catalog. CMS is an ecosystem that feeds, manages, analyzes, and presents the information contained in the Configuration Management Database (CMDB) which is another fundamental component of ITIL. CMDB is depicted in the ITIL books as merely a core component of CMS, a well-architected, federated CMDB implements much of the functionality of the CMS
Service Catalog Service Catalog is a subset of the service portfolio that is visible to customers.
Provides the service consumer view.
Services Available to consumers
Shows relationships of services to business units/processes
Provides IT view.
Makeup of services
Shows relationships of services to enterprise infrastructure elements that support them.
Processes vs. Departments IT Service Desk 2 nd Dept. 3 rd Dept. Department View Step 1 Step 2 Step 3 Step 4 Step 5 Process View
Information Technology Infrastructure Library (ITIL) Overview
Provide a strategic central point of contact for customers and an operational single point of contact for managing incidents to resolution – actually a function, not a process.
Restore normal service operation as quickly as possible and minimize adverse impact on business operations within the defined parameters of service level agreements.
Minimize the impact of incidents and problems on the business that are caused by errors in the IT infrastructure and to prevent recurrence of incidents related to these errors.
Ensure standardized methods and procedures are used for efficient and prompt handling of all changes to minimize the impact of change-related incidents upon service quality improving day to day operations.
< 6 Months> High Level Process Model 6 - 8 wks Authority Matrices 1-2 wks Procedures / Policy Documents Functional tool requirements Process workshops Skills development Communication Plan 16 Weeks Management Information (Quality Norms and Metrics) What Who How
Generic Process Model Process Input & Input Specifications Output & Output Specifications Process Owner Process Goal Quality Parameters & Key Performance Indicators Process Control Resources Roles Process Enablers A connected series of actions, activities, changes etc. performed by agents with the intent of satisfying a purpose or achieving a goal. Activities & Sub-Processes
Define the roles required to execute the new process activities and procedures
Identify majors tasks and responsibilities for each role
Identify skill requirements for each role
Identify new roles to the organization and begin working HR to get approval on these roles
Identify major changes to existing roles and begin working with HR to get approval on these roles
Map the roles and responsibilities back to the organization (Authority Matrix – ARCI / Accountable, Responsible, Consulted and Informed)
Measurement Framework Financial Customer Innovation Internal Improve Quality Of Service Improve Management Control Employ New Technology Reduce Cost Example Goals Process Measures
In order to understand something you must look at it more then one way Balanced Score Card Approach
Service Support KPI’s Quality Compliance # of releases by type that satisfy release management criteria when submitted to Change # of releases that bypass the process Ensure production readiness, quality and authorization of new or modified CIs and their planned deployment Release Quality Value % of CMDB data population and accuracy vs actual, according to scope % Growth or Change by CI type over an elapsed time period Identify / control / manage IT resources within a Configuration Management Database Config. Quality Value # of changes by type / category / Group / Customer. (emergency changes trending down) # of changes that have resulting incidents, or fail and have to be backed out Handle changes efficiently while minimizing impact to service delivery Change Quality Value # of problems identified & root cause determined with solution or workaround. # of Repeat incidents by category trending downwards Identify systemic Infrastructure Errors and eliminate them to minimize impact and improve availability Problem Quality Performance # of Incident by category, priority and resolution type by LOB # of Incidents restored within SLA Targets Restore service degradations to expected level ASAP Incident Category Example Core KPIs Core Objective Process
Service Delivery KPI’s Compliance % deviation of forecasted versus actual cost of IT services within defined tolerance limits (% of Deviation $ of Deviation) Plan for and deliver IT Services within a forecasted budget against actual cost Finance Quality Performance % of systems that fail recovery test Time to execute test of plan and recover IT services in a contingency state against expected targets. Recover IT systems to normal state in an alternate way after a disaster within an expected timeframe ITSCM Quality % of components the breach tolerance thresholds in correspondence to planned capacity levels for components and complete IT systems. Current and future resources are greater than or equal to demand, but excess is planned Capacity Quality % of service availability within SLA negotiated requirements. Define and plan for service availability to meet or exceed stated business requirements through process, technology and people resource planning and implementation Avail. Value % Score of customer satisfaction survey trends up over time i.e. Customer Satisfaction Survey Define services Agree on level, scope, quality, performance Monitor & Manage SLM Category Example Core KPI Core Objective Process
To restore normal service operation as quickly as possible and minimize the adverse impact on business operations
Incident management can be performed by 1st, 2nd, 3rd,…, Nth level support teams, including 3rd party vendors and management
An incident is a deviation from the expected standard operation of a system or service
Incident Management Activities Ownership, Monitoring, Tracking & Communication Incident Detection & Recording Resolution & Recovery Investigation & Diagnosis Classification & Initial Support Incident Closure Service Request Procedure Yes No Service Request
Incident Management Process Resolutions/ Work-arounds Leave Process RFC Resolution Change Management Process CMDB Configuration Details Problem & Error DB Resolution/ Work-around Service Request Procedures Routing Monitoring Incidents Enter Process Service Desk Computer Operations Networking Procedures Other Sources Of Incidents
First, Second & Third Line Support 1 st level Incident Detection & Recording Service Request Procedure Service Request Classification & Initial Support Investigation & Diagnosis Resolution & Recovery Resolved? Incident Closure No Yes 2 nd level Investigation & Diagnosis Resolution & Recovery Resolved? No Yes 3 rd level Resolved? Investigation & Diagnosis Resolution & Recovery No Yes N th level Etc.
Prioritization 3 2 1 4 3 2 5 4 3 Low Medium High Low Medium High URGENCY IMPACT Highest Priority
Incident Matching Procedure Incident alert Y Y N N Update Incident record with ID of Known Error Update incident record with category data Extract resolution or circumvention action from Known Errors Db Support required? Execute resolution action Assess Incident and follow routine procedure Update Incident count on Problem Record Update Incident record with ID of Problem Update Incident record with category data Extract resolution or circumvention action from Problem Db Support required? Execute resolution action Routine Incident? Raise new record on Problem Db Allocate to PM Team N Inform Customer of Work-around Match on Problem Db? Match on Known Errors Db New Problem Alert Process Incident/ Problem Update Incident count on Known Error record Y Y Y N N
The goal of Problem Management is to minimize the adverse impact of Incidents and Problems on the business that are caused by errors within the IT Infrastructure, and to prevent recurrence of Incidents related to these errors
In order to achieve this goal, Problem Management seeks to get to the root cause of Incidents and then initiate actions to improve or correct the situation. Problem Management achieves this by being both reactive and proactive
A condition identified from multiple incidents exhibiting common symptoms, or from a single significant Incident, indicative of a single error, for which the cause is unknown
A condition identified by successful diagnosis of the root cause of a problem, when it is confirmed which CI is at fault
Reactive Problem Management Error Identification & Recording Error Assessment Record Error Resolution Tracking & Monitoring Of Errors RFC Problem Identification & Recording Problem Resolution & Evolution To Known Error Problem Investigation & Diagnosis Problem Classification Tracking & Monitoring Of Problems Problem Control Error Control Close Error & Associated Problems Change Successfully Implemented
The data needs to be made available by the developers to the Problem or Incident Manager or other live environment custodian
The Error Cycle In The Live & Development Environments Live Known Errors Db Development Known Errors Db Investigation & Diagnosis Applications Development & Maintenance Live Operations Investigation & Diagnosis Change Management Requests For Change Problems Problems Release
From Incident(s) To A Problem To A Known Error To A Change Problem Known Error Change X X X X Problem evolves into error record Change Management Incident Management Problem Management CI at Fault X X X X } X X X X } X X X X } Workaround } Incident Matching Root cause determined Temporary solution RFC