ITIL v3 story

1,448 views
1,345 views

Published on

This story explains ITIL v3 based on story covering mostly Transition and Operation phase.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,448
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
57
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ITIL v3 story

  1. 1. www.differ.cz Story of support and maintenance according ITIL v3, part I.Story of support and maintenanceaccording ITIL v3Part I. Operational activities Jaroslav Procházka www.differ.cz version 1.0 August 2011© 2011 Jaroslav Procházka, www.differ.cz Page 1
  2. 2. www.differ.cz Story of support and maintenance according ITIL v3, part I.Story motivationWe are nowadays driven by strong rationality (logical, rational, scientific, verifiable facts matter) and forgetirrational aspects and emotions in human decision making. If humans are rational, why the hell they buyApple products? ;) The same statement is valid for stories and their power. Stories are part of our culturesfor many thousand years and are the best way to transfer the knowledge, see sociological, psychologicalor cognitive studies, e.g. Campbell: The hero with a Thousand Faces or Turner: The Literary Mind. Youknow, all the old epics, Bible or story of Buddha are stories that are attractive for us, we would like to hearthe same variations of hero’s journey again and again. And that’s stories what can differ us, our service,product or company from many other vendors providing the same. Stories matter.Other application of stories in business is in knowledge management and sharing domain. Be honest,how often do you use your logical-structured-fact-based Knowledge base? How easy is to remember suchrecord content, steps and outcomes in longer term? And now, compare it with story of your colleaguedramatically describing the same situation (you could hear it in kitchen, during lunch, in the pub)? Whichone is easier to remember and follow? Big and respected companies like XEROX1, 3M or NASA usestories as the approach to store and share knowledge inside the company. Story telling is also part ofmodern leadership.Next motivation factor that was trigger for me to write this e-book is hard understanding of processframeworks like IBM RUP® or ITIL®. Such misunderstanding causes problems with support, operationsand maintenance of IT infrastructure leading to weak quality, revenue, dissatisfied teams and customers.Goal of this e-book is to spread service-driven (ITSM) philosophy and service thinking using stories. Thestory is focused on principles and concepts described by ITIL v3. We start with end user affected by someissue and solve also hidden root cause (in ITIL terms Incident and Problem Management). Proactiveinvestigation of root causes is weak point of many teams and companies. We’ll emphasize key ideas ofthis approach, doesn’t matter if you call it Problem management, Kaizen, TQM, CMMI. Part of the story isalsoConfiguration Management that’s taking care of IT infrastructure items, Change Management processingchange requests and Release and Deploy Management building the change.The second part of this story that would follow soon is focused on tactical and strategic activities of ITSM,namely service thinking, connection to business and its scenarios, predictions, proactive thinking,contracting and service measurement (so called SLA). This is the core of ITSM/ITIL thinking.1 Více viz http://choo.fis.utoronto.ca/mgt/KM.xeroxCase.html ihttp://www.kmworld.com/Articles/Editorial/Feature/Best-Practices-Eureka!-Xerox-discovers-way-to-grow-community-knowledge.-.-And-customer-satisfaction-9140.aspx© 2011 Jaroslav Procházka, www.differ.cz Page 2
  3. 3. www.differ.cz Story of support and maintenance according ITIL v3, part I.Note:This e-book does not replace ITIL training or certification. You will neither set up the right environmentbased on it. The meaning and goal of it is to raise awareness about ITIL version 3. What is it, how can ithelp with solving my issues and what are the differences from version 2. This material could bring insightfor busy people to study ITIL, typically sales people, customer representatives, customers, architects,higher managers and other key people.If you have any comments, improvement proposals or ideas how to improve this e-book or you would liketo cover also your domain in the story (only application management as incident contributor is covered),send it please to me via email (jarek@differ.cz). I also hope that this short story inspires you to write yourteam/unit knowledge base in form of short stories! It is more memorable, writing it is fun and thus theybring better value to its creators and consumers ;)© 2011 Jaroslav Procházka, www.differ.cz Page 3
  4. 4. www.differ.cz Story of support and maintenance according ITIL v3, part I.A short introductionITSM means IT Service Management, thus the story covers mostly introduction of the concept of ITservice thinking and operational activities connected to this concept. The most known and used ITSMframework nowadays is called ITIL (IT Infrastructure Library) – the library guiding us in IT infrastructuremanagement covering software, hardware, networking, people etc. ITIL brings process approach to ITSMand its key benefit is definition of common terminology that is very important for communication betweenIT and business and among different vendors in the chain. Nowadays (July 2011), ITIL exists in version 3,but new refresh is prepared for release and it would be called ITIL 2011. Key difference between version 2and 3 is newly introduced lifecycle of the service (see picture), starting with its idea, strategy (ServiceStrategy phase) and ending with daily use and support (Service Operation). Story described in this e-bookcovers concepts and processes of version 3, specifically Service Transition and Service Operation part.Daily operations and support deal with necessary activities such as monitoring, data back- ups,implementation of law amendments (e.g. ERP applications) or reflecting changes in assembly process ofproduction and assembly lines. Change or new functionality recording, assessing, implementing, testingand integrating is one part of those necessary activities. But specific actions need to be performed also incase of application/service incident that affects end users and thus the value we provide to the customer.Depending on number of users affected or importance of application, the cost of incident can be reallyhuge: e.g. 30 minutes of stopped assembly line can mean 20 cars not assembled and delivered = 20 cars x 15.000 Eur / 1 car = 300.000 EUR losses just in 30 minutes!Specific domain that needs our attention and automation is physical changes in IT infrastructure: hardwareor network. If not secured and automated properly, it can cause severe incidents with huge financialimpact. More described example calculation of the cost of incident impact shows following box:© 2011 Jaroslav Procházka, www.differ.cz Page 4
  5. 5. www.differ.cz Story of support and maintenance according ITIL v3, part I. Simple Incident cost calculation: Employee cost ………...... 100 EUR / h Headcount ........................ 200 in total Incident length .................. 3h 30 people cannot work for 3 hours because of system incident. The cost of impact can be simply calculated: 30 employees x 3h x 100 = 9000 EUR of costs in one day not generating any value to the customer! If you multiply this number by total amount of incidents per year, you could get pretty high number that could cover e.g. year budget for IT or the cost of totally new system or assembly line.Due to this fact, we need also early identification and uncovering of incidents with high level of automation.Jang part of this Jin is built-in proactive root cause identification and solution (so called ProblemManagement). Necessary backend functionality supporting efficient monitoring and problem managementis providing knowledge about infrastructure: hardware and software configurations, software versions,licenses, people locations, access right politics etc. Advanced teams use (semi)automated knowledgebase storing and proposing already solved issues, incidents, problems or complicated changes with manydependencies. Starring: Mary ….… business user affected by system incident, Pete ……. application programmer, John ….… system administrator, Adam …... Service Desk support specialist, And other starts…© 2011 Jaroslav Procházka, www.differ.cz Page 5
  6. 6. www.differ.cz Story of support and maintenance according ITIL v3, part I. Typical scenario of daily operationsMary used paper evidence of incoming orders until now. Although her company had implementedinformation system for assembly line and economic agenda, order processing was not part of theproject. Paper evidence is not very efficient and brings problem if some order needs to be findquickly. Also archiving is a bit problematic. Orders fade and their readability is harder andharder. Mary is happy, because order processing was recently automated by software programand integrated to assembly line information system. Rework, searching and archiving issues arelimited to almost zero now and Mary can enjoy her work.It’s Monday morning. Mary uses application called WarehouseAndOrders v1.1 to process ordersto assembly line, but after one hour of work software client crashes and she’s not able to run itagain. So she calls her friend Pete, application programmer, to help her with solving thisincident2. Pete is employed by IT company delivering and operating this application and knowsMary from university times. Actually, they are still friends and meet regularly. Pete is happy tohear from Mary again, so they have little chat and by the way Mary also mentions the incident.Pete makes some note, but forgets it immediately because of heavy load caused by upcomingrelease. Mary is awaiting resolution from Pete and performs some unimportant tasks not to bebored. She reminds herself at lunch on Monday when they used to go at lunch with all olduniversity group. Pete asks about some symptoms observed by Mary in the morning (any errormessage, behavior of system etc.), not to be ashamed. But after few Mary’s comments Peteimmediately talks about something else. You know, it is few hours since incident happens, soMary doesn’t remember anything significant and Pete is annoyed by it. It is no surprise that Petecontinues with development tasks after the lunch and forgets about Mary and her issue. Mary stillcould not use the application and process incoming orders.2 We’ll start to differentiate between the key terms incident and problem. The reason is totally different meaning inITIL terminology:Incident is an event causing availability or quality problems of IT service or its part perceived by end user. It couldbe response time, number of processed transactions, volume, no accessibility to service etc. Incident can be usuallysolved by so called workaround (typically server or application restart), but this solution or process doesn’t removehidden root cause! It only allows service to operate again under agreed quality.Problem is hidden root cause of one or more incidents that can be already evident or cannot. Problem can be solvedonly by structural solution, e.g. change in IT infrastructure or bugfix of software application source code.© 2011 Jaroslav Procházka, www.differ.cz Page 6
  7. 7. www.differ.cz Story of support and maintenance according ITIL v3, part I.It is Monday afternoon and Mary is calling Pete again to hear more about the progress. Pete getsangry because Mary interrupts him repeatedly. He needs to finish build testing for upcomingdeployment. Pete wants to be freed of Mary so he stops testing and starts incident investigation.Mary is still waiting and performing not important tasks and incoming orders are not processed.Mary realizes that this day will not bring the solution and goes home earlier. Pete stays until 8pm busy with infrastructure identification: what are the parts of this nasty program? Whichservers are used to operate it? What middleware, databases and other connectors does it use?Tuesday morning is an important deadline for Pete, he needs to finish new release package fordeployment. This is the reason why he comes earlier in the morning even though he finished latethe day before. Mary arrives later this morning to be secured that incident is already solved andher time is not wasted. Pete focuses on finishing build testing and packaging. When ready, hecontinues with incident investigation. Finally he realizes what servers are used to operateWarehouseAndOrders application and both are Linux servers! Thanks IT God, Pete is Linux fanand skilled Linux programmer, so he wants to start investigation but missing accountimmediately stops him. Pete is proactive and calls John (system admin) to get any account to getin. John as good friend shares root account with an assumption that Pete will create his personalone and will upload there some new movies and mp3s. Why the hell would he otherwise ask foraccess to this server?Pete skips lunch today because of his heavy load. Tuesday afternoon brings following steps. Petelogs in Linux server as root and searches for WarehouseAndOrders program directory and otherunderlying applications and database servers. He plans to investigate logs to learn more aboutthe situation, but accidentally when starting MC (Midnight Commander) he notices full serverhard drive. Because Pete is busy but wants to help Mary at the same time, he does not care withcreation of his account and setting the rights but just deletes some temp and log files as root. Hecalls John to restart Oracle DB and also Apache Tomcat web server, both were down and areused by WarehouseAndOrders application. In fact, Pete does not want to waste time by lookingfor admin interface. What more, it’s John’s responsibility anyway. John is confused by thisrequest (Pete is not usually working for the customer using those servers), but he does what isasked for without any notice to end users. John informs Pete after restart to check what wasexpected.© 2011 Jaroslav Procházka, www.differ.cz Page 7
  8. 8. www.differ.cz Story of support and maintenance according ITIL v3, part I.Pete can now call Mary that WarehouseAndOrders application v1.1 is running again. Mary isvery grateful, thanks to Pete and starts to process orders waiting in queue. Pete forgets the wholestory and continues with his assignment. Build needs to be tested and packaged for tomorrow’sdeployment. Pete stays in office again until 8pm to finish all required steps.Wednesday morning looks like ordinary day when Mary processes the orders in queue. After 2hours the same incident occurs again and it makes Mary angry. She calls Pete if he knowsanything about the issue; maybe he’s improving the application she assumes. But nobody repliesto office call. The reason is obvious for our reader, but not for Mary. Pete travels to the customerpremises to install new release, because it cannot be done remotely. Mary is not doomed towaiting, since she calls Pete’s mobile phone and explains the situation. Pete contacts John andquickly synchronizes about the issue and its context. John finally gets the point why Pete askedfor Linux account and server restart. The reason was not mp3s or new movies but incident! Butthanks to this John knows some context, servers used and symptoms of the issue. Pete continueswith installation of customer release, finally without any disturbances. John starts ITenvironment investigation and notices full server disk. He backs up chosen log files in differentserver for further analysis, deletes original ones and tries to restart Oracle DB and only failedinstances of Apache Tomcat web server. He tries WarehouseAndOrders application and seeseverything working but he still does not contact Mary before he’s sure incident will not occuragain.John as system admin is surprised by full server disk. There cannot be so many movies and mp3sstored on server, he thinks loudly… He postpones lunch and starts investigation of incident’sdeeper root cause. How can be server disk full? He writes workaround script that will back upchosen log and temp files in different server regularly and remove the original files after thisprocedure. John wanted to download log files to his computer for further investigation andanalysis and notices accidentally so big Oracle DB log (only just because long download time)!How the hack can today’s Oracle log have almost 3 GB? He opens the log in original serverfolder and after few minutes of investigation notices programmer’s error reports. He updatesworkaround script with this log as well after this finding. Then the script is quickly tested withexpected result, so nothing hinders its deployment. Only after this action John calls Mary to usethe application again. John still wonders what error can cause such a huge Oracle log and if thisis only contributor to full disk. He searches Internet forums if somebody already tackled similarissue, but founds nothing. He reports this defect to Oracle Corporation and waits for any reply.Finally he can go for a Wednesday’s lunch.© 2011 Jaroslav Procházka, www.differ.cz Page 8
  9. 9. www.differ.cz Story of support and maintenance according ITIL v3, part I.Scenario conclusion: Albeit some actions described in this scenario can be striking and funny, many ITand non-IT organizations follow this setup. And if you discuss the topic with them and emphasize someanti-patterns, they are not aware about anything weird and are surprised by your statement aboutefficiency and potential risks. Moreover, this story is our personal experience from previous assignments.Let’s conclude the story: Mary could not process orders for almost 2 days = it could affect company’s cash flow and name or even generate losses but nobody cared. Pete was frequently disturbed, switched context and was overloaded. John as system administrator started to investigate hidden root cause of incident (doing his job) only after 2 days from first incident discovery. Due to disturbances and Pete’s tiredness build could contain unnoticed defects. Pete accessed restricted production servers as root and deleted files there as root. Same incident occurred again in short time and affected end user. Hidden root cause generating incident is still not uncovered and resolved.© 2011 Jaroslav Procházka, www.differ.cz Page 9
  10. 10. www.differ.cz Story of support and maintenance according ITIL v3, part I. ITIL v3 scenarioLet’s discuss same story following ITSM principles. This is how it looked like after 3 month ofimplementation effort. Same stars perform this story, but the approach to incident resolution is different.We focus on Service Transition and Service Operation activities again.Situation with IT systems is the same as described in the first scenario. We start the story onMonday morning again when Mary enters office and starts to use Orders&Warehousing ITService3, not WarehouseAndOrders application anymore. She does not need to care aboutdifferent parts of the service, start program client or prolong licenses. She just uses her browserand link to run Orders&Warehousing IT service. IT service works as expected, no warningsymptoms occur. Standard monitoring and event reporting4 is set up and working at the sametime. Business users, Mary as one of them, do not even know about this monitoring. IT specialiststogether with Service Desk specialists set the thresholds for specific components, servers andtheir events. These events can trigger deeper investigation by specialist or can automaticallyreport an incident. Monitoring system started to report several “lack of free disk space” events ofOrders&Warehousing IT service server this morning5. Service desk specialists started toinvestigate those events but meanwhile Orders&Warehousing IT service has frozen and had notresponded.Mary reports an incident using Service desk (SD) tool. SD is the only single point of contact(SPOC), together with phone, to be used for communication with IT service vendor. Such areported incident record contains incident description (observed symptoms), priority for user(e.g. only one using the service and being affected, department or team affected or wholecompany affected) and the name of service chosen from list of provided services. This actioncauses automatic notification of the incident to relevant service (and/or customer) IncidentManager. Incident Manager does the first incident record check, assigns expected category (e.g.hardware, network, application, premises, licenses) and priority in the context of end userperception but also other services and business impact. Resulting priority in this case is highalthough only Mary uses the service. But the service supports processing of incoming orders, andits unavailability can stop assembly line and affect company’s business and cash flow. Adam isassigned to this incident because he is marked as free in Service Desk dashboard and isautomatically notified about it, the same is Mary. All these steps happen just in few minutes,approximately the same time as reading this page.3 IT service is a mean for customer value delivery using IT resources. Customer gets specific outcomes needed to runthe business without owning and managing costs and risks connected to IT. Customer does not care about software,hardware, networks, licenses, premises, people, upgrades and patches or monitoring. Customer just buys IT serviceas commodity and external or internal vendor takes care about operations, support and maintenance.4 Typically log changes, state monitoring or user events are processed for incident triggering.5 These activities are performed as part of Event Management and are tightly connected with monitoring andmonitoring systems.© 2011 Jaroslav Procházka, www.differ.cz Page 10
  11. 11. www.differ.cz Story of support and maintenance according ITIL v3, part I. Incident reporting example using Outlook© 2011 Jaroslav Procházka, www.differ.cz Page 11
  12. 12. www.differ.cz Story of support and maintenance according ITIL v3, part I. Incident reporting example using Jira toolAdam reads obtained notification and immediately starts incident investigation. The first stepsperformed are following checks: Checking Knowledge Base (KB) – it contains solutions to existing problems and incidents. If well structured, readable and user friendly then KB can ease and speed up incident resolution as well as knowledge sharing among the team at the same time. Checking Configuration Management System (CMS) – it contains description, version, location and bindings of IT infrastructure components (end user stations, servers, accessories). Such system can significantly help with incident localization (which server or station is used by this service and what is the configuration, versions) and root cause identification. And checking automated monitoring tool records and events (Event management records). These functions are often performed by specific team or department called Control Desk.Mentioned tools allow quicker incident resolution but also require less technically skilled ServiceDesk specialists (needed information is stored in the tool and does not need to be mined incomplicated way). Adam knows what components are used to operate Orders&Warehousing ITservice thanks to CMS and IT service catalogue (see following table and figure).© 2011 Jaroslav Procházka, www.differ.cz Page 12
  13. 13. www.differ.cz Story of support and maintenance according ITIL v3, part I.IT service name Users Responsibilities Configuration Items (CI)Orders&Warehousing Users: WarehouseAndOrders v1.1 Tomcat 6 Reporting incident using Oracle 9i Service Desk (tool or phone) Mary Red Hat Enterprise Linux 5 Participating regular monthly Management HW Server Prague SLA reviews Net Switch S1 … Net Switch S2 IntranetInternet See internal rules for using Internet Internet Service Provider All users (link to intranet document) Firewall Zone v3.2 Example IT service catalogue records. Configuration items column is visible only for IT vendor CMS part: visual information about IT service infrastructure (basically visualized Configuration Items column in table above)Events in IT infrastructure show insufficient (no) free server disk space onto core IT serviceoperational server. Adam backs-up temp and log files and starts to investigate Oracle databaseand Tomcat web server logs only, because he knows from IT service catalogue that these areused by the service. Thanks to monitoring tools Adam also knows that only this service is down.He notices too big Oracle DB log consuming several GB of disk space. He backs-up and deletesOracle log, restarts the service and tries its functionality. At the same time he also createsautomatic script that backs-up and deletes original Oracle DB log file in regular interval (socalled workaround solution). He verifies and installs the script, restarts Oracle DB and relevantTomcat instance, checks monitoring tools, IT service functionality and backed-up file. Everythingworks, so Adam creates problem record in SD tool and assigns it, together with link to Oraclelog file, to Oracle group that solves Oracle related problems. Problem ticket is raised to solvedeeper root cause. Adam only used interim workaround solution for incident that allows running© 2011 Jaroslav Procházka, www.differ.cz Page 13
  14. 14. www.differ.cz Story of support and maintenance according ITIL v3, part I.IT service again. But why is Oracle log so huge? What causes this? How to fix this? Thesequestions are still not answered. As final step, Adam updates incident record (Work Log andsolution) and closes it. Mary is notified about solved incident via e-mail, so she knows she canstart to use the service again. Mary needs to try the service and if the solution is ok, she needs toaccept incident solution (or it can be done automatically after some period of time, not to annoyend user). Mary accepts the solution because everything works well.It’s still Monday but already after the lunch. Adam creates Knowledge base record describingthis incident and symptoms and encloses solution workaround (script). This KB record is linkedto original incident record and to created problem record too. The goal of KB record is to speedup solution of similar incidents in the future.We used Service Desk function, or tool, and Event and Incident Management processes to register andprocess incident record. Only incident was solved in the story, root cause is still unclear. Reader couldnotice how appropriate tools and monitoring can make incident management process much more efficientand quick. Thanks to this is incident processed in several minutes and resolved in tens of minutes. Marycould continue with her work and there is no significant impact on company’s business (at least not 3 daysas in previous story). But our job is not done yet. We need to uncover and solve the problem (ITIL term forunknown root cause) causing the incident. Let’s continue with the story then to uncover hidden problemusing Problem Management process and implement the change using Change and Release and DeployManagement. The whole lifecycle and process relations are depicted in following figure: Relations of ITSM Service Operation and Service Transition processes introduced in our storySince Adam created problem ticket in Service Desk related to Oracle database group, ProblemManagement team is formed on demand. This team consists of skilled and experiencedadministrators and database programmers that are involved only in more complicated issues(Level 2 and 3 in Service Desk hierarchy model). The reason is labor cost of those professionals.Rachel, Oracle specialist is notified as Problem Manager and starts to investigate problemrecord as well as incident record with workaround, Knowledge Base description and mainlylinked Oracle log file. Thanks to her knowledge of “standard” Oracle log, she uncovers quickly© 2011 Jaroslav Procházka, www.differ.cz Page 14
  15. 15. www.differ.cz Story of support and maintenance according ITIL v3, part I.programmer’s error reports being part of this log. She’s surprised how this could happen,because she’s never experienced this before. Rachel logs in Oracle defect reporting tool(maintenance fee grants access to this database) and searches for this issue, but founds nothing.She is allowed to create a defect in Oracle defect tool, so she does, describes the log issue andattaches log snapshot to demonstrate it. Rachel receives reply from Oracle after several daysinforming about new patch released by Oracle to fix this defect. Rachel creates request forchange (RfC) to implement this patch to operational environment. Part of this RfC is description,reason, importance and impact of this new patch.Now we get to the moment when root cause was identified and solution exists. Before releasing toproduction environment we need to approve the request (there could be upcoming conflicting ordepending changes), test it (there can be other contributors to this root cause) and finally deploy. Forthese steps are responsible Change, Release and Deploy Management processes and roles. Changeassessment, testing and deployment could look like activities in following chapter.Change request is assessed and approved by Change Manager Mike because no conflict ordependency with upcoming changes was found, implementation costs are very low and we savebackups disk space when remove workaround. Uncovered root cause and proposed solution isstructural one, solves the issue at low cost and allows removing workaround solution. Oraclepatch is first installed and tested in testing environment (mirror copy of production environment)and is ready for production deployment only after all tests are finished and no other symptomsare observed. It seems that Release and Deploy team can now finally distribute and deploy patchto production environment. But before they proceed with this step they need to prepare strategyplan called rollback plan. Orders&Warehousing IT service is so important so IT vendor cannotafford another incident in a row (definitely it would affect SLA6). Rollback plan secures the teamwith strategy used if patch deployment fails. If it happens we have to be able restore previousworking version and configuration. Necessary input for rollback plan is again CMS systemcontaining information about current versions of software and hardware systems, theirconfigurations and provides information about authorized storage of source, configuration andexecutable files.Now we are finally ready to deploy patch to production environment (really done-done).Deployment is done during agreed so called maintenance window. IT vendor can do changes andstop services for maintenance purposes only during this time. It is from 2.00 am to 3.00 am inthis case. When team deploys the patch and runs verification production tests, they removeexisting workaround (backup and delete script) together with Rachel. After check Mike closesthis RfC as successfully implemented. Rachel now updates problem solution (Oracle patch) andcloses problem as successfully implemented as well. She still needs to update Knowledge Baserecord to have all information synchronized. After that she’s done.Bit this is not the end of the story yet. Now there exist discrepancy between real productionenvironment (Oracle database patch – micro version change) and information about it in CMS.We need to update this information in CMS and IT service catalogue to keep these tools useful.6 SLA – Service Level Agreement – defines agreed quality parameters and conditions under which is serviceprovided. It is usually contract appendix, because it is not a formal contract.© 2011 Jaroslav Procházka, www.differ.cz Page 15
  16. 16. www.differ.cz Story of support and maintenance according ITIL v3, part I.Update can be done manually7 or using automated tool8 depending on vendor’s automationmaturity. Simple Rollback plan exampleIf we compare the first and second scenario, we can see big difference. Using more formal ITSM/ITILprocedures supported by automated tools allowed processing all necessary activities more efficiently andwithout needless emotions. We solved also deeper root case with structural, not just interim solutioncausing more complex IT infrastructure and its support and maintenance. But do not take thesestatements as a rule or the only truth. There is a hidden trap when shifting our way of working frominformal, ad hoc to process oriented way of working. The trap is omitting or suppressing human aspectand becoming only ticket driven machine so commonly seen in big corporations.Anyway, we can conclude the scenario as following: Incident was closed much earlier than in the first scenario. People responsible for incident solving did the job, no other IT roles, e.g. programmers, were disturbed. People involved in ITSM activities knew what and how to do (it was also boosted by proper process automation).7 We recommend simple checklist being part of change record (or work log) that will enforce/remind manual update.8 Update can happen without any manual action (monitoring system inform about this change in infrastructure andupdates the information) or semi-automatically (manual trigger for automatic IT infrastructure audit).© 2011 Jaroslav Procházka, www.differ.cz Page 16
  17. 17. www.differ.cz Story of support and maintenance according ITIL v3, part I. Incident root cause investigation and structural solution design (not just accepting workaround solution) started very first day with the aim to prevent recurring incidents. Proper automated tools (have you noticed, no Excel was mentioned ;)) speeded up diagnoses, information gathering and incident resolution process. Every step is recorded in Service Desk tool and it’s easy to track or report all steps and actions performed. Updated user friendly knowledge base (KB) could help with similar incident/problem solution.© 2011 Jaroslav Procházka, www.differ.cz Page 17
  18. 18. www.differ.cz Story of support and maintenance according ITIL v3, part I.The whole story following ITSM/ITILv3 processes is depicted in following picture: Flow starting with identified incident (also reported events) and ending with implementation of structural solution to identified problem (workaround is not a final destination)Story conclusionAs the result of this significant incident affecting Orders&Warehousing IT service was conducted extraSLA review meeting between IT and business. Scope of this meeting was not only to follow thresholdsand actual values of service quality attributes but also possible financial losses caused by this significantincident. It triggered additional actions on IT vendor side that should lead to better understanding ofbusiness, improved capacity and load predictions and proactive steps uncovering potential problems (interms of ITIL terminology). But these steps are already a trailer of upcoming second part of this ITILv3story ;)© 2011 Jaroslav Procházka, www.differ.cz Page 18
  19. 19. www.differ.cz Story of support and maintenance according ITIL v3, part I.Change historyVersion Date Author Change historyV1.0 August 2011 Jarek Procházka First English version created© 2011 Jaroslav Procházka, www.differ.cz Page 19
  20. 20. www.differ.cz Story of support and maintenance according ITIL v3, part I.Differ! www.differ.czImprove your IT development, support, maintenance and operationusing Agile and Lean practices Articles and experience Agile and Lean IT development, support and maintenance Human aspect in IT Agile and Lean management Practical templates and checklists Books review Free e-books ITIL in practice Experience from projects Services Creative workshop Lean workshop Consultations© 2011 Jaroslav Procházka, www.differ.cz Page 20

×