Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

itSMF Presentation March 2009


Published on

ICT Operations Automation IS Gershon\'s Panecea for Federal Government BAU Cost Reduction

Published in: Business, Technology
  • Be the first to comment

  • Be the first to like this

itSMF Presentation March 2009

  1. 1. Gershon’s Panacea? Automation. Jason Moore 3rd March 2009
  2. 2. Life After Gershon "Los Angeles, year 2029. All stealth bombers are upgraded with neural processors, becoming fully unmanned. One of them, Skynet, begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern Time, August 29th"
  3. 3. What is automation? A machine replacing human effort.
  4. 4. What is automation? Depending on the process, may still require human cognitive effort.
  5. 5. What is automation? From fully automatic to human guided .
  6. 6. What is automation? Requires INTEGRATION.
  7. 7. Why automate? <ul><ul><li>Reduced business as usual costs. </li></ul></ul><ul><ul><li>Redirect resources to strategic and innovative activities. </li></ul></ul><ul><ul><li>Decreased operational risk and higher service levels. </li></ul></ul><ul><ul><li>Compliance with, and “real life” documentation of process and policy. </li></ul></ul><ul><ul><li>Repository of captured domain expertise reducing reliance on expert resources and contractors. </li></ul></ul><ul><ul><li>Supports the implementation of Virtualised Cloud Computing and GreenIT environments. </li></ul></ul>
  8. 8. What Drives The Return? <ul><li>Full or partial automation of currently manual tasks </li></ul><ul><ul><li>Faster execution. </li></ul></ul><ul><ul><li>Minimisation of manual errors. </li></ul></ul><ul><ul><li>Parallelisation of serialised tasks. </li></ul></ul><ul><ul><li>Full automation by integration of monitoring to infrastructure to service management. </li></ul></ul><ul><li>SLA achievement improvement. </li></ul><ul><li>Ability to deliver more with same headcount. </li></ul><ul><li>Ability to deliver new services. </li></ul><ul><ul><li>Fully automated self service. </li></ul></ul><ul><ul><li>Scale in and out computing. </li></ul></ul><ul><ul><li>Automated disaster recovery. </li></ul></ul>
  9. 9. Why automate? ~41% YoY Operations Cost Reduction
  10. 10. How Do You Automate?
  11. 11. How Do You Automate?
  12. 12. How Do You Automate?
  13. 13. How Do You Automate?
  14. 14. How Do You Automate?
  15. 15. How Do You Automate?
  16. 16. How do you automate? June 8, 2009 <ul><li>WHAT are my people ACTUALLY doing? </li></ul><ul><ul><li>Determine high frequency or high risk activity. </li></ul></ul><ul><ul><li>Understand human effort for that activity. </li></ul></ul><ul><ul><li>Understand the COST for each activity. </li></ul></ul><ul><li>PICK A FEW – 80/20 is real and big bang doesn’t work. </li></ul><ul><li>Document the process. </li></ul><ul><ul><li>Flow diagram, activities, decision points. </li></ul></ul><ul><ul><li>Business logic, Integration points. </li></ul></ul><ul><li>Detailed Design </li></ul><ul><ul><li>Data flows. </li></ul></ul><ul><ul><li>How to interface with IT systems? </li></ul></ul><ul><ul><li>Create flows and test. </li></ul></ul><ul><li>Move to production </li></ul>
  17. 17. Fortune 50 Investment Banking Firm High Repeatability: 65% alerts can be automated Yearly Labor Cost Savings: $4+ million DSG Minutes/Dollars Actual Occur. 5 months (Est.) Occur. Annual Avg. Savings/ Incident (minutes) Potential Annual Savings MS Exchange Queue analysis and service restarts 8974 21538 13 $279,988.80 MS Exchange Exchange server process stop, restart verify 7247 17393 4 $69,571.20 Win Services Verify, Restart, Re-verify 13224 31738 2 $63,475.20 Win Threshold Babysit some and escalate if consistent, for disk space - find usual suspects and clean 16216 38918 6 $233,510.40 Win Pinger Verify, System type lookup, verify recent reboot, escalate if necessary 7846 18830 1 $18,830.40 Unix Pinger Verify, System type lookup, verify recent reboot, escalate if necessary 31969 76726 1 $76,725.60 Unix reboot Verify, correlate, close 14679 35230 2 $70,459.20 Unix Threshold babysit and potentially escalate, if disk or page file, then free space or adjust 6445 15468 6 $92,808.00 Net App Just verify and throw away some, remediate others 28465 68316 5 $341,580.00 Unix/syslog/pci Verify, count and watchdog escalate and/or open SRS ticket 71246 170990 3 $512,971.20 Unix/syslog/sbus Verify, count and watchdog escalate and/or open SRS ticket 29620 71088 3 $213,264.00 Est. Annual DSG Savings $1,973,184.00 NSO Minutes/Dollars Network IP Ping, lookup SMS, login to gateway, login to switch, run Cisco command, open/close ticket or escalate 40753 97807 5 $489,036.00 Network router similar procedure to above 37498 89995 5 $449,976.00 Network switch similar procedure to above 17171 41210 5 $206,052.00 Network Vitalnet 10957 26297 4 $105,187.20 Network SNMP 9990 23976 3 $71,928.00 Network BGP 9529 22870 3 $68,608.80 Est. Annual NSO Savings $1,390,788.00 PCO Minutes/Dollars DB Iwatch Very simple verification/remediation procedure 9362 22469 2 $44,937.60 DB Liveback check Remotely log onto DB servers, look in logs, run scripts, potentially kill/start processes 7192 17261 5 $86,304.00 DB Logmon check Logon to machine, check running process(es), restart server process 3156 7574 4 $30,297.60 DB Repserver issues Verify which of several issues this could be, typically run one or two commands to fix 22724 54538 3 $163,612.80 Autosys MAXRUN Watchdog to see if job completes within acceptable threshold 164521 394850 1 $394,850.40 Est. Annual PCO Savings $720,002.40 $4,083,974.40
  18. 18. Example of Process Automation Closed Loop Incident Management 2. Gather data to identify root cause 5. Close change request 4. Change implemented upon approval ITIL/Incident and Problem Process View: 1 2 3 4 5 6 Identify Service performance degradation Troubleshoot problem to isolate root-cause Identify changes to be implemented Create Change Request to implement change Implement change and close CR Update CMDB What ACTUALLY happens MANUALLY: High level of human effort and expertise required – different groups handing information and activities to each other. High potential for process variance. Potential for change control to be circumvented. The Problem 1. Identify service performance issue 3. Create CR to make change 6. Update CMDB Monitoring System Configuration System CMDB Ticketing System
  19. 19. <ul><li>Service down detected in Operations Manager </li></ul><ul><li>OO workflow takes ownership of OM alert </li></ul><ul><li>OO opens incident ticket in Service Manager </li></ul><ul><li>OO workflow performs diagnostics and repair procedure to fix service </li></ul><ul><li>OO workflow updates Service Manager ticket with full audit trail </li></ul><ul><li>OO closes ticket and alert </li></ul>Example of Process Automation Closed Loop Incident Management
  20. 20. <ul><li>Change request logged in Service Manager </li></ul><ul><li>Impact analysis and CAB approval automated by Release Control </li></ul><ul><li>Ticket approval sent back to SM and OO flow launched </li></ul><ul><li>OO triggers change execution across business service </li></ul><ul><li>Change validation and ticket close </li></ul>Example of Process Automation Change and Release Management
  21. 21. <ul><li>Opens ticket and validates approval status </li></ul><ul><li>Validates health and configuration of destination systems; Disables monitoring and clustering </li></ul><ul><li>Performs failover tasks validates success, and updates CMDB/ticket </li></ul><ul><li>Re-enables monitoring and clustering </li></ul><ul><li>Notifies stakeholders DR event complete and closes alerts/ticket </li></ul>Example of Process Automation Disaster Recovery
  22. 22. June 8, 2009 <ul><li>OO gathers data from Application Performance Monitoring about the virtual environment </li></ul><ul><li>OO opens a ticket to provision a new VM </li></ul><ul><li>OO checks hypervisor capacity and provisions additional storage if necessary </li></ul><ul><li>OO triggers the provisioning of the new VM and configures the software </li></ul><ul><li>Ticket is automatically updated/closed </li></ul>Example of Process Automation Utility Computing – Scale In/Scale Out
  23. 23. Gershon’s Checkboxes <ul><li>Recommendation 3 – Tighten the management of ICT business as usual funding </li></ul><ul><ul><li>Automation drastically reduces BAU ICT operations costs. </li></ul></ul><ul><li>Recommendation 4 – Enhance the management of the APS ICT skills base </li></ul><ul><ul><li>Automation captures expert “contractor” knowledge in process. </li></ul></ul><ul><ul><li>Automation removes the drudgery of ICT and diverts effort to innovation and strategic activities. </li></ul></ul><ul><li>Recommendation 5 – Data Centres </li></ul><ul><ul><li>Automatic control of virtualised environments enabling “cloud computing”. </li></ul></ul><ul><li>Recommendation 7 – Sustainability of ICT </li></ul><ul><ul><li>Automation delivers utility computing to match real time requirements. </li></ul></ul><ul><ul><li>Automatically shut down equipment (and power up). </li></ul></ul>
  24. 24. <ul><li>Accept the challenges so that you can feel the exhilaration of victory. </li></ul><ul><li>George S. Patton </li></ul>