Cisco Smart Services Automate Network Support and Operations
 

Like this? Share it with your network

Share

Cisco Smart Services Automate Network Support and Operations

on

  • 843 views

Cisco Smart Services Automate Network Support and Operations presentation from Cisco Live US 2013

Cisco Smart Services Automate Network Support and Operations presentation from Cisco Live US 2013

Statistics

Views

Total Views
843
Views on SlideShare
843
Embed Views
0

Actions

Likes
0
Downloads
23
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cisco Smart Services Automate Network Support and Operations Presentation Transcript

  • 1. Cisco Smart Services Automate Your Network Support and Operations Mynul Hoda, Sr. Technical Leader
  • 2. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public UCS Server UCS6100 Common Challenges  Our tools are strictly reactive.  Our IT resources are overspent on level 1 incidents.  We want to leverage Cisco’s expertise to improve customer satisfaction.  We want to apply Cisco automations to manage in-house.  We want to apply Cisco automations to co-manage remotely. 3
  • 3. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Topics 4 CNOAS What and Why Reference Architecture Demo and Examples Use Cases: Best Practices
  • 4. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Monday Morning Without CNOAS 5 CEO Unable to Join a WebEx Meeting Productivity Slows Too late to fix the issue Phone Not Working Unable to join audio conferenceProductivity Stops Has to call IT but meeting already started
  • 5. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public 6 Monday Morning Without CNOAS The Engineer Basic Steps Must follow routine troubleshooting steps over the phone End User Frustration Need to connect to multiple systems Engineer Frustration Basic troubleshooting feels like waste of time Need to connect to number of systems
  • 6. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public 7 Monday Morning Without CNOAS The CIO Dismal TTR Time to resolution is too long CSAT Low Customers are dissatisfied Inefficiency Operations are slow
  • 7. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Could it be Different? 8 Could problem-solving be proactive? Could troubleshooting be automated? Could resources be better utilized?
  • 8. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Monday Morning With CNOAS 9 The CEO is Happy Technology That Works A well-functioning and healthy network facilitates flexible productivity
  • 9. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Monday Morning With CNOAS 10 The Engineer Automations That Liberate Automated basic troubleshooting frees the human mind to tackle complex challenges
  • 10. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Monday Morning With CNOAS 11 The CIO A Solution That Makes Sense Cisco Network Operation Automation Service gives you a flexible, in-house solution to network management
  • 11. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public What Really Happened Here? 12  During last change management, network admin forgot to bring up interface OR  Error condition of a switch port flapping causing spanning tree protocol loop OR  A machine in the network infected with worm caused a lot of traffic generation  Problem is in the network, not on the phone or CUCM
  • 12. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public What Really Happened Here? 13  Detected the problem immediately when it happened  Ran multiple work flows simultaneously such as check CPU, spanning tree stability, performed traffic threshold analysis  Connected to the problematic phone and performed triage  Connected to the CUCM and performed triage  Proactively and accurately detected the problem as on the network  Created incident record  Proactively alerted network admin and sought approval to fix the problem  Ultimately fixed the problem with approval from network admin  Kept record of all activities for audit trail
  • 13. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public What Really Happened Here? 14 Freed from downtime End User Freed from basic troubleshooting tasks Engineer Freed from firefighting CIO
  • 14. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Drivers for Network OperationsAutomation 15
  • 15. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Why Cisco Network Operations Automation Service? Knowledge in documents Drives Operational Excellence in the Customer Network 16
  • 16. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Benefits for Network Operations and Engineering  Network Operations Center—instantly perform troubleshooting and diagnostic procedures  Network Operations Center—migrate tasks from level 3 employees to level 1 employees  Network Engineering—automate mundane and repetitive health checks  Service Desk—automate incident response and corrective action 17
  • 17. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Topics Cisco RMS for Data Center 18 CNOAS Use Case and Automation Backend What and Why Reference Architecture Demo and Examples Use Cases: Best Practices Reference Architecture Architecture and Examples Use Cases: Portal What and Why
  • 18. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Routing and Switching Automation Packs Based on Cisco Best Practices Cisco Process Orchestrator Terminal (SSH/Telnet) Web Services SNMP Database Windows Cisco Network Operations Automation Service Overview Adapters for NetworkAutomation Remedy Day 2 Service Optimization Day 0 Service Delivery Day 1 Service Operations Security, Unified Communications Data Center Networks Integrated Services Router Aggregated Services Router Cisco Services 19
  • 19. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Topics Cisco RMS for Data Center 20 CNOAS Use Case and Automation Backend What and Why Reference Architecture Demo and Examples Use Cases: Best Practices Reference Architecture Architecture and Examples Use Cases: Portal What and Why
  • 20. Demo: Auto Response to Network Outage
  • 21. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public NOADeploymentArchitecture 22
  • 22. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public How to Implement using Scaled Down Version?How to Implement Using Scaled-Down Version? 23
  • 23. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Setup – Nexus 24
  • 24. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Setup – Nexus 25
  • 25. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public E-Mail Notification for Approval 26
  • 26. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Approval Request 27
  • 27. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Approval Request – Contd. 28
  • 28. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Auto Remediation Success E-Mail 29
  • 29. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Automation Summary 30
  • 30. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Automation Summary – Contd. 31
  • 31. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Automation Summary 32
  • 32. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Cisco Process Orchestrator 33
  • 33. Demo: NOS Best Practice Auto Remediation (Fully Automated)
  • 34. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Architecture - NOS Remediation with CNOAS Remedy Ticketing System 4 Get Alerts from NP Db 3 Alerts pulled daily via WebSvc API 5 Cisco Process Orchestrator (CPO) takes action based on Remediation instruction from the customer 1 Inventory/ Configuration Data 2 Analyze & Persist DataNetwork Profiler (NP) / Network Performance Analytics (NPA) 6 MTTR Data Cisco Prime or NCCM Audit Remediation Customer Environment Network Performance Analytics (NPA) 35
  • 35. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Automatic E-Mail to NCE of NOS Account 36
  • 36. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public NCE Reviews BP Exceptions, Publishes Important Ones For Customer to Review and Auto-Remediate 37
  • 37. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public End User Receives Automatic E-Mail to Review BP Exceptions to be Remediated 38
  • 38. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public End Customer Reviews BP Exceptions and Selects Exceptions for Auto-Remediation 39
  • 39. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public End User Reviews Exceptions to Auto-Remediate 40
  • 40. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Exceptions Ready to be Remediated 41
  • 41. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Approval E-Mail Sent 42
  • 42. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Approval 43
  • 43. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Successful Auto-Remediation of BP Exception 44
  • 44. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Remedy Incident Record Auto Created with Resolution Summary 45 Automation Summary
  • 45. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Automation Summary 46
  • 46. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Cisco Process Orchestrator 47
  • 47. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Topics Cisco RMS for Data Center 48 CNOAS Use Case and Automation Backend What and Why Reference Architecture Demo and Examples Use Cases: Best Practices Reference Architecture Architecture and Examples Use Cases: Portal What and Why
  • 48. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Health Check Automation–Avoid Outages Avoid Network Latency Due to Spanning Tree Loop Problem Design it Where can we put it? Procure it Install it Configure it Secure it Is it ready? Architect it Before Automation After Automation • Manual checking • Error Prone • Time intensive • Repetitive • Proactive approach • Rapid checking (5 mins vs. 45 mins) • Simultaneous device check • Complex (CCIE Level Experience) • Monotonous activity • Configuration risk • Easy, Consistent, and Accurate • Repeatable • Fast ( 5 mins) Manual Design it Execute show log Parse HSRP protocol messages Check for instability Execute show CPU utilization check for > 50% Architect it Automation Examine switch log files Examine HSRP protocol messages Check for instability Check CPU Utilization Check port flapping Identify ports causing the problem Request permission to disable ports Disable ports Is it ready? Request Permission to disable ports Identify ports causing problem Disable ports 49
  • 49. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Use Case: UC Phone Down  Problem: IP phone cannot make any calls due to no dial tone  Solution: allow the service desk to troubleshoot the problem – Check if the phone in question is registered with the CUCM – Retrieve IP address based on IP phone number from the CUCM – Connectivity test performed (ping the phone) – Identify the switch and switch port number using the HTTP web server query on the IP phone – Check the switch port interface for connectivity.  Value: improves MTTR from 2 hours to 3 minutes for IP phone down problem. Migrates IP phone troubleshooting to level 1 Service Desk employees from level 3 personnel Automating Troubleshooting support by the Service Desk 50
  • 50. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Use Case: Detecting Branch Failure due to Max Transmission Unit (MTU) in DMVPN network  Problem: packet loss at a branch  Solution: diagnose problem by varying MTU size – Connect to edge router (Branch/HeadEnd) – Run the workflow with different MTU sizes. – Identify correct MTU size which does not result in packets dropping.  Value: Reduce MTTR due to branch outage Diagnose failure at a branch 51
  • 51. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Use Case: UC – Proactively Prevent Outage on Unity  Problem: disk space on Unity server exceeded  Solution: proactively react to the disk space sizing – Monitoring system (CA, BMC, etc.) should generate an alert when disk space reaches a threshold of 90%. – Shrink the report databases – Shrink the Unity databases – Check the disk utilization and exit if less than 70% – If necessary, move the oldest and largest files (>2M) to another directory (separate from the Unity databases) – Create an automation summary  Value: prevents customer satisfaction issues associated with a Unity Server down problem Prevent Outages by Proactively Addressing Unity Disk Space Utilization 52
  • 52. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Use Cases – Virtual NOC • Node/Device Down Troubleshooting • Troubleshooting performance issue due to high CPU utilization • Latency troubleshooting across the network • Large branch outage due to LAN interface failure • Incorrect MTU detection and remediation in DMVPN network • Spanning Tree Protocol Loop Detection and Remediation • Top Talker detection and remediation (performance issue from Branch to Data Center) • Detect CRC error on switch port • End-to-end connection check • Diagnose high availability problem due to HSRP issue • Troubleshoot performance issue due to high CPU • Many users do not connect to the network due to VTP or routing issue • Diagnose end user cannot connect to the network • Diagnose multicast issue with user not receiving stream • Routing Authentication Problem Troubleshooting • Diagnose slow user response from branch office to server • Spanning Tree stability check • End user cannot connect to the network • Circuit troubleshooting • EIGRP route missing • Detect proper MTU size for end-to-end connection in a DMVPN network • ISDN Troubleshooting • Troubleshooting Switch Port Problem • Call Home – ASIC Port Problem • Call Home – Power Fan Problem • Slow user response from a branch to a server in a Data Center 53
  • 53. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Use Cases – Auditing • Spanning Tree stability check • ISDN backup testing for routers • Validate FWSM failover configuration • Validate HSRP redundancy pairs configuration • Database comparison between Solar winds (Orion) and LMS • Validate spanning tree and HSRP affinity match for redundant switches • Bulk export of running configuration from LMS • Database comparison for CA Spectrum and NCM • Routed HA check 54
  • 54. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public Use Cases – Best Practices • UC Phone Troubleshooting • List registered/unregistered phones • Cisco Telepresence call launch and diagnostics • Cisco Untiy connections server – excessive disk usage detection/reduce space • Emergency 911 call validation • Provision a UC phone • List IP phone by name and class • Debug voice/video gateway • Troubleshooting voice/video gateway using show commands • Troubleshooting CUCM – trace collection • Troubleshooting Cisco Unity – trace collection • Troubleshooting Cisco Unity Connection – trace collection • Troubleshooting Cisco emergency responder – trace collection • Cisco Telepresence inventory collection for mgmt • Cisco Telepresence software inventory collection for mgmt and compliance • Enable IP SLA responder for Cisco Telepresence CTS • Reset Telepresence peripheral hardware • Validate NIC is enabled on each Telepresence device • Monitor alarms on CUOM • Move/Add/Change/Delete on CUCM via CUPM • Tandberg videoconference detect & record unit version 55
  • 55. © 2013 Cisco and/or its affiliates. All rights reserved.BRKDCT-1379 Cisco Public DIY with CNOAS added to your existing tools UCS Server UCS6100 Final Thoughts: Move From Reactive to Proactive  Service Level Agreements/Operations (SLA/SLO) like Mean Time To Restore (MTTR) are key evaluation criteria and can be dramatically improved with Day 2 automated operation services like RMS and CNOAS.  Cisco intellectual capital is captured in both automated services and helps dramatically to bring a lot of operational efficiencies.  Auto-Remediations, Auto-Ticket Enhancements, and Auto-Dynamic Checks can help mitigate the majority of level-one NOC incidents. Reactive to Proactive support is the key to sophisticated network operations… …achieved: 56