HP Automated Network Management

  • 1,959 views
Uploaded on

Best practices for managing a Business-Critical Network with Automated Network Management (ANM)

Best practices for managing a Business-Critical Network with Automated Network Management (ANM)

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,959
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
115
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Here are some real life examples of what happens when networks go down. These are examples of well publicized network failures, and the devastating impact they had. Companies can no longer afford to try to get by with manually managing their network configurations and change.
  • What NNMi Discovers:NodesInterfacesIPv4 Addresses and SubnetsIPv6 Addresses and Subnets (coming after 8.10)Layer 2 ConnectionsLayer 3 SubnetsVLANsHSRP (NNMi Advanced)Port Aggregation (NNMi Advanced)
  • Logical network layoutSitesDepartmentsData centersPhysical layoutCountries, states, citiesCampus, building, data centerDesk or equipment rackConnectivityWANLANEquipmentRouters & SwitchesServersPCs & PhonesUPS & EnvironmentalApplications (DDM, etc)TelephonyWeb serversApplication ServersAddressesIP address implementationDiscovery information will be used in the “Organize” stage. Different management policies for different categoriesPossibly different people have different management responsibilitiesBe careful to identify devices consistently across toolsReconcile DNS, local hosts, sysname, and Windows namingDiscovery performance is heavily influenced by DNS performanceRemember the configurable naming options in NNMi…some will match on DNS, some on SysID
  • ConfiguriCommand Line Interface – nnmloadseeds.ovplnnmloadseeds.ovpl –f <seedfile> Seed file contains one line for each device. Example:12.2.111.104# Cisco5500nnmloadseeds.ovpl –n <typeseeds>Seed identifiers can be either resolvable hostnames or IP Addressesnnmloadseeds.ovpl –n cisco5500ng seeds with no Auto-Discovery Rules will allow only the seeded nodes to be discovered.
  • Discovery rules specify address ranges and SNMP address ranges to include or exclude.
  • ConfiguriCommand Line Interface – nnmloadseeds.ovplnnmloadseeds.ovpl –f <seedfile> Seed file contains one line for each device. Example:12.2.111.104# Cisco5500nnmloadseeds.ovpl –n <typeseeds>Seed identifiers can be either resolvable hostnames or IP Addressesnnmloadseeds.ovpl –n cisco5500ng seeds with no Auto-Discovery Rules will allow only the seeded nodes to be discovered.
  • DNS mirror = Caching DNS Name Server
  • RAMS provides the discovery and monitoring for Routed WANS (e.g. OSPF, BGP, EIGRP, etc.). Because RAMS works in the control plan of the routing protocols, it understands how the routed L3 network converges in real time. RAMS feeds this topology information to NNMi for a hyper-accurate L3 network topology.
  • WHY WE GROUP THINGSGroups serve as Filters, allowing us to focus on relevant informationGroups can be used to define sets of management policies:Status pollingPerformance pollingSecurity policiesOperator notification policiesAlso can group by geographic hSuggested interface groups:Supported business servicesGeographic locationHigh-bandwidth local sitesLower-bandwidth remote sitesLocal & remote WAN edge interfacesAdditional groups per carrierData center connectionsStrategic content monitoring locationsIerarchyNodes, and the contained interfaces and configurations can be grouped for effective policy managementGroup based on:Backup policiesSecurity policiesTraffic management policies
  • WHY WE GROUP THINGSGroups serve as Filters, allowing us to focus on relevant informationGroups can be used to define sets of management policies:Status pollingPerformance pollingSecurity policiesOperator notification policiesAlso can group by geographic hSuggested interface groups:Supported business servicesGeographic locationHigh-bandwidth local sitesLower-bandwidth remote sitesLocal & remote WAN edge interfacesAdditional groups per carrierData center connectionsStrategic content monitoring locationsIerarchyNodes, and the contained interfaces and configurations can be grouped for effective policy managementGroup based on:Backup policiesSecurity policiesTraffic management policies
  • Just like NNMi, we can build groups manually, or dynamically with filtersWe now haveNode GroupsInterface GroupsPolicy groups
  • Polling can is done for fault and performance; it’s done on nodes & interfaces* SNMP Performance metrics for a node include:CPUMemoryBuffer utilizationYou can enable tracking/charting of component health for each node groupUse your interface groups to collect group-specific traffic volume and error informationWAN interfaces are of high interest, since this is your interface to the carrier networkEnable Poll Unconnected Interfaces for the WAN edge interfaces
  • Polling can is done for fault and performance; it’s done on nodes & interfaces* SNMP Performance metrics for a node include:CPUMemoryBuffer utilizationYou can enable tracking/charting of component health for each node groupUse your interface groups to collect group-specific traffic volume and error informationWAN interfaces are of high interest, since this is your interface to the carrier networkEnable Poll Unconnected Interfaces for the WAN edge interfaces
  • Important to track error & utilization on linksMaybe cpu/memory on devices
  • Can also use NA to configure tests
  • What’s broken: shows up in the Incidents (go to incident browser
  • RCA based on state, status, adjacencies, and modelsGood examples: remote “islands”Special add-ons for domains such as IPT, MPLS, and multicastNode down RCAInterface DownIsland Down
  • Supports Netflow and Sflow data collectionDoes not do deep-packet inspection or de-duplication

Transcript

  • 1. Session ID: BTOT-WE-0900/3 Twitter hashtag #HPSWU©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
  • 2. HP Automated Network ManagementBest Practices for Managing a Business-CriticalNetworkAshish Kuthiala – Director, Product MarketingNetwork Management Center©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
  • 3. Agenda• The State of the Industry• Why ANM?• A Practical Approach • Discover • Organize • Configure • Analyze • Optimize• Where to go for more information
  • 4. Hi-profile network outages underscore the importance of network management 2008 2009 Average cost of downtime is $70,000 per minute* 8-Day IT Outage Would Cripple Most Companies**Sources: * Aberdeen; ** Gartner (2008)
  • 5. What is Driving the Need forAutomated Network Management? Incident Management• Alert Floods• Unnecessary escalations 75% of IT costs Change & Compliance Management are labor 1• Manual and error-prone change• No audit trails ANM can save 50%+3 Task Automation• Lack of processes for complex tasks such as 40% of policy compliance problems found by customers 2 Virtualization Management• Inconsistent management of physical and virtual infrastructure Sources: 1 Forrester; 2Aberdeen; 3case studies at hp.com/go/ANLM
  • 6. Legacy Approaches aren’t Enough Managing Networks Today Missing automation, ineffective point tools, lack of integrations, silo’d, manual → errors, disruption and costExisting disparate network management products stop short of providing a solution What’s needed is an intelligent and integrated approach to managing networks.
  • 7. Automated Network Management (ANM)Complete control of your infrastructure Network Node Manager i iSPI Perf Metrics Fault and availability monitoring NNMNNM iSPI iSPI for Traffic Improve network availability with a model Performance Traffic Overview based network management solution iSPI Perf QA Change, configuration & compliance Comprehensive network automation spanning tasks from provisioning and change mgmt. to compliance enforcement and reporting iSPI Perf Traffic Network Automation Performance monitoring Increase operator productivity and efficiency and reduced MTTR iSPI NET Engineering Toolset Automate common network engineering and network tool administrators tasks
  • 8. ANM CapabilitiesComplete control of your infrastructure Node Discovery & Unified Fault Configuration & Deploy Changes & Maintain Compliance Diagnostic Automation Detection Performance Monitoring Workflow Automation Auto-discover network Monitor network device Automate changes and Automate common tasks devices and capture audit and interface metrics enforce, audit and report and processes for network trail of all device changes on compliance engineers  Dynamic discovery for  Single poller for fault,  Configuration and  Automate the process of model based availability and software changes capturing diagnostic management performance monitoring information  Enforce best practices  Snapshot and store  Contextual launch from and security standards  Trap analytic capabilities device configuration incident to performance information reports  Easily remediate  Export discovered violations topology into Visio  Real-time change  Navigation between Metric, QA and Traffic data
  • 9. Best Practice Cycle Discover Organize Configure Analyze Optimize
  • 10. Discovery Discover10
  • 11. Before You Start Discovery Discover– How many nodes are supported by your NNMi licenses?– How do you want to use these licenses: network infrastructure only, servers, printers, ….?– How many geographic areas do you want to poll? Does a centralized poller meet your needs?– What’s the general addressing scheme for the network?– How much of this network is SNMPV2, how much is SNMPV3?– How much is IPV4, how much is IPV6?– Where should the management station connect into the network for best visibility?– Set up subnet-local DNS or local hosts
  • 12. What’s on my Network? Discovery Discover– NNMi discovery is controlled by rules– IP Address ranges– Types of Devices– Cards & Ports • Watch out for DNS performance • Be consistent when identifying devices
  • 13. Global and Single-Server Discovery Discover– Great single NNMi server scalability: • Up to 25,000 nodes • 1 MILLION discovered interfaces and 200,000 polled • Unmatched TCO– Global/Regional management • Available to meet hierarchical management structure requirements (regional level requiring different scopes and survivability for independent management). • Can be used to avoid polling over expensive WAN links (ie: WW distributed polling). • To meet needs for consolidated view of extremely large enterprises: can scale up to ~65,000 nodes in the GNM. • Efficient communication: SSL with 3 ports allows polling thru firewalls.13
  • 14. Setup: SNMP Discover– Supports SNMP V1, V2, & V3– Set appropriate timeouts for expected response time– Can global or range-specific– Coordinates with Network Automation
  • 15. Setup: An Auto-Discovery Rule Discover Priority for applying rules Rule specific Ping Sweep Range with wildcard
  • 16. Setup: Carrier Connections Discover– Devices that are on either side of a Service Providers network or WAN– Subnet Connection Rules for subnets with prefix lengths between 28 and 31
  • 17. Setup: Seed Nodes Discover– You must specify seed nodes to start the discovery process.– Best practice: Start with your management system’s gateway router, and expand outward 1 “ring” at a time– Discovery of a seed happens immediately. You will know right away if the SNMP communication is working.
  • 18. Make Discovery Faster Discover– Issue: − Discovery makes a lot of DNS calls − We want to avoid slow DNS response, and minimize DNS network traffic– Options − Run a local DNS mirror on the management system, or on a system on the same subnet − Exclude “problem” hostnames and addresses • hostNoLookup.conf (FQDN or wildcards) • ipNoLookup.conf (IP addresses or wildcards)
  • 19. Discovery Results Discover– Routers and Switches– Containment hierarchy of nodes (CPU, memory, interfaces, etc.)– L2 and L3 connections– VLANS– HSRP/VRRP (NNMi Advanced)– Port Aggregation (NNMi Advanced)– Other SNMP devices (if configured)– Other non-SNMP devices (if configured)– Virtual Machine Hosts and Guests
  • 20. View Your Discovery Results Discover Inventory Topology Maps Troubleshooting Views
  • 21. NNMi Advanced Features Discover– HSRP/VRRP– Port Aggregation– RAMS Stations/Integration
  • 22. Before You Do Anything Else Discover– Back up your network devices with a Network Automation Snapshot • Set up login rules in NA • Import NNMi devices with nnmimport.bat • Discover device drivers • Select all devices in Inventory • Back up the device configurations
  • 23. Discovery Summary DiscoverGeneral Basic Steps– Nothing is discovered by default. 1. Configure SNMP community– Spiral Discovery never ends. strings– Discovery includes inventory discovery and layer 2 connectivity 2. Configure discovery rules with discovery. IP addresses & OIDs– Seeds start the discovery or 3. Configure interface discovery could be used to load a complete rules set of nodes. 4. Specify seed nodes– Back up your devices before proceeding!
  • 24. Getting Organized Organize24 HP Confidential
  • 25. OrganizeGroup objects by: Organize • Business Functionality • Geographic location • Management responsibility • Security requirements • Device type • Backup and Policy requirements • Polling policy
  • 26. Node Group Hierarchies Organize • Group for: • Visualization • Filtering • Polling & management policies
  • 27. Grouping for an Operator Organize
  • 28. NNMi Group Membership Organize
  • 29. Interface Groups Organize
  • 30. Policy Groups Organize– Synchronize with NNMi • Import discovered node list from NNMi into HP Network Automation • Build policy-appropriate groups
  • 31. Network AutomationGroup Membership Organize
  • 32. Network AutomationGroup Membership Organize
  • 33. Configuration Configure
  • 34. Configure ConfigureConfiguration Steps: •Visualizationof managed groups •Fault monitoring •Incident customization •Performance monitoring •Policy monitoring & enforcement
  • 35. Node Group Visualization Configure– All customizations are done in Node Group Map Settings in the Configuration workspace. Remember these node groups?
  • 36. Customizing Node Group Maps Configure One of the existing node groups L2, L3, or NoneOrder in theTopology Mapworkspace Who can save map layout changes Blank, or an existing interface group Default refresh time = NN minutes If there’s a serious problem on a Allow node, make that node’s icon connections bigger from nodes to node groups
  • 37. Node Group Background Images Configure
  • 38. Background Images Configure– NNMi includes maps − OOB maps •http://hostname:port/nnmbg − User-supplied maps •http://hostname:port/nnmdocs/images
  • 39. User-Provided Maps Configure
  • 40. Preserving User Layouts– Role-based permissions must be set by an Configure Administrator for each node group.– An operator simply presses the Save Layout button.
  • 41. Enable Incidents & Notification Configure– Decide which incidents should be passed on to operators– Enable schedules & remote notification with AlarmPoint– Enable integration into OM or 3rd party apps
  • 42. Incident Processing Options Configure– Custom correlation rules– De-duplication– Suppression
  • 43. Network Monitoring:Fault & Performance Metrics Polling Configure• Set up appropriate fault polling policies for each of your node and interface groups• Policies may be based on: • Importance • Bandwidth • Device capabilities• NNMi includes Component Health polling
  • 44. Network Monitoring:Custom Polling Configure• Configure “custom” polling for network devices• Good candidates: • Power supplies • Environmental monitors
  • 45. Network Monitoring:Performance Metrics Thresholds Configure– Configure high and low thresholds for device and interface metrics– Threshold violations generate incidents. These incidents integrate with other BSM & 3rd party tools
  • 46. WAN Monitoring – Network device configuration Configure • Most of the configuration is done in the routers • The QA SPI will discover the information in these configurations LosAngeles46
  • 47. Traffic Monitoring Configuration Configure METRICS TRAFFIC NNMi Reporting Master Collector Leaf Collectors Network Devices
  • 48. Configure Traffic Master & Leaf Collectors Configure– Configure Master, Leaf, and collection parameters
  • 49. Enable Flow Monitoring Configure– Use Network Automation to configure Traffic flow monitoring For devices, specify Leaf Collectors For each monitored interface, enable flow info
  • 50. Configure Policy Compliance ConfigureChecks for: • Security settings, such as ACLs • Console, telnet, ssh, and login/password rules • Checks for devices with vulnerable versions of IOS • Downloads security policies via a subscription model
  • 51. Configure Policy Compliance Configure
  • 52. Configure Policy Compliance:Software Versions Configure
  • 53. Analyze Analyze
  • 54. Analyze Analyze– What’s there?– What’s working?– What’s broken?– What are the trends?– Are there configuration/security issues?
  • 55. High-Level Views of Status Analyze Node Group Status Incident Browser Inventory Views Network Performance and Compliance Gauges
  • 56. Topology Analysis AnalyzeUse L3, L2, and Path views to analyze topology and connectivity issues
  • 57. VMware Hosts and Virtual Machines AnalyzeL2 maps will show a shared media icon connecting the access switch todiscovered ESX hosts and guest VMs Non-SNMP VM ESX host SNMP VM57
  • 58. Find Attached Switch Port Analyze Non-SNMP VM SNMP VM58
  • 59. Export Topology to Visio Analyze59
  • 60. Network Link Discovery:Switches & APs Analyze– Look for Security problems like cascaded switches and rogue Wireless Access Points …but suspicious in Things look fine an L2 view in an L3 view…
  • 61. NNMi Root-Cause Analyze– NNMi does a lot of the hard work for you– NNMi’s conclusions are aided by the polling policies that you configured– NNMi polls devices in the neighborhood of a suspected fault to perform true RCA– Your Incident Processing rules customize the results
  • 62. Real-Time Line GraphsActions -> Graphs Analyze– Graphs provided for • Nodes • Interfaces • Incidents • Custom Polled Instances • MIB Expressions– Some Line Graphs are specific to a vendor or object type– Accessible from tables or maps– Test MIB Expression with Graph62
  • 63. Graph of Interface Utilization Analyze– If node selected, all interfaces will be included on graph – up to 2063
  • 64. View Performance Metrics Analyze Device Metrics • Analyze the CPU, memory, & buffer utilization on your network devices • Start with a “Top N” report to find busiest devices Interface Metrics • Analyze the traffic volume, utilization, throughput, errors, and discards • Again, start with a “Top N” report
  • 65. Dissect the Network Traffic Analyze– Use Performance Metrics for volume, and then Performance Traffic for content analysis– Analyze by source, destination, and application Email Web Bittorrent
  • 66. Traffic Top N Options Analyze– Interface ID – Source Port– Interface Name – Destination Port– Qualified Interface Name – Source Host Name– Node Name – Destination Host Name– Flow Version – Src Host Application & Destination Host– IP Protocol – Destination Host & Application– IP TOS – Collector Name– Application Name
  • 67. Monitor WAN Performancewith QA SPI Analyze Network Performance between sites LosAngeles Bejing Latency Loss Jitter67
  • 68. Monitor Custom Polled Info Analyze Examples: • Power spikes & drops • UPS battery capacity • Temperature & Humidity
  • 69. Look for Patterns AnalyzeUse Calendar, Line Chart, and Heat Chart
  • 70. Check for Policy Compliance Analyze– Understand which devices in compliance with your security, content, & performance policies– Prepare & deploy corrective policies
  • 71. Optimize (Content Management) Optimize
  • 72. Optimize Optimize– Improve the performance & availability of business- critical services– Configure infrastructure for real-time applications such as voice & video– Identify the most troublesome devices and most frequent events– Drive down costs
  • 73. Tools for Optimization OptimizeMetrics DashboardPerformance Charts with Volume & Errors
  • 74. Tools for Optimization Optimize– Traffic Analysis: Top N and High Traffic Hosts • Find the bandwidth hogs! • Identify business vs. non-business traffic • Identify when abuses are happening • Use tools like Heat Chart to find recurring patterns • Drill down into specific ports and addresses
  • 75. Tools for Optimization Optimize – Traffic Enforcement • Implement rules on firewalls • Enable priority queuing of business-critical and real-time traffic • Enable blocking or throttling of non-business traffic − Define permitted source, destination, port, and other attributes − Create corresponding Access Control Lists
  • 76. Where to go for more information– Add items Here76 HP CONFIDENTIAL - ENABLEMENT ONLY, NOT FOR CUSTOMER USE
  • 77. Continue the conversation with your peers at the HP Software Communityhp.com/go/swcommunity