Transcend NCS Network Troubleshooting Guide


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Transcend NCS Network Troubleshooting Guide

  1. 1. 3Com Transcend ® Network Control Services Version 5.0 for UNIX® Network Troubleshooting Guide Net wor k Management Part No. 09-1500-000 Businesses run on networks and networks run with management .
  2. 2. 3Com Corporation Copyright © 1999, 3Com Corporation. All rights reserved. No part of this documentation may be reproduced 5400 Bayfront Plaza in any form or by any means or used to make any derivative work (such as translation, transformation, or adaptation) without written permission from 3Com Corporation. Santa Clara, California 95052-8145 3Com Corporation reserves the right to revise this documentation and to make changes in content from time to time without obligation on the part of 3Com Corporation to provide notification of such revision or change. 3Com Corporation provides this documentation without warranty, term, or condition of any kind, either implied or expressed, including, but not limited to, the implied warranties, terms or conditions of merchantability, satisfactory quality, and fitness for a particular purpose. 3Com may make improvements or changes in the product(s) and/or the program(s) described in this documentation at any time. If there is any software on removable media described in this documentation, it is furnished under a license agreement included with the product as a separate document, in the hard copy documentation, or on the removable media in a directory file named LICENSE.TXT or !LICENSE.TXT. If you are unable to locate a copy, please contact 3Com and a copy will be provided to you. UNITED STATES GOVERNMENT LEGEND If you are a United States government agency, then this documentation and the software described herein are provided to you subject to the following: All technical data and computer software are commercial in nature and developed solely at private expense. Software is delivered as “Commercial Computer Software” as defined in DFARS 252.227-7014 (June 1995) or as a “commercial item” as defined in FAR 2.101(a) and as such is provided with only such rights as are provided in 3Com’s standard commercial license for the Software. Technical data is provided with limited rights only as provided in DFAR 252.227-7015 (Nov 1995) or FAR 52.227-14 (June 1987), whichever is applicable. You agree not to remove or deface any portion of any legend provided on any licensed program or documentation contained in, or delivered to you in conjunction with, this User Guide. Portions of this documentation are reproduced in whole or in part with permission from (as appropriate). Unless otherwise indicated, 3Com registered trademarks are registered in the United States and may or may not be registered in other countries. 3Com, the 3Com logo, Boundary Routing, EtherDisk, EtherLink, EtherLink II, LinkBuilder, Net Age, NETBuilder, NETBuilder II, OfficeConnect, Parallel Tasking, SmartAgent, SuperStack, TokenDisk, TokenLink, LinkSwitch® 1000, LinkSwitch® 3000,Transcend, and ViewBuilder are registered trademarks of 3Com Corporation. ATMLink, AutoLink, CoreBuilder, DynamicAccess, FDDILink, NetProbe, and PACE are trademarks of 3Com Corporation. 3ComFacts is a service mark of 3Com Corporation. Artisoft and LANtastic are registered trademarks of Artisoft, Inc. Banyan and VINES are registered trademarks of Banyan Systems Incorporated. CompuServe is a registered trademark of CompuServe, Inc. DEC and PATHWORKS are registered trademarks of Digital Equipment Corporation. Intel and Pentium are registered trademarks of Intel Corporation. AIX, AT, IBM, NetView, and OS/2 are registered trademarks and Warp is a trademark of International Business Machines Corporation. Microsoft, MS-DOS, Windows, and Windows NT are registered trademarks of Microsoft Corporation. Novell and NetWare are registered trademarks of Novell, Inc. PictureTel is a registered trademark of PictureTel Corporation. UNIX is a registered trademark of X/Open Company, Ltd. in the United States and other countries. All other company and product names may be trademarks of the respective companies with which they are associated. Guide written by Patricia Johnson, Chris Flisher, Sarah Newman, and Adam Bell. Edited by Ben Mann Jr.. Technical information provided by Dan Bailey, Bob McTague, Graeme Robertson, and Andrew Ward. ii
  3. 3. CONTENTS ABOUT THIS GUIDE Finding Specific Information in This Guide 13 What to Expect from This Guide 14 Conventions 14 3Com Device Name Changes 16 Related Documentation 16 3Com Publications 16 User Guides 16 Help Systems 18 3Com World Wide Web (WWW) 18 Year 2000 Compliance 19 PART I BEFORE TROUBLESHOOTING 1 NETWORK TROUBLESHOOTING OVERVIEW Introduction to Network Troubleshooting 23 About Connectivity Problems 23 About Performance Problems 24 Solving Connectivity and Performance Problems 24 Network Troubleshooting Framework 25 Troubleshooting Strategy 26 Recognizing Symptoms 27 User Comments 27 Network Management Software Alerts 28 Analyzing Symptoms 28 Understanding the Problem 29 Identifying and Testing the Cause of the Problem 29 Sample Problem Analysis 30 Equipment for Testing 31 Solving the Problem 32 iii
  4. 4. 2 YOUR NETWORK TROUBLESHOOTING TOOLBOX Transcend Applications 33 Transcend Central 34 Status Watch 34 Web Reporter 34 Address Tracker 34 LANsentry Manager 35 Traffix Manager 35 Device View 36 Network Management Platforms 36 3Com SmartAgent Embedded Software 37 Other Commonly Used Tools 39 Ping 39 Strategies for Using Ping 39 Tips on Interpreting Ping Messages 40 Telnet 41 FTP and TFTP 41 Analyzers 41 Probes 42 Cable Testers 42 3 STEPS TO ACTIVELY MANAGING YOUR NETWORK Designing Your Network for Troubleshooting 43 Positioning Your SNMP Management Station 44 Using Probes 45 Monitoring Business-critical Networks 47 FDDI Backbone Monitoring 48 Internet WAN Link Monitoring 48 Switch Management Monitoring 48 Using Telnet, Serial Line, and Modem Connections 49 Using Communications Servers 50 Setting Up Redundant Management 51 Other Tips on Network Design 52 Management Station Configuration 52 More Tips 52 Preparing Devices for Management 52 Configuring Management Parameters 53 iv
  5. 5. Configuring Traps 53 Configuring Transcend NCS 53 Monitoring Devices 53 Setting Thresholds and Alarms 54 Setting Thresholds in Status Watch 54 Setting Thresholds and Alarms in LANsentry Manager 55 Refining Alarm Settings 55 Setting Alarms Based on a Baseline 56 Other Tips for Setting Thresholds and Alarms 57 Knowing Your Network 57 Knowing Your Network’s Configuration 57 Site Network Map 58 Logical Connections 60 Device Configuration Information 60 Other Important Data About Your Network 61 Identifying Your Network’s Normal Behavior 62 Baselining Your Network 62 Identifying Background Noise 63 PART II NETWORK CONNECTIVITY PROBLEMS AND SOLUTIONS 4 MANAGER-TO-AGENT COMMUNICATION Manager-to-Agent Communication Overview 67 Understanding the Problem 67 Identifying the Problem 67 Solving the Problem 68 Verifying Management Configurations 68 Manager-to-Agent Communication Reference 69 IP Address 69 Gateway Address 69 Subnet Mask 69 SNMP Community Strings 69 SNMP Traps 71 5 FDDI CONNECTIVITY FDDI Connectivity Overview 73 v
  6. 6. Understanding the Problem 73 Identifying the Problem 75 Solving the Problem 76 Monitoring FDDI Connections 76 Status Watch 76 Making Your FDDI Connections More Resilient 77 Implementing Dual Homing 77 Installing an Optical Bypass Unit 78 FDDI Connectivity Reference 79 Peer Wrap Condition 79 Twisted Ring Condition 79 Undesired Connection Attempt Event 80 6 TOKEN RING CONNECTIVITY AND ERRORS Token Ring Overview 81 Using Transcend Applications to Identify Problems and Symptoms 82 Using Token Ring Statistics Tool 82 Using LANsentry Manager 84 Using the Ring Station View 85 Using TR Network Analyzer Tool 86 Network Graphs 87 Active Station and Error Statistics List 87 Token Ring Status Tool 88 Token Ring Utilization Tool 88 Identifying and Solving Ring Errors 89 Troubleshooting Notes 90 7 ATM AND LANE CONNECTIVITY ATM and LANE Connectivity Overview 93 Color Status and Propagation 94 Device Level Troubleshooting 95 LANE Level Troubleshooting 95 ATM Network Level Troubleshooting 97 Virtual LANs Level Troubleshooting 97 Identifying VLAN Splits 98 Indications in the VLAN Map 98 Indications in the Backbone and Services Window 98 vi
  7. 7. Path Assistants for Identifying Connectivity and Performance Problems 99 LE Path Assistant 99 ATM Path Assistant 99 Tracing a VC Path Between Two ATM End Nodes 100 Examining Virtual Channels Across Layer 2 Topologies 100 Tracing the LAN Emulation Control VCCs Between Two LANE Clients 100 PART III NETWORK PERFORMANCE PROBLEMS AND SOLUTIONS 8 BANDWIDTH UTILIZATION Bandwidth Utilization Overview 103 Understanding the Problem 103 Identifying the Problem 103 Solving the Problem 104 Identifying Utilization Problems 104 Status Watch 104 Generating Historical Utilization Reports 106 Web Reporter 106 Bandwidth Utilization Reference 106 ATM Utilization 106 Ethernet Utilization 107 FDDI Utilization 108 Token Ring Utilization 108 9 BROADCAST STORMS Broadcast Storms Overview 109 Understanding the Problem 109 Identifying the Problem 109 Solving the Problem 110 Identifying a Broadcast Storm 110 Status Watch 110 Traffix Manager 111 Disabling the Offending Interface 113 Address Tracker 113 Correcting Spanning Tree Misconfigurations 113 vii
  8. 8. Device View 113 Broadcast Storms Reference 114 Broadcast Packets 114 Multicast Packets 114 10 DUPLICATE ADDRESSES Duplicate Addresses Overview 115 Understanding the Problem 115 Identifying the Problem 115 Solving the Problem 115 Finding Duplicate MAC Addresses 116 Status Watch 116 Finding Duplicate IP Addresses 116 Address Tracker 116 LANsentry Manager 117 Duplicate Addresses Reference 117 Duplicate MAC Addresses 117 Duplicate IP Addresses 118 11 ETHERNET PACKET LOSS Ethernet Packet Loss Overview 119 Understanding the Problem 119 Identifying the Problem 120 Solving the Problem 120 Searching for Packet Loss 120 Status Watch 121 LANsentry Manager Network Statistics Graph 122 Device View 125 Ethernet Packet Loss Reference 127 Alignment Errors 127 Collisions 127 CRC Errors 127 Excessive Collisions 128 FCS Errors 128 Late Collisions 128 Nonstandard Ethernet Problems 129 Receive Discards 129 viii
  9. 9. Too Long Errors 129 Too Short Errors 130 Transmit Discards 130 12 FDDI RING ERRORS FDDI Ring Errors Overview 131 Understanding the Problem 131 Identifying the Problem 131 Solving the Problem 132 Identifying Ring Errors 132 Status Watch 132 FDDI Ring Errors Reference 133 Elasticity Buffer Error Condition 133 Frame Error Condition 133 Frames Not Copied Condition 133 Link Error Condition 134 MAC Neighbor Change Event 134 13 NETWORK FILE SERVER TIMEOUTS Network File Server Timeout Overview 135 Understanding the Problem 135 Identifying the Problem 135 Solving the Problem 136 Looking for Obvious Errors 136 Ping and Telnet 136 LANsentry Manager Alarms View 136 LANsentry Manager Statistics View 137 LANsentry Manager History View 137 Reproducing the Fault While Monitoring the Network 138 LANsentry Manager Top-N Graph 138 LANsentry Manager Packet Capture 138 LANsentry Manager Packet Decode 139 Address Tracker 139 LANsentry Manager Packet Decode 140 Correcting the Fault 140 Network File Server Timeouts Reference 141 Jabbering 141 ix
  10. 10. Network File System (NFS) Protocol 141 14 MEASURING ATM NETWORK PERFORMANCE Measuring Traffic Performance 143 Utilization Map 143 Displaying Link Traffic 144 Displaying Node Configuration 144 Configuring the Utilization Tool 144 Map Configuration 144 Polling Configuration 145 Communication Configuration 145 Measuring Device Level Performance 145 Using the History Graph 145 Displaying Statistics 146 Measuring Port Level Performance 146 Traffic 146 Utilization 146 Total Frames 147 Good Frames 147 Errored Frames 147 LANE Component Statistics 148 LES Statistics 148 LEC Statistics 148 LANE User 149 PART IV REFERENCE 15 SNMP IN NETWORK TROUBLESHOOTING SNMP Operation 153 Manager/Agent Operation 153 SNMP Messages 154 Trap Reporting 154 Security 155 SNMP MIBs 155 MIB Tree 155 MIB-II 157 x
  11. 11. RMON MIB 158 RMON2 MIB 159 3Com Enterprise MIBs 160 16 INFORMATION RESOURCES Books 161 URLs 162 INDEX xi
  12. 12. xii
  13. 13. ABOUT THIS GUIDE This guide helps you to troubleshoot connectivity and performance problems on your network using Transcend® Network Management Software and other tools. This guide is intended for network administrators who understand networking technologies and how to integrate networking devices. You should have a working knowledge of: s Transmission Control Protocol/Internet Protocol (TCP/IP) s Simple Network Management Protocol (SNMP) s Network management platforms s 3Com devices on your network You should also be familiar with the interface and features of the Transcend Network Management Software that you have installed. With subsequent releases of Transcend management software, this guide will be updated with new troubleshooting information and additional Transcend troubleshooting tools. The most current version of this guide is on the 3Com Web site under the Support: Finding Specific This guide, which is available online in Portable Document Format (PDF) Information in and HyperText Markup Language (HTML) formats and in paper, is This Guide designed to be used online. For the online version, cross-references to other sections are indicated with links in blue, underlined text, which you can click. You can print any pages as needed.
  14. 14. 14 CHAPTER : ABOUT THIS GUIDE Table 1 provides guidelines for navigating through this document. Table 1 Guidelines for Finding Specific Information in This Guide If you are looking for See An introduction to network troubleshooting, Part I: “Before Troubleshooting” information about troubleshooting tools, and Note: This part is recommended guidelines for getting ready for management reading for users who are new to network management. Specific troubleshooting scenarios to help you Part II: “Network Connectivity solve real network problems Problems and Solutions” Part III: “Network Performance Problems and Solutions” Useful background information to help you with Part IV: “Reference” troubleshooting tasks What to Expect This guide demonstrates how to troubleshoot problems on your network from This Guide with the help of Transcend and other tools. It also shows you how to use Transcend to move beyond day-to-day troubleshooting to proactive network management. This guide is not intended to help you identify and correct problems with installation and use of Transcend software. For that type of troubleshooting, see: s The Transcend Network Control Services Installation Guide (for help with installation and startup problems) s The Help or user guide for a specific application (for information about troubleshooting application problems) This guide focuses on technologies to troubleshoot your network and demonstrates how these technologies are applied using Transcend management software. Conventions Table 2 and Table 3 list conventions that are used throughout this guide. Table 2 Notice Icons Icon Notice Type Description Information note Information that describes important features or instructions
  15. 15. Conventions 15 Table 2 Notice Icons Icon Notice Type Description Caution Information that alerts you to potential loss of data or potential damage to an application, system, or device Warning Information that alerts you to potential personal injury Table 3 Text Conventions Convention Description Screen displays This typeface represents information as it appears on the screen. Syntax The word “syntax” means that you must evaluate the syntax provided and then supply the appropriate values for the placeholders that appear in angle brackets. Example: To enable RIPIP, use the following syntax: SETDefault !<port> -RIPIP CONTrol = Listen In this example, you must supply a port number for <port>. Commands The word “command” means that you must enter the command exactly as shown and then press Return or Enter. Commands appear in bold. Example: To remove the IP address, enter the following command: SETDefault !0 -IP NETaddr = The words “enter” When you see the word “enter” in this guide, you must type and “type” something, and then press Return or Enter. Do not press Return or Enter when an instruction simply says “type.” Keyboard key names If you must press two or more keys simultaneously, the key names are linked with a plus sign (+). Example: Press Ctrl+Alt+Del Words in italics Italics are used to: s Emphasize a point. s Denote a new term at the place where it is defined in the text. s Identify menu names, menu commands, and software button names. Examples: From the Help menu, select Contents. Click OK.
  16. 16. 16 CHAPTER : ABOUT THIS GUIDE 3Com Device Name Many devices of the CoreBuilder™ family consist of some 3Com devices Changes that previously belonged to different 3Com brands. These devices are known by their new CoreBuilder names in the Transcend® NCS software. See Table 4. Table 4 3Com Device Name Changes Previous name New name Cellplex® 7000 CoreBuilder™ 7000 LANplex 2500 CoreBuilder 2500 LANplex 6000 CoreBuilder 6000 ONcore hubs CoreBuilder 5000 hubs ONcore Controller and Management CoreBuilder 5000 Controller and modules Management modules ONcore FastModule CoreBuilder 5000 FastModule ONcore SwitchModule CoreBuilder 5000 SwitchModule Related The following documents provide background and related information Documentation about local-area networking and internetworking, SNMP-based network management, and 3Com enterprise computing technology. Most user guides and release notes are available in Adobe Acrobat Reader Portable Document Format (PDF) or HTML on the 3Com World Wide Web site: 3Com Publications This guide is complemented by other 3Com documents, Help systems, and World Wide Web (WWW) documents. User Guides The following documents are shipped with your Transcend NCS software as printed books: s Transcend Network Control Services Introduction to Transcend Network Management, Version 5.0 for UNIX s Transcend Network Control Services Installation Guide, Version 5.0 for UNIX s Transcend Network Control Services Network Administration Guide, Version 5.0 for UNIX
  17. 17. Related Documentation 17 s Transcend Management Software Network Troubleshooting Guide, Version 5.0 for UNIX s Transcend Network Control Services Release Notes, Version 5.0 for UNIX s Transcend Network Control Services on the Web Quick Tour, Version 5.0 for UNIX The following documents are shipped with your Transcend NCS software on the CD-ROM entitled Transcend Network Control Services Online Documentation Set: s Inventory Management s Transcend Network Control Services Transcend Central User Guide, Version 5.0 for UNIX s Configuration Management s Transcend Network Control Services Network Admin Tools User Guide, Version 5.0 for UNIX s Transcend Network Control Services Device View User Guide, Version 5.0 for UNIX s Transcend Network Control Services NETBuilder Management Application Suite User Guide, Version 5.0 for UNIX s Transcend Network Control Services Token Ring Manager User Guide, Version 5.0 for UNIX s Transcend Network Control Services Enterprise VLAN Manager User Guide, Version 5.0 for UNIX s Transcend Network Control Service PathBuilder Switch Manager User Guide, Version 5.0 for UNIX s Transcend Network Control Services Total Control Manager/SNMP User Guide s Monitoring and Reporting s Transcend Network Control Services Status Watch User Guide, Version 5.0 for UNIX s Transcend Network Control Services LANsentry Manager User Guide, Version 5.0 for UNIX s Transcend Network Control Services LANsentry Reporter User Guide, Version 5.0 for UNIX
  18. 18. 18 CHAPTER : ABOUT THIS GUIDE Help Systems Each Transcend NCS application contains a Help system that describes how to use all the features of the application. Help includes window descriptions, step-by-step instructions, conceptual information, and troubleshooting tips for that application. You can access Help from: s The Help menu in any application by selecting Help Topics (in the Help Topics window, you can view the Contents and Index) s A Help button in windows and dialog boxes s Your 3Com/Transcnd/Help directory (or the directory that you have set for your Transcend software installation) 3Com World Wide Web (WWW) The following 3Com Web resources provide additional information about Transcend Network Control Services: s 3Com Network Management Solution Center –– Contains a range of information about 3Com’s network management solutions including Transcend Network Control Services, Total Control™ Manager, Transcend Traffix™ Manager, Transcend dRMON Edge Monitor, InfoVista, and Transcend Enterprise Monitor hardware probes for Ethernet and Token Ring networks. s 3Com Support –– Provides access to technical support and includes data sheets, support tips, Frequently Asked Questions (FAQ) documents, user guides, release notes, and software downloads. s Document Center –– Contains useful links to news, technical briefs, case studies, solutions guides, and product data sheets. s Technology Center –– Contains up-to-the-minute white papers, strategic overviews, and in-depth tutorials about networking technologies and innovations. s Networking Glossary –– Explains networking terms and acronyms.
  19. 19. Year 2000 Compliance 19 Year 2000 For information on Year 2000 compliance and 3Com products, visit the Compliance 3Com Year 2000 Web page:
  21. 21. BEFORE TROUBLESHOOTING I Chapter 1 Network Troubleshooting Overview Chapter 2 Your Network Troubleshooting Toolbox Chapter 3 Steps to Actively Managing Your Network
  22. 22. NETWORK TROUBLESHOOTING 1 OVERVIEW These sections introduce you to the concepts and practice of network troubleshooting: s Introduction to Network Troubleshooting s Network Troubleshooting Framework s Troubleshooting Strategy Introduction to Network troubleshooting means recognizing and diagnosing networking Network problems with the goal of keeping your network running optimally. As a Troubleshooting network administrator, your primary concern is maintaining connectivity of all devices (a process often called fault management). You also continually evaluate and improve your network’s performance. Because serious networking problems can sometimes begin as performance problems, paying attention to performance can help you address issues before they become serious. About Connectivity Connectivity problems occur when end stations cannot communicate Problems with other areas of your local area network (LAN) or wide area network (WAN). Using management tools, you can often fix a connectivity problem before users even notice it. Connectivity problems include: s Loss of connectivity — When users cannot access areas of your network, your organization’s effectiveness is impaired. Immediately correct any connectivity breaks. s Intermittent connectivity — Although users have access to network resources some of the time, they are still facing periods of downtime. Intermittent connectivity problems can indicate that your network is on the verge of a major break. If connectivity is erratic, investigate the problem immediately. s Timeout problems — Timeouts cause loss of connectivity, but are often associated with poor network performance.
  23. 23. 24 CHAPTER 1: NETWORK TROUBLESHOOTING OVERVIEW About Performance Your network has performance problems when it is not operating as Problems effectively as it should. For example, response times may be slow, the network may not be as reliable as usual, and users may be complaining that it takes them longer to do their work. Some performance problems are intermittent, such as instances of duplicate addresses. Other problems can indicate a growing strain on your network, such as consistently high utilization rates. If you regularly examine your network for performance problems, you can extend the usefulness of your existing network configuration and plan network enhancements, instead of waiting for a performance problem to adversely affect the users’ productivity. Solving Connectivity When you troubleshoot your network, you employ tools and knowledge and Performance already at your disposal. With an in-depth understanding of your Problems network, you can use network software tools, such as “Ping”, and network devices, such as “Analyzers”, to locate problems, and then make corrections, such as swapping equipment or reconfiguring segments, based on your analysis. Transcend® provides another set of tools for network troubleshooting. These tools have graphical user interfaces that make managing and troubleshooting your network easier. With “Transcend Applications”, you can: s Baseline your network’s normal status to use as a basis for comparison when the network operates abnormally s Precisely monitor network events s Be notified immediately of critical problems on your network, such as a device losing connectivity s Establish alert thresholds to warn you of potential problems that you can correct before they affect your network s Resolve problems by disabling ports or reconfiguring devices See “Your Network Troubleshooting Toolbox” for details about each troubleshooting tool.
  24. 24. Network Troubleshooting Framework 25 Network The International Standards Organization (ISO) Open Systems Troubleshooting Interconnect (OSI) reference model is the foundation of all network Framework communications. This seven-layer structure provides a clear picture of how network communications work. Protocols (rules) govern communications between the layers of a single system and among several systems. In this way, devices made by different manufacturers or using different designs can use different protocols and still communicate. By understanding how network troubleshooting fits into the framework of the OSI model, you can identify at what layer problems are located and which type of troubleshooting tools to use. For example, unreliable packet delivery can be caused by a problem with the transmission media or with a router configuration. If you are receiving high rates of “FCS Errors” and “Alignment Errors”, which you can monitor with Status Watch, then the problem is probably located at the physical layer and not the network layer. Figure 1 shows how to troubleshoot the layers of the OSI model. Table 5 describes the data that the network management tools can collect as it relates to the OSI model layers. Table 5 Network Data and the OSI Model Layers Layer Data Collected TranscendcNCS Tool Used Application Protocol information and other s LANsentry Manager Remote Monitoring (RMON) Presentation and RMON2 data s Traffix Manager™ (for more detail) Session Transport Network Routing information s Status Watch s LANsentry Manager (for more detail) s Traffix Manager (for more detail) Data Link Traffic counts and other packet s Status Watch breakdowns s LANsentry Manager (for more detail) Physical Error counts s Status Watch
  25. 25. 26 CHAPTER 1: NETWORK TROUBLESHOOTING OVERVIEW Figure 1 OSI Reference Model and Network Troubleshooting Troubleshooting Tools Application SNMP Console managers Layer 7 Examples: SNMP Telnet, Presentation manager, agent, rlogin, FTP Layer 6 proxy agent Analyzers Probes Session Traffix™ Manager Layer 5 LANsentry® Manager Examples: Transport TCP UDP Layer 4 Examples: Status Network Watch IP IPX Layer 3 LLC LLC LLC Data link Layer 2 Probes MAC MAC MAC LANsentry Manager PHY PHY PHY Cable Status Watch Physical Ethernet Token testing Layer 1 Ring PMD tools FDDI For information about network troubleshooting tools, see “Your Network Troubleshooting Toolbox”. Troubleshooting How do you know when you are having a network problem? The answer Strategy to this question depends on your site’s network configuration and on your network’s normal behavior. See “Knowing Your Network” for more information.
  26. 26. Troubleshooting Strategy 27 If you notice changes on your network, ask the following questions: s Is the change expected or unusual? s Has this event ever occurred before? s Does the change involve a device or network path for which you already have a backup solution in place? s Does the change interfere with vital network operations? s Does the change affect one or many devices or network paths? After you have an idea of how the change is affecting your network, you can categorize it as critical or noncritical. Both of these categories need resolution (except for changes that are one-time occurrences); the difference between the categories is the time that you have to fix the problem. By using a strategy for network troubleshooting, you can approach a problem methodically and resolve it with minimal disruption to network users. It is also important to have an accurate and detailed map of your current network environment. Beyond that, a good approach to problem resolution is: s Recognizing Symptoms s Understanding the Problem s Identifying and Testing the Cause of the Problem s Solving the Problem Recognizing The first step to resolving any problem is to identify and interpret the Symptoms symptoms. You may discover network problems in several ways. Users may complain that the network seems slow or that they cannot connect to a server. You may pass your network management station and notice that a node icon is red. Your beeper may go off and display the message: WAN connection down. User Comments Although you can often solve networking problems before users notice a change in their environment, you invariably get feedback from your users about how the network is running, such as: s They cannot print. s They cannot access the application server.
  27. 27. 28 CHAPTER 1: NETWORK TROUBLESHOOTING OVERVIEW s It takes them much longer to copy files across the network than it usually does. s They cannot log on to a remote server. s When they send e-mail to another site, they get a routing error message. s Their system freezes whenever they try to Telnet. Network Management Software Alerts Network management software, as described in “Your Network Troubleshooting Toolbox”, can alert you to areas of your network that need attention. For example: s The application displays red (Warning) icons. s Your weekly Top-N utilization report (which indicates the 10 ports with the highest utilization rates) shows that one port is experiencing much higher utilization levels than normal. s You receive an e-mail message from your network management station that the threshold for broadcast and multicast packets has been exceeded. These signs usually provide additional information about the problem, allowing you to focus on the right area. Analyzing Symptoms When a symptom occurs, ask yourself these types of questions to narrow the location of the problem and to get more data for analysis: s To what degree is the network not acting normally (for example, does it now take one minute to perform a task that normally takes five seconds)? s On what subnetwork is the user located? s Is the user trying to reach a server, end station, or printer on the same subnetwork or on a different subnetwork? s Are many users complaining that the network is operating slowly or that a specific network application is operating slowly? s Are many users reporting network logon failures? s Are the problems intermittent? For example, some files may print with no problems, while other printing attempts generate error messages, make users lose their connections, and cause systems to freeze.
  28. 28. Troubleshooting Strategy 29 Understanding the Networks are designed to move data from a transmitting device to a Problem receiving device. When communication becomes problematic, you must determine why data are not traveling as expected and then find a solution. The two most common causes for data not moving reliably from source to destination are: s The physical connection breaks (that is, a cable is unplugged or broken). s A network device is not working properly and cannot send or receive some or all data. Network management software can easily locate and report a physical connection break (layer 1 problem). It is more difficult to determine why a network device is not working as expected, which is often related to a layer 2 or a layer 3 problem. To determine why a network device is not working properly, look first for: s Valid service — Is the device configured properly for the type of service it is supposed to provide? For example, has Quality of Service (QoS), which is the definition of the transmission parameters, been established? s Restricted access — Is an end station supposed to be able to connect with a specific device or is that connection restricted? For example, is a firewall set up that prevents that device from accessing certain network resources? s Correct configuration — Is there a misconfiguration of IP address, subnet mask, gateway, or broadcast address? Network problems are commonly caused by misconfiguration of newly connected or configured devices. See “Manager-to-Agent Communication” for more information. Identifying and After you develop a theory about the cause of the problem, test your Testing the Cause of theory. The test must conclusively prove or disprove your theory. the Problem Two general rules of troubleshooting are: s If you cannot reproduce a problem, then no problem exists unless it happens again on its own. s If the problem is intermittent and you cannot replicate it, you can configure your network management software to catch the event in progress.
  29. 29. 30 CHAPTER 1: NETWORK TROUBLESHOOTING OVERVIEW For example, with “LANsentry Manager”, you can set alarms and automatic packet capture filters to monitor your network and inform you when the problem occurs again. See “Configuring Transcend NCS” for more information. Although network management tools can provide a great deal of information about problems and their general location, you may still need to swap equipment or replace components of your network until you locate the exact trouble spot. After you test your theory, either fix the problem as described in “Solving the Problem” or develop another theory. Sample Problem Analysis This section illustrates the analysis phase of a typical troubleshooting incident. On your network, a user cannot access the mail server. You need to establish two areas of information: s What you know — In this case, the user’s workstation cannot communicate with the mail server. s What you do not know and need to test — s Can the workstation communicate with the network at all, or is the problem limited to communication with the server? Test by sending a “Ping” or by connecting to other devices. s Is the workstation the only device that is unable to communicate with the server, or do other workstations have the same problem? Test connectivity at other workstations. s If other workstations cannot communicate with the server, can they communicate with other network devices? Again, test the connectivity. The analysis process follows these steps: 1 Can the workstation communicate with any other device on the subnetwork? s If no, then go to step 2. s If yes, determine if only the server is unreachable. s If only the server cannot be reached, this suggests a server problem. Confirm by doing step 2.
  30. 30. Troubleshooting Strategy 31 s If other devices cannot be reached, this suggests a connectivity problem in the network. Confirm by doing step 3. 2 Can other workstations communicate with the server? s If no, then most likely it is a server problem. Go to step 3. s If yes, then the problem is that the workstation is not communicating with the subnetwork. (This situation can be caused by workstation issues or a network issue with that specific station.) 3 Can other workstations communicate with other network devices? s If no, then the problem is likely a network problem. s If yes, the problem is likely a server problem. When you determine whether the problem is with the server, subnetwork, or workstation, you can further analyze the problem, as follows: s For a problem with the server — Examine whether the server is running, if it is properly connected to the network, and if it is configured appropriately. s For a problem with the subnetwork — Examine any device on the path between the users and the server. s For a problem with the workstation — Examine whether the workstation can access other network resources and if it is configured to communicate with that particular server. Equipment for Testing To help identify and test the cause of problems, have available: s A laptop computer that is loaded with a terminal emulator, TCP/IP stack, TFTP server, CD-ROM drive (to read the online documentation), and some key network management applications, such as LANsentry® Manager. With the laptop computer, you can plug into any subnetwork to gather and analyze data about the segment. s A spare managed hub to swap for any hub that does not have management. Swapping in a managed hub allows you to quickly spot which port is generating the errors. s A single port probe to insert in the network if you are having a problem where you do not have management capability. s Console cables for each type of connector, labeled and stored in a secure place.
  31. 31. 32 CHAPTER 1: NETWORK TROUBLESHOOTING OVERVIEW Solving the Problem Many device or network problems are straightforward to resolve, but others yield misleading symptoms. If one solution does not work, continue with another. A solution often involves: s Upgrading software or hardware (for example, upgrading to a new version of agent software or installing Gigabit Ethernet devices) s Balancing your network load by analyzing: s What users communicate with which servers s What the user traffic levels are in different segments Based on these findings, you can decide how to redistribute network traffic. s Adding segments to your LAN (for example, adding a new switch where utilization is continually high) s Replacing faulty equipment (for example, replacing a module that has port problems or replacing a network card that has a faulty jabber protection mechanism) To help solve problems, have available: s Spare hardware equipment (such as modules and power supplies), especially for your critical devices s A recent backup of your device configurations to reload if flash memory gets corrupted (which can sometimes happen due to a power outage) Use the Transcend NCS application suite Network Admin Tools to save and reload your software configurations to devices.
  32. 32. YOUR NETWORK 2 TROUBLESHOOTING TOOLBOX A robust network troubleshooting toolbox consists of items (such as network management applications, hardware devices, and other software) to recognize, diagnose, and solve networking problems. It contains: s Transcend Applications s Network Management Platforms s 3Com SmartAgent Embedded Software s Other Commonly Used Tools Transcend Transcend® management software is optimized for managing 3Com Applications devices and their attached networks. However, some applications, such as LANsentry® Manager, can manage any vendor’s networking equipment that complies with the Remote Monitoring (RMON) Management Information Base (MIB). This section describes these Transcend applications, which you can use to troubleshoot your network: s Transcend Central s Status Watch s Address Tracker s LANsentry Manager s Traffix Manager s Device View This guide primarily focuses on using these applications to troubleshoot your network.
  33. 33. 34 CHAPTER 2: YOUR NETWORK TROUBLESHOOTING TOOLBOX Transcend Central Start with Transcend Central, which is an asset management and device grouping application, to understand what your network consists of and to control the Transcend NCS network management troubleshooting tools. Transcend Central is available as both a native Windows application and a Java application that you can access using a Web browser. Using Transcend Central for troubleshooting, you can: s Display an inventory of device, module, and port information. s Group devices to make your troubleshooting tasks easier. By managing a collection of devices, you can simultaneously perform the same tasks on each device in a group and locate physical or logical problems on your network. s Launch Transcend NCS applications, including some of your primary Transcend NCS troubleshooting tools: s Status Watch includes Web Reporter (from the Java version) s Address Tracker s LANsentry Manager s Traffix Manager s Device View Status Watch The Status Watch applications manage 3Com devices and their attached networks. Status Watch applications primarily poll for “MIB-II” data. This is a performance monitoring application that allows you to monitor the operational status of your network devices and quickly identify any problems that require your attention. It works in conjunction with Web Reporter. See the Status Watch Help to learn which 3Com devices are supported. Web Reporter Web Reporter is a data-reporting application that runs in a World Wide Web (WWW) browser. It generates reports from data that Status Watch collects, allowing you to compare network statistics against a baseline. Address Tracker Address Tracker is an address collection and discovery application that: s Polls managed devices for all MAC addresses
  34. 34. Transcend Applications 35 s Polls managed devices and routers for IP addresses to perform MAC-to-IP address translation s Uses Device View to disable troublesome ports LANsentry Manager LANsentry Manager is a set of integrated applications that displays and explores the real-time and historical data that RMON-compliant devices (probes) on the network capture. LANsentry Manager uses SNMP polling to gather RMON and RMON2 data from the probes. Use LANsentry Manager to: s Monitor current performance of network segments s See trends over time s Spot signs of current problems s Configure alarms to monitor for specific events s Capture packets and display their contents LANsentry Manager works with any device (from 3Com or other vendors) that supports the “RMON MIB” or the “RMON2 MIB”. Traffix Manager Traffix™ Manager is a performance-monitoring application that provides information about layer 2 (RMON) and layer 3 conversations between nodes. It helps you to assess traffic patterns on your network. Traffix Manager: s Monitors all the stations that the RMON2–compliant probes encounter on your network s Captures and stores RMON and RMON2 data for your network’s protocols and applications s Displays traffic between stations in user-defined views of the network s Graphs current or historical data on the devices selected s Delivers reports for user-specified stations and time periods as postscript to your printer or as HTML to your Web server s Launches LANsentry Manager tools for in-depth analysis of a station or a conversation between stations
  35. 35. 36 CHAPTER 2: YOUR NETWORK TROUBLESHOOTING TOOLBOX You can use Traffix Manager to: s Know your network — Understand overall flow patterns and interactions between systems, and determine how your network is really being used at the application level. s Optimize your network — Gain an insight into traffic and application usage trends to help you optimize the use and placement of current network resources and make wise decisions about capacity planning and network growth. Traffix Manager works with any device (from 3Com or other vendors) that supports the “RMON2 MIB”. Device View The Device View application is a device configuration tool. When you troubleshoot your network, you can use Device View to determine or change a device’s configuration. You can also use Device View to look at a device’s statistics and to set alarms. Device View manages only 3Com devices. See the Device View Help for which 3Com devices are supported by Device View. You can also use Transcend Upgrade Manager, which is one of the Network Admin Tools applications, to perform bulk software upgrades on devices. Network As part of your troubleshooting toolbox, your network management Management platform is the first place to go to view the overall health of your Platforms network. With the platform, you can understand the logical configuration of your network and configure views of your network to understand how devices work together and the role that they play in the users’ work. The network management platform that supports your Transcend software installation can provide valuable troubleshooting tools. Transcend runs on several platforms within the NT and UNIX environments. The platform discovers the devices. Transcend imports that information from the platform to populate the core database. Unless you are rediscovering, the user must manually update the platform
  36. 36. 3Com SmartAgent Embedded Software 37 Using this device database, a map displays the graphical representation of your network. Each device on your network appears as a symbol (icon) on the map. You can configure views of your network to show devices on the same subnetworks or floors. You can monitor network performance and diagnose network performance and connectivity problems. You can also: s Take a snapshot of your network in its normal state. The snapshot records the state of your network at a particular instant. If you later have network performance problems, you can compare the current state of your network to the snapshot. s Quickly determine the connectivity status of a device by noting the color of its map symbol. Red usually means that communication with a device has ceased. s Diagnose connectivity problems by determining whether two devices can communicate. If they can communicate, then examine the route between the devices, the number of packets that were sent and lost, and the roundtrip time between the two devices. s Manage MIB information (for example, collecting and storing MIB data for trend analysis and graphing) using MIB queries. Transcend compiles MIBs and allows you to navigate up and down the “MIB Tree” to retrieve MIB objects from devices. You can set thresholds for MIB data and generate events when a threshold is exceeded. s Configure the software to act on certain events. The Event Categories window informs you of any unexpected events (which arrive in the form of traps). For more information, see the documentation that is shipped with your software. 3Com SmartAgent Traditional Simple Network Management Protocol (SNMP) management Embedded places the burden of collecting network management information on the Software management station. In this traditional model, software agents collect information about throughput, record errors or packet overflows, and measure performance based on established thresholds. Through a polling process, agents pass this information to a centralized network management station whenever they receive an SNMP query. Management applications then make the data useful and alert the user if there are problems on the device.
  37. 37. 38 CHAPTER 2: YOUR NETWORK TROUBLESHOOTING TOOLBOX For more information about traditional SNMP management, see “SNMP Operation”. As a useful companion to traditional network management methods, 3Com’s SmartAgent® technology places management intelligence into the software agent that runs within a 3Com device. This scalable solution reduces the amount of computational load on the management station and helps minimize management-related network traffic. SmartAgent software, which uses the “RMON MIB”, is self-monitoring, collecting and analyzing its own statistical, analytical, and diagnostic data. In this way, you can conduct network management by exception — that is, you are notified only if a problem occurs. Management by exception is unlike traditional SNMP management, in which the management software collects all data from the device through polling. SmartAgent software works autonomously and reports to the network management station whenever an exceptional network event occurs. The software can also take direct action without involving the management station. Devices that contain SmartAgent software may be able to: s Perform broadcast throttling to minimize the flow of broadcast traffic on your network s Monitor the ratio of good frames to bad frames s Switch a resilient link pair to the standby path if the primary path corrupts frames s Report if traffic on vital segments drops below minimum usage levels s Disable a port for five seconds to clear problems, and then automatically reconnect it To configure these advanced SmartAgent software features, see your device documentation. The Transcend NCS applications LANsentry Manager and Traffix Manager make RMON data that the SmartAgent software collect more usable by summarizing and correlating important information.
  38. 38. Other Commonly Used Tools 39 Other Commonly These commonly used tools can also help you troubleshoot your network: Used Tools s Network software, such as Ping, Telnet, and FTP and TFTP. You can use these applications to troubleshoot, configure, and upgrade your system. s Network monitoring devices, such as Analyzers and Probes. s Tools, such as Cable Testers, for working on physical problems. Many of the tools that are discussed in this section are only useful in TCP/IP networks. Ping Packet Internet Groper (Ping) allows you to quickly verify the connectivity of your network devices. Ping attempts to transmit a packet from one device to a station on the network, and listens for the response to ensure that it was correctly received. You can validate connections on the parts of your network by pinging different devices: s A successful response indicates that a valid network path exists between your station and the remote host and that the remote host is active. s Slower response times than normal can indicate that the path is congested or obstructed. s A failed response indicates that a connection is broken somewhere; use the message to help locate the problem. See Tips on Interpreting Ping Messages. Some network devices, like the CoreBuilder™ 5000, must be configured to be able to respond to Ping messages. If you are not receiving responses from a device, first make sure that it is set up to be a Ping responder. Strategies for Using Ping Follow these strategies for using Ping: s Ping devices when your network is operating normally so that you have a performance baseline for comparison. See “Identifying Your Network’s Normal Behavior” for more information. s Ping by IP address when: s You want to test devices on different subnetworks. This method allows you to Ping your network segments in an organized way, rather than having to remember all the hostnames and locations.
  39. 39. 40 CHAPTER 2: YOUR NETWORK TROUBLESHOOTING TOOLBOX s Your Domain Name System (DNS) server is down and your system cannot look up host names properly. You can Ping with IP addresses even if you cannot access hostname information. s Ping by hostname when you want to identify DNS server problems. s To troubleshoot problems that involve large packet sizes, Ping the remote host repeatedly, increasing the packet size each time. s To determine if a link is erratic, perform a continuous Ping (using ping -s on UNIX), which indicates the time that it takes the device to respond to each Ping. s To determine a route taken to a destination, use the trace route function (tracert). s Consider creating a Ping script that periodically sends a Ping to all necessary networking devices. If a Ping failure message is received, the script can perform some action to notify you of the problem, such as paging you. s Use the Ping functions of your network management platform. For example, in your HP OpenView map, select a device and click the right mouse button to gain access to ping functions. Tips on Interpreting Ping Messages Use the following ping failure messages to troubleshoot problems: No reply from <destination> Indicates that the destination routes are available but that there is a problem with the destination itself. <destination> is unreachable Indicates that your system does not know how to get to the destination. This message means either that routing information to a different subnetwork is unavailable or that a device on the same subnetwork is down. ICMP host unreachable from gateway Indicates that your system can transmit to the target address using a gateway, but that the gateway cannot forward the packet properly because either a device is misconfigured or the gateway is not operating.
  40. 40. Other Commonly Used Tools 41 Telnet Telnet, which is a login and terminal emulation program for Transmission Control Protocol/Internet Protocol (TCP/IP) networks, is a common way to communicate with an individual device. You log in to the device (a remote host) and use that remote device as if it were a local terminal. If you have established an out-of-band Telnet connection with a device, you can use Telnet to communicate with that device even if the network is unavailable. This feature makes Telnet one of the most frequently used network troubleshooting tools. Usually, all device statistics and configuration capabilities are accessible by using Telnet to connect to the device’s console. For more information about setting up an out-of-band connection, see “Using Telnet, Serial Line, and Modem Connections”. You can invoke the Telnet application on your local system and set up a link to a Telnet process that is running on a remote host. You can then run a program that is located on a remote host as if you were working at the remote system. FTP and TFTP Most network devices support either the File Transfer Protocol (FTP) or the Trivial File Transfer Protocol (TFTP) for downloading updates of system software. Updating system software is often the solution to networking problems that are related to agent problems. Also, new software features may help correct a networking problem. FTP provides flexibility and security for file transfer by: s Accepting many file formats, such as ASCII and binary s Using data compression s Providing Read and Write access so that you can display, create, and delete files and directories s Providing password protection TFTP is a simple version of FTP that does not list directories or require passwords. TFTP only transfers files to and from a remote server. Analyzers An analyzer, which is often called a Sniffer, is a network device that collects network data on the segment to which it is attached, a process called packet capturing. Software on the device analyzes this data, which is a process referred to as protocol analysis. Most analyzers can interpret different types of protocol traffic, such as TCP/IP, AppleTalk, and Banyan VINES traffic.
  41. 41. 42 CHAPTER 2: YOUR NETWORK TROUBLESHOOTING TOOLBOX You usually use analyzers for reactive troubleshooting — when you see a problem somewhere on your network, you attach an analyzer to capture and interpret the data from that area. Analyzers are particularly helpful for identifying intermittent problems. For example, if your network backbone has experienced moments of instability that prevent users from logging on to the network, you can attach an analyzer to the backbone to capture the intermittent problems when they happen again. Probes Like Analyzers, a probe is a network device that collects network data. Depending on its type, a probe can collect data from multiple segments simultaneously. It stores the collected data and transfers the data to an analysis site when requested. Unlike an analyzer, probes do not interpret data. A probe can be either a stand-alone device or an agent in a network device. The Transcend Enterprise Monitor 500 series and the SuperStack ® II Monitor series are stand-alone RMON probes. LANsentry Manager and Traffix Manager use data from probes that comply with the “RMON MIB” or the “RMON2 MIB”. You can use a probe daily to determine the health of your network. The Transcend NCS applications can interpret and report this data, alerting you to possible problems so that you can proactively manage your network. For example, an RMON2 probe can help you to analyze traffic patterns on your network. Use this data to make decisions about reconfiguring devices and end stations as needed. Cable Testers Cable testers examine the electrical characteristics of the wiring. They are most commonly used to ensure that building wiring and cables meet Category 5, 4, and 3 standards. For example, network technologies such as Fast Ethernet require the cabling to meet Category 5 requirements. Testers are also used to find defective and broken wiring in a building.
  42. 42. STEPS TO ACTIVELY MANAGING 3 YOUR NETWORK These sections describe the steps that you can take to effectively troubleshoot your network when the need arises: s Designing Your Network for Troubleshooting s Preparing Devices for Management s Configuring Transcend NCS s Knowing Your Network Designing Your By designing your network for troubleshooting, you can access key Network for devices on your network when your network is experiencing connectivity Troubleshooting or performance problems. Having adequate management access depends on these design criteria: s Position of the management station so that it can gather the greatest amount of network data through Simple Network Management Protocol (SNMP) polling s Position of probes for distributed management of critical networks s Ability to communicate with each device even when your management station cannot access the network The following sections discuss how to design your network with the preceding criteria in mind: s Positioning Your SNMP Management Station s Using Probes s Monitoring Business-critical Networks s Using Telnet, Serial Line, and Modem Connections s Using Communications Servers s Setting Up Redundant Management
  43. 43. 44 CHAPTER 3: STEPS TO ACTIVELY MANAGING Y OUR NETWORK s Other Tips on Network Design Positioning Your In a typical LAN, locate your management station directly off the SNMP Management backbone where it can conduct SNMP polling and manage network Station devices. The backbone is usually the optimum location for the management station because: s The backbone is not subject to the failures of individual subnetworked routers or switches. s In a partial network outage, the information collected by a backbone management station is probably more accurate than from a station in a routed subnetwork. s The backbone is usually protected with redundant power and technologies, like Fiber Distributed Data Interface (FDDI), that correct their own problems. This redundancy ensures that the backbone remains operational, even when other areas of the network are having problems. s The backbone is typically faster and has a higher bandwidth than other areas of your network, making it a more efficient location for a management station. Make sure that the capacity of your backbone can accommodate the SNMP traffic that the management applications generate. Figure 2 shows a management station that is set up at the network backbone and polling network devices.
  44. 44. Designing Your Network for Troubleshooting 45 Figure 2 SNMP Management at the Backbone Management workstation NIC card or x network device Backbone x x x x x x x x x x x x x = Network devices that you want to poll Although SNMP management from the backbone is a good way to keep track of what is happening on your network, do not rely on it exclusively. Because SNMP management occurs in-band (that is, SNMP traffic shares network bandwidth with data traffic), network troubleshooting using SNMP can become a problem in these ways: s Very heavy data traffic or a break in the network can make it difficult or impossible for the management station to poll a device. s Traffic that SNMP polling adds to the network may contribute to networking problems. Using Probes To minimize the frequency of SNMP traffic on your network, set up one or more Probes to collect Remote Monitoring (RMON) data from the network devices. In the distributed model illustrated in Figure 3, the management station uses SNMP polling to collect data from the probes rather than from all the network devices. Distributing the management over the network ensures you of some continued data collection even if you have network problems. Many management applications support data from MIBs other than the RMON MIBs. For this reason, even if you are using RMON probes, some SNMP polling to individual devices from a key management station is always useful for a complete picture of your network.
  45. 45. 46 CHAPTER 3: STEPS TO ACTIVELY MANAGING Y OUR NETWORK Figure 3 Management at the Backbone with an Attached Probe Management workstation x Probe NIC card or NIC card or x network device x network device Backbone x x x x x x x x x x x Probe x x x x = Network devices that you want to poll To extend your remote monitoring capabilities, use embedded RMON probes or roving analysis (monitoring one port for a period of time, moving on to another port for a while, and so on). However, with roving analysis, you cannot see a historical analysis of the ports because the probe is moving from one port to another. Some probes, like 3Com’s Enterprise Monitor, are designed to support the large number of interfaces that are found in switched environments. The probe’s high port density supports this multi-segmented switched environment. You can also use the probe’s interfaces to monitor mirror (or copy) ports on the switch, which means that all data received and transmitted on a port is also sent to the probe. Probes do not indicate which port has caused an error. Only a managed hub (a hub or switch with an onboard management module) can provide that level of detail. Probes and a hub’s own management module complement each other.
  46. 46. Designing Your Network for Troubleshooting 47 Monitoring On business-critical networks, you need to increase your level of Business-critical management by dedicating probes to the essential areas of your network. Networks For detailed network management, it is not enough to gather raw performance figures — you need to know, at the network and conversation level, what is generating the traffic and when it is being generated. For this type of analysis, use reporting tools, such as Traffix Manager, and low-level, fault diagnostic tools, such as LANsentry Manager®. The three critical areas to monitor on this type of network are discussed in these sections and shown in Figure 4: s FDDI Backbone Monitoring s Internet WAN Link Monitoring s Switch Management Monitoring Figure 4 Probes Monitoring a Business-critical Network Direct connection to the management workstation Management workstation SuperStack® II Enterprise Monitor with FDDI module x Inline monitoring SuperStack II on Fast Ethernet Enterprise Monitor FDDI Backbone x x x x x x x x x x = Network devices that you want to poll WAN = Possible probe attachment to a switch’s roving analysis port
  47. 47. 48 CHAPTER 3: STEPS TO ACTIVELY MANAGING Y OUR NETWORK FDDI Backbone Monitoring On the FDDI backbone, you need to continually monitor whether it is being overutilized, and, if so, by what type of traffic. By placing the SuperStack ® II Enterprise Monitor with an FDDI media module directly at the backbone, you can gather utilization and host matrix information. Traffix Manager uses these data to provide regular segment utilization reports and Top-N host reports. In addition, the probe provides a full range of FDDI performance statistics that LANsentry Manager can record or that SNMP traps can report to the management station. To ensure management access to the probe, provide a direct connection to the probe from your management station. You can use this connection to access probe data even if the ring is unusable and keeps management traffic off the main ring. Internet WAN Link Monitoring The Internet link is a concern for dedicated network management because it: s Represents an external cost to the company s Requires budgeting s Is a possible security problem In a way that is similar to monitoring the FDDI backbone, Traffix Manager reports can indicate whether you are paying for too much bandwidth or whether you need to purchase more. Traffix Manager can also indicate the level of use on a workgroup basis for internal billing and highlight the top sites that users visit. Similarly, you can monitor for unexpected conversations and protocols. You also need to know the error rates on this link and whether you are experiencing congestion because of circumstances on the Internet provider’s network. LANsentry Manager can record and display these statistics and provide a detailed real-time view. Switch Management Monitoring The third area of interest in this network is the large number of switch-to-end station links. When detailed analysis of these devices is required (for example, if one of the ports on the network suddenly reports much higher traffic than normal), you need to track the source of the problem and decide whether you can optimize the traffic path. In this
  48. 48. Designing Your Network for Troubleshooting 49 case, you need a way to view the traffic on the switch port at a conversation level. By placing a Superstack II Enterprise Monitor in a central location, you can easily attach it to the switches that have the most Ethernet ports as the need arises. By using the roving analysis feature of many 3Com devices, you can copy data from a monitored port to the port on the switch that is connected to the SuperStack II. When a problem arises, roving analysis is activated for a particular switch and LANsentry Manager or Traffix Manager collects the data from the SuperStack II Enterprise Monitor. These applications can then monitor the network data for the devices that are connected to that switch. Using Telnet, To minimize your dependency on SNMP management, set up a way to Serial Line, and reach the console of your key networking devices. Through the console, Modem Connections you can often view Ethernet, FDDI, Asynchronous Transfer Mode (ATM), and token ring statistics, view routing and bridging tables, and determine and modify device configurations. Out-of-band (that is, management using a dedicated line to a device) console connections are also key to network troubleshooting. If the network goes down, your console connections are still available. The types of console connections include: s Telnet — Out-of-band and in-band access using a network connection. For example, on 3Com’s CoreBuilder™ 6000 switch, using Telnet you can access the management console by using a dedicated Ethernet connection to the management module (out-of-band) and from any network attached to the device (in-band). s Serial line — Direct, out-of-band access using a terminal connection. This type of connection allows you to maintain your connections to a device if it reboots. s Modem — Remote, out-of-band access using a modem connection. Figure 5 shows management of a device through the serial line and modem ports.