Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Network Troubleshooting Guide


Published on

  • Be the first to comment

  • Be the first to like this

Network Troubleshooting Guide

  1. 1. Transcend Management Software ® Network Troubleshooting Guide ‘ 9 7 for Wi ndows NT®
  2. 2. Transcend® Management Software ® Network Troubleshooting Guide Part No. 09-1293-000 Published September 1997
  3. 3. 3Com Corporation Copyright © 1997, 3Com Corporation. All rights reserved. No part of this documentation may be 5400 Bayfront Plaza reproduced in any form or by any means or used to make any derivative work (such as translation, transformation, or adaptation) without permission from 3Com Corporation. Santa Clara, California 95052-8145 3Com Corporation reserves the right to revise this documentation and to make changes in content from time to time without obligation on the part of 3Com Corporation to provide notification of such revision or change. 3Com Corporation provides this documentation without warranty of any kind, either implied or expressed, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. 3Com may make improvements or changes in the product(s) and/or the program(s) described in this documentation at any time. UNITED STATES GOVERNMENT LEGENDS: If you are a United States government agency, then this documentation and the software described herein are provided to you subject to the following restricted rights: For units of the Department of Defense: Restricted Rights Legend: Use, duplication, or disclosure by the Government is subject to restrictions as set forth in subparagraph (c) (1) (ii) for Restricted Rights in Technical Data and Computer Software Clause at 48 C.F.R. 52.227-7013. 3Com Corporation, 5400 Bayfront Plaza, Santa Clara, California 95052-8145. For civilian agencies: Restricted Rights Legend: Use, reproduction, or disclosure is subject to restrictions set forth in subparagraph (a) through (d) of the Commercial Computer Software – Restricted Rights Clause at 48 C.F.R. 52.227-19 and the limitations set forth in 3Com Corporation’s standard commercial agreement for the software. Unpublished rights reserved under the copyright laws of the United States. If there is any software on removable media described in this documentation, it is furnished under a license agreement included with the product as a separate document, in the hard copy documentation, or on the removable media in a directory file named LICENSE.TXT. If you are unable to locate a copy, please contact 3Com and a copy will be provided to you. Unless otherwise indicated, 3Com registered trademarks are registered in the United States and may or may not be registered in other countries. 3Com, the 3Com logo, Boundary Routing, EtherDisk, EtherLink, EtherLink II, LANplex, LANsentry, LinkBuilder, LinkSwitch, NetAge, NETBuilder, NETBuilder II, Parallel Tasking, SmartAgent, SuperStack, TokenDisk, TokenLink, Transcend, and ViewBuilder are registered trademarks of 3Com Corporation. CoreBuilder, FDDILink, NetProbe, and Traffix are trademarks of 3Com Corporation. 3ComFacts is a service mark of 3Com Corporation. AppleTalk and Macintosh are registered trademarks of Apple Computer Company. VINES is a registered trademark of Banyan Systems, Inc. CompuServe is a registered trademark of CompuServe, Inc. DECnet is a trademark of Digital Equipment Corporation. HP and OpenView are a registered trademarks of Hewlett-Packard Co. AIX, IBM, and NetView are registered trademarks of International Business Machines Corporation. Zip is a trademark of Iomega. Windows and Windows NT are registered trademarks of Microsoft Corporation. Sniffer is a registered trademark of Network General Holding Corporation. Novell is a registered trademark of Novell, Inc. OpenWindows, SunNet Manager, and SunOS are trademarks of Sun Microsystems Inc. SPARCstation is a trademark and is licensed exclusively to Sun Microsystems Inc. UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Ltd. Other brand and product names may be registered trademarks or trademarks of their respective holders. Guide written by Patricia Johnson, Sarah Newman, Iain Young, and Adam Bell. Technical information provided by Dan Bailey, Bob McTague, Graeme Robertson, and Andrew Ward. Edited by Beth Britt and Bonnie Jo Collins. Production by Christine Zak. ii
  4. 4. CONTENTS ABOUT THIS GUIDE Finding Specific Information in This Guide 12 What to Expect from This Guide 12 Conventions 13 Related Documentation 15 Documents 15 Help Systems 15 PART I BEFORE TROUBLESHOOTING NETWORK TROUBLESHOOTING OVERVIEW Introduction to Network Troubleshooting 19 About Connectivity Problems 19 About Performance Problems 20 Solving Connectivity and Performance Problems 20 Network Troubleshooting Framework 21 Troubleshooting Strategy 23 Recognizing Symptoms 24 User Comments 24 Network Management Software Alerts 24 Analyzing Symptoms 25 Understanding the Problem 25 Identifying and Testing the Cause of the Problem 26 Sample Problem Analysis 27 Equipment for Testing 28 Solving the Problem 29 iii
  5. 5. YOUR NETWORK TROUBLESHOOTING TOOLBOX Transcend Applications 31 Transcend Central 32 Status View 32 Status Watch 32 MAC Watch 33 Web Reporter 33 LANsentry Manager 33 Traffix Manager 34 Device View 35 Network Management Platforms 35 3Com SmartAgent Embedded Software 36 Other Commonly Used Tools 38 Ping 38 Strategies for Using Ping 39 Tips on Interpreting Ping Messages 40 Telnet 40 FTP and TFTP 40 Analyzers 41 Probes 41 Cable Testers 42 STEPS TO ACTIVELY MANAGING YOUR NETWORK Designing Your Network for Troubleshooting 43 Positioning Your SNMP Management Station 44 Using Probes 45 Monitoring Business-critical Networks 47 FDDI Backbone Monitoring 48 Internet WAN Link Monitoring 48 Switch Management Monitoring 48 Using Telnet, Serial Line, and Modem Connections 49 Using Communications Servers 50 Setting Up Redundant Management 51 Other Tips on Network Design 52 Management Station Configuration 52 More Tips 52 iv
  6. 6. Preparing Devices for Management 53 Configuring Management Parameters 53 Configuring Traps 53 Configuring Transcend Software 54 Monitoring Devices 54 Setting Thresholds and Alarms 54 Setting Thresholds in Status Watch 55 Setting Thresholds and Alarms in LANsentry Manager 55 Refining Alarm Settings 56 Setting Alarms Based on a Baseline 57 Other Tips for Setting Thresholds and Alarms 57 Knowing Your Network 58 Knowing Your Network’s Configuration 58 Site Network Map 58 Logical Connections 60 Device Configuration Information 60 Other Important Data About Your Network 61 Identifying Your Network’s Normal Behavior 62 Baselining Your Network 62 Identifying Background Noise 63 PART II NETWORK CONNECTIVITY PROBLEMS AND SOLUTIONS MANAGER-TO-AGENT COMMUNICATION Manager-to-Agent Communication Overview 67 Understanding the Problem 67 Identifying the Problem 67 Solving the Problem 68 Checking Management Configurations 68 Manager-to-Agent Communication Reference 69 IP Address 69 Gateway Address 69 Subnetwork Mask 69 SNMP Community Strings 69 SNMP Traps 72 v
  7. 7. FDDI CONNECTIVITY FDDI Connectivity Overview 73 Understanding the Problem 73 Identifying the Problem 75 Solving the Problem 76 Monitoring FDDI Connections 77 Status Watch 77 Making Your FDDI Connections More Resilient 77 Implementing Dual Homing 77 Installing an Optical Bypass Unit 79 FDDI Connectivity Reference 79 Peer Wrap Condition 79 Twisted Ring Condition 80 Undesired Connection Attempt Event 80 PART III NETWORK PERFORMANCE PROBLEMS AND SOLUTIONS BANDWIDTH UTILIZATION Bandwidth Utilization Overview 85 Understanding the Problem 85 Identifying the Problem 85 Solving the Problem 86 Identifying Utilization Problems 86 Status Watch 86 Generating Historical Utilization Reports 88 Web Reporter 88 Bandwidth Utilization Reference 89 ATM Utilization 89 Ethernet Utilization 89 FDDI Utilization 90 Token Ring Utilization 90 vi
  8. 8. BROADCAST STORMS Broadcast Storms Overview 93 Understanding the Problem 93 Identifying the Problem 93 Solving the Problem 94 Identifying a Broadcast Storm 94 Status Watch 94 Traffix Manager 96 Disabling the Offending Interface 97 MAC Watch 97 Correcting Spanning Tree Misconfigurations 98 Device View 98 Broadcast Storms Reference 99 Broadcast Packets 99 Multicast Packets 99 DUPLICATE ADDRESSES Duplicate Addresses Overview 101 Understanding the Problem 101 Identifying the Problem 101 Solving the Problem 101 Finding Duplicate MAC Addresses 102 MAC Watch 102 Status Watch 103 Finding Duplicate IP Addresses 103 MAC Watch 104 LANsentry Manager 104 Duplicate Addresses Reference 105 Duplicate MAC Addresses 105 Duplicate IP Addresses 106 vii
  9. 9. ETHERNET PACKET LOSS Ethernet Packet Loss Overview 107 Understanding the Problem 107 Identifying the Problem 108 Solving the Problem 108 Checking for Packet Loss 109 Status Watch 109 LANsentry Manager Network Statistics Graph 111 Device View 114 Ethernet Packet Loss Reference 115 Alignment Errors 115 Collisions 115 CRC Errors 116 Excessive Collisions 116 FCS Errors 116 Late Collisions 116 Nonstandard Ethernet Problems 117 Receive Discards 118 Too Long Errors 118 Too Short Errors 118 Transmit Discards 118 FDDI RING ERRORS FDDI Ring Errors Overview 119 Understanding the Problem 119 Identifying the Problem 119 Solving the Problem 120 Identifying Ring Errors 121 Status Watch 121 FDDI Ring Errors Reference 121 Elasticity Buffer Error Condition 121 Frame Error Condition 121 Frames Not Copied Condition 122 Link Error Condition 122 MAC Neighbor Change Event 122 viii
  10. 10. NETWORK FILE SERVER TIMEOUTS Network File Server Timeout Overview 123 Understanding the Problem 123 Identifying the Problem 124 Solving the Problem 124 Checking for Obvious Errors 124 Ping and Telnet 124 LANsentry Manager Alarms View 124 LANsentry Manager Statistics View 125 LANsentry Manager History View 125 Reproducing the Fault While Monitoring the Network 126 LANsentry Manager Top-N Graph 126 LANsentry Manager Packet Capture 126 LANsentry Manager Packet Decode 127 MAC Watch 128 LANsentry Manager Packet Decode 128 Correcting the Fault 129 Network File Server Timeouts Reference 129 Jabbering 129 Network File System (NFS) Protocol 129 ix
  11. 11. PART IV REFERENCE SNMP IN NETWORK TROUBLESHOOTING SNMP Operation 133 Manager/Agent Operation 133 SNMP Messages 134 Trap Reporting 134 Security 135 SNMP MIBs 136 MIB Tree 136 MIB-II 138 RMON MIB 139 RMON2 MIB 140 3Com Enterprise MIBs 141 INFORMATION RESOURCES Books 143 URLs 144 INDEX x
  12. 12. ABOUT THIS GUIDE About This Guide provides an overview of this guide, describes guide conventions, tells you where to look for specific information, and lists other publications that may be useful. This guide helps you to troubleshoot connectivity and performance problems on your network using Transcend® software and other tools. This guide is intended for network administrators who understand networking technologies and how to integrate networking devices. You should have a working knowledge of: s Transmission Control Protocol/Internet Protocol (TCP/IP) s Simple Network Management Protocol (SNMP) s Network management platforms (especially HP OpenView Network Node Manager from Hewlett-Packard) s 3Com devices on your network You should also be familiar with the interface and features of the Transcend management software you have installed. With subsequent releases of Transcend management software, this guide will be updated with new troubleshooting information and additional Transcend troubleshooting tools. The most current version of this guide is on the 3Com Web site:
  13. 13. 12 ABOUT THIS GUIDE Finding Specific This guide, which is available online (in PDF and HTML formats) and on Information in paper, is designed to be used online. For the online version, This Guide cross-references to other sections are indicated with links in blue, underlined text, which you can click. You can print any pages as needed. Table 1 provides guidelines for navigating through this document. Table 1 Guidelines for Finding Specific Information in This Guide If you are looking for See An introduction to network troubleshooting, Part I: Before Troubleshooting information about troubleshooting tools, and (page 17) guidelines for getting ready for management Note: This part is recommended reading for users who are new to network management. Specific troubleshooting scenarios that will help Part II: Network Connectivity you solve real network problems Problems and Solutions (page 65) Part III: Network Performance Problems and Solutions (page 83) Useful background information to help you with Part IV: Reference (page 131) troubleshooting tasks What to Expect This guide demonstrates how to troubleshoot problems on your from This Guide network with the help of Transcend management software and other tools. It also shows you how to use Transcend software to move beyond day-to-day troubleshooting to proactive network management. This guide does not help you identify and correct problems with installation and use of Transcend software. For that type of troubleshooting, see: s The Transcend Management Software Installation Guide (for help with installation and startup problems) s The help or user guide for a specific application (for information about troubleshooting application problems) This guide focuses on technologies that are important for troubleshooting your network and shows how these technologies are
  14. 14. Conventions 13 applied using Transcend management software. For additional information, see the resources listed in Information Resources (page 143). Conventions Table 2, Table 3, and Table 4 list conventions that are used throughout this guide. Table 2 Notice Icons Icon Notice Type Description Information note Important features or instructions Caution Information to alert the user to potential damage to a program, system, or device Warning Information to alert the user to potential personal injury Table 3 Troubleshooting Icons Icon Type Points out Troubleshooting Where a troubleshooting procedure begins procedure Troubleshooting Tips and other useful information for performing a tip troubleshooting task or working with a Transcend management software tool
  15. 15. 14 ABOUT THIS GUIDE Table 4 Text Conventions Convention Description Syntax The word “syntax” means you must evaluate the syntax provided and supply the appropriate values. Placeholders for values you must supply appear in angle brackets. Example: Enable RIPIP by using the following syntax: SETDefault !<port> -RIPIP CONTrol = Listen In this example, you must supply a port number for <port>. Commands The word “command” means you must enter the command exactly as shown in text and press the Return or Enter key. Example: To remove the IP address, enter the following command: SETDefault !0 -IP NETaddr = Screen displays This typeface represents information as it appears on the screen. The words “enter” When you see the word “enter” in this guide, you must and “type” type something, and then press the Return or Enter key. Do not press the Return or Enter key when an instruction simply says “type.” [Key] names Key names appear in text in one of two ways: s Referred to by their labels, such as “the Return key” or “the Escape key” s Written with brackets, such as [Return] or [Esc]. If you must press two or more keys simultaneously, the key names are linked with a plus sign (+). Example: Press [Ctrl]+[Alt]+[Del]. Menu commands Menu commands or button names appear in italics. and buttons Example: From the Help menu, select Contents. Words in italicized Italics emphasize a point or denote new terms at the place type where they are defined in the text. Words in boldface Bold text denotes key features. type
  16. 16. Related Documentation 15 Related This guide is complemented by other 3Com documents and Documentation comprehensive help systems. Documents The following documents are shipped with your Transcend software on the compact disc entitled Transcend Enterprise Manager Online Documentation Set for Windows NT v1.0 and Windows v.6.1: s Transcend Management Software Installation Guide (A paper version is also shipped with the product.) s Transcend Management Software Getting Started Guide (A paper version is also shipped with the product.) s Transcend Management Software Transcend Central User Guide s Transcend Management Software Status View User Guide s Transcend Management Software LANsentry Manager User Guide s Transcend Management Software ATMvLAN Manager User Guide s Transcend Management Software Device View User Guide Also, see the Transcend Traffix Manager User Guide, shipped with the Traffix Manager software. Help Systems Each Transcend application contains a help system that describes how to use all the features of the application. Help includes window descriptions, instructions, conceptual information, and troubleshooting tips for that application. You can access help from: s The Help menu in any application by selecting Help Topics (in the Help Topics window, you can view the Contents and Index) s A Help button in windows and dialog boxes s Your 3Com/Transcnd/Help directory (or the directory that you have set for your Transcend software installation)
  17. 17. 16 ABOUT THIS GUIDE
  18. 18. BEFORE TROUBLESHOOTING I Network Troubleshooting Overview (page 19) Your Network Troubleshooting Toolbox (page 31) Steps to Actively Managing Your Network (page 43)
  19. 19. NETWORK TROUBLESHOOTING OVERVIEW These sections introduce you to the concepts and practice of network troubleshooting: s Introduction to Network Troubleshooting (page 19) s Network Troubleshooting Framework (page 21) s Troubleshooting Strategy (page 23) Introduction to Network troubleshooting means recognizing and diagnosing Network networking problems with the goal of keeping your network running Troubleshooting optimally. As a network administrator, your primary concern is maintaining connectivity of all devices (a process often called fault management). You also continually evaluate and improve your network’s performance. Because serious networking problems can sometimes begin as performance problems, paying attention to performance can help you address issues before they become serious. About Connectivity Connectivity problems occur when end stations cannot communicate Problems with other areas of your local or wide-area network. Using management tools, you can often fix a connectivity problem before the user even notices it. Connectivity problems include: s Loss of connectivity — Immediately correct any connectivity breaks. When users cannot access areas of your network, your organization’s effectiveness is impaired. s Intermittent connectivity — If connectivity is erratic, investigate the problem immediately. Although users have access to network resources some of the time, they are still facing periods of downtime. Intermittent connectivity problems could indicate that your network is on the verge of a major break. s Timeout problems — Timeouts cause loss of connectivity, but are often associated with poor network performance.
  20. 20. 20 NETWORK TROUBLESHOOTING OVERVIEW About Performance Your network has performance problems when it is not operating as Problems effectively as it should. For example, response times may be slow, the network may not be as reliable as usual, and users may be complaining that it takes them longer to do their work. Some performance problems are intermittent, like instances of duplicate addresses. Other problems can indicate a growing strain on your network, such as consistently high utilization rates. If you regularly check your network for performance problems, you can extend the usefulness of your existing network configuration and plan network enhancements, instead of waiting for a performance problem to adversely affect the users’ productivity. Solving Connectivity When troubleshooting your network, you employ tools and knowledge and Performance already at your disposal. With an in-depth understanding of your Problems network, you can use network software tools, such as Ping (page 38), and network devices, such as Analyzers (page 41), to locate problems, and then make corrections, such as swapping equipment or reconfiguring segments, based on your analysis. Transcend® management software provides another set of tools for network troubleshooting. These tools have graphical user interfaces that make managing and troubleshooting your network easier. With Transcend Applications (page 31), you can: s Baseline your network’s normal status so that you can use it as a basis for comparison when troubleshooting s Precisely monitor network events s Be immediately notified of critical problems on your network, such as a device losing connectivity s Establish alert thresholds that warn you of potential problems so that you can correct problems before they affect your network s Resolve problems by disabling ports or reconfiguring devices See Your Network Troubleshooting Toolbox (page 31) for details about each troubleshooting tool.
  21. 21. Network Troubleshooting Framework 21 Network The International Standards Organization (ISO) Open Systems Troubleshooting Interconnect (OSI) reference model is the foundation of all network Framework communications. This seven-layer structure provides a clear picture of how network communications work. Protocols (rules) govern communications between the layers of a single system and among several systems. In this way, devices made by different manufacturers or using different designs can use different protocols and still be about to communicate. Understanding how network troubleshooting fits into the framework of the OSI model will help you to identify at what layer problems are located and which type of troubleshooting tools you might want to use. For example, unreliable packet delivery could be caused by a problem with the transmission media or with a router configuration. If you are receiving high rates of FCS Errors (page 116) and Alignment Errors (page 115), which you can monitor with Status Watch, then the problem is probably located at the physical layer and not the network layer. Figure 1 shows how to troubleshoot the layers of the OSI model. The data that network management tools can collect as it relates to the OSI model layers is described in Table 5. Table 5 Network Data and the OSI Model Layers Layer Data Collected Transcend Tool Used Application Protocol information and other s LANsentry Manager (page 33) Remote Monitoring (RMON) Presentation s Traffix Manager (page 34) and RMON2 data (for more detail) Session Transport Network Routing information s Status Watch (page 32) s LANsentry® Manager (for more detail) s Traffic Manager (for more detail) Data Link Traffic counts and other packet s Status Watch breakdowns s LANsentry Manager (for more detail) Physical Error counts s Status Watch
  22. 22. 22 NETWORK TROUBLESHOOTING OVERVIEW Troubleshooting Tools Application SNMP Console managers Layer 7 Examples: SNMP Telnet, Presentation manager, agent, rlogin, FTP Layer 6 proxy agent Analyzers Probes Session Traffix Manager Layer 5 LANsentry Manager Examples: Transport TCP UDP Layer 4 Examples: Status Network Watch IP IPX Layer 3 LLC LLC LLC Data link Layer 2 Probes MAC MAC MAC LANsentry Manager PHY PHY PHY Cable Status Watch Physical Ethernet Token testing Layer 1 Ring PMD tools FDDI Figure 1 OSI Reference Model and Network Troubleshooting For information about network troubleshooting tools, see Your Network Troubleshooting Toolbox (page 31).
  23. 23. Troubleshooting Strategy 23 Troubleshooting How do you know when you are having a network problem? The Strategy answer to this question depends on your site’s network configuration and on your network’s normal behavior. See Knowing Your Network (page 58) for more information. If you notice changes on your network, ask the following questions: s Is the change expected or unusual? s Has this event ever occurred before? s Does the change involve a device or network path for which you already have a backup solution in place? s Does the change interfere with vital network operations? s Does the change affect one or many devices or network paths? Once you have an idea of how the change is affecting your network, you can categorize it as critical or noncritical. Both of these categories need resolution (except for changes that are one-time occurrences); the difference between the categories is the time you have to fix the problem. Using a strategy for network troubleshooting helps you to approach a problem methodically and resolve it with minimal disruption to the network users. A good approach to problem resolution is: s Recognizing Symptoms (page 24) s Understanding the Problem (page 25) s Identifying and Testing the Cause of the Problem (page 26) s Solving the Problem (page 29)
  24. 24. 24 NETWORK TROUBLESHOOTING OVERVIEW Recognizing The first step to resolving any problem is to identify and interpret the Symptoms symptoms. You may discover network problems in several ways. You may have users complaining that the network seems slow or that they cannot connect to a server. You may pass your network management station and notice that a node icon is red. Your beeper may go off and display the message: WAN connection down. User Comments While you can often solve networking problems before users notice a change in their environment, you invariably get feedback from your users about how the network is running, such as: “I can’t print.” “I can’t access the application server.” “It’s taking me much longer to copy files across the network than it usually does.” “I can’t log on to a remote server.” “When I send e-mail to our other site, I get a routing error message.” “My system freezes whenever I try to Telnet.” Network Management Software Alerts Network management software, as described in Your Network Troubleshooting Toolbox (page 31), can alert you to areas of your network that need attention. For example: s The application displays red (Warning) icons. s Your weekly Top-N utilization report (which provides you with a table of the top ten ports showing the highest utilization rates) shows that one port is experiencing much higher utilization levels than normal. s You receive an e-mail message from your network management station that the threshold for broadcast and multicast packets has been exceeded. These signs usually provide additional information about the problem, allowing you to focus on the right area.
  25. 25. Troubleshooting Strategy 25 Analyzing Symptoms When confronted with a symptom, ask yourself these types of questions to narrow the location of the problem and to get more data for analysis: s To what degree is the network not acting normally (for example, does it now take one minute to perform a task that normally takes five seconds)? s On what subnetwork is the user located? s Is the user trying to reach a server, end station, or printer on the same subnetwork or on a different subnetwork? s Are many users complaining that the network is operating slowly or that a specific network application is operating slowly? s Are many users reporting network logon failures? s Are the problems intermittent? For example, some files may print with no problems, while other printing attempts generate error messages, make users lose their connections, and cause systems to freeze. Understanding the Networks are designed to move packets of data from a transmitting Problem device to a receiving device. When communication becomes problematic, you must determine why packets are not traveling as expected and then find a solution. The two most common causes for packets not moving reliably from source to destination are: s The physical connection breaks (that is, a cable is unplugged or broken). s A network device is not working properly and cannot send or receive some or all packets. Network management software can easily locate and report a physical connection break (layer 1 problem). You will find it harder to determine why a network device is not working as expected, which is often related to a layer 2 or a layer 3 problem.
  26. 26. 26 NETWORK TROUBLESHOOTING OVERVIEW When trying to determine why a network device is not working properly, check first for: s Valid service — Is the device configured properly for the type of service it is supposed to provide? For example, has Quality of Service (QoS), the definition of the transmission parameters, been established? s Restricted access — Is an end station supposed to be able to connect with a specific device or is that connection restricted? For example, is a firewall set up preventing that device from accessing certain network resources? s Correct configuration — Is there a misconfiguration of IP address, network mask, gateway, or broadcast address? Network problems are commonly caused by misconfiguration of newly connected or configured devices. See Manager-to-Agent Communication (page 67) for more information. Identifying and After you develop a possible theory about what is causing the problem, Testing the Cause of you must test your theory. The test must conclusively prove or disprove the Problem your theory. A general rule of troubleshooting is that, if you cannot reproduce a problem, then no problem exists unless it happens again on its own. However, if the problem is intermittent and you cannot replicate it, you can configure your network management software to catch the event in progress. For example, with LANsentry Manager (page 33), you can set alarms and automatic packet capture filters to monitor your network and inform you when the problem occurs again. See Configuring Transcend Software (page 54) for more information. Although network management tools can provide a great deal of information about problems and their general location, you may still need to swap equipment or replace components of your network setup until you locate the exact trouble spot. After testing your theory, you should either fix the problem as described in Solving the Problem (page 29) or develop another theory to check.
  27. 27. Troubleshooting Strategy 27 Sample Problem Analysis This section illustrates the analysis phase of a typical troubleshooting incident. On your network, a user reports that she cannot access her mail server. You need to establish two areas of information: s What you know — In this case, the workstation cannot communicate with the server. s What you do not know and need to test — s Can the workstation communicate with the network at all, or is the problem limited to communication with the server? Test by sending a Ping (page 38) or by connecting to other devices. s Is the workstation the only device that is unable to communicate with the server, or do other workstations have the same problem? Test connectivity at other workstations. s If other workstations cannot communicate with the server, can they communicate with other network devices? Again, test the connectivity. The analysis process follows these steps: 1 Can the workstation communicate with any other device on the subnetwork? s If no, then go to test 2. s If yes, determine if it is only the server that is unreachable. s If only the server cannot be reached, this suggests a server problem. Confirm by doing test 2. s If other devices cannot be reached, this suggests a connectivity problem in the network. Confirm by doing test 3. 2 Can other workstations communicate with the server? s If no, then most likely it is a server problem. Go to test 3. s If yes, then the problem is that the workstation is not communicating with the subnetwork. (This situation can be caused by workstation issues or a network issue with that specific station.) 3 Can other workstations communicate with other network devices? s If no, then the problem is likely a network problem. s If yes, the problem is likely a server problem.
  28. 28. 28 NETWORK TROUBLESHOOTING OVERVIEW When you determine whether the problem is with the server, subnetwork, or workstation, you can further analyze the problem, as follows: s For a problem with the server, examine whether the server is running, if it is properly connected to the network, and if it is configured appropriately. s For a problem with the subnetwork, examine any device on the path between the users and the server. s For a problem with the workstation, examine whether the workstation can access other network resources and if it is configured to communicate with that particular server. Equipment for Testing To help identify and test the cause of problems, have available: s A laptop computer loaded with a terminal emulator, IP stack, TFTP server, CD-ROM drive (with which you can read the online documentation), and some key network management applications, such as LANsentry Manager. With the laptop computer, you can plug into any subnetwork to gather and analyze data about the segment. s A spare managed hub to swap for any hub that does not have management. Swapping in a managed hub allows you to quickly spot which port is generating the errors. s A single port probe to insert in the network if you are having a problem where you do not have management capability. s Console cables for each type of connector, labeled and stored in a secure place.
  29. 29. Troubleshooting Strategy 29 Solving the Problem Many device or network problems are straightforward to resolve, but others yield misleading symptoms. If one solution does not work, continue with another. A solution often involves: s Upgrading software or hardware (for example, upgrading to a new version of agent software or installing Gigabit Ethernet devices) s Balancing your network load by analyzing: s What users communicate with which servers s What the user traffic levels are in different segments of your network Based on these findings, you can decide how to redistribute network traffic. s Adding segments to your LAN (for example, adding a new switch where utilization is continually high) s Replacing faulty equipment (for example, replacing a module that has port problems or replacing a network card that has a faulty jabber protection mechanism) To help solve problems, have available: s Spare hardware equipment (such as modules and power supplies), especially for your critical devices s A recent backup of your device configurations to reload if flash memory gets corrupted (which can sometimes happen when there is a power outage) The Transcend application suite Network Admin Tools allows you to save and reload your software configurations to devices.
  31. 31. YOUR NETWORK TROUBLESHOOTING TOOLBOX A robust network troubleshooting toolbox consists of items (such as network management applications, hardware devices, and other software) essential for recognizing, diagnosing, and solving networking problems. It contains: s Transcend Applications (page 31) s Network Management Platforms (page 35) s 3Com SmartAgent Embedded Software (page 36) s Other Commonly Used Tools (page 38) Transcend Transcend® management software is optimized for managing 3Com Applications devices and their attached networks. However, some applications, such as LANsentry® Manager, can manage any vendor’s networking equipment that complies with the Remote Monitoring (RMON) MIB. This section describes these Transcend applications, which you can use to troubleshoot your network: s Transcend Central (page 32) s Status View (page 32) s LANsentry Manager (page 33) s Traffix Manager (page 34) s Device View (page 35) This guide primarily focuses on using these applications to troubleshoot your network.
  32. 32. 32 YOUR NETWORK TROUBLESHOOTING TOOLBOX Transcend Central Transcend Central, an asset management and device grouping application, is your starting point for understanding what your network consists of and for controlling the Transcend network management troubleshooting tools. Transcend Central is available as both a native Windows application and a Java application that you can access using a browser. Using Transcend Central for troubleshooting, you can: s Display an inventory of device, module, and port information. s Group devices to make your troubleshooting tasks easier. Managing a collection of devices allows you to simultaneously perform the same tasks on each device in a group and to locate physical or logical problems on your network. s Launch Transcend applications, including some of your primary Transcend troubleshooting tools: s Status View (page 32), which includes Status Watch and MAC Watch (from the native version) and Web Reporter (from the Java version) s LANsentry Manager (page 33) s Device View (page 35) Status View The Status View applications manage 3Com devices and their attached networks. Status View applications primarily poll for MIB-II (page 138) data. Check the Status View help to see which 3Com devices are supported by each Status View application. Status Watch Status Watch is a performance monitoring application that allows you to monitor the operational status of your network devices and quickly identify any problems that require your attention.
  33. 33. Transcend Applications 33 MAC Watch MAC Watch is an address collection and discovery application that: s Polls managed devices for all MAC addresses s Polls managed devices and routers for IP addresses to perform MAC-to-IP address translation s Allows you to disable troublesome ports Web Reporter Web Reporter is a data-reporting application that runs in a World Wide Web (WWW) browser. It generates reports from data collected by the Status Watch and MAC Watch applications, allowing you to compare network statistics against a baseline LANsentry Manager LANsentry Manager is a set of integrated applications that displays and explores the real-time and historical data captured by RMON-compliant devices (probes) on the network. LANsentry Manager uses SNMP polling to gather RMON and RMON2 data from the probes. Use LANsentry Manager to: s Monitor current performance of network segments s See trends over time s Spot signs of current problems s Configure alarms to monitor for specific events s Capture packets and display their contents LANsentry Manager works with any device (from 3Com or other vendors) that supports the RMON MIB (page 139) or the RMON2 MIB (page 140).
  34. 34. 34 YOUR NETWORK TROUBLESHOOTING TOOLBOX Traffix Manager Traffix™ Manager is a performance-monitoring application that provides information about layer 3 conversations between nodes. It helps you to assess traffic patterns on your network. Traffix Manager: s Monitors all the stations seen by the RMON2–compliant probes deployed on your network s Captures and stores RMON and RMON2 data for your network’s protocols and applications s Displays traffic between stations in user-defined views of the network s Graphs current or historical data on the devices selected s Delivers reports for user-specified stations and time periods as postscript to your printer or as HTML to your web server s Launches LANsentry Manager tools for in-depth analysis of a station or a conversation between stations You can use Traffix Manager to: s Know your network — Understand overall flow patterns and interactions between systems and see how your network is really being used at the application level s Optimize your network — Gain an insight into traffic and application usage trends to help you optimize the use and placement of current network resources and make wise decisions about capacity planning and network growth Traffix Manager works with any device (from 3Com or other vendors) that supports the RMON2 MIB (page 140).
  35. 35. Network Management Platforms 35 Device View The Device View application is a device configuration tool. When troubleshooting your network, you can use Device View to check or change a device’s configuration and upgrade a device’s agent software. You can also use Device View to look at a device’s statistics and to set alarms. Device View manages only 3Com devices. See the Device View help for which 3Com devices are supported by Device View. You can also use Transcend Upgrade Manager, which is one of the Network Admin Tools applications, to perform bulk software upgrades on devices. Network As part of your troubleshooting toolbox, your network management Management platform is the first place that you go to view the overall health of your Platforms network. With the platform, you can understand the logical configuration of your network and configure views of your network to understand how devices work together and the role they play in the users’ work. The network management platform that supports your Transcend software installation can provide valuable troubleshooting tools. For example, Transcend Enterprise Manager ‘97 for Windows NT software is integrated with HP OpenView Network Node Manager Version 5.01, which runs on Windows NT Version 4.0. Network Node Manager (NNM) provides a number of functions useful in troubleshooting. It automatically discovers all the devices on your network and creates a database that contains information about each device. NNM updates the database when new devices are added or when existing devices are modified or deleted. Using this device database, NNM creates a default map that displays a graphical representation of your network. Each device on your network appears as a symbol (icon) on the map. You can configure views of your network to show devices on the same subnetworks or floors.
  36. 36. 36 YOUR NETWORK TROUBLESHOOTING TOOLBOX You can use NNM to monitor network performance and to diagnose network performance and connectivity problems. You can: s Take a snapshot of your network in its normal state. The snapshot records the state of your network at a particular instant. If you later have network performance problems, you can compare the current state of your network to the snapshot. s Quickly determine the connectivity status of a device by noting the color of its map symbol. Red usually means a device disconnection. s Diagnose connectivity problems by determining whether two devices can communicate. If they can communicate, then examine the route between the devices, the number of packets sent and lost, and the roundtrip time between the two devices. s Manage MIB information (for example, collecting and storing MIB data for trend analysis and graphing) using MIB queries. NNM compiles MIBs and lets you navigate up and down the MIB Tree (page 136) to retrieve MIB objects from devices. You can set thresholds for MIB data and generate events when a threshold is exceeded. s Configure the software to act on certain events. The Event Categories window informs you of any unexpected events (which arrive in the form of traps). For more information, see the HP documentation shipped with your software. 3Com SmartAgent Traditional SNMP management places the burden of collecting network Embedded management information on the management station. In this Software traditional model, software agents collect information about throughput, record errors or packet overflows, and measure performance based on established thresholds. Through a polling process, agents pass this information to a centralized network management station whenever they receive an SNMP query. Management applications then make the data useful and alert the user if there are problems on the device. For more information about traditional SNMP management, see SNMP Operation (page 133).
  37. 37. 3Com SmartAgent Embedded Software 37 As a useful companion to traditional network management methods, 3Com’s SmartAgent® technology places management intelligence into the software agent that runs within a 3Com device. This scalable solution reduces the amount of computational load on the management station and helps minimize management-related network traffic. SmartAgent software, which uses the RMON MIB (page 139), is self-monitoring, collecting and analyzing its own statistical, analytical, and diagnostic data. In this way, you can conduct network management by exception — that is, you are only notified if a problem occurs. Management by exception is unlike traditional SNMP management, in which the management software collects all data from the device through polling. SmartAgent software works autonomously and reports to the network management station whenever an exceptional network event occurs. The software can also take direct action without involving the management station. Devices that contain SmartAgent software may be able to: s Perform broadcast throttling to minimize the flow of broadcast traffic on your network s Monitor the ratio of good to bad frames s Switch a resilient link pair to the standby path if the primary path corrupts frames s Report if traffic on vital segments drops below minimum usage levels s Disable a port for five seconds to clear problems, and then automatically reconnect it To configure these advanced SmartAgent software features, see your device documentation. The Transcend applications LANsentry Manager (page 33) and Traffix Manager (page 34) make RMON data collected by the SmartAgent software more usable by summarizing and correlating important information.
  38. 38. 38 YOUR NETWORK TROUBLESHOOTING TOOLBOX Other Commonly These commonly used tools can also help you troubleshoot your Used Tools network: s Network software, such as Ping (page 38), Telnet (page 40), and FTP and TFTP (page 40). You can use these applications to troubleshoot, configure and upgrade your system. s Network monitoring devices, such as Analyzers (page 41) and Probes (page 41). s Tools, such as Cable Testers (page 42), for working on physical problems. Many of the tools discussed in this section are only useful in TCP/IP networks. Ping Packet Internet Groper (Ping) allows you to quickly verify the connectivity of your network devices. Ping sends a packet from one device, attempts to transmit it to a station on the network, and listens for the response to ensure that it was correctly received. You can validate connections on the parts of your network by pinging different devices: s A successful response tells you that a valid network path exists between your station and the remote host and that the remote host is active. s Slower response times than normal can tell you that the path is congested or obstructed. s A failed response indicates that a connection is broken somewhere; use the message to help locate the problem. See Tips on Interpreting Ping Messages (page 40). Some network devices, like the CoreBuilder® 5000, must be configured to be able to respond to Ping messages. If you are not receiving responses from a device, first check that it is set up to be a Ping responder.
  39. 39. Other Commonly Used Tools 39 Strategies for Using Ping Follow these strategies for using Ping: s Ping devices when your network is operating normally so that you have a performance baseline for comparison. See Identifying Your Network’s Normal Behavior (page 62) for more information. s Ping by IP address when: s You want to test devices on different subnetworks. This method allows you to Ping your network segments in an organized way, rather than having to remember all the hostnames and locations. s Your DNS server is down and your system cannot look up host names properly. You can Ping with IP addresses even if you cannot access hostname information. s Ping by hostname when you want to identify DNS server problems. s To troubleshoot problems involving large packet sizes, Ping the remote host repeatedly, increasing the packet size each time. s To determine if a link is erratic, perform a continuous Ping (using PING -t on Windows NT or ping -s on UNIX), which provides you with the time that it took the device to respond to each Ping. s To determine a route taken to a destination, use the trace route function (tracert) on Windows 95 and Windows NT. s Consider creating a Ping script that periodically sends a Ping to all necessary networking devices. If a Ping failure message is received, the script can perform some action to notify you of the problem, such as paging you. s Use the Ping functions of your network management platform. For example, in your HP Openview map, selecting a device and right-clicking provides access to Ping functions.
  40. 40. 40 YOUR NETWORK TROUBLESHOOTING TOOLBOX Tips on Interpreting Ping Messages Use the following Ping failure messages to troubleshoot problems: s No reply from <destination> — Shows that the destination routes are available but that there is a problem with the destination itself. s <destination> is unreachable — Shows that your system does not know how to get to the destination. This message means either that routing information to a different subnetwork is unavailable or that a device on the same subnetwork is down. s ICMP host unreachable from gateway — Indicates that your system can transmit to the target address using a gateway, but the gateway cannot forward the packet properly because either a device is misconfigured or the gateway is down. Telnet Telnet, which is a login and terminal emulation program for Transmission Control Protocol/Internet Protocol (TCP/IP) networks, is a common way to communicate with an individual device. You log into the device (a remote host) and use that remote device as if it were a local terminal. If you have an out-of-band Telnet connection established with a device, you can use Telnet to communicate with that device even if the network goes down. This feature makes Telnet one of the most frequently used network troubleshooting tools. Usually, all device statistics and configuration capabilities are accessible by using Telnet to connect to the device’s console. For more information about setting up an out-of-band connection, see Using Telnet, Serial Line, and Modem Connections (page 49). You can invoke the Telnet application on your local system and set up a link to a Telnet process running on a remote host. You can then run a program located on a remote host as if you were working on the remote system. FTP and TFTP Most network devices support either the File Transfer Protocol (FTP) or the Trivial File Transfer Protocol (TFTP) for downloading updates of system software. Updating system software is often the solution to networking problems that are related to agent problems. Also, new software features may help correct a networking problem.
  41. 41. Other Commonly Used Tools 41 FTP provides flexibility and security for file transfer by: s Accepting many file formats, such as ASCII and binary s Using data compression s Providing Read and Write access so that you can display, create, and delete files and directories s Providing password protection TFTP is a simple version of FTP that does not list directories or require passwords. TFTP only transfers files to and from a remote server. Analyzers An analyzer, often called a Sniffer, is a network device that collects network data on the segment to which it is attached, a process called packet capturing. Software on the device analyzes this data, a process referred to as protocol analysis. Most analyzers can interpret different types of protocol traffic, such as TCP/IP, AppleTalk, and Banyan Vines traffic. You usually use analyzers for reactive troubleshooting — you see a problem somewhere on your network and you attach an analyzer to capture and interpret the data from that area. Analyzers are particularly helpful in identifying intermittent problems. For example, if your network backbone has experienced moments of instability that prevent users from logging onto the network, you can attach an analyzer to the backbone to capture the intermittent problems when they happen again. Probes Like Analyzers (page 41), a probe is a network device that collects network data. Depending on its type, a probe can collect data from multiple segments simultaneously. It stores the collected data and transfers the data to an analysis site when requested. Unlike an analyzer, probes do not interpret data. A probe can be either a stand-alone device or an agent in a network device. The Transcend Enterprise Monitor 500 series and the SuperStack® II Monitor series are stand-alone RMON probes. LANsentry Manager and Traffix Manager use data from probes that are compliant with the RMON MIB (page 139) or the RMON2 MIB (page 140).
  42. 42. 42 YOUR NETWORK TROUBLESHOOTING TOOLBOX You can use a probe daily to check the health of your network. The Transcend applications can interpret and report this data, alerting you to possible problems so that you can proactively manage your network. For example, an RMON2 probe can help you to analyze traffic patterns on your network. Use this data to make decisions about reconfiguring devices and end stations as needed. Cable Testers Cable testers check the electrical characteristics of the wiring. They are most commonly used to ensure that building wiring and cables meet Category 5, 4, and 3 standards. For example, network technologies such as Fast Ethernet require the cabling to meet Category 5 requirements. Testers are also used to find defective and broken wiring in a building.
  43. 43. STEPS TO ACTIVELY MANAGING YOUR NETWORK These sections describe the steps you can take to effectively troubleshoot your network when the need arises: s Designing Your Network for Troubleshooting (page 43) s Preparing Devices for Management (page 53) s Configuring Transcend Software (page 54) s Knowing Your Network (page 58) Designing Your Designing your network for troubleshooting facilitates your access to Network for key devices on your network when your network is experiencing Troubleshooting connectivity or performance problems. Having adequate management access depends on these design criteria: s Position of the management station so that it can gather the greatest amount of network data through SNMP polling s Position of probes for distributed management of critical networks s Ability to communicate with each device even when your management station cannot access the network The following sections discuss how to design your network with the above criteria in mind: s Positioning Your SNMP Management Station (page 44) s Using Probes (page 45) s Monitoring Business-critical Networks (page 47) s Using Telnet, Serial Line, and Modem Connections (page 49) s Using Communications Servers (page 50) s Setting Up Redundant Management (page 51) s Other Tips on Network Design (page 52)
  44. 44. 44 STEPS TO ACTIVELY MANAGING YOUR NETWORK Positioning Your In a typical LAN, it is best to locate your Windows NT or UNIX SNMP Management management station directly off the backbone where it can conduct Station SNMP polling and manage network devices. The backbone is usually the optimum location for the management station because: s The backbone is not subject to the failures of individual subnetworked routers or switches. s In a partial network outage, the information collected by a backbone management station is probably more accurate than a station in a routed subnet. s The backbone is usually protected with redundant power and technologies, like FDDI, that correct their own problems. This redundancy ensures that the backbone remains operational, even when other areas of the network are having problems. s The backbone is typically faster and has a higher bandwidth than other areas of your network, making it a more efficient location for a management station. Make sure that the capacity of your backbone can accommodate the SNMP traffic that is generated by the management applications. Figure 2 shows a management station that is set up at the network backbone and polling network devices. Management workstation FDDI card or x network device FDDI Backbone x x x x x x x x x x x x x = Network devices that you want to poll Figure 2 SNMP Management at the Backbone
  45. 45. Designing Your Network for Troubleshooting 45 Although SNMP management from the backbone is a good way to keep track of what is happening on your network, do not rely on it exclusively. Because SNMP management occurs in-band (that is, SNMP traffic shares network bandwidth with data traffic), network troubleshooting using SNMP can become a problem in these ways: s Very heavy data traffic or a break in the network can make it difficult or impossible for the management station to poll a device. s Traffic added to the network by SNMP polling may contribute to networking problems. Using Probes To minimize the frequency of SNMP traffic on your network, set up one or more Probes (page 41) to collect Remote Monitoring (RMON) data from the network devices. In the distributed model illustrated in Figure 3, the management station using SNMP polling collects data from the probes rather than from all the network devices. Distributing the management over the network ensures you of some continued data collection even if you have network problems. Many management applications support data from MIBs other than the RMON MIBs. For this reason, even if you are using RMON probes, some SNMP polling to individual devices from a key management station is always useful for a complete picture of your network.
  46. 46. 46 STEPS TO ACTIVELY MANAGING YOUR NETWORK Management workstation x Probe FDDI card or FDDI card or x network device x network device FDDI Backbone x x x x x x x x x x x Probe x x x x = Network devices that you want to poll Figure 3 Management at the Backbone with as Attached Probe To extend your remote monitoring capabilities, use embedded RMON probes or roving analysis (monitoring one port for a period of time, moving on to another port for a while, and so on). However, with roving analysis, you cannot see a historical analysis of the ports because the probe is moving from one port to another. Some probes, like 3Com’s Enterprise Monitor, are designed to support the large number of interfaces found in switched environments. The probe’s high port density supports this multi-segmented switched environment. The probe’s interfaces can also be used to monitor mirror (or copy) ports on the switch, which means that all data received and transmitted on a port is also sent to the probe. Probes will not indicate which port has caused an error. Only a managed hub (a hub or switch with an onboard management module) can provide that level of detail. Probes and a hub’s own management module complement each other.
  47. 47. Designing Your Network for Troubleshooting 47 Monitoring On business-critical networks, you need to increase your level of Business-critical management by dedicating probes to the essential areas of your Networks network. For detailed network management, it is not enough to gather raw performance figures — you need to know, at the network and conversation level, who is generating the traffic and when it is being generated. For this type of analysis, use reporting tools, such as Traffix Manager (page 34), and low-level, fault diagnostic tools, such as LANsentry Manager (page 33). The three critical areas on this type of network that you should monitor are discussed in these sections and shown in Figure 4: s FDDI Backbone Monitoring (page 48) s Internet WAN Link Monitoring (page 48) s Switch Management Monitoring (page 48) Direct connection to the management workstation Management workstation SuperStack II Enterprise Monitor with FDDI module x Inline monitoring SuperStack II on Fast Ethernet Enterprise Monitor FDDI Backbone x x x x x x x x x SuperStack® II WAN Monitor 700 x = Network devices that you want to poll WAN = Possible probe attachment to a switch’s roving analysis port Figure 4 Probes Monitoring a Business-critical Network
  48. 48. 48 STEPS TO ACTIVELY MANAGING YOUR NETWORK FDDI Backbone Monitoring On the FDDI backbone, you need to continually monitor whether it is being overutilized, and, if so, by what type of traffic. By placing the SuperStack® II Enterprise Monitor with an FDDI media module directly at the backbone, you can gather utilization and host matrix information. This data is used by Traffix™ Manager to provide regular segment utilization reports and Top-N host reports. In addition, the probe provides a full range of FDDI performance statistics that can be recorded with LANsentry® Manager or reported to the management station by way of SNMP traps. To ensure management access to the probe, provide a direct connection to the probe from your management station. This connection allows you to access probe data even if the ring is unusable and keeps management traffic off the main ring. Internet WAN Link Monitoring The Internet link is a concern for dedicated network management because it represents an external cost to the company that requires budgeting and because it is a possible security problem. In a way similar to monitoring the FDDI backbone, Traffix Manager reports can indicate whether you are paying for too much bandwidth or whether you need to purchase more. It can also indicate the level of use on a workgroup basis for internal billing and highlight the top sites visited by users. Similarly, you can monitor for unexpected conversations and protocols. You also need to know the error rates on this link and whether you are experiencing congestion because of circumstances on the Internet provider’s network. LANsentry Manager can record and display these statistics and provide a detailed real-time view. Switch Management Monitoring The third area of interest in this network is the large number of switch-to-end station links. When detailed analysis of these devices is required (for example, if one of the ports on the network suddenly reports much higher traffic than normal), you need to track the source of the problem and decide whether you can optimize the traffic path. In this case, you need a way to view the traffic on the switch port at a conversation level.
  49. 49. Designing Your Network for Troubleshooting 49 By placing a Superstack II Enterprise Monitor in a central location, you can easily attach it to the switches that have the most Ethernet ports as the need arises. Using the roving analysis feature of many 3Com devices, data from a monitored port can be copied to the port on the switch to which the SuperStack II is connected. When a problem arises, roving analysis is activated for a particular switch and LANsentry Manager or Traffix Manager collects the data from the SuperStack II Enterprise Monitor. These applications can then monitor the network data for the devices connected to that switch. Using Telnet, To minimize your dependency on SNMP management, set up a way to Serial Line, and reach the console of your key networking devices. Through the console, Modem Connections you can often view Ethernet, FDDI, ATM, and token ring statistics, view routing and bridging tables, and check and modify device configurations. These console connections are also key to network troubleshooting because they can be out-of-band (that is, management using a dedicated line to a device). If the network goes down, your console connections are still available. The types of console connections include: s Telnet (page 40) — Out-of-band and in-band access using a network connection. For example, on 3Com’s CoreBuilder 6000 switch, using Telnet you can access the management console by using a dedicated Ethernet connection to the management module (out-of-band) and from any network attached to the device (in-band). s Serial line — Direct, out-of-band access using a terminal connection. This type of connection allows you to maintain your connections to a device if it reboots. s Modem — Remote, out-of-band access using a modem connection. Figure 5 shows management of a device through the serial line and modem ports.
  50. 50. 50 STEPS TO ACTIVELY MANAGING YOUR NETWORK Management workstation Modem Wiring closet Modem Modem port Serial line port Network switch Management workstation Attached LAN Figure 5 Out-of-band Management Using the Serial and Modem Ports Sometimes, direct access to network devices through out-of-band management is the only way to examine a network problem. For example, if your network connections are down, you can Telnet (page 40) to one of your key routers and examine its routing table. The routing table shows the devices that the router can reach, allowing you to narrow the area of the problem. You can also Ping (page 38) from this device to further investigate which areas of the network are down. Using While out-of-band management keeps you in contact with a particular Communications device during a network problem, it does not inform you about all the Servers areas of your network from a central point. You must access each device separately. To make device management more central, you can set up a communications server (often called a comm server), through which you can easily manage all devices configured to that server from one management station. See Figure 6. 3Com communication servers include the C/S 2500 and C/S 3500.
  51. 51. Designing Your Network for Troubleshooting 51 Management workstation Wiring closet Wiring closet Serial line port Serial line port Network Network switch switch Communications server (“Comm” server) Attached LAN Figure 6 Out-of-band Management with a Communications Server For optimal benefit, provide two management connections to the comm server: s Connect the comm server to the network (an in-band connection) so that you can access the devices from anywhere on the network using reverse Telnet. s Connect your management workstation directly to one of the serial ports of the comm server (an out-of-band connection) so that you can access the devices when the network is down. Setting Up To add redundancy to your management strategy so that a Redundant management station can always access the backbone, set up a “buddy Management system” of management. In this setup, management applications (often different ones) run on separate management workstations, which are connected to the backbone through separate network devices or by using a network card. This setup allows the management workstations to check on each other and report any problems with their attached network devices. The buddy system also provides a backup management connection to your network if one management station loses connectivity.
  52. 52. 52 STEPS TO ACTIVELY MANAGING YOUR NETWORK Other Tips on This section provides some additional tips for designing your network Network Design for troubleshooting. Management Station Configuration s Configure the management station to run without any network connection — including NIS, NFS, and DNS lookups. Because your management station should run with all network cables pulled out, do not install Transcend® Enterprise Manager on a network drive. s Have more than one interface available on the management station, an arrangement called dual hosting. Connect vital probes to the second interface to create a private monitoring LAN (one without regular network traffic) on which network problems will not impair communication. s Do not give the management station privileges on the network, such as the ability to log in with no passwords (rsh). Hackers can easily spot management stations. s Connect the management station to an uninterruptible power supply (UPS) to protect the station from events that interrupt power, such as blackouts, power surges, and brownouts. s Regularly back up the management station. More Tips s Provide remote access through a modem to the management station so that you can keep track of your network’s activity remotely. s Use managed hubs to narrow which link is causing an error. Even if your budget does not allow you to manage all hubs, strategically install one managed hub for error tracking. s Keep copies of all configurations on a file server and on the management station. See Knowing Your Network’s Configuration (page 58) for more information.