Troubleshooting guide

1,290 views
1,150 views

Published on

Brocade

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,290
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
60
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Troubleshooting guide

  1. 1. 53-1001187-01 24 November 2008 Fabric OS Troubleshooting and Diagnostics Guide Supporting Fabric OS v6.2.0
  2. 2. Copyright © 2008 Brocade Communications Systems, Inc. All Rights Reserved. Brocade, Fabric OS, File Lifecycle Manager, MyView, and StorageX are registered trademarks and the Brocade B-wing symbol, DCX, and SAN Health are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries. All other brands, products, or service names are or may be trademarks or service marks of, and are used to identify, products or services of their respective owners. Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes features that may not be currently available. Contact a Brocade sales office for information on feature and product availability. Export of technical data contained in this document may require an export license from the United States government. The authors and Brocade Communications Systems, Inc. shall have no liability or responsibility to any person or entity with respect to any loss, cost, liability, or damages arising from the information contained in this book or the computer programs that accompany it. The product described by this document may contain “open source” software covered by the GNU General Public License or other open source license agreements. To find-out which open source software is included in Brocade products, view the licensing terms applicable to the open source software, and obtain a copy of the programming source code, please visit http://www.brocade.com/support/oscd. Brocade Communications Systems, Incorporated Document History Corporate and Latin American Headquarters Brocade Communications Systems, Inc. 1745 Technology Drive San Jose, CA 95110 Tel: 1-408-333-8000 Fax: 1-408-333-8101 Email: info@brocade.com Asia-Pacific Headquarters Brocade Communications Singapore Pte. Ltd. 30 Cecil Street #19-01 Prudential Tower Singapore 049712 Singapore Tel: +65-6538-4700 Fax: +65-6538-0302 Email: apac-info@brocade.com European Headquarters Brocade Communications Switzerland Sàrl Centre Swissair Tour B - 4ème étage 29, Route de l'Aéroport Case Postale 105 CH-1215 Genève 15 Switzerland Tel: +41 22 799 5640 Fax: +41 22 799 5641 Email: emea-info@brocade.com Title Publication number Summary of changes Date Fabric OS Troubleshooting and Diagnostics Guide 53-0000853-01 First released edition. 12 March 2008 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Added support for Virtual Fabrics, fcPing, pathInfo, and additional troubleshooting tips. 24 November 2008
  3. 3. Fabric OS Troubleshooting and Diagnostics Guide iii 53-1001187-01 Contents About This Document In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix How this document is organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Supported hardware and software . . . . . . . . . . . . . . . . . . . . . . . . . . . x What’s new in this document. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Document conventions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Text formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xi Command syntax conventions . . . . . . . . . . . . . . . . . . . . . . . . . . .xi Notes, cautions, and warnings . . . . . . . . . . . . . . . . . . . . . . . . . . xii Key terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Additional information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Brocade resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Other industry resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Getting technical help. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Document feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Chapter 1 Introduction to Troubleshooting In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Troubleshooting overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Network time protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Most common problem areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Questions for common symptoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Gathering information for your switch support provider. . . . . . . . . . . 5 Setting up your switch for FTP. . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Capturing a supportSave. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Capturing a supportShow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Capturing output from a console . . . . . . . . . . . . . . . . . . . . . . . . . 6 Capturing command output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Building a case for your switch support provider . . . . . . . . . . . . . . . . 7 Basic information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Detailed problem information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . . . 9 Chapter 2 General Issues In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11 Licensing issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
  4. 4. iv Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Time issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11 Switch message logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12 Checking fan components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 Checking the switch temperature. . . . . . . . . . . . . . . . . . . . . . . .13 Checking the power supply . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14 Checking the temperature, fan, and power supply . . . . . . . . . .14 Switch boot issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14 Fibre Channel Router connectivity. . . . . . . . . . . . . . . . . . . . . . . . . . .15 Generate and route an ECHO . . . . . . . . . . . . . . . . . . . . . . . . . . .15 Route and statistical information . . . . . . . . . . . . . . . . . . . . . . . . 17 Performance issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 Third party applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20 Chapter 3 Connections Issues In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21 Port initialization and FCP auto discovery process. . . . . . . . . . . . . .21 Link issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23 Connection problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23 Checking the logical connection. . . . . . . . . . . . . . . . . . . . . . . . .23 Checking the name server (NS) . . . . . . . . . . . . . . . . . . . . . . . . . 24 Link failures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25 Determining a successful negotiation . . . . . . . . . . . . . . . . . . . .26 Checking for a loop initialization failure . . . . . . . . . . . . . . . . . . .26 Checking for a point-to-point initialization failure . . . . . . . . . . . 27 Correcting a port that has come up in the wrong mode . . . . . . 27 Marginal links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28 Troubleshooting a marginal link . . . . . . . . . . . . . . . . . . . . . . . . .28 Device login issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29 Pinpointing problems with device logins . . . . . . . . . . . . . . . . . . 31 Media-related issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34 Testing a port’s external transmit and receive path . . . . . . . . .34 Testing a switch’s internal components . . . . . . . . . . . . . . . . . . .34 Testing components to and from the HBA . . . . . . . . . . . . . . . . .35 Segmented fabrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35 Reconciling fabric parameters individually . . . . . . . . . . . . . . . .36 Downloading a correct configuration . . . . . . . . . . . . . . . . . . . . .36 Reconciling a domain ID conflict . . . . . . . . . . . . . . . . . . . . . . . . 37 Chapter 4 Configuration Issues In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39 Configupload and download issues. . . . . . . . . . . . . . . . . . . . . . . . . .39 Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . . 41 Brocade configuration form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
  5. 5. Fabric OS Troubleshooting and Diagnostics Guide v 53-1001187-01 Chapter 5 FirmwareDownload Errors In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43 Blade troubleshooting tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43 Firmware download issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44 Troubleshooting firmwareDownload . . . . . . . . . . . . . . . . . . . . . . . . .46 Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .46 USB error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46 Considerations for downgrading firmware . . . . . . . . . . . . . . . . . . . . 47 Preinstallation messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Blade types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48 Firmware versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48 IP settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49 Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49 Port settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52 Zoning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52 Chapter 6 Security Issues In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53 Password issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53 Password recovery options . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54 Device authentication issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54 Protocol and certificate management issues . . . . . . . . . . . . . . . . . .55 Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .55 SNMP issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56 Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .56 FIPS issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56 Chapter 7 Virtual Fabrics In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59 General Virtual Fabric troubleshooting . . . . . . . . . . . . . . . . . . . . . . .59 Fabric identification issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60 Logical Fabric issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60 Base switch issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61 Logical switch issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62 Switch configuration blade compatibility. . . . . . . . . . . . . . . . . . . . . .63 Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .64 Chapter 8 ISL Trunking Issues In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65 Link issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65
  6. 6. vi Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Buffer credit issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .66 Chapter 9 Zone Issues In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67 Overview of corrective action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67 Verifying a fabric merge problem . . . . . . . . . . . . . . . . . . . . . . . .67 Verifying a TI zone problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67 Segmented fabrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .68 Zone conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69 Correcting a fabric merge problem quickly . . . . . . . . . . . . . . . .70 Changing the default zone access . . . . . . . . . . . . . . . . . . . . . . . 71 Editing zone configuration members . . . . . . . . . . . . . . . . . . . . . 71 Reordering the zone member list . . . . . . . . . . . . . . . . . . . . . . . .72 Checking for Fibre Channel connectivity problems . . . . . . . . . .72 Checking for zoning problems. . . . . . . . . . . . . . . . . . . . . . . . . . .73 Gathering additional information. . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Chapter 10 FCIP Issues In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75 FCIP tunnel issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75 FCIP links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .77 Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .78 Port mirroring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79 Supported hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79 Port mirroring considerations . . . . . . . . . . . . . . . . . . . . . . . . . . .80 Port mirroring management . . . . . . . . . . . . . . . . . . . . . . . . . . . .81 FTRACE concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .82 Fibre Channel trace information. . . . . . . . . . . . . . . . . . . . . . . . .82 Chapter 11 FICON Fabric Issues In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89 FICON issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89 Troubleshooting FICON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .90 General information to gather for all cases . . . . . . . . . . . . . . . .90 Identifying ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Single-switch topology checklist . . . . . . . . . . . . . . . . . . . . . . . . .92 Cascade mode topology checklist . . . . . . . . . . . . . . . . . . . . . . .92 Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .92 Troubleshooting FICON CUP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .93 Troubleshooting FICON NPIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94 Chapter 12 iSCSI Issues In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .95
  7. 7. Fabric OS Troubleshooting and Diagnostics Guide vii 53-1001187-01 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .95 Zoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .98 Chapter 13 Working With Diagnostic Features In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99 About Fabric OS diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99 Diagnostic information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99 Power-on self test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .100 Disabling POST. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102 Enabling POST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102 Switch status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102 Viewing the overall status of the switch . . . . . . . . . . . . . . . . . .102 Displaying switch information . . . . . . . . . . . . . . . . . . . . . . . . . .103 Displaying the uptime for a switch . . . . . . . . . . . . . . . . . . . . . .104 Chassis-level diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .104 Spinfab and porttest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .105 Port information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .105 Viewing the status of a port . . . . . . . . . . . . . . . . . . . . . . . . . . .105 Displaying the port statistics. . . . . . . . . . . . . . . . . . . . . . . . . . .106 Displaying a summary of port errors for a switch . . . . . . . . . .107 Equipment status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .108 Displaying the status of the fans . . . . . . . . . . . . . . . . . . . . . . .108 Displaying the status of a power supply. . . . . . . . . . . . . . . . . .109 Displaying temperature status . . . . . . . . . . . . . . . . . . . . . . . . .109 System message log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110 Displaying the system message log, with no page breaks . . .110 Displaying the system message log one message at a time .110 Clearing the system message log . . . . . . . . . . . . . . . . . . . . . . .110 Port log. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111 Viewing the port log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111 Syslogd configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112 Configuring the host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113 Configuring the switch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113 Automatic trace dump transfers . . . . . . . . . . . . . . . . . . . . . . . . . . .114 Specifying a remote server . . . . . . . . . . . . . . . . . . . . . . . . . . . .115 Enabling the automatic transfer of trace dumps. . . . . . . . . . .115 Setting up periodic checking of the remote server . . . . . . . . .115 Saving comprehensive diagnostic files to the server . . . . . . .115 Diagnostic tests not supported by M-EOS 9.6.2 and FOS 6.0 . . . .116
  8. 8. viii Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Appendix A Switch Type Appendix B Hexadecimal Index
  9. 9. Fabric OS Troubleshooting and Diagnostics Guide ix 53-1001187-01 About This Document In this chapter •How this document is organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix •Supported hardware and software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x •What’s new in this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x •Document conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi •Additional information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii •Getting technical help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii •Document feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv How this document is organized The document contains the following chapters: • Chapter 1, “Introduction to Troubleshooting,” gives a brief overview of troubleshooting the Fabric OS, and provides procedures for gathering basic information from your switch and fabric to aid in troubleshooting. • Chapter 2, “General Issues,” provides information on licensing, hardware, and syslog issues. • Chapter 3, “Connections Issues,” provides information and procedures on troubleshooting various link issues. • Chapter 4, “Configuration Issues,” provides troubleshooting information and procedures for configuration file issues. • Chapter 5, “FirmwareDownload Errors,” provides procedures for troubleshooting firmware download issues. • Chapter 6, “Security Issues,” provides procedures for user account and security issues. • Chapter 7, “Virtual Fabrics”, provides procedures to troubleshooting Virtual Fabrics. • Chapter 8, “ISL Trunking Issues,” provides procedures for resolving trunking issues. • Chapter 9, “Zone Issues,” provides preparations and procedures for performing firmware downloads, as well troubleshooting information. • Chapter 10, “FCIP Issues,” provides information and procedures to troubleshoot FCIP tunnel issues. • Chapter 11, “FICON Fabric Issues,” provides information and procedures to troubleshooting FICON related problems. • Chapter 12, “iSCSI Issues,” provides information and procedures on iSCSI problems.
  10. 10. x Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 • Chapter 13, “Working With Diagnostic Features,” provides procedures for use of the Brocade Adaptive Networking suite of tools, including Traffic Isolation, QoS Ingress Rate Limiting, and QoS SID/DID Traffic Prioritization. • The appendices provide special information to guide you in understanding switch output. Supported hardware and software In those instances in which procedures or parts of procedures documented here apply to some switches but not to others, this guide identifies exactly which switches are supported and which are not. Although many different software and hardware configurations are tested and supported by Brocade Communications Systems, Inc. for 6.2.0, documenting all possible configurations and scenarios is beyond the scope of this document. The following hardware platforms are supported by this release of Fabric OS: • Brocade 200E switch • Brocade 300 switch • Brocade 4016 switch • Brocade 4018 switch • Brocade 4020 switch • Brocade 4024 switch • Brocade 4100 switch • Brocade 4900 switch • Brocade 5000 switch • Brocade 5100 switch • Brocade 5300 switch • Brocade 5424 switch • Brocade 7500 switch • Brocade 7600 switch • Brocade 48000 director • Brocade DCX Backbone • Brocade DCX-4S Backbone What’s new in this document • Information that was added: - Support for new hardware platforms: • Brocade DCX-4S enterprise-class platform (8 Gbps) - Virtual Fabrics, including logical switch support • Supported on the DCX, DCX-4S, 5300, and 5100
  11. 11. Fabric OS Troubleshooting and Diagnostics Guide xi 53-1001187-01 • Support for gathering additional information - FCIP troubleshooting support • Provided additional information on FTRACE - Brocade HBA feature support • FC Ping between devices (GUI and CLI support) - Miscellaneous • FC Ping to switch WWN • Path information similar to traceroute CLI output • RAS enhancements – system-wide RAS LOG support • TI zone troubleshooting information • Information that was changed: - All commands have been updated. • Information that was deleted: - All obsolete information. This information was obsoleted because it was no longer supported in the current version of firmware. For further information about documentation updates for this release, refer to the release notes. Document conventions This section describes text formatting conventions and important notice formats used in this document. Text formatting The narrative-text formatting conventions that are used are as follows: bold text Identifies command names Identifies the names of user-manipulated GUI elements Identifies keywords and operands Identifies text to enter at the GUI or CLI italic text Provides emphasis Identifies variables Identifies paths and Internet addresses Identifies document titles code text Identifies CLI output Identifies command syntax examples Command syntax conventions For readability, command names in the narrative portions of this guide are presented in mixed lettercase: for example, switchShow. In actual examples, command lettercase is often all lowercase. Otherwise, this manual specifically notes those cases in which a command is case sensitive. Command syntax in this manual follows these conventions:
  12. 12. xii Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Notes, cautions, and warnings The following notices and statements are used in this manual. They are listed below in order of increasing severity of potential hazards. NOTE A note provides a tip, guidance or advice, emphasizes important information, or provides a reference to related information. ATTENTION An Attention statement indicates potential damage to hardware or data. CAUTION A Caution statement alerts you to situations that can be potentially hazardous to you or cause damage to hardware, firmware, software, or data. DANGER A Danger statement indicates conditions or situations that can be potentially lethal or extremely hazardous to you. Safety labels are also attached directly to products to warn of these conditions or situations. Key terms For definitions specific to Brocade and Fibre Channel, see the Brocade Glossary. For definitions of SAN-specific terms, visit the Storage Networking Industry Association online dictionary at: http://www.snia.org/education/dictionary command Commands are printed in bold. --option, option Command options are printed in bold. -argument, arg Arguments. [ ] Optional element. variable Variables are printed in italics. In the help pages, values are underlined or enclosed in angled brackets < >. ... Repeat the previous element, for example “member[;member...]” value Fixed values following arguments are printed in plain font. For example, --show WWN | Boolean. Elements are exclusive. Example: --show -mode egress | ingress
  13. 13. Fabric OS Troubleshooting and Diagnostics Guide xiii 53-1001187-01 Additional information This section lists additional Brocade and industry-specific documentation that you might find helpful. Brocade resources To get up-to-the-minute information, join Brocade Connect. It’s free! Go to http://www.brocade.com and click Brocade Connect to register at no cost for a user ID and password. For practical discussions about SAN design, implementation, and maintenance, you can obtain Building SANs with Brocade Fabric Switches through: http://www.amazon.com For additional Brocade documentation, visit the Brocade SAN Info Center and click the Resource Library location: http://www.brocade.com Release notes are available on the Brocade Connect Web site and are also bundled with the Fabric OS firmware. Other industry resources • White papers, online demos, and data sheets are available through the Brocade Web site at http://www.brocade.com/products/software.jhtml. • Best practice guides, white papers, data sheets, and other documentation is available through the Brocade Partner Web site. For additional resource information, visit the Technical Committee T11 Web site. This Web site provides interface standards for high-performance and mass storage applications for Fibre Channel, storage management, and other applications: http://www.t11.org For information about the Fibre Channel industry, visit the Fibre Channel Industry Association Web site: http://www.fibrechannel.org Getting technical help Contact your switch support supplier for hardware, firmware, and software support, including product repairs and part ordering. To expedite your call, have the following information available: 1. General Information • Switch model • Switch operating system version • Error numbers and messages received • supportSave command output
  14. 14. xiv Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 • Detailed description of the problem, including the switch or fabric behavior immediately following the problem, and specific questions • Description of any troubleshooting steps already performed and the results • Serial console and Telnet session logs • syslog message logs 2. Switch Serial Number The switch serial number and corresponding bar code are provided on the serial number label, as illustrated below.: The serial number label is located as follows: • Brocade 200E—On the nonport side of the chassis. • Brocade 4016—On the top of the switch module. • Brocade 4018—On the top of the blade. • Brocade 4020 and 4024—On the bottom of the switch module. • Brocade 4100, 4900, and 7500—On the switch ID pull-out tab located inside the chassis on the port side on the left. • Brocade 5000—On the switch ID pull-out tab located on the bottom of the port side of the switch • Brocade 300, 5100, and 5300—On the switch ID pull-out tab located on the bottom of the port side of the switch. • Brocade 7600—On the bottom of the chassis. • Brocade 48000—Inside the chassis next to the power supply bays. • Brocade DCX and DCX-4S Backbone—On the bottom right on the port side of the chassis. 3. World Wide Name (WWN) Use the wwn command to display the switch WWN. If you cannot use the wwn command because the switch is inoperable, you can get the WWN from the same place as the serial number, except for the Brocade DCX. For the Brocade DCX, access the numbers on the WWN cards by removing the Brocade logo plate at the top of the nonport side of the chassis. For the Brocade 4016, 4018, 4020, 4024, and 5424 embedded switches: Provide the license ID. Use the licenseIdShow command to display the WWN. Document feedback Quality is our first concern at Brocade and we have made every effort to ensure the accuracy and completeness of this document. However, if you find an error or an omission, or you think that a topic needs further development, we want to hear from you. Forward your feedback to: documentation@brocade.com *FT00X0054E9* FT00X0054E9
  15. 15. Fabric OS Troubleshooting and Diagnostics Guide xv 53-1001187-01 Provide the title and version number of the document and as much detail as possible about your comment, including the topic heading and page number and your suggestions for improvement.
  16. 16. xvi Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01
  17. 17. Fabric OS Troubleshooting and Diagnostics Guide 1 53-1001187-01 Chapter 1Introduction to Troubleshooting In this chapter •Troubleshooting overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 •Most common problem areas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 •Questions for common symptoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 •Gathering information for your switch support provider. . . . . . . . . . . . . . . . . 5 •Building a case for your switch support provider . . . . . . . . . . . . . . . . . . . . . . 7 Troubleshooting overview This book is a companion guide to be used in conjunction with the Fabric OS Administrator’s Guide. Although it provides a lot of common troubleshooting tips and techniques it does not teach troubleshooting methodology. Troubleshooting should begin at the center of the SAN — the fabric. Because switches are located between the hosts and storage devices and have visibility into both sides of the storage network, starting with them can help narrow the search path. After eliminating the possibility of a fault within the fabric, see if the problem is on the storage side or the host side, and continue a more detailed diagnosis from there. Using this approach can quickly pinpoint and isolate problems. For example, if a host cannot detect a storage device, run a switch command, for example switchShow to determine if the storage device is logically connected to the switch. If not, focus first on the switch directly connecting to storage. Use your vendor-supplied storage diagnostic tools to better understand why it is not visible to the switch. If the storage can be detected by the switch, and the host still cannot detect the storage device, then there is still a problem between the host and switch. Network time protocol One of the most frustrating parts of troubleshooting is trying to synchronize switch’s message logs and portlogs with other switches in the fabric. If you do not have NTP set up on your switches, then trying to synchronize log files to track a problem is more difficult.
  18. 18. 2 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Most common problem areas1 Most common problem areas Table 1 identifies the most common problem areas that arise within SANs and identifies tools to use to resolve them. Questions for common symptoms You first need to determine what the problem is. Some symptoms are obvious, such as the switch rebooted without any user intervention, or more obscure, such as your storage is having intermittent connectivity to a particular host. Whatever the symptom is, you will need to gather information from the devices that are directly involved in the symptom. Table 2 lists common symptoms and possible areas to check. You may notice that an intermittent connectivity problem has lots of variables to look into, such as the type of connection between the two devices, how the connection is behaving, and the port type involved. TABLE 1 Common troubleshooting problems and tools Problem area Investigate Tools Fabric • Missing devices • Marginal links (unstable connections) • Incorrect zoning configurations • Incorrect switch configurations • Switch LEDs • Switch commands (for example, switchShow or nsAllShow) for diagnostics • Web or GUI-based monitoring and management software tools Storage Devices • Physical issues between switch and devices • Incorrect storage software configurations • Device LEDs • Storage diagnostic tools • Switch commands (for example, switchShow or nsAllShow) for diagnostics Hosts • Downlevel HBA firmware • Incorrect device driver installation • Incorrect device driver configuration • Host operating system diagnostic tools • Device driver diagnostic tools • Switch commands (for example, switchShow or nsAllShow) for diagnostics Also, make sure you use the latest HBA firmware recommended by the switch supplier or on the HBA supplier's web site Storage Management Applications • Incorrect installation and configuration of the storage devices that the software references. For example, if using a volume-management application, check for: •Incorrect volume installation •Incorrect volume configuration • Application-specific tools and resources
  19. 19. Fabric OS Troubleshooting and Diagnostics Guide 3 53-1001187-01 Questions for common symptoms 1 TABLE 2 Common symptoms Symptom Areas to check Chapter Blade is faulty Firmware or application download Hardware connections Chapter 2, “General Issues” Chapter 5, “FirmwareDownload Errors” Chapter 7, “Virtual Fabrics” Blade is stuck in the “LOADING” state Firmware or application download Chapter 5, “FirmwareDownload Errors” Configupload or download fails FTP or SCP server or USB availability Chapter 4, “Configuration Issues” E_Port failed to come online Correct licensing Fabric parameters Zoning Chapter 2, “General Issues” Chapter 3, “Connections Issues” Chapter 7, “Virtual Fabrics” Chapter 9, “Zone Issues” EX_Port does not form Links Chapter 3, “Connections Issues” Chapter 7, “Virtual Fabrics” Fabric merge fails Fabric segmentation Chapter 2, “General Issues” Chapter 3, “Connections Issues” Chapter 7, “Virtual Fabrics” Chapter 9, “Zone Issues” Fabric segments Licensing Zoning Virtual Fabrics Fabric parameters Chapter 2, “General Issues” Chapter 3, “Connections Issues” Chapter 7, “Virtual Fabrics” Chapter 9, “Zone Issues” FCIP tunnel bounces FCIP tunnel Chapter 10, “FCIP Issues” FCIP tunnel does not come online FCIP tunnel Chapter 10, “FCIP Issues” FCIP tunnel does not form Licensing Fabric parameters Chapter 2, “General Issues” Chapter 10, “FCIP Issues” FCIP tunnel is sluggish FCIP tunnel Chapter 10, “FCIP Issues” Feature is not working Licensing Chapter 2, “General Issues” FCR is slowing down FCR LSAN tags Chapter 2, “General Issues” FICON switch does not talk to hosts FICON settings Chapter 11, “FICON Fabric Issues” FirmwareDownload fails FTP or SCP server or USB availability Firmware version compatibility Unsupported features enabled Firmware versions on switch Chapter 5, “FirmwareDownload Errors” Chapter 7, “Virtual Fabrics” Host application times out FCR LSAN tags Marginal links Chapter 2, “General Issues” Chapter 3, “Connections Issues” Intermittent connectivity Links Trunking Buffer credits FCIP tunnel Chapter 3, “Connections Issues” Chapter 8, “ISL Trunking Issues” Chapter 10, “FCIP Issues” LEDs are flashing Links Chapter 3, “Connections Issues” LEDs are steady Links Chapter 3, “Connections Issues” License issues Licensing Chapter 2, “General Issues” LSAN is slow or times-out LSAN tagging Chapter 2, “General Issues” Marginal link Links Chapter 3, “Connections Issues”
  20. 20. 4 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Questions for common symptoms1 No connectivity between host and storage Cables SCSI timeout errors SCSI retry errors Zoning Chapter 3, “Connections Issues” Chapter 8, “ISL Trunking Issues” Chapter 9, “Zone Issues” Chapter 10, “FCIP Issues” No connectivity between switches Licensing Fabric parameters Segmentation Virtual Fabrics Zoning, if applicable Chapter 2, “General Issues” Chapter 3, “Connections Issues” Chapter 7, “Virtual Fabrics” Chapter 9, “Zone Issues” No light on LEDs Links Chapter 3, “Connections Issues” Performance problems Links FCR LSAN tags FCIP tunnels Chapter 3, “Connections Issues” Chapter 2, “General Issues” Chapter 10, “FCIP Issues” Port cannot be moved Virtual Fabrics Chapter 7, “Virtual Fabrics” SCSI retry errors Buffer credits FCIP tunnel bandwidth Chapter 10, “FCIP Issues” SCSI timeout errors Links HBA Buffer credits FCIP tunnel bandwidth Chapter 3, “Connections Issues” Chapter 8, “ISL Trunking Issues” Chapter 10, “FCIP Issues” Switch constantly reboots FIPS Chapter 6, “Security Issues” Switch is unable to join fabric Security policies Zoning Fabric parameters Chapter 3, “Connections Issues” Chapter 7, “Virtual Fabrics” Chapter 9, “Zone Issues” Switch reboots during configup/download Configuration file discrepancy Chapter 4, “Configuration Issues” Syslog messages Hardware SNMP management station Chapter 2, “General Issues” Chapter 6, “Security Issues” Trunk bounces Cables are on same port group SFPs Trunked ports Chapter 8, “ISL Trunking Issues” Trunk failed to form Licensing Cables are on same port group SFPs Trunked ports Zoning Chapter 2, “General Issues” Chapter 3, “Connections Issues” Chapter 8, “ISL Trunking Issues” Chapter 9, “Zone Issues” User forgot password Password recovery Chapter 6, “Security Issues” User is unable to change switch settings RBAC settings Account settings Chapter 6, “Security Issues” Virtual Fabric does not form FIDs Chapter 7, “Virtual Fabrics” Zone configuration mismatch Effective configuration Chapter 9, “Zone Issues” Zone content mismatch Effective configuration Chapter 9, “Zone Issues” Zone type mismatch Effective configuration Chapter 9, “Zone Issues” TABLE 2 Common symptoms Symptom Areas to check Chapter
  21. 21. Fabric OS Troubleshooting and Diagnostics Guide 5 53-1001187-01 Gathering information for your switch support provider 1 Gathering information for your switch support provider If you are troubleshooting a production system, you must gather data quickly. As soon as a problem is observed, perform the following tasks (if using a dual CP system, run the commands on both CPs). For more information about these commands and their operands, refer to the Fabric OS Command Reference. 1. Enter the supportSave command to save RASLOG, TRACE, supportShow, core file, FFDC data, and other support information from the switch, chassis, blades, and logical switches. NOTE It is recommended that you use the supportFtp command to set up the supportSave environment for automatic dump transfers using the -n and -c options; this will save you from having to enter or know all the required FTP parameters needed to successfully execute a supportSave operation. • Enter the supportShow command to collect information for the local CP to a remote FTP location or using the USB memory device on supporting products. This command does not collect RASLOG, TRACE, core files or FFDC data. To capture the data from the supportShow command, you will need to run the command through a Telnet or SSH utility or serial console connection. 2. Gather console output and logs. NOTE To execute the supportSave or supportShow command on the chassis, you will need to log in to the switch on an account with the admin role that has the chassis role permission. For more details about these commands, see the Fabric OS Command Reference. Setting up your switch for FTP 1. Connect to the switch and log in using an account assigned to the admin role. 2. Type the following command: supportFtp -s [-h hostip][-u username][-p password][-d remotedirectory] 3. Respond to the prompts as follows: Example of supportFTP command switch:admin> supportftp -s Host IP Addr[1080::8:800:200C:417A]: User Name[njoe]: Password[********]: Remote Dir[support]: Auto file transfer parameters changed -h hostip Specifies FTP host IP address. It must be an IP address. hostip should be less than 48 characters. -u username Enter the user name of your account on the server; for example, “JaneDoe”. -d remotedirectory Specifies remote directory to store trace dump files. The supportFtp command cannot take a slash (/) as a directory name. The remote directory should be less than 48 characters. -p password Specifies FTP user password. If the user name is anonymous, the password is not needed. password should be less than 48 characters.
  22. 22. 6 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Gathering information for your switch support provider1 Capturing a supportSave 1. Connect to the switch and log in using an account assigned to the admin role. 2. Type the appropriate supportSave command based on your needs: • If you are saving to an FTP or SCP server, use the following syntax: supportSave When invoked without operands, this command goes into interactive mode. The following operands are optional: -n Does not prompt for confirmation. This operand is optional; if omitted, you are prompted for confirmation. -c Uses the FTP parameters saved by the supportFtp command. This operand is optional; if omitted, specify the FTP parameters through command line options or interactively. To display the current FTP parameters, run supportFtp (on a dual-CP system, run supportFtp on the active CP). • On platforms that support USB devices, you can use your Brocade USB device to save the support files. To use your USB device, use the following syntax: supportsave [-U -d remote_dir] -d Specifies the remote directory to which the file is to be transferred. When saving to a USB device, the predefined /support directory must be used. Capturing a supportShow 1. Connect to the switch through a Telnet or SSH utility or a serial console connection. 2. Log in using an account assigned to the admin role. 3. Set the Telnet or SSH utility to capture output from the screen. Some Telnet or SSH utilities require this step to be performed prior to opening up a session. Check with your Telnet or SSH utility vendor for instructions. 4. Type the supportShow command. Capturing output from a console Some information, such as boot information is only outputted directly to the console. In order to capture this information you have to connect directly to the switch through its management interface, either a serial cable or an RJ-45 connection. 1. Connect directly to the switch using hyperterminal. 2. Log in to the switch using an account assigned to the admin role. 3. Set the utility to capture output from the screen. Some utilities require this step to be performed prior to opening up a session. Check with your utility vendor for instructions. 4. Type the command or start the process to capture the required data on the console.
  23. 23. Fabric OS Troubleshooting and Diagnostics Guide 7 53-1001187-01 Building a case for your switch support provider 1 Capturing command output 1. Connect to the switch through a Telnet or SSH utility. 2. Log in using an account assigned to the admin role. 3. Set the Telnet or SSH utility to capture output from the screen. Some Telnet or SSH utilities require this step to be performed prior to opening up a session. Check with your Telnet or SSH utility vendor for instructions. 4. Type the command or start the process to capture the required data on the console. Building a case for your switch support provider The following form should be filled out in its entirety and presented to your switch support provider when you are ready to contact them. Having this information immediately available will expedite the information gathering process that is necessary to begin determining the problem and finding a solution. Basic information 1. What is the switch’s current Fabric OS level? To determine the switch’s Fabric OS level, type the firmwareShow command and write the information. 2. What is the switch model? To determine the switch model, type the switchshow command and write down the value in the switchType field. Cross-reference this value with the chart located in Appendix A, “Switch Type”. 3. Is the switch operational? Yes or No 4. Impact assessment and urgency: • Is the switch down? Yes or no. • Is it a standalone switch? Yes or no. • Are there VE, VEX or EX ports connected to the chassis? Yes or no. Use the switchShow command to determine the answer. • How large is the fabric? Use the nsAllShow command to determine the answer. • Do you have encryption blades or switches installed in the fabric? Yes or no. • Do you have Virtual Fabrics enabled in the fabric? Yes or no. Use the switchShow command to determine the answer. • Do you have IPsec installed on the switch’s Ethernet interface? Yes or no. Use the ipsecConfig --show command to determine the answer. • Do you have Inband Management installed on the switches GigE ports? Yes or no. User the portShow iproute geX command to determine the answer. • Are you using NPIV? Yes or no.
  24. 24. 8 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Building a case for your switch support provider1 Use the switchShow command to determine the answer. • Are there security policies turned on in the fabric? If so, what are they? (Gather the output from the following commands: secPolicyShow fddCfg --showall ipFilter --show authUtil --show secAuthSecret --show fipsCfg --showall • Is the fabric redundant? If yes, what is the MPIO software? (List vendor and version.) 5. If you have a redundant fabric, did a failover occur? 6. Was POST enabled on the switch? 7. Which CP blade was active? (Only applicable to the Brocade 24000 and 48000 directors, and the Brocade DCX and DCX-4S Backbones.) Detailed problem information Obtain as much of the following informational items as possible prior to contacting the SAN technical support vendor. Document the sequence of events by answering the following questions: • What happened prior to the problem? • Is the problem reproducible? • If so, what are the steps to produce the problem? • What configuration was in place when the problem occurred? • A description of the problem with the switch or the fault with the fabric. • The last actions or changes made to the system environment: settings supportSave output; you can save this information on a qualified and installed Brocade USB storage device only on the Brocade 300, 5100, 5300 and the Brocade DCX enterprise-class platform. supportShow output • Host information: OS version and patch level HBA type HBA firmware version HBA driver version Configuration settings
  25. 25. Fabric OS Troubleshooting and Diagnostics Guide 9 53-1001187-01 Building a case for your switch support provider 1 • Storage information: Disk/tape type Disk/tape firmware level Controller type Controller firmware level Configuration settings Storage software (such as EMC Control Center, Veritas SPC, etc.) 8. If this is a Brocade 48000, Brocade DCX or DCX-4S enterprise-class platform, are the CPs in-sync? Yes or no. Use the haShow command to determine the answer. 9. List out what and when were the last actions or changes made to the switch, the fabric, and the SAN or metaSAN? Gathering additional information Below are features that require you to gather additional information. The additional information is necessary in order for your switch support provider to effectively and efficiently troubleshoot your issue. Refer to the chapter specified for the commands whose data you need to capture. • Configurations, see Chapter 3, “Connections Issues”. • Firmwaredownload, see Chapter 5, “FirmwareDownload Errors”. • Trunking, see Chapter 8, “ISL Trunking Issues”. • Zoning, see Chapter 9, “Zone Issues”. • FCIP tunnels, see Chapter 10, “FCIP Issues”. • FICON, see Chapter 11, “FICON Fabric Issues”. TABLE 3 Environmental changes Type of Change Date when change occurred
  26. 26. 10 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Building a case for your switch support provider1
  27. 27. Fabric OS Troubleshooting and Diagnostics Guide 11 53-1001187-01 Chapter 2General Issues In this chapter •Licensing issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 •Time issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 •Switch message logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 •Switch boot issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 •Fibre Channel Router connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 •Third party applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Licensing issues Some features need licenses in order to work properly. To view a list of features and their associated licenses, refer to the Fabric OS Administrator’s Guide. Licenses are created using a switch’s License Identifier so you cannot apply one license to different switches. Before calling your switch support provider, verify that you have the correct licenses installed by using the licenseIdShow command. Symptom A feature is not working. Probable cause and recommended action Refer to the Fabric OS Administrator’s Guide to determine if the appropriate licenses are installed on the local switch and any connecting switches. Determining installed licenses 1. Connect to the switch and log in using an account assigned to the admin role. 2. Type the licenseShow command. A list of the currently installed licenses on the switch will be displayed. Time issues Symptom Time is not in-sync. Probable cause and recommended action NTP is not set up on the switches in your fabric. Set up NTP on your switches in all fabrics in your SAN and metaSAN. For more information on setting up NTP, refer to the Fabric OS Administrator’s Guide.
  28. 28. 12 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Switch message logs2 Switch message logs Switch message logs (RAS logs) contain information on events that happen on the switch or in the fabric. This is an effective tool in understanding what is going on in your fabric or on your switch. Weekly review of the RAS logs is necessary to prevent minor problems from becoming larger issues, or in catching problems at an early stage. Below are some common problems that can occur with or in your system message log. Symptom Inaccurate information in the system message log Probable cause and recommended action In rare instances, events gathered by the track change feature can report inaccurate information to the system message log. For example, a user enters a correct user name and password, but the login was rejected because the maximum number of users had been reached. However, when looking at the system message log, the login was reported as successful. If the maximum number of switch users has been reached, the switch will still perform correctly in that it will reject the login of additional users, even if they enter the correct user name and password information. However, in this limited example, the Track Change feature will report this event inaccurately to the system message log; it will appear that the login was successful. This scenario only occurs when the maximum number of users has been reached; otherwise, the login information displayed in the system message log should reflect reality. See the Fabric OS Administrator’s Guide for information regarding enabling and disabling track changes (TC). Symptom MQ errors are appearing in the switch log. Probable cause and recommended action An MQ error is a message queue error. Identify an MQ error message by looking for the two letters MQ followed by a number in the error message: 2004/08/24-10:04:42, [MQ-1004], 218,, ERROR, ras007, mqRead, queue = raslog-test- string0123456-raslog, queue I D = 1, type = 2 MQ errors can result in devices dropping from the switch’s Name Server or can prevent a switch from joining the fabric. MQ errors are rare and difficult to troubleshoot; resolve them by working with the switch supplier. When encountering an MQ error, issue the supportSave command to capture debug information about the switch; then, forward the supportSave data to the switch supplier for further investigation.
  29. 29. Fabric OS Troubleshooting and Diagnostics Guide 13 53-1001187-01 Switch message logs 2 Symptom I2C bus errors are appearing in the switch log. Probable cause and recommended action I2 C bus errors generally indicate defective hardware or poorly seated devices or blades; the specific item is listed in the error message. See the Fabric OS Message Reference for information specific to the error that was received. Some Chip-Port (CPT) and Environmental Monitor (EM) messages contain I2 C-related information. If the I2 C message does not indicate the specific hardware that may be failing, begin debugging the hardware, as this is the most likely cause. The next sections provide procedures for debugging the hardware. Symptom Core file or FFDC warning messages appear on the serial console or in the system log. Probable cause and recommended action Issue the supportSave command. The messages can be dismissed by issuing the supportSave -R command after all data is confirmed to be collected properly. Error example: *** CORE FILES WARNING (10/22/08 - 05:00:01 ) *** 3416 KBytes in 1 file(s) use "supportsave" command to upload Checking fan components 1. Log in to the switch as user. 2. Enter the fanShow command. 3. Check the fan status and speed output. The output from this command varies depending on switch type and number of fans present. Refer to the appropriate hardware reference manual for details regarding the fan status. You may first consider re-seating the fan (unplug it and plug it back in). Checking the switch temperature 1. Log in to the switch as user. 2. Enter the tempShow command. 3. Check the temperature output. Refer to the hardware reference manual for your switch to determine the normal temperature range. OK Fan is functioning correctly. absent Fan is not present. below minimum Fan is present but rotating too slowly or stopped. above minimum Fan is rotating too quickly. unknown Unknown fan unit installed. faulty Fan has exceeded hardware tolerance and has stopped. In this case, the last known fan speed is displayed.
  30. 30. 14 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Switch boot issues2 Checking the power supply 1. Log in to the switch as user. 2. Enter the psShow command. 3. Check the power supply status. Refer to the appropriate hardware reference manual for details regarding the power supply status. If any of the power supplies show a status other than OK, consider replacing the power supply as soon as possible. For certain switch models, the OEM serial ID data displays after each power supply status line. Checking the temperature, fan, and power supply 1. Log in to the switch as user. 2. Enter the sensorShow command. See the Fabric OS Command Reference for details regarding the sensor numbers. 3. Check the temperature output. Look for indications of high or low temperatures. 4. Check the fan speed output. If any of the fan speeds display abnormal RPMs, replace the fan FRU. 5. Check the power supply status. If any power supplies show a status other than OK, consider replacing the power supply as soon as possible. Switch boot issues Symptom The enterprise-class platform model rebooted again after an initial bootup. Probable cause and recommended action This issue can occur during an enterprise-class platform boot up with two CPs. If any failure occurs on active CP, before the standby CP is fully functional and has obtained HA sync, the Standby CP may not be able to take on the active role to perform failover successfully. In this case, both CPs will reboot to recover from the failure. OK Power supply functioning correctly. absent Power supply not present. unknown Power supply unit installed. predicting failure Power supply is present but predicting failure. faulty Power supply present but faulty (no power cable, power switch turned off, fuse blown, or other internal error).
  31. 31. Fabric OS Troubleshooting and Diagnostics Guide 15 53-1001187-01 Fibre Channel Router connectivity 2 Fibre Channel Router connectivity This section describes tools you can use to troubleshoot Fibre Channel routing connectivity and performance. Generate and route an ECHO The FC-FC Routing Service enables you to route the ECHO generated when an fcPing command is issued on a switch, providing fcPing capability between two devices in different fabrics across the FC router. The fcPing command sends a Fibre Channel ELS ECHO request to a pair of ports. It performs a zone check between the source and destination. In addition, two Fibre Channel Extended Link Service (ELS) requests will be generated. The first ELS request is from the domain controller to the source port identifier. The second ELS request is from the domain controller to the destination port identifiers. The ELS ECHO request will elicit an ELS ECHO response from a port identifier in the fabric and validates link connectivity. To validate link connectivity to a single device or between a pair of devices, use the fcPing command in the following syntax: fcping [--number frames][--length size][--interval wait][--pattern pattern] [--bypasszone] [--quiet] [source] destination [--help] Where: --number frames Specifies the number of ELS Echo requests to send. The default value is 5. --length size Specifies the frame size of the requests in bytes. The default value is 0. Without data, the Fibre Channel Echo request frame size is 12 bytes. The total byte count includes four bytes from the Echo request header and eight bytes from the timestamp. The maximum allowed value is 2,036 bytes. The length must be word-aligned. --interval wait Specifies the interval, in seconds, between successive ELS Echo requests. The default value is 0 seconds. --pattern pattern Specifies up to 16 "pad" bytes, which are used to fill out the request frame payload sent. This is useful for diagnosing data-dependent problems in the fabric link. The pattern bytes are specified as hexadecimal characters. For example, the --pattern ff fills the request frame with instances of the number 1. If a non-byte aligned pattern is specified, the upper nibble of the last pattern byte is filled with zeros. For example, --pattern 123 fills the payload with a pattern of 0x1203. --bypasszone Bypasses the zone check --quiet Suppresses the diagnostic output. Only zoning information, if applicable, and the summary line are displayed. source Specifies the source port ID, port WWN, or node WWN. This operand is optional. destination Specifies the destination. When using fcPing between a source and a destination, specify the destination as a port WWN or a node WWN. When using fcPing to ping a single device, specify the destination as a switch WWN, a domain ID, or a switch domain controller ID. --help Displays the command usage.
  32. 32. 16 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Fibre Channel Router connectivity2 On the edge Fabric OS switch, make sure that the source and destination devices are properly configured in the LSAN zone before entering the fcPing command. This command performs the following functions: • Checks the zoning configuration for the two ports specified. • Generates an ELS (extended link service) ECHO request to the source port specified and validates the response. • Generates an ELS ECHO request to the destination port specified and validates the response. switch:admin> fcping 0x060f00 0x05f001 Source: 0x60f00 Destination: 0x5f001 Zone Check: Zoned Pinging 0x60f00 with 12 bytes of data: received reply from 0x60f00: 12 bytes time:501 usec received reply from 0x60f00: 12 bytes time:437 usec received reply from 0x60f00: 12 bytes time:506 usec received reply from 0x60f00: 12 bytes time:430 usec received reply from 0x60f00: 12 bytes time:462 usec 5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout Round-trip min/avg/max = 430/467/506 usec Pinging 0x5f001 with 12 bytes of data: received reply from 0x5f001: 12 bytes time:2803 usec received reply from 0x5f001: 12 bytes time:2701 usec received reply from 0x5f001: 12 bytes time:3193 usec received reply from 0x5f001: 12 bytes time:2738 usec received reply from 0x5f001: 12 bytes time:2746 usec 5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout Round-trip min/avg/max = 2701/2836/3193 usec Regardless of the device’s zoning configuration, the fcPing command sends the ELS frame to the destination port. A destination device can take any one of the following actions: • Send an ELS Accept to the ELS request. • Send an ELS Reject to the ELS request. • Ignore the ELS request. There are some devices that do not support the ELS ECHO request. In these cases, the device will either not respond to the request or send an ELS reject. When a device does not respond to the ELS request, further debugging is required; however, do not assume that the device is not connected. For details about the fcPing command, see the Fabric OS Command Reference. Example of one device that accepts the request and another device that rejects the request: switch:admin> fcping 10:00:00:00:c9:29:0e:c4 21:00:00:20:37:25:ad:05 Source: 10:00:00:00:c9:29:0e:c4 Destination: 21:00:00:20:37:25:ad:05 Zone Check: Not Zoned Pinging 10:00:00:00:c9:29:0e:c4 [0x20800] with 12 bytes of data: received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1162 usec received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1013 usec received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1442 usec received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1052 usec received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1012 usec 5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout
  33. 33. Fabric OS Troubleshooting and Diagnostics Guide 17 53-1001187-01 Fibre Channel Router connectivity 2 Round-trip min/avg/max = 1012/1136/1442 usec Pinging 21:00:00:20:37:25:ad:05 [0x211e8] with 12 bytes of data: Request rejected Request rejected Request rejected Request rejected Request rejected 5 frames sent, 0 frames received, 5 frames rejected, 0 frames timeout Round-trip min/avg/max = 0/0/0 usec Example To use fcPing with a single destination (in this example, the destination is a device node WWN): switch:admin> fcping 20:00:00:00:c9:3f:7c:b8 Destination: 20:00:00:00:c9:3f:7c:b8 Pinging 20:00:00:00:c9:3f:7c:b8 [0x370501] with 12 bytes of data: received reply from 20:00:00:00:c9:3f:7c:b8: 12 bytes time:825 usec received reply from 20:00:00:00:c9:3f:7c:b8: 12 bytes time:713 usec received reply from 20:00:00:00:c9:3f:7c:b8: 12 bytes time:714 usec received reply from 20:00:00:00:c9:3f:7c:b8: 12 bytes time:741 usec received reply from 20:00:00:00:c9:3f:7c:b8: 12 bytes time:880 usec 5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout Round-trip min/avg/max = 713/774/880 usec Route and statistical information The pathInfo command displays routing and statistical information from a source port index on the local switch to a destination port index on another switch. This routing information describes the full path that a data stream travels between these ports, including all intermediate switches. The routing and statistics information are provided by every switch along the path, based on the current routing-table information and statistics calculated continuously in real time. Each switch represents one hop. Use the pathInfo command to display routing information from a source port on the local switch to a destination port on another switch. The command output describes the exact data path between these ports, including all intermediate switches. When using this command in Fabric OS v6.2.0 across fabrics connected through an FC router, the command represents backbone information as a single hop. The command captures details about the FC router to which ingress and egress EX_Ports are connected, but it hides the details about the path the frame traverses from the ingress EX_Ports to the egress EX_Ports in the backbone. To use pathInfo across remote fabrics, you must specify both the fabric ID (FID) and the domain ID of the remote switch. You cannot use the command to obtain source port information across remote FCR fabrics. When obtaining path info across remote fabrics, the destination switch must be identified by its domain ID. Identifying the switch by name or WWN is not accepted. Use the pathInfo command in the following syntax: pathinfo destination_switch [source_port[destination_port]] [-r] [-t]
  34. 34. 18 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Fibre Channel Router connectivity2 Where: To display basic path information to a specific domain in command line mode: switch:admin> pathinfo 91 Target port is Embedded Hop In Port Domain ID (Name) Out Port BW Cost --------------------------------------------------------- 0 E 9 (web226) 2 1G 1000 1 3 10 (web229) 8 1G 1000 2 8 8 (web228) 9 1G 1000 3 6 91 (web225) E - - To display basic and extended statistics in interactive mode: switch:admin> pathinfo Max hops: (1..127) [25] Fabric Id: (1..128) [-1] Domain|Wwn|Name: [] 8 Source port: (0..15) [-1] Destination port: (0..255) [-1] Basic stats (yes, y, no, n): [no] y Extended stats (yes, y, no, n): [no] y Trace reverse path (yes, y, no, n): [no] Source route (yes, y, no, n): [no] Timeout: (1..30) [5] Target port is Embedded Hop In Port Domain ID (Name) Out Port BW Cost --------------------------------------------------------- 0 E 9 (web226) 2 1G 1000 Port E 2 Tx Rx Tx Rx ----------------------------------------------- destination_switch Specifies the destination switch. The destination switch can be identified by its Domain ID, by the switch WWN, or by the switch name. This operand is optional; if omitted, the command runs interactively. source_port Specifies the port whose path to the destination domain is traced. For bladed systems and ports above 256, the destination is specified as the port index; otherwise, it is the port area. The embedded port (-1) is the default. The embedded port can be selected manually by entering the value of MAX_PORT. MAX_PORT stands for the maximum number of ports supported by the local switch. destination_port Specifies the port on the destination switch for the path being traced. This operand returns the state of this port. The embedded port (-1) is used by default, or if you specify a destination port that is not active. For bladed systems and ports above 256, the destination is specified as the port index; otherwise, it is the port area. -r Displays the reverse path in addition to the forward path. This operand is optional. -t Displays the command output in traceroute format. When this operand is used, only routing information is displayed. The output includes the time it takes, in microseconds, to reach each hop. Basic and extended statistics are not available in the traceroute format.
  35. 35. Fabric OS Troubleshooting and Diagnostics Guide 19 53-1001187-01 Fibre Channel Router connectivity 2 B/s (1s) - - 0 0 B/s (64s) - - 1 1 Txcrdz (1s) - - 0 - Txcrdz (64s) - - 0 - F/s (1s) - - 0 0 F/s (64s) - - 2743 0 Words - - 2752748 2822763 Frames - - 219849 50881 Errors - - - 0 Hop In Port Domain ID (Name) Out Port BW Cost --------------------------------------------------------- 1 3 10 (web229) 12 1G 1000 Port 3 12 Tx Rx Tx Rx ----------------------------------------------- B/s (1s) 36 76 0 0 B/s (64s) 5 5 5 5 Txcrdz (1s) 0 - 0 - Txcrdz (64s) 0 - 0 - F/s (1s) 1 1 0 0 F/s (64s) 0 0 0 0 Words 240434036 2294316 2119951 2121767 Frames 20025929 54999 162338 56710 Errors - 4 - 0 Hop In Port Domain ID (Name) Out Port BW Cost --------------------------------------------------------- 2 14 8 (web228) E - - (output truncated) For details about the pathInfo command, see the Fabric OS Command Reference. Performance issues Symptom General slow-down in FCR performance and scalability. Probable cause and recommended action As LSAN zone databases get bigger, it takes more switch resources to process them. Use the enforce tag feature to prevent a backbone switch from accepting unwanted LSAN zone databases into its local database. Symptom Host application times out. Probable cause and recommended action The FCR tends to take a long time, more than 5 seconds, to present and setup paths for the proxy devices. Certain hosts are able to do discovery much faster as a result they end up timing out. Use the speed tag feature to always present target proxy to the host and import them faster. This helps sensitive hosts to do a quick discovery without timing out or cause an application failure.
  36. 36. 20 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Third party applications2 Third party applications Symptom Replication application works for a while and then an error or malfunction is reported. Probable cause and recommended action Some third party applications will work when they are first set up and then cease to work because of an incorrect parameter setting. Check each of the following parameters and your application vendor documentation to determine if these are set correctly: • Port-base routing Use the aptPolicy command to set this feature. • In-order delivery Use the iodSet command to turn this feature on and the iodReset command to turn this feature off. • Load balancing In most cases this should be set to off. Use the dlsReset command to turn off the function.
  37. 37. Fabric OS Troubleshooting and Diagnostics Guide 21 53-1001187-01 Chapter 3Connections Issues In this chapter •Port initialization and FCP auto discovery process . . . . . . . . . . . . . . . . . . . . 21 •Link issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 •Connection problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 •Link failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 •Marginal links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 •Device login issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 •Media-related issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 •Segmented fabrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Port initialization and FCP auto discovery process The steps in the port initialization process represent a protocol used to discover the type of connected device and establish the port type and port speed. The possible port types are as follows: • U_Port—Universal FC port. The base Fibre Channel port type and all unidentified, or uninitiated ports are listed as U_Ports. • L_/FL_Port—Fabric Loop port. Connects public loop devices. • G_Port—Generic port. Acts as a transition port for non-loop fabric-capable devices. • E_Port—Expansion port. Assigned to ISL links. • F_Port—Fabric port. Assigned to fabric-capable devices. • EX_Port—A type of E_Port. It connects a Fibre Channel router to an edge fabric. From the point of view of a switch in an edge fabric, an EX_Port appears as a normal E_Port. It follows applicable Fibre Channel standards as other E_Ports. However, the router terminates EX_Ports rather than allowing different fabrics to merge as would happen on a switch with regular E_Ports. • M_Port—A mirror port. A mirror port lets you configure a switch port to connect to a port to mirror a specific source port and destination port traffic passing though any switch port. This is only supported between F_Ports. • VE_Port—A virtual E_Port. A Gigabit Ethernet switch port configured for an FCIP tunnel is called a VE port (virtual E-port). However, with a VEX_Port at the other end it does not propagate fabric services or routing topology information from one edge fabric to another.
  38. 38. 22 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Port initialization and FCP auto discovery process3 • VEX_Port—A virtual EX_Port. It connects a Fibre Channel router to an edge fabric. From the point of view of a switch in an edge fabric, a VEX_Port appears as a normal VE_Port. It follows the same Fibre Channel protocol as other VE_Ports. However, the router terminates VEX_Ports rather than allowing different fabrics to merge as would happen on a switch with regular VE_Ports. Figure 1 shows the process behind port initialization. Understanding this process can help you determine where a problem resides. For example, if your switch cannot form an E_Port, you understand that the process never got to that point or does not recognize the switch as an E_Port. Possible solutions would be to look at licensing and port configuration. Verify that the correct licensing is installed or that the port is not configured as a loop port, a G_Port, or the port speed is not set. FIGURE 1 Simple port initialization process The FCP auto discovery process enables private storage devices that accept the process login (PRLI) to communicate in a fabric. If device probing is enabled, the embedded port logs in (PLOGI) and attempts a PRLI into the device to retrieve information to enter into the name server. This enables private devices that do not perform a fabric login (FLOGI), but accept PRLI, to be entered in the name server and receive full fabric citizenship. Private hosts require the QuickLoop feature which is not available in Fabric OS v4.0.0 and later. A fabric-capable device will register information with the Name Server during a FLOGI. These devices will typically register information with the name server before querying for a device list. The embedded port will still PLOGI and attempt PRLI with these devices. To display the contents of a switch’s Name Server, use the nsShow or nsAllShow command. For more information about these name server commands, refer to Fabric OS Command Reference.
  39. 39. Fabric OS Troubleshooting and Diagnostics Guide 23 53-1001187-01 Link issues 3 Link issues Symptom Port LEDs are flashing. Probable cause and recommended action Depending on the rate of the flash and the color of the port LED this could mean several things. To determine what is happening on either your port status LED or power status LED, refer to that switch’s model hardware reference manual. There is a table that describes the LEDs purpose and explains the current behavior as well as provides suggested resolutions. Symptom Port LEDs are steady. Probable cause and recommended action The color of the port LED is important in this instance. To determine what is happening on either your port status LED or power status LED, refer to that switch’s model hardware reference manual. There is a table that describes the LEDs purpose and explains the current behavior as well as provides suggested resolutions. Symptom No light from the port LEDs. Probable cause and recommended action If there is no light coming from the port LED, then no signal is being detected. Check your cable and SFP to determine the physical fault. Connection problems If a host is unable to detect its target, for example, a storage or tape device, you should begin troubleshooting the problem at the switch. Determine if the problem is the target or the host, then continue to divide the suspected problem-path in half until you can pinpoint the problem. One of the most common solutions is zoning. Verify that the host and target are in the same zone. For more information on zoning, refer to Chapter 9, “Zone Issues”. Checking the logical connection 1. Enter the switchShow command. 2. Review the output from the command and determine if the device successfully logged into the switch. • A device that is logically connected to the switch is registered as an F_, L_, E_, EX_, VE_, VEX_, or N_Port. • A device that is not logically connected to the switch will be registered as a G_ or U_Port. If NPIV is not on the switch. 3. Enter the slotShow -m command to verify that all blades are ENABLED and not faulty, disabled or in some other non-available state. 4. Perform the appropriate actions based on how your missing device is connected: • If the missing device is logically connected, proceed to the next troubleshooting procedure (“Checking the name server (NS)” on page 24).
  40. 40. 24 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Connection problems3 • If the missing device is not logically connected, check the device and everything on that side of the data path. Also see “Link failures” on page 25 for additional information. Checking the path includes the following for the Host. Verify the following: • All aspects of the Host OS. • The third-party vendor multi-pathing input/output (MPIO) software if it is being used. • The driver settings and binaries are up to date. • The device Basic Input Output System (BIOS) settings are correct. • The HBA configuration is correct according to manufacturers specifications. • The SFPs in the HBA are compatible with the Hosts HBA. • The cable going from the switch to the Host HBA is not damaged. • The SFP on the switch is compatible with the switch. • All switch settings related to the Host. Checking the path includes the following for the Target: • The driver settings and binaries are up to date. • The device Basic Input Output System (BIOS) settings are correct. • The HBA configuration is correct according to the manufacturers specifications. • The SFPs in the HBA are compatible with the Target HBA. • The cable going from the switch to the Target HBA is not damaged. • All switch settings related to the Target. See “Checking for a loop initialization failure” on page 26 as the next potential trouble spot. Checking the name server (NS) 1. Enter the nsShow command on the switch to determine if the device is attached: The Local Name Server has 9 entries { Type Pid COS PortName NodeName TTL(sec) *N 021a00; 2,3;20:00:00:e0:69:f0:07:c6;10:00:00:e0:69:f0:07:c6; 895 Fabric Port Name: 20:0a:00:60:69:10:8d:fd NL 051edc; 3;21:00:00:20:37:d9:77:96;20:00:00:20:37:d9:77:96; na FC4s: FCP [SEAGATE ST318304FC 0005] Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051ee0; 3;21:00:00:20:37:d9:73:0f;20:00:00:20:37:d9:73:0f; na FC4s: FCP [SEAGATE ST318304FC 0005] Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051ee1; 3;21:00:00:20:37:d9:76:b3;20:00:00:20:37:d9:76:b3; na FC4s: FCP [SEAGATE ST318304FC 0005] Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051ee2; 3;21:00:00:20:37:d9:77:5a;20:00:00:20:37:d9:77:5a; na FC4s: FCP [SEAGATE ST318304FC 0005] Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051ee4; 3;21:00:00:20:37:d9:74:d7;20:00:00:20:37:d9:74:d7; na
  41. 41. Fabric OS Troubleshooting and Diagnostics Guide 25 53-1001187-01 Link failures 3 FC4s: FCP [SEAGATE ST318304FC 0005] Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051ee8; 3;21:00:00:20:37:d9:6f:eb;20:00:00:20:37:d9:6f:eb; na FC4s: FCP [SEAGATE ST318304FC 0005] Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051eef; 3;21:00:00:20:37:d9:77:45;20:00:00:20:37:d9:77:45; na FC4s: FCP [SEAGATE ST318304FC 0005] Fabric Port Name: 20:0e:00:60:69:10:9b:5b N 051f00; 2,3;50:06:04:82:bc:01:9a:0c;50:06:04:82:bc:01:9a:0c; na FC4s: FCP [EMC SYMMETRIX 5267] Fabric Port Name: 20:0f:00:60:69:10:9b:5b 2. Look for the device in the NS list, which lists the nodes connected to that switch. This allows you to determine if a particular node is accessible on the network. • If the device is not present in the NS list, the problem is between the device and the switch. There may be a time-out communication problem between edge devices and the name server, or there may be a login issue. First check the edge device documentation to determine if there is a time-out setting or parameter that can be reconfigured. Also, check the port log for NS registration information and FCP probing failures (using the fcpProbeShow command). If these queries do not help solve the problem, contact the support organization for the product that appears to be inaccessible. • If the device is listed in the NS, the problem is between the storage device and the host. There may be a zoning mismatch or a host/storage issue. Proceed to Chapter 9, “Zone Issues”. 3. Enter the portLoginShow command to check the port login status. 4. Enter the fcpProbeShow command to display the FCP probing information for the devices attached to the specified F_Port or L_Port. This information includes the number of successful logins and SCSI INQUIRY commands sent over this port and a list of the attached devices. 5. Check the port log to determine whether or not the device sent the FLOGI frame to the switch, and the switch probed the device. Link failures A link failure occurs when a server, storage, or switch device is connected to a switch, but the link between the devices does not come up. This prevents the devices from communicating to or through the switch. If the switchShow command or LEDs indicate that the link has not come up properly, use one or more of the following procedures. The port negotiates the link speed with the opposite side. The negotiation usually completes in one or two seconds; however, sometimes the speed negotiation fails. NOTE Skip this procedure if the port speed is set to a static speed through the portCfgSpeed command.
  42. 42. 26 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Link failures3 Determining a successful negotiation 1. Enter the portCfgShow command to display the port speed settings of all the ports. 2. Enter the switchShow command to determine if the port has module light. 3. Determine whether or not the port completes speed negotiation by entering the portCfgSpeed command. Then change the port speed to 1, 2, 4 or 8Gbps, depending on what speed can be used by both devices. This should correct the negotiation by setting to one speed. 4. Enter the portLogShow or portLogDump command. 5. Check the events area of the output: 14:38:51.976 SPEE sn <Port#> NC 00000001,00000000,00000001 14:39:39.227 SPEE sn <Port#> NC 00000002,00000000,00000001 • sn indicates a speed negotiation. • NC indicates negotiation complete. If these fields do not appear, proceed to the step 6. 6. Correct the negotiation by entering the portCfgSpeed [slotnumber/]portnumber, speed_level command if the fields in step 5 do not appear. switch:admin> portcfgspeed Usage: portCfgSpeed PortNumber Speed_Level Speed_Level: 0 - Auto Negotiate 1 - 1Gbps 2 - 2Gbps 4 - 4Gbps 8 - 8Gbps ax - Auto Negotiate + enhanced retries Checking for a loop initialization failure 1. Verify the port is an L_Port. a. Enter the switchShow command. b. Check the comment field of the output to verify that the switch port indicates an L_Port. If a loop device is connected to the switch, the switch port must be initialized as an L_Port. c. Check to ensure that the port state is online; otherwise, check for link failures. 2. Verify that loop initialization occurred if the port to which the loop device is attached does not negotiate as an L_Port. a. Enter the portLogShow or portLogDump command. b. Check argument number four for the loop initialization soft assigned (LISA) frame (0x11050100). switch:admin> portlogdumpport 4 time task event port cmd args ------------------------------------------------- 11:40:02.078 PORT Rx3 23 20 22000000,00000000,ffffffff,11050100 Received LISA frame The LISA frame indicates that the loop initialization is complete.
  43. 43. Fabric OS Troubleshooting and Diagnostics Guide 27 53-1001187-01 Link failures 3 3. Skip point-to-point initialization by using the portCfgLport Command. The switch changes to point-to-point initialization after the LISA phase of the loop initialization. This behavior sometimes causes trouble with old HBAs. Checking for a point-to-point initialization failure 1. Enter the switchShow command to confirm that the port is active and has a module that is synchronized. If a fabric device or another switch is connected to the switch, the switch port must be online. 2. Enter the portLogShow or portLogDump commands. 3. Verify the event area for the port state entry is pstate. The command entry AC indicates that the port has completed point-to-point initialization. switch:admin> portlogdumpport 4 time task event port cmd args ------------------------------------------------- 11:38:21.726 INTR pstate 4 AC 4. Skip over the loop initialization phase. After becoming an active port, the port becomes an F_Port or an E_Port depending on the device on the opposite side. If the opposite device is a fabric device, the port becomes an F_Port. If the opposite device is another switch, the port becomes an E_Port. If there is a problem with the fabric device, enter the portCfgGPort to force the port to try to come up as point-to-point only. Correcting a port that has come up in the wrong mode 1. Enter the switchShow command. 2. Check the output from the switchShow command and follow the suggested actions in Table 4. TABLE 4 SwitchShow output and suggested action Output Suggested action Disabled If the port is disabled (for example, because persistent disable or security reasons), attempt to resolve the issue and then enter the portEnable or portCfgPersistentEnable command. Bypassed The port may be testing. Loopback The port may be testing. E_Port If the opposite side is not another switch, the link has come up in a wrong mode. Check the output from the portLogShow or PortLogDump commands and identify the link initialization stage where the initialization procedure went wrong. F_Port If the opposite side of the link is a private loop device or a switch, the link has come up in a wrong mode. Check the output from portLogShow or PortLogDump commands.
  44. 44. 28 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Marginal links3 NOTE If you are unable to read a portlog dump, contact your switch support provider for assistance. Marginal links A marginal link involves the connection between the switch and the edge device. Isolating the exact cause of a marginal link involves analyzing and testing many of the components that make up the link (including the switch port, switch SFP, cable, edge device, and edge device SFP). Table 5 shows the different loopback modes you can use when using the portLoopbackTest to test a marginal link. Troubleshooting a marginal link 1. Enter the portErrShow command. 2. Determine whether there is a relatively high number of errors (such as CRC errors or ENC_OUT errors), or if there are a steadily increasing number of errors to confirm a marginal link. 3. If you suspect a marginal link, isolate the areas by moving the suspected marginal port cable to a different port on the switch. Reseating of SFPs may also cure marginal port problems. If the problem stops or goes away, the switch port or the SFP is marginal (proceed to step 4) If the problem does not stop or go away, see step 7. 4. Replace the SFP on the marginal port. G_Port The port has not come up as an E_Port or F_Port. Check the output from portLogShow or PortLogDump commands and identify the link initialization stage where the initialization procedure went wrong. L_Port If the opposite side is not a loop device, the link has come up in a wrong mode. Check the output from portLogShow or PortLogDump commands and identify the link initialization stage where the initialization procedure went wrong. TABLE 4 SwitchShow output and suggested action (Continued) Output Suggested action TABLE 5 Loopback modes Loopback mode Description 1 Port Loopback (loopback plugs) 2 External Serializer/Deserializer (SerDes) loopback 5 Internal (parallel) loopback (indicates no external equipment) 7 Back-end bypass and port loopback 8 Back-end bypass and SerDes loopback 9 Back-end bypass and internal loopback
  45. 45. Fabric OS Troubleshooting and Diagnostics Guide 29 53-1001187-01 Device login issues 3 5. Run the portLoopbackTest on the marginal port. You will need an adapter to run the loopback test for the SFP. Otherwise, run the test on the marginal port using the loopback mode lb=5. Use the different modes shown in Table 5 to test the port. See the Fabric OS Command Reference for additional information on this command. 6. Check the results of the loopback test and proceed as follows: • If the loopback test failed, the port is bad. Replace the port blade or switch. • If the loopback test did not fail, the SFP was bad. 7. Perform the following steps to rule out cabling issues: a. Insert a new cable in the suspected marginal port. b. Enter the portErrShow command to determine if a problem still exists. • If the portErrShow output displays a normal number of generated errors, the issue is solved. • If the portErrShow output still displays a high number of generated errors, follow the troubleshooting procedures for the Host or Storage device in the following section, “Device login issues”. Device login issues A correct login is when the port type matches the device type that is plugged in. In the example below, it shows that the device connected to Port 1 is a fabric point-to-point device and it is correctly logged in an F_Port. brcd5300:admin> switchshow switchName:brcd5300 switchType:64.3 switchState:Online switchMode:Native switchRole:Subordinate switchDomain:1 switchId:fffc01 switchWwn:10:00:00:05:1e:40:ff:c4 zoning:OFF switchBeacon:OFF FC Router:OFF FC Router BB Fabric ID:1 Area Port Media Speed State Proto ===================================== 0 0 -- N8 No_Module 1 1 -- N8 No_Module 2 2 -- N8 No_Module 3 3 -- N8 No_Module 4 4 -- N8 No_Module 5 5 -- N8 No_Module 6 6 -- N8 No_Module 7 7 -- N8 No_Module 8 8 -- N8 No_Module 9 9 -- N8 No_Module 10 10 -- N8 No_Module 11 11 -- N8 No_Module 12 12 -- N8 No_Module
  46. 46. 30 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Device login issues3 13 13 -- N8 No_Module 14 14 -- N8 No_Module 15 15 -- N8 No_Module 16 16 -- N8 No_Module 17 17 -- N8 No_Module 18 18 -- N8 No_Module 19 19 -- N8 No_Module 20 20 -- N8 No_Module 21 21 -- N8 No_Module 22 22 -- N8 No_Module 23 23 -- N8 No_Module 24 24 -- N8 No_Module 25 25 -- N8 No_Module 26 26 -- N8 No_Module 27 27 -- N8 No_Module 28 28 -- N8 No_Module 29 29 -- N8 No_Module 30 30 -- N8 No_Module 31 31 -- N8 No_Module 32 32 -- N8 No_Module 33 33 -- N8 No_Module 34 34 -- N8 No_Module 35 35 -- N8 No_Module 36 36 -- N8 No_Module 37 37 -- N8 No_Module 38 38 -- N8 No_Module 39 39 -- N8 No_Module 40 40 -- N8 No_Module 41 41 -- N8 No_Module 42 42 -- N8 No_Module 43 43 -- N8 No_Module 44 44 -- N8 No_Module 45 45 -- N8 No_Module 46 46 -- N8 No_Module 47 47 -- N8 No_Module 48 48 -- N8 No_Module 49 49 -- N8 No_Module 50 50 -- N8 No_Module 51 51 -- N8 No_Module 52 52 -- N8 No_Module 53 53 -- N8 No_Module 54 54 -- N8 No_Module 55 55 -- N8 No_Module 56 56 -- N8 No_Module 57 57 -- N8 No_Module 58 58 -- N8 No_Module 59 59 -- N8 No_Module 60 60 -- N8 No_Module 61 61 -- N8 No_Module 62 62 -- N8 No_Module 63 63 -- N8 No_Module 64 64 id N2 Online E-Port 10:00:00:05:1e:34:d0:05 "1_d1" (Trunk master) 65 65 -- N8 No_Module 66 66 -- N8 No_Module 67 67 id AN No_Sync 68 68 id N2 Online L-Port 13 public 69 69 -- N8 No_Module 70 70 -- N8 No_Module 71 71 id N2 Online L-Port 13 public
  47. 47. Fabric OS Troubleshooting and Diagnostics Guide 31 53-1001187-01 Device login issues 3 72 72 -- N8 No_Module 73 73 -- N8 No_Module 74 74 -- N8 No_Module 75 75 -- N8 No_Module 76 76 id N2 Online E-Port 10:00:00:05:1e:34:d0:05 "1_d1" (upstream)(Trunk master) 77 77 id N4 Online F-Port 10:00:00:06:2b:0f:6c:1f 78 78 -- N8 No_Module 79 79 id N2 Online E-Port 10:00:00:05:1e:34:d0:05 "1_d1" (Trunk master) Pinpointing problems with device logins 1. Log in to the switch as admin. 2. Enter the switchShow command; then, check for correct logins. 3. Enter the portCfgShow command to see if the port is configured correctly. In some cases, you may find that the port has been locked as an L_Port and the device attached is a fabric point-to-point device such as a host or switch. This would be an incorrect configuration for the device and therefore the device cannot log into the switch. To correct this type of problem, remove the Lock L_Port configuration using the portCfgDefault command. switch:admin> portcfgshow Ports of Slot 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 -----------------+--+--+--+--+----+--+--+--+----+--+--+--+----+--+--+-- Speed AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN Trunk Port ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON Long Distance .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. VC Link Init .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Locked L_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Locked G_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Disabled E_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ISL R_RDY Mode .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. RSCN Suppressed .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Persistent Disable.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. NPIV capability ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON where AN:AutoNegotiate, ..:OFF, ??:INVALID, SN:Software controlled AutoNegotiation. 4. Enter the portErrShow command; then, check for errors that can cause login problems. A steadily increasing number of errors can indicate a problem. Track errors by sampling the port errors every five or ten minutes. • The frames tx and rx are the number of frames being transmitted and received. • The crc_err counter are frames with CRC errors. If this counter goes up then the physical path should be inspected. Check the cables to and from the switch, patch panel, and other devices. Check the SFP by swapping it with a well-known-working SFP. • The crc_g_eof counter are frames with CRC errors and a good EOF. Once a frame is detected to have a CRC error, the EOF is modified from that point on. So the first place that the CRC is detected is the only place where the CRC with a good EOF is seen. This good EOF error helps identify the source of the CRC error.
  48. 48. 32 Fabric OS Troubleshooting and Diagnostics Guide 53-1001187-01 Device login issues3 • The enc_out are errors that occur outside the frame and usually indicating a bad primitive. To determine if you are having a cable problem, take snapshots of the port errors by using the portErrShow command in increments of 5 to 10 minutes. If you notice the crc_err counter go up, you have a bad or damaged cable, or a bad or damaged device in the path. • The disc_c3 errors are discarded class 3 errors, which means that the switch is holding onto the frame longer than the hold time allows. One problem this could be related to is ISL oversubscription. switch:admin> porterrshow frames enc crc crc too too bad enc disc link loss loss frjt fbsy tx rx in err g_eof shrt long eof out c3 fail sync sig ============================================================================ 0: 665k 7.0k 0 0 0 0 0 0 6 0 0 1 2 0 0 1: 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 2: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 4: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 5: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 6: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 7: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 8: 78 60 0 0 0 0 0 0 7 0 0 3 6 0 0 9: 12 4 0 0 0 0 0 0 3 0 0 1 2 0 0 10: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 11: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 12: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 13: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 14: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 15: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 16: 665k 7.4k 0 0 0 0 0 0 6 0 0 1 2 0 0 17: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 18: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 19: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 20: 6.3k 6.6k 0 0 0 0 0 0 7 0 0 1 2 0 0 21: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 22: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 23: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 24: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 25: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 26: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 27: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 28: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 29: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 30: 664k 6.7k 0 0 0 0 0 0 6 0 0 1 2 0 0 31: 12 4 0 0 0 0 0 0 3 0 0 1 2 0 0 (output truncated)

×