SlideShare a Scribd company logo
1 of 109
Download to read offline
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Advanced Troubleshooting Cisco
Nexus 7000 Series Switches
BRKDCT-3144
Dipl.-Ing. Andreas la Quiante
alaquian@cisco.com
Nexus Product Management, Cisco Data Center Group
Level 3 (:= Advanced)
Version 019
2014 San Francisco
18-MAY-14
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Chapter 0
Housekeeping
ASICs are counting
starting with zero.
So do we today
08:03
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Teamwork
thank you
Matt
Martin
Ron
Roland Dmitry
Ronald
Adam
Need help like me?
Terri
4
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
N7K
Switch
Router
PC
Layer 3
Layer 2
Focus
areas
N7004-Berlin#
sh int e 3/12
CLI
Geek
content
Error/Failure/Challenge
Cisco TAC
Interface
Housekeeping
Icons
VLAN
08:05 5
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Agenda, Timing and Theme
…it’s like going on vacation…
6
Strategy
Tools & System
Data-Plane
Layer 2
TCAM
Data-Plane
Layer 3
Control-Plane
Inband
Control-Plane
ARP
Cisco Live 2014
San Francisco: 120 min
1
2
3
Summary, Wrap Up
Layout Item
4
5
6
Chapter 1-3: Chapter 4-6:
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Chapter 1
Strategy, Tools and System
ELAME
System
Strategy Scripts
CLI Ethanalyzer
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Guidance
System Troubleshooting
- Core, CPU, Memory, Interface/Vlan behaving odd,
hardware challenges
Data Plane Troubleshooting
- Packets are lost
- your primary questions is “where”
- 100% loss or partial loss
- consistent or periodically
Control Plane Troubleshooting
- Something is flapping
- Convergence challenges
- start at the process (log)
“Anything better
than checking
everything is an
improvement”
Strategy
Three Areas
Dmitry
8
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
I/O Module
(Forwarding
Engine)
I/O Module
(Forwarding
Engine)
System
Control-
Plane
Data-
Plane
Reference Point 1
Supervisor
(Control-Plane)
Strategy
System, Data-Plane, and Control-Plane
RL CoPP
Reference Point 2
9
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Tools
Content Suggestion (via Cisco Live Content on-demand library, e.g. 2013 Orlando)
BRKARC-2011 Overview of Troubleshooting Tools in Cisco Switches and Routers
Yogesh Ramdoss - Technical Leader, Cisco Services, Cisco
Andy Gossett - Customer Support Engineer, Cisco
10
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
NX-OS Value
NX-OS is build up
with most extensive,
fine granular logging
capabilities
NX-OS High
Performance
Feature Rich
Switching
Logging
Switching
Logging
NX-OS:
Build in
Flight Recorder
Tools
Logging built in
PI := Platform Independent
PD := Platform Dependent
Config
Python, NxAPI
GUI, OF, SNMP
XML, OnePK
Chef, Puppet
Standard CLI
Python/TCL
Engineering CLI
Internal keyword
output is not
documented
Action
11
PD
PI
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Show tech ABC
Always try to use the detailed version show tech detail
Feature
Event history
States (PSS,...)
HW states
Always redirect to a file
Always use a separate file per show tech
Global Service
VDC-1 Default
Feature
“project binary logger”
Significan time saver
Show tech all-binary
Avoiding also
“we need show tech A”
after a while doing RCA
“we need show tech B”
For use by
TAC/BU/ENG
t0 t2 t3
t1
t0 to t2
trigger failure
Immediately
collect data!
Then start
troubleshooting
Tools
show tech
12
If not enough time: try a specific show tech
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ASICs
Some „error“ counters are part of a normal operation
(e.g. dropping packets at ingress trunk if the marked VLAN is not
known (CBL drops), diag packets, extra flooded packets)
One of TAC‘s favourite commands. Use „all“ to look for all
modules / ASICs
N7004-Berlin# show hardware internal errors module 3
|------------------------------------------------------------------------|
| Device:Clipper MAC Role:MAC Mod: 3 |
| Last cleared @ Mon Nov 25 21:41:37 2013
| Device Statistics Category :: ERROR
|------------------------------------------------------------------------|
Instance:2
Cntr Name Value Ports
----- ---- ----- -----
0 GD GMAC bad character interrupt 0000000000000002 12 -
1 GD GMAC sequence error interrupt 0000000000000002 12 -
2 GD GMAC transition from nosync to sync int 0000000000000002 12 -
3 GD GMAC transition from sync to nosync int 0000000000000001 12 -
4 PL ingress_cbl_drop 0000000000003426 12 -
GD GMAC Build in MAC Controller
Our innovative ASICs
provide many counters
1) Suspicion for abc
2) Show hardw int err
3) Send test packets
4) Show hardw int err
Non-Zero Counter
Tools
Custom ASICs
13
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Tips & Tricks
N7004-Berlin# show system internal pktmgr interface
<SNIP>
Vlan1, ordinal: 38 Hash_type: 1
SUP-traffic statistics: (sent/received)
Packets: 2769 / 1896
Bytes: 1619370 / 241310
Instant packet rate: 1 pps / 0 pps
Packet rate limiter (Out/In): 0 pps / 0 pps
Average packet rates(1min/5min/15min/EWMA):
Packet statistics:
Tx: Unicast 1123, Multicast 1641
Broadcast 5
N7004-Berlin# show system internal pktmgr interface |in or|I
<SNIP>
Vlan1, ordinal: 38 Hash_type: 1
Instant packet rate: 0 pps / 0 pps
Packet rate limiter (Out/In): 0 pps / 0 pps
port-channel100, ordinal: 72 Hash_type: 1
Instant packet rate: 1 pps / 1 pps
Packet rate limiter (Out/In): 0 pps / 0 pps
If I am only interested in
parts of the output I can ask
for just those items
You save time by having to
read less
Nexus# sh ver | ?
egrep Egrep -
grep Grep -
head Displ 1st ln
last Displ last
less Filter
no-more
sed
wc Count
begin Begin with
count Count
exclude Exclude ln
include Include ln
Tools
customizing CLI
14
N7004-Berlin# sh processes cpu sort | ex 0.0
„real time flter“
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
6.2(2)
Tools
Tools
Tools (Scripts)
System Check (systemcheck)
Packet Capture (elame)
Event Time Analysis (logw)
6.2(6) 6.2(8)
NX-OS
(Thank you : Adam, Francesco, Dmitry, …)
16
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Information Source
Tools
System Check
Show
tech
Live Device : Nexus
7000-Series
Offline:
Show Tech-Support
Goal: Identifying top x platform issues in one path
Time saving vs. traditional approach: 30-40 minutes
Hardware health, failing diagnostics,
error interrupts
Control plane overload (inband, CPU,
IPC, process, network stability)
Resource issues (CPU, memory,
forwarding resources)
Data plane issues (drops, errors)
statistical analysis
Option “-v” show CLI used by
systemcheck
S
C
R
I
P
T
17
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Example
Tools
System Check
*** hw internal counters ***
*** CPU, process crashes,
service restarts ***
*** Memory ***
*** IPC/MTS ***
*** HW Limit Checks ***
N7K# source sys/systemcheck.py
*** modules, diagnostics, HW exceptions ***
module: 8 (N7K-M132XP-12L) state: ok, FSM state:
LCM_MOD_ST_LC_POWERED_UP/LCM_LC_ST_ONLINE
recent HW exceptions:
2013-07-01 15:28:17 System Manager:0x401e008a Service on linecard
had a hap-reset
2013-06-28 10:33:08 System Manager:0x401e008a Service on linecard
had a hap-reset
´59 HW exceptions before last reload
*** HW internal counters ***
active slots ['1', '2', '4', '7', '8', '9', 'sup']
processing data for slot 1
unique error types: 8
freq / cumulative amount / error
0 12 PL ingress_rx_diag_0_drops
0 15 IB ingress_ib_de_and_pl_drop (small cnt)
0 1 IB INT DE packet drop (cr_type = 0, all fpoe = 0)
10 675048 EB egress credited pkt drops
10 1 IB INT DE packet drop
27 1 PL ingress_rx_err
40 3326668571 PL egress_cbl_drop
40 21 PL ingress_cbl_drop
18
08:10
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Location
Rotated once reaching
10MB
logfile nvram onboard
Logfile:
Syslog Messages
NVRAM:
High Severity Messages (SEV 1 or 2)
On-Board:
Major state changes, MTS transactions
Useful for module troubleshooting
N7004-Berlin# show logging nvram
2013 Nov 9 23:03:25 N7004-Berlin %$ VDC-2 %$
%L2FM-2-L2FM_CFS_SEND_FAILED: cfs send failed, num 1
Wraps quickly
-2- := severity 2
:= Critical
N7004-Berlin# show file logflash:log/messages
2008 Jan 2 19:24:21 %MODULE-5-ACTIVE_SUP_OK: Supervisor 6 is
active (serial: JAFxxxxxxxxx)
2008 Jan 2 19:24:21 %PLATFORM-5-MOD_STATUS: Module 6 current-
status is MOD_STATUS_ONLINE/OK
It is a good idea to synchronize
all devices in your network to
one time source
19
Tools
Logging
Possible next step after syslog:
Look for more infomation in
in the event-history of the
notifying feature
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Trigger
logw.py [-h] [-v] [-f FILTERS] [-t TRUNCATE] [-n
MAX_EVENTS] [-s] start_date start_time duration
Logfile
10MB
logfile
NVRAM
On-board
Event
History
A new tool:
logwindow
Tools
Logwindow
N7K# source logw.py 15/01/2014 12:24:55 100
starting with empty stats stats init done Logw system check port version
0.060813
Time range 2014-01-15 12:24:55 ... 2014-01-15 12:26:35 Got 343 show
... event-history clis
244 clis left after pre-filtering
collecting outputs...done, collected 2602 events in 96.197735 seconds sorted
<snip>
20
Tip:
show log log immediately displays the logfile output, and is
faster than show log which has to read the logging severity settings
Specify a start-time to limit output
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Audit Recording
Only configuration commands are captured by default.
Enable all commands to be captured with terminal log-all
(feature requires 5.x NX-OS or higher)
Trigger
logw.py
Tools
Accounting
N7004-Berlin# show accounting log | last 3
Mon Dec 2 03:33:05 2013:type=update:id=console0:user=admin:cmd=switchto ;
configure terminal ; interface port-channel110 ; shutdown (SUCCESS)
Mon Dec 2 03:33:08 2013:type=update:id=console0:user=admin:cmd=switchto ;
configure terminal ; interface port-channel110 ; no shutdown (REDIRECT)
Mon Dec 2 03:33:08 2013:type=update:id=console0:user=admin:cmd=switchto ;
configure terminal ; interface port-channel110 ; no shutdown (SUCCESS)
N7004-Berlin(config)# terminal log-all
N7004-Berlin(config)# show accounting log all | last 2
Mon Dec 2 03:53:28 2013:type=update:id=console0:user=admin:cmd=switchto ;
show accounting log all | last 2 (SUCCESS)
Mon Dec 2 03:52:11 2013:type=update:id=console0:user=admin:cmd=switchto ;
show hardware internal errors all (SUCCESS)
21
N7004# dir logflash://sup-active/vdc_1
20023 Apr 18 11:19:40 2014 accounting_log
1291 Sep 21 19:26:05 2012 forwarding_debug_data
persistent
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ELAM & ELAME
It is widely used by engineering, QA, TAC and escalation teams
ELAM is an unsupported and internal tool
ELAM requires a great deal of platform architecture and ASIC
knowledge to use. This limits the audience of the raw tool.
Identifying the appropriate FE, creating triggers, and interpreting
ELAM data for complex flows requires full architectural and
forwarding knowledge
Good news: ELAME makes ELAM easy to use
skill
ELAME
F-Series
M-Series
Tools
ELAM & ELAME
22
08:15
(ELAM := Embedded Logic Analyzer Module)
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
workflow
Determin the FE
Configure Trigger
Start ELAM
Analyze
ELAM allows you to verify if a packet is present and/or to analyze
ELAME allows you to verify quickly if a packet is present,
especially in a complicated setup it saves you TIME!
Use cases:
1) Determining the failure
domain
2) Analyze the System
behavior
IP 42.42.42.1
MAC aaaa.bbbb.cccc
IP 42.42.42.12
MAC aaaa.bbbb.dddd
You MUST know the source and destination
MAC/IP pairs involved for troubleshooting.
Is the source and/or destination dual-homed?
Is the source and/or destination real or virtual?
23
Tools
ELAM & ELAME
FE: Eureka(M), Lamira(M), Orion(F1), Clipper(F2), Flanker(F3)
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ELAME
N7004-Paris# source sys/elame 10.0.2.2 224.0.0.5
elam helper, version 1.015
... source 10.0.2.2, destination 224.0.0.5
... getting current vdc ... 4
... ingress interface derived from source address
... ingress interface list is Ethernet4/1
... expanded ingress interface list is Ethernet4/1
... FE instance list is 4/1/1
... setting trigger...
... elam trigger set
... starting capture...
... elam capture started
... no packet captured so far
press [enter] when packets in question are known to have been sent…
... packet captured at FE: 4/1/1
... capture instance 4/1/1 (slot/type/instance)
Since NX-OS 6.2(2) we include „elame.tcl“ in the
distribution:
Berlin
10.0.2.2/24
Paris
10.0.2.4/24
Do we receive OSPF
packets from our
neighbor on E 4/1?
E 4/1
M-Series line card
Because ELAM especially on M-Series is complicated
this example show how easy it is to use ELAME
ELAME works on F2 and M-Series line cards with IPv4
You just specify source and destination address
the tool determines the correct FE to programm
even on M-Series Modules
25
Tools
ELAME, Part 1
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ELAME
N7004-Paris# source sys/elame 10.0.2.2 224.0.0.5
<SNIP>
... packet captured at FE: 4/1/1
... capture instance 4/1/1 (slot/type/instance)
+++ IPv4 packet: 86 bytes from MAC 4055.390f.5642 / IP 10.0.2.2 to MAC
0100.5e00.0005 / IP 224.0.0.5 TTL 1
+++ protocol OSPF
+++ packet received on interface Eth4/1 vlan 0 (source index 0x00030)
... rbus: ccc 0x0 cap1 0x1 cap2 0x1 flood 0x1 dest_vlan 0 dest_index
0x00032 l2_fwd 0x0
+++ packet is flooded to BD 50 / vlan 0
... destination index is NOT from L2 table lookup
+++ copy of the packet is sent to CPU
... lamira OFE: rdt 0x0 dest_index 0x010c7 flood 0x0 l2fwd 0x0 ofe_drop 0x0
+++ lamira OFE exception(s): CPP_LIF (0x200000000)
... FE instance 4/1/1 context after analysis: pb2 retried
... done
DBUS and RBUS captured,
easy tool even on M-Series line cards (here N7K-M224)
E 4/1
LTL 0x30
SUP
LTL 0x10C7
Paris
Berlin
Lamira
Eureka
The lines beginning with +++ are the important once
ELAM(E)
Ethanaylzer
27
skill
ELAME
F-Series
M-Series
Tools
ELAME, Part 3
08:20
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ELAM F2
Embedded Logic
Analyzer Module
F2 no PB for ELAM (:= more simple but the recommendation is
to still use ELAME like the pros)
Clipper: Layer 2 ELAM and/or Layer 3 ELAM
module-3# elam asic clipper instance 2
Module-3(clipper-elam)# layer 3
module-3(clipper-l3-elam)# trigger dbus ipv4 if source-ipv4-address
42.42.42.142
module-3(clipper-l3-elam)# trigger rbus ofe if trig
module-3(clipper-l3-elam)# start
module-3(clipper-l3-elam)# status
<SNIP>
L2
L3
Clipper FE2
E3/12
OFE
IFE
OFE := Outgoing „Pipeline“
IFE := Incomming „Pipeline“
Status: Armed := waiting for the packet
Status: Triggered := we have captured
28
Tools
ELAM
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ELAM F2
Embedded Logic
Analyzer Module
42.42.42.142
E 3/12
F-Series line card
module-3(clipper-l3-elam)# show dbus
--------------------------------------------------------------------
Clipper Instance 02 - Capture Buffer On L3 DBUS:
<SNIP>
--------------------------------------------------------------------
L3 DBUS CONTENT - IPV4 PACKET
--------------------------------------------------------------------
<SNIP>
l2-packet-length : 0x52 ingress-lif : 0xfca
vlan-id : 0x2a ilm-addr : 0x32
source-index : 0x402 destination-index : 0x0
frame-type : 0x5 sequence-number : 0x94
l2-frame-type : 0x0 l4-protocol : 0x59
recirc-preserve-acos: 0x0
recirc-multicast-bridge-disable: 0x0
ipv4_l4_info_elsewhere_1: 0x0
ipv4_l4_info_elsewhere_2: 0x0
destination-mac-address: 0100.5e00.0005
source-mac-address: 0010.7be8.53b0
source-ipv4-address: 42.42.42.142
Destination-ipv4-address: 224.0.0.5
Berlin
30
Tools
ELAM, DBUS
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ELAM F2
Embedded Logic
Analyzer Module
42.42.42.142
E 3/12
F-Series line card
module-3(clipper-l3-elam)# show rbus
--------------------------------------------------------------------
Clipper Instance 02 - Capture Buffer On L3 RBUS:
<SNIP>
--------------------------------------------------------------------
L3 RBUS OFE CONTENT
--------------------------------------------------------------------
OFE valid: 0x1
trig : 0x1 l2-l3-acos : 0x0
<SNIP>
dvif : 0x0 vlan : 0x2a
md-di-valid : 0x0 redirect : 0x0
ccc : 0x4 l2-forward : 0x1
routed : 0x0 eid-select : 0x0
lif-status-enable : 0x1 bcn-compatible : 0x0
VID 42:= 0x2a
Berlin
31
Tools
ELAM, RBUS
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ELAM F2
Embedded Logic
Analyzer Module
module-3# elam asic clipper instance 2
Module-3(clipper-elam)# layer 2
Module-3(clipper-l3-elam)# trigger dbus ipv4 if destination-ipv4-address
42.42.42.142
Module-3(clipper-l3-elam)# trigger rbus ingress if trig
L2
L3
Clipper FE2
E1/12
egr
ingr
Since the former example indicated no Layer 3 rewrite we look
now into Layer 2 ELAM (still looking for Layer 3 information)
module-3(clipper-l2-elam)# show rbus
<SNIP>
inner-cos : 0x0 acos : 0x0
di-ltl-index : 0x8015 l3-multicast-di : 0x0
source-index : 0x402 vlan-id : 0x2a
index-direct : 0x0 eid-sel : 0x0
vqi : 0xfa v5-fpoe-idx : 0xf9
l3-fpoe-idx : 0x0 l3-multicast-v5 : 0x0
dft : 0x0 dfst : 0x0
32
Tools
ELAM, RBUS
08:25
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Reference Point
Similar to NetDR on C6500/7600
but separate / parallel to internal processing and path
33
Tools
Ethanalyzer
Multiple
CPU
Cores
Kernel
Ethanalyzer OSPF
Display Filter
Capture Filter NetStack
http://wiki.wireshark.org/CaptureFilters
http://wiki.wireshark.org/DisplayFilters
N7004# ethanalyzer local interface inband decode-internal limit- frame-size
150 display detail
2013-12-07 15:52:47.446886
Cisco_8b:a0:5a -> PVST+ STP 96 RST. Root = 32768/42/ 00:0c:30:8b:a0:40
Cost=0 Port=0x8041
NXOS Protocol
NXOS VLAN: 42
NXOS SOURCE INDEX: 1030
NXOS DEST INDEX: 4295
Frame 5: 64 bytes on wire (512 bits), 64 bytes captured (512 bits) on if 0
Arrival Time: Dec 7, 2013 15:52:47.446886000 UTC
[Protocols in frame: eth:llc:stp]
IEEE 802.3 Ethernet
Destination: PVST+ (01:00:0c:cc:cc:cd)
Spanning Tree Protocol
Protocol Identifier: Spanning Tree Protocol (0x0000)
Protocol Version Identifier: Rapid Spanning Tree (2)
BPDU Type: Rapid/Multiple Spanning Tree (0x02)
BPDU flags: 0x3c (Forwarding, Learning, Port Role: Designated)
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Tips & Tricks
Event-Histories typically will be enough to diagnose most of the
issues, however sometimes debugging may be required. For
Verbose debugs that can drive up CPU when printed on
terminal, NX-OS provides capability to send debug output
directly to a file saved in a log directory.
After a reload
the information is gone!
N7004-Berlin# debug logfile ALQ-OSPF size 8192
N7004-Berlin# debug ip ospf all detail
N7004-Berlin# dir log:
8192 Jan 04 12:00:03 2014 ALQ-OSPF
11114 Jan 04 11:51:16 2014 messages
196 Jan 04 11:47:53 2014 snmp_log
149595 Jan 04 11:58:07 2014 startupdebug
N7004-Berlin# show debug logfile ALQ-OSPF
2014 Jan 4 12:00:16.332218 ospf: 1 [6941] (default) Nbr 10.0.3.5
FSM start: old state FULL, event HELLORCVD
2014 Jan 4 12:00:16.332240 ospf: 1 [6941] (default) Nbr 10.0.3.5:
FULL --> FULL, event HELLORCVD
Tools
Debugging
34
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
System Troubleshooting
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Ethernet IF
E 3/12
N7004-Berlin# show int eth 3/13
Ethernet3/13 is down (SFP not inserted)
N7004-Berlin# show int eth 3/12
Ethernet3/12 is up
The Interface could be described as the Port-ASIC including the
MAC Controller
Another view would be the Software Process in
the Control Plane Ethpm (:= Ethernet Port Manager)
An up-to-date
network drawing helps
Ethpm
VID 1
VID 42
STP
Vlan Mgr
System
…my interface
36
08:30
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Ethernet IF
E 1/27
Ethpm
Phy_off
802.1X
PIXM
ACL
QOS
L2FM
STP
N7K(config)#
interface e1/27
N7K(config-if)# shut
N7K# show inter e1/27
Ethernet1/27 is down
(Internal-Fail
errDisable,
libeventseq:
sequence timeout)
Processes and Services are
depending on each other
Collect information about
the whole environment:
(e.g. Show tech )
As you likely don‘t know all
dependent processes
Ethpm is interacting with each
service sequencially
(Request and Response)
OK, how about shutting
down a port (e.g. e1/27)?
N7K(config-if)# shut
System
…my interface behaves oddly…
37
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Ethernet IF
E 1/27
Ethpm
N7K# sh system internal ethpm event-history errors | grep –B 4 –A 4 net1/27
<snip>
23) Event:E_DEBUG, length:141, at 908071 usecs after Thu Feb 7 09:29:35 2013
[102] ethpm_def_port_seq_step_failure_hdlr(9406): Port: Ethernet1/27 ,
Sequence No: 4, Sequence Step : 13 ,Error: 0xsequence timeout(408c0008)
<snip>
We start today here with Ethpm and look into the event log
Most features use a private event log
N7K# sh system internal ethpm event-history msgs | grep -B 4 -A 5 0x25933EC0
1407) Event:E_MTS_RX, length:60, at 94113 usecs after Thu Feb 7 09:30:08 2013
[RSP] Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446), Id:0X259E2DF3, Ret:SUCCESS
Src:0x00000505/221, Dst:0x00000505/175, Flags:None
HA_SEQNO:0X00000000, RRtoken:0x25933EC0, Sync:UNKNOWN, Payloadsize:34
--
1440) Event:E_MTS_TX, length:60, at 974110 usecs after Thu Feb 7 09:28:55 2013
[REQ] Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446), Id:0X25933EC0, Ret:SUCCESS
Src:0x00000505/175, Dst:0x00000505/221, Flags:None
HA_SEQNO:0X00000000, RRtoken:0x25933EC0, Sync:UNKNOWN, Payloadsize:34
N7K# sh system internal ethpm event-history errors
MTS_OPC_ETHPM_PORT_PHY_CLEANUP (rr token - 0x25933ec0 sap:221) received
Reference Slide
System
sequence timeout
38
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Take Away
What we know so far:
SEQ time-out happen during/around PHY port CLEANUP
Someone introduced a delay or received an own request with delay
Next steps:
Check MTS (:= Message Transmission System)
Check Log (e.g. look for “SAP 221”)
N7K# show log log
2013 Feb 7 07:12:33 N7K Feb 7 07:08:44 %KERN-2-SYSTEM_MSG:
mts_is_q_space_available_old():1641: regular+fast mesg total = 135287, soft
limit = 32768 - kernel
2013 Feb 7 07:12:33 N7K Feb 7 07:08:44 %KERN-2-SYSTEM_MSG:
mts_is_q_space_available_old(): NO SPACE - node=5, sap=221, uuid=410,
pid=30121, sap_opt = 0x1, hdr_opt
= 0x0, rq=134970(13264613), lq=0(0), pq=317(655986), nq=0(0), sq=0(0), fast:
rq=0, lq=0, pq=0, nq=0, sq=0 - kernel
It was written in the
log file. If we had
looked into the log file
first we would have
saved a lot of time!
Recover
System
sequence timeout
39
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Core Files
Collect cores form „all“ locations on the active (don‘t forget
your standby SUP) and attach them to a TAC case right away
N7004# show cores vdc-all
VDC Module Instance Process-name PID Date(Year-Month-Day Time)
--- ------ -------- --------------- -------- -------------------------
VDC Module Instance Process-name PID Date(Year-Month-Day Time)
--- ------ -------- --------------- -------- -------------------------
1 17 1 pixmc 2134 2013-10-28 16:52:48
1 8 1 pixmc 2134 2013-10-28 16:52:50
SR 123
2010 Jul 17 00:30:18 vrt001 %$ VDC-1 %$ %SYSMGR-SLOT8-2-SERVICE_CRASHED:
Service "mtm" (PID 1600) hasn't caught signal 6 (core will be saved).
Here you see „slot 8“ := you know the line card and MTM is a line
card process
System
Reducing MTTR
%SYSMGR-2-SERVICE_CRASHED: Service "vpc" (PID 5883) hasn't caught signal 11
(core will be saved)
%SYSMGR-2-SERVICE_CRASHED: Service "stp" (PID 4668) hasn't caught signal 9 (no
core).
show cores vdc-all
dir logflash:core
dir logflash://sup-1
dir logflash://sup-2
show process log vdc-all
show process log
details
40
08:35
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Ethernet IF
Kuddewörde
E 3/25
10G
Fiber
My connected device is not working
I suspect a Layer 1 challenge
Working fine when
connected to 2nd switch
BAD
Those counters usually indicate a bad transceivers or fibers
In this case SDP timed out on uplinks
System
Layer 1
[1671]11/05/2013 06:30:01.337185: sdp_rx_timeout: Sdp instance timed out.
ifindex=1a018000. last pkt received at 11/05/2013 06:28:10.512373.
[1672]11/05/2013 06:30:01.337198: fport [0x1a018000]:satmgr_fport_fsm: even:t
Timeout. curr state: Active
[1673]11/05/2013 06:30:01.337216: fport [0x1a018000]:Log - SDP timed out
N7004-Berlin# show hardware internal errors module 3
Instance:6
ID Name Value Ports
-- ---- ----- -----
2189 GD XGMAC rx code violation interrupt 0000000000000001 25 -
2190 GD XGMAC rx code error interrupt 0000000000327284 25 -
2194 GD XGMAC bad to good link change interrupt 0000000000879210 25 -
2195 GD XGMAC good to bad link change interrupt 0000000000007181 25 -
12327 PL ingress_cbl_drop 0000000000032388 25 -
12328 PL egress_cbl_drop 0000000000003889 25 -
41
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Statistics
Suspicious counters for bad transceiver / fibers in yellow
clear statistics module-all device all
and run several times to identify increasing counter
2050 GD Received short frames with bad CRC (RUNT) 0000000000000000 1 -
2051 GD Rx bad CRC frames, excluding RUNT/JABBER 0000000000000000 1 -
2052 GD Rx protocol error count 0000000000000000 1 -
2054 GD Rx frame drop count 0000000000000000 1 -
2096 GD Received oversized frames w/ bad CRC 0000000000000000 1 -
2188 GD XGMAC rx CRC error interrupt 0000000000000000 1 -
2189 GD XGMAC rx code violation interrupt 0000000000000000 1 -
2190 GD XGMAC rx code error interrupt 0000000000000000 1 -
2191 GD XGMAC rx IPG violation interrupt 0000000000000000 1 -
2196 GD GMAC rx_config_word change interrupt 0000000000000000 1 -
2197 GD GMAC loss of sync interrupt 0000000000000000 1 -
2200 GD GMAC rx CRC error interrupt 0000000000000000 1 -
2228 GD Received frame with CRC error interrupt 0000000000000000 1
Reference Slide
System
Layer 1
42
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Chapter 2:
Data-Plane Layer 2
MAC Table L2FM
PIXM STP
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Failure Domain
I am loosing packets between A and B!
How can I quickly determine „where“?
100% traffic loss:
• Table not
progammed
• Wrongly
programmed
• Inconsistency
ELAME
X % traffic loss:
• Congestion?
• Periodically?
Timer/Aging event
(e.g. MAC Table)
A B
Data-Plane
Failure Domain
Determine Failure Domain Quickly
ELAME A B ELAME A B
ELAME A B ELAME A B
Failure Domain
44
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Troubleshooting
At the ingress
forwarding engine
for unicast
multicast replication
occures at the
egress line card
Congestion
F-Series (Ingress)
M-Series (Egress)
Ingress Module
First Stage
Egress Module
Third/Last Stage
EARL 8
SoC Xbar Xbar
Xbar
Fabric Modul
EARL 8
SoC
Data-Plane
Architecture
45
Suggestion:
BRKARC-3470
Cisco Nexus 7000
Hardware Architecture
08:40
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Troubleshooting
N7009-Lagos# show hardware internal errors all
|------------------------------------------------------------------------|
| Device:Sacramento Xbar ASIC Role:FABRIC Mod: 9 |
| Last cleared @ Fri Nov 15 02:19:12 2013
| Device Statistics Category :: ERROR
|------------------------------------------------------------------------|
Instance:0
ID Name Value Ports
-- ---- ----- -----
2129 FB09-P21 LOW_BP_CNT_IN 0000000000000099 1-48 I1-2
|------------------------------------------------------------------------|
| Device:Clipper XBAR Role:QUE Mod: 9 |
| Last cleared @ Fri Nov 15 05:18:38 2013
| Device Statistics Category :: CONGESTION
|------------------------------------------------------------------------|
Instance:0
ID Name Value Ports
-- ---- ----- -----
132 VQ credited pkt replica VOQ tail drops 0000000000000189 1-4 -
137 VQ credited pkt replica drop count 0000000000000189 1-4 -
9602 VQ VQI 204 CCOS 3 drop count 0000000000000189 1-4 -
Clipper
Sacramento
BP :=
Backpressure
System
FPGA Version on
FAB2 needs to be
PM 0.007 for SUP-2/2E
Q
Verify our System status
before troubleshooting
Data-Plane
Congestion
46
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
LC Families
EARL based Line
Cards
M-Series
(:= M1, M2)
SoC based Line
Cards
F-Series
(:= F1, F2, F3)
M2
2 x per LC
SoC
e.g. F2E
Clipper
up to 60 mpps
per SoC
Fabric
ASIC
Fabric
ASIC
EARL 8
Up to
60mpps
L2
L3
P
R
Q
Q:= Queuing Engine
R:= Replication Engine
P:= Port ASIC
FE .= Forwarding Engine
F1
16 x SoC
F2/F2E
12 x SoC
F3 N7K
(and all
1G/10G)
6 x SoC
F3 N77
12 x SoC
Q R P
FE
M-Series F-Series
Data-Plane
LC Architecture
47
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Forwarding
Similar: show
platform hardware
capacity
forwarding on C6K
N7004-Berlin# show hardware internal forwarding engine usage
slot 4
Forwarding Engine Usage
-----------------------
Module inst pps peak pps
4 1 0 4 @Tue Nov 26 20:17:33 2013
N7004-Berlin# show hardware internal statistics module 3 rates
Hardware statistics on module 03:
+ =============================
+ Clipper MAC Instance 0
+ =============================
|-- Ingress IN
| |--- Packets/sec
| | |--- 2: 0
| | |--- 1: 0
| | |--- 3: 0
| | |--- 4: 0
| | |--- sum: 0
| |--- Bytes/sec
| | |--- 2: 3
<SNIP>
|-- Egress OUT
| |--- Packets/sec
| | |--- 2: 0
| | |--- 1: 0
| | |--- 3: 0
| | |--- 4: 0
| | |--- sum: 0
| |--- Bytes/sec
| | |--- 2: 3
This command works for
M-Series line cards
This command works for
F-Series line cards
FE 0
E 3/1
vPC PKA
E 3/2 & 3/3
vPC PL
Module 3: F2
Data-Plane
FWD Engine Performance
48
[N7004-Berlin# show forwarding internal errors]
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
LC Internals
module-1# show hardware internal dev-port-map
--------------------------------------------------------------
CARD_TYPE: 12 port 100G
>Front Panel ports:12
--------------------------------------------------------------
Device name Dev role Abbr num_inst:
--------------------------------------------------------------
> Flanker Eth Mac Driver DEV_ETHERNET_MAC MAC_0 12
> Flanker Fwd Driver DEV_LAYER_2_LOOKUP L2LKP 12
> Flanker Xbar Driver DEV_XBAR_INTF XBAR_INTF 12
> Flanker Queue Driver DEV_QUEUEING QUEUE 12
> Sacramento Xbar ASIC DEV_SWITCH_FABRIC SWICHF 2
> Flanker L3 Driver DEV_LAYER_3_LOOKUP L3LKP 12
> EDC DEV_UNDEFINED PHYS 12
+-----------------------------------------------------------------------+
+----------------+++FRONT PANEL PORT TO ASIC INSTANCE MAP+++------------+
+-----------------------------------------------------------------------+
FP port | PHYS | MAC_0 | L2LKP | L3LKP | QUEUE |SWICHF
1 0 0 0 0 0,1
2 1 1 1 1 0,1
3 2 2 2 2 0,1
4 3 3 3 3 0,1
5 4 4 4 4 0,1
<SNIP>
EDC0 EDC1
Flanker
0
Flanker
1
SAC0 SAC1
000c.308b.a040
Data-Plane
Line Card Components
49
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Layer 2
Berlin
PO 110
MAC Address Table (16K, 64K, or 128K)
MAC Address Table
000c.308b.a040
Sync via CFS
Data-Plane
L2 HW learning
N7004-Berlin# show mac address-table vlan 1
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link,
(T) - True, (F) - False
VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
G 1 0000.0c9f.f001 static - F F sup-eth1(R)
G 1 4055.390f.5642 static - F F sup-eth1(R)
* 1 4055.390f.5643 static - F F vPC Peer-Link
* 1 000c.308b.a040 dynamic 0 F F Po110
N7004-Berlin# show hardware internal forwarding f2 l2 table utilization
L2 entries: Module inst total used mcast ucast lines lines_full
3 0 16384 15 0 15 512 0
N7004-Berlin# show hardware internal forwarding l2 table utilization
L2 entries: Module inst total used mcast ucast lines lines_full
4 1 131072 22 8 14 8192 0
50
08:45
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Layer 2
MAC A
MAC Index Flag
A PO1 PI_E
C 3/3
MAC Index Flag
A PO1 PI_E
C 3/3
MAC Index Flag
A PO1
C 3/3 PI_E
MAC C
E 1/1 E 2/2 E 3/3
Line Card 1 Line Card 2 Line Card 3
PO1
L2FM show mac address-table …
show hardware mac address-table …
Learning and Aging
optimized for physical and
logical ports
(:= PC Port Channel) with
additional signaling via L2FM
L2FM
Data-Plane
Learning & Moves
N7004-Berlin(config)# logging level l2fm 6
2013 Dec 17 02:52:46 N7004-London %$ VDC-3 %$ %L2FM-4-L2FM_MAC_MOVE: Mac
f0de.f1f2.c804 in vlan 42 has moved from Eth3/37 to Eth3/41
2013 Dec 17 02:53:00 N7004-London %$ VDC-3 %$ %L2FM-4-L2FM_MAC_MOVE: Mac
f0de.f1f2.c804 in vlan 42 has moved from Eth3/41 to Eth3/37
51
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Layer 2
L2FM
Looking back in time for a specific MAC Address
12
3
6
N7004-London#
show interface
snmp-ifindex
|i 1a124000
Eth3/37 !Port
437403648 !IFMIB
(0x1a124000) !IFINDEX
N7004-London(config)# show system int l2fm l2dbg macdb address f0de.f1f2.c804
Legend
Db: 0-MACDB, 1-GWMACDB, 2-SMACDB, 3-RMDB, 4-SECMACDB
Src: 0-UNKNOWN, 1-L2FM, 2-PEER, 3-LC, 4-HSRP
5-GLBP, 6-VRRP, 7-STP, 8-DOTX, 9-PSEC 10-CLI 11-PVLAN
12-ETHPM, 13-ALW_LRN, 14-Non_PI_MOD, 15-MCT_DOWN, 16 - SDB
17-OTV, 18-Deounce Timer, 19-AM, 20-PCM_DOWN, 21-MCT_UP, 22-L2VPN
Slot:0 based for LCS 19-MCEC 20-OTV/ORIB
VLAN: 42 MAC: f0de.f1f2.c804
Time If/swid Db Op Src Slot FE
Sat Dec 14 22:18:20 2013 0x1a124000 0 INSERT 3 2 9
Sat Dec 14 22:18:20 2013 0x1a124000 0 RESET_LL_UNDERWAY 2 0 15
Sat Dec 14 22:18:51 2013 0x1a124000 0 NON_PI_MOD 3 2 15
Sat Dec 14 22:18:51 2013 0x1a124000 0 NON_PI_MOD 3 2 15
Sat Dec 14 22:18:51 2013 0x1a124000 0 NON_PI_MOD 3 2 15
Sat Dec 14 22:19:31 2013 0x1a124000 0 FLUSH 12 0 15
Sat Dec 14 22:19:31 2013 0x1a124000 0 DELETE 0 0 15
Sat Dec 14 22:19:36 2013 0x1a128000 0 INSERT 3 2 10
Data-Plane
MAC History
52
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Layer 2
LTL := Local Target Logic (e.g. Source Index (SI) and
Destination Index (DI) e.g. 0x00402)
BD := Bridge Domain
E 3/1
Internal Header added by
PORT ASIC or SoC
(FE)
Ingress L2 Logic learns
MAC Address in HW
(M & F-Series)
Header
Packet
DI = 402h VLAN, ...
Internal Header contains
SI, DI, VLAN
SI = BAh
402h
Org Packet
We add an internal header to carry
needed information (e.g. Index, VLAN)
+
removed
Packet
N7004-Berlin# show hardware mac address-table 3 address 000c.308b.a040
!reformatted!
FE | Valid| PI| BD | MAC | Index| Stat| SW | Modi| Age| Tmr| GM|
---+------+---+------+---------------+-------+-----+-----+-----+----+----+---
0 1 0 17 000c.308b.a040 0x00402 0 0x009 0 121 1 0
2 1 1 17 000c.308b.a040 0x00402 0 0x009 0 121 1 0
Data-Plane
Internal Header
53
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Layer 2
0402h
8011h
BD – VLAN
VDC 2:17 := 1
DB
PO110
A interface is assigned
one or more indices
One port gets assigned one or more index values, internally we
use the concept of bridge domains (which map to VLAN ID)
54
Data-Plane
Index
N7004-Berlin# show system internal pixm info ltl 0x00402
PC_TYPE PORT LTL RES_ID LTL_FLAG CB_FLAG MEMB_CNT
------------------------------------------------------------------------------
Normal Po110 0x0402 0x1600006d 0x00000000 0x00000002 1
Member rbh rbh_cnt
Eth3/12 0x000000ff 0x08
CBL Check States: Ingress: Enabled; Egress: Enabled
VLAN| BD| BD-St | CBL St & Direction:
--------------------------------------------------
1 | 0x11 | INCLUDE_IF_IN_BD | FORWARDING (Both)
Member info
------------------
Type LTL
----------------------
PORT_CHANNEL Po110
FLOOD_W_FPOE 0x8011
How to convert a
BD (in dec)
to a VLAN ID
11h = 17
STP ingress/egress
N7004-Berlin# show vlan internal bd-info
bd-to-vlan 17
VDC Id BD Id Vlan Id
------ ------- -------
2 17 1
54
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
PIXM
000bh
E 3/12
0402h
8011h
PO110
10C7h
10C8h
SUP
LTL setup (here) for SUP-2 and NX-OS 6.2(5.41)
N7004-London# show system internal pixm info ltl-region
===========================================================
PIXM VDC 1 LTL MAP Version: 2
Description: LTL Map for N7K SUP2 Silverstone (all flavors)
===========================================================
LTL_TYPE SIZE START END
========================================================================
LIBLTLMAP_LTL_TYPE_PHY_PORT 1024 0x0 0x3ff
LIBLTLMAP_LTL_TYPE_PC 3204 0x400 0x1083
LIBLTLMAP_LTL_TYPE_SUP_FUTURE 67 0x1084 0x10c6
LIBLTLMAP_LTL_TYPE_SUP_ETH_INBAND 64 0x10c7 0x1106
-------------------------------------------------------------------
SUB-TYPE LTL
-------------------------------------------------------------------
LIBLTLMAP_LTL_TYPE_SUP_INBAND_HQ 0x10c7
LIBLTLMAP_LTL_TYPE_SUP_INBAND_LQ 0x10c8
<SNIP>
Data-Plane
Port Index Manager
55
08:50
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
STP
STP
STP
root
Config
BPDU
DP
DP := Designated Port
RP := Root Port
BPDU := Bridge Protocol
Data Unit
RP
TCN
BPDU
Know your port states in a stable condition
(:= before the troubleshooting, prepare yourself)
Two BPDU types: Configuration BPDU’s and TCN BPDU’s
Tracking Port Role Changes, Root Changes via SYSLOG
For vPC with peer switch
configuration both
devices are sending
BPDUs as root.
NX-OS 4.2(6), 5.0(2a)
Data-Plane
STP
logging level spanning-tree 6
%STP-6-PORT_ROLE: Port Ethernet2/1 instance VLAN0001 role changed to designate
56
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
STP
Symptoms for a Data Loop
High link utilization (100%)
High CPU and fabric traffic utilization
Constant MAC Address re-learning and flapping
Exessive output drops on an interface
Verify each switch on the redundant path
Someone who is supposed to block is forwarding...
No loop in my lab
today…
In the real world we see
loops
created by blade
servers, teaming-nic’s
and hypervisors
(:= virtual switches)
Data-Plane
STP
N7004-Berlin# show interface e 3/7 | i rate
30 seconds input rate 24 bits/sec, 0 packets/sec
30 seconds output rate 304 bits/sec, 0 packets/sec
300 seconds input rate 104 bits/sec, 0 packets/sec
300 seconds output rate 424 bits/sec, 0 packets/sec
57
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
STP
Verifying systematically the path
Paris
Berlin
VID 42
Moscow
London
E4/17
STP
pktmgr
Ethanaylzer
Data-Plane
STP
N7004-Berlin# show spanning-tree interface ethernet 3/7 detail
Port 391 (Ethernet3/7) of VLAN0042 is designated forwarding
<SNIP>
BPDU: sent 1972, received 5
N7004-Paris# show spanning-tree interface ethernet 4/2 detail
Port 514 (Ethernet4/2) of VLAN0042 is root forwarding
<SNIP>
BPDU: sent 5, received 2007
N7004-Berlin# show system internal pktmgr interface ethernet 3/7
Ethernet3/7, ordinal: 80 Hash_type: 2
SUP-traffic statistics: (sent/received)
Packets: 2217 / 82
Bytes: 139163 / 17376
Instant packet rate: 0 pps / 0 pps
Packet rate limiter (Out/In): 0 pps / 0 pps
Average packet rates(1min/5min/15min/EWMA):
Packet statistics:
Tx: Unicast 0, Multicast 2217 <SNIP> STP
pktmgr
Ethanaylzer
ELAME
58
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
STP
STP
What is our STP role? Are we stable? TCN send or received?
If yes through which Interface did we received last TCN?
In case of an access port enable port-fast
STP
DP
RP
TCN
BPDU
Data-Plane
STP
N7004-London(config-if)# show spanning-tree vlan 1 detail
VLAN0001 is executing the rstp compatible Spanning Tree protocol
Bridge Identifier has priority 32768, sysid 1, address 4055.390f.5643
Configured hello time 2, max age 20, forward delay 15
Current root has priority 32769, address 000c.308b.a040
Root port is 4195 (port-channel100), cost of root path is 2
Topology change flag not set, detected flag not set
Number of topology changes 2 last change occurred 0:15:50 ago
from port-channel100
<SNIP>
N7004-London(config-if)# spanning-tree port type edge
Warning: Edge port type (portfast) should only be enabled on ports
connected to a single
host. Connecting hubs, concentrators, switches, bridges, etc... to this
interface when edge port type (portfast) is enabled, can cause temporary
bridging loops. Use with CAUTION
59
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
STP
STP
Looking back in time for STP: Event-History
800886 us – 795697 us
= 5189 us ~ 5.2 ms
12
3
6
Data-Plane
STP event history
N7004-London(config-if)# sh spanning-tree internal event-history tree 1 interface
port-channel 110
VDC03 VLAN0001 <port-channel110>
0) Transition at 795697 usecs after Sat Dec 14 21:20:53 2013
State: DIS Role: Unkw Age: 0 Inc: no [STP_PORT_EV_UP]
<SNIP>
5) Transition at 800886 usecs after Sat Dec 14 21:20:53 2013
State: FWD Role: Root Age: 0 Inc: no [STP_PORT_ROLE_CHANGE]
60
08:55
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Chapter 3:
Data-Plane Layer 3
uRIB
LC
SPAN
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
3 Areas to verify
FIB
Manager
uFDM
uRIB
OSPF
route adj
IS-IS RIP IP
BGP
mRIB
• RIB fully resolved and used for
packets originated by the control
plane
Is control plane state
as expected
(route exists, points to
expected next hop)?
Is control plane stable?
Is control plane
consistent with data
plane
(route programmed in
forwarding plane,
consistent with control
plane)?
Data-Plane
Control-Plane
Forwarding Hardware
• Neighbor management
• Protocol database
• Add/Delete prefixes
• Translate routes to
hardware format
• Program hardware
forwarding engine
• Push routes to platform
• Route download
Control-Plane
Data-Plane
Data-Plane
Unicast Routing Architecture
62
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
L3
Paris
42.42.42.4
Ip ospf-42
42.42.42.142
11.0.0.1/32
VID = 42
N7004-Paris# show ip ospf 42 internal txlist urib
ospf 42
ospf process tag 42
ospf process instance number 1
ospf process uuid 1090519321
ospf process linux pid 7746
<SNIP>
OSPFv2->URIB transmit list: version 0x10
N7004-Paris# show processes cpu sort |i PID|7746
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
7746 10450 502752 0 0.00% 0.01% 0.01% - ospf
uRIB route adj
13: 42.42.42.0/24
14: 11.0.0.1/32
15: 10.0.2.0/24
16: 10.0.4.0/24
16: RIB marker
OSPF-42
SAP 320
Assumption: Control-Plane is stable, OSPF receives LSAs
we look at the flow of information from OSFP to HW,
Data-Plane
Unicast Control
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
L3
uRIB
OSFP
route adj
OSPF Routes in URIB
Administrative distance
assigned
(D) route is directly attached
(R) route is in RIB
N7004-Paris# sh ip ospf 42 route
<SNIP>
11.0.0.1/32 (inter)(R) area 0.0.0.0
via 42.42.42.142/Vlan42 , cost 41 distance 110
N7004-Paris# show ip route ospf-42 detail
<SNIP
255.255.255.255/32, ubest/mbest: 1/0
*via sup-eth1, [0/0], 01:59:22, broadcast
11.0.0.1/32, ubest/mbest: 1/0
*via 42.42.42.142, Vlan42, [110/41], 01:57:18, ospf-42, inter
N7004-Paris# sh ip arp 42.42.42.142
<SNIP>
IP ARP Table
Total number of entries: 1
Address Age MAC Address Interface
42.42.42.142 00:03:39 0010.7be8.53b0 Vlan42
Is there a route to the destination ?
Do we have a resolved
Layer 2 address?
Data-Plane
Unicast Control
69
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
uFDM
uRIB
client
route adj
L3
Forwarding Hardware
FIB
Manager
Verifying on the ingress line card
N7004-Paris# show forwarding ipv4 route 11.0.0.1 module 4
IPv4 routes for table default/base
------------------+------------------+----------------------+-----------------
Prefix | Next-hop | Interface | Labels
------------------+------------------+----------------------+-----------------
11.0.0.1/32 42.42.42.142 Vlan42
N7004-Paris# show forwarding adjacency 42.42.42.142 module 4
IPv4 adjacency information
next-hop rewrite info interface
-------------- --------------- -------------
42.42.42.142 0010.7be8.53b0 Vlan42
N7004-Paris# show ip arp 42.42.42.142
Address Age MAC Address Interface
42.42.42.142 00:08:56 0010.7be8.53b0 Vlan42
Is adjacency
consistent with ARP
In the control plane?
Hardware forwarding (FIB)
information on per-module basis
Displays hardware adjacency
table information
Data-Plane
Layer 3 Unicast
09:00 70
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
uFDM
uRIB
client
route adj
L3
Forwarding Hardware
FIB
Manager
Verifying on the ingress line card
N7004-Paris# show system internal forwarding route 11.0.0.1 module 4 detail
RPF Flags legend:
S - Directly attached route (S_Star)
V - RPF valid
M - SMAC IP check enabled
G - SGT valid
E - RPF External table valid
11.0.0.1/32 , Vlan42 , No of paths: 1
Dev: 1 , Idx: 0x2603 , RPF Flags: V , DGT: 0 , VPN: 7
RPF_Intf_5: Vlan42 (0x35 )
AdjIdx: 0xa038 , LIFB: 0 , LIF: Vlan42 (0x35 ), DI: 0x0
DMAC: 0010.7be8.53b0 SMAC: 4055.390f.5644
N7004-Paris# show system internal forwarding adjacency entry 0xa038 module 4
Device: 1 Index: 0xa038 dmac: 0010.7be8.53b0 smac: 0055.390f.5644 e-vpn: 7
e-lif: 0x35 packets: 0 bytes: 0
Data-Plane
Layer 3 Unicast
71
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Verification Location
L2/L3 reachability for multicast and max. unicast
N7004-Paris# ping multicast 224.0.0.5 interface vlan 42
PING 224.0.0.5 (224.0.0.5): 56 data bytes
64 bytes from 42.42.42.5: icmp_seq=0 ttl=254 time=0.836 ms
64 bytes from 42.42.42.5: icmp_seq=1 ttl=254 time=0.685 ms
64 bytes from 42.42.42.5: icmp_seq=2 ttl=254 time=0.613 ms
<SNIP>
64 bytes from 42.42.42.142: icmp_seq=0 ttl=254 time=4.461 ms
64 bytes from 42.42.42.142: icmp_seq=1 ttl=254 time=5.007 ms
64 bytes from 42.42.42.142: icmp_seq=2 ttl=254 time=5.771 ms
<SNIP>
N7004-Paris# ping 42.42.42.142 packet-size 1472
PING 42.42.42.142 (42.42.42.142): 1472 data bytes
1480 bytes from 42.42.42.142: icmp_seq=0 ttl=254 time=5.493 ms
1480 bytes from 42.42.42.142: icmp_seq=1 ttl=254 time=5.37 ms
1480 bytes from 42.42.42.142: icmp_seq=2 ttl=254 time=5.337 ms
<SNIP>
Why not 1500?
1500 – 20 (IP) -8 (ICMP) = 1472
Ethanalyzer ELAME
Debug
Better
alternatives
OSFP
Debug
Ethanalyzer
ELAME
CoPP
RL
ICMP
Q
Data-Plane
Layer 3 Unicast
72
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
73
Example N7K# test forwarding inconsistency
N7K# show forwarding inconsistency
IPV4 Consistency check : table_id(0x13) Execution time : 14327 ms ()
No inconsistent adjacencies.
Inconsistent routes:
1. slot(1), vrf(default), prefix (172.31.38.6/32), Route extra in FIB Software
2. slot(1), vrf(default), prefix (172.31.38.2/32), Route extra in FIB Software
Test for inconsistency
N7K# show ip route 172.18.144.2 IP Route Table for VRF "default"
<SNIP>
172.18.144.0/24, ubest/mbest: 1/0
*via 172.31.38.2, [200/0], 1d22h, bgp-65000, internal, tag 64949
N7K# show ip fib route 172.18.144.2
<SNIP>
------------------+------------------+----------------------+--------
Prefix | Next-hop | Interface | Labels
------------------+------------------+----------------------+---------
*172.18.144.0/24 0.0.0.0 Null0
How can we recover?
(show forwarding ipv4
route 172.18.144.2
module 1)
FIB
Manager
uRIB
route
Data-Plane
Layer 3 Unicast
73
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
IDS
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Security Check
This checking drops
various ‘illegal’ packets
These drops can be
also seen in show
hardware internal
errors but there they
might look a bit more
cryptic
The checks can be
disabled via ‘hardware
ip verify …’ – in
default VDC (for all
VDCs)
IDS and how do we identify the source or sender?
Data-Plane
Layer 3 Unicast
N7004-Paris# show hardware forwarding ip verify module 4
IPv4 IDS Checks Status Packets Failed
-----------------------------+---------+------------------
address source broadcast Enabled 0
address source multicast Enabled 0
address destination zero Enabled 0
address identical Disabled --
address reserved Disabled --
address class-e Disabled --
checksum Enabled 0
protocol Enabled 0
fragment Disabled --
length minimum Enabled 0
length consistent Enabled 0
length maximum max-frag Enabled 0
length maximum udp Disabled --
length maximum max-tcp Enabled 0
tcp flags Disabled --
tcp tiny-frag Enabled 0
version Enabled 0
<SNIP>
09:05 75
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Non-zero Counter
N7K# show hardware internal errors mod 4
<SNIP>
|------------------------------------------------------------------------|
| Device:Lamira Role:L3 |
|------------------------------------------------------------------------|
Instance:0
ID Name Value Ports
-- ---- ----- -----
2 IF IDS check TCP flags verification 0000000002ebfd14 1-48 I1
8 IF IDS check Src or Dest IP is Class E 0000000002ebfd14 1-48 I1
17 CL2 Invalid Pkt count 00000001bf4bbac4 1-48 I1
57 L3 Fib Miss Pkt ctr 0000000079978df2 1-48 I1
How do we verify if IDS dropped packets?
How do we identify the source of those packets?
Data-Plane
Layer 3 Unicast
76
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Examples
Forwarding Engine
Line Card
DI := SUP
DI := drop
Exception
Redirect
Table
SPAN Engine
ERSPAN
SPAN
E 3/37
DI := SUP
DI := drop
Use inband SPAN
- MTU failures
- TTL errors
- ICMP redirect
Use exception SPAN
- IP Option fail
- IP check
- RPF
- Unsupported RW
N7004-Berlin(config)# monitor session 1
N7004-Berlin(config-monitor)# source interface sup-eth 0 both
or
N7004-Berlin(config-monitor)# source exception [layer 3|fabricp | other | all]
Destination Index
:= Drop can be
changed to SPAN
Engine
Data-Plane
Tools: SPAN
77
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
2+6 (5) sessions:
M2. F1-3
and NX-OS 6.2
2+12 Session Model
Options
Tools
SPAN
78
SPAN/ERSPAN is hardware based and distributed
(not using resources on the SUP)
SPAN (Port or VLAN)
RSPAN (Destination)
ERSPAN
ACL Capture1
Rule Based SPAN
(VLAN Filtering)
MTU Truncated SPAN
Sampling
Rate Limit
I/O Module
Replication
Engine
1ACL Capture requires NX-OS 5.2(1)
or higher and M-Series line cards
N7004-Berlin(config)# monitor session 1
N7004-Berlin(config-monitor)# source interface sup-eth 0 … |
eth a/b … |
port-channel c …
N7004-Berlin(config-monitor)# source vlan d
Monitoring
Appliance
Fabric
Egress
Line Card(s)
switch(config-monitor)#
filter frame-type ipv4 src-ip 10.1.1.3/32 tos 3 l4-protocol …
10.1.1.3
cos 3
“Could be called an Application Copy function”
Regular
Destination
Egress
Line Card(s)
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Chapter 4:
Control-Plane Inband
Inband Concept
Trigger
CoPP
Netstack
RL
Inband
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Two Tasks
Looking for dropped
packets which are
targeted for the
Control Plane
Management
Port
1G
10G
Multiple
CPU
Cores
Inband
CoPP
RL
OSPF…
SUP
Line Card
System
Controller
High CPU due to:
Punted traffic
ACL processing
Control Plane tasks
Indentifying from
where/what is being
send from/to the
CPU
Kernel
ELAME
Reference Point 2 Reference Point 1
Architecture
Inband Path
Forwarding
Engine
Ethanalyzer PID
X
09:10 80
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
L3 Resources
How long since the
route was added?
How long since ARP
has been updated?
How long have
adjacency stayed up?
Can we find previous
incarnations of
adjacency here?
Log of recent routing
changes (can filter out
prefix in question)?
N7004-Paris# show ip route 11.0.0.1
<SNIP>
11.0.0.1/32, ubest/mbest: 1/0
*via 42.42.42.142, Vlan42, [110/41], 02:50:20, ospf-42, inter
N7004-Paris# show ip arp 42.42.42.142
Address Age MAC Address Interface
Address Age MAC Address Interface
42.42.42.142 00:01:29 0010.7be8.53b0 Vlan42
N7004-Paris# show ip ospf neighbors
OSPF Process ID 42 VRF default
Total number of neighbors: 2
Neighbor ID Pri State Up Time Address Interface
42.0.0.5 1 FULL/BDR 02:52:48 42.42.42.5 Vlan42
200.0.0.10 1 FULL/DR 02:51:44 42.42.42.142 Vlan42
Are we stable?
Control-
Plane
81
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
OSPF
192.251.19.22
Syslog messages report
OSPF neighbor failures
40.9.0.0
2011 Mar 26 15:38:56.395 N7K-1-VDC2 %OSPF-5-NBRSTATE: ospf-6467 [3981]
Process 6467, Nbr 192.251.19.22 on Vlan19 from INIT to DOWN, DEADTIME
2011 Mar 26 15:38:56.584 N7K-1-VDC2 %OSPF-5-NBRSTATE: ospf-6467 [3981]
Process 6467, Nbr 192.251.19.22 on Vlan19 from DOWN to INIT, HELLORCVD
2011 Mar 26 15:39:33.865 N7K-1-VDC2 %OSPF-5-NBRSTATE: ospf-6467 [3981]
Process 6467, Nbr 192.251.19.22 on Vlan19 from INIT to DOWN, DEADTIME
2011 Mar 26 15:39:35.754 N7K-1-VDC2 %OSPF-5-NBRSTATE: ospf-6467 [3981]
Process 6467, Nbr 192.251.19.22 on Vlan19 from DOWN to INIT, HELLORCVD
An example of an trigger or why you start looking:
Control-
Plane
A Syslog Message 1
%COPP-5-COPP_DROPS5: CoPP drops exceed threshold in class:
copp-system-class-critical,
check show policy-map interface control-plane for more info.
Active CoPP Monitoring showing drops 2
SITE1-AGG1# show policy-map int control-plane
SITE1-AGG1# show policy-map int control-plane | i "class|conform|violated“
<SNIP>
violated 1799505072 bytes; action: drop
1 Needs NX-OS 5.1 or higher
Logging drop threshold #
level #
2 No “statistics per-entry available
but
show system internal
access-list input entries
detail
82
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Platform Independent
Berlin
42.42.42.4
London
42.42.42.5
42.42.42.142
Verifying the neighbors and if needed the adjacency history
N7004-Paris# show ip ospf 42 event-history adjacency |i EXCHDONE
2013 Dec 29 17:21:46.927613 ospf 42 [7746]: :
Nbr 42.42.42.142: EXCHANGE --> FULL, event EXCHDONE
N7004-Paris# show ip ospf 42 neighbors
OSPF Process ID 42 VRF default
Total number of neighbors: 2
Neighbor ID Pri State Up Time Address Interface
42.0.0.5 1 FULL/BDR 02:16:27 42.42.42.5 Vlan42
200.0.0.10 1 FULL/DR 02:15:24 42.42.42.142 Vlan42
Paris Moscow
10.x.x.x
The latest messages appear at the top
Control-
Plane
83
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Failure Domain
Determine with
Etheranalzer the
failure domain
From Prozess point
of view:
Do I get enough?
Do I get too much?
Ingress MAC Drops?
Ethanalyzer
HWRL Drops?
CoPP Drops?
Inband Drops or FC?
Packet Manager?
IPv4/IPv6
ARP/AM
uRIB
Line Card
ELAME
OSFP
Do we receive the
packet?
Do we receive the
packet (e.g. BPDU
or LSA at the CPU?
CPU?
MEM?
We verified on the other
side we are sending
LDP, BGP, OSPF, …
One real world example
in chapter 5 (ARP)
Control-
Plane
84
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
OSPF
192.251.19.22
Syslog messages report
OSPF neighbor failures
CPU states high utilization
caused by OSPF and
Netstack process
40.9.0.0
Here two processes OSPF and NETSTACK
are using most resources.
How much do they use usually?
How does my base line look like?
N7K-1-VDC2# show system resources
Load average: 1 minute: 2.92 5 minutes: 2.38 15 minutes: 2.27
Processes : 1267 total, 4 running
CPU states : 34.0% user, 42.5% kernel, 23.5% idle
Memory usage: 4115232K total, 3638780K used, 476452K free
N7K-1-VDC2# show processes cpu sort
PID Runtime(ms) Invoked uSecs 1Sec Process
----- ----------- -------- ----- ------ -----------
3981 127 276 462 43.2% ospf
3841 267 78 3427 16.4% netstack
2941 34146488 7377876 4628 0.9% platform
3982 118 245 485 0.9% ospfv3
+ statistics per Core for SUP-
2/SUP-2E and with newer NX-
OS for SUP-1
Control-
Plane
Module-3# show system internal processes cpu
09:15 85
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Platform Independent
Having problems with one neighbor or link?
N7004-Paris# show ip ospf retransmission-list 200.0.0.10 vlan 42
OSPF Process ID 42 VRF default
Neighbor 200.0.0.10, interface Vlan42, address 42.42.42.142
Link state retransmission timer not running
Type LSID Adv Rtr Seq No Checksum Age
Checklist for neighbor
issues:
L2/L3 reachability
Configuration challenges like
OSPF not enabled on the
interface
Interface is defined as
passive
Mismatched subnet mask,
timer, area ID, …
Control-
Plane
86
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
L3 Resources
Log of recent routing
changes (can filter out
prefix in question)
Verifying ADJMGR
history
N7004-Paris# show routing event-history general
Dumping: general
2013 Dec 29 17:31:01.280070 urib: Received state change for unknown uuid
0x2c9
2013 Dec 29 17:31:01.101498 urib: Received state change for unknown uuid 0x0
2013 Dec 29 17:31:00.069988 urib: Received state change for unknown uuid
0x2c2
2013 Dec 29 17:30:59.751551 urib: Received state change for unknown uuid 0x0
2013 Dec 29 17:21:52.543826 urib: "ospf-42": 11.0.0.1/32 C: SN=F EC=T NF=T
VN=T WM=F BH=F NW=T UP=T
2013 Dec 29 17:21:52.543811 urib: "ospf-42": 11.0.0.1/32, new best path nh
42.42.42.142%Vlan42, metric [110/41] route-type inter tag 0x00000000
2013 Dec 29 17:21:52.543810 urib: "ospf-42": 11.0.0.1/32 B: SN=F EC=F NF=T
VN=T WM=F BH=F NW=T UP=T
What happened?
N7004-Paris# show sys inte adjmgr internal event-history ipc |grep prev 1
42.42.42.142
4) Event:E_DEBUG, length:160, at 503661 usecs after Sun Dec 29 17:21:41 2013
[116] [7586]: Added adjacency entry for 42.42.42.142 (0010.7be8.53b0) on
interface Vlan42 (Ethernet4/2)with preference 50 afi 1 mct 0 uuid 268 Mac
changed:TRUE
<SNIP>
Control-
Plane
87
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
CPU
Output show
processes from all
VDC’s
Multiple
CPUs &
Cores
How much is my CPU used?
How much are my CPU cores used?
How does one specific PID (e.g OSPF) behaved in the past?
N7K-3-VDC3# show processes cpu | egrep "PID|--|ospf"
PID Runtime(ms) Invoked uSecs 1Sec Process
----- ----------- -------- ----- ------ -----------
9337 102 72 1418 0.0% ospfv3
22916 118 62 1905 13.1% ospf
N7K-3-VDC3# show system internal sysmgr service pid 22916
Service "__inst_001__ospf" ("ospf", 58):
UUID = 0x41000119, PID = 22916, SAP = 320
State: SRV_STATE_HANDSHAKED (entered at time Thu Mar 3 21:53:59 2012).
Restart count: 1
Time of last restart: Thu Mar 3 21:53:58 2011.
The service never crashed since the last reboot.
Tag = 6467
Plugin ID: 1
Wait I remember now
for the complaining
customer we used:
VDC2…
Verify “high CPU”
against the base line.
You need base line
information.
Control-
Plane
88
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Case Study
Flapping OSPF neighbors, unwanted “traffic sources” for your
control plane?
40.9.0.0/16
N7K-1# show policy-map interface control-plane module 2
| egrep "service-policy|critical|ospf|police cir 39600"
service-policy input: copp-system-policy
class-map copp-system-class-critical (match-any)
match access-grp name copp-system-acl-ospf
match access-grp name copp-system-acl-ospf6
police cir 39600 kbps , bc 250 ms
N7K-1# show class-map type control-plane copp-system-class-critical
| egrep class|ospf
class-map type control-plane match-any copp-system-class-critical
match access-grp name copp-system-acl-ospf
match access-grp name copp-system-acl-ospf6
N7K-1# show ip access-lists copp-system-acl-ospf
IP access list copp-system-acl-ospf
10 permit ospf any any
Customize CoPP, don’t turn it off!
Legitimate
neighbor
Control-
Plane
90
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Environment
N7K-1# show ip access-lists copp-system-acl-ospf
IP access list copp-system-acl-ospf
10 permit ospf any any
20 permit ip 40.9.0.0/16 224.0.0.5/32
30 permit ip 40.9.0.0/16 224.0.0.6/32
40.9.0.0/16
N7K-1# show ip access-lists copp-system-acl-ospf-test
IP access list copp-system-acl-osfp-test
10 permit ip any 224.0.0.0/24
N7K-1# show policy-map interface control-plane module 2
| egrep "service-policy|critical|ospf|police
cir 39600|ospf-test|police cir 100 "
service-policy input: copp-system-policy
class-map copp-system-class-critical (match-any)
match access-grp name copp-system-acl-ospf
match access-grp name copp-system-acl-ospf6
police cir 39600 kbps , bc 250 ms
class-map copp-system-class-OSFP-TEST (match-any)
match access-grp name copp-system-acl-ospf-test
police cir 100 bps , bc 200 ms
OSPFv2
224.0.0.5
224.0.0.6
CoPP
Now we specifically
identify the legitimate
neighbors
Control-
Plane
09:20 91
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Environment
40.9.0.0/16
OSPFv2
224.0.0.5
224.0.0.6
Module 1
CoPP
N7K-1# show policy-map interface control-plane module 1
class copp-system-class-ospf-test
control Plane
service-policy input: copp-system-policy
class-map copp-system-class-ospf-test (match-any)
match access-grp name copp-system-acl-malicious
police cir 100 bps , bc 200 ms
module 1 :
conformed 0 bytes; action: drop
violated 0 bytes; action: drop
N7K-1# show policy-map interface control-plane module 2
class copp-system-class-ospf-test
control Plane
service-policy input: copp-system-policy
class-map copp-system-class-ospf-test (match-any)
match access-grp name copp-system-acl-ospf-test
police cir 100 bps , bc 200 ms
module 2 :
conformed 0 bytes; action: drop
violated 1799505072 bytes; action: drop
Module 2
CoPP
Generic: show policy-map interface control-plane you determine the
affected class, and with N7K# show class-map type control-plane
you determine what is classified for those classes.
Control-
Plane
92
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
RL
As with CoPP policers,
modifying the default rates
should be carefully planned
before any configuration
changes.
Rate-limiters can prevent overwhelming the control-plane
CoPP
RL
Multiple
CPU
Cores
N7004-Berlin# show hardware rate-limiter
Units for Config: packets per second
Allowed, Dropped & Total: aggregated since last clear counters
Module: 3
R-L Class Config Allowed Dropped Total
+----------------+--------+-------------+-------------+----------------+
L3 mtu 500 0 0 0
L3 ttl 500 0 0 0
L3 control 10000 0 0 0
L3 glean 100 0 0 0
<SNIP>
L2 storm-ctrl Disable
access-list-log 100 0 0 0
copy 30000 1423 0 1423
receive 30000 8540 0 8540
L2 port-sec 500 0 0 0
L2 mcast-snoop 10000 2 0 2
<SNIP>
Control-
Plane
93
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Inband
SUP-2 / NX-OS 6.2 (5.41)
B
P
D
U
Q0 Q1
Clipper
R2D2
CPU
BDR-529-Berlin# show system inband queuing status
Weighted Round Robin Algorithm
Weights BPDU - 64, Q0 - 16, Q1 – 4
BDR-529-Berlin# show system inband queuing statistics
Inband packets unmapped to a queue: 0
Inband packets mapped to bpdu queue: 2078
Inband packets mapped to q0: 1339
Inband packets mapped to q1: 4
In KLM packets mapped to bpdu: 0
In KLM packets mapped to arp : 0
In KLM packets mapped to q0 : 0
In KLM packets mapped to q1 : 0
In KLM packets mapped to veobc : 0
Inband Queues:
bpdu: recv 2078, drop 0, congested 0 rcvbuf 2097152, sndbuf 4194304 no drop 1
(q0): recv 1339, drop 0, congested 0 rcvbuf 2097152, sndbuf 4194304 no drop 0
(q1): recv 4, drop 0, congested 0 rcvbuf 2097152, sndbuf 4194304 no drop 0
Control-
Plane
94
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Inband
CPU
N7004# show hardware internal cpu-mac inband events
1) Event:TX_PPS_MAX, length:4, at 382147 usecs after Fri Jan 10 20:04:37
2014 new maximum = 191
2) Event:RX_PPS_MAX, length:4, at 382147 usecs after Fri Jan 10 20:04:37
2014 new maximum = 195
How to determine the max pps rate to/from the CPU, if we
run out of buffer and it’s occurrence
How to determine the time of the max pps rate to correlate
against your logs?
N7004-Berlin# show hardware internal cpu-mac inband stats | in rate|buffer
Rx no buffers .................. 0
Packet rate limit ........... 64000 pps
Rx packet rate (current/max) 85 / 195 pps
Tx packet rate (current/max) 85 / 191 pps
Goal: Compare against logs
Possible next step: logw.py
α
tα
α
Control-
Plane
95
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
NX-OS
Packet
Manager
NetStack
IP
Clients
NetStack VDC -1
L3
L2
„ip input“
ARP
OSPF
System
manager
OSPF ARP
System Manager starts
and controls / monitors
If the heatbeat fails
core sig6 -> system
troubleshooting
N7004-Berlin# debug pktmgr frame
2014 Jan 10 20:14:40.061027 pktmgr: In 0x0800 82
7 4055.390f.5645 -> 0100.5e00.0005 Eth3/6
STP
BGP
Clients
Ethanalyzer ELAME
Debug
Packet
Manager
NetStack
IP
NetStack VDC-2
L3
L2
„ip input“
Control-
Plane
09:25 96
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Chapter 5:
Control-Plan ARP
ARP
glean
HSRP
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Layer 2/3
ARP Incomplete...
E 3/13 E 3/14
20.0.0.0/24
.13 .14
VRF
Control-P.
ARP & AM
N7004-Berlin# show ip arp
Flags: * - Adjacencies learnt on non-active FHRP router
+ - Adjacencies synced via CFSoE
# - Adjacencies Throttled for Glean
D - Static Adjacencies attached to down interface
IP ARP Table for context default
Total number of entries: 3
Address Age MAC Address Interface
IP ARP Table for context default
Total number of entries: 5
Address Age MAC Address Interface
192.168.0.3 00:04:41 4055.390f.5643 Vlan1
10.0.3.5 00:06:35 4055.390f.5645 Ethernet3/6
10.0.2.4 00:07:14 4055.390f.5644 Ethernet3/8
20.0.0.13 00:00:14 INCOMPLETE Ethernet3/14
192.168.0.254 - 0000.0c9f.f001 Vlan1
Simple example
uRIB
(253)
route adj AM ARP
98
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
E 3/13
20.0.0.0/24
.13 .14
VRF
ARP
E 3/14
Consider the use of Debug-Filter and send to a file
Control-P.
ARP & AM
N7004-Berlin# debug ip arp packet
2014 Jan 5 21:51:40.477507 arp: (context 1) Sending packet on interface Ethernet3/14, (prty 0) Hrd
type 1 Prot type 800 Hrd len 6 Prot len 4 OP 1, Pkt size 28
2014 Jan 5 21:51:40.477629 arp: Src 4055.390f.5642/20.0.0.14 Dst ffff.ffff.ffff/20.0.0.13
2014 Jan 5 21:51:40.481061 arp: (context 4) Receiving packet from interface Ethernet3/13, (prty 6)
Hrd type 1 Prot type 800 Hrd len 6 Prot len 4 OP 1, Pkt size 46
2014 Jan 5 21:51:40.481131 arp: Src 4055.390f.5642/20.0.0.14 Dst ffff.ffff.ffff/20.0.0.13
N7004-Berlin# show ip arp statistics ethernet 3/14
ARP packet statistics for interface: Ethernet3/14
Sent:
Total 10, Requests 9, Replies 0, Requests on L2 0, Replies on L2 0,
Gratuitous 1, Tunneled 0, Dropped 0
Send packet drops details:
MBUF operation failed : 0
Context not yet created : 0
Invalid context : 0
Invalid ifindex : 0
Invalid SRC IP : 0
Invalid DEST IP : 0
Destination is our own IP : 0
Unattached IP : 0
<SNIP>
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
E 3/13
20.0.0.0/24
.13 .14
VRF
ARP
N7004-Berlin# show ip arp statistics ethernet 3/14
Control-P.
ARP & AM
Received:
Total 1, … , Dropped 1
Received packet drops details:
Appeared on a wrong interface : 0
Incorrect length : 0
Invalid protocol packet : 0
Invalid context : 0
Context not yet created : 0
Invalid layer 2 address length : 0
Invalid layer 3 address length : 0
Invalid source IP address : 0
Source IP address is our own : 0
No mem to create per intf structure : 0
Source address mismatch with subnet : 0
Directed broadcast source : 0
<SMIP>
N7004-Berlin# show ip arp statistics vrf ALQ
<SNIP>
Received:
Total 13, … , Dropped 13
<SNIP>
Invalid source MAC address : 0
Source MAC address is our own : 13
<SMIP>
100
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Check CoPP and/or HWRL:
SWT-1 SWT-2
ARP INCOMPLETE
It worked before
no new deployment
Ethanalyzer verifies ARP
packets are being send
by SWT-1 but not
received
On SWT-2 ARP is being
Received and Send
customer# show vpc brief
vPC keep-alive status : peer is not reachable through peer-keepalive
Control-P.
ARP & AM
Customer# show class-map type control-plane copp-system-p-class-normal
class-map type control-plane match-any copp-system-p-class-normal
match access-group name copp-system-p-acl-mac-dot1x
match exception ip multicast directly-connected-sources
match exception ipv6 multicast directly-connected-sources
match protocol arp
class-map copp-system-p-class-normal (match-any)
violate action: drop
module 5: violated 20557632224 bytes,
5-min violate rate 4154397 bytes/sec
module 9: violated 0 bytes,
5-min violate rate 0 bytes/sec
Customer# show hardware rate-limiter | i Module|R-L|glean
Module: 5
R-L Class Config Allowed Dropped Total
+------------------+--------+---------------+-------------+-----------------+
L3 glean 100 4904 2935 7839
L3 glean-fast 100 863401 1539316 2402717
09:30 101
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Layer 2/3
CPU Utilization
Glean Throttle is not enabled by default (be careful with probing
hosts). Default timeout 300 seconds (300s-1800s)
Use show ip arp to verify if INCOMPLETE adjacency is in
Glean Throttle state
Control-P.
ARP & AM
N7K# show ip arp 192.1.49.2
Flags: * - Adjacencies learnt on non-active FHRP router
+ - Adjacencies synced via CFSoE
# - Adjacencies Throttled for Glean
D - Static Adjacencies attached to down interface
IP ARP Table
Total number of entries: 1
Address Age MAC Address Interface
192.1.49.2 00:00:13 INCOMPLETE Ethernet4/9 #
192.1.49.2/32
Has been throttled
102
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
glean fast path
NX-OS 6.2(2)
Control-P.
ARP & AM
N7004-London# sh run all | grep "fast-path"
ip arp fast-path
N7004-London# show system internal pixm info ltl-region | i FAST
LIBLTLMAP_LTL_TYPE_SUP_INBAND_GLEAN_FAST_PATH 0x10d1
N7004-London# show system internal pktmgr client 0x10c
Client uuid: 268, 4 filters, pid 7209
Filter 1: EthType 0x0806,
Rx: 28, Drop: 0
Filter 2: EthType 0xfff0, Exc 8,
Rx: 0, Drop: 0
Filter 3: EthType 0x8841, Snap 34881,
Rx: 0, Drop: 0
Filter 4: EthType 0x0800, DstIf 0x150b0000, Excl. Any
Rx: 0, Drop: 0
<SNIP>
SUP-ETH Interface
103
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Chapter 6:
ACL’s
TCAM PBR
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
TCAM What is TCAM? Hardware to identify packets
T0B0
T0B1
T1B0
T1B1
T1B0
T1B1
Forwarding Engine
on Ingress Line Card
contains TCAM
RACL
QoS
T := TCAM 0 or 1
B := Bank 0 or 1
VID 42
Configuration
Interface or VLAN
(ingress/egress)
TCAM
Ternary Content Addressable Memory
Packet
FE
108
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
TCAM/ ACL’s
N7004-London# show sys int acc feature bank map interface ingress
<SNIP>
slot 3
=======
_____________________________________________________________________
Feature Rslt Type T0B0 T0B1 T1B0 T1B1
_____________________________________________________________________
PACL Acl X
RACL Acl X
DHCP Acl X
QoS Qos X
PBR Acl X
Netflow Sampler Acc X
SPM WCCP Acl X X X
BFD Acl X
FEX Acl X
<SNIP>
Specific Features map to a specific “location” := TCAM/BANK
Three result types:
QoS
ACL
ACC
T0B0
T0B1
T1B0
T1B1
TCAM
Ternary Content Addressable Memory
109
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Result Types
N7004-London# show system internal access-list feature bank map vlan ingress
<SNIP>
slot 3
=======
_________________________________________________________________________
Feature Rslt Type T0B0 T0B1 T1B0 T1B1
_________________________________________________________________________
QoS Qos X
RACL Acl X
PBR Acl X
VACL Acl X
DHCP Acl X
ARP Acl X
Netflow Acl X X
Netflow (SVI) Acl X X
Netflow Sampler Acc X
Netflow Sampler (SVI) Acc X
<SNIP>
Per bank only one result type can be used:
for “VLAN” & “Ingress” e.g. either QoS or NF
Sampler
“I can’t configure xyz…the system
rejects my configuration…”
TCAM
Ternary Content Addressable Memory
2014 May 19
11:27:12.673
backuprot3 %ACLQOS-
SLOT4-2-ACLQOS_FAILED:
ACLQOS failure:
feature combination
not supported on VDC-2
VLAN 2156 for : RACL,
Netflow Sampler (SVI)
2014 May 19
11:27:13.214
backuprot3 %IM-3-
IM_RESP_ERROR:
Component MTS_SAP_VMM
opcode:MTS_OPC_IM_IF_V
DC_BIND in vdc:2
returned error:Tcam
Allocation Failure
110
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ACL’s N7004-London# show system internal access-list globals
slot 1
=======
NOT Supported in SUP ACLQOS
slot 3
=======
Atomic Update : ENABLED
Default ACL : DENY
Bank Chaining : DISABLED
Seq Feat Model : NO_DENY_ACE_SUPPORT
This pltfm supports seq feat model
Bank Class Model : DISABLED
This pltfm supports bank class model
Fabric path DNL : DISABLED
Seq Feat Model : NO_DENY_ACE_SUPPORT
This pltfm supports seq feat model
LOU Threshold Value : 5
Overview
Atomic Update
Resource Pooling
Statistics Per Entry
ACL Threshold Exp
Fragment Handling
Bank Management
T0B0
T0B1
T1B0
T1B1
TCAM
Ternary Content Addressable Memory
09:40 111
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
ACL
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
TCAM
N7K# show ip access example
IP access list example
statistics per-entry
10 permit ip any 10.1.2.100/32 [match=3452]
20 deny ip any 10.1.68.101/32 [match=49920]
30 deny ip any 10.33.2.25/32 [match=232324]
40 permit tcp any any eq 22 [match=9881]
50 deny tcp any any eq telnet [match=442]
60 deny udp any any eq syslog [match=87112]
70 permit tcp any any eq www [match=4345667]
80 permit udp any any eq snmp [match=234222]
ACL logging is enabled by including the log keyword in an
ACL rule (show log log).
The Sup receives a copy of the packet. The original packet is
forwarded/dropped in hardware with no performance
penalty.
Statistics per Entry
The CPU is protected by
using one of the available
rate limiters. Forwarding
engine hardware enforces
rate to avoid saturating
inband interface CPU.
hardware rate-limit
access-list-log command
adjusts rate (def 100 pps)
ACL Logging can be a
useful tool during
troubleshooting. Use ACL
logging to sample specific
packets from data plane.
Use onboard ethanalyzer
(wireshark) to analyze
sampled packets
TCAM
Utilization
09:45 118
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Statistics per entry results in no optimization and no merge
activity. Instead a 1:1 mapping of configured ACE to CL
TCAM will be seen
TCAM Space
„...when using ACL stats per entry on the 7K the TCAM
utilization goes up to 47%, when removed, it dropped to 7%...“
object groups do NOT offer ANY optimization in terms of
CL (:= Classification) TCAM utilization
ACLs Statistics are
NOT enabled by default
(fundamental difference
vs. IOS) because they
require the ACEs NOT
to be merged and this
affects the TCAM
utilization.
TCAM
Utilization
119
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
TCAM
N7004(config)# hardware access-list resource feature bank-mapping
N7004(config)# show system internal access-list feature bank-class map ingress
slot 3
=======
Feature Class Definition:
0. CLASS_QOS :
QoS,
1. CLASS_INBAND :
Tunnel Decap, SPM LISP,
2. CLASS_PACL :
PACL, Netflow,
3. CLASS_DHCP :
DHCP, Netflow, Netflow (vlan), ARP,
4. CLASS_RACL :
RACL, RACL_STAT, Netflow (SVI), ARP,
<SNIP>
Feature Class Combination (Ingress)
0. CLASS_PACL, CLASS_QOS_INTF, CLASS_EMPTY, CLASS_EMPTY
1. CLASS_PACL, CLASS_NF_SMPL_INTF, CLASS_EMPTY, CLASS_EMPTY
<SNIP>
33. CLASS_EMPTY, CLASS_EMPTY, CLASS_NF_SMPL, CLASS_QOS
“now I can configure
QoS and NF Sampler”
TCAM
Bank Management
120
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Summary
Strategy, Tools
and System
Data-Plane
Layer 2
Data-Plane
Layer 3
Control-Plane
Inband
Control-
Plane ARP
TCAM
10:00
08:00
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Time vs. RCA
Be prepared both for
troubleshooting itself
but also for the
strategy
Have an up-to-date
network diagram at
hand
Know your network in
good state
Summary
N7K
N7K with NX-OS
provides visibility and
tools to efficiently
troubleshoot
For most challenges the
“normal” CLI, Ethanalyzer and
the log entries are sufficient
A complete and mature
feature set helps to find
workarounds
Reducing risk by using
mainstream designs
and proven
deployments
122
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Summary
two examples of high uptime from EMEAR
ROME, ITALY
Kernel uptime is 1813 day(s),
Nexus 7010
Ireland, UKI
System uptime: 2612 days
MDS9509
ALQ/ 06-ARP-14
> 4.5Y > 7.0Y
Shape and secure your future
with Nexus 7000 Series
09:50
„In order to consolidate a Business critical server Farm
enhancing network speed and availability
in 2009 Fastweb adopted Nexus 7000.
Today, May 2014, Nexus Kernel uptime
is more than 1870 day (Last reset on Tue Mar 31 16:05:10 2009).”
Luca Chiappetti –
Network Operations Control Coordinator @ Fastweb
123
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Complete Your Online Session Evaluation
• Give us your feedback and you
could win fabulous prizes. Winners
announced daily.
• Complete your session evaluation
through the Cisco Live mobile app
or visit one of the interactive kiosks
located throughout the convention
center.
Don’t forget: Cisco Live sessions will be available
for viewing on-demand after the event at
CiscoLive.com/Online
124
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
Continue Your Education
• Demos in the Cisco Campus
• Walk-in Self-Paced Labs
• Table Topics
• Meet the Engineer 1:1 meetings
125
…and have fun…
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
© 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante
Cisco Public
09:55

More Related Content

Similar to BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (2014 San Francisco) - 2 Hours.pdf

Advanced Troublesshooting Nexus 7K.pdf
Advanced Troublesshooting Nexus 7K.pdfAdvanced Troublesshooting Nexus 7K.pdf
Advanced Troublesshooting Nexus 7K.pdfJeanChristian12
 
Cisco: Care and Feeding of Smart Licensing
Cisco: Care and Feeding of Smart LicensingCisco: Care and Feeding of Smart Licensing
Cisco: Care and Feeding of Smart Licensingdaxtindavon
 
9Tuts.Com New CCNA 200-120 New CCNA New Questions 2
9Tuts.Com New CCNA 200-120 New CCNA   New Questions 29Tuts.Com New CCNA 200-120 New CCNA   New Questions 2
9Tuts.Com New CCNA 200-120 New CCNA New Questions 2Lori Head
 
26.1.7 lab snort and firewall rules
26.1.7 lab   snort and firewall rules26.1.7 lab   snort and firewall rules
26.1.7 lab snort and firewall rulesFreddy Buenaño
 
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SPKrzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SPPROIDEA
 
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with KubernetesSolve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with KubernetesNiklas Quarfot Nielsen
 
6 profiling tools
6 profiling tools6 profiling tools
6 profiling toolsvideos
 
2010 03 papi_indiana
2010 03 papi_indiana2010 03 papi_indiana
2010 03 papi_indianaPTIHPA
 
breed_python_tx_redacted
breed_python_tx_redactedbreed_python_tx_redacted
breed_python_tx_redactedRyan Breed
 
SCADA Strangelove: взлом во имя
SCADA Strangelove: взлом во имяSCADA Strangelove: взлом во имя
SCADA Strangelove: взлом во имяEkaterina Melnik
 
SCADA Strangelove: Hacking in the Name
SCADA Strangelove: Hacking in the NameSCADA Strangelove: Hacking in the Name
SCADA Strangelove: Hacking in the NamePositive Hack Days
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightLinaro
 
Отчет Audit report RAPID7
 Отчет Audit report RAPID7 Отчет Audit report RAPID7
Отчет Audit report RAPID7Sergey Yrievich
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and ArchitectureSidney Chen
 
Virtual Twins: Modeling Trends and Challenges Ahead
Virtual Twins: Modeling Trends and Challenges AheadVirtual Twins: Modeling Trends and Challenges Ahead
Virtual Twins: Modeling Trends and Challenges AheadBrain IoT Project
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and HadoopDataWorks Summit
 
Man in the middle attacks on IEC 60870-5-104
Man in the middle attacks on IEC 60870-5-104Man in the middle attacks on IEC 60870-5-104
Man in the middle attacks on IEC 60870-5-104pgmaynard
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at NetflixBrendan Gregg
 
Using packet-tracer, capture and other Cisco ASA tools for network troublesho...
Using packet-tracer, capture and other Cisco ASA tools for network troublesho...Using packet-tracer, capture and other Cisco ASA tools for network troublesho...
Using packet-tracer, capture and other Cisco ASA tools for network troublesho...Cisco Russia
 

Similar to BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (2014 San Francisco) - 2 Hours.pdf (20)

Advanced Troublesshooting Nexus 7K.pdf
Advanced Troublesshooting Nexus 7K.pdfAdvanced Troublesshooting Nexus 7K.pdf
Advanced Troublesshooting Nexus 7K.pdf
 
Cisco: Care and Feeding of Smart Licensing
Cisco: Care and Feeding of Smart LicensingCisco: Care and Feeding of Smart Licensing
Cisco: Care and Feeding of Smart Licensing
 
9Tuts.Com New CCNA 200-120 New CCNA New Questions 2
9Tuts.Com New CCNA 200-120 New CCNA   New Questions 29Tuts.Com New CCNA 200-120 New CCNA   New Questions 2
9Tuts.Com New CCNA 200-120 New CCNA New Questions 2
 
26.1.7 lab snort and firewall rules
26.1.7 lab   snort and firewall rules26.1.7 lab   snort and firewall rules
26.1.7 lab snort and firewall rules
 
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SPKrzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
 
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with KubernetesSolve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with Kubernetes
 
6 profiling tools
6 profiling tools6 profiling tools
6 profiling tools
 
2010 03 papi_indiana
2010 03 papi_indiana2010 03 papi_indiana
2010 03 papi_indiana
 
breed_python_tx_redacted
breed_python_tx_redactedbreed_python_tx_redacted
breed_python_tx_redacted
 
SCADA Strangelove: взлом во имя
SCADA Strangelove: взлом во имяSCADA Strangelove: взлом во имя
SCADA Strangelove: взлом во имя
 
SCADA Strangelove: Hacking in the Name
SCADA Strangelove: Hacking in the NameSCADA Strangelove: Hacking in the Name
SCADA Strangelove: Hacking in the Name
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
 
Отчет Audit report RAPID7
 Отчет Audit report RAPID7 Отчет Audit report RAPID7
Отчет Audit report RAPID7
 
Report PAPID 7
Report PAPID 7Report PAPID 7
Report PAPID 7
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
 
Virtual Twins: Modeling Trends and Challenges Ahead
Virtual Twins: Modeling Trends and Challenges AheadVirtual Twins: Modeling Trends and Challenges Ahead
Virtual Twins: Modeling Trends and Challenges Ahead
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and Hadoop
 
Man in the middle attacks on IEC 60870-5-104
Man in the middle attacks on IEC 60870-5-104Man in the middle attacks on IEC 60870-5-104
Man in the middle attacks on IEC 60870-5-104
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
 
Using packet-tracer, capture and other Cisco ASA tools for network troublesho...
Using packet-tracer, capture and other Cisco ASA tools for network troublesho...Using packet-tracer, capture and other Cisco ASA tools for network troublesho...
Using packet-tracer, capture and other Cisco ASA tools for network troublesho...
 

Recently uploaded

VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service PuneVIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service PuneCall girls in Ahmedabad High profile
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirtrahman018755
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of indiaimessage0108
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Roomishabajaj13
 
Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneRussian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneCall girls in Ahmedabad High profile
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607dollysharma2066
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...aditipandeya
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 

Recently uploaded (20)

VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service PuneVIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of india
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
 
Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneRussian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 

BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (2014 San Francisco) - 2 Hours.pdf

  • 1. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public
  • 2. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Advanced Troubleshooting Cisco Nexus 7000 Series Switches BRKDCT-3144 Dipl.-Ing. Andreas la Quiante alaquian@cisco.com Nexus Product Management, Cisco Data Center Group Level 3 (:= Advanced) Version 019 2014 San Francisco 18-MAY-14
  • 3. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Chapter 0 Housekeeping ASICs are counting starting with zero. So do we today 08:03
  • 4. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Teamwork thank you Matt Martin Ron Roland Dmitry Ronald Adam Need help like me? Terri 4
  • 5. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public N7K Switch Router PC Layer 3 Layer 2 Focus areas N7004-Berlin# sh int e 3/12 CLI Geek content Error/Failure/Challenge Cisco TAC Interface Housekeeping Icons VLAN 08:05 5
  • 6. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Agenda, Timing and Theme …it’s like going on vacation… 6 Strategy Tools & System Data-Plane Layer 2 TCAM Data-Plane Layer 3 Control-Plane Inband Control-Plane ARP Cisco Live 2014 San Francisco: 120 min 1 2 3 Summary, Wrap Up Layout Item 4 5 6 Chapter 1-3: Chapter 4-6:
  • 7. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Chapter 1 Strategy, Tools and System ELAME System Strategy Scripts CLI Ethanalyzer
  • 8. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Guidance System Troubleshooting - Core, CPU, Memory, Interface/Vlan behaving odd, hardware challenges Data Plane Troubleshooting - Packets are lost - your primary questions is “where” - 100% loss or partial loss - consistent or periodically Control Plane Troubleshooting - Something is flapping - Convergence challenges - start at the process (log) “Anything better than checking everything is an improvement” Strategy Three Areas Dmitry 8
  • 9. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public I/O Module (Forwarding Engine) I/O Module (Forwarding Engine) System Control- Plane Data- Plane Reference Point 1 Supervisor (Control-Plane) Strategy System, Data-Plane, and Control-Plane RL CoPP Reference Point 2 9
  • 10. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Tools Content Suggestion (via Cisco Live Content on-demand library, e.g. 2013 Orlando) BRKARC-2011 Overview of Troubleshooting Tools in Cisco Switches and Routers Yogesh Ramdoss - Technical Leader, Cisco Services, Cisco Andy Gossett - Customer Support Engineer, Cisco 10
  • 11. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public NX-OS Value NX-OS is build up with most extensive, fine granular logging capabilities NX-OS High Performance Feature Rich Switching Logging Switching Logging NX-OS: Build in Flight Recorder Tools Logging built in PI := Platform Independent PD := Platform Dependent Config Python, NxAPI GUI, OF, SNMP XML, OnePK Chef, Puppet Standard CLI Python/TCL Engineering CLI Internal keyword output is not documented Action 11 PD PI
  • 12. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Show tech ABC Always try to use the detailed version show tech detail Feature Event history States (PSS,...) HW states Always redirect to a file Always use a separate file per show tech Global Service VDC-1 Default Feature “project binary logger” Significan time saver Show tech all-binary Avoiding also “we need show tech A” after a while doing RCA “we need show tech B” For use by TAC/BU/ENG t0 t2 t3 t1 t0 to t2 trigger failure Immediately collect data! Then start troubleshooting Tools show tech 12 If not enough time: try a specific show tech
  • 13. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ASICs Some „error“ counters are part of a normal operation (e.g. dropping packets at ingress trunk if the marked VLAN is not known (CBL drops), diag packets, extra flooded packets) One of TAC‘s favourite commands. Use „all“ to look for all modules / ASICs N7004-Berlin# show hardware internal errors module 3 |------------------------------------------------------------------------| | Device:Clipper MAC Role:MAC Mod: 3 | | Last cleared @ Mon Nov 25 21:41:37 2013 | Device Statistics Category :: ERROR |------------------------------------------------------------------------| Instance:2 Cntr Name Value Ports ----- ---- ----- ----- 0 GD GMAC bad character interrupt 0000000000000002 12 - 1 GD GMAC sequence error interrupt 0000000000000002 12 - 2 GD GMAC transition from nosync to sync int 0000000000000002 12 - 3 GD GMAC transition from sync to nosync int 0000000000000001 12 - 4 PL ingress_cbl_drop 0000000000003426 12 - GD GMAC Build in MAC Controller Our innovative ASICs provide many counters 1) Suspicion for abc 2) Show hardw int err 3) Send test packets 4) Show hardw int err Non-Zero Counter Tools Custom ASICs 13
  • 14. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Tips & Tricks N7004-Berlin# show system internal pktmgr interface <SNIP> Vlan1, ordinal: 38 Hash_type: 1 SUP-traffic statistics: (sent/received) Packets: 2769 / 1896 Bytes: 1619370 / 241310 Instant packet rate: 1 pps / 0 pps Packet rate limiter (Out/In): 0 pps / 0 pps Average packet rates(1min/5min/15min/EWMA): Packet statistics: Tx: Unicast 1123, Multicast 1641 Broadcast 5 N7004-Berlin# show system internal pktmgr interface |in or|I <SNIP> Vlan1, ordinal: 38 Hash_type: 1 Instant packet rate: 0 pps / 0 pps Packet rate limiter (Out/In): 0 pps / 0 pps port-channel100, ordinal: 72 Hash_type: 1 Instant packet rate: 1 pps / 1 pps Packet rate limiter (Out/In): 0 pps / 0 pps If I am only interested in parts of the output I can ask for just those items You save time by having to read less Nexus# sh ver | ? egrep Egrep - grep Grep - head Displ 1st ln last Displ last less Filter no-more sed wc Count begin Begin with count Count exclude Exclude ln include Include ln Tools customizing CLI 14 N7004-Berlin# sh processes cpu sort | ex 0.0 „real time flter“
  • 15. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public 6.2(2) Tools Tools Tools (Scripts) System Check (systemcheck) Packet Capture (elame) Event Time Analysis (logw) 6.2(6) 6.2(8) NX-OS (Thank you : Adam, Francesco, Dmitry, …) 16
  • 16. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Information Source Tools System Check Show tech Live Device : Nexus 7000-Series Offline: Show Tech-Support Goal: Identifying top x platform issues in one path Time saving vs. traditional approach: 30-40 minutes Hardware health, failing diagnostics, error interrupts Control plane overload (inband, CPU, IPC, process, network stability) Resource issues (CPU, memory, forwarding resources) Data plane issues (drops, errors) statistical analysis Option “-v” show CLI used by systemcheck S C R I P T 17
  • 17. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Example Tools System Check *** hw internal counters *** *** CPU, process crashes, service restarts *** *** Memory *** *** IPC/MTS *** *** HW Limit Checks *** N7K# source sys/systemcheck.py *** modules, diagnostics, HW exceptions *** module: 8 (N7K-M132XP-12L) state: ok, FSM state: LCM_MOD_ST_LC_POWERED_UP/LCM_LC_ST_ONLINE recent HW exceptions: 2013-07-01 15:28:17 System Manager:0x401e008a Service on linecard had a hap-reset 2013-06-28 10:33:08 System Manager:0x401e008a Service on linecard had a hap-reset ´59 HW exceptions before last reload *** HW internal counters *** active slots ['1', '2', '4', '7', '8', '9', 'sup'] processing data for slot 1 unique error types: 8 freq / cumulative amount / error 0 12 PL ingress_rx_diag_0_drops 0 15 IB ingress_ib_de_and_pl_drop (small cnt) 0 1 IB INT DE packet drop (cr_type = 0, all fpoe = 0) 10 675048 EB egress credited pkt drops 10 1 IB INT DE packet drop 27 1 PL ingress_rx_err 40 3326668571 PL egress_cbl_drop 40 21 PL ingress_cbl_drop 18 08:10
  • 18. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Location Rotated once reaching 10MB logfile nvram onboard Logfile: Syslog Messages NVRAM: High Severity Messages (SEV 1 or 2) On-Board: Major state changes, MTS transactions Useful for module troubleshooting N7004-Berlin# show logging nvram 2013 Nov 9 23:03:25 N7004-Berlin %$ VDC-2 %$ %L2FM-2-L2FM_CFS_SEND_FAILED: cfs send failed, num 1 Wraps quickly -2- := severity 2 := Critical N7004-Berlin# show file logflash:log/messages 2008 Jan 2 19:24:21 %MODULE-5-ACTIVE_SUP_OK: Supervisor 6 is active (serial: JAFxxxxxxxxx) 2008 Jan 2 19:24:21 %PLATFORM-5-MOD_STATUS: Module 6 current- status is MOD_STATUS_ONLINE/OK It is a good idea to synchronize all devices in your network to one time source 19 Tools Logging Possible next step after syslog: Look for more infomation in in the event-history of the notifying feature
  • 19. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Trigger logw.py [-h] [-v] [-f FILTERS] [-t TRUNCATE] [-n MAX_EVENTS] [-s] start_date start_time duration Logfile 10MB logfile NVRAM On-board Event History A new tool: logwindow Tools Logwindow N7K# source logw.py 15/01/2014 12:24:55 100 starting with empty stats stats init done Logw system check port version 0.060813 Time range 2014-01-15 12:24:55 ... 2014-01-15 12:26:35 Got 343 show ... event-history clis 244 clis left after pre-filtering collecting outputs...done, collected 2602 events in 96.197735 seconds sorted <snip> 20 Tip: show log log immediately displays the logfile output, and is faster than show log which has to read the logging severity settings Specify a start-time to limit output
  • 20. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Audit Recording Only configuration commands are captured by default. Enable all commands to be captured with terminal log-all (feature requires 5.x NX-OS or higher) Trigger logw.py Tools Accounting N7004-Berlin# show accounting log | last 3 Mon Dec 2 03:33:05 2013:type=update:id=console0:user=admin:cmd=switchto ; configure terminal ; interface port-channel110 ; shutdown (SUCCESS) Mon Dec 2 03:33:08 2013:type=update:id=console0:user=admin:cmd=switchto ; configure terminal ; interface port-channel110 ; no shutdown (REDIRECT) Mon Dec 2 03:33:08 2013:type=update:id=console0:user=admin:cmd=switchto ; configure terminal ; interface port-channel110 ; no shutdown (SUCCESS) N7004-Berlin(config)# terminal log-all N7004-Berlin(config)# show accounting log all | last 2 Mon Dec 2 03:53:28 2013:type=update:id=console0:user=admin:cmd=switchto ; show accounting log all | last 2 (SUCCESS) Mon Dec 2 03:52:11 2013:type=update:id=console0:user=admin:cmd=switchto ; show hardware internal errors all (SUCCESS) 21 N7004# dir logflash://sup-active/vdc_1 20023 Apr 18 11:19:40 2014 accounting_log 1291 Sep 21 19:26:05 2012 forwarding_debug_data persistent
  • 21. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ELAM & ELAME It is widely used by engineering, QA, TAC and escalation teams ELAM is an unsupported and internal tool ELAM requires a great deal of platform architecture and ASIC knowledge to use. This limits the audience of the raw tool. Identifying the appropriate FE, creating triggers, and interpreting ELAM data for complex flows requires full architectural and forwarding knowledge Good news: ELAME makes ELAM easy to use skill ELAME F-Series M-Series Tools ELAM & ELAME 22 08:15 (ELAM := Embedded Logic Analyzer Module)
  • 22. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public workflow Determin the FE Configure Trigger Start ELAM Analyze ELAM allows you to verify if a packet is present and/or to analyze ELAME allows you to verify quickly if a packet is present, especially in a complicated setup it saves you TIME! Use cases: 1) Determining the failure domain 2) Analyze the System behavior IP 42.42.42.1 MAC aaaa.bbbb.cccc IP 42.42.42.12 MAC aaaa.bbbb.dddd You MUST know the source and destination MAC/IP pairs involved for troubleshooting. Is the source and/or destination dual-homed? Is the source and/or destination real or virtual? 23 Tools ELAM & ELAME FE: Eureka(M), Lamira(M), Orion(F1), Clipper(F2), Flanker(F3)
  • 23. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ELAME N7004-Paris# source sys/elame 10.0.2.2 224.0.0.5 elam helper, version 1.015 ... source 10.0.2.2, destination 224.0.0.5 ... getting current vdc ... 4 ... ingress interface derived from source address ... ingress interface list is Ethernet4/1 ... expanded ingress interface list is Ethernet4/1 ... FE instance list is 4/1/1 ... setting trigger... ... elam trigger set ... starting capture... ... elam capture started ... no packet captured so far press [enter] when packets in question are known to have been sent… ... packet captured at FE: 4/1/1 ... capture instance 4/1/1 (slot/type/instance) Since NX-OS 6.2(2) we include „elame.tcl“ in the distribution: Berlin 10.0.2.2/24 Paris 10.0.2.4/24 Do we receive OSPF packets from our neighbor on E 4/1? E 4/1 M-Series line card Because ELAM especially on M-Series is complicated this example show how easy it is to use ELAME ELAME works on F2 and M-Series line cards with IPv4 You just specify source and destination address the tool determines the correct FE to programm even on M-Series Modules 25 Tools ELAME, Part 1
  • 24. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ELAME N7004-Paris# source sys/elame 10.0.2.2 224.0.0.5 <SNIP> ... packet captured at FE: 4/1/1 ... capture instance 4/1/1 (slot/type/instance) +++ IPv4 packet: 86 bytes from MAC 4055.390f.5642 / IP 10.0.2.2 to MAC 0100.5e00.0005 / IP 224.0.0.5 TTL 1 +++ protocol OSPF +++ packet received on interface Eth4/1 vlan 0 (source index 0x00030) ... rbus: ccc 0x0 cap1 0x1 cap2 0x1 flood 0x1 dest_vlan 0 dest_index 0x00032 l2_fwd 0x0 +++ packet is flooded to BD 50 / vlan 0 ... destination index is NOT from L2 table lookup +++ copy of the packet is sent to CPU ... lamira OFE: rdt 0x0 dest_index 0x010c7 flood 0x0 l2fwd 0x0 ofe_drop 0x0 +++ lamira OFE exception(s): CPP_LIF (0x200000000) ... FE instance 4/1/1 context after analysis: pb2 retried ... done DBUS and RBUS captured, easy tool even on M-Series line cards (here N7K-M224) E 4/1 LTL 0x30 SUP LTL 0x10C7 Paris Berlin Lamira Eureka The lines beginning with +++ are the important once ELAM(E) Ethanaylzer 27 skill ELAME F-Series M-Series Tools ELAME, Part 3 08:20
  • 25. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ELAM F2 Embedded Logic Analyzer Module F2 no PB for ELAM (:= more simple but the recommendation is to still use ELAME like the pros) Clipper: Layer 2 ELAM and/or Layer 3 ELAM module-3# elam asic clipper instance 2 Module-3(clipper-elam)# layer 3 module-3(clipper-l3-elam)# trigger dbus ipv4 if source-ipv4-address 42.42.42.142 module-3(clipper-l3-elam)# trigger rbus ofe if trig module-3(clipper-l3-elam)# start module-3(clipper-l3-elam)# status <SNIP> L2 L3 Clipper FE2 E3/12 OFE IFE OFE := Outgoing „Pipeline“ IFE := Incomming „Pipeline“ Status: Armed := waiting for the packet Status: Triggered := we have captured 28 Tools ELAM
  • 26. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ELAM F2 Embedded Logic Analyzer Module 42.42.42.142 E 3/12 F-Series line card module-3(clipper-l3-elam)# show dbus -------------------------------------------------------------------- Clipper Instance 02 - Capture Buffer On L3 DBUS: <SNIP> -------------------------------------------------------------------- L3 DBUS CONTENT - IPV4 PACKET -------------------------------------------------------------------- <SNIP> l2-packet-length : 0x52 ingress-lif : 0xfca vlan-id : 0x2a ilm-addr : 0x32 source-index : 0x402 destination-index : 0x0 frame-type : 0x5 sequence-number : 0x94 l2-frame-type : 0x0 l4-protocol : 0x59 recirc-preserve-acos: 0x0 recirc-multicast-bridge-disable: 0x0 ipv4_l4_info_elsewhere_1: 0x0 ipv4_l4_info_elsewhere_2: 0x0 destination-mac-address: 0100.5e00.0005 source-mac-address: 0010.7be8.53b0 source-ipv4-address: 42.42.42.142 Destination-ipv4-address: 224.0.0.5 Berlin 30 Tools ELAM, DBUS
  • 27. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ELAM F2 Embedded Logic Analyzer Module 42.42.42.142 E 3/12 F-Series line card module-3(clipper-l3-elam)# show rbus -------------------------------------------------------------------- Clipper Instance 02 - Capture Buffer On L3 RBUS: <SNIP> -------------------------------------------------------------------- L3 RBUS OFE CONTENT -------------------------------------------------------------------- OFE valid: 0x1 trig : 0x1 l2-l3-acos : 0x0 <SNIP> dvif : 0x0 vlan : 0x2a md-di-valid : 0x0 redirect : 0x0 ccc : 0x4 l2-forward : 0x1 routed : 0x0 eid-select : 0x0 lif-status-enable : 0x1 bcn-compatible : 0x0 VID 42:= 0x2a Berlin 31 Tools ELAM, RBUS
  • 28. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ELAM F2 Embedded Logic Analyzer Module module-3# elam asic clipper instance 2 Module-3(clipper-elam)# layer 2 Module-3(clipper-l3-elam)# trigger dbus ipv4 if destination-ipv4-address 42.42.42.142 Module-3(clipper-l3-elam)# trigger rbus ingress if trig L2 L3 Clipper FE2 E1/12 egr ingr Since the former example indicated no Layer 3 rewrite we look now into Layer 2 ELAM (still looking for Layer 3 information) module-3(clipper-l2-elam)# show rbus <SNIP> inner-cos : 0x0 acos : 0x0 di-ltl-index : 0x8015 l3-multicast-di : 0x0 source-index : 0x402 vlan-id : 0x2a index-direct : 0x0 eid-sel : 0x0 vqi : 0xfa v5-fpoe-idx : 0xf9 l3-fpoe-idx : 0x0 l3-multicast-v5 : 0x0 dft : 0x0 dfst : 0x0 32 Tools ELAM, RBUS 08:25
  • 29. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Reference Point Similar to NetDR on C6500/7600 but separate / parallel to internal processing and path 33 Tools Ethanalyzer Multiple CPU Cores Kernel Ethanalyzer OSPF Display Filter Capture Filter NetStack http://wiki.wireshark.org/CaptureFilters http://wiki.wireshark.org/DisplayFilters N7004# ethanalyzer local interface inband decode-internal limit- frame-size 150 display detail 2013-12-07 15:52:47.446886 Cisco_8b:a0:5a -> PVST+ STP 96 RST. Root = 32768/42/ 00:0c:30:8b:a0:40 Cost=0 Port=0x8041 NXOS Protocol NXOS VLAN: 42 NXOS SOURCE INDEX: 1030 NXOS DEST INDEX: 4295 Frame 5: 64 bytes on wire (512 bits), 64 bytes captured (512 bits) on if 0 Arrival Time: Dec 7, 2013 15:52:47.446886000 UTC [Protocols in frame: eth:llc:stp] IEEE 802.3 Ethernet Destination: PVST+ (01:00:0c:cc:cc:cd) Spanning Tree Protocol Protocol Identifier: Spanning Tree Protocol (0x0000) Protocol Version Identifier: Rapid Spanning Tree (2) BPDU Type: Rapid/Multiple Spanning Tree (0x02) BPDU flags: 0x3c (Forwarding, Learning, Port Role: Designated)
  • 30. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Tips & Tricks Event-Histories typically will be enough to diagnose most of the issues, however sometimes debugging may be required. For Verbose debugs that can drive up CPU when printed on terminal, NX-OS provides capability to send debug output directly to a file saved in a log directory. After a reload the information is gone! N7004-Berlin# debug logfile ALQ-OSPF size 8192 N7004-Berlin# debug ip ospf all detail N7004-Berlin# dir log: 8192 Jan 04 12:00:03 2014 ALQ-OSPF 11114 Jan 04 11:51:16 2014 messages 196 Jan 04 11:47:53 2014 snmp_log 149595 Jan 04 11:58:07 2014 startupdebug N7004-Berlin# show debug logfile ALQ-OSPF 2014 Jan 4 12:00:16.332218 ospf: 1 [6941] (default) Nbr 10.0.3.5 FSM start: old state FULL, event HELLORCVD 2014 Jan 4 12:00:16.332240 ospf: 1 [6941] (default) Nbr 10.0.3.5: FULL --> FULL, event HELLORCVD Tools Debugging 34
  • 31. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public System Troubleshooting
  • 32. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Ethernet IF E 3/12 N7004-Berlin# show int eth 3/13 Ethernet3/13 is down (SFP not inserted) N7004-Berlin# show int eth 3/12 Ethernet3/12 is up The Interface could be described as the Port-ASIC including the MAC Controller Another view would be the Software Process in the Control Plane Ethpm (:= Ethernet Port Manager) An up-to-date network drawing helps Ethpm VID 1 VID 42 STP Vlan Mgr System …my interface 36 08:30
  • 33. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Ethernet IF E 1/27 Ethpm Phy_off 802.1X PIXM ACL QOS L2FM STP N7K(config)# interface e1/27 N7K(config-if)# shut N7K# show inter e1/27 Ethernet1/27 is down (Internal-Fail errDisable, libeventseq: sequence timeout) Processes and Services are depending on each other Collect information about the whole environment: (e.g. Show tech ) As you likely don‘t know all dependent processes Ethpm is interacting with each service sequencially (Request and Response) OK, how about shutting down a port (e.g. e1/27)? N7K(config-if)# shut System …my interface behaves oddly… 37
  • 34. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Ethernet IF E 1/27 Ethpm N7K# sh system internal ethpm event-history errors | grep –B 4 –A 4 net1/27 <snip> 23) Event:E_DEBUG, length:141, at 908071 usecs after Thu Feb 7 09:29:35 2013 [102] ethpm_def_port_seq_step_failure_hdlr(9406): Port: Ethernet1/27 , Sequence No: 4, Sequence Step : 13 ,Error: 0xsequence timeout(408c0008) <snip> We start today here with Ethpm and look into the event log Most features use a private event log N7K# sh system internal ethpm event-history msgs | grep -B 4 -A 5 0x25933EC0 1407) Event:E_MTS_RX, length:60, at 94113 usecs after Thu Feb 7 09:30:08 2013 [RSP] Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446), Id:0X259E2DF3, Ret:SUCCESS Src:0x00000505/221, Dst:0x00000505/175, Flags:None HA_SEQNO:0X00000000, RRtoken:0x25933EC0, Sync:UNKNOWN, Payloadsize:34 -- 1440) Event:E_MTS_TX, length:60, at 974110 usecs after Thu Feb 7 09:28:55 2013 [REQ] Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446), Id:0X25933EC0, Ret:SUCCESS Src:0x00000505/175, Dst:0x00000505/221, Flags:None HA_SEQNO:0X00000000, RRtoken:0x25933EC0, Sync:UNKNOWN, Payloadsize:34 N7K# sh system internal ethpm event-history errors MTS_OPC_ETHPM_PORT_PHY_CLEANUP (rr token - 0x25933ec0 sap:221) received Reference Slide System sequence timeout 38
  • 35. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Take Away What we know so far: SEQ time-out happen during/around PHY port CLEANUP Someone introduced a delay or received an own request with delay Next steps: Check MTS (:= Message Transmission System) Check Log (e.g. look for “SAP 221”) N7K# show log log 2013 Feb 7 07:12:33 N7K Feb 7 07:08:44 %KERN-2-SYSTEM_MSG: mts_is_q_space_available_old():1641: regular+fast mesg total = 135287, soft limit = 32768 - kernel 2013 Feb 7 07:12:33 N7K Feb 7 07:08:44 %KERN-2-SYSTEM_MSG: mts_is_q_space_available_old(): NO SPACE - node=5, sap=221, uuid=410, pid=30121, sap_opt = 0x1, hdr_opt = 0x0, rq=134970(13264613), lq=0(0), pq=317(655986), nq=0(0), sq=0(0), fast: rq=0, lq=0, pq=0, nq=0, sq=0 - kernel It was written in the log file. If we had looked into the log file first we would have saved a lot of time! Recover System sequence timeout 39
  • 36. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Core Files Collect cores form „all“ locations on the active (don‘t forget your standby SUP) and attach them to a TAC case right away N7004# show cores vdc-all VDC Module Instance Process-name PID Date(Year-Month-Day Time) --- ------ -------- --------------- -------- ------------------------- VDC Module Instance Process-name PID Date(Year-Month-Day Time) --- ------ -------- --------------- -------- ------------------------- 1 17 1 pixmc 2134 2013-10-28 16:52:48 1 8 1 pixmc 2134 2013-10-28 16:52:50 SR 123 2010 Jul 17 00:30:18 vrt001 %$ VDC-1 %$ %SYSMGR-SLOT8-2-SERVICE_CRASHED: Service "mtm" (PID 1600) hasn't caught signal 6 (core will be saved). Here you see „slot 8“ := you know the line card and MTM is a line card process System Reducing MTTR %SYSMGR-2-SERVICE_CRASHED: Service "vpc" (PID 5883) hasn't caught signal 11 (core will be saved) %SYSMGR-2-SERVICE_CRASHED: Service "stp" (PID 4668) hasn't caught signal 9 (no core). show cores vdc-all dir logflash:core dir logflash://sup-1 dir logflash://sup-2 show process log vdc-all show process log details 40 08:35
  • 37. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Ethernet IF Kuddewörde E 3/25 10G Fiber My connected device is not working I suspect a Layer 1 challenge Working fine when connected to 2nd switch BAD Those counters usually indicate a bad transceivers or fibers In this case SDP timed out on uplinks System Layer 1 [1671]11/05/2013 06:30:01.337185: sdp_rx_timeout: Sdp instance timed out. ifindex=1a018000. last pkt received at 11/05/2013 06:28:10.512373. [1672]11/05/2013 06:30:01.337198: fport [0x1a018000]:satmgr_fport_fsm: even:t Timeout. curr state: Active [1673]11/05/2013 06:30:01.337216: fport [0x1a018000]:Log - SDP timed out N7004-Berlin# show hardware internal errors module 3 Instance:6 ID Name Value Ports -- ---- ----- ----- 2189 GD XGMAC rx code violation interrupt 0000000000000001 25 - 2190 GD XGMAC rx code error interrupt 0000000000327284 25 - 2194 GD XGMAC bad to good link change interrupt 0000000000879210 25 - 2195 GD XGMAC good to bad link change interrupt 0000000000007181 25 - 12327 PL ingress_cbl_drop 0000000000032388 25 - 12328 PL egress_cbl_drop 0000000000003889 25 - 41
  • 38. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Statistics Suspicious counters for bad transceiver / fibers in yellow clear statistics module-all device all and run several times to identify increasing counter 2050 GD Received short frames with bad CRC (RUNT) 0000000000000000 1 - 2051 GD Rx bad CRC frames, excluding RUNT/JABBER 0000000000000000 1 - 2052 GD Rx protocol error count 0000000000000000 1 - 2054 GD Rx frame drop count 0000000000000000 1 - 2096 GD Received oversized frames w/ bad CRC 0000000000000000 1 - 2188 GD XGMAC rx CRC error interrupt 0000000000000000 1 - 2189 GD XGMAC rx code violation interrupt 0000000000000000 1 - 2190 GD XGMAC rx code error interrupt 0000000000000000 1 - 2191 GD XGMAC rx IPG violation interrupt 0000000000000000 1 - 2196 GD GMAC rx_config_word change interrupt 0000000000000000 1 - 2197 GD GMAC loss of sync interrupt 0000000000000000 1 - 2200 GD GMAC rx CRC error interrupt 0000000000000000 1 - 2228 GD Received frame with CRC error interrupt 0000000000000000 1 Reference Slide System Layer 1 42
  • 39. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Chapter 2: Data-Plane Layer 2 MAC Table L2FM PIXM STP
  • 40. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Failure Domain I am loosing packets between A and B! How can I quickly determine „where“? 100% traffic loss: • Table not progammed • Wrongly programmed • Inconsistency ELAME X % traffic loss: • Congestion? • Periodically? Timer/Aging event (e.g. MAC Table) A B Data-Plane Failure Domain Determine Failure Domain Quickly ELAME A B ELAME A B ELAME A B ELAME A B Failure Domain 44
  • 41. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Troubleshooting At the ingress forwarding engine for unicast multicast replication occures at the egress line card Congestion F-Series (Ingress) M-Series (Egress) Ingress Module First Stage Egress Module Third/Last Stage EARL 8 SoC Xbar Xbar Xbar Fabric Modul EARL 8 SoC Data-Plane Architecture 45 Suggestion: BRKARC-3470 Cisco Nexus 7000 Hardware Architecture 08:40
  • 42. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Troubleshooting N7009-Lagos# show hardware internal errors all |------------------------------------------------------------------------| | Device:Sacramento Xbar ASIC Role:FABRIC Mod: 9 | | Last cleared @ Fri Nov 15 02:19:12 2013 | Device Statistics Category :: ERROR |------------------------------------------------------------------------| Instance:0 ID Name Value Ports -- ---- ----- ----- 2129 FB09-P21 LOW_BP_CNT_IN 0000000000000099 1-48 I1-2 |------------------------------------------------------------------------| | Device:Clipper XBAR Role:QUE Mod: 9 | | Last cleared @ Fri Nov 15 05:18:38 2013 | Device Statistics Category :: CONGESTION |------------------------------------------------------------------------| Instance:0 ID Name Value Ports -- ---- ----- ----- 132 VQ credited pkt replica VOQ tail drops 0000000000000189 1-4 - 137 VQ credited pkt replica drop count 0000000000000189 1-4 - 9602 VQ VQI 204 CCOS 3 drop count 0000000000000189 1-4 - Clipper Sacramento BP := Backpressure System FPGA Version on FAB2 needs to be PM 0.007 for SUP-2/2E Q Verify our System status before troubleshooting Data-Plane Congestion 46
  • 43. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public LC Families EARL based Line Cards M-Series (:= M1, M2) SoC based Line Cards F-Series (:= F1, F2, F3) M2 2 x per LC SoC e.g. F2E Clipper up to 60 mpps per SoC Fabric ASIC Fabric ASIC EARL 8 Up to 60mpps L2 L3 P R Q Q:= Queuing Engine R:= Replication Engine P:= Port ASIC FE .= Forwarding Engine F1 16 x SoC F2/F2E 12 x SoC F3 N7K (and all 1G/10G) 6 x SoC F3 N77 12 x SoC Q R P FE M-Series F-Series Data-Plane LC Architecture 47
  • 44. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Forwarding Similar: show platform hardware capacity forwarding on C6K N7004-Berlin# show hardware internal forwarding engine usage slot 4 Forwarding Engine Usage ----------------------- Module inst pps peak pps 4 1 0 4 @Tue Nov 26 20:17:33 2013 N7004-Berlin# show hardware internal statistics module 3 rates Hardware statistics on module 03: + ============================= + Clipper MAC Instance 0 + ============================= |-- Ingress IN | |--- Packets/sec | | |--- 2: 0 | | |--- 1: 0 | | |--- 3: 0 | | |--- 4: 0 | | |--- sum: 0 | |--- Bytes/sec | | |--- 2: 3 <SNIP> |-- Egress OUT | |--- Packets/sec | | |--- 2: 0 | | |--- 1: 0 | | |--- 3: 0 | | |--- 4: 0 | | |--- sum: 0 | |--- Bytes/sec | | |--- 2: 3 This command works for M-Series line cards This command works for F-Series line cards FE 0 E 3/1 vPC PKA E 3/2 & 3/3 vPC PL Module 3: F2 Data-Plane FWD Engine Performance 48 [N7004-Berlin# show forwarding internal errors]
  • 45. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public LC Internals module-1# show hardware internal dev-port-map -------------------------------------------------------------- CARD_TYPE: 12 port 100G >Front Panel ports:12 -------------------------------------------------------------- Device name Dev role Abbr num_inst: -------------------------------------------------------------- > Flanker Eth Mac Driver DEV_ETHERNET_MAC MAC_0 12 > Flanker Fwd Driver DEV_LAYER_2_LOOKUP L2LKP 12 > Flanker Xbar Driver DEV_XBAR_INTF XBAR_INTF 12 > Flanker Queue Driver DEV_QUEUEING QUEUE 12 > Sacramento Xbar ASIC DEV_SWITCH_FABRIC SWICHF 2 > Flanker L3 Driver DEV_LAYER_3_LOOKUP L3LKP 12 > EDC DEV_UNDEFINED PHYS 12 +-----------------------------------------------------------------------+ +----------------+++FRONT PANEL PORT TO ASIC INSTANCE MAP+++------------+ +-----------------------------------------------------------------------+ FP port | PHYS | MAC_0 | L2LKP | L3LKP | QUEUE |SWICHF 1 0 0 0 0 0,1 2 1 1 1 1 0,1 3 2 2 2 2 0,1 4 3 3 3 3 0,1 5 4 4 4 4 0,1 <SNIP> EDC0 EDC1 Flanker 0 Flanker 1 SAC0 SAC1 000c.308b.a040 Data-Plane Line Card Components 49
  • 46. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Layer 2 Berlin PO 110 MAC Address Table (16K, 64K, or 128K) MAC Address Table 000c.308b.a040 Sync via CFS Data-Plane L2 HW learning N7004-Berlin# show mac address-table vlan 1 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, (T) - True, (F) - False VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID ---------+-----------------+--------+---------+------+----+------------------ G 1 0000.0c9f.f001 static - F F sup-eth1(R) G 1 4055.390f.5642 static - F F sup-eth1(R) * 1 4055.390f.5643 static - F F vPC Peer-Link * 1 000c.308b.a040 dynamic 0 F F Po110 N7004-Berlin# show hardware internal forwarding f2 l2 table utilization L2 entries: Module inst total used mcast ucast lines lines_full 3 0 16384 15 0 15 512 0 N7004-Berlin# show hardware internal forwarding l2 table utilization L2 entries: Module inst total used mcast ucast lines lines_full 4 1 131072 22 8 14 8192 0 50 08:45
  • 47. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Layer 2 MAC A MAC Index Flag A PO1 PI_E C 3/3 MAC Index Flag A PO1 PI_E C 3/3 MAC Index Flag A PO1 C 3/3 PI_E MAC C E 1/1 E 2/2 E 3/3 Line Card 1 Line Card 2 Line Card 3 PO1 L2FM show mac address-table … show hardware mac address-table … Learning and Aging optimized for physical and logical ports (:= PC Port Channel) with additional signaling via L2FM L2FM Data-Plane Learning & Moves N7004-Berlin(config)# logging level l2fm 6 2013 Dec 17 02:52:46 N7004-London %$ VDC-3 %$ %L2FM-4-L2FM_MAC_MOVE: Mac f0de.f1f2.c804 in vlan 42 has moved from Eth3/37 to Eth3/41 2013 Dec 17 02:53:00 N7004-London %$ VDC-3 %$ %L2FM-4-L2FM_MAC_MOVE: Mac f0de.f1f2.c804 in vlan 42 has moved from Eth3/41 to Eth3/37 51
  • 48. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Layer 2 L2FM Looking back in time for a specific MAC Address 12 3 6 N7004-London# show interface snmp-ifindex |i 1a124000 Eth3/37 !Port 437403648 !IFMIB (0x1a124000) !IFINDEX N7004-London(config)# show system int l2fm l2dbg macdb address f0de.f1f2.c804 Legend Db: 0-MACDB, 1-GWMACDB, 2-SMACDB, 3-RMDB, 4-SECMACDB Src: 0-UNKNOWN, 1-L2FM, 2-PEER, 3-LC, 4-HSRP 5-GLBP, 6-VRRP, 7-STP, 8-DOTX, 9-PSEC 10-CLI 11-PVLAN 12-ETHPM, 13-ALW_LRN, 14-Non_PI_MOD, 15-MCT_DOWN, 16 - SDB 17-OTV, 18-Deounce Timer, 19-AM, 20-PCM_DOWN, 21-MCT_UP, 22-L2VPN Slot:0 based for LCS 19-MCEC 20-OTV/ORIB VLAN: 42 MAC: f0de.f1f2.c804 Time If/swid Db Op Src Slot FE Sat Dec 14 22:18:20 2013 0x1a124000 0 INSERT 3 2 9 Sat Dec 14 22:18:20 2013 0x1a124000 0 RESET_LL_UNDERWAY 2 0 15 Sat Dec 14 22:18:51 2013 0x1a124000 0 NON_PI_MOD 3 2 15 Sat Dec 14 22:18:51 2013 0x1a124000 0 NON_PI_MOD 3 2 15 Sat Dec 14 22:18:51 2013 0x1a124000 0 NON_PI_MOD 3 2 15 Sat Dec 14 22:19:31 2013 0x1a124000 0 FLUSH 12 0 15 Sat Dec 14 22:19:31 2013 0x1a124000 0 DELETE 0 0 15 Sat Dec 14 22:19:36 2013 0x1a128000 0 INSERT 3 2 10 Data-Plane MAC History 52
  • 49. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Layer 2 LTL := Local Target Logic (e.g. Source Index (SI) and Destination Index (DI) e.g. 0x00402) BD := Bridge Domain E 3/1 Internal Header added by PORT ASIC or SoC (FE) Ingress L2 Logic learns MAC Address in HW (M & F-Series) Header Packet DI = 402h VLAN, ... Internal Header contains SI, DI, VLAN SI = BAh 402h Org Packet We add an internal header to carry needed information (e.g. Index, VLAN) + removed Packet N7004-Berlin# show hardware mac address-table 3 address 000c.308b.a040 !reformatted! FE | Valid| PI| BD | MAC | Index| Stat| SW | Modi| Age| Tmr| GM| ---+------+---+------+---------------+-------+-----+-----+-----+----+----+--- 0 1 0 17 000c.308b.a040 0x00402 0 0x009 0 121 1 0 2 1 1 17 000c.308b.a040 0x00402 0 0x009 0 121 1 0 Data-Plane Internal Header 53
  • 50. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Layer 2 0402h 8011h BD – VLAN VDC 2:17 := 1 DB PO110 A interface is assigned one or more indices One port gets assigned one or more index values, internally we use the concept of bridge domains (which map to VLAN ID) 54 Data-Plane Index N7004-Berlin# show system internal pixm info ltl 0x00402 PC_TYPE PORT LTL RES_ID LTL_FLAG CB_FLAG MEMB_CNT ------------------------------------------------------------------------------ Normal Po110 0x0402 0x1600006d 0x00000000 0x00000002 1 Member rbh rbh_cnt Eth3/12 0x000000ff 0x08 CBL Check States: Ingress: Enabled; Egress: Enabled VLAN| BD| BD-St | CBL St & Direction: -------------------------------------------------- 1 | 0x11 | INCLUDE_IF_IN_BD | FORWARDING (Both) Member info ------------------ Type LTL ---------------------- PORT_CHANNEL Po110 FLOOD_W_FPOE 0x8011 How to convert a BD (in dec) to a VLAN ID 11h = 17 STP ingress/egress N7004-Berlin# show vlan internal bd-info bd-to-vlan 17 VDC Id BD Id Vlan Id ------ ------- ------- 2 17 1 54
  • 51. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public PIXM 000bh E 3/12 0402h 8011h PO110 10C7h 10C8h SUP LTL setup (here) for SUP-2 and NX-OS 6.2(5.41) N7004-London# show system internal pixm info ltl-region =========================================================== PIXM VDC 1 LTL MAP Version: 2 Description: LTL Map for N7K SUP2 Silverstone (all flavors) =========================================================== LTL_TYPE SIZE START END ======================================================================== LIBLTLMAP_LTL_TYPE_PHY_PORT 1024 0x0 0x3ff LIBLTLMAP_LTL_TYPE_PC 3204 0x400 0x1083 LIBLTLMAP_LTL_TYPE_SUP_FUTURE 67 0x1084 0x10c6 LIBLTLMAP_LTL_TYPE_SUP_ETH_INBAND 64 0x10c7 0x1106 ------------------------------------------------------------------- SUB-TYPE LTL ------------------------------------------------------------------- LIBLTLMAP_LTL_TYPE_SUP_INBAND_HQ 0x10c7 LIBLTLMAP_LTL_TYPE_SUP_INBAND_LQ 0x10c8 <SNIP> Data-Plane Port Index Manager 55 08:50
  • 52. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public STP STP STP root Config BPDU DP DP := Designated Port RP := Root Port BPDU := Bridge Protocol Data Unit RP TCN BPDU Know your port states in a stable condition (:= before the troubleshooting, prepare yourself) Two BPDU types: Configuration BPDU’s and TCN BPDU’s Tracking Port Role Changes, Root Changes via SYSLOG For vPC with peer switch configuration both devices are sending BPDUs as root. NX-OS 4.2(6), 5.0(2a) Data-Plane STP logging level spanning-tree 6 %STP-6-PORT_ROLE: Port Ethernet2/1 instance VLAN0001 role changed to designate 56
  • 53. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public STP Symptoms for a Data Loop High link utilization (100%) High CPU and fabric traffic utilization Constant MAC Address re-learning and flapping Exessive output drops on an interface Verify each switch on the redundant path Someone who is supposed to block is forwarding... No loop in my lab today… In the real world we see loops created by blade servers, teaming-nic’s and hypervisors (:= virtual switches) Data-Plane STP N7004-Berlin# show interface e 3/7 | i rate 30 seconds input rate 24 bits/sec, 0 packets/sec 30 seconds output rate 304 bits/sec, 0 packets/sec 300 seconds input rate 104 bits/sec, 0 packets/sec 300 seconds output rate 424 bits/sec, 0 packets/sec 57
  • 54. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public STP Verifying systematically the path Paris Berlin VID 42 Moscow London E4/17 STP pktmgr Ethanaylzer Data-Plane STP N7004-Berlin# show spanning-tree interface ethernet 3/7 detail Port 391 (Ethernet3/7) of VLAN0042 is designated forwarding <SNIP> BPDU: sent 1972, received 5 N7004-Paris# show spanning-tree interface ethernet 4/2 detail Port 514 (Ethernet4/2) of VLAN0042 is root forwarding <SNIP> BPDU: sent 5, received 2007 N7004-Berlin# show system internal pktmgr interface ethernet 3/7 Ethernet3/7, ordinal: 80 Hash_type: 2 SUP-traffic statistics: (sent/received) Packets: 2217 / 82 Bytes: 139163 / 17376 Instant packet rate: 0 pps / 0 pps Packet rate limiter (Out/In): 0 pps / 0 pps Average packet rates(1min/5min/15min/EWMA): Packet statistics: Tx: Unicast 0, Multicast 2217 <SNIP> STP pktmgr Ethanaylzer ELAME 58
  • 55. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public STP STP What is our STP role? Are we stable? TCN send or received? If yes through which Interface did we received last TCN? In case of an access port enable port-fast STP DP RP TCN BPDU Data-Plane STP N7004-London(config-if)# show spanning-tree vlan 1 detail VLAN0001 is executing the rstp compatible Spanning Tree protocol Bridge Identifier has priority 32768, sysid 1, address 4055.390f.5643 Configured hello time 2, max age 20, forward delay 15 Current root has priority 32769, address 000c.308b.a040 Root port is 4195 (port-channel100), cost of root path is 2 Topology change flag not set, detected flag not set Number of topology changes 2 last change occurred 0:15:50 ago from port-channel100 <SNIP> N7004-London(config-if)# spanning-tree port type edge Warning: Edge port type (portfast) should only be enabled on ports connected to a single host. Connecting hubs, concentrators, switches, bridges, etc... to this interface when edge port type (portfast) is enabled, can cause temporary bridging loops. Use with CAUTION 59
  • 56. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public STP STP Looking back in time for STP: Event-History 800886 us – 795697 us = 5189 us ~ 5.2 ms 12 3 6 Data-Plane STP event history N7004-London(config-if)# sh spanning-tree internal event-history tree 1 interface port-channel 110 VDC03 VLAN0001 <port-channel110> 0) Transition at 795697 usecs after Sat Dec 14 21:20:53 2013 State: DIS Role: Unkw Age: 0 Inc: no [STP_PORT_EV_UP] <SNIP> 5) Transition at 800886 usecs after Sat Dec 14 21:20:53 2013 State: FWD Role: Root Age: 0 Inc: no [STP_PORT_ROLE_CHANGE] 60 08:55
  • 57. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Chapter 3: Data-Plane Layer 3 uRIB LC SPAN
  • 58. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public 3 Areas to verify FIB Manager uFDM uRIB OSPF route adj IS-IS RIP IP BGP mRIB • RIB fully resolved and used for packets originated by the control plane Is control plane state as expected (route exists, points to expected next hop)? Is control plane stable? Is control plane consistent with data plane (route programmed in forwarding plane, consistent with control plane)? Data-Plane Control-Plane Forwarding Hardware • Neighbor management • Protocol database • Add/Delete prefixes • Translate routes to hardware format • Program hardware forwarding engine • Push routes to platform • Route download Control-Plane Data-Plane Data-Plane Unicast Routing Architecture 62
  • 59. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public L3 Paris 42.42.42.4 Ip ospf-42 42.42.42.142 11.0.0.1/32 VID = 42 N7004-Paris# show ip ospf 42 internal txlist urib ospf 42 ospf process tag 42 ospf process instance number 1 ospf process uuid 1090519321 ospf process linux pid 7746 <SNIP> OSPFv2->URIB transmit list: version 0x10 N7004-Paris# show processes cpu sort |i PID|7746 PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 7746 10450 502752 0 0.00% 0.01% 0.01% - ospf uRIB route adj 13: 42.42.42.0/24 14: 11.0.0.1/32 15: 10.0.2.0/24 16: 10.0.4.0/24 16: RIB marker OSPF-42 SAP 320 Assumption: Control-Plane is stable, OSPF receives LSAs we look at the flow of information from OSFP to HW, Data-Plane Unicast Control
  • 60. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public L3 uRIB OSFP route adj OSPF Routes in URIB Administrative distance assigned (D) route is directly attached (R) route is in RIB N7004-Paris# sh ip ospf 42 route <SNIP> 11.0.0.1/32 (inter)(R) area 0.0.0.0 via 42.42.42.142/Vlan42 , cost 41 distance 110 N7004-Paris# show ip route ospf-42 detail <SNIP 255.255.255.255/32, ubest/mbest: 1/0 *via sup-eth1, [0/0], 01:59:22, broadcast 11.0.0.1/32, ubest/mbest: 1/0 *via 42.42.42.142, Vlan42, [110/41], 01:57:18, ospf-42, inter N7004-Paris# sh ip arp 42.42.42.142 <SNIP> IP ARP Table Total number of entries: 1 Address Age MAC Address Interface 42.42.42.142 00:03:39 0010.7be8.53b0 Vlan42 Is there a route to the destination ? Do we have a resolved Layer 2 address? Data-Plane Unicast Control 69
  • 61. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public uFDM uRIB client route adj L3 Forwarding Hardware FIB Manager Verifying on the ingress line card N7004-Paris# show forwarding ipv4 route 11.0.0.1 module 4 IPv4 routes for table default/base ------------------+------------------+----------------------+----------------- Prefix | Next-hop | Interface | Labels ------------------+------------------+----------------------+----------------- 11.0.0.1/32 42.42.42.142 Vlan42 N7004-Paris# show forwarding adjacency 42.42.42.142 module 4 IPv4 adjacency information next-hop rewrite info interface -------------- --------------- ------------- 42.42.42.142 0010.7be8.53b0 Vlan42 N7004-Paris# show ip arp 42.42.42.142 Address Age MAC Address Interface 42.42.42.142 00:08:56 0010.7be8.53b0 Vlan42 Is adjacency consistent with ARP In the control plane? Hardware forwarding (FIB) information on per-module basis Displays hardware adjacency table information Data-Plane Layer 3 Unicast 09:00 70
  • 62. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public uFDM uRIB client route adj L3 Forwarding Hardware FIB Manager Verifying on the ingress line card N7004-Paris# show system internal forwarding route 11.0.0.1 module 4 detail RPF Flags legend: S - Directly attached route (S_Star) V - RPF valid M - SMAC IP check enabled G - SGT valid E - RPF External table valid 11.0.0.1/32 , Vlan42 , No of paths: 1 Dev: 1 , Idx: 0x2603 , RPF Flags: V , DGT: 0 , VPN: 7 RPF_Intf_5: Vlan42 (0x35 ) AdjIdx: 0xa038 , LIFB: 0 , LIF: Vlan42 (0x35 ), DI: 0x0 DMAC: 0010.7be8.53b0 SMAC: 4055.390f.5644 N7004-Paris# show system internal forwarding adjacency entry 0xa038 module 4 Device: 1 Index: 0xa038 dmac: 0010.7be8.53b0 smac: 0055.390f.5644 e-vpn: 7 e-lif: 0x35 packets: 0 bytes: 0 Data-Plane Layer 3 Unicast 71
  • 63. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Verification Location L2/L3 reachability for multicast and max. unicast N7004-Paris# ping multicast 224.0.0.5 interface vlan 42 PING 224.0.0.5 (224.0.0.5): 56 data bytes 64 bytes from 42.42.42.5: icmp_seq=0 ttl=254 time=0.836 ms 64 bytes from 42.42.42.5: icmp_seq=1 ttl=254 time=0.685 ms 64 bytes from 42.42.42.5: icmp_seq=2 ttl=254 time=0.613 ms <SNIP> 64 bytes from 42.42.42.142: icmp_seq=0 ttl=254 time=4.461 ms 64 bytes from 42.42.42.142: icmp_seq=1 ttl=254 time=5.007 ms 64 bytes from 42.42.42.142: icmp_seq=2 ttl=254 time=5.771 ms <SNIP> N7004-Paris# ping 42.42.42.142 packet-size 1472 PING 42.42.42.142 (42.42.42.142): 1472 data bytes 1480 bytes from 42.42.42.142: icmp_seq=0 ttl=254 time=5.493 ms 1480 bytes from 42.42.42.142: icmp_seq=1 ttl=254 time=5.37 ms 1480 bytes from 42.42.42.142: icmp_seq=2 ttl=254 time=5.337 ms <SNIP> Why not 1500? 1500 – 20 (IP) -8 (ICMP) = 1472 Ethanalyzer ELAME Debug Better alternatives OSFP Debug Ethanalyzer ELAME CoPP RL ICMP Q Data-Plane Layer 3 Unicast 72
  • 64. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public 73 Example N7K# test forwarding inconsistency N7K# show forwarding inconsistency IPV4 Consistency check : table_id(0x13) Execution time : 14327 ms () No inconsistent adjacencies. Inconsistent routes: 1. slot(1), vrf(default), prefix (172.31.38.6/32), Route extra in FIB Software 2. slot(1), vrf(default), prefix (172.31.38.2/32), Route extra in FIB Software Test for inconsistency N7K# show ip route 172.18.144.2 IP Route Table for VRF "default" <SNIP> 172.18.144.0/24, ubest/mbest: 1/0 *via 172.31.38.2, [200/0], 1d22h, bgp-65000, internal, tag 64949 N7K# show ip fib route 172.18.144.2 <SNIP> ------------------+------------------+----------------------+-------- Prefix | Next-hop | Interface | Labels ------------------+------------------+----------------------+--------- *172.18.144.0/24 0.0.0.0 Null0 How can we recover? (show forwarding ipv4 route 172.18.144.2 module 1) FIB Manager uRIB route Data-Plane Layer 3 Unicast 73
  • 65. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public IDS
  • 66. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Security Check This checking drops various ‘illegal’ packets These drops can be also seen in show hardware internal errors but there they might look a bit more cryptic The checks can be disabled via ‘hardware ip verify …’ – in default VDC (for all VDCs) IDS and how do we identify the source or sender? Data-Plane Layer 3 Unicast N7004-Paris# show hardware forwarding ip verify module 4 IPv4 IDS Checks Status Packets Failed -----------------------------+---------+------------------ address source broadcast Enabled 0 address source multicast Enabled 0 address destination zero Enabled 0 address identical Disabled -- address reserved Disabled -- address class-e Disabled -- checksum Enabled 0 protocol Enabled 0 fragment Disabled -- length minimum Enabled 0 length consistent Enabled 0 length maximum max-frag Enabled 0 length maximum udp Disabled -- length maximum max-tcp Enabled 0 tcp flags Disabled -- tcp tiny-frag Enabled 0 version Enabled 0 <SNIP> 09:05 75
  • 67. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Non-zero Counter N7K# show hardware internal errors mod 4 <SNIP> |------------------------------------------------------------------------| | Device:Lamira Role:L3 | |------------------------------------------------------------------------| Instance:0 ID Name Value Ports -- ---- ----- ----- 2 IF IDS check TCP flags verification 0000000002ebfd14 1-48 I1 8 IF IDS check Src or Dest IP is Class E 0000000002ebfd14 1-48 I1 17 CL2 Invalid Pkt count 00000001bf4bbac4 1-48 I1 57 L3 Fib Miss Pkt ctr 0000000079978df2 1-48 I1 How do we verify if IDS dropped packets? How do we identify the source of those packets? Data-Plane Layer 3 Unicast 76
  • 68. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Examples Forwarding Engine Line Card DI := SUP DI := drop Exception Redirect Table SPAN Engine ERSPAN SPAN E 3/37 DI := SUP DI := drop Use inband SPAN - MTU failures - TTL errors - ICMP redirect Use exception SPAN - IP Option fail - IP check - RPF - Unsupported RW N7004-Berlin(config)# monitor session 1 N7004-Berlin(config-monitor)# source interface sup-eth 0 both or N7004-Berlin(config-monitor)# source exception [layer 3|fabricp | other | all] Destination Index := Drop can be changed to SPAN Engine Data-Plane Tools: SPAN 77
  • 69. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public 2+6 (5) sessions: M2. F1-3 and NX-OS 6.2 2+12 Session Model Options Tools SPAN 78 SPAN/ERSPAN is hardware based and distributed (not using resources on the SUP) SPAN (Port or VLAN) RSPAN (Destination) ERSPAN ACL Capture1 Rule Based SPAN (VLAN Filtering) MTU Truncated SPAN Sampling Rate Limit I/O Module Replication Engine 1ACL Capture requires NX-OS 5.2(1) or higher and M-Series line cards N7004-Berlin(config)# monitor session 1 N7004-Berlin(config-monitor)# source interface sup-eth 0 … | eth a/b … | port-channel c … N7004-Berlin(config-monitor)# source vlan d Monitoring Appliance Fabric Egress Line Card(s) switch(config-monitor)# filter frame-type ipv4 src-ip 10.1.1.3/32 tos 3 l4-protocol … 10.1.1.3 cos 3 “Could be called an Application Copy function” Regular Destination Egress Line Card(s)
  • 70. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Chapter 4: Control-Plane Inband Inband Concept Trigger CoPP Netstack RL Inband
  • 71. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Two Tasks Looking for dropped packets which are targeted for the Control Plane Management Port 1G 10G Multiple CPU Cores Inband CoPP RL OSPF… SUP Line Card System Controller High CPU due to: Punted traffic ACL processing Control Plane tasks Indentifying from where/what is being send from/to the CPU Kernel ELAME Reference Point 2 Reference Point 1 Architecture Inband Path Forwarding Engine Ethanalyzer PID X 09:10 80
  • 72. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public L3 Resources How long since the route was added? How long since ARP has been updated? How long have adjacency stayed up? Can we find previous incarnations of adjacency here? Log of recent routing changes (can filter out prefix in question)? N7004-Paris# show ip route 11.0.0.1 <SNIP> 11.0.0.1/32, ubest/mbest: 1/0 *via 42.42.42.142, Vlan42, [110/41], 02:50:20, ospf-42, inter N7004-Paris# show ip arp 42.42.42.142 Address Age MAC Address Interface Address Age MAC Address Interface 42.42.42.142 00:01:29 0010.7be8.53b0 Vlan42 N7004-Paris# show ip ospf neighbors OSPF Process ID 42 VRF default Total number of neighbors: 2 Neighbor ID Pri State Up Time Address Interface 42.0.0.5 1 FULL/BDR 02:52:48 42.42.42.5 Vlan42 200.0.0.10 1 FULL/DR 02:51:44 42.42.42.142 Vlan42 Are we stable? Control- Plane 81
  • 73. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public OSPF 192.251.19.22 Syslog messages report OSPF neighbor failures 40.9.0.0 2011 Mar 26 15:38:56.395 N7K-1-VDC2 %OSPF-5-NBRSTATE: ospf-6467 [3981] Process 6467, Nbr 192.251.19.22 on Vlan19 from INIT to DOWN, DEADTIME 2011 Mar 26 15:38:56.584 N7K-1-VDC2 %OSPF-5-NBRSTATE: ospf-6467 [3981] Process 6467, Nbr 192.251.19.22 on Vlan19 from DOWN to INIT, HELLORCVD 2011 Mar 26 15:39:33.865 N7K-1-VDC2 %OSPF-5-NBRSTATE: ospf-6467 [3981] Process 6467, Nbr 192.251.19.22 on Vlan19 from INIT to DOWN, DEADTIME 2011 Mar 26 15:39:35.754 N7K-1-VDC2 %OSPF-5-NBRSTATE: ospf-6467 [3981] Process 6467, Nbr 192.251.19.22 on Vlan19 from DOWN to INIT, HELLORCVD An example of an trigger or why you start looking: Control- Plane A Syslog Message 1 %COPP-5-COPP_DROPS5: CoPP drops exceed threshold in class: copp-system-class-critical, check show policy-map interface control-plane for more info. Active CoPP Monitoring showing drops 2 SITE1-AGG1# show policy-map int control-plane SITE1-AGG1# show policy-map int control-plane | i "class|conform|violated“ <SNIP> violated 1799505072 bytes; action: drop 1 Needs NX-OS 5.1 or higher Logging drop threshold # level # 2 No “statistics per-entry available but show system internal access-list input entries detail 82
  • 74. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Platform Independent Berlin 42.42.42.4 London 42.42.42.5 42.42.42.142 Verifying the neighbors and if needed the adjacency history N7004-Paris# show ip ospf 42 event-history adjacency |i EXCHDONE 2013 Dec 29 17:21:46.927613 ospf 42 [7746]: : Nbr 42.42.42.142: EXCHANGE --> FULL, event EXCHDONE N7004-Paris# show ip ospf 42 neighbors OSPF Process ID 42 VRF default Total number of neighbors: 2 Neighbor ID Pri State Up Time Address Interface 42.0.0.5 1 FULL/BDR 02:16:27 42.42.42.5 Vlan42 200.0.0.10 1 FULL/DR 02:15:24 42.42.42.142 Vlan42 Paris Moscow 10.x.x.x The latest messages appear at the top Control- Plane 83
  • 75. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Failure Domain Determine with Etheranalzer the failure domain From Prozess point of view: Do I get enough? Do I get too much? Ingress MAC Drops? Ethanalyzer HWRL Drops? CoPP Drops? Inband Drops or FC? Packet Manager? IPv4/IPv6 ARP/AM uRIB Line Card ELAME OSFP Do we receive the packet? Do we receive the packet (e.g. BPDU or LSA at the CPU? CPU? MEM? We verified on the other side we are sending LDP, BGP, OSPF, … One real world example in chapter 5 (ARP) Control- Plane 84
  • 76. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public OSPF 192.251.19.22 Syslog messages report OSPF neighbor failures CPU states high utilization caused by OSPF and Netstack process 40.9.0.0 Here two processes OSPF and NETSTACK are using most resources. How much do they use usually? How does my base line look like? N7K-1-VDC2# show system resources Load average: 1 minute: 2.92 5 minutes: 2.38 15 minutes: 2.27 Processes : 1267 total, 4 running CPU states : 34.0% user, 42.5% kernel, 23.5% idle Memory usage: 4115232K total, 3638780K used, 476452K free N7K-1-VDC2# show processes cpu sort PID Runtime(ms) Invoked uSecs 1Sec Process ----- ----------- -------- ----- ------ ----------- 3981 127 276 462 43.2% ospf 3841 267 78 3427 16.4% netstack 2941 34146488 7377876 4628 0.9% platform 3982 118 245 485 0.9% ospfv3 + statistics per Core for SUP- 2/SUP-2E and with newer NX- OS for SUP-1 Control- Plane Module-3# show system internal processes cpu 09:15 85
  • 77. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Platform Independent Having problems with one neighbor or link? N7004-Paris# show ip ospf retransmission-list 200.0.0.10 vlan 42 OSPF Process ID 42 VRF default Neighbor 200.0.0.10, interface Vlan42, address 42.42.42.142 Link state retransmission timer not running Type LSID Adv Rtr Seq No Checksum Age Checklist for neighbor issues: L2/L3 reachability Configuration challenges like OSPF not enabled on the interface Interface is defined as passive Mismatched subnet mask, timer, area ID, … Control- Plane 86
  • 78. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public L3 Resources Log of recent routing changes (can filter out prefix in question) Verifying ADJMGR history N7004-Paris# show routing event-history general Dumping: general 2013 Dec 29 17:31:01.280070 urib: Received state change for unknown uuid 0x2c9 2013 Dec 29 17:31:01.101498 urib: Received state change for unknown uuid 0x0 2013 Dec 29 17:31:00.069988 urib: Received state change for unknown uuid 0x2c2 2013 Dec 29 17:30:59.751551 urib: Received state change for unknown uuid 0x0 2013 Dec 29 17:21:52.543826 urib: "ospf-42": 11.0.0.1/32 C: SN=F EC=T NF=T VN=T WM=F BH=F NW=T UP=T 2013 Dec 29 17:21:52.543811 urib: "ospf-42": 11.0.0.1/32, new best path nh 42.42.42.142%Vlan42, metric [110/41] route-type inter tag 0x00000000 2013 Dec 29 17:21:52.543810 urib: "ospf-42": 11.0.0.1/32 B: SN=F EC=F NF=T VN=T WM=F BH=F NW=T UP=T What happened? N7004-Paris# show sys inte adjmgr internal event-history ipc |grep prev 1 42.42.42.142 4) Event:E_DEBUG, length:160, at 503661 usecs after Sun Dec 29 17:21:41 2013 [116] [7586]: Added adjacency entry for 42.42.42.142 (0010.7be8.53b0) on interface Vlan42 (Ethernet4/2)with preference 50 afi 1 mct 0 uuid 268 Mac changed:TRUE <SNIP> Control- Plane 87
  • 79. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public CPU Output show processes from all VDC’s Multiple CPUs & Cores How much is my CPU used? How much are my CPU cores used? How does one specific PID (e.g OSPF) behaved in the past? N7K-3-VDC3# show processes cpu | egrep "PID|--|ospf" PID Runtime(ms) Invoked uSecs 1Sec Process ----- ----------- -------- ----- ------ ----------- 9337 102 72 1418 0.0% ospfv3 22916 118 62 1905 13.1% ospf N7K-3-VDC3# show system internal sysmgr service pid 22916 Service "__inst_001__ospf" ("ospf", 58): UUID = 0x41000119, PID = 22916, SAP = 320 State: SRV_STATE_HANDSHAKED (entered at time Thu Mar 3 21:53:59 2012). Restart count: 1 Time of last restart: Thu Mar 3 21:53:58 2011. The service never crashed since the last reboot. Tag = 6467 Plugin ID: 1 Wait I remember now for the complaining customer we used: VDC2… Verify “high CPU” against the base line. You need base line information. Control- Plane 88
  • 80. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Case Study Flapping OSPF neighbors, unwanted “traffic sources” for your control plane? 40.9.0.0/16 N7K-1# show policy-map interface control-plane module 2 | egrep "service-policy|critical|ospf|police cir 39600" service-policy input: copp-system-policy class-map copp-system-class-critical (match-any) match access-grp name copp-system-acl-ospf match access-grp name copp-system-acl-ospf6 police cir 39600 kbps , bc 250 ms N7K-1# show class-map type control-plane copp-system-class-critical | egrep class|ospf class-map type control-plane match-any copp-system-class-critical match access-grp name copp-system-acl-ospf match access-grp name copp-system-acl-ospf6 N7K-1# show ip access-lists copp-system-acl-ospf IP access list copp-system-acl-ospf 10 permit ospf any any Customize CoPP, don’t turn it off! Legitimate neighbor Control- Plane 90
  • 81. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Environment N7K-1# show ip access-lists copp-system-acl-ospf IP access list copp-system-acl-ospf 10 permit ospf any any 20 permit ip 40.9.0.0/16 224.0.0.5/32 30 permit ip 40.9.0.0/16 224.0.0.6/32 40.9.0.0/16 N7K-1# show ip access-lists copp-system-acl-ospf-test IP access list copp-system-acl-osfp-test 10 permit ip any 224.0.0.0/24 N7K-1# show policy-map interface control-plane module 2 | egrep "service-policy|critical|ospf|police cir 39600|ospf-test|police cir 100 " service-policy input: copp-system-policy class-map copp-system-class-critical (match-any) match access-grp name copp-system-acl-ospf match access-grp name copp-system-acl-ospf6 police cir 39600 kbps , bc 250 ms class-map copp-system-class-OSFP-TEST (match-any) match access-grp name copp-system-acl-ospf-test police cir 100 bps , bc 200 ms OSPFv2 224.0.0.5 224.0.0.6 CoPP Now we specifically identify the legitimate neighbors Control- Plane 09:20 91
  • 82. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Environment 40.9.0.0/16 OSPFv2 224.0.0.5 224.0.0.6 Module 1 CoPP N7K-1# show policy-map interface control-plane module 1 class copp-system-class-ospf-test control Plane service-policy input: copp-system-policy class-map copp-system-class-ospf-test (match-any) match access-grp name copp-system-acl-malicious police cir 100 bps , bc 200 ms module 1 : conformed 0 bytes; action: drop violated 0 bytes; action: drop N7K-1# show policy-map interface control-plane module 2 class copp-system-class-ospf-test control Plane service-policy input: copp-system-policy class-map copp-system-class-ospf-test (match-any) match access-grp name copp-system-acl-ospf-test police cir 100 bps , bc 200 ms module 2 : conformed 0 bytes; action: drop violated 1799505072 bytes; action: drop Module 2 CoPP Generic: show policy-map interface control-plane you determine the affected class, and with N7K# show class-map type control-plane you determine what is classified for those classes. Control- Plane 92
  • 83. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public RL As with CoPP policers, modifying the default rates should be carefully planned before any configuration changes. Rate-limiters can prevent overwhelming the control-plane CoPP RL Multiple CPU Cores N7004-Berlin# show hardware rate-limiter Units for Config: packets per second Allowed, Dropped & Total: aggregated since last clear counters Module: 3 R-L Class Config Allowed Dropped Total +----------------+--------+-------------+-------------+----------------+ L3 mtu 500 0 0 0 L3 ttl 500 0 0 0 L3 control 10000 0 0 0 L3 glean 100 0 0 0 <SNIP> L2 storm-ctrl Disable access-list-log 100 0 0 0 copy 30000 1423 0 1423 receive 30000 8540 0 8540 L2 port-sec 500 0 0 0 L2 mcast-snoop 10000 2 0 2 <SNIP> Control- Plane 93
  • 84. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Inband SUP-2 / NX-OS 6.2 (5.41) B P D U Q0 Q1 Clipper R2D2 CPU BDR-529-Berlin# show system inband queuing status Weighted Round Robin Algorithm Weights BPDU - 64, Q0 - 16, Q1 – 4 BDR-529-Berlin# show system inband queuing statistics Inband packets unmapped to a queue: 0 Inband packets mapped to bpdu queue: 2078 Inband packets mapped to q0: 1339 Inband packets mapped to q1: 4 In KLM packets mapped to bpdu: 0 In KLM packets mapped to arp : 0 In KLM packets mapped to q0 : 0 In KLM packets mapped to q1 : 0 In KLM packets mapped to veobc : 0 Inband Queues: bpdu: recv 2078, drop 0, congested 0 rcvbuf 2097152, sndbuf 4194304 no drop 1 (q0): recv 1339, drop 0, congested 0 rcvbuf 2097152, sndbuf 4194304 no drop 0 (q1): recv 4, drop 0, congested 0 rcvbuf 2097152, sndbuf 4194304 no drop 0 Control- Plane 94
  • 85. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Inband CPU N7004# show hardware internal cpu-mac inband events 1) Event:TX_PPS_MAX, length:4, at 382147 usecs after Fri Jan 10 20:04:37 2014 new maximum = 191 2) Event:RX_PPS_MAX, length:4, at 382147 usecs after Fri Jan 10 20:04:37 2014 new maximum = 195 How to determine the max pps rate to/from the CPU, if we run out of buffer and it’s occurrence How to determine the time of the max pps rate to correlate against your logs? N7004-Berlin# show hardware internal cpu-mac inband stats | in rate|buffer Rx no buffers .................. 0 Packet rate limit ........... 64000 pps Rx packet rate (current/max) 85 / 195 pps Tx packet rate (current/max) 85 / 191 pps Goal: Compare against logs Possible next step: logw.py α tα α Control- Plane 95
  • 86. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public NX-OS Packet Manager NetStack IP Clients NetStack VDC -1 L3 L2 „ip input“ ARP OSPF System manager OSPF ARP System Manager starts and controls / monitors If the heatbeat fails core sig6 -> system troubleshooting N7004-Berlin# debug pktmgr frame 2014 Jan 10 20:14:40.061027 pktmgr: In 0x0800 82 7 4055.390f.5645 -> 0100.5e00.0005 Eth3/6 STP BGP Clients Ethanalyzer ELAME Debug Packet Manager NetStack IP NetStack VDC-2 L3 L2 „ip input“ Control- Plane 09:25 96
  • 87. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Chapter 5: Control-Plan ARP ARP glean HSRP
  • 88. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Layer 2/3 ARP Incomplete... E 3/13 E 3/14 20.0.0.0/24 .13 .14 VRF Control-P. ARP & AM N7004-Berlin# show ip arp Flags: * - Adjacencies learnt on non-active FHRP router + - Adjacencies synced via CFSoE # - Adjacencies Throttled for Glean D - Static Adjacencies attached to down interface IP ARP Table for context default Total number of entries: 3 Address Age MAC Address Interface IP ARP Table for context default Total number of entries: 5 Address Age MAC Address Interface 192.168.0.3 00:04:41 4055.390f.5643 Vlan1 10.0.3.5 00:06:35 4055.390f.5645 Ethernet3/6 10.0.2.4 00:07:14 4055.390f.5644 Ethernet3/8 20.0.0.13 00:00:14 INCOMPLETE Ethernet3/14 192.168.0.254 - 0000.0c9f.f001 Vlan1 Simple example uRIB (253) route adj AM ARP 98
  • 89. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public E 3/13 20.0.0.0/24 .13 .14 VRF ARP E 3/14 Consider the use of Debug-Filter and send to a file Control-P. ARP & AM N7004-Berlin# debug ip arp packet 2014 Jan 5 21:51:40.477507 arp: (context 1) Sending packet on interface Ethernet3/14, (prty 0) Hrd type 1 Prot type 800 Hrd len 6 Prot len 4 OP 1, Pkt size 28 2014 Jan 5 21:51:40.477629 arp: Src 4055.390f.5642/20.0.0.14 Dst ffff.ffff.ffff/20.0.0.13 2014 Jan 5 21:51:40.481061 arp: (context 4) Receiving packet from interface Ethernet3/13, (prty 6) Hrd type 1 Prot type 800 Hrd len 6 Prot len 4 OP 1, Pkt size 46 2014 Jan 5 21:51:40.481131 arp: Src 4055.390f.5642/20.0.0.14 Dst ffff.ffff.ffff/20.0.0.13 N7004-Berlin# show ip arp statistics ethernet 3/14 ARP packet statistics for interface: Ethernet3/14 Sent: Total 10, Requests 9, Replies 0, Requests on L2 0, Replies on L2 0, Gratuitous 1, Tunneled 0, Dropped 0 Send packet drops details: MBUF operation failed : 0 Context not yet created : 0 Invalid context : 0 Invalid ifindex : 0 Invalid SRC IP : 0 Invalid DEST IP : 0 Destination is our own IP : 0 Unattached IP : 0 <SNIP>
  • 90. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public E 3/13 20.0.0.0/24 .13 .14 VRF ARP N7004-Berlin# show ip arp statistics ethernet 3/14 Control-P. ARP & AM Received: Total 1, … , Dropped 1 Received packet drops details: Appeared on a wrong interface : 0 Incorrect length : 0 Invalid protocol packet : 0 Invalid context : 0 Context not yet created : 0 Invalid layer 2 address length : 0 Invalid layer 3 address length : 0 Invalid source IP address : 0 Source IP address is our own : 0 No mem to create per intf structure : 0 Source address mismatch with subnet : 0 Directed broadcast source : 0 <SMIP> N7004-Berlin# show ip arp statistics vrf ALQ <SNIP> Received: Total 13, … , Dropped 13 <SNIP> Invalid source MAC address : 0 Source MAC address is our own : 13 <SMIP> 100
  • 91. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Check CoPP and/or HWRL: SWT-1 SWT-2 ARP INCOMPLETE It worked before no new deployment Ethanalyzer verifies ARP packets are being send by SWT-1 but not received On SWT-2 ARP is being Received and Send customer# show vpc brief vPC keep-alive status : peer is not reachable through peer-keepalive Control-P. ARP & AM Customer# show class-map type control-plane copp-system-p-class-normal class-map type control-plane match-any copp-system-p-class-normal match access-group name copp-system-p-acl-mac-dot1x match exception ip multicast directly-connected-sources match exception ipv6 multicast directly-connected-sources match protocol arp class-map copp-system-p-class-normal (match-any) violate action: drop module 5: violated 20557632224 bytes, 5-min violate rate 4154397 bytes/sec module 9: violated 0 bytes, 5-min violate rate 0 bytes/sec Customer# show hardware rate-limiter | i Module|R-L|glean Module: 5 R-L Class Config Allowed Dropped Total +------------------+--------+---------------+-------------+-----------------+ L3 glean 100 4904 2935 7839 L3 glean-fast 100 863401 1539316 2402717 09:30 101
  • 92. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Layer 2/3 CPU Utilization Glean Throttle is not enabled by default (be careful with probing hosts). Default timeout 300 seconds (300s-1800s) Use show ip arp to verify if INCOMPLETE adjacency is in Glean Throttle state Control-P. ARP & AM N7K# show ip arp 192.1.49.2 Flags: * - Adjacencies learnt on non-active FHRP router + - Adjacencies synced via CFSoE # - Adjacencies Throttled for Glean D - Static Adjacencies attached to down interface IP ARP Table Total number of entries: 1 Address Age MAC Address Interface 192.1.49.2 00:00:13 INCOMPLETE Ethernet4/9 # 192.1.49.2/32 Has been throttled 102
  • 93. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public glean fast path NX-OS 6.2(2) Control-P. ARP & AM N7004-London# sh run all | grep "fast-path" ip arp fast-path N7004-London# show system internal pixm info ltl-region | i FAST LIBLTLMAP_LTL_TYPE_SUP_INBAND_GLEAN_FAST_PATH 0x10d1 N7004-London# show system internal pktmgr client 0x10c Client uuid: 268, 4 filters, pid 7209 Filter 1: EthType 0x0806, Rx: 28, Drop: 0 Filter 2: EthType 0xfff0, Exc 8, Rx: 0, Drop: 0 Filter 3: EthType 0x8841, Snap 34881, Rx: 0, Drop: 0 Filter 4: EthType 0x0800, DstIf 0x150b0000, Excl. Any Rx: 0, Drop: 0 <SNIP> SUP-ETH Interface 103
  • 94. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Chapter 6: ACL’s TCAM PBR
  • 95. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public TCAM What is TCAM? Hardware to identify packets T0B0 T0B1 T1B0 T1B1 T1B0 T1B1 Forwarding Engine on Ingress Line Card contains TCAM RACL QoS T := TCAM 0 or 1 B := Bank 0 or 1 VID 42 Configuration Interface or VLAN (ingress/egress) TCAM Ternary Content Addressable Memory Packet FE 108
  • 96. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public TCAM/ ACL’s N7004-London# show sys int acc feature bank map interface ingress <SNIP> slot 3 ======= _____________________________________________________________________ Feature Rslt Type T0B0 T0B1 T1B0 T1B1 _____________________________________________________________________ PACL Acl X RACL Acl X DHCP Acl X QoS Qos X PBR Acl X Netflow Sampler Acc X SPM WCCP Acl X X X BFD Acl X FEX Acl X <SNIP> Specific Features map to a specific “location” := TCAM/BANK Three result types: QoS ACL ACC T0B0 T0B1 T1B0 T1B1 TCAM Ternary Content Addressable Memory 109
  • 97. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Result Types N7004-London# show system internal access-list feature bank map vlan ingress <SNIP> slot 3 ======= _________________________________________________________________________ Feature Rslt Type T0B0 T0B1 T1B0 T1B1 _________________________________________________________________________ QoS Qos X RACL Acl X PBR Acl X VACL Acl X DHCP Acl X ARP Acl X Netflow Acl X X Netflow (SVI) Acl X X Netflow Sampler Acc X Netflow Sampler (SVI) Acc X <SNIP> Per bank only one result type can be used: for “VLAN” & “Ingress” e.g. either QoS or NF Sampler “I can’t configure xyz…the system rejects my configuration…” TCAM Ternary Content Addressable Memory 2014 May 19 11:27:12.673 backuprot3 %ACLQOS- SLOT4-2-ACLQOS_FAILED: ACLQOS failure: feature combination not supported on VDC-2 VLAN 2156 for : RACL, Netflow Sampler (SVI) 2014 May 19 11:27:13.214 backuprot3 %IM-3- IM_RESP_ERROR: Component MTS_SAP_VMM opcode:MTS_OPC_IM_IF_V DC_BIND in vdc:2 returned error:Tcam Allocation Failure 110
  • 98. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ACL’s N7004-London# show system internal access-list globals slot 1 ======= NOT Supported in SUP ACLQOS slot 3 ======= Atomic Update : ENABLED Default ACL : DENY Bank Chaining : DISABLED Seq Feat Model : NO_DENY_ACE_SUPPORT This pltfm supports seq feat model Bank Class Model : DISABLED This pltfm supports bank class model Fabric path DNL : DISABLED Seq Feat Model : NO_DENY_ACE_SUPPORT This pltfm supports seq feat model LOU Threshold Value : 5 Overview Atomic Update Resource Pooling Statistics Per Entry ACL Threshold Exp Fragment Handling Bank Management T0B0 T0B1 T1B0 T1B1 TCAM Ternary Content Addressable Memory 09:40 111
  • 99. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public ACL
  • 100. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public TCAM N7K# show ip access example IP access list example statistics per-entry 10 permit ip any 10.1.2.100/32 [match=3452] 20 deny ip any 10.1.68.101/32 [match=49920] 30 deny ip any 10.33.2.25/32 [match=232324] 40 permit tcp any any eq 22 [match=9881] 50 deny tcp any any eq telnet [match=442] 60 deny udp any any eq syslog [match=87112] 70 permit tcp any any eq www [match=4345667] 80 permit udp any any eq snmp [match=234222] ACL logging is enabled by including the log keyword in an ACL rule (show log log). The Sup receives a copy of the packet. The original packet is forwarded/dropped in hardware with no performance penalty. Statistics per Entry The CPU is protected by using one of the available rate limiters. Forwarding engine hardware enforces rate to avoid saturating inband interface CPU. hardware rate-limit access-list-log command adjusts rate (def 100 pps) ACL Logging can be a useful tool during troubleshooting. Use ACL logging to sample specific packets from data plane. Use onboard ethanalyzer (wireshark) to analyze sampled packets TCAM Utilization 09:45 118
  • 101. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Statistics per entry results in no optimization and no merge activity. Instead a 1:1 mapping of configured ACE to CL TCAM will be seen TCAM Space „...when using ACL stats per entry on the 7K the TCAM utilization goes up to 47%, when removed, it dropped to 7%...“ object groups do NOT offer ANY optimization in terms of CL (:= Classification) TCAM utilization ACLs Statistics are NOT enabled by default (fundamental difference vs. IOS) because they require the ACEs NOT to be merged and this affects the TCAM utilization. TCAM Utilization 119
  • 102. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public TCAM N7004(config)# hardware access-list resource feature bank-mapping N7004(config)# show system internal access-list feature bank-class map ingress slot 3 ======= Feature Class Definition: 0. CLASS_QOS : QoS, 1. CLASS_INBAND : Tunnel Decap, SPM LISP, 2. CLASS_PACL : PACL, Netflow, 3. CLASS_DHCP : DHCP, Netflow, Netflow (vlan), ARP, 4. CLASS_RACL : RACL, RACL_STAT, Netflow (SVI), ARP, <SNIP> Feature Class Combination (Ingress) 0. CLASS_PACL, CLASS_QOS_INTF, CLASS_EMPTY, CLASS_EMPTY 1. CLASS_PACL, CLASS_NF_SMPL_INTF, CLASS_EMPTY, CLASS_EMPTY <SNIP> 33. CLASS_EMPTY, CLASS_EMPTY, CLASS_NF_SMPL, CLASS_QOS “now I can configure QoS and NF Sampler” TCAM Bank Management 120
  • 103. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Summary Strategy, Tools and System Data-Plane Layer 2 Data-Plane Layer 3 Control-Plane Inband Control- Plane ARP TCAM 10:00 08:00
  • 104. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Time vs. RCA Be prepared both for troubleshooting itself but also for the strategy Have an up-to-date network diagram at hand Know your network in good state Summary N7K N7K with NX-OS provides visibility and tools to efficiently troubleshoot For most challenges the “normal” CLI, Ethanalyzer and the log entries are sufficient A complete and mature feature set helps to find workarounds Reducing risk by using mainstream designs and proven deployments 122
  • 105. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Summary two examples of high uptime from EMEAR ROME, ITALY Kernel uptime is 1813 day(s), Nexus 7010 Ireland, UKI System uptime: 2612 days MDS9509 ALQ/ 06-ARP-14 > 4.5Y > 7.0Y Shape and secure your future with Nexus 7000 Series 09:50 „In order to consolidate a Business critical server Farm enhancing network speed and availability in 2009 Fastweb adopted Nexus 7000. Today, May 2014, Nexus Kernel uptime is more than 1870 day (Last reset on Tue Mar 31 16:05:10 2009).” Luca Chiappetti – Network Operations Control Coordinator @ Fastweb 123
  • 106. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Complete Your Online Session Evaluation • Give us your feedback and you could win fabulous prizes. Winners announced daily. • Complete your session evaluation through the Cisco Live mobile app or visit one of the interactive kiosks located throughout the convention center. Don’t forget: Cisco Live sessions will be available for viewing on-demand after the event at CiscoLive.com/Online 124
  • 107. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public Continue Your Education • Demos in the Cisco Campus • Walk-in Self-Paced Labs • Table Topics • Meet the Engineer 1:1 meetings 125 …and have fun…
  • 108. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public
  • 109. © 2014 Cisco and/or its affiliates. All rights reserved. BRKDCT-3144, 2014 San Francisco; Dipl.-Ing. Andreas la Quiante Cisco Public 09:55