Vous êtes sur la page 1sur 44

Catalyst 6500

Bootcamp

Chapter 13
Catalyst 6500 High Availability

© 2006 Cisco Systems, Inc. All rights reserved. CISCO PARTNER CONFIDENTIAL 1
Agenda

§ High Availability General Concepts and Features


§ Supervisor Redundancy: NSF/SSO
§ On Line Insertion and Removal Details
§ Generic On Line Diagnostics (GOLD)

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 2
Physical Redundancy
Catalyst 6500 integrated hardware resiliency
§ Separate control and forwarding plane
§ Redundant Supervisors (1:1)
NSF/SSO switchover results in sub-second recovery

§ Redundant Fans (1:N)


Secondary fans provide sufficient cooling to keep the
system running at full capacity
The Catalyst 6509-NEB-A chassis supports redundant
fan trays as well

§ Redundant Power Supplies (1+1)


Secondary power supply kicks in instantly to provide
full uninterrupted power to the system

§ Hot Swap Capability (Online Insertion and


Removal - OIR)
§ Redundant Clocks
§ Spares and Maintenance
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 3
HA goes beyond redudancy ...
§ Catalyst 6500 provides a great feature set to prevent
and/or mitigate attacks performed against the switch:
Hardware accelerated ACLs
Control Plane Policing and PFC3 rate limiters
DHCP Snooping / DAI
Port-security
Strong QoS Support: Scavenger QoS
… plus all the service modules (FWSM, IDSM, NAM …)
§ Features like IOS Software Modularity or the
Embedded Event Manager also add to the overall
box resiliency
§ Full integrated Netflow support can also help in
detecting network attacks
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 4
Detecting Routing Peer Failures
Bidirectional Forwarding Detection (BFD)

§ What’s the issue? BFD Async Mode


Link-layer failure detection
sometimes takes too long
§ BFD is a protocol-independent Orange is OK Green is OK
method of detecting control/data-
plane “liveliness” between two
peer systems
§ Uses Hello-like mechanism
Systems periodically send BFD Control
§ Lightweight, Fast packets to one another
§ Can be distributed to the data
plane If no packet is received for the peer
during the duration of the negotiated
§ Faster failure detection so faster Detect Time (negotiated interval *
reconvergence Supported
to an alternatesince multiplier),
12.2(18)SXE the session
for is
alldeclared to be
path down.
Ethernet Interfaces
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 5
Agenda

§ High Availability General Concepts and Features


§ Supervisor Redundancy: NSF/SSO
§ On Line Insertion and Removal Details
§ Generic On Line Diagnostics (GOLD)

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 6
NSF/SSO
Introduction
Redundant
Redundant Supervisors
Supervisors
Redundant
Redundant Supervisors
Supervisors …Nonstop
…Nonstop Forwarding
Forwarding and
and Stateful
Stateful
IfIf the
the active
active Supervisor
Supervisor fails
fails due
due to
to Switchover
Switchover (NSF
(NSF and
and SSO)
SSO) result
result in
in
aa hardware
hardware or
or software
software fault…
fault… sub-second recovery on the
sub-second recovery on the
standby
standby Supervisor
Supervisor

Route Route
Route
Route Processor
Processor Processor
Processor Redundancy
Redundancy
Redundancy Redundancy
where
where the where
where the
the redundant
redundantthe
Sup
Sup is
is not
not
RPR
RPR redundant Sup is not initialised 90
90 sec
sec failover
failover
redundant Sup is not initialised
initialized
initialized

Route
Route Processor
Processor Redundancy
Redundancy Plus
Plus -- Redundant
Redundant Sup Sup is
is not
not
RPR+
RPR+ 30+
30+ sec
sec failover
failover
stateful
stateful -- L2
L2 protocols
protocols restart
restart and
and state
state table
table is
is purged
purged

Stateful
Stateful Switchover
Switchover -- on
on Switchover,
Switchover, physical
physical links
links kept
kept up
up --
SSO
SSO 0-3
0-3 sec
sec failover
failover
Sup
Sup redundancy
redundancy is
is stateful
stateful for
for L2
L2 protocols
protocols and
and hwhw tables
tables

Non
Non Stop
Stop Forwarding
Forwarding with
with Stateful
Stateful Switchover
Switchover -- on
on
NSF/SSO
NSF/SSO Switchover,
Switchover, allow
allow packet
packet routing
routing to
to continue
continue until
until L3
L3 protocol
protocol 0-3
0-3 sec
sec failover
failover
converges
converges

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 7
NSF/SSO
SSO Operation
Layer
Layer 22 Control
Control Plane
Plane information
information and
and state
state
synchronized
synchronized
-- Spanning
Spanning Tree
Tree State
State
-- Trunking/channeling
Trunking/channeling
Active -- Port
Port state
state (link
(link up/down)
up/down)
-- Security/IP
Security/IP phone
phone state
state
Standby
Hardware
Hardware tables
tables Synchronized
Synchronized
-- FIB/ADJ
FIB/ADJ Tables
Tables
-- QoS
QoS and
and Security
Security ACLs
ACLs
-- MAC
MAC address
address tables
tables replicated
replicated
What it Does
Stateful Switch Over synchronizes Layer 2, ACL, and state information. Beneficial for wiring
closet deployments with dual supervisor engines Works in conjunction with Non-stop
Forwarding (NSF) to ensure total Supervisor resiliency in Layer 3 environments
Benefit
Seamless Supervisor Engine sub-second switchover with NO interruption to packet forwarding
and Layer 2 sessions. IP Telephone calls do not drop. Wireless access points do not need to
re-authenticate with network

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 8
NSF/SSO
SSO Operation

Standby
Active Standby
Active

New RP Builds table


RP RP
FIB Update
SP SP after RP
Convergence
STP, Port, VTP States,

PFCx PFCx L3 Traffic forwards


on last known FIB in HW

L2, L3 FIB, Netflow, ACL Tables

No
Sup1a DFC’s Not affected by
Support
for SSO
DFCx Supervisor Failover

L2, L3 FIB, Netflow, ACL Tables

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 9
NSF/SSO
NSF Operation

Active Graceful
Gracefulrestart
restartfor
forLayer
Layer33Routing
Routing
Standby Protocols
Protocolsbetween
betweenSupervisors
Supervisorsand
and
other
otherLayer
Layer33devices
devices

§ What it Does
NSF maintains Layer 3 route and protocol state information. Works in conjunction
with Stateful Switch Over (SSO) to ensure total Supervisor resiliency
§ Benefit
Routing protocols (OSPF, BGP, EIGRP, IS-IS) do not have to re-converge,
ensuring better network availability. Frame Relay, PPP and ATM sessions on
router modules to not reset

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 10
NSF/SSO
NSF Terminology
NSF Aware
§ NSF Capable Router
NSF Capable
(restarting router)
A router that preserves it’s forwarding table and
rebuilds it’s routing topology after an RP switch over;
currently a dual RP router
§ NSF Aware Router (peer)
A router that assists an NSF capable during restart
and can preserve routes reachable via the restarting
router
§ NSF Unaware Router
A router that is not capable of assisting an NSF NSF Aware
Capable router during an RP switchover
§ NSF Capable Router is § NSF – Nonstop Forwarding
NSF Aware, too!!!!
Cisco terminology and marketing
§ SSO Aware or HA aware name for feature set
Cisco IOS subsystem – an HA client § Graceful Restart (GR)
Term used in some protocol
standards and drafts

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 11
NSF/SSO
Building Relationship

GR (NSF/SSO) GR (NSF) Aware Peer


Capable Router

During Restart
– I Will Preserve My
“I Can Preserve My
Forwarding Table
Forwarding Table
– I Will Not Declare
During Restart. Agreement You Dead
– I Will Not Inform
My Neighbors

Packets can be forwarded in hardware


based on pre-switchover FIB/ADJ
information while my routing protocol
database is being rebuilt

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 12
NSF/SSO
NSF

GR (NSF/SSO) GR (NSF) Aware Peer


Capable Router

OK. I Acknowledge.
Restart Notification I Will Stick to My
I Have Restarted
and Acknowledgement Agreement

Knowledge Transfer
I Will Use Your This Is My Knowledge
Knowledge to of the Network
Build My
Database Updates

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 13
NSF/SSO
NSF/SSO synchronization process

Active Standby
Supervisor Supervisor

Synchronization
RP CPU Configuration
RP CPU
Routing Protocol
process
Control Path

Routing Information Base ARP Table Cisco IOS ARP Table


CEF Tables
Synchronization
IOS CEF FIB Adjacency IOS CEF FIB Adjacency
Tables Table Table Tables Table Table

Hardware Tables
Hardware Synchronization Hardware
FIB Adjacency FIB Adjacency
Table Table Table Table

Forwarding Path
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 14
NSF/SSO
NSF Switchover Details
Active Supervisor Fails
1 Newly Active Supervisor
ic ation RP
RP CPU

Control Plane
ti f
s ta rt No CPU 5
Re are
OSPF EIGRP IS-IS BGP
7 A w Process Process Process Process
NSF Control
. I Am
iz a tion 9
o
Hell nc hron Path
7 s e Sy Routing Information Base ARP Table
b a
Data 10
8 2 6
4
Cisco IOS CEF Tables Global Epoch = 1
FIB Table Adjacency Table
Prefix Next Hop InterfaceEpoch Next Hop MAC Epoch
10.2 10.1.1.1 Vlan 10 01 1
10.1.1.1 AA-BB-.. 0
192.1 192.168.1.1Vlan 192 0 192.168.1.1 EE-DD.. 0
1

NSF Aware Router 11


Data Plane
3 12
Data
Hardware 3
FIB Adjacency
Table Table

Forwarding Path
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 15
NSF/SSO
NSF Switchover Details
1. Switchover is triggered. Standby Supervisor (shown) becomes active.
2. Control plane and data plane separation: the FIB is detached from the RIB.
3. Packet forwarding continues based on last-known FIB and adjacency entries while the
standby takes over.
4. The global epoch number is incremented.
5. The Supervisor brings its interfaces and control plane online.
6. The software adjacency table is populated with the pre-switchover ARP table contents.
Updated CEF entries receive the new global epoch number. New adjacency entries are
downloaded in hardware.
7. The routing protocol specific neighbor and adjacency reacquisition occurs.
8. The routing protocol specific database synchronization occurs.
9. The RIB is repopulated with new routing entries. The corresponding CEF entries are
updated.
10. Updated entries receive the global epoch number to indicate that they have been
refreshed. Corresponding FIB entries and hardware entries are updated.
11. Each routing protocol notifies CEF that it has converged. Once all of them have
converged, the last one flushes the stale route and adjacency information.
12. The IOS CEF tables on the RP and the forwarding tables on the SP and PFC are now
synchronized. Generic non-NSF specific operations can take place.

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 16
Design Considerations for NSF/SSO
NSF and Hello Timer Tuning?
§ NSF is intended to provide availability through
route convergence avoidance Neighbor Loss, No
§ Fast IGP timers are intended to provide Graceful Restart
availability through fast route convergence
§ In an NSF environment dead timer must be
greater than SSO Recovery + RP restart + time
to send Si Si
first hello
§ Switches running Native IOS
OSPF 2/8 seconds for hello/dead
EIGRP 1/4 seconds for hello/hold
§ Switches running Hybrid
OSPF 3/12 seconds for hello/dead
EIGRP 2/8 seconds for hello/hold

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 17
NSF/SSO
Failover Results
§ Time to recover the data plane depends on how fast the forwarding
engine, switch fabric and bus can be recovered

Forwarding Active Forwarding Standby


Engine Supervisor Engine Supervisor
Engine Engine

Active Switch Fabric

Standby Switch Fabric

System Bus

Fabric Fabric Bus/ Fabric Bus


Interface Interface Interface Interface
Forwarding
Engine
Port Port Port Port Port Port Port Port

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 18
NSF/SSO
NSF Comparison

RPR RPR+ SSO NSF/SSO


Before Switchover
Active & standby images Can be different Must be same Must be same Must be same
Standby Sup Cold Hot Hot Hot
Synchronized Information Start-up config; Boot Running config; OIR PFC control-driven PFC control- driven
registers state data data
After Switchover
Standby Supervisor Image Initialized with startup Becomes active Becomes active Becomes active
config
Line Card Images Reloaded No change No change No change
Interface State - Reset Stateful Stateful
L2 MAC Table - Purged Stateful Stateful
L2 Protocol States - Reset Stateful Stateful
L3 FIB, Adj Tables - Purged Purged Graceful update
L3 Protocol States - Reset Reset Graceful restart
Supervisor Active 90+ sec 30+ sec < 3 sec < 3 sec

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 19
NSF/SSO
Supervisor Uplinks
§ Cisco Catalyst 6500: both the active supervisor and the standby supervisor
uplink ports are active as long as the supervisors are up and running
Uplink ports go down when the supervisor is reset

• Catalyst 6500 Supervisors: all ports


are active

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 20
NSF/SSO
Supervisor Uplinks and Pre-IOS 12.2(18)SXF5 issue overview

§ After NSF/SSO switchover, L3


forwarding tables are frozen until CEF 10.10.0.0/16
re-convergence
§ This results in traffic blackholing Si Si

towards the stale “old sup uplink”


adjacency
1 Switchover notification
G5/1 G6/1
2 Traffic Blackholed (20s +)

3 Stale entries (ADJ1) removed after CEF reconvergence

Pre-switchover active/standby entries Post-switchover “newly active sup” entries

Hardware Tables Hardware Tables 3


FIB Table Adjacency Table
1 FIB Table Adjacency Table
Prefix Adjacency Ptr Prefix Adjacency Ptr
Rewrite Information
ADJ1 (G5/1)
Rewrite Information
ADJ1 (G5/1)
2
10.10.0.0/16 Adj Ptr1 10.10.0.0/16 Adj Ptr1
ADJ2 (G6/1) ADJ2 (G6/1)

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 21
NSF/SSO
Supervisor Uplinks

§ The use of Catalyst 6500 Supervisor uplinks with


NSF/SSO results in a more complex network recovery
scenario Si Si

§ Dual failure scenario


Supervisor Failure
Port Failure
§ During recovery FIB is frozen but uplink port Si Si
is gone
§ PFC tries to forward traffic out a non-existent link à
this leads to a 24 seconds worst case convergence
time
§ Bundling Supervisor uplinks into Etherchannel links
improves convergence
§ The problem is solved in the software release
12.2(18)SXF5
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 22
Agenda

§ High Availability General Concepts and Features


§ Supervisor Redundancy: NSF/SSO
§ On Line Insertion and Removal Details
§ Generic On Line Diagnostics (GOLD)

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 23
ONLINE INSERTION AND REMOVAL
OIR Improvements

§ OIR stands for Online Insertion and Removal


§ All Catalyst 6500 linecards except services modules are hot
swappable
§ Under normal operation, the traffic stall is subsecond. Partially
inserted linecards can result in prolonged stalls
§ Architecture improvements allow newer linecards to avoid impact
by bus stall: OIR of new linecards (WS-X6148A-GE-TX, WS-
X6148A-RJ-45, WS-X6148-FE-SFP, Enhanced FlexWAN, SIP) will
cause zero packet loss

When an OIR is performed, a stall signal is generated on the


backplane bus to prevent backplane data corruption
Bus stall prevents packets from being transmitted to the
backplane: this results in traffic interruption for the duration of the
stall
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 24
ONLINE INSERTION AND REMOVAL
Online Insertion Operation

Prior to card insertion data


LINE CARD
flows freely over backplane

When line card hits longest


pin first (shown in green), a
LINE CARD stall signal is placed on the
backplane to protect the
system from data corruption.

Bus Stall removed when line


card touches the shortest pin
LINE CARD
(shown as blue pin), bus stall
removed and data flows freely

Time of bus stall is from when line card touches the long pin to when it
touches the shortest pin

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 25
ONLINE INSERTION AND REMOVAL
Online Removal Operation

Prior to card insertion data flows freely


LINE CARD
over backplane

When the shortest pin (shown in blue)


LINE CARD looses connectivity with linecard, a stall
signal is placed on the backplane to
protect the system from data corruption.

Bus Stall removed when the


LINE CARD longest pin (shown as green pin)
looses connectivity with linecard.

Time of bus stall is from when line card touches the shortest pin to when it
touches the longest pin

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 26
Agenda

§ High Availability General Concepts and Features


§ Supervisor Redundancy: NSF/SSO
§ On Line Insertion and Removal Details
§ Generic On Line Diagnostics (GOLD)

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 27
GOLD
Introduction

§§ GOLD
GOLD defines
defines aa common
common framework
framework for
for diagnostics
diagnostics operations
operations
across Cisco platforms running Cisco IOS Software.
across Cisco platforms running Cisco IOS Software.
§§ Goal:
Goal: check
check the
the health
health of
of hardware
hardware components
components and and verify
verify proper
proper
operation
operation of
of the
the system
system data
data plane
plane and
and control
control plane
plane at
at run-time
run-time
and
and boot-time.
boot-time.
§§ Provides
Provides aa common
common CLI CLI and
and scheduling
scheduling for
for field
field diagnostics
diagnostics

GOLD
GOLD Tests
Tests
Bootup
Bootup Tests
Tests (includes
(includes online
online insertion)
insertion)
Health
Health Monitoring
Monitoring Tests
Tests (background
(background non-disruptive)
non-disruptive)
On-Demand
On-Demand Tests
Tests (disruptive
(disruptive and
and Non-disruptive)
Non-disruptive)
User
User Scheduled
Scheduled Tests
Tests (disruptive
(disruptive and
and Non-disruptive)
Non-disruptive)
CLI
CLI access
access to
to data
data via
via Management
Management Interface
Interface

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 28
GOLD
How does it work?

Ports
Ports working
working Linecards
Linecards working
working
properly?
properly? properly?
properly?

Standby
Standby Sup
Sup ready
ready to
to
Is
Is the
the supervisor
supervisor take
take over?
over?
control
control plane
plane and
and
forwarding
forwarding plane
plane
functioning
functioning properly?
properly?
GOLD
GOLD cancan catch
catch the
the
following:
following:
Port
Port Failure
Failure
Backplane Bent
Bent backplane
backplane pin
pin
Backplane
connection Bad
Bad fabric
fabric connection
connection
connection
working? Malfunctioning
Malfunctioning PFC/DFC
PFC/DFC
working?
Bad
Bad memory
memory
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 29
GOLD
Diagnostic Integration
Configuration/reporting

Boot-up
Boot-up Diagnostics
Diagnostics
•Default corrective action
Supervisor reset
Configure online diagnostics Runtime
Runtime Diagnostics
Diagnostics Supervisor switch-over
Fabric switch-over
and check diagnostics results Port shut down
Scheduled
Scheduled Line card reset
On-Demand Line card power down
On-Demand Generate a call-home
Health
Health Monitoring
Monitoring message
•Trigger Syslog
•Trigger EEM policies
•Generate SNMP Trap

Automated action based on


Verify hardware functionalities
diagnostics results

Detect
Detect and
and identify
identify problems
problems before
before they
they result
result in
in network
network downtime!
downtime!
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 30
GOLD
Diagnostic Integration
Boot-Up diagnostics Run
RunDuring
DuringSystem
SystemBootup,
Bootup,Line
LineCard
CardOIR
OIRoror
Supervisor
SupervisorSwitchover
Switchover
Switch(config)#diagnostic
Switch(config)#diagnostic bootup
bootup level
level complete Makes
complete Makes sure faultyhardware
sure faulty hardwareisistaken
takenout
outofof
service
service
Runtime diagnostics
Health-Monitoring
Non-disruptive
Non-disruptivetests
testsrun
runininthe
the
Switch(config)#diagnostic
Switch(config)#diagnostic monitor
monitor module
module 55 test
test 22
Switch(config)#diagnostic
background
background
Switch(config)#diagnostic monitor
monitor interval
interval module
module 55 test
test 22 00:00:15
00:00:15
Serves
Servesas
asHA
HAtrigger
trigger
On-Demand
Switch#diagnostic
Switch#diagnostic start
start module
module 44 test
test 88
Module
Module 4:
4: Running
Running test(s)
test(s) 88 may
may disrupt
disrupt normal
normal system
system
operation
operation
Do
Do you
you want
want to
to continue? [no]: yy
continue? [no]: All
Alldiagnostics
diagnosticstests
testscan
canbe
berun
runon
on
Switch#diagnostic
Switch#diagnostic stopstop module
module 44
demand, for troubleshooting purposes.ItIt
demand, for troubleshooting purposes.
Scheduled can
canalso
alsobe
beused
usedas
asaapre-deployment
pre-deployment
Switch(config)#diagnostic
Switch(config)#diagnostic schedule
schedule module
module 44 test
test 11 port
port 33
tool.
tool.
on
on Jan
Jan 33 2005
2005 23:32
23:32
Switch(config)#diagnostic
Switch(config)#diagnostic schedule
schedule module
module 44 test
test 22 daily
daily Schedule
Schedulediagnostics
diagnosticstests,
tests,for
for
14:45
14:45 verification and troubleshooting
verification and troubleshooting
purposes
purposes
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 31
GOLD
High Level Architecture

Fault
Fault Policy
Policy Manager
Manager and
and
other
other NMS
NMS Applications
Applications NMS Layer

Embedded
Embedded Embedded
Embedded Event
Event Call-
Call-
MIB/SNMP
MIB/SNMP Syslog
Syslog Manager
Manager Manager
Manager Home
Home

GOLD Subsystems

SEA
SEA &&
OBFL
OBFL Platform Specific Diagnostics

Runtime
Runtime Software
Software Drivers
Drivers
IOS Layer

HARDWARE
HARDWARE
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 32
GOLD
GOLD Test Suite
Boot-up Diagnostics On-Demand Diagnostics
§ Forwarding Engine Learning Tests (Sup & DFC) § Exhaustive Memory Test
§ L2 Tests (Channel, BPDU, Capture) § Exhaustive TCAM Search Test
§ L3 Tests (IPv4, IPv6, MPLS) § Stress Testing
§ Span and Multicast Tests § All bootup and health monitoring tests
can be run on-demand
§ CAM Lookup Tests (FIB, NetFlow, QoS CAM)
§ Port Loopback Test (all cards)
Scheduled Diagnostics
§ Fabric Snake Tests
§ All boot-up and health monitoring
tests can be schedule
Health Monitoring Diagnostics § Scheduled Switch-over
§ SP-RP Inband Ping Test (Sup’s SP/RP,
EARL(L2&L3), RW engines
§ Fabric Channel Health Test (Fabric enabled line
cards)
§ MacNotification Test (DFC line cards)
§ Non Disruptive Loopback Test
§ Scratch Registers Test (PLD & ASICs)

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 33
GOLD
Example - Supervisor Data Path

MSFC Monitors
Monitors forwarding
forwarding path
path between
between the
the
PFC3
RP CPU
Switch
Switch Processor,
Processor, Route
Route Processor
Processor and
and
Port ASIC Forwarding
L3/4 Forwarding Engine
Engine
Engine
SP CPU
Runs
Runs Periodically
Periodically every
every 15
15 Seconds
Seconds
L2 Engine Fabric Switch Fabric after
after System
System is
is Online
Online (Configurable)
(Configurable)
Interface/
Replication
Engine
10
10 Consecutive
Consecutive Failures
Failures is
is treated
treated as
as
FATAL
FATAL and
and will
will result
result in
in supervisor
supervisor
switchover
switchover or
or supervisor
supervisor reset
reset
DBUS
RBUS
16 Gbps EOBC
Bus

Switch(config)#diagnostic
Switch(config)#diagnostic monitor
monitor module
module 55 test
test 22
Switch(config)#diagnostic
Switch(config)#diagnostic monitor
monitor interval
interval module
module 55 test
test 22 00:00:15
00:00:15

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 34
GOLD
Using it for Pre-Deployment
§ GOLD can be used for pre-stage testing. The order in which tests
are run matters!!!!!
Run diagnostics first on linecards, then on supervisors
Run packet switching tests first, run memory tests after
Switch#diagnostic
Switch#diagnostic start
start module
module 66 test
test all
all
Module
Module 6:
6: Running
Running test(s)
test(s) 88 will
will require
require resetting
resetting the
the line
line card
card after
after the
the test
test has
has completed
completed
Module
Module 6:
6: Running
Running test(s)
test(s) 1-2,5-9
1-2,5-9 may
may disrupt
disrupt normal
normal system
system operation
operation
Do
Do you
you want
want to
to continue?
continue? [no]:
[no]: yes
yes
*Mar
*Mar 25
25 22:43:16:
22:43:16: %DIAG-SP-6-TEST_RUNNING:
%DIAG-SP-6-TEST_RUNNING: Module
Module 6:
6: Running
Running TestTransceiverIntegrity{ID=1}
TestTransceiverIntegrity{ID=1} ...
...
*Mar 25 22:43:16: %DIAG-SP-3-TEST_SKIPPED: Module 6: TestTransceiverIntegrity{ID=1} is skipped
*Mar 25 22:43:16: %DIAG-SP-3-TEST_SKIPPED: Module 6: TestTransceiverIntegrity{ID=1} is skipped
*Mar
*Mar 25
25 22:43:16:
22:43:16: %LINK-5-CHANGED:
%LINK-5-CHANGED: Interface
Interface GigabitEthernet6/1,
GigabitEthernet6/1, changed
changed state
state to
to administratively
administratively down
down
*Mar
*Mar 25
25 22:43:16:
22:43:16: %DIAG-SP-6-TEST_RUNNING:
%DIAG-SP-6-TEST_RUNNING: Module
Module 6:
6: Running
Running TestLoopback{ID=2}
TestLoopback{ID=2} ...
...
*Mar
*Mar 25
25 22:43:16:
22:43:16: %DIAG-SP-6-TEST_RUNNING:
%DIAG-SP-6-TEST_RUNNING: Module
Module 6:
6: Running
Running TestAsicMemory{ID=8}
TestAsicMemory{ID=8} ...
...
*Mar
*Mar 25
25 22:43:16:
22:43:16: SP:
SP: ******************************************************************
******************************************************************
*Mar
*Mar 25
25 22:43:16:
22:43:16: SP:
SP: ** WARNING:
WARNING:
*Mar
*Mar 25 22:43:16: SP: * ASIC Memory
25 22:43:16: SP: * ASIC Memory test
test on
on module
module 66 may
may take
take up
up to
to 2hr
2hr 30min.
30min.
*Mar
*Mar 25
25 22:43:16:
22:43:16: SP:
SP: ** During
During this
this time,
time, please
please DO
DO NOT
NOT perform
perform any
any packet
packet switching.
switching.
*Mar 25 22:43:16: SP: ******************************************************************
*Mar 25 22:43:16: SP: ******************************************************************
<snip>
<snip>
Switch#diagnostic
Switch#diagnostic start
start module
module 55 test
test all
all
Module
Module 5:
5: Running
Running test(s)
test(s) 27-30
27-30 will
will power-down
power-down line
line cards
cards and
and standby
standby supervisor
supervisor should
should be
be power-down
power-down manually
manually and
and
supervisor
supervisor should
should be
be reset
reset after
after the
the test
test
Module
Module 5:
5: Running
Running test(s)
test(s) 26
26 will
will shut
shut down
down the
the ports
ports of
of all
all linecards
linecards and
and supervisor
supervisor should
should be
be reset
reset after
after the
the test
test
Module
Module 5:
5: Running
Running test(s)
test(s) 3,5,8-10,19,22-23,26-31
3,5,8-10,19,22-23,26-31 maymay disrupt
disrupt normal
normal system
system operation
operation
Do
Do you
you want
want to
to continue?
continue? [no]:
[no]: yes
yes
<snip>
<snip>

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 35
GOLD
Operation Example
Switch#show
Switch#show diagnostic
diagnostic content
content mod
mod 55
Module
Module 5: Supervisor Engine 720 (Active)
5: Supervisor Engine 720 (Active)
<snip>
<snip>
Testing
Testing Interval
Interval
ID
ID Test
Test Name
Name Attributes
Attributes (day hh:mm:ss.ms)
(day hh:mm:ss.ms)
====
==== ==================================
================================== ============
============ =================
=================
1)
1) TestScratchRegister
TestScratchRegister ------------->
-------------> ***N****A***
***N****A*** 000
000 00:00:30.00
00:00:30.00
2) TestSPRPInbandPing -------------->
2) TestSPRPInbandPing --------------> ***N****A***
***N****A*** 000 00:00:15.00
000 00:00:15.00
3)
3) TestTransceiverIntegrity
TestTransceiverIntegrity -------->
--------> **PD****I***
**PD****I*** not
not configured
configured
4)
4) TestActiveToStandbyLoopback ----->
TestActiveToStandbyLoopback -----> M*PDS***I***
M*PDS***I*** not configured
not configured
5)
5) TestLoopback
TestLoopback -------------------->
--------------------> M*PD****I***
M*PD****I*** not
not configured
configured
6)
6) TestNewIndexLearn
TestNewIndexLearn --------------->
---------------> M**N****I***
M**N****I*** not
not configured
configured
7)
7) TestDontConditionalLearn
TestDontConditionalLearn -------->
--------> M**N****I***
M**N****I*** Diagnostics
not configured
Diagnostics
not configuredtest
test suite
suite attributes:
attributes:
8)
8) TestBadBpduTrap
TestBadBpduTrap ----------------->
-----------------> M**D****I***
M**D****I*** not
notM/C/*
configured
M/C/* -- Minimal
Minimal bootup
configured bootup level
level test
test // Complete
Complete bootup
bootup level
level
test
test // NA
NA
9)
9) TestMatchCapture
TestMatchCapture ---------------->
----------------> M**D****I***
M**D****I*** not
not configured
configured
B/*
B/* -- Basic
Basic ondemand
ondemand test
test // NA
NA
10) TestProtocolMatchChannel --------> M**D****I***
10) TestProtocolMatchChannel --------> M**D****I*** not
not configured
configured
P/V/*
P/V/* - Per port test / Per device test
- Per port test / Per device test // NANA
11)
11) TestFibDevices
TestFibDevices ------------------>
------------------> M**N****I***
M**N****I*** not
not configured
configured
D/N/*
D/N/* -- Disruptive
Disruptive test
test // Non-disruptive
Non-disruptive test
test // NA
NA
12)
12) TestIPv4FibShortcut ------------->
TestIPv4FibShortcut -------------> M**N****I***
M**N****I*** not
not configured
configured
S/* - Only applicable to standby unit
S/* - Only applicable to standby unit / NA / NA
13)
13) TestL3Capture2
TestL3Capture2 ------------------>
------------------> M**N****I***
M**N****I*** not
not configured
configured
X/*
X/* -- Not
Not aa health
health monitoring
monitoring test
test // NA
NA
14)
14) TestIPv6FibShortcut
TestIPv6FibShortcut ------------->
-------------> M**N****I***
M**N****I*** not
not configured
configured
F/*
F/* -- Fixed
Fixed monitoring
monitoring interval
interval test
test // NA
NA
15)
15) TestMPLSFibShortcut
TestMPLSFibShortcut ------------->
-------------> M**N****I***
M**N****I*** not
not configured
configured
E/* - Always enabled monitoring test
E/* - Always enabled monitoring test / NA / NA
16)
16) TestNATFibShortcut
TestNATFibShortcut -------------->
--------------> M**N****I***
M**N****I*** not
not configured
configured
A/I
A/I -- Monitoring
Monitoring is
is active
active // Monitoring
Monitoring is is inactive
inactive
17)
17) TestAclPermit
TestAclPermit ------------------->
-------------------> M**N****I***
M**N****I*** not
not configured
configured
R/*
R/* -- Power-down
Power-down line
line cards
cards and
and need
need reset
reset supervisor
supervisor //
18) TestAclDeny ---------------------> M**N****A***
18) TestAclDeny ---------------------> M**N****A*** 000
000 00:00:05.00
NA00:00:05.00
NA
19)
19) TestQoSTcam
TestQoSTcam --------------------->
---------------------> M**D****I***
M**D****I*** not
not configured
configured
K/*
K/* -- Require
Require resetting
resetting thethe line
line card
card after
after thethe test
test has
has
<snip> completed
completed // NA
NA
<snip>
T/*
T/* -- Shut
Shut down
down all
all ports
ports and
and need
need reset
reset supervisor
supervisor // NANA

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 36
GOLD
Operation Example
20)
20) TestL3VlanMet
TestL3VlanMet ------------------->
-------------------> M**N****I***
M**N****I*** not
not configured
configured n/a
n/a
21)
21) TestIngressSpan ----------------->
TestIngressSpan -----------------> M**N****I***
M**N****I*** not configured
not configured n/a
n/a
22)
22) TestEgressSpan
TestEgressSpan ------------------>
------------------> M**D****I***
M**D****I*** not
not configured
configured n/a
n/a
23)
23) TestNetflowInlineRewrite
TestNetflowInlineRewrite -------->
--------> C*PD****I***
C*PD****I*** not
not configured
configured n/a
n/a
24)
24) TestFabricSnakeForward
TestFabricSnakeForward ---------->
----------> M**N****I***
M**N****I*** not
not configured
configured n/a
n/a
25)
25) TestFabricSnakeBackward
TestFabricSnakeBackward --------->
---------> M**N****I***
M**N****I*** not
not configured
configured n/a
n/a
26)
26) TestTrafficStress
TestTrafficStress --------------->
---------------> ***D****I**T
***D****I**T not
not configured
configured n/a
n/a
27) TestFibTcamSSRAM ----------------> ***D*X**IR**
27) TestFibTcamSSRAM ----------------> ***D*X**IR** not configured
not configured n/a
n/a
28)
28) TestAsicMemory
TestAsicMemory ------------------>
------------------> ***D*X**IR**
***D*X**IR** not
not configured
configured n/a
n/a
29)
29) TestNetflowTcam ----------------->
TestNetflowTcam -----------------> ***D*X**IR**
***D*X**IR** not configured
not configured n/a
n/a
30)
30) ScheduleSwitchover
ScheduleSwitchover -------------->
--------------> ***D****I***
***D****I*** not
not configured
configured n/a
n/a
31)
31) TestFirmwareDiagStatus
TestFirmwareDiagStatus ---------->
----------> M**N****I***
M**N****I*** not
not configured
configured n/a
n/a
32)
32) TestAsicSync
TestAsicSync -------------------->
--------------------> ***N****A***
***N****A*** 000
000 00:00:15.00
00:00:15.00 10
10

Pay
Pay extra
extra attention
attention to
to
Memory
Memory tests:
tests:
Memory
Memory tests
tests can
can take
take
hours
hours to
to complete
complete and
and aa
reset
reset is
is required
required after
after
running
running these
these tests!
tests!

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 37
GOLD
Operation Example
20)
20) TestL3VlanMet
TestL3VlanMet ------------------->
-------------------> M**N****I***
M**N****I*** not
not configured
configured n/a
n/a
21)
21) TestIngressSpan ----------------->
TestIngressSpan -----------------> M**N****I***
M**N****I*** not configured
not configured n/a
n/a
22)
22) TestEgressSpan
TestEgressSpan ------------------>
------------------> M**D****I***
M**D****I*** not
not configured
configured n/a
n/a
23)
23) TestNetflowInlineRewrite
TestNetflowInlineRewrite -------->
--------> C*PD****I***
C*PD****I*** not
not configured
configured n/a
n/a
24)
24) TestFabricSnakeForward
TestFabricSnakeForward ---------->
----------> M**N****I***
M**N****I*** not
not configured
configured n/a
n/a
25)
25) TestFabricSnakeBackward
TestFabricSnakeBackward --------->
---------> M**N****I***
M**N****I*** not
not configured
configured n/a
n/a
26)
26) TestTrafficStress
TestTrafficStress --------------->
---------------> ***D****I**T
***D****I**T not
not configured
configured n/a
n/a
27) TestFibTcamSSRAM ----------------> ***D*X**IR**
27) TestFibTcamSSRAM ----------------> ***D*X**IR** not configured
not configured n/a
n/a
28)
28) TestAsicMemory
TestAsicMemory ------------------>
------------------> ***D*X**IR**
***D*X**IR** not
not configured
configured n/a
n/a
29)
29) TestNetflowTcam ----------------->
TestNetflowTcam -----------------> ***D*X**IR**
***D*X**IR** not configured
not configured n/a
n/a
30)
30) ScheduleSwitchover
ScheduleSwitchover -------------->
--------------> ***D****I***
***D****I*** not
not configured
configured n/a
n/a
31)
31) TestFirmwareDiagStatus
TestFirmwareDiagStatus ---------->
----------> M**N****I***
M**N****I*** not
not configured
configured n/a
n/a
32)
32) TestAsicSync
TestAsicSync -------------------->
--------------------> ***N****A***
***N****A*** 000
000 00:00:15.00
00:00:15.00 1010
Diagnostics
Diagnostics test
test suite
suite attributes:
attributes:
M/C/*
M/C/* -- Minimal
Minimal bootup
bootup level
level test
test // Complete
Complete bootup
bootup level
level test
test
// NA
NA
B/*
B/* -- Basic
Basic ondemand
ondemand test
test // NA
NA
P/V/*
P/V/* - Per port test / Per device test
- Per port test / Per device test // NA
NA
D/N/*
D/N/* -- Disruptive
Disruptive test
test // Non-disruptive
Non-disruptive test
test // NA
NA
S/* - Only applicable to standby unit
S/* - Only applicable to standby unit / NA / NA
X/*
X/* -- Not
Not aa health
health monitoring
monitoring test
test // NA
NA
Pay
Pay extra
extra attention
attention to
to F/*
F/* -- Fixed
Fixed monitoring
monitoring interval
interval test
test // NA
NA
Memory
Memory tests:
tests: E/*
E/* -- Always
Always enabled
enabled monitoring
monitoring test
test // NA
NA
A/I
A/I -- Monitoring
Monitoring is
is active
active // Monitoring
Monitoring isis inactive
inactive
Memory
Memory tests
tests can
can take
take R/*
R/* -- Power-down
Power-down line
line cards
cards and
and need
need reset
reset supervisor
supervisor // NA
NA
hours
hours to
to complete
complete and
and aa K/* - Require resetting the line card after the test
K/* - Require resetting the line card after the test has
completed
has
completed // NA
NA
reset
reset is
is required
required after
after T/*
T/* -- Shut
Shut down
down all
all ports
ports and
and need
need reset
reset supervisor
supervisor // NA
NA
running
running these
these tests!
tests!

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 38
GOLD
Operation Example

GOLD generic Syslog messages start with the string “DIAG”; “CONST_DIAG”
messages platform specific…

Bootup
Bootup Test
Test Failure:
Failure:
%CONST_DIAG-SP-3-BOOTUP_TEST_FAIL:
%CONST_DIAG-SP-3-BOOTUP_TEST_FAIL: Module
Module 2:
2: TestL3VlanMet
TestL3VlanMet failed
failed
Health
Health Monitoring
Monitoring Test
Test Failure:
Failure:
%CONST_DIAG-SP-3-HM_TEST_FAIL:
%CONST_DIAG-SP-3-HM_TEST_FAIL: Module
Module 55 TestSPRPInbandPing
TestSPRPInbandPing consecutive
consecutive failure
failure count:10
count:10
%CONST_DIAG-SP-6-HM_TEST_INFO: CPU util(5sec): SP=3% RP=12% Traffic=0% %CONST_DIAG-SP-4-
%CONST_DIAG-SP-6-HM_TEST_INFO: CPU util(5sec): SP=3% RP=12% Traffic=0% %CONST_DIAG-SP-4-
HM_TEST_WARNING:
HM_TEST_WARNING: Sup
Sup switchover
switchover will
will occur
occur after
after 10
10 consecutive
consecutive failures
failures
On
On Demand
Demand Diagnostics
Diagnostics Test
Test Failure:
Failure:
%DIAG-SP-3-TEST_FAIL:
%DIAG-SP-3-TEST_FAIL: Module
Module 5:
5: TestTrafficStress{ID=24}
TestTrafficStress{ID=24} has
has failed.
failed. Error
Error code
code == 0x1
0x1
Scheduled
Scheduled Diagnostics
Diagnostics Test
Test Failure:
Failure:
%DIAG-SP-3-TEST_FAIL:
%DIAG-SP-3-TEST_FAIL: Module
Module 3:
3: TestLoopback{ID=1}
TestLoopback{ID=1} has
has failed.
failed. Error
Error code
code == 0x1
0x1
Generic
Generic Minor
Minor and
and Major
Major Failure:
Failure:
%DIAG-SP-3-MINOR:
%DIAG-SP-3-MINOR: Module
Module 3:
3: Online
Online Diagnostics
Diagnostics detected
detected aa Minor
Minor Error.
Error. Please
Please use
use 'show
'show diagnostic
diagnostic
result
result <target>'
<target>' to
to see
see test
test results.
results.
%DIAG-SP-3-MAJOR:
%DIAG-SP-3-MAJOR: Module
Module 6:
6: Online
Online Diagnostics
Diagnostics detected
detected aa Major
Major Error.
Error. Please
Please use
use 'show
'show diagnostic
diagnostic
Module 6' to see test results.
Module 6' to see test results.

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 39
GOLD
Recommendations

Boot-up diagnostics:
- Set level to complete
On demand diagnostics:
- Use as a pre-deployment tool: run complete diagnostics
before putting hardware into production environment
- Use as a troubleshooting tool when suspecting hardware
failure
Scheduled diagnostics:
- Schedule key diagnostics tests periodically
- Schedule all non-disruptive tests periodically
Health-monitoring diagnostics:
- Key tests running by default
- Enable additional non-disruptive tests for specific
functionalities enabled in your network: IPv6, MPLS, NAT…

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 40
GOLD
Case Study

6500_cust#
6500_cust# show
show module
module
Mod
Mod Ports
Ports Card
Card Type
Type Model
Model Serial
Serial No.
No.
---
--- -----
----- --------
-------- ------------------------------
------------------------------ ------------------
------------------ -----------
-----------
12
12 48
48 CEF720
CEF720 48
48 port
port 1000mb
1000mb SFP
SFP WS-X6748-SFP
WS-X6748-SFP SALxxxxxxx
SALxxxxxxx

•Situation:
•Situation:
Customer
Customer was
was running
running into
into aa problem
problem :: 6500_cust#
6500_cust# show
show diagnostic
diagnostic result
result module
module 12
12 test
test 13
13 de
de
packets
packets ingress
ingress on
on aa particular
particular line
line card
card
were
were getting dropped intermittently. All
getting dropped intermittently. All
Current
Current boot
boot up
up diagnostic
diagnostic level:
level: complete
complete

software/hardware entries etc


software/hardware entries etc werewere Test
Test results:
results: (.
(. == Pass,
Pass, FF == Fail,
Fail, UU == Untested)
Untested)
checked.
checked. ______________________________________________________
______________________________________________________
•Action:
•Action: 13)
13) TestLinecardMemory
TestLinecardMemory -------------->
--------------> FF
TAC
TAC engineer
engineer requested
requested customer
customer to
to
run line card memory test
run line card memory test Error
Error code
code ------------------>
------------------> 11 (DIAG_FAILURE)
(DIAG_FAILURE)
Total
Total run
run count
count ------------->
-------------> 11
•Results:
•Results: Last
Last test
test execution
execution time
time ---->
----> Mar
Mar 01
01 2006
2006 00:23:23
00:23:23
First
First test
test failure
failure time
time -----> Mar
Mar 01
01 2006
2006 00:23:23
Diagnostics
Diagnostics results
results revealed
revealed that
that Last
Last test
test failure
failure time
----->
time ------>
------> Mar
Mar 01
01 2006
00:23:23
2006 00:23:23
00:23:23
memory
memory was failing. Line card was
was failing. Line card was Last
Last test
test pass
pass time
time --------->
---------> n/a
n/a
replaced and the switch functionality
replaced and the switch functionality
Total
Total failure
failure count
Consecutive
count --------->
Consecutive failure
--------->
failure count
count --->
--->
11
11
was
was restored
restored in
in aa very
very short
short time
time

Thanks
Thanks to
to GOLD
GOLD !!!!

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 41
GOLD
Case Study

6500_cust#
6500_cust# diagnostic
diagnostic start
start module
module 11 test
test 28
28
Module
Module 1:
1: Running
Running test(s)
test(s) 28
28 may
maydisrupt
disrupt normal
normal operation
operation
Do
Do you
you want
want to
to run
run disruptive
disruptive tests?
tests? [no]
[no] yes
yes

Mar
Mar 17 17 15:58:34:
15:58:34: SP:SP: ******************************************************************
******************************************************************
Mar
Mar 17 17 15:58:34:
15:58:34: SP:SP: ** WARNING:
WARNING:
Mar
Mar 17 17 15:58:34:
15:58:34: SP:SP: ** ASIC
ASIC Memory
Memorytest test on
on module
module 11 may maytaketake up up toto 1hr
1hr 30min.
30min.
Mar
Mar 17 17 15:58:34:
15:58:34: SP:SP: ** During
During this
this time,
time, please
please DO DO NOTNOT perform
perform any anypacket
packet switching.
switching.
•Situation:
•Situation: Mar
Mar 17
Mar
Mar 17
17 15:58:34:
15:58:34: SP:
17 16:10:27:
16:10:27: SP:
SP: ******************************************************************
******************************************************************
SP: diag_scp_asic_mem_test
diag_scp_asic_mem_test [1/1/RN_PBIF]: [1/1/RN_PBIF]: LCP LCP TEST
TEST FAILED.
FAILED. fail_addr
fail_addr
Customer
Customer was
was running
running into
into aa problem
problem :: == 0xE923,
55
0xE923, test_data
55 || 55,
55, 55
test_data || result_data:
55 || 55,
55, 55
result_data:
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 5555 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55,
packets
packets ingress
ingress on
on aa particular
particular line
line card
card 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 5555 || 55,
55, 55
55 || 55,
55, 55
55 || 53,
53, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55,
were
were getting dropped intermittently. All
getting dropped intermittently. All 55
55 || 55,
55
55, 55
55 || 55,
55 || 55,
55, 55
55, 55
55 || 55,
55 || 55,
55, 55
55, 55
55 || 55,
55 || 55,
55, 55
55, 55
55 || 55,
55 || 55,
55, 55
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55 || 55,
55, 55
55, 55
55 || 55,
55 || 55,
55, 55
55, 55
55 || 55,
55 || 55,
55, 55
55, 55
55 || 55,
55 || 55,
55, 55
55, 55
55 || 55,
55 || 55,
55, 55
55, 55
55 || 55,
55 || 55,
55, 55
55,
55 || 55,
55,
software/hardware entries etc
software/hardware entries etc werewere 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 5555 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55, 55
55 || 55,
55,
checked.
checked.
55
55 || 55,
Mar
Mar 17
55, 55
55 || 55,
55, 55
17 16:10:27:
55 || 55,
16:10:27: SP:
55, 55
55 || 55,
SP: do_mem_test
55,
do_mem_test [1/1]: [1/1]: test
test RN_PBIF
RN_PBIF memorymemoryfailed failed
•Action:
•Action:
Mar
Mar 17
Mar
Mar 17
17 16:10:27:
16:10:27: SP:
17 16:10:27:
16:10:27: SP:
SP: ******************************************************************
******************************************************************
SP: ** WARNING:
WARNING: Please Please RESET
RESET module
module 11 prior
prior to to normal
normal use.use. Also,
Also,
TAC
TAC engineer
engineer requested
requested customer
customer to
to packet
packet
Mar
Mar 17 17 16:10:27:
16:10:27: SP:SP: ** switching
switching tests
tests willwill no
no longer
longer workwork (i.e.
(i.e. test
test failure)
failure) because
because
run line card memory test
run line card memory test Mar 17 16:10:27: SP: * its memories are filled with
Mar 17 16:10:27: SP: * its memories are filled with test patterns. test patterns.
Mar
Mar 17 17 16:10:27:
16:10:27: SP:SP: ******************************************************************
******************************************************************
•Results:
•Results: MarchCMem:
MarchCMem: got got data
data mismatch
mismatch at at addr:
addr: 0xE923,
0xE923, dev#:
dev#: 11
Diagnostics
Diagnostics results
results revealed
revealed that
that rc
rc == 0x12
0x12 comparison
comparison data|rslt:
data|rslt: 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55
memory
memory was failing. Line card was
was failing. Line card was
55|55
55|55 55|55
55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|53
55|55 55|55
55|53 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55
55|55 55|55
55|55
replaced and the switch functionality
replaced and the switch functionality 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55
was
was restored
restored in
in aa very
very short
short time
time
55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55 55|55
55|55

Mar
Mar 17
17 16:10:27:
16:10:27: %DIAG-SP-3-TEST_FAIL:
%DIAG-SP-3-TEST_FAIL: Module
Module 1:
1: TestLinecardMemory{ID=28}
TestLinecardMemory{ID=28} has
has
failed.
failed. Error
Error code
code == 0x1
0x1
Thanks
Thanks to
to GOLD
GOLD !!!!

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 42
GOLD
GOLD Paper

http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper0900aecd801e659f.shtml
http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper0900aecd801e659f.shtml

© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 43
© 2006 Cisco Systems, Inc. All rights reserved. Cisco Partner Confidential 44

Vous aimerez peut-être aussi