Académique Documents
Professionnel Documents
Culture Documents
Troubleshooting - IP Routing
Issue Date 01 2011-10-15
Copyright Huawei Technologies Co., Ltd. 2011. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied. The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute the warranty of any kind, express or implied.
Website: Email:
Issue 01 (2011-10-15)
Symbol Conventions
The symbols that may be found in this document are defined as follows. Symbol Description Alerts you to a high risk hazard that could, if not avoided, result in serious injury or death. Alerts you to a medium or low risk hazard that could, if not avoided, result in moderate or minor injury.
Issue 01 (2011-10-15)
ii
Symbol
Description Alerts you to a potentially hazardous situation that could, if not avoided, result in equipment damage, data loss, performance deterioration, or unanticipated results. Provides a tip that may help you solve a problem or save time. Provides additional information to emphasize or supplement important points in the main text.
[ x | y | ... ]* &<1-n> #
Change History
Updates between document issues are cumulative. Therefore, the latest document issue contains all updates made in previous issues.
Contents
Contents
About This Document.....................................................................................................................ii 1 ARP Troubleshooting...................................................................................................................1
1.1 The ARP Entries on the Local Device Cannot Be Learnt By the Peer...............................................................2 1.1.1 Common Causes........................................................................................................................................2 1.1.2 Troubleshooting Flowchart........................................................................................................................2 1.1.3 Troubleshooting Procedure........................................................................................................................3 1.1.4 Relevant Alarm and Log Messages...........................................................................................................5 1.2 Trouble Cases.....................................................................................................................................................5 1.2.1 PCs on the Same Network Segment Cannot Access Each Other Because ARP Proxy Is Not Enabled ............................................................................................................................................................................5
2 IP Forwarding Troubleshooting.................................................................................................7
2.1 The Ping Operation Fails....................................................................................................................................8 2.1.1 Common Causes........................................................................................................................................8 2.1.2 Troubleshooting Flowchart........................................................................................................................8 2.1.3 Troubleshooting Procedure......................................................................................................................10 2.1.4 Relevant Alarms and Logs......................................................................................................................16 2.2 The Tracert Operation Fails..............................................................................................................................16 2.2.1 Common Causes......................................................................................................................................16 2.2.2 Troubleshooting Procedure......................................................................................................................16 2.2.3 Relevant Alarms and Logs......................................................................................................................17
Contents
3.3.1 Common Causes......................................................................................................................................25 3.3.2 Troubleshooting Flowchart......................................................................................................................25 3.3.3 Troubleshooting Procedure......................................................................................................................27 3.3.4 Relevant Alarms and Logs......................................................................................................................29
6 OSPF Troubleshooting...............................................................................................................53
6.1 The OSPF Neighbor Relationship Is Down.....................................................................................................54 6.1.1 Common Causes......................................................................................................................................54 6.1.2 Troubleshooting Flowchart......................................................................................................................54 6.1.3 Troubleshooting Procedure......................................................................................................................55 6.1.4 Relevant Alarms and Logs......................................................................................................................59 6.2 OSPF Neighbor Relationship Cannot Enter the Full State...............................................................................59 6.2.1 Common Causes......................................................................................................................................59 6.2.2 Troubleshooting Flowchart......................................................................................................................59 6.2.3 Troubleshooting Procedure......................................................................................................................60 6.2.4 Relevant Alarms and Logs......................................................................................................................62 6.3 Trouble Cases...................................................................................................................................................62 6.3.1 The router Receives Two LSAs with the Same LS ID but Fails to Calculate a Route Based on One of the LSAs.................................................................................................................................................................62 Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. v
Contents
6.3.2 The OSPF Neighbor Relationship Cannot Be Established Between Two Devices Because the Link Between the Devices Is Faulty.........................................................................................................................65 6.3.3 An OSPF Routing Loop Occurs Because Router IDs of Devices Conflict.............................................66 6.3.4 Services on the Master Plane Are Interrupted Because a Device on the Slave Plane Performs Master/Slave Switchover on a Bearer Network......................................................................................................................68
7 IS-IS Troubleshooting................................................................................................................70
7.1 The IS-IS Neighbor Relationship Cannot Be Established................................................................................71 7.1.1 Common Causes......................................................................................................................................71 7.1.2 Troubleshooting Flowchart......................................................................................................................71 7.1.3 Troubleshooting Procedure......................................................................................................................72 7.1.4 Relevant Alarms and Logs......................................................................................................................76 7.2 A Device Fails to Learn Specified IS-IS Routes from Its Neighbor................................................................76 7.2.1 Common Causes......................................................................................................................................76 7.2.2 Troubleshooting Flowchart......................................................................................................................77 7.2.3 Troubleshooting Procedure......................................................................................................................78 7.2.4 Relevant Alarms and Logs......................................................................................................................81 7.3 IS-IS Routes Flap..............................................................................................................................................81 7.3.1 Common Causes......................................................................................................................................81 7.3.2 Troubleshooting Flowchart......................................................................................................................81 7.3.3 Troubleshooting Procedure......................................................................................................................82 7.3.4 Relevant Alarms and Logs......................................................................................................................83 7.4 Trouble Cases...................................................................................................................................................84 7.4.1 An Upper-layer Device Cannot Learn IS-IS Routes Due to Differences in the Types of Routes Imported by IS-IS on a Huawei Device and a Non-Huawei Device................................................................................84
8 BGP Troubleshooting.................................................................................................................86
8.1 The BGP Peer Relationship Fails to Be Established........................................................................................87 8.1.1 Common Causes......................................................................................................................................87 8.1.2 Troubleshooting Flowchart......................................................................................................................87 8.1.3 Troubleshooting Procedure......................................................................................................................88 8.1.4 Relevant Alarms and Logs......................................................................................................................92 8.2 BGP Public Network Traffic Is Interrupted.....................................................................................................92 8.2.1 Common Causes......................................................................................................................................92 8.2.2 Troubleshooting Flowchart......................................................................................................................92 8.2.3 Troubleshooting Procedure......................................................................................................................93 8.2.4 Relevant Alarms and Logs......................................................................................................................96 8.3 BGP Private Network Traffic Is Interrupted....................................................................................................96 8.3.1 Common Causes......................................................................................................................................96 8.3.2 Troubleshooting Flowchart......................................................................................................................96 8.3.3 Troubleshooting Procedure......................................................................................................................98 8.3.4 Relevant Alarms and Logs....................................................................................................................102 8.4 Troubleshooting of the Fault that a Local BGP Peer (Route Sender) Cannot Receive ORFs from a Remote Peer (Route Receiver)...................................................................................................................................................102 8.4.1 Common Causes....................................................................................................................................102 Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. vi
Contents
8.4.2 Troubleshooting Flowchart....................................................................................................................103 8.4.3 Troubleshooting Procedure....................................................................................................................104 8.4.4 Relevant Alarms and Logs....................................................................................................................107 8.5 Trouble Cases.................................................................................................................................................107 8.5.1 Traffic Traverses the Egress Device of an AS Because BGP Delivers Default Routes with Different MEDs ........................................................................................................................................................................107 8.5.2 Routing Policies Delivered by a PE Do Not Take Effect Because There Are Multiple Routing Policies with the Same Name.......................................................................................................................................109 8.5.3 A PE Fails to Establish the Public Network LSP Because the Path of IGP Routes Is Incorrect...........111 8.5.4 The BGP Peer Relationship Goes Down Because of Route Iteration...................................................113 8.5.5 Static Routes Do Not Take Effect Because of the Relay Depth............................................................115 8.5.6 The Outgoing Traffic Is Not Balanced Because BGP Load Balancing Is Not Enabled........................116 8.5.7 Summary Routes Advertised by EBGP Flap Frequently Because Routing Protocols Are Configured with Improper Preferences......................................................................................................................................118 8.5.8 Traffic Is Not Load Balanced Between Two Links Because Load Balancing Is Not Configured on the Peer End..........................................................................................................................................................120
Issue 01 (2011-10-15)
vii
1 ARP Troubleshooting
1
About This Chapter
1.2 Trouble Cases
ARP Troubleshooting
1.1 The ARP Entries on the Local Device Cannot Be Learnt By the Peer
Issue 01 (2011-10-15)
1 ARP Troubleshooting
1.1 The ARP Entries on the Local Device Cannot Be Learnt By the Peer
1.1.1 Common Causes
This fault is commonly caused by one of the following: l l l l l The interface connecting the local device to the peer is not physically Up. The IP addresses of the interfaces connecting the local device and the peer are on different network segments. The local device is under ARP attacks. The link transmission is unstable or the optical power is insufficient. The device software is faulty.
Issue 01 (2011-10-15)
1 ARP Troubleshooting
Figure 1-1 Troubleshooting flowchart for the fault that the ARP entries on the peer device cannot be learnt
Failing to learn the ARP entries on the peer device
No
Yes
Yes Is there any error packet on the interface? Yes Are the IP addresses of the interface and the peer device on the same network segment? Yes Ensure that the IP addresses of the No interface and the peer device are on the same network segment Ensure that there is no error packet on the interface
No
Yes
Yes
No
Does the number of sent broadcast packets and received unicast packets increase? No
Yes Ensure that the peer device and the link are normal
Yes
End
Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Run the display interface interface-type interface-number command in the user view to view the physical status of the interface connecting the local device to the peer.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 3
1 ARP Troubleshooting
You can run the display interface interface-type interface-number command to view information about the interface connecting the local device to the peer.
l If the interface is Down, refer to relevant troubleshooting cases to make the interface go Up. l If the interface is Up, go to Step 2. Step 2 View the Internet Address field. Check that the IP addresses of the interfaces connecting the two devices are on the same network segment. l If the two IP addresses are on different network segments, re-configure either one of the IP addresses so that the two IP addresses are on the same network segment. l If the two IP addresses are on the same network segment, go to Step 3. Step 3 Run the display arp statistics slot slot-id command in the user view to check whether the number of ARP entries on the specified LPU of the local device exceeds the upper limit. If the upper limit is reached, refer to the Chapter "Configuring ARP Security" in the NE5000E Configuration Guide to solve the problem. If the number of ARP entries does not exceed the upper limit but the fault persists, go to Step 4. Step 4 View status information of the interface connecting the local device to the peer. Check whether the count displayed in the CRC field increases. If so, it indicates that there are a large number of error packets on the interface connecting the local device to the peer. In this case, check the link quality to verify if the link transmission is unstable or the optical power is insufficient. If the count displayed in the CRC field does not increase but the fault persists, go to Step 5. Step 5 View status information of the interface connecting the local device to the peer. Check whether the count in the Broadcast sub-field of the output field increases. Before viewing the following information, ping the IP address of the peer device from the local device to trigger the local device to send ARP request packets. l If the number of broadcast packets remains unchanged, it indicates that no ARP request packet is sent. Therefore, it can be concluded that the local device is faulty. In this case, go to Step 7. l If the number of broadcast packets increases, check whether the peer device or the link is functioning properly. If the peer device or the link is not functioning properly, locate the fault and rectify it. If the peer device and the link function are functioning properly, go to Step 6. Step 6 View status information of the interface connecting the local device to the peer. Check whether the count in the Unicast sub-field of the input field increases. l If the number of unicast packets remains unchanged, it indicates that the device has not received any ARP response packets. The ARP response packets sent by the local device may have been discarded on the peer device because their format is not RFC-compliant. l If the number of unicast packets increases, it indicates that the local device has received ARP response packets. It can be concluded that some software modules on the local device are not functioning properly. In this case, go to Step 7. Step 7 Collect the following information and contact Huawei technical support personnel. l Results of the preceding operation procedure
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 4
1 ARP Troubleshooting
l Configuration files, log files, and alarm files of the devices ----End
Log Messages
None.
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
On the network shown in Figure 1-2, PC1 and PC2 reside on the same network segment 192.168.0.0/16. The two routers have static routes to the network segment of each other. After the configuration is complete, PC1 fails to ping PC2. Figure 1-2 Networking diagram for the scenario that the PCs on the same network segment cannot access each other
192.168.1.2/16
Issue 01 (2011-10-15)
1 ARP Troubleshooting
Fault Analysis
1. Run the arp -a command on PC1 to check all ARP entries. You can find that there is no mapping between the IP address and MAC address of PC2. It indicates that the ARP entry is not learned when the ping command is run. When Router A receives the ARP Request packet from PC1, it finds that the destination IP address in the packet is not the IP address of the local interface. Therefore, Router A discards the ARP Request packet.
NOTE
Usually, when the router receives an ARP Request packet, it checks whether the packet is for itself. If yes, the router sends an ARP Reply packet; if no, the router discards the ARP Request packet. If routed proxy ARP is enabled, the router does not directly discard a received ARP Request packet that is not for itself. Instead, it checks its own routing table to see if there is a route to the intended destination of the ARP Request packet. If there is such a route and conditions permit ( The IP address of the interface connecting router to the PC is on the same network segment with the IP address of the PC. ) , the router offers its own MAC address to the sender of the ARP Request packet in reply as a proxy. Then, the sender of the ARP Request packet sends the packets to the router, which will forward the packets to the intended destination.
Procedure
Step 1 Run the system-view command to enter the system view on RouterA (or RouterB). Step 2 Run the interface interface-type interface-number command to enter the view of the customerfacing interface. Step 3 Run the arp-proxy enable command to enable routed proxy ARP on the interface. Step 4 Run the ping 192.168.2.2 command on PC1 to ping the IP address of PC2. Then, run the arp a command on PC1. You can find that the MAC address corresponding to the IP address of PC2 is the MAC address of the interface connected to PC1 on RouterA.
C:\Documents and Settings\Administrator>arp -a Interface: 192.168.1.2 --- 0x2 Internet Address Physical Address Type 192.168.2.2 00e0-fc39-80aa dynamic
When the configuration is complete, PC1 can successfully ping PC2, and the fault is rectified. ----End
Summary
Proxy ARP is a technique by which a device serving as a proxy on a given network answers the ARP requests for a network address that is not on that network. Routed proxy ARP, enables the communications between computers on the same network segment but on different physical networks.
Issue 01 (2011-10-15)
2 IP Forwarding Troubleshooting
IP Forwarding Troubleshooting
Issue 01 (2011-10-15)
2 IP Forwarding Troubleshooting
Issue 01 (2011-10-15)
2 IP Forwarding Troubleshooting
Is the link transmission delay too long? No Is the ping operation correct? No Locate the direction and device where the fault occurs
Yes
Is fault rectified? No
Yes
Yes
Is fault rectified? No
Yes
Is a CPU attack defense policy is configured on the device? Yes Are FIB and ARP entries on the device?
No
No
Is fault rectified? No
Yes
Yes Do error packets exist on interfaces? No Does the network layer of the device work properly? No Seek technical support Yes Ensure that the network layer works properly Yes Clear faults on the link and optical module Is fault rectified? No Yes Yes
Is fault rectified? No
End
Issue 01 (2011-10-15)
2 IP Forwarding Troubleshooting
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check whether or not the ping failure is caused by the too long link transmission delay. Run the ping -t time-value -v destination-address command to check whether or not the ping failure is caused by the too long link transmission delay.
NOTE
In this command, the parameter -t is used to set the timeout period for waiting for a Response packet from the destination end. By default, the timeout period is 2000 ms. The parameter -v is used to display unexpected Response packets; by default, such packets are not displayed.
The ping operation succeeds if a Response packet is received within a specified period, and the ping operation fails if no Response packet is received within the specified period. Therefore, you can specify proper values for -t and -v to check whether or not the ping failure is caused by a long link transmission delay. If ping packet loss occurs because the configured link transmission delay is shorter than the actual delay, the following information is displayed:
<HUAWEI> ping -v -t 1 10.1.1.1 PING 10.1.1.1: 56 data bytes, press CTRL_C to break Request time out Error: Sequence number = 1 is less than the correct = 2!
If the preceding information is displayed, it indicates that the ping failure occurs because the configured link transmission delay is shorter than the actual delay. To solve this problem, increase the value of -t. If the fault persists, go to Step 2.
NOTE
If the ping operation can succeed only after -t is increased to a very long value, there is a possibility that a fault occurs on the device or link. Check the device and link status and clear the fault. To ping a private network address from a PE, you need to run the ping -vpn-instance vpn-name destinationaddress command. The parameter -vpn-instance vpn-name specifies the VPN instance to which the pinged destination address belongs.
Step 2 Check that there are no incorrect operations. 1. Check whether or not the ping -f command is run. If this command is run, it indicates that packet fragmentation is not supported. In this case, check whether the MTU of the outbound interface along the path is smaller than the size of the Ping packet. If the MTU is smaller than the size of the Ping packet, packet loss will occur, in which case, you need to change the size of the Ping packet to a value smaller than the MTU. Otherwise, go to Sub-step b. You can run the following command to view the MTU of an interface:
<HUAWEI> display interface gigabitethernet 1/0/1
Issue 01 (2011-10-15)
10
2 IP Forwarding Troubleshooting
GigabitEthernet1/0/1 current state : UP Line protocol current state : UP Last line protocol up time : 2010-10-25 17:34:51 Description: HUAWEI, GigabitEthernet1/0/1 Interface (ifindex: 6, vr: 0) Route Port,The Maximum Transmit Unit is 1500 Internet Address is 1.0.0.1/24 IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 0002-0002-0002 The Vendor PN is PD100-TXLEB The Vendor Name is Santur Corp. Transceiver max BW: 43000Mbps, Transceiver Mode: single mode WaveLength: 0nm, Transmission Distance: 10000m Rx Optical Power: -40.00dBmTx Optical Power: -40.00dBm Loopback: none, LAN full-duplex mode, Pause Flowcontrol: Receive Enable and Send Enable Last physical up time : 2010-10-25 17:27:25 Last physical down time : 2010-10-25 17:17:24 Current system time: 2010-10-25 18:11:44 Statistics last cleared:never Last 300 seconds input rate 25600 bits/sec, 0 packets/sec Last 300 seconds output rate 25600 bits/sec, 0 packets/sec Input: 960300 bytes, 100 packets Output: 960200 bytes, 100 packets Input: Unicast: 100 packets, Multicast: 0 packets Broadcast: 0 packets, JumboOctets: 0 packets CRC: 0 packets, Symbol: 0 packets Overrun: 0 packets, InRangeLength: 0 packets LongPacket: 100 packets, Jabber: 0 packets, Alignment: 0 packets Fragment: 0 packets, Undersized Frame: 0 packets RxPause: 0 packets Output: Unicast: 100 packets, Multicast: 0 packets Broadcast: 0 packets, JumboOctets: 100 packets Lost: 0 packets, Overflow: 0 packets, Underrun: 0 packets System: 0 packets, Overruns: 0 packets TxPause: 0 packets Last 300 seconds input utility rate: 0.01% Last 300 seconds output utility rate: 0.01%
2.
Check whether or not the ping -i command is run, that is, whether or not an outbound interface is specified. If a broadcast interface such as an Ethernet interface is specified as an outbound interface, the destination address to be pinged must be the address of the directly connected interface. If this condition is not met, you need to specify the directly connected interface as the outbound interface. If the fault persists, go to Step 3.
NOTE
If -f is specified in a ping command, it indicates that Ping packets do not support packet fragmentation. If -iinterface-name is specified in a ping command, it indicates that interface-name is specified as the outbound interface of Ping packets and the destination address is used as the next-hop address.
Step 3 Locate the direction in which the fault occurs. A ping operation involves three roles: the sending device (source end) of Ping packets, intermediate device, and receiving device (destination end) of the Ping packets. If the ping operation fails, the fault may occur in the sending or receiving direction of any of the three devices and therefore you need to locate the direction and node where the fault occurs. Check whether or not the fault occurs in the direction from the source end to the destination end or in the reverse direction. Stop the ping operation on the source end and destination end, and run the display icmp statistics command to check ICMP packet transmission. The following information is displayed:
<HUAWEI> display icmp statistics Input: bad formats 0 echo 36 bad checksum destination unreachable 0 9
Issue 01 (2011-10-15)
11
2 IP Forwarding Troubleshooting
redirects parameter problem information request mask replies Mping reply destination unreachable redirects parameter problem information reply mask replies Mping reply 43 0 0 0 0 71438 0 0 0 0 0
l If the number of ICMP packets does not increase, it indicates that the device does not receive other ICMP packets such as ICMP packets sent from the NMS. Do as follows to locate the fault. Perform a ping operation, and run the display icmp statistics command again to view statistics about ICMP packets. According to the numbers of sent and received ICMP packets, you can locate the direction in which the fault occurs: If the following conditions are all met, it indicates that the source end sends Request packets but does not receive any Response packet, and the destination end does not receive the Request packets. On the source end, the value of the Output:echo field increases normally but the value of the Input:echo field does not increase. On the destination end, the numbers of sent and received packets remain unchanged. In this case, you can determine that the fault occurs in the direction from the source end to the destination end. If the following conditions are all met, it indicates that the source end sends Request packets but does not receive any Response packet, and the destination end receives the Request packets and sends Response packets. On the source end, the value of the Output:echo field increases normally but the value of the Input:echo field does not increase. On the destination end, the numbers of sent and received packets increase normally. In this case, you can determine that the fault occurs in the direction from the destination end to the source end. After determining the direction in which the fault occurs, go to Step 4. l If the number of ICMP packets still increases, it indicates that the board or the device receives other ICMP packets. Do as follows to locate the fault.
NOTE
Before performing subsequent operations, ensure that: l Services on the current network will not be affected. l No traffic policies are applied to interfaces.
1.
Configure an ACL on each device to match Ping packets by using source and destination addresses. Take the following configurations as an example:
statistics enable # acl number 3000 rule 5 permit ip source 1.1.1.1 0 destination 1.1.1.2 0 # traffic classifier 3000 operator or
Issue 01 (2011-10-15)
12
2 IP Forwarding Troubleshooting
2.
Run the traffic-policy command in the interface view to configure a traffic policy and apply the ACL to interfaces. On the source end and destination end, apply the traffic policy in the inbound direction of interfaces on both ends. On the intermediate device, apply the traffic policy in both the inbound and outbound directions of the associated interface. Take the following configurations as an example:
# interface GigabitEthernet2/0/0 ip address 1.1.1.2 255.255.255.252 traffic-policy 3000 inbound # interface GigabitEthernet3/0/0 traffic-policy 3001 outbound #
NOTE
If the ACL is applied to a trunk interface or VLANIF interface, you need to configure the traffic policy on a physical member interface.
3.
Run the display traffic policy statistics interface command to view statistics about the Ping packets that match the ACL on each interface.
<HUAWEI> display traffic policy statistics interface gigabitethernet 1/0/0 inbound Interface: GigabitEthernet1/0/0 inbound: test Traffic policy applied at 2007-08-30 18:30:20 Traffic policy Statistics enabled at 2007-08-30 18:30:20 Statistics last cleared: Never Rule number: 7 IPv4, 1 IPv6 Current status: OK! Item Packets Bytes ------------------------------------------------------------------Matched 1,000 100,000 +--Passed 500 50,000 +--Dropped 500 50,000 +--Filter 100 10,000 +--URPF 100 10,000 +--CAR 300 30,000 Missed 500 50,000 Last 30 seconds rate
If all Ping packets match the ACL, it indicates that the Ping packets are sent or received normally. If the ping operation still fails, collect the preceding information and contact Huawei technical support personnel. If both incoming and outgoing Ping packets of the intermediate device match the ACL, it indicates that the intermediate device works properly. Then, you need to check whether or not a fault occurs on the source end or destination end. If incoming Ping packets of one of the three devices do not match the ACL, it indicates that the upstream device of this device becomes faulty. Then, go to Step 5. Step 4 Locate the node where the fault occurs. Locate the node according to the direction in which the fault occurs.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 13
2 IP Forwarding Troubleshooting
l If the fault occurs in the direction from the source end to the destination end, do as follows to locate the node where the fault occurs, starting with the source end. l If the fault occurs in the direction from the destination end to the source end, do as follows to locate the node where the fault occurs, starting with the destination end. Run the tracert dest-ip-address command to find the location where packet loss occurs.
<HUAWEI> tracert 1.1.1.1 traceroute to 1.1.1.1(1.1.1.1), max hops: 30 ,packet length: 40 1 30.1.1.1 5 ms 4 ms 3 ms 2 89.0.0.2 10 ms 11 ms 8 3 * * * ...
The preceding command output shows that the next hop of the route to 89.0.0.2 (namely, the node displayed as "3 * * *") becomes faulty. After locating the node where the fault occurs, go to Step 5.
NOTE
For the tracert to a VPN, run the tracert -vpn-instance vpn-name destination-address command for fault location. -vpn-instance vpn-name specifies the VPN instance to which the tracerted destination address belongs.
Step 5 Check whether or not a local attack defense policy is configured on the node where the fault occurs. If some devices have been attacked by ICMP packets, the rate at which ICMP packets are sent to the CPU is decreased and excess ICMP packets are dropped to protect against attacks. As a result, the ping operation fails. Run the display current-configuration | include cpu-defend command to check whether there are configurations of a CPU attack defense policy in the configuration file of the device. l If a CPU attack defense policy is configured, run the display cpu-defend policy and display cpu-defend car commands to check the following: Check whether or not the blacklist that contains the source or destination IP address of ping packets is configured. Check whether or not the CAR is configured. If the CAR is configured, check whether or not Ping packets fail to be processed because the CAR is set to a too small value. If the blacklist is configured or the CAR is set too small, a ping failure or packet loss occurs. If the ping operation is still required, delete the blacklist or the CAR and then run a ping command again. If the ping operation still fails, go to Step 6. l If no CPU attack defense policy is configured, go to Step 6. Step 6 Check that routing entries and ARP entries on the node where the fault occurs are correct. Run the display ip routing-table destination-address command on the node where the fault occurs on check whether or not there is a route to the destination address. If there is no such route, see the or . If there is a route to the destination address and Ping packets are transmitted over an Ethernet link, run the display arp slotslot-number command to check whether or not the required ARP entry exists. If the required ARP entry does not exist, see the HUAWEI NetEngine5000E Troubleshooting - LAN Access and MAN Access. If the fault persists, go to Step 6.
Issue 01 (2011-10-15)
14
2 IP Forwarding Troubleshooting
For the ping to a VPN, run the display ip routing-table vpn-instance vpn-name destination-address command to check FIB entries. vpn-instance vpn-name specifies the VPN instance to which the pinged destination address belongs.
Step 7 Check that there are no error packets on interfaces on the node where the fault occurs. Run the display interface interface-type interface-number command to check packet statistics on the specified interface. l Check whether or not the value of the CRC field on an Ethernet interface increases after this display command is run again. l Check whether or not the value of the SDH alarm field or SDH error field on a POS interface increases after this display command is run again. If the number of error packets or alarms on the specified interface increases, it indicates that the link or optical module becomes faulty. Clear faults on the link or optical module. If the number of error packets or alarms on the specified interface does not increase, go to Step 8. Step 8 Locate the layer where the fault occurs. After finding the node where the fault occurs, do as follows to locate the layer where the fault occurs. 1. Check whether or not ICMP packets are sent and received normally.
<HUAWEI> display icmp Input: bad formats echo source quench echo reply timestamp mask requests time exceeded Mping request Output:echo source quench echo reply timestamp mask requests time exceeded Mping request statistics 0 bad checksum 0 0 destination unreachable 0 0 redirects 0 0 parameter problem 0 0 information request 0 0 mask replies 0 0 0 Mping reply 0 0 destination unreachable 476236 0 redirects 0 0 parameter problem 0 0 information reply 0 0 mask replies 0 0 0 Mping reply 0
If no ICMP packets are received or error packets are received, collect the preceding information and contact Huawei technical support personnel. If ICMP packets are received normally, go to Sub-step 3. 2. Check whether the network layer is normal. Run the display ip statistics command to check whether the network layer is normal.
<HUAWEI> display ip statistics Input: sum bad protocol bad checksum discard srr Output: forwarding dropped Fragment: input dropped fragmented Reassembling:sum 123174 0 0 0 0 0 0 0 0 0 local bad format bad options TTL exceeded local no route output couldn't fragment timeouts 0 0 0 0 268816 0 0 0 0
If error packet statistics (such as the values of the bad protocol, bad format, bad checksum, bad options, discard srr, TTL exceeded, dropped, no route, and couldn't fragment fields)
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 15
2 IP Forwarding Troubleshooting
displayed in the command output increase, it indicates that some error packets reach the network layer and are dropped after validity check. l If error packet statistics increase, it indicates that the board on the device may become faulty. Then, collect the preceding information and contact Huawei technical support personnel. l If error packet statistics do not increase, go to Step 9. Step 9 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Relevant Logs
None.
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check that routing entries and ARP entries are correct.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 16
2 IP Forwarding Troubleshooting
Run the display ip routing-table destination-address command on each device to check whether there is a route to the destination address. l If there is no route to the destination address, see the 6 OSPF Troubleshooting or 7 IS-IS Troubleshooting. l If there is a route to the destination address and Tracert packets are transmitted over an Ethernet link, run the display arp slot slot-number command to check whether the required ARP entry exists. If the required ARP entry does not exist, go to step 3. If the fault persists, go to Step 2. Step 2 Check that the device sending Tracert packets (the source end) does not receive ICMP error packets. Run the display icmp statistics command on the source end to check whether or not it receives ICMP error packets.
<HUAWEI> display icmp statistics Input: bad formats 0 echo 13 source quench 0 echo reply 697 timestamp 0 mask requests 0 time exceeded 12 Mping request 0 Output:echo 704 source quench 0 echo reply 13 timestamp 0 mask requests 0 time exceeded 0 Mping request 0 bad checksum destination unreachable redirects parameter problem information request mask replies Mping reply destination unreachable redirects parameter problem information reply mask replies Mping reply 0 18 43 0 0 0 0 93326 0 0 0 0 0
During the tracert operation, run the display icmp statistics command several times to check the tracert result. If the increased value of the total number of Destination Unreachable packets and Time Exceeded packets in the Input field equals the number of sent Tracert packets, it indicates that the source end receives ICMP error packets. Contact Huawei technical support personnel. Otherwise, go to Step 3. Step 3 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Issue 01 (2011-10-15)
17
Issue 01 (2011-10-15)
18
Is the Ethernet interface configured with the local loopback function? Yes Delete the local loopback function and disable the IPv6 function
No
Delete the conflicted address and assign a new address in the same network segment to the interface
No
Yes Enable the IPv6 function, configure an IPv6 address, and configure the local loopback function
End
Issue 01 (2011-10-15)
19
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check whether the interface where the fault occurs is an Ethernet interface. l If the interface is an Ethernet interface, check whether the interface is configured with the local loopback function before being configured with the IPv6 address. If the interface is configured with the local loopback function before being configured with the IPv6 address, run the following commands in sequence and then go to Step 2. 1. 2. 3. 4. 5. 6. 7. 8. Run the undo loopback command to disable the loopback function on the Ethernet interface. Run the undo ipv6 enable command to disable the IPv6 function on the Ethernet interface. Run the commit command to commit the preceding configurations. Run the ipv6 enable command to re-enable the IPv6 function on the Ethernet interface. Run the ipv6 address { ipv6-address prefix-length | ipv6-address/prefix-length } command to reconfigure the IPv6 address on the Ethernet interface. Run the commit command to commit the preceding configurations. Run the loopback local command to reconfigure the local loopback function on the Ethernet interface. Run the commit command to commit the preceding configurations.
If the interface is configured with the loopback function after being configured with the IPv6 address, go to Step 2. l If the interface is not an Ethernet interface, go to Step 2. Step 2 Run the undo ipv6 address command in the interface view to delete the IPv6 address configured for the interface, and run the ipv6 address command to configure a new IPv6 address in the same network segment for the interface. Go to Step 3. Step 3 Run the display ipv6 interface interface-type interface-number command to check whether the IPv6 address is in the normal state. l If the IPv6 address is followed by [DUPLICATED], the IPv6 address is not in the normal state. Go to Step 4. l If the IPv6 address is not followed by [DUPLICATED], go to Step 5. Step 4 Collect the following information and contact Huawei technical support personnel.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 20
l Results of the preceding operation procedure l Configuration, log, and alarm files Step 5 End. ----End
Relevant Logs
ND/4/ADDR_DUPLICATE
Issue 01 (2011-10-15)
21
No
No
Yes
No
End
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Issue 01 (2011-10-15)
22
Procedure
Step 1 Run the display this command in the interface view to check whether the interface has been enabled with the IPv6 function. l If ipv6 enable is displayed in the command output, the interface has been enabled with the IPv6 function. Go to Step 2. l If no ipv6 enable is displayed, the interface is not enabled with the IPv6 function. Run the ipv6 enable command to enable the IPv6 function and go to Step 2. Step 2 Run the display ipv6 interface interface-type interface-number command to check whether the physical status of the interface is Up. l If the physical status of the interface is Administratively DOWN, run the undo shutdown command in the interface view and go to Step 3. l If the physical status of the interface is DOWN, a fault occurs on the physical connection of the interface. Rectify the fault and go to Step 3. l If the physical status of the interface is UP, go to Step 3. Step 3 Run the display ipv6 interface interface-type interface-number command to check whether the IPv6 address is in the normal state. l If the IPv6 address is in the normal state, that is, the IPv6 address is not followed by [TENTATIVE] or [DUPLICATED], go to Step 4. l If the IPv6 address is followed by [DUPLICATED], go to 3.1 IPv6 Address Conflicts Occur on Interfaces. l If the IPv6 address is followed by [TENTATIVE], the IPv6 address is being detected. If the IPv6 address is being detected during a long period, perform the following steps: 1. Run the ipv6 nd ns retrans-timer interval command to shorten the interval at which the system sends ND packets. Run the display ipv6 interface interface-type interfacenumber command to check whether the IPv6 address is in the normal state. If the IPv6 address is in the normal state, go to Step 4. If the IPv6 address is followed by [DUPLICATED], go to 3.1 IPv6 Address Conflicts Occur on Interfaces. If the IPv6 address is followed by [TENTATIVE], go to Step b. 2. Run the display ipv6 interface interface-type interface-number command to check the number of DAD attempts. If the number of DAD attempts is large, run the ipv6 nd dad attempts value command to reduce the number. Run the display ipv6 interface interface-type interface-number command to check whether the IPv6 address is in the normal state. If the IPv6 address is in the normal state, go to Step 4. If the IPv6 address is followed by [DUPLICATED], go to 3.1 IPv6 Address Conflicts Occur on Interfaces. If the IPv6 address is followed by [TENTATIVE], go to Step 7. If the number of DAD attempts is set properly, go to Step 4. Step 4 Run the display ipv6 neighbors command to check whether the number of configured ND entries has exceeded the upper limit.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 23
The upper limit of ND entries is as follows: l The sum of maximum number of dynamic and static ND entries is 16384. l he maximum number of dynamic ND entries learned by an interface is 4096. l the maximum number of static ND entries learned by an interface is 300.
l If the number of ND entries does not exceed the upper limit, go to Step 5. l If the number of ND entries exceeds the upper limit, go to Step 7. Step 5 Determine whether the packets are discarded in receiving or sending direction and where the packets are discarded. Run the reset ipv6 statistics command on the source and destination ends to delete IPv6 statistics,then run the ping ipv6 command and the display icmpv6 statistics [ interface interface-type interface-number ] command again to view the statistics on received and sent ICMPv6 packets on the interface. l If the Neighbor solicit value in the Sent packets field does not increase on the source end, the source end has not sent Neighbor Solicitation (NS) packets. Go to Step 6. l If the Neighbor solicit value in the Sent packets field increases but the Neighbor advert value in the Received packets field does not increase on the source end, and both the Neighbor advert value in the Sent packets field and the Neighbor solicit value in the Received packets field do not increase on the destination end, the destination end has not received the NS packets sent by the source end. Go to Step 6. l If the Neighbor solicit value in the Sent packets field increases properly but the Neighbor advert value in the Received packets field does not increase on the source end, and the Neighbor solicit value in the Received packets field increases properly but the Neighbor advert value in the Sent packets field does not increase on the destination end, the source end has sent NS packets and the destination end has received them but has failed to reply with Neighbor Advertisement (NA) packets. Go to Step 6. l If the Neighbor solicit value in the Sent packets field increases properly but the Neighbor advert value in the Received packets field does not increase on the source end, and both the Neighbor advert value in the Sent packets field and the Neighbor solicit value in the Received packets field increase properly on the destination end, the source end has sent NS packets and the destination end has received them and has replied with NA packets. Go to Step 6. l If both the Neighbor solicit value in the Sent packets field and the Neighbor advert value in the Received packets field increase properly on the source end, and both the Neighbor advert value in the Sent packets field and the Neighbor solicit value in the Received packets field increase properly on the destination end, the source end has sent NS packets and received NA packets from the destination end. Go to Step 7. Step 6 Determine where the packets are discarded. Locate the position according to the direction in which the fault occurs. Perform the following operations to enable the ND packet debugging:
NOTE
Enabling debugging affects system performance. Exercise caution when enabling debugging.
<HUAWEI> debugging ipv6 nd packet <HUAWEI> terminal debugging
Run the ping ipv6 -c echo-number destination-ipv6-address command to send a ping packet. Check whether the source end has sent an NS packet and received an NA packet, and whether
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 24
the destination end has received an NS packet and replied with an NA packet. If no information about packet sending and receiving is displayed, go to Step 7. Step 7 Collect the following information and contact Huawei technical support personnel. l Results of the preceding operation procedure l Configuration, log, and alarm files Step 8 End. ----End
Relevant Logs
ND/4/ADDR_DUPLICATE
Issue 01 (2011-10-15)
25
No
Yes
Remove the fault to enable the IPv6 No address of the interface to become valid
Yes
No
No Yes
Yes
No
End
Issue 01 (2011-10-15)
26
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Run the display this command in the interface view to check whether the interface has been enabled with the IPv6 function. l If the ipv6 enable field is displayed, the interface has been enabled with the IPv6 function. Go to Step 2. l If the ipv6 enable field is not displayed, the interface is not enabled with the IPv6 function. Run the ipv6 enable command to enable the IPv6 function and go to Step 2. Step 2 Run the display ipv6 interface interface-type interface-number command to check whether the physical status of the interface is Up. l If the physical status of the interface is Administratively DOWN, run the undo shutdown command in the interface view and go to Step 3. l If the physical status of the interface is DOWN, a fault occurs on the physical connection of the interface. Rectify the fault and go to Step 3. l If the physical status of the interface is UP, go to Step 3. Step 3 Run the display ipv6 interface interface-type interface-number command to check whether the IPv6 address is in the normal state. l If the IPv6 address is in the normal state, that is, the IPv6 address is not followed by [TENTATIVE] or [DUPLICATED] in the command output, go to Step 4. l If the IPv6 address is followed by [DUPLICATED], go to 3.1 IPv6 Address Conflicts Occur on Interfaces. l If the IPv6 address is followed by [TENTATIVE], the IPv6 address is being detected. If the IPv6 address is being detected during a long period, perform the following steps: 1. Run the ipv6 nd ns retrans-timer interval command to shorten the interval at which the system sends ND packets. Run the display ipv6 interface interface-type interfacenumber command to check whether the IPv6 address is in the normal state. If the IPv6 address is in the normal state, go to Step 4. If the IPv6 address is followed by [DUPLICATED], go to 3.1 IPv6 Address Conflicts Occur on Interfaces. If the IPv6 address is still followed by [DUPLICATED], go to Step 6. 2. Run the display ipv6 interface interface-type interface-number command to check the number of DAD attempts.
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 27
Issue 01 (2011-10-15)
If the number of DAD attempts is large, run the ipv6 nd dad attempts value command to reduce the number. Run the display ipv6 interface interface-type interface-number command to check whether the IPv6 address is in the normal state. If the IPv6 address is in the normal state, go to Step 4. If the IPv6 address is followed by [DUPLICATED], go to 3.1 IPv6 Address Conflicts Occur on Interfaces. If the IPv6 address is followed by [TENTATIVE], go to Step 8. If the number of DAD attempts is set properly, go to Step 4. Step 4 Check IPv6 routing entries. Run the display ipv6 routing-table command on the source end to check whether the source end has a route to the destination end, and run the display ipv6 routing-table command on the destination end to check whether the destination end has a route to the source end. l If the source end does not has a route destined to the destination end, or the destination end does not has a route destined to the source end, configure a routing protocol for or add a static route to the source end or the destination end. Go to Step 5. l If the source and destination ends has routes to each other, go to Step 5. Step 5 Run the display ipv6 neighbors command to check the ND entries on the interface. l If there is no ND entry about the neighbor, the local end failed to learn ND entries of the neighbor. Go to 3.2 ND Entries Cannot Be Learned. l If information about ND entries is correct, go to Step 6. Step 6 Determine whether the packets are discarded in receiving or sending direction and where the packets are discarded. Run the reset ipv6 statistics command on the source and destination ends to delete IPv6 statistics, then run the ping ipv6 command and the display icmpv6 statistics [ interface interface-type interface-number ] command again to view the statistics on received and sent ICMPv6 packets on the interface. l If the Echoed value in the Sent packets field does not increase on the source end, the source end has not sent IPv6 packets to the destination end. Go to Step 7. l If the Echoed value in the Sent packets field increases properly but the Echo replied value in the Received packets field does not increase on the source end, and both the Echo replied value in the Sent packets field and the Echoed value in the Received packets field do not increase on the destination, the destination end has not received the IPv6 packets sent by the source end. Go to Step 7. l If the Echoed value in the Sent packets field increases properly but the Echo replied value in the Received packets field does not increase on the source end, and the Echoed value in the Received packets field increases properly but the Echo replied value in the Sent packets field does not increase on the destination end, the source end has sent IPv6 packets and the destination end has received the IPv6 packets but has failed to send response packets. Go to Step 7. l If the Echoed value in the Sent packets field increases properly but the Echo replied value in the Received packets field does not increase on the source end, and both the Echo replied value in the Sent packets field and the Echoed value in the Received packets field increase properly on the destination end, the source end has sent IPv6 packets but has failed to receive response packets from the destination end. Go to Step 7. l If both the Echoed value in the Sent packets field and the Echo replied value in the Received packets field increase properly on the source end, and both the Echo replied value in the
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 28
Sent packets field and the Echoed value in the Received packets field increase properly on the destination end, the source end has sent IPv6 packets and received response packets from the destination end. The fault that the IPv6 address of the neighbor cannot be pinged may be caused by an over long link transmission delay. Run the ping ipv6 -t timeout command to increase the timeout period for sending an ICMPv6 response packet. If the ping succeeds, go to Step 9. If the ping still fails, go to Step 7. Step 7 Determine where the packets are discarded. Perform the following operations to enable the IPv6 packet debugging.
NOTE
Enabling debugging affects system performance. Exercise caution when enabling debugging.
<HUAWEI> debugging rawip ipv6 packet <HUAWEI> debugging ipv6 packet <HUAWEI> terminal debugging
Run the ping ipv6 -c echo-number destination-ipv6-address command to send five ping packets. Check whether the source end has sent five packets and received five response packets, and whether the destination end has received five packets and sent five response packets. If no related information is displayed, go to Step 8. Step 8 Collect the following information and contact Huawei technical support personnel. l Results of the preceding operation procedure l Configuration, log, and alarm files Step 9 End. ----End
Relevant Logs
None.
Issue 01 (2011-10-15)
29
Issue 01 (2011-10-15)
30
Issue 01 (2011-10-15)
31
Figure 4-1 Troubleshooting flowchart for the fault that two IPv6 networks cannot communicate over a 6to4 tunnel
Two IPv6 Networks Cannot ommunicate Over a 6to4 Tunnel
Yes
Yes
Yes
End
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Issue 01 (2011-10-15)
32
Procedure
Step 1 Check that the devices on the IPv4 network between the two IPv6 networks can ping each. l If the ping succeeds, go to Step 2. l If the ping fails, go to 2.1 The Ping Operation Fails. Step 2 Check that IPv6 routes are correctly configured on the source and destination ends. Run the display ipv6 routing-table command on the source end to check whether the source end has a route destined to the destination end, and run the display ipv6 routing-table command on the destination end to check whether the destination end has a route destined to the source end. l If the source end does not has a route destined to the destination end, or the destination end does not has a route destined to the source end, run the ipv6 route-static dest-ipv6-address prefix-length interface-type interface-number command on the source end or the destination end to add a static route to the source end or the destination end. Then, go to Step 3. l If IPv6 routes are correctly configured on the source and destination ends, go to Step 3. Step 3 Run the display ipv6 interface interface-type interface-number command on the source and destination ends to check whether the physical status of the 6to4 tunnel is Up. l If the physical status of the tunnel interface is Administratively DOWN, run the undo shutdown command in the tunnel interface view and then go to Step 4. l If the physical status of the tunnel interface is DOWN, 1. run the display this command in the tunnel interface view to check whether the tunnel source address and the tunnel mode have been configured. If the tunnel source address and the tunnel mode have been configured, that is, the source field and the tunnel-protocol ipv6-ipv4 6to4 field are displayed, go to Step b. If the tunnel source address or the tunnel mode is not configured, run the source {source-ip-address | interface-type interface-number } command or the tunnelprotocol ipv6-ipv4 6to4command to configure the tunnel source address or the tunnel mode. Then, go to Step b. 2. Run the display ip interface brief command to check whether the interface mapping the tunnel source address or the IPv4 address of the tunnel source interface exists. If the interface mapping the tunnel source address or the IPv4 address of the tunnel source interface exists, go to Step 4. If the interface mapping the tunnel source address or the IPv4 address of the tunnel source interface does not exist, go to Step 4. l If the physical status of the tunnel interface is UP, go to Step 4. Step 4 Run the display ipv6 interface interface-type interface-number command on the source and destination ends to check whether the status of the 6to4 tunnel protocol is Up. l If the status of the 6to4 tunnel protocol is DOWN, run the display this command to check whether the tunnel interface has been configured with an IPv6 address. If the tunnel interface has been configured with an IPv6 address, go to Step 8. If the tunnel interface is not configured with an IPv6 address, run the ipv6 address command to configure an IPv6 address for the tunnel interface, and then go to Step 5. l If the status of the 6to4 tunnel protocol is UP, go to Step 5.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 33
Step 5 Run the display ipv6 interface interface-type interface-number command on the source and destination ends to check whether the IPv6 address of the 6to4 tunnel is in the 6to4 address format, and the IPv4 address in the 6to4 tunnel address is the source 6to4 tunnel address.
NOTE
6to4 tunnel addresses are in the format of 2002:IPv4 address:Subnet ID::Interface ID.
l If the 6to4 tunnel address is not in the 6to4 address format, run the ipv6 address ipv6-address prefix-length command to modify the address to be in the 6to4 address format, and then go to Step 6. l If the 6to4 tunnel address is in the 6to4 address format, go to Step 6. Step 6 Determine whether the packets are discarded in receiving or sending direction and where the packets are discarded. Run the reset ipv6 statistics command on the source and destination ends to delete IPv6 statistics,then run the ping ipv6 command and the display icmpv6 statistics [ interface interface-type interface-number ] command again to view the statistics on received and sent ICMPv6 packets on the interface. l If the Echoed value in the Sent packets field does not increase, the source end has not sent IPv6 packets. Go to Step 7. l If the Echoed value in the Sent packets field increases properly but the Echo replied value in the Received packets field does not increase on the source end, and both the Echo replied value in the Sent packets field and the Echoed value in the Received packets field do not increase on the destination end, the destination end has not received the IPv6 packets that were sent by the source end. Go to Step 7. l If the Echoed value in the Sent packets field increases properly but the Echo replied value in the Received packets field does not increase on the source end, and the Echoed value in the Received packets field increases properly but the Echo replied value in the Sent packets field does not increase on the destination end, the source end has sent IPv6 packets and the destination end has received the IPv6 packets but has not sent response packets. Go to Step 7. l If the Echoed value in the Sent packets field increases properly but the Echo replied value in the Received packets field does not increase on the source end, and both the Echo replied value in the Sent packets field and the Echoed value in the Received packets field increase properly on the destination end, the source end has sent IPv6 packets but has not received response packets, and the destination end has received the IPv6 packets and sent response packets. Go to Step 7. l If the Echoed value in the Sent packets field and theEcho replied value in the Received packets field increase properly on the source end, and both the Echo replied value in the Sent packets field and the Echoed value in the Received packets field increase properly on the destination end, the source end has sent IPv6 packets and received response packets, and the destination has received the IPv6 packets and sent response packets. The fault that two IPv6 networks cannot ping each other over a 6to4 tunnel may be caused by an over long link transmission delay. Run the ping ipv6 -t timeout command to increase the timeout period for sending an ICMPv6 response packet. If the ping succeeds, go to Step 9. If the ping still fails, go to Step 7. Step 7 Determine the position on which the fault occurs. Locate the position according to the direction in which the fault occurs. Perform the following operations to enable the IPv6 packet debugging:
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 34
Enabling debugging affects the system performance. So confirm the action before you enable debugging.
<HUAWEI> debugging rawip ipv6 packet <HUAWEI> debugging ipv6 packet <HUAWEI> terminal debugging
Run the ping ipv6 -c echo-number destination-ipv6-address command to send five ping packets. Check whether the source end has sent five packets and received five response packets, and whether the destination end has received five packets and sent five response packets. If no related information is displayed, go to Step 8. Step 8 If the fault persists, collect the following information and contact Huawei technical support personnel. l Results of the preceding operation procedure l Configuration files, log files, and alarm files of the device Step 9 End. ----End
Relevant Logs
None.
4.2 Two IPv6 Networks Cannot Communicate Over a Manual IPv6 over IPv4 Tunnel
This chapter describes common causes of the fault that two IPv6 networks cannot communicate over a manual IPv6 over IPv4 tunnel, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.
Routing entries on the two IPv6 networks or routing entries on the IPv4 network between the two IPv6 networks are abnormal. The manual IPv6 over IPv4 tunnel goes Down. The IPv6 address configured for the manual IPv6 over IPv4 tunnel is incorrect. The link transmission delay is too long. The source end, therefore, cannot receive a response packet from the destination end within the waiting time. A hardware fault occurs.
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 35
No
No
Yes
Yes
No
Yes
No
End
Issue 01 (2011-10-15)
36
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check that the devices on the IPv4 network between the two IPv6 networks can ping each. l If the ping succeeds, go to Step 2. l If the ping fails, go to 2.1 The Ping Operation Fails. Step 2 Check that IPv6 routes are correctly configured on the source and destination ends. Run the display ipv6 routing-table command on the source end to check whether the source end has a route destined to the destination end, and run the display ipv6 routing-table command on the destination end to check whether the destination end has a route destined to the source end. l If the source end does not has a route destined to the destination end, or the destination end does not has a route destined to the source end, run the ipv6 route-static dest-ipv6-address prefix-length interface-type interface-number command to add a static route to the source end or the destination end. l If IPv6 routes are correctly configured on the source and destination ends, go to Step 3. Step 3 Run the display ipv6 interface interface-type interface-number command on the source and destination ends to check whether the physical status of the manual IPv6 over IPv4 tunnel is Up. l If the physical status of the tunnel interface is Administratively DOWN, run the undo shutdown command in the tunnel interface view and then go to Step 4. l If the physical status of the tunnel interface is DOWN, perform the following steps. 1. Run the display this command in the tunnel interface view to check whether the tunnel source address, tunnel destination address, and tunnel mode have been configured. If the tunnel source address, tunnel destination address, and tunnel mode are configured (that is, the source, destination, and tunnel-protocol ipv6-ipv4 6to4 fields are displayed), go to Step b. If the tunnel source address, tunnel destination address, or tunnel mode is not configured, run the source {source-ip-address | interface-type interface-number } command, the destination dest-ip-address command, or the tunnel-protocol ipv6ipv4command to configure a tunnel source address (or a source interface address), a tunnel destination address, or a tunnel mode. Then, go to Step b. 2. Run the display ip interface brief command to check whether the interface mapping the tunnel source address or the IPv4 address of the tunnel source interface exists.
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 37
Issue 01 (2011-10-15)
If the interface mapping the tunnel source address or the IPv4 address of the tunnel source interface exists, go to Step c. If the interface mapping the tunnel source address or the IPv4 address of the tunnel source interface does not exist, configure a tunnel source address (or a source interface address) and a tunnel destination address, and then go to Step c. 3. Run the display ip routing-table ip-address command to check whether the route destined to the destination address exists. If the route destined to the destination address exists, go to Step 4. If the route destined to the destination address does not exist, add a routing protocol or a static route, and then go to Step 4. l If the physical status of the tunnel interface is UP, go to Step 4. Step 4 Run the display ipv6 interface interface-type interface-number command on the source and destination ends to check whether the status of the protocol of the manual IPv6 over IPv4 tunnel is Up. l If the status of the protocol is DOWN, run the display this command and check whether the tunnel interface has been configured with an IPv6 address. If the tunnel interface has been configured with an IPv6 address, go to Step 9. If the tunnel interface is not configured with an IPv6 address, run the ipv6 address command to configure an IPv6 address for the tunnel interface, and go to Step 5. l If the status of the protocol on the tunnel interface is UP, go to Step 5. Step 5 Run the display this command in the tunnel interface views on the source and destination ends to check whether the source address and the destination address of the tunnel are symmetrical. l If the source address and the destination address are symmetrical, go to Step 6. l If the source address and the destination address are not symmetrical, run the source {sourceip-address | interface-type interface-number } command or the destination dest-ipaddress command to modify the source address or the destination address so that the two address become symmetrical. Step 6 Run the display this command in the tunnel interface views on the source and destination ends to check whether the IPv6 addresses are correctly configured on the two ends. l If the IPv6 addresses are correctly configured on the two ends, go to Step 7. l If the IPv6 addresses are incorrectly configured on the two ends, run the ipv6 address { ipv6-address prefix-length | ipv6-address/prefix-length } command in the tunnel interface views to reconfigure IPv6 addresses of the tunnel, and then go to Step 7. Step 7 Determine whether the packets are discarded in receiving or sending direction and where the packets are discarded. Run the reset ipv6 statistics command on the source and destination ends to delete statistics on IPv6 traffic,then run the ping ipv6 command and the display icmpv6 statistics [ interface interface-type interface-number ] command again to view the statistics on received and sent ICMPv6 packets on the interface. l If the Echoed value in the Sent packets field does not increase on the source end, the source end has not sent IPv6 packets. Go to Step 8. l If the Echoed value in the Sent packets field increases properly but the Echo replied value in the Received packets field does not increase on the source end, and both the Echo replied value in the Sent packets field and the Echoed value in the Received packets field
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 38
do not increase on the destination end, the destination end has not received the IPv6 packets that were sent by the source end. Go to Step 8. l If the Echoed value in the Sent packets field increases properly but the Echo replied value in the Received packets field does not increase on the source end, and the Echoed value in the Received packets field increases properly but the Echo replied value in the Sent packets field does not increase on the destination end, the source end has sent IPv6 packets and the destination end has received the IPv6 packets but has not sent response packets. Go to Step 8. l If the Echoed value in the Sent packets field increases properly but the Echo replied value in the Received packets field does not increase on the source end, and both the Echo replied value in the Sent packets field and the Echoed value in the Received packets field increase properly on the destination end, the source end has sent IPv6 packets but has not received response packets, and the destination end has received the IPv6 packets and sent response packets. Go to Step 8. l If both the Echoed value in the Sent packets field and the Echo replied value in the Received packets field increase properly on the source end, and both the Echo replied value in the Sent packets field and the Echoed value in the Received packets field increase properly on the destination end, the source end has sent IPv6 packets and received response packets, and the destination has received the IPv6 packets and sent response packets. The fault that two IPv6 networks cannot communicate over a manual IPv6 over IPv4 tunnel may be caused by an over long link transmission delay. Run the ping ipv6 -t timeout destination-address command to increase the timeout period for sending an ICMPv6 response packet. If the ping succeeds, go to Step 10. If the ping still fails, go to Step 8. Step 8 Determine where the packets are discarded. Locate the position according to the direction in which the fault occurs. Perform the following operations to enable the IPv6 packet debugging:
NOTE
Enabling debugging affects the system performance. Therefore, confirm the action before you enable debugging.
<HUAWEI> debugging rawip ipv6 packet <HUAWEI> debugging ipv6 packet <HUAWEI> terminal debugging
Run the ping ipv6 -c echo-number destination-ipv6-address command to send five ping packets. Check whether the source end has sent five packets and received five response packets, and whether the destination end has received five packets and sent five response packets. If the related information is not displayed, go to Step 9. Step 9 If the fault persists, collect the following information and contact Huawei technical support personnel. l Results of the preceding operation procedure l Configuration files, log files, and alarm files of the device Step 10 End. ----End
Issue 01 (2011-10-15)
39
Relevant Logs
None.
Issue 01 (2011-10-15)
40
5 RIP Troubleshooting
5
About This Chapter
RIP Troubleshooting
This chapter describes common causes of RIP faults and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs. 5.1 Device Does not Receive Partial or All the Routes This chapter describes common causes of device not receive partial or all the routes, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs. 5.2 Device Does not Send Partial or All the Routes This section describes the common causes, troubleshooting flowchart and provides a step-bystep troubleshooting procedure for the fault that device does not send partial or all the routes. 5.3 Trouble Cases
Issue 01 (2011-10-15)
41
5 RIP Troubleshooting
RIP version 2 broadcast will receive all packets from the peer.
l l l l l
The undo rip input command is configured on the interface so that the interface is disabled to receive the RIP packet. The policy that is used to filter the received RIP routes is configured. The metric of the received routes is larger than 16. Other protocols learning the same routes in the routing table. The number of the received routes exceeds the upper limit.
Issue 01 (2011-10-15)
42
5 RIP Troubleshooting
Is RIP enabled on the ingress? Yes Does the ingress work properly? Yes
No
Is fault rectified? No
Yes
No
Is fault rectified? No
Yes
Do the No versions of the sending and receiving interfaces match? Yes Yes
Configure the same version for the receiving and sending interfaces.
Is fault rectified? No
Yes
Ensure that the filtering policy does not filter out the received routes.
Is fault rectified? No
Yes
Yes
Is fault rectified? No
Yes
Yes Is the RIP metric equal to or greater than 16? No Is load balancing configured? No Is the verify-source command run? No Do other better routes exist? No Seek technical supports. Yes Yes Yes
Is fault rectified? No
Yes
Is fault rectified? No
Yes
Is fault rectified? No
Yes
End
Issue 01 (2011-10-15)
43
5 RIP Troubleshooting
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check that RIP is enabled on the inbound interface. The network command is used to specify the interface network segment. Only the interface enabled with the RIP protocol can receive and send RIP routing information. Using the display current-configuration configuration rip command, you can check if the inbound interface is contained in the display about the current enabled RIP. The network address enabled by the network command must be that of the natural network segment. Step 2 Check that the inbound interface works normally. Run the display interface command to check the running status of the inbound interface. If the physical state of the interface is Down or Administratively Down, or the state of the current protocol is Down, RIP cannot work normally on the interface. Therefore, the UP state of the interface must be ensured. Step 3 Check if the authentication type and password sent by the peer matches the authentication type and password received on the interface. If the authentication type and password of the peer and that of the RIP packet are different, the RIP routing information may not be received correctly. Step 4 Check if the version number sent by the peer matches the version number received on the local interface. If the version numbers of the inbound interface and that of the RIP packet are different, the RIP routing information may not be received correctly. Step 5 Check if the undo rip input command is configured on the inbound interface. The rip input command enables the specified interface to receive the RIP packet. The undo rip input command disables the specified interface from receiving the RIP packets. If the undo rip input command is configured on the inbound interface, all the RIP packets coming from the interface cannot be processed. Therefore, the routing information cannot be received. Step 6 Check if the policy that filters the received RIP route is configured in RIP.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 44
5 RIP Troubleshooting
The filter-policy import command is used to filter the received RIP routing information. l If the ACL is adopted, use the display acl command to view if the RIP routing information learned from the neighbor is filtered out. Use the suffix list of the IP address to filter the route. Use the display ip ip-prefix command to check the configured policy. l If the routing information is filtered out by the routing policy, configure the correct routing policy.
Step 7 Check if the inbound interface is configured with the rip metricin command and the metric is equal to 15. The rip metricin command sets the metric added to the route when the interface receives the RIP packet. If the metric exceeds 16, the route is seen as unreachable. Therefore, the route is not added into the routing table. Step 8 Check if maximum load balancing has been configured Use the display this command in the RIP view to check the number of maximum balanced paths configured for RIP. If the maximum value is reached, then the incoming routes to the same destination will not be added to the database. Step 9 Check if verify source has been configured. The rip verify source functionality checks for the address of the source in the incoming packets. It discards the packets from a different network. By default, this feature is enabled. If you are receiving packets from a different network, you need to run the undo verify-source command. Step 10 Check if there are other protocols learning the same route. Run the display rip route command to check if there are routes received from the neighbor. l l The possible case is that the RIP route is received correctly and the local device learns the same route from other protocols such as OSPF, IS-IS. In general, the preference of OSPF or IS-IS protocol is larger than that of the RIP protocol. Hence, if the same route is learned through OSPF or IS-IS, RM chooses either of those routes over RIP routes. You can change the preference of RIP to a higher value by using the preference command. Using the display ip routing-table protocol rip verbose command, you can view the route. The state of the route is active. If the fault cannot be located yet, you can contact the Huawei technical support engineer.
l l
Step 11 Check for authentication keys (for RIPv2). For RIPv2, if authentication mode is configured on the sending and receiving interface, then the authentication keys on both the interfaces should be the same. Step 12 If the fault persists, collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 45
5 RIP Troubleshooting
Relevant Logs
None.
l l l
Issue 01 (2011-10-15)
46
5 RIP Troubleshooting
Issue 01 (2011-10-15)
47
5 RIP Troubleshooting
Is RIP enabled on the egress? Yes Does the outbound interface work properly? Yes Is the Silent-interface command run? No Is the undo rip output command run? No Is route summarization configured? No
Is fault rectified? No
Yes
No
Yes
Is fault rectified? No
Yes
Yes Delete the undo rip output command. Is fault rectified? No Yes Disable route summarization.
Yes
Yes Is the metric equal to or greater than 16? No Does the route exist in the database. Yes
Is fault rectified?
Yes
Is a filtering policy configured? No Does the local interface work properly? Yes Is the routing update interval too long? Yes Is the maximum packet length too small? No Seek technical support.
Yes
Ensure that the filtering policy does not filter out the routes imported by RIP.
Yes
Yes Reduce the routing update interval. Is fault rectified? No Yes Increase the maximum packet length.
Yes
End
Issue 01 (2011-10-15)
48
5 RIP Troubleshooting
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check if RIP is enabled on the outbound interface. The network command is used to specify the interface network segment. Only the interface enabled with the RIP protocol can receive and send RIP routes. l Using the display current-configuration configuration rip command, you can check information about the network segment where RIP is enabled. Check if the outbound interface is enabled. The network address enabled by the network command must be that of any natural network segment except loopback, broadcast, and multicast networks. Step 2 Check if the outbound interface works normally. Use the display interface command to check the running status of the outbound interface. If the physical state of the interface is down or Administratively Down or the state of the current protocol is Down, RIP cannot work normally on the interface. Therefore, the interface should be in UP state. Step 3 Check if the silent-interface command is configured on the outbound interface. The silent-interface command disables the interface from sending the RIP packet. l l Run the display current-configuration configuration rip command to check if the interface is disabled from sending the RIP packet. Enable the interface by running undo silent-interface command.
Step 4 Check if the undo rip output command is configured on the outbound interface. Run the display current-configuration command on the outbound interface to view if the rip output command is configured. The rip output command enables the interface to send the RIP packets. The undo rip output command disables the interface from sending the RIP packets. If the outbound interface is configured with the undo rip output command, the RIP packet cannot be sent on the interface. Step 5 Check if summary has been configured.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 49
5 RIP Troubleshooting
The rip summary-address command disables Classful summarization to perform routing between subnets. When Classful summarization is off, subnets are advertised. This feature is used to reduce the number of RIP packets on the network. If RIPv2 is configured on the interface, it always uses Classful summarization. In this case you can run undo rip summary-address command. Step 6 Check if the sum of the route and metric out value is greater than 16. Run the display rip database command to check the metric value of the route. If the metric value is greater than or equal to 16, it is considered as unreachable. While sending RIP routes, the RIP metricout value is added to the RIP packets. Hence to correct this fault, decrease the metricout value for the interface through which the route is sent. Step 7 Check if the policy that filters the imported RIP route is configured in RIP. Using the filter-policy export command, you can configure the filtering policy on the global interface. Only the route whose attributes match with those specified in the filter policy are added to the advertised routing table of RIP and advertised through the updated packet. Step 8 Check the status of the interface when the route is sent to the local interface address. Run the display interface command to check the running status of the interface. If the physical state of the interface is down or Administratively Down or the current state of the protocol on the outbound interface is down, the IP address of the interface cannot be added to the advertised routing table of RIP. Therefore, the routing information is not sent to the neighbor. Step 9 Check if the periodic timer value is too high and if age timer and garbage timer too less. Run the display rip command to check the value of rip timers. If the update timer value is configured to a very high value, the updates are sent after a long period giving the impression that the router is not functioning properly. If age timer and garbage timer is configured too less, then the route is removed from the database l l Configure periodic timer value to a lower value such as 30 seconds to get frequent updates. Configure age timer and garbage time to greater value such as 120, 180 seconds respectively.
Step 10 Check if there are other issues. If the outbound interface does not support the multicast or broadcast mode and the packet needs to be sent to the multicast or broadcast address, the fault occurs. Configure the peer command in RIP mode and send the packet to the unicast address. Thus, the fault can be prevented. Step 11 Check if the maximum packet length on the interface is less. Using the display current-configuration configuration command, you can check information about the maximum packet length where RIP is enabled. Packets are transmitted if atleast one RTE can send out. Packets are not transmitted out if maximum packet length is less than 32. Reconfigure the maximum packet length on the interface to at least 32.
Issue 01 (2011-10-15)
50
5 RIP Troubleshooting
l Max packet length is 52 for simple authentication interface. l Max packet length is 72 for MD5-keyid interface or keychain MD5 interface. l Max packet length is 56 for Huawei MD5 interface
Step 12 If the fault persists, collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Relevant Logs
None.
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
As shown in Figure 5-3, Routers A, B, and C run RIP. RIP is run in the network segments 172.16.1.0/24 and 192.168.1.0/24 of Router A, network segments 192.168.1.0/24 and 192.168.2.0/24 of Router B, and network segments 172.16.2.0/24 and 192.168.2.0/24 of Router C.
Issue 01 (2011-10-15)
51
5 RIP Troubleshooting
Figure 5-3 Networking where a RIP route error occurs due to discontinuous subnets
POS1/0/0 192.168.1.2/24
RouterB
POS2/0/0 192.168.2.2/24
RouterA
RouterC
GE1/0/0 172.16.1.1/24
GE1/0/0 172.16.2.1/24
Run the display ip routing-table command on Router B to check whether the routing information is correct. If the routing information of Router B is correct, Router B has a route to each of the network segments 172.16.1.0 and 172.16.2.0. The command output, however, shows that Router B has only a route to the network segment 172.16.0.0.
Fault Analysis
The network configured with discontinuous subnets runs RIP-1 that does not support VLSM and CIDR. As a result, a RIP route error occurs. To solve the problem, configure RIP-2 on all the devices on the network and disable classfull route summarization.
Procedure
Step 1 Run the system-view command on Router A to enter the system view. Step 2 Run the rip [ process-id ] command on Router A to enter the RIP view. Step 3 Run the version 2 command on Router A to configure RIPv2. Step 4 Run the undo summary command on Router A to disable classful route summarization. After configuring Router A, configure Router B and Router C. The configurations of Router B and Router C are similar to that of Router A, and are not provided here. ----End
Summary
RIPv1 does not support discontinuous subnets. When discontinuous subnets are deployed on a network, configuring RIPv2 on the network is recommended. When a network runs RIPv2, class-based route summarization is enabled by default. To ensure the correctness of routing information, disabling class-based route summarization is recommended.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 52
6 OSPF Troubleshooting
6
About This Chapter
OSPF Troubleshooting
This chapter describes common OSPF faults and provides troubleshooting procedures together with troubleshooting cases. For details about OSPF, see the HUAWEI NetEngine5000E Feature Description - OSPF. 6.1 The OSPF Neighbor Relationship Is Down This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the fault that the OSFP neighbor relationship is Down. 6.2 OSPF Neighbor Relationship Cannot Enter the Full State This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the fault that the OSFP neighbor relationship cannot enter the Full state. 6.3 Trouble Cases
Issue 01 (2011-10-15)
53
6 OSPF Troubleshooting
Issue 01 (2011-10-15)
54
6 OSPF Troubleshooting
Figure 6-1 Troubleshooting flowchart for the fault that the OSPF neighbor relationship is Down
The OSPF neighbor relationship is Down
Yes
Is fault rectified? No
Yes
Yes
Is fault rectified? No
Yes
Neighbor Down Due to 1-Wayhello Received No Neighbor Down Due to SequenceNum Mismatch No
Yes
Is fault rectified? No
Yes
Yes
Is fault rectified? No
Yes
End
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check logs to find the cause of the fault. Run the display logbuffer { event | oper } command, and you can find the following log information. If the log message is as follows:
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 55
6 OSPF Troubleshooting
VRPV8 %%01 ospfv2comm/6/NBR_CHANGE(l):VR=0-CID=[UINT];Neighbor changes event: neighbor status changed. (ProcessId=[UINT], NbrIpAddr=[IPADDR], NbrEvent=[UINT], NbrPreviousState=[UINT], NbrCurrentState=[UINT])
It indicates that the neighbor status changes. Check the NbrEvent field which records the cause of the fault. The possible causes of the fault are as follows: l Inactivity (NbrEvent=7) The InactivityTimer event of the neighbor state machine occurs. If a device does not receive any Hello packet from its neighbor within the deadtime, the OSPF neighbor relationship becomes Down. In this case, go to Step 2. l LLDown (NbrEvent=6) The LLDown event of the neighbor state machine occurs. It indicates that the lower-layer protocol notifies the upper layer that the neighbor is unreachable. In this case, go to Step 2. l 1-Way Received (NbrEvent=4) The 1-Way Received event of the neighbor state machine occurs. A 1-Way Hello packet is sent from the remote end to the local end when the OSPF status on the remote end changes to Down. After receiving the packet, the OSPF status on the local end also changes to Down. In this case, check the remote end to rectify any possible fault. l Kill Neighbor (NbrEvent=5) This event indicates that the interface or BFD session becomes Down. Run the display interface [ interface-type [ interface-number ] ] command to check the interface status and rectify any possible fault. If the log message is as follows:
VRPV8 %%01 ospfv2comm/6/OSPF_RESET(l):VR=%u-CID=[UINT];OSPF process or area reset. (CompCID=[UINT], Parameter=[UINT], ResetReason=[UINT])
It indicates that the reset ospf process command has been run. Whether this command has been run can be known by checking the operation records or log information. In other cases, go to Step 9. Step 2 Check that the link between the two devices is normal. Run the ping command and the display this interface command in the interface view to check whether the link between the two devices is normal, and whether the transmission devices are normal. If the link is normal, go to Step 3. Step 3 Check that the CPU usage is within the normal range. Run the display cpu-usage command to check whether the CPU usage on the MPU or LPU of the faulty device is too high. If the CPU usage is too high, OSPF fails to normally receive and send protocol packets, causing the neighbor relationship to flap. In this case, troubleshoot the fault of the high CPU usage and disable unnecessary functions. If the CPU usage is within the normal range, go to Step 4. Step 4 Check that the interface status is Up. Run the display interface [ interface-type [ interface-number ] ] command to check the physical status of the interface. If the physical status of the interface is Down, troubleshoot the interface fault. If the physical status of the interface is Up, run the display ospf interface command to check whether the OSPF status of the interface is a normal status such as DR, BDR, DROther, or P2P.
<HUAWEI> display ospf interface
Issue 01 (2011-10-15)
56
6 OSPF Troubleshooting
OSPF Process 1 with Router ID 1.1.1.1 Interfaces Area: 0.0.0.0 Interface Pos1/1/0 (MPLS TE not enabled) IP Address Type 192.1.1.1 P2P State DR Cost 1 Pri 1
If the OSPF status of the interface is Down, run the display ospf cumulative command to check whether the number of interfaces enabled with OSPF in the OSPF process exceeds the upper limit. If so, reduce the number of interfaces enabled with OSPF.
<HUAWEI> display ospf cumulative OSPF Process 1 with Router ID 1.1.1.1 Cumulations IO Statistics Type Input Output Hello 0 86 DB Description 0 0 Link-State Req 0 0 Link-State Update 0 0 Link-State Ack 0 0 ASE: (Disabled) LSAs originated by this router Router: 1 Network: 0 Sum-Net: 0 Sum-Asbr: 0 External: 0 NSSA: 0 Opq-Link: 0 Opq-Area: 0 Opq-As: 0 LSAs Originated: 1 LSAs Received: 0 Routing Table: Intra Area: 1 Inter Area: 0 ASE: 0 Up Interface Cumulate: 1 Neighbor Cumulate: ======================================================= Neighbor cumulative data. (Process 1) ------------------------------------------------------Down: 0 Init: 0 Attempt: 0 2-Way: 0 Exstart: 0 Exchange: 0 Loading: 0 Full: 1 Retransmit Count:1 Neighbor cumulative data. (Total) ------------------------------------------------------Down: 0 Init: 0 Attempt: 0 2-Way: 0 Exstart: 0 Exchange: 0 Loading: 0 Full: 1 Retransmit Count:1
If the OSPF status of the interface is normal (such as DR, BDR, DR Other, or P2P), go to Step 5.
Step 5 Check that the IP addresses of the two devices are on the same network segment. Run the display interface interface-type [ interface-number ] command to check the IP addresses of the interfaces on the two devices. l If the IP addresses of the two devices are on different network segments, run the ip address command to change the IP addresses of the devices to ensure that the IP addresses of the two devices are on the same network segment. If the IP addresses of the two devices are on the same network segment, go to Step 6.
Issue 01 (2011-10-15)
57
6 OSPF Troubleshooting
If the ospf mtu-enable command is run on interfaces on both ends, it is required that MTUs of the two interfaces be consistent; otherwise, the OSPF neighbor relationship cannot be established. l If MTUs of the two interfaces are inconsistent, run the mtu mtu command in the interface view to change MTUs of the two interfaces to be consistent. l If MTUs of the two interfaces are consistent, go to Step 7. Step 7 Check whether there is an interface whose priority is not 0. On broadcast and NBMA network segments, there shall be at least one interface whose priority is not 0 to ensure that the DR can be elected correctly. Otherwise, the OSPF neighbor relationship can only reach the two-way state. Run the display ospf interface command to view the interface priority.
<HUAWEI> display ospf interface OSPF Process 1 with Router ID 1.1.1.1 Interfaces Area: 0.0.0.0 Interface Pos1/1/0 (MPLS TE not enabled) IP Address Type 192.168.1.1 P2P State P-2-P Cost 1 Pri 1
Step 8 Check that the OSPF configurations on the two devices are correct. 1. Check whether the OSPF router IDs of the two devices conflict.
<HUAWEI> display ospf brief OSPF Process 1 with Router ID 1.1.1.1 OSPF Protocol Information
If so, modify the OSPF router IDs of the two devices. If not, proceed with the check. 2. Check whether the OSPF area configurations on the two devices are consistent.
<HUAWEI> display ospf interface OSPF Process 1 with Router ID 1.1.1.1 Interfaces Area: 0.0.0.0 Interface Pos1/1/0 (MPLS TE not enabled) IP Address Type 192.168.1.1 Broadcast State BDR Cost 1 Pri 1
3.
Check whether other OSPF configurations on the two devices are the consistent. Run the display ospf error command every 10s for 5 m.
<HUAWEI> display ospf error OSPF Process 1 with Router ID 1.1.1.1 OSPF error statistics General packet errors: 0 : IP: received my own packet 0 : Bad packet 0 : Bad version 0 : Bad checksum 0 : Bad area id 0 : Drop on unnumbered interface 0 : Bad virtual link 0 : Bad authentication type 0 : Bad authentication key 0 : Packet too small 0 : Packet size > ip length 0 : Transmit error 0 : Interface down 0 : Unknown neighbor HELLO packet errors: 0 : Netmask mismatch 0 : Hello timer mismatch 0 : Dead timer mismatch 0 : Extern option mismatch 0 : Router id confusion 0 : Virtual neighbor unknown 0 : NBMA neighbor unknown 0 : Invalid Source Address
l Check the Bad authentication type field. If the value of this field keeps increasing, it indicates that the OSPF authentication types of the two devices that establish the
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 58
6 OSPF Troubleshooting
neighbor relationship are different. In this case, you need to set the same authentication type for the two devices. l Check the Hello timer mismatch field. If the value of this field keeps increasing, it indicates that the value of the Hello timers on the two devices that establish the neighbor relationship are different. In this case, you need to check the interface configurations of the two devices and set the same value for the Hello timers. l Check the Dead timer mismatch field. If the value of this field keeps increasing, it indicates that the value of the dead timers on the two devices that establish the neighbor relationship are different. In this case, you need to check the interface configurations of the two devices and set the same value for the dead timers. l Check the Extern option mismatch field. If the value of this field keeps increasing, it indicates that the area types of the two devices that establish the neighbor relationship are different (the area type of one device is common area, and the area type of the other device is stub area or NSSA). In this case, you need to set the same area type for the two devices. If the fault persists, go to Step 9. Step 9 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Relevant Logs
OSPF/6/NBR_CHANGE OSPF/6/OSPF_RESET
6 OSPF Troubleshooting
Figure 6-2 Troubleshoot flowchart for the fault that the OSPF neighbor relationship cannot reach the Full state
The OSPF relationship cannot enter the Full state.
Can the status of the neighbor relationship be displayed? No Is the neighbor relationship always in the Down state? No Is the neighbor relationship always in the Init state? No Is the neighbor relationship always in the 2-Way state? No Is the neighbor relationship always in the Exstart state? No Is the neighbor relationship always in the Exchange state? No
Yes
Is fault rectified? No
Yes
Is fault rectified? No
Yes
Yes
Is fault rectified? No
Yes
Yes
End
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Issue 01 (2011-10-15)
60
6 OSPF Troubleshooting
Procedure
Step 1 Troubleshoot the fault based on the status of the OSPF neighbor relationship. l The status of the OSPF neighbor relationship cannot be displayed. If the status of the OSPF neighbor relationship cannot be displayed, see The OSPF Neighbor Relationship Is Down to rectify the fault. l The neighbor relationship is always in the Init state. If the status of the neighbor relationship is always displayed as Init, it indicates that the remote device cannot receive Hello packets from the local device. In this case, you need to check whether the link or the remote device is faulty. l The neighbor relationship is always in the 2-way state. If the status of the neighbor relationship is always displayed as 2-way, run the display ospf interface command to check whether the DR priorities of the interfaces enabled with OSPF are 0.
<HUAWEI> display ospf interface OSPF Process 1 with Router ID 111.1.1.1 Interfaces Area: 0.0.0.0 Interface Pos1/1/0 (MPLS TE not enabled) IP Address Type 111.1.1.1 Broadcast State DROther Cost 1 Pri 0
If the DR priorities of the interfaces enabled with OSPF are 0, it indicates the status of the neighbor relationship is normal. If the DR priorities of the interfaces enabled with OSPF are not 0, go to Step 2. l The neighbor relationship is always in the Exstart state. If the status of the neighbor relationship is always displayed as Exstart, it indicates that the devices are exchanging DD packets but fail to synchronize LSDBs, which occurs in the following cases: Too long packets cannot be normally received and sent. Run the ping -s 1500 neighbor-address command to check the sending and receiving of too long packets. If the two devices fail to ping each other, solve the link problem first. The OSPF MTUs of the two devices are different. If the ospf mtu-enable command is run on the OSPF interfaces, check whether the OSPF MTUs on the two interfaces are the same. If not, change the MTUs of the interfaces to ensure that the MTUs of the interfaces are the same. If the fault persists, go to Step 2. l The neighbor relationship is always in the Exchange state. If the status of the neighbor relationship is always displayed as Exchange, it indicates that the two devices are exchanging DD packets. In this case, perform the troubleshooting as has been described when the neighbor relationship is in the Init state. If the fault persists, go to Step 2. l The neighbor relationship is always in the Loading state.
CAUTION
Restarting OSPF causes the re-establishment of all neighbor relationships in the OSPF process and temporary interruption of services.
Issue 01 (2011-10-15)
61
6 OSPF Troubleshooting
If the neighbor relationship is always in the Loading state, run the reset ospf process-id process command to restart the OSPF process. If the fault persists, go to Step 2. Step 2 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Relevant Logs
None.
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
On the network shown in Figure 6-3, traffic is unevenly distributed between the path from Router A to the BAS and the path from Router B to the BAS. You need to configure load balancing between the path Router A -> BAS -> destination and the path Router A -> RouterB -> BAS-> destination for the traffic transmitted from Router A to the network segment to which the BAS is connected.
Issue 01 (2011-10-15)
62
6 OSPF Troubleshooting
Figure 6-3 Diagram of the networking where a device fails to calculate a route
RouterA RouterB
BAS
10.1.1.0
The following takes traffic to destination network segment 10.1.1.0 as an example. On Router B, a static route to 10.1.1.0 is configured and OSPF is configured to import static routes. Router A receives an ASE LSA with the LS ID being 10.1.1.0 from Router B and an ASE LSA with the LS ID also being 10.1.1.0 from the BAS. Router A can calculate a route based on the LSA received from the BAS, but fails to calculate a route based on the LSA received from Router B.
Fault Analysis
The possible causes are as follows: 1. 2. 3. The configurations of devices are incorrect. The FA field in the LSA sent by Router B is 10.1.2.26. The LSA is not calculated because the FA field of the LSA is incorrect. The conditions of generating routes for load balancing are not met.
Based on the analysis of the preceding possible causes, it can be concluded: 1. 2. The configurations of the devices are normal. The LSA whose FA field meets the condition of route calculation.
<RouterA> ping 10.1.3.1 PING 10.1.3.1: 56 data bytes, press CTRL_C to break Reply from 10.1.3.1: bytes=56 Sequence=1 ttl=255 time=1 Reply from 10.1.3.1: bytes=56 Sequence=2 ttl=255 time=1 Reply from 10.1.3.1: bytes=56 Sequence=3 ttl=255 time=1 Reply from 10.1.3.1: bytes=56 Sequence=4 ttl=255 time=1 Reply from 10.1.3.1: bytes=56 Sequence=5 ttl=255 time=1 ms ms ms ms ms
--- 10.1.3.1 ping statistics --5 packet(s) transmitted 5 packet(s) received 0.00% packet loss round-trip min/avg/max = 1/1/1 ms <RouterA> display ip routing-table 10.1.3.1 Route Flags: R - relay, D - download for forwarding -----------------------------------------------------------------------------Routing Table : Public Summary Count : 2 Destination/Mask 10.1.3.1/32 Proto Pre 1 1 Cost Flags NextHop D D 10.1.2.45 10.1.2.49 Interface GE1/0/5 GE1/0/6
Issue 01 (2011-10-15)
63
6 OSPF Troubleshooting
Reply from 10.1.2.26: bytes=56 Sequence=2 ttl=254 time=1 ms 0.00% packet loss round-trip min/avg/max = 1/1/1 ms <RouterA> display ip routing-table 10.1.2.26 10.1.2.24/30 OSPF OSPF 10 10 101 101 D D 10.1.2.45 10.1.2.49 GE1/0/5 GE1/0/6
3.
On this network, the costs of LSAs are 1. You need to compare the cost of the route to the ASBR and the cost of the route to the FA. For Type 2 ASE LSAs, OSPF equal-cost routes can be generated when the following conditions are met: (1) The costs of LSAs are the same. (2) The cost of the route to the ASBR is the same as the cost of the route to the FA. On the network, the cost of the route to the FA is 101. l For the LSA with the FA field being 0.0.0.0, the cost of the route to ASBR at 10.1.3.21 is 1. l For the LSA with the FA field not being 0.0.0.0, the cost of the route to the FA being 10.1.3.26 is 101. The LSA with the FA field being set is not calculated because the priority of the LSA is lower. As a result, equal-cost routes cannot be formed.
Procedure
Step 1 To form equal-cost routes on the network, do as follows: On the BAS, enable OSPF on the next-hop interface of the route to 10.1.1.0. Set the cost on the interface to 100 so that the interface advertises LSAs with the FA field being its address. Then, you can find two LSAs with FA fields on Router A. The cost of the route to one FA and the cost of the route to the other FA are both 101. Thus, equal-cost routes can be formed. ----End
Summary
None.
Issue 01 (2011-10-15)
64
6 OSPF Troubleshooting
6.3.2 The OSPF Neighbor Relationship Cannot Be Established Between Two Devices Because the Link Between the Devices Is Faulty
NOTE
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
In the networking shown in Figure 6-4, the OSPF neighbor relationship cannot be established between Router A and its neighbor, and the neighbor is in the Exchange state. Figure 6-4 Diagram of the networking where the neighbor relationship cannot be established between two devices
RouterA
Fault Analysis
The possible causes are as follows: 1. 2. 3. The configurations of Router A are improper. Related parameters of the two devices are incorrectly set. Device hardware is faulty.
Procedure
Step 1 Check the configurations of the devices. You can find that the configurations of the devices are correct. Step 2 Check the OSPF parameters on the corresponding interfaces. You can find that the OSPF parameters on the interfaces are correctly set. Step 3 Check the MTUs on the Huawei device and non-Huawei device. You can find that the MTU is not successfully negotiated between the two devices. The MTUs on the two devices are 4470. However, the check result shows that the MTU contained in the packet received by the non-Huawei device is 0, which indicates that the MTU is not set on the peer device. It is concluded that the link does not work normally.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 65
6 OSPF Troubleshooting
Run the following command on Router A to ping the peer device. You can find that packet loss occurs.
<RouterA> ping 10.1.1.0 PING 10.1.1.0: 56 data bytes, press CTRL_C to break Request time out Reply from 10.1.1.0: bytes=56 Sequence=2 ttl=255 time=5 ms Reply from 10.1.1.0: bytes=56 Sequence=3 ttl=255 time=5 ms Reply from 10.1.1.0: bytes=56 Sequence=4 ttl=255 time=5 ms Request time out 40.00% packet loss
Check that the link between intermediate transmission devices is normal. Collect traffic statistics on Router A. You can find that packet loss does not occur on Router A. Thus, packet loss may occur on the board of the peer device or on the link. Collect traffic statistics on the peer device. You can find that packet loss occurs on the board of the non-Huawei device because the board is faulty. After the board is replaced, the fault is rectified. ----End
Summary
None
6.3.3 An OSPF Routing Loop Occurs Because Router IDs of Devices Conflict
NOTE
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
In the networking shown in Figure 6-5, OSPF multi-instance is run between PEs and CEs. The CEs are Layer 3 switches of other manufacturers. The PEs deliver OSPF default routes to interwork the networks of the two cities. PE1 and PE2 are connected to the same UMG. The same IP address 10.1.1.33 is set for the interface connecting PE1 to the UMG and the interface connecting PE2 to the UMG, and the two interfaces are bound to the VPN instance of the UMG. Normally, the link between the UMG and PE2 is Down. Therefore, the two interfaces with the IP address being 10.1.1.33 on the two PEs cannot be in the Up state at the same time. CE1 can successfully ping PE1, and CE2 can successfully ping PE2. When a CE pings its remote peer or a device on the remote network, packet loss, however, occurs irregularly.
Issue 01 (2011-10-15)
66
6 OSPF Troubleshooting
City A
City B
CE1
CE2
Fault Analysis
1. 10.1.1.33 is the largest IP address in the VPN instance to which the two PEs are bound, and the following command is run to configure OSPF multi-instance:
<PE1> ospf 4 vpn-instance www
Therefore, PE1 and PE2 select 10.1.1.33 as their router ID. 2. 3. On CE1, you can find that the router ID of PE1 is 10.1.1.33; on CE2, you can find that the router ID of PE2 is also 10.1.1.33. The debugging information on the CEs shows that a device with the router ID being 10.1.1.33 sends LSAs every five seconds and the sequence numbers of LSAs are incremental and unstable. The CEs receive LSAs sent by two devices with the same router ID. Therefore, the OSPF default routes in the routing tables of the CEs constantly change. When the default route of CE1 is learned by CE2 and the default route of CE2 is learned by CE1, a routing loop occurs. As a result, routes are unreachable and packet loss occurs.
4.
Procedure
Step 1 Run the ospf 4 router-id 10.2.2.9 vpn-instance www command on PE1 to specify the router ID of the OSPF multi-instance as the unique address of PE1, and run the ospf 4 router-id 10.2.2.10 vpn-instance www command on PE2 to specify the router ID of the OSPF multiinstance as the unique address of PE2.
[PE1] ospf 4 router-id 10.2.2.9 vpn-instance www [PE2] ospf 4 router-id 10.2.2.10 vpn-instance www
Step 2 Restart the OSPF process associated with the VPN instance on PE1, and then perform the same operation on PE2. After that, services are restored. ----End
Summary
None.
Issue 01 (2011-10-15)
67
6 OSPF Troubleshooting
6.3.4 Services on the Master Plane Are Interrupted Because a Device on the Slave Plane Performs Master/Slave Switchover on a Bearer Network
NOTE
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
On the bearer network shown in Figure 6-6, there are two planes working in master and slave mode. The traffic model of the master plane is UMGCE1AR1BR1AR1'CE1', which is the same as the return path. During the master/slave switchover of AR2 on the slave plane, packet loss (which lasts 1 second) occurs on the master plane. Similarly, during the master/slave switchover of AR1 on the master plane, packet loss also occurs on the slave plane. Figure 6-6 Diagram of the networking where services are interrupted on the master plane
CE1 AR1 BR1 AR1' CE1'
UMG'
Fault Analysis
The analysis about the topology and routes of the current network shows that the master and slave planes are independent of each other. Therefore, it is impossible that the master/slave switchover of a device on one plane affects the services on the other plane. By viewing the configurations of the devices on the current network, you can find that the route aggregation command is run in the OSPF multi-instance on all ARs.
ospf 1 vpn-instance 123 asbr-summary 10.0.0.0 255.0.0.0
The network command is run to configure BGP to advertise routes. The routes to the remote end are thus aggregated into a route 10.0.0.0/8. Based on the route aggregation rules on the ABR, the cost of the aggregated route is the largest among the costs of the routes that are aggregated (after route aggregation, the ASBR and the ABR send the aggregated route with the largest cost). For example, there are the following routes to be aggregated:
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 68
6 OSPF Troubleshooting
l l l
The aggregated route is 10.0.0.0/8 with the cost being 1000. After AR1 on the master plane performs the master/slave switchover, AR2 needs to reconverge private network routes. If AR2 receives a route 10.x.x.x with the cost smaller than 200, the cost of the aggregated route 10.0.0.0/8 advertised by AR2 to CE2 become smaller, and the aggregated route is advertised to CE1 by OSPF. The traffic model on the master plane becomes UMG CE1CE2AR2BR2AR2'CE2'. Because the network scale is large, AR2 has not completed route convergence yet, that is, AR2 has no specific route to the destination. As a result, the fault occurs.
Procedure
Step 1 When configuring route aggregation, specify the cost of the aggregated route to prevent route re-selection when a route with a smaller cost is received.
ospf 1 vpn-instance 123 asbr-summary 10.0.0.0 255.0.0.0 cost 300
After the preceding operations, no traffic passes through the link between CE1 and CE2, and traffic on the master plane is always forwarded on the master plane. ----End
Summary
During the planning of a network with dual planes, prevent the two planes from affecting each other due to the factors such as IS-IS checksum errors, incorrect costs of OSPF routes, and incorrect costs of IS-IS routes.
Issue 01 (2011-10-15)
69
7 IS-IS Troubleshooting
7
About This Chapter
IS-IS Troubleshooting
This chapter describes common IS-IS faults and provides troubleshooting procedures together with troubleshooting cases. For details about IS-IS, refer to the HUAWEI NetEngine5000E Feature Description - IS-IS. 7.1 The IS-IS Neighbor Relationship Cannot Be Established This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the fault that the IS-IS neighbor relationship cannot be established on an IS-IS network. 7.2 A Device Fails to Learn Specified IS-IS Routes from Its Neighbor This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the fault that a device fails to learn specified IS-IS routes from its neighbor. 7.3 IS-IS Routes Flap This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the fault that IS-IS routes flap on an IS-IS network. 7.4 Trouble Cases
Issue 01 (2011-10-15)
70
7 IS-IS Troubleshooting
Issue 01 (2011-10-15)
71
7 IS-IS Troubleshooting
Figure 7-1 Flowchart for troubleshooting the fault that the IS-IS neighbor relationship cannot be established
The IS-IS neighbor relationship cannot be normally established.
Is the IS-IS status of the interface Up? Yes Are Hello packets normally sent and received? Yes Is local IS-IS parameters are matched with the neighbor's? Yes Seek technical support.
No
Is fault rectified? No
Yes
No
Is fault rectified? No
Yes
No
Is fault rectified? No
Yes
End
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check the status of IS-IS interfaces. Run the display isis interface command to check the state of interfaces enabled with IS-IS (the value of the IPv4.State or IPv6.State item).
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 72
7 IS-IS Troubleshooting
l l
If the state is Mtu:Up/Lnk:Dn/IP:Dn, go toStep 2. If the state is Mtu:Dn/Lnk:Up/IP:Up, run the display current-configuration interface interface-type [ interface-number ] command to check the MTUs on the interfaces. Run the display current-configuration configuration isis command to check the lengths of LSPs in an IS-IS process.
NOTE
If the lengths of LSPs cannot be viewed by using the display current-configuration configuration isis command, the default LSP lengths are used. The default LSP lengths can be viewed by using the display default-parameter isis command. The value of the LSP-Originate-Length field is the maximum length of a originated LSP, and the value of the LSP-Receive-Length field is the maximum length of a received LSP. If the MTU cannot be viewed by using the display current-configuration interface interfacetype [ interface-number ] command, the default MTU lengths are used. The default MTU lengths is 4470 bytes on the POS and 1500 bytes on the other interfaces. On a P2P interface, the LSP length should not be greater than the MTU on the P2P interface. On a broadcast interface, the value obtained by the MTU on the interface subtracted by the LSP length should be equal to or greater than 3. If the condition is not met, run the lsp-length command in the IS-IS view to change the LSP length, or run the mtu command in the interface view to change the MTU. If two Huawei devices need to communicate with each other, check that the MTUs on the connected interfaces are the same. If a Huawei device needs to communicate with another vendor's device, adjust the MTUs on both ends as required.
If the fault is still not rectified, go to Step 4. l If the state is Down, run the display current-configuration configuration isis command to check the configuration of the IS-IS process. Check whether the NET is configured in the IS-IS process. If not, configure the network-entity command in the IS-IS process. If the fault is still not rectified, go to Step 2. l If the state is Up, go to Step 4.
Step 2 Check that the interface status is Up. Run the display ip interface [ interface-type [ interface-number ] ] command to check the status of specified interfaces. l If the interface link status (Line protocol current state field in the output information ) is not Up, troubleshoot the interface fault. See the section "Physical Connection and Interfaces" or "L2 Network". If the fault is still not rectified, go to Step 3. l If the interface status is Up, go to Step 3.
Step 3 Check that the IP addresses of the two interfaces at both ends of the link are on the same network segment. l If the IP addresses of the two interfaces are on different network segments, change the IP addresses of the two interfaces to ensure that the two IP addresses are on the same network segment. If the fault is still not rectified, go to Step 4. If the IP addresses of the two interfaces are on the same network segment, go to Step 4.
Step 4 Check that IS-IS can normally receive and send Hello packets. Run the display isis statistics packet [ interface interface-type interface-number ] command to check whether IS-IS can normally receive and send Hello packets.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 73
7 IS-IS Troubleshooting
The default interval at which IS-IS sends Hello packets is 10s. Therefore, run this command every 10s to check whether the packet statistics increase (L1 IIH or L2 IIH). On a broadcast interface, Hello packets have IS-IS levels, and therefore you can view the statistics about Hello packets based on the levels of established neighbor relationships. On a P2P interface, Hello packets have no IS-IS levels and are recorded as L2 IIH packets.
If the number of received Hello packets does not increase for a certain period, check whether the IS-IS packets are lost. For Broadcast interface, run the debugging ethernet packet isis interface-type interface-number command. The following information indicates the interface can normally receive and send IS-IS Hello packets.
*0.75124950 HUAWEI ETH/7/eth_rcv:Slot=3;Receive an Eth Packet, interface : Ethernet1/0/0, eth format: 3, length: 60, protoctype: 8000 isis, src_eth_addr: 00e0-fc37-08c1, dst_eth_addr: 0180-c200-0015 *0.75124950 HUAWEI ETH/7/eth_send:Slot=3;Send an Eth Packet, interface : Ethernet1/0/0, eth format: 3, length: 112, protoctype: 8000 isis, src_eth_addr: 00e0-fc26-f9d9, dst_eth_addr : 0180-c200-0015
For Broadcast interface, run the debugging ethernet packet isis interface-type interface-number command. The following information indicates the interface can normally receive and send IS-IS Hello packets.
*0.75124950 HUAWEI ETH/7/eth_rcv:Receive an Eth Packet, interface : Vlanif10, eth format: 3, length: 60, protoctype: 8000 isis, src_eth_addr: 00e0-fc37-08c1, dst_eth_addr: 0180-c200-0015 *0.75124950 HUAWEI ETH/7/eth_send:Send an Eth Packet, interface : Vlanif10, eth format: 3, length: 112, protoctype: 8000 isis, src_eth_addr: 00e0-fc26f9d9, dst_eth_addr : 0180-c200-0015
For P2P interface, run the debugging ppp osi-npdu packet interface-type interfacenumber command. The following information indicates the interface can normally receive and send IS-IS Hello packets.
*0.85102199 HUAWEI PPP7/debug2:Slot=2; PPP Packet: Pos2/0/0 Output OSI-NPDU(0023) Pkt, Len 1004 *0.85102199 HUAWEI PPP7/debug2:Slot=2; PPP Packet: Pos2/0/0 Input OSI-NPDU(0023) Pkt, Len 1501
NOTE
If the DIS field shown in the output of the display isis interface interface-type interfacenumbercommand is "--", it indicates the interface type is P2P. Otherwise, the interface type is Broadcast.
If the device can not normally receive and send Hello packets, go to Step 9. l If the device can normally receive Hello packets, go to Step 5. If the interfaces at both ends of the link are trunk interfaces, check whether the numbers of the member interfaces in the Up state in the trunk interfaces are the same. If numbers of the member interfaces in the Up state in the trunk interfaces are different, add the required physical interfaces to the Trunk interface correctly. Otherwise, go to Step 2. If the interfaces at both ends of the link are not trunk interfaces, go to Step 2. Step 5 Check that the devices at both ends of the link are configured with different system IDs. Run the display current-configuration configuration isis command to check whether the system IDs of the two devices are the same. l l
Issue 01 (2011-10-15)
If the system IDs of the two devices are the same, set different system IDs for the two devices. If the system IDs of the two devices are different, go to Step 6.
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 74
7 IS-IS Troubleshooting
Step 6 Check that the IS-IS levels of the two devices at both ends of the link are consistent. Run the display current-configuration configuration isis | include is-level command to check the levels of the IS-IS processes on the two devices. Then, run the display currentconfiguration interface interface-type interface-number | include isis circuit-level command to check whether the IS-IS levels of the interfaces at both ends of the link match. The IS-IS neighbor relationship can be established only when the IS-IS levels of the two interfaces match.
NOTE
If the IS-IS levels of the two interfaces cannot be viewed by using the display current-configuration interface interface-type interface-number | include isis circuit-level command, the two interfaces use the default IS-IS level. The default IS-IS level can be viewed by using the display default-parameter isis command. The value of the Circuit-Level field is the default IS-IS level. The matching rules of interface levels are as follows: l If the level of the local interface is Level-1, the level of the remote interface must be Level-1 or Level-1-2. l If the level of the local interface is Level-2, the level of the remote interface must be Level-2 or Level-1-2. l If the level of the local interface is Level-1-2, the level of the remote interface can be Level-1, Level-2, or Level-1-2.
If the IS-IS levels of the two devices do not match, run the is-level command in the IS-IS view to set matching IS-IS levels for the two devices, or run the isis circuit-level command in the interface view to change the levels of related interfaces. If the IS-IS levels of the two devices are consistent, go to Step 7.
Step 7 Check that the area addresses of the two devices at both ends of the link are the same. When the area addresses of the two devices are different, the log ISIS_AREA_MISMATCH is generated.
NOTE
If two devices at both ends of a link establish a Level-1 neighbor relationship, ensure that the two devices are in the same area. An IS-IS process can be configured with a maximum of three area addresses. As long as one of the area addresses of the local IS-IS process is the same as one of the area addresses of the remote IS-IS process, the Level-1 neighbor relationship can be established. When the IS-IS Level-2 neighbor relationship is established between two devices, you do not need to determine whether the area addresses of the two devices match.
l l
If the area addresses of the two devices are different, run the network-entity command in the IS-IS view to set the same area address for the two devices. If the area addresses of the two devices at both ends of the link are the same, go to Step 8.
Step 8 Check that the authentication configurations of the two devices at both ends of the link are the same. If the authentication types of the two devices are different, the log ISIS_AUTHENTICATION_TYPE_FAILURE or the log ISIS_AUTHENTICATION_FAILURE is generated. Run the display current-configuration interface interface-type interface-number | include isis authentication-mode command to check whether the IS-IS authentication configurations of the two interfaces at both ends of the link are the same.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 75
7 IS-IS Troubleshooting
If the authentication types on the two interfaces are different, run the isis authenticationmode command in the view of each of the two interfaces to set the same authentication type for the two interfaces. If the authentication passwords on the two interfaces are different, run the isis authentication-mode command in the view of each of the two interfaces to set the same authentication password for the two interfaces. If the authentication configurations of the two devices are the same, go to Step 9.
Step 9 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Relevant Logs
ISIS_AREA_MISMATCH ISIS_AUTHENTICATION_TYPE_FAILURE ISIS_AUTHENTICATION_FAILURE
7.2 A Device Fails to Learn Specified IS-IS Routes from Its Neighbor
This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the fault that a device fails to learn specified IS-IS routes from its neighbor.
Another routing protocol whose priority is higher than that of IS-IS advertises the same routes as those advertised by IS-IS. The preferences of the imported external routes are low, and therefore the imported external routes are not preferred. The IS-IS cost styles of the two devices are inconsistent. The IS-IS neighbor relationship is not normally established between the two devices. The two devices are configured with the same system ID. The authentication configurations of the two devices are inconsistent. LSP loss occurs due to a device fault or a link fault.
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 76
7 IS-IS Troubleshooting
Issue 01 (2011-10-15)
77
7 IS-IS Troubleshooting
Figure 7-2 Flowchart for troubleshooting the fault that a device fails to learn IS-IS routes from its neighbor
A device fails to learn specified routes from its neighbor.
No
Is fault rectified?
Yes
No
Check the IS-IS configuration of the device that advertises the routes.
Is fault rectified?
Yes
No
No
Is fault rectified?
Yes
No No Ensure that cost styles of the interfaces on both ends of the link are consistent. Yes Is fault rectified? No Troubleshoot the fault of the IS-IS neighbor relationship fails to be established.
Is the IS-IS neighbor relationship normally established? Yes Seek technical support.
No
Is fault rectified?
Yes
No
End
Issue 01 (2011-10-15)
78
7 IS-IS Troubleshooting
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check that the IS-IS routing table of the device that fails to learn specified routes is correct. Run the display isis route command to view the IS-IS routing table. l If the specified routes exist in the IS-IS routing table, run the display ip routing-table ipaddress [ mask | mask-length ] verbose command to check whether routes advertised by a routing protocol whose priority is higher than that of IS-IS exist in the routing table.
NOTE
If the value of the State field of a route is Active Adv, it indicates that the route is an active route. If there are multiple routes that have the same prefix but are advertised by different routing protocols, the route advertised by the routing protocol with the highest priority is preferred as the active route.
If there are such routes in the routing table, adjust the configuration based on the network planning. If there is no such routes in the routing table, go to Step 6. l If there is no specified route in the IS-IS routing table, go to Step 2. Step 2 Check that the specified IS-IS routes are advertised. On the device that advertises specified routes, run the display isis lsdb verbose local command to check whether LSPs generated by the device carry the specified routes. l If the LSPs do not carry the specified routes, check whether the configurations of the device are correct, for example, whether IS-IS is enabled on associated interfaces.
NOTE
If the specified routes are imported external routes, run the display ip routing-table protocol protocol verbose command to check whether the external routes are active routes.
Step 3 Check that IS-IS LSDBs are synchronized. On the device that fails to learn specified IS-IS routes, run the display isis lsdb command to check whether the device learns LSPs from the device that advertises specified routes.
NOTE
LSPID identifies an LSP, and Seq Num is the sequence number of an LSP. The greater the sequence number, the newer the LSP.
If the LSDB of the device that fails to learn specified IS-IS routes does not have specified LSPs, do as follows as required: If the log ISIS_AUTHENTICATION_TYPE_FAILURE or the log ISIS_AUTHENTICATION_FAILURE is generated, it indicates that the authentication
Issue 01 (2011-10-15)
79
7 IS-IS Troubleshooting
types or authentication passwords of the device that fails to learn specified routes and the device that advertises the specified routes are inconsistent. In this case, set the same authentication type and authentication password for the two devices. If the log ISIS_AUTHENTICATION_TYPE_FAILURE or the log ISIS_AUTHENTICATION_FAILURE is not generated, check whether devices or intermediate links are faulty. l If the LSDB of the device that fails to learn specified IS-IS routes contains specified LSPs, but the Seq Num fields of the LSPs are different with the fields of the display isis lsdb local verbose command,t and the values of the Seq Num fields keep increasing, it indicates that there is another device configured with the same system ID as the device that advertises specified routes on the network. In this case, the log ISIS_SEQUENCE_NUMBER_SKIP is generated, and you need to check the IS-IS configurations on the devices on the network. If the LSDB of the device that fails to learn specified IS-IS routes contains specified LSPs, but the Seq Num fields of the LSPs are inconsistent and the values of the Seq Num fields keep unchanged, it indicates that the LSPs may be discarded during transmission. In this case, you need to check whether devices or intermediate links are faulty. If the LSDB of the device that fails to learn specified IS-IS routes contains specified LSPs and the Seq Num fields of the LSPs are consistent, go to Step 4.
Step 4 Check whether the IS-IS cost styles of the two devices are consistent. Run the display current-configuration configuration isis command on the device that advertises specified routes and the device that fails to learn specified IS-IS routes respectively to check whether the IS-IS cost styles (the cost-style command) of the two devices are consistent.
NOTE
Two devices can learn routes from each other only when the IS-IS cost styles of the two devices match. The IS-IS cost styles are classified as follows: l narrow: indicates that the packets with the cost style being narrow can be received and sent. l narrow-compatible: indicates that the packets with the cost style being narrow or wide can be received but only the packets with the cost style being narrow can be sent. l compatible: indicates that the packets with the cost style being narrow or wide can be received and sent. l wide-compatible: indicates that the packets with the cost style being narrow or wide can be received but only the packets with the cost style being wide can be sent. l wide: indicates that the packets with the cost style being wide can be received and sent. If the cost style of one device is narrow and the cost style of the other device is wide or wide-compatible, or the cost style of one device is narrow-compatible and the cost style of the other device is wide, the two devices cannot interwork.
l l
If the IS-IS cost styles on the two devices are inconsistent, run the cost-style command to set the same IS-IS cost style for the two devices. If the IS-IS cost styles on the two devices are consistent, go to Step 5.
Step 5 Check that the IS-IS neighbor relationship is normally established. Run the display isis peer command on every device on the path to check whether the IS-IS neighbor relationships are normally established. l l If an IS-IS neighbor relationship cannot be normally established, troubleshoot the fault The IS-IS Neighbor Relationship Cannot Be Established. If the IS-IS neighbor relationships are normally established, go to Step 6.
Step 6 Collect the following information and contact Huawei technical support personnel.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 80
7 IS-IS Troubleshooting
l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Relevant Logs
ISIS_AUTHENTICATION_TYPE_FAILURE ISIS_AUTHENTICATION_FAILURE ISIS_SEQUENCE_NUMBER_SKIP
Issue 01 (2011-10-15)
81
7 IS-IS Troubleshooting
Figure 7-3 Flowchart for troubleshooting the fault that IS-IS routes flap
IS-IS routs flap
Check the routing table and identify the changed attributes of routes The outbound interface or cost of the route changes Ensure that the IS-IS neighbor relationship does not flap Is fault rectified No Yes Yes
Is fault rectified No
Ensure that external routes do not flap and that the IS-IS configuration is correct
Is fault rectified No
Yes
Other cases
End
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check the details about route flapping. Run the display ip routing-table ip-address verbose command to check the details about route flapping, such as, the routing protocol from which active routes are learned and the changed attributes of routes during route flapping.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 82
7 IS-IS Troubleshooting
If the value of the TunnelID field changes after route flapping, check whether the MPLS LSP flaps(run the display mpls lsp command to check whether the LDP LSP is Up for a short time, which indicates that the LDP LSP flaps).If the MPLS LSP flaps, see LDP LSP Flapping or TE Tunnel Goes Down Suddenly to rectify the fault. If the Cost or Interface field of a route changes, check whether the IS-IS neighbor relationship established between devices on the path flaps. If so, see 0x08960007 isisAdjacencyChange to rectify the fault. If a route appears intermittently in the routing table (the value of the Age field changes), run the display isis lsdb verbose command to identify the LSP(Link state Protocol) that carries the route. Then, run the display isis lsdb verbose lsp-id command to check the updates of the LSP. If the LSP always carries the specified route, check whether the IS-IS neighbor relationship established between devices on the path flaps. If the value of the Seq Num field of the LSP constantly increases and the contents of the LSP before the update are greatly different from the contents of the LSP after the update, check whether the two devices are configured with the same system ID(Run the display isis brief command to check the value of SystemId). If the value of the Seq Num field of the LSP constantly increases and the route appears intermittently before and after the LSP is updated, perform 2 on the device that generates the LSP.
Step 2 Check the external routes imported by IS-IS. If specified routes are external routes imported by IS-IS, run the display ip routing-table ipaddress verbose command on the device where IS-IS imports the external routes to view details about route flapping. l The active routes in the routing table are IS-IS routes rather than external routes to be imported by IS-IS, it indicates that other IS-IS devices advertise the same routes. In this case, you need to modify the priorities of routing protocols based on network planning, or configure a route filtering policy in the IS-IS view to control the routes to be added to the IP routing table. In other cases, go to 3.
Step 3 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Relevant Logs
None.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 83
7 IS-IS Troubleshooting
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
On the network shown in Figure 7-4, Router B and Router C are located at the core layer and are connected to two SR devices, that is, Router A, and Router D. Router D is a non-Huawei device. To implement load balancing, Router A and Router D are configured to the same network, and direct routes and static routes are imported to IS-IS in related IS-IS processes. After the configuration, you can find that Router B and Router C can learn routes from only Router D. Figure 7-4 Diagram of the network where devices cannot learn IS-IS routes
Core layer
RouterB RouterC
RouterA
RouterD
Network
Fault Analysis
By default, the type of static routes imported by IS-IS on Router D is internal and the cost of the routes equals to the original cost of the imported route, whereas the type of static routes
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 84
7 IS-IS Troubleshooting
imported by IS-IS on Router A is external and the cost of the routes equals to the sum of original cost of the imported route and 64. Router B, and Router C selects routes only from Router D rather than Router A because the costs are different.
NOTE
Procedure
Step 1 Run the system-view command on Router A to enter the system view. Step 2 Run the isis [ process-id ] command to enter the IS-IS view. Step 3 Run the import-route direct cost-type internal command to configure IS-IS to import direct routes and set cost-type to internal. Step 4 Run the import-route static cost-type internal command to configure IS-IS to import static routes and set cost-type to internal.
NOTE
Modify the cost-type from external to internal, the cost of the imported routes equals to the original cost of the imported route, rather than the sum of original cost of the imported route and 64.
After the preceding operations, run the display isis route command on Router B and Router C to view routing information. You can find that there are two IS-IS routes to the same network segment and that load balancing is performed by Router A and Router D. ----End
Summary
In the networking with devices of different manufacturers, note the implementation differences between the devices.
Issue 01 (2011-10-15)
85
8 BGP Troubleshooting
8
About This Chapter
BGP Troubleshooting
This chapter describes common BGP faults and provides troubleshooting procedures, troubleshooting cases, and FAQs. For details about BGP, see the HUAWEI NetEngine5000E Feature Description - IP Routing. 8.1 The BGP Peer Relationship Fails to Be Established This section describes the troubleshooting flow and provides a step-by-step troubleshooting procedure for the fault that the BGP peer relationship fails to be established. 8.2 BGP Public Network Traffic Is Interrupted This section describes the troubleshooting flow and provides a step-by-step troubleshooting procedure for the fault that BGP public network traffic is interrupted. 8.3 BGP Private Network Traffic Is Interrupted This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the fault that BGP private network traffic is interrupted. 8.4 Troubleshooting of the Fault that a Local BGP Peer (Route Sender) Cannot Receive ORFs from a Remote Peer (Route Receiver) This section describes the troubleshooting roadmap for the fault that a local BGP peer (route sender) cannot receive ORFs from a remote peer (route receiver), and provides troubleshooting cases. 8.5 Trouble Cases
Issue 01 (2011-10-15)
86
8 BGP Troubleshooting
Issue 01 (2011-10-15)
87
8 BGP Troubleshooting
Figure 8-1 Troubleshooting flowchart for the failure to establish the BGP peer relationship
The BGP peer relationship fails to be established
No
Is fault rectified?
Yes
No Yes Is there an ACL configured whose destination port is The TCP port 179? No Change the two router IDs to different values in Yes Yes Is fault rectified?
No
Does the peer router ID conflict with the loca l router ID? No Whether the displayed peer AS number is the same as the remote AS number? No Does BGP configurations affect the establishment of the BGP peer relationship? No Seek technical support
Yes
Yes
Yes
No
End
Issue 01 (2011-10-15)
88
8 BGP Troubleshooting
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Run the ping command to check whether BGP peers can ping each other successfully. l If they can ping each other successfully, it indicates that there are available routes between BGP peers and that link transmission is normal. In this case, go to Step 2.
NOTE
Run the ping -a source-ip-address -s packetsize host or ping ipv6 a source-ipv6-address s bytenumber destination-ipv6-address command to detect the connectivity of devices on both ends. Because the source address is specified in this command, you can check whether the two devices have available routes to each other. By specifying the size of a Ping packet, you can check whether large Ping packets can be normally transmitted over the link.
If the ping operation fails, see the section The Ping Operation Fails to rectify the link fault.
Step 2 Check that no ACL is configured to filter the packets whose destination port is TCP port 179. Run the display acl all command on the two devices to check whether an ACL is configured to filter the packets whose destination port is TCP port 179.
<HUAWEI> display acl all Advanced ACL 3001, 2 rules ACL's step is 5 ACL's match-order is config rule 5 deny tcp source-port eq bgp rule 10 deny tcp destination-port eq bgp
If an ACL is configured to filter the packets whose destination port is TCP port 179, run the undo rule rule-id destination-port command and the undo rule rule-id source-port command to delete the ACL configuration. If no ACL is configured to filter the packets whose destination port is TCP port 179, go to Step 3.
Step 3 Check that the peer router ID does not conflict with the local router ID. View information about BGP peers to check whether their router IDs conflict. For example, if the IPv4 unicast peer relationship fails to be established, you can run the display bgp peer command to check whether the peer router ID conflicts with the local router ID. Take the following command output as an example:
<HUAWEI> display bgp peer BGP local router ID : 1.1.1.1 Local AS number : 65001 Total number of peers : 12 Peer PrefRcv 8.9.0.8 V 4 AS 100 MsgRcvd 1601
0 23:21:56 Established
Issue 01 (2011-10-15)
89
8 BGP Troubleshooting
200
1565
1799
0 23:15:30 Established
9999
Check information about peers in other address families. l Run the display bgp vpnv4 all peer command to check information about all VPNv4 peers. l Run the display bgp ipv6 peer command to check information about IPv6 peers. l Run the display bgp vpnv6 all peer command to check information about all VPNv6 peers.
If the peer router ID conflicts with the local router ID, run the router id command in the BGP view to change the two router IDs to different values. Generally, a loopback interface address is used as the local router ID. If the peer router ID does not conflict with the local router ID, go to Step 4.
Step 4 Check that the peer AS number is configured correctly. Run the display bgp peer command on each device to check whether the displayed peer AS number is the same as the remote AS number.
<HUAWEI> display bgp peer BGP local router ID : 223.5.0.109 Local AS number : 41976 Total number of peers : 12 Peer PrefRcv 8.9.0.8 10000 9.10.0.10
NOTE
V 4 4
AS 100 200
Check information about peers in other address families. l Run the display bgp vpnv4 all peer command to check information about all VPNv4 peers. l Run the display bgp ipv6 peer command to check information about IPv6 peers. l Run the display bgp vpnv6 all peer command to check information about all VPNv6 peers.
l l
If the peer AS number is incorrect configured, change it to be the same as the remote AS number. If the peer AS number is configured correctly, go to Step 5.
Step 5 Check whether BGP configurations affect the establishment of the BGP peer relationship. Run the display current-configuration configuration bgp command to check BGP configurations. Item peer connect-interface { interfacetype interface-number | ipv4-sourceaddress } Description If two devices use loopback interfaces to established the BGP peer relationship, you need to run the peer connect-interface command to specify the source interface through which the BGP packets are sent, and the source address with which the BGP connection is established.
Issue 01 (2011-10-15)
90
8 BGP Troubleshooting
Description When two directly connected devices use loopback interfaces to establish the EBGP peer relationship or two indirectly connected devices establish the EBGP peer relationship, you need to run the peer ebgp-max-hop command and specify the maximum number of hops between the two devices. l When two directly connected devices use loopback interfaces to establish the EBGP peer relationship, the hop count can be any number greater than 1. l When two indirectly connected devices establish the EBGP peer relationship, you need to specify the number of hops according to the actual situation.
If the peer valid-ttl-hops hops command is configured, check whether the value of hops is correct. The valid TTL range of the detected packet is [255 - hops + 1, 255]. hops specifies the number of hops between devices on both ends, and the hop count between two directly connected devices is 1.
NOTE The peer valid-ttl-hops command must be configured on devices on both ends.
If the peer route-limit limit command is configured, check whether the number of routes sent by the peer exceeds the upper limit that is specified by limit. If the upper limit is exceeded, you need to reduce the number of routes to be sent by the peer, and run the reset bgp ip-address command to reset the BGP peer relationship to trigger the re-establishment of the BGP peer relationship. If the peer ignore command is configured on the peer, it indicates that the peer is not required to establish the BGP peer relationship with the local device temporarily. To establish the BGP peer relationship between the peer and the local device, you can run the undo peer ignore command on the peer.
peer ignore
Issue 01 (2011-10-15)
91
8 BGP Troubleshooting
Description Check whether the address family capabilities of devices on both ends match. For example, in order to establish the BGP VPNv4 peer relationship, the peer enable command must be configured in the BGP-VPNv4 address families of both devices. If the peer enable command is configured on one device only, the BGP peer relationship on the other device is displayed as No neg.
Step 6 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Relevant Logs
None
8 BGP Troubleshooting
Figure 8-2 Troubleshooting flowchart for interruption of BGP public network traffic
The BGP public network traffic is interrupted
No
No
Yes
No
Does the number of routes exceed the upper limit? No Seek technical support
Yes
No
End
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check that next hops of routes are reachable.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 93
8 BGP Troubleshooting
Run the display bgp routing-table network { mask | mask-length } command on the device that sends routes (that is, the local device) to check whether the target route is active and whether it has been sent to the peer. network specifies the prefix of the target route. Assume that the target route is a route to 13.0.0.0/8. The command output shows that this route is valid and has been selected and sent to the peer at 3.3.3.3; the original next hop and iterated next hop of this route are 1.1.1.1 and 172.1.1.1 respectively.
<HUAWEI> display bgp routing-table 13.0.0.0 8 BGP local router ID : 23.1.1.2 Local AS number : 100 Paths: 1 available, 1 best, 1 select BGP routing table entry information of 13.0.0.0/8: From: 1.1.1.1 (121.1.1.1) Route Duration: 4d21h29m39s Relay IP Nexthop: 172.1.1.1 Relay IP Out-Interface: GigabitEthernet1/0/2 Original nexthop: 1.1.1.1 Qos information : 0x0 AS-path Nil, origin incomplete, localpref 100, pref-val 0, valid, internal, best, select, active, pre 255 Aggregator: AS 100, Aggregator ID 121.1.1.1 Advertised to such 1 peers: 3.3.3.3
If the target route is inactive, check whether there is a route to the original next hop in the IP routing table. If there is no route to the original next hop, it indicates that the BGP route is not advertised because its next hop is unreachable. Then, find out why there is no route to the original next hop (this fault is generally associated with IGP or static routes). If the target route is active and has been selected but there is no information indicating that this route is sent to the peer, perform Step 2 to check the outbound policy applied to the local device. Run the display bgp routing-table network { mask | mask-length } command on the peer to check whether it has received the target route. If the peer has received the target route, perform Step 1 again to check whether the next hop of the route is reachable and whether this route has been selected. If the peer has not received the target route, perform Step 2 to check the inbound policy applied to the peer.
NOTE
In BGP4+ networking, the display bgp routing-table ipv6-address prefix-length command is used to check whether the target route is received.
Step 2 Check that routing policies are configured correctly. Run the display current-configuration configuration bgp command on the local device and the peer to check whether inbound and outbound policies are configured.
<HUAWEI> display current-configuration configuration bgp # bgp 100 peer 1.1.1.1 as-number 100 # ipv4-family unicast undo synchronization filter-policy ip-prefix aaa import filter-policy ip-prefix aaa export peer 1.1.1.1 enable peer 1.1.1.1 filter-policy acl-name acl-name import peer 1.1.1.1 filter-policy acl-name acl-name export peer 1.1.1.1 as-path-filter 1 import
Issue 01 (2011-10-15)
94
8 BGP Troubleshooting
If inbound and outbound policies are configured on the two devices, you need to check whether the target route is filtered by these policies. For detailed configurations of a routing policy, see the HUAWEI NetEngine5000E Core Router Configuration Guide - IP Routing. If inbound and outbound policies are not configured on the two devices, go to Step 3.
Step 3 Check that the number of routes is lower than the upper limit. Run the display current-configuration configuration bgp | include peer destinationaddress command or the display current-configuration configuration bgp | include peer group-name command on the peer to check whether an upper limit on the number of routes to be received is configured on the peer. For example, if the upper limit is set to 5, subsequent routes are dropped and a log is recorded after the peer receives five routes from the local device at 1.1.1.1.
<HUAWEI> display current-configuration configuration bgp | include peer 1.1.1.1 peer 1.1.1.1 as-number 100 peer 1.1.1.1 route-limit 5 alert-only peer 1.1.1.1 enable
If the peer is added to a peer group, there may be no configurations of the upper limit in the command output.
<HUAWEI> display current-configuration configuration bgp | include peer 1.1.1.1 peer 1.1.1.1 as-number 100 peer 1.1.1.1 group IBGP peer 1.1.1.1 enable peer 1.1.1.1 group IBGP
In this case, you need to run the display current-configuration configuration bgp | include peer group-name command to check configurations of this peer group.
<HUAWEI> display current-configuration configuration bgp | include peer IBGP peer IBGP route-limit 5 alert-only peer IBGP enable
If the alarm 0x08790002 hwBgpPeerRouteNumberExceed is generated when traffic is interrupted, it indicates that the target route is dropped because the upper limit is exceeded. Then, you need to increase the upper limit.
NOTE
Changing the upper limit on the number of routes to be received from a peer interrupts the BGP peer relationship. Therefore, it is recommended to reduce the number of sent routes by configuring route summarization on the local device.
Step 4 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure. l Configuration files, log files, and alarm files of the devices. ----End
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 95
8 BGP Troubleshooting
Relevant Logs
None
Issue 01 (2011-10-15)
96
8 BGP Troubleshooting
Figure 8-3 Troubleshooting flowchart for interruption of BGP private network traffic
The BGP private network traffic is interrupted
No
Is fault rectified? No
Yes
No
Is fault rectified? No
Yes
Yes
Reduce the number of routes or configure the device to assign a label to each instance
Is fault rectified? No
Yes
No
Is fault rectified? No
Yes
Is fault rectified? No
Yes
Yes
Is fault rectified? No
Yes
End
Issue 01 (2011-10-15)
97
8 BGP Troubleshooting
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check that next hops of routes are reachable. Run the display bgp vpnv4 vpn-instance vpn-instance-name routing-table ipv4-address [ mask | mask-length ] command on the PE that sends routes (that is, the local PE) to check whether the target route exists. ipv4-address specifies the prefix of the target route. l If the target route does not exist, check whether the route of a CE is advertised to the local PE. l If the target route exists, check whether it is active. The following is an example: Assume that the target route is a route to 1.1.1.1/32. The following command output shows that this route is valid and best. The original next hop and iterated next hop of this route are 3.3.3.3 and 20.1.1.2 respectively.
<HUAWEI> display bgp vpnv4 vpn-instance vpna routing-table 1.1.1.1 BGP local router ID : 20.1.1.2 Local AS number : 100 Paths: 1 available, 1 best, 1 select BGP routing table entry information of 1.1.1.1/32: From: 20.1.1.1 (1.1.1.1) Route Duration: 00h00m03s Relay IP Nexthop: 20.1.1.2 Relay IP Out-Interface: Pos1/0/0 Original nexthop: 3.3.3.3 Qos information : 0x0 AS-path Nil, origin incomplete, MED 0, localpref 100, pref-val 0, valid, internal, best, select, active, pre 255 Not advertised to any peer yet
If the target route is inactive, check whether there is a route to the original next hop in the IP routing table. If there is no route to the original next hop, it indicates that the BGP route is not advertised because its next hop is unreachable. Then, find out why there is no route to the original next hop (this fault is generally associated with IGP or static routes). If the target route is valid and best but there is no information indicating that this route is sent to the remote PE, perform Step 2 to check the outbound policy applied to the local PE. Run the display bgp vpnv4 all routing-table ipv4-address { mask | mask-length } command on the remote PE to check whether it has received the target route. If the remote PE has received the target route, perform Step 1 again to check whether the next hop of the route is reachable and whether this route is selected.
l l
Issue 01 (2011-10-15)
98
8 BGP Troubleshooting
If the remote PE has not received the target route, perform Step 2 to check the inbound policy of the remote PE. Step 2 Check that routing policies are configured correctly. Run the display current-configuration configuration bgp command on the local PE and remote PE to check whether inbound and outbound policies are configured.
NOTE
You only need to focus on peers of the BGP-VPNv4 address family or BGP-VPN instance address family in this troubleshooting case because the private network traffic is interrupted.
<HUAWEI> display current-configuration configuration bgp # bgp 100 peer 1.1.1.1 as-number 200 # ipv4-family unicast undo synchronization peer 1.1.1.1 enable # ipv6-family unicast undo synchronization # ipv4-family vpnv4 policy vpn-target peer 1.1.1.1 enable peer 1.1.1.1 filter-policy acl-name acl-name import peer 1.1.1.1 filter-policy acl-name acl-name export peer 1.1.1.1 as-path-filter 1 import peer 1.1.1.1 as-path-filter 1 export peer 1.1.1.1 ip-prefix prefix-name import peer 1.1.1.1 ip-prefix prefix-name export peer 1.1.1.1 route-policy policy-name import peer 1.1.1.1 route-policy policy-name export # ipv4-family vpn-instance vpna peer 10.1.1.1 as-number 300 peer 10.1.1.1 filter-policy acl-name acl-name import peer 10.1.1.1 filter-policy acl-name acl-name export peer 10.1.1.1 as-path-filter 1 import peer 10.1.1.1 as-path-filter 1 export peer 10.1.1.1 ip-prefix prefix-name import peer 10.1.1.1 ip-prefix prefix-name export peer 10.1.1.1 route-policy policy-name import peer 10.1.1.1 route-policy policy-name export # return
If inbound and outbound policies are configured on the two devices, check whether the target route fails to be transmitted because it is filtered by these policies. For detailed configurations of a routing policy, see the HUAWEI NetEngine5000E Configuration Guide - IP Routing. If inbound and outbound policies are not configured on the two devices, go to Step 3.
Step 3 Check that routes can be iterated to a tunnel. Run the display bgp vpnv4 all routing-table ipv4-address [ mask | mask-length ] command on the remote PE to check whether the target route can be iterated to a tunnel. Assume that the target route is a route to 50.1.1.2/32. If the Relay Tunnel Name field in the command output are not empty, it indicates that this route can be iterated to a tunnel.
<HUAWEI> dis bgp vpnv4 all routing-table 50.1.1.2 BGP local router ID : 2.2.2.2 Local AS number : 100
Issue 01 (2011-10-15)
99
8 BGP Troubleshooting
Total routes of Route Distinguisher(1:2): 1 BGP routing table entry information of 50.1.1.2/32: Label information (Received/Applied): 13316/NULL From: 1.1.1.1 (1.1.1.1) Route Duration: 00h00m08s Relay IP Nexthop: 20.1.1.1 Relay IP Out-Interface: Pos1/0/0 Relay Tunnel Name: ldp Original nexthop: 1.1.1.1 Qos information : 0x0 Ext-Community:RT <1 : 1> AS-path Nil, origin incomplete, MED 0, localpref 100, pref-val 0, valid, internal, best, select, pre 255 Not advertised to any peer yet Total routes of vpn-instance vpna: 1 BGP routing table entry information of 50.1.1.2/32: Label information (Received/Applied): 13316/NULL From: 1.1.1.1 (1.1.1.1) Route Duration: 00h00m07s Relay Tunnel Name: ldp Original nexthop: 1.1.1.1 Qos information : 0x0 Ext-Community:RT <1 : 1> AS-path Nil, origin incomplete, MED 0, localpref 100, pref-val 0, valid, internal, best, select, active, pre 255 Not advertised to any peer yet
If the target route fails to be iterated to a tunnel, check whether the associated tunnel exists or whether the tunnel configurations are correct. For details, see the HUAWEI NetEngine5000E Troubleshooting - MPLS. If the target route can be iterated to a tunnel, go to Step 4.
Step 4 Check whether routes fail to be added to the VPN routing table because the configured import RT and export RT do not match. Run the display current-configuration configuration vpn-instance command on the local PE and remote PE to check whether routes fail to be added to the VPN routing table of the remote PE after being sent to the remote PE because the export RT of the local VPN instance does not match the import RT of the remote VPN instance. export-extcommunity indicates an export RT, and import-extcommunity indicates an import RT.
<HUAWEI> display current-configuration configuration vpn-instance # ip vpn-instance vpna route-distinguisher 1:1 apply-label per-instance vpn-target 1:1 export-extcommunity vpn-target 1:1 import-extcommunity ip vpn-instance vpnb route-distinguisher 1:2 vpn-target 1:1 export-extcommunity vpn-target 1:1 import-extcommunity # return
l If the export RT of the local VPN instance does not match the import RT of the remote VPN instance, configure matching VPN-targets in the VPN instance. l If the export RT of the local VPN instance matches the import RT of the remote VPN instance, go to Step 5. Step 5 Check that the number of labels is lower than the upper limit.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 100
8 BGP Troubleshooting
Check whether MPLS is enabled on the local PE. Then, run the display bgp vpnv4 all routingtable ipv4-address [ mask | mask-length ] command to check whether the target route is assigned a VPN label. If there is no Label information field in the command output, it indicates that labels may be insufficient. As a result, the target route is not assigned a label and is not advertised to the peer.
<HUAWEI> display bgp vpnv4 all routing-table 100.1.1.1 BGP local router ID : 10.1.1.2 Local AS number : 100 Total routes of Route Distinguisher(1:1): 1 BGP routing table entry information of 100.1.1.0/24: Imported route. Label information (Received/Applied): NULL/13312 From: 0.0.0.0 (0.0.0.0) Route Duration: 00h21m24s Direct Out-interface: NULL0 Original nexthop: 0.0.0.0 Qos information : 0x0 Ext-Community:RT <1 : 1> AS-path Nil, origin incomplete, MED 0, pref-val 0, valid, local, best, select, pre 255 Advertised to such 1 peers: 1.1.1.1 Total routes of vpn-instance vpna: 1 BGP routing table entry information of 100.1.1.0/24: Imported route. From: 0.0.0.0 (0.0.0.0) Route Duration: 00h21m24s Direct Out-interface: NULL0 Original nexthop: 0.0.0.0 Qos information : 0x0 AS-path Nil, origin incomplete, MED 0, pref-val 0, valid, local, best, select, pre 60 Not advertised to any peer yet
l If labels are insufficient, run the apply-label per-instance command in the VPN instance view to configure the device to assign one label to each instance so as to save labels. You can also configure route summarization to reduce the number of routes. l If labels are sufficient, go to Step 6. Step 6 Check that the number of routes is lower than the upper limit. Run the display current-configuration configuration bgp | include peer destinationaddress command or the display current-configuration configuration bgp | include peer group-name command on the remote PE to check whether the upper limit on the number of routes to be received is configured on the remote PE. For example, if the upper limit is set to 5, subsequent routes are dropped and a log is recorded after the remote PE receives five routes from the local PE at 1.1.1.1.
<HUAWEI> display current-configuration configuration bgp | include peer 1.1.1.1 peer 1.1.1.1 as-number 100 peer 1.1.1.1 route-limit 5 alert-only peer 1.1.1.1 enable
If the peer is added to a peer group, there may be no configurations about the upper limit in the command output.
<HUAWEI> display current-configuration configuration bgp | include peer 1.1.1.1 peer 1.1.1.1 as-number 100 peer 1.1.1.1 group IBGP
Issue 01 (2011-10-15)
101
8 BGP Troubleshooting
In this case, you need to run the display current-configuration configuration bgp | include peer group-name command to check configurations of this peer group.
<HUAWEI> display current-configuration configuration bgp | include peer IBGP peer IBGP route-limit 5 alert-only peer IBGP enable
If the alarm BGPCOMM_0x08790002 hwBgpPeerRouteNumberExceed is generated when traffic is interrupted, it indicates that the target route is dropped because the number of routes received has exceeded the upper limit. Then, you need to increase the upper limit.
NOTE
Changing the upper limit on the number of routes to be received from a peer interrupts the BGP peer relationship. Therefore, it is recommended to reduce the number of sent routes by configuring route summarization on the local device.
Step 7 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices ----End
Relevant Logs
None
8.4 Troubleshooting of the Fault that a Local BGP Peer (Route Sender) Cannot Receive ORFs from a Remote Peer (Route Receiver)
This section describes the troubleshooting roadmap for the fault that a local BGP peer (route sender) cannot receive ORFs from a remote peer (route receiver), and provides troubleshooting cases.
The IPv4 BGP peer relationship cannot be established. Negotiating the BGP ORF capability fails. No import IP-prefix policy is configured on the remote peer (route receiver)
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 102
8 BGP Troubleshooting
No prefix list corresponding to the import IP-prefix policy is configured on the remote peer (route receiver).
Issue 01 (2011-10-15)
103
8 BGP Troubleshooting
Figure 8-4 Troubleshooting flowchart for the fault that a local BGP peer (route sender) cannot receive ORFs from a remote peer (route receiver)
A local BGP peer (route sender) fails to receive ORFs from a remote peer (route receiver)
No
See detailed troubleshooting procedures in "Troubleshooting of the Fault that a BGP Peer Relationship Cannot Be Set Up"
Is fault rectified?
Yes
No
Yes
Is the BGP ORF function enabled on BGP peers and do the Peers succeed in negotiating the BGP ORF capability?
No
Enable the BGP ORF function on BGP peers and reestablish the BGP peer relationship
Is fault rectified?
Yes
No Yes
No
Configure the import IP-prefix policy on the remote peer (route receiver)
Is fault rectified?
Yes
Yes
No
Is the prefix list corresponding to the import IP-prefix policy is configured on the remote peer (route receiver)? Yes
No
Configure the prefix list corresponding to the import IP-prefix policy on the remote peer
Is fault rectified?
Yes
No
End
8 BGP Troubleshooting
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check that a BGP peer relationship is set up. Run the display bgp peer command to check whether the BGP peer relationship is in the Established state. l l If the BGP peer relationship is not in the Established state, see detailed troubleshooting procedures in The BGP Peer Relationship Fails to Be Established. If the BGP peer relationship is in the Established state, go to Step 2.
Step 2 Check that the BGP ORF function is enabled on BGP peers, and the peers succeed in negotiating the BGP ORF capability. Run the display current-configuration configuration bgp command on BGP peers to check whether peer ipv4-address capability-advertise orf ip-prefix is configured in the IPv4 unicast address family view.
<HUAWEI> display current-configuration configuration bgp # bgp 100 peer 7.1.1.1 as-number 100 # ipv4-family unicast undo synchronization peer 7.1.1.1 ip-prefix in import peer 7.1.1.1 capability-advertise orf ip-prefix both #
NOTE
BGP ORF has three modes: send, receive, and both. In send mode, a device can send ORFs; in receive mode, a device can receive ORFs; in both mode, a device can either send or receive ORFs. To enable a device to receive ORF IP-prefix information, configure the both or receive mode on the device and the both or send mode on its peer.
If one peer is not configured with the BGP ORF function, enter the BGP IPv4 unicast address family view and run the peer ipv4-address capability-advertise orf ip-prefix command to enable BGP ORF. If both or receive is specified when you configure the local peer, both or send must be specified when you configure the remote peer.
<HUAWEI> system-view [~HUAWEI] bgp 100 [~HUAWEI-bgp] ipv4-family unicast [~HUAWEI-bgp-af-ipv4] peer 7.1.1.1 capability-advertise orf ip-prefix both
If BGP ORF is enabled on both BGP peers, wait for the re-establishment of a BGP peer relationship, and run the display bgp peer ipv4-address verbose command to check whether the BGP ORF capability is successfully negotiated. The command output shows the ORF capabilities on both the local and remote peers.
<HUAWEI> display bgp peer 7.1.1.1 verbose | include Address-Prefix
Issue 01 (2011-10-15)
105
8 BGP Troubleshooting
Support Address-Prefix: IPv4-UNC address-family, rfc-compatible, both Enable Address-Prefix: IPv4-UNC address-family, rfc-compatible, both
NOTE
In the preceding command output, the first part shows the ORF capability announced by the remote peer and the subsequent part shows the ORF capability configured on the local peer. The ORF capability supported by non-huawei devices is different from that defined in the RFC standard. Therefore, to enable Huawei devices to communicate with non-Huawei devices, new commands for compatibility are added. Ensure that both BGP peers are configured with the same compatibility mode (either Cisco-compatible or RFC-compatible).
If both BGP peers are configured with the BGP ORF function and succeed in negotiating the BGP ORF capability, go to Step 3.
Step 3 Check that an import IP-prefix policy is configured on the remote peer (route receiver). Run the display current-configuration configuration bgp command on the remote peer to check whether peer ipv4-address ip-prefix ip-prefix-name import is configured in the IPv4 unicast address family view.
<HUAWEI> display current-configuration configuration bgp # bgp 100 peer 7.1.1.1 as-number 100 # ipv4-family unicast undo synchronization peer 7.1.1.1 ip-prefix in import peer 7.1.1.1 capability-advertise orf ip-prefix both #
If no import IP-prefix policy is configured on the remote peer, enter the BGP IPv4 unicast address family view, and run the peer ipv4-address ip-prefix ip-prefix-name import command to configure an import IP-prefix policy.
<HUAWEI> system-view [~HUAWEI] bgp 100 [~HUAWEI-bgp] ipv4-family unicast [~HUAWEI-bgp-af-ipv4] peer 7.1.1.1 ip-prefix in import
If an import IP-prefix policy is configured on the remote peer but the local peer still cannot receive ORF IP-prefix information from the remote peer, go to Step 4.
Step 4 Check that the prefix list corresponding to the import IP-prefix policy is configured on the remote peer (route receiver). Run the display ip ip-prefix ip-prefix-name command on the remote peer to check whether the prefix list corresponding to the import IP-prefix policy is configured.
<HUAWEI> display ip ip-prefix in Info: The specified filter list does not exist.
The preceding output shows that the prefix list in has not been successfully configured. Enter the system view, and run the ip ip-prefix ip-prefix-name index index-number permit ipv4-address mask-length command to configure a prefix list.
<HUAWEI> system-view [~HUAWEI] ip ip-prefix in index 10 permit 10.1.1.0 24
After completing the preceding configuration, run the display ip ip-prefix ip-prefix-name command on the remote peer to check whether the prefix list corresponding to the import IPprefix policy is configured.
<HUAWEI> display ip ip-prefix in
Issue 01 (2011-10-15)
106
8 BGP Troubleshooting
permit
10.1.1.0/24
The preceding output shows that the prefix list in has been successfully configured. After completing the preceding steps, if the local peer still cannot receive ORFs from the remote peer, go to Step 5. Step 5 Collect the following information and contact Huawei technical support personnel. l Results of the preceding troubleshooting procedures l Configuration files, log files, and alarm files of the device ----End
Relevant Logs
BGP/6/BGP_PEER_STATE_CHG
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
In Figure 8-5, Router A and Router B reside on the backbone network. EBGP peer relationships are established between devices in AS 100 and AS 200. IBGP peer relationships are established between devices inside each AS. After Router A and Router B advertise BGP default routes, detailed information about BGP default routes on Router C shows that the outgoing traffic of AS 200 is directed to Router D. That is, the next hop of BGP default routes is Router D. Consequently, the outgoing traffic of AS 200 traverses Router C.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 107
8 BGP Troubleshooting
AS100
RouterC
RouterD
AS200
Fault Analysis
Run the display bgp routing-table 0.0.0.0 command on Router C to check detailed information about BGP default routes. You can find that different MEDs are set for Router A and Router B. As a result, the outgoing traffic of AS 200 traverses Router C.
Procedure
Step 1 Run the system-view command on Router A or Router B to enter the system view. Step 2 Run the bgp as-number command to enter the BGP view. Step 3 Run the ipv4-family unicast command to enter the BGP-IPv4 unicast address family view. Step 4 Run the default med med command to modify the default MED of BGP routes, and make sure the MED of BGP routes on Router A is the same as the MED of BGP routes on Router B. Step 5 Run the return command to return to the user view and then run the save command to save the modification. After the preceding operations, run the display bgp routing-table 0.0.0.0 command on Router C to check detailed information about BGP default routes. You can find that the outgoing traffic of AS 200 is sent out through Router C. This indicates that the fault is cleared. ----End
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 108
8 BGP Troubleshooting
Summary
When there are multiple egress devices between two ASs, you need to set the same MED for the default routes that are advertised. In this manner, BGP prefers the routes learned from EBGP peers because the delivered default routes have the same route attributes such as the localpreference and MED.
8.5.2 Routing Policies Delivered by a PE Do Not Take Effect Because There Are Multiple Routing Policies with the Same Name
NOTE
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
As shown in Figure 8-6, a PE is connected to VPN 1 and is linked to GSR 1 in the upstream direction. VPN 1 routes of the PE need to be advertised to the backbone network. The PE controls the routes to be advertised to GSR 1 through routing policies. According to the routing policies, GSR 1 only needs to learn the VPN 1 summary route advertised by the PE instead of specific routes. After the routing policies are configured, GSR 1 learns both the VPN 1 summary route and specific routes. Figure 8-6 Routing policies delivered by a PE failing to take effect
PE VPN1
GSR1
Backbone
Fault Analysis
1. Run the display current-configuration command on the PE to check routing policy configurations. The command output shows that no abnormality occurs.
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 109
Issue 01 (2011-10-15)
8 BGP Troubleshooting
2. 3.
According to the fault symptom, it is possible that GSR 1 learns the specific VPN 1 routes advertised by the PE because the delivered routing policies do not take effect. Run the display bgp vpnv4 vpn-instance vpn-instance-name routing-table peer peeraddress { advertised-routes | received-routes [ active ] } command on the PE to check routes of other VPNs to which the PE is connected. Routes learned on GSR 1 are summary routes of these VPNs. Therefore, it can be concluded that the routing policies for VPN 1 are incorrect. Check the configuration file of the PE. The result shows that there are three routing policies with the same name for the PE to advertise routes of VPN 1. The ip-prefix NGN-A referenced by the first routing policy is defined, and this routing policy is valid. The ipprefix NGN-A1 and ip-prefix NGN-A2 referenced by the other two routing policies respectively are not defined, and therefore the two routing policies are invalid. Detailed configurations are as follows:
ipv4-family vpn-instance CDMA-NGN peer 10.247.0.1 route-policy PE_NGN_OUT_MASTER export route-policy PE_NGN_OUT_MASTER permit node 10 if-match ip-prefix NGN-A route-policy PE_NGN_OUT_MASTER permit node 20 if-match ip-prefix NGN-A1 route-policy PE_NGN_OUT_MASTER permit node 30 if-match ip-prefix NGN-A2 ip ip-prefix NGN-A index 10 permit 10.247.0.0 21
4.
The rules for referencing routing policies dictate that the relationship between the three routing policies with the same name is OR. That is, VPN 1 routes can be correctly advertised as long as one of the routing policies with the same name is valid. However, the redundant invalid routing policies with the same name can still cause VPN 1 routes to be incorrectly advertised. 5. After deleting the other two invalid routing policies, you can find that GSR 1 learns only one summary route, namely, a route whose prefix belongs to the IP prefix list named NGNA. The fault is cleared.
Procedure
Step 1 Run the system-view command on the PE to enter the system view. Step 2 Run the undo route-policy route-policy-name [ node node ] command to delete the two redundant routing policies. Step 3 Run the display bgp vpnv4 vpn-instance vpn-instance-name routing-table peer peeraddress advertised-routes command to check routes of VPN 1 to which the PE is connected, and you can find only one summary route. The fault is cleared. ----End
Summary
According to the rules for referencing routing policies, the relationship between routing policies with the same name is OR. However, redundant routing policies still need to be deleted in case that routes are incorrectly advertised.
Issue 01 (2011-10-15)
110
8 BGP Troubleshooting
8.5.3 A PE Fails to Establish the Public Network LSP Because the Path of IGP Routes Is Incorrect
NOTE
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
As shown in Figure 8-7, PE1 can receive the VPNv4 routes reflected from RR1, but these routes cannot be installed in the routing table of the VPN instance. That is, there is associated routing information in the BGP VPNv4 routing table but not in the routing table of the IPv4 VPN instance on PE1. Figure 8-7 Failure in establishing a public network LSP on a PE
PE1
PE2
Fault Analysis
1. Run the display bgp vpnv4 all routing-table command on PE1 to check the BGP routing table, and you can find the routes learned from the peer PE. This indicates that the BGP neighbor relationship is normal and that VPN labels are assigned properly. Run the display ip routing-table vpn-instance vpn-instance-name ip-address verbose command on PE1 to check detailed information about VPN routes, and you can find that the Interface field is displayed as NULL0. This indicates that VPN routes do not have a correct iterated outbound interface on the public network. That is, these VPN routes are invalid and therefore are not installed in the routing table of the VPN instance. Run the display mpls ldp session command on PE1 to check the LDP session on the public network, and you can find that the LDP session works normally. This indicates that an LDP session can be established between the Ps.
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 111
2.
3.
Issue 01 (2011-10-15)
8 BGP Troubleshooting
4.
Run the display mpls ldp lsp destination-address mask-length command on PE1 to check the assignment of public network labels, and you can find that the In/OutLabel field is displayed as Null and the Next-Hop field is empty. This indicates that LDP sessions can be established between the PEs and the Ps, but labels cannot be assigned. Run the display ip routing-table ip-address command on PE1 to check IGP routes to the loopback interface of the P. The command output shows that the 32-bit IGP route to the loopback interface of the P is learned by the P from PE2. As a result, although an LDP session is established, public network label assignment is not triggered. LDP is not configured on the two PEs. If LDP is configured on the two PEs, public network labels can be assigned and routes can be generated in the VPN instance. Run the display current-configuration and display ip routing-table ip-address commands to check IGP configurations and IGP routes of devices. You can find that OSPF is not correctly configured on the interface that connects RR1 to PE1.
5.
6.
Procedure
Step 1 Re-configure OSPF on the interface that connects RR1 to PE1 to ensure that the path of IGP routes is correct. Then, routing information can be learned from the interfaces that connect the PEs to the Ps. Step 2 Run the display ip routing-table vpn-instance vpn-instance-name ip-address verbose command on PE1 to check the assignment of MPLS labels and iteration of the outbound interface of VPN routes. You can find that MPLS labels can be assigned properly and the outbound interface of VPN routes can be iterated properly. The Interface field shows a correct iterated outbound interface on the public network. Step 3 Run the display mpls ldp lsp destination-address mask-length command on PE1 to check the assignment of public network labels. You can find that public network labels are assigned properly. Step 4 Run the display ip routing-table vpn-instance vpn-instance-name command on PE1 to check the routing table of the VPN instance. You can find that the routing table of the VPN instance contains associated VPN routes. This indicates that the fault is cleared. ----End
Summary
Whether VPN routes can be installed in the routing table of a VPN instance depends on the establishment of a public network LSP. If public network labels fail to be assigned and LSPs cannot be established, check whether the path of IGP routes can trigger label assignment and whether IGP routes are correct.
Issue 01 (2011-10-15)
112
8 BGP Troubleshooting
8.5.4 The BGP Peer Relationship Goes Down Because of Route Iteration
NOTE
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
There are two links between Router A (a Huawei device) and Router B (a non-Huawei device). One link is established through POS interfaces and the other link is established through GE interfaces. Devices on both ends of a link establish the BGP peer relationship through loopback interfaces. After the POS interface on Router A goes Down, the BGP peer relationship between Router A and Router B goes Down and remains in the OpenSent state. Router A, however, can successfully ping the address of the loopback interface on Router B. Figure 8-8 Route iteration causing the BGP peer relationship to go Down
RouterA POS2/0/0 192.168.0.2/30 GE1/0/0 192.168.1.2/30 Loopback0 20.0.0.1 POS2/0/0 RouterB 192.168.0.1/30 GE1/0/0 192.168.1.1/30 Loopback0 10.0.0.1
Fault Analysis
1. After POS 2/0/0 on Router A goes Down, you can run the display ip routing-table ipaddress command on Router A to check equal-cost routes to the public network. The command output shows that there are two equal-cost routes with next-hop addresses both being 10.0.0.1 and outbound interfaces being GE 1/0/0 and Null0 respectively. Before POS 2/0/0 on Router A goes Down, you can find that the outbound interfaces of two equal-cost routes with next-hop addresses both being 10.0.0.1 are GE 1/0/0 and POS 2/0/0 respectively. Run the display bgp peer command on Router A to check the BGP peer relationship, and you can find that the BGP peer with the address 10.0.0.1 is in the OpenSent state. 2. Route iteration may cause outbound interfaces of equal-cost routes to change. If no route iteration occurs, after POS 2/0/0 goes Down, only one of the two equal-cost routes exists, that is, the route with the outbound interface being GE 1/0/0.
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 113
Issue 01 (2011-10-15)
8 BGP Troubleshooting
3.
Check configurations of Router A and analyze why the outbound interface is iterated to Null0. Configurations show that the static routes with the 32-bit mask to the address (10.0.0.1) of the loopback interface on Router B are configured on Router A.
ip route-static 10.0.0.1 255.255.255.255 192.168.1.1 ip route-static 10.0.0.1 255.255.255.255 192.168.0.1
After POS 2/0/0 on Router A goes Down, the preceding static route configurations cause Router A to iterate routes. Then, check whether there is a route to 192.168.0.1 in the routing table. By checking the configuration file, you can find the following static route configurations:
ip route-static 192.168.0.0 255.255.255.0 NULL0 preference 255
Therefore, the outbound interface of one of the two upstream equal-cost routes becomes Null0. 4. Analyze why the BGP peer relationship goes Down after one outbound interface becomes Null0. After POS 2/0/0 goes Down, two upstream routes of Router A are as follows:
Destination/Mask 10.0.0.1/32 Proto BGP BGP Pre 100 100 Cost 0 0 NextHop 10.0.0.1 10.0.0.1 Interface GigabitEthernet1/0/0 NUll0
In this case, Router A can successfully ping the address (10.0.0.1) of the loopback interface on Router B. In normal situations, the BGP peer relationship keeps Up. Because there are two links between Router A and Router B, Hash calculation is triggered when packets are exchanged between the two devices. If you run the ping command without specifying the source address, the outbound interface calculated by the Hash algorithm is GE 1/0/0, in which case the ping succeeds. If you run the ping command with loopback interface address 20.0.0.1 being the source address on Router A, the outbound interface calculated by the Hash algorithm is POS 2/0/0, in which case the ping fails. Loopback interface addresses are used to establish the BGP peer relationship between Router A and Router B. POS 2/0/0 is now iterated to the outbound interface of Null0. Therefore, the BGP peer relationship between Router A and Router B goes Down. To clear the fault, you need to disable route iteration on Router A.
Procedure
Step 1 Run the system-view command on Router A to enter the system view. After POS 2/0/0 on Router A goes Down, the static route with the outbound interface being POS 2/0/0 becomes unreachable and thus is deleted from the routing table. Then, all packets destined for Router B are sent through GE 1/0/0 only. Step 2 Run the undo ip route-static 10.0.0.1 255.255.255.255 192.168.1.1 and undo ip route-static 10.0.0.1 255.255.255.255 192.168.0.1 commands to delete original static route configurations. Step 3 Run the ip route-static 10.0.0.1 255.255.255.255 gigabitethernet 1/0/0 192.168.1.1 and ip route-static 10.0.0.1 255.255.255.255 pos 2/0/0 192.168.0.1 commands to configure the static routes with next hops and outbound interfaces. Step 4 Run the display bgp peer command, and you can find that the BGP peer with the address 10.0.0.1 is in the Established state. This indicates that the BGP peer becomes normal. The fault is cleared. ----End
Summary
Route iteration is enabled by default. Ensure that route iteration will not cause exceptions on a network.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 114
8 BGP Troubleshooting
8.5.5 Static Routes Do Not Take Effect Because of the Relay Depth
NOTE
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
As shown in Figure 8-9, Router A and Router B are connected through two POS links and establish the EBGP peer relationship. The following two static routes are configured on Router A:
ip route-static 2.2.2.2 255.255.255.255 pos1/0/0 10.1.1.2 ip route-static 2.2.2.2 255.255.255.255 10.1.2.2
The routing table shows that routes to Router B have only one outbound interface POS 1/0/0. Figure 8-9 Static routes failing to take effect
Fault Analysis
Because the static route configured through the ip route-static 2.2.2.2 255.255.255.255 pos1/0/0 command is specified with an outbound interface, route relay is not required and the relay depth is 0. Because no outbound interface is specified for the other static route configured through the ip route-static 2.2.2.2 255.255.255.255 10.1.2.2 command, route relay needs to be performed one time and the relay depth is 1. BGP selects the static route with the smallest relay depth. Therefore, BGP selects the static route with the relay depth of 0, and the outbound interfaces of the BGP routes are POS 1/0/0.
Procedure
Step 1 Run the system-view command on Router A to enter the system view. Step 2 Run the ip route-static 2.2.2.2 255.255.255.255 10.1.2.2 command to delete the static route.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 115
8 BGP Troubleshooting
Step 3 Run the ip route-static 2.2.2.2 255.255.255.255 pos2/0/0 10.1.2.2 command to configure a static route with an outbound interface. After the preceding operations, both static routes are selected when BGP selects static routes with the smallest relay depth. Therefore, you can find two outbound interfaces POS 1/0/0 and POS 2/0/0 when checking the routing table of Router A. ----End
Summary
When configuring static routes, specify outbound interfaces for them. In this way, route relay is avoided.
8.5.6 The Outgoing Traffic Is Not Balanced Because BGP Load Balancing Is Not Enabled
NOTE
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
As shown in Figure 8-10, Router A, Router B, Router C, and Router D are four egress devices on the MAN; Router E, Router F, Router G, and Router H are four devices on the provincial backbone network. Before service cutover, the four egress devices on the MAN are connected to the four devices on the provincial backbone network by using EBGP as shown in Figure 8-10. The networking is adjusted to meet service requirements. After service cutover, the four egress devices on the MAN are connected to the four devices on the provincial backbone network through EBGP as shown in Figure 8-11. Traffic fails to be balanced between the two outbound interfaces of each egress device on the MAN, and most traffic is transmitted over one link.
Issue 01 (2011-10-15)
116
8 BGP Troubleshooting
RouterH
RouterD
RouterH
RouterD
Fault Analysis
Two equal-cost routes are generated on each egress device on the MAN. Because BGP load balancing is not enabled, most traffic is transmitted over one link. To solve this problem, you need to enable BGP load balancing on the four egress devices on the MAN.
Issue 01 (2011-10-15)
117
8 BGP Troubleshooting
Procedure
Step 1 Run the system-view command on Router A to enter the system view. Step 2 Run the bgp as-number command to enter the BGP view. Step 3 Run the maximum load-balancing number command to set the maximum number of equalcost routes to 2 or a greater value. After the preceding operations, perform the same operations on Router B, Router C, and Router D. ----End
Summary
With the rapid development of networks, MANs and backbone networks as shown in Figure 8-11 are widely deployed. When devices on two MANs access each other, equal-cost routes are generated. If BGP load balancing is not enabled on egress devices on the MAN, the route advertised by the device with the smallest router ID is preferred according to BGP route selection rules. Consequently, the fault described in this case occurs. It is recommended to enable load balancing on egress devices on the MAN and set the maximum number of equal-cost routes to the maximum value to facilitate capacity expansion. If there are both Huawei devices and non-Huawei devices on a network, take the maximum number of equalcost routes supported on non-Huawei devices into consideration.
8.5.7 Summary Routes Advertised by EBGP Flap Frequently Because Routing Protocols Are Configured with Improper Preferences
NOTE
After commands are configured to troubleshoot faults, pay attention to the configuration validation mode to ensure that the configurations take effect. Unless otherwise specified, this manual defaults to the immediate validation mode. l In immediate validation mode, configurations take effect after commands are input and the Enter key is pressed. l In two-phase validation mode, after commands are configured, the commit command needs to be run to commit the configurations. Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.
Fault Symptom
As shown in Figure 8-12, Router C and Router D are two egress devices on a MAN and are connected to Router A and Router B on the backbone network through EBGP. Route suppression is configured on devices on the backbone network to suppress EBGP routes from egress devices on the MAN. On the MAN, Router C and Router D are connected to their attached devices through IS-IS and IBGP peer relationships are established between them. Some traffic traverses egress devices on the MAN. Therefore, to prevent link faults from causing routing loops, Router C and Router D establish the IBGP peer relationship by using interface addresses and advertise MAN routes to devices on the backbone network through static blackhole routes imported by the network command.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 118
8 BGP Troubleshooting
After a fault occurs on a board or a link, the IBGP peer relationship between Router C and Router D alternates between Up and Down states frequently and therefore all MAN services are interrupted. Figure 8-12 Summary routes advertised by EBGP peer flapping frequently
RouterC
RouterD
RouterE
RouterF
MAN
RouterG
RouterH
Fault Analysis
Possible causes of route flapping are as follows: l l l Associated routing policies, including local and remote routing policies, are changed manually. Routes (mainly refer to the advertised summary routes) are added and then deleted for two consecutive times. Static and dynamic routing protocols are configured with improper preferences. As a result, some BGP summary routes cannot be advertised through the blackhole route imported by the network command.
By checking logs on devices, you can find that no associated routing policies are changed manually and no routes are added and then deleted. Egress devices on the MAN advertise routes through blackhole routes imported by the network command. Therefore, route addition or deletion is not involved. By checking summary
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 119
8 BGP Troubleshooting
routes, you can find that the lifetime of these routes is very long. This indicates that service interruption is unlikely to occur. On Router C and Router D, the preference of the BGP protocol is set to 20, and the preference of static blackhole routes defaults to 60. Therefore, if both IBGP routes and static routes exist, IBGP routes are advertised as summary routes first. As a result, when the IBGP peer relationship between Router C and Router D alternates between Up and Down states, the advertised summary routes flap, thus causing MAN services to be interrupted. To solve this problem, you need to change the preferences of BGP routes and static routes so that IBGP routes have a lower preference than static routes.
Procedure
Step 1 Run the system-view command on Router C to enter the system view. Step 2 Run the bgp as-number to enter the BGP view. Step 3 Run the preference preference command to set the preference for the BGP protocol to a value greater than 60. After the preceding operations, perform the same operations on Router D to change the preference of the BGP protocol. In this manner, MAN service transmission recovers. ----End
Summary
Theoretically, MAN routes, which are advertised through blackhole routes imported by the network command, do not flap. In this case, however, devices on the MAN do not advertise routes as required because routing protocols are configured with improper preferences.
8.5.8 Traffic Is Not Load Balanced Between Two Links Because Load Balancing Is Not Configured on the Peer End
Fault Symptom
AS1, AS2, and AS3 belong to three different carriers. AS1 is connected to Router B through two links, and traffic is load balanced on the two links with the load balancing weight being 2:1. After the two links are cut over to Router A, among traffic from AS1 to AS3, the volume of traffic received by POS 1/0/2 reaches 120 Mbit/s whereas the volume of traffic received by POS 1/0/1 reaches only 1 to 3 Mbit/s. This indicates that traffic from AS1 to AS3 is not load balanced among the two links between AS1 to Router A.
Issue 01 (2011-10-15)
120
8 BGP Troubleshooting
AS2
AS3
Router B
10.111.128.0 255.255.240.0
10.111.144.0 255.255.240.0
Fault Analysis
It is inferred that devices in AS1 do not load balance traffic among the routes learned from AS2. Devices in AS1 forward traffic only along the optimal path, thus causing this fault. You can change the path to be selected by the peer end by modifying the Multi-Exit-Discriminator (MED) of Router A.
Procedure
Step 1 Run the system-view command on Router A to enter the system view. Step 2 Run the ip ip-prefix med-prefix index 10 permit 10.111.128.0 20 command to configure an IP prefix list named med-prefix to allow only the routes with the prefix being 10.111.128.0/20 to pass. Step 3 Run the route-policy med-500 permit node 10 command to set a route-policy named med-500 with its node number being 10 in permit mode. Step 4 Run the if-match ip-prefix med-prefix command to set a matching rule based on the IP prefix list named med-prefix. Step 5 Run the apply cost 500 command to set the cost of the routes with the prefix being 10.111.128.0/20 to 500. Step 6 Run the quit command to exit from the route-policy view.
Issue 01 (2011-10-15) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 121
8 BGP Troubleshooting
Step 7 Run the route-policy med-500 permit node 15 command to set a route-policy named med-500 with its node number being 15 in permit mode. Step 8 Run the apply cost 0 command to set the cost of the routes destined for other network segments to 0. Step 9 Run the quit command to exit from the route-policy view. Step 10 Run the bgp 3 command to enter the BGP view. Step 11 Run the ipv4-family unicast command to enter the IPv4 unicast address family view. Step 12 Run the peer 1.1.1.1 route-policy med-500 export command to apply a route-policy named med-500 to the routes to be advertised to the peer. Step 13 Run the display interface pos 1/0/1 command. The command output shows that the volume of incoming traffic of POS 1/0/1 increases greatly. This indicates that the configured route-policy has taken effect. To split more traffic to POS 1/0/1, add network segments in the configured IP prefix list. ----End
Summary
The MED attribute is transmitted between two neighboring ASs only. That is, devices in an AS do not advertise the received MED attribute to any devices in any other ASs. The MED is similar to the metric used by an IGP and is used to determine the optimal route for the traffic that enters an AS. When a BGP Router obtains multiple routes with the same destination address but different next hops through EBGP peers, the route with the smallest MED is selected as the optimal route if the other conditions of the routes are the same.
Issue 01 (2011-10-15)
122