Vous êtes sur la page 1sur 110
Troubleshooting BGP BRKRST-3320 BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Troubleshooting BGP

BRKRST-3320

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

2

Introduction

Housekeeping

Cell Phones

Who am I?

Who are you?

Service Provider Enterprise Studying for CCIE

Advanced Class

‒ Studying for CCIE  “ Advanced ” Class ‒ Assume BGP Operational Experience Basic configuration

Assume BGP Operational Experience

Basic configuration Show commands

Understand BGP attributes

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

3

Introduction

Operating Systems

IOS vs. IOS-XR vs. NX-OS

Troubleshooting concepts are the same

Some variation in show command syntax and output

Will use all three in this presentation

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

4

Introduction

Agenda

Generic Troubleshooting Advice

Troubleshooting Peers

Bestpath Algorithm

Table Version

Initial Convergence

Periodic Convergence

High Utilization

Layer 3 VPNs

Looking Glasses

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

5

Generic Troubleshooting Advice

Generic Troubleshooting Advice

Generic Troubleshooting Advice

Narrow down the problem

Can you reproduce it?

Which device(s) are the cause of the problem?

Reduce your configs

Troubleshoot one thing at a time

100k routes flapping? Pick one route and focus on that one route

Have a co-worker take a look

Forces you to talk through the problem

Different set of eyes may spot something

Sniffer capture, sniffer capture, sniffer capture

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

7

Generic Troubleshooting Advice

Syslogs

Use NTP to sync timestamps on your routers

clock timezone EST -5 0

clock summer-time EDT recurring

ntp server x.x.x.x

Use a syslog server

logging monitor informational

logging host x.x.x.x

service timestamps log datetime msec localtime

x.x.x.x ‒ service timestamps log datetime msec localtime BRKRST-3320 © 2012 Cisco and/or its affiliates. All
x.x.x.x ‒ service timestamps log datetime msec localtime BRKRST-3320 © 2012 Cisco and/or its affiliates. All

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

8

Generic Troubleshooting Advice

Syslogs

Generic Troubleshooting Advice Syslogs Centralized/Timesynced syslogs are a great troubleshooting tool BRKRST-3320 © 2012

Centralized/Timesynced syslogs are a great troubleshooting tool

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

9

Generic Troubleshooting Advice

log-neighbor-changes

Generic Troubleshooting Advice log-neighbor-changes  bgp log-neighbor-changes ‒ Generates a syslog message when a
Generic Troubleshooting Advice log-neighbor-changes  bgp log-neighbor-changes ‒ Generates a syslog message when a

bgp log-neighbor-changes

Generates a syslog message when a peer goes up or down

Always configure this

OSPF, ISIS, and EIGRP all have log-neighbor-changes too

‒ OSPF, ISIS, and EIGRP all have log-neighbor-changes too BRKRST-3320 © 2012 Cisco and/or its affiliates.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

10

Generic Troubleshooting Advice

Define “Normal”

“The CPU on this router is high”

High compared to what?

What is the CPU load normally at this time of day?

Things to keep track of

CPU load

Free Memory

Largest block of memory

Input/Output load for interfaces

Rate of BGP bestpath changes

Etc., etc.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

11

Generic Troubleshooting Advice

Define “Normal”

Cacti is a handy tool for polling and graphing data from various network devices

http://www.cacti.net/

data from various network devices ‒ http://www.cacti.net/ BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

12

Generic Troubleshooting Advice

Sniffer Captures

Use SPAN to get traffic to your sniffer

monitor session 1 source interface Te2/4 rx

monitor session 1 destination interface Te2/2

IOS-XR

Only supported on ASR-9000

Use ACLs to control what packets to SPAN

RSPAN

“RSPAN has all the features of SPAN, plus support for source ports and destination ports that are distributed across multiple switches, allowing one to monitor any destination port located on the RSPAN VLAN. Hence, one can monitor the traffic on one switch using a device on another switch.”

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

13

Generic Troubleshooting Advice

Embedded Packet Capture

Ability to capture packets on the router Primarily for control-plane traffic

Difficult to capture transit traffic on distributed platforms Is supported on some platforms

Very handy if a dedicated sniffer is not available

Available on IOS and NX-OS

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

14

Generic Troubleshooting Advice

IOS Embedded Packet Capture

Create a buffer

monitor capture buffer buf1 size 512 max-size 512 circular

Define which interface and direction to capture

monitor capture point ip cef dwalton-cap gig 0/0 in

Associate the buffer with the capture

monitor capture point associate dwalton-cap buf1

Start/Stop the capture

monitor capture point start dwalton-cap

monitor capture point stop dwalton-cap

Export the capture to a .pcap file

monitor capture buffer buf1 export tftp://172.26.2.254/buf1.pcap

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

15

Generic Troubleshooting Advice

Wireshark

Generic Troubleshooting Advice Wireshark You probably know this already but…   Wireshark is your best

You probably know this already but…

Wireshark is your best friend It is free

You can get it here

http://www.wireshark.org/

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

16

Generic Troubleshooting Advice

Wireshark

Generic Troubleshooting Advice Wireshark BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights reserved. Cisco

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

17

Generic Troubleshooting Advice

Wireshark

Can do complex filters

ANDs, ORs, ()s, etc.

If the filter is red, your syntax is busted

 If the filter is green, your syntax is correct
If the filter is green, your syntax is correct
busted  If the filter is green, your syntax is correct BRKRST-3320 © 2012 Cisco and/or
busted  If the filter is green, your syntax is correct BRKRST-3320 © 2012 Cisco and/or

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

18

Generic Troubleshooting Advice

Wireshark

Generic Troubleshooting Advice Wireshark  Wireshark does a LOT  Enough for someone to write an

Wireshark does a LOT

Enough for someone to write an 800 page book on how to use it

ISBN-13: 978-1893939998

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

19

Generic Troubleshooting Advice

Debugs

Send output to the logging buffer, not the console

logging buffered <size>

no logging console

Use milli-second timestamps

service timestamps debug datetime msec localtime

service timestamps log datetime msec localtime

Use ACLs to limit output

brain1(config)#access-list 100 permit ip host 1.1.1.1 host 2.2.2.2

brain1#debug ip packet 100

IP packet debugging is on for access list 100

brain1#

If you need to enable a very chatty debug

reload in 10

Run your debug

reload cancel

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

20

Generic Troubleshooting Advice

Event Tracing

Collects event information for various protocols

Runs in the background

Events are stored in memory

Debug output is not generated

Syslogs are not generated

Finite number of most recent events are stored

Use show commands later to

Display an event in a “debug like” format

Merge events from various protocols

Easier on the box than debugs

http://tinyurl.com/cisco-event-tracer

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

21

Generic Troubleshooting Advice

Event Tracing

brain1(config)#monitor event-trace ?

adjacency

Adjacency Events

all-traces

Configure merged event traces

atom

AToM Event Trace

cef

CEF traces

[snip] brain1(config)#monitor event-trace adjacency enable

brain1(config)#end

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

22

Generic Troubleshooting Advice

Event Tracing

brain1#show monitor event-trace adjacency all

Feb 14 17:15:48.270: GLOBAL: adj mgr notified of fibidb state change int FastEthernet0/0 to down [OK]

Feb 14 17:15:50.958: GLOBAL: adj mgr notified of fibidb state change int FastEthernet0/0 to up [OK]

Feb 14 17:15:51.682: GLOBAL: adj ipv4 bundle changed to IPv4 no fixup adj oce [OK]

Feb 14 17:15:51.682: ADJ: IP 172.26.38.1 FastEthernet0/0/0: update oce bundle, IPv4 incomplete adj oce [OK]

Feb 14 17:15:51.682: ADJ: IP 172.26.38.1 FastEthernet0/0/0: allocate [OK]

Feb 14 17:15:51.686: ADJ: IP 172.26.38.1 FastEthernet0/0/0: request resolution [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: request to add ARP [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: allocate [Ignr]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: add source ARP [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: request to update [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: update oce bundle, IPv4 no fixup adj oce [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: update [OK]

brain1#

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

23

Generic Troubleshooting Advice

Out of Band Access

Generic Troubleshooting Advice Out of Band Access  Don’t be the person who has to drive

Don’t be the person who has to drive 3 hours to console into a box

If you don’t have out of band access for every router and/or switch in your network….get it….please

router and/or switch in your network….get it….please BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

24

Troubleshooting Peers

Troubleshooting Peers

Failed Peering

Configurations

Check

AS Numbers
IP addresses for TCP

eBGP Multihop?

R1#sh tcp brief all

TCB

Local Address

64328548 *.179

R1#

sh tcp brief all TCB Local Address 64328548 *.179 … R1# R1 R2 interface Loop0 ip

R1

R2

brief all TCB Local Address 64328548 *.179 … R1# R1 R2 interface Loop0 ip address 1.1.1.1/32

interface Loop0 ip address 1.1.1.1/32

*.179 … R1# R1 R2 interface Loop0 ip address 1.1.1.1/32 ! router bgp 100 neighbor 2.2.2.2

!

router bgp 100 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 update-source Loop0

2.2.2.2 remote-as 100 neighbor 2.2.2.2 update-source Loop0 interface Loop0 ip address 2.2.2.2/32 ! router bgp 100
2.2.2.2 remote-as 100 neighbor 2.2.2.2 update-source Loop0 interface Loop0 ip address 2.2.2.2/32 ! router bgp 100

interface Loop0 ip address 2.2.2.2/32

update-source Loop0 interface Loop0 ip address 2.2.2.2/32 ! router bgp 100 neighbor 1.1.1.1 remote-as 100 neighbor

!

router bgp 100 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 update-source Loop0

1.1.1.1 remote-as 100 neighbor 1.1.1.1 update-source Loop0 Foreign Address 2.2.2.2.* (state) LISTEN BRKRST-3320 © 2012
1.1.1.1 remote-as 100 neighbor 1.1.1.1 update-source Loop0 Foreign Address 2.2.2.2.* (state) LISTEN BRKRST-3320 © 2012

Foreign Address

2.2.2.2.*

(state)

LISTEN

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

26

Failed Peering

Connectivity

Check

Extended ping between BGP peering addresses

Check ✓  Extended ping between BGP peering addresses R1 R2 R1#ping 2.2.2.2 source Loop0 Sending

R1

R2

✓  Extended ping between BGP peering addresses R1 R2 R1#ping 2.2.2.2 source Loop0 Sending 5,
✓  Extended ping between BGP peering addresses R1 R2 R1#ping 2.2.2.2 source Loop0 Sending 5,
✓  Extended ping between BGP peering addresses R1 R2 R1#ping 2.2.2.2 source Loop0 Sending 5,

R1#ping 2.2.2.2 source Loop0 Sending 5, 100-byte ICMP Echos to 2.2.2.2 Packet sent with a source address of 1.1.1.1

Success rate is 0 percent (0/5)

R1#

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

interface Loop0 ip address 1.1.1.1/32
interface Loop0
ip address 1.1.1.1/32

!

router bgp 100 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 update-source Loop0

interface Loop0 ip address 2.2.2.2/32
interface Loop0
ip address 2.2.2.2/32

!

router bgp 100 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 update-source Loop0

Cisco Public

27

Failed Peering

Connectivity

Failed Peering Connectivity  BGP runs on top of IP and can be affected by many

BGP runs on top of IP and can be affected by many things

No connectivity?

IGP issues

Access Lists

TCP problems

Peers come up but flap, are slow, etc.

MTU Issues – extended ping and sweep address ranges, DF bit, etc.

Rate limiting

Traffic shaping

Debugs may be needed

BRKRST-3320

‒ Traffic shaping  Debugs may be needed BRKRST-3320 © 2012 Cisco and/or its affiliates. All

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

28

Failed Peering

Notifications

BGP NOTIFICATIONs consist of an error code, subcode and data

All Error Codes and Subcodes can be found here

http://www.iana.org/assignments/bgp-parameters/bgp-parameters.xml

http://tinyurl.com/bgp-notification-codes

Data portion may contain what triggered the notification

Example: corrupt part of the UPDATE

Pay attention to who sent vs. received the NOTIFICATION

If Router X sent the NOTIFICATION, it means he noticed the issue

Does not mean Router X is the cause of the issue

issue ‒ Does not mean Router X is the cause of the issue BRKRST-3320 © 2012

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

29

Failed Peering

Notifications

%BGP-3-NOTIFICATION: sent to neighbor 2.2.2.2 2/2 (peer in wrong AS) 2 bytes 00C8 FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 002D 0104 00C8 00B4 0202 0202 1002 0601 0400 0100 0102 0280 0002 0202 00

Value

Name

Reference

1

Message Header Error

RFC 4271

2

OPEN Message Error

RFC 4271

3

UPDATE Message Error

RFC 4271

4

Hold Timer Expired

RFC 4271

5

Finite State Machine Error

RFC 4271

6

Cease

RFC 4271

The first 2 in “2/2” is the Error Code….so “OPEN Message Error”

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

30

Failed Peering

Notifications

Subcode #

Subcode Name

Subcode Description

1

Unsupported BGP version

The version of BGP the peer is running isn’t compatible with the local version of BGP

2

Bad Peer AS

The AS this peer is locally configured for doesn’t match the AS the peer is advertising

3

Bad BGP Identifier

The BGP router ID is the same as the local BGP router ID

4

Unsupported Optional Parameter

There is an option in the packet which the local BGP speaker doesn’t recognize

6

Unacceptable Hold Time

The remote BGP peer has requested a BGP hold time which is not allowed (too low)

7

Unsupported Capability

The peer has asked for support for a feature which the local router does not support

OPEN Message Subcodes shown above The second 2 in “2/2” is the Error Subcode….so “Bad Peer AS”

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

31

10.1.2.1

10.1.2.2

Failed Peering

Notifications

R2# show log | include NOTIFICATION %BGP-3-NOTIFICATION: sent to neighbor 10.1.2.1 2/2 (peer in wrong AS) 2 bytes 0064 FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 002D 0104 0064 00B4 0101 0101 1002 0601 0400 0100 0102 0280 0002 0202 00

00B4 0101 0101 1002 0601 0400 0100 0102 0280 0002 0202 00 x0064 = “data” of

x0064 = “data” of NOTIFICATION x0064 = decimal 100

00 x0064 = “data” of NOTIFICATION x0064 = decimal 100 Sniff of BGP Notification Sent from

Sniff of BGP Notification Sent from R2 to R1

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

R1

AS 100

from R2 to R1 BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public
from R2 to R1 BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

R2

AS 200

32

10.1.2.1

10.1.2.2

Failed Peering

Notifications

Question: What did R1 see?

R1#sh log | include NOTIFICATION %BGP-3-NOTIFICATION: received from neighbor 10.1.2.2 2/2 (peer in wrong AS) 2 bytes 0064

R1 AS 100
R1
AS 100
10.1.2.2 2/2 (peer in wrong AS) 2 bytes 0064 R1 AS 100 R2 AS 200 BRKRST-3320
10.1.2.2 2/2 (peer in wrong AS) 2 bytes 0064 R1 AS 100 R2 AS 200 BRKRST-3320

R2

AS 200

BRKRST-3320

in wrong AS) 2 bytes 0064 R1 AS 100 R2 AS 200 BRKRST-3320 router bgp 100

router bgp 100 no synchronization bgp log-neighbor-changes neighbor 10.1.2.2 remote-as 200 no auto-summary

router bgp 200 no synchronization bgp log-neighbor-changes neighbor 10.1.2.1 remote-as 10 no auto-summary

neighbor 10.1.2.1 remote-as 10 no auto-summary © 2012 Cisco and/or its affiliates. All rights reserved.

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

33

Failed Peering

Decoding Hex

What if a peer sends you a message that causes us to send a NOTIFICATION?

Corrupt UPDATE

Bad OPEN message, etc.

View the message that triggered the NOTIFICATION

show ip bgp neighbor 1.1.1.1 | begin Last reset

Last reset 5d12h, due to BGP Notification sent, invalid or corrupt AS path Message received that caused BGP to send a Notification:

FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF 005C0200 00004140 01010040 0206065D 1CFC059F 400304D5 8C20F480 04040000 05054005 04000000 55C0081C 329C4844 329C6E28 329C6E29 58F50082 58F5EACE 58F5FA02 58F5FA6E 18D14E70

329C6E29 58F50082 58F5EACE 58F5FA02 58F5FA6E 18D14E70 BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

34

Failed Peering

Decoding Hex

You don’t like reading hex?

Nice write-up here on converting hex output to wireshark .pcap file

http://ccie-in-3-months.blogspot.com/2010/08/decoding-ripe-experiment.html

http://tinyurl.com/bgp-hex-decode

In a nutshell, put the hex dump in this format

 In a nutshell, put the hex dump in this format BRKRST-3320 © 2012 Cisco and/or

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

35

Failed Peering

Decoding Hex

Now use Wireshark’s text2pcap.exe to add the needed headers

use Wireshark’s text2pcap.exe to add the needed headers Open bgp_message.pcap with Wireshark BRKRST-3320 © 2012

Open bgp_message.pcap with Wireshark

add the needed headers Open bgp_message.pcap with Wireshark BRKRST-3320 © 2012 Cisco and/or its affiliates. All

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

36

Troubleshooting Peers

eBGP TTL

BGP uses a TTL of 1 for eBGP peers

Also verifies if NEXTHOP is directly connected

For eBGP peers that are more than 1 hop away a larger TTL must be used

No longer verifies if NEXTHOP is directly connected

neighbor x.x.x.x ebgp-multihop [2-255]

connected  neighbor x.x.x.x ebgp-multihop [2-255] R2 AS65001 Default TTL Configured TTL AS65000 R1
R2 AS65001
R2
AS65001

Default TTL

x.x.x.x ebgp-multihop [2-255] R2 AS65001 Default TTL Configured TTL AS65000 R1 BRKRST-3320 © 2012 Cisco and/or
x.x.x.x ebgp-multihop [2-255] R2 AS65001 Default TTL Configured TTL AS65000 R1 BRKRST-3320 © 2012 Cisco and/or

Configured TTL

AS65000 R1
AS65000
R1

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

37

Troubleshooting Peers

eBGP TTL

Loopback peering to directly connected eBGP peer

Typically used to load-balance over multiple links

Two options for configuring this…

Option #1 – The old way

Use ebgp-multihop

Change the TTL to 2

Disables the “is the NEXTHOP on a connected subnet” check

R1#

router bgp 100 no synchronization bgp log-neighbor-changes neighbor 2.2.2.2 remote-as 200 neighbor 2.2.2.2 ebgp-multihop 2 neighbor 2.2.2.2 update-source Loopback0 no auto-summary

2 neighbor 2.2.2.2 update-source Loopback0 no auto-summary R1 R2 BRKRST-3320 © 2012 Cisco and/or its affiliates.

R1

R2

2.2.2.2 update-source Loopback0 no auto-summary R1 R2 BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Multihop eBGP session between loopbacks

38

Troubleshooting Peers

eBGP TTL

Option #2 – The new way

Use disable-connected-check

Still uses a TTL of 1

Disables the “is the NEXTHOP on a connected subnet” check

R1#

router bgp 100 no synchronization bgp log-neighbor-changes neighbor 2.2.2.2 remote-as 200 neighbor 2.2.2.2 disable-connected-check neighbor 2.2.2.2 update-source Loopback0 no auto-summary

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

© 2012 Cisco and/or its affiliates. All rights reserved. R1 R2 Cisco Public Multihop eBGP session

R1

R2

© 2012 Cisco and/or its affiliates. All rights reserved. R1 R2 Cisco Public Multihop eBGP session

Cisco Public

Multihop eBGP session between loopbacks

39

Failed Peering

Notifications – Hold Time Expired

R1

Failed Peering Notifications – Hold Time Expired R1 NOTIFICATION R2 %BGP-5-ADJCHANGE: neighbor 2.2.2.2 Down BGP
Failed Peering Notifications – Hold Time Expired R1 NOTIFICATION R2 %BGP-5-ADJCHANGE: neighbor 2.2.2.2 Down BGP

NOTIFICATION

R2

%BGP-5-ADJCHANGE: neighbor 2.2.2.2 Down BGP Notification sent %BGP-3-NOTIFICATION: sent to neighbor 2.2.2.2 4/0 (hold time expired)

R1#show ip bgp neighbor 2.2.2.2 | include last reset Last reset 00:01:02, due to BGP Notification sent, hold time expired

R1 sends hold time expired NOTIFICATION to R2

R1 did not receive a KA from R2 for holdtime seconds

One of two issues

R2 is not generating keepalives

R2 is generating keepalives but R1 is not receiving them

R2 is generating keepalives but R1 is not receiving them BRKRST-3320 © 2012 Cisco and/or its

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

40

Failed Peering

Notifications – Hold Time Expired

First figure out if R2 is building keepalives

Is R2 out of memory or CPU?

Output drops on the outbound interface towards R1?

When did R2 last build a BGP message for R1? It should be within “keepalive interval” seconds.

R2#show ip bgp neighbors 1.1.1.1

Last read 00:00:15, last write 00:00:44, hold time is 180,

keepalive interval is 60 seconds

R2 is building messages for R1 but is R2 able to send them?

Watch OutQ and MsgSent counters in “show ip bgp summary”

OutQ is the number of packets waiting for TCP to TX to a peer

MsgSent is the number of packets TCP has removed from OutQ and transmitted for a peer

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

41

Failed Peering

Notifications – Hold Time Expired

R2#show ip bgp sum | begin Neighbor

Neighbor … MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

1.1.1.1 …

53 284
53
284

10167

0
0

97 00:01:20

Up/Down State/PfxRcd 1.1.1.1 … 53 284 10167 0 97 00:01:20 0 The number of packets generated

0

State/PfxRcd 1.1.1.1 … 53 284 10167 0 97 00:01:20 0 The number of packets generated is

The number of packets generated is increasing

97 00:01:20 0 The number of packets generated is increasing The number of packets transmitted is

The number of packets transmitted is not increasing

At least one BGP keepalive interval apart

increasing  At least one BGP keepalive interval apart 53 284 R2# show ip bgp sum
53 284
53
284

R2#show ip bgp sum | begin Neighbor

Neighbor … MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

1.1.1.1 …

10167

0

TblVer InQ OutQ Up/Down State/PfxRcd 1.1.1.1 … 10167 0 98 00:02:24 0 OutQ is incrementing due
TblVer InQ OutQ Up/Down State/PfxRcd 1.1.1.1 … 10167 0 98 00:02:24 0 OutQ is incrementing due

98 00:02:24

0

OutQ is incrementing due to keepalive generation MsgSent is not incrementing Something is “stuck” on the OutQ The keepalives are not leaving R2!!

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

42

Failed Peering

Notifications – Hold Time Expired

Do R1 and R2 still have IP connectivity?

Ping using peering addresses (loopback to loopback)

Ping with mss (max-segment-size) with df-bit set

MSS – Max Segment Size

536 bytes by default

Path MTU Discovery finds smallest MTU between R1 and R2

Subtracts 40 bytes for TCP/IP overhead

Note the MSS and ping accordingly

R1#sh ip bgp neighbors BGP neighbor is 2.2.2.2, remote AS 2, external link Datagrams (max data segment is 1460 bytes):

R1# ping 2.2.2.2 source loop0 size 1500 df-bit

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

43

Failed Peering

Notifications – Hold Time Expired

MSS ping

BGP OPENs and Keepalives are small

UPDATEs can be much larger

Maybe small packets work but larger packets do not?

R1#ping 2.2.2.2 source loop0 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:

!!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 16/21/24 ms

R1#ping 2.2.2.2 source loop0 size 1500 df-bit

Type escape sequence to abort. Sending 5, 1500-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:

Packet sent with the DF bit set . Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms

.

.

.

.

This is a layer 2 or 3 transport issue, etc.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

44

Failed Peering

Notifications – Hold Time Expired

Some other possible causes could have been

Input drops on R1

R1 CPU at 100%

R1 out of memory

Input drops on R1 ‒ R1 CPU at 100% ‒ R1 out of memory BRKRST-3320 ©
Input drops on R1 ‒ R1 CPU at 100% ‒ R1 out of memory BRKRST-3320 ©

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

45

Bestpath Algorithm

Bestpath Algorithm

Best Path

Algorithm

Quick bestpath review

 

Remember

 

BGP only advertises one path per prefix…the bestpath

Cannot advertise path from one iBGP peer to another

Bestpath selection process is a little lengthy

 

First eliminate paths that are ineligible for bestpath

1

Not synchronized

Only happens if “sync” is configured AND the route isn’t in your IGP

2

Inaccessible NEXTHOP

IGP does not have a route to the BGP NEXTHOP

3

Received-only paths

Happens if “soft-reconfig inbound” is applied. A path will be received-only if it was denied/modified by inbound policy.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

47

Best Path

Algorithm

1

Weight

Highest wins

Scope is router only

2

LOCAL_PREFERENCE

Highest wins

Scope is AS only

3

Locally Originated

 

Redistribution or network statement favored over aggregate- address

4

AS_PATH

Shortest wins

Skipped if “bgp bestpath as-path ignore” configured AS_SET counts as 1 CONFED parts do not count

5

ORIGIN

Lowest wins

IGP < EGP < Incomplete

6

MED

Lowest wins

MEDs are compared only if the first AS in the AS_SEQUENCE is the same

7

eBGP over iBGP

   

8

Metric to Next Hop

Lowest wins

IGP cost to the BGP NEXTHOP

9

Multiple Paths in RIB

 

Flag path as “multipath” is max-paths is configured

10

Oldest External Wins

 

Unless BGP best path compare router-id configured

11

BGP Router ID

Lowest

 

12

CLUSTER_LIST

Smallest

Shorter CLUSTER_LIST wins

13

Neighbor Address

Lowest

Lowest neighbor address

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

48

Best Path

Algorithm

show ip bgp x.x.x.x bestpath

Will show you only the bestpath for x.x.x.x

Handy if you have lots of paths for a prefix

R2#sh ip bgp 7.4.4.0/24 bestpath

BGP routing table entry for 7.4.4.0/24, version 2 Paths: (20 available, best #13, table Default-IP-Routing-Table) Flag: 0x820 Not advertised to any peer

100

192.150.6.11 from 192.150.6.11 (192.150.6.11) Origin IGP, metric 0, localpref 100, valid, external, best

R2#

show ip bgp x.x.x.x multipath

Same concept but will show you all of the multipaths for x.x.x.x

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

49

Best Path

Algorithm

IOS-XR has

sh ip bgp x.x.x.x bestpath-compare

Explains why the bestpath is the best

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

50

BGP Table Version

BGP Table Version

BGP Table Version

Lots of things must happen when bestpaths change

RIB must be notified

Peers must be informed

Must have a way to track who has been informed of which bestpath changes

Prefix Table Version

Each prefix has a 32 bit number that is its table version

A prefix’s table version is bumped for every bestpath change

Bumped means the table version changes from the current version to the next available version #.

Assume 10.0.0.0/8 has a table version of #27 and the highest table version used by any prefix is #30. If 10.0.0.0/8 has a bestpath change his table version will be bumped to #31.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

52

BGP Table Version

show ip bgp x.x.x.x” will show you a prefix’s table version

R1#sh ip bgp 10.0.0.0

BGP routing table entry for 10.0.0.0/8, version 31 Paths: (1 available, best #1, table Default-IP-Routing-Table) Flag: 0x820 Not advertised to any peer

200

2.2.2.2 from 2.2.2.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, external, best

R1#

IGP, metric 0, localpref 100, valid, external, best R1# BRKRST-3320 © 2012 Cisco and/or its affiliates.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

53

BGP Table Version

RIB & Peer Table Versions

We have a table version for the RIB

Also have a table version for each peer

Used to keep track of which bestpath changes have been propagated to whom

track of which bestpath changes have been propagated to whom  If peer 1.1.1.1 has a

If

peer 1.1.1.1 has a table version of #60 this tells us we have informed

1.1.1.1 of all bestpath changes for prefixes with a table version of <= #60



If any prefix has a table version > #60 then we need to inform 1.1.1.1 of that prefix’s bestpath



Once 1.1.1.1 has been updated his table version will be updated accordingly



Same concept for the RIB and its table version

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

54

BGP Table Version

“show ip bgp summary” is best for viewing RIB and peer version #s

R2#show ip bgp summ BGP router identifier 2.2.2.2, local AS number 200 BGP table version is 13, main routing table version 13

3

network entries using 351 bytes of memory

3

path entries using 156 bytes of memory

Neighbor

V

AS MsgRcvd MsgSent

TblVer InQ OutQ Up/Down State/PfxRcd

1.1.1.1

4

100

4386

4388

13

0

0 01:20:24

1

R2#

4 100 4386 4388 13 0 0 01:20:24 1 R2#  Highest table version of any

Highest table version of any prefix = “main routing table version”

RIB is converged

1.1.1.1 is converged

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

55

BGP Table Version

Example

Assume the highest table version of any prefix is #10

The RIB has a table version of #10

The RIB is up to date for all prefixes

All peers have a table version of #10

Our peers are currently converged

5 prefixes experience a bestpath change

Highest table version is now #15

Inform the RIB of these 5 changes

Do RIB adds, deletes, and/or modifies

When complete, set the RIB table version to #15

Inform our peers of these 5 changes

Build updates and/or withdraws for each peer

When complete, set our peers’ table versions to #15

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

56

BGP Table Version

Why am I babbling About This?

Gives you a way to know who has been informed about what

Provides a way to tell how many bestpath changes your network is experiencing

You have 150k routes and see the table version increase by 150k every minute…something is wrong!!

You have 150k routes and see the table version increase by 300 every minute…sounds like normal network churn

You should monitor the table version in your network to determine what is normal for you

If the table version is increasing rapidly then that could explain why “BGP Router” and “BGP IO” are busy

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

57

Initial Convergence

Initial Convergence

BGP Convergence

Hey—Who are you calling slow? Two general convergence situations

Initial startup

Periodic route changes

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

59

Convergence

Initial Startup

Initial convergence happens when:

A router boots

RP failover

clear ip bgp *

How long initial convergence takes is a factor of the amount of work to be done and the router/network’s ability to do this fast and efficiently

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

to do this fast and efficiently BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

60

Convergence

Initial Startup

Convergence Initial Startup Initial convergence can be stressful…if you are approaching BGP scalability limits this is

Initial convergence can be stressful…if you are approaching BGP scalability limits this is when you will see issues.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

61

Convergence

Initial Startup

What work needs to be done?

1)

Accept routes from all peers

Not too difficult

2)

Calculate bestpaths

This is easy

3)

Install bestpaths in the RIB

Also fairly easy

4)

Advertise bestpaths to all peers

This can be difficult and may take several minutes depending on the following variables…

take several minutes depending on the following variables… BRKRST-3320 © 2012 Cisco and/or its affiliates. All

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

62

Convergence

Key Variables

BGP Variables

The number of routes

The number of peers

The number of update-groups

The ability to advertise routes to each update-group efficiently

Router Variables

CPU horsepower

Code version

Outbound Interface Bandwidth

‒ Code version ‒ Outbound Interface Bandwidth BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

63

Convergence

UPDATE Packing

An UPDATE contains a set of Attributes and a list of prefixes (NLRI)

BGP starts an UPDATE by building an attribute set

BGP then packs as many destinations (NLRIs) as it can into the UPDATE

NLRI = Network Layer Reachability Information Only NLRI with a matching attribute set can be placed in the UPDATE NLRI are added to the UPDATE until it is full (4096 bytes max)

“UPDATE Packing” refers to how efficiently an implementation packs NLRIs into UPDATEs

Least efficient: BGP only puts one NLRI per UPDATE

Most efficient: BGP puts all NLRI with a certain Attribute set in one UPDATE

Least Efficient

Most Efficient

BRKRST-3320

MED 50 Origin IGP

10.1.1.0/24

MED 50 Origin IGP

10.1.2.0/24

MED 50 Origin IGP

10.1.3.0/24

MED 50 Origin IGP

10.1.1.0/24

10.1.2.0/24

 

10.1.3.0/24

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

64

Convergence

UPDATE Packing

The fewer attribute sets you have the better

More NLRI will share an attribute set

Fewer UPDATEs to converge

Things you can do to reduce attribute sets

next-hop-self for all iBGP sessions

Don’t accept/send communities you don’t need

Use cluster-id to put RRs in the same POP in a cluster

To see how many attribute sets you have

show ip bgp summary

190844 network entries using 21565372 bytes of memory 302705 path entries using 15740660 bytes of memory 57469/31045 BGP path/bestpath attribute entries using 6206652 bytes of memory

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

65

Convergence

TCP MSS – Max Segment Size

TCP MSS (max segment size) is also a factor in convergence times. The larger the MSS the fewer TCP packets it takes to transport the BGP updates. Fewer packets means less overhead and faster convergence.

BGP UPDATE

Attribute

NLRI

NLRIs

NLRI

NLRIs

NLRI

Default MSS

BGP UDPATE is split into two TCP packets

IP Header

TCP Header

Attribute

NLRI

NLRIs

IP Header

TCP Header

NLRI

NLRIs

NLRI

Increased MSS

IP Header

TCP Header

Attribute

NLRI

NLRIs

NLRI

NLRIs

NLRI

The entire BGP update can fit in one TCP packet

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

66

Convergence

TCP MSS – Max Segment Size

MSS – Max Segment Size

Limit on packet size for a TCP socket

536 bytes by default

Path MTU Discovery

Finds smallest MTU between R1 and R2

Subtract 40 bytes for TCP/IP overhead

Enabled by default for BGP

neighbor 2.2.2.2 transport path-mtu-discovery disable

To find the MSS

R1#sh ip bgp neighbors BGP neighbor is 2.2.2.2, remote AS 3, external link Datagrams (max data segment is 1460 bytes):

external link Datagrams (max data segment is 1460 bytes): BRKRST-3320 © 2012 Cisco and/or its affiliates.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

67

Convergence

Update Groups

BGP must create updates based on the policies towards each peer

Peers with a common outbound policy are members of the same update-group

iBGP vs. eBGP

Outbound route-map, prefix-lists, etc.

UPDATEs are generated for one member of an update-group and then replicated to the other members

of an update-group and then replicated to the other members BRKRST-3320 © 2012 Cisco and/or its

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Less Efficient – Two peers in different update-groups

Attribute

NLRI

NLRI

Attribute

NLRI

NLRI

Attribute NLRI NLRI Attribute NLRI NLRI More Efficient – Two peers in the same update-group
Attribute NLRI NLRI Attribute NLRI NLRI More Efficient – Two peers in the same update-group

More Efficient – Two peers in the same update-group

Attribute

NLRI

NLRI

Attribute NLRI NLRI More Efficient – Two peers in the same update-group Attribute NLRI NLRI Cisco

Cisco Public

Attribute NLRI NLRI More Efficient – Two peers in the same update-group Attribute NLRI NLRI Cisco
Attribute NLRI NLRI More Efficient – Two peers in the same update-group Attribute NLRI NLRI Cisco
Attribute NLRI NLRI More Efficient – Two peers in the same update-group Attribute NLRI NLRI Cisco
Attribute NLRI NLRI More Efficient – Two peers in the same update-group Attribute NLRI NLRI Cisco

68

Convergence

Dropping TCP Acks

Primarily an issue on RRs (Route Reflectors) with

One or two interfaces connecting to the core Hundreds of RRCs (Route Reflector Clients)

RR sends out tons of UPDATES to RRCs

RRCs send TCP ACKs

RR core facing interface(s) receive huge wave of TCP ACKs

RR

RR core facing interface(s) receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs
RR core facing interface(s) receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs

BGP UPDATEs

interface(s) receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2012
interface(s) receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2012
interface(s) receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2012
interface(s) receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2012

TCP ACKs

receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2012 Cisco
receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2012 Cisco
receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2012 Cisco
receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2012 Cisco
receive huge wave of TCP ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2012 Cisco

RRCs

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

69

Convergence

Dropping TCP Acks

Interface input queue fills up…TCP ACKs are dropped

Each time a TCP packet is dropped, the session goes into slow start

It takes a good deal of time for a TCP session to come out of slow start

Increase the input queue

hold-queue 1000 in

If you still see drops increase to 4096

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

70

Convergence

How do You Know if BGP has Converged?

Watch the global table version

Increases by 1 for every bestpath change

In the lab: Table version stabilizes

In the real world: Reaches your “normal” rate of change

Watch peer InQ and OutQs

Wait for all InQ and OutQs to be empty

To list peers with non - empty queues

show ip bgp summ | e 0

0

Watch peer table versions

show ip bgp summ

If peer table version == global table version and InQ/OutQ empty, BGP has converged that peer

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

71

Convergence

Initial Convergence Summary

Initial convergence time is a factor of the amount of work that needs to be done and the router/network’s ability to do this fast and efficiently

Reduce the number of attributes sets in BGP

Use next-hop-self, don’t send communities you don’t need, etc.

Reduce the number of unique outbound policies towards all peers

Try to find a small set of common policies, rather than individualizing policies per peer

The fewer update-groups the better

MSS/PMTU

Efficient packaging of BGP messages in TCP

Stop TCP ACK drops

Increase interface input queues on RRs

TCP ACK drops ‒ Increase interface input queues on RRs BRKRST-3320 © 2012 Cisco and/or its

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

72

Periodic Convergence

Periodic Convergence

Convergence

Route Changes

There are 2 elements to route change convergence for BGP Failure Detection

How long does it take to see the failure? (t0 to t1)

Convergence

How long does it take to process and propagate information about the failure? (t1 to t2)

t0

t1

t2

propagate information about the failure? (t1 to t2) t0 t1 t2 Failure Process Propagate BRKRST-3320 ©

Failure

Process Propagate
Process
Propagate

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Recovery

Cisco Public

74

Convergence

Route Changes

Time to Detect Failure

Address Tracking Feature

Nexthop Tracking Peer Down Detection

Feature ‒ Nexthop Tracking ‒ Peer Down Detection  Time to Respond to Failure ‒ MRAI

Time to Respond to Failure

MRAI – Min Route Advertisement Interval

Advertising the new information

Advertisement Interval ‒ Advertising the new information BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

75

Convergence

Address Tracking Filter

Quick ATF review…

ATF = Address Tracking Filter

ATF is a middle man between the RIB and RIB clients

BGP, OSPF, EIGRP, etc. are clients of the RIB

A client tells ATF what prefixes he is interested in

ATF tracks each prefix

Notify the client when the route to a registered prefix changes

Client is responsible for taking action based on ATF notification

Provides a scalable event driven model for dealing with RIB changes

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

76

Convergence

Nexthop Tracking

BGP nexthop tracking

Relies on ATF

Event driven convergence model

Register NEXTHOPs with ATF

10.1.1.3

10.1.1.5

BGP

BGP NEXTHOPs

10.1.1.3

10.1.1.5

‒ 10.1.1.5 BGP BGP NEXTHOPs 10.1.1.3 10.1.1.5 ATF  ATF filters out changes for 10.1.1.1/32,
‒ 10.1.1.5 BGP BGP NEXTHOPs 10.1.1.3 10.1.1.5 ATF  ATF filters out changes for 10.1.1.1/32,

ATF

ATF filters out changes for 10.1.1.1/32, 10.1.1.2/32, and 10.1.1.4/32

BGP has not registered for these

Changes to 10.1.1.3/32 and 10.1.1.5/32 are passed along to BGP

Recompute bestpath for prefixes that use these NEXTHOPs

No need to wait for BGP Scanner

 
 
 
 
 
 
 

RIB

 

10.1.1.1/32

10.1.1.2/32

10.1.1.3/32

10.1.1.4/32

10.1.1.5/32

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

77

Convergence

Nexthop Tracking

Enabled by default

[no] bgp nexthop trigger enable

BGP registers all nexthops with ATF

show ip bgp attr next-hop ribfilter

Trigger delay is configurable

bgp nexthop trigger delay <0-100>

5 seconds by default

Debugs

debug ip bgp events nexthop

debug ip bgp rib-filter

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

78

Convergence

Peer Down Detection

BGP must learn that the peer is down

Default keepalive/holdtime values are 60 seconds and 180 seconds

My 2c….use 3 second KA with 9 second holdtime

Tune your IGP to converge in under 9 seconds

Use BFD (bidirectional forwarding detection) if you need to be more aggressive

eBGP directly connected

bgp fast-external-fallover

If the interface goes down so does the eBGP peer

Reduce carrier-delay settings

0 msec for down

100 msec for up

eBGP multihop

Relies on holdtime or BFD

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

79

Convergence

Peer Down Detection

iBGP peers

Relies on holdtime or BFD

BFD on iBGP peers

Know how fast your IGP converges!

Your BFD dead timer must be greater than that amount

iBGP peer down detection isn’t as critical as eBGP. Why?

IGP should be tuned to converge quickly

Fast IGP + BGP Nexthop Tracking = BGP reacts quickly to nexthop changes

BGP can route around a change in the core prior to bringing down iBGP peer(s)

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

80

Convergence

Fast Session Deactivation

Fast Session Deactivation

neighbor x.x.x.x fall-over

Register peer's address with ATF

ATF informs BGP of routing changes to the peer

When we lose our route to the peer, bring the peer down.

No need to wait for holdtime to expire

Primary use case is eBGP multihop

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

© 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public Multihop eBGP #1 – Link

Cisco Public

Multihop eBGP #1 – Link 1 fails #2 – Link 2 fails #3 – FSD takes down peer

81

Convergence

Fast Session Deactivation

Very dangerous for iBGP peers

IGP may not have a route to a peer for a split second

FSD would tear down the BGP session

Imagine if you lose your IGP route to your RR (Route Reflector) for just 100ms

Every RR to RRC session would flap

Off by default

neighbor x.x.x.x fall-over

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

82

Convergence

FSD vs. BFD

Why do we have both?

FSD was developed first

Goal was fast BGP neighbor detection without expense of fast keepalives

BFD came later

Fast keepalives not as much of a concern

Goal was fast neighbor detection for multiple protocols

BFD KAs are generated by linecards

CPUs are also much faster today

FSD

Relies on control plane (absence of a route in the RIB) to tear down the peer

We could have a route but not have connectivity

BFD

Relies on forwarding plane to detect down peer

If we loose connectivity, the peer comes down

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

the peer comes down  BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights reserved. Cisco

Cisco Public

83

Convergence

MRAI (Minimum Route Advertisement Interval)

How is the timer enforced for peer X?

Timer starts when all routes have been advertised to X

For the next MRAI (seconds) we will not propagate any bestpath changes to peer X

Once X’s MRAI timer expires, send him updates and withdraws

Restart the timer and the process repeats…

User may see a wave of updates and withdraws to peer X every MRAI seconds

User will NOT see a delay of MRAI between each individual update and/or withdraw

BGP would never converge if this were the case

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

84

Convergence

MRAI

MRAI timeline for BGP peer w/ MRAI of 5 seconds

T0

Bestpath Change #2 Bestpath Change #1
Bestpath
Change #2
Bestpath
Change #1
t5 t10 t15 t20 •TX update #1 •Start MRAI •MRAI Expires •MRAI Expires •TX update
t5
t10
t15
t20
•TX update #1
•Start MRAI
•MRAI Expires
•MRAI Expires
•TX update #2
•Start MRAI

t25

The big bang

T7

Bestpath Change #1

UPDATE sent immediately

MRAI timer starts, will expire at T12

T10

t0

Bestpath Change #2

Must wait until T12 for MRAI to expire

T12

MRAI expires

Bestpath Change #2 is Txed

MRAI timer starts, will expire at T17

T17

MRAI expires

No pending UPDATEs

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

85

Convergence

MRAI

BGP is not a link state protocol, it is path vector

May take several rounds/cycles of exchanging updates and withdraws for the network to converge

MRAI must expire between each round!

The more fully meshed the network and the more tiers of ASes, the more rounds required for convergence

Think about

How many tiers of ASes there are in the Internet How meshy peering can be in the Internet

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

86

Convergence

MRAI

Internet churn means we are constantly setting and waiting on MRAI timers

One flapping prefix slows convergence for all prefixes

Internet table sees roughly 6 bestpath changes per second

For iBGP and PE-CE eBGP peers

neighbor x.x.x.x advertisement-interval 0

Has been the default since 12.0(32)S

For regular eBGP peers

Default is 30 seconds

Lowering to 0 may get you dampened

OK to lower for eBGP peers if they are not using dampening

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

87

Convergence

MRAI

Will a MRAI of 0 eliminate batching?

Somewhat but not much happens anyway

TCP, the operating system, and BGP code provide some batching

Process all message from peer InQs Calculate bestpaths based on received messages Format UPDATEs to advertise new bestpaths

What about CPU load from 0 second MRAI?

Internet table has ~6 bestpath changes per second

Remember the stress of initial convergence?

6 bestpath changes per second is easy

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

88

High CPU Utilization

High CPU Utilization

“High Utilization”

Router#show process cpu CPU utilization for five seconds: 100%/0%; one minute: 99%; five minutes: 81%

139 6795740

1020252

Define “High”

6660 88.34% 91.63% 74.01%

0 BGP Router

Know what normal CPU utilization is for the router in question

Is the CPU spiking due to “BGP Scanner” or is it constant?

Look at the scenario

Is BGP going through “Initial Convergence”?

If not then route churn is the usual culprit

Illegal recursive lookup or some other factor causes bestpath changes for the entire table

other factor causes bestpath changes for the entire table BRKRST-3320 © 2012 Cisco and/or its affiliates.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

90

“High Utilization”

How to identify route churn?

Do “sh ip bgp summary”, note the table version

Wait 60 seconds

Do “sh ip bgp summary”, compare the table version from 60 seconds ago

You have 150k routes and see the table version increase by 300

This is probably normal route churn

Know how many bestpath changes you normally see per minute

You have 150k routes and see the table version increase by 150k

This is bad and is the cause of your high CPU

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

91

“High Utilization”

What causes massive table version changes?

Flapping peers

Hold-timer expiring?

Corrupt UPDATE?

Route churn

Don’t try to troubleshoot the entire BGP table at once

Identify one prefix that is churning and troubleshoot that one prefix

Will likely fix the problem with the rest of the BGP table churn

likely fix the problem with the rest of the BGP table churn BRKRST-3320 © 2012 Cisco

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

92

“High Utilization”

Table Version Changing Rapidly: A Little Lab Fun

RP/0/RP0/CPU0:XR#sh route | include 00:00:

Wed Apr 27

Fun RP/0/RP0/CPU0:XR#sh route | include 00:00: Wed Apr 27 13:53:40.201 EDT O 1.0.0.0/30 [110/3] via 10.1.2.1,

13:53:40.201 EDT

O

1.0.0.0/30 [110/3] via 10.1.2.1, 00:00:00, GigabitEthernet0/0/0/1

O

1.0.0.4/30 [110/3] via 10.1.2.1, 00:00:00, GigabitEthernet0/0/0/1

O

1.0.0.8/30 [110/3] via 10.1.2.1, 00:00:00, GigabitEthernet0/0/0/1

O

1.0.0.12/30 [110/3] via 10.1.2.1, 00:00:00, GigabitEthernet0/0/0/1

RP/0/RP0/CPU0:XR#sh route | include 00:00:

Wed Apr 27

RP/0/RP0/CPU0:XR#sh route | include 00:00: Wed Apr 27 13:53:44.162 EDT B 1.0.0.0/30 [20/2] via 1.1.1.1,

13:53:44.162 EDT

B

1.0.0.0/30 [20/2] via 1.1.1.1, 00:00:01

B

1.0.0.4/30 [20/2] via 1.1.1.1, 00:00:01

B

1.0.0.8/30 [20/2] via 1.1.1.1, 00:00:01

B

1.0.0.12/30 [20/2] via 1.1.1.1, 00:00:01

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

< 4 seconds later

Cisco Public

00:00: 01 BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights reserved. < 4 seconds later

93

“High Utilization”

Table Version Changing Rapidly: A Little Lab Fun

RP/0/RP0/CPU0:aggies#sh ip bgp 1.0.0.4 Wed Apr 27 14:00:36.066 EDT

14:00:35.387

Last Modified: Apr 27

Paths: (1 available, no best path)

for 00:00:00

100

1.1.1.1 (inaccessible) from

1.1.1.1 (1.1.1.1)

3 seconds later

1.1.1.1 (NH) flapping

1.0.0.4

RP/0/RP0/CPU0:aggies#sh ip bgp Wed Apr 27 14:00:38.710 EDT

Last Modified: Apr 27

14:00:38.387

for 00:00:00

Paths: (1 available, no best path)

1.1.1.1 (metric 2) from 1.1.1.1 (1.1.1.1)

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

94

“High Utilization”

Something is wrong with NEXTHOP 1.1.1.1

Flip flops between inaccessible and “accessible with an IGP cost of 2”

Troubleshoot 1.1.1.1 and the churning will stop

of 2”  Troubleshoot 1.1.1.1 and the churning will stop BRKRST-3320 © 2012 Cisco and/or its

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

95

Layer 3 VPNs

Layer 3 VPNs

Layer 3 VPNs

Troubleshooting Checklist

#1 PE1 PE2 core connectivity

Verify you can ping from loopback to loopback

Verify you can mpls ping from loopback to loopback

PE loopbacks must be /32

Check IGP

Check LDP

#1

PE1
PE1
loopbacks must be /32 ‒ Check IGP ‒ Check LDP #1 PE1 PE2 #2 CE2 #2
loopbacks must be /32 ‒ Check IGP ‒ Check LDP #1 PE1 PE2 #2 CE2 #2
PE2
PE2

#2

CE2
CE2

#2

CE1
CE1

#2 PE1 CE1 and PE2 CE2 connectivity

Can each PE ping their directly connected CE?

Remember to do “ping vrf FOO x.x.x.x”

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

97

Layer 3 VPNs

#3 PE PE vrf connectivity

Can PEs ping the vrf interface of the other PE?

If not double check your import/export Route Targets

#4 PE CE connectivity

Verify each PE can ping the CE connected to the other PE

#5 CE CE connectivity

PE1
PE1
CE1
CE1
PE2
PE2
CE2
CE2

#3

#4 #4
#4
#4

#5

At this point you should be able to ping CE to CE

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

98

Looking Glasses

Looking Glasses

The Internet

BGP Looking Glasses

You are advertising your address space to your ISPs

Q: How can you verify they are receiving it?

Q: How can you verify the rest of the Internet is receiving it?

A: BGP Looking Glasses

of the Internet is receiving it?  A: BGP Looking Glasses BRKRST-3320 © 2012 Cisco and/or

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

100

BGP Looking Glass servers are computers on the Internet running one of a variety of publicly available Looking Glass software implementations. A Looking Glass server (or LG server) is accessed remotely for the purpose of viewing routing info. Essentially, the server acts as a limited, read-only portal to routers of whatever organization is running the Looking Glass server. Typically, publicly accessible looking glass servers are run by ISPs or NOCs”

http://www.bgp4.as/looking-glasses

server. Typically, publicly accessible looking glass servers are run by ISPs or NOCs” http://www.bgp4.as/looking-glasses

The Internet

BGP Looking Glasses

https://www.sprint.net/lg/ Show bgp route 72.163.4.161 72.163.0.0/20
https://www.sprint.net/lg/
Show bgp route 72.163.4.161
72.163.0.0/20

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

102

The Internet

BGP Looking Glasses

host$ nslookup www.cisco.com Address: 72.163.4.161 host$
host$ nslookup www.cisco.com
Address: 72.163.4.161
host$

http://whois.arin.net/ui

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

103

The Internet

BGP Looking Glasses

Huge list of looking glasses here

http://www.bgp4.as/looking-glasses

looking glasses here ‒ http://www.bgp4.as/looking-glasses BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

104

The Internet

BGP Looking Glasses

The Level3 looking glass will translate AS #s to company names

AS-PATH:

AS-PATH Translation: GBLX SHAWFIBER

3549 6327

AS-PATH: ‒ AS-PATH Translation: GBLX SHAWFIBER 3549 6327 BRKRST-3320 © 2012 Cisco and/or its affiliates. All

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

105

The Internet

Whose AS is That Anyway?

Long list here

http://bgp.potaroo.net/cidr/autnums.html

Or lookup a specific AS

http://whois.arin.net/rest/asn/AS1239/pft

a specific AS ‒ http://whois.arin.net/rest/asn/AS1239/pft BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

106

The University's Route Views project was originally conceived as a tool for Internet operators to obtain real-time information about the global

routing system from the perspectives of several different backbones and locations around the Internet. Although other tools handle related tasks, such as the various Looking Glass Collections (see e.g. NANOG, or the DTI NSPIXP-2 Looking Glass), they typically either provide only a constrained view of the routing system (e.g., either a single provider, or the route server) or they do not provide real-time access to routing data.

While the Route Views project was originally motivated by interest on the part of operators in determining how the global routing system viewed their prefixes and/or AS space, there have been many other interesting uses of this Route Views data. For example, NLANR has used Route Views data for AS path visualization (see also NLANR), and to study IPv4 address space utilization (archive). Others have used Route Views data to map IP addresses to origin AS for various topological studies. CAIDA has used it in conjunction with theNetGeo database in generating geographic locations for hosts, functionality that both CoralReef and the Skitter project support.”

University of Oregon Route Views Project http://www.routeviews.org/

CoralReef and the Skitter project support.” University of Oregon Route Views Project http://www.routeviews.org/

Complete Your Online Session Evaluation

Give us your feedback and you could win fabulous prizes. Winners announced daily.

Receive 20 Passport points for each session evaluation you complete.

Complete your session evaluation online now (open a browser through our wireless network to access our portal) or visit one of the Internet stations throughout the Convention Center.

of the Internet stations throughout the Convention Center. Don’t forget to activate your Cisco Live Virtual

Don’t forget to activate your Cisco Live Virtual account for access to all session material, communities, and on-demand and live activities throughout the year. Activate your account at the Cisco booth in the World of Solutions or visit www.ciscolive.com.

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

108

Final Thoughts

Get hands-on experience with the Walk-in Labs located in World of Solutions, booth 1042

Come see demos of many key solutions and products in the main Cisco booth 2924

Visit www.ciscoLive365.com after the event for updated PDFs, on- demand session videos, networking, and more!

Follow Cisco Live! using social media:

Facebook: https://www.facebook.com/ciscoliveus Twitter: https://twitter.com/#!/CiscoLive

LinkedIn Group: http://linkd.in/CiscoLI

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

109

BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights reserved. C i s c o

BRKRST-3320

© 2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

BRKRST-3320 © 2012 Cisco and/or its affiliates. All rights reserved. C i s c o P