Vous êtes sur la page 1sur 108

Introduction

Housekeeping

Cell Phones

Who am I?

Who are you? Service Provider Enterprise Studying for CCIE

“Advanced” Class Assume BGP Operational Experience

“Advanced” Class – Assume BGP Operational Experience  Basic configuration  Show commands – Understand

Basic configuration

Show commands

Understand BGP attributes

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

– Understand BGP attributes BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

2

Introduction Operating Systems

IOS vs. IOS-XR vs. NX-OS

Troubleshooting concepts are the same

Some variation in show command syntax and output

Will use all three in this presentation

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

use all three in this presentation BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

3

Introduction

Agenda

Generic Troubleshooting Advice

Troubleshooting Peers

Bestpath Algorithm

Table Version

Initial Convergence

Periodic Convergence

High Utilization

Layer 3 VPNs

Looking Glasses

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Layer 3 VPNs  Looking Glasses BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

4

Generic Troubleshooting Advice
Generic Troubleshooting Advice

Generic Troubleshooting Advice

Generic Troubleshooting Advice

Narrow down the problem

Can you reproduce it?

Which device(s) are the cause of the problem?

Reduce your configs

Troubleshoot one thing at a time

100k routes flapping? Pick one route and focus on that one route

Have a co-worker take a look

Forces you to talk through the problem

Different set of eyes may spot something

Sniffer capture, sniffer capture, sniffer capture

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

sniffer capture, sniffer capture BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

Cisco Public

sniffer capture, sniffer capture BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

6

Generic Troubleshooting Advice Syslogs

Use NTP to sync timestamps on your routers

clock timezone EST -5 0

clock summer-time EDT recurring

ntp server x.x.x.x

Use a syslog server

logging monitor informational

logging host x.x.x.x

service timestamps log datetime msec localtime

x.x.x.x – service timestamps log d atetime msec localtime BRKRST-3320 © 2013 Cisco and/or its affiliates.
x.x.x.x – service timestamps log d atetime msec localtime BRKRST-3320 © 2013 Cisco and/or its affiliates.

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

log d atetime msec localtime BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

7

Generic Troubleshooting Advice Syslogs

Generic Troubleshooting Advice Syslogs Centralized/Timesynced syslogs are a great troubleshooting tool BRKRST-3320 ©

Centralized/Timesynced syslogs are a great troubleshooting tool

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

are a great troubleshooting tool BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

8

Generic Troubleshooting Advice log-neighbor-changes

Generic Troubleshooting Advice log-neighbor-changes  bgp log-neighbor-changes – Generates a syslog message w hen a
Generic Troubleshooting Advice log-neighbor-changes  bgp log-neighbor-changes – Generates a syslog message w hen a

bgp log-neighbor-changes

Generates a syslog message when a peer goes up or down

Always configure this

OSPF, ISIS, and EIGRP all have log-neighbor-changes too

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

all have log-neighbor-changes too BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

9

Generic Troubleshooting Advice Define “Normal”

“The CPU on this router is high”

High compared to what?

What is the CPU load normally at this time of day?

Things to keep track of

CPU load

Free Memory

Largest block of memory

Input/Output load for interfaces

Rate of BGP bestpath changes

Etc, etc

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

BGP bestpath changes – Etc, etc BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

10

Generic Troubleshooting Advice Define “Normal”

Cacti is a handy tool for polling and graphing data from various network devices

http://www.cacti.net/

from various network devices – http://www.cacti.net/ BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

devices – http://www.cacti.net/ BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 11

11

Generic Troubleshooting Advice Sniffer Captures

Use SPAN to get traffic to your sniffer

monitor session 1 source interface Te2/4 rx

monitor session 1 destination interface Te2/2

IOS-XR

Only supported on ASR-9000

Use ACLs to control what packets to SPAN

RSPAN

“RSPAN has all the features of SPAN, plus support for source ports and destination ports that are distributed across multiple switches, allowing one to monitor any destination port located on the RSPAN VLAN. Hence, one can monitor the traffic on one switch using a device on another switch.”

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

using a device on another switch.” BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

12

Generic Troubleshooting Advice Embedded Packet Capture

Ability to capture packets on the router

Primarily for control-plane traffic

Difficult to capture transit traffic on distributed platforms

Is supported on some platforms

Very handy if a dedicated sniffer is not available

Available on IOS and NX-OS

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

 Available on IOS and NX-OS BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

13

Generic Troubleshooting Advice IOS Embedded Packet Capture

Create a buffer

monitor capture buffer buf1 size 512 max-size 512 circular

Define which interface and direction to capture

monitor capture point ip cef dwalton-cap gig 0/0 in

Associate the buffer with the capture

monitor capture point associate dwalton-cap buf1

Start/Stop the capture

monitor capture point start dwalton-cap

monitor capture point stop dwalton-cap

Export the capture to a .pcap file

monitor capture buffer buf1 export tftp://172.26.2.254/buf1.pcap

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

port tftp://172.26.2.254/buf1.pcap BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 14

14

Generic Troubleshooting Advice Wireshark

You probably know this already but…

Wireshark is your best friend

It is free

You can get it here

is your best friend  It is free  You can get it here – http://www.wireshark.org/

http://www.wireshark.org/

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

here – http://www.wireshark.org/ BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 15

15

Generic Troubleshooting Advice Wireshark

Generic Troubleshooting Advice Wireshark BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

16

Generic Troubleshooting Advice Wireshark

Can do complex filters

ANDs, ORs, ()s, etc

If the filter is red, your syntax is busted

()s, etc  If the filter is red, your syntax is busted  If the filter

If the filter is green, your syntax is correct

busted  If the filter is green, your syntax is correct BRKRST-3320 © 2013 Cisco and/or
busted  If the filter is green, your syntax is correct BRKRST-3320 © 2013 Cisco and/or

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

is green, your syntax is correct BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

17

Generic Troubleshooting Advice Wireshark

Generic Troubleshooting Advice Wireshark  Wireshark does a LOT  Enough for someone to write an

Wireshark does a LOT

Enough for someone to write an 800 page book on how to use it

ISBN-13: 978-1893939998

book on how to use it   ISBN-13: 978-1893939998 BRKRST-3320 © 2013 Cisco and/or its

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

18

Generic Troubleshooting Advice Debugs

Send output to the logging buffer, not the console

logging buffered <size>

no logging console

Use milli-second timestamps

service timestamps debug datetime msec localtime

service timestamps log datetime msec localtime

Use ACLs to limit output

brain1(config)#access-list 100 permit ip host 1.1.1.1 host 2.2.2.2

brain1#debug ip packet 100

IP packet debugging is on for access list 100

brain1#

If you need to enable a very chatty debug

reload in 10

Run your debug

reload cancel

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Run your debug – reload cancel BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

19

Generic Troubleshooting Advice Event Tracing

Collects event information for various protocols

Runs in the background

Events are stored in memory

Debug output is not generated

Syslogs are not generated

Finite number of most recent events are stored

Use show commands later to

Display an event in a “debug like” format

Merge events from various protocols

Easier on the box than debugs

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

 Easier on the box than debugs BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

20

Generic Troubleshooting Advice Event Tracing

brain1(config)#monitor event-trace ?

adjacency

Adjacency Events

all-traces

Configure merged event traces

atom

AToM Event Trace

cef

CEF traces

[snip] brain1(config)#monitor event-trace adjacency enable

brain1(config)#end

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

adjacency enable brain1(config)#end BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 21

21

Generic Troubleshooting Advice Event Tracing

brain1#show monitor event-trace adjacency all

Feb 14 17:15:48.270: GLOBAL: adj mgr notified of fibidb state change int FastEthernet0/0 to down [OK]

Feb 14 17:15:50.958: GLOBAL: adj mgr notified of fibidb state change int FastEthernet0/0 to up [OK]

Feb 14 17:15:51.682: GLOBAL: adj ipv4 bundle changed to IPv4 no fixup adj oce [OK]

Feb 14 17:15:51.682: ADJ: IP 172.26.38.1 FastEthernet0/0/0: update oce bundle, IPv4 incomplete adj oce [OK]

Feb 14 17:15:51.682: ADJ: IP 172.26.38.1 FastEthernet0/0/0: allocate [OK]

Feb 14 17:15:51.686: ADJ: IP 172.26.38.1 FastEthernet0/0/0: request resolution [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: request to add ARP [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: allocate [Ignr]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: add source ARP [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: request to update [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: update oce bundle, IPv4 no fixup adj oce [OK]

Feb 14 17:15:51.734: ADJ: IP 172.26.38.1 FastEthernet0/0/0: update [OK]

brain1#

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

update [OK] brain1# BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 22

22

Generic Troubleshooting Advice Out Of Band Access

Generic Troubleshooting Advice Out Of Band Access  Don’t be the person who has to drive

Don’t be the person who has to drive 3 hours to console into a box

If you don’t have out of band access for every router and/or switch in your network….get it….please

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

in your network….get it….please BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

23

Troubleshooting Peers
Troubleshooting Peers

Troubleshooting Peers

Failed Peering Configurations

Check

AS Numbers
IP addresses for TCP

eBGP Multihop?

Numbers  ✓ IP addresses for TCP  eBGP Multihop? R1 R2 interface Loop0 ip address

R1

R2

 ✓ IP addresses for TCP  eBGP Multihop? R1 R2 interface Loop0 ip address 1.1.1.1/32

interface Loop0 ip address 1.1.1.1/32

!

eBGP Multihop? R1 R2 interface Loop0 ip address 1.1.1.1/32 ! router bgp 100 neighbor 2.2.2.2 remote-as

router bgp 100 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 update-source Loop0

2.2.2.2 remote-as 100 neighbor 2.2.2.2 update-source Loop0 interface Loop0 ip address 2.2.2.2/32 ! BRKRST-3320 R1# sh
2.2.2.2 remote-as 100 neighbor 2.2.2.2 update-source Loop0 interface Loop0 ip address 2.2.2.2/32 ! BRKRST-3320 R1# sh

interface Loop0 ip address 2.2.2.2/32

!

update-source Loop0 interface Loop0 ip address 2.2.2.2/32 ! BRKRST-3320 R1# sh tcp brief all router bgp

BRKRST-3320

R1#sh tcp brief all

router bgp 100 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 update-source Loop0

1.1.1.1 remote-as 100 neighbor 1.1.1.1 update-source Loop0 TCB Local Address Foreign Address (state)

TCB

Local Address

Foreign Address

(state)

64328548 *.179

2.2.2.2.*

LISTEN

R1#

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

*.179 2.2.2.2.* LISTEN … R1# © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

25

Failed Peering Connectivity

Check

Extended ping between BGP peering addresses

Check  ✓ Extended ping between BGP peering addresses R1 R2 R1#ping 2.2.2.2 source Loop0 Sending

R1

R2

 ✓ Extended ping between BGP peering addresses R1 R2 R1#ping 2.2.2.2 source Loop0 Sending 5,
 ✓ Extended ping between BGP peering addresses R1 R2 R1#ping 2.2.2.2 source Loop0 Sending 5,
 ✓ Extended ping between BGP peering addresses R1 R2 R1#ping 2.2.2.2 source Loop0 Sending 5,

R1#ping 2.2.2.2 source Loop0 Sending 5, 100-byte ICMP Echos to 2.2.2.2 Packet sent with a source address of 1.1.1.1

Success rate is 0 percent (0/5)

R1#

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

interface Loop0 ip address 1.1.1.1/32
interface Loop0
ip address 1.1.1.1/32

!

router bgp 100 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 update-source Loop0

interface Loop0 ip address 2.2.2.2/32
interface Loop0
ip address 2.2.2.2/32

!

router bgp 100 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 update-source Loop0

Cisco Public

ip address 2.2.2.2/32 ! router bgp 100 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 update-source Loop0 Cisco

26

Failed Peering Connectivity

Failed Peering Connectivity  BGP runs on top of IP and can be affected by many

BGP runs on top of IP and can be affected by many things

No connectivity?

IGP issues

Access Lists

TCP problems

Peers come up but flap, are slow, etc

MTU Issues – extended ping and sweep address ranges, DF bit, etc

Rate limiting

Traffic shaping

Debugs may be needed

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

shaping  Debugs may be needed BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

27

Failed Peering Notifications

BGP NOTIFICATIONs consist of an error code, subcode and data

All Error Codes and Subcodes can be found here

http://www.iana.org/assignments/bgp-parameters/bgp-parameters.xml

http://tinyurl.com/bgp-notification-codes

Data portion may contain what triggered the notification

Example: corrupt part of the UPDATE

Pay attention to who sent vs. received the NOTIFICATION

If Router X sent the NOTIFICATION, it means he noticed the issue

Does not mean Router X is the cause of the issue

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Router X is the cause of the issue BRKRST-3320 © 2013 Cisco and/or its affiliates. All

28

Failed Peering Notifications

%BGP-3-NOTIFICATION: sent to neighbor 2.2.2.2 2/2 (peer in wrong AS) 2 bytes 00C8 FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 002D 0104 00C8 00B4 0202 0202 1002 0601 0400 0100 0102 0280 0002 0202 00

Value

Name

Reference

1

Message Header Error

RFC 4271

2

OPEN Message Error

RFC 4271

3

UPDATE Message Error

RFC 4271

4

Hold Timer Expired

RFC 4271

5

Finite State Machine Error

RFC 4271

6

Cease

RFC 4271

The first 2 in “2/2” is the Error Code….so “OPEN Message Error”

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Code….so “OPEN Message Error” BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

29

Failed Peering Notifications

Subcode #

Subcode Name

Subcode Description

1

Unsupported BGP version

The version of BGP the peer is running isn’t compatible with the local version of BGP

2

Bad Peer AS

The AS this peer is locally configured for doesn’t match the AS the peer is advertising

3

Bad BGP Identifier

The BGP router ID is the same as the local BGP router ID

4

Unsupported Optional Parameter

There is an option in the packet which the local BGP speaker doesn’t recognize

6

Unacceptable Hold Time

The remote BGP peer has requested a BGP hold time which is not allowed (too low)

7

Unsupported Capability

The peer has asked for support for a feature which the local router does not support

OPEN Message Subcodes shown above The second 2 in “2/2” is the Error Subcode….so “Bad Peer AS”

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Subcode….so “Bad Peer AS” BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

30

10.1.2.1

10.1.2.2

Failed Peering Notifications

R2# show log | include NOTIFICATION %BGP-3-NOTIFICATION: sent to neighbor 10.1.2.1 2/2 (peer in wrong AS) 2 bytes 0064 FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 002D 0104 0064 00B4 0101 0101 1002 0601 0400 0100 0102 0280 0002 0202 00

00B4 0101 0101 1002 0601 0400 0100 0102 0280 0002 0202 00 x0064 = “data” of

x0064 = “data” of NOTIFICATION x0064 = decimal 100

R1

x0064 = “data” of NOTIFICATION x0064 = decimal 100 R1 R2 Sniff of BGP Notification Sent
x0064 = “data” of NOTIFICATION x0064 = decimal 100 R1 R2 Sniff of BGP Notification Sent

R2

= “data” of NOTIFICATION x0064 = decimal 100 R1 R2 Sniff of BGP Notification Sent from

Sniff of BGP Notification Sent from R2 to R1

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

BGP Notification Sent from R2 to R1 BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

31

10.1.2.1

10.1.2.2

Failed Peering Notifications

Question: What did R1 see?

R1#sh log | include NOTIFICATION %BGP-3-NOTIFICATION: received from neighbor 10.1.2.2 2/2 (peer in wrong AS) 2 bytes 0064

R1

neighbor 10.1.2.2 2/2 (peer in wrong AS) 2 bytes 0064 R1 R2 BRKRST-3320 router bgp 100
neighbor 10.1.2.2 2/2 (peer in wrong AS) 2 bytes 0064 R1 R2 BRKRST-3320 router bgp 100

R2

BRKRST-3320

router bgp 100 no synchronization bgp log-neighbor-changes neighbor 10.1.2.2 remote-as 200 no auto-summary

router bgp 200 no synchronization bgp log-neighbor-changes neighbor 10.1.2.1 remote-as 10 no auto-summary

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

neighbor 10.1.2.1 remote-as 10 no auto-summary © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

32

Failed Peering Decoding Hex

What if a peer sends you a message that causes us to send a NOTIFICATION?

Corrupt UPDATE

Bad OPEN message, etc

View the message that triggered the NOTIFICATION

show ip bgp neighbor 1.1.1.1 | begin Last reset

Last reset 5d12h, due to BGP Notification sent, invalid or corrupt AS path

Message received that caused BGP to send a Notification:

FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF

005C0200 00004140 01010040 0206065D

1CFC059F 400304D5 8C20F480 04040000

05054005 04000000 55C0081C 329C4844

329C6E28 329C6E29 58F50082 58F5EACE

58F5FA02 58F5FA6E 18D14E70

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

 58F5FA02 58F5FA6E 18D14E70 BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

33

Failed Peering Decoding Hex

You don’t like reading hex?

Nice write-up here on converting hex output to wireshark .pcap file

http://ccie-in-3-months.blogspot.com/2010/08/decoding-ripe-experiment.html

http://tinyurl.com/bgp-hex-decode

In a nutshell, put the hex dump in this format

 In a nutshell, put the hex dump in this format BRKRST-3320 © 2013 Cisco and/or

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

34

Failed Peering Decoding Hex

Now use Wireshark’s text2pcap.exe to add the needed headers

use Wireshark’s text2pcap.exe to add the needed headers Open bgp_message.pcap with Wireshark BRKRST-3320 © 2013

Open bgp_message.pcap with Wireshark

add the needed headers Open bgp_message.pcap with Wireshark BRKRST-3320 © 2013 Cisco and/or its affiliates. All

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

35

Troubleshooting Peers eBGP TTL

BGP uses a TTL of 1 for eBGP peers

Also verifies if NEXTHOP is directly connected

For eBGP peers that are more than 1 hop away a larger TTL must be used

No longer verifies if NEXTHOP is directly connected

neighbor x.x.x.x ebgp-multihop [2-255]

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

R2 AS65001 Default TTL Configured TTL AS65000 R1
R2
AS65001
Default TTL
Configured TTL
AS65000
R1

Cisco Public

Cisco and/or its affiliates. All rights reserved. R2 AS65001 Default TTL Configured TTL AS65000 R1 Cisco

36

Troubleshooting Peers eBGP TTL

Loopback peering to directly connected eBGP peer

Typically used to load-balance over multiple links

Two options for configuring this…

Option #1 – The old way

Use ebgp-multihop

Change the TTL to 2

Disables the “is the NEXTHOP on a connected subnet” check

R1#

router bgp 100 no synchronization bgp log-neighbor-changes neighbor 2.2.2.2 remote-as 200 neighbor 2.2.2.2 ebgp-multihop 2 neighbor 2.2.2.2 update-source Loopback0 no auto-summary

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

R1 R2
R1
R2

Cisco Public

Multihop eBGP session between loopbacks

© 2013 Cisco and/or its affiliates. All rights reserved. R1 R2 Cisco Public Multihop eBGP session

37

Troubleshooting Peers eBGP TTL

Option #2 – The new way

Use disable-connected-check

Still uses a TTL of 1

Disables the “is the NEXTHOP on a connected subnet” check

R1#

router bgp 100 no synchronization bgp log-neighbor-changes neighbor 2.2.2.2 remote-as 200 neighbor 2.2.2.2 disable-connected-check neighbor 2.2.2.2 update-source Loopback0 no auto-summary

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

R1 R2
R1
R2

Cisco Public

Multihop eBGP session between loopbacks

© 2013 Cisco and/or its affiliates. All rights reserved. R1 R2 Cisco Public Multihop eBGP session

38

Failed Peering Notifications – Hold Time Expired

Assume R1 sends hold time expired NOTIFICATION to R2

R1 did not receive a KA from R2 for holdtime seconds

One of two issues

R2 is not generating keepalives

R2 is generating keepalives but R1 is not receiving them

First figure out if R2 is building keepalives

Is R2 out of memory or CPU?

Output drops on the outbound interface towards R1?

When did R2 last build a keepalive?

R2#show ip bgp neighbors 1.1.1.1

Last read 00:00:15, last write 00:00:44, hold time is 180,

keepalive interval is 60 seconds

Is the TCP window open?

show ip bgp summary

Watch R2’s MsgSent counter for R1….does it increment?

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

counter for R1….does it increment? BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

39

Failed Peering Notifications – Hold Time Expired

Assuming R2 is sending keepalives, why isn’t R1 receiving them?

Input drops on R1

Lost in transit?

Do R1 and R2 still have IP connectivity?

Ping using peering addresses (loopback to loopback)

Ping with mss (max-segment-size) with df-bit set

MSS – Max Segment Size

536 bytes by default

Path MTU Discovery finds smallest MTU between R1 and R2

Subtract 40 bytes for TCP/IP overhead

Note the MSS and ping accordingly

R1#sh ip bgp neighbors BGP neighbor is 2.2.2.2, remote AS 2, external link Datagrams (max data segment is 1460 bytes):

R1# ping 2.2.2.2 source loop0 size 1500 df-bit

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

source loop0 size 1500 df-bit BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

40

Failed Peering Notifications – Hold Time Expired

show ip bgp summary

Watch R1’s MsgRcvd counter for R2…it should be incrementing

When did R1 last receive keepalive?

R1#show ip bgp neighbors 2.2.2.2 Last read 00:00:95, last write 00:00:44, hold time is 180, keepalive interval is 60 seconds

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

keepalive interval is 60 seconds BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

41

Speaker Flap Case Study

R1

Speaker Flap Case Study R1 NOTIFICATION R2 %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Down BGP Notification sent

NOTIFICATION

R2

%BGP-5-ADJCHANGE: neighbor 1.1.1.1 Down BGP Notification sent %BGP-3-NOTIFICATION: sent to neighbor 1.1.1.1 4/0 (hold time expired) R2#show ip bgp neighbor 1.1.1.1 | include last reset Last reset 00:01:02, due to BGP Notification sent, hold time expired

There are lots of possibilities here

Is R1 having a problem sending keepalives?

CPU at 100%?

Out of memory?

Are the keepalives being lost in the cloud?

Is R2 having a problem receiving the keepalive?

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

a problem receiving the keepalive? BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

42

Speaker Flap Case Study

Did R1 build and transmit a keepalive for R2?

debug ip bgp keepalive

show ip bgp neighbor

When did we last send or receive data with the peer?

R2#show ip bgp neighbors 1.1.1.1 BGP neighbor is 1.1.1.1, remote AS 100, external link BGP version 4, remote router ID 1.1.1.1 BGP state = Established, up for 00:12:49 Last read 00:01:15, last write 00:00:44, hold time is 180, keepalive interval is 60 seconds

R2 hasn’t received a Keepalive in more than “keepalive interval” seconds

Time to check R1

How is R1 on memory?

What is the R1’s CPU load?

Is R2’s TCP window open?

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

– Is R2’s TCP window open? BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

43

Speaker Flap Case Study

R1#show ip bgp sum | begin Neighbor

Neighbor … MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

2.2.2.2 …

53

MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 2.2.2.2 … 53 284 10167 0 97 00:01:20 0 The
284
284

10167

0

97 00:01:20
97 00:01:20

0

State/PfxRcd 2.2.2.2 … 53 284 10167 0 97 00:01:20 0 The number of packets transmitted is
State/PfxRcd 2.2.2.2 … 53 284 10167 0 97 00:01:20 0 The number of packets transmitted is

The number of packets transmitted is not increasing

The number of packets generated is increasing

 The number of packets generated is increasing At least one BGP keepalive interval apart 53

At least one BGP keepalive interval apart

is increasing At least one BGP keepalive interval apart 53 284 R1# show ip bgp sum
53 284
53
284

R1#show ip bgp sum | begin Neighbor

Neighbor … MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

2.2.2.2 …

10167

0

98 00:02:24
98 00:02:24

0

OutQ is incrementing due to Keepalives MsgSent is not incrementing Something is “stuck” in the OutQ The keepalives aren‘t leaving R1!

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

The keepalives aren‘t leaving R1! BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

44

Speaker Flap Case Study

This is a layer 2 or 3 transport issue, etc.

BGP OPENs and Keepalives are small

UPDATEs can be much larger

So maybe small packets work but larger packets do not?

R1#ping 2.2.2.2 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:

!!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 16/21/24 ms

R1#ping 2.2.2.2 size 1500 df-bit

Type escape sequence to abort. Sending 5, 1500-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:

Packet sent with the DF bit set . Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms

.

.

.

.

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

min/avg/max = 1/1/1 ms . . . . BRKRST-3320 © 2013 Cisco and/or its affiliates. All

45

Bestpath Algorithm
Bestpath Algorithm

Bestpath Algorithm

Best Path Algorithm

• Quick bestpath review • Remember • BGP only advertises one path per prefix…the bestpath
Quick bestpath review
Remember
• BGP only advertises one path per prefix…the bestpath
• Cannot advertise path from one iBGP peer to another
Bestpath selection process is a little lengthy
First eliminate paths that are ineligible for bestpath
1
Not synchronized
Only happens if “sync” is configured AND the route isn’t in your IGP
2
Inaccessible NEXTHOP
IGP does not have a route to the BGP NEXTHOP
3
Received-only paths
Happens if “soft-reconfig inbound” is applied. A path will be received-only if it
was denied/modified by inbound policy.

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

47

Best Path Algorithm

1 Weight Highest wins Scope is router only 2 LOCAL_PREFERENCE Highest wins Scope is AS
1
Weight
Highest wins
Scope is router only
2
LOCAL_PREFERENCE
Highest wins
Scope is AS only
3
Locally Originated
Redistribution or network statement favored over aggregate-
address
4
AS_PATH
Shortest wins
Skipped if “bgp bestpath as-path ignore” configured
AS_SET counts as 1
CONFED parts do not count
5
ORIGIN
Lowest wins
IGP < EGP < Incomplete
6
MED
Lowest wins
MEDs are compared only if the first AS in the AS_SEQUENCE
is the same
7
eBGP over iBGP
8
Metric to Next Hop
Lowest wins
IGP cost to the BGP NEXTHOP
9
Multiple Paths in RIB
Flag path as “multipath” is max-paths is configured
10
Oldest External Wins
Unless BGP best path compare router-id configured
11
BGP Router ID
Lowest
12
CLUSTER_LIST
Smallest
Shorter CLUSTER_LIST wins
13
Neighbor Address
Lowest
Lowest neighbor address

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

48

Best Path Algorithm

show ip bgp x.x.x.x bestpath

Will show you only the bestpath for x.x.x.x

Handy if you have lots of paths for a prefix

R2#sh ip bgp 7.4.4.0/24 bestpath

BGP routing table entry for 7.4.4.0/24, version 2 Paths: (20 available, best #13, table Default-IP-Routing-Table) Flag: 0x820 Not advertised to any peer

100

192.150.6.11 from 192.150.6.11 (192.150.6.11) Origin IGP, metric 0, localpref 100, valid, external, best

R2#

show ip bgp x.x.x.x multipath

Same concept but will show you all of the multipaths for x.x.x.x

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

a ll of the multipaths for x.x.x.x BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

49

Best Path Algorithm

IOS-XR has

sh ip bgp x.x.x.x bestpath-compare

Explains why the bestpath is the best

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

why the bestpath is the best BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

50

BGP Table Version
BGP Table Version

BGP Table Version

BGP Table Version

Lots of things must happen when bestpaths change

RIB must be notified

Peers must be informed

Must have a way to track who has been informed of which bestpath changes

Prefix Table Version

Each prefix has a 32 bit number that is its table version

A prefix’s table version is bumped for every bestpath change

Bumped means the table version changes from the current version to the next available version #.

Assume 10.0.0.0/8 has a table version of #27 and the highest table version used by any prefix is #30. If 10.0.0.0/8 has a bestpath change his table version will be bumped to #31.

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

version will be bumped to #31. BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

52

BGP Table Version

show ip bgp x.x.x.x” will show you a prefix’s table version

R1#sh ip bgp 10.0.0.0

BGP routing table entry for 10.0.0.0/8, version 31 Paths: (1 available, best #1, table Default-IP-Routing-Table) Flag: 0x820 Not advertised to any peer

200

2.2.2.2 from 2.2.2.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, external, best

R1#

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

100, valid, external, best R1# BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

53

BGP Table Version

RIB & Peer Table Versions

We have a table version for the RIB

Also have a table version for each peer

Used to keep track of which bestpath changes have been propagated to whom

If peer 1.1.1.1 has a table version of #60 this tells us we have informed 1.1.1.1 of all bestpath changes for prefixes with a table version of <= #60

If any prefix has a table version > #60 then we need to inform 1.1.1.1 of that prefix’s bestpath

Once 1.1.1.1 has been updated his table version will be updated accordingly

Same concept for the RIB and its table version

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

for the RIB and its table version BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

54

BGP Table Version

“show ip bgp summary” is best for viewing RIB and peer version #s

R2#show ip bgp summ

BGP router identifier 2.2.2.2, local AS number 200 BGP table version is 13, main routing table version 13

3

network entries using 351 bytes of memory

3

path entries using 156 bytes of memory

Neighbor

V

AS MsgRcvd MsgSent

TblVer InQ OutQ Up/Down State/PfxRcd

1.1.1.1

4

100

4386 4388

13

0

0 01:20:24

1

R2#

Highest table version of any prefix = “main routing table version”

RIB is converged

1.1.1.1 is converged

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

converged  1.1.1.1 is converged BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

55

BGP Table Version

Example

Assume the highest table version of any prefix is #10

The RIB has a table version of #10

The RIB is up to date for all prefixes

All peers have a table version of #10

Our peers are currently converged

5 prefixes experience a bestpath change

Highest table version is now #15

Inform the RIB of these 5 changes

Do RIB adds, deletes, and/or modifies

When complete, set the RIB table version to #15

Inform our peers of these 5 changes

Build updates and/or withdraws for each peer

When complete, set our peers’ table versions to #15

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

our peers’ table versions to #15 BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

56

BGP Table Version

Why am I babbling about this?

Gives you a way to know who has been informed about what

Provides a way to tell how many bestpath changes your network is experiencing

You have 150k routes and see the table version increase by 150k every minute…something is wrong!!

You have 150k routes and see the table version increase by 300 every minute…sounds like normal network churn

You should monitor the table version in your network to determine what is normal for you

If the table version is increasing rapidly then that could explain why “BGP Router” and “BGP IO” are busy

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Router” and “BGP IO” are busy BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

57

Initial Convergence
Initial Convergence

Initial Convergence

BGP Convergence

Hey—Who are you calling slow?

Two general convergence situations

Initial startup

Periodic route changes

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

startup – Periodic route changes BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

59

Convergence Initial Startup

Initial convergence happens when:

A router boots

RP failover

clear ip bgp *

How long initial convergence takes is a factor of the amount of work to be done and the router/network’s ability to do this fast and efficiently

router/network’s ability to do this fast and efficiently BRKRST-3320 © 2013 Cisco and/or its affiliates. All

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

to do this fast and efficiently BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

60

Convergence Initial Startup

Convergence Initial Startup Initial convergence can be stressful…if you are approaching BGP scalability limits this

Initial convergence can be stressful…if you are approaching BGP scalability limits this is when you will see issues.

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

this is when you will see issues. BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

61

Convergence Initial Startup

What work needs to be done?

1)

Accept routes from all peers

Not too difficult

2)

Calculate bestpaths

This is easy

3)

Install bestpaths in the RIB

Also fairly easy

4)

Advertise bestpaths to all peers

This can be difficult and may take several minutes depending on the following variables…

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

on the following variables… BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

62

Convergence Key Variables

BGP Variables

The number of routes

The number of peers

The number of update-groups

The ability to advertise routes to each update-group efficiently

Router Variables

CPU horsepower

Code version

Outbound Interface Bandwidth

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

– Outbound Interface Bandwidth BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

63

Convergence UPDATE Packing

An UPDATE contains a set of Attributes and a list of prefixes (NLRI)

BGP starts an UPDATE by building an attribute set

BGP then packs as many destinations (NLRIs) as it can into the UPDATE

NLRI = Network Layer Reachability Information

Only NLRI with a matching attribute set can be placed in the UPDATE

NLRI are added to the UPDATE until it is full (4096 bytes max)

“UPDATE Packing” refers to how efficiently an implementation packs NLRIs into UPDATEs

Least efficient: BGP only puts one NLRI per UPDATE

Most efficient: BGP puts all NLRI with a certain Attribute set in one UPDATE

Least Efficient

Most Efficient

BRKRST-3320

MED 50 10.1.1.0/24 MED 50 10.1.2.0/24 MED 50 10.1.3.0/24 Origin IGP Origin IGP Origin IGP
MED 50
10.1.1.0/24
MED 50
10.1.2.0/24
MED 50
10.1.3.0/24
Origin IGP
Origin IGP
Origin IGP

MED 50

10.1.1.0/24

Origin IGP

10.1.2.0/24

10.1.3.0/24

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Origin IGP 10.1.2.0/24 10.1.3.0/24 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 64

64

Convergence UPDATE Packing

The fewer attribute sets you have the better

More NLRI will share an attribute set

Fewer UPDATEs to converge

Things you can do to reduce attribute sets

next-hop-self for all iBGP sessions

Don’t accept/send communities you don’t need

Use cluster-id to put RRs in the same POP in a cluster

To see how many attribute sets you have

show ip bgp summary

network entries using 21565372 bytes of memory

190844

302705

path entries using 15740660 bytes of memory

57469/31045 BGP path/bestpath attribute entries using 6206652 bytes of

memory

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

using 6206652 bytes of memory BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

65

Convergence TCP MSS – Max Segment Size

TCP MSS (max segment size) is also a factor in convergence times. The larger the MSS the fewer TCP packets it takes to transport the BGP updates. Fewer packets means less overhead and faster convergence.

BGP UPDATE

Attribute NLRI NLRIs NLRI NLRIs NLRI
Attribute
NLRI
NLRIs
NLRI
NLRIs
NLRI

Default MSS

BGP UDPATE is split into two TCP packets

IP Header TCP Header Attribute NLRI NLRIs IP Header TCP Header NLRI NLRIs NLRI
IP Header
TCP Header
Attribute
NLRI
NLRIs
IP Header
TCP Header
NLRI
NLRIs
NLRI

Increased MSS

IP Header TCP Header Attribute NLRI NLRIs NLRI NLRIs NLRI
IP Header
TCP Header
Attribute
NLRI
NLRIs
NLRI
NLRIs
NLRI

The entire BGP update can fit in one TCP packet

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

update can fit in one TCP packet BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

66

Convergence TCP MSS – Max Segment Size

MSS – Max Segment Size

Limit on packet size for a TCP socket

536 bytes by default

Path MTU Discovery

Finds smallest MTU between R1 and R2

Subtract 40 bytes for TCP/IP overhead

Enabled by default for BGP

neighbor 2.2.2.2 transport path-mtu-discovery disable

To find the MSS

R1#sh ip bgp neighbors BGP neighbor is 2.2.2.2, remote AS 3, external link Datagrams (max data segment is 1460 bytes):

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

(max data segment is 1460 bytes): BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

67

Convergence Update Groups

BGP must create updates based on the policies towards each peer

Peers with a common outbound policy are members of the same update- group

iBGP vs. eBGP

Outbound route-map, prefix-lists, etc

UPDATEs are generated for one member of an update-group and then replicated to the other members

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Less Efficient – Two peers in different update-groups

Attribute Attribute
Attribute
Attribute
NLRI NLRI NLRI NLRI
NLRI
NLRI
NLRI
NLRI

More Efficient – Two peers in the same update-group

Attribute
Attribute
NLRI
NLRI
NLRI
NLRI

Cisco Public

NLRI NLRI NLRI NLRI More Efficient – Two peers in the same update-group Attribute NLRI NLRI
NLRI NLRI NLRI NLRI More Efficient – Two peers in the same update-group Attribute NLRI NLRI

68

Convergence Dropping TCP Acks

Primarily an issue on RRs (Route Reflectors) with

One or two interfaces connecting to the core

Hundreds of RRCs (Route Reflector Clients)

RR sends out tons of UPDATES to RRCs

RRCs send TCP ACKs

RR core facing interface(s) receive huge wave of TCP ACKs

RR

BGP UPDATEs TCP ACKs
BGP UPDATEs
TCP ACKs

RRCs

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

ACKs RR BGP UPDATEs TCP ACKs RRCs BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

69

Convergence Dropping TCP Acks

Interface input queue fills up…TCP ACKs are dropped

Each time a TCP packet is dropped, the session goes into slow start

It takes a good deal of time for a TCP session to come out of slow start

Increase the input queue

hold-queue 1000 in

If you still see drops increase to 4096

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

still see drops increase to 4096 BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

70

Convergence

How do you know if BGP has converged?

Watch the global table version

Increases by 1 for every bestpath change

In the lab: Table version stabilizes

In the real world: Reaches your “normal” rate of change

Watch peer InQ and OutQs

Wait for all InQ and OutQs to be empty

To list peers with non-empty queues

show ip bgp summ | e 0

0

Watch peer table versions

show ip bgp summ

If peer table version == global table version and InQ/OutQ empty, BGP has converged that peer

vers ion and InQ/OutQ empty, BGP has converged that peer BRKRST-3320 © 2013 Cisco and/or its

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

71

Convergence Initial Convergence Summary

Initial convergence time is a factor of the amount of work that needs to be done and the router/network’s ability to do this fast and efficiently

Reduce the number of attributes sets in BGP

Use next-hop-self, don’t send communities you don’t need, etc.

Reduce the number of unique outbound policies towards all peers

Try to find a small set of common policies, rather than individualizing policies per peer

The fewer update-groups the better

MSS/PMTU

Efficient packaging of BGP messages in TCP

Stop TCP ACK drops

Increase interface input queues on RRs

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

interface input queues on RRs BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

72

Periodic Convergence
Periodic Convergence

Periodic Convergence

Convergence Route Changes

There are 2 elements to route change convergence for BGP

Failure Detection

How long does it take to see the failure? (t0 to t1)

Convergence

How long does it take to process and propagate information about the failure? (t1 to t2)

t0

t1

t2

propagate information about the failure? (t1 to t2) t0 t1 t2 Process Failure Propagate BRKRST-3320 ©
Process Failure Propagate
Process
Failure
Propagate

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Recovery

Cisco Public

Failure Propagate BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.  Recovery Cisco Public

74

Convergence Route Changes

Time to Detect Failure

Address Tracking Feature

Nexthop Tracking

Peer Down Detection

Time to Respond to Failure

MRAI – Min Route Advertisement Interval

Advertising the new information

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

– Advertising the new information BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

75

Convergence Address Tracking Filter

Quick ATF review…

ATF = Address Tracking Filter

ATF is a middle man between the RIB and RIB clients

BGP, OSPF, EIGRP, etc are clients of the RIB

A client tells ATF what prefixes he is interested in

ATF tracks each prefix

Notify the client when the route to a registered prefix changes

Client is responsible for taking action based on ATF notification

Provides a scalable event driven model for dealing with RIB changes

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

model for dealing with RIB changes BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

76

Convergence Nexthop Tracking

BGP nexthop tracking

BGP ATF
BGP
ATF
RIB 10.1.1.1/32 10.1.1.2/32 10.1.1.3/32 10.1.1.4/32 10.1.1.5/32 Cisco Public
RIB
10.1.1.1/32
10.1.1.2/32
10.1.1.3/32
10.1.1.4/32
10.1.1.5/32
Cisco Public
BGP NEXTHOPs 10.1.1.3 10.1.1.5
BGP NEXTHOPs
10.1.1.3
10.1.1.5

Relies on ATF

Event driven convergence model

Register NEXTHOPs with ATF

10.1.1.3

10.1.1.5

ATF filters out changes for 10.1.1.1/32, 10.1.1.2/32, and 10.1.1.4/32

BGP has not registered for these

Changes to 10.1.1.3/32 and 10.1.1.5/32 are passed along to BGP

Recompute bestpath for prefixes that use these NEXTHOPs

No need to wait for BGP Scanner

that use these NEXTHOPs – No need to wait for BGP Scanner BRKRST-3320 © 2013 Cisco

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

77

Convergence Nexthop Tracking

Enabled by default

[no] bgp nexthop trigger enable

BGP registers all nexthops with ATF

show ip bgp attr next-hop ribfilter

Trigger delay is configurable

bgp nexthop trigger delay <0-100>

5 seconds by default

Debugs

debug ip bgp events nexthop

debug ip bgp rib-filter

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

nexthop – debug ip bgp rib-filter BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

78

Convergence Peer Down Detection

BGP must learn that the peer is down

Default keepalive/holdtime values are 60 seconds and 180 seconds

My 2c….use 3 second KA with 9 second holdtime

Tune your IGP to converge in under 9 seconds

Use BFD (bidirectional forwarding detection) if you need to be more aggressive

eBGP directly connected

bgp fast-external-fallover

If the interface goes down so does the eBGP peer

Reduce carrier-delay settings

0 msec for down

100 msec for up

eBGP multihop

Relies on holdtime or BFD

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

– Relies on holdtime or BFD BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

79

Convergence Peer Down Detection

iBGP peers

Relies on holdtime or BFD

BFD on iBGP peers

Know how fast your IGP converges!

Your BFD dead timer must be greater than that amount

iBGP peer down detection isn’t as critical as eBGP. Why?

IGP should be tuned to converge quickly

Fast IGP + BGP Nexthop Tracking = BGP reacts quickly to nexthop changes

BGP can route around a change in the core prior to bringing down iBGP peer(s)

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

to bringing down iBGP peer(s) BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

80

Convergence Fast Session Deactivation

Fast Session Deactivation

neighbor x.x.x.x fall-over

Register peer's address with ATF

ATF informs BGP of routing changes to the peer

When we lose our route to the peer, bring the peer down.

No need to wait for holdtime to expire

Primary use case is eBGP multihop

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public Multihop eBGP #1 – Link
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public Multihop eBGP #1 – Link
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public Multihop eBGP #1 – Link
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public Multihop eBGP #1 – Link

Cisco Public

Multihop eBGP #1 – Link 1 fails #2 – Link 2 fails #3 – FSD takes down peer

All rights reserved. Cisco Public Multihop eBGP #1 – Link 1 fails #2 – Link 2

81

Convergence Fast Session Deactivation

Very dangerous for iBGP peers

IGP may not have a route to a peer for a split second

FSD would tear down the BGP session

Imagine if you lose your IGP route to your RR (Route Reflector) for just 100ms

Every RR to RRC session would flap

Off by default

neighbor x.x.x.x fall-over

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

– neighbor x.x.x.x fall-over BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public

82

Convergence FSD vs. BFD

Why do we have both?

FSD was developed first

Goal was fast BGP neighbor detection without expense of fast keepalives

BFD came later

Goal was fast neighbor detection for multiple protocols

Fast keepalives not as much of a concern

BFD KAs are generated by linecards

CPUs are also much faster today

FSD

Relies on control plane (absence of a route in the RIB) to tear down the peer

We could have a route but not have connectivity

BFD

Relies on forwarding plane to detect down peer

If we loose connectivity, the peer comes down

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

the peer comes down  BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

Cisco Public

the peer comes down  BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

83

Convergence MRAI (minimum route advertisement interval)

How is the timer enforced for peer X?

Timer starts when all routes have been advertised to X

For the next MRAI (seconds) we will not propagate any bestpath changes to peer

X

Once X’s MRAI timer expires, send him updates and withdraws

Restart the timer and the process repeats…

User may see a wave of updates and withdraws to peer X every MRAI seconds

User will NOT see a delay of MRAI between each individual update and/or withdraw

BGP would never converge if this were the case

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

converge if this were the case BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

84

Convergence

MRAI

MRAI timeline for BGP peer w/ MRAI of 5 seconds

T0

The big bang

T7

Bestpath Change #1

UPDATE sent immediately

Bestpath Change #2 Bestpath Change #1
Bestpath
Change #2
Bestpath
Change #1
– MRAI timer starts, will expire at T12  T10 t0 t5 t10 t15 t20
– MRAI timer starts, will expire at T12
 T10
t0
t5
t10
t15
t20
t2
– Bestpath Change #2
5
– Must wait until T12 for MRAI to expire
•TX update #1
•Start MRAI
•MRAI Expires
 T12
– MRAI expires
– Bestpath Change #2 is Txed
•MRAI Expires
•TX update #2
•Start MRAI
– MRAI timer starts, will expire at T17
 T17
– MRAI expires
– No pending UPDATEs

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

85

Convergence

MRAI

BGP is not a link state protocol, it is path vector

May take several “rounds/cycles” of exchanging updates and withdraws for the network to converge

MRAI must expire between each round!

The more fully meshed the network and the more tiers of ASes, the more rounds required for convergence

Think about

How many tiers of ASes there are in the Internet

How meshy peering can be in the Internet

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

meshy peering can be in the Internet BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

86

Convergence

MRAI

Internet churn means we are constantly setting and waiting on MRAI timers

One flapping prefix slows convergence for all prefixes

Internet table sees roughly 6 bestpath changes per second

For iBGP and PE-CE eBGP peers

neighbor x.x.x.x advertisement-interval 0

Has been the default since 12.0(32)S

For regular eBGP peers

Lowering to 0 may get you dampened

OK to lower for eBGP peers if they are not using dampening

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

if they are not using dampening BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

87

High CPU Utilization
High CPU Utilization

High CPU Utilization

“High Utilization”

Router#show process cpu CPU utilization for five seconds: 100%/0%; one minute: 99%; five minutes: 81%

139 6795740

1020252

Define “High”

6660 88.34% 91.63% 74.01%

0 BGP Router

Know what normal CPU utilization is for the router in question

Is the CPU spiking due to “BGP Scanner” or is it constant?

Look at the scenario

Is BGP going through “Initial Convergence”?

If not then route churn is the usual culprit

Illegal recursive lookup or some other factor causes bestpath changes for the entire table

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

changes for the entire table BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

90

“High Utilization”

How to identify route churn?

Do “sh ip bgp summary”, note the table version

Wait 60 seconds

Do “sh ip bgp summary”, compare the table version from 60 seconds ago

You have 150k routes and see the table version increase by 300

This is probably normal route churn

Know how many bestpath changes you normally see per minute

You have 150k routes and see the table version increase by 150k

This is bad and is the cause of your high CPU

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

and is the c ause of your high CPU BRKRST-3320 © 2013 Cisco and/or its affiliates.

91

“High Utilization”

What causes massive table version changes?

Flapping peers

Hold-timer expiring?

Corrupt UPDATE?

Route churn

Don’t try to troubleshoot the entire BGP table at once

Identify one prefix that is churning and troubleshoot that one prefix

Will likely fix the problem with the rest of the BGP table churn

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

the rest of the BGP table churn BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

92

“High Utilization”

Table Version Changing Rapidly: A Little Lab Fun

RP/0/RP0/CPU0:XR#sh route | include 00:00:

Wed Apr 27 13:53:40.201 EDT

O

O

O

O

| include 00:00: Wed Apr 27 13:53:40.201 EDT O O O O 1.0.0.0/30 [110/3] via 10.1.2.1,

1.0.0.0/30 [110/3] via 10.1.2.1, 00:00:00, GigabitEthernet0/0/0/1

1.0.0.4/30 [110/3] via 10.1.2.1, 00:00:00, GigabitEthernet0/0/0/1

[110/3] via 10.1.2.1, 00:00: 00, GigabitEthernet0/0/0/1 1.0.0.8/30 [110/3] via 10.1.2.1, 00:00: 00,

1.0.0.8/30 [110/3] via 10.1.2.1, 00:00:00, GigabitEthernet0/0/0/1

1.0.0.12/30 [110/3] via 10.1.2.1, 00:00:00, GigabitEthernet0/0/0/1

RP/0/RP0/CPU0:XR#sh route | include 00:00:

RP/0/RP0/CPU0:XR#sh route | include 00:00: Wed Apr 27 13:53:44.162 EDT B 1.0.0.0/30 [20/2] via

Wed Apr 27 13:53:44.162 EDT

B

1.0.0.0/30 [20/2] via 1.1.1.1, 00:00:01

B

1.0.0.4/30 [20/2] via 1.1.1.1, 00:00:01

B

1.0.0.8/30 [20/2] via 1.1.1.1, 00:00:01

B

1.0.0.12/30 [20/2] via 1.1.1.1, 00:00:01

< 4 seconds later

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

00:00: 01 < 4 seconds later BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

93

“High Utilization”

Table Version Changing Rapidly: A Little Lab Fun

RP/0/RP0/CPU0:aggies#sh ip bgp 1.0.0.4 Wed Apr 27 14:00:36.066 EDT

Last Modified: Apr 27 14:00:35.387 for 00:00:00 Paths: (1 available, no best path) 100 1.1.1.1
Last Modified: Apr 27 14:00:35.387 for 00:00:00
Paths: (1 available, no best path)
100
1.1.1.1
(inaccessible) from 1.1.1.1 (1.1.1.1)
3 seconds later
1.1.1.1 (NH) flapping
RP/0/RP0/CPU0:aggies#sh ip bgp 1.0.0.4
Wed Apr 27 14:00:38.710 EDT
Last Modified: Apr 27 14:00:38.387 for 00:00:00
Paths: (1 available, no best path)
1.1.1.1 (metric 2) from 1.1.1.1 (1.1.1.1)

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

(metric 2) from 1.1.1.1 (1.1.1.1) BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco

94

“High Utilization”

Something is wrong with NEXTHOP 1.1.1.1

Flip flops between inaccessible and “accessible with an IGP cost of 2”

Troubleshoot 1.1.1.1 and the churning will stop

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

1.1.1.1 and the churning will stop BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

95

Layer 3 VPNs
Layer 3 VPNs

Layer 3 VPNs

Layer 3 VPNs

Troubleshooting Checklist

#1 PE1 PE2 core connectivity

Verify you can ping from loopback to loopback

Verify you can mpls ping from loopback to loopback

PE loopbacks must be /32

Check IGP

Check LDP

#2 PE1 CE1 and PE2 CE2 connectivity

Can each PE ping their directly connected CE?

Remember to do “ping vrf FOO x.x.x.x”

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

#1 PE1 PE2
#1
PE1
PE2
#2 CE1
#2
CE1

Cisco Public

#2 CE2
#2
CE2
BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. #1 PE1 PE2 #2 CE1 Cisco

97

Layer 3 VPNs

#3 PE PE vrf connectivity

Can PEs ping the vrf interface of the other PE?

If not double check your import/export Route Targets

#4 PE CE connectivity

Verify each PE can ping the CE connected to the other PE

#5 CE CE connectivity

At this point you should be able to ping CE to CE

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

PE1
PE1
PE2
PE2

#3

#4 #4 #5 CE1 CE2
#4
#4
#5
CE1
CE2

Cisco Public

CE BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. PE1 PE2 # 3 #4

98

Looking Glasses
Looking Glasses

Looking Glasses

The Internet BGP Looking Glasses

You are advertising your address space to your ISPs

Q: How can you verify they are receiving it?

Q: How can you verify the rest of the Internet is receiving it?

A: BGP Looking Glasses

of the Internet is receiving it?  A: BGP Looking Glasses BRKRST-3320 © 2013 Cisco and/or

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

it?  A: BGP Looking Glasses BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved.

100

‟ BGP Looking Glass servers are computers on the Internet running one of a variety

BGP Looking Glass servers are computers on the Internet running one of a variety of publicly available Looking Glass software implementations. A Looking Glass server (or LG server) is accessed remotely for the purpose of viewing routing info. Essentially, the server acts as a limited, read-only portal to routers of whatever organization is running the Looking Glass server. Typically, publicly accessible looking glass servers are run by ISPs or NOCs.

http://www.bgp4.as/looking-glasses

e r s a r e r u n b y I S P s o

The Internet BGP Looking Glasses

https://www.sprint.net/lg/ Show bgp route 72.163.4.161 72.163.0.0/20
https://www.sprint.net/lg/
Show bgp route 72.163.4.161
72.163.0.0/20

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

102

The Internet BGP Looking Glasses

host$ nslookup www.cisco.com Address: 72.163.4.161 host$
host$ nslookup www.cisco.com
Address: 72.163.4.161
host$

http://whois.arin.net/ui

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

host$ http://whois.arin.net/ui BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 103

103

The Internet BGP Looking Glasses

Huge list of looking glasses here

http://www.bgp4.as/looking-glasses

looking glasses here – http://www.bgp4.as/looking-glasses BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

http://www.bgp4.as/looking-glasses BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 104

104

The Internet BGP Looking Glasses

The Level3 looking glass will translate AS #s to company names

AS-PATH:

AS-PATH Translation: GBLX SHAWFIBER

3549 6327

AS-PATH: – AS-PATH Translation: GBLX SHAWFIBER 3549 6327 BRKRST-3320 © 2013 Cisco and/or its affiliates. All

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

105

The Internet Whose AS is that anyway?

Long list here

http://bgp.potaroo.net/cidr/autnums.html

Or lookup a specific AS

http://whois.arin.net/rest/asn/AS1239/pft

a specific AS – http://whois.arin.net/rest/asn/AS1239/pft BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

BRKRST-3320 © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 106

106

The University's Route Views project was originally conceived as a tool for Internet operators to obtain real-time information about the global routing system from the perspectives of several different backbones and locations around the Internet. Although other tools handle related tasks, such as the various Looking Glass Collections (see e.g. NANOG, or the DTI NSPIXP-2 Looking Glass), they typically either provide only a constrained view of the routing system (e.g., either a single provider, or the route server) or they do not provide real-time access to routing data.

or they do not provide real-time access to routing data. While the Route Views projec t

While the Route Views project was originally motivated by interest on the part of operators in determining how the global routing system viewed their prefixes and/or AS space, there have been many other interesting uses of this Route Views data. For example, NLANR has used Route Views data for AS path visualization (see also NLANR), and to study IPv4 address space utilization (archive). Others have used Route Views data to map IP addresses to origin AS for various topological studies. CAIDA has used it in conjunction with theNetGeo database in generating geographic locations for hosts, functionality that both CoralReef and the Skitter project support.”

University of Oregon Route Views Project http://www.routeviews.org/

and the Skitter project support.” • University of Oregon Route Views Project • http://www.routeviews.org/

Complete Your Online Session Evaluation

Give us your feedback and you could win fabulous prizes. Winners announced daily.

Receive 20 Cisco Daily Challenge points for each session evaluation you complete.

Complete your session evaluation online now through either the mobile app or internet kiosk stations.

through either the mobile app or internet kiosk stations. Maximize your Cisco Live experience with your

Maximize your Cisco Live experience with your free Cisco Live 365 account. Download session PDFs, view sessions on-demand and participate in live activities throughout the year. Click the Enter Cisco Live 365 button in your Cisco Live portal to log in.

Cisco Live 365 button in your Cisco Live portal to log in. BRKRST-3320 © 2013 Cisco

BRKRST-3320

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

108