Vous êtes sur la page 1sur 92

Chapter 4

Network Layer

A note on the use of these ppt slides:


We’re making these slides freely available to all (faculty, students, readers).
They’re in PowerPoint form so you can add, modify, and delete slides Computer Networking:
(including this one) and slide content to suit your needs. They obviously A Top Down Approach
represent a lot of work on our part. In return for use, we only ask the
following: Featuring the Internet,
If you use these slides (e.g., in a class) in substantially unaltered form, 3rd edition.
that you mention their source (after all, we’d like people to use our book!)
If you post any slides in substantially unaltered form on a www site, that Jim Kurose, Keith Ross
you note that they are adapted from (or perhaps identical to) our slides, and Addison-Wesley, July
note our copyright of this material.
2004.
Thanks and enjoy! JFK/KWR

All material copyright 1996-2004


J.F Kurose and K.W. Ross, All Rights Reserved
Network Layer 4-1
Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-2


Network layer
delivers the segments from application
the sending to the receiving transport
network
hosts data link
physical
network
network data link
on the sending side, it data link physical
network
data link
physical
encapsulates the segments physical

into datagrams
network
data link
physical network
on the receiving side, it data link
physical
delivers the segments to the
transport layer
network
network data link
data link physical
network layer protocols exist physical

in every host and router


network
data link application
physical transport
the router examines the network
data link
header fields in all IP physical

datagrams passing through it

Network Layer 4-3


Key Network-Layer Functions
forwarding: moving the analogy:
packets from the router
input to appropriate routing: the process of
router output locally planning a trip from
source to destination
routing: determining the
end-to-end route to be forwarding: the process
taken by the packets of getting through a
from source to single street interchange
destination using routing
algorithms & updating
the forwarding tables

Network Layer 4-4


Interplay between routing and forwarding

routing algorithm

local forwarding table


header value output link
0100 3
0101 2
0111 2
1001 1

value in arriving
packet’s header
0111 1

3 2

Network Layer 4-5


Packet Switching Devices
A packet-switching device can be either:
A link-layer switch (or layer-2 switch):
• The forwarding decision is based on a value (usually the
physical address or MAC address) in the data-link layer
header (chap 5)
A network layer switch (or layer-3 switch or router):
• The forwarding decision is based on a value (usually the
logical address or IP address) in the network layer header
• The routing algorithm, which can be centralized or
distributed, determines the entries of the router’s
forwarding table
• The router receives routing protocol messages, which are
used to configure the forwarding tables.

Network Layer 4-6


Connection setup
3rd important function (after routing and
forwarding) in some network architectures:
ATM, frame relay, X.25 (virtual-circuit switching)
before the datagrams flow, the two hosts and
the intervening routers establish a virtual
connection
i.e.; the routers get involved in the connection setup
network & transport layer connection service:
Network layer: between two hosts
Transport layer: between two processes

Network Layer 4-7


Network service model
Q: What is the service model of the “channel”
transporting datagrams from a sender to a receiver?
Example services for Example services for a
individual datagrams: flow of datagrams:
guaranteed delivery in-order datagram
guaranteed delay: delivery
e.g. delivery with guaranteed minimum
less than 40 msec bandwidth to flow
guaranteed maximum
jitter: restrictions on
changes in inter-packet
spacing (maximum jitter)
Network Layer 4-8
Network layer service models:
Guarantees ?
Network Service Congestion
Architecture Model Bandwidth Loss Order Timing feedback

Internet best effort none no no no no

ATM CBR constant yes yes yes no


rate congestion
ATM VBR guaranteed yes yes yes no
rate congestion
ATM ABR guaranteed no yes no yes
minimum
ATM UBR none no yes no no

ATM: Asynchronous Transfer Mode


CBR: Constant Bit Rate, VBR: Variable Bit Rate
ABR: Available Bit Rate, UBR: Unspecified Bit Rate Network Layer 4-9
Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-10


Network-layer connection and
connection-less service
Datagram networks provide network-layer
connectionless service
VC network provides network-layer connection
service
Analogous to the transport-layer services, but:
Service: host-to-host (vs. process-to-process)
No choice: network provides one service or the other
but not both (vs. both are available at the same time)
Implementation: in both the core and the end systems
(vs. in the end systems only)

Network Layer 4-11


Virtual circuits
“source-to-dest path behaves much like telephone circuit”
performance-wise
network actions along source-to-dest path

call setup for each call before the data can flow
call teardown for each call after the data transfer is complete
each packet carries a VC identifier (not a destination host
address)
every router on the source-destination path maintains a “state”
for each passing connection
the link and router resources (e.g., bandwidth, buffers) may be
allocated to the VC

Network Layer 4-12


VC implementation
A VC consists of:
1. a path from the source to the destination
2. VC numbers: one number for each link along path
3. entries in the forwarding tables of the routers
along the path
packet belonging to VC carries a VC number.
the VC number must be changed on each link.
a new VC number comes from the forwarding table
• shorter VC field in the packet header
• simpler VC setup: local independent decision

Network Layer 4-13


Forwarding table VC number

12 22 32

1 3
2

Forwarding table in interface


number
the northwest router:
Incoming interface Incoming VC # Outgoing interface Outgoing VC #

1 12 3 22
2 63 1 18
3 7 2 17
1 97 3 87
… … … …

Routers maintain connection state information!


Network Layer 4-14
Virtual circuits: signaling protocols
end system and routers use signaling messages to
setup, maintain, and teardown VC
used in ATM, frame-relay, and X.25

application
6. Receive data application
transport 5. Data flow begins
network 4. Call connected 3. Accept call transport
data link 1. Initiate call 2. incoming call network
data link
physical
physical

Network Layer 4-15


Datagram networks
no call setup at the network layer
routers: no state about end-to-end connections
no network-level concept of a “connection”
packets are forwarded using destination host address
packets between the same source-destination pair may take
different paths depending on the network status and the
router decision making

application
application
transport
transport
network
data link 1. Send data 2. Receive data network
data link
physical
physical

Network Layer 4-16


32-bit IP Address 4
Forwarding table billion possible entries

Destination Address Range Link Interface

11001000 00010111 00010000 00000000 (200.23.16.0)


through 0
11001000 00010111 00010111 11111111 (200.23.23.255)

11001000 00010111 00011000 00000000 (200.23.24.0)


through 1
11001000 00010111 00011000 11111111 (200.23.24.255)

11001000 00010111 00011001 00000000 (200.23.25.0)


through 2
11001000 00010111 00011111 11111111 (200.23.31.255)

otherwise 3
Network Layer 4-17
Longest prefix matching
Prefix Match Link Interface
11001000 00010111 00010 0
11001000 00010111 00011000 1
11001000 00010111 00011 2
otherwise 3

Examples

DA: 11001000 00010111 00010110 10100001 Which interface?

DA: 11001000 00010111 00011000 10101010 Which interface?

for the longest prefix matching to be effective, each


output link interface should be responsible for
forwarding a large number of contiguous addresses
this is the case with the Internet addresses as they are
assigned in a hierarchical fashion
Network Layer 4-18
Datagram or VC network
Internet ATM
data exchange among computers evolved from telephony
“elastic” services, no strict human conversation:
timing requirements strict timing and
“smart” end systems (e.g. PC’s) reliability requirements
can adapt and perform there is a need for
control and error recovery guaranteed service
simple network core “dumb” end systems
complexity at the “edge” telephones
quick and easy to add and complexity moved to
attach new services inside the network
many link types
different characteristics
uniform service is difficult
Network Layer 4-19
Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-20


Router Architecture Overview
There are two key router functions:
running routing algorithms/protocol (RIP, OSPF, BGP)
forwarding datagrams from incoming to outgoing link

Network Layer 4-21


Input Port Functions

Physical layer:
bit-level reception
Decentralized switching:
Data link layer: given a packet dest., lookup the output port using
e.g., Ethernet the forwarding table in input port memory
see chapter 5 an updated copy of the forwarding table is stored
in each input port avoiding a processing
bottleneck at routing processor
goal: complete input port processing at ‘line speed’
queuing: occurs if a new packet arrives faster
than the forwarding rate into switch fabric

Network Layer 4-22


Speed of the input port processing
example:
an OC-48 link runs at 2.5 Gbps. Therefore, with 256 byte
packets, the lookup speed should be about 1 million lookup/sec
lookup techniques to speed-up the datagram forwarding:
binary tree: N steps to lookup N-bit addresses
• even not fast enough for backbone routing requirements
Content Addressable Memory (CAM): constant time
using cache memory
other recent techniques: log(N) steps
queuing reasons at the input port:
when a new packet is received and is ready to be forwarded
before the current packet is forwarded
when the forwarding of the current packet is blocked by the
switching fabric (i.e.; fabric is busy forwarding another packet
from another input port)
Network Layer 4-23
Three types of switching fabrics

Network Layer 4-24


Switching Via Memory
First generation routers:
traditional computers with switching under direct control of the CPU
I/O ports functioned as traditional I/O devices
input ports used interrupts to signal the packet arrival
packets are then copied to the system memory
CPU reads the header and then forwards the packet to an output port
speed limited by memory bandwidth (2 bus crossings per datagram)
therefore, the input-to-output forwarding speed is one-half of the memory
access speed
modern routers use this method as a shared-memory multiprocessors
Examples: Cisco Catalyst 8500 Series & Bay Networks Accelar 1200 Series

Input Memory Output


Port Port

System Bus
Network Layer 4-25
Switching Via a Bus
packet from input port memory to output port memory via a
shared bus
one packet at a time can be transferred
bus contention: switching speed is limited by the bus
bandwidth (at least as fast as all input port together)
example: 1 Gbps bus, Cisco 1900: sufficient speed for access
and enterprise routers (not regional or backbone)

Network Layer 4-26


Switching Via An Interconnection Network
n x n crossbar switch with 2n busses
overcomes the bus bandwidth limitations (to a certain extent)
Advanced design: fragment the packets into fixed length cells,
then switch the cells through the fabric to simplify and
speedup the switching
example: Cisco 12000: switches at up to 60 Gbps through the
interconnection network

Network Layer 4-27


Output Ports

Buffering (queuing) required when packets arrive from fabric


faster than the transmission rate
Scheduling discipline chooses among queued packets for
transmission

Network Layer 4-28


Where does queuing and loss occur?
the actual location of packet queuing and/or loss
depends on:
the traffic load (arrival rate, packet size, etc.)
the relative speed of the switching fabric
the line speed
queuing reasons at the input port (review):
when a new packet is received and is ready to be forwarded
before the current packet is forwarded
when the forwarding of the current packet is blocked by
the switching fabric
queuing reasons at the output port:
when the arrival rate via the switching fabric exceeds the
output line speed

Network Layer 4-29


Output port queueing
Example:
3 input ports
3 output ports
line speeds are identical = S
switch fabric speed = 3S
worst case: all packets at
input ports are destined to
the same output port

queuing delay and loss due to


output port buffer overflow!

Network Layer 4-30


Packet scheduling & queue management
Which packet in the queue is selected next for transmission? Or
what is the scheduling scheme?
Simply, First-Come-First-Served (FCFS)
Weighted Fair Queuing (WFQ)
• Fair share of outgoing link among end-to-end connections
What if the buffer is full?
drop arriving packets (drop-tail policy) when full. But, can we do better?
Active Queue Management (AQM) algorithms:
Drop or mark the header of the arriving packet before the buffer is
filled. Why? A congestion indication to the sender
Example: Random Early Detection (RED) algorithm:
• Weighted average of the queue length is maintained
• If queue length is less than a min. threshold, accept packets
• If queue length is greater than a max. threshold, mark or drop new packets
• If queue length is in between, drop or mark new packets based on a
probability that is a function of the queue length
Packet scheduling is very important for Quality-of-Service (QoS)
guarantees
Network Layer 4-31
Input Port Queuing
Fabric switch is slower than the input ports combined
queuing may occur at the input queues
Head-of-the-Line (HOL) blocking: queued packet at
the front of the queue prevents others in the queue
from moving forward

Network Layer 4-32


Chapter 4: Network Layer
4.1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-33


The Internet Network layer
Host, router network layer functions:

Network Layer 4-34


Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-35


IP datagram format
IP protocol version number
32 bits total datagram
length (bytes)
header length (bytes) head type of
ver length
len service
“type” of data fragment for
16-bit identifier flgs
offset fragmentation/
max number remaining time to upper Internet reassembly
hops (decremented at live layer checksum
each router)
Header
32 bit source IP address checksum
upper layer protocol
to deliver payload to 32 bit destination IP address
(glue between network
and transport layers) Options (if any) E.g. timestamp,
record route
how much overhead taken, specify
with TCP/IP typically? data list of routers
20 bytes of TCP (variable length, to visit, etc.
typically a TCP
20 bytes of IP
or UDP segment)
= 40 bytes + app
layer overhead
Network Layer 4-36
IP Fragmentation & Reassembly
network links have MTU (Max. fragmentation:
in: one large datagram
Transmission Unit), which is the out: 3 smaller datagrams
largest possible link-level frame
different link types and hence
different MTU’s along the route
• Ethernet: 1500 bytes
• Some WANs: 576 bytes
large IP datagram is divided
(“fragmented”) within the network
one datagram becomes several reassembly
datagrams
“reassembled” only at the final
destination (keep core simple)
IP header bits are used to
identify and order the related
fragments
Network Layer 4-37
IP Fragmentation and Reassembly
Example length ID fragflag offset
4000 byte datagram =4000 =x =0 =0
3980 byte payload
One large datagram becomes
20 bytes IP header
several smaller datagrams
MTU = 1500 bytes
length ID fragflag offset
=1500 =x =1 =0
1480 bytes in data field
(must be multiple of 8 bytes length ID fragflag offset
except for last fragment) =1500 =x =1 =185
offset = 1480/8
length ID fragflag offset
=1040 =x =0 =370
3980 - 1480 - 1480 = 1020
bytes in the data field offset = (1480 + 1480)/8
Network Layer 4-38
Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-39


IP Addressing: introduction
IP address: 32-bit ID for the Dotted-Decimal Notation
interface of the host or router
interface: is the connection 223.1.1.1 = 11011111 00000001 00000001 00000001
between the host or router 223 1 1 1
and the physical link
router’s typically have multiple
interfaces 223.1.1.1
a host typically has one 223.1.2.1
interface
223.1.1.2
one IP addresses is associated
223.1.2.9
with each interface 223.1.1.4

a portion of the interface’s IP


address is determined by the 223.1.1.3 223.1.3.27
subnet it is connected to 223.1.2.2

IP addresses in the global


Internet should be unique
(except for interfaces behind 223.1.3.1 223.1.3.2
the NATs)
Network Layer 4-40
Subnets subnet subnet

the IP address consists of: 223.1.1.1


subnet part: the x most 223.1.2.1
significant bits 223.1.1.2
223.1.2.9
host part: 32-x least 223.1.1.4
significant bits
223.1.2.2
what’s a subnet ? 223.1.1.3 223.1.3.27
a network of devices with
their interfaces having the subnet
same subnet part of IP
223.1.3.1 223.1.3.2
address
the devices on the same
subnet can physically reach
each other without an network consisting of 3 subnets
intervening router
• Connected by a data-link
layer hub or switch

Network Layer 4-41


Subnets 223.1.1.0/24
223.1.2.0/24

Recipe
To determine the subnets,
detach each interface
from its host or router,
creating islands of isolated
networks. Each isolated
network is called a subnet.

The subnet mask indicates


223.1.3.0/24
the number of most
significant bits used to
identify the subnet part of
Subnet mask: /24
the IP address 256 addresses

Network Layer 4-42


Subnets 223.1.1.2

How many? 223.1.1.1 223.1.1.4

223.1.1.3

223.1.9.2 223.1.7.0

223.1.9.1 223.1.7.1
223.1.8.1 223.1.8.0

223.1.2.6 223.1.3.27

223.1.2.1 223.1.2.2 223.1.3.1 223.1.3.2

Network Layer 4-43


IP addressing: Classful Addressing
the old method of IP addressing
subnet addresses must be 1, 2, or 3 bytes
subnet address classes:
Class A: a.b.c.d/8 (over 16 million addresses)
Class B: a.b.c.d/16 (over 65 thousand addresses)
Class C: a.b.c.d/24 (only 256 addresses)
what if an organization needs only 500 addresses?
Problems with classful addressing:
fast depletion of class B address space
poor utilization of the assigned number space

Network Layer 4-44


IP addressing: CIDR strategy
CIDR: Classless InterDomain Routing
subnet portion of address can be of an arbitrary length
address format: a.b.c.d/x, where x is # bits in subnet portion of
address (also called the prefix or the network prefix)
the prefix represents the network portion of the IP address
IP addresses are usually assigned to organizations in blocks of
contiguous addresses that share a common prefix
only the x bits are considered by routers outside the organization’s
network
the remaining 32-x bits are used to identify device interfaces within
the organization’s network (may have additional subnetting structure)

subnet host
part part
11001000 00010111 00010000 00000000
200.23.16.0/23
Network Layer 4-45
IP addresses: how to get one?
Q: How does a host get an IP address?

manually configured by system admin in a system file


Windows: control-panel->network->configuration->tcp/ip-
>properties
UNIX: /etc/rc.config
DHCP: Dynamic Host Configuration Protocol: dynamically get an
IP address from a server
“plug-and-play”
efficient IP address utilization
efficient for mobile hosts such as laptops
(more in next chapter)

Network Layer 4-46


IP addresses: how to get one?
Q: How does a network get the subnet part of IP
address?
A: gets allocated a portion of its provider ISP’s
address space

ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20

Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23


Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23
Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23
... ….. …. ….
Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23

Network Layer 4-47


Hierarchical addressing: route aggregation
hierarchical addressing: addresses are assigned in contiguous blocks to ISPs
and then from ISPs to client organizations:
allows for efficient advertisement of routing info
route (or address) aggregation is the ability to use a single prefix to
advertise multiple networks:
works very well with hierarchical addressing

Organization 0
200.23.16.0/23
Organization 1
“Send me anything
200.23.18.0/23 with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
beginning
199.31.0.0/16”
Network Layer 4-48
Hierarchical addressing: more specific routes
Fly-By-Night acquires ISPs-R-Us and connect Organization 1 through it:
Organization 1 renumbers all its routers and hosts (very costly solution)
Organization 1 keeps the same numbers, which are specifically advertised by
ISPs-R-Us benefiting from the longest-prefix-match routing feature
Organization 0
200.23.16.0/23

“Send me anything
with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
Organization 1 beginning 199.31.0.0/16
or 200.23.18.0/23”
200.23.18.0/23

ISPs-R-Us has a more specific route to Organization 1


Network Layer 4-49
IP addressing: the last word...

Q: How does an ISP get block of addresses?


A: ICANN: Internet Corporation for Assigned
Names and Numbers
allocates addresses
manages DNS
assigns domain names, resolves disputes

Network Layer 4-50


NAT: Network Address Translation

rest of local network


Internet (e.g., home network)
10.0.0.0/24 10.0.0.1

10.0.0.4
10.0.0.2
138.76.29.7

10.0.0.3

All datagrams leaving local Datagrams with source or


network have same single source destination in this network
NAT IP address: 138.76.29.7 but have 10.0.0.0/24 address for
with different source port numbers source, destination (as usual)

Note: 10.0.0.0/8 range is reserved for private networks


Network Layer 4-51
NAT: Network Address Translation

Motivation: the local network uses just one IP address


as far as outside world is concerned:
no need to be allocated a range of addresses from the ISP: -
just one IP address is used for all devices
can change addresses of the devices in the local network
without notifying the outside world
can change the ISP without changing the addresses of the
devices in the local network
devices inside the local network are not explicitly addressable
or visible by the outside world (a security plus).

Network Layer 4-52


NAT: Network Address Translation
Implementation: NAT router must:
change the header of the outgoing datagrams: replace (source
IP address, port #) of every outgoing datagram to (NAT IP
address, new port #)
. . . remote clients/servers will respond using (NAT IP address,
new port #) as destination address

remember (in NAT translation table) every (source IP address,


port #) to (NAT IP address, new port #) translation pair

change the header of the incoming datagrams: replace (NAT IP


address, new port #) in dest fields of every incoming datagram
with corresponding (source IP address, port #) stored in NAT
table

Network Layer 4-53


NAT: Network Address Translation
NAT translation table
2: NAT router 1: host 10.0.0.1
WAN side addr LAN side addr
changes datagram sends datagram to
138.76.29.7, 5001 10.0.0.1, 3345 128.119.40.186, 80
source addr from
…… ……
10.0.0.1, 3345 to
138.76.29.7, 5001, S: 10.0.0.1, 3345
updates table D: 128.119.40.186, 80
10.0.0.1
1
S: 138.76.29.7, 5001
2 D: 128.119.40.186, 80 10.0.0.4
10.0.0.2
138.76.29.7 S: 128.119.40.186, 80
D: 10.0.0.1, 3345 4
S: 128.119.40.186, 80
D: 138.76.29.7, 5001 3 10.0.0.3
4: NAT router
3: Reply arrives changes datagram
dest. address: dest addr from
138.76.29.7, 5001 138.76.29.7, 5001 to 10.0.0.1, 3345

Network Layer 4-54


NAT: Network Address Translation
16-bit port-number field:
over 60,000 simultaneous connections with a single LAN-
side address!
NAT is controversial:
port addresses are meant to be used for addressing
processes not hosts
• causes trouble to servers running within the local network
routers should only process up to layer 3
violates end-to-end argument
• NAT possibility must be taken into account by network
application designers, e.g., P2P applications
address shortage should instead be solved by IPv6

Network Layer 4-55


Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-56


ICMP: Internet Control Message Protocol
used by hosts & routers to Type Code description
communicate network-level 0 0 echo reply (ping)
information 3 0 dest. network unreachable
error reporting: unreachable 3 1 dest host unreachable
host, network, port, protocol 3 2 dest protocol unreachable
echo request/reply (used by 3 3 dest port unreachable
ping) 3 6 dest network unknown
network-layer “above” IP: 3 7 dest host unknown
4 0 source quench (congestion
ICMP msgs carried in IP control - not used)
datagrams 8 0 echo request (ping)
ICMP datagrams are 9 0 route advertisement
decapsulated and demuxed 10 0 router discovery
to the ICMP 11 0 TTL expired
ICMP message: type & code plus 12 0 bad IP header
first 8 bytes of IP datagram
causing error

Network Layer 4-57


Traceroute and ICMP
Source sends series of When ICMP message
UDP segments to dest arrives, source calculates
First has TTL =1 RTT
Second has TTL=2, etc. Traceroute does this 3
Unlikely port number times
When nth datagram arrives Stopping criterion
to nth router: UDP segment eventually
Router discards datagram arrives at destination host
And sends to source a Destination returns ICMP
“TTL Expired” ICMP “port unreachable” packet
message (type 11, code 0) (type 3, code 3)
Message includes name &
When source gets this
IP address of the router
ICMP, it stops.

Network Layer 4-58


Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-59


IPv6
Initial motivation: 32-bit address space is
soon to be completely allocated.
Additional motivation:
header format helps speed processing/forwarding
header changes are to facilitate QoS

IPv6 datagram format:


fixed-length 40 byte header
no fragmentation allowed at intermediate routers

Network Layer 4-60


IPv6 Header (Cont)
Priority (or Traffic Class): identify priority among datagrams in flow
Flow Label: identify datagrams in same “flow”
(the concept of “flow” is not well defined yet)
Next header: identify upper layer protocol for data

Network Layer 4-61


Other Changes from IPv4
Fragmentation/reassembly: not allowed any more
Too large packets are dropped by the router
A “Packet Too Big” ICMP error message is sent to source
Checksum: removed entirely to reduce processing
time at each hop
Options: allowed, but outside of header, indicated
by “Next Header” field
ICMPv6: new version of ICMP
additional message types, e.g. “Packet Too Big”
multicast group management functions

Network Layer 4-62


Transition From IPv4 To IPv6
Not all routers can be upgraded simultaneously
no “flag day” (all machines are turned off & upgraded together)
How will the network operate with mixed IPv4 and IPv6 routers?
Dual-stack (IPv6/IPv4) nodes:
should be able to know if other nodes are dual-stack: DNS
two IPv6-capable nodes may end-up communicating using IPv4 if
an intermediate node in between is not
Tunneling: IPv6 carried as payload in IPv4 datagram
among IPv4 routers

Network Layer 4-63


Tunneling
A B E F
Logical view: tunnel

IPv6 IPv6 IPv6 IPv6

A B C D E F
Physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6

Flow: X Src:B Src:B Flow: X


Src: A Dest: E Dest: E Src: A
Dest: F Dest: F
Flow: X Flow: X
Src: A Src: A
data Dest: F Dest: F data

data data

A-to-B: E-to-F:
B-to-C: B-to-C:
IPv6 IPv6
IPv6 inside IPv6 inside
IPv4 IPv4
Network Layer 4-64
Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-65


Interplay between routing and forwarding

routing algorithm

local forwarding table


header value output link
0100 3
0101 2
0111 2
1001 1

value in arriving
packet’s header
0111 1

3 2

Network Layer 4-66


Graph abstraction
5
3
v w 5
2
u 2 1 z
3
1 2
x 1
y
Graph: G = (N,E)

N = set of nodes (or routers ) = { u, v, w, x, y, z }

E = set of edges (or links ) = { (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }

Remark: Graph abstraction is useful in other network contexts

Example: P2P, where N is set of peers and E is set of TCP connections

Network Layer 4-67


Graph abstraction: costs
5 • c(a,b) = cost of link (a,b)

3 - e.g., c(w,z) = 5
v w 5
2
u 2 1 z • cost could always be 1, or
3 inversely related to
1 2
x 1
y bandwidth, or inversely
related to congestion
Default, first-hop, Destination router
or source router

Cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp)

Question: What’s the least-cost path between u and z ?

Routing algorithm: algorithm that finds least-cost path

Network Layer 4-68


Routing Algorithm classification
Global or decentralized? Static or dynamic?
Global: Static:
all routers have complete routes change very slowly
topology & link cost info over time
“link state” or LS algorithms e.g. manually updated tables
Decentralized: Dynamic:
router only knows the routes change more quickly
physically-connected neighbors periodic update
& the link costs to them in response to topology
iterative process of or link cost changes
computation, exchange of info adapts to network status ☺
with neighbors
susceptible to routing loops
“distance vector” or DS and oscillation in routes
algorithms

Network Layer 4-69


Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-70


A Link-State (LS) Routing Algorithm
Dijkstra’s algorithm Notation:
network topology and link c(x,y): the link cost from node
costs are known to all nodes x to y; c(X,Y) = ∞ if X and Y
accomplished via “link are not direct neighbors
state broadcast”
D(v): the current value of the
all nodes have the same cost of the path from source to
information destination v
computes least cost paths
from one node (“source”) to p(v): the predecessor node
all other nodes along the path from source to v
gives the forwarding table N': the set of nodes whose
for that node least cost path are definitively
iterative: after k iterations, known
it knows the least cost path
to k destinations

Network Layer 4-71


Dijsktra’s Algorithm
1 Initialization:
2 N' = {u}
3 for all nodes v
4 if v adjacent to u
5 then D(v) = c(u,v)
6 else D(v) = ∞
7
8 Loop
9 find w not in N' such that D(w) is a minimum
10 add w to N'
11 update D(v) for all v adjacent to w and not in N' :
12 D(v) = min( D(v), D(w) + c(w,v) )
13 /* new cost to v is either old cost to v or known
14 least-cost path cost to w plus cost from w to v */
15 until all nodes in N'

Network Layer 4-72


Dijkstra’s algorithm: example
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv 3,y 4,y
4 uxyvw 4,y
5 uxyvwz

5
3
v w 5
2
u 2 1 z
3
1 2
x 1
y

Network Layer 4-73


Dijkstra’s algorithm: example (2)
Resulting shortest-path tree from u:

v w
u z
x y

Resulting forwarding table in u:


destination link
v (u,v)
x (u,x)
y (u,x)
w (u,x)
z (u,x)
Network Layer 4-74
Dijkstra’s algorithm, discussion
Algorithm complexity: n nodes
each iteration: need to check all nodes, w, not in N
n+(n-1)+(n-2)+..+1 = n(n+1)/2 comparisons: O(n2)
more efficient implementations possible: O(n log n)
Oscillations possible:
e.g., link cost = amount of carried traffic (see example below)
solutions:
make link cost independent of carried traffic: not a good solution
routers should run the algorithm asynchronously: self-synchronization
routers should advertise link status at random times ☺

1 A A A A
1+e 2+e 0 0 2+e 2+e 0
D 0 0 B D 1+e 1 B D B D 1+e 1 B
0 0
0 e 0 0 1 1+e 0 e
1
C C C C
1
e
… recompute … recompute … recompute
initially
routing
Network Layer 4-75
Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-76


Distance Vector (DV) Algorithm
Distributed: each node receives info from direct neighbors, runs the
algorithm, and redistribute the results back to direct neighbors
Iterative: same process continues until no more changes are possible
Self-terminating: the algorithm finally converges
Asynchronous: does not require all nodes to operate simultaneously

Based on Bellman-Ford Equation (dynamic programming)


Define
dx(y) : is the total cost of least-cost path from x to y

dx(y) = min {c(x,v) + dv(y) }


v

where min is taken over all neighbors v of x


Network Layer 4-77
Bellman-Ford example
5
3
v w 5
2
u 2 1 z
3
1 2
x 1
y

Clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3

Using B-F equation:


du(z) = min { c(u,v) + dv(z), c(u,x) + dx(z), c(u,w) + dw(z) }
= min {2 + 5, 1 + 3, 5 + 3}
=4
The node that achieves the minimum cost is the next
hop in the path in forwarding table
Network Layer 4-78
Distance Vector Algorithm
Dx(y) = estimate of least cost from x to y
Distance vector: Dx = [Dx(y): y є N ]
Least cost estimate to every node in the network
Node x:
knows cost to each neighbor v: c(x,v)
maintains Dx = [Dx(y): y є N ]
maintains its neighbors’ distance vectors
• For each neighbor v, x maintains Dv = [Dv(y): y є N ]

Network Layer 4-79


Distance vector algorithm (4)
Basic idea:
Each node periodically sends its own distance
vector estimate to neighbors
When a node x receives new DV estimate from
neighbor, it updates its own DV using B-F equation:

Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N

Under normal conditions, the estimate Dx(y)


converge to the actual least cost dx(y)

Network Layer 4-80


Distance Vector Algorithm (5)
Iterative, asynchronous: Each node:
each local iteration caused by:
local link cost change
DV update message from a wait for (change in local link
neighbor cost or msg from neighbor)
Distributed:
each node notifies neighbors recompute estimates
only when its DV changes
neighbors then notify their
neighbors if necessary
if DV to any dest has
changed, notify neighbors

Network Layer 4-81


Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} Dx(z) = min{c(x,y) + Dy(z), c(x,z) + Dz(z)}
= min{2+0 , 7+1} = 2 = min{2+1 , 7+0} = 3
node x table
cost to cost to cost to
x y z x y z x y z
x 0 2 7 x 0 2 3 x 0 2 3

from
from

y ∞∞ ∞ y 2 0 1

from
y 2 0 1
z ∞∞ ∞ z 7 1 0 z 3 1 0
node y table cost to
cost to cost to
x y z x y z x y z y
2 1
x ∞ ∞ ∞ x 0 2 7 x 0 2 3 x z
from

y 2 0 1 y 2 0 1 7

from
from

y 2 0 1
z ∞∞ ∞ z 7 1 0 z 3 1 0
node z table
cost to cost to cost to
x y z x y z x y z

x ∞∞ ∞ x 0 2 7 x 0 2 3
from
from

y 2 0 1 y 2 0 1
from

y ∞∞ ∞
z 71 0 z 3 1 0 z 3 1 0
time
Network Layer 4-82
Distance Vector: link cost changes
Link cost changes: 1
node detects local link cost change y
4 1
updates routing info, recalculates
x z
distance vector 50
if DV changes, notify neighbors

At time t0, y detects the link-cost change, updates its DV,


and informs its neighbors.
“good
news At time t1, z receives the update from y and updates its table.
It computes a new least cost to x and sends its neighbors its DV.
travels
fast” At time t2, y receives z’s update and updates its distance table.
y’s least costs do not change and hence y does not send any
message to z.

Network Layer 4-83


Distance Vector: link cost changes
60
Link cost changes: y
good news travels fast 4 1

bad news travels slow - “count to x z


50
infinity” problem!
before the change
44 iterations before algorithm
y x=4
stabilizes z y x=5
after the change, y doesn’t
know that z routes to x via y
Poisoned-reverse solution: Y z x=1+5=6
If Z routes through Y to get to X : A routing loop is created
Y informs z of the change
Z tells Y its (Z’s) distance to X is
infinity (so Y won’t route to X via Z) z changes to
z y x=6+1=7
will this completely solve count to z informs y of the change
infinity problem? No. and so on ..

Network Layer 4-84


Comparison of LS and DV algorithms
Message complexity Robustness: what happens if
LS: with n nodes, E links, router malfunctions?
O(nE) msgs sent LS (more robust):
DV: exchange between node can advertise
neighbors only incorrect link cost
convergence time varies each node computes only
Speed of Convergence its own table
LS: O(n2) algorithm requires DV (less robust):
O(nE) msgs DV node can advertise
may have oscillations incorrect path cost
DV: convergence time varies each node’s table used by
others
may be routing loops
• error propagate thru
count-to-infinity problem network

Network Layer 4-85


Chapter 4: Network Layer
4. 1 Introduction 4.5 Routing algorithms
4.2 Virtual circuit and Link state
datagram networks Distance Vector
Hierarchical routing
4.3 What’s inside a
router 4.6 Routing in the
4.4 IP: Internet Internet
RIP
Protocol
OSPF
Datagram format
BGP
IPv4 addressing
ICMP 4.7 Broadcast and
IPv6 multicast routing

Network Layer 4-86


Hierarchical Routing
Our routing study thus far - idealization
all routers identical
network “flat” … not true in practice

There are two problems:


scale: administrative autonomy
there are over 200 million internet = network of
destinations: networks
can’t store all dest’s in each network admin may
routing tables! want to control routing in its
routing table exchange own network
would swamp links!

Network Layer 4-87


Hierarchical Routing
Solution:
organize routers into “autonomous systems” (AS)
each AS is a group of routers that are under the same
admin control
routers in same AS run same routing protocol
“intra-AS” routing protocol
routers in different AS can run different intra-AS routing
protocol

Gateway router
direct link to a routers in another AS

Network Layer 4-88


Interconnected ASes
3c
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b
1d AS1
Forwarding table is
configured by both intra-
and inter-AS routing
Intra-AS Inter-AS algorithm
Routing
algorithm
Routing
algorithm
Intra-AS sets entries for
internal dests
Forwarding Inter-AS & Intra-AS sets
table
entries for external dests
same inter-AS routing
protocol in all AS’s

Network Layer 4-89


Inter-AS tasks
Suppose a router in AS1 AS1 needs to:
receives a datagram of 1. learn which dests are
which the dest is outside reachable through AS2
of AS1 and which through AS3
Router should forward 2. propagate this
packet towards one of the
reachability info to all
gateway routers, but
which one? routers in AS1
A job of the inter-AS routing!

3c
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b
1d AS1
Network Layer 4-90
Example: Setting forwarding table in router 1d

Suppose AS1 learns from the inter-AS protocol


that subnet x is reachable from AS3 (gateway 1c)
but not from AS2.
Intra-AS protocol propagates reachability info to
all internal routers.
Router 1d determines from intra-AS routing info
that its interface I is on the least cost path to 1c.
Puts in forwarding table entry (x,I).

Network Layer 4-91


Example: Choosing among multiple ASes
Now suppose AS1 learns from the inter-AS protocol
that subnet x is reachable from AS3 and from AS2.
To configure forwarding table, router 1d must
determine towards which gateway it should forward
packets for dest x.
Hot potato routing: send packet towards closest
(the one with least-cost) of two routers.

Use routing info Determine from


Learn from inter-AS Hot potato routing: forwarding table the
from intra-AS
protocol that subnet Choose the gateway interface I that leads
protocol to determine
x is reachable via that has the to least-cost gateway.
costs of least-cost
multiple gateways smallest least cost Enter (x,I) in
paths to each
of the gateways forwarding table

Network Layer 4-92

Vous aimerez peut-être aussi