Académique Documents
Professionnel Documents
Culture Documents
Transport Layer
al
you note that they are adapted from (or perhaps identical to) our slides, and
note our copyright of this material. 2007.
ep
Thanks and enjoy! JFK/KWR
itn
All material copyright 1996-2007
J.F Kurose and K.W. Ross, All Rights Reserved
cs
Transport Layer 3-1
Chapter 3: Transport Layer
Our goals:
understand principles learn about transport
behind transport layer protocols in the
layer services: Internet:
multiplexing/demultipl UDP: connectionless
exing transport
reliable data transfer TCP: connection-oriented
flow control transport
congestion control TCP congestion control
al
ep
itn
cs
Transport Layer 3-2
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-3
Transport services and protocols
applicatio
n
provide logical communication transport
network
between app processes data link
physical
running on different hosts
transport protocols run in
end systems
send side: breaks app
messages into segments,
passes to network layer
rcv side: reassembles applicatio
n
segments into messages, transport
network
passes to app layer data link
physical
al
protocol available to apps
ep
Internet: TCP and UDP
itn
cs
Transport Layer 3-4
Transport vs. network layer
network layer: logical Household analogy:
communication 12 kids sending letters to
between hosts 12 kids
transport layer: logical processes = kids
communication app messages = letters
between processes in envelopes
relies on, enhances,
hosts = houses
network layer services
transport protocol =
Ann and Bill
al
network-layer protocol
ep
= postal service
itn
cs
Transport Layer 3-5
Internet transport-layer protocols
reliable, in-order
applicatio
n
transport
delivery (TCP) network
data link
network
congestion control physical
data link
network
physical
data link
flow control physical
connection setup
unreliable, unordered
network
data link
physicalnetwork
delivery: UDP data link
physical
no-frills extension of network
data link
best-effort IP physical network
applicatio
n
data link transport
services not available:
physical network
data link
al
physical
delay guarantees
ep
bandwidth guarantees
itn
cs
Transport Layer 3-6
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-7
Multiplexing/demultiplexing
Demultiplexing at rcv host: Multiplexing at send host:
gathering data from multiple
delivering received segments
sockets, enveloping data with
to correct socket
header (later used for
demultiplexing)
= socket = process
P3 P1
P1 P2 P4 application
application application
al
physical
ep
physical physical
host 3
itn
host 1 host 2
cs
Transport Layer 3-8
How demultiplexing works
host receives IP datagrams
each datagram has source 32 bits
IP address, destination IP
address source port # dest port #
al
ep
TCP/UDP segment format
itn
cs
Transport Layer 3-9
Connectionless demultiplexing
When host receives UDP
Create sockets with port
segment:
numbers:
DatagramSocket mySocket1 = new checks destination port
DatagramSocket(12534); number in segment
DatagramSocket mySocket2 = new directs UDP segment to
DatagramSocket(12535); socket with that port
number
UDP socket identified by
two-tuple: IP datagrams with
different source IP
(dest IP address, dest port number)
addresses and/or source
port numbers directed
al
to same socket
ep
itn
cs
Transport Layer 3-10
Connectionless demux (cont)
DatagramSocket serverSocket = new DatagramSocket(6428);
P2 P1
P1
P3
al
ep
SP provides return address
itn
cs
Transport Layer 3-11
Connection-oriented demux
TCP socket identified Server host may support
by 4-tuple: many simultaneous TCP
source IP address sockets:
source port number each socket identified by
dest IP address its own 4-tuple
dest port number Web servers have
recv host uses all four different sockets for
values to direct each connecting client
segment to appropriate non-persistent HTTP will
socket have different socket for
al
each request
ep
itn
cs
Transport Layer 3-12
Connection-oriented demux
(cont)
P1 P4 P5 P6 P2 P1P3
SP: 5775
DP: 80
S-IP: B
D-IP:C
al
D-IP:C D-IP:C
ep
itn
cs
Transport Layer 3-13
Connection-oriented demux:
Threaded Web Server
P1 P4 P2 P1P3
SP: 5775
DP: 80
S-IP: B
D-IP:C
al
D-IP:C D-IP:C
ep
itn
cs
Transport Layer 3-14
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-15
UDP: User Datagram Protocol [RFC 768]
no frills, bare bones
Internet transport Why is there a UDP?
protocol
no connection
best effort service, UDP establishment (which can
segments may be: add delay)
lost simple: no connection state
delivered out of order at sender, receiver
to app small segment header
connectionless: no congestion control: UDP
no handshaking between can blast away as fast as
UDP sender, receiver desired
each UDP segment
al
handled independently
ep
of others
itn
cs
Transport Layer 3-16
UDP: more
often used for streaming
multimedia apps 32 bits
al
error recovery!
ep
UDP segment format
itn
cs
Transport Layer 3-17
UDP checksum
Goal: detect errors (e.g., flipped bits) in transmitted
segment
Sender: Receiver:
treat segment contents compute checksum of
as sequence of 16-bit received segment
integers check if computed checksum
checksum: addition (1s equals checksum field value:
complement sum) of NO - error detected
segment contents YES - no error detected.
sender puts checksum But maybe errors
al
value into UDP checksum nonetheless? More later
ep
field .
itn
cs
Transport Layer 3-18
Internet Checksum Example
Note
When adding numbers, a carryout from the
most significant bit needs to be added to the
result
Example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
al
ep
sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
itn
cs
Transport Layer 3-19
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-20
Principles of Reliable data transfer
important in app., transport, link layers
top-10 list of important networking topics!
al
ep
characteristics of unreliable channel will determine
complexity of reliable data transfer protocol (rdt)
itn
cs
Transport Layer 3-21
Principles of Reliable data transfer
important in app., transport, link layers
top-10 list of important networking topics!
al
ep
characteristics of unreliable channel will determine
complexity of reliable data transfer protocol (rdt)
itn
cs
Transport Layer 3-22
Principles of Reliable data transfer
important in app., transport, link layers
top-10 list of important networking topics!
al
ep
characteristics of unreliable channel will determine
complexity of reliable data transfer protocol (rdt)
itn
cs
Transport Layer 3-23
Reliable data transfer: getting started
rdt_send(): called from above, deliver_data(): called by
(e.g., by app.). Passed data to rdt to deliver data to upper
deliver to receiver upper layer
send receive
side side
al
udt_send(): called by rdt, rdt_rcv(): called when packet
ep
to transfer packet over arrives on rcv-side of channel
unreliable channel to receiver
itn
cs
Transport Layer 3-24
Reliable data transfer: getting started
Well:
incrementally develop sender, receiver sides of
reliable data transfer protocol (rdt)
consider only unidirectional data transfer
but control info will flow on both directions!
use finite state machines (FSM) to specify
sender, receiver
event causing state transition
actions taken on state transition
state: when in this
state next state state state
1 event
al
uniquely determined 2
actions
ep
by next event
itn
cs
Transport Layer 3-25
Rdt1.0: reliable transfer over a reliable channel
underlying channel perfectly reliable
no bit errors
no loss of packets
al
ep
sender receiver
itn
cs
Transport Layer 3-26
Rdt2.0: channel with bit errors
underlying channel may flip bits in packet
checksum to detect bit errors
al
ep
itn
cs
Transport Layer 3-27
rdt2.0: FSM specification
rdt_send(data)
snkpkt = make_pkt(data, checksum) receiver
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
al
extract(rcvpkt,data)
ep
deliver_data(data)
udt_send(ACK)
itn
cs
Transport Layer 3-28
rdt2.0: operation with no errors
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
al
extract(rcvpkt,data)
ep
deliver_data(data)
udt_send(ACK)
itn
cs
Transport Layer 3-29
rdt2.0: error scenario
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
al
extract(rcvpkt,data)
ep
deliver_data(data)
udt_send(ACK)
itn
cs
Transport Layer 3-30
rdt2.0 has a fatal flaw!
What happens if Handling duplicates:
ACK/NAK corrupted? sender retransmits current
sender doesnt know what pkt if ACK/NAK garbled
happened at receiver! sender adds sequence
cant just retransmit: number to each pkt
possible duplicate receiver discards (doesnt
deliver up) duplicate pkt
al
response
ep
itn
cs
Transport Layer 3-31
rdt2.1: sender, handles garbled ACK/NAKs
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK or
isNAK(rcvpkt) )
call 0 from
NAK 0 udt_send(sndpkt)
above
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt) && notcorrupt(rcvpkt)
&& isACK(rcvpkt)
L
L
Wait for Wait for
ACK or call 1 from
rdt_rcv(rcvpkt) && NAK 1 above
( corrupt(rcvpkt) ||
rdt_send(data)
al
isNAK(rcvpkt) )
sndpkt = make_pkt(1, data, checksum)
ep
udt_send(sndpkt)
udt_send(sndpkt)
itn
cs
Transport Layer 3-32
rdt2.1: receiver, handles garbled ACK/NAKs
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum) sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
Wait for Wait for
rdt_rcv(rcvpkt) && 0 from 1 from rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) && below below not corrupt(rcvpkt) &&
has_seq1(rcvpkt) has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, chksum) sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
al
extract(rcvpkt,data)
ep
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
itn
udt_send(sndpkt)
cs
Transport Layer 3-33
rdt2.1: discussion
Sender: Receiver:
seq # added to pkt must check if received
two seq. #s (0,1) will packet is duplicate
suffice. Why? state indicates whether
0 or 1 is expected pkt
must check if received seq #
ACK/NAK corrupted
note: receiver can not
twice as many states know if its last
state must remember ACK/NAK received OK
whether current pkt
at sender
has 0 or 1 seq. #
al
ep
itn
cs
Transport Layer 3-34
rdt2.2: a NAK-free protocol
al
ep
itn
cs
Transport Layer 3-35
rdt2.2: sender, receiver fragments
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK isACK(rcvpkt,1) )
call 0 from
above 0 udt_send(sndpkt)
sender FSM
fragment rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && && isACK(rcvpkt,0)
(corrupt(rcvpkt) || L
has_seq1(rcvpkt)) Wait for receiver FSM
0 from
udt_send(sndpkt) below fragment
al
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
ep
extract(rcvpkt,data)
itn
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
cs
udt_send(sndpkt) Transport Layer 3-36
rdt3.0: channels with errors and loss
al
# of pkt being ACKed
ep
requires countdown timer
itn
cs
Transport Layer 3-37
rdt3.0 sender
rdt_send(data)
rdt_rcv(rcvpkt) &&
sndpkt = make_pkt(0, data, checksum) ( corrupt(rcvpkt) ||
udt_send(sndpkt) isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) start_timer L
L Wait for Wait
for timeout
call 0from
ACK0 udt_send(sndpkt)
above
start_timer
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt,1) && notcorrupt(rcvpkt)
stop_timer && isACK(rcvpkt,0)
stop_timer
Wait Wait for
timeout for call 1 from
udt_send(sndpkt) ACK1 above
start_timer rdt_rcv(rcvpkt)
rdt_send(data) L
rdt_rcv(rcvpkt) &&
al
( corrupt(rcvpkt) || sndpkt = make_pkt(1, data, checksum)
ep
isACK(rcvpkt,0) ) udt_send(sndpkt)
start_timer
L
itn
cs
Transport Layer 3-38
rdt3.0 in action
al
ep
itn
cs
Transport Layer 3-39
rdt3.0 in action
al
ep
itn
cs
Transport Layer 3-40
Performance of rdt3.0
L 8000bits
dtrans 9
8 microsecon ds
R 10 bps
U sender: utilization fraction of time sender busy sending
U L/R .008
sender
= = = 0.00027
RTT + L / R 30.008 microsec
onds
1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link
al
network protocol limits use of physical resources!
ep
itn
cs
Transport Layer 3-41
rdt3.0: stop-and-wait operation
sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
L/R .008
al
U = = = 0.00027
sender
ep
RTT + L / R 30.008 microsec
onds
itn
cs
Transport Layer 3-42
Pipelined protocols
Pipelining: sender allows multiple, in-flight, yet-to-
be-acknowledged pkts
range of sequence numbers must be increased
buffering at sender and/or receiver
al
Two generic forms of pipelined protocols: go-Back-N,
ep
selective repeat
itn
cs
Transport Layer 3-43
Pipelining: increased utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R
Increase utilization
by a factor of 3!
al
U 3*L/R .024
= = = 0.0008
ep
sender 30.008
RTT + L / R microsecon
itn
ds
cs
Transport Layer 3-44
Pipelining Protocols
Go-back-N: big picture: Selective Repeat: big pic
Sender can have up to Sender can have up to
N unacked packets in N unacked packets in
pipeline pipeline
Rcvr only sends Rcvr acks individual
cumulative acks packets
Doesnt ack packet if Sender maintains
theres a gap timer for each
Sender has timer for unacked packet
oldest unacked packet When timer expires,
If timer expires, retransmit only unack
al
retransmit all unacked packet
ep
packets
itn
cs
Transport Layer 3-45
Selective repeat: big picture
Sender can have up to N unacked packets
in pipeline
Rcvr acks individual packets
Sender maintains timer for each unacked
packet
When timer expires, retransmit only unack
packet
al
ep
itn
cs
Transport Layer 3-46
Go-Back-N
Sender:
k-bit seq # in pkt header
window of up to N, consecutive unacked pkts allowed
al
timer for each in-flight pkt
ep
timeout(n): retransmit pkt n and all higher seq # pkts in window
itn
cs
Transport Layer 3-47
GBN: sender extended FSM
rdt_send(data)
if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
L else
refuse_data(data)
base=1
nextseqnum=1
timeout
start_timer
Wait
udt_send(sndpkt[base])
rdt_rcv(rcvpkt) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt)
udt_send(sndpkt[nextseqnum-1])
rdt_rcv(rcvpkt) &&
al
notcorrupt(rcvpkt)
ep
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
itn
stop_timer
else
cs
start_timer Transport Layer 3-48
GBN: receiver extended FSM
default
udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
L && hasseqnum(rcvpkt,expectedseqnum)
expectedseqnum=1 Wait extract(rcvpkt,data)
sndpkt = deliver_data(data)
make_pkt(expectedseqnum,ACK,chksum) sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++
al
ep
discard (dont buffer) -> no receiver buffering!
itn
Re-ACK pkt with highest in-order seq #
cs
Transport Layer 3-49
GBN in
action
al
ep
itn
cs
Transport Layer 3-50
Selective Repeat
receiver individually acknowledges all correctly
received pkts
buffers pkts, as needed, for eventual in-order delivery
to upper layer
sender only resends pkts for which ACK not
received
sender timer for each unACKed pkt
sender window
N consecutive seq #s
again limits seq #s of sent, unACKed pkts
al
ep
itn
cs
Transport Layer 3-51
Selective repeat: sender, receiver windows
al
ep
itn
cs
Transport Layer 3-52
Selective repeat
sender receiver
data from above : pkt n in [rcvbase, rcvbase+N-1]
if next available seq # in send ACK(n)
window, send pkt out-of-order: buffer
timeout(n): in-order: deliver (also
resend pkt n, restart timer deliver buffered, in-order
pkts), advance window to
ACK(n) in [sendbase,sendbase+N]: next not-yet-received pkt
mark pkt n as received
pkt n in [rcvbase-N,rcvbase-1]
if n smallest unACKed pkt,
ACK(n)
advance window base to
next unACKed seq # otherwise:
al
ignore
ep
itn
cs
Transport Layer 3-53
Selective repeat in action
al
ep
itn
cs
Transport Layer 3-54
Selective repeat:
dilemma
Example:
seq #s: 0, 1, 2, 3
window size=3
receiver sees no
difference in two
scenarios!
incorrectly passes
duplicate data as new
in (a)
al
Q: what relationship
ep
between seq # size
itn
and window size?
cs
Transport Layer 3-55
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-56
TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581
al
sender will not
ep
application application
writes data reads data
socket socket
overwhelm receiver
door door
itn
TCP TCP
send buffer receive buffer
segment
cs
Transport Layer 3-57
TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used
UA P R S F Receive window
(generally not used) # bytes
checksum Urg data pnter
rcvr willing
RST, SYN, FIN: to accept
Options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
al
checksum (variable length)
ep
(as in UDP)
itn
cs
Transport Layer 3-58
TCP seq. #s and ACKs
Seq. #s:
Host A Host B
byte stream
number of first User
types
byte in segments C
data host ACKs
receipt of
ACKs: C, echoes
seq # of next byte back C
expected from
other side host ACKs
cumulative ACK receipt
of echoed
Q: how receiver handles C
out-of-order segments
al
A: TCP spec doesnt
ep
time
say, - up to
simple telnet scenario
implementor
itn
cs
Transport Layer 3-59
TCP Round Trip Time and Timeout
Q: how to set TCP Q: how to estimate RTT?
timeout value? SampleRTT: measured time from
longer than RTT segment transmission until ACK
but RTT varies
receipt
ignore retransmissions
too short: premature
timeout SampleRTT will vary, want
unnecessary
estimated RTT smoother
retransmissions average several recent
al
ep
itn
cs
Transport Layer 3-60
TCP Round Trip Time and Timeout
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
al
ep
itn
cs
Transport Layer 3-61
Example RTT estimation:
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
350
300
250
RTT (milliseconds)
200
150
al
100
ep
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
itn
SampleRTT Estimated RTT
cs
Transport Layer 3-62
TCP Round Trip Time and Timeout
Setting the timeout
EstimtedRTT plus safety margin
large variation in EstimatedRTT -> larger safety margin
first estimate of how much SampleRTT deviates from
EstimatedRTT:
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically, = 0.25)
al
ep
TimeoutInterval = EstimatedRTT + 4*DevRTT
itn
cs
Transport Layer 3-63
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-64
TCP reliable data transfer
TCP creates rdt Retransmissions are
service on top of IPs triggered by:
unreliable service timeout events
Pipelined segments duplicate acks
Cumulative acks Initially consider
TCP uses single
simplified TCP sender:
ignore duplicate acks
retransmission timer
ignore flow control,
congestion control
al
ep
itn
cs
Transport Layer 3-65
TCP sender events:
data rcvd from app: timeout:
Create segment with retransmit segment
seq # that caused timeout
seq # is byte-stream restart timer
number of first data Ack rcvd:
byte in segment If acknowledges
start timer if not previously unacked
already running (think segments
of timer as for oldest update what is known to
unacked segment) be acked
expiration interval: start timer if there are
al
TimeOutInterval outstanding segments
ep
itn
cs
Transport Layer 3-66
NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum
al
if (there are currently not-yet-acknowledged segments)
acked
ep
start timer
}
itn
} /* end of loop forever */
cs
Transport Layer 3-67
TCP: retransmission scenarios
Host A Host B Host A Host B
Seq=92 timeout
timeout
X
loss
Sendbase
= 100
Seq=92 timeout
SendBase
= 120
al
SendBase
SendBase
ep
= 100
= 120 premature timeout
itn
time time
lost ACK scenario
cs
Transport Layer 3-68
TCP retransmission scenarios (more)
Host A Host B
timeout
X
loss
SendBase
= 120
al
time
ep
Cumulative ACK scenario
itn
cs
Transport Layer 3-69
TCP ACK generation [RFC 1122, RFC 2581]
al
ep
Arrival of segment that Immediate send ACK, provided that
partially or completely fills gap segment starts at lower end of gap
itn
cs
Transport Layer 3-70
Fast Retransmit
Time-out period often If sender receives 3
relatively long: ACKs for the same
long delay before data, it supposes that
resending lost packet segment after ACKed
Detect lost segments data was lost:
via duplicate ACKs. fast retransmit: resend
Sender often sends segment before timer
many segments back-to- expires
back
If segment is lost,
there will likely be many
al
duplicate ACKs.
ep
itn
cs
Transport Layer 3-71
Host A Host B
X
timeout
al
ep
time
itn
cs
Figure 3.37 Resending a segment after triple duplicate ACK Layer
Transport 3-72
Fast retransmit algorithm:
al
a duplicate ACK for fast retransmit
ep
already ACKed segment
itn
cs
Transport Layer 3-73
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-74
TCP Flow Control
flow control
sender wont overflow
receive side of TCP receivers buffer by
connection has a transmitting too
receive buffer: much,
too fast
speed-matching
service: matching the
send rate to the
receiving apps drain
rate
app process may be
al
slow at reading from
ep
buffer
itn
cs
Transport Layer 3-75
TCP Flow control: how it works
Rcvr advertises spare
room by including value
of RcvWindow in
segments
Sender limits unACKed
(Suppose TCP receiver data to RcvWindow
discards out-of-order guarantees receive
segments) buffer doesnt overflow
spare room in buffer
= RcvWindow
al
= RcvBuffer-[LastByteRcvd -
ep
LastByteRead]
itn
cs
Transport Layer 3-76
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-77
TCP Connection Management
Recall: TCP sender, receiver Three way handshake:
establish connection
before exchanging data Step 1: client host sends TCP
segments SYN segment to server
initialize TCP variables: specifies initial seq #
seq. #s no data
al
Socket connectionSocket =
replies with ACK segment,
ep
welcomeSocket.accept();
which may contain data
itn
cs
Transport Layer 3-78
TCP Connection Management (cont.)
close
client closes socket:
clientSocket.close();
al
FIN.
ep
closed
itn
cs
Transport Layer 3-79
TCP Connection Management (cont.)
timed wait
Note: with small
closed
modification, can handle
al
simultaneous FINs.
ep
closed
itn
cs
Transport Layer 3-80
TCP Connection Management (cont)
TCP server
lifecycle
TCP client
lifecycle
al
ep
itn
cs
Transport Layer 3-81
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-82
Principles of Congestion Control
Congestion:
informally: too many sources sending too much
data too fast for network to handle
different from flow control!
manifestations:
lost packets (buffer overflow at routers)
long delays (queueing in router buffers)
a top-10 problem!
al
ep
itn
cs
Transport Layer 3-83
Causes/costs of congestion: scenario 1
Host A lout
two senders, two
lin : original data
receivers
one router,
Host B unlimited shared
output link buffers
infinite buffers
no retransmission
large delays
when congested
maximum
al
achievable
ep
throughput
itn
cs
Transport Layer 3-84
Causes/costs of congestion: scenario 2
al
ep
itn
cs
Transport Layer 3-85
Causes/costs of congestion: scenario 2
always: = l
l (goodput)
in out
perfect retransmission only when loss: l > lout
in
retransmission of delayed (not lost) packet makes l larger
in
(than perfect case) for same lout
R/2 R/2 R/2
R/3
lou
lou
lou
R/4
t
t
R/2 R/2 R/2
lin lin lin
a. b. c.
al
costs of congestion:
ep
more work (retrans) for given goodput
itn
unneeded retransmissions: link carries multiple copies of pkt
cs
Transport Layer 3-86
Causes/costs of congestion: scenario 3
four senders
Q: what happens as l
multihop paths in
and l increase ?
timeout/retransmit in
Host A lout
lin : original data
l'in : original data, plus
retransmitted data
finite shared
output link buffers
Host B
al
ep
itn
cs
Transport Layer 3-87
Causes/costs of congestion: scenario 3
H l
o
o
s
u
t
A t
H
o
s
t
B
al
capacity used for that packet was wasted!
ep
itn
cs
Transport Layer 3-88
Approaches towards congestion control
Two broad approaches towards congestion control:
al
should send at
ep
itn
cs
Transport Layer 3-89
Case study: ATM ABR congestion control
al
receiver, with bits intact
ep
itn
cs
Transport Layer 3-90
Case study: ATM ABR congestion control
al
if data cell preceding RM cell has EFCI set, sender sets CI
ep
bit in returned RM cell
itn
cs
Transport Layer 3-91
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless
connection management
transport: UDP
3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control
al
ep
itn
cs
Transport Layer 3-92
TCP congestion control: additive increase,
multiplicative decrease
Approach: increase transmission rate (window size),
probing for usable bandwidth, until loss occurs
additive increase: increase CongWin by 1 MSS
every RTT until loss detected
multiplicative decrease: cut CongWin in half after
loss congestion
window
congestion window size
24 Kbytes
Saw tooth
behavior: probing
16 Kbytes
for bandwidth
al
8 Kbytes
ep
time
itn
time
cs
Transport Layer 3-93
TCP Congestion Control: details
sender limits transmission: How does sender
LastByteSent-LastByteAcked perceive congestion?
CongWin loss event = timeout or
Roughly, 3 duplicate acks
CongWin TCP sender reduces
rate = Bytes/sec
RTT rate (CongWin) after
CongWin is dynamic, function
loss event
of perceived network three mechanisms:
congestion AIMD
slow start
al
ep
conservative after
timeout events
itn
cs
Transport Layer 3-94
TCP Slow Start
When connection begins, When connection begins,
CongWin = 1 MSS increase rate
Example: MSS = 500 exponentially fast until
bytes & RTT = 200 msec first loss event
initial rate = 20 kbps
available bandwidth may
be >> MSS/RTT
desirable to quickly ramp
up to respectable rate
al
ep
itn
cs
Transport Layer 3-95
TCP Slow Start (more)
When connection Host A Host B
begins, increase rate
exponentially until
RTT
first loss event:
double CongWin every
RTT
done by incrementing
CongWin for every ACK
received
Summary: initial rate
is slow but ramps up
al
ep
exponentially fast time
itn
cs
Transport Layer 3-96
Refinement: inferring loss
After 3 dup ACKs:
CongWin is cut in half Philosophy:
window then grows
linearly 3 dup ACKs indicates
But after timeout event: network capable of
delivering some segments
CongWin instead set to
timeout indicates a
1 MSS;
more alarming
window then grows congestion scenario
exponentially
al
to a threshold, then
ep
grows linearly
itn
cs
Transport Layer 3-97
Refinement
Q: When should the
exponential
increase switch to
linear?
A: When CongWin
gets to 1/2 of its
value before
timeout.
Implementation:
Variable Threshold
At loss event, Threshold is
set to 1/2 of CongWin just
al
before loss event
ep
itn
cs
Transport Layer 3-98
Summary: TCP Congestion Control
al
CongWin/2 and CongWin is set to 1 MSS.
ep
itn
cs
Transport Layer 3-99
TCP sender congestion control
State Event TCP Sender Action Commentary
Slow Start ACK receipt CongWin = CongWin + MSS, Resulting in a doubling of
(SS) for previously If (CongWin > Threshold) CongWin every RTT
unacked set state to Congestion
data Avoidance
Congestion ACK receipt CongWin = CongWin+MSS * Additive increase, resulting
Avoidance for previously (MSS/CongWin) in increase of CongWin by
(CA) unacked 1 MSS every RTT
data
SS or CA Loss event Threshold = CongWin/2, Fast recovery,
detected by CongWin = Threshold, implementing multiplicative
triple Set state to Congestion decrease. CongWin will not
duplicate Avoidance drop below 1 MSS.
ACK
SS or CA Timeout Threshold = CongWin/2, Enter slow start
CongWin = 1 MSS,
Set state to Slow Start
al
SS or CA Duplicate Increment duplicate ACK count CongWin and Threshold not
ep
ACK for segment being acked changed
itn
cs
Transport Layer 3-100
TCP throughput
Whats the average throughout of TCP as a
function of window size and RTT?
Ignore slow start
Let W be the window size when loss occurs.
When window is W, throughput is W/RTT
Just after loss, window drops to W/2,
throughput to W/2RTT.
Average throughout: .75 W/RTT
al
ep
itn
cs
Transport Layer 3-101
TCP Futures: TCP over long, fat pipes
1.22 MSS
RTT L
L = 210-10 Wow
al
New versions of TCP for high-speed
ep
itn
cs
Transport Layer 3-102
TCP Fairness
Fairness goal: if K TCP sessions share same
bottleneck link of bandwidth R, each should have
average rate of R/K
TCP connection 1
bottleneck
TCP
router
connection 2
capacity R
al
ep
itn
cs
Transport Layer 3-103
Why is TCP fair?
Two competing sessions:
Additive increase gives slope of 1, as throughout increases
multiplicative decrease decreases throughput proportionally
al
ep
Connection 1 throughput R
itn
cs
Transport Layer 3-104
Fairness (more)
Fairness and UDP Fairness and parallel TCP
Multimedia apps often
connections
do not use TCP nothing prevents app from
do not want rate opening parallel
throttled by congestion connections between 2
control hosts.
Instead use UDP: Web browsers do this
pump audio/video at Example: link of rate R
constant rate, tolerate
packet loss
supporting 9 connections;
new app asks for 1 TCP, gets
Research area: TCP rate R/10
friendly new app asks for 11 TCPs,
al
gets R/2 !
ep
itn
cs
Transport Layer 3-105
Chapter 3: Summary
principles behind transport
layer services:
multiplexing,
demultiplexing
reliable data transfer
flow control Next:
congestion control leaving the network
instantiation and edge (application,
implementation in the transport layers)
Internet into the network
al
UDP core
ep
TCP
itn
cs
Transport Layer 3-106