Lecture12 Routers

CIS 553: Networked Systems
Vincent Liu
Spring 2018
Lecture 12
Slides from Mosharaf Chowdhury

and CN5E by Tanenbaum & Wetherall © Pearson Education-Prentice Hall and D. Wetherall
Flow control
Please slow down!
Network
Congestion control
Please slow down!
Network
Sliding window at receiver
RWND =Receiving
B - (LastByteReceived
process - LastByteRead)
Last byte read Buffer size (B)
Next byte needed

(1st byte not received)
Last byte received

Two basic questions
• How does the sender detect congestion?
• Packet loss (triple duplicate ACK, timeout)
• How does the sender adjust its sending rate?
• Finding available bottleneck bandwidth? (slow start)
• Adjusting to bandwidth variations?
• Sharing bandwidth?
AIMD
• Additive increase
• For each ACK, CWND = CWND+ 1/CWND
• CWND is increased by one only if all segments in a CWND
have been acknowledged
• Multiplicative decrease
• On packet loss, CWND = CWND/2
Leads to the TCP “Sawtooth”
Window
Loss
Exponential t
“slow start”
Why AIMD?
• Every RTT, we can do
• Multiplicative increase or decrease: CWND® a*CWND
• Additive increase or decrease: CWND® CWND + b
• Four alternatives:
• AIAD: gentle increase, gentle decrease
• AIMD: gentle increase, drastic decrease
• MIAD: drastic increase, gentle decrease
• MIMD: drastic increase and decrease
Simple model of congestion control
1 Efficiency line Fairness line
• Two users (x1+x2 = 1) (x1 =x2)
• rates x1 and x2
User 2’s rate (x2)

• Congestion when
x1+x2 > 1
• Unused capacity when
x1+x2 < 1
d à
ste
n ge
• Fair when x1 =x2 co
i e nt
c
effi
in
ß
User 1’s rate (x1) 1

Example
1 Fairness
Efficient: x1+x2=1 line
Fair
Congested: x1+x2=1.2
User 2: x2 (0.2, 0.5) (0.7, 0.5)

(0.5, 0.5)
(0.7, 0.3) d à
ste
Inefficient: x1+x2=0.7 n ge
co
Efficient: x1+x2=1 i e nt
c
Not fair effi Efficiency
in
ß line
User 1: x1 1
AIAD
Fairness
(x1-aD+aI),
• Increase: x + aI x2-aD+aI))
line
• Decrease: x - aD
• Does not converge (x1,x2)
to fairness
User 2: x2
(x1-aD,x2-aD)
d à
ste
n ge
co
i e nt
c
effi Efficiency
in
ß line
User 1: x1
0
10
20
30
40
50
60
A
D
1
28
55
x2
x1
82
109
136
163
190
217
244
271
298
325
352
AIAD Sharing Dynamics
379
406
433
460
E
B
487
MIMD
Fairness
• Increase: x*bI (x1,x2)
line
• Decrease: x*bD (bIbDx1,

bIbDx2)
• Does not converge to
fairness
User 2: x2
(bdx1,bdx2)
d à
ste
e
ong
c
i e nt
c
effi Efficiency
in
ß line
User 1: x1
MIAD
Fairness
• Increase: x*bI (bI(x1h-aD), bI(x2h-aD)) line
• Decrease: x - aD
• Does not converge to (x1h,x2h)
fairness
User 2: x2
• Does not converge to
efficiency
(x1h-aD,x2h-aD)
d à
• “Analysis of the Increase and ste
Decrease Algorithms for n ge
co
Congestion Avoidance in
Computer Networks” i e nt
c
-- Chiu and Jain effi Efficiency
in
ß line
User 1: x1
AIMD
Fairness
• Increase: x+aI (x1,x2) line
• Decrease: x*bD (bDx1+aI,

bDx2+aI)
• Converges to
fairness
User 2: x2
(bDx1,bDx2)
d à
ste
n ge
co
i e nt
c
effi Efficiency
in
ß line
User 1: x1
AIMD Sharing Dynamics
A x1 B
x2 50 packets/sec
D E
60
50
40
Rates equalize à fair share
30
20
10
0
1
28
55
82
109
136
163
190
217
244
271
298
325
352
379
406
433
460
487
Efficiency vs. Fairness
• Cannot always have both!
• Example network with traffic AàB, BàC and AàC
• How much traffic can we carry?
A B C
1 1
Efficiency vs. Fairness (2)
• If we care about fairness:
• Give equal bandwidth to each flow
• AàB: ½ unit, BàC: ½, and AàC, ½
• Total traffic carried is 1 ½ units
A B C
1 1
Efficiency vs. Fairness (3)
• If we care about efficiency:
• Maximize total traffic in network
• AàB: 1 unit, BàC: 1, and AàC, 0
• Total traffic rises to 2 units!
A B C
1 1
Max-Min fairness
• Given set of bandwidth demands ri and total
bandwidth C, max-min bandwidth allocations are:
• ai = min(f, ri)
• where f is the unique value such that Sum(ai) = C
• If you don’t get full demand, no one gets more than
you
• This is what round-robin service gives if all packets
are the same size
Computing Max-Min Fairness
• To find it given a network, imagine “pouring water
into the network”
1. Start with all flows at rate 0
2. Increase the flows until there is a new bottleneck in
the network
3. Hold fixed the rate of the flows that are bottlenecked
4. Go to step 2 for any remaining flows
ACK Clocking
ACK Clocking
• Consider what happens when sender injects a burst
of segments into the network
Queue
Fast link Slow (bottleneck) link Fast link

ACK Clocking (2)
• Segments are buffered and spread out on slow link
Segments
“spread out”
Fast link Slow (bottleneck) link Fast link

ACK Clocking (3)
• ACKs maintain the spread back to the original
sender
Slow link
Acks maintain spread
ACK Clocking (4)
• Sender clocks new segments with the spread
• Now sending at the bottleneck link without queuing!
Segments spread Queue no longer builds
Slow link
ACK clocking mitigates bursts
• Helps the network run with low levels of loss and
delay!
• The network has smoothed out the burst of data

segments
• ACK clock transfers this smooth timing back to the
sender
• Subsequent data segments are not sent in bursts so
do not queue up in the network
Timeouts and Idle Periods
• After a timeout or idle period:
• We lose ACK clocking!
• Also, network conditions change
• Maybe many more flows are traversing the link
• Dangerous to start transmitting at the old rate

• Previously-idle TCP sender might blast network
• … causing excessive congestion and packet loss
• So, some TCP implementations repeat slow start

• Slow-start restart after an idle period
Repeating Slow Start After Idleness
Window
timeout
Slow start until

ssthresh
t =
CWND/2
Slow-start restart: Go back to CWND of 1, but take

advantage of knowing previous value of CWND
TCP flavors
• TCP-Tahoe
• CWND =1 on 3 dupACKs or timeout
• TCP-Reno
• CWND = 1 on timeout
• CWND = CWND/2 on 3 dupACKs Our default
assumption
• TCP-newReno
• TCP-Reno + fast recovery
• TCP-SACK
• Incorporates selective acknowledgements
How can they coexist?
• All follow the same principle
• Increase CWND on good news
• Decrease CWND on bad news
• Notion of TCP-friendliness
This Lecture
• Router design
• Queueing/Scheduling
• Router-assisted congestion control
What an IP router does:
The normal case
1. Receive incoming packet from link input interface
2. Lookup packet destination in forwarding table
(destination, output port(s))
3. Validate checksum, decrement ttl, update checksum
4. Buffer packet in input queue
5. Send packet to output interface (interfaces?)
6. Buffer packet in output queue
7. Send packet to output interface link
Many types of routers
• Core
• R = 10/40/100 Gbps
• NR = O(100) Tbps (Aggregated)
• Edge
• R = 1/10/40
• NR = O(100) Gbps
• Small business
• R = 10/100/1000 Mbps
• NR < 10 Gbps
What’s inside a router?
Control Route/Control
Plane Processor
Data Linecards (input) Linecards (output)

Plane
1 1
2 2
Interconnect
(Switching)
Fabric
N N
What’s inside a router?
• Linecards
• Input linecards process packets on their way in
• Output linecards process packets on way out
• Input and output for the same port are on the same
physical linecard
• Interconnect/switching fabric
• Transfers packets from input to output ports
Primary responsibilities
1. Determining the appropriate output port
• IP Prefix Lookup
2. Scheduling traffic so that each flow’s packets are

serviced. Two concerns:
• Efficiency: If there is traffic waiting for an output port, the
router should be “busy”
• Fairness: Competing flows should all be serviced
• Challenge: speed!
• 100B packets @ 40Gbps à new packet every 20 ns!
• Typically implemented with specialized ASICs (network
processors)
This Lecture
• Router design
How do we handle multiple
inputs/outputs?
Control Route/Control
Plane Processor
Data Linecards (input) Linecards (output)

Plane
1 1
2 2
Interconnect
(Switching)
Fabric
N N
Interconnect Fabric
Input
ports Where do we
need queues?
Output ports
• If inputs and outputs are the same speed, there can be contention
• Location of queue depends on the speed of the linecards/fabric
Shared memory
• One giant block of memory
• N writes, N reads every ‘tick’
• Pros
• Simple to use
• Ability to dynamically carve up of available space
• Cons
• Fast memory is expensive
• Size usually limited
Output queuing
• Output interfaces buffer packets
Input Output
Switch
• Pros
• Faster/cheaper than shared memory
• Single congestion point
• Cons
• N inputs may send to the same
output
• Still requires speedup of N
• i.e., output ports must run N times
quicker than input ports to handle
worst case
Input queuing
l Input interfaces buffer packets Input Output
Switch
l Pros
u Single congestion point
u 1 input, 1 output per ‘tick’
l Cons
u Must implement flow control
u Low utilization due to
Head-of-Line (HoL) Blocking
Head-of-Line Blocking
Problem: The packet at the front of the queue experiences contention for
the output queue, blocking all packets behind it.
Input 1 Output 1
Input 2 Output 2
Input 3 Output 3
Maximum throughput in such a switch: 2 – sqrt(2) ~= 58%

(random traffic)
Solution: Virtual Output Queues
• Maintain N virtual queues at each input
• one per output
Input 1
Output 1
Output 2
Input 2
Output 3
Input 3
Typical queue management policy
• Access to the bandwidth: first-in first-out queue
• Packets only differentiated when they arrive
• Access to the buffer space: drop-tail queuing

• If the queue is full, drop the incoming packet
Many possible improvements to FIFO

We’ll just talk about one for now
Alternative: Random Early Detection (RED)
• Drop-tail leads to bursty loss

• Feedback only comes when buffer is completely full
• RED:Router notices that queue is getting full

• … and randomly drops packets to signal congestion
Probability
1
Drop
Average Queue Length

Problems With RED
• Hard to get tunable parameters just right
• How early to start dropping packets?
• What slope for increase in drop probability?
• What time scale for averaging queue length?
• RED has mixed adoption in practice

• If parameters aren’t set right, RED doesn’t help
• Many other variations in research community

• Names like “Blue” (self-tuning), “FRED”…
This Lecture
• Router design
TCP has lots of problems!
Routers tell endpoints
if they’re congested
• Misled by non-congestion losses
• Fills up queues leading to high delays
• Tight coupling with reliability mechanisms Routers tell
endpoints what
• Short flows complete before discovering available capacity
rate to send at
• AIMD impractical for high speed links
• Saw tooth discovery too choppy for some apps
• Unfair under heterogeneous RTTs
Routers enforce
• End hosts can cheat fair sharing
Could fix many of these with some help from routers!

Router-assisted congestion control
• Three tasks for congestion control
• Isolation/fairness
• Adjustment
• Detecting congestion
Fairness: General approach
• Routers classify packets into “flows”
• Let’s assume flows are TCP connections
• Each flow has its own FIFO queue in router
• Router services flows in a fair fashion
• When line becomes free, take packet from next flow in a
fair order
• Provides max-min fairness
How do we deal with packets of
different sizes?
• Mental model: Bit-by-bit round robin (“fluid flow”)
• Can you do this in practice?
• No, packets cannot be preempted
• But we can approximate it
• This is what “fair queuing” routers do
Fair Queuing (FQ)
• For each packet, compute the time at which the
last bit of a packet would have left the router if
flows are served bit-by-bit
• Then serve packets in the increasing order of their
deadlines
Example
Flow 1 1 2 3 4 5 6
(arrival traffic) time
Flow 2 1 2 3 4 5
(arrival traffic) time
Service 1 2 1 2 3 4 5 6
in fluid flow 3 4 5
time
system
FQ 1 2 1 3 2 3 4 4 5 5 6
Packet time
system
Fair Queuing (FQ)
• Implementation of round-robin generalized to the
case where not all packets are equal sized
• Weighted fair queuing (WFQ): assign different flows
different shares
• Today, some form of WFQ implemented in almost
all routers
• Not the case in the 1980-90s, when CC was being
developed
• Mostly used to isolate traffic at larger granularities (e.g.,
per-prefix)
FQ vs. FIFO
• FQ advantages:
• Isolation: cheating flows don’t benefit
• Bandwidth share does not depend on RTT
• Flows can pick any rate adjustment scheme they want
• Disadvantages:
• More complex than FIFO: per flow queue/state,
additional per-packet book-keeping
FQ in the big picture
• FQ does not eliminate congestion à it just
manages the congestion
s
5G
bp
bp
0M
10
s
1Gbps
bps Will drop an additional
1G
1 G
400Mbps from
bp
s
the green flow
Blue and Green get If the green flow doesn’t drop its sending rate to
0.5Gbps; any excess 100Mbps, we’re wasting 400Mbps that could be
will be dropped usefully given to the blue flow
FQ in the big picture
• FQ does not eliminate congestion à it just
manages the congestion
• Robust to cheating, variations in RTT, details of delay,
reordering, retransmission, etc.
• But congestion (and packet drops) still occurs
• We still want end-hosts to discover/adapt to their
fair share!
• What would the end-to-end argument say w.r.t.
congestion control?
Fairness is a controversial goal
• What if you have 8 flows, and I have 4?
• Why should you get twice the bandwidth?
• What if your flow goes over 4 congested hops, and
mine only goes over 1?
• Why shouldn’t you be penalized for using more scarce
bandwidth?
• What is a flow anyway?
• TCP connection
• Source-Destination pair?
• Source?
Router-Assisted Congestion
Control
• CC has three different tasks:
• Rate adjustment
Why not let routers tell what rate
end hosts should use?
• Packets carry “rate field”
• Routers insert “fair share” f in packet header
• End-hosts set sending rate (or window size) to f
• Hopefully (still need some policing of end hosts!)
• This is the basic idea behind the “Rate Control
Protocol” (RCP) from Dukkipati et al. ’07
• Flows react faster
Router-Assisted Congestion Control
• CC has three different tasks:

• Rate adjustment
Explicit Congestion Notification
(ECN)
• Single bit in packet header; set by congested
routers
• If data packet has bit set, then ACK has ECN bit set
• Many options for when routers set the bit
• Tradeoff between (link) utilization and (packet) delay
• Congestion semantics can be exactly like that of
drop
• i.e., end-host reacts as though it saw a drop
ECN
• Advantages:
• Don’t confuse corruption with congestion; recovery w/
rate adjustment
• Can serve as an early indicator of congestion to avoid
delays
• Easy (easier) to incrementally deploy
• Today: defined in RFC 3168 using ToS/DSCP bits in the IP header
• Common in datacenters

Lecture12 Routers

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Lecture12 Routers

Transféré par

Droits d'auteur :

Formats disponibles

CIS 553: Networked Systems

Slides from Mosharaf Chowdhury

Please slow down!

Last byte read Buffer size (B)

Next byte needed

Last byte received

User 2’s rate (x2)

User 1’s rate (x1) 1

User 2: x2 (0.2, 0.5) (0.7, 0.5)

• Decrease: x*bD (bIbDx1,

• Decrease: x*bD (bDx1+aI,

Fast link Slow (bottleneck) link Fast link

Fast link Slow (bottleneck) link Fast link

Segments spread Queue no longer builds

• The network has smoothed out the burst of data

• Dangerous to start transmitting at the old rate

• So, some TCP implementations repeat slow start

Slow start until

Slow-start restart: Go back to CWND of 1, but take

Data Linecards (input) Linecards (output)

2. Scheduling traffic so that each flow’s packets are

Data Linecards (input) Linecards (output)

Maximum throughput in such a switch: 2 – sqrt(2) ~= 58%

• Access to the buffer space: drop-tail queuing

Many possible improvements to FIFO

• Drop-tail leads to bursty loss

• RED:Router notices that queue is getting full

Average Queue Length

• RED has mixed adoption in practice

• Many other variations in research community

Could fix many of these with some help from routers!

bps Will drop an additional

• CC has three different tasks:

Vous aimerez peut-être aussi