Académique Documents
Professionnel Documents
Culture Documents
Abstract— This paper presents pathChirp, a new active times (RTTs), while imposing as light a load as possible
probing tool for estimating the available bandwidth on a on the network.
communication network path. Based on the concept of Current schemes for available bandwidth estimation
“self-induced congestion,” pathChirp features an exponen-
fall into two broad classes. The first class of schemes
tial flight pattern of probes we call a chirp. Packet chips
offer several significant advantages over current probing is based on statistical cross-traffic models, such as Del-
schemes based on packet pairs or packet trains. By rapidly phi [7] and the methods proposed in [8]. While these
increasing the probing rate within each chirp, pathChirp schemes can potentially provide accurate estimates of
obtains a rich set of information from which to dynami- cross-traffic, to date they have been designed for single
cally estimate the available bandwidth. Since it uses only hop (bottleneck) scenarios and as such may not be ro-
packet interarrival times for estimation, pathChirp does bust in multi-hop networks which are more common in
not require synchronous nor highly stable clocks at the
practice.
sender and receiver. We test pathChirp with simulations
and Internet experiments and find that it provides good The second class of schemes is based on the concept of
estimates of the available bandwidth while using only a self-induced congestion, which relies on a simple heuris-
fraction of the number of probe bytes that current state- tic: If the probing rate exceeds the available bandwidth
of-the-art techniques use. over the path, then the probe packets become queued at
Keywords: probing, networks, bandwidth, available some router, resulting in an increased transfer time. On
bandwidth, chirp, end-to-end, inference the other hand, if the probing rate is below the available
bandwidth, the packets face no queuing delay. The avail-
I. I NTRODUCTION able bandwidth can then be estimated as the probing rate
at the onset of congestion. Such schemes are equally
Inferring the unused capacity or available bandwidth
suited to single and multiple hop paths, since they rely
is of great importance for various network applications.
only on whether the probe packets make it across the path
Knowledge of the available bandwidth on an end-to-end
with an unusual delay or not.
path can improve rate-based streaming applications [1],
Two examples of the self-induced congestion ap-
end-to-end admission control [2], server selection [3],
proach are pathload [9] and TOPP [10]. Pathload em-
optimal route selection in overlay networks [4], conges-
ploys long constant bit-rate (CBR) packet trains and
tion control [5] as well as service level agreement veri-
adaptively varies the rates of successive packet trains in
fication [6]. Obtaining useful estimates of the available
an effort to converge to the available bandwidth rate. Be-
bandwidth from routers is often not possible due to var-
cause of its adaptive search, pathload can have long con-
ious technical and privacy issues or due to an insuffi-
vergence times (on the order of 100s of RTTs) [9] and
cient level of measurement resolution or accuracy. Thus,
use several MB of probe traffic per estimate. TOPP [10]
it becomes necessary to infer the required information
uses packet pairs of different spacings well-separated in
from the network edge via an active or passive probing
time and estimates available bandwidth from the time-
scheme. Ideally, a probing scheme should provide an ac-
averaged spacing of packets at the receiver. As a result
curate estimate of a path’s available bandwidth in as short
TOPP does not make use of delay correlation informa-
a time as possible, preferably over only a few round trip
tion obtainable from packet trains with closely spaced
This work was supported by NSF grant ANI–0099148, NSF packets.
grant ANI–9979465, DoE SciDAC grant DE-FC02-01ER25462, In this paper, we propose a new self-induced con-
DARPA/AFRL grant F30602–00–2–0557, and the Texas In-
struments Leadership University Program. Email: {vinay,
gestion available bandwidth estimation scheme we call
riedi, richb}@rice.edu, {jiri,cottrell}@slac.stanford.edu. Web: pathChirp. Unique to pathChirp is an exponentially
spin.rice.edu, www-iepm.slac.stanford.edu. spaced chirp probing train (see Fig. 1). Chirp trains
2
probe packets
A. Available bandwidth
1 2 N-4 N-3 N-2 N-1 N
Denote the capacity of the output queue of router node
i as Ci , and the total traffic (other than probes) entering
T γ N-2 T γ3 T γ2 Tγ T time
it between times a and b as Ai [a, b]. Define the path’s
available bandwidth in time interval [t − τ, t] as
Fig. 1. Chirp probe train, with exponential packet flight pat-
tern. Ai [t − τ + pi , t + pi ]
B[t − τ, t] = min Ci − , (1)
i τ
are highly efficient. First a chirp of N packets has where pi is the minimum time a packet sent from the
N − 1 packet spacings that would normally require sender could take to reach router i. The delay p i includes
2N − 2 packets using packet pairs. Second by expo- the speed-of-light propagation delay and packet service
nentially increasing the packet spacing, chirps probe the times1 at intermediate queues.
network over the range of rates [G1 , G2 ]Mbps using just In reality probe packets suffer queuing delays in addi-
log(G2 ) − log(G1 ) packets. One other advantage of tion to the minimum delay pi . Thus probes transmitted
chirps (or any other packet train) over packet pairs is that during [t−τ, t] can arrive at router i outside time interval
they capture critical delay correlation information that [t − τ + pi , t + pi ] and do not exactly measure B[t − τ, t].
packet pairs do not. PathChirp exploits these advanta- For large τ ( RTT), however, the effect of queuing de-
geous properties of chirps to rapidly estimate available lay becomes inconsequential.
bandwidth using few packets.
To avoid confusion, we emphasize that the only com- B. pathChirp overview
monality between pathChirp and our Delphi algorithm PathChirp estimates the available bandwidth along a
[7] is the chirping packet train. Delphi uses chirps to path by launching a number of packet chirps (numbered
estimate the available bandwidth at a range of different m = 1, 2, . . .) from sender to receiver and then conduct-
time scales based on a multifractal tree model for the ing a statistical analysis at the receiver.
bandwidth over time. It does not use the self-induced First some notation for chirps (see Fig. 1). Consider
congestion principle. chirp m consisting of N exponentially spaced packets,
We present the pathChirp concept and algorithm each of size P bytes. Define the ratio of successive
in Section II and analyze its performance as a func- packet inter-spacing times within a chirp as the spread
tion of its parameters in Section III. After testing (m)
factor γ, the queuing delay of packet k as q k ,2 the
pathChirp in multi-hop scenarios in Section IV we com- (m)
sender transmission time of packet k as t k , the inter-
pare it to TOPP and pathload in Sections V and VI (m)
respectively. Section VII overviews experiments on spacing time between packets k and k + 1 as ∆ k , and
the real Internet. We conclude in Section VIII with the instantaneous chirp rate at packet k as
a discussion and directions for future research. The (m) (m)
Rk = P/∆k . (2)
pathChirp tool is available as open-source freeware at
spin.rice.edu/Software/pathChirp. (m) (m)
Since ∆k and Rk are the same for all chirps, we drop
their superscripts in the subsequent discussion.
II. PATH C HIRP
In a CBR fluid cross-traffic scenario, we have
In this paper we are concerned with a single sender h i
(m) (m) (m)
qk = 0, if B t1 , tN ≥ Rk
– single receiver path of a communication network. We
explicitly permit multiple queues; to this end we model a (m) (m)
qk > qk−1 , otherwise (3)
path as a series of store-and-forward nodes each with its h i
own constant service rate, equipped with FIFO queues. b t ,t (m) (m)
which leads to a simple estimate: B 1 N = Rk ∗ ,
This is an accurate model for today’s Internet. We fo- ∗
where k is the packet at which the queuing delay begins
cus on estimating the available bandwidth over the path increasing.
based on queuing delays of probe packets transmitted
1
from the sender to the receiver. We use information only We assume a constant probe packet size which implies a constant
packet service time.
on the relative delays between probe packets; this al- 2
If the clocks at the sender and receiver are unsynchronized but
lows us to not require clock synchronization between the stable, then the difference between receiver and sender time stamps
sender and receiver. is the queuing delay plus a constant.
3
procedure excursion(q,i,F ,L){ ing time stamp corruption due to context switching. One
j = i + 1; of them is to use time stamps generated by NIC cards
rather than application layer ones.
max q= 0;
PathChirp discards all chirps with dropped packets.
while((j ≤ N ) and (q(j) − q(i) > max q/F ))
{ III. P ERFORMANCE AND PARAMETER C HOICE
max q=maximum(max q,q(j) − q(i)); In this section, we use simulations to better understand
j = j + 1; the role of the various pathChirp parameters. In the ex-
} periments, we use a single queue with capacity 10Mbps
fed with Poisson packet arrivals. The cross-traffic packet
if ((j ≥ N )) return j;
sizes were randomly chosen to be 1500 or 40 bytes with
if (j − i ≥ L) equal probability. Internet traffic has been shown to have
return j; weak correlations over time lags less than 100ms [11] in
else spite of stronger correlations (or long-range dependence
return i; (LRD)) at time lags greater than 1s. Since the duration
of a chirp is typically less than 100ms a Poisson cross-
} traffic model which does not possess LRD suffices.
We varied the packet size P , spread factor γ, decrease
Fig. 4. The excursion segmentation algorithm. factor F and busy period threshold L while keeping τ
and the total probing load constant at 500kbps. Recall
in time to achieve the specified average probing rate. that we can maintain any average low probing rate by
Each UDP packet carries a sender timestamp which the spacing the chirp trains far enough apart. Our choice for
receiver uses along with its own local timestamp in the the performance metric is the mean squared error (MSE)
delay estimation process. of the estimate ρ[0, τ ] normalized by the second moment
In pathChirp, probe packets travel one-way from of the true B[0, τ ]. All experiments report 90% confi-
sender to receiver, and the receiver performs the estima- dence intervals.
tion. We prefer to not merely echo back information to Probe packet size P : First we assess the impact of probe
the sender to avoid the problem of echo probe traffic in- packet size P on estimation performance. Obviously
terfering with the sender-to-receiver chirp probes. This the number of bytes transmitted per chirp decreases with
can occur on links that are not full-duplex, for example P . Thus by reducing P we can send more chirps for
those in shared LANs. the same average probing rate, giving us more estimates
PathChirp addresses the practical problem of con- D (m) per time interval τ . However from (2) we observe
text switching. When a context switch takes place at a that for the same set of probing rates R k , a small P re-
host receiving probe packets, the packets are temporarily sults in a proportionately small ∆k . Intuitively the cross-
buffered while the CPU handles other processes. This traffic arriving over a time interval ∆ k is more bursty for
introduces delays between packets reaching the applica- smaller ∆k . For instance when ∆k → 0 the cross-traffic
tion layer just before the context switch and after it. In process is far from smooth and to the contrary is a binary
addition, the buffered packets rapidly reach the applica- process: we either have one packet arriving or none at
tion layer after the context switch. These delays may be all. Thus shorter chirps will exhibit more erratic signa-
mistakenly construed as router queuing delays and thus tures and give less accurate estimates.
corrupt pathChirp’s network inference. When the differ- Fig. 5 demonstrates the effect of the probe packet size
ence between two consecutive receive time stamps is less P on estimation performance. We set γ = 1.2 and vary
than a threshold d, we detect a context switch and dis- the parameters F and L as well as the link utilization.
card the concerned chirp. We note that a d value of 30µs Observe that in most cases larger P values give better
is lower than the transmission time of a 1000 byte packet performance. In a few cases in Fig. 5(a) the MSE in-
on an OC-3 link (50µs). Thus one would expect packet creases slightly with P .
arrival times at the receiver to exceed 50µs if the last link The results show that pathChirp generally performs
is of OC-3 or lower speed. Currently d is hardcoded in better with larger packet sizes. In Internet experiments
the program. In future d will be adaptively chosen to suit we thus use P ≥ 1000 bytes.
the machine in question. Spread Factor γ: The spread factor γ controls the spec-
We are currently studying other ways of circumvent- trum of probing rates in a chirp. A smaller γ leads
6
0.1
normalized MSE
normalized MSE
0.15 F=1.5,L=3 F=1.5,γ=1.2
F=1.5,L=5 F=1.5,γ=1.4
F=3.5,L=3
F=3.5,L=5
0.08 F=3.5,γ=1.2
F=3.5,γ=1.4
0.1 F=5.5,L=3 F=5.5,γ=1.2
F=5.5,L=5
0.06 F=5.5,γ=1.4
0.05
0.04
normalized MSE
F=1.5,L=3 F=1.5,γ=1.2
normalized MSE
0.5 0.2
Fig. 5. Normalized mean squared error vs. probe packet size Fig. 7. Normalized MSE vs. busy period threshold L for two
P for two utilizations: (a) 30% and (b) 70%. In most cases the utilizations: (a) 30% and (b) 70%. The error improves with
MSE decreases with increasing packet size. The experiment decreasing L.
used γ = 1.2.
0.12
busy period threshold L and decrease factor F influence
F=1.5,L=3
normalized MSE
0.1
F=1.5,L=5
F=3.5,L=3
pathChirp’s excursion segmentation algorithm. Recall
F=3.5,L=5
(m)
0.08
F=5.5,L=3
F=5.5,L=5 that the Ek estimates corresponding to an excursion
0.06 region are always less than what they would be if the
0.04 region was not marked as belonging to one (compare
1 1.1 1.2 1.3 1.4 1.5 1.6 cases (a) and (c) in Section II-D). Increasing L or de-
(a) spread factor γ
creasing F makes it harder for bumps in signatures to
0.6 F=1.5,L=3
normalized MSE
F=1.5,L=5
F=3.5,L=3
qualify as valid excursions thus leading to over-estimates
0.4
F=3.5,L=5
F=5.5,L=3 of the available bandwidth. Conversely, decreasing L
F=5.5,L=5
or increasing F will lead to under-estimation of avail-
0.2 able bandwidth. The optimal choice for the busy period
threshold L and decrease factor F will depend on the
1 1.1 1.2 1.3 1.4 1.5 1.6
(b) spread factor γ cross-traffic statistics at queues on the path.
From Figs. 7 and 8 observe that for our single queue
Fig. 6. Normalized MSE vs. spread factor γ for two utiliza- Poisson cross-traffic scenario small values of L and large
tions: (a) 30% and (b) 70%. The MSE decreases with decreas- values of F give better performance.
ing γ .
Internet experiments indicate that the optimum values
of L = 3 and F = 6 obtained from the above exper-
to a dense spectrum of rates Rk , potentially increasing iments provide overly conservative estimates of avail-
the accuracy of estimates D (m) . It also leads to a finer able bandwidth. This could possibly be due to the noise
sampling of network delay, thus potentially improving present in real experiments that is absent in simulations.
pathChirp’s ability to identify excursions. However it The pathChirp tool instead uses L = 5 and F = 1.5 as
also increases the number of packets per chirp and hence default.
reduces the number of estimates D (m) per time interval
τ , possibly degrading the estimate ρ[t − τ, t]. IV. M ULTI -H OP SCENARIOS
Fig. 6 demonstrates the effect of the spread factor γ Real Internet paths almost always are multi-hop. Al-
on estimation performance. We observe that the MSE though we are unaware of any rigorous study of the num-
decreases (that is, improves) with decreasing γ. This ex- ber of congested queues on typical Internet paths, we hy-
periment uses P = 1300 byte packets. Since γ > 2 can pothesize that congestion largely occurs at the edge of
give errors as high as 100% even in CBR scenarios, we the network close to the source or receiver. Thus data
have excluded them in the experiments. packets might likely encounter two congested queues,
PathChirp uses γ = 1.2 by default. one on each end of their paths. One fact that supports
Busy period threshold L and decrease factor F : The the argument that congestion occurs at the edge is that
7
normalized MSE
L=3,γ=1.2
L=3,γ=1.4 avail−bw 1st queue = 10Mbps, 2nd queue = 20Mbps
L=5,γ=1.2
0.08 L=5,γ=1.4 0.06
0.06
normalized MSE
0.05
0.04
0.04
1 2 3 4 5 6
(a) decrease factor F 0.03
0.6
normalized MSE
L=3,γ=1.2
L=3,γ=1.4
0.02
L=5,γ=1.2
L=5,γ=1.4
0.4 0.01
1 2 3 4 5 6 7 8
0.2 (a) time interval τ (seconds)
normalized MSE
Fig. 8. Normalized MSE vs. decrease factor F for two uti- 0.07
lizations: (a) 30% and (b) 70%. The error improves with in- 0.06
creasing F . 0.05
0.04
queue 1 queue 2 0.03
40Mbps 20Mbps 0.02
0.01
1 2 3 4 5 6 7 8
cross−traffic cross−traffic (b) time interval τ (seconds)
Fig. 9. Multi-hop experiment. Fig. 10. Performance in multi-hop experiments. The MSE
in the case of both queues being loaded is comparable to that
when only one is loaded implying that pathChirp is robust to
backbone ISPs have reported very low packet loss and
multi-hop paths.
queuing delay on their networks [12]. While it is pos-
sible for paths to have no congested queues or possibly
one, it is important for tools like pathChirp to be robust where the slack queue has 5Mbps cross-traffic and no
to the presence of at least two congested queues along cross-traffic at all.
the end-to-end path. In the second experiment both queues are fed with
This section tests pathChirp in a two-hop scenario as 10Mbps cross-traffic which sets the available bandwidth
depicted in Fig. 9. As before, competing cross-traffic at the first queue to 30Mbps and that at the second to
packet arrivals are Poisson and the packet sizes are cho- 10Mbps. From Fig. 10(b) to our surprise we observe that
sen at random to be 1500 or 40 bytes with equal probabil- the MSE is marginally smaller when the slack queue is
ity. The parameters we use are γ = 1.2, L = 5, F = 2, loaded that when it is not.
P = 1500 and τ = 3s. The results show that pathChirp is robust in multi-hop
Each experiment consists of two scenarios. In the scenarios.
first, we load both queues with cross-traffic such that one
V. C OMPARISON WITH TOPP
queue has less available bandwidth (the tight queue) than
the other (the slack queue). The slack queue essentially This section compares pathChirp with TOPP [10] us-
adds noise to the chirp packet delays. In the second, we ing simulations when both use the same probing bit rate
set the cross-traffic rate at the slack queue to zero. An and probe packet spacings.
error of comparable magnitude in the two scenarios im-
plies that pathChirp is robust to the noise of the slack A. TOPP
queue in the first case. TOPP sends out several packet pairs well-separated in
In the first experiment the cross-traffic rate at the first time [10]. Denote the set of unique packet-pair spacings
queue is 30Mbps and that at the second queue 5Mbps. at the sender arranged in decreasing order as δ k , k =
This sets the available bandwidth at the first queue (the 1, . . . , N − 1 and the corresponding average spacings
tight one) to 10Mbps and that at the second (the slack at the receiver as ηk , k = 1, . . . , N − 1. Then un-
one) to 15Mbps. From Fig. 10(a) we observe that the der the assumption of proportional sharing (see [10]
MSE is practically indistinguishable between the cases for details) of bandwidth at all queues on the path, the
8
plot of ηk /δk vs. P/δk is piecewise linear with increas- 0.18 pathChirp
TOPP
ing slope. The very first linear segment equals 1 for 0.16
P/δk ∈ (0, B[−∞, ∞]), implying that the first break- 0.14
normalized MSE
point gives the available bandwidth B[−∞, ∞]. In prac- 0.12
normalized MSE
estimates ρ[nτ, (n + 1)τ ] are obtained as described in 0.4
Section II-B.
For pathChirp we fix the spread factor γ and separate 0.3
dom variables.
Fig. 11. Comparison of pathChirp and TOPP in a single-hop
B. Single-hop scenarios scenario for two utilizations: (a) 30% and (b) 70%. Observe
that pathChirp performs far better than TOPP.
This experiment uses a single queue with link speed
20Mbps fed with Poisson cross-traffic. The probe rate
is 1Mbps and the pathChirp parameters are set to P = VI. C OMPARISON WITH PATHLOAD
1500 bytes, γ = 1.2, F = 5, and L = 3. We now compare pathChirp with pathload (version
Fig. 11 displays the MSE for experiments with two pathload 1.0.2) [9] using a simple test bed at Rice Uni-
different utilizations. Observe that the pathChirp outper- versity depicted in Fig. 13. The goal is to compare their
forms TOPP by about an order of magnitude. efficiency in terms of number of bytes used to obtain
available bandwidth estimates of equal accuracy.
C. Multi-hop scenarios PathChirp and pathload differ in their measure-
We next compare pathChirp and TOPP in the multi- ment methodology as well as their output quantities.
hop scenario depicted in Fig. 9 with average probing rate PathChirp provides a single estimate of available band-
500kbps. In the first experiment we set the Poisson cross- width per specified time interval τ . Pathload instead
traffic rates so that the first queue has 10Mbps available provides minimum and maximum bounds on the avail-
while the second has 15Mbps available. The first queue able bandwidth while taking a variable amount of time
is thus the tight one. In the second experiment the rates to make the estimate.
are set so the the first queue has 20Mbps available and the We perform two sets of experiments to compare the
second has 10Mbps available, thus making the second tools. To measure the efficiency of the tools, in each ex-
queue the tight one. periment we compute the average number of bytes over
From Fig. 12 observe that again pathChirp outper- 25 runs that each tool takes to provide estimates accurate
forms TOPP like in the single-hop scenarios. to 10Mbps.
Since pathChirp uses queuing delay correlation infor- To obtain the bytes used by pathload, we set its band-
mation present in signatures and not just the average de- width resolution parameter to 10Mbps and take the aver-
lay increase between packet pairs, the above results are age number of bytes used to make 25 estimates.
not surprising. A theoretical analysis supporting these To count the bytes used by pathChirp, we employ the
empirical findings is part of our future research. following procedure. Denoting the start of the experi-
9
1.2
pathChirp bandwidth to a constant value using iperf CBR UDP traf-
TOPP
fic [14] while in the second set of experiments we employ
normalized MSE 1 Poisson UDP traffic [15]. The iperf packet size is 1470
bytes while that of the Poisson traffic is 1000 bytes. The
0.8
results in Tables I and II indicate that pathChirp needs
0.6 less than 10% of the bytes that pathload uses. In addi-
tion to the average number of bytes the two tools use
0.4
to achieve the desired accuracy, Tables I and II provide
0.2
the 10%-90% values of pathChirp estimates and the av-
erage of pathload’s minimum and maximum bounds of
1 2 3 4 5 6 7 8
(a) time interval τ (seconds) available bandwidth. Observe that pathChirp’s estimates
0.9 pathChirp
have a consistent negative bias, implying that its mea-
TOPP surements are conservative.
0.8
0.7
These results demonstrate pathChirp’s utility, espe-
normalized MSE
0.6
cially for applications requiring rapid estimates of the
available bandwidth using only a light probing load.
0.5
0.4
TABLE I
Efficiency comparison of pathChirp and pathload with iperf CBR cross-traffic.
TABLE II
Efficiency comparison of pathChirp and pathload with Poisson cross-traffic.
Caltech/StarLight(Chicago)
Rice
Internet Poisson traffic
Internet
Internet pathchirp
(a) SLAC
ρ[t−τ,t]
80 65Mbps minus Poisson rate
60
Mbps
40
20
0
0 50 100 150 200 250 300 350
(b) time "t" (s)
ρ[t−τ,t]
80 65Mbps minus Poisson rate
60
Mbps
40
20
0
0 50 100 150 200 250 300 350
(c) time "t" (s)
Fig. 14. (a) Setup for the Internet experiment. (b) Available
bandwidth estimates when Poisson traffic originates at Cal-
tech. (c) Available bandwidth estimates when Poisson traffic
originates at StarLight (Chicago). Observe that the pathChirp
estimates fall in proportion to the introduced Poisson traffic.