Reproducible Measurements of TCP BBR Congestion Control
Reproducible Measurements of TCP BBR Congestion Control
Benedikt Jaeger, Dominik Scholz, Daniel Raumer, Fabien Geyer, Georg Carle
Chair of Network Architectures and Services, Technical University of Munich
{jaeger|scholz|raumer|fgeyer|carle}@net.in.tum.de
Abstract
The complexity of evaluating TCP congestion control has increased considerably since its initial development in the 1980s.
Several congestion control algorithms following different approaches have been proposed to match the requirements of
modern computer networks. We present a framework to analyze different congestion control algorithms using network
emulation. The framework is publicly available which provides easy repeatability of our experiments and reproducibility
of the results.
As a case study, we use our framework to analyze the bottleneck bandwidth and round-trip time (BBR) congestion
control algorithm, which was published by Google in 2016. Because of promising initial results, BBR has gained wide-
spread attention. As such it has been subject analysis, which showed an increase in performance, but also revealed
critical flaws. We verify our framework by reproducing experiments from related work which confirm weaknesses of
the current BBR implementation. We also contribute an analysis of BBR’s inter-flow synchronization behavior and its
interaction when competing with other congestion control algorithms. Our results show that BBR flows on their own
have difficulty to reach a fairness equilibrium and suppress other congestion control algorithms. BBR is still work in
progress, thus the framework is useful to validate further updates of BBR by rerunning the given experiments.
Keywords: TCP, Congestion Control, BBR, Reproducible Measurements
RTT
point
RTprop
have been reproducible, demonstrating that BBR signifi-
Kleinrock’s optimal
cantly improved the bandwidth and median RTT in their operating point
Delivery
ness have been discovered since (e.g. [7, 8, 9]). As a con- BtlBw
Rate
sequence, BBR is actively improved [8]. Proposed changes
usually aim to mitigate specific issues, however they need
to be carefully studied for unintended side effects. Amount Inflight
2
congestion control algorithms. Popular loss-based TCP packet loss as the primary signal for congestion, which
versions are Reno [11], BIC [20] and CUBIC [5]. They decides if the congestion window should be increased or
only use a congestion window as control parameter which decreased. Additionally, the delay is taken into account to
limits the amount of unacknowledged data in the network. determine the quantity of the change. When the network
Its size grows as long as all packets arrive and is reduced is not congested, Illinois grows the congestion window fast
whenever packet loss occurs. The quantity of theses in- and reduces it less drastically. When delay increases the
creases and decreases usually varies for different loss-based growth is slowed down, leading to a concave slope simi-
algorithms For example, Reno increases its congestion win- lar to CUBIC. Another example for a hybrid algorithm
dow by one for each RTT and reduces it by 50 % whenever is Compound TCP [15], having both a delay-based and a
packet loss is detected. As a result, Reno has both prob- loss-based component.
lems with RTT fairness and the utilization of long-delay
links with large BDP. 2.5. Bottleneck Bandwidth and Round-trip Propagation Time
TCP CUBIC increases the window according to a cu- Cardwell et al. proposed TCP BBR following a new ap-
bic slope as a function of the time since the last packet proach named congestion-based congestion control, which
loss happened. It sets an inflection point as target, the is supposed to react only to actual congestion and not
area where it estimates to reach the optimum operating only to indicators as former algorithms. In this section
point. Close to the inflection point, CUBIC is conserva- the basics of BBR that are important for our evaluation
tive, trying to have a stable congestion window, however, are described. Our deliberations are based on the version
the further away CUBIC gets from the inflection point, presented by Cardwell et al. [4] and we refer to their work
the more aggressive the congestion window is increased. for a detailed description of the congestion control algo-
In case of a loss event, CUBIC adjusts its estimated tar- rithm or [7] for a formal analysis.
get conservatively reducing it by 20 % [1].
Loss-based approaches suffer from two major problems. 2.5.1. Overview
First, they are susceptible to random packet loss since The main objective of BBR is to ensure that the bot-
TCP interprets them as a signal for congestion and steadily tleneck remains saturated but not congested, resulting in
reduces its congestion window, leading to under-utilization maximum throughput with minimal delay. Therefore, BBR
of the link. Secondly, they shift their operation point away estimates bandwidth as maximum observed delivery rate
from Kleinrock’s optimal operation point (Figure 1). This BtlBw and propagation delay RTprop as minimum ob-
permanently keeps the network buffers full, which is not a served RTT over certain intervals. Both values cannot be
problem as long as the buffer size is small. However, with measured simultaneously, as probing for more bandwidth
larger size the additional buffer delay grows increasing the increases the delay through the creation of a queue at the
transmission delay of all connections using that link. This bottleneck and vice-versa. Consequently, they are mea-
becomes a major problem for real-time applications like sured separately.
telephony or streaming via TCP. To control the amount of data sent, BBR uses pacing
gain. This parameter, most of the time set to one, is mul-
2.3. Delay-based Congestion Control
tiplied with BtlBw to represent the actual sending rate.
Delay-based algorithms use the measured time between
a packet was sent and the corresponding acknowledgement 2.5.2. Phases
arrived to detect congestion. If this time increases, TCP
The BBR algorithm has four different phases [22]: Startup,
assumes that a queue has formed somewhere on the path
Drain, Probe Bandwidth, and Probe RTT.
and reduces the sending rate. Thus, compared to the loss- The first phase adapts the exponential Startup behav-
based approach, congestion can be detected before any ior from CUBIC by doubling the sending rate with each
packet-loss occurs.
round-trip. Once the measured bandwidth does not in-
TCP Vegas [12] is an example for a delay-based algo-
crease further, BBR assumes to have reached the bottle-
rithm. It periodically measures the connection’s RTT and
neck bandwidth. Since this observation is delayed by one
stores the minimum measured as RTTmin . The conges- RTT, a queue was already created at the bottleneck. BBR
tion window is scaled linearly according to the difference tries to Drain it by temporarily reducing the pacing gain.
between the measured RTT and RTTmin . Since Vegas Afterwards, BBR enters the Probe Bandwidth phase in
keeps RTTmin for the whole connection it cannot adapt to
which it probes for more available bandwidth. This is per-
changes in the path’s RTprop.
formed in eight cycles, each lasting RTprop: First, pacing
Delay-based congestion control algorithms are known
gain is set to 1.25, probing for more bandwidth, followed
to perform poorly when run in parallel with loss-based by 0.75 to drain created queues. For the remaining six
algorithms [21]. cycles BBR sets the pacing gain to 1. BBR continuously
samples the bandwidth and uses the maximum as BtlBw
2.4. Hybrid Congestion Control
estimator, whereby values are valid for the timespan of ten
To combine the advantages of both approaches, hy-
RTprop. After not measuring a new RTprop value for ten
brid algorithms were developed. TCP Illinois [16] uses
3
Sender Receiver dumbbell topology depicted in Figure 2. For each TCP
Different
tcpdump RTTs flow a new host-pair, sender and receiver, is added for
simplified collection of per-flow data. Both sides are con-
nected via three switches. The middle switch acts as the
bottleneck by performing traffic policing on its interface.
The two additional switches allow capturing the traffic be-
TBF Bottleneck Link fore and after the policing. Traffic from the receivers to
the senders is not subject to rate limiting since we only
send data from the senders and the returning acknowledg-
Figure 2: Mininet setup with sending and receiving hosts and bot- ment stream does not exceed the bottleneck bandwidth,
tleneck link.
assuming symmetric bottleneck bandwidth.
Delay Emulation & Packet Loss: We use NetEm
seconds, BBR stops probing for bandwidth and enters the to add flow specific delay at the links between the switch
Probe RTT phase. During this phase the bandwidth is and the respective receivers to allow configurable RTTs.
reduced to four packets to drain any possible queue and This approach introduces problems for higher data rates
get a real estimation of the RTT. This phase is kept for like 10 Gbit/s where side effects (e.g. jitter) occur [7],
200 ms plus one RTT. If a new minimum value is measured, but works well for the data rates we use. Additionally,
RTprop is updated and valid for ten seconds. a stochastic packet-loss rate can be specified.
Rate Limit & Buffer Size: We use Linux’s Token-
Bucket Filter (TBF) for rate limiting and setting the buffer
3. TCP Measurement Framework size. TBFs also allow a configurable amount of tokens to
accumulate when they are not needed and the configured
The development of our framework followed four re-
rate can be exceeded until they are spent. We set this token
quirements. Flexibility of the framework should allow to
bucket size to only hold a single packet, because exceeding
analyze aspects of TCP congestion control, focusing on but
the bottleneck bandwidth even for a short time interferes
not limited to BBR. The Portability of our framework
with BBRs ability to estimate the bottleneck bandwidth
shall not be restricted to a specific hardware setup. Re-
correctly [4].
producibility of results obtained via the framework must
BBR Pacing: To send data at the desired rate, BBR
be ensured. Given a configuration of an experiment, the
requires the TCP stack to pace outgoing packets. For
experiment itself shall be repeatable. All important con-
Linux versions before 4.13, pacing support was not part of
figuration parameters and the results should be gathered
the TCP stack but implemented by the Fair Queue (FQ)
to allow replicability and reproducibility by others without
queuing discipline. Since we performed our measurements
the need for high performance hardware and testbed. The
on Linux 4.9, we explicitly configured the FQ queuing dis-
complete measurement process shall be simplified through
cipline on each BBR sender, but we verified that using FQ
Automation. Via configuration files and experiment de-
is no longer required in Linux 4.13.
scription, including post processing of data and generation
of plots, the experiment should be executed without fur- 3.3. Workflow
ther user interaction.
Each experiment is controlled using a configuration file
The full source code of our framework is available on-
describing the flows. For each flow, the desired TCP con-
line [3].
gestion control algorithm, start time in relation to pre-
vious flow, RTT, and runtime have to be specified. The
3.1. Emulation Environment
runtime of an experiment consists of a negligible period to
Our framework uses emulation based on Linux network set up Mininet, as well as the actual experiment defined
namespaces with Mininet. Linux network namespaces pro- by the length of the running flows. During the execution
vide lightweight network emulation, including processes, of the test only data is collected and nothing is analyzed
to run hundreds of nodes on a single PC [23]. A draw- yet. Thus, no collected data is altered due to computa-
back is that the whole system is limited by the hardware tional processes running on the same system and the col-
resources of a single computer. Thus we use low band- lection and analysis can be split up to different systems
widths of 10 Mbit/s for the different links in the studied and moments in time. The analysis framework then au-
topology. By showing in Section 4 that our measurements tomatically extracts data and computes the implemented
yield similar results as related work performing measure- metrics. This automation allows to systematically evalu-
ments beyond 10 Gbit/s, we argue that the difference in ate parameter spaces. For example, Figure 12 shows the
throughput does not affect the validity of the results. results of more than 800 individual experiments.
The analysis framework outputs CSV files for each com-
3.2. Setup puted metric to simplify further processing of the data.
Topology: As a TCP connection can be reduced to Additionally a plot visualizing all gathered data is gener-
the bottleneck link (cf. Section 2.1), our setup uses a ated in form of a PDF file.
4
3.4. Metric Collection Inflight Data: Refers to the number of bytes sent
For each TCP flow we gather the sending rate, through- but not yet acknowledged. We obtain this value by com-
put, current RTT, and the internal BBR values. We also puting the difference of the maximum observed sequence
sample the buffer backlog of the TBF in customizable in- and acknowledgment numbers in the capture before the
tervals. We capture the packet headers up to the TCP bottleneck. This metric is only useful when there are no
layer of all packets before and after the bottleneck using retransmissions.
tcpdump. Bottleneck Buffer Backlog: We use Linux’ traffic
The raw data is processed afterwards to generate the control tool to extract the current backlog of the bottle-
metrics listed below. Existing tools like Wireshark (in- neck buffer. We count the backlog length in bits in a mod-
cluding the command line tool tshark) and tcptrace did ifiable interval, e.g. 20 ms. This interval was chosen to see
not meet all our requirements for flexibility. Instead we the impact of short-lived effects without adding too much
wrote our own analysis program in Python. It uses the computational and storage overhead. This allows to get a
dpkt1 module to extract the information from the packet clearer view of the current congestion of the bottleneck.
captures. Congestion Control Internal Values: Congestion
As a result of one experiment, a report containing 14 control algorithms keep different internal metrics, like for
graphs visualizing the metrics over time is automatically instance the congestion window or the slow start threshold.
generated. Sample configuration files can be found with In addition, BBR keeps track of the estimated bottleneck
our source code publication [3]. bandwidth and RTT as well as the pacing and window
Sending Rate & Throughput: A useful metric for gain factors. We extract these values every 20 ms using
analyzing TCP fairness is the sending rate, which can be the ss tool from the iproute2 tools collection.
larger than the maximum transmission rate of the link,
resulting in congestion at the bottleneck. We compute the 3.5. Limitations
per flow and total aggregated sending rate as the average Due to resource restriction on a single host emulated
bit-rate based on the IP packet size in modifiable intervals, network, the framework is limited both by the bottleneck
using the capture before the bottleneck. The time interval bandwidth and the number of concurrent flows. We ran
4t should be small enough to deliver precise values and tests both on a normal notebook and on more power-
large enough to avoid oscillations. E.g. if 4t is so small ful commercial off-the-shelf server hardware with differ-
that either 1 or 0 packets arrive per interval, then the ent CPUs. In Figure 3 the CPU utilization of different
computed sending rate alternates between 0 and packet 4t
size
. host systems running the framework is depicted. The
The throughput is computed equal to the sending rate, used bandwidth hardly impacts the performance during
but is based on the capture after the bottleneck to observe the emulation step, but increases the duration of the anal-
the effect of the traffic policing. ysis step since the packet capture files grow in size. The
Fairness: We follow the recommendation of RFC 5166 [24] number of flows is limited by the CPU since we use indi-
and use Jain’s Index [25] as fairness coefficient based on the vidual sender and receiver hosts for each flow, each hav-
sending rate to indicate how fair the bandwidth is shared ing its own network namespace. To provide valid results
between all flows. For n flows, each of them allocating the host system’s resources should not be fully utilized.
xi ≥ 0 of a resource, For example, we observed a drop in the polling rate of
the internal TCP values at about 50 % CPU utilization
F = 1/n · [Σni=1 xi ]2 /Σni=1 x2i for each test. Though, the results show that when us-
ing a bandwidth below 10 Mbit/s and up to 20 concurrent
is 1 if all flows receive the same bandwidth and 1/n if flows, our methodology provides sufficient accuracy com-
one flow uses the entire bandwidth while the other flows pared to measurements utilizing real hardware. This is
receive nothing. The index allows quantifying the fairness because both approaches use the same network stack, i.e.,
in different network setups independent of the number of the same implementation of the BBR algorithm.
flows or the bottleneck bandwidth. Graphs displaying the
fairness index in the remaining part of this paper are re-
stricted to the interval [1/n, 1] unless mentioned otherwise. 4. Reproduction & Extension of Related Work
Round-trip Time: We use the TCP Timestamp op-
We validate the accuracy of our framework by using it
tion [26] to measure the current RTT for each arriving
acknowledgment. To reduce the amount of data points, to reproduce the results of related work that were based
the average RTT is computed in time intervals. on measurements with hardware devices. The results show
Retransmissions: We count retransmissions of TCP that the behavior of TCP BBR at bandwidths in the Mbit/s
segments in the packet capture before the bottleneck. We range is comparable to the behavior at higher ranges of
Gbit/s. In the following, we present our reproduced re-
use these as an indicator for packet loss in our evaluation.
sults with a mention of the respective related work.
We focus on the results of two research groups. Card-
1 https://pypi.python.org/pypi/dpkt well et al., the original authors of BBR, have described
5
Intel Core i5-2520M (2 cores@2.5 GHz) increased RTT, and drained in the following five seconds.
Intel Xeon E3-1230 (4 cores@3.2 GHz) Instead of an additional bandwidth reduction, we tripled
Intel Xeon E5-2640 v2 (8 cores@2.00 GHz)
RTprop at the 56 s mark. The results are surprising at
100
first. Similar to a decrease in BtlBw, BBR cannot adapt
CPU Load [%]
80
60 to an increase in RTprop immediately, since the minimum
40 filter retains an old, lower value for another 10 s. When
20 RTprop grows, the acknowledgments for the packets take
0
10 20 30 40 50 60 70 80 90 100
longer to arrive, which increases the inflight data until the
Number of Flows congestion window is reached, which BBR sets to 2 BDP.
To adapt, BBR has to limit its sending rate, resulting in
Figure 3: CPU Utilization when running the framework using lower samples for BtlBw. As soon as the BtlBw estimate
10 Mbit/s (solid) and 100 Mbit/s (dashed) links. expires, the congestion window is reduced according to
the new, lower BDP. This happens repeatedly until the
2 BtlBw 3 RTprop
old minimum value for RTprop is invalidated (at approx.
[Mbit/s]
10
the sending rate again to match BtlBw with exponential
0 growth.
120 RTT RTprop
RTT
[ms]
Inflight BDP First, even though the sending rate drops, the inflight data
[Mbit/s] [Mbit]
1.2
0.8 does not decrease compared to before the RTT increase.
0.4
0 However, larger increases of the RTT can lead to BBR uti-
20 Estimated BtlBw
lizing less than 20 % of the available bandwidth for up to
10
10 s. Second, it is unlikely that such a drastic change in
0
120 Estimated RTprop RTT happens in the Internet in the first place.
RTT
[ms]
6
10
Probe RTT
Probe RTT
Sending Rate
Flows in
Flows in
4 4
[Mbit/s]
2 2
5
0 0
30 30.5 31 31.5 32 30 30.5 31 31.5 32
Time [s] Time [s]
0
0 5 10 15 20 25 30 35 40 45 50 55
(a) RT prop = 10 ms (b) RT prop = 100 ms
Time [s]
800
Duration [ms]
Overlap Single Flow
Probe RTT
(a) Two BBR flows with different RTTs 600 200 + RTT 200 + 2.5 RTT
Sending Rate
200
2
0
1 0 20 40 60 80 100 120 140 160 180 200
0 RTT [ms]
40 45 50 55 60 65 70 75 80
Time [s] (c) Overlapping Probe RTT phase duration
(b) Multiple BBR flows with two groups of RTTs Figure 7: Influence of different RTT on simultaneous Probe RTT
phases
Figure 5: RTT Unfairness of BBR
CUBIC≥25 s BBR≥25 s
RT prop [ms]
Rate [log %]
Bandwidth
10 6 CUBIC 150
[Mbit/s]
4 BBR 100
1
BtlBw
2 50
0.1 RT prop
0.1 1 2 10 50 0 0
0 10 20 30 40 50 60
Buffer Size in Multiples of BDP [log] Time [s]
Figure 8: Retransmission rate after 25 s for 5 simultaneously started Figure 9: Competing BBR and CUBIC flow
flows with different bottleneck buffer sizes
8
1 100
BBR5
Ftp
0.75 0.75 75 CUBIC5
0.5 0.5 BBR2.5
50
Bandwidth
Bandwidth
BBR CUBIC
Avg. [%]
Avg. [%]
CUBIC2.5
75 75
BBR BBR1
50 50 25
CUBIC CUBIC1
25 25
0
0 0 50 100 150 200
10−1 100 101 101 102 103 RTT [ms]
Buffersize in multiples of BDP [log] RTT [log ms]
(c) One flow 50 ms RTT for 1, 2.5 and 5 BDP
(a) Increasing buffer with 50 ms RTT (b) Increasing RTT with 2.3 BDP buffer buffer
Figure 10: One CUBIC vs. one BBR flow for changing network conditions
# BBR flows: 1 2 5 10 larger share for BBR. Only two or less CUBIC flows can
100
manage to get more bandwidth than their fair share at all.
CUBIC Flows [%]
80
Total Share of
Figure 12: BBR’s fairness towards other algorithms. The shading describes the difference between the actual gained share of all BBR flows
and the fair share in percent. Positive values indicate that BBR is claiming more bandwidth than it should. Contour lines are given for 0
(dashed) and 20 % (dotted).
20 10
Throughput
10 1 BBR Flow
[Mbit/s]
0
Vegas Flow
Relative Frequency [%]
20 5
10 2 BBR Flows Vegas Flow
0 BBR Flow
20 3 BBR Flows 0
10 0 10 20 30 40 50 60
0 Time [s]
20 4 BBR Flows
10
0 Figure 15: BBR competing with Vegas
20 5 BBR Flows
10
0
20 10 BBR Flows
10 gestion window. Hence, BBR can measure higher values
0 for BtlBw increasing its sending rate further until Vegas’
50 60 70 80 90 100 110 120 130 140
Estimated RTprop [ms] congestion window cannot get any lower.
In this experiment, BBR makes accurate RTprop and
Figure 13: Relative frequency of RTprop estimations for different BtlBw estimates. Still, it does not reach its desired opera-
number of BBR flows competing with one CUBIC flow on a 50 ms
link.
tion point and is capped by its congestion window, which
adds one BDP of data to the queue. When the BBR flow is
in Probe RTT, the Vegas flow measures a lower RTT and
Queue Size [BDP]
8
6
Max Mean Min 2.5 BDP starts increasing its congestion window. After Probe RTT,
BBR returns to its previous sending rate. The total send-
4
ing rate of both flows exceeds the link’s bandwidth and
2
a queue is created. Since Vegas cannot decrease its con-
0
5 10 15 20 25 30 gestion window further and BBR only adjusts its sending
Number of BBR Flows rate to its BtlBw estimation, this queue cannot be removed
until the next Probe RTT phase. Then the whole process
Figure 14: Size of the persistent queue at the bottleneck buffer with
increasing number of BBR flows. More than ten BBR flows fail to repeats. Thus, a parallel Vegas flow does hardly impact
drain the queue completely. the throughput of BBR but it manages to push BBR into a
state in which BBR maintains a persistent queue without
the capability of draining it.
5.2. Competing with Delay-based Algorithm When considering throughput, Vegas performs poorly
Similar to BBR, Vegas also tries to keep the delay as against BBR independently of the number of Vegas flows
low as possible. Therefore, not only the fairness between (Figure 12c).
Vegas and BBR is evaluated, but also whether both algo-
rithms can keep the bottleneck queue small when running 5.3. Competing with Loss-delay-based Algorithm
simultaneously. Lastly, we compare BBR to the loss-delay-based Illinois
Regarding the throughput, Vegas performs poorly when algorithm.
running parallel with BBR. Independently of the order in Figure 12d is similar to the other loss-based algorithms
which the flows are started, Vegas receives only 5 to 10 % since Illinois still uses packet loss as primary signal for
of the total bandwidth (see Figure 15). After the BBR congestion. However, Illinois can achieve higher through-
flow starts, a queue is created at the bottleneck since Ve- put against BBR than Reno, because Illinois slows down
gas already fully utilizes the link. This induces additional when increasing delay is measured, which also increases
delay and Vegas reacts accordingly by decreasing its con-
10
1 1
0.6 0.9
F
Favg
0.2 Tjoin 0.8
TF 95
Sending Rate
6 Existing Flows
[Mbit/s]
TF 95 [s]
0
50 55 60 65 70 75 80 85 90 95 100 105 110 20
Time [s]
10
the time until the next congestion event occurs. Addi- (a) Join during different times of the Probe RTT cycle. Red area marks
Probe RTT phases.
tionally, Illinois is affected less by the constant packet loss
400 Startup Drain Probe RTT
with increasing number of BBR flows due to the smaller
RTprop [ms]
multiplicative decrease. 200
0
400 Sync. flows New flow
5.4. Summary
200
Overall, there are several different factors, as buffer
0
size, RTT or number of flows which influence the fairness -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26
of BBR towards other algorithms. For competing loss- Time in relation to Probe RTT phase [s]
based algorithms, the deciding factor is the used buffer
(b) Correlation between Startup/Drain and Probe RTT for joining 2 s
size while Vegas completely starves but manages to move and 1.7 s before next Probe RTT phase
BBR’s operation point towards a persistent queue at the
bottleneck. For most configurations, BBR received a too Figure 17: Single BBR flow joining synchronized BBR flows
large share, mostly due to being too aggressive and not
considering any implicit congestion signals. However, BBR
causes the new flow to overestimate the BDP, inducing
did not starve against another algorithm in any of our
queuing delay or packet loss.
tests.
This raises two questions regarding the synchroniza-
tion behavior of BBR flows: Is there an optimal and worst
6. Inter-flow Synchronization moment regarding the time until equilibrium is reached for
a single flow to join a bottleneck containing already syn-
Different BBR flows synchronize themselves to avoid chronized BBR flows? And secondly we want to determine
faulty estimations, e.g., when one flow probes for band- if constantly adding new flows can result in extended or
width causing a queue to form at the bottleneck, while accumulated unfairness.
another probes for RTT. In contrast to loss-based algo-
rithms, this does not correlate with congestion, as the flows 6.2. Synchronization Metrics
are impervious to loss.
To quantify the impact of a new flow joining we use
two metrics based on Jain’s fairness index F. For better
6.1. Theory & Questions
comparison we define Tjoin as the point in time when the
Cardwell et al. demonstrate in [4, Fig. 6] how differ- flow of interest, i.e. the last flow, has joined the network
ent BBR flows synchronize whenever a large flow enters (cf. Figure 16). As first metric, we define TF 95 as the point
the Probe RTT phase. We visualize the process in Fig- after Tjoin for which F remains stable above 0.95, i.e. no
ure 16 with one new flow joining four already synchronized longer than 2 s below this threshold. Second, we compute
flows. The new flow immediately overestimates the bottle- the average fairness Favg in the interval [Tjoin , Tjoin + 30 s].
neck link and claims a too large share of the bandwidth. In the following we analyze the behavior of flows with
10 s later it enters Probe RTT. The flow with bigger share equal RTTs. We assume that all effects described in the
drains a large portion of packets from the queue, which following will scale similarly as described in Section 4.2
results in all other flows measuring a better RTprop es- with RTT unfairness between flows.
timate. Consequently, the flows are synchronized as the
RTprop samples of all flows expire at the same time, caus- 6.3. Single Flow Synchronization Behavior
ing them to enter Probe RTT together at the 81 s mark.
To analyze the basic synchronization behavior, we use
Considering the fairness, it takes approximately 35 s after
the scenario of one new BBR flow joining a network with
the new flow joined until equilibrium is reached.
four other BBR flows already synchronized and converged
To maximize performance, BBR should only spend 2%
to a fair share. Figure 17a shows our experimental evalua-
of time in Probe RTT [4, 22]. Therefore, new flows have
tion when joining a new flow in relation to the Probe RTT
trouble to measure the correct RTprop as active flows
phase of the synchronized flows.
likely probe for more bandwidth and create queues. It
11
1 proximately 10 s and 20 s. When all flows join at the same
0.8
Favg
30
20
bilize. As for a single flow, TF 95 and Favg improve with
10 increasing interval. For flows joining every 5 s an addi-
0 tional local optimum is visible as every second flow joins
0 2 4 6 8 10 12 14 16 18 20 22
during the Probe RTT phase of the other flows. For in-
Interval length [s]
tervals larger than one Probe RTT cycle (after flows leave
Figure 18: Different join intervals for subsequent flows Probe RTT, approximately 10.5 s), TF 95 and Favg show
the behavior for a single flow joining. This is because all
prior flows have already synchronized, resulting in them
As expected, a periodic behavior is revealed, with the already converging towards an equilibrium before the next
best case for a new flow to join being during the Probe flow joins.
RTT phase. It synchronizes immediately as the queues Analyzing the effect of a new flow joining on individual
are drained and the new flow can measure the optimal existing flows, e.g. the longest running flow, is difficult for
RTT, leading to low TF 95 and high Favg . The worst case the lack of a good metric. We therefore select the best
is if the flow joins directly after the other flows left the and worst case join intervals displayed in Figure 19 for a
Probe RTT phase. At this point, the queue is building visual analysis. As the minimum value of F depends on the
again as the flows keep 2 BDP inflight, resulting in the number of flows (1/n), it is normalized using percentages.
new flow severely overestimating the BDP. It remains in Figures 19a and 19b show the effects of subsequent
this state until the old flows enter Probe RTT again (up to flows joining during (best case) or immediately after (worst
10 s later), draining the queue and synchronizing with the case) the Probe RTT phase. Similar to the effects on the
new flow. This behavior of aggressively taking bandwidth last flow joining, existing flows are only influenced by the
from existing flows can be harmful when many short living timing of the next flow joining. Within the group of syn-
BBR flows join, leading to starvation of long-living flows. chronized flows, they converge to their fair share. The
In general, it lasts 20 s until TF 95 is reached, but the synchronization itself depends on the timing and happens
later the new flow joins during the cycle, the higher varies at most after 10 s. The resulting unfairness is only caused
TF 95 (10 to 30 s). The local optimum when joining 2 s be- by the new flow. The overall time until bandwidth equi-
fore the Probe RTT phase with TF 95 = 10 s is because librium is approximately 55 s and 70 s, respectively. We
the existing flows enter the Probe RTT phase while the attribute the 15 s difference to the longer synchronization
new flow drains after the Startup as shown in Figure 17b. phase in the latter case (10 s) and bigger unfairness thereof.
Consequently, all flows drain the queue and measure a new
optimal RTprop, synchronizing immediately, yet overesti- 6.5. Further Observations
mating the bottleneck because the queue created during Further measurements showed that many BBR flows
Startup is not entirely drained yet. In contrast, the worse also have problems to synchronize. When running 50 BBR
case directly afterwards (1.7 s before next Probe RTT) flows, the required buffer size to run without packet-loss
with TF 95 = 22 s is caused by the existing flows enter- is about 10 BDP on a 10 Mbit/s, 100 ms link. If the buffer
ing Probe RTT, draining the queue, while the new flow is is too small, BBR constantly overwhelms the network and
in Startup. This causes the new flow to drastically over- completely fails to synchronize. In this case no more than
estimate the bottleneck until leaving Startup, suppressing 10 % of the flows are in Probe RTT at the same time. Oth-
other flows. erwise, the flows can synchronize and reach a fair band-
Considering the prevalence of short-lived flows in the width share. However, two issues are remaining. First, if
Internet [27, 5], this high TF 95 value poses a significant the RTT of the flows is too short not all flows are in Probe
disadvantage of TCP BBR. Initially, flows during this time RTT at the same time. This leads to undrained queues
suppress other flows through unfair bandwidth claims, which and RTprop overestimations (cf. Figure 7). Second, the
is only solved when reaching a fair share. whole synchronization process takes more than one minute
when starting the flows in intervals of 0.2 s.
6.4. Accumulating Effects
Summarizing, the fair sharing of bandwidth is inter-
To evaluate if negative effects of multiple flows joining twined with the timing of new flows joining the network.
can accumulate, i.e. whether the duration of unfairness can Except during the brief Probe RTT phase, equilibrium is
be prolonged, we change the scenario to have a new flow only reached after 20 s and can extend up to 30 s. How-
join every x seconds up to a total of five BBR flows (cf. ever, there are no effects accumulating beyond the interval
Figure 18). of one Probe RTT phase. The timing only has a short term
Optima are visible for intervals matching the duration effect on the amplitude of unfairness, not TF 95 .
of the Probe RTT phase of the already active flows at ap-
12
100 100
Send. Rate F [%]
[Mbit/s]
4 4
2 2
0
0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80
Time [s] Time [s]
(a) 10.1 s join interval (during Probe RTT) (b) 10.5 s join interval (immediate after Probe RTT)
achieves similar throughput with decreased average delay Once these improvements are included in the Linux
compared to CUBIC for large files, but higher throughput kernel, our framework can easily be used for validation by
with higher self-inflicted delay for small downloads [31]. rerunning all tests and see how the results have changed.
Atxutegi et al. also tested BBR on cellular networks, re- For all tests in this paper the scripts to generate the con-
sulting in BBR performing better than other algorithms figurations and analyze the results are published at [3].
such as CUBIC and Reno [30]. However, they also de-
tected that BBR has problems when encountering long or
8. Conclusion
4G latencies. Crichigno et al. measured an improved per-
formace of BBR flows with larger maximum segment sizes We presented a framework for TCP congestion con-
using parallel streams [45]. This especially effects long trol measurements focusing on flexibility, portability, re-
living flows transporting huge amounts of data, so called producibility and automation. Using Mininet to emulate
elephant flows. Integration of BBR for QUIC is work in different user-configured flows, it allows to perform exper-
progress [32]. iments analyzing a concrete algorithm, or the interaction
with multiple flows using even different algorithms. We re-
7.4. Further Development of BBR produced related work to validate the applicability of our
Since its first publication, BBR has been under active approach.
development by the authors and research community. At We use the new TCP BBR algorithm as case study for
IETF 102 Cardwell et al. proposed ideas for improving our framework, summarizing the current state of the al-
on several of the discovered issues [46], calling the algo- gorithm and extending existing insights in several aspects.
rithm BBR2.0. The slow synchronization is addressed by In particular, we have shown that the algorithm to deter-
reducing the time span between the Probe RTT phases for mine the duration of the Probe RTT phase has problems
example to 2 s. This is supposed to improve the fairness and that in most cases BBR does not share bandwidth in
towards other BBR and also CUBIC flows. Still, it re- a fair manner with any of the tested algorithms like Reno,
mains a challenging task to reach fairness with loss-based CUBIC, Vegas and Illinois.
flows. Other areas of research are how BBR can handle Our final contribution is an experimental analysis of
fluctuating RTTs, as in WiFi environments or in the pres- the synchronization mechanism. We identified two pri-
ence of delayed acknowledgements, and the influence of mary problems. Depending on the timing of new flows
ACK aggregation [47]. Furthermore, explicit congestion joining existing flows in relation to their Probe RTT phase,
notifications (ECN) and also packet-loss are taken into bandwidth can be shared severely unfair. This boils down
account to improve BBR’s network model. This reduces to BBR’s general problem of overestimating the BDP. The
the retransmission rate of BBR when running on shallow second problem is the time until a bandwidth equilibrium
buffers. Lastly, the problem of BBR maintaining a persis- is regained. This can last up to 30 s, which is bad for
tent queue when running parallel with other BBR flows is short-lived flows, common in today’s Internet. We iden-
addressed by adding mechanisms to drain existing queues tified that this is correlated with the trigger for synchro-
more frequently. nization, i.e. the Probe RTT phase, draining the queues.
14
Consequently, without reducing the time between Probe [14] M. Hock, F. Neumeister, M. Zitterbart, R. Bless, TCP LoLa:
RTT phases, the worst case time until flows synchronize Congestion Control for Low Latencies and High Throughput,
in: 2017 IEEE 42nd Conference on Local Computer Networks,
cannot be improved further. 2017.
As BBR’s underlying model is flawed in several aspects, [15] K. Tan, J. Song, Q. Zhang, M. Sridharan, A Compound
drastic changes are proposed for BBR 2.0. Our framework TCP Approach for high-speed and long Distance Networks, in:
aids the active development and improvement, as all ex- Proceedings-IEEE INFOCOM, 2006.
[16] S. Liu, T. Başar, R. Srikant, TCP-Illinois: A loss-and delay-
periments can easily be repeated and reproduced. This based congestion control algorithm for high-speed networks,
allows to easily verify the impact of changes to the algo- Performance Evaluation 65 (6-7) (2008) 417–440.
rithm, quantify improvements and avoid regressions. Our [17] M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Pa-
framework as well as the raw data for all figures presented tel, B. Prabhakar, S. Sengupta, M. Sridharan, Data Center
TCP (DCTCP), in: Proceedings of the 2011 ACM SIGCOMM
is available online [3] for replicability of our results and to Conference, vol. 41, ACM, doi:\bibinfo{doi}{10.1145/1851275.
allow further investigations by the research community. 1851192}, 2011.
[18] K. Winstein, H. Balakrishnan, TCP Ex Machina: Computer-
generated Congestion Control, in: Proceedings of the 2013 con-
Acknowledgment ference on ACM SIGCOMM Conference, vol. 43, ACM, doi:
\bibinfo{doi}{10.1145/2534169.2486020}, 2013.
This work was supported by the High-Performance Cen- [19] M. Dong, Q. Li, D. Zarchy, P. B. Godfrey, M. Schapira, PCC:
ter for Secure Networked Systems and the German BMBF Re-architecting congestion control for consistent high perfor-
project SENDATE-PLANETS (16KIS0472). mance, in: 12th USENIX Symposium on Networked Systems
Design and Implementation (NSDI 15), 2015.
The paper at hand is an extended version of our paper [20] L. Xu, K. Harfoush, I. Rhee, Binary increase congestion con-
presented at IFIP 2018 conference [48]. This paper focuses trol (BIC) for fast long-distance networks, in: INFOCOM 2004.
more on the framework for reproducible measurements and Twenty-third AnnualJoint Conference of the IEEE Computer
and Communications Societies, vol. 4, IEEE, 2004.
expands the former explanations about congestion control
[21] J. Mo, R. J. La, V. Anantharam, J. Walrand, Analysis and com-
and the framework. Furthermore, the evaluation of BBR’s parison of TCP Reno and Vegas, in: INFOCOM’99. Eighteenth
inter-protocol fairness has been extended significantly. Annual Joint Conference of the IEEE Computer and Communi-
[1] A. Afanasyev, N. Tilley, P. Reiher, L. Kleinrock, Host-to-host cations Societies. Proceedings. IEEE, vol. 3, IEEE, 1556–1563,
congestion control for TCP, IEEE Communications surveys & 1999.
tutorials 12 (3) (2010) 304–342. [22] N. Cardwell, Y. Cheng, S. Yeganeh, V. Jacobson,
[2] P. Yang, J. Shao, W. Luo, L. Xu, J. Deogun, Y. Lu, TCP con- BBR Congestion Control, Internet-Draft draft-cardwell-
gestion avoidance algorithm identification, IEEE/Acm Transac- iccrg-bbr-congestion-control-00, IETF Secretariat, URL
tions On Networking 22 (4) (2014) 1311–1324. http://www.ietf.org/internet-drafts/draft-cardwell-
[3] Framework and Data Publication, URL https://gitlab.lrz. iccrg-bbr-congestion-control-00.txt, 2017.
de/tcp-bbr, 2018. [23] B. Lantz, B. Heller, N. McKeown, A Network in a Laptop:
[4] N. Cardwell, Y. Cheng, C. S. Gunn, S. H. Yeganeh, V. Jacob- Rapid Prototyping for Software-Defined Networks, in: Proceed-
son, BBR: Congestion-based Congestion Control, ACM Queue ings of the 9th ACM SIGCOMM Workshop on Hot Topics in
14 (5). Networks, 2010.
[5] S. Ha, I. Rhee, L. Xu, CUBIC: a new TCP-friendly high-speed [24] S. Floyd, Metrics for the Evaluation of Congestion Control
TCP variant, ACM SIGOPS Operating Systems Review 42 (5). Mechanisms, RFC 5166, RFC Editor, 2008.
[6] L. Kleinrock, Power and Deterministic Rules of Thumb for [25] R. Jain, D.-M. Chiu, W. R. Hawe, A Quantitative Measure of
Probabilistic Problems in Computer Communications, in: Pro- Fairness and Discrimination for Resource Allocation in Shared
ceedings of the International Conference on Communications, Computer System, vol. 38, Eastern Research Laboratory, Digi-
vol. 43, 1979. tal Equipment Corporation Hudson, MA, 1984.
[7] M. Hock, R. Bless, M. Zitterbart, Experimental Evaluation of [26] V. Jacobson, R. Braden, D. Borman, RFC 1323, TCP exten-
BBR Congestion Control, in: 25th IEEE International Confer- sions for high performance .
ence on Network Protocols (ICNP 2017), 2017. [27] S. Ebrahimi-Taghizadeh, A. Helmy, S. Gupta, TCP vs. TCP:
[8] N. Cardwell, Y. Cheng, C. S. Gunn, S. H. Yeganeh, V. Jacob- a systematic study of adverse impact of short-lived TCP flows
son, I. Swett, J. Iyengar, V. Vasiliev, BBR Congestion Control: on long-lived TCP flows, in: INFOCOM 2005. 24th Annual
IETF 100 Update: BBR in shallow buffers, IETF 100 URL Joint Conference of the IEEE Computer and Communications
https://datatracker.ietf.org/meeting/100/materials/ Societies. Proceedings IEEE, vol. 2, IEEE, 2005.
slides-100-iccrg-a-quick-bbr-update-bbr-in-shallow- [28] N. Kuhn, MPTCP and BBR performance over Internet satel-
buffers/, Presentation Slides. lite paths, IETF 100 URL https://datatracker.ietf.org/
[9] S. Ma, J. Jiang, W. Wang, B. Li, Towards RTT Fairness of meeting/100/materials/slides-100-iccrg-mptcp-and-bbr-
Congestion-Based Congestion Control, CoRR abs/1706.09115, performance-over-satcom-links/, Presentation Slides.
URL http://arxiv.org/abs/1706.09115. [29] W. K. Leong, Z. Wang, B. Leong, TCP Congestion Control Be-
[10] J. Gettys, K. Nichols, Bufferbloat: Dark Buffers in the Internet, yond Bandwidth-Delay Product for Mobile Cellular Networks,
Commun. ACM 55 (1), ISSN 0001-0782, doi:\bibinfo{doi}{10. in: Proceedings of the 13th International Conference on emerg-
1145/2063176.2063196}. ing Networking EXperiments and Technologies, ACM, 2017.
[11] M. Allman, V. Paxson, E. Blanton, TCP Congestion Control, [30] E. Atxutegi, F. Liberal, H. K. Haile, K.-J. Grinnemo, A. Brun-
Tech. Rep., 2009. strom, A. Arvidsson, On the Use of TCP BBR in Cellular Net-
[12] L. S. Brakmo, L. L. Peterson, TCP Vegas: End to End Conges- works, IEEE Communications Magazine 56 (3) (2018) 172–179.
tion Avoidance on a Global Internet, IEEE Journal on selected [31] F. Li, J. W. Chung, X. Jiang, M. Claypool, TCP CUBIC versus
Areas in communications 13 (8). BBR on the Highway, in: International Conference on Passive
[13] R. Mittal, N. Dukkipati, E. Blem, H. Wassel, M. Ghobadi, and Active Network Measurement, Springer, 269–280, 2018.
A. Vahdat, Y. Wang, D. Wetherall, D. Zats, et al., TIMELY: [32] A. M. Kakhki, S. Jero, D. Choffnes, C. Nita-Rotaru, A. Mis-
RTT-based Congestion Control for the Datacenter, in: ACM love, Taking a Long Look at QUIC, in: Proceedings of the 2017
SIGCOMM Computer Communication Review, ACM, 2015. Internet Measurement Conference, 2017.
15
[33] D. G. Feitelson, From Repeatability to Reproducibility and Cor- 2535403}, 2013.
roboration, SIGOPS Oper. Syst. Rev. 49 (1), ISSN 0163-5980, [42] B. Girardeau, S. Steele, Reproducing Network Research (CS
doi:\bibinfo{doi}{10.1145/2723872.2723875}. 244 ’17): Congestion-based Congestion Control With BBR,
[34] O. Bonaventure, April 2016: Editor’s Message, in: ACM SIG- URL https://reproducingnetworkresearch.wordpress.
COMM CCR, 2016. com/2017/06/05/cs-244-17-congestion-based-congestion-
[35] S. Gallenmüller, D. Scholz, F. Wohlfart, Q. Scheitle, P. Em- control-with-bbr/, 2017.
merich, G. Carle, High-Performance Packet Processing and [43] C. A. Grazia, N. Patriciello, M. Klapez, M. Casoni, A cross-
Measurements (Invited Paper), in: 10th International Confer- comparison between TCP and AQM algorithms: Which is the
ence on Communication Systems & Networks (COMSNETS best couple for congestion control?, in: Proceedings of the 14th
2018), Bangalore, India, 2018. International Conference on Telecommunications (ConTEL),
[36] A. Brooks, J. Daly, J. Miller, M. Roper, M. Wood, Repli- IEEE, 2017.
cation’s role in experimental computer science, EfoCS–5–94 [44] F. Y. Yan, J. Ma, G. D. Hill, D. Raghavan, R. S. Wahby,
(RR/94/171) . P. Levis, K. Winstein, Pantheon: the training ground for
[37] V. Bajpai, M. Kühlewind, J. Ott, J. Schönwälder, A. Sperotto, Internet congestion-control research, in: 2018 USENIX An-
B. Trammell, Challenges with Reproducibility, in: ACM SIG- nual Technical Conference (USENIX ATC 18), 731–743, URL
COMM’17 Workshop on Reproducibility, 2017. http://pantheon.stanford.edu, 2018.
[38] Q. Scheitle, M. Wählisch, O. Gasser, T. C. Schmidt, G. Carle, [45] J. Crichigno, Z. Csibi, E. Bou-Harb, N. Ghani, Impact of Seg-
Towards an Ecosystem for Reproducible Research in Com- ment Size and Parallel Streams on TCP BBR, in: 2018 41st In-
puter Networking, in: ACM SIGCOMM’17 Workshop on Re- ternational Conference on Telecommunications and Signal Pro-
producibility, 2017. cessing (TSP), IEEE, 1–5, 2018.
[39] N. Handigol, B. Heller, V. Jeyakumar, B. Lantz, N. McK- [46] N. Cardwell, Y. Cheng, et al., BBR Congestion Con-
eown, Reproducible Network Experiments Using Container- trol Work at Google IETF 102 Update, IETF 102 URL
based Emulation, in: Proceedings of the 8th International https://datatracker.ietf.org/meeting/102/materials/
Conference on Emerging Networking Experiments and Tech- slides-102-iccrg-an-update-on-bbr-work-at-google-00,
nologies, CoNEXT ’12, ACM, ISBN 978-1-4503-1775-7, doi: Presentation Slides.
\bibinfo{doi}{10.1145/2413176.2413206}, 2012. [47] N. Cardwell, Y. Cheng, et al., BBR Congestion Con-
[40] L. Yan, N. McKeown, Learning Networking by Reproducing trol Work at Google IETF 101 Update, IETF 101 URL
Research Results, SIGCOMM Comput. Commun. Rev. 47 (2), https://datatracker.ietf.org/meeting/101/materials/
ISSN 0146-4833, doi:\bibinfo{doi}{10.1145/3089262.3089266}. slides-101-iccrg-an-update-on-bbr-work-at-google-00,
[41] C. Paasch, R. Khalili, O. Bonaventure, On the Benefits of Ap- Presentation Slides.
plying Experimental Design to Improve Multipath TCP, in: [48] D. Scholz, B. Jaeger, L. Schwaighofer, D. Raumer, F. Geyer,
Proceedings of the Ninth ACM Conference on Emerging Net- G. Carle, Towards a Deeper Understanding of TCP BBR Con-
working Experiments and Technologies, CoNEXT ’13, ACM, gestion Control, in: IFIP Networking 2018, Zurich, Switzerland,
ISBN 978-1-4503-2101-3, doi:\bibinfo{doi}{10.1145/2535372. 2018.
16