Chapter 3: roadmap
▪ Transport-layer services
▪ Multiplexing and demultiplexing
▪ Connectionless transport: UDP
▪ Principles of reliable data transfer
▪ Connection-oriented transport: TCP
▪ Principles of congestion control
▪ TCP congestion control
▪ Evolution of transport-layer
functionality
Transport Layer: 3-63
TCP congestion control
The congestion window, imposes a constraint on the rate at which a TCP sender
can send traffic into the network.
Specifically, the amount of unacknowledged data at a sender may not exceed the
minimum of cwnd and rwnd
Additive increase multiplicative decrease of sending rate
64
TCP congestion control: details
sender sequence number space
cwnd TCP sending behavior:
▪ roughly: send cwnd bytes,
wait RTT for ACKS, then
send more bytes
last byte
available but cwnd
ACKed sent, but TCP rate ~
~ bytes/sec
not-yet ACKed not used RTT
(“in-flight”) last byte sent
▪ TCP sender limits transmission: LastByteSent- LastByteAcked < min
▪ cwnd is dynamically adjusted in response to observed
(cwnd,rwnd)
network congestion (implementing TCP congestion control)
Transport Layer: 3-65
TCP congestion control: AIMD
▪ approach: senders can increase sending rate until packet loss
(congestion) occurs, then decrease sending rate on loss event
Additive Increase Multiplicative Decrease
increase sending rate by 1 cut sending rate in half at each
maximum segment size every loss event
RTT until loss detected
TCP sender Sending rate
AIMD sawtooth
behavior: probing
for bandwidth
time Transport Layer: 3-66
TCP AIMD: more
Multiplicative decrease detail: sending rate is
▪ Cut to 1 MSS (maximum segment size) when loss detected by
timeout or duplicate ACK (TCP Tahoe)
▪ Cut in half on loss detected by triple duplicate ACK and to 1 MSS
when loss detected by timeout (TCP Reno)
Transport Layer: 3-67
TCP slow start
Host A Host B
▪ when connection begins,
increase rate exponentially
one segm
until first loss event: ent
RTT
• initially cwnd = 1 MSS two segm
ents
• double cwnd every RTT
• done by incrementing cwnd
four segm
for every ACK received ents
▪ summary: initial rate is
slow, but ramps up
time
exponentially fast
Transport Layer: 3-68
TCP: from slow start to congestion
avoidance
Q: when should the exponential
increase switch to linear?
A: when cwnd gets to 1/2 of its X
value before timeout.
(previous timeout)
Implementation:
▪ variable ssthresh
▪ on loss event, ssthresh is set to
1/2 of cwnd just before loss event
* Check out the online interactive exercises for more examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
Transport Layer: 3-69
Summary: TCP congestion control
New
New ACK!
ACK! new ACK
duplicate ACK
dupACKcount++ new ACK .
cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount = 0
cwnd = cwnd+MSS transmit new segment(s), as allowed
dupACKcount = 0
Λ transmit new segment(s), as allowed
cwnd = 1 MSS
ssthresh = 64 KB cwnd > ssthresh
dupACKcount = 0 slow Λ congestion
start timeout avoidance
ssthresh = cwnd/2
cwnd = 1 MSS duplicate ACK
timeout dupACKcount = 0 dupACKcount++
ssthresh = cwnd/2 retransmit missing segment
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
timeout New
ACK!
ssthresh = cwnd/2
cwnd = 1 New ACK
dupACKcount = 0
dupACKcount == 3 cwnd = ssthresh dupACKcount == 3
retransmit missing segment dupACKcount = 0
ssthresh= cwnd/2 ssthresh= cwnd/2
cwnd = ssthresh + 3 cwnd = ssthresh + 3
retransmit missing segment
retransmit missing segment
fast
recovery
duplicate ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed
Transport Layer: 3-70
Explicit congestion notification (ECN)
TCP deployments often implement network-assisted congestion control:
▪ two bits in IP header (ToS field) marked by network router to indicate congestion
• policy to determine marking chosen by network operator
▪ congestion indication carried to destination
▪ destination sets ECE bit on ACK segment to notify sender of congestion
▪ involves both IP (IP header ECN bit marking) and TCP (TCP header C,E bit marking)
sourc TCP ACK segment
destination
e
application
ECE=1 application
TCP TCP
network network
link link
physical physical
ECN=10 ECN=11
IP datagram
Transport Layer: 3-71
TCP connection management
before exchanging data, sender/receiver “handshake”:
▪ agree to establish connection (each knowing the other willing to establish connection)
▪ agree on connection parameters (e.g., starting seq #s)
application application
connection state: ESTAB connection state: ESTAB
connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
at server,client at server,client
network network
Socket clientSocket = Socket connectionSocket =
newSocket("hostname","port number"); welcomeSocket.accept();
Transport Layer: 3-72
TCP 3-way handshake
Server state
serverSocket = socket(AF_INET,SOCK_STREAM)
Client state serverSocket.bind((‘’,serverPort))
serverSocket.listen(1)
clientSocket = socket(AF_INET, SOCK_STREAM) connectionSocket, addr = serverSocket.accept()
LISTEN
clientSocket.connect((serverName,serverPort)) LISTEN
choose init seq num, x
send TCP SYN msg
SYNSENT SYNbit=1, Seq=x
choose init seq num, y
send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
received SYNACK(x)
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1, ACKnum=y+1
client-to-server data
received ACK(y)
indicates client is live
ESTAB
Transport Layer: 3-73
Closing a TCP connection
▪ client, server each close their side of connection
• send TCP segment with FIN bit = 1
▪ respond to received FIN with ACK
• on receiving FIN, ACK can be combined with own FIN
▪ simultaneous FIN exchanges can be handled
Transport Layer: 3-74
Chapter 3: summary
▪ principles behind transport Up next:
layer services: ▪ leaving the network
• multiplexing, demultiplexing “edge” (application,
• reliable data transfer transport layers)
• flow control ▪ into the network “core”
• congestion control
▪ two network-layer
▪ instantiation, implementation chapters:
in the Internet • data plane
• UDP • control plane
• TCP
Transport Layer: 3-75