TCP: Overview
RFCs: 793, 1122, 1323, 2018, 2581
TCP segment structure
32 bits URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now RST, SYN, FIN: connection estab (setup, teardown commands) Internet checksum (as in UDP)
point-to-point:
full duplex data:
source port #
dest port #
reliable, in-order byte
one sender, one receiver
stream:
no message boundaries TCP congestion and flow control set window size
bi-directional data flow in same connection MSS: maximum segment size handshaking (exchange of control msgs) inits sender, receiver state before data exchange sender will not overwhelm receiver
3: Transport Layer 3b-1
sequence number
head not UA P R S F len used
acknowledgement number
rcvr window size ptr urgent data checksum
counting by bytes of data (not segments!) # bytes rcvr willing to accept
pipelined:
connection-oriented:
Options (variable length) application data (variable length)
socket door
send & receive buffers
application writes data TCP send buffer
segment
flow controlled:
application reads data TCP receive buffer
socket door
3: Transport Layer
3b-2
TCP seq. #s and ACKs
Seq. #s: byte stream number of first byte in segments data ACKs: seq # of next byte expected from other side cumulative ACK Q: how receiver handles out-of-order segments A: TCP spec doesnt say, - up to implementor
Host A
User types C
Seq=4 2 , A CK =79, d a
Host B
ta = C
3 CK=4 79, A Seq=
, data
= C
host ACKs receipt of C, echoes back C
TCP: reliable data transfer
Simplified TCP sender
host ACKs receipt of echoed C
Seq=4 3, ACK = 80
simple telnet scenario
3: Transport Layer
time
00 sendbase = initial_sequence number 01 nextseqnum = initial_sequence number 02 03 loop (forever) { 04 switch(event) 05 event: data received from application above 06 create TCP segment with sequence number nextseqnum 07 start timer for segment nextseqnum 08 pass segment to IP 09 nextseqnum = nextseqnum + length(data) 10 event: timer timeout for segment with sequence number y 11 retransmit segment with sequence number y 12 compute new timeout interval for segment y 13 restart timer for sequence number y 14 event: ACK received, with ACK field value of y 15 if (y > sendbase) { /* cumulative ACK of all data up to y */ 16 cancel all timers for segments with sequence numbers < y 17 sendbase = y 18 } 19 else { /* a duplicate ACK for already ACKed segment */ 20 increment number of duplicate ACKs received for y 21 if (number of duplicate ACKS received for y == 3) { 22 /* TCP fast retransmit */ 23 resend segment with sequence number y 24 restart timer for segment y 25 } 26 } /* end of loop forever */ 3: Transport Layer 3b-4
3b-3
TCP ACK generation
Event
in-order segment arrival, no gaps, everything else already ACKed in-order segment arrival, no gaps, one delayed ACK pending out-of-order segment arrival higher-than-expect seq. # gap detected arrival of segment that partially or completely fills gap
[RFC 1122, RFC 2581]
TCP: retransmission scenarios
Host A
Seq=9 2
TCP Receiver action
timeout
Host B
, 8 byte s data
Host A
Seq=9 2
Host B
, 8 byte s data 00, 2 0 byte s data
0 10 K= 120 AC ACK=
immediately send single cumulative ACK send duplicate ACK, indicating seq. # of next expected byte immediate ACK if segment starts at lower end of gap
3: Transport Layer 3b-5
loss
Seq=9 2 , 8 byte s data
=100 A CK
Seq=100 timeout Seq=92 timeout
delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK
Seq= 1
Seq=9 2
, 8 byte
s data
=100 AC K
AC
2 K=1
time
lost ACK scenario
time
premature timeout, cumulative ACKs
3: Transport Layer 3b-6
TCP Flow Control
flow control
sender wont overrun receivers buffers by transmitting too much, too fast receiver: explicitly informs sender of (dynamically changing) amount of free buffer space RcvWindow field in TCP segment sender: keeps the amount of transmitted, unACKed data less than most recently received RcvWindow
TCP Round Trip Time and Timeout
Q: how to set TCP timeout value?
longer than RTT
Q: how to estimate RTT?
SampleRTT: measured time from
RcvBuffer = size or TCP Receive Buffer RcvWindow = amount of spare room in Buffer
note: RTT will vary too short: premature timeout unnecessary retransmissions too long: slow reaction to segment loss
segment transmission until ACK receipt ignore retransmissions, cumulatively ACKed segments SampleRTT will vary, want estimated RTT smoother use several recent measurements, not just current SampleRTT
receiver buffering
3: Transport Layer 3b-7 3: Transport Layer 3b-8
TCP Round Trip Time and Timeout
EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT
Exponential weighted moving average influence of given sample decreases exponentially fast typical value of x: 0.1
TCP Connection Management
Recall: TCP sender, receiver
establish connection before exchanging data segments initialize TCP variables: seq. #s buffers, flow control info (e.g. RcvWindow) client: connection initiator
Socket clientSocket = new Socket("hostname","port number");
Three way handshake:
Step 1: client end system
sends TCP SYN control segment to server specifies initial seq #
Setting the timeout
EstimtedRTT plus safety margin large variation in EstimatedRTT -> larger safety margin
Step 2: server end system
ACKs received SYN allocates buffers specifies server-> receiver initial seq. #
receives SYN, replies with SYNACK control segment
Timeout = EstimatedRTT + 4*Deviation Deviation = (1-x)*Deviation + x*|SampleRTT-EstimatedRTT|
3: Transport Layer 3b-9
server: contacted by client
Socket connectionSocket = welcomeSocket.accept();
3: Transport Layer 3b-10
TCP Connection Management (cont.)
Closing a connection:
client closes socket: clientSocket.close();
close
client
FIN
TCP Connection Management (cont.)
Step 3: client receives FIN,
replies with ACK.
server
client
server
FIN
Step 1: client end system
sends TCP FIN control segment to server
AC K FIN
close
Enters timed wait will respond with ACK to received FINs
closing
Step 4: server, receives Note: with small
AC K FIN
closing
ACK. Connection closed.
timed wait
timed wait
Step 2: server receives
A CK
FIN, replies with ACK. Closes connection, sends FIN.
modification, can handle simultaneous FINs.
A CK
closed
closed
3: Transport Layer 3b-11
closed
3: Transport Layer 3b-12
TCP Connection Management (cont)
TCP server lifecycle TCP client lifecycle
3: Transport Layer 3b-13