0% found this document useful (0 votes)
34 views114 pages

CN Unit-3

Uploaded by

iamshashank008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views114 pages

CN Unit-3

Uploaded by

iamshashank008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 114

Chapter 3

Transport Layer

A note on the use of these ppt slides:


We’re making these slides freely available to all (faculty, students, readers). Computer
They’re in PowerPoint form so you see the animations; and can add, modify,
and delete slides (including this one) and slide content to suit your needs. Networking: A Top
They obviously represent a lot of work on our part. In return for use, we only
ask the following: Down Approach
 If you use these slides (e.g., in a class) that you mention their source
(after all, we’d like people to use our book!)
6th edition
 If you post any slides on a www site, that you note that they are adapted Jim Kurose, Keith Ross
from (or perhaps identical to) our slides, and note our copyright of this
material.
Addison-Wesley
March 2012
Thanks and enjoy! JFK/KWR
All material copyright 1996-2013
J.F Kurose and K.W. Ross, All Rights Reserved

Transport Layer 3-1


Chapter 3: Transport
Layer
our goals:
 understand  learn about Internet
principles behind transport layer
transport layer protocols:
services: UDP:
multiplexing, connectionless
demultiplexi transport
ng TCP: connection-
reliable data oriented reliable
transfer transport
flow control TCP congestion control
congestion
control
Transport Layer 3-2
Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-3
Transport services and
protocols
 provide logical communication
applicatio
n
transport
network
data link
between app processes physic
al
running on different hosts

lo a n s
transport protocols run in

gi p

tr

c a or
end systems

le t
nd
-e
send side: breaks app

dn
messages into segments,
passes to network
layer
rcv side: reassembles
application
transpo
rt
network
segments into messages, data link
physica
 passes to app layer l

more than one transport


protocol available to
apps
Internet: TCP and Transport Layer 3-4
Transport vs. network
layer
 network layer: logical
household analogy:
communicatio 12 kids in Ann’s house
n between sending
hosts
 transport layer: letters to 12 kids in Bill’s
logical  house:

communication  hosts = houses

between  processes =

processes 
kids
 relies on, app messages = letters
enhances, in envelopes
transport protocol =
network  Ann
layer and Bill who demux to
services in- house siblings
network-layer protocol =
postal service
Transport Layer 3-5
Internet transport-layer
protocols
 reliable, in-order
application
transpo
rt

delivery (TCP) network


data
link
physica networ

congestion l networ k data

lo a n s
d
kata link

gi p
tr

c a or
link
physica physical
control l networ

le t
k data

nd
flow control link

-e
physic

n d
network
al
connection data
link

setup physic networ


al k
data
link
 unreliable, unordered network
data
physical
application

delivery: UDP link


physic
al networ
k
data
transpo
rt
network
data
no-frills extension
physic
link
al link
physica
of “best-effort” IP l

 services not
available:
delay guarantees Transport Layer 3-6
Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-7
Multiplexing/
demultiplexing
multiplexing at
handle
sender:data from demultiplexing at
sockets,
multiple add transport header use header info to
receiver:
(later used for received
deliver segments to correct
demultiplexing) socket

application

application P1 P2 application socket


P3 transport P4
process
transport network transport
network link network
link physical link
physical physical

Transport Layer 3-8


How demultiplexing
works
 host receives IP datagrams 32
each datagram has source bits
source port # dest port #
IP address, destination IP
address
other header fields
each datagram carries
one transport-layer
segment applicati
each segment has on data
source, destination port (payload)
number
 host uses IP addresses &
port numbers to direct TCP/UDP segment
format
segment to appropriate
socket
Transport Layer 3-9
Connectionless
demultiplexing
 recall: created socket has  recall: when
creating
host-local port #: datagram to send into
DatagramSocket UDP socket, must
mySocket1
= new
specify
DatagramSocket(12534); destination IP address
destination port #

 when host receives UDP IP datagrams with same


segment: dest. port #, but different
checks destination port source IP addresses
# in segment and/or source port
numbers will be directed
directs UDP segment to same socket at dest
to socket with that
port # Transport Layer 3-10
Connectionless demux:
example DatagramSocket
serverSocket = new
DatagramSocket DatagramSocket
mySocket2 = new DatagramSocket
mySocket1 = new
DatagramSocket (6428); DatagramSocket
(9157); application
(5775);
application P1 application
P3 P4
transport
transport transp rt
network
o
network l in netw rk
link k o lin
physical
physic k
physical
al
source port: source
dest port: 9157
6428 dest
port: ? port: ?

source port: source port: ?


9157 dest port: ?
dest port: 6428
Transport Layer 3-11
Connection-oriented
demux
 TCP socket  server host may
identified by 4-tuple: support many
source IP address simultaneous TCP
source port number sockets:
dest IP address each socket identified
dest port number by its own 4-tuple
 demux: receiver uses
 web servers have
all four values to direct different sockets for
segment to each connecting
appropriate socket client
non-persistent HTTP
will have different
socket for each request
Transport Layer 3-12
Connection-oriented demux:
example
application
application application
P4 P5 P6
P3 P2 P3
transpo
transport rt transport
network network network
link link link
physical physical server: IP physical
address
B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: source IP,port: C,5775 address C
A,9157 dest IP,port: B,80
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80
three segments, all destined to IP address: B,
dest port: 80 are demultiplexed to different
Transport Layer 3-13
sockets
Connection-oriented demux:
example threaded
server
application
application application
P4
P3 P2 P3
transpo
transport rt transport
network network network
link link link
physical physical server: IP physical
address
B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: source IP,port: C,5775 address C
A,9157 dest IP,port: B,80
source IP,port: A,9157
dest IP, port:
B,80 source IP,port: C,9157
dest IP,port: B,80

Transport Layer 3-14


Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-15
UDP: User Datagram Protocol [RFC
768]
 “no frills,” “bare  UDP use:
bones” Internet streaming multimedia
transport protocol apps (loss tolerant, rate
 “best effort” service, sensitive)
UDP segments may be: DNS
lost SNMP
delivered out-of-  reliable transfer over
order to app
UDP:
 connectionless:
add reliability
no handshaking at application
between UDP sender, layer
receiver
application-specific
each UDP segment error recovery!
handled independently
of others
Transport Layer 3-16
UDP: segment
header32 length, in bytes
of UDP
bits dest port #
source port # segment,
including
length checksum header
why is there a
UDP?
 no connection
applicati establishment (which can
on data add delay)
(payload)  simple: no connection
state at sender,
 receiver
 small header size
UDP segment
format no congestion control:
UDP can blast away
as fast as desired
Transport Layer 3-17
UDP segment Header (Cont..)
• The port numbers allow the destination host to
pass the application data to the correct process
running on the destination end system
• The length field specifies the number of bytes in
the UDP segment (header plus data).
• The checksum is used by the receiving host to
check whether errors have been introduced into
the segment.
UDP
checksum
Goal: detect “errors” (e.g., flipped bits) in
transmitted segment
sender receiver
 treat segment contents,
: including header fields, : compute checksum of

received segment
as sequence of 16-bit  check if computed
integers
checksum equals checksum
 checksum: addition field value:
(one’s complement sum)
of segment contents N O - error detected
 sender puts checksum YES - no error
value into UDP detected.
checksum field But maybe errors
nonetheless? More
later Transport Layer 3-19
Internet checksum:
example
example: add two 16-bit integers
1 1 1 0 0 1 1 0 0 1 1
0 0 1 1 0
1 1 0 1 0 1 0 1 0 1 0
wraparound 1 1 0 1 1 1 0 1 1 0 1 1 1 0 0 1 1 1 1 0 1 1

sum 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
checksum

Note: when adding numbers, a carryout from the


most significant bit needs to be added to the
result

Transport Layer 3-20


Internet transport protocols
services
TCP service: UDP
 reliable transport between
service:  unreliable data transfer
sending and receiving between sending and
process receiving process
 flow control: sender won’t  does not provide:
overwhelm receiver
reliability, flow control,
 congestion control: throttle congestion control,
sender when network
overloaded timing, throughput
guarantee, security,
 does not provide: timing, orconnection setup,
minimum throughput
guarantee, security
 connection-oriented: setup Q: why bother? Why
required between client and is there a UDP?
server processes

Application Layer 2-
Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-22
Principles of reliable
data transfer
 important in application, transport, link
layers
 top-10 list of important networking topics!

 characteristics of unreliable channel will determine


complexity of reliable data transfer protocol
(rdt)
Transport Layer 3-23
Principles of reliable
data transfer
 important in application, transport, link
layers
 top-10 list of important networking topics!

 characteristics of unreliable channel will determine


complexity of reliable data transfer protocol
(rdt)
Transport Layer 3-24
Principles of reliable
data transfer
 important in application, transport, link
layers
 top-10 list of important networking topics!

 characteristics of unreliable channel will determine


complexity of reliable data transfer protocol
(rdt)
Transport Layer 3-25
Reliable data transfer: getting
started
rdt_send(): called from deliver_data(): called
above, (e.g., by app.). Passed by rdt to deliver data to
data to deliver to receiver upper upper
layer

send receive
side
side

udt_send(): called by rdt, rdt_rcv(): called when


to transfer packet over packet arrives on rcv-side of
unreliable channel to channel
receiver
Transport Layer 3-26
Reliable data transfer: getting
started
we’ll:
 incrementally develop sender, receiver sides of
reliable data transfer protocol (rdt)
 consider only unidirectional data

transfer
 but control info will flow on both directions!
 use finite state machines (FSM) to specify
sender, receiver
state: when in this event causing state transition
“state” next state stat actions taken on state transition stat
uniquely e event
determined by e
next
1 2
event actions

Transport Layer 3-27


rdt1.0: reliable transfer over a reliable
channel
 underlying channel perfectly reliable
no bit errors
no loss of packets
 separate FSMs for sender, receiver:
sender sends data into underlying channel
receiver reads data from underlying
channel
Wait for rdt_send(data) Wait for rdt_rcv(packet)
call call extract (packet,data)
from packet = make_pkt(data) from deliver_data(data)
above udt_send(packet) below

sende receiv
r er
Transport Layer 3-28
rdt2.0: channel with bit
errors
 underlying channel may flip bits in packet
 checksum to detect bit errors
 the question: how to recover from
errors:
 acknowledgements (ACKs): receiver explicitly tells
sender that pkt received O K
 negative acknowledgements (NAKs): receiver explicitly
tells sender that pkt had errors
 senderHow do humans
retransmits pkt onrecover
receipt offrom
NAK
 new mechanisms “errors”
in rdt2.0during
(beyond rdt1.0):
 error detectionconversation?
 receiver feedback: control msgs (ACK,NAK) rcvr-
>sender

Transport Layer 3-29


rdt2.0: channel with bit
errors
 underlying channel may flip bits in packet
checksum to detect bit errors
 the question: how to recover from errors:
acknowledgements (ACKs): receiver explicitly tells
sender that pkt received O K
negative acknowledgements (NAKs): receiver explicitly
tells sender that pkt had errors
sender retransmits pkt on receipt of NAK
 new mechanisms in rdt2.0 (beyond rdt1.0):
error detection
feedback: control msgs (ACK,NAK) from receiver
to sender

Transport Layer 3-30


rdt2.0: FSM
specification
rdt_send(data)
sndpkt = make_pkt(data, checksum) receive
udt_send(sndpkt)
rdt_rcv( r
rcvpkt)
isNAK(rcvpkt)
Wait for Wait for && rdt_rcv(rcvpkt) &&
call ACK udt_send(sndpkt) corrupt(rcvpkt)
from or
above NAK udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
 call
from
sende below
r
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

Transport Layer 3-31


rdt2.0: operation with no
errors
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(
rcvpkt)
isNAK(rcvpkt)
Wait for Wait for && rdt_rcv(rcvpkt) &&
call ACK udt_send(sndpkt) corrupt(rcvpkt)
from or
above NAK udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
 call
from
below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

Transport Layer 3-32


rdt2.0: error
scenario
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call ACK udt_send(sndpkt) corrupt(rcvpkt)
from or
above NAK udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
 call
from
below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

Transport Layer 3-33


rdt2.0 has a fatal
flaw!
what happens if handling
ACK/NAK corrupted?
 sender doesn’t know
duplicates:
 sender retransmits
current pkt if ACK/NAK
what happened at corrupted
receiver!  sender adds sequence
 can’t just retransmit: number to each pkt
possible duplicate
 receiver discards (doesn’t
deliver up) duplicate pkt
stop and wait
sender sends one packet,
then waits for receiver
response

Transport Layer 3-34


rdt2.1: sender, handles garbled ACK/NAKs
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt)
Wait for Wait for
ACK || isNAK(rcvpkt) )
call 0 from
above or
rdt_rcv(rcvpkt) NAK 0 udt_send(sndpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt) && notcorrupt(rcvpkt)
&& isACK(rcvpkt)


Wait for Wait for
ACK call 1 from
rdt_rcv(rcvpkt) && or above
( corrupt(rcvpkt) NAK 1
|| isNAK(rcvpkt) ) rdt_send(data)

udt_send(sndpkt) sndpkt = make_pkt(1, data, checksum)


udt_send(sndpkt)

Transport Layer 3-35


rdt2.1: receiver, handles garbled ACK/NAKs
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum) sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
Wait for Wait for
rdt_rcv(rcvpkt) && 0 from 1 from rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) && below below not corrupt(rcvpkt) &&
has_seq1(rcvpkt) has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, chksum) sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)

extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)

Transport Layer 3-36


rdt2.1:
discussion
sender: receiver:
 seq # added to pkt  must check if received
 two seq. #’s (0,1) will packet is duplicate
suffice. Why?  state indicates
 must check if received whether 0 or 1 is
ACK/NAK corrupted expected pkt seq #
 twice as many states
 note: receiver can not
know if its last
 state must ACK/NAK received
“remember” whether O K at sender
“expected” pkt should
have seq # of 0 or 1

Transport Layer 3-37


rdt2.2: a NAK-free
protocol
 same functionality as rdt2.1, using ACKs only
 instead of NAK, receiver sends ACK for last
pkt received O K
 receiver must explicitly include seq # of pkt being
ACKed
 duplicate ACK at sender results in same action
as NAK: retransmit current pkt

Transport Layer 3-38


rdt2.2: sender, receiver
fragments rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK isACK(rcvpkt,1) )
call 0 from
above 0 udt_send(sndpkt
sender )
FSM rdt_rcv(rcvpkt)
fragment && notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && && isACK(rcvpkt,0)
(corrupt(rcvpkt) || 
has_seq1(rcvpkt)) Wait for receiver FSM
0 from
udt_send(sndpkt) below fragment
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt =
make_pkt(ACK1,
Transport Layer 3-37
rdt3.0: channels with errors and
loss
new assumption: approach: sender waits
underlying channel can “reasonable” amount of
also lose packets time for ACK
(data, ACKs) retransmits if no ACK

 checksum, seq. #, received in this time


ACKs, retransmissions if pkt (or ACK) just delayed

(not lost):
will be of help … but
not enough retransmission will
be duplicate, but seq.
#’s already handles
this
receiver must specify
 seq # of pkt being
ACKed
requires countdown timer
Transport Layer 3-40
rdt3.0
sender rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt)
udt_send(sndpkt) || isACK(rcvpkt,1)
rdt_rcv(rcvpkt) start_timer )
 Wait 
Wait for
call 0from for
ACK0 timeout
above
udt_send(sndpkt)
rdt_rcv(rcvpkt) start_timer
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt,1) && notcorrupt(rcvpkt)
stop_timer && isACK(rcvpkt,0)
stop_timer
Wait Wait for
timeout for call 1 from
udt_send(sndpkt) ACK1 above
start_timer rdt_rcv(rcvpkt)
rdt_send(data) 
rdt_rcv(rcvpkt) &&
sndpkt = make_pkt(1, data, checksum)
( corrupt(rcvpkt) || udt_send(sndpkt)
start_timer
isACK(r cvpkt,0) )

Transport Layer 3-41


rdt3.0 in
action
sender sender receive
send r
receiver pkt0 pkt0
rcv pkt0 rcv
send pkt0 ack0 send
pkt0 ack0 send
pkt0
rcv ack0 rcv ack0
ack0 ack0 pkt1
send rcv pkt1 send X
pkt1 los
pkt1 ack1 send pkt1
rcv ack1 ack1 s
send pkt0
pkt0 rcv timeou
ack0 send
pkt0 resend
t pkt1
ack0 pkt1 rcv
ack1 send
pkt1
rcv ack1 ack1
send pkt0
(a) no pkt0 rcv
loss ack0 send
pkt0
ack0

(b) packet
loss
Transport Layer 3-42
rdt3.0 in
sender receive
action
sender send pkt0
r
receiver pkt0
pkt0 rcv
ack0 send
pkt0
rcv
send pkt0 send rcv ack0
ack0 pkt0 ack0
rcv ack0 send rcv pkt1
ack0 pkt1 pkt1
send
send rcv pkt1 ack1
pkt1
ack1 ack1
pkt1 send
X ack1
los timeou
s resend
t pkt1
rcv
timeou
resend pkt1 pkt1rcv pkt0 (detect
pkt1
t rcv send
pkt1 (detect
send
ack1 duplicate)
pkt1 ack1
ack1 send
duplicate) pkt0 a
rcv
ack1
rcv ack1 send
pkt0
send pkt0 ack1 rcv ack1 ck0
send p ack0 rcv
pkt0 rcv (detect
ack0 send pkt0 k pkt0
pkt0 send
duplicate)
ack0 t
0 ack0
(c) ACK (d) premature timeout/
a delayed
loss ACK ck0
Transport Layer 3-43
Performance of
rdt3.0
 rdt3.0 is correct, but performance stinks
 e.g.: 1 Gbps link, 15 ms prop. delay, 8000 bit
packet: L 8000 bits
= = 8 microsecs
Dtrans = R
109 bits/sec
 U sender: utilization – fraction of time sender busy
sendingU L/R .008
sender = 0.00027
RTT + L / R = 30.008 =

 if RTT=30 msec, 1KB pkt every 30 msec: 33kB/sec


thruput over 1 Gbps link
 network protocol limits use of physical resources!

Transport Layer 3-44


rdt3.0: stop-and-wait
operation sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L /
R
first packet bit arrives
RTT last packet bit arrives, send
ACK

ACK arrives, send next


packet, t = RTT + L / R

U L/R .008
sender = = = 0.00027
RTT + L / R 30.008

Transport Layer 3-45


Pipelined
protocols
pipelining: sender allows multiple, “in-flight”, yet-
to-be-acknowledged pkts
range of sequence numbers must be increased
buffering at sender and/or receiver

 two generic forms of pipelined protocols: go-Back-N,


selective repeat
Transport Layer 3-46
Pipelining: increased
utilization sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L /
R
first packet bit arrives
RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
3-packet pipelining increases
utilization by a factor of 3!

U 3L / R .0024
sender = =
30.008 = 0.00081
RTT + L / R

Transport Layer 3-47


Pipelined protocols:
overview
Go-back-N: Selective Repeat:
 sender can have up  sender can have up to
to N unacked N unack’ed packets in
packets in pipeline pipeline
 receiver only sends  rcvr sends individual ack
cumulative ack for each packet
doesn’t ack packet
if there’s a gap
 sender has timer for  sender maintains timer
oldest unacked for each unacked
packet packet
when timer expires,  when timer
retransmit all unacked expires, retransmit
packets only that unacked
packet Transport Layer 3-48
Go-Back-N:
sender
k-bit seq # in pkt header
 “window” of up to N, consecutive unack’ed pkts
allowed

 ACK(n): ACKs all pkts up to, including seq # n -


“cumulative ACK”
 may receive duplicate ACKs (see


receiver) timer for oldest in-flight pkt
timeout(n): retransmit packet n and all higher seq # pkts
in window
Transport Layer 3-49
GBN: sender extended
FSM rdt_send(data)
if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
 else
refuse_data(data)
base=1
nextseqnum=1
timeout start_timer
udt_send(sndpkt[base])
Wait udt_send(sndpkt[base+1])
rdt_rcv(rcvpkt) …
&& corrupt(rcvpkt) udt_send(sndpkt[nextseqnum-

1])
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_tim
er Transport Layer 3-48
GBN: receiver extended
FSM default
udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
 && hasseqnum(rcvpkt,expectedseqnum)
expectedseqnum=1 Wait extract(rcvpkt,data)
sndpkt = deliver_data(data)
make_pkt(expectedseqnum,ACK,chksum) sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++

ACK-only: always send ACK for correctly-


received pkt with highest in-order seq #
may generate duplicate ACKs
need only remember expectedseqnum
 out-of-order pkt:
discard (don’t buffer): no receiver buffering!
re-ACK pkt with highest in-order seq #
Transport Layer 3-51
GBN in
action
sender window sender receive
(N=4) send pkt0 r
012345678 send pkt1
012345678 send pkt2 receive pkt0, send
send ack0 receive pkt1,
012345678 Xloss
012345678 pkt3 send ack1
(wait)
012345678 rcv ack0, send receive pkt3,
pkt4
012345 678 rcv ack1, send discard,
pkt5 (re)send ack1
ignore duplicate ACK
receive pkt4,
pkt 2 timeout discard,
012345678 send pkt2 (re)send ack1
012345678 send pkt3 receive pkt5,
send pkt4 discard,
rcv pkt2, deliver, send
012345678
012345678 send pkt5 ack2 (re)sendeliver,
rcv pkt3,
send dack3ack1 rcv pkt4,
deliver, send ack4 rcv
pkt5, deliver, send ack5

Transport Layer 3-52


Selective
repeat
 receiver individually acknowledges all correctly
received pkts
buffers pkts, as needed, for eventual in-order
delivery to upper layer
 sender only resends pkts for which ACK
not received
sender timer for each unACKed pkt
 sender window
N consecutive seq #’s
limits seq #s of sent, unACKed pkts

Transport Layer 3-53


Selective repeat: sender, receiver
windows

Transport Layer 3-54


Selective
repeat
sender receiver
data from pkt n in [rcvbase, rcvbase+N-1]
above:
 if next available seq #  send ACK(n)
in window, send pkt  out-of-order: buffer
timeout(n):  in-order: deliver
 resend pkt n, (also
restart timer deliver buffered, in-
order
ACK(n) in
[sendbase,sendbase+N]:
pkts), advance window to
 mark pkt n as
 next not-yet-received
 received if n smallest pkt
unACKed pkt n in [rcvbase-N,rcvbase-1]
pkt, advance window base 
ACK(n)
to next unACKed seq #
otherwise
: Transport Layer 3-55
Selective repeat in
action
sender window sender receive
(N=4) send pkt0 r
012345678 send pkt1
012345678 send pkt2 receive pkt0, send
send ack0 receive pkt1,
012345678 Xloss
012345678 pkt3 send ack1
(wait)
012345678 rcv ack0, send receive pkt3, buffer,
pkt4
012345 678 rcv ack1, send send ack3
pkt5 receive pkt4, buffer,
record ack3 arrived
send ack4
receive pkt5,
pkt 2 timeout
buffer,
012345678 send pkt2
012345678
send
record ack4 arrived
012345678
ack5
rcv pkt2; deliver pkt2,
record ack5 arrived
012345678 pkt3, pkt4, pkt5; send
ack2
Q: what happens when ack2
arrives?
Transport Layer 3-56
sender window receiver window
Selective repeat: (after receipt) (after receipt)

dilemma 0123012 pkt0


pkt 0123012
0123012
0123012 1 0123012
example: pkt 0123012
 seq #’s: 0, 1, 2, 3 0123012
0123012
2

 window size=3 pkt3 will accept packet


with seq number
(a) no
 receiver sees no problem X
0

difference in receiver can’t see sender side.


two scenarios! receiver behavior
pkt0 identical in both
cases!
 duplicate data something’s (very) wrong!
accepted as new 0123012 pkt
in (b) 0123012 0 0123012
0123012 pkt 0123012
1 0123012
Q: what relationship pkt
between seq # size timeout
retransmit 2 X
and window size to pkt0
0123012 pkt
0X will accept packet
avoid problem in with seq number
(b) oops!
(b)? X
0

Transport Layer 3-57


Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-58
TCP: RFCs: 793,1122,1323, 2018,
2581
Overview
 point-to-point:  full duplex data:
one sender, one bi-directional data
receiver flow in same
connection
 reliable, in-order byte
MSS: maximum
steam: segment size
no
“message
 connection-oriented:
boundaries” handshaking
(exchange of control
 pipelined: msgs) inits sender,
TCP congestion and receiver state before
flow control set data exchange
window size  flow controlled:
sender will not
overwhelm Transport Layer 3-59
TCP segment
structure 32 bits
URG: urgent data counting
(generally not used) source port # dest port by
ACK: ACK bytes of
# sequence number
# acknowledgement number data
vali head (not
receive window
d U AP R S F segment
# bytes
checksum Urg data pointer s!) rcvr willing
not len
PSH: push data now
RST, SYN,
(generally not FIN: to accept
used options (variable length)
connection estab
used)
(setup,
teardown
application
commands)
Internet data
checksum (variable
(as in UDP) length)

Transport Layer 3-60


TCP Segment Structure(Cont..)
• Source and destination port numbers, which are used for
multiplexing/demultiplexing data from/to upper-layer applications.
• The 32-bit sequence number field and the 32-bit acknowledgment
number field are used by the TCP sender and receiver in implementing
a reliable data transfer service
• The sequence number for a segment is therefore the byte-stream
number of the first byte in the segment
• The acknowledgment number that Host A puts in its segment is the
sequence number of the next byte Host A is expecting from Host B.
• The 16-bit receive window field is used for flow control. It is used to
indicate the number of bytes that a receiver is willing to accept.
• The 4-bit header length field specifies the length of the TCP header in
32-bit words.
• The optional and variable-length options field is used when a sender
and receiver negotiate the maximum segment size (MSS)
TCP Segment Structure (cont..)
• The flag field contains 6 bits.
• The ACK bit is used to indicate that the value carried in the
acknowledgment field is valid
• The RST, SYN, and FIN bits are used for connection setup and
teardown
• PSH bit indicates that the receiver should pass the data to the upper
layer immediately
• The URG bit is used to indicate that there is data in this segment that
the sending-side upper-layer entity has marked as “urgent.” The
location of the last byte of this urgent data is indicated by the 16-bit
urgent data pointer field.
TCP seq. numbers,
ACKs
sequence numbers:
outgoing segment from
sender
source port # dest port #
sequence number

byte stream “number” acknowledgement number


rwnd
of first byte in segment’s checksum urg pointer

data window
size N
acknowledgements:
seq # of next byte
expected from other sender sequence number
space
side sent sent,
usable
not- ACKed yet ACKed but not
cumulative ACK not
usable (“in-flight”) yet
sent
Q: how receiver
handles out-of-order incoming
source port # segment
dest port #to
sender
segments sequence number
acknowledgement number
A: TCP spec doesn’t A rwnd
checksum urg pointer
say,
- up to implementor Transport Layer 3-63
TCP seq. numbers,
ACK s Host Host
A B

User

type Seq=42, ACK=79, data =


s ‘C’ host ACKs

C’ receipt of
‘C’, echoes
Seq=79, ACK=43, data = ‘C’ back ‘C’
host ACKs
receipt
of
echoed Seq=43, ACK=80
‘C’

simple telnet
scenario

Transport Layer 3-64


TCP round trip time,
timeout
Q: how to set Q: how to estimate RTT?
TCP timeout  SampleRTT: measured
value? time from segment
transmission until ACK
 longer than RTT receipt
 but RTT varies
ignore
 retransmissions
 too short: premature SampleRTT will vary,
timeout, unnecessary want estimated RTT
retransmissions “smoother”
 too long: slow average several recent
measurements, not just
reaction to segment current SampleRTT
loss
Transport Layer 3-65
TCP round trip time,
timeout = (1- )*EstimatedRTT + *SampleRTT
EstimatedRTT
 exponential weighted moving average
 influence of past sample decreases exponentially fast
 typical value:  = 0.125 RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

350

RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

300

s 250
d
n
o
ce
si
ll
i
m
(T
T
R 200

sampleRTT
150

EstimatedRT
T
sdnocesilli

100
( TTR

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
time
SampleRTT Estimated RTT Transport Layer 3-62
TCP round trip time,
timeout
 timeout interval: EstimatedRTT plus “safety
margin”
 estimate SampleRTT deviation from
 large variation in EstimatedRTT -> EstimatedRTT:
larger safety margin
DevRTT = (1- )*DevRTT +
 *|SampleRTT-EstimatedRTT|
(typically, 

= 0.25)

TimeoutInterval =
EstimatedRTT + 4*DevRTT
estimated RTT “safety
margin”

Transport Layer 3-67


Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-68
TCP reliable data
transfer
TCP creates rdt service
on top of IP’s unreliable
service
 pipelined segments
cumulative acks l
single e simplified TCP sender:
retransmission t ignore duplicate acks
timer ’ ignore flow
 retransmissio s control,
ns triggered congestion control
by: i
timeout n
events i
duplicate acks t
Transport Layer 3-69
TCP sender
events:
data rcvd from app: timeout:
 create segment with  retransmit segment
seq # that caused
 seq # is byte- timeout
stream number of  restart timer
first data byte in ack rcvd:
segment  if ack acknowledges
 start timer if previously
not already unacked segments
running update what is
think of timer as known to be ACKed
for oldest unacked start timer if there
segment are still unacked
expiration interval: segments
TimeOutInterval Transport Layer 3-70
TCP sender
(simplified) data received from application above
create segment, seq. #:
pass segment to IP (i.e., “send”)
NextSeqNum
NextSeqNum = NextSeqNum + length(data)
if (timer currently not running)
  start timer
NextSeqNum = InitialSeqNum wait
SendBase = InitialSeqNum for
event timeout
retransmit not-yet-acked segment
with smallest seq.
#
start timer
ACK received, with ACK field value y
if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */ if
(there are currently not-yet-acked segments)
start timer
else stop timer
}
Transport Layer 3-67
TCP: retransmission
scenarios
Host A
Host
B
Host
Host
B
A

SendBase=92
Seq=92, 8 bytes of Seq=92, 8 bytes of
data data
Seq=100, 20 bytes of
ACK=100 data
X
ACK=100
t uoe

t uoe
mit

ACK=120

mti
Seq=92, 8 bytes of Seq=92, 8
data SendBase=100 bytes of
SendBase=120 data
ACK=100
ACK=120

SendBase=120

lost ACK premature


scenario timeout
Transport Layer 3-72
TCP: retransmission
scenarios
Host Host
A B

Seq=92, 8 bytes of data

Seq=100, 20 bytes of
data
ACK=100
X
t uoe

ACK=120
mti

Seq=120, 15 bytes of
data

cumulative ACK

Transport Layer 3-73


TCP ACK [RFC 1122, RFC
2581]
generation
event at receiver TCP receiver action
arrival of in-order segment with delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

arrival of in-order segment with immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

Transport Layer 3-74


TCP fast
retransmit
 time-out period
often relatively long: TCP fast retransmit
long delay before if sender receives 3
resending lost ACKs for same
packet data
(“triple duplicate
 detect lost segments (“triple
ACKs”),duplicate ACKs”),
via duplicate ACKs. resend unacked
sender often sends segment with smallest
many segments back- seq #
to-back
 likely that unacked
if segment is lost, segment lost, so don’t
there will likely be wait for timeout
many duplicate ACKs.

Transport Layer 3-75


TCP fast
retransmitHost
A
Host
B

Seq=92, 8 bytes of data


Seq=100, 20 bytes of
data
X

ACK=100

ACK=100
t uoe

ACK=100
mti

ACK=100
Seq=100,
fast retransmit after sender
20 bytes
receipt of triple duplicate
of data
ACK Transport Layer 3-76
Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-77
TCP flow
control
application
application process
may remove data applicatio
from TCP socket n
buffers …. TCP socket OS
receiver
… slower than TCP buffers
receiver is
delivering (sender
TCP
is sending) cod
e

IP
flow cod
e
receiver controls sender, so
control
sender won’t overflow
receiver’s buffer by transmitting from
too much, too fast sender
receiver protocol
stack
Transport Layer 3-78
TCP flow
control
 receiver “advertises” free
buffer space by including to application
rwnd value in TCP process
header of receiver-to-
sender segments RcvBuffer buffered data
RcvBuffer size set via
socket options (typical default rwnd free buffer space
is 4096 bytes)
many operating systems
autoadjust TCP segment
 RcvBuffer payloads
sender limits amount of
unacked (“in-flight”) data to receiver-side
receiver’s rwnd value buffering

guarantees receive buffer
will not overflow
Transport Layer 3-79
Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-80
Connection
Management
before exchanging data, sender/receiver
“handshake”:
 agree to establish connection (each knowing the other
willing to establish connection)
 agree on connection parameters

application application

connection state: ESTAB connection state: ESTAB


connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer rcvBuffer
size at size at
server,client server,client
networ networ
k k

Socket clientSocket = Socket connectionSocket =


newSocket("hostname","port
number"); welcomeSocket.accept();

Transport Layer 3-81


Agreeing to establish a
connection
2-way handshake:
Q: will 2-way
handshake always
Let’s
work in network?
talk ESTAB  variable delays
OK
ESTAB  retransmitted messages
(e.g. req_conn(x)) due to
message loss
 message reordering
choose
req_conn(x)
 can’t “see” other
x
ESTAB side
acc_conn(x)
ESTAB

Transport Layer 3-82


Agreeing to establish a
connection
2-way handshake failure
scenarios:

choose choose
req_conn(x) x req_conn(x)
x
ESTAB ESTAB
retrans acc_conn(x) retrans acc_conn(x)
mit mit
req_conn(x req_conn(x
) )
data(x+ accept
req_conn(x) 1)
ESTA ESTA data(x+
B B 1)
connection connection
clien x server retransm x server
completes
clien
it completes
t forgets t data(x+
terminat x terminate req_conn(x) forgets x
es 1)
s
ESTAB ESTAB
data(x+ accept
half open connection! 1) data(x+
(no client!) 1)
Transport Layer 3-83
TCP 3-way
handshake
client server
state state
LISTEN choose init seq num, LISTEN
x send TCP SYN
SYNSENT msg SYNbit=1,
Seq=x choose init seq num,
y send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1;
received SYNACK(x) ACKnum=x+1
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1,
client-to-server data
ACKnum=y+1 received ACK(y)
indicates client is
live ESTAB

Transport Layer 3-84


TCP 3-way handshake:
FSM
closed

Socket connectionSocket =

welcomeSocket.accept();
 Socket clientSocket =
SYN(x) newSocket("hostname","port
number");
SYNACK(seq=y,ACKnum=x+1)
create new socket for
communication back to client listen SYN(seq=x)

SYN SYN
rcvd sent

SYNACK(seq=y,ACKnum=x+1)
ESTAB ACK(ACKnum=y+1)
ACK(ACKnum=y+1)

Transport Layer 3-85


TCP: closing a
connection
 client, server each close their side of connection
send TCP segment with FIN bit = 1
 respond to received FIN with ACK
on receiving FIN, ACK can be combined with own
FIN
 simultaneous FIN exchanges can be handled

Transport Layer 3-86


TCP: closing a
connection
client server
state state
ESTAB clientSocket.close() ESTAB
FIN_WAIT_1 can no longer FINbit=1,
send but seq=x
can receive CLOSE_WAIT
data ACKbit=1;
can still
FIN_WAIT_ wait for ACKnum=x+1 send
server data
2
close
LAST_ACK
FINbit=1,
TIMED_WAIT seq=y can no longer
send data
ACKbit=1;
timed wait ACKnum=y+1
for 2*max CLOSED
segment
lifetime

CLOSED

Transport Layer 3-87


Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-88
Principles of congestion
control
congestion:
 informally: “too many sources sending too much
data too fast for network to handle”
 different from flow control!
 manifestations:
lost packets (buffer overflow at routers)
long delays (queueing in router buffers)
 a top-10 problem!

Transport Layer 3-89


Causes/costs of congestion: scenario
1 original data:  in throughput:  out
 two senders, two
receivers Host A

 one router, infinite unlimited shared


buffers output link
buffers
 output link capacity: R
 no retransmission
Host B

R/
2

delay
o
ut

R/ R/
in 2 in 2
 maximum per-connection  large delays as arrival
throughput: R/2 rate,
in, approaches capacity
Transport Layer 3-90
Causes/costs of congestion: scenario
2 one router, finite buffers
 sender retransmission of timed-out packet
 application-layer input = application-layer output: in =
out transport-layer input includes retransmissions :

in
‘ni
in : original data
out
'in: original data, plus
retransmitted data

Host A

finite shared output


Host B
link buffers
Transport Layer 3-91
Causes/costs of congestion: scenario
2 R/
idealization: perfect 2

knowledge

o
ut
 sender sends only when
router buffers available
R/
in 2

in : original data


copy out
'in: original data, plus
retransmitted data

A free buffer space!

finite shared output


Host B
link buffers
Transport Layer 3-92
Causes/costs of congestion: scenario
2Idealization: known loss
packets can be lost,
dropped at router due
to full buffers
 sender only resends if
packet known to be
lost
in : original data
copy out
'in: original data, plus
retransmitted data

A no buffer space!

Host B

Transport Layer 3-93


Causes/costs of congestion: scenario
2Idealization: known loss R/2
packets can be lost,
when sending at R/2,
dropped at router due some packets are
to full buffers

ou
retransmissions but

t
asymptotic goodput
 sender only resends if is still R/2 (why?)
packet known to be in R/2
lost
in : original data
out
'in: original data, plus
retransmitted data

A free buffer space!

Host B

Transport Layer 3-94


Causes/costs of congestion: scenario
2Realistic: R/2
duplicates
 packets can be lost, dropped
at router due to full buffers when sending at R/2,
some packets are

ou
 sender times out prematurely, retransmissions

t
including duplicated
sending two copies, both of that are delivered!
which are delivered in R/2

in
timeout
cop 'in out
y

A free buffer space!

Host B

Transport Layer 3-95


Causes/costs of congestion: scenario
2Realistic: R/2
duplicates
 packets can be lost, dropped
at router due to full buffers when sending at R/2,
some packets are

ou
 sender times out prematurely, retransmissions

t
including duplicated
sending two copies, both of that are delivered!
which are delivered in R/2

“costs” of
congestion:
 more work (retrans) for given “goodput”
 unneeded retransmissions: link carries multiple copies of
pkt
 decreasing goodput

Transport Layer 3-96


Causes/costs of congestion: scenario
3 Q: what happens as  and  ’
four senders in in

 multihop paths increase


A:?as red in’increases, all arriving
 timeout/retrans blue pkts at upper queue are
mit dropped, blue throughput
Host A 0
in : original data out Host B
'in: original data, plus
retransmitted data
finite shared output
link buffers

Host D
Host C

Transport Layer 3-97


Causes/costs of congestion: scenario
3
C/2
ou
t

C/
in ’
2

another “cost” of congestion:


 when packet dropped, any “upstream

transmission capacity used for that packet was


wasted!

Transport Layer 3-98


Approaches towards congestion
control
two broad approaches towards congestion
control:
end-end congestion network-assisted
control: congestion
 no explicit feedback control:provide
 routers
from network feedback to end
 congestion inferred systems
from end-system single bit indicating
observed loss, congestion (SNA,
 delay DECbit, TCP/IP
approach taken by ECN, ATM)
TCP explicit rate for
sender to send
at
Transport Layer 3-99
Case study: ATM ABR congestion
control
ABR: available bit RM (resource
rate:
 “elastic service” management) cells:
 if sender’s path sent
 by sender, interspersed
with data cells
“underloaded bits in RM cell set by

”: switches (“network-
 sender should use assisted”)
available NI bit: no increase in
bandwidth rate (mild congestion)
if sender’s path CI bit:
congested:  congestion
sender throttled to indication
minimum guaranteed RM cells returned to
rate sender by receiver, with
bits intact Transport Layer 3-100
Case study: ATM ABR congestion
control
RM data cell
cell

 two-byte ER (explicit rate) field in RM cell


congested switch may lower ER value in cell
senders’ send rate thus max supportable rate on path
 EFCI bit in data cells: set to 1 in congested
switch
if data cell preceding RM cell has EFCI set, receiver
sets CI bit in returned RM cell Transport Layer 3-101
Chapter 3
outline
1. transport- 5. connection-
layer services oriented
2. multiplexing and transport: TCP
demultiplexing segment structure
3. connectionless reliable data
transport: UDP transfer
flow control
4. principles of reliable
connection
data transfer
management
6. principles of
congestion control
7. TCP congestion
control Transport Layer 3-102
TCP congestion control: additive
increase multiplicative decrease
 approach: sender increases transmission rate (window
size), probing for usable bandwidth, until loss occurs
additive increase: increase cwnd by 1 MSS
every RTT until loss detected
multiplicative decrease: cut cwnd in half after
loss additively increase window size …
…. until loss occurs (then cut window in
congestion window size

half)
cwnd: TCP sender

AIMD saw tooth


behavior:
probing for
bandwidth

tim
e
Transport Layer 3-103
TCP Congestion Control:
details
sender sequence number
space cwnd TCP sending rate:
 roughly: send cwnd

bytes, wait RTT for


last last
ACKS, then send
byte
ACKed
sent,
yet ACKed
not-
byte
sent
more bytes
(“in-
flight”)
 sender limits rate ~ cwnd
bytes/sec RTT
transmission:
LastByteSent- < cwnd
LastByteAcked

 cwnd is dynamic,
function of perceived
network congestion
Transport Layer 3-104
TCP Slow
Start
 when connection begins,
Host A Host B

increase rate
exponentially until first one segme
n t
loss event:

RTT
initially cwnd = 1 MSS two segme
nts
double cwnd every RTT
done by
incrementing cwnd four segme
nts
for every ACK
received
 summary: initial rate
is slow but ramps up time
exponentially fast
Transport Layer 3-105
TCP: detecting, reacting to
loss
 loss indicated by timeout:
cwnd set to 1 MSS;
window then grows exponentially (as in slow
start) to threshold, then grows linearly
 loss indicated by 3 duplicate ACKs: TCP RENO
dup AC Ks indicate network capable of
delivering some segments
cwnd is cut in half window then grows linearly
 TCP Tahoe always sets cwnd to 1 (timeout or
3 duplicate acks)

Transport Layer 3-106


TCP: switching from slow start to
CAwhen should the
Q:
exponential
increase switch to
linear?
A: when cwnd
gets to 1/2 of its
value before
timeout.

Implementation:
 variable ssthresh
 on loss event, ssthresh
is set to 1/2 of cwnd
just before loss event

Transport Layer 3-107


Summary: TCP Congestion
Control New
ACK new ACK
New
ACK!
duplicate ACK
dupACKcount++ new !ACK
cwnd = cwnd + MSS
(MSS/cwnd)
.
cwnd = cwnd+MSS dupACKcount = 0
dupACKcount = 0 transmit new segment(s), as allowed
 transmit new
cwnd = 1 MSS segment(s), as
ssthresh = 64 KB allowed
cwnd > ssthresh
dupACKcount = 0
slow  congestion
timeout avoidance
start ssthresh
= cwnd/2 duplicate ACK
timeout cwnd = 1 MSS dupACKcount++
ssthresh = cwnd/2 dupACKcount = 0
cwnd = 1 MSS retransmit missing
dupACKcount = 0 segment
retransmit missing
timeout
New
segment ACK!
ssthresh = cwnd/2
cwnd = 1 New ACK
dupACKcount = 0
cwnd = ssthresh dupACKcount == 3
dupACKcount == 3 retransmit missing segment dupACKcount =
ssthresh= cwnd/2 0 ssthresh= cwnd/2
cwnd = ssthresh + cwnd = ssthresh + 3
3 retransmit missing
retransmit missing
segment
fast segment

recovery
duplicate ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed

Transport Layer 3-108


TCP
throughput
 avg. TCP thruput as function of window size,

RTT?
ignore slow start, assume always data to send
 W: window size (measured in bytes) where loss occurs
avg. window size (# in-flight bytes) is ¾ W
avg. thruput is 3/4W per RTT 3
W bytes/sec
avg TCP thruput =
4 RTT

W/
2

Transport Layer 3-109


TCP Futures: TCP over “long, fat
pipes”
 example: 1500 byte segments, 100ms RTT, want
10 Gbps throughput
 requires W = 83,333 in-flight segments
 throughput in terms of segment loss probability,
L
[Mathis 1997]:
1.22 . MSS
TCP throughput = RTT L

➜ to achieve 10 Gbps throughput, need a loss rate of


L
= 2·10-10 – a very small loss rate!
 new versions of TCP for high-speed
Transport Layer 3-110
TCP
Fairness
fairness goal: if K TCP sessions share same
bottleneck link of bandwidth R, each should
have average rate of R/K

TCP connection 1

bottleneck
router
capacity R
TCP connection 2

Transport Layer 3-111


Why is TCP
fair?
two competing
sessions:
 additive increase gives slope of 1, as throughout increases
 multiplicative decrease decreases throughput
proportionally
R equal bandwidth share
Connection 2 throughput

loss: decrease window by factor of 2


congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase

Connection 1 throughput R

Transport Layer 3-112


Fairness
(more)
Fairness and UDP Fairness, parallel TCP
 multimedia apps often connections
do not use TCP  application can open
do not want rate multiple parallel
throttled by congestion connections between two
control
hosts
 instead use UDP:  web browsers do this
send audio/video at
constant rate, tolerate  e.g., link of rate R with
packet loss 9 existing connections:
new app asks for 1 TCP, gets rate
R/10
new app asks for 11 TCPs, gets
R/2

Transport Layer 3-113


Chapter 3:
summary
 principles behind
transport layer
services:
multiplexing, next:
 leaving the
demultiplexi
ng network “edge”
reliable data (application,
transport
transfer layers)
flow control  into the network
congestion “core”
control
 instantiation,
implementation in the Transport Layer 3-114

You might also like