Transport Layer Essentials
Transport Layer Essentials
I
2.2.2 Connectionless Multiplexing and Demultiplexing
2.2.3 Connection Oriented Multiplexing and Demultiplexing
R
2.2.4 Web Servers and TCP
2.3 Connectionless Transport: UDP
2.3.1 UDP Segment Structure
YS
2.3.2 UDP Checksum
2.4 Principles of Reliable Data Transfer
2.4.1 Building a Reliable Data Transfer Protocol
m
2.4.1.1 Reliable Data Transfer over a Perfectly Reliable Channel: rdt1.0
2.4.1.2 Reliable Data Transfer over a Channel with Bit Errors: rdt2.0
2.4.1.2.1 Sender Handles Garbled ACK/NAKs: rdt2.1
2.4.1.2.2 Sender uses ACK/NAKs: rdt2.2
S.cBo
2.4.1.3 Reliable Data Transfer over a Lossy Channel with Bit Errors: rdt3.0
2.4.2 Pipelined Reliable Data Transfer Protocols
2.4.3 Go-Back-N (GBN)
2.4.3.1 GBN Sender
2.4.3.2 GBN Receiver
2.4.3.3 Operation of the GBN Protocol
ToEp
2.4.4 Selective Repeat (SR)
2.4.4.1 SR Sender
2.4.4.2 SR Receiver
2.4.5 Summary of Reliable Data Transfer Mechanisms and their Use
Olo
“The greatest deception men suffer is from their own opinions.” —Leonardo da Vinci
2-1
COMPUTER NETWORKS
2.6.2 Approaches to Congestion Control
2.6.3 Network Assisted Congestion Control Example: ATM ABR Congestion Control
2.6.3.1 Three Methods to indicate Congestion
2.7 TCP Congestion Control
2.7.1 TCP Congestion Control
2.7.1.1 Slow Start
2.7.1.2 Congestion Avoidance
2.7.1.3 Fast Recovery
2.7.1.4 TCP Congestion Control: Retrospective
2.7.2 Fairness
2.7.2.1 Fairness and UDP
I
2.7.2.2 Fairness and Parallel TCP Connections
R
mYS
S.cBo
ToEp
Olo
Ntu
Uv
VT
“If you are seeking revenge, start by digging two graves.” —Ancient Chinese proverb
2-2
COMPUTER NETWORKS
I
→ receives messages from an application-process
→ converts the messages into the segments and
R
→ passes the segment to the network-layer.
• On the receiver, the transport-layer
→ receives the segment from the network-layer
YS
→ converts the segments into to the messages and
→ passes the messages to the application-process.
• The Internet has 2 transport-layer protocols: TCP and UDP
m
2.1.1 Relationship between Transport and Network Layers
• A transport-layer protocol provides logical-communication b/w processes running on different hosts.
Whereas, a network-layer protocol provides logical-communication between hosts.
S.cBo
• Transport-layer protocols are implemented in the end-systems but not in network-routers.
• Within an end-system, a transport protocol
→ moves messages from application-processes to the network-layer and vice versa.
→ but doesn't say anything about how the messages are moved within the network-core.
• The routers do not recognize any info. which is appended to the messages by the transport-layer.
ToEp
2.1.2 Overview of the Transport Layer in the Internet
• When designing a network-application, we must choose either TCP or UDP as transport protocol.
1) UDP (User Datagram Protocol)
UDP provides a connectionless service to the invoking application.
Olo
1) Reliable data transfer i.e. guarantees data will arrive to destination-process correctly.
2) Congestion control and
3) Error checking.
VT
“I have never met a man so ignorant that I couldnt learn something from him.” —Galileo Galilei
2-3
COMPUTER NETWORKS
2.2 Multiplexing and Demultiplexing
• A process can have one or more sockets.
• The sockets are used to pass data from the network to the process and vice versa.
1) Multiplexing
At the sender, the transport-layer
→ gathers data-chunks at the source-host from different sockets
→ encapsulates data-chunk with header to create segments and
→ passes the segments to the network-layer.
The job of combining the data-chunks from different sockets to create a segment is called
multiplexing.
2) Demultiplexing
I
At the receiver, the transport-layer
R
→ examines the fields in the segments to identify the receiving-socket and
→ directs the segment to the receiving-socket.
The job of delivering the data in a segment to the correct socket is called demultiplexing.
• In Figure 2.1,
YS
In the middle host, the transport-layer must demultiplex segments arriving from the network-
layer to either process P1 or P2.
The arriving segment’s data is directed to the corresponding process’s socket.
m
S.cBo
ToEp
Olo
“Only the foolish and the dead never change their opinions.” — James R. Lowell
2-4
COMPUTER NETWORKS
2.2.1 Endpoint Identification
• Each socket must have a unique identifier.
• Each segment must include 2 header-fields to identify the socket (Figure 2.2):
1) Source-port-number field and
2) Destination-port-number field.
• Each port-number is a 16-bit number: 0 to 65535.
• The port-numbers ranging from 0 to 1023 are called well-known port-numbers and are restricted.
For example: HTTP uses port-no 80
FTP uses port-no 21
• When we develop a new application, we must assign the application a port-number, which are known
as ephemeral ports (49152–65535).
I
R
mYS
Figure 2.2: Source and destination-port-no fields in a transport-layer segment
S.cBo
• How the transport-layer implements the demultiplexing service?
• Answer:
Each socket in the host will be assigned a port-number.
When a segment arrives at the host, the transport-layer
ToEp
→ examines the destination-port-no in the segment
→ directs the segment to the corresponding socket and
→ passes then the segment to the attached process.
Olo
Ntu
Uv
VT
“When you can't change the direction of the wind - adjust your sails.” —H. Jackson Brown
2-5
COMPUTER NETWORKS
2.2.2 Connectionless Multiplexing and Demultiplexing
• At client side of the application, the transport-layer automatically assigns the port-number.
Whereas, at the server side, the application assigns a specific port-number.
• Suppose process on Host-A (port 19157) wants to send data to process on Host-B (port 46428).
I
R
mYS
Figure 2.3: The inversion of source and destination-port-nos
S.cBo
• At the sender A, the transport-layer
→ creates a segment containing source-port 19157, destination-port 46428 & data and
→ passes then the resulting segment to the network-layer.
• At the receiver B, the transport-layer
→ examines the destination-port field in the segment and
ToEp
→ delivers the segment to the socket identified by port 46428.
• A UDP socket is identified by a two-tuple:
1) Destination IP address &
2) Destination-port-no.
• As shown in Figure 2.3,
Olo
Source-port-no from Host-A is used at Host-B as "return address" i.e. when B wants to send a
segment back to A.
Ntu
Uv
VT
“Although the world is full of suffering, it is also full of the overcoming of it.” —Helen Keller
2-6
COMPUTER NETWORKS
2.2.3 Connection Oriented Multiplexing and Demultiplexing
• Each TCP connection has exactly 2 end-points. (Figure 2.4).
• Thus, 2 arriving TCP segments with different source-port-nos will be directed to 2 different sockets,
even if they have the same destination-port-no.
• A TCP socket is identified by a four-tuple:
1) Source IP address
2) Source-port-no
3) Destination IP address &
4) Destination-port-no.
I
R
mYS
S.cBo
Figure 2.4: The inversion of source and destination-port-nos
ToEp
• The server-host may support many simultaneous connection-sockets.
• Each socket will be
→ attached to a process.
→ identified by its own four tuple.
• When a segment arrives at the host, all 4 fields are used to direct the segment to the appropriate
Olo
“Difficulties are things that show a person who they are.” —Epictetus
2-7
COMPUTER NETWORKS
2.2.4 Web Servers and TCP
• Consider a host running a Web-server (ex: Apache) on port 80.
• When clients (ex: browsers) send segments to the server, all segments will have destination-port 80.
• The server distinguishes the segments from the different clients using two-tuple:
1) Source IP addresses &
2) Source-port-nos.
• Figure 2.5 shows a Web-server that creates a new process for each connection.
• The server can use either i) persistent HTTP or ii) non-persistent HTTP
i) Persistent HTTP
Throughout the duration of the persistent connection the client and server exchange HTTP
messages via the same server socket.
I
ii) Non-persistent HTTP
R
A new TCP connection is created and closed for every request/response.
Hence, a new socket is created and closed for every request/response.
This can severely impact the performance of a busy Web-server.
mYS
S.cBo
ToEp
Olo
Ntu
Figure 2.5: Two clients, using the same destination-port-no (80) to communicate with the same Web-
server application
Uv
VT
2-8
COMPUTER NETWORKS
2.3 Connectionless Transport: UDP
• UDP is an unreliable, connectionless protocol.
Unreliable service means UDP doesn’t guarantee data will arrive to destination-process.
Connectionless means there is no handshaking b/w sender & receiver before sending data.
• It provides following 2 services:
i) Process-to-process data delivery and
ii) Error checking.
• It does not provide flow, error, or congestion control.
• At the sender, UDP
→ takes messages from the application-process
→ attaches source- & destination-port-nos and
I
→ passes the resulting segment to the network-layer.
R
• At the receiver, UDP
→ examines the destination-port-no in the segment and
→ delivers the segment to the correct application-process.
• It is suitable for application program that
YS
→ needs to send short messages &
→ cannot afford the retransmission.
• UDP is suitable for many applications for the following reasons:
m
1) Finer Application Level Control over what Data is Sent, and when.
When an application-process passes data to UDP, the UDP
→ packs the data inside a segment and
→ passes immediately the segment to the network-layer.
S.cBo
On the other hand,
In TCP, a congestion-control mechanism throttles the sender when the n/w is congested
2) No Connection Establishment.
TCP uses a three-way handshake before it starts to transfer data.
UDP just immediately passes the data without any formal preliminaries.
ToEp
Thus, UDP does not introduce any delay to establish a connection.
That’s why, DNS runs over UDP rather than TCP.
3) No Connection State.
TCP maintains connection-state in the end-systems.
This connection-state includes
Olo
Table 2.1: Popular Internet applications and their underlying transport protocols
Application Application-Layer Underlying Transport
Protocol Protocol
Electronic mail SMTP TCP
VT
When we are no longer able to change a situation, we are challenged to change ourselves. -Victor Frankl
2-9
COMPUTER NETWORKS
2.3.1 UDP Segment Structure
I
Figure 2.6: UDP segment structure
R
• UDP Segment contains following fields (Figure 2.6):
1) Application Data: This field occupies the data-field of the segment.
YS
2) Destination Port No: This field is used to deliver the data to correct process running on the
destination-host. (i.e. demultiplexing function).
3) Length: This field specifies the number of bytes in the segment (header plus data).
m
4) Checksum: This field is used for error-detection.
• On the sender:
Suppose that we have the following three 16-bit words:
0110011001100000
0101010101010101 → three 16 bits words
Ntu
1000111100001100
The sum of first two 16-bit words is:
0110011001100000
Uv
0101010101010101
1011101110110101
Adding the third word to the above sum gives:
1011101110110101 → sum of 1st two 16 bit words
1000111100001100 → third 16 bit word
0100101011000010 → sum of all three 16 bit words
Taking 1’s complement for the final sum:
VT
"You only live once, but if you do it right, once is enough." —Mae West
2-10
COMPUTER NETWORKS
2.4 Principles of Reliable Data Transfer
• Figure 2.7 illustrates the framework of reliable data transfer protocol.
I
R
mYS
S.cBo
Figure 2.7: Reliable data transfer: Service model and service implementation
ToEp
• On the sender, rdt_send() will be called when a packet has to be sent on the channel.
• On the receiver,
i) rdt_rcv()will be called when a packet has to be recieved on the channel.
ii) deliver_data() will be called when the data has to be delivered to the upper layer
Olo
Ntu
Uv
VT
“If you love life, don't waste time, for time is what life is made up of.” —Bruce Lee
2-11
COMPUTER NETWORKS
2.4.1 Building a Reliable Data Transfer Protocol
2.4.1.1 Reliable Data Transfer over a Perfectly Reliable Channel: rdt1.0
• Consider data transfer over a perfectly reliable channel.
• We call this protocol as rdt1.0.
I
R
mYS
S.cBo
Figure 2.8: rdt1.0 – A protocol for a completely reliable channel
• The finite-state machine (FSM) definitions for the rdt1.0 sender and receiver are shown in Figure 2.8.
• The sender and receiver FSMs have only one state.
ToEp
• In FSM, following notations are used:
i) The arrows indicate the transition of the protocol from one state to another.
ii) The event causing the transition is shown above the horizontal line labelling the transition.
iii) The action taken when the event occurs is shown below the horizontal line.
iv) The dashed arrow indicates the initial state.
Olo
→ passes the data up to the upper layer (via the action deliver_data(data)).
VT
“Write it on your heart that every day is the best day in the year.” —Ralph Waldo Emerson
2-12
COMPUTER NETWORKS
2.4.1.2 Reliable Data Transfer over a Channel with Bit Errors: rdt2.0
• Consider data transfer over an unreliable channel in which bits in a packet may be corrupted.
• We call this protocol as rdt2.0.
• The message dictation protocol uses both
→ positive acknowledgements (ACK) and
→ negative acknowledgements (NAK).
• The receiver uses these control messages to inform the sender about
→ what has been received correctly and
→ what has been received in error and thus requires retransmission.
• Reliable data transfer protocols based on the retransmission are known as ARQ protocols.
• Three additional protocol capabilities are required in ARQ protocols:
I
1) Error Detection
R
A mechanism is needed to allow the receiver to detect when bit-errors have occurred.
UDP uses the checksum field for error-detection.
Error-correction techniques allow the receiver to detect and correct packet bit-errors.
2) Receiver Feedback
YS
Since the sender and receiver are typically executing on different end-systems.
The only way for the sender to learn about status of the receiver is by the receiver providing
explicit feedback to the sender.
m
For example: ACK & NAK
3) Retransmission
A packet that is received in error at the receiver will be retransmitted by the sender.
• Figure 2.9 shows the FSM representation of rdt2.0.
S.cBo
ToEp
Olo
Ntu
Uv
VT
2-13
COMPUTER NETWORKS
Sender FSM
• The sender of rdt2.0 has 2 states:
1) In one state, the protocol is waiting for data to be passed down from the upper layer.
2) In other state, the protocol is waiting for an ACK or a NAK from the receiver.
i) If an ACK is received, the protocol
→ knows that the most recently transmitted packet has been received correctly
→ returns to the state of waiting for data from the upper layer.
ii) If a NAK is received, the protocol
→ retransmits the last packet and
→ waits for an ACK or NAK to be returned by the receiver.
• The sender will not send a new data until it is sure that the receiver has correctly received the
I
current packet.
R
• Because of this behaviour, protocol rdt2.0 is known as stop-and-wait protocols.
Receiver FSM
• The receiver of rdt2.0 has a single state.
• On packet arrival, the receiver replies with either an ACK or a NAK, depending on the received packet
YS
is corrupted or not.
m
S.cBo
ToEp
Olo
Ntu
Uv
VT
“If you spend too much time thinking about a thing, you'll never get it done.” —Bruce Lee
2-14
COMPUTER NETWORKS
2.4.1.2.1 Sender Handles Garbled ACK/NAKs: rdt2.1
• Problem with rdt2.0:
If an ACK or NAK is corrupted, the sender cannot know whether the receiver has correctly
received the data or not.
• Solution: The sender resends the current data packet when it receives garbled ACK or NAK packet.
Problem: This approach introduces duplicate packets into the channel.
Solution: Add sequence-number field to the data packet.
The receiver has to only check the sequence-number to determine whether the received
packet is a retransmission or not.
• For a stop-and-wait protocol, a 1-bit sequence-number will be sufficient.
• A 1-bit sequence-number allows the receiver to know whether the sender is sending
I
→ previously transmitted packet (0) or
R
→ new packet (1).
• We call this protocol as rdt2.1.
• Figure 2.10 and 2.11 shows the FSM description for rdt2.1.
mYS
S.cBo
ToEp
Olo
Ntu
“For every minute you remain angry, you give up 60 seconds of peace of mind.” —Ralph Waldo Emerson
2-15
COMPUTER NETWORKS
I
R
mYS
Figure 2.11: rdt2.1 receiver
S.cBo
ToEp
Olo
Ntu
Uv
VT
“When angry count to ten before you speak. If very angry, count to one hundred.” —Thomas Jefferson
2-16
COMPUTER NETWORKS
2.4.1.2.2 Sender uses ACK/NAKs: rdt2.2
• Protocol rdt2.2 uses both positive and negative acknowledgments from the receiver to the sender.
i) When out-of-order packet is received, the receiver sends a positive acknowledgment (ACK).
ii) When a corrupted packet is received, the receiver sends a negative acknowledgment (NAK).
• We call this protocol as rdt2.2. (Figure 2.12 and 2.13).
I
R
mYS
S.cBo
ToEp
Figure 2.12: rdt2.2 sender
Olo
Ntu
Uv
VT
2-17
COMPUTER NETWORKS
2.4.1.3 Reliable Data Transfer over a Lossy Channel with Bit Errors: rdt3.0
• Consider data transfer over an unreliable channel in which packet lose may occur.
• We call this protocol as rdt3.0.
• Two problems must be solved by the rdt3.0:
1) How to detect packet loss?
2) What to do when packet loss occurs?
• Solution:
The sender
→ sends one packet & starts a timer and
→ waits for ACK from the receiver (okay to go ahead).
If the timer expires before ACK arrives, the sender retransmits the packet and restarts the
I
timer.
R
• The sender must wait at least as long as
1) A round-trip delay between the sender and receiver plus
2) Amount of time needed to process a packet at the receiver.
• Implementing a time-based retransmission mechanism requires a countdown timer.
YS
• The timer must interrupt the sender after a given amount of time has expired.
• Figure 2.14 shows the sender FSM for rdt3.0, a protocol that reliably transfers data over a channel
that can corrupt or lose packets;
m
• Figure 2.15 shows how the protocol operates with no lost or delayed packets and how it handles lost
data packets.
• Because sequence-numbers alternate b/w 0 & 1, protocol rdt3.0 is known as alternating-bit protocol.
S.cBo
ToEp
Olo
Ntu
Uv
VT
2-18
COMPUTER NETWORKS
I
R
mYS
S.cBo
ToEp
Olo
Ntu
“The key to immortality is first living a life worth remembering.” —Bruce Lee
2-19
COMPUTER NETWORKS
2.4.2 Pipelined Reliable Data Transfer Protocols
• The sender is allowed to send multiple packets without waiting for acknowledgments.
• This is illustrated in Figure 2.16 (b).
• Pipelining has the following consequences:
1) The range of sequence-numbers must be increased.
2) The sender and receiver may have to buffer more than one packet.
• Two basic approaches toward pipelined error recovery can be identified:
1) Go-Back-N and 2) Selective repeat.
I
R
mYS
S.cBo
ToEp
Olo
Ntu
Uv
VT
2-20
COMPUTER NETWORKS
2.4.3 Go-Back-N (GBN)
• The sender is allowed to transmit multiple packets without waiting for an acknowledgment.
• But, the sender is constrained to have at most N unacknowledged packets in the pipeline.
Where N = window-size which refers maximum no. of unacknowledged packets in the pipeline
• GBN protocol is called a sliding-window protocol.
• Figure 2.17 shows the sender’s view of the range of sequence-numbers.
I
R
Figure 2.17: Sender’s view of sequence-numbers in Go-Back-N
• Figure 2.18 and 2.19 give a FSM description of the sender and receivers of a GBN protocol.
mYS
S.cBo
ToEp
Olo
Ntu
Uv
“Try not to become a man of success, but rather try to become a man of value.” —Albert Einstein
2-21
COMPUTER NETWORKS
2.4.3.1 GBN Sender
• The sender must respond to 3 types of events:
1) Invocation from above.
When rdt_send() is called from above, the sender first checks to see if the window is full
i.e. whether there are N outstanding, unacknowledged packets.
i) If the window is not full, the sender creates and sends a packet.
ii) If the window is full, the sender simply returns the data back to the upper layer. This
is an implicit indication that the window is full.
2) Receipt of an ACK.
An acknowledgment for a packet with sequence-number n will be taken to be a cumulative
acknowledgment.
I
All packets with a sequence-number up to n have been correctly received at the receiver.
R
3) A Timeout Event.
A timer will be used to recover from lost data or acknowledgment packets.
i) If a timeout occurs, the sender resends all packets that have been previously sent but
that have not yet been acknowledged.
YS
ii) If an ACK is received but there are still additional transmitted but not yet
acknowledged packets, the timer is restarted.
iii) If there are no outstanding unacknowledged packets, the timer is stopped.
m
2.4.3.2 GBN Receiver
• If a packet with sequence-number n is received correctly and is in order, the receiver
→ sends an ACK for packet n and
→ delivers the packet to the upper layer.
S.cBo
• In all other cases, the receiver
→ discards the packet and
→ resends an ACK for the most recently received in-order packet.
ToEp
Olo
Ntu
Uv
VT
“Judge of your natural character by what you do in your dreams.” —Ralph Waldo Emerson
2-22
COMPUTER NETWORKS
2.4.3.3 Operation of the GBN Protocol
I
R
mYS
S.cBo
ToEp
• Figure 2.20 shows the operation of the GBN protocol for the case of a window-size of four packets.
• The sender sends packets 0 through 3.
• The sender then must wait for one or more of these packets to be acknowledged before proceeding.
• As each successive ACK (for ex, ACK0 and ACK1) is received, the window slides forward and the
Ntu
“Nature and books belong to the eyes that see them.” —Ralph Waldo Emerson
2-23
COMPUTER NETWORKS
2.4.4 Selective Repeat (SR)
• Problem with GBN:
GBN suffers from performance problems.
When the window-size and bandwidth-delay product are both large, many packets can be in
the pipeline.
Thus, a single packet error results in retransmission of a large number of packets.
• Solution: Use Selective Repeat (SR).
I
R
mYS
S.cBo
Figure 2.21: Selective-repeat (SR) sender and receiver views of sequence-number space
ToEp
• The sender retransmits only those packets that it suspects were erroneous.
• Thus, avoids unnecessary retransmissions. Hence, the name “selective-repeat”.
• The receiver individually acknowledge correctly received packets.
• A window-size N is used to limit the no. of outstanding, unacknowledged packets in the pipeline.
Olo
2.4.4.1 SR Sender
• The various actions taken by the SR sender are as follows:
Ntu
2-24
COMPUTER NETWORKS
2.4.4.2 SR Receiver
• The various actions taken by the SR receiver are as follows:
1) Packet with sequence-number in [rcv_base, rcv_base+N-1] is correctly received.
In this case,
→ received packet falls within the receiver’s window and
→ selective ACK packet is returned to the sender.
If the packet was not previously received, it is buffered.
If this packet has a sequence-number equal to rcv_base, then this packet, and any previously
buffered and consecutively numbered packets are delivered to the upper layer.
The receive-window is then moved forward by the no. of packets delivered to the upper layer.
For example: consider Figure 2.22.
I
¤ When a packet with a sequence-number of rcv_base=2 is received, it and packets 3, 4,
R
and 5 can be delivered to the upper layer.
2) Packet with sequence-number in [rcv_base-N, rcv_base-1] is correctly received.
In this case, an ACK must be generated, even though this is a packet that the receiver has
previously acknowledged.
YS
3) Otherwise.
Ignore the packet.
m
S.cBo
ToEp
Olo
Ntu
Uv
VT
2-25
COMPUTER NETWORKS
2.4.5 Summary of Reliable Data Transfer Mechanisms and their Use
Table 2.2: Summary of reliable data transfer mechanisms and their use
Mechanism Use, Comments
Checksum Used to detect bit errors in a transmitted packet.
Timer Used to timeout/retransmit a packet because the packet (or its ACK) was
lost.
Because timeouts can occur when a packet is delayed but not lost,
duplicate copies of a packet may be received by a receiver.
Sequence-number Used for sequential numbering of packets of data flowing from sender to
receiver.
I
Gaps in the sequence-numbers of received packets allow the receiver to
R
detect a lost packet.
Packets with duplicate sequence-numbers allow the receiver to detect
duplicate copies of a packet.
Acknowledgment Used by the receiver to tell the sender that a packet or set of packets has
YS
been received correctly.
Acknowledgments will typically carry the sequence-number of the packet or
packets being acknowledged.
m
Acknowledgments may be individual or cumulative, depending on the
protocol.
Negative Used by the receiver to tell the sender that a packet has not been received
acknowledgment S.cBo
correctly.
Negative acknowledgments will typically carry the sequence-number of the
packet that was not received correctly.
Window, pipelining The sender may be restricted to sending only packets with sequence-
numbers that fall within a given range.
By allowing multiple packets to be transmitted but not yet acknowledged,
ToEp
sender utilization can be increased over a stop-and-wait mode of operation.
Olo
Ntu
Uv
VT
“Between two evils, I always pick the one I never tried before.” —Mae West
2-26
COMPUTER NETWORKS
2.5 Connection-Oriented Transport: TCP
• TCP is a reliable connection-oriented protocol.
Connection-oriented means a connection is established b/w sender & receiver before sending
the data.
Reliable service means TCP guarantees that the data will arrive to destination-process
correctly.
• TCP provides flow-control, error-control and congestion-control.
I
TCP is said to be connection-oriented. This is because
R
The 2 application-processes must first establish connection with each other before they
begin communication.
Both application-processes will initialize many state-variables associated with the connection.
2) Runs in the End Systems
YS
TCP runs only in the end-systems but not in the intermediate routers.
The routers do not maintain any state-variables associated with the connection.
3) Full Duplex Service
m
TCP connection provides a full-duplex service.
Both application-processes can transmit and receive the data at the same time.
4) Point-to-Point
A TCP connection is point-to-point i.e. only 2 devices are connected by a dedicated-link
S.cBo
So, multicasting is not possible.
5) Three-way Handshake
Connection-establishment process is referred to as a three-way handshake. This is because
3 segments are sent between the two hosts:
i) The client sends a first-segment.
ToEp
ii) The server responds with a second-segment and
iii) Finally, the client responds again with a third segment containing payload (or data).
6) Maximum Segment Size (MSS)
MSS limits the maximum amount of data that can be placed in a segment.
For example: MSS = 1,500 bytes for Ethernet
Olo
At Receiver
i) The segment’s data is placed in the receive-buffer.
ii) The application reads the stream-of-data from the receive-buffer.
VT
“The secret of success is to be ready when your opportunity comes.” —Benjamin Disraeli
2-27
COMPUTER NETWORKS
2.5.2 TCP Segment Structure
• The segment consists of header-fields and a data-field.
• The data-field contains a chunk-of-data.
• When TCP sends a large file, it breaks the file into chunks of size MSS.
• Figure 2.24 shows the structure of the TCP segment.
I
R
mYS
S.cBo Figure 2.24: TCP segment structure
i) ACK
¤ This bit indicates that value of acknowledgment field is valid.
ii) RST, SYN & FIN
¤ These bits are used for connection setup and teardown.
Uv
iii) PSH
¤ This bit indicates the sender has invoked the push operation.
iv) URG
¤ This bit indicates the segment contains urgent-data.
5) Receive Window
This field defines receiver’s window size
VT
“Every failure is just another step closer to a win. Never stop trying.” —Robert M. Hensel
2-28
COMPUTER NETWORKS
2.5.2.1 Sequence Numbers and Acknowledgment Numbers
Sequence Numbers
• The sequence-number is used for sequential numbering of packets of data flowing from sender to
receiver.
• Applications:
1) Gaps in the sequence-numbers of received packets allow the receiver to detect a lost packet.
2) Packets with duplicate sequence-numbers allow the receiver to detect duplicate copies of a
packet.
Acknowledgment Numbers
• The acknowledgment-number is used by the receiver to tell the sender that a packet has been
received correctly.
I
• Acknowledgments will typically carry the sequence-number of the packet being acknowledged.
R
mYS
S.cBo
ToEp
Olo
Figure 2.25: Sequence and acknowledgment-numbers for a simple Telnet application over TCP
Ntu
“You will not be punished for your anger; you will be punished by your anger.” —Buddha
2-29
COMPUTER NETWORKS
2.5.2.2 Telnet: A Case Study for Sequence and Acknowledgment Numbers
• Telnet is a popular application-layer protocol used for remote-login.
• Telnet runs over TCP.
• Telnet is designed to work between any pair of hosts.
• As shown in Figure 2.27, suppose client initiates a Telnet session with server.
• Now suppose the user types a single letter, ‘C’.
• Three segments are sent between client & server:
1) First Segment
The first-segment is sent from the client to the server.
The segment contains
→ letter ‘C’
I
→ sequence-number 42
R
→ acknowledgment-number 79
2) Second Segment
The second-segment is sent from the server to the client.
Two purpose of the segment:
YS
i) It provides an acknowledgment of the data the server has received.
ii) It is used to echo back the letter ‘C’.
The acknowledgment for client-to-server data is carried in a segment carrying server-to-client
m
data.
This acknowledgment is said to be piggybacked on the server-to-client data-segment.
3) Third Segment
The third segment is sent from the client to the server.
S.cBo
One purpose of the segment:
i) It acknowledges the data it has received from the server.
ToEp
Olo
Ntu
Uv
VT
Figure 2.27: Sequence and acknowledgment-numbers for a simple Telnet application over TCP
“If you live your life in the past, you waste the life you have to live.” —Jessica Cress
2-30
COMPUTER NETWORKS
2.5.3 Round Trip Time Estimation and Timeout
• TCP uses a timeout/retransmit mechanism to recover from lost segments.
• Clearly, the timeout should be larger than the round-trip-time (RTT) of the connection.
• DevRTT is defined as
I
“An estimate of how much SampleRTT typically deviates from EstimatedRTT.”
R
• If the SampleRTT values have little fluctuation, then DevRTT will be small.
If the SampleRTT values have huge fluctuation, then DevRTT will be large.
YS
2.5.3.2 Setting and Managing the Retransmission Timeout Interval
• What value should be used for timeout interval?
m
• Clearly, the interval should be greater than or equal to EstimatedRTT.
• Timeout interval is given by:
S.cBo
ToEp
Olo
Ntu
Uv
VT
2-31
COMPUTER NETWORKS
2.5.4 Reliable Data Transfer
• IP is unreliable i.e. IP does not guarantee data delivery.
IP does not guarantee in-order delivery of data.
IP does not guarantee the integrity of the data.
• TCP creates a reliable data-transfer-service on top of IP’s unreliable-service.
• At the receiver, reliable-service means
→ data-stream is uncorrupted
→ data-stream is without duplication and
→ data-stream is in sequence.
I
2.5.4.1.1 First Scenario
R
• As shown in Figure 2.28, Host-A sends one segment to Host-B.
• Assume the acknowledgment from B to A gets lost.
• In this case, the timeout event occurs, and Host-A retransmits the same segment.
• When Host-B receives retransmission, it observes that the sequence-no has already been received.
YS
• Thus, Host-B will discard the retransmitted-segment.
m
S.cBo
ToEp
Olo
Ntu
2-32
COMPUTER NETWORKS
2.5.4.1.2 Second Scenario
• As shown in Figure 2.29, Host-A sends two segments back-to-back.
• Host-B sends two separate acknowledgments.
• Suppose neither of the acknowledgments arrives at Host-A before the timeout.
• When the timeout event occurs, Host-A resends the first-segment and restarts the timer.
• The second-segment will not be retransmitted until ACK for the second-segment arrives before the
new timeout.
I
R
mYS
S.cBo Figure 2.29: Segment 100 not retransmitted
“Faith is the bird that feels the light when the dawn is still dark.” —Rabindranath Tagore
2-33
COMPUTER NETWORKS
2.5.4.2 Fast Retransmit
• The timeout period can be relatively long.
• The sender can often detect packet-loss well before the timeout occurs by noting duplicate ACKs.
• A duplicate ACK refers to ACK the sender receives for the second time. (Figure 2.31).
I
interval, send an ACK.
R
Arrival of in-order segment with expected Immediately send single cumulative ACK, ACKing
sequence-number. both in-order segments.
One other in-order segment waiting for ACK
transmission.
YS
Arrival of out-of-order segment with higher- Immediately send duplicate ACK, indicating
than-expected sequence-number. sequence-number of next expected-byte.
Gap detected.
m
Arrival of segment that partially or completely Immediately send ACK.
fills in gap in received-data.
S.cBo
ToEp
Olo
Ntu
Uv
VT
Figure 2.31: Fast retransmit: retransmitting the missing segment before the segment’s timer expires
2-34
COMPUTER NETWORKS
2.5.5 Flow Control
• TCP provides a flow-control service to its applications.
• A flow-control service eliminates the possibility of the sender overflowing the receiver-buffer.
I
R
YS
Figure 2.32: a) Send buffer and b) Receive Buffer
m
1) MaxSendBuffer: A send-buffer allocated to the sender.
2) MaxRcvBuffer: A receive-buffer allocated to the receiver.
3) LastByteSent: The no. of the last bytes sent to the send-buffer at the sender.
4) LastByteAcked: The no. of the last bytes acknowledged in the send-buffer at the sender.
S.cBo
5) LastByteRead: The no. of the last bytes read from the receive-buffer at the receiver.
6) LastByteRcvd: The no. of the last bytes arrived & placed in receive-buffer at the receiver.
Send Buffer
• Sender maintains a send buffer, divided into 3 segments namely
1) Acknowledged data
ToEp
2) Unacknowledged data and
3) Data to be transmitted
• Send buffer maintains 2 pointers: LastByteAcked and LastByteSent. The relation b/w these two is:
Receive Buffer
Olo
• Receiver throttles the sender by advertising a window that is smaller than the amount of free space
that it can buffer as:
VT
“You can't win unless you learn how to lose.” —Kareem Abdu Jabbar
2-35
COMPUTER NETWORKS
2.5.6 TCP Connection Management
2.5.6.1 Connection Setup & Data Transfer
• To setup the connection, three segments are sent between the two hosts. Therefore, this process is
referred to as a three-way handshake.
• Suppose a client-process wants to initiate a connection with a server-process.
• Figure 2.33 illustrates the steps involved:
Step 1: Client sends a connection-request segment to the Server
The client first sends a connection-request segment to the server.
The connection-request segment contains:
1) SYN bit is set to 1.
2) Initial sequence-number (client_isn).
I
The SYN segment is encapsulated within an IP datagram and sent to the server.
R
Step 2: Server sends a connection-granted segment to the Client
Then, the server
→ extracts the SYN segment from the datagram
→ allocates the buffers and variables to the connection and
YS
→ sends a connection-granted segment to the client.
The connection-granted segment contains:
1) SYN bit is set to 1.
m
2) Acknowledgment field is set to client_isn+1.
3) Initial sequence-number (server_isn).
Step 3: Client sends an ACK segment to the Server
Finally, the client
S.cBo
→ allocates buffers and variables to the connection and
→ sends an ACK segment to the server
The ACK segment acknowledges the server.
SYN bit is set to zero, since the connection is established.
ToEp
Olo
Ntu
Uv
VT
2-36
COMPUTER NETWORKS
2.5.6.2 Connection Release
• Either of the two processes in a connection can end the connection.
• When a connection ends, the “resources” in the hosts are de-allocated.
• Suppose the client decides to close the connection.
• Figure 2.34 illustrates the steps involved:
1) The client-process issues a close command.
¤ Then, the client sends a shutdown-segment to the server.
¤ This segment has a FIN bit set to 1.
2) The server responds with an acknowledgment to the client.
3) The server then sends its own shutdown-segment.
¤ This segment has a FIN bit set to 1.
I
4) Finally, the client acknowledges the server’s shutdown-segment.
R
mYS
S.cBo
ToEp
Olo
“Men must live and create. Live to the point of tears.” —Albert Camus
2-37
COMPUTER NETWORKS
2.6 Principles of Congestion Control
2.6.1 The Causes and the Costs of Congestion
2.6.1.1 Scenario 1: Two Senders, a Router with Infinite Buffers
• Two hosts (A & B) have a connection that shares a single-hop b/w source & destination.
• This is illustrated in Figure 2.35.
I
R
YS
Figure 2.35: Congestion scenario 1: Two connections sharing a single hop with infinite buffers
m
• Let
Sending-rate of Host-A = λin bytes/sec
S.cBo
Outgoing Link’s capacity = R
• Packets from Hosts A and B pass through a router and over a shared outgoing link.
• The router has buffers.
• The buffers stores incoming packets when packet-arrival rate exceeds the outgoing link’s capacity.
ToEp
Olo
Ntu
Uv
Figure 2.36: Congestion scenario 1: Throughput and delay as a function of host sending-rate
However, for a sending-rate above R/2, the throughput at the receiver is only R/2. (Figure 2.36a)
• Conclusion: The link cannot deliver packets to a receiver at a steady-state rate that exceeds R/2.
Right Hand Graph
• The right graph plots the average delay as a function of the connection-sending-rate (Figure 2.36b).
• As the sending-rate approaches R/2, the average delay becomes larger and larger.
However, for a sending-rate above R/2, the average delay becomes infinite.
• Conclusion: Large queuing delays are experienced as the packet arrival rate nears the link capacity.
“Rule No.1: Never lose money. Rule No.2: Never forget rule No.1.” —Warren Buffett
2-38
COMPUTER NETWORKS
2.6.1.2 Scenario 2: Two Senders and a Router with Finite Buffers
• Here, we have 2 assumptions (Figure 2.37):
1) The amount of router buffering is finite.
Packets will be dropped when arriving to an already full buffer.
2) Each connection is reliable.
If a packet is dropped at the router, the sender will eventually retransmit it.
• Let
Application’s sending-rate of Host-A = λin bytes/sec
Transport-layer’s sending-rate of Host-A = λin‘ bytes/sec (also called offered-load to network)
Outgoing Link’s capacity = R
I
R
mYS
S.cBo
Figure 2.37: Scenario 2: Two hosts (with retransmissions) and a router with finite buffers
ToEp
Case 1 (Figure 2.38(a)):
• Host-A sends a packet only when a buffer is free.
• In this case,
→ no loss occurs
→ λin will be equal to λin‘, and
Olo
• The sender must perform retransmissions to compensate for lost packets due to buffer overflow.
Uv
VT
2-39
COMPUTER NETWORKS
2.6.1.3 Scenario 3: Four Senders, Routers with Finite Buffers, and Multihop Paths
• Four hosts transmit packets, each over overlapping two-hop paths.
• This is illustrated in Figure 2.39.
I
R
mYS
S.cBo
ToEp
Figure 2.39: Four senders, routers with finite buffers, and multihop paths
• Consider the connection from Host-A to Host C, passing through routers R1 and R2.
• The A–C connection
Olo
If λin‘ is extremely large for all connections, then the arrival rate of B–D traffic at R2 can be
much larger than that of the A–C traffic.
The A–C and B–D traffic must compete at router R2 for the limited amount of buffer-space.
Thus, the amount of A–C traffic that successfully gets through R2 becomes smaller and
smaller as the offered-load from B–D gets larger and larger.
In the limit, as the offered-load approaches infinity, an empty buffer at R2 is immediately
filled by a B–D packet, and the throughput of the A–C connection at R2 goes to zero.
When a packet is dropped along a path, the transmission capacity ends up having been
wasted.
2-40
COMPUTER NETWORKS
2.6.2 Approaches to Congestion Control
• Congestion-control approaches can be classified based on whether the network-layer provides any
explicit assistance to the transport-layer:
1) End-to-end Congestion Control
The network-layer provides no explicit support to the transport-layer for congestion-control.
Even the presence of congestion must be inferred by the end-systems based only on
observed network-behavior.
Segment loss is taken as an indication of network-congestion and the window-size is
decreased accordingly.
2) Network Assisted congestion Control
Network-layer components provide explicit feedback to the sender regarding congestion.
I
This feedback may be a single bit indicating congestion at a link.
R
Congestion information is fed back from the network to the sender in one of two ways:
i) Direct feedback may be sent from a network-router to the sender (Figure 2.40).
¤ This form of notification typically takes the form of a choke packet.
ii) A router marks a field in a packet flowing from sender to receiver to indicate
YS
congestion.
¤ Upon receipt of a marked packet, the receiver then notifies the sender of the
congestion indication.
m
¤ This form of notification takes at least a full round-trip time.
S.cBo
ToEp
Olo
"Great things are done when men and mountains meet." —William Blake
2-41
COMPUTER NETWORKS
2.6.3 Network Assisted Congestion Control Example: ATM ABR Congestion Control
• ATM (Asynchronous Transfer Mode) protocol uses network-assisted approach for congestion-control.
• ABR (Available Bit Rate) has been designed as an elastic data-transfer-service.
i) When the network is underloaded, ABR has to take advantage of the spare available bandwidth.
ii) When the network is congested, ABR should reduce its transmission-rate.
I
R
YS
Figure 2.41: Congestion-control framework for ATM ABR service
m
• Figure 2.41 shows the framework for ATM ABR congestion-control.
• Data-cells are transmitted from a source to a destination through a series of intermediate switches.
• RM-cells are placed between the data-cells. (RM Resource Management).
•
•
•
S.cBo
The RM-cells are used to send congestion-related information to the hosts & switches.
When an RM-cell arrives at a destination, the cell will be sent back to the sender
Thus, RM-cells can be used to provide both
→ direct network feedback and
→ network feedback via the receiver.
ToEp
2.6.3.1 Three Methods to indicate Congestion
• ATM ABR congestion-control is a rate-based approach.
• ABR provides 3 mechanisms for indicating congestion-related information:
1) EFCI Bit
Olo
Each data-cell contains an EFCI bit. (EFCI Explicit forward congestion indication)
A congested-switch sets the EFCI bit to 1 to signal congestion to the destination.
The destination must check the EFCI bit in all received data-cells.
If the most recently received data-cell has the EFCI bit set to 1, then the destination
Ntu
2) CI and NI Bits
The rate of RM-cell interspersion is a tunable parameter.
The default value is one RM-cell every 32 data-cells. (NI No Increase)
The RM-cells have a CI bit and a NI bit that can be set by a congested-switch.
A switch
→ sets the NI bit to 1 in a RM-cell under mild congestion and
→ sets the CI bit to 1 under severe congestion conditions.
VT
3) ER Setting
Each RM-cell also contains an ER field. (ER explicit rate)
A congested-switch may lower the value contained in the ER field in a passing RM-cell.
In this manner, ER field will be set to minimum supportable rate of all switches on the path.
"If you would know the value of money, go and try to borrow some." —Benjamin Franklin
2-42
COMPUTER NETWORKS
2.7 TCP Congestion Control
2.7.1 TCP Congestion Control
• TCP has congestion-control mechanism.
• TCP uses end-to-end congestion-control rather than network-assisted congestion-control
• Here is how it works:
Each sender limits the rate at which it sends traffic into its connection as a function of
perceived congestion.
i) If sender perceives that there is little congestion, then sender increases its data-rate.
ii) If sender perceives that there is congestion, then sender reduces its data-rate.
• This approach raises three questions:
1) How does a sender limit the rate at which it sends traffic into its connection?
I
2) How does a sender perceive that there is congestion on the path?
R
3) What algorithm should the sender use to change its data-rate?
• The sender keeps track of an additional variable called the congestion-window (cwnd).
• The congestion-window imposes a constraint on the data-rate of a sender.
• The amount of unacknowledged-data at a sender will not exceed minimum of (cwnd & rwnd), that is:
YS
• The sender’s data-rate is roughly cwnd/RTT bytes/sec.
• Explanation of Loss event:
m
A “loss event” at a sender is defined as the occurrence of either
→ timeout or
→ receipt of 3 duplicate ACKs from the receiver.
S.cBo
Due to excessive congestion, the router-buffer along the path overflows. This causes a
datagram to be dropped.
The dropped datagram, in turn, results in a loss event at the sender.
The sender considers the loss event as an indication of congestion on the path.
• How congestion is detected?
Consider the network is congestion-free.
ToEp
Acknowledgments for previously unacknowledged segments will be received at the sender.
TCP
→ will take the arrival of these acknowledgments as an indication that all is well and
→ will use acknowledgments to increase the window-size (& hence data-rate).
TCP is said to be self-clocking because
Olo
"If you treat people right they will treat you right... 90% of the time." —Franklin D. Roosevelt
2-43
COMPUTER NETWORKS
2.7.1.1 Slow Start
• When a TCP connection begins, the value of cwnd is initialized to 1 MSS.
• TCP doubles the number of packets sent every RTT on successful transmission.
• Here is how it works:
As shown in Figure 2.42, the TCP
→ sends the first-segment into the network and
→ waits for an acknowledgment.
When an acknowledgment arrives, the sender
→ increases the congestion-window by one MSS and
→ sends out 2 segments.
When two acknowledgments arrive, the sender
I
→ increases the congestion-window by one MSS and
R
→ sends out 4 segments.
This process results in a doubling of the sending-rate every RTT.
• Thus, the TCP data-rate starts slow but grows exponentially during the slow start phase.
mYS
S.cBo
ToEp
Olo
Ntu
2) When the value of cwnd equals ssthresh, TCP enters the congestion avoidance state.
3) When three duplicate ACKs are detected, TCP
→ performs a fast retransmit and
→ enters the fast recovery state.
• TCP’s behavior in slow start is summarized in FSM description in Figure 2.43.
"Men are not prisoners of fate, but only prisoners of their own minds." —Franklin D. Roosevelt
2-44
COMPUTER NETWORKS
I
R
mYS
S.cBo
ToEp
2-45
COMPUTER NETWORKS
2.7.1.2 Congestion Avoidance
• On entry to congestion-avoidance state, the value of cwnd is approximately half its previous value.
• Thus, the value of cwnd is increased by a single MSS every RTT.
• The sender must increases cwnd by MSS bytes (MSS/cwnd) whenever a new acknowledgment arrives
• When should linear increase (of 1 MSS per RTT) end?
1) When a timeout occurs.
When the loss event occurred,
→ value of cwnd is set to 1 MSS and
→ value of ssthresh is set to half the value of cwnd.
2) When triple duplicate ACK occurs.
When the triple duplicate ACKs were received,
I
→ value of cwnd is halved.
R
→ value of ssthresh is set to half the value of cwnd.
YS
• When an ACK arrives for the missing segment, the congestion-avoidance state is entered.
• If a timeout event occurs, fast recovery transitions to the slow-start state.
• When the loss event occurred
m
→ value of cwnd is set to 1 MSS, and
→ value of ssthresh is set to half the value of cwnd.
• There are 2 versions of TCP:
1) TCP Tahoe
S.cBo
An early version of TCP was known as TCP Tahoe.
TCP Tahoe
→ cut the congestion-window to 1 MSS and
→ entered the slow-start phase after either
i) timeout-indicated or
ToEp
ii) triple-duplicate-ACK-indicated loss event.
2) TCP Reno
The newer version of TCP is known as TCP Reno.
TCP Reno incorporated fast recovery.
Figure 2.44 illustrates the evolution of TCP’s congestion-window for both Reno and Tahoe.
Olo
Ntu
Uv
VT
2-46
COMPUTER NETWORKS
2.7.1.4 TCP Congestion Control: Retrospective
• TCP’s congestion-control consists of (AIMD additive increase, multiplicative decrease)
→ Increasing linearly (additive) value of cwnd by 1 MSS per RTT and
→ Halving (multiplicative decrease) value of cwnd on a triple duplicate-ACK event.
• For this reason, TCP congestion-control is often referred to as an AIMD.
• AIMD congestion-control gives rise to the “saw tooth” behavior shown in Figure 2.45.
• TCP
→ increases linearly the congestion-window-size until a triple duplicate-ACK event occurs and
→ decreases then the congestion-window-size by a factor of 2
I
R
mYS
S.cBo
Figure 2.45: Additive-increase, multiplicative-decrease congestion-control
ToEp
Olo
Ntu
Uv
VT
"Either you run the day or the day runs you." —Jim Rohn
2-47
COMPUTER NETWORKS
2.7.2 Fairness
• Congestion-control mechanism is fair if each connection gets equal share of the link-bandwidth.
• As shown in Figure 2.46, consider 2 TCP connections sharing a single link with transmission-rate R.
• Assume the two connections have the same MSS and RTT.
I
R
Figure 2.46: Two TCP connections sharing a single bottleneck link
YS
• Figure 2.47 plots the throughput realized by the two TCP connections.
If TCP shares the link-bandwidth equally b/w the 2 connections,
m
then the throughput falls along the 45-degree arrow starting from the origin.
S.cBo
ToEp
Olo
Ntu
"Take care of your body. It's the only place you have to live." —Jim Rohn
2-48
COMPUTER NETWORKS
MODULE-WISE QUESTIONS
PART 1
1) With a diagram, explain multiplexing and demultiplexing. (6*)
2) Explain the significance of source and destination-port-no in a segment. (4*)
3) With a diagram, explain connectionless multiplexing and demultiplexing. (4)
4) With a diagram, explain connection oriented multiplexing and demultiplexing. (4)
5) Briefly explain UDP & its services. (6*)
6) With general format, explain various field of UDP segment. Explain how checksum is calculated (8*)
I
7) With a diagram, explain the working of rdt1.0. (6)
8) With a diagram, explain the working of rdt2.0. (6*)
R
9) With a diagram, explain the working of rdt2.1. (6)
10) With a diagram, explain the working of rdt3.0. (6*)
11) With a diagram, explain the working of Go-Back-N. (6*)
YS
12) With a diagram, explain the working of selective repeat. (6*)
13) Explain the following terms: (8)
i) Sequence-number
m
ii) Acknowledgment
iii) Negative acknowledgment
iv) Window, pipelining
14)
15)
S.cBo
Briefly explain TCP & its services. (6*)
PART 2
“Life is like riding a bicycle. To keep your balance, you must keep moving.” -Albert Einstein
2-49