Unit-2
Communication in
Distributed Systems
Dr. Subhasis Dash
Associate Professor
School of Computer Engineering
KIIT Deemed to be University
Bhubaneswar
UNIT-2: Communications in Distributed Systems
Basics of Communication Systems
Layered Protocols
ATM Models
Client-Server Model
Blocking and Non-Blocking Primitives
Buffered and Un-Buffered Primitives
Reliable and Unreliable Primitives
Message passing
Remote Procedure Calls
Unit-2 : DOS 2
Basics of Communication Systems
Issues : In a uniprocessor, most interprocess communication implicitly
assumes the existence of shared memory. Eg: Producer-Consumer
In Distributed system there is no shared memory
Protocols: These are the rules adhere by the communicating processes
In Wide-Area Distributed System: These protocols have multiple
layers
Platforms for Interprocess Communication
• Open Systems Interconnection (OSI)
• Asynchronous Transfer Mode (ATM)
• Remote Procedure Calls (RPC)
• Group Communication
Unit-2 : DOS 3
Layered Protocols
Background : Due to the absence of shared memory, all communication in
distributed systems is based on message passing.
Communication Network:
• DATA COMMUNICATIONS
• NETWORKS
• NETWORK TYPES
• PROTOCOL LAYERING
• TCP/IP PROTOCOL SUITE
• THE OSI MODEL
Unit-2 : DOS 4
Types of Communications
• A network is two or more devices connected through links.
• A link is a communications pathway that transfers data from one
device to another.
• There are two possible types of connections: point-to-point and
multipoint
• Point-to-point communication is a method in which the channel of
communication is shared only between two devices or nodes.
• Multi-point communication is a form of communication in which the
channel is shared among multiple devices or nodes.
• Bus Topology is a common example of Multipoint Topology.
5
Physical Topology
• The term physical topology refers to the way in which a network is laid
out physically.
• Two or more devices connect to a link; two or more links form a
topology.
• The topology of a network is the geometric representation of the
relationship of all the links and linking devices (usually called nodes) to
one another.
• There are four basic topologies possible: Mesh, Star, Bus, & Ring.
6
A fully connected mesh topology
n = 5 10 links
• Suppose, N number of devices are connected with each other in a mesh topology, then
the total number of dedicated links required to connect them is N * (N-1) / 2.
• There are 5 devices connected to each other, hence the total number of links required
is 5*4/2 = 10.
7
A star topology
• A Star topology is a type of network topology in which all the devices
or nodes are physically connected to a central node such as a router,
switch, or hub.
• The central node (hub) acts as a server, and the connecting nodes act
as clients.
8
A bus topology
• Bus topology, alternatively known as line topology, is a type of network
topology where all devices on a network are connected to a single
cable, called a bus or backbone.
• This cable serves as a shared communication line, allowing all devices
(computers, printers, etc.) to receive the same signal simultaneously.
9
A ring topology
• Ring topology is a network configuration where devices are connected
in a circular structure, forming a closed loop.
• In this topology, each device is connected to exactly two other
devices, one on either side, creating a single continuous pathway for
data transmission.
• Data travels in only one direction around the ring, passing through
each device until it reaches its destination.
10
Local Area Network (LAN)
• A local area network (LAN) is a collection of devices connected
together in one physical location, such as a building, office, or home.
• A LAN can be small or large, ranging from a home network with one
user to an enterprise network with thousands of users and devices in
an office or school.
• A LAN comprises cables, access points, switches, routers, and other
components that enable devices to connect to internal servers, web
servers, and other LANs via wide area networks.
11
An isolated LAN in the past and today
Access the text alternative for slide images.
12
Wide Area Network (WAN)
• A wide area network (WAN) is also a connection of devices capable of
communication.
• In its simplest form, a wide-area network (WAN) is a collection of
local-area networks (LANs) or other networks that communicate with
one another. A WAN is essentially a network of networks, with the
Internet the world's largest WAN.
• Types of WAN technologies:Packet switching,TCP/IP protocol suite,
RouterOverlay network(network virtualization)..
13
Switched WAN
• A switched WAN is a network with more than two ends.
• A switched WAN is used in the backbone of global communication today.
14
An internetwork made of two LANs and one WAN
Access the text alternative for slide images.
15
A heterogeneous network made of WANs and LANs
16
The Internet
• An internet (note the lowercase i) is two or more networks that can
communicate with each other.
• The most notable internet is called the Internet (uppercase I) and is
composed of millions of interconnected networks.
• A conceptual (not geographical) view of the Internet given below
17
PROTOCOL LAYERING
We defined the term protocol before. In data communication and
networking, a protocol defines the rules that both the sender and
receiver and all intermediate devices need to be communicate directly.
18
First Scenario
• A large organization or a large corporation can itself become a local
ISP and be connected to the Internet.
• This can be done if the organization or the corporation leases a high-
speed WAN from a carrier provider and connects itself to a regional
ISP.
19
A single-layer protocol
In single layer protocol, the two persons named Maria on one side and Ann on another side
communicates (listen or talk) through air medium. The listen or talk actions on both sides are
represented as layer 1.
20
Second Scenario
In the second scenario, we assume that Ann is offered a higher-level
position in her company, but needs to move to another branch located in
a city very far from Maria.
They decide to continue their conversion using regular mail through the
post office. However, they do not want their ideas to be revealed by
other people if the letters are intercepted. They use an
encryption/decryption technique.
21
A three-layer protocol
In three layer protocol, the communication between Maria and Ann is shown in three
layers. The layer 1 is sending mail or receiving mail. The layer 2 is encrypting or
decrypting the mail to a Ciphertext. The layer 3 is listening or talking where the
ciphertext is converted to a plaintext. Each peer layers share identical objects that is
the mail, ciphertext, and plaintext in both sides (Maria and Ann) are identical objects.
22
Principles of Protocol Layering
#1. The first principle dictates that we need to make each layer to
perform two opposite task in each direction.
#2. The second principle dictates that two objects under each layer
should be identical.
23
Logical connection between peer layers
A three layer protocol communication between Maria and Ann is shown is demonstrated.
The layer 1 is sending mail or receiving mail. The layer 2 is encrypting or decrypting
the mail to a Ciphertext. The layer 3 is listening or talking where the ciphertext is
converted to a plaintext. Each peer layer establishes a logical connection between the
objects they share to the successive layers.
24
Layers in the TCP/IP protocol suite
25
Layered Architecture
To show how the layers in the TCP/IP protocol suite are involved in
communication between two hosts, we assume that we want to use the
suite in a small internet made up of three LANs (links), each with a
link-layer switch. We also assume that the links are connected by one
router.
26
Communication through an internet
27
Brief Description of Layers
After the above introduction, we briefly discuss the functions and
duties of layers in the TCP/IP protocol suite. To better understand the
duties of each layer, we need to think about the logical connections
between layers.
28
Logical connections between layers in TCP/IP
Access the text alternative for slide images.
29
Identical objects in the TCP/IP protocol suite
An illustration shows that the first laptop and the second laptop have physical data link,
networks, transport, and application. Reversible communication between the first and second
laptops is shown. The application of first and second laptops shares identical objects
(messages). The transport of first and second laptops shares identical objects (segments or
user datagrams). The network of first and second laptops shares identical objects (datagrams)
through the router. The data link of the first and second laptops shares identical objects
(frames) through the router. The physical of first and second laptops share identical objects
(bits) through the router.
30
Characteristics of Different Layers
1. We can say that the physical layer is responsible for carrying individual bits in a frame
across the link.
The physical layer is the lowest level in the TCP/IP protocol suite, the communication
between two devices at the physical layer is still a logical communication because
there is another, hidden layer, the transmission media, under the physical layer.
2. We have seen that an internet is made up of several links (LANs and WANs) connected
by routers. When the next link to travel is determined by the router, the data-link layer
is responsible for taking the datagram and moving it across the link.
3. The network layer is responsible for creating a connection between the source computer
and the destination computer.
The communication at the network layer is host-to-host.
However, since there can be several routers from the source to the destination, the
routers in the path are responsible for choosing the best route for each packet.
31
Characteristics of Different
Characteristics Layers
of Different Layers Cont.
4. The logical connection at the transport layer is Point-to-Point. The transport layer
at the source host gets the message from the application layer, encapsulates it in a
transport-layer packet.
In other words, the transport layer is responsible for giving services to the
application layer: to get a message from an application program running on the
source host and deliver it to the corresponding application program on the
destination host. transmits user datagrams without first creating a logical
connection.
5. The logical connection between the two application layers is end-to-end.
The two application layers exchange messages between each other as though
there were a bridge between the two layers. However, we should know that the
communication is done through all the layers. Communication at the application
layer is between two processes (two programs running at this layer).
To communicate, a process sends a request to the other process and receives a
response. Process-to-process communication is the duty of the application layer.
32
Open System interconnection (OSI) MODEL
• Although, when speaking of the Internet, everyone talks about the
TCP/IP protocol suite, this suite is not the only suite of protocols
defined.
• Established in 1947, the International Organization for
Standardization (ISO) is a multinational body dedicated to worldwide
agreement on international standards.
• Almost three-fourths of the countries in the world are represented in
the ISO. An ISO standard that covers all aspects of network
communications is the Open Systems Interconnection (OSI) model. It
was first introduced in the late 1970s.
33
The OSI model
34
OSI versus TCP/IP
• When we compare the two models, we find that two layers, session and
presentation, are missing from the TCP/IP protocol suite. These two
layers were not added to the TCP/IP protocol suite after the
publication of the OSI model.
• The application layer in the suite is usually considered to be the
combination of three layers in the OSI model.
35
TCP/IP and OSI model
36
Asynchronous Transfer Mode (ATM)
ATM is the cell relay protocol designed by ATM forum and adopted
by ITU-T (International Telecommunication Union – Telecommunication Standardization Sector)
It is a cell switching and multiplexing technology that combines
benefits of both circuit switching and packet switching
ATM working principles
• Sender first establishes a connection (virtual circuit) to the receiver(s)
• A route is determined from sender to receiver
• Routing information is stored in the switches along the way
• Packets can be sent through this connection by sender
• Packets are chopped into small fixed-sized units (cell) by hardware
• Routing information purged from switches when connection is not required
Unit-2 : DOS School of Computer Engineering 37
ATM Advantages
A single network is used to transport voice, data, broadcast
television, videotapes, radio.
It is used for video conferencing, video-on-demand,
teleconferencing, access to thousands of remote databases
Cost saving
ATM uses cell switching which handles both point-to-point
and multicasting efficiently
ATM allows rapid switching as its cell (or packet) size is
fixed
Unit-2 : DOS School of Computer Engineering 38
ATM Layers
ATM Reference Model ISO/OSI Model
Unit-2 : DOS School of Computer Engineering 39
ATM Physical Layer
In the Physical Layer, ATM is synchronous as it transmits empty cells while no
data to be send.
It uses SONET (Synchronous Optical NETwork) in physical Layer.
In SONET, frame size is 810 bytes (overhead: 36 bytes, payload: 744 bytes), gross
data rate 51.840 Mbps.
Basic 51.840 Mbps channel is called Optical Carrier (OC-1)
OC-12 (622.08 Mbps) and OC-48 (2.488 Gbps) are used for long-haul transmission.
Unit-2 : DOS School of Computer Engineering 40
ATM Layer
Generic Flow Control (GFC) is used for flow control.
Virtual Path Identifier (VPI) and Virtual Channel Identifier (VCI) together identify path
and circuit of a cell
Payload type distinguishes data cells from control cells
Cell Loss Priority (CLP) identifies the less important cells which drop if congestion occurs
Cyclic Redundancy Check (CRC) identifies redundancy and correct it
5 bit Header
Virtual Path Identifier Cell Loss Priority
GFC VPI VCI Payload Type (3 CLP CRC
(4 bits) (8 bits) (16 bits) bits) (1 bit) (8 bits)
Generic Flow Control Virtual Channel Identifier Cyclic Redundancy Check
User-to-Network Cell Header Layout
Unit-2 : DOS 41
ATM Adaptation Layer
Adaptation Layer has four classes
• Constant bit rate traffic (for audio and video) : CBR Traffic
• Variable bit rate traffic but with bounded delay: VBR Traffic
• Connection-oriented data traffic
• Connectionless data traffic
Simple and Efficient Adaptation Layer (SEAL)
• 1 bit of ATM header, 1 bit of Payload Type
• Payload Type field is set to 1 for last cell, otherwise 0
• Last cell contains 8 bytes tailer with four fields
• Tailer contains packet length (2 bytes), checksum (4 bytes)
• There are no use of first two fields (1 byte each field)
Unit-2 : DOS School of Computer Engineering 42
ATM Switching
Network built with 4 switches and 8 computers
Cells can be switched different computers by traversing
switches
Switching fabric connects input and output lines and ensures
parallel switching
Head-of-line blocking
problem
Solution: Keep copy of a
cell in a output buffer
queue
(a) ATM Switching Network (b) Inside of One Switch
Unit-2 : DOS School of Computer Engineering 43
ATM Implications for Distributed Systems
High-speed network but latency remains
Flow control
Transcontinental Delay
Cell drops during congression
Unit-2 : DOS School of Computer Engineering 44
ATM Advantages
High-speed, fast-switched integrated data, voice, and video communication.
A standards-based solution formalized by the International
Telecommunication Union (ITU)
Interoperability with standard LAN/WAN technologies
QoS technologies(The parameters are: End-to-end delay, Delay Jitter,
PLR, Bandwidth usage etc.) that enable a single network connection to
reliably carry voice, data, and video simultaneously.
Unit-2 : DOS School of Computer Engineering 45
Interprocess Communication
In distributed system, it is completely different from uniporcessor system
as there is no shared memory.
Certain rules need to be followed for interprocess communication called
Protocols.
For wide area distributed systems, these protocols take the form of
multiple layers such as OSI , ATM.
OSI model addresses only a small aspect of the communication - sending
bits from the sender to the receiver, with much overheads.
Unit-2 : DOS School of Computer Engineering 46
Client – Server Model
It is based on simple connectionless request / reply protocol.
Client sends a request message to the server and the server
returns the data requested or an error code indicating the
reason of failure.
It is simple. No connection to be established before use and no
connection to be closed after use.
Simplicity leads to efficiency. Only three levels of protocol are
needed.
Client Request Server
kernel Reply kernel
Unit-2 : DOS School of Computer Engineering 47
Client – Server Model Cont.
Physical and data link protocol take care of getting the packets from
client to server and back.
No routing and no connections - layers 3 & 4 not needed.
Layer 5 is the request/ reply protocol. No sessions required.
Communication provided by the micro-kernels using two system calls -
• send (dest, &mptr)
• receive(addr, &mptr)
mptr - message pointer
dest - destination process
addr - source address
Unit-2 : DOS School of Computer Engineering 48
Example : Client and Server Program
Unit-2 : DOS School of Computer Engineering 49
Example : Client and Server Program
Unit-2 : DOS School of Computer Engineering 50
Example : Client and Server Program
Unit-2 : DOS School of Computer Engineering 51
Addressing
One way of mentioning server address is to mention it in header.h as a constant.
Sending kernel can extract it (ex - 243) from message structure and use it for
sending packets to server.
Ambiguity arises if multiple processes are running on the same server.
Alternative 1 -
Send messages to processes , not machines.
Process identification - two part names - machine + process no.
Ex - 243.4 or 4@243
Each machine can number its processes starting from 0. So there is no confusion
between process ‘n’ of different machines.
No global coordination is required. 1
Client Server
kernel 2 kernel
1. Request to 243.0
2. Reply to 199.0
Unit-2 : DOS School of Computer Engineering
Addressing Cont.
Alternative 2 -
Use machine.local-id instead of machine.process
Each process is assigned a local-id and informs kernel that it listens to local-id
Problem - user is aware of the location of the machine (243). If the machine is down, compiled
programs with header.h will not work, although another machine (365) is available. No
transparency.
Alternative 3-
Each process has a unique address that doesn’t contain machine number.
A centralized process address allocator maintains a counter. Upon receiving a request, it
returns the current value of the counter.
Problem - Such centralized components do not scale to large systems.
Unit-2 : DOS School of Computer Engineering 53
Addressing Cont. 1. Broadcast
2. Here I am
3. Request
4. Reply 1
Alternative 4 -
Each process picks its own identifier from a large address space
(space of 64 bit binary integer).
It is scalable.
Identification of machine -Sender broadcasts a special Locate packet
containing the address of the destination process. It will be received
by all machines on the network. The matched kernel respondes with a
message “here i am” along with the machine number. So the sending
kernel uses this machine number for further communication.
Problem - Broadcasting is an overload to the system.
Unit-2 : DOS School of Computer Engineering 54
3 1
Server Client Name
Addressing 1.Lookup
2. NS reply kernel 4 kernel 2
Server
kernel
3. Request
4. Reply
Alternative 5 -
Overload can be avoided by providing an extra machine to map high-
level (ASCII) service names to machine address.
These names are embedded in the programs, not binary machine
numbers.
For the first time client sends a query to the Name server, asking the
machine number where the server is currently located. Then the
request can be sent directly to the machine address.
Problem - If the name server is replicated, consistency problem may
arise.
Unit-2 : DOS School of Computer Engineering 55
Blocking versus Nonblocking Primitives
Message Passing: A message-passing system gives a collection of message-
based IPC (Inter-Process Communication) protocols.
The send() and receive() communication primitives are used by processes for
interacting with each other.
For example, Process A wants to communicate with Process B then Process A
will send a message with send() primitive and Process B will receive the
message with receive() primitive.
Synchronization Semantics: The following are the two ways of message
passing between processes:
Blocking (Synchronous)
Non-blocking (Asynchronous)
Unit-2 : DOS School of Computer Engineering 56
Blocking (Synchronous)
In case of blocking primitive (also called as synchronous primitives), when a process calls send it
specifies a destination and a buffer to send to that destination.
While the message is being sent, the sending process is blocked (i.e., suspended).
The instruction following the call to send is not executed until the message has been completely
sent, as shown in figure below.
Similarly, a call to receive does not return control until the message has actually been received
and put in the message buffer.
A blocking send primitive
Unit-2 : DOS School of Computer Engineering 57
Non-Blocking
An alternative to blocking primitives are nonblocking primitives (also called as asynchronous
primitives).
If send is nonblocking, it returns control to the caller immediately, before the message is
sent.
The advantage of this scheme is that the sending process can continue computing in parallel
with the message transmission, instead of having the CPU go idle.
The disadvantage of this scheme is that the sender cannot modify the message buffer until
the message has been sent.
The sending process has no idea of when the transmission is done, so it never knows when it is
safe to reuse the buffer.
A nonblocking send primitive
Unit-2 : DOS School of Computer Engineering 58
Non-Blocking
There are two possible solutions to this problem:
The first solution is to have the kernel copy the message to an internal kernel buffer and
then allow the process to continue.
The disadvantage of this method is that every outgoing message has to copied from user
space to kernel space.
The second solution is to interrupt the sender when the message has been sent to inform it
that the buffer is once again available.
No copy is required here, which saves time, but programs based on user-level interrupts are
difficult to write and debug.
Unit-2 : DOS School of Computer Engineering 59
Unit-2 : DOS School of Computer Engineering 60
Unit-2 : DOS School of Computer Engineering 61
Blocking and nonblocking send primitives:
Blocking send() primitive: The blocking send() primitive refers to the
blocking of sending process.
The process remains blocking until it receives an acknowledgment from the
receiver side that the message has been received after the execution of
this primitive.
Non-blocking send() primitive: The non-blocking send() primitive refers to
the non-blocking state of the sending process that implies after the
execution of send() statement, the process is permitted to continue further
with its execution immediately when the message has been transferred to a
buffer.
Unit-2 : DOS School of Computer Engineering 62
Just as send can be blocking and nonblocking, so can receive.
Blocking receive() primitive: when the receive statement is executed, the receiving process
is halted until a message is received.
Nonblocking receive() primitive: The non-blocking receive() primitive implies that the
receiving process is not blocked after executing the receive() statement, control is returned
immediately after informing the kernel of the message buffer’s location.
The issue in a nonblocking receive() primitive is when a message arrives in the message
buffer, how does the receiving process know?
One of the following two procedures can be used for this purpose:
Polling: In the polling method, the receiver can check the status of the buffer when a test
primitive is passed in this method.
The receiver regularly polls the kernel to check whether the message is already in the
buffer.
Unit-2 : DOS School of Computer Engineering 63
Interrupt: A software interrupt is used in the software interrupt method to
inform the receiving process regarding the status of the message i.e. when
the message has been stored into the buffer and is ready for usage by the
receiver.
So, here in this method, receiving process keeps on running without having
to submit failed test requests.
Unit-2 : DOS School of Computer Engineering 64
Buffered versus Unbuffered Primitives
Unbuffered Primitives:
Unbuffered primitives involve direct communication without any intermediate
storage.
In these primitives, the sender and receiver need to be synchronized for the
communication to take place.
A call to the primitive receive(addr, &m) tells the kernel of the machine on
which it is running that the calling process is listening to the address addr
and is prepared to receive one message sent to that address.
A single message buffer, pointed to by m, is provided to hold the incoming
message.
When the message comes in, the receiving kernel copies it to the buffer and
unblocks the receiving process, as shown by Figure 3 in the next slide.
Unit-2 : DOS School of Computer Engineering 65
Cont..
Figure 3: Unbuffered message passing
Unit-2 : DOS School of Computer Engineering 66
Unit-2 : DOS School of Computer Engineering 67
Cont..
What are the problems that occurs when the client calls send primitive before the server calls receive
primitive in unbuffered message passing mechanism?
The problems are as follows:
How does the server’s kernel knows which of its process is using the address in the newly arrived message?
How does the server’s kernel knows where to copy the message?
To avoid such problems, it's crucial to ensure that the receive primitive is called in a timely manner. Some
strategies to handle this include:
Pre-emptive Design: Design the system so that the receive is always invoked before or concurrently with
send to avoid blocking.
Timeouts and Error Handling: Implement timeouts or error handling mechanisms to manage situations
where a send operation might block indefinitely.
Buffered Communication: Use buffered message passing where messages are stored in a buffer
temporarily, allowing the sender and receiver to operate asynchronously and reducing the risk of blocking.
Unit-2 : DOS School of Computer Engineering 68
Cont..
Buffered primitives:
In order to deal with the buffer management issues, a new data structure
called a “mailbox” is defined.
A process that is interested in receiving messages tells the kernel to create
a mailbox for it, and specifies an address to look for the network packets.
Henceforth, all incoming messages with that address are put in the mailbox.
The call to receive removes on message from the mailbox, or blocks if none is
present.
In this way, the kernel knows what to do with incoming messages and has a
place to put them.
This technique is referred as buffered primitive, as shown by Figure 4, in
the next slide.
Unit-2 : DOS School of Computer Engineering 69
Cont..
Figure 4: Buffered message passing
Unit-2 : DOS School of Computer Engineering 70
Unit-2 : DOS School of Computer Engineering 71
Reliable vs Unreliable Primitives
Using blocking primitives, the client process gets suspended after sending
a message. When it is restarted, there is no guarantee that message has
been delivered. The message might have been lost.
Alternative solution1 - Redefine the semantics of send to be unreliable.
The system gives no guarantee about message delivery.
Alternative solution 2 - Kernel on the receiving machine sends an
acknowledgement back to the kernel on sending machine.
Sending kernel free the process after receiving this acknowledgement.
Similarly, the reply from server back to client is acknowledged by client’s
kernel.
Acknowledgement goes from kernel to kernel. 1
Client Server
1. Request 3
2. ACK (kernel to kernel)
4
3. Reply kernel kernel
4. ACK (kernel to kernel) 2
Unit-2 : DOS School of Computer Engineering 72
Reliable vs Unreliable Cont.
Alternative solution3 - Client is blocked after sending message and
the server’s reply act as an acknowledgement.
If the reply takes too long, the sender can resend the mesage to
guard against lost message.
An acknowledgement from client’s kernel to the server’s kernel is
sometimes used. Until this packet is received, the server’s send does
not complete and the server remains blocked.
If the reply is lost and the request is retransmitted, then the
server kernel sends reply again without waking up the server.
1. Request (client to server)
2. Reply (server to client)
3. ACK (kernel to kernel) 1
Client Server
2
kernel kernel
3
Unit-2 : DOS School of Computer Engineering 73
Implementing client-server model
Design issues for the communication primitives
Item Option 1 Option 2 Option 3
Addressing Machne address Sparse process Names looked up via
addresses server
Blocking Blocking primitives Nonblocking with copy Nonblocking with
to kernel interrupt
Bufferring Unbuffered, discarding Unbuffered, temporarly Mailboxes
unexpected messages keeping unexpected
messages
Reliability Unreliable Request-Ack-Reply-Ack Request-Reply-Ack
How message passing is implemented depends on which choices are made.
Unit-2 : DOS School of Computer Engineering 74
Implementing client-server model
Issue of packet size - All packets have a limit of packet size.
Messages larger than this must be split up into multiple packets and
sent separately.
Problem - some of these packets may be lost or distorted. They may even arrive in
the wrong order.
Solution - Assign a number to each message and put it in each packet belonging to
that message, along with a sequence number indicating the order of the packet.
Issue of acknowledgement -Acknowledge each individual packet or Acknowledge
entire message.
case I - if a packet is lost, only one packet need to be retransmitted. But it will
cause more number of acknowledgements.
case II - Fewer packets but more complicated to recover if a packet is lost.
Unit-2 : DOS School of Computer Engineering 75
Implementing client-server model
Issue of underlying protocol -
AYA - to check whether request is complicated or the server is
crashed
TA - if request packet cannot be accepted
Code Packet type From To Description
REQ Request Client Server Client wants service
REP Reply Server Client Reply from server to the client
ACK Acknowledgement Either Other Previous packet arrived
AYA Are you alive ? Client Server Check if the server is crashed
IAA I am alive Server Client Server has not crashed.
TA Try again Server Client Server has no space
AU Address unknown Server Client No processis using this address
Unit-2 : DOS School of Computer Engineering 76
Implementing client-server model
1
Client 1 Server Client 3 Server
4
kernel 2 kernel kernel kernel
2
1
Client Server
3
1
kernel 2 kernel Client Server
3
AYA
1. Request IAA
2. ACK (kernel to kernel) 4
3. Reply kernel kernel
4. ACK (kernel to kernel) 2
Examples of packet exchanges
Unit-2 : DOS School of Computer Engineering 77
Remote Procedure Call
Remote procedure call:- Information can be transported from
the caller to the callee in the parameters and can come back in the
procedure result.
Calling and Called procedures run on different machines and they
execute in different address spaces.
RPC is the widely used approach for Distributed Operating
System to communicate between process on different machine or
between different processes on the same machine.
Unit-2 : DOS School of Computer Engineering 78
Remote Procedure Call Model
Remote procedure call Model:- When a process on m/c ”A” Calls a procedure
on m/c “B” the calling process on “A” is suspended and the execution of the
called procedure take place on “B”.
Information can be transported from the caller to the callee in the
parameters and can come back in the procedure result.
No message passing and I/O event at all visible to the parameter.
This is called RPC.
It is like procedure call.
Unit-2 : DOS School of Computer Engineering 79
Remote Procedure Call Model
During RPC the caller and callee process interact with each other.
Caller / Client Process Callee/ Server Process
The client sends a call message to server and
waits for the reply message.
Request Message
1. Receive Request and start procedure Execution
Call procedure
2. Procedure Execution
and wait for the
3. Send Reply and wait for the next request
Reply
The server process executes the procedure
Reply Message and then returns the results of the procedure in
the reply message to the client process.
Once the reply message is received, the result
of the procedure execution is extracted, and
the client process execution is resumed.
Unit-2 : DOS School of Computer Engineering 80
Transparency of Remote Procedure Call (RPC)
A transparent RPC mechanism is one which local procedure and remote procedure
are indistinguishable to programmers. This is of two types:-
1. Syntactic Transparency:- RPC should have exactly the same syntax as local procedures
call have.
2. Semantic Transparency :- Semantics of the RPC are identical to those of local
procedure call.
RPC vs LPC
In RPC called procedure is executed in an address space that is disjoint from the calling
programmer’s address space. (Client can not access to any data of server)
RPC are more vulnerable to failure than local procedure call as they involve two different
process, possibly a network & two different m/cs.
RPC consumes much more time than LPC due to involvement of communication network.
Unit-2 : DOS School of Computer Engineering 81
Remote Procedure Call (RPC) Implementation
RPC involves 5 different elements
1. The client Client M/C
Server M/C
2. The Client Stub
3. The RPC Runtime
Return Call Return Call
4. The Server Stub
5. The Server
Client Stub Server Stub
Unpack Pack Unpack Pack
RPC Run Time RPC Run Time
Receive Send Receive Send
Call Packet
Result Packet
Unit-2 : DOS School of Computer Engineering 82
Remote Procedure Call (RPC) Implementation
1. Client :-
It is a user process that initiates RPC.
Client makes a normal local call that invokes a corresponding procedure in the client stub.
2. Client Stub:-
On receipt of a call request from the client, it packs a specification of the target procedure and
arguments into a message and then asks the local RPC runtime to send it to the server stub.
On receipt of the of the result of procedure execution, it unpacks the result and passes it to the client.
3. RPC Runtime:-
It handles transmission of message across the network between client and server m/cs.
It is responsible for transmission acknowledgement, packet routing and encryption.
It receives the call request message from the stub and send it to the server m/c.
It also receives the result message of the procedure executed on server and pass it to the client stub.
4. Server Stub:-
It receives the call request message from local RPC runtime, the server stub unpacks it and makes a
normal call to invoke the appropriate procedure in the server.
The server packs the result into the message and then asks the local RPC runtime to send it to the client
stub.
5. Server:-
On receiving the call request from the server stub, the server executes the appropriates procedure and
returns result executed on server stub.
Unit-2 : DOS School of Computer Engineering 83
Design Issues in RPC
A. STUB STRUCTURE: Widely used structure for RPC
Procedure execution
Client Server
Return Call Call Return
reply request request reply
12 1 6 7
Client Stub Server Stub
Message to Parameter to Message to Parameter to
parameter message parameter message
11 2 5 8
Transport layer Transport layer
Receive Send Receive Send
10 3 4 9
Network
Note: 1 to 12 indicates flow of data /information in that order
Steps for RPC Packing and Unpacking
Unit-2 : DOS School of Computer Engineering 85
RPC Call Message:-
1. During RPC client and server interaction will be done by 2 types of message
Call Message:- Sent by the client to server for requesting execution of a remote
procedure.
Message Format:-
Remote Procedure Identifier
Message Identifier Message Type Client Identifier Program Version Producer Arguments
Number Number Number
1. Message Identifier:- Consist of sequence of number identifying lost message and
duplicate message in case of system failure and for a proper matching reply message to
several outstanding call messages arrives out of order.
2. Message Type:- Used to distinguish call message from reply message. (for call =0 &
Reply =1)
3. Client Identifier:- Allow the server of the RPC to identify the client to whom the reply
message has to be return and to allow the server to check authentication of the client
process for executing the concerned procedure.
4. Remote Procedure Identifier :- Identifier information of RPC to be executed
5. Arguments :- Necessary for the executing of the procedure.
Unit-2 : DOS School of Computer Engineering 86
RPC Reply Message:-
2. Reply Message:- Sent by the server to client as result of the executed remote
procedure.
Successful Reply Message Format:- Unsuccessful Reply Message Format:-
Reply Status
Message Message (0 – Successful) Result Message Message Reply Status Result
Identifier Type Identifier Type (1 – Unsuccessful)
1. Message Identifier & Message Type :- Function is same as call message
For the successful reply, Reply=0 & followed by field containing result of the procedure
For the Unsuccessful reply, Reply=1 or non zero value to indicate the failure
Later on the value of the reply status indicate the type of the error
Unit-2 : DOS School of Computer Engineering 87
Marshaling Arguments and Results in RPC
Marshaling :-
1. During transfer of messages between two computers needs encoding and
decoding of data.
2. Taking the arguments of client process or the result of the server process
that will form the message data need to be sent to remote process.
3. Encoding the message at client end, convert the program objects into a
stream from that is suitable for transmission and placing them into a
message buffer.
4. Decoding of the message data on the server, reconstruction of the
program objects from the message data that was received in stream form.
Unit-2 : DOS School of Computer Engineering 88
Parameter Passing
Parameter Marshaling:- Packing the parameters in the message.
The Client Stub takes the parameters and put them in a message. It also puts
the number or name of the procedure to be called in the message. The server
machine might support different calls.
Unit-2 : DOS School of Computer Engineering 89
Problem occurs when the system at client and server end is
different.
Each machine has its own representation for numbers,
characters and other data items.
IBM Mainframe machines :-EBCDIC character code.
IBM personal Computer :- ASCII character code.
Similar problem occurs with representation of integers and
floating numbers.
Unit-2 : DOS School of Computer Engineering 90
Client End:-The compiler reads the server specification and
generate a client stub that packs its parameters into the
officially approved message format.
Server End:- The compiler can also produce a server stub that
unpacks them and calls the server procedure.
The system is transparent with respect to the differences in
the internal representations of the data items.
Unit-2 : DOS School of Computer Engineering 91
Dynamic Binding.
The client locates server in distributed system using Dynamic Binding.
Registering the Server to Binder:-
The server send a message to a program called a binder, to make its existence
known.
The server specifies its name, version number ,a unique identifier (32 bit long), and a
handle used to locate it.
The handle is system dependent (Ethernet Address, IP Address and X.500 Address,
a sparse process identifier).
It can deregister with the binder when it is no longer prepared to offer service.
Unit-2 : DOS School of Computer Engineering 92
How client locates server?
The client stub send message to the binder asking it to import version of the
server interface.
The binder checks to see if one or more servers have already exported an
interface with the version and name . If no server is found the read call fails.
Otherwise , the binder gives its handle and unique identifier to the client
stub. The client stub uses the handle as the address to send the request
message
Unit-2 : DOS School of Computer Engineering 93
Design Issues in RPC ‐ Contd
CLIENT SIDE SERVER SIDE
1
Register services (Initialize)
Receive Query
Client side Client Stub
Server Stub Remote
Program Procedure
Return server address
Procedure Procedure
Query the BINDING SERVER
LPC(…)
2
binding 3 Unpack Execute
server proc
4
Wait for Pack &
the replay
Transmit to LPC(…)
the server
5 6
identified
Return From Wait
Local call
9 8 7
Unpack Pack
Result Result Return
CLIENT
Working of RPC
Server
Working of RPC
• STEPS (1 – 9):
1. Registering of service provider’s (server) services
2. Client stub receives the LPC request from client
3. Querying the Registry server for the service provider address
4. Binding server returns the address of service provider
5. Client stub will convert the parameter into message, pack it and transfer to the
identified service provider
6. Service provider will unpack it and invoke the procedure with the parameter at service
provider side
7. Server stub receives the result from service provider after the execution of the
procedure, convert parameter into message
8. Packed message will be sent to the client stub
9. Client stub will unpack the message, convert the message into parameters, send the
parameters to client
RPC Semantics in the Presence of Failure
Five different classes of failure that can occur in RPC systems:
1. The client is unable to locate the server.
2. The request message from the client to the server is lost.
3. The reply message from the server to the client is lost.
4. The server crashes after receiving a request.
5. The client crashes after sending a request.
Unit-2 : DOS School of Computer Engineering 96
Client cannot locate the server
The server might be down, for example:- Suppose that the client is compiled using a particular version of the client
stub, and the binary is not used for a considerable period of time. In the meantime, the server evolves and a new
version of the interface is installed and new stubs are generated and put into use. When the client is finally run, the
binder will be unable to match it up with a server and will report failure.
The problem remains of how this failure should be dealt with.
With the server each of the procedures returns a value, with the code –1 conventionally used to indicate failure. For
such procedures, just returning –1 will clearly tell the caller that something is missed.
In UNIX, a global variable, error no, is also assigned a value indicating the error type. In such a system, adding a new
error type "Cannot locate server" is simple.
One possible candidate is to have the error raise an exception. In some languages, programmers can write special
procedures that are invoked upon specific errors, such as division by zero.
This approach, too, has drawbacks. To start with, not every language has exceptions or signals.
Unit-2 : DOS School of Computer Engineering 97
Lost Request Messages
Kernel need to start a timer when sending the request
If timer expirers before a Reply or ACK comes back, the kernel sends the message
again.
I the message is truly lost, the server will not be able to differentiate between the
retransmission and the original.
When many request message are lost the kernel declares server is down.
Unit-2 : DOS School of Computer Engineering 98
Lost Reply Messages
Lost replies are considerably more difficult to deal with. The obvious solution is just to rely on the timer again. If no reply is
forthcoming within a reasonable period, just send the request once more.
The trouble with this solution is that the client's kernel is not really sure why there was no answer. Did the request or reply get lost, or
is the server merely slow? It may make a difference.
In particular, some operations can safely be repeated as often as necessary with no damage being done. A request such as asking for the
first 1024 bytes of a file has no side effects and can be executed as often as necessary without any harm being done. A request that
has this property is said to be idempotent.
Now consider a request to a banking server asking to transfer a million dollars from one account to another. If the request arrives and is
carried out, but the reply is lost, the client will not know this and will retransmit the message. The bank server will interpret this
request as a new one, and will carry it out too. Two million dollars will be transferred. Heaven forbid that the reply is lost 10 times.
Transferring money is not idempotent.
One way of solving this problem is to try to structure all requests in an idem-potent way. In practice, however, many requests (e.g.,
transferring money) are inherently not idempotent, so something else is needed.
Another method is to have the client's kernel assign each request a sequence number. By having each server's kernel keep track of the
most recently received sequence number from each client's kernel that is using it, the server's kernel can tell the difference between
an original request and a retransmission and can refuse to carry out any request a second time.
An additional safeguard is to have a bit in the message header that is used to distinguish initial requests from retransmissions (the idea
being that it is always safe to perform an original request; retransmissions may require more care).
Unit-2 : DOS School of Computer Engineering 99
Server Crashes
The next failure on the list is a server crash. It too relates to idempotency, but unfortunately it cannot be solved
using sequence numbers.
a) A request arrives, is carried out, and a reply is sent.
b). A request arrives and is carried out, just as before, but the server crashes before it can send the reply.
c). Again a request arrives, but this time the server crashes before it can even be carried out.
In (b) the system has to report failure back to the client (e.g., raise an exception), whereas in (c) it can just
retransmit the request. The problem is that the client's kernel cannot tell which one is missed. All it knows is that its
timer has expired.
1. One philosophy is to wait until the server reboots (or rebinds to a new server) and try the operation again. The idea
is to keep trying until a reply has been received, then give it to the client. This technique is called at least once
semantics and guarantees that the RPC has been carried out at least one time, but possibly more.
2. The second philosophy gives up immediately and reports back failure. This way is called at most once semantics and
guarantees that the rpc has been carried out at most one time, but possibly none at all.
3. In third philosophy, When a server crashes, the client gets no help and no promises. The RPC may have been carried
out anywhere from 0 to a large number of times. The main virtue of this scheme is that it is easy to implement.
Unit-2 : DOS School of Computer Engineering 100
Client Crashes
What happens if a client sends a request to a server to do some work and crashes before the server replies? At this point a computation
is active and no parent is waiting for the result. Such an unwanted computation is called an orphan.
Orphans can cause a variety of problems. As they waste CPU cycles. They can also lock files or otherwise tie up valuable resources.
Finally, if the client reboots and does the RPC again, but the reply from the orphan comes back immediately afterward, confusion can
result.
What can be done about orphans? In solution 1, before a client stub sends an RPC message, it makes a log entry telling what it is about
to do. The log is kept on disk or some other medium that survives crashes. After a reboot, the log is checked and the orphan is explicitly
killed off. This solution is called extermination.
The disadvantage of this scheme is the horrendous expense of writing a disk record for every RPC. Furthermore, it may not even work,
since orphans themselves may do RPCs, thus creating grand orphans or further descendants that are impossible to locate. Finally, the
network may be partitioned, due to a failed gateway, making it impossible to kill them, even if they can be located.
In solution 2, called reincarnation, When a client reboots, it broadcasts a message to all machines declaring the start of a new epoch.
When such a broadcast comes in, all remote computations are killed. Of course, if the network is partitioned, some orphans may survive.
However, when they report back, their replies will contain an obsolete epoch number, making them easy to detect.
Solution 3 called less Draconian. It is called gentle reincarnation. When an epoch broadcast comes in, each machine checks to see if it
has any remote computations, and if so, tries to locate their owner. Only if the owner cannot be found is the computation killed.
Solution 4, called expiration, in which each RPC is given a standard amount of time, T, to do the job. If it cannot finish, it must explicitly
ask for another quantum, which is a nuisance. On the other hand, if after a crash the server waits a time T before rebooting, all orphans
are sure to be gone. The problem to be solved here is choosing a reasonable value of T in the face of RPCs with wildly differing
requirements.
In practice, none of these methods are desirable. Worse yet, killing an orphan may have unforeseen consequences. For example,
suppose that an orphan has obtained locks on one or more files or data base records. If the orphan is suddenly killed, these locks may
remain forever. Also, an orphan may have already made entries in various remote queues to start up other processes at some future
time, so even killing the orphan may not remove all traces of it.
Unit-2 : DOS School of Computer Engineering 101
Remote Procedure Call - Performance
Unit-2 : DOS School of Computer Engineering 102
RPC Protocol Selection
Criteria of Selection: It should gets the information from the
client’s kernel to the server’s kernel
1. Connection-oriented protocol vs Connectionless protocol
2. Standard general-purpose protocol vs Specifically designed for
RPC
3. Length of Packet and Message
Unit-2 : DOS School of Computer Engineering 103
Connection-Oriented Vs Connectionless Protocol
Connection-Oriented Protocol Connectionless Protocol
After connection establishment, the client is bound No principle of connection establishment for long
to the server period. However session-wise pairing between two
neighboring entities is required.
Same connection is used by all the traffic, in both The path used by all the traffic might be different
directions.
Communication is easier Communication is easier only if in LAN, where most
of the connections are of one hop length
When a kernel sends a message, the possibility of Loss of message, loss of ACK need extra work
lost of the message and receiving of its ACK is not
worrisome for it.
This approach is very strong in WAN Reliable in LAN
Suitable in small building LAN
Conclusion: Connection-oriented in WAN Conclusion: Connectionless in LAN
Unit-2 : DOS School of Computer Engineering 104
Standard Protocol Vs Specialized RPC
Standard Protocol (IP or UDP) Specialized RPC
The protocol is already exists which saves substantial It is need to be invented, implemented, tested and
work. embedded in existing systems. Considerably more work.
Many implementations are available. Saves work and time. More work and time
Communication is easier Communication need to be tested in the networks.
Most of the UNIX systems accept the packets of these Needs integration into existing UNIX systems
protocol for communication purpose
Existing networks also support IP and UDP packets Need to be tested across all types of networks
Writing, executing and testing code using these protocols Several phases of software testing is required.
are straightforward.
IP is not an end-to-end protocol. It is executed on top of Specialized RPC would avoid bouncing back of the packets.
reliable TCP. So, it bounce back several times in the
network.
IP has 13 header fields. 3 are essential (Src_Addr, Number of header fields may differ, according to the
Dstn_addr, Pkt_len). Header checksum is time consuming requirement of the problem.
Unit-2 : DOS School of Computer Engineering 105
Packet and Message Length
RPC has a large fixed overhead, independent of the amount of the
data sent.
Reading a Size of file is 64K in a single 64K RPC would be more
efficient than Size of file is 64K in a 64 -1K RPCs
Large size file with Maxflow should be supported by both protocol
and network
• Sun Microsystem’s limit is 8K (System level constraints)
• Ethernet’s limit is 1536 bytes (Network level constraints)
So, a single RPC is required to be split over multiple packets, is an
overhead.
Unit-2 : DOS School of Computer Engineering 106
Acknowledgements
When large RPCs have to be broken up into many small packets,
then what should be the acknowledgement process?
• Should individual packets be acknowledged? (stop-and-wait- protocol)
• Acknowledge after receiving all the packets (Blast Protocol)
Unit-2 : DOS School of Computer Engineering 107
Critical Path
A critical path is defined as the sequence of instructions that is
executed on every RPC (Eg. A client to a remote server)
There are 14 steps in the RPC from Client-to-Server
1. Call stub 8. Move packet to controller over the QBus
2. Get message buffer 9. Ethernet transmission time
3. Marshal parameters 10. Get packet from controller
4. Fill in headers 11. Interrupt service routine
5. Compute UDP checksum 12. Computer UDP Checksum
6. Trap to Kernel 13. Context switch to user space
7. Queue packet for transmission 14. Server stub code
Unit-2 : DOS School of Computer Engineering 108
Critical Path : Schematic View
Client Machine Server Machine
Client C
Call stub procedure Perform service Server
Prepare message buffer Call server
Client Marshal parameters into buffer Set up parameters on stack
Fill in message header fields Unmarshal parameters Server
stub
Trap to kernel stub
Context switch to kernel Context switch to server stub
Copy message to kernel Copy message to server stub
Determine destination address See if stub is waiting
Kernel Kernel
Put address in message header Decide which stub to give it to
Set up network interface Check packet for validity
Start timer Process interrupt
Unit-2 : DOS School of Computer Engineering 109
Critical Path Cont.
Q: Where is most of the time spent on the critical path?
Ans:
• Marshaling parameters and moving messages around
• In case of null RPC, context switch to the server stub when packet arrives, the
interrupt service routine, and moving the packets to the network interface for
transmission
• Managing a pool of buffers – which client stubs use to avoid having to fill in the
entire UDP header every time.
• All the machines don’t share the same address space, so context switch and use
of page table takes time
• Entire RPC system has been carefully coded in assembly language and hand
optimized. So, it is faster and saves time
Unit-2 : DOS School of Computer Engineering 110
Copying
Unit-2 : DOS School of Computer Engineering 111
Copy The network chip can DMA the message directly out of the client
1 stub’s address space onto the network
M
A Copy If(kernel can’t map the page into the server’s address space) then
N
2 kernel copies the packet to the server stub
D Copy The hardware is started, causing the packet to be moved over the
A 3 network to the interface board on the destination machine
T Copy When the packet arrived, interrupt occurs, kernel copies it to its
O 4 buffer (before knowing its exact location)
R
Y
Copy
5 Finally, the message has to be copied to the server stub
C Copy If(the call has a large array passed as a value parameter) the array
O 6 has to be copied onto the client’s stack for the call stub,
P
I Copy Copy from the stack to the message buffer during marshaling within
E 7
the client stub
S
Copy Copy from the incoming message in the server stub to the server’s
8 stack preceding the call to the server.
Unit-2 : DOS School of Computer Engineering 112
Copying Cont.
How to eliminate unnecessary copying?
• Using the hardware scatter-gather
• At the Sender’s side: With cooperative hardware, a reusable packet
header inside the kernel and a data buffer in user space can be put out
onto the network with no internal copying on the sending side.
• At the Receiver’s side: Dump the message into a kernel buffer and let
the kernel figure out what to do with it.
• In Operating Systems: Using virtual memory
• Using mapping: If(memory map can be updated in less time)
1. Then, mapping is faster than copying
2. Else, Not
Unit-2 : DOS School of Computer Engineering 113
Timer Management
Timer: It is an automatic mechanism for activating an entity
at a preset time.
Setting a timer requires building a data structure specifying
when the timer is to expire and what is to be done when that
happens.
The list of messages are in sorted order
Timer starts just after message transmitted
If(ACK or Reply arrives before the timer expires)
• Then the timeout entry must be located and removed from the list
Timer value should be neither too high or too low
Most systems maintains a Process Table to implement Timer
Unit-2 : DOS School of Computer Engineering 114
Timer Management via Sorted List and Process Table
Current time Current time Explanation: In process
table in stead of storing
14200 14200 timeouts in a sorted
Timer linked list, each process
14205
is off table entry has a field
for holding its timeouts.
0 14216
Process 3 It is shown the left of
the process table in blue
color.
Sorted List 1 0
14212 Working Principle: The
w.r.t. kernel scans the entire
time out Process 2 process table, checks each
timer value against the
2 14212 current time. If
(Tread<=Tcurrent) then it is
processed and reset.
14216
Process 0 3 14205 Note: Sweep Algorithms
operates by periodically
0 making a sequential pass
(a) Timeouts in a sorted list (b) Timeouts in a process table
through a process table.
Unit-2 : DOS School of Computer Engineering 115