0% found this document useful (0 votes)
14 views56 pages

Slides 05

The document discusses coordination and clock synchronization in distributed systems, emphasizing the need for exclusive access to shared resources and the agreement on the order of events. It explains the importance of synchronized clocks, using examples like cloud gaming and medical systems, and introduces protocols like Network Time Protocol (NTP) for achieving accurate time synchronization. Additionally, it covers logical clocks, including Lamport's and vector clocks, which help maintain the order of events and causal relationships in distributed processes.

Uploaded by

sidhu.kandoth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views56 pages

Slides 05

The document discusses coordination and clock synchronization in distributed systems, emphasizing the need for exclusive access to shared resources and the agreement on the order of events. It explains the importance of synchronized clocks, using examples like cloud gaming and medical systems, and introduces protocols like Network Time Protocol (NTP) for achieving accurate time synchronization. Additionally, it covers logical clocks, including Lamport's and vector clocks, which help maintain the order of events and causal relationships in distributed processes.

Uploaded by

sidhu.kandoth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Coordination Clock synchronization

Coordination

How can processes synchronize and coordinate their actions?

a) Exclusive Access to Shared Resources


• Multiple processes often require access to a common resource (e.g.,
printer, file, database record).
• Coordination is required to ensure only one process uses the resource at
a time

Example: Two users in different locations try to update the same online
bank account.

b) Agreeing on the Order of Events


• Sometimes, it's important to determine the order in which events occurred.
• Example: Process P sends message m1, and process Q sends message
m2.
• The system must agree: did m1 happen before or after m2?

Physical clocks
Coordination Clock synchronization

Clock Synchronization
Achieving agreement on time in a distributed system – Global time

Example for implication of lack of global time - Unix make program

When each machine have its own clock, an event that occurred after another
event may nevertheless be assigned an earlier time on another remote device.

In the above example, MAKE will not call the compiler for the newer version of
the “output.c” program, even though it is “newer”.

Observation: A lack of global (synchronized) time leads to incorrect


behavior in automated tools relying on timestamps.
Physical clocks
Coordination Clock synchronization

Clock Synchronization

Achieving agreement on time in a distributed system – Global time

More Examples:

• Cloud-Based Multiplayer Gaming: Lack of synchronization can result


in unfair gameplay

• Collaborative sensing: To synchronize using timestamps and to know


the exact time/location of an event

• Medical and Health Monitoring Systems: Misaligned timestamps can


lead to incorrect interpretation of a patient’s condition

Physical clocks
Coordination Clock synchronization

Physical clocks
Problem
• Sometimes we simply need the exact time, not just an ordering.
• Clock Skew: When a system has n computers, all n crystals will run at slightly
different rates, causing the clocks to gradually get out of sync.

Solution: Universal Coordinated Time (UTC)


• global time standard used for synchronizing clocks across systems
worldwide.

• maintained using highly accurate atomic clocks (e.g., cesium-133).


• serves as the absolute time baseline for all system clocks.

• Computed as the average of around 50 atomic clocks located at different


labs worldwide.

UTC Broadcast:
• Broadcast through: Short-wave radio signals (e.g., WWVB in the US)
• Global Positioning System (GPS) satellites
Physical clocks
Coordination Clock synchronization

Network Time Protocol


Getting the current time from a timeserver

• Standard protocol used to synchronize clocks of computers over a


variable-latency data network.

• NTP communicates with time servers on the internet to get the most
accurate time possible and then adjusts your system clock accordingly.

• NTP uses a hierarchical client–server


architecture that forms a tree structure.

• Each level of this hierarchy is called


a stratum and is assigned a number,
starting with zero, representing reference
hardware clocks.

Clock synchronization algorithms


Network Time Protocol

• Stratum 0(top level) comprises the most accurate


reference clocks(Atomic clocks)

• Stratum 0 is the high-precision timekeeping devices


such as atomic clocks, Global navigation satellite
system (including GPS) or other radio clocks

• Stratum 1 is directly connected to the stratum 0, and


receives the most accurate time

• NTP can handle a maximum of 16 Stratum levels


Coordination Clock synchronization

NTP - Detecting and adjusting incorrect times


Getting the current time from a timeserver

Round-Trip Delay & Clock Offset


•Goal: Compute the offset (θ) between client and server clocks
and the network delay (δ).

Assumption:
•δT_req = T₂ - T₁ ≈ T₄ - T₃ = δT_res (equal transmission time)

Clock synchronization algorithms


Clock Offset (θ): This estimates how much the client's clock differs
from the server’s.

Round-Trip Delay (δ): time taken by the messages to travel back and
forth.

•The client sends multiple requests to collect (θ, δ) pairs.

•Then selects the value of θ with the minimum δ, assuming the shortest
delay implies the most accurate measurement.
Theta = (2 - 0 )/2
= 1 min
Coordination Logical clocks

Lamport's Logical Clocks


What usually matters is not that all processes agree on exactly what time it is,
but that they agree on the order in which events occur. Requires a notion of
ordering.

Lamport’s logical clocks


Lamport’s logical clock
• It is a numerical software counter value maintained in each
process

• Lamport clocks allow processes to assign sequence


numbers (“timestamps”) to messages and other events

• Timestamp is incremented when a new event occur on a


process

• so that all cooperating processes can agree on the order of


related events.
Happened Before Relation
The technique behind logical clock is
happened before relation
Space Time diagram
Lamport’s logical clock

Events are causally related if one event may potentially


influence the outcome of another event.

For instance, events in one process are causally related.

If a process A sends a message to process B, all events that


occurred on process A before the message was sent causally
precede all the events that occur on B after B received the
message.

Lamport’s algorithm states that every event is timestamped


(assigned a sequence number) and each message carries a
timestamp of the sender’s clock (sequence number).
Coordination Logical clocks

The Happened-before relationship


Issue
What usually matters is not that all processes agree on exactly what time it is,
but that they agree on the order in which events occur. Requires a notion of
ordering.

The happened-before relation


• If a and b are two events in the same process, and a comes before
b, then a → b.
• If a is the sending of a message by one process, and b is the receipt
of that message by another process, then a → b is true.
• If a → b and b → c, then a → c

• If two events, x and y, happen in different processes that do not


exchange messages, then x → y is not true, but neither is y → x. These
events are said to be concurrent.
• No causal relationship between the two events

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks
Problem
How do we maintain a global view of the system’s behavior that is consistent
with the happened-before relation?

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks
Problem
How do we maintain a global view of the system’s behavior that is consistent
with the happened-before relation?

Attach a timestamp C(e) to each event e, satisfying the following


properties:
P1 If a and b are two events in the same process, and a → b, then we
demand that C(a) < C(b).
P2 If a corresponds to sending a message m, and b to the receipt of that
message, then also C(a) < C(b).

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks
Problem
How do we maintain a global view of the system’s behavior that is consistent
with the happened-before relation?

Attach a timestamp C(e) to each event e, satisfying the following


properties:
P1 If a and b are two events in the same process, and a → b, then we
demand that C(a) < C(b).
P2 If a corresponds to sending a message m, and b to the receipt of that
message, then also C(a) < C(b).

Problem
How to attach a timestamp to an event when there’s no global clock ⇒
maintain a consistent set of logical clocks, one per process.

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks: solution


Each process Pi maintains a local counter Ci and adjusts this counter
1. For each new event that takes place within Pi , Ci is incremented by 1.
2. Each time a message m is sent by process Pi , the message receives a
timestamp ts(m) = Ci .
3. Whenever a message m is received by a process Pj , Pj adjusts its local
counter Cj to max{Cj , ts(m)}; then executes step 1 before passing m to
the application.

Notes
• Property P1 is satisfied by (1); Property P2 by (2) and (3).
• It can still occur that two events happen at the same time. Avoid this by
breaking ties through process IDs.

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks: example


Consider three processes with event counters operating at different
rates

Lamport’s logical clocks


Example of Lamport's logical clock
As per Lamport's scalar clock what will be the time for
event 𝑒26 (i.e. 2nd process 6th event)?
As per Lamport's scalar clock what will be the time for
event 𝑒26 (i.e. 2nd process 6th event)?

Answer:11
Vector Clocks
• With Lamport clocks, nothing can be said about the relationship
between two events a and b by merely comparing their time values

• Lamport clocks do not capture causality. Causality is captured using


vector clocks.

• Lamport clocks cannot tell us if a message was concurrent, and


cannot be used to infer causality between events.

• Vector clocks are a more sophisticated variant that gives us more


guarantees, including knowledge of concurrency & causal history.

• Vector Clock assigns timestamps for events in a distributed system.

• A vector clock also gives the ordering of the events.


Coordination Logical clocks

Vector clocks
Observation
Lamport’s clocks do not guarantee that if C(a) < C(b) that a causally
preceded b.

Concurrent message Observation


transmission using logical Event a: m1 is received at T = 16;
clocks Event b: m2 is sent at T = 20.

Vector clocks
Coordination Logical clocks

Vector clocks
Observation
Lamport’s clocks do not guarantee that if C(a) < C(b) that a causally
preceded b.

Concurrent message Observation


transmission using logical Event a: m1 is received at T = 16;
clocks Event b: m2 is sent at T = 20.

Note
We cannot conclude that a causally
precedes b.

Vector clocks
Vector Clock
• Represented by a set of n-dimensional non-negative integer vectors.

• Unlike Lamport clocks, each process in vector clock knows about the
counter values in other processes as well.

• A vector of integer values to represent the timestamp.

• If we have N processes in the group, then each process will have a vector with
N elements.

• Two events are causally related means one’s existence is caused by the other
one (cause-effect).

• An event b is causally dependent on another event a if b happens just because a


has already happened.

• If an event a happened before another event b (a → b), vector clock for b, Vb will
be greater than Va. Similarly, the reverse is also true

• A vector clock Vi is said to be less than another vector clock Vj if all the elements
of Vi are less than or equal to that of Vj
Coordination Logical clocks

Causal dependency
Definition
We say that b may causally depend on a if ts(a) < ts(b), with:
• for all k , ts(a)[k ] ≤ ts(b)[k ] and
• there exists at least one index k′ for which ts(a)[k′] < ts(b)[k′]

Precedence vs. dependency


• We say that a causally precedes b.
• b may causally depend on a, as there may be information from a that is
propagated into b.

Vector clocks
Causal dependency

We can use the vector timestamps to verify if two events are either
causally related or concurrent
Events b and l, d and j
vector clock algorithm

• Initially, all the clocks are set to zero.

• Every time, an Internal event occurs in a process, the value of the process’s
logical clock in the vector is incremented by 1

• Also, every time a process sends a message, the value of the process’s logical
clock in the vector is incremented by 1.

• Every time, a process receives a message, the value of the process’s logical
clock in the vector is incremented by 1

• Each element is updated by taking the maximum of the value in its own
vector clock and the value in the vector in the received message (for
every element)
Coordination Logical clocks

Capturing potential causality


Solution: each Pi maintains a vector VCi
• VCi [i ] is the local logical clock at process Pi .
• If VCi [j ] = k then Pi knows that k events have occurred at Pj .

Maintaining vector clocks


1. Before executing an event, Pi executes VCi [i ] ← VCi [i ] + 1.
2. When process Pi sends a message m to Pj , it sets m’s (vector)
timestamp ts(m) equal to VCi after having executed step 1.
3. Upon the receipt of a message m, process Pj sets
VCj [k ] ← max{VCj [k ], ts(m)[k ]}for each k , after which it executes step 1
and then delivers the message to the application.

Vector clocks
Coordination Logical clocks

Vector clocks: Example


Capturing potential causality when exchanging messages

(a) (b)

Analysis

Situation ts(m2 ) ts(m4 ) ts(m2 ) ts(m2 ) Conclusion


< >
ts(m4 ) ts(m4 )
(a) (2, 1, 0) (4, 3, 0) Yes No m2 may causally precede m4
(b) (4, 1, 0) (2, 3, 0) No No m2 and m4 may conflict

Vector clocks
Coordination Logical clocks

Totally ordered multicast is a communication protocol used in


distributed systems to ensure that all messages are delivered to all
processes in the same order, regardless of when or where the message
originated.
E.g., Concurrent updates on a replicated database are seen in the same order
everywhere
• P1 adds $100 to an account (initial value: $1000)
• P2 increments account by 1%
• There are two replicas

Result
In absence of proper synchronization:
replica #1 ← $1111, while replica #2 ← $1110.

Lamport’s logical clocks


Coordination Logical clocks

Causally ordered multicasting


Observation
We can now ensure that a message is delivered only if all causally preceding
messages have already been delivered.

Adjustment
Pi increments VCi [i ] only when sending a message, and Pj “adjusts” VCj
when receiving a message (i.e., effectively does not change VCj [j ]).

Pj postpones delivery of m until:


1. ts(m)[i ] = VCj [i ] + 1
2. ts(m)[k ] ≤ VCj [k ] for all k ̸= i

Vector clocks
Coordination Logical clocks

Causally ordered multicasting


Enforcing causal communication

Vector clocks
Coordination Logical clocks

Causally ordered multicasting


Enforcing causal communication

Example
Take VC3 = [0, 2, 2], ts(m) = [1, 3, 0] from P1. What information does P3 have,
and what will it do when receiving m (from P1)?

Vector clocks
Coordination Mutual exclusion

Mutual exclusion
Problem
Several processes in a distributed system want exclusive access to some
resource.

Basic solutions
Permission-based: A process wanting to enter its critical region, or access a
resource, needs permission from other processes.
Token-based: A token is passed between processes. The one who has the
token may proceed in its critical region, or pass it on when not
interested.

Overview
Coordination Mutual exclusion

Permission-based, centralized
Simply use a coordinator

(a) (b) (c)


(a) Process P1 asks the coordinator for permission to access a shared
resource. Permission is granted.
(b) Process P2 then asks permission to access the same resource. The
coordinator does not reply.
(c) When P1 releases the resource, it tells the coordinator, which then replies
to P2 .

A centralized algorithm
Coordination Mutual exclusion

Permission-based, centralized
Simply use a coordinator
Pros: a ) Fair – Request granted in order they are received and no process
waits forever.
b ) Simple – Three messages per resource.
Cons: Coordinator- single point of failure; (can this be a Pro?)
If no response, a process cannot distinguish a dead
coordinator from "permission denied"

A centralized algorithm
Coordination Mutual exclusion

Mutual exclusion: Ricart & Agrawala - A distributed algorithm


The same as Lamport except that acknowledgments are not sent
• When a process wants to access a shared resource, it builds a message
containing the name of the resource, its process number, and the current
(logical) time
• It then sends the message to all other processes, conceptually including
itself.
• If the receiver is not accessing the resource and does not want to access
it, it sends back an OK message to the sender.
• If the receiver already has access to the resource, it simply does not
reply. Instead, it queues the request.
• If the receiver wants to access the resource as well but has not yet done
so, it compares the timestamp of the incoming message with the one
contained in the message that it has sent to everyone. The lowest one
wins.
• If the incoming message has a lower timestamp, the receiver sends back
an OK message.
• If its own message has a lower timestamp, the receiver queues the
incoming request and sends nothing
A distributed algorithm
Coordination Mutual exclusion

Mutual exclusion: Ricart & Agrawala - A distributed algorithm


The same as Lamport except that acknowledgments are not sent
As soon as all the permissions are in, it may go ahead. When it is finished, it
sends OK messages to all processes in its queue and deletes them all from
the queue.

Return a response to a request only when:


• The receiving process has no interest in the shared resource; or
• The receiving process is waiting for the resource, but has lower priority
(known through comparison of timestamps).
In all other cases, reply is deferred, implying some more local administration.
• If the receiver already has access to the resource, it simply does not
reply. Instead, it queues the request.
• If its own message has a lower timestamp, the receiver queues the
incoming request and sends nothing.

A distributed algorithm
Coordination Mutual exclusion

Mutual exclusion: Ricart & Agrawala


Example with three processes

(a) (b) (c)

(a) Two processes want to access a shared resource at the same moment.
(b) P0 has the lowest timestamp, so it wins.
(c) When process P0 is done, it sends an OK also, so P2 can now go ahead.

A distributed algorithm
Coordination Mutual exclusion

Mutual exclusion: Ricart & Agrawala

Pros: No deadlock or starvation


Cons: a ) If N no. of processes, then N points of failure; If any process fails,
it blocks all subsequent attempts to access the resource.
b ) All processes involved in all decisions – burden on processes
running on resource-constrained machines

A distributed algorithm
Coordination Mutual exclusion

Mutual exclusion: Token ring algorithm


Essence
Organize processes in a logical ring, and let a token be passed between them.
The one that holds the token is allowed to enter the critical region (if it wants
to).

An overlay network constructed as a logical ring with a circulating token

A token-ring algorithm
Coordination Mutual exclusion

Mutual exclusion: Token ring algorithm

Pros: Token circulates in a defined order. So no starvation

Cons: Lost token difficult to identify since no time bound between successive
appearances of token on the network.

The algorithm also runs into trouble if a process crashes, but recovery is relatively easy. If we require a process
receiving the token to acknowledge receipt, a dead process will be detected when its neighbor tries to give it the
token and fails. At that point, the dead process can be removed from the group, and the token holder can throw the
token over the head of the dead process to the next member

A token-ring algorithm
Coordination Mutual exclusion

Decentralized mutual exclusion


Principle
Assume every resource is replicated N times, with each replica having its own
coordinator ⇒ access requires a majority vote from m > N/2 coordinators. A
coordinator always responds immediately to a request.

Assumption
When a coordinator crashes, it will recover quickly, but will have forgotten
about permissions it had granted.

A decentralized algorithm
Coordination Election algorithms

Election algorithms
Principle
An algorithm requires that some process acts as a coordinator. The question is
how to select this special process dynamically.

Note
In many systems, the coordinator is chosen manually (e.g., file servers). This
leads to centralized solutions ⇒ single point of failure.
Coordination Election algorithms

Election algorithms
Principle
An algorithm requires that some process acts as a coordinator. The question is
how to select this special process dynamically.

Note
In many systems, the coordinator is chosen manually (e.g., file servers). This
leads to centralized solutions ⇒ single point of failure.

Teasers
1. If a coordinator is chosen dynamically, to what extent can we speak about
a centralized or distributed solution?
2. Is a fully distributed solution, i.e. one without a coordinator, always more
robust than any centralized/coordinated solution?
Coordination Election algorithms

Basic assumptions

• All processes have unique id’s


• All processes know id’s of all processes in the system (but not if they are
up or down)
• Election means identifying the process with the highest id that is up
Coordination Election algorithms

Election by bullying
Principle
Consider N processes {P0 , . . . , PN−1} and let id (Pk ) = k . When a process Pk
notices that the coordinator is no longer responding to requests, it initiates an
election:
1. Pk sends an ELECTION message to all processes with higher identifiers:
Pk + 1 , Pk + 2 , . . . , PN−1.
2. If no one responds, Pk wins the election and becomes coordinator.
3. If one of the higher-ups answers, it takes over and Pk ’s job is done.

The bully algorithm


Coordination Election algorithms

Election by bullying
The bully election algorithm

The bully algorithm


Coordination Election algorithms

Election in a ring
Principle
Process priority is obtained by organizing processes into a (logical) ring. The
process with the highest priority should be elected as coordinator.
• Any process can start an election by sending an election message to its
successor. If a successor is down, the message is passed on to the next
successor.
• If a message is passed on, the sender adds itself to the list. When it gets
back to the initiator, everyone had a chance to make its presence known.
• The initiator sends a coordinator message around the ring containing a
list of all living processes. The one with the highest priority is elected as
coordinator.

A ring algorithm
Coordination Election algorithms

Election in a ring
Election algorithm using a ring

• The solid line shows the election messages initiated by P6


• The dashed one, the messages by P3

A ring algorithm

You might also like