0% found this document useful (0 votes)
51 views3 pages

Distributed Computing Techniques

Uploaded by

alainemario
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views3 pages

Distributed Computing Techniques

Uploaded by

alainemario
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

EnggTree.

com
CS3551 DISTRIBUTED COMPUTING

 When a failure occurs, the recovering process initiates rollback by broadcasting a


dependency request message to collect all the dependency information maintained by each
process.

 When a process receives this message, it stops its execution and replies with the
dependency information saved on the stable storage as well as with the dependency
information, if any, which is associated with its current state.
www.EnggTree.com
 The initiator then calculates the recovery line based on the global dependency information
and broadcasts a rollback request message containing the recovery line.

 Upon receiving this message, a process whose current state belongs to the recovery line
simply resumes execution; otherwise, it rolls back to an earlier checkpoint as indicated by
the recovery line. 
2. Coordinated Checkpointing
In coordinated checkpointing, processes orchestrate their checkpointing activities so that all
local checkpoints form a consistent global state
Types
1. Blocking Checkpointing: After a process takes a local checkpoint, to prevent orphan
messages, it remains blocked until the entire checkpointing activity is complete
Disadvantages: The computation is blocked during the checkpointing
2. Non-blocking Checkpointing: The processes need not stop their execution while taking
checkpoints. A fundamental problem in coordinated checkpointing is to prevent a process
from receiving application messages that could make the checkpoint inconsistent.

M.A.M COLLEGE OF ENGINEERING

Downloaded from EnggTree.com


EnggTree.com
CS3551 DISTRIBUTED COMPUTING

Example (a) : Checkpoint inconsistency


 Message m is sent by 𝑃0 after receiving a checkpoint request from the checkpoint
coordinator
 Assume m reaches 𝑃1 before the checkpoint request
 This situation results in an inconsistent checkpoint since checkpoint 𝐶1,𝑥 shows the receipt
of message m from 𝑃0, while checkpoint 𝐶0,𝑥 does not show m being sent from
𝑃0
Example (b) : A solution with FIFO channels
 If channels are FIFO, this problem can be avoided by preceding the first post-checkpoint
message on each channel by a checkpoint request, forcing each process to take a checkpoint
before receiving the first post-checkpoint message

www.EnggTree.com

Impossibility of min-process non-blocking checkpointing


 A min-process, non-blocking checkpointing algorithm is one that forces only a minimum
number of processes to take a new checkpoint, and at the same time it does not force any
process to suspend its computation.

M.A.M COLLEGE OF ENGINEERING

Downloaded from EnggTree.com


EnggTree.com
CS3551 DISTRIBUTED COMPUTING

Algorithm
 The algorithm consists of two phases. During the first phase, the checkpoint initiator
identifies all processes with which it has communicated since the last checkpoint and sends
them a request.
 Upon receiving the request, each process in turn identifies all processes it has
communicated with since the last checkpoint and sends them a request, and so on, until
no more processes can be identified.
 During the second phase, all processes identified in the first phase take a checkpoint. The
result is a consistent checkpoint that involves only the participating processes.

 In this protocol, after a process takes a checkpoint, it cannot send any message until the
second phase terminates successfully, although receiving a message after the checkpoint
has been taken is allowable.
3. Communication-induced Checkpointing
Communication-induced checkpointing is another way to avoid the domino effect, while allowing
processes to take some of their checkpoints independently. Processes may be forced to take
additional checkpoints
www.EnggTree.com
Two types of checkpoints
1. Autonomous checkpoints
2. Forced checkpoints
The checkpoints that a process takes independently are called local checkpoints, while those that
a process is forced to take are called forced checkpoints.
 Communication-induced check pointing piggybacks protocol- related information on
each application message
 The receiver of each application message uses the piggybacked information to determine
if it has to take a forced checkpoint to advance the global recovery line
 The forced checkpoint must be taken before the application may process the contents of
the message
 In contrast with coordinated check pointing, no special coordination messages are
exchanged
Two types of communication-induced checkpointing
1. Model-based checkpointing
2. Index-based checkpointing.

M.A.M COLLEGE OF ENGINEERING

Downloaded from EnggTree.com

You might also like