0% found this document useful (0 votes)
217 views16 pages

Distributed Deadlock Detection: COP 4610 Notes On Deadlock

This document discusses distributed deadlock detection in operating systems. It describes different models and strategies for distributed deadlock handling, including prevention, avoidance and detection. It also categorizes and explains several algorithms for distributed deadlock detection, such as centralized control, edge-chasing and diffusion-based approaches.

Uploaded by

Chakraborty S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
217 views16 pages

Distributed Deadlock Detection: COP 4610 Notes On Deadlock

This document discusses distributed deadlock detection in operating systems. It describes different models and strategies for distributed deadlock handling, including prevention, avoidance and detection. It also categorizes and explains several algorithms for distributed deadlock detection, such as centralized control, edge-chasing and diffusion-based approaches.

Uploaded by

Chakraborty S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 16

COP 5611: OPERATING SYSTEMS -Spring 2003- Ted Baker

Distributed Deadlock Detection


These notes were used in a subsititute lecture, for Xiuwen Liu's class in Spring 2003.

The material is mainly from Chapter 7 (Distributed Deadlock Detection) in Advanced


Concepts in OS.

You should be already familiar with deadlock handling in single-processor systems, from the
undergraduate course that is prerequisite to this one. If not, you should review it now, as
background for distributed deadlock.

For reference, see COP 4610 Notes on Deadlock.

Beyond that, you should have read and be familiar with Chapter 3 of the text for this course,
especially the sections on graph models for deadlock detection.

Distributed Deadlock Models


Based on WFG (not GRG)

 Nodes are processes


 Resources are located at a site, but may be held by processes at other sites
 Edge (P,Q) indicates P is blocked and waiting for Q to release some resource
 Deadlock exists iff there is a directed cycle or knot (depends on request model -- see
Chapter 3)

The text fudges a bit by forgetting that in some models the existence of a cycle is necessary
but not sufficient, and the existence of a knot is sufficient but not necessary, leaving a gap.

Distributed Deadlock Handling Strategies


 prevention -- too restrictive?
 avoidance -- too complicated, slow?
 detection -- the usual approach

Distributed Deadlock Detection Issues & Requirements


 issues
o maintenance of WFG
o detection of cycles (or knots) in the WFG
 requirements
o progress = no undetected deadlocks
o safety = no false deadlocks

What do we do when a deadlock is detected?

Categorization of Methods
 centralized control
 distributed control
 hierarchical control

Simple Centralized Control


 one control site maintains the WFG and checks for deadlock
 other sites send request and release messages to control site

What are the advantages and disadvantages?

False Deadlock Example


An external observer can see deadlock where there is none.

4 sites:

 R1 stored at S1
 R2 stored at S2
 T1 runs at S3
 T2 runs at S4

Two transactions run concurrently:

T2: lock R1 T2: unlock R1 T2: lock R2


T1: lock R1 T : unlock R2
T1: unlock R1 T1: lock R2 T1: unlock R2 2

False Deadlock Example: Event Trace Diagram


False Deadlock Snapshots
What if each site maintains its local view of the system state, and periodically sends this to the
control site?

Ho-Ramamoorthy Algorithm: Two-Phase


 each site maintains table with status of all local processes
 control site periodically requests status from all sites, builds WFG, and checks for
deadlock
 if deadlock detected, control site repeats status requests, but throws out transactions
that have changed
(trying to avoid false deadlock)

May still report false deadlocks.

What lesson(s) does this teach?

Ho-Ramamoorthy Algorithm: One-Phase


 each site maintains two tables:
o resource status (for all local resources)
o process status (for all local processes)
 control site periodically requests copies of tables, builds WFG, and checks for
deadlock
 transactions are not used unless the process table info agrees with the resource table
info

Notice the similarity in principle here to the Chandy-Lamport global state recording
algorithm, i.e., we need to capture not just the states of the processes but also the states of the
messages in transit.

Classification of Distributed Detection Algorithms


 path-pushing
o path information transmitted, accumulated
 edge-chasing
o ``I'm waiting for you'' probes are sent along edges
o single returned probe indicates a cycle
 diffusion
o ``Are you blocked?'' probes are sent along all edges
o all queries returned indicates existence of a knot (necessary and sufficient
condition for deadlock in an expedient general resource graph with single unit
requests), {\it i.e.} there is no way out of the blockage
 global state detection
o take and use snapshot of system state

Obermarck's Path-Pushing Algorithm


 designed for distributed database systems
 processes are called ``transactions'' T1, T2, Tn
 there is special virtual node Ex
 transactions are totally ordered

How can we totally order the transactions?

Obermarck's Path-Pushing Algorithm


1. wait for info from previous iteration of Step 3
2. combine received info with local TWFG
detect all cycles
break local cycles
3. send each cycle including the node Ex
to the external nodes it is waiting for
4. time-saver: only send path to other sites if first
transaction is higher in lexical order than the last

Note: TWF = "transaction wait-for graph", just another name for a WFG, when applied to
transactions.

Note: In the figures below, the rule above is reversed, i.e., we send the path of the first
transaction is lower in lexical order than the first. This was an accident when I drew up the
figures, but it does not matter. Either rule will work. The essential idea is simply to have one
canonical representation of each path.
Note: The example above does not fully illustrate all the things that can go on in a more
complicated graph. In particular, there are no nodes with multi-node AND-dependences.
Problem and Questions on Obermarck's Path-Pushing
Algorithm
Detects false deadlocks, due to asynchronous snapshots at different sites. (lesson?)

Message complexity? Message size? Detection delay?

Exactly how are paths combined and checked?

Obermarck's Path-Pushing Algorithm: Performance


 O(n(n1)/2) messages
 O(n) message size
 O(n) delay to detect deadlock

The O(n(n1)/2) message bound is a little bit misleading, since the algorithm is described as
being run continually, in an iterative fashion. That is, the sites exchange paths continually,
even when no deadlock is detected.

Chandy-Misra-Haas Edge-Chasing Algorithm


 for AND request model
 probe= (i,j,k) is sent for
detection initiated by Pi,
by site of Pj to site of Pk
 deadlock is detected when a probe returns to its initiator

Terminology
 Pj is dependent on Pk if there is a sequence Pj,Pi1,Pi2,Pim, Pk such that each process
except Pk is blocked and each process except the first holds a resource for which the
previous process is waiting
 Pj is locally dependent on Pk if it is dependent and both processes are at the same site
 array dependenti(j) = true  Pi knows that Pj is dependent on Pi

Algorithm Initiation by Pi
if Pi is locally dependent on itself then declare a deadlock
else send probe (i, j, k) to home site of Pk for each j, k such that all of the following hold

 Pi is locally dependent on Pj
 Pj is waiting on Pk
 Pj and Pk are on different sites

Algorithm on receipt of probe (i,j,k) by node of Pk


check the following conditions

 Pk is blocked
 dependentk(i) = false
(Pk does not yet know that Pi is dependent on Pk)
 Pk has not replied to all requests of Pj

if these are all true, do the following

 set dependentk(i) = true


(Pk now knows that Pi is dependent on Pk)
 if k=i declare that Pi is deadlocked
 else send probe (i,m,n) to the home site of Pn for every m and n such that the following
all hold
o Pk is locally dependent on Pm
o Pm is waiting on Pn
o Pm and Pn are on different sites
What is the purpose of the dependent relation?

The above example does not include any nontrivial AND-requests. What is it about this
algorithm that allows it to handle AND-requests, but not OR-requests?

Chandy Misra Haas Complexity


 What is the message complexity?
 What is delay in detection?

Analysis
 m(n1)/2 messages for m processes at n sites
 3-word message length
 O(n) delay to detect deadlock

Other Edge-Chasing Algorithms


We will not have time to go into these in detail in class.
 Mitchell-Meritt
o each node has two counters, called public and private labels
o propagates largest public label backwards in TWFG along cycle
 Sinha-Natarajan
o uses probes (initiator, lowest_trans_traversed_so_far)
o Suffers from false deadlocks, and also misses deadlocks
o "Correction" by Choudary et al. has same defects
o What is the lesson here?

Diffusion Based Algorithms: Chandy et al.


 for OR request model
 processes are active or blocked
 whenever a process blocks it starts a diffusion (!!)
 if deadlock is not detected, the process will eventually unblock and terminate the
algorithm
 message = query (i,j,k)
o i = initiator of check
o j = immediate sender
o k = immediate recipient
 reply = reply (i,k,j)

On receipt of query(i,j,k) by m
 if not blocked then discard the query
 if blocked
o if this is an engaging query
propagate query(i,k,m) to dependent set of m
o else
 if not continously blocked since engagement then discard the query
 else send reply(i,k,j) to j
On receipt of reply(i,j,k) by k
 if this is not the last reply
then just decrement the awaited reply count
 if this is the last reply then
o if i=k report a deadlock
o else send reply(i,k,m)
to the engaging process m

The black dashed arrows indicate the engaging process for each engaged process. This
information is needed to route the reply when the number of other processes for which the
engaged process is waiting goes to zero.
Observe that the engaging process arrows form a spanning tree of the subgraph corresponding
to the set of process for which the initiating process is waiting. If every process in this
subgraph is blocked, we have a knot.

At this point, a knot has been detected.

Global State Detection Based Algorithms


 take snapshot of distributed WFG
 use graph reduction to check for deadlock

Details differ.

Recall: What is graph reduction?

Graph Reduction
General idea: simulate the result of execution, assuming all unblocked processes complete
without requesting any more resources
 while there is an unblocked process, remove the process and all (resource-holding)
edges to it
 there is deadlock if the remaining graph is non-null

If you have never done one, it would be a good idea to work through an example of graph
reduction. In particular, note that backtracking may be necessary with the more complex
resource models such as consumable resource graphs.

© 1999, 2001, 2003 T. P. Baker


Last updated by Author: baker on Date: 2003/02/18

You might also like