0% found this document useful (0 votes)

17 views37 pages

Lecture 13

The document discusses fault tolerance in distributed systems, focusing on replication, two-phase commit (2PC), and three-phase commit (3PC) protocols. It emphasizes the importance of ensuring atomicity in transactions and outlines the steps involved in 2PC and 3PC, including handling failures and the need for logging and recovery mechanisms. The lecture highlights the challenges of blocking in 2PC and introduces 3PC as a more complex but non-blocking alternative.

Uploaded by

ronesa3901

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views37 pages

Lecture 13

Uploaded by

ronesa3901

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

University of Pennsylvania

CIS 5050
Software Systems

Linh Thi Xuan Phan

Department of Computer and Information Science

University of Pennsylvania

Lecture 13: Fault tolerance

March 28, 2024

©2016-2024 Linh Thi Xuan Phan

Announcements
• Each team please email me the GitHub IDs of your
team members

• Needed for creating the team GitHub repository

2
©2016-2024 Linh Thi Xuan Phan
Recall: Replication

• How does replication help?

– Durability: Once data is stored, the system should not lose it
– Availability: Data should be accessible whenever we need it
– Speed: Accessing the data should be fast
– Scalability: Distribute load and functionality across machines
• What do we need for replication?
– A replication protocol (how to propagate updates to all replicas)
– A consistency model (what is the ‘right’ value to return to the client)
• Are there any new challenges with replication?
3
©2016-2024 Linh Thi Xuan Phan
Plan for today
• Distributed commit NEXT

– Two-phase commit (2PC)

– Three-phase commit (3PC)

• Logging and recovery

– Centralized checkpointing
– Chandy-Lamport algorithm

4
©2016-2024 Linh Thi Xuan Phan
Why distributed commit?
• Suppose a large bank is operating a ‘shared’
account database
– Example: Node #1 has account data for customers whose first
names start with A, node #2 has B, node #3 C, ...

• Now suppose Alice wants to send $100 to Bob

– This involves changes on two separate nodes: Adding $100 to Bob's
account (node #2), and taking $100 from Alice's account (node #1)
– What if node #2 already finished the former, and then node #1
notices that there isn't enough money in Alice's account?
– What if node #1 finishes its part, but node #2 crashes before it can
complete the operation?
– ...

5
©2016-2024 Linh Thi Xuan Phan
Atomicity
• Goal: We need to ensure atomicity
– Either all parts of the transaction are completed, or none of them is!
– This is one of the four classical ACID properties from databases
• Atomicity, Consistency, Isolation, Durability

• How can we ensure atomicity?

• Idea: Let's do one-phase commit

– We elect one node as the coordinator; the others are subordinates
– The coordinator tells all the subordinates whether to finalize their part of
the transaction (commit), or whether to undo it (abort)
– The subordinates do what they are told, and then acknowledge

• Is this a good solution?

6
©2016-2024 Linh Thi Xuan Phan
Why one-phase commit fails
Coordinator crashes
in the middle
CO
T
CO MMI M
M But I already
IT

COMMIT
??? aborted!

B
C
OK!
A

• Problem #1: Subordinate cannot independently abort the

transaction
• Problem #2: Coordinator might crash before subordinate
receives the message; partial results are lost

7
©2016-2024 Linh Thi Xuan Phan
Two-Phase Commit (2PC)
• Idea: We need two rounds of communication!

• First round: Voting

– Coordinator sends prepare message to each subordinate
– Each subordinate responds with yes if it is able to commit its part of
the transaction, otherwise with no
• If the subordinate needs locks to commit, it needs to acquire them before
responding with yes (why?)

• Second round: Decision

– Coordinator sends commit or abort to each subordinate
– Each subordinate responds with an ack

• Result: Any site can decide to abort a transaction!

8
©2016-2024 Linh Thi Xuan Phan
2PC: What about crashes?
• We also need to handle the case where a node
crashes in the middle
– Nodes need to be able to 'remember' some information!

• Idea: Each node is given a local, persistent log

– This log could be stored on disk, or in NVRAM
• Is this enough to ensure that data remains persistent despite a possible crash?
– When a node recovers after a crash, it can look at its local log to
figure out what the next steps should be

9
©2016-2024 Linh Thi Xuan Phan
2PC: Steps in more detail
• When a transaction wants to commit:
– The coordinator sends a prepare message to each subordinate
– Subordinate force-writes an abort or prepare log record, then
sends a no (abort) or yes (prepare) message to coordinator
– The coordinator then considers the votes:
• If everyone has voted yes, it force-writes a commit log record and sends
commit message to all subordinates
• Else, it force-writes an abort log record and sends an abort message
– The subordinates force-write abort/commit log records based on
the message they get, and then send an ack message to
coordinator
– The coordinator writes an end log record after getting all the
acks
• Why is the 'end' record useful?
Messages in red
Log records in green 10
©2016-2024 Linh Thi Xuan Phan
2PC: Protocol (1/2)
void coordinator(Transaction t, Set nodes)
{
log.write("BEGIN");
log.write(result);//commits the result
foreach (n : nodes)
foreach (n : nodes)
send(n, "PREPARE");
send(n, result);
if (result == "COMMIT")
Set responses = new Set();
t.performLocalPart();
bool allInFavor = true;
if (!t.localPartCanCommit())
Set finished = new Set();
allInFavor = false;
while (!finished.equals(nodes)){
while (!responses.equals(nodes) &&
Node sender;
!timeout() && allInFavor){
Message msg = recv(&sender);
Node sender;
if (msg == "STATUS?")
Message msg = recv(&sender);
send(sender, result);
responses.add(sender);
if (msg == "ACK")
if (msg == "NO")
finished.add(sender);
allInFavor = false;
}
}
if (timeout())
log.write("END");
allInFavor = false;
}
String result;
if (allInFavor)
result = "COMMIT";
else
result = "ABORT";
11
©2016-2024 Linh Thi Xuan Phan
2PC: Protocol (2/2)
void subordinate(Transaction t, Node coordinator)
{
log.write("BEGIN");
while (true) {
Message msg = recvFrom(coordinator);
if (msg == "PREPARE") {
if (t.localPartCanCommit()) {
log.write("PREPARE");
send(coordinator, "YES");
} else {
log.write("ABORT");
send(coordinator, "NO");
}
} else if (msg == "COMMIT") {
log.write("COMMIT");
t.performLocalPart();
log.write("END");
send(coordinator, ”ACK");
break;
} else if (msg == "ABORT") {
log.write("ABORT");
log.write("END");
send(coordinator, ”ACK");
break;
}
}
} 12
©2016-2024 Linh Thi Xuan Phan
2PC: Illustration
Coordinator Subordinate 1 Subordinate 2
force-write
begin log entry
send “prepare”
send “prepare”

force-write force-write
prepare log entry prepare log entry
send “yes”
send “yes”
force-write
commit log entry commit point
send “commit”
send “commit”
force-write force-write
commit log entry commit log entry
send “ack”
send “ack”
write
end log entry

©2016-2024 Linh Thi Xuan Phan 13

2PC: Some observations
• All log records for a transaction contain its ID and
the coordinator’s ID
– The coordinator’s abort/commit record also includes IDs of all
subordinates (why?)

• Every message reflects a decision by the sender

– To ensure that this decision survives failures, it is first recorded in
the local log

• There exists no distributed commit protocol that

can recover without communicating with other
processes, in the presence of multiple failures!

©2016-2024 Linh Thi Xuan Phan

What if a node fails in the middle?
• Suppose we find a commit or abort log record for
transaction T, but not an end record?
– Need to redo/undo T
– If this node is the coordinator for T, keep sending commit / abort
messages to subordinates until acks have been received
• Suppose we find a prepare log record for
transaction T, but not commit/abort?
– This node is a subordinate for T
– Repeatedly contact the coordinator to find status of T, then write
commit/abort log record; redo/undo T; and write end log record
• Suppose we don’t find even a prepare record for T?
– Unilaterally abort and undo T
– This site may be coordinator! If so, subordinates may send
messages and need to be undone as well

©2016-2024 Linh Thi Xuan Phan

Coordinator failure
• What should a subordinate do when the
coordinator fails after it has already voted yes?
– Problem: Cannot decide whether to commit or abort T until
coordinator recovers - T is blocked!
– Consequences?

• Suppose all the subordinates know each other?

– Can be implemented (requires extra overhead in prepare msg)
– But: They are still blocked, unless one of them voted no, or one of
them has already received the coordinator's decision

©2016-2024 Linh Thi Xuan Phan

Link and remote site failures
• What should a node do when a remote node does
not respond during the commit protocol for
transaction T (either because the remote node
failed or the link failed)?
– If the current node is the coordinator for T, should abort T
– If the current node is a subordinate, and has not yet voted yes, it
should abort T
– If the current node is a subordinate and has voted yes, it is blocked
until the coordinator responds!

• Can we do better?

©2016-2024 Linh Thi Xuan Phan

Plan for today
• Distributed commit
– Two-phase commit (2PC)
– Three-phase commit (3PC) NEXT

• Logging and recovery

– Centralized checkpointing
– Chandy-Lamport algorithm

18
©2016-2024 Linh Thi Xuan Phan
How can we improve 2PC?
• What is the real reason why 2PC can block?
– Suppose both the coordinator and a subordinate crash
– The decision could have been COMMIT, and the subordinate may
have already completed the operation
– But the other subordinates have no way to distinguish this from the
situation where the decision was ABORT

• Idea: Let's make sure that the subordinates know

the decision before they execute anything!
– When the coordinator has received yes votes from all the subordi-
nates, it tells them that there will be a COMMIT ("PRECOMMIT")
– Once everyone (or at least a large number) acknowledges the
receipt, the coordinator then sends the actual COMMIT message
– What happens now when the coordinator fails?
• ... during the PRECOMMIT phase? ... during the actual COMMIT phase? 19
©2016-2024 Linh Thi Xuan Phan
Three-phase commit (3PC)
• Phase 1: Voting
– Coordinator sends PREPARE to each subordinate
– Each subordinate votes either YES or NO

• Phase 2: Precommit
– If at least one vote is NO, the coordinator sends ABORT as before
– If all votes are YES, the coordinator sends PRECOMMIT to
at least k subordinates (where k is a tunable parameter)
• OK to send more PRECOMMITs if there are not enough responses
– Each subordinate replies with an ACK

• Phase 3: Commit
– Once the coordinator has received k ACKs, it sends COMMIT to
each subordinate
– Each subordinate responds with an ACK
20
©2016-2024 Linh Thi Xuan Phan
3PC: Handling coordinator failures
• What if some nodes fail, including the coordinator?
– Remaining nodes ask each other what the coordinator has told them
• Situation #1: Nobody has seen a PRECOMMIT
– 2PC would have block in this case!
– But with 3PC, the remaining nodes can safely ABORT, since the
failed nodes could not have made any changes yet
• ... at least unless more than k nodes have failed! (why?)

• Situation #2: At least one PRECOMMIT or COMMIT

– The remaining subordinates know that the decision was (or was
going to be) COMMIT
– They can all decide to go ahead and COMMIT; once the other nodes
come back up, they can learn about this decision and COMMIT too
• Situation #3: Network partition
– 3PC isn't safe in this case! (why?)

21
©2016-2024 Linh Thi Xuan Phan
Recap: Distributed commit
• Goal: Do something on all nodes, or none of them
– A very common requirement in distributed systems
– Naïve solution (one-phase commit) isn't safe

• We have seen two solutions: 2PC and 3PC

– The key idea is for the nodes to 'vote' whether to commit or abort
– A coordinator is elected to collect votes and broadcast the decision
– 2PC is much simpler, but can block when the coordinator fails
– 3PC doesn't have that problem, but it is more complicated
– Neither 2PC nor 3PC can tolerate network partitions

22
©2016-2024 Linh Thi Xuan Phan
Plan for today
• Distributed commit
– Two-phase commit (2PC)
– Three-phase commit (3PC)

• Logging and recovery NEXT

– Centralized checkpointing
– Chandy-Lamport algorithm

23
©2016-2024 Linh Thi Xuan Phan
Why logging and recovery?
• Suppose a distributed system fails in the middle of
a long and expensive computation

• What should we do?

– Restarting from scratch may be expensive, or even impossible!

• Idea: Periodically record a checkpoint of the system

– Each node writes down ("logs") its current state, e.g., on its local disk
– If something bad happens, the nodes can go back to the latest
checkpoint and then continue from there!

24
©2016-2024 Linh Thi Xuan Phan
Message logging
• What if the latest checkpoint was some time ago?
– Rolling back the entire system would destroy a lot of useful work

• Idea: Use message logging + deterministic replay

– Whenever a message is sent, it is saved somewhere – either by the
sender (sender-based logging) or by the recipient (receiver-based)
– Nodes also record all nondeterministic events
• What are these? Why do they need to be remembered?
– When a node crashes, we can roll back only that node to its latest
checkpoint, and then feed it all the recorded messages and
nondeterministic events
– This will bring the node back to the state it was before the crash
• What are we assuming about the software on the node?
• Does this assumption generally hold? What does it take to make it hold?

25
©2016-2024 Linh Thi Xuan Phan
Checkpointing on a single node
• How would you record a checkpoint on one node?
• Idea #1: Just write its memory contents to disk!
– Problem: Write can take a long time!
– If the node keeps running during that time, the checkpoint will be
inconsistent: the state at the beginning is older than that at the end
– The node may never have been in the state that the checkpoint
describes (at least not at any given time)

• Idea #2: Stop the node during the checkpoint

– Not ideal either – system will be unresponsive during that time!
– Actual checkpointing systems use techniques like copy-on-write
(CoW) to take a checkpoint in memory very quickly (a few ms)
• This can then be written to disk asynchronously!
– But this trick won't work if we have a distributed system!
26
©2016-2024 Linh Thi Xuan Phan
Consistent vs. inconsistent cuts
Event A0 Event A1 Event A2 Event A3
A
Me
ssa

2
em
ge
m

ag
1

ss
Me
B
Event B0 Event B1 Event B2 Event B3

Inconsistent cut Consistent cut

• We can define a cut of the distributed execution

– Basically, one prefix of each node's local execution, taken together
• When can we call a cut consistent?
– If, for every event it contains, it also contains all events that
'happened before' that event.
– In particular, every received message must have been sent
• Which of the above cuts are consistent?
27
©2016-2024 Linh Thi Xuan Phan
Recovery lines
= Checkpoint

Failure

Initial state Recovery line Inconsistent cut

• Suppose we want to roll back the entire system

– Can we simply roll back each individual node to a recent checkpoint?
– Problem: Checkpoints could describe an inconsistent cut!
• A recovery line is a consistent set of checkpoints
– How can we find such a recovery line?

28
©2016-2024 Linh Thi Xuan Phan
The Domino Effect
= Checkpoint

Failure

Initial state Recovery line

• What if a set of checkpoints is not consistent?

– Need to roll back some of the nodes to an even earlier checkpoint!
– But that could create further inconsistencies!
• Problem: Cascading rollbacks! Why?
– In this example, the nodes just took their checkpoints individually
– Thus, it is unlikely that they would (ever) form a consistent cut
– The nodes need to coordinate their checkpointing!
29
©2016-2024 Linh Thi Xuan Phan
Centralized checkpointing
• Similar to 2PC with a coordinator

• Coordinator first multicasts a CHECKPOINT

message to all processes

• When a process receives the message:

– Takes a local checkpoint
– Queues any subsequent outgoing messages (why?)
– Responds with an ACK

• When the coordinator receives all the ACKs, it

sends a DONE message
– At that point, the queued messages are sent, and the execution
proceeds normally again

Can we do better?
• Centralized checkpointing has several drawbacks

• Problem #1: Need for a central coordinator

– Ideally, we would like a fully distributed protocol
– Any node should be able to initiate a checkpoint at any time

• Problem #2: Queueing

– This is necessary to ensure that the cut is consistent
– But it also holds up the system for a while!

• Problem #3: Messages are not captured

– Some messages may still be 'in flight' at checkpoint time
– If we roll back every node to the checkpoint, these messages will
not be re-sent!

• This should include:

– A checkpoint for each node
– The set of messages that were "in flight"
at the time (i.e., sent but not yet received)

• Can we expect to capture a state of the entire

system as of some particular time?
– Not unless clocks are perfectly synchronized!
– All we can hope for is a state that the system could have been in,
and from which the actual state is reachable (why?)
32
©2016-2024 Linh Thi Xuan Phan
What are in flight messages from B to A?

Node A Node B

m1: sent and received before m1

both A’s and B’s checkpoints

m2
m2 , m3: sent before B’s checkpoint
received after A’s checkpoint
m3

m4: sent and received after

both A’s and B’s checkpoints m4

33
©2016-2024 Linh Thi Xuan Phan
Some simplifying assumptions
• To simplify our discussion,
we will assume that:
– Each node Ni has a direct "channel" cij
to every other node Nj
– Channels are unidirectional and
deliver messages in FIFO order
– Channels are reliable, i.e., messages
are never lost

• How can we make these assumptions true?

– Direct channel: This is just an abstraction; all we need is that every
node can send messages to every other node
– FIFO order: Can use sequence numbers
– Reliable: Can use retransmissions
34
©2016-2024 Linh Thi Xuan Phan
K. Mani Chandy and Leslie Lamport: "Distributed Snapshots: Determining Global States of Distributed Systems", ACM TOCS 3(1):63-75
Chandy-Lamport algorithm
• When a node Ni wants to initiate a snapshot:
– it takes a local checkpoint,
– it sends a special marker message on each of its outgoing channels, and
– it begins recording messages that arrive on its incoming channels
• When a node Nj receives a marker on channel cij:
– If Nj has not yet taken a checkpoint:
• it takes a local checkpoint and sends a special marker on each outgoing channel,
• it records the state of cij as the empty set, and
• it begins recording messages that arrive on any
other incoming channel ckj
– If Nj has already taken a checkpoint:
• it stops recording messages for cij

• Termination condition:
– The node initiated the snapshot has received a marker from every other node

• What are the 'recorded' sets of messages?

– These are the messages that are "in flight" at the snapshot time
35
©2016-2024 Linh Thi Xuan Phan
Chandy-Lamport: Example
Alice: $500 Alice: $600 Alice: $350 Alice: $400
A

0
10
:$
ce
0
: $5
lice

Ali
→ A

b→
Bob
Bob: $300 Bo
Bob: $200 Bob: $50
B
Bob: $150

Bo $1
Ali

b→ 00
c
e→ 250

Ch
$
Ch

arl
arl

ie:
Charlie: $100

ie:
C
Charlie: $350 Charlie: $450

Node C: Chk="Charlie: $100"; SAC = { Alice→Charlie: $250 }; SBC = {};

Node B: Chk=”Bob: $150"; SAB = {}; SCB = {};

Node A: Chk=”Alice: $350"; SBA = { Bob → Alice: $50 } SCA = {};

• What happens if we roll back & replay messages?

– We arrive at the point indicated by the orange dots! 36
©2016-2024 Linh Thi Xuan Phan
Recap: Chandy-Lamport algorithm
• Goal: A consistent "snapshot" of the system
– One checkpoint per node, and all the messages that were "in flight"
– Not necessarily a state that the system ever was in at any given
(wallclock) time

• Properties of the algorithm:

– Fully distributed; no need for a central coordinator
– Any node can trigger a snapshot at any time
– Execution can continue; no need to "queue" messages

• How can we use this?

– Example: Stable property detection
• Some of the algorithms we have discussed (e.g., deadlock detection via circular
wait) assume that we have a consistent snapshot of the global system state
– Example: Failure recovery

Ddbs Checkpointing ... Ddbs Checkpointing ... : Phase 1 at Css Phase 2 at CC
No ratings yet
Ddbs Checkpointing ... Ddbs Checkpointing ... : Phase 1 at Css Phase 2 at CC
9 pages
Group 8 Assignment - Ruth & Cosmas
No ratings yet
Group 8 Assignment - Ruth & Cosmas
7 pages
Distributed Transactions
No ratings yet
Distributed Transactions
27 pages
Unit IV - Distributed Transaction Processing
No ratings yet
Unit IV - Distributed Transaction Processing
38 pages
Reliability and Security in The Distributed Databases
No ratings yet
Reliability and Security in The Distributed Databases
29 pages
Distributed Transactions - Database Systems
No ratings yet
Distributed Transactions - Database Systems
10 pages
Nonblocking Commit Protocols: Dale Skeen
No ratings yet
Nonblocking Commit Protocols: Dale Skeen
42 pages
ch23 1
No ratings yet
ch23 1
34 pages
Logless One-Phase Commit Made Possible For Highly-Available Datastores
No ratings yet
Logless One-Phase Commit Made Possible For Highly-Available Datastores
26 pages
Lec 22
No ratings yet
Lec 22
22 pages
Quorum
No ratings yet
Quorum
14 pages
Atomic Commit and Concurrency Control: COS 418: Distributed Systems Wyatt Lloyd
No ratings yet
Atomic Commit and Concurrency Control: COS 418: Distributed Systems Wyatt Lloyd
40 pages
Commit Protocols Non-Blocking Commit Protocols
No ratings yet
Commit Protocols Non-Blocking Commit Protocols
10 pages
CS542: Topics in Distributed Systems
No ratings yet
CS542: Topics in Distributed Systems
11 pages
Distributed Transaction
No ratings yet
Distributed Transaction
46 pages
DDS Unit - 4
No ratings yet
DDS Unit - 4
22 pages
Lec 21
No ratings yet
Lec 21
56 pages
L6 Transactions II
No ratings yet
L6 Transactions II
20 pages
Consensus
No ratings yet
Consensus
77 pages
Lec16 2023
No ratings yet
Lec16 2023
45 pages
Fert
No ratings yet
Fert
38 pages
DDBS Lecture10
No ratings yet
DDBS Lecture10
44 pages
Distributed Transactions Module7
No ratings yet
Distributed Transactions Module7
35 pages
Session 35
No ratings yet
Session 35
3 pages
Consensus On Transaction Commit
No ratings yet
Consensus On Transaction Commit
28 pages
3 - Nonblocking Commit Protocols
No ratings yet
3 - Nonblocking Commit Protocols
28 pages
Lecture 9 Distributed Transactions
No ratings yet
Lecture 9 Distributed Transactions
7 pages
Distributed Transaction Model
No ratings yet
Distributed Transaction Model
17 pages
Lecture 05
No ratings yet
Lecture 05
29 pages
Word Unit5
No ratings yet
Word Unit5
19 pages
Nested and Distributed Transactions Guide
No ratings yet
Nested and Distributed Transactions Guide
11 pages
Distributed Commit Protocols
No ratings yet
Distributed Commit Protocols
9 pages
13 - Distributed Transactions
No ratings yet
13 - Distributed Transactions
28 pages
Locks Distributed Systems LOCKS
No ratings yet
Locks Distributed Systems LOCKS
40 pages
Chapter 8 - Fault Tolerance
No ratings yet
Chapter 8 - Fault Tolerance
19 pages
Design and Implementation of A Two-Phase Commit Protocol Simulator
No ratings yet
Design and Implementation of A Two-Phase Commit Protocol Simulator
8 pages
10 DS - Ch17
No ratings yet
10 DS - Ch17
16 pages
WINSEM2023-24 CSI2004 TH VL2023240501820 2024-02-07 Reference-Material-I
No ratings yet
WINSEM2023-24 CSI2004 TH VL2023240501820 2024-02-07 Reference-Material-I
75 pages
Lecture 11A - Replication Control
No ratings yet
Lecture 11A - Replication Control
15 pages
2.11 Distributed Transaction
No ratings yet
2.11 Distributed Transaction
27 pages
Distributed Recovery Management: UNIT-4
No ratings yet
Distributed Recovery Management: UNIT-4
31 pages
Unit - Iv
No ratings yet
Unit - Iv
15 pages
Distributed Transactions
No ratings yet
Distributed Transactions
37 pages
25 DistributedCoordination
No ratings yet
25 DistributedCoordination
30 pages
Lect 22
No ratings yet
Lect 22
27 pages
DBMS - UNIT-3 Notes
No ratings yet
DBMS - UNIT-3 Notes
11 pages
Concurrency Control
No ratings yet
Concurrency Control
26 pages
Lecture 7
No ratings yet
Lecture 7
29 pages
MIT 6.824 - Lecture 12 - Distributed Transactions
No ratings yet
MIT 6.824 - Lecture 12 - Distributed Transactions
1 page
What Is A Transaction
No ratings yet
What Is A Transaction
7 pages
DDB Transaction Management Protocols
No ratings yet
DDB Transaction Management Protocols
11 pages
Fault Tolerance in Distributed Systems
100% (1)
Fault Tolerance in Distributed Systems
21 pages
8 Transaction
No ratings yet
8 Transaction
30 pages
Os Answer 5
No ratings yet
Os Answer 5
2 pages
24 2PC 2023
No ratings yet
24 2PC 2023
26 pages
Efficient and Non-Blocking Agreement Protocols: Suyash Gupta Mohammad Sadoghi
No ratings yet
Efficient and Non-Blocking Agreement Protocols: Suyash Gupta Mohammad Sadoghi
47 pages
Distributed Computing Replication Control
No ratings yet
Distributed Computing Replication Control
71 pages
Poly Studio x50 Ds en
No ratings yet
Poly Studio x50 Ds en
3 pages
Syllabus Class IX Commerce 2023 2024
No ratings yet
Syllabus Class IX Commerce 2023 2024
21 pages
Window Glass Coverage Endorsement: Your Policy Is Amended As Follows
No ratings yet
Window Glass Coverage Endorsement: Your Policy Is Amended As Follows
2 pages
2021 Eyes - Open - 1 - Students Book
100% (2)
2021 Eyes - Open - 1 - Students Book
89 pages
Articles A, An, The PDF
No ratings yet
Articles A, An, The PDF
3 pages
The White Curse (Gazellian Series 2) - VentreCanard - Wattpad - Wattpad
No ratings yet
The White Curse (Gazellian Series 2) - VentreCanard - Wattpad - Wattpad
349 pages
New Tokenized Funding Deck
No ratings yet
New Tokenized Funding Deck
9 pages
HW 24:6
No ratings yet
HW 24:6
2 pages
679713b86cdd397868371745 - ## - Linear Equation in One Variable - Practice Sheet - (Only PDF) - Mathematics Olympiad Beginners Program 2025
No ratings yet
679713b86cdd397868371745 - ## - Linear Equation in One Variable - Practice Sheet - (Only PDF) - Mathematics Olympiad Beginners Program 2025
4 pages
Crdetails - OS 20022021
No ratings yet
Crdetails - OS 20022021
14 pages
BCA Syllabus 2023-24
No ratings yet
BCA Syllabus 2023-24
158 pages
g12 Quizz 1
No ratings yet
g12 Quizz 1
4 pages
Review- The Soul of Mbira Twenty Years On- A Retrospect Reviewed Work(s)- The Soul of Mbira- Music and Traditions of the Shona People of Zimbabwe by Paul Berliner Review by- Keith Goddard and John M. Chernoff
No ratings yet
Review- The Soul of Mbira Twenty Years On- A Retrospect Reviewed Work(s)- The Soul of Mbira- Music and Traditions of the Shona People of Zimbabwe by Paul Berliner Review by- Keith Goddard and John M. Chernoff
21 pages
Get Through FRCR Part 2B Rapid Reporting of Plain Radiographs Official Test Bank
No ratings yet
Get Through FRCR Part 2B Rapid Reporting of Plain Radiographs Official Test Bank
325 pages
Tugas P10 - M.faturrahman
No ratings yet
Tugas P10 - M.faturrahman
3 pages
Prescriptive Role of Nurse Practitioner in Singapore
No ratings yet
Prescriptive Role of Nurse Practitioner in Singapore
2 pages
Spatial Progression of Sigiriya
No ratings yet
Spatial Progression of Sigiriya
41 pages
HDPE Chemical Compatibility & Resistance Chart: Explanation of Footnotes
No ratings yet
HDPE Chemical Compatibility & Resistance Chart: Explanation of Footnotes
8 pages
Tar Abalam
No ratings yet
Tar Abalam
4 pages
Indian IPOs: Pandemic Performance Analysis
No ratings yet
Indian IPOs: Pandemic Performance Analysis
7 pages
Benefits of Having A CRM System in Addition PDF
No ratings yet
Benefits of Having A CRM System in Addition PDF
49 pages
Types of Communication Styles - Assertive, Aggressive, Passive, Passive Aggressive
No ratings yet
Types of Communication Styles - Assertive, Aggressive, Passive, Passive Aggressive
13 pages
Fantasy Sword Lore & Abilities
No ratings yet
Fantasy Sword Lore & Abilities
2 pages
PPC Zimbabwe Economic Insights
No ratings yet
PPC Zimbabwe Economic Insights
38 pages
Solar Elbow
No ratings yet
Solar Elbow
22 pages
HYDRAULICREPORTTEMPLATE2009 Ver 2
No ratings yet
HYDRAULICREPORTTEMPLATE2009 Ver 2
72 pages
4th, 5th, 6th and 7th Pay Scale Table For 7th CPC Pension Calculator
No ratings yet
4th, 5th, 6th and 7th Pay Scale Table For 7th CPC Pension Calculator
2 pages
Wende Proposal
No ratings yet
Wende Proposal
25 pages
Python List Exercises Guide
No ratings yet
Python List Exercises Guide
8 pages
Silver Spoon Menu New 27-01
No ratings yet
Silver Spoon Menu New 27-01
5 pages

Lecture 13

Uploaded by

Lecture 13

Uploaded by

University of Pennsylvania

Linh Thi Xuan Phan

Department of Computer and Information Science

Lecture 13: Fault tolerance

©2016-2024 Linh Thi Xuan Phan

• Needed for creating the team GitHub repository

• How does replication help?

– Two-phase commit (2PC)

• Logging and recovery

• Now suppose Alice wants to send $100 to Bob

• How can we ensure atomicity?

• Idea: Let's do one-phase commit

• Is this a good solution?

• Problem #1: Subordinate cannot independently abort the

• First round: Voting

• Second round: Decision

• Result: Any site can decide to abort a transaction!

• Idea: Each node is given a local, persistent log

©2016-2024 Linh Thi Xuan Phan 13

• Every message reflects a decision by the sender

• There exists no distributed commit protocol that

©2016-2024 Linh Thi Xuan Phan

©2016-2024 Linh Thi Xuan Phan

• Suppose all the subordinates know each other?

©2016-2024 Linh Thi Xuan Phan

©2016-2024 Linh Thi Xuan Phan

• Logging and recovery

• Idea: Let's make sure that the subordinates know

• Situation #2: At least one PRECOMMIT or COMMIT

• We have seen two solutions: 2PC and 3PC

• Logging and recovery NEXT

• What should we do?

• Idea: Periodically record a checkpoint of the system

• Idea: Use message logging + deterministic replay

• Idea #2: Stop the node during the checkpoint

Inconsistent cut Consistent cut

• We can define a cut of the distributed execution

Initial state Recovery line Inconsistent cut

• Suppose we want to roll back the entire system

Initial state Recovery line

• What if a set of checkpoints is not consistent?

• Coordinator first multicasts a CHECKPOINT

• When a process receives the message:

• When the coordinator receives all the ACKs, it

©2016-2024 Linh Thi Xuan Phan

• Problem #1: Need for a central coordinator

• Problem #2: Queueing

• Problem #3: Messages are not captured

• This should include:

• Can we expect to capture a state of the entire

m1: sent and received before m1

m4: sent and received after

• How can we make these assumptions true?

• What are the 'recorded' sets of messages?

Node C: Chk="Charlie: $100"; SAC = { Alice→Charlie: $250 }; SBC = {};

Node B: Chk=”Bob: $150"; SAB = {}; SCB = {};

Node A: Chk=”Alice: $350"; SBA = { Bob → Alice: $50 } SCA = {};

• What happens if we roll back & replay messages?

• Properties of the algorithm:

• How can we use this?

You might also like