Concurrency Control Concepts: Two-Phase Locking Techniques for Concurrency Control.
Database
Recovery Techniques- Recovery Concepts, NO-UNDO/REDO Recovery Based on Deferred Update.
Need of Concurrency Control Protocols:
Concurrency control is essential in a multi-user database environment to ensure that multiple
transactions can execute concurrently without leading to inconsistencies or data corruption.
Why is it needed?
1. To maintain data consistency
2. To enforce isolation (a key ACID property)
3. To prevent problems like lost updates, dirty reads, uncommitted data, and inconsistent
retrievals
4. To ensure serializability — the correctness criterion for concurrent execution.
Example Without Concurrency Control:Consider two transactions, T1 and T2, running concurrently
on a bank account balance:
Initial balance in Account A = ₹10,000
Transaction T1: Withdraw ₹1,000 Transaction T2: Deposit ₹2,000
Read(A); // A = 10,000 Read(A); // A = 10,000
A = A - 1000; // A = 9000 A = A + 2000; // A = 12,000
Write(A); Write(A);
Without Concurrency Control:
If both transactions read the value of A (₹10,000) before either writes it:
T1 reads ₹10,000, subtracts ₹1,000 → ₹9,000
T2 reads ₹10,000, adds ₹2,000 → ₹12,000
T2 writes ₹12,000
T1 writes ₹9,000
Final Balance Should Be: ₹11,000
But the Actual Final Balance is: ₹9,000 (T2's write is lost due to overwrite)
This is called a "Lost Update Problem"
Such issues are prevented using concurrency control techniques like locking, timestamp ordering, or
validation-based methods.
With Concurrency Control (e.g., Two-Phase Locking):
• T1 locks A, updates it to ₹9,000, and releases the lock.
• T2 waits, then reads the updated A (₹9,000), updates to ₹11,000.
• Correct final value is maintained.
Purpose of Concurrency Control
• To manage simultaneous execution of transactions without conflicting.
• Ensures isolation, consistency, and serializability.
Lock-based Protocols:
A transaction must get a lock before operating on the data. LOCKS are used to ensure concurrency.
Two types of locks:
• Shared (S) locks (also called read locks): Obtained if transactions want to only read an item
• Exclusive (X) locks (also called write locks): Obtained for updating a data item
New LOCK instructions
o Lock-S: shared (or read) lock request
o Lock-X: exclusive (or write) lock request
o Unlock: release previously held lock
Example schedule:
T1 T2
read(B) B =B-50 read(A)
write(B) read(A) read(B)
A =A + 50 display(A+B)
write(A)
After applying lock-based instruction
T1 T2
Lock-X(B) Lock-S(A)
read(B) read(A)
B = B-50 Unlock(A)
write(B) Lock-S(B)
Unlock(B) read(B)
Lock-X(A) Unlock(B)
read(A) display(A+B)
A =A + 50
write(A)
Unlock(A)
Two-Phase Locking Protocol (2PL)
A transaction follows the two-phase locking protocol if all locking operations precede the first unlock operation
in the transaction.
transaction can be divided into two phases:
Growing Phase:
• Transaction acquires locks.
• No lock is released in this phase.
Shrinking Phase:
• No new locks are acquired.
• Only release operations are allowed.
Example:
Variations of 2PL:
1. Basic 2PL
❑ Two Phase: Growing and Shrinking Phase
2. Conservative (or static) 2PL
❑ Conservative 2PL requires a transaction to lock all the items it accesses before
the transaction begins execution by pre declaring its write set and read set.
❑ If any of the pre declared items needed cannot be locked, the transaction does not lock any
item; instead it waits until all the items are available for locking
3. Strict 2PL
□ A transaction T does not release any of its exclusive (write) locks until after it commits or
aborts.
□ Hence no other transaction can read or write an item that is written by T unless T has committed.
4. Rigorous 2PL
A transaction T does not release any of its locks (exclusive or shared) until after it commits or aborts
and so it is easier to implement than strict 2PL
Drawbacks:
• Deadlocks: Two or more transactions wait for each other’s locks.
• Reduced Concurrency: Especially with conservative 2PL.
Problems on 2PL:
Given the following transactions:
• T1: Read(A), Write(A), Read(B), Write(B)
• T2: Read(B), Write(B), Read(C), Write(C)
Assuming the locking operations follow the format Lx(Y) for locking item Y by transaction X and
Ux(Y) for unlocking,construct a schedule that follows the Two-Phase Locking (2PL) protocol.
Demonstrate how this schedule satisfies conflict-serializability.
Solution:
Given Transactions
T1: Read(A), Write(A), Read(B), Write(B)
T2: Read(B), Write(B), Read(C), Write(C)
Step 1: Add Locks According to 2PL (Two-Phase Locking)
Two-Phase Locking Protocol:
• Growing phase: Acquire all locks.
• Shrinking phase: Release all locks (no new locks can be acquired after the first unlock).
Schedule with Lock (L) and Unlock (U) Operations:
Let’s construct a conflict-serializable schedule that follows 2PL:
1. L1(A)
2. Read1(A)
3. Write1(A)
4. L1(B)
5. Read1(B)
6. Write1(B)
7. L2(B) ← waits (T1 holds B)
8. L1(C) ← skip, T1 does not access C
9. U1(A)
10. U1(B)
11. L2(B)
12. Read2(B)
13. Write2(B)
14. L2(C)
15. Read2(C)
16. Write2(C)
17. U2(B)
18. U2(C)
Step 2: Construct the Final Schedule
This gives us the final interleaved schedule:
T1: L1(A), Read(A), Write(A), L1(B), Read(B), Write(B), U1(A), U1(B)
T2: L2(B), Read(B), Write(B), L2(C), Read(C), Write(C), U2(B), U2(C)
Since T1 releases locks only after acquiring both (A, B), it respects 2PL.
T2 waits for B until T1 unlocks it, then acquires both B and C before unlocking.
Step 3: Conflict-Serializability
Let's analyze the conflicts:
• T1 → T2 on B:
T1 writes B before T2 reads/writes B ⇒ T1 precedes T2.
• No other conflicting operations (T1 and T2 operate on disjoint sets otherwise: A, C).
Therefore, the precedence graph has:
T1 → T2
No cycles → Schedule is Conflict-Serializable
Equivalent to the serial schedule: T1 → T2
Therefore, The schedule follows 2PL: all locks acquired before unlocks. The schedule is conflict-
serializable, equivalent to the serial order T1 before T2.
Database Recovery Techniques:
Why recovery is Needed ?
• To avoid following failures (or What causes a Transaction to fail)
A computer failure (system crash):A hardware or software error occurs in the computer system
during transaction execution. If the hardware crashes, the contents of the computer’s internal
memory may be lost.
• A transaction or system error:
Some operation in the transaction may cause it to fail, such as integer overflow or division by
zero. Transaction failure may also occur because of erroneous parameter values or because of
a logical programming error. In addition, the user may interrupt the transaction during its
execution.
• Local errors or exception conditions detected by the transaction:
Certain conditions necessitate cancellation of the transaction. For example, data for the
transaction may not be found. A condition, such as insufficient account balance in a banking
database, may cause a transaction, such as a fund withdrawal from that account, to be cancelled.
A programmed abort in the transaction causes it to fail.
• Concurrency control enforcement:
The concurrency control method may decide to abort the transaction, to be restarted later,
because it violates serializability or because several transactions are in a state of deadlock.
• Disk failure:
Some disk blocks may lose their data because of a read or write malfunction or because of a
disk read/write head crash. This may happen during a read or a write operation of the
transaction.
• Physical problems and catastrophes:
This refers to an endless list of problems that includes power or air-conditioning failure, fire,
theft, sabotage, overwriting disks by mistake.
Objective of Recovery
To restore the database to a consistent state after a failure. This typically means rolling back or
reapplying transactions based on a log file.
Failure Classification:
• Transaction failure
• System crash
• Disk failure
Transaction failure:
• Logical errors: transaction cannot complete due to some internal error condition
• System errors: the database system must terminate an active transaction due to an error
condition (e.g., deadlock)
System crash: a power failure or other hardware or software failure causes the system to crash.
• Fail-stop assumption: non-volatile storage contents are assumed to not be corrupted by
system crash.
• Database systems have numerous integrity checks to prevent corruption of disk data
Disk failure: a head crash or similar disk failure destroys all or part of disk storage.
• Destruction is assumed to be detectable: disk drives use checksums to detect failures
Types of Failures and Recovery Strategies:
1. Catastrophic Failures (e.g., disk crash)
• Entire database on disk is lost or corrupted.
• Recovery involves two steps:
o Restore the last full backup from offline storage (like tapes).
o Use the log file to REDO all committed transactions up to the time of failure.
2. Non-Catastrophic Failures (e.g., system crash, power failure)
• Database is intact, but transactions may be incomplete.
• Recovery involves:
o UNDO uncommitted transactions (rollback changes).
o REDO committed transactions (if their changes were not yet written to disk).
• No need for full database backup; recovery uses the online log on disk.
System Log and Its Role:
• Stores information about all write operations.
• Crucial for recovery protocols to determine what to redo or undo.
• Must be stored persistently (on disk).
Policies for Recovery from Transaction Failures:
1. Deferred Update (NO-UNDO/REDO)
• Updates are delayed until the transaction commits.
• Until then, changes are stored in:
o Memory buffers
o Or transaction workspace
• Once a transaction commits:
o Changes are persistently logged, then written to the database.
• If failure occurs before commit:
➤ No changes on disk → No UNDO needed
• If failure occurs after commit but before writing to DB:
➤ REDO needed using the log
2. Immediate Update (UNDO/REDO) –
• Allows the database to be updated during transaction execution, before commit.
• Each update must be logged before it's applied to the database (force-write).
• If transaction fails before commit:
➤ UNDO required.
• If transaction committed but not all changes written to DB:
➤ REDO required.
3. Undo-Only Recovery (UNDO/NO-REDO)
• All changes are immediately written to DB before commit.
• No need for REDO (since all updates are guaranteed to be on disk).
• If failure occurs before commit → UNDO is sufficient.
Introduction to Recovery Concepts
The following are the concepts of recovery:
• Caching (Buffering) of Disk Blocks
• Write-Ahead Logging, Steal/No-Steal, and Force/No-Force
• Checkpoints in the System Log and Fuzzy Checkpointing
• Cascading Rollbacks and Cascadeless Schedules
• Transaction Actions That Do Not Affect the Database
1.Caching (Buffering) of Disk Blocks
Purpose: Improve performance by caching frequently accessed disk pages in main memory (DBMS
cache).
Components:
• Cache Directory: Tracks mappings like <Disk_Page_Address, Buffer_Location, Dirty_Bit,
Pin_Bit, Transaction_IDs>.
• Buffers: Hold disk pages temporarily for manipulation.
• Dirty Bit: Indicates whether a buffer page has been modified (1) or not (0).
• Pin/Unpin Bit: Prevents pages from being flushed before specific conditions are met (e.g.,
transaction commit).
Workflow:
a.When a page is requested:
Check if it’s in the cache.
If not, bring it from disk.
May need to replace an existing page using policies like LRU or MRU.
b. Modified buffers with a dirty bit = 1 must be written to disk before replacement.
2. Write-Ahead Logging (WAL)
• WAL is a recovery technique used when in-place updating is done in a DBMS.
• It ensures that log records are written before the actual database is updated on disk.
• WAL helps in performing UNDO and REDO operations during recovery.
In-place update means the data item on disk is directly overwritten by its new value (AFIM – After
Image).This is risky during crashes, hence we need to log changes to allow recovery.
The log records the BFIM (Before Image) and AFIM of each updated item.
These logs ensure we can:
• UNDO (rollback) using BFIM.
• REDO (reapply) using AFIM.
WAL Protocol Rules:
Key Rule: Log first, update later.
The log entry (containing BFIM & AFIM) must be written to disk before the actual data block is
updated on disk.
It Ensures UNDO/REDO can be performed after a crash.
Two types of log info:
• UNDO-type: Has BFIM (used to undo updates).
• REDO-type: Has AFIM (used to redo updates).
• Combined Log Entry: Stores both BFIM and AFIM (UNDO/REDO).
Cascading Rollbacks and Read_Item Entries
• When cascading rollback is possible (i.e., one transaction rollback affects others), read_item
entries are treated as UNDO-type.
• This helps track dependencies between transactions.
DBMS Cache and Log Buffers
DBMS cache holds:
• Data blocks (pages)
• Index blocks
• Log blocks
• Logs are append-only disk files.
When an update occurs:
• A log record is added to the log buffer in memory.
• Before the data block is written, the log buffer must be flushed to disk (WAL rule).
Steal/No-Steal and Force/No-Force Policies:
Steal/No-Steal ENSURES when the database system can write dirty (updated but
uncommitted) data to disk.
Steal:
• The DBMS can write updated data to disk before the transaction commits.
• Saves memory (buffer space).
• But if system crashes, we need UNDO (because uncommitted changes are already on disk).
→ Example: Transaction T1 updates a row and DBMS writes it to disk before T1 commits. If T1
fails, we must undo that update.
No-Steal:
• The DBMS does not write uncommitted data to disk.
• No need for UNDO.
• But needs more memory to keep all updated data in buffer.
Force and No-Force:
These polices are about whether the DBMS writes all committed data to disk immediately when a
transaction commits.
Force:
• All updated pages of a transaction are immediately written to disk when it commits.
• No need for REDO during recovery.
No-Force:
• DBMS does not write all data to disk at commit time.
• Data may still be in memory.
• Needs REDO to reapply changes if crash happens after commit.
3. Checkpoints in the System Log and Fuzzy Checkpointing
• A checkpoint is a special log entry used to reduce the amount of work needed during
recovery.
• It tells the system:“Up to this point, all changes have been safely written to disk.”
• It marks a safe state in the database so that recovery does not have to start from the beginning
of the log.
Why is Checkpointing Important?
• Helps speed up recovery after a crash.
• No need to redo operations of transactions that committed before the checkpoint (they are
already written to disk).
• Only the active (uncommitted) transactions at the time of the checkpoint need to be
considered during recovery.
What is in a Checkpoint Log Record?
A checkpoint log record includes:
• List of active transactions at the time of the checkpoint.
• (Optional) Addresses of the first and last log records of each active transaction.
• This helps identify and undo uncommitted transactions quickly.
When is a Checkpoint Taken?
➢ Periodically, based on:
• Time (e.g., every 5 minutes).
• Number of committed transactions (e.g., every 10 transactions).
➢ These values are system-defined parameters.
Steps in Checkpointing
1. Suspend current transaction execution.
2. Force-write all modified pages in the DBMS buffer to disk.
3. Write a [checkpoint] log record and flush the log to disk.
4. Resume transaction execution.
Problem with Regular Checkpointing
• Step 2 (writing all buffers to disk) takes time.
• Suspending transactions can slow down performance.
Solution: Fuzzy Checkpointing
• Allows transactions to continue even while the checkpoint is being created.
• Steps:
1. Write [begin_checkpoint] log record.
2. Do not pause transaction execution.
3. System starts writing modified pages to disk in the background.
4. After writing, write [end_checkpoint] record with transaction info.
5. Update a pointer on disk to point to the new checkpoint (only after step 4 finishes).
• Until the process finishes, the system uses the previous checkpoint for recovery.
4.Transaction Rollback and Cascading Rollback
What is Transaction Rollback?
• If a transaction fails after updating the database but before commit, it must be rolled back.
• All changes made by the transaction are undone using the UNDO-type log entries (which
contain BFIMs).
• This restores the database to its previous consistent state
When is Rollback Needed?
• System crash
• Transaction failure (logic error, violation, deadlock, etc.)
• Manual abort
Cascading Rollback – What is it?
• Cascading rollback happens when other transactions read data written by a failed
transaction.
• These dependent transactions also need to be rolled back.
Example Flow:
Let’s say:
• T3 writes data B
• T2 reads that B from T3
• T3 fails and is rolled back
T2 must also be rolled back (because it used invalid data).Now:
• If T1 had read from T2 → T1 also must be rolled back
This chain is called a cascading rollback.
Why is Cascading Rollback a Problem?
• It is complex, time-consuming, and inefficient.
• Can affect many transactions due to a single failure.
Solution: Use Cascadeless or Strict Schedules
Type of Schedule Description Cascading Rollback?
Commit only after other T
Recoverable May happen
commits
T can read only committed
Cascadeless Prevented
values
T can read/write only after other
Strict Strictly prevented
T commits
5. Transaction Actions That Do Not Affect the Database
These actions do not alter the persistent database and may include:
• Read operations (unless part of a write).
• Computations in memory.
• Logging operations.
• Locks acquisition or release.
They are non-critical for recovery and don’t require undo/redo.
NO-UNDO/REDO Recovery Based on Deferred Update
What is Deferred Update?
• In Deferred Update, the database on disk is not updated during transaction execution.
• All changes are kept in main memory buffers and logs.
• Updates are applied to the actual database only after the transaction commits
Key Characteristics
Feature Deferred Update
Update Time After commit only
Updates made to Log + Cache only
Type of log entries needed Only REDO (AFIM)
UNDO required? No
Cascading rollback? No
No-Steal (no uncommitted data is written to
Buffer policy
disk)
Concurrency control Usually with Strict 2PL
Steps in Deferred Update Protocol
• No update is written to the database disk until the transaction commits.
• All REDO-type log entries (AFIM) are force-written to disk before committing.
• If the transaction fails before commit, it is simply aborted—no UNDO required.
Example: Commit vs Fail
• If a transaction commits: Changes are REDOed from the log.
• If it fails: Just discard changes, since nothing was written to disk.
Recovery Algorithm: RDU_M
(Recovery using Deferred Update in a Multiuser system)
• Maintain Two Lists:
1. Commit List – Transactions committed after the last checkpoint
2. Active List – Transactions that were active during crash
• Recovery Steps:
o REDO all write_item operations from committed transactions in the commit list.
o Ignore active (uncommitted) transactions → they are canceled and must be
resubmitted.
REDO Operation
• Uses log entry: [write_item, T, X, new_value]
• Sets value of X = new_value in the database (AFIM).
Optimization in REDO
• If an item X is updated multiple times, only the last update needs to be redone.
• REDO process starts from the end of the log, and maintains a list of already redone items to
avoid repeating.
Advantages of Deferred Update
1. No UNDO needed
2. No cascading rollback
3. Ensures strict schedule when used with Strict Two-Phase Locking (2PL)
4. Simplifies recovery
Disadvantages
1. Requires large buffer space to hold uncommitted updates.
2. All write locks are held until commit → may reduce concurrency.
3. Suitable only for short transactions that change few items.
Problem on the NO-UNDO/REDO recovery technique (Deferred Update Protocol).
problems Consider system crashed after the following log, system crash occurs after step 8.
1. START T1
2. WRITE T1, A, 10 → 20
3. START T2
4. WRITE T2, B, 30 → 40
5. COMMIT T1
6. START T3
7. WRITE T3, A, 20 → 25
8. COMMIT T2
Apply the NO-UNDO/REDO recovery technique based on deferred updates to a transaction log.
Show step-by-step how the database state is recovered after a system crash.
Solution:
Given Transaction Log (Before Crash)
Step Log Entry
1 START T1
2 WRITE T1, A, 10 → 20
3 START T2
4 WRITE T2, B, 30 → 40
5 COMMIT T1
6 START T3
7 WRITE T3, A, 20 → 25
8 COMMIT T2
9 System Crash (T3 not committed)
Assumptions
• Initial values: A = 10, B = 30
• NO-UNDO/REDO protocol is used (deferred update).
• Updates are written to disk only after commit.
Step 1: Identify Committed and Uncommitted Transactions
Transaction Status
T1 Committed (Step 5)
T2 Committed (Step 8)
T3 Not committed (Active at crash)
Step 2: Apply NO-UNDO/REDO Recovery
According to NO-UNDO/REDO:
• Only committed transactions are redone.
• Uncommitted transactions are ignored (T3 is discarded).
• Redo write operations of T1 and T2 in the order they appear in the log.
Step 3: REDO Operations
1. WRITE T1, A, 10 → 20 (Step 2)
→ Set A = 20 in the database
2. WRITE T2, B, 30 → 40 (Step 4)
→ Set B = 40 in the database
Note:
WRITE T3, A, 20 → 25 is not redone, since T3 was not committed.
Final Recovered Database State
Data Item Value
A 20
B 40
Similarly solve the following problems
1. Recover the final state of the database after the crash using NO-UNDO/REDO technique
Initial DB State: X = 50, Y = 100
Log Before Crash:
1. START T1
2. WRITE T1, X, 50 → 60
3. START T2
4. WRITE T2, Y, 100 → 120
5. COMMIT T2
6. WRITE T1, X, 60 → 65
7. System crash
2. What will be the final state of A, B, and C?
Initial DB State: A = 5, B = 15, C = 25
Log Before Crash:
1. START T1
2. WRITE T1, A, 5 → 10
3. START T2
4. WRITE T2, B, 15 → 18
5. COMMIT T1
6. START T3
7. WRITE T3, C, 25 → 30
8. COMMIT T3
9. Crash
3. Apply deferred update recovery and write the final values of M and N.
Initial DB State: M = 200, N = 300
Log Before Crash:
1. START T1
2. WRITE T1, M, 200 → 220
3. START T2
4. WRITE T2, N, 300 → 310
5. COMMIT T1
6. WRITE T2, N, 310 → 315
7. COMMIT T2
8. START T3
9. WRITE T3, M, 220 → 230
10. Crash