0% found this document useful (0 votes)
69 views17 pages

Concurrency Control in Databases

Concurrency control techniques DBMS

Uploaded by

false9782
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views17 pages

Concurrency Control in Databases

Concurrency control techniques DBMS

Uploaded by

false9782
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Concurrency Control

Concurrency Control is the management procedure that is required for controlling concurrent
execution of the operations that take place on a database.

Concurrent Execution in DBMS


o In a multi-user system, multiple users can access and use the same database at one
time, which is known as the concurrent execution of the database. It means that the
same database is executed simultaneously on a multi-user system by different users.
o While working on the database transactions, there occurs the requirement of using the
database by multiple users for performing different operations, and in that case,
concurrent execution of the database is performed.
o The thing is that the simultaneous execution that is performed should be done in an
interleaved manner, and no operation should affect the other executing operations,
thus maintaining the consistency of the database. Thus, on making the concurrent
execution of the transaction operations, there occur several challenging problems that
need to be solved.

Problems with Concurrent Execution


In a database transaction, the two main operations are READ and WRITE operations. So, there is
a need to manage these two operations in the concurrent execution of the transactions as if these
operations are not performed in an interleaved manner, and the data may become inconsistent. So,
the following problems occur with the Concurrent Execution of the operations:

Problem 1: Lost Update Problems (W - W Conflict)


The problem occurs when two different database transactions perform the read/write operations
on the same database items in an interleaved manner (i.e., concurrent execution) that makes the
values of the items incorrect hence making the database inconsistent.

For example: Consider the below diagram where two transactions TX and TY, are performed on the
same account A where the balance of account A is $300.
o At time t1, transaction TX reads the value of account A, i.e., $300 (only read).
o At time t2, transaction TX deducts $50 from account A that becomes $250 (only
deducted and not updated/write).
o Alternately, at time t3, transaction TY reads the value of account A that will be $300
only because TX didn't update the value yet.
o At time t4, transaction TY adds $100 to account A that becomes $400 (only added but
not updated/write).
o At time t6, transaction TX writes the value of account A that will be updated as $250
only, as TY didn't update the value yet.
o Similarly, at time t7, transaction TY writes the values of account A, so it will write as
done at time t4 that will be $400. It means the value written by TX is lost, i.e., $250 is
lost.
Hence data becomes incorrect, and database sets to inconsistent.

Dirty Read Problems (W-R Conflict)


The dirty read problem occurs when one transaction updates an item of the database, and
somehow the transaction fails, and before the data gets rollback, the updated database item is
accessed by another transaction. There comes the Read-Write Conflict between both transactions.

For example: Consider two transactions TX and TY in the below diagram performing read/write
operations on account A where the available balance in account A is $300:

o At time t1, transaction TX reads the value of account A, i.e., $300.


o At time t2, transaction TX adds $50 to account A that becomes $350.
o At time t3, transaction TX writes the updated value in account A, i.e., $350.
o Then at time t4, transaction TY reads account A that will be read as $350.
o Then at time t5, transaction TX rollbacks due to server problem, and the value changes
back to $300 (as initially).
o But the value for account A remains $350 for transaction TY as committed, which is
the dirty read and therefore known as the Dirty Read Problem.
Unrepeatable Read Problem (R-W Conflict)
Also known as Inconsistent Retrievals Problem that occurs when in a transaction, two different
values are read for the same database item.

For example: Consider two transactions, TX and TY, performing the read/write operations on
account A, having an available balance = $300. The diagram is shown below:

o At time t1, transaction TX reads the value from account A, i.e., $300.
o At time t2, transaction TY reads the value from account A, i.e., $300.
o At time t3, transaction TY updates the value of account A by adding $100 to the
available balance, and then it becomes $400.
o At time t4, transaction TY writes the updated value, i.e., $400.
o After that, at time t5, transaction TX reads the available value of account A, and that
will be read as $400.
o It means that within the same transaction TX, it reads two different values of account
A, i.e., $ 300 initially, and after updation made by transaction TY, it reads $400. It is
an unrepeatable read and is therefore known as the Unrepeatable read problem.
Thus, in order to maintain consistency in the database and avoid such problems that take place in
concurrent execution, management is needed, and that is where the concept of Concurrency
Control comes into role.

Deadlock in DBMS
In database management systems (DBMS) a deadlock occurs when two or more
transactions are unable to the proceed because each transaction is waiting for the
other to the release locks on resources.

This situation creates a cycle of the dependencies where no transaction can


continue leading to the standstill in the system.

The Deadlocks can severely impact the performance and reliability of a DBMS
making it crucial to the understand and manage them effectively.

What is Deadlock?

Deadlocks occur when two or more transactions wait indefinitely for resources
held by each other.

Characteristics of Deadlock

● Mutual Exclusion: Only one transaction can hold a particular resource at a time.

● Hold and Wait: The Transactions holding resources may request additional
resources held by others.

● No Preemption: The Resources cannot be forcibly taken from the transaction


holding them.

● Circular Wait: A cycle of transactions exists where each transaction is waiting


for the resource held by the next transaction in the cycle.

DBMSs often use various techniques to detect and resolve deadlocks


automatically. These techniques include :-

Timeout mechanisms - where a transaction is forced to release its locks after a


certain period of time
Deadlock detection algorithms - which periodically scan the transaction log for
deadlock cycles and then choose a transaction to abort to resolve the deadlock.

It is also possible to prevent deadlocks by careful design of transactions, such as


always acquiring locks in the same order or releasing locks as soon as possible.
Proper design of the database schema and application can also help to minimize the
likelihood of deadlocks.
In a database, a deadlock is an unwanted situation in which two or more
transactions are waiting indefinitely for one another to give up locks.
Deadlock is said to be one of the most feared complications in DBMS as it brings
the whole system to a Halt.

Example – let us understand the concept of deadlock suppose, Transaction T1


holds a lock on some rows in the Students table and needs to update some rows in
the Grades table.

Simultaneously, Transaction T2 holds locks on those very rows (Which T1 needs


to update) in the Grades table but needs to update the rows in the Student
table held by Transaction T1.

Now, the main problem arises. Transaction T1 will wait for transaction T2 to give
up the lock, and similarly, transaction T2 will wait for transaction T1 to give up the
lock. As a consequence, All activity comes to a halt and remains at a standstill
forever unless the DBMS detects the deadlock and aborts one of the transactions.

Deadlock in DBMS

What is Deadlock Avoidance?

When a database is stuck in a deadlock, It is always better to avoid the deadlock


rather than restarting or aborting the database.

The deadlock avoidance method is suitable for smaller databases


The deadlock prevention method is suitable for larger databases.
One method of avoiding deadlock is using application-consistent logic. In the
above-given example, Transactions that access Students and Grades should always
access the tables in the same order. In this way, in the scenario described above,
Transaction T1 simply waits for transaction T2 to release the lock on Grades
before it begins. When transaction T2 releases the lock, Transaction T1 can
proceed freely.

Another method for avoiding deadlock is to apply both the row-level locking
mechanism and the READ COMMITTED isolation level. However, It does not
guarantee to remove deadlocks completely.

What is Deadlock Detection?

When a transaction waits indefinitely to obtain a lock, The database management


system should detect whether the transaction is involved in a deadlock or not.

Wait-for-graph is one of the methods for detecting the deadlock situation. This
method is suitable for smaller databases. In this method, a graph is drawn based on
the transaction and its lock on the resource. If the graph created has a closed loop
or a cycle, then there is a deadlock.

For the above-mentioned scenario, the Wait-For graph is drawn below:

What is Deadlock Prevention - Based on Timestamp


Ordering

For a large database, the deadlock prevention method is suitable. A deadlock can
be prevented if the resources are allocated in such a way that a deadlock never
occurs.
The DBMS analyzes the operations whether they can create a deadlock situation or
not, If they do, that transaction is never allowed to be executed.

Deadlock prevention mechanism proposes two schemes:


● Wait-Die Scheme: In this scheme, If a transaction requests a resource that is
locked by another transaction, then the DBMS simply checks the timestamp of
both transactions and allows the older transaction to wait until the resource is
available for execution.
Suppose, there are two transactions T1 and T2, and Let the timestamp of any
transaction T be TS (T). Now, If there is a lock on T2 by some other transaction
and T1 is requesting resources held by T2, then DBMS performs the following
actions:
Checks if TS (T1) < TS (T2) – if T1 is the older transaction and T2 has held
some resource, then it allows T1 to wait until resource is available for
execution. That means if a younger transaction has locked some resource and
an older transaction is waiting for it, then an older transaction is allowed to wait
for it till it is available. If T1 is an older transaction and has held some resource
with it and if T2 is waiting for it, then T2 is killed and restarted later with
random delay but with the same timestamp. i.e. if the older transaction has held
some resource and the younger transaction waits for the resource, then the
younger transaction is killed and restarted with a very minute delay with the
same timestamp.
This scheme allows the older transaction to wait but kills the younger one.
● Wound Wait Scheme: In this scheme, if an older transaction requests for a
resource held by a younger transaction, then an older transaction forces a
younger transaction to kill the transaction and release the resource. The younger
transaction is restarted with a minute delay but with the same timestamp. If the
younger transaction is requesting a resource that is held by an older one, then
the younger transaction is asked to wait till the older one releases it.
The following table lists the differences between Wait – Die and Wound -Wait
scheme prevention schemes:
Wait – Die Wound -Wait

It is based on a pre-emptive
It is based on a non-pre-emptive technique.
technique.

In this, older transactions must wait for the In this, older transactions never wait
younger one to release its data items. for younger transactions.
Wait – Die Wound -Wait

The number of aborts and rollbacks is higher In this, the number of aborts and
in these techniques. rollback is lesser.

Disadvantages
1. System downtime: Deadlock can cause system downtime, which can result in
loss of productivity and revenue for businesses that rely on the DBMS.
2. Resource waste: When transactions are waiting for resources, these resources
are not being used, leading to wasted resources and decreased system
efficiency.
3. Reduced concurrency: Deadlock can lead to a decrease in system concurrency,
which can result in slower transaction processing and reduced throughput.
4. Complex resolution: Resolving deadlock can be a complex and time-consuming
process, requiring system administrators to intervene and manually resolve the
deadlock.
5. Increased system overhead: The mechanisms used to detect and resolve
deadlock, such as timeouts and rollbacks, can increase system overhead,
leading to decreased performance.

Lock Based Concurrency Control Protocol in DBMS



In a database management system (DBMS), lock-based concurrency control (BCC)


is used to control the access of multiple transactions to the same data item.

This protocol helps to maintain data consistency and integrity across multiple
users.

In the protocol, transactions gain locks on data items to control their access and
prevent conflicts between concurrent transactions.

What is a Lock?
A Lock is a variable assigned to any data item to keep track of the status of that
data item so that isolation and non-interference are ensured during concurrent
transactions.

Lock-based concurrency control ensures that transactions in a database can


proceed safely without causing conflicts.

Lock Based Protocols

A lock is a variable associated with a data item that describes the status of the data
item to possible operations that can be applied to it. They synchronize the access
by concurrent transactions to the database items.

It is required in this protocol that all the data items must be accessed in a mutually
exclusive manner.

Types of Lock

1. Shared Lock (S): Shared Lock is also known as Read-only lock. As the name
suggests it can be shared between transactions because while holding this lock
the transaction does not have the permission to update data on the data item.
S-lock is requested using lock-S instruction.
2. Exclusive Lock (X): Data item can be both read as well as written.
This is Exclusive and cannot be held simultaneously on the same data item.
X-lock is requested using lock-X instruction.

Lock Compatibility Matrix


A transaction may be granted a lock on an item if the requested lock is compatible
with locks already held on the item by other transactions. Any number of
transactions can hold shared locks on an item, but if any transaction holds an
exclusive(X) on the item no other transaction may hold any lock on the item. If a
lock cannot be granted, the requesting transaction is made to wait till all
incompatible locks held by other transactions have been released. Then the lock is
granted.

Lock Compatibility Matrix


LOCK Based Concurrency Control Protocols

Lock Based Protocol


o Basic 2-PL
o Conservative 2-PL
o Strict 2-PL
o Rigorous 2-PL

Two-phase locking (2PL)


A transaction is said to follow the Two-Phase Locking protocol if Locking and
Unlocking can be done in two phases.
● Growing Phase: New locks on data items may be acquired but none can be
released.
● Shrinking Phase: Existing locks may be released but no new locks can be
acquired.
Note: If lock conversion is allowed, then upgrading of lock( from S(a) to X(a) ) is
allowed in the Growing Phase, and downgrading of lock (from X(a) to S(a)) must
be done in the shrinking phase.
Let’s see a transaction implementing 2-PL.

T1 T2

1 lock-S(A)

2 lock-S(A)

3 lock-X(B)

4 ………. ……….

5 Unlock(A)

6 Lock-X(C)

7 Unlock(B)
T1 T2

8 Unlock(A)

9 Unlock(C)

10 ………. ……….

This is just a skeleton transaction that shows how unlocking and locking work with
2-PL.
Transaction T1
● The growing Phase is from steps 1-3
● The shrinking Phase is from steps 5-7
● Lock Point at 3

Transaction T2
● The growing Phase is from steps 2-6
● The shrinking Phase is from steps 8-9
● Lock Point at 6

Lock Point

The Point at which the growing phase ends, i.e., when a transaction takes the final
lock it needs to carry on its work. Now look at the schedule, you’ll surely
understand. I have said that 2-PL ensures serializability, but there are still some
drawbacks of 2-PL. Let’s glance at the drawbacks.
● Cascading Rollback is possible under 2-PL.
● Deadlocks and Starvation are possible.

Cascading Rollbacks in 2-PL


Let’s see the following Schedule
Cascading Roll-Back

Take a moment to analyze the schedule. Yes, you’re correct, because of Dirty
Read in T2 and T3 in lines 8 and 12 respectively, when T1 failed we have to roll
back others also. Hence, Cascading Rollbacks are possible in 2-PL.

Strict Two-Phase Locking Protocol

Strict Two-Phase Locking requires that in addition to the 2-PL all Exclusive(X)
locks held by the transaction be released until after the Transaction Commits. For
more details refer the published article Strict Two-Phase Locking Protocol.

Upgrade / Downgrade locks

A transaction that holds a lock on an item Ais allowed under certain condition to
change the lock state from one state to another.

Upgrade: A S(A) can be upgraded to X(A) if Ti is the only transaction holding the
S-lock on element A.
Downgrade: We may downgrade X(A) to S(A) when we feel that we no longer
want to write on data-item A. As we were holding X-lock on A, we need not check
any conditions.

Categories of Two Phase Locking (Strict, Rigorous &


Conservative)

In Two-Phase Locking (2-PL) the basic rules which should be followed which
ensures serializability.

Moreover - problems with 2-PL are Cascading Aborts, and Deadlocks.

The enhancements made on 2-PL to make the protocol nearly error-free.


There are three categories:

1. Strict 2-PL
2. Rigorous 2-PL
3. Conservative 2-PL

Strict 2-PL –
This requires that in addition to the lock being 2-Phase all Exclusive(X) locks held
by the transaction be released until after the Transaction Commits.
Following Strict 2-PL ensures that our schedule is:
● Recoverable
● Cascadeless
Hence, it gives us freedom from Cascading Abort which was still there in Basic
2-PL and moreover guarantee Strict Schedules but still, Deadlocks are possible!

Rigorous 2-PL –
This requires that in addition to the lock being 2-Phase all Exclusive(X) and
Shared(S) locks held by the transaction be released until after the Transaction
Commits. Following Rigorous 2-PL ensures that our schedule is:
● Recoverable
● Cascadeless

Note: The difference between Strict 2-PL and Rigorous 2-PL is that Rigorous is
more restrictive, it requires both Exclusive and Shared locks to be held until after
the Transaction commits and this is what makes the implementation of Rigorous
2-PL easier.
Conservative 2-PL –

A.K.A Static 2-PL, this protocol requires the transaction to lock all the items it
access before the Transaction begins execution by predeclaring its read-set and
write-set.
If any of the predeclared items needed cannot be locked, the transaction does not
lock any of the items, instead, it waits until all the items are available for locking.

Conservative 2-PL is Deadlock free and but it does not ensure a Strict
schedule(More about this here!).

However, it is difficult to use in practice because of the need to predeclare the


read-set and the write-set which is not possible in many situations. In practice, the
most popular variation of 2-PL is Strict 2-PL.

Venn Diagram below shows the classification of schedules that are rigorous and
strict. The universe represents the schedules that can be serialized as 2-PL.

Now as the diagram suggests, and it can also be logically concluded, if a schedule
is Rigorous then it is Strict.

Image – Venn Diagram showing categories of languages under 2-PL


Now, let’s see the schedule below, tell me if this schedule can be locked using
2-PL, and if yes, show how and what class of 2-PL does your answer belongs to.
T1 T2 Yes, the schedule is conflict serializable, so we can
try implementing 2-PL. So, let’s try…
1 Read(A) Solution:

2 Read(A)

3 Read(B)

4 Write(B)

5 Commit

6 Read(B)

7 Write(B)

6 Commit

T1 T2

1 Lock-S(A)

2 Read(A)

3 Lock-S(A)

4 Read(A)

5 Lock-X(B)

6 Read(B)

7 Write(B)

8 Commit
9 Unlock(A)

10 Unlock(B)

11 Lock-X(B)

12 Read(B)

13 Write(B)

14 Commit

15 Unlock(A)

16 Unlock(B)

Timestamp Ordering Protocol


o The Timestamp Ordering Protocol is used to order the transactions based on
their Timestamps. The order of transaction is nothing but the ascending order
of the transaction creation.
o The priority of the older transaction is higher that's why it executes first. To
determine the timestamp of the transaction, this protocol uses system time or
logical counter.
o The lock-based protocol is used to manage the order between conflicting
pairs among transactions at the execution time. But Timestamp based
protocols start working as soon as a transaction is created.
o Let's assume there are two transactions T1 and T2. Suppose the transaction
T1 has entered the system at 007 times and transaction T2 has entered the
system at 009 times. T1 has the higher priority, so it executes first as it is
entered the system first.
o The timestamp ordering protocol also maintains the timestamp of last 'read'
and 'write' operation on a data.
Basic Timestamp ordering protocol works as follows:

1. Check the following condition whenever a transaction Ti issues a Read (X) operation:

o If W_TS(X) >TS(Ti) then the operation is rejected.


o If W_TS(X) <= TS(Ti) then the operation is executed.
o Timestamps of all the data items are updated.
2. Check the following condition whenever a transaction Ti issues a Write(X) operation:

o If TS(Ti) < R_TS(X) then the operation is rejected.


o If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back
otherwise the operation is executed.
Where,

TS(TI) denotes the timestamp of the transaction Ti.

R_TS(X) denotes the Read time-stamp of data-item X.

W_TS(X) denotes the Write time-stamp of data-item X.

Advantages and Disadvantages of TO protocol:


o TO protocol ensures serializability since the precedence graph is as follows:

o TS protocol ensures freedom from deadlock that means no transaction ever


waits.
o But the schedule may not be recoverable and may not even be cascade- free.

You might also like