0% found this document useful (0 votes)
11 views48 pages

Lec 18

The document discusses transactions in database management systems (DBMS), focusing on their ACID properties: Atomicity, Consistency, Isolation, and Durability. It explains the importance of concurrency control and recovery in ensuring correct database states during operations, particularly in scenarios involving multiple transactions. The document also covers the mechanisms for ensuring these properties and the implications of various transaction execution schedules.

Uploaded by

p20232002567
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views48 pages

Lec 18

The document discusses transactions in database management systems (DBMS), focusing on their ACID properties: Atomicity, Consistency, Isolation, and Durability. It explains the importance of concurrency control and recovery in ensuring correct database states during operations, particularly in scenarios involving multiple transactions. The document also covers the mechanisms for ensuring these properties and the implications of various transaction execution schedules.

Uploaded by

p20232002567
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Advanced Database Systems

Spring 2025

Lecture #18:
Transactions

R&G: Chapters 16 & 17


2

A RCHITECTURE OF A DBMS
Up until now we have assumed a SQ L C lie nt
single-user architecture and failure-
Q uer y Pl anning
free execution
O pera tor Exe cuti on

F ile s & Index Mana ge me nt


C onc urre ncy Control
Buf fe r Mana ge me nt
Re cove r y
Di sk S pace Ma nag em e nt

Da taba se
3

M OTIVATION
We both change the same record in
Concurrency Control
a table at the same time.
How to avoid race condition?

You transfer £100 between bank accounts


but there is a power failure. Recovery
What is the correct database state?

Both concurrency control and recovery are based


on a concept of transactions with ACID properties
4

T RANSACTIONS
A transaction is the execution of a
Transaction 1 Transaction 2 Transaction 3
sequence of operations (e.g., SQL BEGIN XACT BEGIN XACT BEGIN XACT

queries) on a shared database to SELECT INSERT DELETE


INSERT DELETE DELETE
perform some higher-level function DELETE ROLLBACK COMMIT
COMMIT
Basic unit of change in a DBMS
Partial transactions are not allowed!
5

U SER P ERSPECTIVE : T RANSACTIONS


Transaction (abbr. txn) = group of operations the user wants
the DBMS to treat “as one”

A new transaction starts with the BEGIN command

The transaction stops with either COMMIT or ABORT (ROLLBACK)


If commits, all changes are saved
If aborts, all changes are undone (as if the txn never executed at all)

Abort can be either self-inflicted or caused by DBMS


6

T RANSACTION E XAMPLE
Transfer £100 from Checking to Savings account of user 1904

BEGIN Consistent DB
// check if Checking balance > 100
UPDATE Accounts
SET balance = balance – 100
WHERE customer_id = 1904
AND account_type = ‘Checking’;
UPDATE Accounts
Temporary inconsistent DB
SET balance = balance + 100
WHERE customer_id = 1904
AND account_type = ‘Savings’;
COMMIT Consistent DB
7

T RANSACTION E XAMPLE
Transfer £100 from Checking to Savings account of user 1904

BEGIN How to check if balance > 100?


// check if Checking balance > 100
UPDATE Accounts Outside DBMS using another language
SET balance = balance – 100
WHERE customer_id = 1904 E.g., in Java or PHP code
AND account_type = ‘Checking’;
UPDATE Accounts Inside DBMS using stored procedures
SET balance = balance + 100 expressed in PL/SQL or T-SQL
WHERE customer_id = 1904
AND account_type = ‘Savings’; PL/SQL = SQL + procedural constructs such as

COMMIT if-then-else, loops, variables, functions…


8

D ATABASE P ERSPECTIVE
A transaction may carry out many operations on the data
retrieved from the database

However, the DBMS is only concerned about what data is


read/written from/to the database
Changes to the “outside world” are beyond scope of the DBMS
9

T RANSACTIONS : F ORMAL D EFINITION


Database = fixed set of named data objects (A, B, C, …)
Transactions access object A using read A and write A, for short R(A) and W(A)
In a relational DBMS, an object can be an attribute, record, page, or table

Transaction = sequence of read and write operations


T = ⟨ R(A), W(A), W(B), … ⟩
DBMS’s abstract view of a user program
10

S TRAWMAN E XECUTION
Execute each txn one-by-one (serial order) as they arrive in the DBMS
One and only one txn can be running at the same time in the DBMS

Before a txn starts, copy the entire database to a new file and make all
changes to that file
If the txn completes successfully, overwrite the original file with the new one
If the txn fails, just remove the dirty copy

SQLite executes transactions in serial order


11

C ONCURRENT E XECUTION
A better approach is to allow concurrent execution of independent transactions

Why do we want that? Txn 1 Txn 2 Txn 3


Better resource utilization and throughput (txns/sec) 3 3 3
2 2 2
Use the CPU while another txn is waiting for the disk
1 1 1
Multicore: Ideally, scale throughput in the # of CPUs
SCHEDULER
Decreased response times to users
One txn’s latency need not be dependent on another unrelated txn 1

Or that’s the hope 1


2
But we also would like correctness and fairness 1

EXECUTOR
12

T RANSACTION G UARANTEES : ACID


Atomicity: All actions in the txn happen, or none happen
“all or nothing”

Consistency: If each txn is consistent and the DB starts


consistent, then it ends up consistent
“it looks correct to me”

Isolation: Execution of one txn is isolated from that of other txns

“as if alone”
Durability: If a txn commits, its effects persist

“survive failures”
13

ACID P ROPERTIES : ATOMICITY


Two possible outcomes of executing a transaction:
Commit after completing all actions
Abort (or be aborted by the DBMS) after executing some actions

The DBMS guarantees that transactions are atomic


From user’s point of view:
A transaction always either executes all its actions or executes no actions at all

Example:
Take £100 from account A, but then a power failure happens before crediting account B
When the DBMS comes back online, what should be the correct state of the database?
14

M ECHANISMS FOR E NSURING ATOMICITY


Approach #1: Logging
DBMS logs all actions so that it can undo the actions of aborted transactions
Write-ahead logging is used by almost all modern database systems
Efficiency reasons: random writes turned into sequential writes through a log
Audit trail: everything done by the app is recorded

Approach #2: Shadow Paging (copy-on-write)


DBMS makes copies of pages and transactions make changes to those copies
Only when the transaction commits is the page made visible to others
Few database systems do this (CouchDB, LMDB)
15

ACID P ROPERTIES : C ONSISTENCY

Database consistency
The database accurately models the real world and follows integrity constraints
Transactions in the future see the effects of transactions committed in the past

Transaction consistency
If the database is consistent before the txn starts (running alone), it will be also consistent after
Transaction consistency is the application’s responsibility!
16

ACID P ROPERTIES : I SOLATION


Users submit transactions, and each transaction executes as
if it was running alone

The DBMS achieves concurrency by interleaving actions


(read/writes of database objects) of various transactions

How do we achieve this?


17

M ECHANISMS FOR E NSURING I SOLATION


A concurrency control protocol is how the DBMS decides the proper
interleaving of operations from multiple transactions

Two main categories:


Pessimistic: Don’t let problems arise in the first place

Optimistic: Assume conflicts are rare, deal with them after they happen
18

E XAMPLE
Assume at first accounts A and B each have £1000

T1 transfers £100 from A to B

T2 credits both accounts with 6% interest

T1 T2
BEGIN BEGIN
A = A - 100 A = A * 1.06
B = B + 100 B = B * 1.06
END END
20

E XAMPLE
Assume at first accounts A and B each have £1000

What are the possible outcomes of running T1 and T2?

Many! But A+B should be 2000 * 1.06 = 2120

There is no guarantee that T1 will execute before T2 or vice versa,


if both are submitted together

But the net effect must be equivalent to these two transactions running
serially in some order
22

E XAMPLE : S ERIAL E XECUTION


Schedule Schedule

T1 T2 T1 T2
BEGIN BEGIN
A = A - 100 A = A * 1.06
B = B + 100 B = B * 1.06
TIME

COMMIT
BEGIN
A = A * 1.06
≡ BEGIN
A = A - 100
COMMIT

B = B * 1.06 B = B + 100
COMMIT COMMIT

A = 954, B = 1166 A = 960, B = 1160

A+B = 2120
24

E XAMPLE : I NTERLEAVING (G OOD )


Schedule Schedule

T1 T2 T1 T2
BEGIN BEGIN
A = A - 100 A = A - 100
BEGIN B = B + 100
TIME

B = B + 100
COMMIT
A = A * 1.06
≡ COMMIT
BEGIN
A = A * 1.06
B = B * 1.06 B = B * 1.06
COMMIT COMMIT

A = 954, B = 1166 A = 954, B = 1166


25

E XAMPLE : I NTERLEAVING (B AD )
Schedule

T1 T2
BEGIN
A = A - 100
BEGIN A = 954, B = 1166
TIME

A = A * 1.06
B = B * 1.06
COMMIT
≢ or
B = B + 100 A = 960, B = 1160
COMMIT

A = 954, B = 1160

The bank is missing £6!


A+B = 2114
26

E XAMPLE : I NTERLEAVING (B AD )
Schedule DBMS View

T1 T2 T1 T2
BEGIN BEGIN
A = A - 100 R(A)
BEGIN W(A)
TIME

A = A * 1.06 BEGIN
B = B * 1.06 R(A)
COMMIT W(A)
B = B + 100 R(B)
COMMIT W(B)
COMMIT
A = 954, B = 1160 R(B)
W(B)
COMMIT

A+B = 2114
27

C ORRECTNESS
How do we judge whether a schedule is correct?

If the schedule is equivalent to some serial execution

Schedule S for a set of transactions { T1, … , Tn }


S contains all steps of all transactions and order among steps in each Ti is preserved
S = ⟨ (T1, read B), (T2, read A), (T2, write B), (T1, write A) ⟩
for short, S = ⟨ R1(B), R2(A), W2(B), W1(A) ⟩
28

F ORMAL P ROPERTIES OF S CHEDULES


Equivalent schedules
For any database state, the effect of executing the first schedule is identical
to the effect of executing the second schedule
Does not matter what the higher-level operations are!

Serial schedule (no concurrency)


A schedule that does not interleave the actions of different transactions

Txn 2 Txn 1 Txn 3


1 2 3 1 2 3 1 2 3

TIME
29

F ORMAL P ROPERTIES OF S CHEDULES


Serializable schedule
A schedule that is equivalent to some serial execution of the transactions
If each transaction preserves consistency, every serializable
schedule preserves consistency

Serializability
Less intuitive notion of correctness compared to transaction initiation time
or commit order
But it provides the DBMS with flexibility in scheduling operations
More flexibility means better parallelism
30

C ONFLICTING O PERATIONS
We need a formal notion of equivalence that can be implemented
efficiently based on the notion of “conflicting” operations

Two operations conflict if


They are by different transactions
They are on the same object and at least one of them is a write

Interleaved execution anomalies:


Read-Write conflicts (R-W)
Write-Read conflicts (W-R)
Write-Write conflicts (W-W)
31

R EAD -W RITE C ONFLICTS


Unrepeatable Reads

T1 T2
BEGIN
£10 R(A)
BEGIN
R(A) £10
W(A) £19
COMMIT
£19 R(A)
COMMIT
32

W RITE -R EAD C ONFLICTS


Reading Uncommitted Data (“Dirty Reads”)

T1 T2
BEGIN
£10 R(A)
£12 W(A)
BEGIN
R(A) £12
W(A) £14
COMMIT
ABORT

Not recoverable
33

W RITE -W RITE C ONFLICTS


Overwriting Uncommitted Data (”Lost Update”)

T1 T2
BEGIN
£10 W(A)
BEGIN
W(A) £19
W(B) Alice
COMMIT
Michael W(B)
COMMIT
34

F ORMAL P ROPERTIES OF S CHEDULES


Given these conflicts, we can now understand what it means
for a schedule to be serializable
This is to check whether schedules are correct

This is not how to generate a correct schedule

There are levels of serializability


Conflict Serializability
Most DBMS try to support this
View Serializability

No DBMS supports this


35

C ONFLICT S ERIALIZABLE S CHEDULES


Two schedules are conflict equivalent iff
They involve the same actions of the same transactions

Every pair of conflicting actions is ordered in the same way

Schedule S is conflict serializable if S is conflict equivalent to some serial


schedule
Intuition: Schedule S is conflict serializable if you can transform S into a serial schedule by
swapping consecutive non-conflicting operations of different txns
36

C ONFLICT S ERIALIZABILITY : I NTUITION


Schedule

T1 T2
BEGIN BEGIN
R(A)
W(A)
TIME

R(A)
W(A)
R(B)
W(B)
COMMIT
R(B)
W(B)
COMMIT
37

C ONFLICT S ERIALIZABILITY : I NTUITION


Schedule

T1 T2
BEGIN BEGIN
R(A)
W(A)
TIME

R(A)
R(B)
W(A)
W(B)
COMMIT
R(B)
W(B)
COMMIT
38

C ONFLICT S ERIALIZABILITY : I NTUITION


Schedule

T1 T2
BEGIN BEGIN
R(A)
W(A)
TIME

R(B)
R(A)
W(A)
W(B)
COMMIT
R(B)
W(B)
COMMIT
39

C ONFLICT S ERIALIZABILITY : I NTUITION


Schedule

T1 T2
BEGIN BEGIN
R(A)
W(A)
TIME

R(B)
R(A)
W(B)
W(A)
COMMIT
R(B)
W(B)
COMMIT
40

C ONFLICT S ERIALIZABILITY : I NTUITION


Schedule Serial schedule

T1 T2 T1 T2
BEGIN BEGIN BEGIN
R(A) R(A)
W(A) W(A)


TIME

R(B) R(B)
W(B) W(B)
R(A) COMMIT BEGIN
W(A) R(A)
COMMIT W(A)
R(B) R(B)
W(B) W(B)
COMMIT COMMIT

Serializable
41

C ONFLICT S ERIALIZABILITY : I NTUITION


Schedule Serial schedule

T1 T2 T1 T2
BEGIN BEGIN BEGIN
TIME

R(A) R(A)
R(A) W(A)

W(A)
W(A)
≢ COMMIT BEGIN
R(A)
COMMIT COMMIT W(A)
COMMIT

Not conflict-serializable
42

S ERIALIZABILITY
Swapping operations is easy when there are only two txns in the schedule

But it’s cumbersome when there are many txns

Are there any faster algorithms to figure this out other than transposing
operations?
43

D EPENDENCY G RAPHS
Dependency Graph

Dependency graph for a schedule


Ti Tj
One node per transaction
Edge from Ti to Tj if:
Operation Oi of Ti conflicts with an operation Oj of Tj and
Oi appears earlier in the schedule than Oj

Also known as a conflict graph or precedence graph

A schedule is conflict-serializable if and only if


its dependency graph is acyclic

Equivalent serial schedule can be obtained by sorting the graph topologically


44

E XAMPLE #1
Schedule Dependency Graph

T1 T2
BEGIN BEGIN
R(A) T1 T2
W(A)
TIME

R(A)
W(A)
R(B)
W(B)
COMMIT
R(B) The cycle in the graph reveals
W(B) the problem. The output of T1
COMMIT depends on T2, and vice versa.
45

E XAMPLE #2 - T HREESOME
Schedule Dependency Graph

T1 T2 T3
T1 T2
BEGIN
R(A)
W(A) BEGIN
T3
TIME

R(A)
W(A)
BEGIN COMMIT
R(B)
W(B)
R(B) COMMIT Is this equivalent to a serial schedule?
W(B)
COMMIT
Yes, (T2, T1, T3)
Notice that T3 should go after T2 although
T3 starts before T2!
48

V IEW S ERIALIZABILITY
Alternative (weaker) notion of serializability

Schedule S1 and S2 are view equivalent iff


If T1 reads initial value of A in S1, then T1 also reads initial value of A in S2
If T1 reads value of A written by T2 in S1, then T1 also reads value of A
written by T2 in S2
If T1 writes final value of A in S1, then T1 also writes final value of A in S2
49

V IEW S ERIALIZABILITY
Schedule Dependency Graph

T1 T2 T3
BEGIN
R(A) BEGIN T1 T2
W(A)
TIME

BEGIN
W(A)
W(A)
COMMIT COMMIT COMMIT
T3

Not conflict serializable. But is this


equivalent to a serial schedule?
50

V IEW S ERIALIZABILITY
Schedule Serial Schedule

T1 T2 T3 T1 T2 T3
BEGIN BEGIN
R(A) BEGIN R(A)
W(A) W(A)
TIME

BEGIN VIEW COMMIT


W(A)

COMMIT COMMIT
W(A)
COMMIT
≡ BEGIN
W(A)
COMMIT
BEGIN
W(A)
COMMIT

Allows all conflict serializable


schedules + “blind writes”
51

S ERIALIZABILITY
Conflict serializability All Schedules
Can enforced efficiently
View Serializable
All DBMSs support it
Conflict Serializable
View serializability
Admits (slightly) more schedules than CS Serial
But it is difficult to enforce efficiently
No DBMS supports it

Neither definition allows all “serializable” schedules


They do not understand the meaning of the operations or the data
52

ACID P ROPERTIES : D URABILITY


All of the changes of committed transactions must be persistent
No torn updates

No changes from failed transactions

The DBMS uses either logging or shadow paging to ensure


that all changes are durable

More about logging in next lectures


53

S UMMARY
ACID Transactions Serializability
Atomicity: All or nothing Serializable schedules
Consistency: Only valid data Conflict & view serializability
Isolation: No interference Checking for conflict serializability
Durability: Committed data persists

Concurrency control and recovery are among the most important functions
provided by a DBMS

You might also like