0% found this document useful (0 votes)

28 views113 pages

Module 5

Module 5 covers transaction processing concepts including concurrency control, transaction states, and recovery mechanisms. It discusses various transaction problems like lost updates and dirty reads, as well as the significance of ACID properties for maintaining database integrity. Additionally, it introduces NoSQL databases and their characteristics, alongside the importance of system logs for transaction recovery.

Uploaded by

Aann Mariya Sabu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views113 pages

Module 5

Uploaded by

Aann Mariya Sabu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 113

Module 5

• Transaction Processing Concepts - overview of concurrency control,

Transaction Model, Significance of concurrency Control & Recovery,
Transaction States, System Log, Desirable Properties of transactions.
• Serial schedules, Concurrent and Serializable Schedules, Conflict
equivalence and conflict serializability, Recoverable and cascade-less
schedules, Locking, Two-phase locking and its variations. Log-based
recovery, Deferred database modification, check-pointing.
• Introduction to NoSQL Databases, Main characteristics of Key-value DB
(examples from: Redis), Document DB (examples from: MongoDB)
• Main characteristics of Column - Family DB (examples from:
Cassandra) and Graph DB (examples from : ArangoDB)
Transaction Processing

• An action or series of action performed by a user or an application program

which reads or updates the content of database
• A transaction is an executing program that forms a logic unit of database
processing
• The operations performed in a transaction include one or more of database
operations like insert, delete, update or retrieve data.
• Example for transaction processing :Airline reservation, Banking
transaction etc
• One way of specifying the transaction boundaries is by specifying explicit
begin transaction and end transaction
• The basic database access operation that a transaction can include are as
follows

• read_item(X) − reads data item from storage to main memory

• modify_item() − change value of item in the main memory.

• write_item() − write the modified value from main memory to storage

Single User and Multi User Database Systems
Interleaved Processing Vs Parallel
► The figure shows two
processing processes, A and B,
executing concurrently in
an interleaved fashion.
► if the computer system has
multiple hardware
processors (CPUs), parallel
processing of multiple
processes is possible, as
illustrated by processes C
and D in the figure.
Source: https://www.geeksforgeeks.org/
Concurrent Transactions

• Several transactions will be executed in concurrent manner

• Concurrency control and recovery mechanism are mainly concerned with

database commands in a transaction

• If this concurrent execution is uncontrolled, it may lead to problems

Why Concurrency Control Is Needed?
• When multiple transactions execute concurrently in an uncontrolled
or unrestricted manner, then it might lead to several problems.
• These problems are commonly referred to as concurrency problems
in database environment.
• The concurrency problems that can occur in database are:
– Temporary Update Problem (Dirty Read Problem)
– Incorrect Summary Problem
– Lost Update Problem
– Unrepeatable Read Problem
Lost update problem

• In the lost update problem, update

done to a data item by a transaction
is lost as it is overwritten by the
update done by another transaction.
• At time t1, transaction TX reads the value of
account A, i.e., $300 (only read).
• At time t2, transaction TX deducts $50 from
account A which becomes $250 (only deducted
and not updated/write).
• Alternately, at time t3, transaction TY reads the
value of account A which will be $300 only
because TX didn't update the value yet.
• At time t4, transaction TY adds $100 to account
A which becomes $400 (only added but not
updated/written).
• At time t6, transaction TX writes the value of
account A that will be updated as $250 only, as
TY didn't update the value yet.
• Similarly, at time t7, transaction TY writes the
values of account A, so it will write as done at
time t4 which will be $400. It means the value
written by TX is lost, i.e., $250 is lost.
Temporary Update Problem

• Temporary update or dirty read problem occurs when one transaction

updates an item and fails, then the updated item is read by another
transaction before the item is changed or reverted back to its last or orginal
value.
► In this example, if transaction 1 fails for some
reason then X will revert back to its previous
value.
► But transaction 2 has already read the incorrect
value of X.
Incorrect Summary Problem

• Consider a situation, where one transaction is applying the aggregate

function on some records while another transaction is updating these
records.
• The aggregate function may calculate some values before the values have
been updated and others after they are updated.

► In this example, transaction 2 is calculating the sum

of some records while transaction 1 is updating
them.
► Therefore the aggregate function may calculate
some values before they have been updated and
others after they have been updated.
Unrepeatable Read Problem

• A transaction T reads the same item twice and the item is changed
by another transaction T between two reads. So T receives different
values for its two reads of the same item.
• ex: A customer inquires about the seat availability on several
flights. when a customer is searching for the ticket availability, on a
particular flight and before completing the reservation, and it may
end up reading a different value for the item
Unrepeatable Read Problem:

• The unrepeatable problem occurs when two or more read operations

of the same transaction read different values of the same variable.

► In this example, once transaction 2 reads the variable

X, a write operation in transaction 1 changes the
value of the variable X.
► Thus, when another read operation is performed by
transaction 2, it reads the new value of X which was
updated by transaction 1.
Why Recovery Is Needed
• Whenever a transaction is submitted to a DBMS for execution
– The system is responsible for making sure that either all the operations
in the transaction are completed successfully and their effect is
recorded permanently in the database(Committed) or that the
transaction does not have any effect on the database or any other
transactions(Aborted)
Types of Failures

• Computer failure (system crash)

• Transaction or system error
• Local errors or exception conditions detected by the transaction
• Disk failure
• Physical problems and catastrophes

Types of Failures

• A computer failure (system crash). A hardware, software, or network error

occurs in the computer system during transaction execution.
• A transaction or system error. Some operation in the transaction may cause it
to fail,
– Integer overflow or division by zero.
– Erroneous parameter values
– Logical programming error.
– User may interrupt the transaction during its execution
Why Recovery Is Needed

• Types of Failures
• Local errors or exception conditions detected by the transaction
– During transaction execution, certain conditions may occur that
necessitate cancellation of the transaction.
– Eg: Insufficient account balance in a banking database, may cause a
transaction, such as a fund withdrawal, to be canceled.
• Disk failure
– Some disk blocks may lose their data because of a read or write
malfunction or because of a disk read/write head crash.
– This may happen during a read or a write operation of the transaction.
• Physical problems and catastrophes.
– This refers to an endless list of problems that includes power or air-
conditioning failure, fire, theft, sabotage, overwriting disks or tapes
by mistake, and mounting of a wrong tape by the operator.
What to be done if a transaction fails

• If a transaction fails after executing some of its operation but before

executing all of them, the operations already executed must be undone

• Whenever a failure occurs, the system must quickly recover from the
failure
States of Transactions
States of Transactions

1. Once a transaction states execution, it becomes active. It can issue READ or WRITE
operation.
2. Once the READ and WRITE operations complete, the transactions becomes partially
committed state.
3. Next, some recovery protocols need to ensure that a system failure will not result in an
inability to record changes in the transaction permanently. If this check is a success, the
transaction commits and enters into the committed state.
4. If the check is a fail, the transaction goes to the Failed state.
5. If the transaction is aborted while it’s in the active state, it goes to the failed state. The
transaction should be rolled back to undo the effect of its write operations on the
database.
6. The terminated state refers to the transaction leaving the system.
Transaction Operations
► The low level operations performed in a transaction are −
► begin_transaction − A marker that specifies start of transaction execution.
► read_item or write_item − Database operations that may be interleaved with
main memory operations as a part of transaction.
► end_transaction − A marker that specifies end of transaction.
► commit − A signal to specify that the transaction has been successfully
completed in its entirety and will not be undone.
► rollback − A signal to specify that the transaction has been unsuccessful and
so all temporary changes in the database are undone. A committed transaction
cannot be rolled back.
THE SYSTEM LOG

• To be able to recover from failures that affect transactions, the system

maintains a log to keep track of all transaction operations that affect the
values of database items, as well as other transaction information that may
be needed to permit recovery from failures.
• The log is a file that is kept on disk
• The following are the types of entries- called log records -that are written to
the log file
The System Log
1. [start_transaction, T]. Indicates that transaction T has started execution.
2. [write_item, T, X, old_value, new_value]. Indicates that transaction T has
changed the value of database item X from old_value to new_value.
3. [read_item, T, X]. Indicates that transaction T has read the value of database
item X.
4. [commit, T]. Indicates that transaction T has completed successfully, and
affirms that its effect can be committed (recorded permanently) to the
database.
5. [abort, T]. Indicates that transaction T has been aborted.
Properties of Transactions
► ACID Properties are used for maintaining the integrity of database during
transaction processing.
► ACID in DBMS stands for Atomicity, Consistency, Isolation, and Durability.
Atomicity
► By this, we mean that either the entire transaction takes place at once or
doesn’t happen at all.
► There is no midway i.e. transactions do not occur partially.
► Each transaction is considered as one unit and either runs to completion or is
not executed at all.
► It involves the following two operations.
► Abort: If a transaction aborts, changes made to database are not visible.

► Commit: If a transaction commits, changes made are visible.

► Atomicity is also known as the ‘All or nothing rule’.

Atomicity
► Consider the following transaction T consisting of T1 and T2: Transfer of 100
from account X to account Y.

► If the transaction fails after completion of T1 but before completion of T2.(

say, after write(X) but before write(Y)), then amount has been deducted from
X but not added to Y.
► This results in an inconsistent database state. Therefore, the transaction must
be executed in entirety in order to ensure correctness of database state.
Consistency
► This means that integrity constraints must be maintained so that the database
is consistent before and after the transaction.
► It refers to the correctness of a database.
► Referring to the example above,
► The total amount before and after the transaction must be maintained.
► Total before T occurs = 500 + 200 = 700.
► Total after T occurs = 400 + 300 = 700.
► Therefore, database is consistent. Inconsistency occurs in case T1 completes
but T2 fails. As a result T is incomplete.
Isolation

• In a database system where more than one transaction is being executed

simultaneously and in parallel
• The term 'isolation' means separation.
• In DBMS, Isolation is the property of a database where no data should affect
the other one and may occur concurrently. In short, the operation on one
database should begin when the operation on the first database gets complete
• The property of isolation states that all the transactions will be carried out and
executed as if it is the only transaction in the system.
• No transaction will affect the existence of any other transaction.
• Isolation can be ensured trivially by running transactions serially, that is one
after the other.
• Although multiple transactions may execute concurrently each transaction
must be unaware of either concurrently executing the transaction.
• Intermediate transaction results must be hidden from other concurrently
executed transactions.
Isolation
► Let X= 500, Y = 500. Consider two transactions T and T”.

► Suppose T has been executed till Read (Y) and then T’’ starts. As a result ,
interleaving of operations takes place due to which T’’ reads correct value of
X but incorrect value of Y and sum computed by
► T’’: (X+Y = 50, 000+500=50, 500) is thus not consistent with the sum at end
of transaction:
► T: (X+Y = 50, 000 + 450 = 50, 450).
► This results in database inconsistency, due to a loss of 50 units. Hence,
transactions must take place in isolation and changes should be visible only
after they have been made to the main memory.
Durability
► The database should be durable enough to hold all its latest updates even if
the system fails or restarts.
► If a transaction updates a chunk of data in a database and commits, then the
database will hold the modified data.
► If a transaction commits but the system fails before the data could be written
on to the disk, then that data will be updated once the system springs back
into action.
Schedules

• A schedule (or history) S of n transactions T1, T2, ..., Tn is an ordering of

the operations of the transactions.
• Operations from different transactions can be interleaved in the schedule S.
• A Schedule is defined as an execution sequence of transactions
• It is the arrangement of transaction operations
• A shorthand notation for describing a schedules uses the symbols b,r,w,e,c
and a
• b- begin_transaction
• r- read_item
• w- write_item
• e- end_transaction
• c- commit
• a- abort
Serial schedule

• Schedules in which the T1 T2

transactions are executed non- R(A)
interleaved
W(A)
• A serial schedule is one in
W(A)
which no transaction starts
until a running transaction has R(A)

ended are called serial commit

schedules.
commit
• If some transaction Tj is
reading value updated or
written by some other
Example: Consider the schedule involving
transaction Ti, then the commit
two transactions T1 and T2. This is a
of Tj must occur after the
serial schedule since the transactions
commit of Ti.
perform serially in the order T1 —> T2
Non Serial Schedule

• This is a type of Scheduling where the operations of multiple

transactions are interleaved. MIXED
• Unlike the serial schedule where one transaction must wait for
another to complete all its operation, in the non-serial schedule, the
other transaction proceeds without waiting for the previous
transaction to complete.
• This might lead to a rise in the concurrency problem.
• It can be of two types namely, Serializable and Non-Serializable Schedule
Conflict Operations in Schedule

Conflicting Operations
• The two operations become conflicting if all conditions satisfy:
– Both belong to separate transactions.
– They have the same data item.
– They contain at least one write operation.
• In the schedule Sa:
• Sa: r1(X);r2(X);W1(x); r1(Y);w2(X);w1(Y);
Conflicting Operations
• r1(X) and w2(X)
• r2(X) and w1(X)
• W1(X) and w2(X)
Non conflicting operations
• r1(X) and r2(X) (both are read )
• W2(X) and w1(Y)(operate on different data items X and Y)
• r1(X) and w1(X)(they belong to same transaction)
Changing orders of conflicting operations

• Two operations are conflicting if changing their order can result in a

different outcome
• Eg r1(x); w2(x) -> value is read by Transaction T1 before it is written by
transaction T2 (read- write conflict)
• This can be changed as w2(x);r1(x) -> the value of X is changed by w2(x)
before it is read by r1(x)
• w1(x);w2(x) -> w2(x);w1(x)
• This type is called write-write conflict
• Last value of x will differ because in one case it is written by T2 and in other
case by T1
• Two read- read operations are not conflicting because changing order makes
no difference
Serializable Schedule

• This is used to maintain the consistency of the database.

• It is mainly used in the Non-Serial scheduling to verify whether the
scheduling will lead to any inconsistency or not.
• A schedule S of n transactions is serializable if it is equivalent to some
serial schedule of the same n transactions
• These are of two types:
– Conflict Serializable
– View Serializable: PROBLEM
Equivalent Schedule

• For two schedule to be equivalent,

the operations applied to each data
item in both schedules must be in
the same order
• Two definitions of equivalence of
schedules are generally used:
– conflict equivalence
– View equivalence
Conflict equivalence
Two schedules are said to be
conflict equivalent if the order of
any two conflicting operations is
the same in both schedules
Conflict Serializable Schedule

• A schedule is called conflict serializability if after swapping of non-conflicting

operations, it can transform into a serial schedule.
• The schedule will be a conflict serializable if it is conflict equivalent to a serial
schedule.
Conflict Serializable Schedule
T1 T2

Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)

Schedule S1 can be transformed into a serial schedule

by swapping non-conflicting operations of S1.
Testing for Conflict Serializability of a Schedule –
precedence graph

T2 T3
Testing for Conflict Serializability of a Schedule

T2 T3
Testing for conflict serializability of a schedule
If the schedule is conflict serializable then apply the
topological ordering in the graph to find out the
equivalent serial schedule
Example 1

NO CYCLE SO SERALIXZABLE
2. Check whether the given schedule S is conflict serializable or not ?

COMMIT=MPERMANNENT
Testing for Conflict Serializability of a Schedule -
Example
T1 T2 T3 • Draw Edge when we have WHENEVER CONFLICT=EDGE
R(x) • read_item(X) and write_item(X)
R(y) • write_item(X) and read_item(X)
R(x)
• write_item(X) and write_item(X)
R(y)
R(z) T1 ► T1-R(x): no W(x) in T2, T3
W(y)
► T3-R(y): no W(y) in T2, T1
W(z) ► T3-R(x): W(x) in T1 draw the edge T3->T1
R(z) ► T2-R(y): W(y) in T3 draw the edge T2->T3
W(x) T2 T3
W(z) ► T2-R(z): W(z) in T1 draw the edge T2->T1
► T3-W(y): no R(y) or W(y) in T2, T1
► T2-W(z): R(z) and W(z) in T1 draw edge T2-
>T1(already there)

► As we have no Cycle/Loop in the Precedence Graph these schedule is

conflict serializable
Testing for Conflict Serializability of a Schedule
– Example 2
• Draw Edge when we have
T1 T2 T3
R(A) • read_item(X) and write_item(X)
W(A)
• write_item(X) and read_item(X)
W(A) • write_item(X) and write_item(X)
W(A)
T1

T2 T3

► As we have Cycle/Loop in the Precedence Graph this schedule is

Non conflict serializable . Is this serializable?
University Previous Question Paper Question

Check if the following schedules are conflict-serializable using precedence graph. If so,
give the equivalent serial schedule(s). r3(X), r2(X), w3(X), r1(X), w1(X). (Note:
ri(X)/wi(X) means transaction Ti issues read/write on item X.)
T1 T2 T3 • Draw Edge when we have
R(x) • read_item(X) and write_item(X)
R(x) T1
• write_item(X) and read_item(X)
W(x) • write_item(X) and write_item(X)
R(x)
W(x)

T2 T3

► As we have no Cycle/Loop in the Precedence Graph this schedule is conflict serializable .

University Previous Question Paper Question

Check if the following schedules are conflict-serializable using precedence

graph. If so, give the equivalent serial schedule(s). r3(X), r2(X), w3(X), r1(X),
w1(X). (Note: ri(X)/wi(X) means transaction Ti issues read/write on item X.)
T1 Indegree 2 0 1

T1 T2 T3

T2 T3 0 0 1 T2

T2 T3 T1
► All the possible topological orderings
of the above precedence graph will 0 0 0 T3
T2
be the possible serialized schedules.
T2 T3 T1
NO CYCLE T2 T3
T1
serialized schedule
Testing for Conflict Serializability of a Schedule – Example 3

Check whether the given schedule S is conflict serializable or not. If yes,

then determine all the possible serialized schedules
Testing for Conflict Serializability of a Schedule – Example 3

Check whether the given schedule S is conflict serializable or not. If yes, then
determine all the possible serialized schedules

Step-01:

List all the conflicting

operations and determine the
dependency between the
transactions-
•R4(A) , W2(A) (T4 → T2)
•R3(A) , W2(A) (T3 → T2)
•W1(B) , R3(B) (T1 → T3)
•W1(B) , W2(B) (T1 → T2)
•R3(B) , W2(B) (T3 → T2)
Testing for Conflict Serializability of a Schedule – Example 3 contd

Check whether the given schedule S is conflict serializable or not. If yes, then
determine all the possible serialized schedules
Step-02:

Draw the precedence

graph-

Clearly, there exists no cycle in the precedence

graph. Therefore, the given schedule S is conflict
serializable.
Testing for Conflict Serializability of a Schedule – Example 3

Check whether the given schedule S is conflict serializable or not. If yes,

then determine all the possible serialized schedules
Finding the Serialized
Schedules-

•All the possible topological

orderings of the above
precedence graph will be the
possible serialized schedules.
•The topological orderings
can be found be
performing the
Topological Sort of the
precedence graph.
1.T1 T3 T4 T2
• 2.T1 T4 T3 T2
• 3.T4 T1 T3 T2 -
• After performing the topological sort, the possible serialized
schedules are-
• 1.T1 → T3 → T4 → T2
• 2.T1 → T4 → T3 → T2
• 3.T4 → T1 → T3 → T2
Testing for Conflict Serializability of a Schedule – univ question

Check whether the given schedules are conflict serializable or not

i) S1 : R1(X) , R2(X) , R1(Y) , R2(Y) , R3(Y) , W1(X) , W2(Y)
Ans : S1 is not conflict serializable:

ii) S2 : R1(X) , R2(X) , R2(Y) , W2(Y) , R1(Y) , W1(X)

Ans: S2 is conflict serializable:
Non-Serializability in DBMS

► A non-serial schedule that is not serializable is called a non-

serializable schedule. Non-serializable schedules may/may not be
consistent or recoverable. Non-serializable Schedule is divided into
types:
► Recoverable Schedule-A schedule is recoverable if each transaction commits
• only after all the transactions from which it has read have committed.
► Non-recoverable Schedule-If a transaction reads the value of an operation
from an uncommitted transaction and commits before the transaction from
where it has read the value, then such a schedule is called Non-
Recoverable schedule.
► Recoverable schedules are further categorized into 3 types:
► Cascading Schedule
► Cascadeless Schedule
► Strict Schedule
Recoverable Schedules
Irrecoverable Schedule

• Schedule-If a transaction reads the value of an operation from an uncommitted

transaction and commits before the transaction from where it has read the value,
then such a schedule is called Non-Recoverable schedule.

► Suppose that the system allows T9 to commit

immediately after the execution of read(A)
instruction. Thus T9 commits before T8 does.
► Now suppose that T8 fails before it commits. Since
T9 has read the value of data item A written by T8
we must abort T9 to ensure transaction Atomicity.
► However, T9 has already committed and cannot be
aborted. Thus we have a situation where it is
impossible to recover correctly from the failure of
Recoverable Schedules
• Consider the schedule
sa=r1(x);r2(x);w1(x);r1(y);w2(x);c2;w1(y);c1;
• sa is recoverable
2. sc: r1(x);w1(x);r2(x);r1(y)w2(x);c2;a1;
3. sd:r1(x);w1(x); r2(x); r1(y); w2(x);w1(y); c1; c2;
4. se :r1(x); w1(x); r2(x); r1(y); w2(x); w1(y);a1; a2;
• sc is not recoverable because t2 reads item x from t1, but t2
commits before t1 commits
• The problem occurs if t1 aborts after the c2 operation in sc
• The value of x that t2 read is no longer valid and t2 must be
aborted after it is committed so it is not recoverable
• For the schedule to be recoverable, the c2 operation in sc must
be postponed until t1 commits, as shown in sd
• if t1 aborts instead of committing, then t2 should also abort as
shown in se, because the value of x it reads is no longer valid
• In se, aborting t2 is acceptable since it has not committed yet
Recoverable Schedules
Recoverable Schedules with cascading Rollback
Recoverable with Cascading Rollback
Cascadeless Recoverable Rollback
Strict schedule

• More restrictive type of schedule, called a strict schedule

• Transactions can neither read nor write an item X until the last
transaction that wrote X has committed (or aborted)
What is NoSQL?
► NoSQL database stands for “Not Only SQL” or “Not SQL.”
► It is a non-relational Data Management System, that does not require a
fixed schema.
► It avoids joins, and is easy to scale.
► The major purpose of using a NoSQL database is for distributed data
►
stores with
NoSQL enormous
is used dataand
for Big data storage needs.
real-time web apps.
► For example, companies like Twitter, Facebook and Google collect terabytes
of user data every single day.
What is NoSQL?
Online analytical
processing (OLAP)
Why NoSQL?

► The concept of NoSQL databases became popular with Internet giants like
Google, Facebook, Amazon, etc. who deal with huge volumes of data.
► The system response time becomes slow when you use RDBMS for massive
volumes of data
► To resolve this problem, we could “scale up” our systems by upgrading our
• existing hardware. This process is expensive.
• The alternative for this issue is to distribute the database load on multiple hosts
whenever the load increases. This method is known as “scaling out.”
► NoSQL database is non-relational, so it scales out better than relational
databases as they are designed with web applications in mind.
non relational
schema free
distributive
simple api
Features of NoSQL

• Non-relational
– NoSQL databases never follow the relational mode
– Never provide tables with flat fixed-column records
– Work with self-contained aggregates or BLOBs (Binary Large Objects)
.They are complex files such as images, video, and audio.
– Doesn’t require object-relational mapping and data normalization
– No complex features like query languages, query planners, referential
integrity joins, ACID
► Schema-free
– NoSQL databases are either schema-free or have relaxed schemas
– Do not require any sort of definition of the schema of the data
– Offers heterogeneous structures of data in the same domain
Features of NoSQL

• Simple API
– Offers easy to use interfaces for storage and querying data
provided
– APIs allow low -level data manipulation & selection methods
– Text-based protocols mostly used with HTTP REST with JSON
– Mostly used no standard -based NoSql quey language
– Web-enabled databases running as internet-facing services
Features of NoSQL
• Distributed
– Multiple NoSQL databases can be executed in a distributed fashion
– Offers auto-scaling and fail-over capabilities
► Often ACID concept can be sacrificed for scalability and throughput
► Mostly no synchronous replication between distributed nodes
Asynchronous Multi-Master
– Replication, peer-to-peer, HDFS Replication
► Only providing eventual consistency

► Shared Nothing Architecture. This enables less coordination and

higher distribution.
Types of NoSQL Databases

► NoSQL Databases are mainly categorized into four types:

► Key-value Pair Based

► Column-oriented

► Graphs based

► Document-oriented
key-value database

• A key-value database (sometimes

called a key-value store) uses a
simple key- value method to store
data.
• These databases contain a simple
string (the key) that is always
unique and an arbitrary large data
field (the value).
• They are easy to design and
implement
What is a Key-Value Database?

• As the name suggests, this type of NoSQL database implements a hash

table to store unique keys along with the pointers to the corresponding data
values.
► The values can be of scalar data types such as integers or complex
structures such as JSON, lists, BLOB, and so on.
► A value can be stored as an integer, a string, JSON, or an array—with a
key used to reference that value.
► It typically offers excellent performance and can be optimized to fit an
• organization’s needs.
► Key-value stores have no query language but they do provide a way to
add and remove key-value pairs.
► Values cannot be queried or searched upon. Only the key can be queried.
What is a Key-Value Database?
CHARACHTERSTICS

A simple example of key-value data store.

When to use a key-value database
► When your application needs to handle lots of small continuous reads
and writes, that may be volatile.
► Key- value databases offers fast in-memory access.

► When storing basic information, such as customer details; storing webpages

with the URL as the key and the webpage as the value; storing shopping-
cart contents, product categories, e-commerce product details
► For applications that don’t require frequent updates or need to support
complex queries.
Use cases for key-value databases
► Session management on a large scale.
► Using cache to accelerate application responses.
► Storing personal data on specific users.
► Product recommendations, storing personalized lists of items for individual
customers.
► Managing each player’s session in massive multiplayer online games.
► Redis, Dynamo, Riak are some NoSQL examples of key-value store
DataBases.

Prof. Sarju S, Department of Computer Science and

Page 174
Engineering, SJCET Palai
Key value database: - Redis

► Redis is an in-memory, key/value store.

► Redis allows you to
set and retrieve pairs
of keys and values.
► Redis supports the following
• data types and data manipulations:
► Lists, Sets, Hashes, Increments,
► Command repetition, Random Keys,
• Secondary indexes, Scripts
► features of Redis:-
► enables low latency and high throughput data access set title
► Flexible data structures set author "aann"
get tile
► Simplicity and ease-of-use get set commande
Column-oriented

• While a relational database stores data in rows and reads data

row by row, a column store is organized as a set of columns.
• When you want to run analytics on a small number of columns,
you can read those columns directly without consuming
memory with unwanted data
• Columns are often of the same type and benefit from more
efficient compression, making reads even faster.
• Columnar databases can quickly aggregate the value of a given
column (adding up the total sales for the year, for example). Use
cases include analytics.
Column-oriented
Column-oriented
► Column databases use the concept of keyspace, which is sort of like a schema in
relational models.
► This keyspace contains all the column families, which then contain rows, which
then contain columns.
Column-oriented

► If we take a specific row as an example:

► The Row Key is exactly that: the specific identifier of that row and is always unique.
► The column contains the name, value, and timestamp, so that’s straightforward.
► The name/value pair is also straight forward, and the timestamp is the date and time
the data was entered into the database.
► Some examples of column-store databases include Casandra, CosmoDB,
Bigtable, and HBase.
Column-oriented
Column-oriented - Use cases

► Developers mainly use column databases in:

► Content management systems

► Blogging platforms

► Systems that maintain counters

► Services that have expiring usage

► Systems that require heavy write requests (like log

aggregators)
Benefits of Column Databases
► There are several benefits that go along with columnar databases:
► Column stores are excellent at compression and therefore are efficient in terms of
storage.
► You can reduce disk resources while holding massive amounts of information
in a single column
► Since a majority of the information is stored in a column, aggregation queries are
quite fast, which is important for projects that require large amounts of queries in
a small amount of time.
► Scalability is excellent with column-store databases.

► They can be expanded nearly infinitely, and are often spread across large
clusters of machines,
even numbering in thousands.
► That also means that they are great for Massive Parallel Processing
Benefits of Column Databases
► Load times are similarly excellent, as you can easily load a billion-row table
in a few seconds.
► You can load and query nearly instantly.

► Large amounts of flexibility as columns do not necessarily have to look

like each other.
► You can add new and different columns without disrupting the whole
database.
column database:-Cassandra

► A columnar database is a database management system (DBMS) that

stores data in columns instead of rows
► Cassandra is an open-source, column-oriented database designed to
handle large amounts of data across many commodity servers.
► Features of Cassandra:-
► Efficient and speed at scale
► reduces the data storage costs
► improve query performance significantly
► CQL (Cassandra Query Language):
Cassandra provides CQL, a SQL-like //colum
language, to interact with the database. //row
► CQL allows you to create tables, insert and retrieve data, and "aann":
perform various operations on the database fname aann
last sabu
}
//row
Document-oriented databases
► Is a modernized way of storing data as JSON rather than basic columns/rows
— i.e. storing data in its native form.
► This storage system lets you retrieve, store, and manage document-oriented
information
► It’s a very popular category of modern NoSQL databases, used by the likes of
MongoDB, Cosmos DB, DocumentDB, SimpleDB, PostgreSQL, OrientDB,
Elasticsearch and RavenDB.

► This is an example of a document that might

appear in a document database like
MongoDB.
► This sample document represents a company
contact card, describing an employee called
Sammy:
What are document-oriented databases?

► Notice that the document is written as a JSON

► object.
JSON is a human-readable data format that has become quite popular in recent years.
► While many different formats can be used to represent data within a document
database, such as XML or YAML, JSON is one of the most common choices.
► For example, MongoDB adopted JSON as the primary data format to define and
manage
data.
CHARACHTERSTICS

Relational – Document Database

MongoDB
• MongoDB is a NoSQL open-source database that is available for on all operating systems.
• NoSQL, stands for “not only SQL” or “non SQL.”
• NoSQL is used to perform operations on data in databases not structured by rows and
columns.
• NoSQL supports four different types of databases: document, key-value stores, column-
oriented, and graph.

• BSON (binary JSON)

MongoDB is a document database because it stores data JSON-like
documents with schema.
• MongoDB supports all the essential CRUD operations
Atlas (Cloud) & Compass

• MongoDB Atlas is a multi-cloud database service by the same people that build
MongoDB.
• Atlas simplifies deploying and managing your databases on the cloud providers of your choice
(AWS, Azure, and Google Cloud).
• MongoDB Compass is GUI client which can be used for querying, aggregating, and
analayze your MongoDB data in a visual environment.
MONGO DB
• Create or insert operations add new documents to a collection. If the collection does
not
currently exist, insert operations will create the collection.
• Read operations retrieve documents from a collection; i.e. query a collection for
documents. MongoDB provides the following methods to read documents from a
collection
• Update operations modify existing documents in a collection. MongoDB provides
the
following methods to update documents of a collection:
• Delete operations remove documents from a collection. MongoDB
provides the following methods to delete documents of a collection
MongoDB CRUD Operations

Page 205
MongoDB CRUD Operations

Page 206
Benefits of Document Databases
structure unstruct flexibili adapitibilty dcalability

► A few of the most important benefits are:

► Flexibility and adaptability: with a high level of control over the data
structure, document databases enable experimentation and adaptation to
new emerging requirements.
► New fields can be added right away and existing ones can be changed any
time.
► It’s up to the developer to decide whether old documents must be amended or
the
change can be implemented only going forward.
► Ability to manage structured and unstructured data: Document databases
can be used to handle structured data as well, but they’re also quite useful for
storing unstructured data where necessary.
► Scalability by design: Conversely, document databases are designed as
distributed systems that instead allow you to scale horizontally (meaning
that you split a single database up across multiple servers).
Graph-Based NoSQL
► Graph databases are generally straightforward in how they’re structured
though. They primarily are composed of two components:
► The Node
► This is the actual piece of data itself.
► It can be the number of viewers of a youtube video, the number of people who have read
a tweet, or it could even be basic information such as people’s names, addresses, and so
forth.
► The Edge
► This explains actual relationship between two nodes.

► Interestingly enough, edges can also have their own pieces of information, such as the nature of the
relation between two nodes. Similarly, edges might also have directions describing the flow of said
data.
Translating NoSQL Knowledge to Graphs
► With the advent of the NoSQL movement, businesses of all sizes have a
variety of modern options from which to build solutions relevant to their
use cases.
► Calculating average income? Ask a relational database.

► Building a shopping cart? Use a key-value Store.

► Storing structured product information? Store as a document.

► Describing how a user got from point A to point B? Follow a graph.

► Examples of Graph Databases

► Neo4j, ArangoDB
ArangoDB

► ArangoDB is a native multi-model, open-source database with flexible

data models for documents, graphs, and key-values.
► Build high performance applications using a convenient SQL-like query
• language or JavaScript extensions.
► Use ACID transactions if you require them. Scale horizontally and
vertically with a few mouse clicks.
► Key features include:
► Installing ArangoDB on a cluster is as easy as installing an app on
your mobile
► Powerful query language (AQL) to retrieve and modify data
► Use ArangoDB as an application server and fuse your application
and database together for maximal throughput
ArangoDB
ArangoDB

► Flexible data modeling: model your data as combination of key-value

pairs, documents or graphs - perfect for social relations
► Transactions: run queries on multiple documents or collections with
optional transactional consistency and isolation
► Configurable durability: let the application decide if it needs more
durability or more performance
► No-nonsense storage: ArangoDB uses all of the power of modern
storage hardware, like SSD and large caches
► JavaScript for all: no language zoo, you can use one language
from your browser to your back-end
► ArangoDB can be easily deployed as a fault-tolerant distributed state
• machine, which can serve as the animal brain of distributed appliances
► It is open source (Apache License 2.0)
ArangoDB Use Cases
► ArangoDB is a database system with a large solution space
because it combines graphs, documents, key-value, search engine,
and machine learning all in one
► ArangoDB as a Graph Database
► ArangoDB as a graph database is a great fit for use cases like fraud
detection, knowledge graphs, recommendation engines, identity and
access management, network and IT operations, social media
management, traffic management, and many more.
► ArangoDB as a Document Database
► ArangoDB can be used as the backend for heterogeneous content
management, e-commerce systems, Internet of Things applications,
and more generally as a persistence layer for a broad range of
services that benefit from an agile and scalable data store.
ArangoDB Use Cases
► ArangoDB as a Key-Value Database
► Key-value stores are the simplest kind of database systems.
Each record is stored as a block of data under a key that
uniquely identifies the record.
► The data is opaque, which means the system doesn’t know
anything about the contained information, it simply stores it
and can retrieve it for you via the identifiers
► ArangoDB as a Search Engine
► ArangoDB has a natively integrated search engine for a
broad range of information retrieval needs.
► It is powered by inverted indexes and can index full-text,
GeoJSON, as well as arbitrary JSON data.
Graph-Based NoSQL

Lesson 08
No ratings yet
Lesson 08
39 pages
Chapter 3 - Transaction Management DB
No ratings yet
Chapter 3 - Transaction Management DB
45 pages
Transaction Processing Basics
No ratings yet
Transaction Processing Basics
25 pages
1 - Dbms-Module-5 FULL
No ratings yet
1 - Dbms-Module-5 FULL
170 pages
Chapter 3 ADBMS
No ratings yet
Chapter 3 ADBMS
65 pages
Module 5 Part1 Introduction To Transaction Processing
No ratings yet
Module 5 Part1 Introduction To Transaction Processing
76 pages
Module 5 Dbms
No ratings yet
Module 5 Dbms
8 pages
The University of Dodoma College of Informatics and Virtual Education
No ratings yet
The University of Dodoma College of Informatics and Virtual Education
44 pages
BCS403 DBMS M4 Transaction Notes
No ratings yet
BCS403 DBMS M4 Transaction Notes
10 pages
Module 5 - Tranasaction
No ratings yet
Module 5 - Tranasaction
13 pages
Chapter Three: Transaction Processing Concepts
No ratings yet
Chapter Three: Transaction Processing Concepts
29 pages
Chapter 3-Transaction Processing Concepts
No ratings yet
Chapter 3-Transaction Processing Concepts
26 pages
Unit 4 Part2 Transacmgt
No ratings yet
Unit 4 Part2 Transacmgt
68 pages
Module 4 TransactionProcessing
No ratings yet
Module 4 TransactionProcessing
73 pages
Ch7 Transaction Processing
No ratings yet
Ch7 Transaction Processing
55 pages
18CS53 - 2022 - 23 - Module5 - DBMS
No ratings yet
18CS53 - 2022 - 23 - Module5 - DBMS
89 pages
DBMS: Transaction Management Guide
No ratings yet
DBMS: Transaction Management Guide
40 pages
Chapter 3 Transaction Concepts
No ratings yet
Chapter 3 Transaction Concepts
43 pages
ADB Slides 6
No ratings yet
ADB Slides 6
64 pages
Chapter 3 Transaction Processing
No ratings yet
Chapter 3 Transaction Processing
29 pages
Ch2 TransactionConcepts
No ratings yet
Ch2 TransactionConcepts
25 pages
Advanced DB Systems - CH 3
No ratings yet
Advanced DB Systems - CH 3
28 pages
Module-5 Transaction
No ratings yet
Module-5 Transaction
47 pages
Transaction Processing Concepts
No ratings yet
Transaction Processing Concepts
11 pages
Transaction Processing Concepts and Theory
No ratings yet
Transaction Processing Concepts and Theory
56 pages
Chap 1
No ratings yet
Chap 1
45 pages
Chapter Four: Introduction To Transaction Processing Concepts and Theory
No ratings yet
Chapter Four: Introduction To Transaction Processing Concepts and Theory
36 pages
Ad Database Transaction Concept
No ratings yet
Ad Database Transaction Concept
62 pages
Chapter 3 - Transaction Management
No ratings yet
Chapter 3 - Transaction Management
46 pages
3 - Transaction Management
No ratings yet
3 - Transaction Management
39 pages
Chapter 2 Transaction Processing
No ratings yet
Chapter 2 Transaction Processing
48 pages
Database System
No ratings yet
Database System
33 pages
Chapter 3 Transaction
No ratings yet
Chapter 3 Transaction
61 pages
Unit - Iii
No ratings yet
Unit - Iii
15 pages
Transaction Processing I-1
No ratings yet
Transaction Processing I-1
27 pages
Advanced DB System Chapter 3
No ratings yet
Advanced DB System Chapter 3
51 pages
Advanced Database Chapter 3
No ratings yet
Advanced Database Chapter 3
36 pages
DB2 L03 Transactions
No ratings yet
DB2 L03 Transactions
60 pages
6 - TransactionProcessing - Ch17 (Autosaved)
No ratings yet
6 - TransactionProcessing - Ch17 (Autosaved)
62 pages
ADB Chapter 3
No ratings yet
ADB Chapter 3
54 pages
Chapter 9 Transaction Processing Concepts and Concurrency Control Techniques
No ratings yet
Chapter 9 Transaction Processing Concepts and Concurrency Control Techniques
13 pages
Mca DBMS5
No ratings yet
Mca DBMS5
37 pages
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-23 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-23 Reference-Material-I
37 pages
My Chap 20 DATABASE SYSTEMS - TRANSACTION PROCESSING
No ratings yet
My Chap 20 DATABASE SYSTEMS - TRANSACTION PROCESSING
57 pages
Chapter 3 Transaction Processing Conceptes
No ratings yet
Chapter 3 Transaction Processing Conceptes
75 pages
W6 DBMS Chapter22
No ratings yet
W6 DBMS Chapter22
50 pages
Transaction Management Concurrency Control and Backup
No ratings yet
Transaction Management Concurrency Control and Backup
24 pages
Introduction To Transaction Processing Concepts and Theory: Database Systems
No ratings yet
Introduction To Transaction Processing Concepts and Theory: Database Systems
87 pages
1 - Dbms Module 5
No ratings yet
1 - Dbms Module 5
85 pages
Chapter 6
No ratings yet
Chapter 6
43 pages
Chapter 1 - Transaction Processing and MGT
No ratings yet
Chapter 1 - Transaction Processing and MGT
70 pages
Chapter 2-Transaction Management
No ratings yet
Chapter 2-Transaction Management
35 pages
Chapter-1 Transaction Processing
No ratings yet
Chapter-1 Transaction Processing
69 pages
Chapter 3 - Transaction Management
No ratings yet
Chapter 3 - Transaction Management
39 pages
Transactionprocessing Unit IV 1
No ratings yet
Transactionprocessing Unit IV 1
27 pages
Unit IV Transaction Processing
No ratings yet
Unit IV Transaction Processing
49 pages
Chapter 20
No ratings yet
Chapter 20
60 pages
DB Week 12
No ratings yet
DB Week 12
28 pages
Fee Structure For All Batches 15.10.2024
No ratings yet
Fee Structure For All Batches 15.10.2024
11 pages
Proceduresandfuncti ONS
No ratings yet
Proceduresandfuncti ONS
5 pages
MET402 - M3 - Ktunotes - in
No ratings yet
MET402 - M3 - Ktunotes - in
92 pages
Module 3
No ratings yet
Module 3
61 pages
Notification LPGrade Ktunotes - in
No ratings yet
Notification LPGrade Ktunotes - in
15 pages
PYQ3
No ratings yet
PYQ3
13 pages
8642 50 354 Module 5 1
No ratings yet
8642 50 354 Module 5 1
66 pages
PYQ2
No ratings yet
PYQ2
2 pages
7877 50 354 Module 3
No ratings yet
7877 50 354 Module 3
92 pages
7878 50 354 ERT446 Mod4
No ratings yet
7878 50 354 ERT446 Mod4
32 pages
7499 50 354 Module 2 3
No ratings yet
7499 50 354 Module 2 3
34 pages
7497 50 354 Module 2 1
No ratings yet
7497 50 354 Module 2 1
41 pages
Python Ktu
No ratings yet
Python Ktu
30 pages
Youth Empowerment Through Skill Development by Mr. S. Ramadorai
No ratings yet
Youth Empowerment Through Skill Development by Mr. S. Ramadorai
51 pages
8 Best SIEM Solutions
No ratings yet
8 Best SIEM Solutions
10 pages
SRS Sample
No ratings yet
SRS Sample
49 pages
Crime Reporting System in PHP With Source Code
No ratings yet
Crime Reporting System in PHP With Source Code
2 pages
SAP SQL Server 2000 Setup Guide
No ratings yet
SAP SQL Server 2000 Setup Guide
17 pages
Oracle Data Guard for DBAs
No ratings yet
Oracle Data Guard for DBAs
5 pages
Android File Handling & Storage
No ratings yet
Android File Handling & Storage
23 pages
My Visitor Brochure
No ratings yet
My Visitor Brochure
1 page
A Java Based University Library Management System
No ratings yet
A Java Based University Library Management System
10 pages
Restaurant Management System Project Report
No ratings yet
Restaurant Management System Project Report
26 pages
Pipe Flow Expert Brochure
No ratings yet
Pipe Flow Expert Brochure
2 pages
SAP HANA Cloud - Foundation - Unit 3
No ratings yet
SAP HANA Cloud - Foundation - Unit 3
20 pages
Smart Maintenance Solutions AR - and VR-Enhanced Digital Twin Powered by FIWARE
No ratings yet
Smart Maintenance Solutions AR - and VR-Enhanced Digital Twin Powered by FIWARE
24 pages
Import Excel Data & Create Queries
No ratings yet
Import Excel Data & Create Queries
7 pages
ERP and Middleware
No ratings yet
ERP and Middleware
16 pages
HRIS
100% (1)
HRIS
20 pages
PWV8iAdminClientSs3 TRN013280 1 0005
No ratings yet
PWV8iAdminClientSs3 TRN013280 1 0005
434 pages
Fundamentals of Database Systems Course Outline
No ratings yet
Fundamentals of Database Systems Course Outline
3 pages
Pega Day 1
No ratings yet
Pega Day 1
31 pages
SmartPSSLite User's Manual Eng
No ratings yet
SmartPSSLite User's Manual Eng
45 pages
Emerging Database Technologies and Applications
No ratings yet
Emerging Database Technologies and Applications
20 pages
PRPC Architecture EAR
No ratings yet
PRPC Architecture EAR
10 pages
Alcina PDF
No ratings yet
Alcina PDF
24 pages
How To Flowchart A Process: An Implementation Guide
No ratings yet
How To Flowchart A Process: An Implementation Guide
20 pages
Appian - Sample CV - 1
No ratings yet
Appian - Sample CV - 1
2 pages
Software Application
100% (1)
Software Application
16 pages
Beer Sales Rep Cover Letter
100% (1)
Beer Sales Rep Cover Letter
7 pages
CMDBuild UserManual ENG V230 PDF
No ratings yet
CMDBuild UserManual ENG V230 PDF
67 pages
Toad For Oracle 12 - 0 Vs SQL Developer Functional Matrix - Final
No ratings yet
Toad For Oracle 12 - 0 Vs SQL Developer Functional Matrix - Final
4 pages
User Manual Paver 5.2
100% (1)
User Manual Paver 5.2
141 pages

Module 5

Uploaded by

Module 5

Uploaded by

Module 5

• Transaction Processing Concepts - overview of concurrency control,

• An action or series of action performed by a user or an application program

• read_item(X) − reads data item from storage to main memory

• modify_item() − change value of item in the main memory.

• write_item() − write the modified value from main memory to storage

• Several transactions will be executed in concurrent manner

• Concurrency control and recovery mechanism are mainly concerned with

• If this concurrent execution is uncontrolled, it may lead to problems

• In the lost update problem, update

• Temporary update or dirty read problem occurs when one transaction

• Consider a situation, where one transaction is applying the aggregate

► In this example, transaction 2 is calculating the sum

• The unrepeatable problem occurs when two or more read operations

► In this example, once transaction 2 reads the variable

• Computer failure (system crash)

• A computer failure (system crash). A hardware, software, or network error

• If a transaction fails after executing some of its operation but before

• To be able to recover from failures that affect transactions, the system

► Commit: If a transaction commits, changes made are visible.

► Atomicity is also known as the ‘All or nothing rule’.

► If the transaction fails after completion of T1 but before completion of T2.(

• In a database system where more than one transaction is being executed

• A schedule (or history) S of n transactions T1, T2, ..., Tn is an ordering of

• Schedules in which the T1 T2

ended are called serial commit

• This is a type of Scheduling where the operations of multiple

• Two operations are conflicting if changing their order can result in a

• This is used to maintain the consistency of the database.

• For two schedule to be equivalent,

• A schedule is called conflict serializability if after swapping of non-conflicting

Schedule S1 can be transformed into a serial schedule

► As we have no Cycle/Loop in the Precedence Graph these schedule is

► As we have Cycle/Loop in the Precedence Graph this schedule is

► As we have no Cycle/Loop in the Precedence Graph this schedule is conflict serializable .

Check if the following schedules are conflict-serializable using precedence

Check whether the given schedule S is conflict serializable or not. If yes,

List all the conflicting

Draw the precedence

Clearly, there exists no cycle in the precedence

Check whether the given schedule S is conflict serializable or not. If yes,

•All the possible topological

Check whether the given schedules are conflict serializable or not

ii) S2 : R1(X) , R2(X) , R2(Y) , W2(Y) , R1(Y) , W1(X)

► A non-serial schedule that is not serializable is called a non-

• Schedule-If a transaction reads the value of an operation from an uncommitted

► Suppose that the system allows T9 to commit

• More restrictive type of schedule, called a strict schedule

► Shared Nothing Architecture. This enables less coordination and

► NoSQL Databases are mainly categorized into four types:

• A key-value database (sometimes

• As the name suggests, this type of NoSQL database implements a hash

A simple example of key-value data store.

► When storing basic information, such as customer details; storing webpages

Prof. Sarju S, Department of Computer Science and

► Redis is an in-memory, key/value store.

• While a relational database stores data in rows and reads data

► If we take a specific row as an example:

► Developers mainly use column databases in:

► Systems that maintain counters

► Services that have expiring usage

► Systems that require heavy write requests (like log

► Large amounts of flexibility as columns do not necessarily have to look

► A columnar database is a database management system (DBMS) that

► This is an example of a document that might

► Notice that the document is written as a JSON

Relational – Document Database

• BSON (binary JSON)

► A few of the most important benefits are:

► Building a shopping cart? Use a key-value Store.

► Storing structured product information? Store as a document.

► Describing how a user got from point A to point B? Follow a graph.

► Examples of Graph Databases

► ArangoDB is a native multi-model, open-source database with flexible

► Flexible data modeling: model your data as combination of key-value

You might also like