Dbms 2nd Sem
Dbms 2nd Sem
Introduction to Databases:
A database is a collection of related data or information. Data means any raw facts such as its
combination of A to Z alphabetic, 0-9 combinational numbers, characters , images, videos, audios etc…
A database is an organized collection of structured information, or data, typically stored
electronically in a computer system.
A database may be generated and maintained manually or it may be computerized.
A database can be of any size and complexity. For example, the list of names and addresses referred to
earlier may consist of only a few hundred records, each with a simple structure.
An example of a large commercial database is Amazon.com. It contains data for over 20 million books,
CDs, videos, DVDs, games, electronics, apparel, and other items.
Database Management System (DBMS):
A database management system is a collection of programs that enables users to create and maintain a
database. The DBMS is a general-purpose software system that facilitates the processes of defining,
constructing, manipulating, and sharing databases among various users and applications.
Defining a database involves specifying the data types, structures, and constraints of the data to be
stored in the database.
Constructing the database is the process of storing the data on some storage medium that is controlled
by the DBMS.
Manipulating a database includes functions such as querying the database to retrieve specific data,
updating the database to reflect changes in the data.
Sharing a database allows multiple users and programs to access the database simultaneously.
An application program accesses the database by sending queries or requests for data to the DBMS.
Other important functions provided by the DBMS include protecting the database and maintaining it
over a long period of time.
Protection includes system protection against hardware or software malfunction (or crashes) and security
protection against unauthorized or malicious access.
A typical large database may have a life cycle of many years, so the DBMS must be able to maintain the
database system by allowing the system to evolve as requirements change over time.
**********************
An Example:
Consider the example of university system. UNIVERSITY database for maintaining information
concerning students, courses, and grades in a university environment. The STUDENT file stores data on each
student, the COURSE file stores data on each course, the SECTION file stores data on each section of a course,
the GRADE_REPORT file stores the grades that students receive in the various sections they have completed
each course.
Database users are categorized based up on their interaction with the database. These are seven types
of database users in DBMS.
1. Database administrator (DBA): The DBA is responsible for authorizing access to the database,
coordinating and monitoring its use and acquiring software and hardware resources as needed. DBA is
also responsible for providing security to the database.DBA also monitors the recovery and backup
and provides technical support. DBA repairs damage caused due to hardware and/or software failures.
DBA is the one having privileges to users.
2. Naïve users: who don’t have any DBMS knowledge but they frequently use the database applications in
their daily life. For examples, Railway’s ticket booking users are naive users.
3. Database designers: Database designers are responsible for identifying the data to be stored in the
database and for choosing appropriate structures to represent and store this data.
4. System analysts: System analysts determine the requirements of end users, especially naive and
parametric end users, and develop specifications for standard canned transactions that meet these
requirements.
5. Application programmers: Application programmers implement these specifications as programs; then
they test, debug, document, and maintain these canned transactions.
6. Sophisticated Users: Sophisticated users can be engineers, scientists, business analyst, who are
familiar with the database.
7. Casual Users / Temporary Users : Casual Users are the users who occasionally use/access the
database but each time when they access the database they require the new information ,
*********************
Workers behind the scene:
Who design, use, and administer a database, others are associated with the design, development, and operation
of the DBMS software and system environment. These persons are typically not interested in the database
content itself. We call them the workers behind the scene.
DBMS system designers and implementers design and implement the DBMS modules and interfaces
as a software package. A DBMS is a very complex software system that consists of many components,
or modules, including modules for implementing the catalog, query language processing, interface
processing, accessing and buffering data, controlling concurrency, and handling data recovery and
security. The DBMS must interface with other system software such as the operating system and
compilers for various programming languages.
Network Model in DBMS is a hierarchical model that is used to represent the many-to-many
relationship among the database constraints. It is a simple and easy-to-construct database model.
The Network Model in DBMS is based on the set of nodes and links.
Relational Databases:
The relational data model also introduced high-level query languages that provided an
alternative to programming language interfaces, making it much faster to write new queries.
In relational model, the data and relationships are represented by collection of inter-related
tables. Each table is a group of column and rows, where column represents attribute of an entity
and rows represents records.
Student-id Student-name branch Phone number
101 Rekha MCA 9999555522
102 Geetha MBA 9933446688
103 Kumar MCA 8844998877
104 Veeru MBA 1702271702
Object-Oriented Applications: The real world entities and situations are represented as objects in the
Object oriented database model.
Relational Databases:
The relational data model also introduced high-level query languages that provided an alternative to
programming language interfaces, making it much faster to write new queries. In relational model, the
data and relationships are represented by collection of inter-related tables. Each table is a group of
column and rows, where column represents attribute of an entity and rows represents records.
Student-id Student-name branch Phone number
101 Rekha MCA 9999555522
102 Geetha MBA 9933446688
103 Kumar MCA 8844998877
104 Veeru MBA 1702271702
Object-Oriented Applications:
The real world entities and situations are represented as objects in the Object oriented database model .
In this model, both the data and relationship are present in a single structure known as an object. In
this model, two are more objects are connected through links.
****************************
Schemas and Instances:
Schemas:
It refers to an overall description that we get for any given database. In simpler words, schema refers to
the basic structure of how one needs to store data in any database. There are basically two types of
Schema: Physical Schema and Logical Schema.
Physical Schema – This schema describes the database designed at a physical level.
Logical Schema – This schema describes the database designed at a logical level.
Instances:
In DBMS, the data is stored for a particular amount of time and is called an instance of the database. The
database schema defines the attributes of the database in the particular DBMS. The value of the
particular attribute at a particular moment in time is known as an instance of the DBMS.
The major differences between schema and instance are as follows:
Database Schema Database Instance
It is the definition of the database, or it is defined It is a snapshot of a database at a specific moment
as the description of the database.
This corresponds to the variable declaration of a The value of the variable in a program at a point in
programming language. time corresponds to an instance of the database
schema.
It rarely changes. It changes frequently.
This corresponds to the variable declaration of a The value of the variable in a program at a point in
programming language. time corresponds to an instance of the database
schema.
Defines the basic structure of the database, i.e., It is the set of Information stored at a particular
how the data will be stored in the database. time.
Schema is same for whole database. Data in instances can be changed using addition,
deletion, updating.
***********************
Three-Schema Architecture:
The goal of the three-schema architecture, is to separate the user applications from the physical
database. In this architecture, schemas can be defined at the following three levels:
1. Internal Level
2. Conceptual Level
3. External Level
**********************
Data Independence:
Data Independence can be defined as the capacity to change the schema at one level of a database
system without having to change the schema at the next higher level.
We can define two types of data independence:
1. Logical data independence.
2. Physical data independence.
Logical data independence:
Logical data independence is the capacity to change the conceptual schema without having to change
external schemas or application programs. We may change the conceptual schema to expand the
database (by adding a record type or data item), to change constraints, or to reduce the database (by
removing a record type or data item).
Physical data independence:
Physical data independence is the capacity to change the internal schema without having to change the
conceptual schema. Hence, the external schemas need not be changed as well. Changes to the internal
schema may be needed because some physical files were reorganized. for example, by creating
additional access structures to improve the performance of retrieval or update. If the same data as before
remains in the database, we should not have to change the conceptual schema.
*************************
Once the design of a database is completed and a DBMS is chosen to implement the database, the first
step is to specify conceptual and internal schemas for the database and any mappings between the two.
Database languages are
1. Data Definition Language.
2. Data Manipulation Language.
3. Data Control Language.
4. Transaction Control Language
Database Users:
Users are differentiated by the way they expect to interact with the system:
Application programmers:
Application programmers are computer professionals who write application programs.
Application programmers can choose from many tools to develop user interfaces.
Sophisticated users:
Sophisticated users interact with the system application programs and database Languages.
Sophisticated users are having knowledge on multiple tools.
Naïve users :
Naive users are unsophisticated users who interact with the system without any knowledge by
invoking one of the application programming interfaces that have been written previously.
DBMS M L REKHA BVCITS-AMALAPURAM
Database Administrator:
Coordinates all the activities of the database system. Database administrator's duties include:
Schema definition: The DBA creates the original database schema by executing a set of data
definition statements in the DDL.
Storage structure and access method definition.
Schema and physical organization modification
Granting user authority to access the database: By granting different types of authorization, the
database administrator can regulate which parts of the database various users can access.
Specifying integrity constraints.
Monitoring performance and responding to changes in requirements.
Backup and recovery the data.
Providing security.
Query Processor:
The query processor will accept query from user and solves it by accessing the database.
Parts of Query processor:
DDL interpreter
This will interprets DDL statements and fetch the definitions in the data dictionary.
DML compiler
This will translates DML statements in a query language into low level instructions that the
query evaluation engine understands.
A query can usually be translated into any of a number of alternative evaluation plans for same
query result DML compiler will select best plan for query optimization.
Query evaluation engine
This engine will execute low-level instructions generated by the DML compiler on DBMS.
Storage Manager/Storage Management:
A storage manager is a program module which acts like interface between the data stored in a database
and the application programs and queries submitted to the system. Thus, the storage manager is responsible for
storing, retrieving and updating data in the database.
The storage manager components include:
Authorization and integrity manager: Checks for integrity constraints and authority of users to access
data.
Transaction manager: Ensures that the database remains in a consistent state although there are system
failures.
File manager: Manages the allocation of space on disk storage and the data structures used to represent
information stored on disk.
Buffer manager: It is responsible for retrieving data from disk storage into main memory. It enables the
database to handle data sizes that are much larger than the size of main memory.
Memory storage:
Data structures implemented by storage manager.
Data files: Stored in the database itself.
Data dictionary: Stores metadata about the structure of the database.
Indices: Provide fast access to data items.
Statistical data: Statistical analysis performed on memory.
*********************
Classification of Database Management Systems:
Classification of database management system is based on various parameters such as the kind of data
model used to construct the DBMS, the number of users that will be using the database system, the way in
which the database is distributed
1. Based on Data Model
2. Based on Number of Users
3. Based on Database Distribution
4. Based on Cost of Database
5. Based on Usage
DBMS M L REKHA BVCITS-AMALAPURAM
Based on Data Model
. Depending upon how the data is structured, data models are further classified into:
Relational Data Model:
In the relational data model, we use tables to represent data and the relationship among that data. Each
of the tables in the relational data model has a unique name. A table has multiple columns where each
column name is unique. A table holds records which has value for each column of the table. The
relational database model is the most currently used data model.
Entity-Relationship Model
The Entity-Relationship model (E-R data model) represents data using objects and the relationship
among these objects. These objects are referred to as entities that represent the real ‘thing’ or ‘object’ in
the real world.
Object-Based Data Model
The object-based data model is an extension of the E-R model which also include notion for
encapsulation, methods. There is also an object-relational data model which is a combination of the
object-oriented data model and relational data model.
Semi structured Data Model
The semi structured data model is different from what we have studied above. In the semi structured
data model, the data items or objects of the same kind might have a different set of attributes. The
Extensible Markup Language represents the semi structured data. The hierarchical data model stores the
data in the form of records and uses a tree structure to represent these records. Network data model was
introduced which allow the multiple parent record for a single child record.
Based on Number of Users:
The database management system can also be classified on the basis of its user. So, a DBMS can either
be used by a single user or it can be used by multiple users.
The database system that can be used by a single user at a time is referred to as a single-user system
The database system that can be used by multiple users at a time is referred to as a multiple usersystem.
Based on Database Distribution:
Depending on the distribution of the database over numerous sites we can classify the database as:
Centralized DBMS:
In the centralized DBMS, the entire database is stored in a single computer site. Though the centralized
database supports multiple users still the DBMS software and the data both are stores on a single
computer site.
Distributed DBMS
In the distributed DBMS (DDBMS) the database and the DBMS software are distributed over many
computer sites. These computer sites are connected via a computer network. The DDBMS is further
classified as homogeneous DDBMS and heterogeneous DDBMS.
Homogeneous DDBMS: The homogeneous DDBMS has the same DBMS software at all the
distributed sites.
Heterogeneous DDBMS: The heterogeneous DDBMS has different DBMS software for
different sites.
Based on Cost of Database:
Well, it is quite difficult to classify the database on the basis of its cost as nowadays you can have free
open source DBMS products such as MySQL and PostgreSQL. Although the personal version of RDBMS can
cost up to $100.You may also have to pay millions of dollars for the installation and maintenance of a large
database system.
Based on Usage:
On the basis of the access path that is used to store the files, the database can be classified as general-
purpose DBMS and special-purpose DBMS.
The special-purpose DBMS is the one that is designed for a specific application and it can not be used for
another application .
Online transaction processing (OLTP): The OLTP system supports a large number of transactions concurrently
without any delay.
The general-purpose DBMS is the one that is designed to meet the need of as many applications as possible
*****************************
DBMS M L REKHA BVCITS-AMALAPURAM
UNIT-2
Syllabus: Introduction to Database Design: Database Design and ER Diagrams, Entities, Attributes and Entity Sets,
Relationships and Relationship Sets, Additional Features of the ER Model, Conceptual Design with the ER Model,
Conceptual Design for Large Enterprises .
Relational Model: Introduction to the Relational Model, Integrity Constraints over Relations, Enforcing Integrity
Constraints, Querying Relational Data, Logical Database Design: ER to Relational, Introduction to Views,
Destroying/Altering Tables and View
Introduction to Database Design:
Database design can be generally defined as a collection of tasks or processes that enhance the
designing, development, implementation, and maintenance of enterprise data management system.
Database designs provide the blueprints of how the data is going to be stored in a system. A proper design of a
database highly affects the overall performance of any application.
The designing principles defined for a database give a clear idea of the behavior of any application and how the
requests are processed. There are two types database design:
Logical model – This stage is concerned with developing a database model based on requirements. The entire
design is on paper without any physical implementations or specific DBMS considerations.
Physical model – This stage implements the logical model of the database taking into account the DBMS and
physical implementation factors.
ER Diagrams:
An Entity Relationship Diagram is a diagram that represents relationships among entities in a database. It is
commonly known as an ER Diagram. An ER Diagram in DBMS plays a crucial role in designing the database.
An Entity–relationship model (ER model) describes the structure of a database with the help of a diagram, which is
known as Entity Relationship Diagram (ER Diagram). An ER model is a design or blueprint of a database that can
later be implemented as a database. The main components of E-R model are: entity set and relationship set.
Weak Entity:
An entity that cannot be uniquely identified by its own attributes and relies on the relationship with other entity
entit is
called weak entity. The weak entity is represented by a double rectangle.
Attribute:
An attribute describes the property of an entity. An attribute is represented as Oval in an ER diagram. There are four types
of attributes:
1. Key attribute
2. Composite attribute
3. Multi valued attribute
4. Derived attribute
1. Key attribute:
A key attribute can uniquely identify an entity from an entity set. For example, student roll number can uniquely
identify a student from a set of students. Key attribute iiss represented by oval same as other attributes however the text of
key attribute is underlined.
2. Composite attribute:
An attribute that is a combination of other attributes is known as composite attribute. For example, In student
entity, the student address
ddress is a composite attribute as an address is composed of other attributes such as pin code, state,
country.
***************************
Entity Set:
An entity set is a set of same type off entities.
An entity refers to any object having-
Either a physical existence such as a particular person, office, house or car.
Or a conceptual existence such as a school, a university, a company or a job.
In ER diagram,
Attributes are associated with an entity set.
Attributes describe the properties of entities in the entity set.
Based on the values of certain attributes, an entity can be identified uniquely.
Generalization – It is a process of extracting common properties from a set of entities and creating a generalized
entity from it. Generalization is a “Bottom-up approach”. In which two or more entities can be combined to form a
higher-level entity if they have some attributes in common.
In generalization, Subclasses are combined to make a super class
Example: There are three entities given, car, bus, and bike. They all have some common attributes like all cars,
buses, and bikes they all have no. of tires and have some colors. So they all can be grouped and make a superclass
named a vehicle.
******************
A Conceptual Data Model is an organized view of database concepts and their relationships. The purpose of
creating a conceptual data model is to establish entities, their attributes, and relationships. In this data modeling level, there
is hardly any detail available on the actual database structure. Business stakeholders and data architects typically create a
conceptual data model.
The 3 basic tenants of Conceptual Data Model are
Entity: A real-world thing
Attribute: Characteristics or properties of an entity
Relationship: Dependency or association between two entities
Data model example:
Customer and Product are two entities. Customer number and name are attributes of the Customer entity
Product name and price are attributes of product entity
Sale is the relationship between the customer and product
***********************
One course is enrolled by multiple students and one student for multiple courses. Hence, relationships are many to
many.
One department has multiple instructors and one instructor belongs to one and only one department, hence the
relationship is one to many.
Each department has one “HOD” and one instructor is “HOD” for only one department, hence the relationship is one
to one. Here, HOD refers to the head of the department.
One course is taught by only one instructor but one instructor teaches many courses hence the relationship between
course and instructor is many to one.
The relationship between instructor and student is not defined because of the following reasons −
There is no significance in the relationship.
We can always derive this relationship indirectly through course and instructors, and course and student.
Step 5 − Complete ER model
DBMS ML REKHA BVCITS-MCA
The complete ER Model is as follows −
*********************
Introduction to the Relational Model:
In the concept of a relational database management system, data is organized into tables.
Tables are similar to folders in a file system, where each table stores a collection of information. Tables are further
divided into columns and rows. Columns represent the attributes of an entity, while rows represent the entities
themselves.
**********************
Domain Constraint:
Domain integrity constraint contains a certain set of rules or conditions to restrict the kind of attributes or
values a column can hold in the database table. The data type of a domain can be string, integer, character,
DateTime, currency, etc.
Example:
Consider a Student's table having Roll No, Name, Age, Class of students.
Roll No Name Age Class
101 rekha 14 6
102 geetha 16 8
103 Durga 8 4
104 kumar 18 12
105 naidu 6 A
In the above student's table, the value A in the last row last column violates the domain integrity constraint because
the Class attribute contains only integer values while A is a character.
Entity Integrity Constraint:
Entity Integrity Constraint is used to ensure that the primary key cannot be null. A primary key is used to
identify individual records in a table and if the primary key has a null value, then we can't identify those records.
There can be null values anywhere in the table except the primary key column.
Example: Consider Employees table having Id, Name, and salary of employees
ID Name Salary
101 rekha 40000
102 geetha 60000
103 Durga 80000
104 kumar 1800000
105 naidu 36000
In the above employee's table, we can see that the ID column is the primary key and contains a null value in the
last row which violates the entity integrity constraint.
Referential Integrity Constraint
Referential Integrity Constraint ensures that there must always exist a valid relationship between two
relational database tables. This valid relationship between the two tables confirms that a foreign key exists in a
table. It should always reference a corresponding value or attribute in the other table or be null.
Example: Consider an Employee and a Department table where Dept_ID acts as a foreign key between the two
tables
Employees Table:
Key constraint:
Keys are the set of entities that are used to identify an entity within its entity set uniquely. There could be
multiple keys in a single entity set, but out of these multiple keys, only one key will be the primary key. A primary
key can only contain unique and not null values in the relational database table.
Example:
Roll No Name Age Class
101 rekha 14 6
102 geetha 16 8
103 Durga 8 4
104 kumar 18 12
102 naidu 6 7
The last row of the student's table violates the key integrity constraint since Roll No 102 is repeated twice in
the primary key column. A primary key must be unique and not null therefore duplicate values are not allowed in
the Roll No column of the above student's table.
Types of key constraints are:
Primary Key:
The PRIMARY KEY constraint uniquely identifies each record in a table. Primary keys must contain
UNIQUE values, and cannot contain NULL values.
Example:
create table emp
( id number
name varchar (20)
age number
course varchar(10)
primary key (id) );
Foreign Key:
The foreign key a constraint is a column or list of columns that points to the primary key column of
another table The main purpose of the foreign key is only those values are allowed in the present table that will
match the primary key column of another table.
Example: create a foreign key
1. Reference table/primary table:
create table customers1(
id int ,
name varchar (20) ,
course varchar(10) ,
primary key (id));
2. Child table:
create table customers2(
id int ,
marks int,
references customers1(i));
Not Null :
Null represents a record where data may be missing data or data for that record may be optional Once not
null is applied to a particular column, you cannot enter null values to that column and restricted to maintain
only some proper value other than null
Example:
create table orders (
orderid number not null,
personname varchar(10));
Transact-SQL provides several mechanisms for integrity enforcement in a database such as rules, defaults, indexes,
and triggers. These mechanisms allow you to maintain these types of data integrity:
Requirement – requires that a table column must contain a valid value in every row; it cannot allow null values.
The create table statement allows you to restrict null values for a column.
Check or validity – limits or restricts the data values inserted into a table column. You can use triggers or rules to
enforce this type of integrity.
Uniqueness – no two table rows can have the same non-null values for one or more table columns. You can use
indexes to enforce this integrity.
Referential – data inserted into a table column must already have matching data in another table column or another
column in the same table. A single table can have up to 192 references.
******************************
A relational database query (query, for short) is a question about the data, and the answer consists of a new relation
containing the result. For example ,we might want to find all students younger than 18 or all students enrolled
inReggae203. A query language is a specialized language for writing queries.SQL is the most popular commercial
query language for a relational DBMS.
We can retrieve rows corresponding to students who are younger than 18 with the following SQL query:
SELECT *
FROM Students S
WHERE S.age < 18
Here,..The symbol ,*, means that we retain all fields of selected tuples in the result.
In addition to selecting a subset of tuples, a query can extract a subset of the fields of each selected tuple. vVe can
compute the names and logins of students who are younger than 18 with the following query:
We can also combine information in the Students and Enrolled relations. If we want to obtain the names of all
students who obtained an A and the id of the course in which they got an A, we could write the following query:
****************************
Logical Database Design: ER to Relational:
The ER model is convenient for representing an initial, high-level database design. Given an ER diagram
describing a database.
Entity Sets to Tables:
An entity set is mapped to a relation in a straightforward way: Each attribute of the entity set becomes an attribute
of the table. Note that we know both the domain of each attribute and the (primary) key of an entity set.
Consider the Employees entity set with attributes id, name, and city shown below.
Example:
Entity relational model: id name
city
eeeee
e
Employee
A relationship set, like an entity set, is mapped to a relation in the relational model. we begin by considering
relationship sets without key and participation constraints,
id id city
salary
Employee Works
Department
in
Introduction to Views:
1. Views:
Views in SQL are considered as a virtual table. A view also contains rows and columns. To create the view, we can
select the fields from one or more tables present in the database. View is logical copy of table data.
Advantages of views:
Provide Security.
Increase performance.
Update dynamically
Types of view:
Simple view
Composite view
Inline view
Forced view
Materialized view
Simple view: A view is created on single table is called simple view
Example: create view empp
as
select id, name, salary from emp;
Composite view: A view is created on multiple tables are called composite view.
Example: create view empdept
as
select id, name, salary, dname, location from emp, dept where emp.did=dept.did;
Inline view: A view based on a subquery in FROM Clause, that subquery creates a temporary table and simplifies
the complex query.
Example: select id, name,salary from (select * from emp)t;
Forced view: The Force view is used to create a view when there is no table that exists in the database
Example: create or replace force view view_name
as
select id,name from product;
DBMS ML REKHA BVCITS-MCA
Materialized view: A materialized view in Oracle is a database object that contains the results of a query. They
are local copies of data located remotely, or are used to create summary tables based on aggregations of a table’s
data. Materialized views, which store data based on remote tables are also, know as snapshots.
Example:
create materialized view new
as
select *from empp;
**********************
Destroying/Altering Tables and Views:
Dropping Views:
Obviously, where you have a view, you need a way to drop the view if it is no longer needed. The syntax is very
simple and is given below −
Example: DROP VIEW view_name;
Alter View:
ALTER TABLE modifies the structure of an existing table. To add a column called maiden-name to
Students, for example, we would use the following command:
Example: alter table students
add column maiden-name char(10)
ALTER TABLE can also be used to delete columns and add or drop integrity constraints on a table; we do
notd iscuss these aspects of the command beyond remarking that dropping column sis treated very
similarly to dropping tables or views.
ALTER VIEW command modifies views created using the create view command or a view projected from
persistent class. The altered view replaces the existing view ,so you cannot modify specific column in a
view
Example:
Create or replace view [brazil customers] as
select customername, contactname, city
from customers
where country = "brazil";
Example:
Alter view newemployees as
select name,office,startdate
from sample.employees
where name=’rekha’;
***************************
Relational Algebra: Selection and Projection, Set Operations, Renaming, Joins, Division, More Examples of
Algebra Queries.
SQL: Queries, Constraints, Triggers: The Form of a Basic SQL Query, UNION, INTERSECT and
EXCEPT, Nested Queries, Aggregate Operators, Null Values, Complex Integrity Constraints in SQL, Triggers
and Active Databases, Designing Active Databases.
In DBMS there are two types of query languages. Query is a condition or statement it is used to retrieve
the data from the database. Query language is also called high level language. Query languages are two types
1. Procedural query language.
2. Non Procedural query language.
Procedural Query Language:
The user used to specific procedure to retrieve the data from database called procedural query language.
Ex: Relational algebra.
Non-Procedural Query Language:
The user not used to specific procedure to retrieve the data from database is called non procedural query
language.
Ex: Relational calculus.
Relational Algebra:
Relational algebra is a procedural query language. Each relational query describes a step-by-step
procedure for computing the desired answer, based on the order in which operators are applied in the query .It
consisting of set of operators that are used to take one or more relations as input and produce new relation as
output.
Basic operators are
1. Unary operator.
2. Binary operator.
Unary operator:
Unary operator applied operation on single relations.
Ex: selection, projection, renaming etc…
Binary operator:
Binary operator applied operation on a two relations.
Ex: set operators, joins, division etc…
**********************
Selection:
Selection Operator (σ) is a unary operator in relational algebra that performs a selection operation.
Selection operation retrieving the data from database using records or tuples.
Syntax: σ<selection condition>(R)
Example: σ<rating > 7(sailors).
Selection operator can perform using logical (and, or, not ) and comparison operators(<,>,<=,>=,==,!=).
Selection operator can retrieve the selected records.
Selection operator always selects the entire tuple. It can not select a section or part of a tuple.
Selection operator can also run with projection.
Degree of the relation from a selection operation is same as degree of the input relation.
Selection operator performs horizontal partitioning of the relation.
Projection:
The projection operator π is one of the unary operators in relational algebra (RA) and is used to
project columns from a relation. It can select specific columns from a given relation and hide all the other
columns.
Set Operations:
SQL set operators are used to combine the results obtained from two or more queries into a single result. The
queries which contain two or more sub queries are known as compounded queries.
Union
Union all
Intersect
Minus
1. Union:
The SQL Union operation is used to combine the result of two or more SQL SELECT queries.
In the union operation, all the number of data type and columns must be same in both the tables on
which UNION operation is being applied.
The union operation eliminates the duplicate rows from its result set.
Rename is used to change the object name /table name in database name. The rename operator ρ is
one of the unary operators in relational algebra and is used to rename relations in a DBMS.
Joins are used to display multiple data types of data from multiple tables. Joins are two types
1. Physical join
2. Logical join
Physical join: physical join is used to establish the connection physically on tables using referential
constraints/integrity constraints.
Logical join: Logical join is used to establish the connection logically on tables. Logical join two types
1. Combined data: It can be joined using set operators.(same data type of data and same number columns)
These are
Union
Union all
Intersect
Minus
2. Combinational data: It can be joined using join operation.(different data type of data and different
number columns).These are
Cross join
Equal join
Inner join
Self join
Outer join
Left outer.
Right outer.
Full outer.
Cross join:
It will display combination of data from multiple tables. In this join, each value in the first table is
mapped with all values in the second table. It will display all possible combinations of data from multiple
tables.
Example:
select eid, ename, dname from employee, depart;
Equal Join:
A cross join is known as equi join if we specify join condition using '=' operator. It will display only matched
data from all tables. A condition is known as join condition if it is specified between primary key of one table
and foriegn key of other table.
Example: select eid,ename,dname from employee,depart
where employee.deptid=depart.deptid;
Inner Join:
The INNER JOIN keyword selects records that have matching values in both tables. The INNER
JOIN keyword selects all rows from both tables as long as there is a match between the columns.
Example: select employee.eid, depart.deptid from employee
inner join
depart on (depart.deptid=employee.deptid);
Outer join:
These are used to display all data from one table and only matched data from other table.
Types of outer joins:
1) Left outer join : left join Display all the data from left table and only matched data from right table.
Example: select employee.eid, depart.deptid from employee
left outer join
depart on (depart.deptid=employee.deptid);
2) Right outer join : Right join Display complete data from right table and only matched data from left table.
Example: select employee.eid, depart.deptid from employee
right outer join
depart on (depart.deptid=employee.deptid)
3) Full outer join : Full join: Display --matched data from both the tables
--unmatched data from left table
--unmatched data from right table
The DIVISION operation is a binary relational operation that divides one set of rows into another set of
rows based on specified conditions.
It is similar to a JOIN operation, but the resulting table contains only the rows that belong to the first set
and satisfy the division condition.
The division operator is used for queries which involve the ‘all’.R1 ÷ R2 = tuples of R1 associated with
all tuples of R2.
1. Find the names and ages of sailors who are rating above 7
Π sname,age σ<rating > 7(sailors).
2. Find the names of sailors who have reserved 103 boat
Π snameσ <bid=103> (sailors X Reserves).
3. Find the names of sailors who ha've reserved a red boat
Π snameσ <color=’red’ (Boats) > (sailors X Reserves).
4. Find the colors of boats reserved by Lubber
Π colorsσ <sname=’Lubber’ (sailors) > (sailors X Reserves X Boats )
5. Find the names of sailors who have reserved at least one boat
Π sname (sailors X Reserves).
6. Find the names of sailors who have reserved a red and green boat
Π snameσ <color=’red’ (Boats) U color=’green’ (Boats) > (sailors X Reserves X Boats ).
7. Find the sids of sailors with age over 20 who have not Reserved a Red boat
Π sid <σ (age=20 (Sailors)) - σ (color=’red’ (Boats) ) > (sailors X Reserves X Boats ).
8. Find the names of sailors 'Who have reserved all boats
Π sname (sailors X Reserves).
**********************************
The Form of a Basic SQL Query:
Structured Query Language (SQL) is the most widely used relational database language. A conceptual
evaluation strategy is a way to evaluate the query that is intended to be easy to understand rather than efficient.
A DBMS would typically execute a query in a different and more efficient way.
The basic form of an SQL query is follows:
Example:
SELECT [DISTINCT]
select-list FROM from-list
WHERE qualification
Select statement can be used retrieve all the information table.
Syntax:
Select *from table where condition.
Example:
Select * from student where city =’Kakinada’;
Select statement can be use retrieve selected attributes from the table
Syntax:
Select attribute1, attribute2,….from table where condition.
Example:
Select id, name, course from student;
Select statement can be use retrieve selected records from the table
Syntax:
Select attribute1, attribute2,….from table where condition.
Example:
Select id, name, course from student where city =’Kakinada’;
**************************
UNION, INTERSECT:
Union
The SQL Union operation is used to combine the result of two or more SQL SELECT queries.
In the union operation, all the number of data type and columns must be same in both the tables on
which UNION operation is being applied.
The union operation eliminates the duplicate rows from its result set.
EXCEPT:
The EXCEPT operator in SQL is used to retrieve the unique records that exist in the first table, not the
common records of both tables. This operator acts as the opposite of the SQL UNION operator.
Example:
except
Example:
SELECT eid, ename FROM employee
WHERE eid IN (SELECT eid FROM awards);
Example:
SELECT eid, ename FROM employee
WHERE eid NOT IN (SELECT eid FROM awards);
Example:
SELECT * FROM employee
WHERE desg = 'Asst.prof'
AND salary > ALL (
SELECT salary FROM employee WHERE desg = 'HOD');
Example:
SELECT * FROM employee
WHERE desg = 'Asst.prof'
AND salary > ANY (
SELECT salary FROM employee WHERE desg = 'HOD');
Co-related Nested Queries:
In co-related nested queries, the inner query uses the values from the outer query to execute the inner
query for every row processed by the outer query. The co-related nested queries run slowly because the
inner query is executed for every row of the outer query's result.
Example:
SELECT * FROM employee emp1
WHERE salary > (
SELECT AVG(salary)
FROM employee emp2
WHERE emp1.desg = emp2.desg);
*********************************************
Aggregate Operators:
Aggregate Functions in DBMS: Aggregate functions are those functions in the DBMS which takes the
values of multiple rows of a single column and then form a single value by using a query. These functions allow
the user to summarizing the data.
In Database Management System, following are the five aggregate functions:
1. AVG
2. COUNT
3. SUM
4. MIN
5. MAX
AVG Function
This function takes the values from the given column and then returns the average of the values.
Example:
select avg(pcost) from product;
COUNT Function
This aggregate function returns the total number of values in the specified column. This function can
work on any type of data, i.e., numeric as well as non-numeric. This function does not count the NULL
values.
Example:
select count(name) from student;
SUM Function
This aggregate function sums all the non-NULL values of the given column. Like the AVG function,
this function also works only on the numeric data.
Example:
select sum(pcost) from product;
MAX Function
This function returns the value, which is maximum from the specified column.
Example:
select max(pcost) from product;
MIN Function
This function returns the value, which is minimum from the specified column.
Example:
select min(pcost) from product;
********************************************
Null Values:
Null values are special values in DBMS that represent values which are unknown and are always different from
zero value. For example age of a particular student is not available in the age column of student table then it is
represented as null but not as zero.
SQL provides special operators and functions to deal with data involving null values
IS NULL operator:
All operations upon null values present in the table must be done using this ‘is null’ operator .we cannot
compare null value using the assignment operator
Example:
select * from emp
where comm is null
IS NOT NULL operator:
All operations upon not null values present in the table must be done using this ‘is not null’ operator .
Example:
select * from emp
where comm is not null
NOT NULL Constraint :
Not all constraints prevents a column to contain null values Once not null is applied to a particular
column, you cannot enter null values to that column and restricted to maintain only some proper
value other than null.A not-null constraint cannot be applied at table level
Example:
create table student
(
id int not null,
name varchar (20) not null,
age int not null,
address char (25) ,
salary decimal (18, 2),
primary key (id));
NVL() Function:
Using NVL function you can substitute a value in the place of NULL values.
Example:
select nvl (comm, 500) from employees
where salary>1000;
********************************************
We can specify complex constraints over a single table using table constraints, which
have the form CHECK conditional-expression.
We use the Sailors, Reserves, and Boats schema for all our examples
1. Find the names and ages of sailors who are rating above 7
SELECT S.sid, S.sname, S.rating, S.age
FROM Sailors AS S
WHERE S.rating > 7
2. Find the names of sailors who have reserved 103 boat
SELECT S.sname FROM Sailors S, Reserves R
WHERE S.sid=R.sid AND R.bid= 103;
3. Find the names of sailors who ha've reserved a red boat
SELECT DISTINCT s.sname FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ’red’
4. Find the colors of boats reserved by Lubber
SELECT B.color
FROM Sailors S, Reserves R, Boats B
WHERE S.sname='Lubber' AND S.sid=R.sid AND R.bid = B.bid
5. Find the names of sailors who have reserved at least one boat
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid = R.sid
6. Find the names of sailors who have reserved a red and green boat
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid
AND (B.color = ‘red’ OR B.color = ‘green’)
7. Find the sids of sailors with age over 20 who have not Reserved a Red boat
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE B.color != 'red' and B.bid = R.bid and S.sid = R.sid and S.sid > 20
8. Find the names of sailors 'Who have reserved all boats
SELECT S.sname FROM Sailors S, Reserves R
WHERE S.sid=R.sid .
Triggers:
A trigger is a procedure that is automatically invoked by the DBMS in response to specified changes to
the database, and is typically specified by the DBA. A database that has a set of associated triggers is called an
active database.
A trigger description contains three parts:
Event: A change to the database that activates the trigger. ..
Condition: A query or test that is run when the trigger is activated. ..
Action: A procedure that is executed when the trigger is activated and its condition is true.
Use of trigger:
Triggers may be used for any of the following reasons −
To implement any complex business rule, that cannot be implemented using integrity constraints.
Triggers will be used to audit the process. For example, to keep track of changes made to a table.
Trigger is used to perform automatic action when another concerned action takes place.
Types of triggers:
The different types of triggers are explained below −
Statement level trigger − It is fired only once for DML statement irrespective of number of rows
affected by statement. Statement-level triggers are the default type of trigger.
Before-triggers − At the time of defining a trigger we can specify whether the trigger is to be fired
before a command like INSERT, DELETE, or UPDATE is executed or after the command is executed.
Before triggers are automatically used to check the validity of data before the action is performed. For
instance, we can use before trigger to prevent deletion of rows if deletion should not be allowed in a
given case.
After-triggers − It is used after the triggering action is completed. For example, if the trigger is
associated with the INSERT command then it is fired after the row is inserted into the table.
Row-level triggers − It is fired for each row that is affected by DML command. For example, if an
UPDATE command updates 150 rows then a row-level trigger is fired 150 times whereas a statement-
level trigger is fired only for once.
Introduction to Normalization:
Normalization is a database design technique divides larger tables into smaller tables and links them
using relationships. The purpose of Normalization in SQL is to eliminate redundancy and provide security
also improve data integrity.
Why do we need Normalization?
The main reason for normalizing the relations is removing these anomalies. Failure to eliminate
anomalies leads to data redundancy and can cause data integrity and other problems as the database grows.
Data modification anomalies can be categorized into three types:
1. Insertion anomaly: It occurs when we cannot insert data to the table without the presence of another
attribute
2. Update anomaly: It is a data inconsistency that results from data redundancy and a partial update of
data.
3. Deletion Anomaly: It occurs when certain attributes are lost because of the deletion of other
attributes.
***************************************
Informal Design Guidelines for Relation Schema
Informal guidelines that may be used as measures to determine the quality of relation schema design.
There are four methods of informal designs.
1. Making sure that the semantics of the attributes is clear in the schema
2. Reducing the redundant information in tuples
3. Reducing the NULL values in tuples
4. Disallowing the possibility of generating spurious tuples
Making sure that the semantics of the attributes is clear in the schema:
Informally, each tuple in a relation should represent one entity or relationship instance. (Applies to
individual relations and their attributes).
Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be mixed in
the same relation
Only foreign keys should be used to refer to other entities
Entity and relationship attributes should be kept apart as much as possible.
************************************
Functional dependency:
The functional dependency is a relationship that exists between two attributes. It typically exists between the
primary key and non-key attribute within a table.
X → Y
The left side of FD is known as a determinant, the right side of the production is known as a dependent.
For example:
Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address.
Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table because if we know
the Emp_Id, we can tell that employee name associated with it.
Functional dependency can be written as:
Emp_Id → Emp_Name
Rules of functional dependency:
William Armstrong in 1974 suggested a few rules related to functional dependency. They are
called RAT rules.
Reflexivity: If A is a set of attributes and B is a subset of A, then the functional dependency A →
B holds true.
Example:{ Employee_Id, Name } → Name is valid.
Augmentation: If a functional dependency A → B holds true, then appending any number of the
attribute to both sides of dependency doesn't affect the dependency. It remains true.
Example: X → Y holds true then, ZX → ZY also holds true.
Example, if { Employee_Id, Name } → { Name } holds true then, { Employee_Id, Name, Age }
→ { Name, Age }
Transitivity: If two functional dependencies X → Y and Y → Z hold true, then X → Z also holds true
by the rule of Transitivity.
Example, if { Employee_Id } → { Name } holds true and { Name } → { Department } holds true,
then { Employee_Id } → { Department } also holds true.
*********************************
Normal Forms Based on Primary Keys:
Normal forms are used to eliminate or reduce redundancy in database tables. There are several
types of keys used in normalization in database management systems (DBMS). These are explained as
follows.
Super Key: A super key is a set of one or more attributes that uniquely identifies each record in a
table. A super key may contain more attributes than necessary to uniquely identify each record.
Normalization is the process of organizing data in a database to minimize redundancy and dependency. In
database design, there are different normal forms based on the primary keys of a table. These include
******************************
This table is not in first normal form because the [Color] column can contain multiple values. For example,
the first row includes values "red" and "green."
To bring this table to first normal form, we split the table into two tables and now we have the resulting
tables:
Now first normal form is satisfied, as the columns on each table all hold just one value.
**********************************
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
Now second normal form is satisfied,
*********************************
Third Normal Form:
A database is in third normal form if it satisfies the following conditions:
It is in second normal form
There is no transitive functional dependency
By transitive functional dependency, we mean we have the following relationships in the table: A is
functionally dependent on B, and B is functionally dependent on C. In this case, C is transitively dependent on
A via B.
Example:
Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID. The non-
prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID). It violates the
rule of third normal form.That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key
EMPLOYEE table:
EMP_ID EMP_NAME EMP_ZIP
222 Harry 201010
333 Stephan 02228
444 Lan 60007
555 Katharine 06389
666 John 462007
EMPLOYEE_ZIP table:
EMP_ZIP EMP_STATE EMP_CITY
201010 UP Noida
02228 US Boston
60007 US Chicago
06389 UK Norwich
462007 MP Bhopal
****************************************
Example: Suppose there is a bike manufacturer company which produces two colors(white and black) of each
model every year.
BIKE_MODEL MANUF_YEAR COLOR
M2011 2008 White
M2001 2008 Black
M3001 2013 White
M3001 2013 Black
M4006 2017 White
M4006 2017 Black
In this case, these two columns can be called as multivalued dependent on BIKE_MODEL. The representation
of these dependencies is shown below:
BIKE_MODEL → → MANUF_YEAR
BIKE_MODEL → → COLOR
Example
STUDENT
STU_ID COURSE HOBBY
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent entity. Hence,
there is no relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two courses, Computer and Math and two
hobbies, Dancing and Singing. So there is a Multi-valued dependency on STU_ID, which leads to
unnecessary repetition of data. So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
***************************
*******************************
Relation R
Dept Subject Name
CSE C Ammu
CSE C Amar
CSE Java Amar
IT C bhanu
Here,
dept ->-> subject
dept->-> name
The above relation is in 4NF. Anomalies can occur in relation in 4NF if the primary key has three or more
fields. The primary key is (dept, subject, name). Sometimes decomposition of a relation into two smaller
relations does not remove redundancy.The above relation says that dept offers many elective subjects which
are taken by a variety of students. Students have the opinion to choose subjects. Therefore all three fields are
needed to represent the information.
The above relation does not show non-trivial MVDs since the attributes subject and name are dependent; they
are related to each other (A FD subject->name exists). The relation cannot be decomposed in two relations
(dept, subject) and (dept,sname).
Therefore the relation can be decomposed into following three relations −
R1(dept, subject)
R2(dept, name) and R3(subject, name) and it can be shown that decomposition is lossless.
R1
Dept Subject
CSE C
CSE Java
IT C
R2
Dept Name
CSE Ammu
CSE Amar
IT bhanu
R3
Subject Name
C Ammu
C Amar
Java Amar
C bhanu
**************************************
DBMS M L REKHA BVCITS
UNIT-5
Transaction Management and Concurrency Control: Transaction Concept, A Simple Transaction Model,
Storage Structure, ACID Properties, Serializability, Transaction Isolation Levels, Concurrency Control, Lock-
Based Protocols, Validation-Based Protocols.
Transaction Concept:
The transaction is a set of logically related operation. It contains a group of tasks. A transaction is an action or
series of actions. It is performed by a single user to perform operations for accessing the contents of the
database.
Example: Suppose an employee of bank transfers Rs 800 from X's account to Y's account. This small
transaction contains several low-level tasks:
x's account:
open_account(x)
old_balance = x.balance
new_balance = old_balance - 800
x.balance = new_balance
close_account(x)
y's account:
open_account(y)
old_balance = y.balance
new_balance = old_balance + 800
y.balance = new_balance
close_account(y) .
*************************************
Simple Transaction Model:
Simple transaction model is a model of transaction how it must be. It has active, partially committed,
failed, aborted, and committed states. Transaction is a several operations that can change the content of the
database which is handled by a single program
Active: The initial state where the transaction enters is the active state… i.e, it start the execution of
tractions.
Partially Committed: The transaction enter in this state while it is executing read, write or other
operations.
Committed: The transaction enters this state after successful completion of the transaction and system
checks have issued commit signal. All operations will save per mentally on physical database.
Failed: The transaction goes from partially committed state or active state to failed state when it is
discovered that normal execution can no longer proceed or system checks fail.
Aborted: This is the state after the transaction has been rolled back after failure and the database has been
restored to its state that was before the transaction begin.
The following state transition diagram depicts the states in the transaction and the low level transaction
operations that causes change in states.
*********************
Storage Structure:
1. Storage devices in DBMS:
A database system provides an ultimate view of the stored data. However, data in the form of bits, bytes get
stored in different storage devices.
Types of Data Storage
For storing the data, there are different types of storage options available. These storage types differ from one
another as per the speed and accessibility. There are the following types of storage devices used for storing the
data:
Primary Storage
Secondary Storage
Primary Storage
It is the primary area that offers quick access to the stored data. We also know the primary storage as volatile
storage. It is because this type of memory does not permanently store the data. As soon as the system leads to a
power cut or a crash, the data also get lost. Main memory and cache are the types of primary storage.
Main Memory: It is the one that is responsible for operating the data that is available by the storage
medium. The main memory handles each instruction of a computer machine.
Cache: It is one of the costly storage media. On the other hand, it is the fastest one. A cache is a tiny
storage media which is maintained by the computer hardware usually. While designing the algorithms and
query processors for the data structures, the designers keep concern on the cache effects.
Secondary Storage
Secondary storage is also called as Online storage. It is the storage area that allows the user to save and store
data permanently. This type of memory does not lose the data due to any power failure or system crash. That's
why we also call it non-volatile storage.
There are some commonly described secondary storage media which are available in almost every type of
computer system:
Flash Memory: A flash memory stores data in USB (Universal Serial Bus) keys which are further
plugged into the USB slots of a computer system. These USB keys help transfer data to a computer system, but
it varies in size limits.
Magnetic Disk Storage: This type of storage media is also known as online storage media. A magnetic
disk is used for storing the data for a long time. It is capable of storing an entire database. It is the responsibility
of the computer system to make availability of the data from a disk to the main memory for further accessing.
2. File organization in DBMS:
There are several ways to organize files. Based on access or selection, these specific strategies have benefits and
drawbacks. In terms of file organization, the programmer selects the approach that best suits his needs.
The following are the types of file organization in DBMS:
1. Heap files organization.
2. Sequential file organization.
3. Indexed file organization.
Heap file organization
It is the simplest and most basic type of organization.
It works with data blocks. In heap file organization, the records are inserted at the file's end.
When the records are inserted, it doesn't require the sorting and ordering of records. When the data block
is full, the new record is stored in some other block.
This new data block need not to be the very next data block, but it can select any data block in the
memory to store new records.
The heap file is also known as an unordered file. In the file, every record has a unique id, and every page
in a file is of the same size.
It is the DBMS responsibility to store and manage the new records.
Sequential file organization:
Every file record contains a data field (attribute) to uniquely identify that record.
In sequential file organization, records are placed in the file in some sequential order based on the
unique key field or search key.
Practically, it is not possible to store all the records sequentially in physical form.
ISAM method is an advanced sequential file organization. In this method, records are stored in the file
using the primary key.
An index value is generated for each primary key and mapped with the record.
This index contains the address of the record in the file.
B+ Trees:
B+ tree file organization is the advanced method of an indexed sequential access method.
It uses a tree-like structure to store records in File.
It uses the same concept of key-index where the primary key is used to sort the records. For each
primary key, the value of the index is generated and mapped with the record.
The B+ tree is similar to a binary search tree (BST), but it can have more than two children.
In this method, all the records are stored only at the leaf node.
Intermediate nodes act as a pointer to the leaf nodes. They do not contain any records.
************************
ACID Properties:
A transaction is a logical unit of work that accesses and updates the contents of a database. Read and
write operations are used by transactions to access data. To maintain the integrity of the data, there are four
properties described in the database management system, which are known as the ACID properties. The ACID
properties are meant for the transaction that goes through a different group of tasks, and there we come to see
the role of the ACID properties.
1) Atomicity:
The term atomicity defines that the data remains atomic. It means if any operation is performed on the
data, either it should be performed or executed completely or should not be executed at all. It further means that
the operation should not break in between or execute partially. In the case of executing operations on the
transaction, the operation should be completely executed and not partially.
Tranction-T1
Read(A) Read(B)
1000 2000
Write(A) Write(B)
1000-200 2000+200
Read(A) Read(B)
800 2200
Successful of Transaction.
2) Consistency:
Data is in a consistent state when a transaction starts and when it ends. Before transaction and after
transaction data is correct and the structure is stable.
Tranction-T1
Before Transaction A:1000 ,B:2000
Total=1000+2000=3000
Read(A) Read(B)
1000 2000
Write(A) Write(B)
1000-200 2000+200
Read(A) Read(B)
800 2200
After Transaction A:800 ,B:2200
Total=800+2200=3000
3) Isolation:
The term 'isolation' means separation. In DBMS, Isolation is the property of a database where no data
should affect the other one and may occur concurrently. In short, the operation on one database should begin
when the operation on the first database gets complete. It means if two operations are being performed on two
different databases, they may not affect the value of one another. In the case of transactions, when two or more
transactions occur simultaneously, the consistency should remain maintained. Any changes that occur in any
particular transaction will not be seen by other transactions until the change is not committed in the memory.
Tranction-T1
Read(A) Read(B)
1000 2000
Write(A) Write(B)
1000-200 2000+200
Read(A) Read(B)
800 2200
Tranction-T2
Read(B) Read(C)
2200 1500
Write(B) Write(C)
2200+500 1500-500
Read(B) Read(C)
2700 1000
4) Durability
Durability ensures the permanency of something. In DBMS, the term durability ensures that the data after the
successful execution of the operation becomes permanent in the database. The durability of the data should be
so perfect that even if the system fails or leads to a crash, the database still survives.
****************************
Serializability:
Serializability is related to schedules and transactions. Schedule is a set of transactions, and a transaction is a set
of instructions used to perform any logical operations in terms of databases. i.e assign a sequential number to
each and every operation in a transaction.
Schedules in DBMS are of two types:
Serial Schedule - A schedule in which only one transaction is executed at a time, i.e., one transaction is
executed completely before starting another transaction.
Non-serial Schedule - A schedule in which the transactions are interleaving or interchanging. There are several
transactions executing simultaneously as they are being used in performing real-world database operations.
Types of serializability
There are two types of serializability −
View serializability:
A schedule is view-serializability if it is viewed equivalent to a serial schedule.
The rules it follows are as follows −
T1 is reading the initial value of A, then T2 also reads the initial value of A.
T1 is the reading value written by T2, then T2 also reads the value written by T1.
T1 is writing the final value, and then T2 also has the write operation as the final value.
Conflict serializability:
It orders any conflicting operations in the same way as some serial execution. A pair of operations is said to
conflict if they operate on the same data item and one of them is a write operation.
That means
Readi(x) readj(x) - non conflict read-read operation
Readi(x) writej(x) - conflict read-write operation.
Writei(x) readj(x) - conflict write-read operation.
Writei(x) writej(x) - conflict write-write operation.
*****************************
Transaction isolation levels specify what data is visible to statements within a transaction. These levels directly
impact the level of concurrent access by defining what interaction is possible between transactions against the
same target data source.
The different types of database anomalies are described as follows:
1. Read Uncommitted: It is the lowest in the isolation level. At this level, one transaction cannot read the
changes made by the other transactions, so it allows dirty reads. At this level, the transaction is not
isolated from each other.
2. Read Committed: It provides a guarantee to each data that these data gets committed when these are
read by any transaction. So that it does not allow dirty Read. The transaction holds the Read or writes
action so that it prevents the data from reading or written by any other transaction.
3. Repeatable Read: This is the most restrictive isolation level. The transaction holds read locks on all
rows it references and writes locks on referenced rows for updates and deletes actions. Since other
transactions cannot read, update or delete these rows, consequently, it avoids non-repeatable Read.
4. Serializable: This is the highest isolation level. A serializable execution is guaranteed to be serializable.
Serializable execution is defined to be an execution of operations in which concurrently executing
transactions appears to be serially executing.
*******************************
Concurrency Control:
Concurrency control concept comes under the Transaction in database management system (DBMS). It
is a procedure in DBMS which helps us for the management of two simultaneous processes to execute without
conflicts between each other, these conflicts occur in multi user systems.Concurrency can simply be said to be
executing multiple transactions at a time. It is required to increase time efficiency. If many transactions try to
access the same data, then inconsistency arises. Concurrency control required to maintain consistency data.
For example, if we take ATM machines and do not use concurrency, multiple persons cannot draw money at a
time in different places. This is where we need concurrency.
Concurrency control techniques:
The concurrency control techniques are as follows −
Locking
Lock guaranties exclusive use of data items to a current transaction. It first accesses the data items by
acquiring a lock, after completion of the transaction it releases the lock.
Types of Locks
The types of locks are as follows −
Shared Lock [Transaction can read only the data item values]
Exclusive Lock [Used for both read and write data item values]
Time Stamping
Time stamp is a unique identifier created by DBMS that indicates relative starting time of a transaction.
Whatever transaction we are doing it stores the starting time of the transaction and denotes a specific
time. This can be generated using a system clock or logical counter. This can be started whenever a
transaction is started. Here, the logical counter is incremented after a new timestamp has been assigned.
Optimistic
It is based on the assumption that conflict is rare and it is more efficient to allow transactions to proceed
without imposing delays to ensure serializability.
*******************************
Lock-Based Protocols:
A Lock-based protocol in DBMS as a mechanism that is responsible for preventing a transaction from reading
or writing data until the necessary lock is obtained.
It is the simplest way of locking the data while transaction. Simplistic lock-based protocols allow all the
transactions to get the lock on the data before insert or delete or update on it. It will unlock the data item
after completing the transaction.
Pre-claiming Lock Protocols evaluate the transaction to list all the data items on which they need locks.
Before initiating an execution of the transaction, it requests DBMS for all the lock on all those data
items. If all the locks are granted then this protocol allows the transaction to begin. When the transaction
is completed then it releases all the lock.
The two-phase locking protocol divides the execution phase of the transaction into three parts. In the
first part, when the execution of the transaction starts, it seeks permission for the lock it requires. In the
second part, the transaction acquires all the locks. The third phase is started as soon as the transaction
releases its first lock.
Growing phase: In the growing phase, a new lock on the data item may be acquired by the transaction,
but none can be released.
Shrinking phase: In the shrinking phase, existing lock held by the transaction may be released, but no
new locks can be acquired.
Strict Two-phase locking (Strict-2PL):
The first phase of Strict-2PL is similar to 2PL. In the first phase, after acquiring all the locks, the
transaction continues to execute normally. The only difference between 2PL and strict 2PL is that Strict-
2PL does not release a lock after using it.Strict-2PL waits until the whole transaction to commit and save
all the transactions in physical database and then it releases all the locks at a time.
**********************************
Validation-Based Protocols:
Validation phase is also known as optimistic concurrency control technique. In the validation based
protocol, the transaction is executed in the following three phases:
1. Read phase: In this phase, a transaction reads the value of data items from database and store their values
into the temporary local variables. Transaction then starts executing but it doesn’t update the data items in
the database, instead it performs all the operations on temporary local variables.
2. Validation phase: In this phase, a validation check is done on the temporary variables to see if it violates
the rules of serializability.
3. Write phase: This is the final phase of validation based protocol. In this phase, if the validation of the
transaction is successful then the values of temporary local variables are written to the database and the
transaction is committed. If the validation is failed in second phase then the updates are discarded and
transaction is slowed down to be restarted later.
******************************