0% found this document useful (0 votes)

28 views169 pages

Q-A Combined

The document provides an overview of relational databases, including the need for Database Management Systems (DBMS), definitions of key concepts such as schema, entity-relationship models, and various types of data models. It discusses the architecture of DBMS, including components like query processors and storage managers, as well as the importance of data integrity and types of keys. Additionally, it outlines the three-schema architecture and various SQL join operations, emphasizing the significance of data management in various applications.

Uploaded by

samprincecr7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views169 pages

Q-A Combined

Uploaded by

samprincecr7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 169

UNIT – I – RELATIONAL DATABASE

PART – A

1. State the need for DBMS

The need for a Database Management System (DBMS) arises from the
necessity to efficiently store, manage, and retrieve large volumes of data in a
structured manner, ensuring data integrity, security, and concurrency control.

2. Define a schema

A schema is a logical blueprint that defines the structure, organization, and
relationships of data within a database. It outlines the tables, attributes, data
types, constraints, and relationships.

3. What is an entity-relationship model?

This data model is based on real world that consists of basic objects
called entities and of relationship among these objects. Entities are
described in a database by a set of attributes.

4. State the three levels of abstraction.

The three levels of data abstraction are -
1. Physical Level
2. Logical Level
3. View Level

5. Define foreign key. Give an example.

● Foreign key is a single attribute or collection of attributes in one
table that refers to the primary key of another table.
● For example - Consider a Student database as Student (RollNo,
Name,Address) and Course(CourseId, CourseName, RollNo).
Here RollNo is a foreign key

6. Mention three major disadvantages of keeping organizational information in

a file processing system.

a) Data redundancy: Data redundancy means duplication of data at several

places. Since different programmers create different files and these files might
have different structures, there are chances that some information may appear
repeatedly in some or more format at several places.
b) Data inconsistency: Data inconsistency occurs when various copies of same
data may no longer get matched. For example, changed address of an employee
may be reflected in one department and may not be available (or old address
present) for other department.
c) Difficulty in accessing data: The conventional file system does not allow to
retrieve the desired data in efficient and convenient manner.
7. What is Data Definition Language?
Data Definition Language (DDL) is a specialized language used to specify a
database schema by a set of definitions.
• It is a language which is used for creating and modifying the structures of tables,
views, indexes and so on.
• Some of the common commands used in DDL are -CREATE, ALTER, DROP.

8. List out the different types of data models.

Various types of data models are –
(1) Relational Data Model
(2) Entity Relational Data Model
(3) Object Based Data Model
(4) Semi-structured Data Model

9. Relate the data and database management systems.

A database management system (DBMS) is a tool we use to create and manage
databases. A DBMS requires several components to come together. Firstly, we
need data, which is the information stored in a database. Text, numbers, booleans
(i.e. true/false statements) and dates typically represent our data.

10.What are the different integrity constraints used in relational database?

Different types of integrity constraints are
(1) Entity Integrity Constraint
(2) Referential Integrity Constraint
(3) Domain Integrity Constraint
(4) Key Integrity Constraint

11.Why does SQL allow duplicate tuples in a table or in a query result?

● Data can be the same. Two people may have the same name. Since SQL
is a database where you store your data and data can be duplicate.
● But we can apply primary key constraints, Unique constraints or Distinct
keyword to identify the record uniquely

12.List the reasons why null value might be introduced into the database.
NULL is a special value provided by database in two cases -
i) When field values of some tuples are unknown (For e.g. city name is not
assigned and
ii) inapplicable (For e.g. middle name is not present).

13.What is meant by instance and Schema of the database?

When information is inserted or deleted from the database then the database gets
changed. The collection of information at particular moment is called instances.
The overall design of the database is called schema.
14.What are the various types of keys in the database?
1) Super Key(SK)
2) Candidate Key(CK)
3) Primary Key(PK)
4) Alternate Key
5) Foreign key

15.Differentiate between Dynamic SQL and Static SQL.

16.List any eight applications of DBMS.

1) Accounting: Database systems are used in maintaining information
employees, salaries, and payroll taxes.
2) Manufacturing: For management of supply chain and tracking production of
3) For maintaining customer, product and purchase information
4) Banking: In banking sector, for customer information, accounts and loan and
for performing banking applications
5) For purchase on credit cards and generation of monthly statements database
systems are useful.
6) Universities: In universities for maintaining student information, course
registration, and accounting.
7) Reservation systems: In airline/railway reservation systems,
8) Telecommunication: In telecommunications for keeping records of the calls
made, generating monthly bills, maintaining balances on prepaid calling cards

17.What is the difference between primary key and foreign key?

18.List some relational algebra operations.

Various operators used in Relational algebra are
(1) Selection Operator
(2) Projection Operator (II)
(3) Cartesian Product(x).
(4) Rename Operator (ρ)

19.Outline referential integrity with an example.

• The referential integrity rule states that "whenever a foreign key value is used it
must reference a valid, existing primary key in the parent table".
• Example: Consider the situation where you have two tables: Employees and
Managers. The Employees table has a foreign key attribute entitled Managed By,
which points to the record for each employee's manager in the Managers table.

20.What is the difference between logical data independence and physical data
independence?
1. Physical Independence: This is a kind of data independence which allows the
modification of physical schema without requiring any change to the conceptual
schema. For example - if there is any change in memory size of database server
then it will not affect the logical structure of any data object.
2. Logical Independence: This is a kind of data independence which allows the
modification of conceptual schema without requiring any change to the external
schema. For example - Any change in the table structure such as addition or
deletion of some column does not affect user views.
By these data independence the time and cost acquired by changes in any one
level can be reduced and abstract view of data can be provided to the user.
PART B

1. Sketch the typical component modules of DBMS. Indicate and explain
interactions between those modules of the system.
• Consider the top part of Fig. 1.5.1. It shows application interfaces used by naïve
users, application programs created by application programmers, query tools used
by sophisticated users and administration tools used by database administrator.

• The lowest part of the architecture is for disk storage.

• The two important components of database architecture are - Query processor and
storage manager.

Query processor:

• The interactive query processor helps the database system to simplify and facilitate
access to data. It consists of DDL interpreter, DML compiler and query evaluation
engine.

• With the following components of query processor, various functionalities are

performed -

i) DDL interpreter: This is basically a translator which interprets the DDL statements
in data dictionaries.

ii) DML compiler: It translates DML statements query language into an evaluation
plan. This plan consists of the instructions which query evaluation engine
understands.

iii) Query evaluation engine: It executes the low-level instructions generated by the
DML compiler.

• When a user issues a query, the parsed query is presented to a query optimizer,
which uses information about how the data is stored to produce an efficient execution
plan for evaluating the query. An execution plan is a blueprint for evaluating a query.
It is evaluated by query evaluation engine.

Storage manager:

• Storage manager is the component of database system that provides interface

between the low-level data stored in the database and the application programs and
queries submitted to the system.

• The storage manager is responsible for storing, retrieving, and updating detain the
database. The storage manager components include -

i) Authorization and integrity manager: Validates the users who want to access the
data and tests for integrity constraints.

ii) Transaction manager: Ensures that the database remains in consistent despite of
system failures and concurrent transaction execution proceeds without conflicting.

iii) File manager: Manages allocation of space on disk storage and representation of
the information on disk.

iv) Buffer manager: Manages the fetching of data from disk storage into main
memory. The buffer manager also decides what data to cache in main memory.
Buffer manager is a crucial part of database system.

• Storage manager implements several data structures such as -

i) Data files: Used for storing database itself.

ii) Data dictionary: Used for storing metadata, particularly schema of database.

iii) Indices: Indices are used to provide fast access to data items present in the
database

2. Discuss the main categories of the data model. What is the basic difference
between the relational model, the object model and the XML model.

It is a collection of conceptual tools for describing data, relationships among data,

semantics (meaning) of data and constraints.

• Data model is a structure below the database.

• Data model provides a way to describe the design of database at physical, logical
and view level.

• There are various data models used in database systems and these are as follows
-

(1) Relational model:

• Relation model consists of collection of tables which stores data and also guilatxo
represents the relationship among the data.

• Table is also known as relation.

• The table contains one or more columns and each column has unique name.

• Each table contains record of particular type, and each record type defines a fixed
number of fields or attributes.

• For example - Following figure shows the relational model by showing the
relationship between Student and Result database. For example - Student Ram
lives in city Chennai and his marks are 78. Thus the relationship between these two
databases is maintained by the SeatNo. Column

Advantages:

(i) Structural Independence: Structural independence is an ability that allows us to

make changes in one database structure without affecting other. The relational
levsiz model have structural independence. Hence making required changes in
thedatabase is convenient in relational database model.

(ii)Conceptual Simplicity: The relational model allows the designer to simply focus
on logical design and not on physical design. Hence relational models are
conceptually simple to understand.

(iii) Query Capability: Using simple query language (such as SQL) user can get egile
information from the database or designer can manipulate the database structure.

(iv) Easy design,maintenance and usage: The relational models can be designed
logically hence they are easy to maintain and use.

Disadvantages:

(i) Relational model requires powerful hardware and large data storage devices.

(ii) May lead to slower processing time.

(iii) Poorly designed systems lead to poor implementation of database systems.

2) Entity relationship model:

• As the name suggests the entity relationship model uses collection of basic objects
called entities and relationships.

• The entity is a thing or object in the real world.

• The entity relationship model is widely used in database design.

• For example - Following is a representation of Entity Relationship modelin which

the relationship works_for is between entities Employee and Department.

Advantages:

i) Simple: It is simple to draw ER diagram when we know entities and relationships.

ii) Easy to understand: The design of ER diagram is very logical and hence they are
easy to design and understand.

iii) Effective: It is effective communication tool.

iv) Integrated: The ER model can be easily integrated with Relational model.

v) Easy conversion: ER model can be converted easily into other type of models.

Disadvantages:
i) Loss of information: While drawing ER model some information can be hidden or
lost.

ii) Limited relationships: The ER model can represent limited relationships as

compared to other models.

iii) No Representation for data manipulation: It is not possible to represent data

manipulation in ER model.

iv) No industry standard: There is no industry standard for notations of ER diagram.

(3) Object Based Data Model:

• The object oriented languages like C++, Java, C# are becoming the

dominant in software development.

• This led to object based data model.

• To The object based data model combines object oriented features with
relationaldata model.

Advantages:

i) Enriched modelling: The object based data model has capability of modelling the
real world objects.

ii) Reusability: There are certain features of object oriented design such as
inheritance, polymorphism which help in reusability.

iii) Support for schema evolution: There is a tight coupling between data and b
applications, hence there is strong support for schema evolution.

iv)Improved performance: Using object based data model there can be significant
improvement in performance using object based data model.

Disadvantages:

i) Lack of universal data model: There is no universally agreed data model for an
object based data model, and most models lack a theoretical foundation.

ii) Lack of experience: In comparison with relational database management the use
of object based data model is limited. This model is more dependent on the skilled
egi programmer.

iii) Complex: More functionalities present in object based data model make the
design complex.

(4) Semi-structured data model:

• The semi-structured data model permits the specification of data where individual
data items of same type may have different sets of attributes.

• The Extensible Markup Language (XML) is widely used to represent semi-

structured data model.
Advantages

i) Data is not constrained by fixed schema.

ii) It is flexible.

iii) It is portable.

Disadvantages

i) Queries are less efficient than other types of data model.

3. Describe the three-schema architecture. Why do we need mapping between

schema Levels? How do different schema definition languages support this
architecture?
• Definition: Database schema is a collection of database objects like tables, views,
indexes and so on associated with one particular database username. This
username is called the schema owner.
• For example, Student Schema can be owner of STUDENT and MARKS tables.
The Course schema can be the owner of SUBJECT table.
• The goal of three-schema architecture is to separate the user application from the
physical database.
• The architecture of database is divided into three levels based on three types of
schema - internal schema, conceptual schema or external schema.
1. Internal level:
• It contains internal schema.
• This schema represents the physical storage structure of database.
• This schema is maintained by the software and user is not allowed to modify it.
• This level is closest to the physical storage. It typically describes the record layout
of the files and types of files, access paths etc.
2. Conceptual level:
• It contains conceptual schema.
• This schema hides the details of internal level.
• This level is also called as logical level as it contains the constructs used for
designing the database.
• It contains information like table name, their columns, indexes and constraints,
database operations.
• A representational data model is used to describe conceptual schema when a
database system is implemented.
3. External level:
• It contains the external schema or user views.
• At this level, the user will get to see only the data stored in the database. Either
they will see whole data values or any specific records. They will not have any
information about how they are stored in the databases
• The processes of transforming requests and results between levels are called
mappings.
• In the three schema architecture there are two mappings -
1) External - Conceptual Mapping and
2) Conceptual - Internal Mapping.

4. Explain equi-join, left outer join, right outer join and full outer join operations in
relational algebra with an example.
The SQL Joins clause is used to combine records from two or more tables in a
database. A JOIN is a means for combining fields from two tables by using values
common to each.
Example: Consider two tables for using the joins in SQL. Note that cid is common
column in following tables.

1) Inner Join:
• The most important and frequently used of the joins is the INNER JOIN. They are
also known as an EQUIJOIN.
• The INNER JOIN creates a new result table by combining column values of two
alqutul no tables (Table1 and Table2) based upon the join-predicate.
• The query compares each row of tablel with each row of Table2 to find all pairs of
rows which satisfy the join-predicate.
• When the join-predicate is satisfied, column values for each matched pair of rows of
A and B are combined into a result row. It can be represented as:

• Syntax: The basic syntax of the INNER JOIN is as follows.

SELECT Table1.column1, Table2.column2...
FROM Table1
INNER JOIN Table2
ON Table1.common_field = Table2.common_field;
• Example: For above given two tables namely Student and City, we can apply inner
join. It will return the record that are matching in both tables using the common
column cid. The query will be
SELECT *
FROM Student Inner Join City on Student.cid=City.cid
The result will be

2) Left Join:
• The SQL LEFT JOIN returns all rows from the left table, even if there are no
matches in the right table. This means that if the ON clause matches 0 (zero) records
in the right table; the join will still return a row in the result, but with NULL in each
column from the right table.
• This means that a left join returns all the values from the left table, plus matched
values from the right table or NULL in case of no matching join predicate.
• It can be represented as –

• Syntax: The basic syntax of a LEFT JOIN is as follows.

SELECT
SELECT Table1.column1, Table2.column2...
FROM Table1
LEFT JOIN Table2
ON Table1.common field Table2.common field;
• Example: For above given two tables namely Student and City, we can apply Left
join. It will Return all records from the left table, and the matched records from the
right table using the common column cid. The query will be
SELECT *
FROM Student Left Join City on Student.cid=City.cid
The result will be

3) Right Join:
• The SQL RIGHT JOIN returns all rows from the right table, even if there are no
matches in the left table.
• This means that if the ON clause matches 0 (zero) records in the left table; the join
will still return a row in the result, but with NULL in each column from the left table.
• This means that a right join returns all the values from the right table, plus matched
values from the left table or NULL in case of no matching join predicate.
• It can be represented as follows:
• Syntax: The basic syntax of a RIGHT JOIN is as follow-

SELECT Table1.column1, Table2.column2...

FROM Table1
RIGHT JOIN Table2
ON Table1.common_field = Table2.common_field;
• Example: For above given two tables namely Student and City, we can apply
Rightjoin. It will return all records from the right table, and the matched records from
the left table using the common column cid. The query will be,
SELECT *
FROM Student Right Join City on Student.cid=City.cid
The result will be –

4) Full Join:
• The SQL FULL JOIN combines the results of both left and right outer joins.
• The joined table will contain all records from both the tables and fill in NULLS for
missing matches on either side.
• It can be represented as

• Syntax: The basic syntax of a FULL JOIN is as follows

SELECT Table1.column1, Table2.column2...
FROM Table1 FULL JOIN Table2 ON Table1.common_field = Table2.common_field;
The result will be -
Example: For above given two tables namely Student and City, we can apply Full
join. It will return returns rows when there is a match in one of the tables using the
common column cid. The query will be -
SELECT *
FROM Student Full Join City on Student.cid=City.cid
The result will be –
5. Consider the following relations: Consider the following relations:

EMPLOYEE (ENO, NAME, DATE_BORN, GENDER, DATE_JOINED,

DESIGNATION, BASIC_PAY, DEPARTMENT_NUMBER)
DEPARTMENT (DEPARTMENT NUMBER, NAME)

Write SQL queries to perform the following:

a) Create the table and insert the values.
b) List the details of employees belonging to department number ‘CSE’.
c) List the employee number, employee name, department number and
department name of all employees.
d) List the department number and number of employees in each department.
e) List the details of employees who earn less than the average basic pay of
all employees.
Soln:
a) Create the tables and insert values
-- Create the EMPLOYEE table
CREATE TABLE EMPLOYEE (
ENO INT PRIMARY KEY,
NAME VARCHAR(50),
DATE_BORN DATE,
GENDER CHAR(1),
DATE_JOINED DATE,
DESIGNATION VARCHAR(50),
BASIC_PAY DECIMAL(10, 2),
DEPARTMENT_NUMBER VARCHAR(10)
);

-- Create the DEPARTMENT table

CREATE TABLE DEPARTMENT (
DEPARTMENT_NUMBER VARCHAR(10) PRIMARY KEY,
NAME VARCHAR(50)
);
-- Insert sample values into EMPLOYEE table
INSERT INTO EMPLOYEE (ENO, NAME, DATE_BORN, GENDER,
DATE_JOINED, DESIGNATION, BASIC_PAY, DEPARTMENT_NUMBER)
VALUES
(1, 'Alice', '1990-05-10', 'F', '2015-07-01', 'Manager', 75000.00, 'CSE'),
(2, 'Bob', '1985-08-12', 'M', '2012-03-15', 'Developer', 60000.00, 'ECE'),
(3, 'Charlie', '1992-11-23', 'M', '2018-10-05', 'Analyst', 55000.00, 'CSE'),
(4, 'Diana', '1988-04-17', 'F', '2016-02-20', 'HR', 65000.00, 'HRD'),
(5, 'Eve', '1995-01-30', 'F', '2020-06-10', 'Intern', 30000.00, 'CSE');

-- Insert sample values into DEPARTMENT table

INSERT INTO DEPARTMENT (DEPARTMENT_NUMBER, NAME)
VALUES
('CSE', 'Computer Science'),
('ECE', 'Electronics'),
('HRD', 'Human Resources');
b) List the details of employees belonging to department number ‘CSE’
SELECT *
FROM EMPLOYEE
WHERE DEPARTMENT_NUMBER = 'CSE';
c) List the employee number, employee name, department number, and
department name of all employees
SELECT
E.ENO AS Employee_Number,
E.NAME AS Employee_Name,
E.DEPARTMENT_NUMBER AS Department_Number,
D.NAME AS Department_Name
FROM
EMPLOYEE E
JOIN
DEPARTMENT D
ON
E.DEPARTMENT_NUMBER = D.DEPARTMENT_NUMBER;
d) List the department number and number of employees in each
department
SELECT
DEPARTMENT_NUMBER,
COUNT(*) AS Number_Of_Employees
FROM
EMPLOYEE
GROUP BY
DEPARTMENT_NUMBER;
e) List the details of employees who earn less than the average basic
pay of all employees
SELECT *
FROM EMPLOYEE
WHERE BASIC_PAY < (SELECT AVG(BASIC_PAY) FROM EMPLOYEE);

6. Explain about Data Definition Language for student database.

The CREATE command is used to define new tables and their structure. For
example,
1. creating a Student table:
CREATE TABLE STUDENT (
STUDENT_ID INT PRIMARY KEY, -- Unique identifier for each student
NAME VARCHAR(50) NOT NULL, -- Student's name
DATE_OF_BIRTH DATE, -- Date of birth
GENDER CHAR(1), -- Gender (M/F)
COURSE_ID INT, -- Course identifier (foreign key)
ADMISSION_DATE DATE, -- Admission date
EMAIL VARCHAR(100) UNIQUE -- Unique email address
);

2. The ALTER command is used to modify an existing table's structure. Examples

include adding, modifying, or dropping columns.

Adding a Column:
ALTER TABLE STUDENT
ADD PHONE_NUMBER VARCHAR(15);
Modifying a Column:
ALTER TABLE STUDENT

Dropping a Column:
The DROP command removes a table and its data from the database permanently.
For example:

Modifying a Column:
ALTER TABLE COURSE
MODIFY FEES DECIMAL(12, 2);

3. Dropping Tables
The DROP command removes a table and its data from the database permanently.
For example:
DROP TABLE STUDENT;
4. Truncating Tables
The TRUNCATE command removes all rows from a table while retaining its
structure. For example:
sql
TRUNCATE TABLE STUDENT;
5. Constraints in DDL
Constraints ensure data integrity in the database. Common constraints include:

Primary Key: Ensures unique identification of each record.

CREATE TABLE STUDENT (
STUDENT_ID INT PRIMARY KEY,
NAME VARCHAR(50) NOT NULL
);

Foreign Key: Maintains relationships between tables.

ALTER TABLE STUDENT
ADD CONSTRAINT FK_COURSE FOREIGN KEY (COURSE_ID) REFERENCES
COURSE(COURSE_ID);

Unique Key: Ensures unique values for a column.

ALTER TABLE STUDENT
ADD CONSTRAINT UNIQUE_EMAIL UNIQUE (EMAIL);

Check Constraint: Enforces a condition for column values.

ALTER TABLE STUDENT
ADD CONSTRAINT CHECK_GENDER CHECK (GENDER IN ('M', 'F'));
Not Null: Ensures a column cannot have null values.
ALTER TABLE STUDENT
MODIFY NAME VARCHAR(50) NOT NULL;
DL commands are essential for defining and maintaining the structure of a student
database. By using CREATE, ALTER, DROP, and TRUNCATE, you can manage the
schema, relationships, and constraints effectively, ensuring the database operates
reliably and maintains data integrity.

7. Discuss the entity Integrity and referential integrity constraints. Why are they
important? Explain them with suitable examples.

Database integrity means correctness or accuracy of data in the database. A

database may have number of integrity constraints. For example -

(i) The Employee ID and Department ID must consists of two digits.

(ii) Every Employee ID must start with letter.

The integrity constraints are classified based on the concept of primary key and
foreign key. Let us discuss the classification of constraints based on primary key and
foreign key as follows-

Entity Integrity Rule

This rule states that "In the relations, the value of attribute of primary key can not be
null".

The NULL represents a value for an attribute that is currently unknown or is not
applicable for this tuple. The Nulls are always to deal with incomplete or exceptional
data.

The primary key value helps in uniquely identifying every row in the table. Thus if the
users of the database want to retrieve any row from the table or perform any action
on that table, they must know the value of the key for that row. Hence it is necessary
that the primary key should not have the NULL value.

Referential Integrity Rule

• Referential integrity refers to the accuracy and consistency of data within a

relationship.

• In relationships, data is linked between two or more tables. This is achieved by

having the foreign key (in the associated table) reference a primary key value (in the
primary or parent - table). Because of this, we need to ensure that data on both sides
of the relationship remain intact.

• The referential integrity rule states that "whenever a foreign key value is used it
must reference a valid, existing primary key in the parent table".

• Example:Consider the situation where you have two tables Employees and
Managers. The Employees table has a foreign key attribute entitled Managed By,
which points to the record for each employee's manager in the Managers table.

Referential integrity enforces the following three rules:

i) You cannot add a record to the Employees table unless the Managed By attribute
points to a valid record in the Managers table. Referential integrity prevents the
insertion of incorrect details into a table. Any operation that doesn't satisfy referential
integrity rule fails.

ii) If the primary key for a record in the Managers table changes, all corresponding
records in the Employees table are modified.

iii) If a record in the Managers table is deleted, all corresponding records in the
Employees table are deleted.

Advantages of Referential Integrity

Referential integrity offers following advantages:

i) Prevents the entry of duplicate data.

ii) Prevents one table from pointing to a nonexistent field in another table.

ii) Guaranteed consistency between "partnered" tables.

iii) Prevents the deletion of a record that contains a value referred to by a foreign key
in olds another table.

iv) Prevents the addition of a record to a table that contains a foreign key unless
there is etail a primary key in the linked table.

8. Describe in detail about various types of key constraints with SQL query
example.
We can specify rules for data in a table.
• When the table is created at that time we can define the constraints.
• The constraint can be column level i.e. we can impose constraint on the column
and table level i.e we can impose constraint on the entire table.
There are various types of constraints that can be defined are as follows -
1) Primary key: The primary key constraint is defined to uniquely identify the records
from the table.
The primary key must contain unique values. Hence database designer should
choose primary key very carefully.
For example
Consider that we have to create a person_details table with AdharNo, FirstName,
MiddleName, LastName, Address and City.
Now making AdharNo as a primary key is helpful here as using this field it becomes
easy to identify the records correctly.
The result will be
CREATE TABLE person_details (
AdharNo int,
FirstName VARCHAR(20),
MiddleName VARCHAR(20),
LastName VARCHAR(20),
Address VARCHAR(30),
City VARCHAR(10),
PRIMARY KEY(AdharNo)
);
We can create a composite key as a primary key using CONSTRAINT keyword. For
example
CREATE TABLE person_details (
AdharNo int NOT NULL,
FirstName VARCHAR(20),
MiddleName VARCHAR(20),
LastName VARCHAR(20) NOT NULL,
Address VARCHAR(30),
City VARCHAR(10),
CONSTRAINT PK_person_details PRIMARY KEY(AdharNo, LastName)
);
(2) Foreign Key
• Foreign key is used to link two tables.
• Foreign key for one table is actually a primary key of another table.
• The table containing foreign key is called child table and the table containing
candidate primary key is called parent key.
• Consider
Employee Table

Dept Table:

• Notice that the "EmpID" column in the "Dept" table points to the "EmpID" column in
the "Employee" table.
• The "EmpID" column in the "Employee" table is the PRIMARY KEY in the
"Employee" table.
• The "EmpID" column in the "Dept" table is a FOREIGN KEY in the "Dept" table.
• The FOREIGN KEY constraint is used to prevent actions that would destroy links
between tables.
• The FOREIGN KEY constraint also prevents invalid data from being inserted into
the foreign key column, because it has to be one of the values contained in the table
it points to.
• The purpose of the foreign key constraint is to enforce referential integrity but there
are also performance benefits to be had by including them in your database design.
The table Dept can be created as follows with foreign key constraint.
CREATE TABLE DEPT (
DeptID int
DeptName VARCHAR(20),
EmpID int,
PRIMARY KEY(DeptID),
FOREIGN KEY (EmpID)
REFERENCES EMPLOYEE(EmpID)
);
(3) Unique
Unique constraint is used to prevent same values in a column. In the EMPLOYEE
table, for example, you might want to prevent two or more employees from having an
identical designation. Then in that case we must use unique constraint.
We can set the constraint as unique at the time of creation of table, or if the table is
already created and we want to add the unique constraint then we can use ALTER
command.
For example -
CREATE TABLE EMPLOYEE(
EmpID INT NOT NULL,
Name VARCHAR (20) NOT NULL,
Designation VARCHAR(20) NOT NULL UNIQUE,
Salary DECIMAL (12, 2),
PRIMARY KEY (EmpID)
);
If table is already created then also we can add the unique constraint as follows -
ALTER TABLE EMPLOYEE
MODIFY Designation VARCHAR(20) NOT NULL UNIQUE;
(4) NOT NULL
• By default the column can have NULL values.
• NULL means unknown values.
• We can set the column values as non NULL by using the constraint NOT NULL.
• For example
CREATE TABLE EMPLOYEE(
EmpID INT NOT NULL,
Name VARCHAR (20) NOT NULL,
Designation VARCHAR(20) NOT NULL,
Salary DECIMAL (12, 2) NOT NULL,
PRIMARY KEY (EmpID)
);
(5) CHECK
The CHECK constraint is used to limit the value range that can be placed in a
column.
For example
CREATE TABLE parts (
Part_no int PRIMARY KEY,
Description VARCHAR(40),
Price DECIMAL(10, 2) NOT NULL CHECK(cost > 0)
);
9. Elaborate about embedded SQL and Dynamic SQL with suitable examples.
Embedded SQL
Embedded SQL is a method that combines SQL with a high−level programming
language’s feature. It enables programmers to put SQL statements right into the
source code files used to set up an application.
Database operations may be carried out effortlessly by developers by adding
SQL statements to the application code. The source code files having embedded
SQL statements should be pre-processed before compilation because of the issue of
interpretation of SQL statements by the high−level programming languages in
embedded SQL.
The terms EXEC SQL and END_EXEC must be used before and after each SQL
statement in the source code file. In embedded SQL, host variables play a key role.
These variables serve as an intermediary for data transfer between the application
and the database. There are two different kinds of host variables: input host
variables that provide data to the database and output host variables that receive
that data.

Structure of embedded SQL

EXEC SQL BEGIN DECLARE SECTION;
int STD_ID;
char STD_NAME [15];
char ADDRESS[20];
EXEC SQL END DECLARE SECTION;
int main() {
EXEC SQL INCLUDE SQLCA;
EXEC SQL BEGIN DECLARE SECTION;
int OrderID; /* Employee ID (from user) */
int CustID; /* Retrieved customer ID */
char SalesPerson[10] /* Retrieved salesperson name */
char Status[6] /* Retrieved order status */
EXEC SQL END DECLARE SECTION;
/* Set up error processing */
EXEC SQL WHENEVER SQLERROR GOTO query_error;
EXEC SQL WHENEVER NOT FOUND GOTO bad_number;
/* Prompt the user for order number */
printf ("Enter order number: ");
scanf_s("%d", &OrderID);
/* Execute the SQL query */
EXEC SQL SELECT CustID, SalesPerson, Status
FROM Orders
WHERE OrderID = :OrderID
INTO :CustID, :SalesPerson, :Status;
/* Display the results */
printf ("Customer number: %d\n", CustID);
printf ("Salesperson: %s\n", SalesPerson);
printf ("Status: %s\n", Status);
exit();
query_error:
printf ("SQL error: %ld\n", sqlca->sqlcode);
exit();
bad_number:
printf ("Invalid order number.\n");
exit();
}

DYNAMIC SQL
The main disadvantage of embedded SQL is that it supports only static SQLs. If we
need to build up queries at run time, then we can use dynamic sql. That means if
query changes according to user input, then it always better to use dynamic SQL.
PREPARE
Since dynamic SQL builds a query at run time, as a first step we need to capture all
the inputs from the user. It will be stored in a string variable.

EXECUTE
This statement is used to compile and execute the SQL statements prepared in DB.
EXEC SQL EXECUTE sql_query;
EXECUTE IMMEDIATE
This statement is used to prepare SQL statement as well as execute the SQL
statements in DB. It performs the task of PREPARE and EXECUTE in a single line.
EXEC SQL EXECUTE IMMEDIATE:
sql_stmt;
EXEC SQL EXECUTE IMMEDIATE ‘GRANT SELECT ON STUDENT TO Faculty’;
EXEC SQL EXECUTE IMMEDIATE ‘DELETE FROM STUDENT WHERE STD_ID =
100’; EXEC SQL EXECUTE IMMEDIATE ‘UPDATE STUDENT SET ADDRESS =
‘Troy’ WHERE STD_ID =100’;

Advantages of Dynamic SQL

Flexibility: Dynamic SQL provides unparalleled flexibility as it allows developers to
build SQL statements dynamically based on runtime conditions or user input. This
flexibility enables developers to create dynamic queries, adapt to changing
requirements, and handle complex scenarios.
Conditional Queries: Dynamic SQL is particularly useful when dealing with
conditional queries. By building SQL statements dynamically, developers can
incorporate conditions into the queries, such as dynamic WHERE clauses or varying
column selections based on runtime conditions.
Table and Column Name Flexibility: Dynamic SQL allows developers to work with
dynamic table and column names. This flexibility is beneficial when dealing with
scenarios where the table or column names are not known or need to be determined
at runtime.
Database Administration: Dynamic SQL can be useful for database administrators
(DBAs) when performing tasks such as data migration, schema changes, or
automated maintenance operations. Dynamic SQL allows DBAs to generate and
execute SQL statements on the fly, adapting to the specific requirements of the task
at hand.

10.i) Outline select, project, cartesian product and join operations in relational
algebra with an example.
(ii) Solve the queries for the following database using relational algebra branch
(branch-name, branch-city, assets)
customer (customer-name, customer-street, customer-only)
account (account-number, branch-name, balance)
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
(a) Find all loans over $1200
(b) Find the loan number for each loan of an amount greater than $1200
(c) Find the names of all customers who have a loan, an account, or both, from
the bank
(d) Find the names of all customers who have a loan and an account at bank.
(e) Find the names of all customers who have a loan at the Perryridge branch.
(f) Find the names of all customers who have a loan at the Perryridge branch but
do not have an account at any branch of the bank.
i) Relational Operations
Various types of relational operations are as follows-

(1) Selection:
• This operation is used to fetch the rows or tuples from the table(relation).
• Syntax: The syntax is
σpredicate(relation)
Where σrepresents the select operation. The predicate denotes some logic using
which the data from the relation (table) is selected.
• For example - Consider the relation student as follows-

Query: Fetch students with age more than 18

We can write it in relational algebra as
Σage>>18 (student)
The output will be-
We can also specify conditions using and, or operators.

(2)Projection :
• Project operation is used to project only a certain set of attributes of a relation. That
means if you want to see only the names all of the students in the Student table, then
you can use Project operation.
• Thus to display particular column from the relation, the projection operator is used.
• It will only project or show the columns or attributes asked for, and will also vait
remove duplicate data from the columns.
• Syntax:
ПС1, С2... (r)
where C1, C2 etc. are attribute names(column names).
• For example - Consider the Student table given in Fig. 1.13.2.
Query: Display the name and age all the students
This can be written in relational algebra as
Пsname, age (Student)
Above statement will show us only the Name and Age columns for all the rows of
data in Student table.

(3) Cartesian product:

• This is used to combine data from two different relations(tables) into one and fetch
data from the combined relation.
• Syntax: A x B
• For example: Suppose there are two tables named Student and Reserve as follows
Query: Find the names of all the students who have reserved isbn = 005. To satisfy
this query we need to extract data from two table. Hence the cartesian product
operator is used as
(σStudent.sid = Reserve.sid ˄ Reserve.Isbn = 005 (Student × Reserve)
As an output we will get

Note:that although the Sid columns is same, it is repeated.

(4) Set operations: Various set operations are - union, intersection and set-difference.
Let us understand each of these operations with the help of examples.
(i) Union:
• This operation is used to fetch data from two relations(tables) or temporary relation
(result of another operation).
• For this operation to work, the relations(tables) specified should have same number
of attributes(columns) and same attribute domain. Also the duplicate tuples are
automatically eliminated from the result.
• Syntax: AUB
• Where A and B are relations.
• For example: If there are two tables student and books as follows-
• Query: We want to display both the student name and book names from both the
tables then
Пsname (Student) U Пbname (Book)
(ii) Intersection:
• This operation is used to fetch data from both tables which is common in both the
tables.
• Syntax: A ∩ B
where A and B are relations.
• Example - Consider two tables - Student and Worker

• Query: If we want to find out the names of the students who are working in a
company then 300
Пname (Student) ∩ Пname (Worker)

(iii) Set-Difference: The result of set difference operation is tuples, which are present
in one relation but are not in the second relation.
Syntax: A - B
For Example: Consider two relations Full_Time_Employee and
Part_Time_Employee, if we want to find out all the employee working for Fulltime,
then the set difference operator is used -
ПEmpName(Full Time_Employee) – ПEmpName(Part_Time_Employee)
(5) Join:The join operation is used to combine information from two or more relations.
Formally join can be defined as a cross-product followed by selections and
projections, joins arise much more frequently in practice than plain cross-products.
The join operator is used as
A) Inner Join
There are three types of joins used in relational algebra
i) Conditional join: This is an operation in which information from two tables is
combined using some condition and this condition is specified along with the join
operator.
A c B = σc (A x B)
Thus is defined to be a cross-product followed by a selection. Note that the
condition c can refer to attributes of both A and B. The condition C can be specified
using <,<,>,< or = operators.
For example consider two table student and reserve as follows-

If we want the names of students with sid(Student) = sid (Reserve) and isbn =
005,then we can write it using Cartesian product as -
(σ((Student.sid = Reserve.sid) ∩(Reserve.(isbn) =005)) (Student × Reserve))
Here there are two conditions as
i) (Student.sid =Reserve.sid) and ii) (Reserve.isbn = 005) which are joined
by∩operator.
Now we can use c instead of above statement and write it as -

(Student ( Student.sid - Reserve.sid) ˄ (Reserve.(Isbn) - 005) Reserve))

The result will be-

ii) Equijoin: This is a kind of join in which there is equality condition between two
attributes(columns) of relations(tables). For example - If there are two table Book and
Reserve table and we want to find the book which is reserved by the student having
isbn 005 and name of the book is 'DBMS', then :

(Obname='DBMS' (Book (Book.isbn = Reserve.isbn) Reserve)

Then we get

iii) Natural Join: When there are common columns and we have to equate these
common columns then we use natural join. The symbol for natural join is simply
without any condition. For example, consider two tables-

Now if we want to list the books that are reserved, then that means we want to match
Books.isbn with Reserve.isbn. Hence it will be simply Books Reserve

ii) Solution:

1) σamount>1200 (loan))

2) II loan-number(σamount>1200 (loan))

3) II customer-name(borrower)U II customer-name(depositor)

4) II customer-name (borrower)∩ II customer-name(depositor)

5) Πcustomer-name(σbranch-name="Perryridge"(σborrower.loan-number-loan.loan-n
umber(borrower loan)))

6) IIcustomer-name(σbranch-name="Perryridge"(σborrower.loan-number-loan.loan-n
umber(borrower loan))) - II customer-name(depositor)

7) IIcustomer-name(σbranch-name="Perryridge"(σborrower.loan-number-loan.loan-n
umber(borrower loan)) U IIcustomer-name(depositor))
UNIT – II – DATABASE DESIGN

PART – A

1. Give the limitations of E-R model. How to overcome this?

1) Loss of information content: Some information be lost or hidden in ER model

2) Limited relationship representation: ER model represents limited relationship as compared to another

data models like relational model etc.

3) No representation of data manipulation: It is difficult to show data manipulation in ER model.

4) Popular for high level design: ER model is very popular for designing high level design.

2. List the design phases of Entity Relationship model.

1) Requirement Analysis 2) Conceptual Database Design
3) Logical Database Design 4) Schema Refinement
5) Physical Database Design 6) Application and Security Design.
3. Define a weak entity. Give an example.
A weak entity is an entity that cannot be uniquely identified by its attributes alone. The entity
set which does not have sufficient attributes to form a primary key is called as weak entity set.

4. State the problems caused by redundancy.

Problems caused by Redundancy: Following problems can be caused by redundancy -

i) Redundant Storage: Some information is stored repeatedly.

ii) Update Anomalies: If one copy of such repeated data is updated then inconsistency is created unless
all other copies are similarly updated.

iii) Insertion Anomalies: Due to insertion of new record repeated information get added to the relation
schema.

iv) Deletion Anomalies: Due to deletion of particular record some other important information associated
with the deleted record get deleted and thus we may lose some other important information from the
schema.

5. Define normalization.

Normalization is the process of reorganizing data in a database so that it meets two basic
requirements:
1) There is no redundancy of data (all data is stored in only one place), and
2) Data dependencies are logical (all related data items are stored together)
6. Restate why certain functional dependencies are called trivial functional dependencies.

• A functional dependency FD: XY is called trivial if Y is a subset of X. This kind of dependency is

called trivial because it can be derived from common sense. If one "side" is a subset of the other, it's
considered trivial. The left side is considered the determinant and the right the dependent.

• For example - {A,B} -> B is a trivial functional dependency because B is a subset of A,B. Since
(A,B) -> B includes B, the value of B can be determined. It's a trivial functional dependency because
determining B is satisfied by its relationship to A,B

7. Define First normalization.

8. Define multivalued dependency.

A table is said to have multi-valued dependency, if the following conditions are true,

1) For a dependency A → B, if for a single value of A, multiple values of B exists, then the table may have
multi-values dependency.

2) Also, a table should have at-least 3 columns for it to have a multi-valued dependency.

3) And, for a relation R(A,B,C), if there is a multi-valued dependency between, A and B, then B and C
should be independent of each other.

9. Show that if a relation is in BCNF, then it is also in 3NF.

• Boyce and Codd Normal Form is a higher version of the Third Normal form.

• A 3NF table which does not have multiple overlapping candidate keys is said to ove be in BCNF.
When the table is in BCNF then it doesn't have partial functional dependency as well as transitive
dependency.

• Hence it is true that if relation is in BCNF then it is also in 3NF.

10. Give an example of a relation schema R and set of dependencies such that R is in BCNF but not
in 4NF.

Consider relation R (A,B,C,D) with dependencies

AB→C

ABC→D

AC→B
Here the only key is AB. Thus each functional dependency has superkey on the left. But MVD has
non-superky on its left. So it is not 4NF.

11. Mention atleast two desirable properties of decomposition.

There are two properties associated with decomposition and those are -

1) Loss-less Join or non Loss Decomposition: When all information found in the original database is
preserved after decomposition, we call it as loss less or nonloss decomposition.

2) Dependency Preservation: This is a property in which the constraints on the wied original table can be
maintained by simply enforcing some constraints on each of the smaller relations.

12. Give a simple example of lossless join decomposition.

i) Union of attributes of R1 and R2 must be equal to attribute of R. Each attribute of R must be

either in R1 or in R2.

Att(R1) U Att(R2) = Att(R)

ii) Intersection of attributes of R1 and R2 must not be NULL.

Att(R1) Att(R2) ≠ Φ

iii)Common attribute must be a key for at least one relation (R1 or R2)

Att(R1) Att(R2) -> Att(R1)

Att (R1) Att (R2) -> Att (R2)

13. Define the terms entity set and relationship set.

Entity set: The entity set is a set of entities of the same types. For example - All students studying in class X
of the School. The entity set need not be disjoint. Each entity in entity set have the same set of attributes
and the set of attributes will distinguish it from other entity sets. No other entity set will have exactly the
same set of attributes.

Relationship Sets: Relationship is an association among two or more entities. The relationship set is a
collection of similar relationships. For example - Following Fig. shows the relationship works for the two
entities Employee and Departments.
The association between entity sets is called as participation. That is, the entity sets E1, E2,...,
En participate in relationship set R.

The function that an entity plays in a relationship is called that entity's role.

14. Boyce-Codd normal form is found to be stricter than third normal form Justify the statement.

(i) Every relation which is in BCNF is also in 3NF but every relation which is in 3NF is not
necessarily be in BCNF.

(ii) BCNF non-transitionally depends on individual candidate key but there is no such
requirement in 3NF.

Hence BCNF is stricter than 3NF.

15. Mention the significance of "participation role name" in the description of relationship types.

Each entity type that participates in a relationship type plays a particular role in the
relationship. The role name signifies the role that a participating entity of an entity plays in each relationship
instance. In PREPARED BY relationship type, EMPLOYEE plays the role of document creator and voucher
plays the role of document created. Entity TEACHER and Entity STUDENT are related with a relationship
TEACHER-teach-STUDENT. The teaches is a participating role in the entity set TEACHER and STUDENT.

16. Define BCNF and state a relation which is in BCNF.

Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals
with certain type of anomaly that is not handled by 3NF.

A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF.

Or in other words, For a table to be in BCNF, following conditions must be satisfied:

i) R must be in 3rd Normal Form

ii) For each functional dependency (X → Y), X should be a super Key. In simple words if Y is
a prime attribute then X can not be non prime attribute.

17. Differentiate first normal form and second normal form.

S.No 1NF 2NF
1. In order to be in 1NF any relation must In order to be in 2NF any relation must
be atomic and should not contain any be in 1NF and should not contain any
composite or multi-valued attributes. partial dependency.
2. The identification of functional The identification of functional
dependency is not necessary for first dependency is necessary for second
normal form. normal form.
3. First Normal form only deals with the Second normal form handles the update
schema of the table and it does not anomalies.
handle the update anomalies.
18. Define third normal form.

A relation is in 3NF if and only if it is in 2NF and every non key attribute is non transitively

dependent on the primary key.

19. What is mapping cardinality?

Mapping cardinality express the number of entities to which another entity can be associated via a
relationship set.

20. State 4NF in normal form is more desirable than BCNF.

4NF is more desirable than BCNF because it reduces the repetition of information. If we
consider a BCNF schema not in 4NF we observe that decomposition into 4NF does not lose information
provided that a lossless join decomposition is used, yet A redundancy is reduced.

PART – B

1. Describe the notation used in E-R diagram. Explain the E-R model structure with Example.

2. Develop an Entity Relationship model for a library management system. Clearly state the
problem Definition, Description, Business Rule and any assumption you make.

1. Problem Definition
To design a database system for a library that manages books, borrowers, and their
transactions. The system should facilitate the management of book information, borrower records, issue
and return of books, and overdue penalties. It must ensure data consistency and provide efficient
access to information.
2. Description
The library management system consists of the following key components:
● Books: Information about books available in the library.
● Borrowers: Information about library members who borrow books.
● Transactions: Records of book issues and returns.
● Staff: Library employees managing operations.

3. Business Rules

1. Each book is uniquely identified by a Book ID.

2. A book can have multiple copies.

3. A borrower can borrow multiple books but not more than a specified limit (e.g., 3 books at a
time).

4. Each transaction records the issue date, due date, and return date.

5. If a book is returned after the due date, a fine is imposed based on the number of overdue
days.

6. Borrowers must register to become members of the library.

7. A book can only be issued if it is available in stock.

8. Library staff are responsible for issuing and returning books.

4. Assumptions

1. Each borrower has a unique Borrower ID.

2. Fine calculations are based on a fixed daily rate.

3. The system supports only one library branch.

5. ER Model Components

Entities and Attributes:

1. Book

o BookID (Primary Key)

o Title

o Author

o Publisher

o ISBN

o CopiesAvailable

2. Borrower

o BorrowerID (Primary Key)

o Name

o ContactInfo

o MembershipDate

3. Staff
o StaffID (Primary Key)

o Name

o Role

o ContactInfo

4. Transaction

o TransactionID (Primary Key)

o BorrowerID (Foreign Key)

o BookID (Foreign Key)

o IssueDate

o DueDate

o ReturnDate

o Fine

Relationships:

1. A Borrower can borrow multiple Books (1:N).

2. A Book can be borrowed in multiple Transactions (1:N).

3. A Staff member processes each Transaction (1:N).

6. ER Diagram

● Entities: Represented as rectangles (Book, Borrower, Staff, Transaction).

● Attributes: Represented as ovals connected to respective entities.

● Relationships: Represented as diamonds with lines connecting entities.

Example:

● Borrower --- borrows --- Transaction --- involves --- Book

● Staff --- processes --- Transaction

This model captures the functional requirements of the library management system and ensures
data integrity, efficient querying, and scalability.

3. Develop an Entity Relationship model preparation staff (chef) and finalize the customer‗s bill.
The Food preparation staffs (Chefs), with their touch-display interfaces to the system, are able
to view orders sent to the kitchen by waiters. During preparation, they are able to let the waiter
know the status of each item, and can send notifications when items are completed. The system
should have full accountability and logging facilities, and should support Supervisor actions to
account for exceptional circumstances, such as a meal being refunded or walked out on.

1. Problem Definition

To design a database system that manages food orders, preparation status, customer bills,
and accountability for exceptional cases in a restaurant. The system must:

1. Allow chefs to view and update the status of orders.

2. Notify waiters when items are ready.

3. Maintain full accountability for all actions, including refunds or walk-outs.

4. Support supervisory actions for managing exceptional circumstances.

2. Description

The system involves multiple key components:

● Orders: Customer orders placed through waiters and processed by chefs.

● Chefs: Food preparation staff responsible for preparing ordered items.

● Waiters: Staff members responsible for taking orders and serving food.

● Bills: Customer bills based on ordered items.

● Supervisors: Staff members authorized to handle exceptional situations.

3. Business Rules

1. Each customer order is uniquely identified by an Order ID.

2. Each order can consist of multiple items, and each item is prepared by chefs.

3. Chefs update the status of each item as "Preparing," "Ready," or "Cancelled."

4. Waiters are notified when an item is marked as "Ready."

5. Bills are generated based on the items in the order and their prices.

6. Supervisors can authorize refunds or cancel bills in exceptional situations (e.g., customer
dissatisfaction or walk-outs).

7. The system must log every action for accountability, including order updates, notifications,
and supervisor interventions.

4. Assumptions

1. Each chef, waiter, and supervisor has a unique Staff ID.

2. Each menu item has a fixed price.

3. Orders are associated with a single table or customer.

4. Exceptional circumstances are rare but must be logged and authorized.

5. ER Model Components:

Entities and Attributes:

1. Chef

o ChefID (Primary Key)

o Name

o Specialization

2. Waiter

o WaiterID (Primary Key)

o Name

3. Supervisor

o SupervisorID (Primary Key)

o Name

4. Order

o OrderID (Primary Key)

o TableNumber

o WaiterID (Foreign Key)

o OrderStatus (e.g., "In Progress," "Completed," "Cancelled")

o Timestamp

5. OrderItem

o OrderItemID (Primary Key)

o OrderID (Foreign Key)

o ItemID (Foreign Key)

o ChefID (Foreign Key)

o ItemStatus (e.g., "Preparing," "Ready," "Cancelled")

o Timestamp

6. Item
o ItemID (Primary Key)

o Name

o Price

7. Bill

o BillID (Primary Key)

o OrderID (Foreign Key)

o TotalAmount

o PaymentStatus (e.g., "Paid," "Refunded")

o SupervisorID (Foreign Key, optional for refunds)

8. ActionLog

o LogID (Primary Key)

o StaffID (Foreign Key)

o ActionType (e.g., "Update Order," "Refund Authorized")

o Timestamp

o Description

Relationships

1. Order is placed by a Waiter (1:N).

2. OrderItem belongs to an Order (1:N).

3. OrderItem is prepared by a Chef (M:N through OrderItem).

4. Order generates a single Bill (1:1).

5. Bill can be authorized for refund by a Supervisor (M:N).

6. ActionLog tracks actions performed by Chefs, Waiters, or Supervisors (M:N).

6. ER Diagram

Key Components in the Diagram:

1. Entities: Chef, Waiter, Supervisor, Order, OrderItem, Item, Bill, ActionLog.

2. Attributes: Represented as ovals connected to entities.

3. Relationships: Represented as diamonds, such as:

o Order --- "placed by" --- Waiter

o OrderItem --- "prepared by" --- Chef

o Order --- "generates" --- Bill

o Bill --- "authorized by" --- Supervisor

o ActionLog --- "performed by" --- Staff (Chef/Waiter/Supervisor)

7. Explanation of the Model

This model ensures:

1. Chefs can update order item statuses (via OrderItem).

2. Waiters are notified when items are ready (status updates logged in ActionLog).

3. Bills are accurately generated and linked to orders (via Bill entity).

4. Supervisors can handle refunds or cancellations and ensure all actions are logged for
accountability.

This ER model provides a scalable and efficient design for managing the restaurant's operations
and ensuring data consistency.

4. Distinguish between lossless-join decomposition and dependency preserving decomposition.

With an example.
The process of breaking up a relation into smaller sub-relations is called Decomposition.
Decomposition is required in DBMS to convert a relation into a specific normal form which further
reduces redundancy, anomalies, and inconsistency in the relation.
There are mainly two types of decompositions in DBMS
1. Lossless join Decomposition
2. Lossy join Decomposition
Lossless Join Decomposition:
Lossless join decomposition is a process in which a relation is decomposed into smaller relations
without losing any information. When we rejoin the decomposed relations, the original relation is
perfectly reconstructed without losing data.
Advantages of Lossless Join Decomposition:
● Data Integrity: On decomposed tables no loss of any data or information when re-join them together,
it becomes the original table before decomposition.
● Consistency: This decomposition ensures that data will remain accurate and consistent across the
database.
● Normalization: This helps in achieving higher normal forms like 3NF or BCNF and improving
efficiency.
Disadvantage of Lossless Join Decomposition:
● Storage Overhead: Storage usage is increased, as sometimes additional tables and columns are
needed.
● Complex Queries: To rejoin the table we decomposed may require complex SQL queries, and these
queries may impact the performance.
Lossy Join Decomposition:
In this type of decomposition the information will be lost if the relations are decomposed into
smaller parts. This means that when the original relation is decomposed and then later we try to
rejoin them back together then some data from the original relation is not lost and not recoverable,
which leads to data inconsistencies.
Advantages of Lossy Join Decomposition:
● Structure is Simple, the result of decomposing the relations will give simple and smaller sub tables
and this helps in reducing the complexity in some cases.
● Redundancy is Less: In some cases where loss of information is acceptable will help in reducing
redundancy in certain cases.
Disadvantage of Lossy Join Decomposition:
● Loss of Data: When tables are joined back together then some of the information will be loosed
permanently which can cause problem.
● Inconsistency: as discussed above, due to information loss the data integrity problem will arise.
● Hard to manage: It can be difficult to maintain data consistency as some information may loss,
which is making harder to manage the database.
Example to check whether given Decomposition Lossless Join Decomposition.

Let there be a relational schema Student(Roll No., S_name, S_dept). StudentDetails(Roll

No., S_name) and Dept(Roll No., S_dept) be it’s decompositions.

Now for the decomposition to be lossless.

StudentDetails ⨝ Dept = Student then, StudentDetails ⨝ Dept is

As, StudentDetails ⨝ Dept = Student,

This decomposition is Lossless.

5. Explain the Database Design Process in ER model. Draw the ER diagram for banking systems.
(Home loan Application).
Entity Relational model is a model for identifying entities to be represented in the database
and representation of how those entities are related.
Design Phases
Following are the six steps of database design process. The ER model is most relevant to
first three steps.
Step 1: Requirement analysis:

• In this step, it is necessary to understand what data need to be stored in the database,
what applications must be built, what are all those operations that are frequently used by the system.

• The requirement analysis is an informal process and it requires proper communication with
user groups.

• There are several methods for organizing and presenting information gathered in this step.

• Some automated tools can also be used for this purpose.

Step 2: Conceptual database design:

• This is a steps in which E-R Model i.e. Entity Relationship model is built.

• E-R model is a high level data model used in database design.

• The goal of this design is to create a simple description of data that matches with the
requirements of users.

Step 3: Logical database design:

• This is a step in which ER model in converted to relational database schema, sometimes

called as the logical schema in the relational data model.

Step 4: Schema refinement:

• In this step, relational database schema is analyzed to identify the potential smise problems
and to refine it.

• The schema refinement can be done with the help of normalizing and restructuring the
relations.

Step 5: Physical database design:

• In this step, the design of database is refined further.

• The tasks that are performed in this step are building indexes on tables and clustering tables,
redesigning some parts of schema obtained from earlier design steps.
Step 6: Application and security design:

• Using design methodologies like UML (Unified Modeling Language) the design of the
database can be accomplished.

• The role of each entity in every process must be reflected in the application task.

• For each role, there must be the provision for accessing the some part of database and
prohibition of access to some other part of database.

• Thus some access rules must be enforced on the application(which is accessing the database)
to protect the security features.
6. Consider the following tables:
Employee (Emp_no, Name, Emp_city)
Company (Emp_no, Company_name, Salary)
i) Write a SQL query to display Employee name and company name.
ii) Write a SQL query to display employee name, employee city ,company name and salary of all
the employees whose salary >10000
iii) Write a query to display all the employees working in “XYZ company
iv) Write a query to display employees working in same city.
i) Query to display Employee name and Company name
We need to join the Employee and Company tables using the common column Emp_no.
SELECT Employee.Name, Company.Company_name
FROM Employee
INNER JOIN Company
ON Employee.Emp_no = Company.Emp_no;
ii) Query to display employee name, employee city, company name, and salary of employees
whose salary > 10000
Add a WHERE clause to filter employees with Salary > 10000.
SELECT Employee.Name, Employee.Emp_city, Company.Company_name, Company.Salary
FROM Employee
INNER JOIN Company
ON Employee.Emp_no = Company.Emp_no
WHERE Company.Salary > 10000;
iii) Query to display all employees working in "XYZ company"
Add a WHERE condition to filter rows based on Company_name = 'XYZ company'.
SELECT Employee.Name, Employee.Emp_city, Company.Company_name
FROM Employee
INNER JOIN Company
ON Employee.Emp_no = Company.Emp_no
WHERE Company.Company_name = 'XYZ company';
iv) Query to display employees working in the same city
For this, we need to join the Employee table with itself (self-join) to compare employees from
the same city.
SELECT E1.Name AS Employee1, E2.Name AS Employee2, E1.Emp_city
FROM Employee E1
INNER JOIN Employee E2
ON E1.Emp_city = E2.Emp_city AND E1.Emp_no <> E2.Emp_no;

● E1 and E2 are aliases for the Employee table.

● The condition E1.Emp_no <> E2.Emp_no ensures the query does not return the same
employee twice.
7. Consider the following relational schema

Employee (empno,name,office,age)

Books(isbn,title,authors,publisher)
Loan(empno, isbn,date)

Write the following queries in relational algebra.

a) Find the names of employees who have borrowed a book Published by McGraw-Hill.

b) Find the names of employees who have borrowed all books Published by McGraw-Hill.

c) Find the names of employees who have borrowed more than five different books
published by McGraw-Hill.

d) For each publisher, find the names of employees who have borrowed more than five
books of that publisher.

We will assume the databases as -

member(memb_no, name, dob)

books(isbn, title, authors, publisher)

borrowed(memb_no, isbn, date)

(i) Relational Algebra:

π name (employee ⋈ (πempno (Loan ⋈ (π isbn(Ϭ publisher=’McGrawHILL’ (books)))))

SQL:

SELECT name

FROM member

WHERE meber.memb_no=borrowed.memb_no AND books.isbn=borrowed.isbn

AND books.publisher 'McGraw Hill';

(ii) Relational Algebra

π name, isbn(employee ⋈ πempno (loan ⋈books) ÷ π isbn(Ϭ publisher=’McGrawHILL’ (books))

SQL:

SELECT distinct M.name

FROM Member M,

WHERE NOT EXIST

(SELECT isbn

FROM books

WHERE publisher = 'McGraw Hill'

)

EXCEPT

(SELECT isbn

FROM borrowed R

WHERE R.memb_no = M.memb_no

(iii) Relational Algebra

π name (employee ⋈ πempno loan ⋈books (π isbn(Ϭ publisher=’McGrawHILL’ ^count(isbn)>5(books) )

SQL:
select name from employee, loan where employee. empno=loan. empno and isbn in ( select
distinct isbn from books where publisher='McGraw-Hill') group by employee.empno, name
having count(isbn) >=5

(iv) Relational Algebra

π name (employee ⋈ πempno loan ⋈books (π isbn(Ϭpublisher=’McGrawHILL’ ^count(isbn)>5(books) )

SQL:
select name from employee, loan, books where employee.empno=loan.empno and
books.isbn=loan.isbn group by employee. empno, name, books. publisher having count(loan.isbn)
>=5

8. Write about the following

i. Nested loop join

ii. Block Nested loop join

iii. Merge join

iv Hash join

i. Nested Loop Join

Definition:
The Nested Loop Join is a basic join algorithm where each tuple of one relation (outer relation) is
compared with every tuple of the other relation (inner relation) to find matching pairs based on the
join condition.

Steps:
1. For each tuple in the outer relation, scan all tuples in the inner relation.

2. Compare each pair of tuples based on the join condition.

3. Output the matched pairs.

Example:
Consider two relations:

● R1: {A: 1, 2}

● R2:{B:1,3}
Join condition: R1.A = R2.B.

● Compare R1(1) with R2(1) → Match.

● Compare R1(1) with R2(3) → No match.

● Compare R1(2) with R2(1) → No match.

● Compare R1(2) with R2(3) → No match.

Result: {(1, 1)}

Time Complexity:

● Without indexing: O(m × n), where m and n are the sizes of the two relations.

Advantages:

● Simple and easy to implement.

Disadvantages:

● Inefficient for large datasets.

ii. Block Nested Loop Join

Definition:
A variation of the nested loop join that reduces the number of disk I/O operations by reading the
tuples of the outer relation in blocks (pages) rather than one tuple at a time.

Steps:

1. Divide the outer relation into blocks (pages).

2. For each block of the outer relation, compare its tuples with all tuples of the inner relation.

3. Output the matched pairs.

Optimization:

● Fewer I/O operations because multiple tuples are processed together.

Example:
Using the same relations (R1 and R2) as in the nested loop join:

● Load a block of tuples from R1 and compare with all tuples in R2.

● Repeat for the next block until all blocks of R1 are processed.

Advantages:

● More efficient than the basic nested loop join, especially when the outer relation is much
larger.

Disadvantages:

● Still slower than other optimized joins like hash join or merge join for large datasets.

iii. Merge Join

Definition:
The Merge Join (or Sort-Merge Join) works by first sorting both relations on the join key and then
merging the two sorted lists to find matching tuples.

Steps:

1. Sort both relations on the join key.

2. Use two pointers, one for each relation, and scan through both relations simultaneously.

3. If the join condition is satisfied, output the matching tuples.

4. If the keys do not match, move the pointer for the relation with the smaller value.

Example:
R1: {A: 1, 2, 4} (already sorted)
R2: {B: 1, 3, 4} (already sorted)

Join condition: R1.A = R2.B.

● Compare R1(1) with R2(1) → Match.

● Compare R1(2) with R2(3) → No match.

● Compare R1(4) with R2(4) → Match.

Result: {(1, 1), (4, 4)}

Time Complexity:

● Sorting: O(m log m + n log n)

● Merging: O(m + n)

Advantages:
● Efficient for sorted relations.

● Handles large datasets well.

Disadvantages:

● Sorting overhead if the relations are not pre-sorted.

iv. Hash Join

Definition:
The Hash Join is an efficient join algorithm that uses hashing to partition and match tuples.
It works best for equality joins.

Steps:

1. Build Phase: Create a hash table for the smaller relation (usually the inner relation) based
on the join key.

2. Probe Phase: For each tuple in the larger relation (outer relation), use the hash function to
check for matches in the hash table.

3. Output the matching pairs.

Example:
R1: {A: 1, 2}
R2: {B: 1, 3}
Join condition: R1.A = R2.B.

● Build phase: Create a hash table for R2 → Hash(B=1) → Slot 1, Hash(B=3) → Slot 3.

● Probe phase: For R1, hash A=1 → Match in Slot 1. Hash A=2 → No match.
Result: {(1, 1)}

Advantages:

● Very efficient for large relations when the smaller relation fits in memory.

● No sorting required.

Disadvantages:

● Performance depends on the quality of the hash function.

● Not suitable for non-equality joins or large inner relations that do not fit in memory.

9. A software contract and consultancy firm maintains details of all the various projects in which its
employees are currently involved. These details comprise:

• Employee Number

• Employee Name
• Date of Birth

• Department Code

• Department Name Project Code

• Project Description

• Project Supervisor

Assume the following:

• Each employee number is unique.

• Each department has a single department code.

• Each project has a single code and supervisor.

• Each employee may work on one or more projects.

• Employee names need not necessarily be unique.

• Project Code, Project Description and Project Supervisor are repeating fields.

Normalise this data to Third Normal Form.

1. Unnormalized Form (UNF)

The given table contains repeating fields such as Project Code, Project Description, and
Project Supervisor, indicating it is in UNF.
Here is the initial structure:

2. First Normal Form (1NF)

To achieve 1NF, eliminate repeating groups. Each attribute should contain atomic values,
and separate rows should be created for each repeating group.
Transformed Table in 1NF:
3. Second Normal Form (2NF)
To achieve 2NF, remove partial dependencies. A table is in 2NF if it is in 1NF and all non-prime
attributes are fully dependent on the entire primary key.
● Primary Key: The combination of Employee Number and Project Code uniquely identifies
each record.
● Identify partial dependencies: Attributes like Employee Name, Date of Birth, Department
Code, and Department Name are dependent only on Employee Number (not the full
primary key).
Decomposition into 2NF:
Table 1: Employee Table

Table 2: Project Table

Table 3: Employee-Project Table (Mapping Table)

4. Third Normal Form (3NF)
To achieve 3NF, remove transitive dependencies. A table is in 3NF if it is in 2NF, and no
non-prime attribute depends on another non-prime attribute.
● In Table 1 (Employee Table), Department Name depends on Department Code, not
directly on the primary key.
● Decompose further to remove this transitive dependency.
Final Tables in 3NF:
Table 1: Employee Table

Table 2: Department Table

Table 3: Project Table

Table 4: Employee-Project Table

Final Structure:
1. Employee Table contains employee-specific details (removing redundancy).
2. Department Table ensures department details are stored once and referenced via
Department Code.
3. Project Table stores project-specific details (removing redundancy of repeating fields).
4. Employee-Project Table establishes a many-to-many relationship between employees and
projects.
This normalized structure eliminates redundancy, ensures data integrity, and adheres to the rules of 3NF.

10. A car rental company maintains a database for all vehicles in its current fleet. For all vehicles, it
includes the vehicle identification number license number, manufacturer, model, date of purchase
and colour. Special data are included for certain types of vehicles.

Trucks: Cargo capacity

Sports cars: horsepower, renter age requirement

Vans: number of passengers

Off-road vehicles: ground clearance, drivetrain (four-or two-wheel drive)

Construct an ER model for the car rental company database.

UNIT – III – TRANSACTIONS

PART – A

1. What does time-to-commit mean?

• The COMMIT command is used to save permanently any transaction to database.

• When we perform, Read or Write operations to the database then those changes can be
undone by rollback operations. To make these changes permanent, we should make use of commit.

2. What are ACID properties?

● In a database, each transaction should maintain ACID property to meet the consistency and
integrity of the database. These are

(1) Atomicity

(2) Consistency

(3) Isolation

(4) Durability

3. State the atomicity property of a transaction.

This property states that each transaction must be considered as a single unit and must be
completed fully or not completed at all. No transaction in the database is left half completed.
4. What is meant by concurrency control?
A mechanism which ensures that simultaneous execution of more than one transactions
does not lead to any database inconsistencies is called concurrency control mechanism.
5. State the need for concurrency control.
The purposes of concurrency control-
• To ensure isolation
• To resolve read-write or write-write conflicts
• To preserve consistency of database
6. List commonly used concurrency control techniques.
The commonly used concurrency control techniques are –
i)Lock
ii) Timestamp
iii) Snapshot Isolation
7. What is meant by serializability? How it is tested?
Serializability is a concept that helps to identify which non serial schedule and find the
transaction equivalent to serial schedule. It is tested using precedence graph technique.
8. Define two phase locking.
The two phase locking is a protocol in which there are two phases:
i)Growing Phase (Locking Phase): It is a phase in which the transaction may obtain
locks but does not release any lock.
ii) Shrinking Phase (Unlocking Phase): It is a phase in which the transaction may
release the locks but does not obtain any new lock.
9. When are two schedules conflict equivalent?
Two schedules are conflict equivalent if :
• They contain the same set of the transaction.
• every pair of conflicting actions is ordered the same way.
For example –

Schedule S2 is a serial schedule because, in this, all operations of T1 are performed before
starting any operation of T2. Schedule S1 can be transformed into a serial schedule by swapping
non-conflicting operations of S1. Hence both of the above the schedules are conflict equivalent.
10.What is the difference between shared lock and exclusive lock?

11.What benefit does strict two-phase locking provide? Give the disadvantages result.
Benefits:
1. This ensure that any data written by an uncommitted transaction are locked in exclusive
mode until the transaction commits and preventing other transaction from reading that data.
2. This protocol solves dirty read problem.
Disadvantage:
1. Concurrency is reduced.
12.What is rigorous two- phase locking protocol?
This is stricter two phase locking protocol. Here all locks are to be held until the transaction
commits.
13.Why is it necessary to have control of concurrent execution of transactions? How is it made
possible?
Concurrent execution of transactions is necessary to improve system performance and
resource utilization, but it can lead to issues like data inconsistency, deadlock, and race conditions.
Without proper control, problems such as the lost update, dirty read, and unrepeatable read can
occur, compromising data integrity.
Control of concurrent execution is achieved through concurrency control techniques like:
1. Lock-based protocols: Ensuring exclusive or shared access to data items.
2. Timestamp ordering: Ordering transactions based on their start time.
3. Multiversion concurrency control (MVCC): Maintaining multiple versions of data.
4. Optimistic concurrency control: Detecting conflicts at the transaction commit phase.
These techniques ensure that the database maintains ACID properties (Atomicity,
Consistency, Isolation, Durability).

14.Define deadlock.
Deadlock is a situation in which when two or more transactions have got a lock and waiting
for another locks currently held by one of the other transactions.
15.List four conditions for deadlock.
1. Mutual exclusion condition
2. Hold and wait condition
3. No pre-emption condition
4. Circular wait condition
16.List the responsibilities of a DBMS has whenever a transaction is submitted to the system for
execution.
The system is responsible for making sure that –
(1) Either all the operations in the transaction are completed successfully and effect is
recorded permanently in the database.
(2) The transaction, has no effect whatsoever on the database or on the database or on any
other transaction.
17.Give any two violations that may occur if a transaction executes a lower isolation level than
serializable.
If a transaction executes at a lower isolation level than serializable, the following two
violations may occur:
1. Dirty Read:
A transaction may read uncommitted changes made by another transaction. If the other
transaction rolls back, the data read becomes invalid, leading to inconsistencies.
Example: Transaction T1 updates a record but hasn't committed, and Transaction T2 reads the
uncommitted value.
2. Non-Repeatable Read:
A transaction reads the same data twice but gets different results because another
transaction modifies the data in between.
Example: Transaction T1 reads a record, then Transaction T2 updates or deletes the record,
causing T1 to see inconsistent data when it reads the same record again.
18.What is meant by log-based recovery?
Log-based recovery in DBMS refers to a technique where all the changes made to the
database by transactions are recorded in a log file. This log contains details such as the transaction
ID, the operations performed, and the data affected, ensuring that the database can be recovered to
a consistent state in case of failure.
● Undo: If a transaction fails, the log is used to roll back the changes made by the transaction.
● Redo: If a committed transaction's changes are lost due to a crash, the log is used to reapply
those changes.
This ensures the database maintains atomicity and durability (ACID properties).
19.What is a transaction?
A transaction is a sequence of one or more operations performed on a database that
functions as a single logical unit of work. A transaction ensures the ACID properties:
● Atomicity: All operations in the transaction are executed fully or not at all.
● Consistency: The database remains in a valid state before and after the transaction.
● Isolation: Transactions do not interfere with each other.
● Durability: Once a transaction is committed, changes are permanent.
Example: Transferring money between two bank accounts involves debiting one account and
crediting another as a single transaction.
20. What do you mean by phantom problem?
The phantom problem in DBMS occurs when a transaction retrieves a set of rows based on
a condition, but another transaction inserts or deletes rows that match the same condition. This
leads to inconsistent results if the first transaction re-executes the query.
Example:
1. Transaction T1 reads rows from a table where salary > 5000.
2. Transaction T2 inserts a new row with salary = 6000 and commits.
3. When T1 re-executes the query, it sees the new row, causing inconsistent results.
The phantom problem occurs under lower isolation levels and can be prevented using the
serializable isolation level by applying range locks.
1. Explain in detail about Lock based protocols and Timestamp based protocols.
a) Lock Based Protocol
• Concept of Protocol: The lock-based protocol is a mechanism in which there is exclusive
use of locks on the data item for current transaction.
• Types of Locks: There are two types of locks used -

i) Shared Lock: The shared lock is used for reading data items only. It is denoted by
Lock-S. This is also called as read lock.
ii) Exclusive Lock: The exclusive lock is used for both read and write operations. It is
denoted as Lock-X. This is also called as write lock.
• The compatibility matrix is used while working on set of locks. The concurrency control
manager checks the compatibility matrix before granting the lock. If the two modes of
transactions are compatible to each other then only the lock will be granted.
• In a set of locks may consists of shared or exclusive locks. Following matrix represents
the compatibility between modes of locks.

Here T stands for True and F stands for False. If the control manager get the compatibility
mode as True then it grant the lock otherwise the lock will be denied.
• For example: If the transaction T1 is holding a shared lock in data item A, then the no
control manager can grant the shared lock to transaction T2 as compatibility is True.
But it cannot grant the exclusive lock as the compatibility is false. In simple words if
transaction T1 is reading a data item A then same data item A can be read by another
transaction T2 but cannot be written by another transaction.
Similarly, if an exclusive lock (i.e. lock for read and write operations) is hold on the data
item in some transaction then no other transaction can acquire Share or exclusive lock as
the compatibility function denotes F. That means of some transaction is writing a data item
A then another transaction cannot read or write that data item A.
Hence the rule of thumb is
i) Any number of transactions can hold shared lock on an item.
ii) But exclusive lock can be hold by only one transaction.
• Example of a schedule denoting shared and exclusive locks: Consider following schedule
in which initially A=100. We deduct 50 from A in T, transaction and Read the data item A in
transaction T2. The scenario can be represented with the help of locks and concurrency
control manager as follows:

Timestamp Based Protocol

• The time stamp ordering protocol is a scheme in which the order of transaction' is
decided in advance based on their timestamps. Thus the schedules are serialized
according to their timestamps.
• The timestamp-ordering protocol ensures that any conflicting read and write
operations are executed in timestamp order.
• A larger timestamp indicates a more recent transaction or it is also called as younger
transaction while lesser timestamp indicates older transaction.
• Assume a collection of data items that are accessed, with read and write operations,
by transactions.
• For each data item X the DBMS maintains the following values:
• RTS(X): The Timestamp on which object X was last read (by some transaction
T1, i.e., RTS(X):=TS(T;)) [Note that: RTS stands for Read Time Stamp]
• WTS(X): The Timestamp on which object X was last written (by some transaction
T,, i.e., WTS(X):=TS(T)) [Note that: WTS stands for Write Time Stamp]
• For the following algorithms we use the following assumptions: - A data item X in the
database has a RTS(X) and WTS(X). These are actually the timestamps of read and
write operations performed on data item X at latest time.
• A transaction T attempts to perform some action (read or write) on data item X on
some timestamp and we call that timestamp as TS(T).
• By timestamp ordering algorithm we need to decide whether transaction T has to be
aborted or T can continue execution.
Basic Timestamp Ordering Algorithm
Case 1 (Read): Transaction T issues a read(X) operation
i) If TS(T) < WTS(X), then read(X) is rejected. T has to abort and be rejected.
ii) If WTS(X) ≤ TS(T), then execute read(X) of T and update RTS(X).
Case 2 (Write): Transaction T issues a write(X) operation
i) If TS(T) < RTS(X) or if TS(T) < WTS(X), then write is rejected
ii) If RTS(X)≤ TS(T) or WTS(X) ≤ TS(T), then execute write(X) of T and
update WTS(X).
Example for Case 1 (Read operation)
(i) Suppose we have two transactions T1 and T2 with timestamps 10 sec and 20 sec
respectively.

RTS(X) and WTS(X) is initially = 0

Then RTS(X)=10, when transaction T1 executes
After that WTS(X) =20 when transaction T2 executes
Now if Read operation R(X) occurs on transaction T1 at TS(T1) = 10 then
TS(T1) i.e. 10 <WTS(X) i.e. 20, hence we have to reject second read operation on
T1 i.e.
(ii) Suppose we have two transactions T1 and T2 with timestamps 10 sec and 20
sec respectively.

RTS(X) and WTS(X) is initially = 0

Then WTS(X) =10 as transaction T1 executes.
Now if Read operation R(X) occurs on transaction T2 at TS(T2) = 20 then
TS(T2) i.e. 20 >WTS(X) which is 10, hence we accept read operation on T2. The
transaction T2 will perform read operation and now RTS will be updated as
RTS(X)=20
Example for Case 2 (Write Operation)
(i) Suppose we have two transactions T1 and T2 with timestamps 10 sec and 20 sec
respectively.

RTS(X) and WTS(X) is initially = 0

Then RTS(X)=10, when transaction T1 executes
After that WTS(X) =20 when transaction T2 executes
Now if Write operation W(X) occurs on transaction T1 at TS(T1) = 10 then TS(T1) i.e. 10
<WTS(X), hence we have to reject second write operation on T1 i.e.

(ii) Suppose we have two transactions T1 and T2 with timestamps 10 sec and 20
sec respectively.

RTS(X) and WTS(X) is initially = 0

Then WTS(X) =10 as transaction T1 executes.
Now if write operation W(X) occurs on transaction T2 at TS(T2) = 20 then
TS(T2) i.e. 20 >WTS(X) which is 10, hence we accept write operation on T2. The
transaction T2 will perform write operation and now WTS will be updated as
WTS(X) = 20
Advantages and disadvantages of time stamp ordering
Advantages
(1) Schedules are serializable
(2) No waiting for transaction and hence there is no deadlock situation.
Disadvantages
(1) Schedules are not recoverable once transactions occur.
(2) Same transaction may be continuously aborted or restarted.

2. Describe serializability with examples.

Serializability
• When multiple transactions run concurrently, then it may lead to inconsistency of
data (i.e. change in the resultant value of data from different transaction).
• Serializability is a concept that helps to identify which non serial schedule and find
the transaction equivalent to serial schedule.
• In above transaction initially T1 will read the values from database as A= 100, B= 100
and modify the values of A and B, transaction T2 will read the modified value i.e. 90
and will modify it to 80 and perform write operation. Thus at the end of transaction
T1 value of A will be 90 but at end of transaction T2 value of A will be 80. Thus
conflicts or inconsistency occurs here. This sequence can be converted to a sequence
which may give us consistent result. This process is called serializability.

• There are two types of serializability’s: conflict serializability and view serializability
Conflict Serializability
• Definition: Suppose T1 and T2 are two transactions and I1 and I2 are the instructions
in T1 and T2 respectively. Then these two transactions are said to be conflict
Serializable, if both the instruction access the data item d, and at least one of the
instructions is write operation.
• What is conflict? In the definition three conditions are specified for a conflict in
conflict serializability -
1) There should be different transactions
2) The operations must be performed on same data items
3) One of the operations must be the Write(W) operation
• We can test a given schedule for conflict serializability by constructing a precedence
graph for the schedule, and by searching for absence of cycles in the graph.
• Predence graph is a directed graph, consisting of G=(V,E) where V is set of vertices
and E is set of edges. The set of vertices consists of all the transactions participating
in the schedule. The set of edges consists of all edges Ti→Tj for which one of three
conditions holds:
1. Ti executes write(Q) before Tj executes read(Q).
2. Ti executes read(Q) before Tj executes write(Q).
3. Ti executes write(Q) before Tj executes write(Q).
• A serializability order of the transactions can be obtained by finding a linear order
consistent with the partial order of the precedence graph. This process is called
topological sorting.
Testing for serializability
Following method is used for testing the serializability: To test the conflictserializability
we can draw a graph G = (V,E) where V = vertices which represent the number of
transactions. E = edges for conflicting pairs.
Step 1: Create a node for each transaction.
Step 2: Find the conflicting pairs (RW, WR, WW) on the same variable (or data item)
by different transactions.
Step 3: Draw edge for the given schedule. Consider following cases
1. Ti executes write(Q) before Tj executes read(Q), then draw edge from Ti to Tj.
2. Ti executes read(Q) before Tj executes write(Q), then draw edge from Ti to Tj
3. Ti executes write(Q) before Tj executes write (Q),, then draw edge from Ti to Tj
Step 4: Now, if precedence graph is cyclic then it is a non-conflict serializable schedule
and if the precedence graph is acyclic then it is conflicting serializable schedule.
Example 3.5.1 Consider the following two transactions and schedule (time goes from
top to bottom). Is this schedule conflict-serializable? Explain why or why not.

Solution :
Step 1: To check whether the schedule is conflict serializable or not we will check from
top to bottom. Thus we will start reading from top to bottom as
T1: R(A) -> T1:W(A) ->T2:R(A) -> T2:R(B) ->T1:R(B)->T1:W(B)
Step 2: We will find conflicting operations. Two operations are called as conflicting
operations if all the following conditions hold true for them-
i) Both the operations belong to different transactions.
ii) Both the operations are on same data item.
iii) At least one of the two operations is a write operation
From above given example in the top to bottom scanning we find the conflict as
T1:W(A)->T2:R(A).
i) Here note that there are two different transactions T1 and T2,
ii) Both work on same data item i.e. A and
iii) One of the operation is write operation.
Step 3: We will build a precedence graph by drawing one node from each transaction.
In above given scenario as there are two transactions, there will be two nodes namely
T1 and T2
Step 4: Draw the edge between conflicting transactions. For example in above given
scenario, the conflict occurs while moving from T1:W(A) to T2:R(A). Hence edge must
be from T1 to T2.

Step 5: Repeat the step 4 while reading from top to bottom. Finally the precedence
graph will be as follows

Step 6: Check if any cycle exists in the graph. Cycle is a path using which we can start
from one node and reach to the same node. If the is cycle found then schedule is not
conflict serializable. In the step 5 we get a graph with cycle, that means given
schedule is not conflict serializable.
Example 3.5.2 Check whether following schedule is conflict serializable or not. If it is
not conflict serializable then find the serializability order.

Solution:
Step 1: We will read from top to bottom, and build a precedence graph for conflicting
entries. We will build a precedence graph by drawing one node from each transaction.
In above given scenario as there are three transactions, there will be two nodes
namely T1 T2, and T3

Step 2: The conflicts are found as follows –

Step 3: The precedence graph will be as follows –

Step 4: As there is no cycle in the precedence graph, the given sequence is conflict
serializable. Hence, we can convert this non serial schedule to serial schedule. For
that purpose, we will follow these steps to find the serializable order.
Step 5: A serializability order of the transactions can be obtained by finding a linear
order consistent with the partial order of the precedence graph. This process is called
topological sorting.
Step 6: Find the vertex which has no incoming edge which is T1. If we delete T1 node
then T3 is a node that has no incoming edge. If we delete T3, then T2 is a node that
has no incoming edge.

Thus, the nodes can be deleted in a order T1, T3 and T2. Hence the order will be
T1-T3-T2
Example 3.5.3 Check whether the below schedule is conflict serializable or not.
{B2,r2(X),b1,r1(X),W1(X),r1(Y),W1(Y),W2(X),e1,C1,e2,C2}
Solution: b2 and b1 represents begin transaction 2 and begin transaction 1. Similarly,
el and e2 represents end transaction 1 and end transaction 2.
We will rewrite the schedule as follows-

Step 1: We will find conflicting operations. Two operations are called as conflicting
operations if all the following conditions hold true for them -
i) Both the operations belong to different transactions.
ii) Both the operations are on same data item.
iii) At least one of the two operations is a write operation.
The conflicting entries are as follows –

Step 2: Now we build a precedence graph for conflicting entries.

As there are two transactions only two nodes are present in the graph.
Step 3: We get a graph with cycle, that means given schedule is not conflict
serializable.
Example 3.5.4 Consider the three transactions T1, T2, and T3 and schedules S1 and
S2 given below. Determine whether each schedule is serializable or not? If a schedule
is serializable write down the equivalent serial schedule(S).
T1: R1(x) R1(z);W1(x);
T2: R2(x);R2(y);W2(z);W2(y)
T3:R3(x);R3(y);W3(y);
S1: R1(x);R2(z);R1(z); R3(x);R3(y);W1(x);W3(y);R2(y); W2(z);W2(y);
S2: R1(x);R2(z);R3(x);R1(z);R2(y);R3(y);W1(x);W2(z);W3(y);W2(y);
Solution:
Step 1: We will represent the schedule S1 as follows

Step (a): We will find conflicting operations. Two operations are called as conflicting
operations if all the following conditions hold true for them -
i) Both the operations belong to different transactions.
ii) Both the operations are on same data item.
iii) At least one of the two operations is a write operation
The conflicting entries are as follows -

Step (b): Now we will draw precedence graph as follows-

As there is no cycle in the precedence graph, the given sequence is conflict
serializable. Hence, we can convert this non serial schedule to serial schedule. For
that purpose, we will follow these steps to find the serializable order.
Step (c): A serializability order of the transactions can be obtained by finding a linear
order consistent with the partial order of the precedence graph. This process is called
topological sorting.
Step (d): Find the vertex which has no incoming edge which is T3. If we delete T3,
then T1 is the edge that has no incoming edge. Finally find the vertex having no
outgoing edge which is T2. Hence the order will be T3-T1-T2.
Step 2: We will represent the schedule S2 as follows -

Step (b): Now we will draw precedence graph as follows-

As there is no cycle in the precedence graph, the given sequence is conflict

serializable. Hence, we can convert this non serial schedule to serial schedule. For
that purpose, we will follow these steps to find the serializable order.
Step (c): A serializability order of the transactions can be obtained by finding a linear
order consistent with the partial order of the precedence graph. This process is called
topological sorting.
Step (d): Find the vertex which has no incoming edge which is T3. Finally find the
vertex having no outgoing edge which is T2. So in between them is T1. Hence the
order will be T3-T1-T2
Example 3.5.5 Explain the concept of conflict serializability. Decide whether following
schedule is conflict serializable or not. Justify your answer.

Solution:
Step 1: We will read from top to bottom, and build a precedence graph for conflicting
entries.
The conflicting entries are as follows-

Step 2: Now we will build precedence graph as follows

Step 3: There is no cycle in the precedence graph. That means this schedule is
conflict serializable. Hence, we can convert this non serial schedule to serial schedule.
For that purpose, we will follow the following steps to find the serializable order.
1) Find the vertex which has no incoming edge which is T1.
2) Then find the vertex having no outgoing edge which is T2. In between them there is
no other transaction.
3) Hence the order will be T1-T2.
View Serializability
• If a given schedule is found to be view equivalent to some serial schedule, then it is
called as a view serializable schedule.
• View Equivalent Schedule: Consider two schedules S1 and S2 consisting of
transactions T1 and T2 respectively, then schedules S1 and S2 are said to be view
equivalent schedule if it satisfies following three conditions:
• If transaction T1 reads a data item A from the database initially in schedule S1, then in
schedule S2 also, T1 must perform the initial read of the data item X from the database.
This is same for all the data items. In other words –the initial reads must be same for
all data items.
• If data item A has been updated at last by transaction T1 in schedule S1, then in
schedule S2 also, the data item A must be updated at last by transaction T1.
• If transaction T1 reads a data item that has been updated by the transaction T1 in
schedule S1 then in schedule S2 also, transaction T1 must read the same data item
that has been updated by transaction T1. In other words the Write-Read sequence
must be same.

3. Explain Two phase locking protocol in detail.

• The two-phase locking is a protocol in which there are two phases:
i) Growing phase (Locking phase): It is a phase in which the transaction may obtain
locks but does not release any lock.
ii) Shrinking phase (Unlocking phase): It is a phase in which the transaction may
release the locks but does not obtain any new lock.
• Lock Point: The last lock position or first unlock position is called lock point. For
example

Consider following transactions

The important rule for being a two-phase locking is - All Lock operations precede all
the unlock operations.
In above transactions T1 is in two phase locking mode but transaction T2 is not in two
phase locking. Because in T2, the Shared lock is acquired by data item B, then data
item B is read and then the lock is released. Again, the lock is acquired by data item A,
then the data item A is read and the lock is then released. Thus, we get
lock-unlock-lock-unlock sequence. Clearly this is not possible in two phase locking.
Example 3.10.1 Prove that two phase locking guarantees serializability. AU: Dec.-11,
Marks 8
Solution:
• Serializability is mainly an issue of handling write operation. Because any
inconsistency may only be created by write operation.
• Multiple reads on a database item can happen parallelly.
• 2-Phase locking protocol restricts this unwanted read/write by applying exclusive
lock.
• Moreover, when there is an exclusive lock on an item it will only be released in
shrinking phase. Due to this restriction, there is no chance of getting any inconsistent
state.
The serializability using two phase locking can be understood with the help of following
example
Consider two transactions
Step 1: Now we will apply two phase locking. That means we will apply locks in
growing and shrinking phase

Note that above schedule is serializable as it prevents interference between

two transactions.
The serializability order can be obtained based on the lock point. The lock point is
either last lock operation position or first unlock position in the transaction.
The last lock position is in T1, then it is in T2. Hence the serializability will be
T1->T2 based on lock points. Hence the sequence can be R1(A); R2(A);R1(B);W1(B)
Limitations of Two-Phase Locking Protocol
The two-phase locking protocol leads to two problems - deadlock and cascading roll
back.
(1) Deadlock: The deadlock problem can not be solved by two phase locking.
Deadlock is a situation in which when two or more transactions have got a lock and
waiting for another locks currently held by one of the other transactions.
For example,
(2) Cascading Rollback: Cascading rollback is a situation in which transaction failure
leads to a series of transaction rollback. For example -

When T1 writes value of C then only T2 can read it. And when T2 writes the value of C
then only transaction T3 can read it. But if the transaction T1 gets failed then
automatically transactions T2 and T3 gets failed.
The simple two-phase locking does not solve the cascading rollback problem. To solve
the problem of cascading Rollback two types of two-phase locking mechanisms can
be used.
Types of Two-Phase Locking
(1) Strict two-phase locking: The strict 2PL protocol is a basic two-phase protocol but
all the exclusive mode locks be held until the transaction commits. That means in
other words all the exclusive locks are unlocked only after the transaction is
committed. That also means that if T1 has exclusive lock, then T, will release the
exclusive lock only after commit operation, then only other transaction is allowed to
read or write. For example, Consider two transactions
If we apply the locks then

Thus only after commit operation in T1, we can unlock the exclusive lock. This
ensures the strict serializability.
Thus compared to basic two phase locking protocol, the advantage of strict 2PL
protocol is it ensures strict serializability.
(2) Rigorous two phase locking: This is stricter two phase locking protocol. Here all
locks are to be held until the transaction commits. The transactions can be seriealized
in the order in which they commit.
example - Consider transactions

If we apply the locks then

Thus the above transaction uses rigorous two phase locking mechanism
Example 3.10.2 Consider the following two transactions:
T1:read(A)
Read(B);
If A=0 then B=B+1;
Write(B)
T2:read(B); read(A)
If B=0 then A=A+1
Write(A)
Add lock and unlock instructions to transactions T1 and T2, so that they observe two
phase locking protocol. Can the execution of these transactions result in
deadlock? AU: Dec.-16, Marks 6
Solution:

This is lock-unlock instruction sequence help to satisfy the requirements for strict two
phase locking for the given transactions.
The execution of these transactions result in deadlock. Consider following partial
execution scenario which leads to deadlock.

Lock Conversion
Lock conversion is a mechanism in two phase locking mechanism - which allows
conversion of shared lock to exclusive lock or exclusive lock to shared lock.
Method of Conversion :
First Phase:
• can acquire a lock-S on item
• can acquire a lock-X on item
• can convert a lock-S to a lock-X (upgrade)
Second Phase:
• can release a lock-S
• can release a lock-X
• can convert a lock-X to a lock-S (downgrade)
This protocol assures serializability. But still relies on the programmer to insert the
various locking instructions.
For example - Consider following two transactions -

Here if we start applying locks, then we must apply the exclusive lock on data item A,
because we have to read as well as write on data item A. Another transaction T2 does
not get shared lock on A until transaction T1 performs write operation on A. Since
transaction T1 needs exclusive lock only at the end when it performs write operation on
A, it is better if T1 could initially lock A in shared mode and then later change it to
exclusive mode lock when it performs write operation. In such situation, the lock
conversion mechanism becomes useful.
When we convert the shared mode lock to exclusive mode then it is called upgrading
and when we convert exclusive mode lock to shared mode then it is called
downgrading.
Also note that upgrading takes place only in growing phase and downgrading takes
place only in shrinking phase. Thus we can refine above transactions using lock
conversion mechanism as follows -
4. Write in detail about the immediate update and deferred update recovery
techniques.
Deferred Database Modification
• In this technique, the database is not updated immediately.
• Only log file is updated on each transaction.
• When the transaction reaches to its commit point, then only the database is
physically updated from the log file.
• In this technique, if a transaction fails before reaching to its commit point, it will not
have changed database anyway. Hence there is no need for the UNDO operation.
The REDO operation is required to record the operations from log file to physical
database. Hence deferred database modification technique is also called as NO
UNDO/REDO algorithm.
• For example: Consider two transactions T1 and T2 as follows:

If T1 and T2 are executed serially with initial values of A = 100, B = 200 and C = 300,
then the state of log and database if crash occurs
a) Just after write (B, b)
b) Just after write (C, c)
c) Just after <T2, commit>
The result of above 3 scenarios is as follows:
Initially the log and database will be

a) Just after write (B, b)

Just after write operation, no commit record appears in log. Hence no write operation
is performed on database. So database retains only old values. Hence A = 100 and B
= 200 respectively.
Thus the system comes back to original position and no redo operation take place.
The incomplete transaction of T1 can be deleted from log.
b) Just after write (C, c)
The state of log records is as follows
Note that crash occurs before T2 commits. At this point T1 is completed successfully,
so new values of A and B are written from log to database. But as T2 is not
committed, there is no redo (T2) and the incomplete transaction T2 can be deleted
from log.
The redo (T1) is done as < T1, commit> gets executed. Therefore A = 90, B = 210 and
C=300 are the values for database.
c) Just after < T2, commit>
The log records are as follows:
<T1, Start>
<T1, A, 90>
<T1, B, 210>
<T1, Commit>
<T2, Start>
<T2, 6, 280>
<T2, Commit>
← Crash occurs here
Clearly both T1 and T2 reached at commit point and then crash occurs. So both redo
(T1) and redo (T2) are done and updated values will be A = 90, B = 210, C = 280.
Immediate Database Modification
In this technique, the database is updated during the execution of transaction even
before it reaches to its commit point.
If the transaction gets failed before it reaches to its commit point, then the a
ROLLBACK Operation needs to be done to bring the database to its earlier
consistent state. That means the effect of operations need to be undone on the
database. For that purpose both Redo and Undo operations are both required during
the recovery. This technique is known as UNDO/REDO technique.
For example: Consider two transaction T1 and T2 as follows:

Here T1 and T2 are executed serially. Initially A = 100, B = 200 and C = 300
If the crash occurs after
i) Just after Write(B, b) ii) Just after Write(C, c) iii) Just after <T,,Commit>
Then using the immediate Database modification approach the result of above three
scenarios can be elaborated as follows:
The contents of log and database is as follows:

The recovery scheme uses two recovery techniques -

i) UNDO (Ti): The transaction T, needs to be undone if the log contains <Ti,Start> but
does not contain <Ti,Commit>. In this phase, it restores the values of all data items
updated by T, to the old values.
ii) REDO (Ti): The transaction Ti needs to be redone if the log contains both
<Ti,Start> and <Ti,Commit>. In this phase, the data item values are set to the new
values as per the transaction. After a failure has occurred log record is consulted to
determine which transaction need to be redone.
a) Just after Write (B, b): When system comes back from this crash, it sees that there
is <T1, Start> but no <T1, Commit>. Hence T1 must be undone. That means old
values of A and B are restored. Thus old values of A and B are taken from log and
both the transaction T1 and T2 are re-executed.
b) Just after Write (C, c): Here both the redo and undo operations will occur.
c) Undo: When system comes back from this crash, it sees that there is <T2, Start>
but no <T2, Commit>. Hence T2 must be undone. That means old values of C is
restored.
Thus old value of C is taken from log and the transaction T2 is re-executed.
c) Redo: The transaction T, must be done as log contains both the <T1,Start> and
<T1,Commit>
So A = 90, B = 210 and C = 300
d) Just after <T2, Commit>: When the system comes back from this crash, it sees that
there are two transaction T1 and T2 with both start and commit points. That means
T1 and T2 need to be redone. So A = 90, B = 210 and C = 280
Example 3.19.1 Suppose there is a database system that never fails. Is a recovery
manager require for this system? Why?
Solution:
1) Yes. Even-though the database system never fails, the recovery manager is
required for this system.
2) During the transaction processing some transactions might be aborted. Such
transactions must be rolled back and then the schedule is continued further.
3) Thus to perform the rollbacks of aborted transactions recovery manager is
required
5. Explain the concept of Deadlock avoidance and prevention in
detail.
Deadlock avoidance depends on additional information about the long-term resource
needs of each process. The system must be able to decide whether granting a
resource is safe or not and only make the allocation when it is safe.
• Fig. 3.5.1 shows safe and unsafe state.

• When a process is created, it must declare its maximum claim, i.e. the maximum
number of unit resource. The resource manager can grant the request if the
resources are available.
• For example, if process 1 requests a resource held by process 2 then make sure
that process 2 is not waiting for resource head by first process 1.
Banker's Algorithm
• Banker's algorithm is the deadlock avoidance algorithm. Banker's is named
because the algorithm is modelled after a banker who makes loads from a pool of
capital and receives payments that are returned to that pool.
• Algorithm is check to see if granting the request leads to an unsafe state. If it does,
the request is denied. If granting the request leads to a safe state, it is carried out.
• The Dijkstra proposed an algorithm to regulate resource allocation to avoid Hiwe
deadlocks. The banker's algorithm is the best known of the avoidance method.
• By using avoidance method, the system is always kept in a safe state.
• It is easy to check if a deadlock is created by granting a request, deadlock analysis
method is used for this.
• Deadlock avoidance uses the worst-case analysis method to check for future
deadlocks.
• Safe state: There is at least one sequence of resource allocations to processes that
does not result in a deadlock.
• System is in a safe state only if there exist a safe sequence. A safe state is not
a deadlocked state.
• Deadlocked state is an unsafe state. It does mean the system is in a deadlock.
• As long as the state is safe, the resource manager can be guaranteed to avoid a
deadlock.
• Initially the system is in a safe state. When process requests a resource and that
resource is available then the system must decide whether the resources can be
allocated immediately or process must wait.
• If system remains in safe state after allocating resource, then only OS allocates
resources to process.
• Banker algorithm uses following data structures.
1. Allocation: Allocation is a table in which row represents process and column
represents resources (R).
alloc [i, j] = Number of units of resource Rj held by process Pi.
2. Max: Max be the maximum number of resources that process requires during its
execution.
• Need (claim): It is current claim of a process, where a process's claim is equal to its
maximum need minus its current allocation.
Need Max - Allocation
• Available: There will be number of resources still available for allocation. This is
equivalent to the total number of resources minus the sum of the allocation to all
processes in the system.
Available = Number of resources - Sum of the allocation to all process
= Number of resources - ∑ni=1 Allocation (Pi)
• Each process cannot request more that the total number of resources in the
system. Each process must also guarantee that once allocated a resource, the
process will return that resource to the system within a finite time.
Weakness of Banker's algorithm
1. It requires that there be a fixed number of resources to allocate.
2. The algorithm requires that users state their maximum needs (request) in advance.
3. Number of users must remain fixed.
4. The algorithm requires that the bankers grant all requests within a finite time. 5.
Algorithm requires that process returns all resource within a finite time.
Examples on Banker's algorithm
1. System consists of five processes (P1, P2, P3, P4, P5) and three resources (R1, R2,
R3). Resource type R1 has 10 instances, resource type R2 has 5 instances and
R3 has 7 instances. The following snapshot of the system has been taken :
Currently the system is in safe state.
Safe sequence: Safe sequence is calculated as follows:
1) Need of each process is compared with available. If needi ≤ availablei , then the
resources are allocated to that process and process will release the resource.
2) If need is greater than available, next process need is taken for comparison.
3) In the above example, need of process P1 is (7, 4, 3) and available is (3, 3, 2).
need ≥ available → False
So system will move for next process.
4) Need for process P2 is (1, 2, 2) and available (3, 3, 2), so
need ≤ available (work)
(1, 2, 2) ≤ (3, 3, 2) = True
then Finish [i] = True
Request of P2 is granted and processes P2 is release the resource to the system.
Work: Work + Allocation
Work: (3, 3, 2) + (2, 0, 0)
:= (5, 3, 2)
This procedure is continued for all processes.
5) Next process P3 need (6, 0, 0) is compared with new available (5, 3, 2).
Need > Available = False
(6 00) > (532)
6) Process P4 need (0, 1, 1) is compared with available (5, 3, 2).
Need > Available
(0 1 1) < (532) = True
Available = Available + Allocation
= (532) + (2 1 1)
= (7 4 3) (New available)
7) Then process P5 need (4 3 1) is compared with available (7 4 3)
Need < Available
(4 3 1) < (74 3) = True
Available = Available + Allocation
= (7 4 3) + (0 0 2) = (7 4 5) (New available)
8) One cycle is completed. Again system takes all remaining process in sequence.
So process P1 need (7 4 3) is compared with new available (7 4 5).
Need < Available = True
(7 4 3) < (745)
Available = Available + Allocation
= (7 4 5) + (0 1 0) = (755) (New available)
9) Process P3 need is (6 0 0) is compared with new available (7 5 5).
Need < Available = True
(6 0 0) < (755) = True
Available = Available + Allocation
= (755) + (3 0 2)
= (10 5 7) = (New available)
Safe sequence is <P2 P4 P5 P1 P3>

6. Discuss in detail about the ACID properties of transaction.

ACID Properties
1) Atomicity:
This property states that each transaction must be considered as a single unit and must be
completed fully or not completed at all.
• No transaction in the database is left half completed.
• Database should be in a state either before the transaction execution or after the
transaction execution. It should not be in a state 'executing'.
• For example In above mentioned withdrawal of money transaction all the five steps must
be completed fully or none of the step is completed. Suppose if transaction gets failed after
step 3, then the customer will get the money but the balance will not be updated
accordingly. The state of database should be either at before ATM withdrawal (i.e customer
without withdrawn money) or after ATM withdrawal (i.e. customer with money and account
updated). This will make the system in consistent state.
2) Consistency:
• The database must remain in consistent state after performing any transaction.
• For example: In ATM withdrawal operation, the balance must be updated appropriately
after performing transaction. Thus the database can be in consistent state.
3) Isolation:
• In a database system where more than one transaction are being executed simultaneously
and in parallel, the property of isolation states that all the transactions will be carried out
and executed as if it is the only transaction in the system.
• No transaction will affect the existence of any other transaction.
• For example: If a bank manager is checking the account balance of particular customer,
then manager should see the balance either before withdrawing the money or after
withdrawing the money. This will make sure that each individual transaction is completed
and any other dependent transaction will get the consistent data out of it. Any failure to any
transaction will not affect other transaction in this case. Hence it makes all the transactions
consistent.
4) Durability:
• The database should be strong enough to handle any system failure.
• If there is any set of insert /update, then it should be able to handle and commit to the
database.
• If there is any failure, the database should be able to recover it to the consistent state.
• For example: In ATM withdrawal example, if the system failure happens after Customer
getting the money then the system should be strong enough to update Database with his
new balance, after system recovers. For that purpose the system has to keep the log of
each transaction and its failure. So when the system recovers, it should be able to know
when a system has failed and if there is any pending transaction, then it should be updated
to Database.

7. Define concurrency control. Explain briefly how it is implemented in DBMS with

diagrams and examples.
Concurrency Control
• One of the fundamental properties of a transaction is isolation.
• When several transactions execute concurrently in the database, however, the isolation
property may no longer be preserved.
• A database can have multiple transactions running at the same time. This is called
concurrency.
• To preserve the isolation property, the system must control the interaction among the
concurrent transactions; this control is achieved through one of a variety of mechanisms
called concurrency control schemes.
• Definition of concurrency control: A mechanism which ensures that simultaneous
execution of more than one transactions does not lead to any database inconsistencies is
called concurrency control mechanism.
• The concurrency control can be achieved with the help of various protocols such as - lock
based protocol, Deadlock handling, Multiple Granularity, Timestamp based protocol, and
validation-based protocols.
Need for Concurrency
Following are the purposes of concurrency control -
• To ensure isolation
• To resolve read-write or write-write conflicts
• To preserve consistency of database
• Concurrent execution of transactions over shared database creates several data integrity
and consistency problems - these are
(1) Lost update problem: This problem occurs when two transactions that access the same
database items have their operations interleaved in a way that makes the value of some
database item incorrect.
For example - Consider following transactions
(1) Salary of Employee is read during transaction T1.
(2) Salary of Employee is read by another transaction T2.
(3) During transaction T1, the salary is incremented by 200
(4) During transaction T2, the salary is incremented by 500
The result of the above sequence is that the update made by transaction T1 is completely
lost. Therefor this problem is called as lost update problem.
(2) Dirty read or Uncommited read problem: The dirty read is a situation in which one
transaction reads the data immediately after the write operation of previous transaction

For example - Consider following transactions -

Assume initially salary is = 1000
(1) At the time t1, the transaction T2 updates the salary to 1200
(2) This salary is read at time t2 by transaction T1. Obviously it is 1200
(3) But at the time t3, the transaction T2 performs Rollback by undoing the changes made by
T1 and T2 at time t1 and t2.
(4) Thus the salary again becomes = 1000. This situation leads to Dirty Read or Uncommited
Read because here the read made at time t2(immediately after roid update of another
transaction) becomes a dirty read.
(3) Non-repeatable read problem
This problem is also known as inconsistent analysis problem. This problem occurs when a
particular transaction sees two different values for the same row within its lifetime. For
example-

(1) At time t1, the transaction T1 reads the salary as 1000

(2) At time t2 the transaction T2 reads the same salary as 1000 and updates it to 1200
(3) Then at time t3, the transaction T2 gets committed.
(4) Now when the transaction T1 reads the same salary at time t4, it gets different value than
what it had read at time t1. Now, transaction T1, cannot repeat its reading operation. Thus
inconsistent values are obtained.
Hence the name of this problem is non-repeatable read or inconsistent analysis problem.
(4) Phantom read problem
The phantom read problem is a special case of non repeatable read problem.
This is a problem in which one of the transaction makes the changes in the database system
and due to these changes another transaction can not read the data item which it has read
just recently. For example -
(1) At time t1, the transaction T1 reads the value of salary as 1000
(2) At time t2, the transaction T2 reads the value of the same salary as 1000
(3) At time t3, the transaction T1 deletes the variable salary.
(4) Now at time t4, when T2 again reads the salary it gets error. Now transaction T2 can not
identify the reason why it is not getting the salary value which is read just few time back.
This problem occurs due to changes in the database and is called phantom read problem.

8. Discuss the violations caused by each of the following: dirty read, non-repeatable
read and phantoms with suitable example.
Dirty read, Non-repeatable read, and Phantom read
Dirty read
Uncommitted data is read.

Transaction B is rolled back at this time, then the second transaction A reads dirty data which
age is 18
Phantom read
When the user reads records, another transaction inserts or deletes rows to the records being
read. When the user reads the same rows again, a new “phantom” row will be found.
Transaction B inserts a new row where transaction A reads, then transaction A finds the total
count of the result changes to 2
Non-repeatable read
Before transaction A is over, another transaction B also accesses the same data. Then, due
to the modification caused by transaction B, the data read twice from transaction A may be
different.
The difference between Phantom read and Non-repeatable read：
The key to non-repeatable reading is to modify:
In the same conditions, the data you have read, read it again, and find that the value is
different.
The key point of the phantom reading is to add or delete:
Under the same conditions, the number of records read out for the first time and the second
time is different.
The isolation levels are used to solve various problems in the database :
1. DEFAULT
Use the isolation level used by the database itself.
ORACLE (read has been submitted) MySQL (repeatable read)
2. Read uncommitted
Reading uncommitted, as the name implies, is that one transaction can read the data
of another uncommitted transaction.
3. Read committed
Read commit, as the name implies, is that a transaction cannot read data until another
transaction is committed.
Solve dirty read, but cannot solve non-repeatable read and phantom read.
4. Repeatable read
Repeated reading, that is, when starting to read data (transaction is opened),
modification operations are no longer allowed.
Solved non-repeatable read.
5. Serializable serialization
Serializable is the highest transaction isolation level. Under this level, transactions are
serialized and executed sequentially, which can avoid dirty read, non-repeatable read,
and phantom read. However, this transaction isolation level is inefficient and
consumes database performance, so it is rarely used.
9. Consider the following schedules. The actions are listed in the order they are
scheduled, and prefixed with the transaction name.
S1:T1:R(X),T2:R(X),T1:W(Y),T2:W(Y),T1:R(Y),T2:R(Y)
S2:T3:W(X),T1:R(X),T1:W(Y),T2:R(Z),T2:W(Z),T3:R(Z)
For each of the schedule, answer the following questions:
(i) what is the precedence graph for the schedule?
(ii) Is the schedule conflict –serializable? if so, what are all the conflict
equivalent serial schedules?
(iii)Is the schedule view-serializable? if so, what are all the view equivalent serial
schedules?
Serializability
• When multiple transactions run concurrently, then it may lead to inconsistency of data (i.e.
change in the resultant value of data from different transaction).
• Serializability is a concept that helps to identify which non serial schedule and find the
transaction equivalent to serial schedule.
• For example:

• In above transaction initially T1 will read the values from database as A= 100, B= 100 and
modify the values of A and B, transaction T2 will read the modified value i.e. 90 and will modify
it to 80 and perform write operation. Thus at the end of transaction T1 value of A will be 90 but
at end of transaction T2 value of A will be 80. Thus conflicts or inconsistency occurs here.
This sequence can be converted to a sequence which may give us consistent result. This
process is called serializability.
Difference between Serial schedule and Serializable schedule
• There are two types of serializabilities: conflict serializability and view serializability
Conflict Serializability
• Definition: Suppose T1 and T2 are two transactions and I1 and I2 are the instructions in
T1 and T2 respectively. Then these two transactions are said to be conflict Serializable, if both
the instruction access the data item d, and at least one of the instruction is write operation.
• What is conflict?: In the definition three conditions are specified for a conflict in conflict
serializability -
1) There should be different transactions
2) The operations must be performed on same data items
3) One of the operation must be the Write(W) operation
• We can test a given schedule for conflict serializability by constructing a precedence graph
for the schedule, and by searching for absence of cycles in the graph.
• Predence graph is a directed graph, consisting of G=(V,E) where V is set of vertices and E is
set of edges. The set of vertices consists of all the transactions participating in the schedule.
The set of edges consists of all edges Ti→Tj for which one of three conditions holds :
1. Ti executes write(Q) before Tj executes read(Q).
2. Ti executes read(Q) before Tj executes write(Q).
3. Ti executes write(Q) before Tj executes write(Q).
• A serializability order of the transactions can be obtained by finding a linear order consistent
with the partial order of the precedence graph. This process is called topological sorting.
Testing for serializability
Following method is used for testing the serializability: To test the conflictserializability we can
draw a graph G = (V,E) where V = vertices which represent the number of transactions. E =
edges for conflicting pairs.
Step 1: Create a node for each transaction.
Step 2: Find the conflicting pairs(RW, WR, WW) on the same variable(or data item) by
different transactions.
Step 3: Draw edge for the given schedule. Consider following cases
1. Ti executes write(Q) before Tj executes read(Q), then draw edge from Ti to Tj.
2. Ti executes read(Q) before Tj executes write(Q), then draw edge from Ti to Tj
3. Ti executes write(Q) before Tj executes write(Q),, then draw edge from Ti to Tj
Step 4: Now, if precedence graph is cyclic then it is a non conflict serializable schedule and if
the precedence graph is acyclic then it is conflict serializable schedule.
Example 3.5.1 Consider the following two transactions and schedule (time goes from top to
bottom). Is this schedule conflict-serializable? Explain why or why not.

Solution :
Step 1: To check whether the schedule is conflict serializable or not we will check from top to
bottom. Thus we will start reading from top to bottom as
T1: R(A) -> T1:W(A) ->T2:R(A) -> T2:R(B) ->T1:R(B)->T1:W(B)
Step 2: We will find conflicting operations. Two operations are called as conflicting operations
if all the following conditions hold true for them-
i) Both the operations belong to different transactions.
ii) Both the operations are on same data item.
iii) At least one of the two operations is a write operation
From above given example in the top to bottom scanning we find the conflict as
T1:W(A)->T2:R(A).
i) Here note that there are two different transactions T1 and T2,
ii) Both work on same data item i.e. A and
iii) One of the operation is write operation.
Step 3: We will build a precedence graph by drawing one node from each transaction. In
above given scenario as there are two transactions, there will be two nodes namely T1 and T2

Step 4: Draw the edge between conflicting transactions. For example in above given
scenario, the conflict occurs while moving from T1:W(A) to T2:R(A). Hence edge must be from
T1 to T2.

Step 5: Repeat the step 4 while reading from top to bottom. Finally the precedence graph will
be as follows

Step 6: Check if any cycle exists in the graph. Cycle is a path using which we can start from
one node and reach to the same node. If the is cycle found then schedule is not conflict
serializable. In the step 5 we get a graph with cycle, that means given schedule is not conflict
serializable.
Example 3.5.2 Check whether following schedule is conflict serializable or not. If it is not
conflict serializable then find the serializability order.

Solution:
Step 1: We will read from top to bottom, and build a precedence graph for conflicting entries.
We will build a precedence graph by drawing one node from each transaction. In above given
scenario as there are three transactions, there will be two nodes namely T1 T2, and T3

Step 2: The conflicts are found as follows –

Step 3: The precedence graph will be as follows –

Step 4: As there is no cycle in the precedence graph, the given sequence is conflict
serializable. Hence we can convert this non serial schedule to serial schedule. For that
purpose we will follow these steps to find the serializable order.
Step 5: A serializability order of the transactions can be obtained by finding a linear order
consistent with the partial order of the precedence graph. This process is called topological
sorting.
Step 6: Find the vertex which has no incoming edge which is T1. If we delete T1 node then
T3 is a node that has no incoming edge. If we delete T3, then T2 is a node that has no
incoming edge.

Thus the nodes can be deleted in a order T1, T3 and T2. Hence the order will be T1-T3-T2
Example 3.5.3 Check whether the below schedule is conflict serializable or
not.{B2,r2(X),b1,r1(X),W1(X),r1(Y),W1(Y),W2(X),e1,C1,e2,C2}
Solution: b2 and b1 represents begin transaction 2 and begin transaction 1. Similarly, el and
e2 represents end transaction 1 and end transaction 2.
We will rewrite the schedule as follows-
Step 1: We will find conflicting operations. Two operations are called as conflicting operations
if all the following conditions hold true for them -
i) Both the operations belong to different transactions.
ii) Both the operations are on same data item.
iii) At least one of the two operations is a write operation.
The conflicting entries are as follows –

Step 2: Now we build a precedence graph for conflicting entries.

As there are two transactions only two nodes are present in the graph.
Step 3: We get a graph with cycle, that means given schedule is not conflict serializable.
Example 3.5.4 Consider the three transactions T1, T2, and T3 and schedules S1 and S2
given below. Determine whether each schedule is serializable or not? If a schedule is
serializable write down the equivalent serial schedule(S).
T1: R1(x) R1(z);W1(x);
T2: R2(x);R2(y);W2(z);W2(y)
T3:R3(x);R3(y);W3(y);
S1: R1(x);R2(z);R1(z); R3(x);R3(y);W1(x);W3(y);R2(y); W2(z);W2(y);
S2: R1(x);R2(z);R3(x);R1(z);R2(y);R3(y);W1(x);W2(z);W3(y);W2(y);
Solution:
Step 1: We will represent the schedule S1 as follows

Step (b): Now we will draw precedence graph as follows-

As there is no cycle in the precedence graph, the given sequence is conflict serializable.
Hence we can convert this non serial schedule to serial schedule. For that purpose we will
follow these steps to find the serializable order.
Step (c): A serializability order of the transactions can be obtained by finding a linear order
consistent with the partial order of the precedence graph. This process is called topological
sorting.
Step (d): Find the vertex which has no incoming edge which is T3. If we delete T3, then T1 is
the edge that has no incoming edge. Finally find the vertex having no outgoing edge which is
T2. Hence the order will be T3-T1-T2.
Step 2: We will represent the schedule S2 as follows -

As there is no cycle in the precedence graph, the given sequence is conflict serializable.
Hence we can convert this non serial schedule to serial schedule. For that purpose we will
follow these steps to find the serializable order.
Step (c): A serializability order of the transactions can be obtained by finding a linear order
consistent with the partial order of the precedence graph. This process is called topological
sorting.
Step (d): Find the vertex which has no incoming edge which is T3. Finally find the vertex
having no outgoing edge which is T2. So in between them is T1. Hence the order will be
T3-T1-T2
Example 3.5.5 Explain the concept of conflict serializability. Decide whether following
schedule is conflict serializable or not. Justify your answer.

Solution:
Step 1: We will read from top to bottom, and build a precedence graph for conflicting entries.
The conflicting entries are as follows-

Step 2: Now we will build precedence graph as follows

Step 3: There is no cycle in the precedence graph. That means this schedule is conflict
serializable. Hence we can convert this non serial schedule to serial schedule. For that
purpose we will follow the following steps to find the serializable order.
1) Find the vertex which has no incoming edge which is T1.
2) Then find the vertex having no outgoing edge which is T2. In between them there is no
other transaction.
3) Hence the order will be T1-T2.
View Serializability
• If a given schedule is found to be view equivalent to some serial schedule, then it is called
as a view serializable schedule.
• View Equivalent Schedule: Consider two schedules S1 and S2 consisting of transactions
T1 and T2 respectively, then schedules S1 and S2 are said to be view equivalent schedule if it
satisfies following three conditions:
• If transaction T1 reads a data item A from the database initially in schedule S1, then in
schedule S2 also, T1 must perform the initial read of the data item X from the database. This is
same for all the data items. In other words –the initial reads must be same for all data items.
• If data item A has been updated at last by transaction T1 in schedule S1, then in schedule
S2 also, the data item A must be updated at last by transaction T1.
• If transaction T1 reads a data item that has been updated by the transaction T1 in schedule
S1 then in schedule S2 also, transaction T1 must read the same data item that has been
updated by transaction T1. In other words the Write-Read sequence must be same.

Steps to check whether the given schedule is view serializable or not

Step 1: If the schedule is conflict serializable then it is surely view serializable because
conflict serializability is a restricted form of view serializability.
Step 2: If it is not conflict serializable schedule then check whether there exist any blind write
operation. The blind write operation is a write operation without reading a value. If there does
not exist any blind write then that means the given schedule is not view serializable. In other
words if a blind write exists then that means schedule may or may not be view conflict.
Step 3: Find the view equivalence schedule
Example 3.5.6 Consider the following schedules for checking if these are view serializable or
not.

Solution:
i) The initial read operation is performed by T2 on data item A or by T1 on data item C. Hence
we will begin with T2 or T1. We will choose T2 at the beginning.
ii) The final write is performed by T1 on the same data item B. Hence T1 will be at the last
position.
iii) The data item C is written by T3 and then it is read by T1. Hence T3 should before T1.
Thus we get the order of schedule of view serializability as T2 - T1 – T3
Example 3.5.7 Consider following two transactions:
T1: read(A)
read(B)
if A=0 then B:=B+1;
write(B)
T2: read(B);
read(A);
if B=0 then A:=A+1;
write(A)
Let consistency requirement be A=0 V B=0 with A=B=0 the initial values.
1) Show that every serial execution involving these two transactions preserves the
consistency of the Database?
2) Show a concurrent execution of T1 and T2 that produces a non serializable
schedule?
3) Is there a concurrent execution of T1 and T2 that produces a serializable
schedule?
Solution: 1) There are two possible executions: T1 ->T2 or T2->T1
Consider case T1->T2 then
AvB =A OR B=FVT-T. This means consistency is met.
Consider case T2->T1 then

AvB = A OR B = FvT=T. This means consistency is met.

2) The concurrent execution means interleaving of transactions T1 and T2. It can be

This is a non-serializable schedule.

(3) There is no concurrent execution resulting in a serializable schedule.
Example 3.5.8 Test serializability of the following schedule:
i) r1(x);r,(x);w,(x);r2(x);w ̧(x) ii) 1,(x);12(x);W;(x);1,(x);w,(x)
Solution:
i) r1(x);r3(x);w1(x);r2(x);w3(x)
The r1 represents the read operation of transaction T1, w3 represents the write operation on
transaction T3 and so on. Hence from given sequence the schedule for three transactions can
be represented as follows:

Step 1: We will use the precedence graph method to check the serializability. As there are
three transactions, three nodes are created for each transaction.

Step 2: We will read from top to bottom. Initially we read r,(x) and keep on moving bottom in
search of write operation. Here all the transactions work on same data item i.e. x. Now we get
a write operation in T1 as w3(x). Hence the dependency is from T1 to T3. Therefore we draw
edge from T1 to T3.
Similarly, for r3(x) we get w1(x) pair. Hence there will be edge from T3 to T1. Continuing in this
fashion we get the precedence graph as

Step 3: As cycle exists in the above precedence graph, we conclude that it is not
serializable.
ii) r3(x);r2(x);w3(x);r1(x);w1(x)
From the given sequence the schedule can be represented as follows:

Step 1: Read the schedule from top to bottom for pair of operations. For r3(x) we get w1(x)
pair. Hence edge exists from T, to T, in precedence graph.
There is a pair from r2(x): w3(x). Hence edge exists from T2 to T3.
There is a pair from r2(x): w1(x). Hence edge exists from T2 to T1.
There is a pair from w2(x): r1(x). Hence edge exists from T3 to T1.
Step 2: The precedence graph will then be as follows-
Step 3: As there is no cycle in the above graph, the given schedule is serializable.
Step 4: The serializability order for consistent schedule will be obtained by applying
topological sorting on above drawn precedence graph. This can be achieved as follows,
Sub-Step 1: Find the node having no incoming edge. We obtain T2 is such a node. Hence
T2 is at the beginning of the serializability sequence. Now delete T2. The Graph will be

Sub-Step 2: Repeat sub-Step 1, We obtain T3 and T1 nodes as a sequence.

Thus we obtain the sequence of transactions as T2, T3 and T1. Hence the serializability order
is
r2(x);r3(x);w3(x):r1(x);w1(x)
Example 3.5.9 Consider the following schedules. The actions are listed in the order they are
scheduled, and prefixed with the transaction name.
S1: T1: R(X), T2: R(X), T1: W(Y), T2: W(Y)TI: R(Y), T2: R(Y)
S2: T3: W(X), TI: R(X), T1: W(Y), T2: R(Z),T2: W(Z) T3: R(Z)
For each of the schedules, answer the following questions:
i) What is the precedence graph for the schedule?
ii) Is the schedule conflict-serializable ? If so, what are all the conflict equivalent serial
schedules?
iii) Is the schedule view-serializable? If so, what are all the view equivalent serial schedules ?
Solution: i) We will find conflicting operations. Two operations are called as conflicting
operations if all the following conditions hold true for them -
• Both the operations belong to different transactions.
• Both the operations are on same data item.
• At least one of the two operations is a write operation
For S1: From above given example in the top to bottom scanning we find the conflict as
• T1: W(Y), T2: W(Y) and
• T2: W(Y), T1: R(Y)
Hence we will build the precedence graph. Draw the edge between conflicting transactions.
For example in above given scenario, the conflict occurs while moving from T1:W(Y) to
T2:W(Y). Hence edge must be from T1 to T2. Similarly for second conflict, there will be the
edge from T2 to T1
For S2: The conflicts are
• T3: W(X), T1: R(X)
• T2: W(Z) T3: R(Z)
Hence the precedence graph is as follows –

i)
• S1 is not conflict-serializable since the dependency graph has a cycle.
• S2 is conflict-serializable as the dependency graph is acylic. The order T2-T3-T1 is the only
equivalent serial order
ii)
• S1 is not view serializable.
• S2 is trivially view-serializable as it is conflict serializable. The only serial order allowed is
T2-T3-T1.
Example 3.5.10 Check whether following schedule is view serializable or not. Justify your
answer. (Note: T1 and T2 are transactions). Also explain the concept of view equivalent
schedules and conflict equivalent schedule considering the example schedule given below :

Solution:
Step 1: We will first find if the given schedule is conflict serializable or not. For that purpose,
we will find the conflicting operations. These are as shown below –
The precedence graph is as follows -

As there exists no cycle, the schedule is conflict serializable. The possible serializability order
can be T1 - T2
Now we check it for view serializability. As we get the serializability order as T1 - T2, we will
find the view equivalence with the given schedule as serializable schedule.
Let S be the given schedule as given in the problem statement. Let the serializable schedule
is S'={T1,T2}. These two schedules are represented as follows:

Now we will check the equivalence between them using following conditions -
(1) Initial Read
In schedule S initial read on A is in transaction T1. Similarly initial read on B is in transaction
T1.
Similarly in schedule S', initial read on A is in transaction T1. Similarly initial read on B is in
transaction T1.
(2) Final Write
In schedule S final write on A is in transaction T2. Similarly final write on B is in transaction
T2.
In schedule S' final write on A is in transaction T2. Similarly final write on B is in transaction T2
(3) Intermediate Read
Consider schedule S for finding intermediate read operation.

Similarly consider schedule S` for finding intermediate read operation

In both the schedules S and S', the intermediate read operation is performed by T2 only after
T1 performs write operation.
Thus all the above three conditions get satisfied. Hence given schedule is view serializable.

10. Define deadlock. How does it occur? How transactions can be written to (i) Avoid
deadlock. (ii) Guarantee correct execution.
Illustrate with suitable example
Deadlock Handling
Deadlock is a specific concurrency problem in which two transactions depend on each other
for something.
For example- Consider that transaction T1 holds a lock on some rows of table A and needs to
update some rows in the B table. Simultaneously, transaction T2 holds locks on some rows in
the B table and needs to update the rows in the A table held by Transaction T1.

Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock and
similarly, transaction T2 is waiting for T1 to release its lock. All activities come to a halt state
and remain at a standstill. This situation is called deadlock in DBMS.
Definition: Deadlock can be formally defined as - " A system is in deadlock state if there exists
a set of transactions such that every transaction in the set is waiting for another transaction in
the set. "
There are four conditions for a deadlock to occur
A deadlock may occur if all the following conditions holds true.
1. Mutual exclusion condition: There must be at least one resource that cannot be used by
more than one process at a time.
2. Hold and wait condition: A process that is holding a resource can request for vd's additional
resources that are being held by other processes in the system.
3. No pre-emption condition: A resource cannot be forcibly taken from a process. Only the
process can release a resource that is being held by it.
4. Circular wait condition: A condition where one process is waiting for a resource that is being
held by second process and second process is waiting for third process In the....so on and the
last process is waiting for the first process. Thus, making a circular chain of waiting.
Deadlock can be handled using two techniques -
1. Deadlock Prevention
2. Deadlock Detection and deadlock recovery
1. Deadlock prevention:
For large database, deadlock prevention method is suitable. A deadlock can be prevented if
the resources are allocated in such a way that deadlock never occur. The DBMS analyzes the
operations whether they can create deadlock situation or not, If they do, that transaction is
never allowed to be executed.
There are two techniques used for deadlock prevention -
(i) Wait-Die:
• In this scheme, if a transaction requests for a resource which is already held with a
conflicting lock by another transaction, then the DBMS simply checks the timestamp of both
transactions. It allows the older transaction to wait until the resource is available for execution.
• Suppose there are two transactions T; and T, and let TS(T) is a timestamp of any transaction
T. If T2 holds a lock by some other transaction and T1 is requesting for resources held by T2
then the following actions are performed by DBMS:
• Check if TS(T) < TS(T) - If T, is the older transaction and T, has held some resource, then T,
is allowed to wait until the data-item is available for execution. That means if the older
transaction is waiting for a resource which is locked by the younger transaction, then the older
transaction is allowed to wait for estb n resource until it is available.
• Check if TS(T;) < TS(T;) - If T; is older transaction and has held some resource and if T, is
waiting for it, then T, is killed and restarted later with the random delay but with the same
timestamp.
Timestamp is a way of assigning priorities to each transaction when it starts. If timestamp is
lower then that transaction has higher priority. That means oldest transaction has highest
priority.
For example-
Let T1 is a transaction which requests the data item acquired by Transaction T2. Similarly, T3
is a transaction which requests the data item acquired by transaction T2.

Here TS(T1) i.e. Time stamp of T1 is less than TS(T3). In other words, T1 is older than T3.
Hence T1 is made to wait while T3 is rolled back.
(ii) Wound - wait:
• In wound wait scheme, if the older transaction requests for a resource which is held by the
younger transaction, then older transaction forces younger one to not kill the transaction and
release the resource. After some delay, the younger bevomer transaction is restarted but with
the same timestamp.
• If the older transaction has held a resource which is requested by the younger transaction,
then the younger transaction is asked to wait until older releases it.
Suppose T1 needs a resource held by T2 and T3 also needs the resource held by T2, with
TS(T1) =5, TS(T2) =8 and TS(T3) =10, then T1 being older waits and T3 being younger dies.
After some delay, the younger transaction is restarted but with the same timestamp.
This ultimately prevents a deadlock to occur.
To summarize

2. Deadlock detection:
• In deadlock detection mechanism, an algorithm that examines the state of the system is
invoked periodically to determine whether a deadlock has occurred or not. If deadlock is
occurrence is detected, then the system must try to recover from it.
• Deadlock detection is done using wait for graph method.
Wait for graph
• In this method, a graph is created based on the transaction and their lock. If the created
graph has a cycle or closed loop, then there is a deadlock.
• The wait for the graph is maintained by the system for every transaction which is waiting for
some data held by the others. The system keeps checking the graph if there is any cycle in
the graph.
• This graph consists of a pair G = (V, E), where V is a set of vertices and E is a set of edges.
• The set of vertices consists of all the transactions in the system.
• When transaction Ti requests a data item currently being held by transaction Ti, then the
edge Ti → Tj is inserted in the wait-for graph. This edge is removed only when transaction Tj is
no longer holding a data item needed by transaction Ti.
• For example - Consider following transactions, We will draw a wait for graph for this scenario
and check for deadlock.

We will use three rules for designing the wait-for graph -

Rule 1: If T1 has Read operation and then T2 has Write operation then draw an edge T1->T2.
Rule 2: If T1 has Write operation and then T2 has Read operation then draw an edge T1->T2
Rule 3: If T1 has Write operation and then T2 has Write operation then draw an edge T1->T2
Let us draw wait-for graph
Step 1: Draw vertices for all the transactions

Step 2: We find the Read-Write pair from two different transactions reading from top to bottom.
If such as pair is found then we will add the edges between corresponding directions. For
instance –
Step 3:

As cycle is detected in the wait-for graph there is no need to further process. The deadlock is
present in this transaction scenario.
Example 3.16.1 Give an example of a scenario where two phase locking leads to
deadlock. AU: May-09, Marks 4
Solution: Following scenario of execution of transactions can result in deadlock.

In above scenario, transaction T1, makes an exclusive lock on data item B and then
transaction T2 makes an exclusive lock on data item A. Here unless and until T1, does not give
up the lock (i.e. unlock) on B; T2 cannot read/write it. Similarly unless and until T2 does not
give up the lock on A; T1, cannot read or write on A.
This is a purely deadlock situation in two phase locking.
UNIT – IV
IMPLEMENTATION TECHNIQUES
PART – A
Q.1 What is the need for RAID?
• RAID is a technology that is used to increase the performance.
• It is used for increased reliability of data storage.
• An array of multiple disks accessed in parallel will give greater throughput than a single
disk.
• With multiple disks and a suitable redundancy scheme, your system can stay up and
running when a disk fails, and even while the replacement disk is being installed and its data
restored.
Q.2 Define Software and hardware RAID systems.
Ans.: Hardware RAID: The hardware-based array manages the RAID subsystem
independently from the host. It presents a single disk per RAID array to the host. Software
RAID: Software RAID implements the various RAID levels in the kernel disk code. It offers
the cheapest possible solution, as expensive disk controller cards.
Q.3 What are ordered indices?
Ans.: This is type of indexing which is based on sorted ordering values. Various ordered
indices are primary indexing, secondary indexing.
Q.4 What are the two types of ordered indices?
Ans.: Two types of ordered indices are - Primary indexing and secondary indexing. The
primary indexing can be further classified into dense indexing and sparse indexing and
single level indexing and multilevel indexing.
Q.5 Give the comparison between ordered indices and hashing.
Ans.:
(1) If range of queries are common, ordered indices are to be used.
(2) The buckets containing records can be chained in sorted order in case of ordered
indices.
(3) Hashing is generally better at retrieving records having a specified value of the key.
(4) Hash function assigns values randomly to buckets. Thus, there is no simple notion of
"next bucket in sorted order."
Q.6 What are the causes of bucket overflow in a hash file organization?
Ans.:Bucket overflow can occur for following reasons -
(1) Insufficient buckets: For the total number of buckets there are insufficient number of
buckets to occupy.
(2) Skew: Some buckets are assigned more records than are others, so a bucket might
overflow even while other buckets still have space. This situation is known as bucket skew.
Q.7 What can be done to reduce the occurrences of bucket overflows in a hash file
organization?
Ans.:
(1) A bucket is a unit of storage containing one or more records (a bucket is typically a disk
block).
(2) The file blocks are divided into M equal-sized buckets, numbered bucket0, bucket...
bucketM-1. Typically, a bucket corresponds to one (or a fixed number of) disk block.
(3) In a hash file organization we obtain the bucket of a record directly from its search- key
value using a hash function, h (K).
(4) To reduce overflow records, a hash file is typically kept 70-80% full.
(5) The hash function h should distribute the records uniformly among the buckets;
otherwise, search time will be increased because many overflow records will exist.
Q.8 Distinguish between dense and sparse indices.
1) Dense index:
• An index record appears for every search key value in file.
• This record contains search key value and a pointer to the actual record.
2) Sparse index:

• Index records are created only for some of the records.

• To locate a record, we find the index record with the largest search key value less than or
equal to the search key value we are looking for.
• We start at that record pointed to by the index record, and proceed along the pointers in the
file (that is, sequentially) until we find the desired record
Q.9 When is it preferable to use a dense index rather than a sparse index? Explain
your answer.
Ans.: 1. It is preferable to use a dense index instead of a sparse index when the file is not
sorted on the indexed field.
2. Or when the index file is small compared to the size of memory.
Q.10 How does B-tree differs from a B+ tree? Why is a B+ tree usually preferred as an
access structure to a data file?

Q.11 What are the disadvantages of B tree over B+ tree?

Ans.:
(1) Searching of a key value becomes difficult in B-tree as data can not be found in the leaf
node.
(2) The leaf node can not store linked list and thus wastes the space.
Q.12 Mention different hashing techniques.
Ans.: Two types of hashing techniques are - i) Static hashing ii) Dynamic hashing.
Q.13 List the mechanisms to avoid collision during hashing.
Ans.: Collision Resolution techniques are: (1) Separate chaining
(2) Open addressing techniques: (i) Linear probing (ii) Quadratic probing
Q.14 What is the basic difference between static hashing and dynamic hashing?

Q.15 What is the need for query optimization?

Ans.: Query optimization is required for fast execution of long running complex
Q.16 Which cost component are used most commonly as the basis for cost function.
Ans.: Disk access or secondary storage access is considered most commonly as a basis for
cost function.
Q.17 What is query execution plan?
Ans.: To specify fully how to evaluate a query, we need not only to provide the
relational-algebra expression, but also to annotate it with instructions specifying how to
evaluate each operation. This annotated structure is called query execution plan.
Q.18 Mention all the operations of files.
Ans. Various file operations are - (1) Creation of file (2) Insertion of data (3) Deletion of data
(4) Searching desired data from the file.
Q.19 Define dense index.
• An index record appears for every search key value in file.
• This record contains search key value and a pointer to the actual record.
• For example:

Q.20 How do you represent leaf node of a B+ tree of order p?

Ans.: To retrieve all the leaf pages efficiently we have to link them using page pointers. The
sequence of leaf pages is also called as sequence set. Refer Fig. 4.8.1.
PART B
1.Explain what a RAID system is? How does it improve performance and reliability?
Discuss RAID level 3 and level 4.
RAID
• RAID stands for Redundant Array of Independent Disks. This is a technology in which
multiple secondary disks are connected together to increase the performance, data
redundancy or both.
• For achieving the data redundancy - in case of disk failure, if the same data is also backed
up onto another disk, we can retrieve the data and go on with the operation.
• It consists of an array of disks in which multiple disks are connected to achieve different
goals.
• The main advantage of RAID, is the fact that, to the operating system the array of disks can
be presented as a single disk.
Need for RAID
• RAID is a technology that is used to increase the performance.
• It is used for increased reliability of data storage.
• An array of multiple disks accessed in parallel will give greater throughput than a single
disk.
• With multiple disks and a suitable redundancy scheme, your system can stay up and
running when a disk fails, and even while the replacement disk is being installed and its data
restored.
Features
(1) RAID is a technology that contains the set of physical disk drives.
(2) In this technology, the operating system views the separate disks as a single logical disk.
(3) The data is distributed across the physical drives of the array.
(4) In case of disk failure, the parity information can be helped to recover the data.
RAID Levels
Level: RAID 0
• In this level, data is broken down into blocks and these blocks are stored across all the
disks.
• Thus striped array of disks is implemented in this level. For instance in the following figure
blocks "A B" form a stripe.
• There is no duplication of data in this level so once a block is lost then there is no int lovol
diri way recover it.
• The main priority of this level is performance and not the reliability.
Level: RAID 1
• This level makes use of mirroring. That means all data in the drive is duplicated to another
drive.
• This level provides 100% redundancy in case of failure.
• Only half space of the drive is used to store the data. The other half of drive is just a mirror
to the already stored data.
• The main advantage of this level is fault tolerance. If some disk fails then the other
automatically takes care of lost data.

Level: RAID 2
• This level makes use of mirroring as well as stores Error Correcting Codes (ECC) for its
data striped on different disks.
• The data is stored in separate set of disks and ECC is stored another set of disks.
• This level has a complex structure and high cost. Hence it is not used commercially.
Level: RAID 3
• This level consists of byte-level stripping with dedicated parity. In this level, the parity
information is stored for each disk section and written to a dedicated. parity drive.
• We can detect single errors with a parity bit. Parity is a technique that checks whether data
has been lost or written over when it is moved from one place in storage to another.
• In case of disk failure, the parity disk is accessed and data is reconstructed from the
remaining devices. Once the failed disk is replaced, the missing data can be restored on the
new disk.
Level: RAID 4
• RAID 4 consists of block-level stripping with a parity disk.
• Note that level 3 uses byte-level striping, whereas level 4 uses block-level striping.

Level: RAID 5
• RAID 5 is a modification of RAID 4.
• RAID 5 writes whole data blocks onto different disks, but the parity bits generated for data
block stripe are distributed among all the data disks rather than storing them on a different
dedicated disk.
Level: RAID 6
• RAID 6 is a extension of Level 5
• RAID 6 writes whole data blocks onto different disks, but the two independent parity bits
generated for data block stripe are distributed among all the data disks rather than storing
them on a different dedicated disk.
• Two parities provide additional fault tolerance.
• This level requires at least four disks to implement RAID.
The factors to be taken into account in choosing a RAID level are :
Monetary cost of extra disk-storage requirements.
1. Performance requirements in terms of number of I/O operations.
2. Performance when a disk has failed.
3. Performance during rebuild

2. Illustrate indexing and hashing techniques with examples. What is the use of an index
structure? Explain the concept of ordered indices.
An index is a data structure that organizes data records on the disk to make the retrieval of
data efficient.
• The search key for an index is collection of one or more fields of records using which we
can efficiently retrieve the data that satisfy the search conditions.
• The indexes are required to speed up the search operations on file of records.
• There are two types of indices -
• Ordered Indices: This type of indexing is based on sorted ordering values.
• Hash Indices: This type of indexing is based on uniform distribution of values across
range of buckets. The address of bucket is obtained using the hash function.
• There are several techniques of for using indexing and hashing. These techniques are
evaluated based on following factors -
• Access Types: It supports various types of access that are supported efficiently.
• Access Time: It denotes the time it takes to find a particular data item or set items.
• Insertion Time: It represents the time required to insert new data item.
• Deletion Time: It represents the time required to delete the desired data item.
• Space overhead: The space is required to occupy the index structure. But allocating such
extra space is worth to achieve improved performance.
Indexing Techniques:
Indexing is used to speed up data retrieval by creating a structure that maps keys to data
records.
1. Primary Index:
o Built on primary keys.

o Example: A student database sorted by roll number with the first record of
each block indexed.
2. Secondary Index:
o Created on non-primary keys.

o Example: Index on Student Name in a database sorted by Roll No.

Hashing Techniques:
Hashing maps a key to a fixed-size location (bucket) using a hash function.
● Example: A hash function h(x)=xmod 10h(x) = x \mod 10h(x)=xmod10 maps key 23
to bucket 3.
● Handles collisions using methods like chaining (linked lists) or open addressing
(linear probing).
Use of Index Structures:
1. Speed up query performance by reducing the number of disk I/O operations.
2. Enable efficient range and equality searches.
3. Improve the performance of join and sort operations.
Ordered Indices:
● Maintain records in sorted order based on the index key.

● Types:

1. Primary Index: Built on a sorted file.

2. Secondary Index: For fields that are not the sorting key.
Example:
If a file is sorted by EmployeeID, a primary index points to the first record of each block.

3.Briefly explain about B+ Index File with Example.

B+ Tree Index Files
• The B+ tree is similar to binary search tree. It is a balanced tree in which the internal nodes
direct the search.
• The leaf nodes of B+ trees contain the data entries.
Structure of B+ Tree
• The typical node structure of B+ node is as follows –

• It contains up to n – 1 search-key values k1, k2, ……, kn-1 and n pointers

P1, P2,..., Pn
• The search-key values within a node are kept in sorted order; thus, if i < j, then Ki<Kj.
• To retrieve all the leaf pages efficiently we have to link them using page pointers. The
sequence of leaf pages is also called as sequence set.
• Following Fig. 4.8.1 represents the example of B+ tree.
• The B+ tree is called dynamic tree because the tree structure can grow on insertion of
records and shrink on deletion of records.
Characteristics of B+ Tree
Following are the characteristics of B+ tree.
1) The B+ tree is a balanced tree and the operations insertions and deletion keeps the tree
balanced.
2) A minimum occupancy of 50 percent is guaranteed for each node except the root.
3) Searching for a record requires just traversal from the root to appropriate leaf.
Insertion Operation
Algorithm for insertion :
Step 1: Find correct leaf L.
Step 2: Put data entry onto L.
i) If L has enough space, done!
ii) Else, must split L (into L and a new node L2)
• Allocate new node
• Redistribute entries evenly
• Copy up middle key.
• Insert index entry pointing to L2 into parent of L.
Step 3: This can happen recursively
i) To split index node, redistribute entries evenly, but push up middle key. (Contrast with leaf
splits.)
Step 4: Splits "grow" tree; root split increases height.
i) Tree growth: gets wider or one level taller at top.
Example 4.8.1 Construct B+ tree for following data. 30,31,23,32,22,28,24,29, where number
of pointers that fit in one node are 5.
Solution: In B+ tree each node is allowed to have the number of pointers to be 5. That
means at the most 4 key values are allowed in each node.
Step 1: Insert 30,31,23,32. We insert the key values in ascending order.

Step 2: Now if we insert 22, the sequence will be 22, 23, 30, 31, 32. The middle key 30,
will go up.

Step 3: Insert 28,24. The insertion is in ascending order.

Step 4: Insert 29. The sequence becomes 22, 23, 24, 28, 29. The middle key 24 will go
up. Thus we get the B+ tree.

Example 4.8.2 Construct B+ tree to insert the following (order of the tree is 3)
26,27,28,3,4,7,9,46,48,51,2,6
Solution:
Order means maximum number of children allowed by each node. Hence order 3 means at
the most 2 key values are allowed in each node.
Step 1: Insert 26, 27 in ascending order
Step 2: Now insert 28. The sequence becomes 26,27,28. As the capacity of the node is
full, 27 will go up. The B+ tree will be,

Step 3: Insert 3. The partial B+ Tree will be,

Step 4: Insert 4. The sequence becomes 3,4, 26. The 4 will go up. The partial B+ tree
will be –

Step 5: Insert 7. The sequence becomes 4,7,26. The 7 will go up. Again from 4,7,27.
the 7 will go up. The partial B+ Tree will be,

Step 6: Insert 9. By inserting 7,9, 26 will be the sequence. The 9 will go up. The partial
B+ tree will be,
Step 7: Insert 46. The sequence becomes 27,28,46. The 28 will go up. Now the
sequence becomes 9, 27, 28. The 27 will go up and join 7. The B+ Tree will be,

Step 8: Insert 48. The sequence becomes 28,46,48. The 46 will go up. The B+ Tree will
become,

Step 9: Insert 51. The sequence becomes 46,48,51. The 48 will go up. Then the
sequence becomes 28, 46, 48. Again the 46 will go up. Now the sequence becomes
7,27, 46. Now the 27 will go up. Thus the B+ tree will be
Step 10: Insert 2. The insertion is simple. The B+ tree will be,

Step 11: Insert 6. The insertion can be made in a vacant node of 7(the leaf node). The
final B+ tree will be,

Deletion Operation
Algorithm for deletion:
Step 1: Start at root, find leaf L with entry, if it exists.
Step 2: Remove the entry.
i) If L is at least half-full, done!
ii) If L has only d-1 entries,
• Try to re-distribute, borrowing keys from sibling.
(adjacent node with same parent as L).
• If redistribution fails, merge L and sibling.
Step 3: If merge occurred, must delete entry (pointing to L or sibling) from parent of L.
Step 4: Merge could propagate to root, decreasing height.
Example 4.8.3 Construct B+ Tree for the following set of key values
(2,3,5,7,11,17,19,23,29,31) Assume that the tree is initially empty and values are added
in ascending order. Construct B+ tree for the cases where the number of pointers that
fit one node is four. After creation of B+ tree perform following series of operations:
(a) Insert 9. (b) Insert 10. (c) Insert 8. (d) Delete 23. (e) Delete 19.
Solution: The number of pointers fitting in one node is four. That means each node contains
at the most three key values.
Step 1: Insert 2, 3, 5.

Step 2: If we insert 7, the sequence becomes 2, 3, 5, 7. Since each node can accommodate
at the most three key, the 5 will go up, from the sequence 2, 3, 5, 7.

Step 3: Insert 11. The partial B+ tree will be,

Step 4: Insert 17. The sequence becomes 5,7, 11,17. The element 11 will go up. Then the
partial B+ tree becomes,
Step 5: Insert 19.

Step 6: Insert 23. The sequence becomes 11,17,19,23. The 19 will go up.

Step 7: Insert 29. The partial B+ tree will be,

Step 8: Insert 31. The sequence becomes 19,23,29, 31. The 29 will go up. Then at the
upper level the sequence becomes 5,11,19,29. Hence again 19 will go up to maintain the
capacity of node (it is four pointers three key values at the most). Hence the complete B+
tree will be,
(a) Insertion of 9: It is very simple operation as the node containing 5,7 has one space
vacant to accommodate. The B+ tree will be,

(b) Insert 10: If we try to insert 10 then the sequence becomes 5,7,9,10. The 9 will go up.
The B+ tree will then become –

(c) Insert 8: Again insertion of 8 is simple. We have a vacant space at node 5,7. So we just
insert the value over there. The B+ tree will be-
(d) Delete 23: Just remove the key entry of 23 from the node 19,23. Then merge the sibling
node to form a node 19,29,31. Get down the entry of 11 to the leaf node. Attach the node of
11,17 as a left child of 19.

(e) Delete 19: Just delete the entry of 19 from the node 19,29,31. Delete the internal node
key 19. Copy the 29 up as an internal node as it is an inorder successor node.

Search Operation
1. Perform a binary search on the records in the current node.
2. If a record with the search key is found, then return that record.
3. If the current node is a leaf node and the key is not found, then report an unsuccessful
search.
4. Otherwise, follow the proper branch and repeat the process.
For example-

Consider the B+ tree as shown in above Fig. 4.8.2.

For searching a node 25, we start from the root node -
(1) Compare 20 with key value 25. As 25>20, move on to right branch.
(2) Compare 25 with key value 25. As the match is found we declare, that the given node is
present in the B+ tree.
For searching a node 10, we start form the root node -
(1) Compare 20 with key value 10, as 10<20, we follow left branch
(2) Compare 8 with 10, 10>8, then we compare 10 with the next adjacent value of the same
node. It is 11, as 10<11, we follow left branch of 11.
(3) We compare 10, with all the values in that node, as match is not found we report
unsuccessful search or node is not present in given B+ tree.
Merits of B+ Index Tree Structure
1. In B+ tree the data is stored in leaf node so searching of any data requires scanning only
of leaf node alone.
2. Data is ordered in linked list.
3. Any record can be fetched in equal number of disk accesses.
4. Range queries can be performed easily as leaves are linked up.
5. Height of the tree is less as only keys are used for indexing.
6. Supports both random and sequential access.
Demerits of B+ Index Tree Structure
1. Extra insertion of non leaf nodes.
2. There is space overhead.
4. Describe the structure of a B+ tree and give the algorithm for searching in a B+ tree
with an example.
•In B-trees the traversing of the nodes is done in inorder manner which is time consuming.
We want such a data structure of B-tree which will allow us to access data sequentially,
instead of inorder traversing.
• Definition : In B+ trees from leaf nodes reference to any other node can be possible. The
leaves in B+tree form a linked list which is useful in scanning the nodes sequentially.
• The insertion and deletion operations are similar to B-trees.
•Example : Consider following B+tree.

• From leaf node only any key can be accessed of entire tree. There is no need to traverse
the tree in inorder fashion.
•Thus B+tree gives faster access to any key.
Ex. 5.2.1 Construct a B+tree for F, S, Q, K, C, L, H, T, V, W, M, R.
Sol ; The method for constructing B+tree is similar to the building of B tree but the only
difference here is that, the parent nodes also appear in the leaf nodes. We will build B+tree
for order 5.
The order 3 means at the most 2 keys are allowed.
4.Write detailed notes on ordered indices and B-tree index files.
B-tree indices are similar to B+-tree indices.
• The primary distinction between the two approaches is that a B-tree eliminates the
redundant storage of search-key values.
• B-tree is a specialized multiway tree used to store the records in a disk.
• There are number of subtrees to each node. So that the height of the tree is relatively
small. So that only small number of nodes must be read from disk to retrieve an item. The
goal of B-trees is to get fast access of the data.
• A B-tree allows search-key values to appear only once (if they are unique), unlike a
B+-tree, where a value may appear in a nonleaf node, in addition to appearing in a leaf
node.

Example 4.9.1 Create B tree of order 3 for following data: 20,10,30,15,12,40,50.

Solution: The B tree of order 3 means at the most two key values are allowed in each
node of B-Tree.
Step 1: Insert 20,10 in ascending order

Step 2: If we insert the value 30. The sequence becomes 10,20,30. As only two key
values are allowed in each node (being order 3), the 20 will go up.

Step 3: Now insert 15.

Step 4: Insert 12. The sequence becomes 10, 12, 15. The middle element 12 will go up.

Step 5: Insert 40

Step 6: Insert 50. The sequence becomes 30,40,50. The 40 will go up. But again it
forms 12,20,40 sequence and then 20 will go up. Thus the final B Tree will be,

This is the final B-Tree

Difference between B Tree and B+ Tree
6. Write a note on hashing. Describe hash file organization.
• Hash file organization method is the one where data is stored at the data blocks whose
address is generated by using hash function.
• The memory location where these records are stored is called as data block or bucket. This
bucket is capable of storing one or more records.
• The hash function can use any of the column value to generate the address. Most of the
time, hash function uses primary key to generate the hash index - address of the or data
block.
• Hash function can be simple mathematical function to any complex mathematical function.
• For example - Following figure represents the records of student can be searched using
hash based indexing. In this example the hash function is based on the age field of the
record. Here the index is made up of data entry k* which is actual data record. For a hash
function the age is converted to binary number format and the last two digits are considered
to locate the student record.

Basic Terms used in Hashing

1) Hash Table: Hash table is a data structure used for storing and retrieving data
quickly. Every entry in the hash table is made using Hash function.
2) Hash function:
• Hash function is a function used to place data in hash table.
• Similarly hash function is used to retrieve data from hash table.
• Thus the use of hash function is to implement hash table.
For example: Consider hash function as key

3) Bucket: The hash function H(key) is used to map several dictionary entries in the hash
table. Each position of the hash table is called bucket.
4) Collision: Collision is situation in which hash function returns the same address for more
than one record.
For example:

5) Probe: Each calculation of an address and test for success is known as a

à probe.
6) Synonym: The set of keys that has to the same location are called synonyms. For
example - In above given hash table computation 25 and 55 are synonyms.
7) Overflow: When hash table becomes full and new record needs to be inserted then ore it
is called overflow.
For example -
7. Distinguish between static and dynamic hashing. Discuss the relative merits.
• The problem with static hashing is that it does not expand or shrink dynamically as the size
of the database grows or shrinks.
• Dynamic hashing provides a mechanism in which data buckets are added and removed
dynamically and on-demand.
• The most commonly used technique of dynamic hashing is extendible hashing.
Extendible Hashing
The extendible hashing is a dynamic hashing technique in which, if the bucket is overflow,
then the number of buckets are doubled and data entries in buckets are re- distributed.
Example of extendible hashing:
In extendible hashing technique the directory of pointers to bucket is used. Refer following
Fig. 4.12.1
To locate a data entry, we apply a hash function to search the data we us last two digits of
binary representation of number. For instance binary representation of 32* = 10000000. The
last two bits are 00. Hence we store 32* accordingly.
Insertion operation :
• Suppose we want to insert 20* (binary 10100). But with 00, the bucket A is full. So we must
split the bucket by allocating new bucket and redistributing the contents, bellsp across the
old bucket and its split image.
• For splitting, we consider last three bits of h(r).
• The redistribution while insertion of 20* is as shown in following Fig. 4.12.2.
The split image of bucket A i.e. A2 and old bucket A are based on last two bits i.e. 00. Here
we need two data pages, to adjacent additional data record. Therefore here it is necessary
to double the directory using three bits instead of two bits. Hence,
• There will be binary versions for buckets A and A2 as 000 and 100.
• In extendible hashing, last bits d is called global depth for directory and d is called local
depth for data pages or buckets. After insetion of 20*, the global depth becomes 3 as we
consider last three bits and local depth of A and A2 buckets become 3 as we are considering
last three bits for placing the data records. Refer Fig. 4.12.3.
(Note: Student should refer binary values given in Fig. 4.12.2, for understanding insertion
operation)
• Suppose if we want to insert 11*, it belongs to bucket B, which is already full. Hence let us
split bucket B into old bucket B and split image of B as B2.
• The local depth of B and B2 now becomes 3.
• Now for bucket B, we get and 1=001
11 100011
• For bucket B2, we get
5=101
29 = 11101
and 21 =10101
After insertion of 11* we get the scenario as follows,
Example 4.12.1 The following key values are organized in an extendible hashing technique.
13589 12 17 28. Show the extendible hash structure for this file if the hash function is h(x) =
x mod 8 and buckets can hold three records. Show how extendable hash structure changes
as the result of each of the following steps:
Insert 2
Insert 24
Delete 5
Delete 12
Solution:
Step 1: Initially we assume the hash function based on last two bits, of result of hash
function.
1 mod 8=1=001
3 mod 8=3 = 011
5 mod 8=5= 101
8 mod 8=0=000
9 mod 8=1=001
12 mod 8 = 4=100
17 mod 8=1=001
28 mod 84 = 100
The extendible hash table will be,

Hence we will extend the table by assuming the bit size as 3. The above indicated bucket A
will split based on 001 and 101.
a) Insert 2 will be 2 mod 8 = 2 = 010. If we consider last two digits i.e. 10 then there is
no bucket. So we get,
b) Insert 24 : 24 mod 8 = 0 = 000. The bucket in which 24 can be inserted is 8, 12, 28.
But as this bucket is full we split it in two buckets based on digits 000 100.
c) Delete 5: On deleting 5, we get one bucket pointed by 101 as empty. This will also
result in reducing the local depth of the bucket pointed by 001. Hence we get,
d) Delete 12: We will simply delete 12 from the corresponding bucket there can not be
any merging of buckets on deletion. The result of deletion is as given below
Difference between Static and Dynamic Hashing
8. Query Processing
• Query processing is a collection of activities that are involved in extracting data from
database.
• During query processing there is translation high level database language queries into the
expressions that can be used at the physical level of filesystem.
• There are three basic steps involved in query processing and those are -
1. Parsing and Translation
• In this step the query is translated into its internal form and then into relational algebra.
• Parser checks syntax and verifies relations.
• For instance - If we submit the query as,
SELECT RollNo, name
FROM Student
HAVING RollNo=10
Then it will issue a syntactical error message as the correct query should be
SELECT RollNo, name
FROM Student
HAVING RollNo=10
Thus during this step the syntax of the query is checked so that only correct and verified
query can be submitted for further processing.
2. Optimization
• During this process thequery evaluation plan is prepared from all the relational algebraic
expressions. bud off
• The query cost for all the evaluation plans is calculated.
• Amongst all equivalent evaluation plans the one with lowest cost is chosen.
• Cost is estimated using statistical information from the database catalog, such asthe
number of tuples in each relation, size of tuples, etc.
3. Evaluation
• The query-execution engine takes a query-evaluation plan, executes that plan, and returns
the answers to the query.
For example - If the SQL query is,
SELECT balance
FROM account
WHERE balance<1000
Step 1: This query is first verified by the parser and translator unit for correct syntax. If so
then the relational algebra expressions can be obtained. For the above given queries there
are two possible relational algebra
(1) σbalance<1000(Πbalance (account))
(2) Πbalance ( σbalance<1000 (account))
Step 2: Query Evaluation Plan: To specify fully how to evaluate a query, we need not only to
provide the relational-algebra expression, but also to annotate it with instructions specifying
how to evaluate each operation. For that purpose, using the order of evaluation of queries,
two query evaluation plans are prepared. These are as follows

Associated with each query evaluation plan there is a query cost. The query optimization
selects the query evaluation plan having minimum query cost.
Once the query plan is chosen, the query is evaluated with that plan and the result of the
query is output.

9. Extendable Hashing Example

Extendible Hashing
The extendible hashing is a dynamic hashing technique in which, if the bucket is overflow,
then the number of buckets are doubled and data entries in buckets are re- distributed.
Example of extendible hashing:
In extendible hashing technique the directory of pointers to bucket is used. Refer following
Fig. 4.12.1

To locate a data entry, we apply a hash function to search the data we us last two digits of
binary representation of number. For instance binary representation of 32* = 10000000. The
last two bits are 00. Hence we store 32* accordingly.
Insertion operation :
• Suppose we want to insert 20* (binary 10100). But with 00, the bucket A is full. So we must
split the bucket by allocating new bucket and redistributing the contents, bellsp across the
old bucket and its split image.
• For splitting, we consider last three bits of h(r).
• The redistribution while insertion of 20* is as shown in following Fig. 4.12.2.
The split image of bucket A i.e. A2 and old bucket A are based on last two bits i.e. 00. Here
we need two data pages, to adjacent additional data record. Therefore here it is necessary
to double the directory using three bits instead of two bits. Hence,
• There will be binary versions for buckets A and A2 as 000 and 100.
• In extendible hashing, last bits d is called global depth for directory and d is called local
depth for data pages or buckets. After insetion of 20*, the global depth becomes 3 as we
consider last three bits and local depth of A and A2 buckets become 3 as we are considering
last three bits for placing the data records. Refer Fig. 4.12.3.
• Suppose if we want to insert 11*, it belongs to bucket B, which is already full. Hence let us
split bucket B into old bucket B and split image of B as B2.
• The local depth of B and B2 now becomes 3.
• Now for bucket B, we get and 1=001
11 100011
• For bucket B2, we get
5=101
29 = 11101
and 21 =10101
After insertion of 11* we get the scenario as follows,
Example 4.12.1 The following key values are organized in an extendible hashing technique.
13589 12 17 28. Show the extendible hash structure for this file if the hash function is h(x) =
x mod 8 and buckets can hold three records. Show how extendable hash structure changes
as the result of each of the following steps:
Insert 2
Insert 24
Delete 5
Delete 12
Solution:
Step 1: Initially we assume the hash function based on last two bits, of result of hash
function.
1 mod 8=1=001
3 mod 8=3 = 011
5 mod 8=5= 101
8 mod 8=0=000
9 mod 8=1=001
12 mod 8 = 4=100
17 mod 8=1=001
28 mod 84 = 100
The extendible hash table will be,

Types of Redundancy in DBMS

1. Data Redundancy
● Description: The same data is stored in multiple locations, either within the same
database or across distributed systems.
● Example: In a distributed database system, critical data is replicated across multiple
servers.
● Benefits:
o Fault Tolerance: If one copy of the data is lost or corrupted, another copy
can be used.
o Increased Availability: Users can access data even if one server or site fails.
2. Hardware Redundancy
● Description: Duplicate hardware components are installed to ensure the system
remains operational during hardware failures.
● Example: RAID (Redundant Array of Independent Disks) configurations use multiple
hard drives to store the same or striped data.
● Benefits:
o Disk Fault Tolerance: Data is not lost even if one disk fails.
o Improved Performance: Techniques like striping can improve read/write
speeds.
3. Software Redundancy
● Description: Multiple versions or backup systems of the software are maintained to
recover from software failures.
● Example: Database systems often use standby databases or failover mechanisms to
switch to backup software when the primary software fails.
● Benefits:
o Reduces downtime caused by software crashes or updates.
4. Network Redundancy
● Description: Duplicate network paths or connections are established to ensure
continuous communication between systems.
● Example: Using multiple network interfaces and failover routers.
● Benefits:
o Ensures uninterrupted communication between clients and servers.
5. Backup and Recovery Systems
● Description: Regular backups are taken to create redundant copies of the database,
which can be restored in case of data loss.
● Example: Incremental or full database backups are maintained at a secondary site.
● Benefits:
o Disaster Recovery: Ensures data can be restored after a catastrophic event.
o Point-in-Time Recovery: Allows the system to be restored to a specific point
in time to maintain consistency.
Advantages of Redundancy in DBMS
1. Improved Fault Tolerance: Redundancy ensures the system remains operational
despite failures.
2. High Availability: Data and services are always accessible, minimizing downtime.
3. Data Integrity and Consistency: Even in failure scenarios, redundant systems help
maintain consistent and accurate data.
4. Disaster Recovery: Enables the system to recover quickly from failures like
hardware crashes, natural disasters, or cyberattacks.

Challenges of Redundancy
1. Increased Storage Costs: Storing redundant data or components requires more
resources.
2. Synchronization Overhead: Ensuring that redundant copies remain consistent adds
complexity.
3. Potential for Data Anomalies: If redundancy is poorly managed, it can lead to data
inconsistencies.

Conclusion
Redundancy is a critical strategy in DBMS to improve reliability by mitigating the impact of
failures. Whether through data replication, hardware duplication, or network backups,
redundancy ensures that systems maintain high availability, fault tolerance, and data
integrity. However, redundancy should be implemented with careful consideration of costs,
performance trade-offs, and synchronization mechanisms.

ii) Representation and Organization of Records in a File

In database systems, records are the fundamental units of data storage, representing a
single instance of related data fields. Files in a database store these records in an organized
manner to allow efficient retrieval, insertion, deletion, and updates.

Types of Record Representation in Files

Records in a file are represented and organized using one of the following approaches:
1. Fixed-Length Records
● Description: Each record in the file has the same size, and every field occupies a
fixed amount of space.
● Advantages:
o Simple to implement.
o Easy to locate records using arithmetic (e.g., Record N = Start Address +
(N-1) × Record Size).
● Disadvantages:
o Wastes space if fields are not fully utilized.
o Not flexible for variable-length fields (e.g., names or descriptions).
Example: Suppose a file stores records of employees with the following fields:
● Employee ID: 4 bytes
● Name: 20 bytes
● Salary: 8 bytes
Total Record Size = 4+20+8=32 bytes4 + 20 + 8 = 32 \, \text{bytes}4+20+8=32bytes.
File Representation:
[EmpID][Name ][Salary]
101 Alice 50000
102 Bob 60000
103 Charlie 55000
2. Variable-Length Records
● Description: Records in the file have different sizes due to fields that vary in length.
● Advantages:
o Saves storage space by accommodating variable-length fields.
o Flexible for fields like names, descriptions, or comments.
● Disadvantages:
o Complex to implement as the record size varies.
o Difficult to locate specific records (requires indexing or pointers).
Example: Suppose a file stores student records with:
● Student ID: 4 bytes
● Name: Variable size
● Grade: 1 byte
File Representation:
[StudentID] [Name ][Grade]
201 Alice A
202 Bob B+
203 Charlie A-
To locate records, pointers or delimiters are used:
● Pointers indicate the start of each record.
● Delimiters (e.g., | or \n) separate fields.

File Organization Methods

Records are organized in files using one of the following methods:
1. Heap (Unordered) File Organization
● Description: Records are stored in no specific order. New records are appended to
the end of the file.
● Advantages:
o Simple to implement.
o Efficient for bulk insertions.
● Disadvantages:
o Inefficient for search operations (requires scanning the entire file).
Example:
[Record1] -> [Record2] -> [Record3]
2. Sequential File Organization
● Description: Records are stored in a sequential order based on a key field.
● Advantages:
o Efficient for range queries and sorted data retrieval.
● Disadvantages:
o Insertion and deletion require reorganization of the file.
Example: If records are organized by Employee ID:
[101, Alice, 50000]
[102, Bob, 60000]
[103, Charlie, 55000]
3. Clustered File Organization
● Description: Related records are grouped together in the same block to minimize
disk I/O.
● Advantages:
o Faster retrieval of related records.
● Disadvantages:
o Complex to maintain.
Example:
Block 1: [EmpID=101, Dept=Sales], [EmpID=102, Dept=Sales]
Block 2: [EmpID=201, Dept=HR], [EmpID=202, Dept=HR]
4. Hashed File Organization
● Description: A hash function is used to compute the address of the record based on
a key field.
● Advantages:
o Very efficient for equality searches.
● Disadvantages:
o Inefficient for range queries.
Example:
Hash(EmployeeID) -> Bucket
101 -> Bucket 1
102 -> Bucket 2

Example for File Organization

Consider a student database with fields: Student ID, Name, and Grade.
Sequential Organization:
File contents:
[101, Alice, A]
[102, Bob, B+]
[103, Charlie, A-]
Hashed Organization (Hash Function: h(ID)=IDmod 3h(ID) = ID \mod 3h(ID)=IDmod3):
Buckets:
Bucket 0: [102, Bob, B+]
Bucket 1: [101, Alice, A]
Bucket 2: [103, Charlie, A-]

Conclusion
The way records are represented and organized in files directly impacts the efficiency of
data retrieval, insertion, and maintenance. Fixed-length records are simple but waste space,
whereas variable-length records optimize storage but are more complex. File organization
methods like heap, sequential, clustered, and hashed further determine the performance and
usability of database systems based on the specific application requirements.

Dbms Module 1 Questions With Answers
No ratings yet
Dbms Module 1 Questions With Answers
7 pages
DBMS - 2 Marks
No ratings yet
DBMS - 2 Marks
19 pages
CS2255 DATABASE MANAGEMENT SYSTEMS (2marks and 16 Marks) Unit-I Part-A
No ratings yet
CS2255 DATABASE MANAGEMENT SYSTEMS (2marks and 16 Marks) Unit-I Part-A
32 pages
DBMS Question Bank
No ratings yet
DBMS Question Bank
33 pages
DBDM 2 Marks
No ratings yet
DBDM 2 Marks
23 pages
Relational Databases & SQL Basics
No ratings yet
Relational Databases & SQL Basics
26 pages
Dbms (Cse201) Theory Notes: Primary Key
No ratings yet
Dbms (Cse201) Theory Notes: Primary Key
18 pages
DBMS Concepts for CSE Students
No ratings yet
DBMS Concepts for CSE Students
55 pages
CS3492 DBMS-Important-2-Mark With Answer
No ratings yet
CS3492 DBMS-Important-2-Mark With Answer
16 pages
DBMS
No ratings yet
DBMS
13 pages
DBMS QB
No ratings yet
DBMS QB
9 pages
Dbms Question Bank
No ratings yet
Dbms Question Bank
15 pages
2marks and 3marks-12.06.2020
100% (1)
2marks and 3marks-12.06.2020
15 pages
Unit I Fundamentals Part A 1. Define Database Management System?
No ratings yet
Unit I Fundamentals Part A 1. Define Database Management System?
7 pages
DBMS Sns
No ratings yet
DBMS Sns
37 pages
DBMS Ca1 (Raw)
No ratings yet
DBMS Ca1 (Raw)
11 pages
Database Quiz Preparation Notes
No ratings yet
Database Quiz Preparation Notes
5 pages
DBMS Two Mark
No ratings yet
DBMS Two Mark
12 pages
Relational Database Concepts Guide
No ratings yet
Relational Database Concepts Guide
27 pages
Department of Information Technology:: VRSEC III/IV B.Tech VTH Sem DBMS - 17IT2505A
No ratings yet
Department of Information Technology:: VRSEC III/IV B.Tech VTH Sem DBMS - 17IT2505A
12 pages
DBMS 2-Marks Q&A
No ratings yet
DBMS 2-Marks Q&A
14 pages
COM 508 Assignment 3
No ratings yet
COM 508 Assignment 3
10 pages
Dbms All Units 2 Marks
No ratings yet
Dbms All Units 2 Marks
28 pages
Aarsh RDBMS, Oracle, SQL Base
No ratings yet
Aarsh RDBMS, Oracle, SQL Base
16 pages
CS3492 Database Management Systems Two Mark Questions 1
100% (1)
CS3492 Database Management Systems Two Mark Questions 1
38 pages
DBMS Repeated From Book
No ratings yet
DBMS Repeated From Book
74 pages
DBMS Two Marks Question Bank
No ratings yet
DBMS Two Marks Question Bank
16 pages
Database Management Systems Q&A
100% (1)
Database Management Systems Q&A
17 pages
CT1 Answer Key
No ratings yet
CT1 Answer Key
7 pages
DBMS Sanjivani
No ratings yet
DBMS Sanjivani
31 pages
Chapter 11
No ratings yet
Chapter 11
6 pages
2 Marks With Answers and Part-B Questions Only
No ratings yet
2 Marks With Answers and Part-B Questions Only
32 pages
DBMS - Two Mark Question
No ratings yet
DBMS - Two Mark Question
10 pages
Introduction to Database Concepts
No ratings yet
Introduction to Database Concepts
34 pages
Database Systems Overview
No ratings yet
Database Systems Overview
89 pages
Dbms
No ratings yet
Dbms
21 pages
DBMS NJNN
No ratings yet
DBMS NJNN
38 pages
DBMS (Unit1)
No ratings yet
DBMS (Unit1)
24 pages
Unit 1
No ratings yet
Unit 1
41 pages
Database System
No ratings yet
Database System
5 pages
Dbms Unit 1 - Part A Questions
No ratings yet
Dbms Unit 1 - Part A Questions
4 pages
Unit: 1: Two Marks
No ratings yet
Unit: 1: Two Marks
45 pages
Database System Architecture Work
No ratings yet
Database System Architecture Work
29 pages
Database Management System 2m
No ratings yet
Database Management System 2m
24 pages
Practical Assignment 1
No ratings yet
Practical Assignment 1
5 pages
Unit No1 (Database System Concepts)
No ratings yet
Unit No1 (Database System Concepts)
32 pages
DBMS 2marks
No ratings yet
DBMS 2marks
29 pages
DBMS (Unit 1) - Final For II B.COM (CA)
No ratings yet
DBMS (Unit 1) - Final For II B.COM (CA)
11 pages
Cse DBMS 2marks
No ratings yet
Cse DBMS 2marks
14 pages
DBMS 2 Marks
No ratings yet
DBMS 2 Marks
21 pages
2 Marks Questions and Answers
No ratings yet
2 Marks Questions and Answers
24 pages
Short Answers
No ratings yet
Short Answers
12 pages
Updated DBMS IMPORTANT QUES FOR MID SEM
No ratings yet
Updated DBMS IMPORTANT QUES FOR MID SEM
19 pages
DBMS 53617 Scheme
No ratings yet
DBMS 53617 Scheme
7 pages
Dbms 2 Marks
No ratings yet
Dbms 2 Marks
9 pages
How To Improve ASCP Data Collections Performance
No ratings yet
How To Improve ASCP Data Collections Performance
6 pages
DBMS
No ratings yet
DBMS
1 page
Tena GG
No ratings yet
Tena GG
25 pages
SQL Theory Questions
No ratings yet
SQL Theory Questions
10 pages
Introduction to Data Management
No ratings yet
Introduction to Data Management
66 pages
Oracle DB Migration Guide
No ratings yet
Oracle DB Migration Guide
11 pages
Chapter 2a Non Structured DataRozianiwati
No ratings yet
Chapter 2a Non Structured DataRozianiwati
43 pages
Dbms Notes
No ratings yet
Dbms Notes
5 pages
SQL+Server+Index+Architecture+and+Design+Guide+-+SQL+Server+ +Microsoft+Docs
No ratings yet
SQL+Server+Index+Architecture+and+Design+Guide+-+SQL+Server+ +Microsoft+Docs
47 pages
Jilani CV
No ratings yet
Jilani CV
3 pages
Part 1-Textbook Exercise-Unit 3 Database Management System Using LibreOffice Base
100% (1)
Part 1-Textbook Exercise-Unit 3 Database Management System Using LibreOffice Base
12 pages
MIS Note 3
No ratings yet
MIS Note 3
8 pages
Database Engineering
No ratings yet
Database Engineering
3 pages
Chapter-4 3
No ratings yet
Chapter-4 3
25 pages
Lesson Note On Data Processing SS2 First Term
No ratings yet
Lesson Note On Data Processing SS2 First Term
40 pages
Database Schema Exercise Solutions
No ratings yet
Database Schema Exercise Solutions
5 pages
Chapter: 7.2 Database Topic: 7.2.1 Introduction To Database: E-Content of It Tools and Business System
No ratings yet
Chapter: 7.2 Database Topic: 7.2.1 Introduction To Database: E-Content of It Tools and Business System
4 pages
C#, ASP - Net and MySQL Project On Online Employee Management System
80% (10)
C#, ASP - Net and MySQL Project On Online Employee Management System
139 pages
Conceptual Queries: by Dr. Terry Halpin, BSC, Diped, Ba, Mlitstud, PHD Director of Database Strategy, Visio Corporation
No ratings yet
Conceptual Queries: by Dr. Terry Halpin, BSC, Diped, Ba, Mlitstud, PHD Director of Database Strategy, Visio Corporation
14 pages
16 Mark Questions: Unit: 1
No ratings yet
16 Mark Questions: Unit: 1
11 pages
Lecture 7 - CS50x 2024
No ratings yet
Lecture 7 - CS50x 2024
20 pages
SQL Server Interview Questions & Answers Book
No ratings yet
SQL Server Interview Questions & Answers Book
75 pages
IT Exam Prep: Database Concepts
No ratings yet
IT Exam Prep: Database Concepts
12 pages
DBMS MCQS
No ratings yet
DBMS MCQS
22 pages
Edi 104 - Chapter 3
No ratings yet
Edi 104 - Chapter 3
47 pages
250+ TOP MCQs On Multivalued Dependencies and Answers
No ratings yet
250+ TOP MCQs On Multivalued Dependencies and Answers
7 pages
Organizing Data and Information: AD660 - Databases, Security, and Web Technologies
No ratings yet
Organizing Data and Information: AD660 - Databases, Security, and Web Technologies
31 pages
Nosql Tricks
No ratings yet
Nosql Tricks
34 pages
.NET Core, SDLC, Agile & C# Course
No ratings yet
.NET Core, SDLC, Agile & C# Course
34 pages
Data Analysis Notes
No ratings yet
Data Analysis Notes
20 pages

Q-A Combined

Uploaded by

Q-A Combined

Uploaded by

UNIT – I – RELATIONAL DATABASE

1.​ State the need for DBMS

2.​ Define a schema

3.​ What is an entity-relationship model?

4.​ State the three levels of abstraction.

5.​ Define foreign key. Give an example.

6.​ Mention three major disadvantages of keeping organizational information in

a) Data redundancy: Data redundancy means duplication of data at several

8.​ List out the different types of data models.

9.​ Relate the data and database management systems.

10.​What are the different integrity constraints used in relational database?

11.​Why does SQL allow duplicate tuples in a table or in a query result?

13.​What is meant by instance and Schema of the database?

15.​Differentiate between Dynamic SQL and Static SQL.

16.​List any eight applications of DBMS.

17.​What is the difference between primary key and foreign key?

18.​List some relational algebra operations.

19.​Outline referential integrity with an example.

• The lowest part of the architecture is for disk storage.

• With the following components of query processor, various functionalities are

• Storage manager is the component of database system that provides interface

• Storage manager implements several data structures such as -

i) Data files: Used for storing database itself.

It is a collection of conceptual tools for describing data, relationships among data,

• Data model is a structure below the database.

(1) Relational model:

• Table is also known as relation.

(i) Structural Independence: Structural independence is an ability that allows us to

(ii) May lead to slower processing time.

(iii) Poorly designed systems lead to poor implementation of database systems.

2) Entity relationship model:

• The entity is a thing or object in the real world.

• The entity relationship model is widely used in database design.

• For example - Following is a representation of Entity Relationship modelin which

i) Simple: It is simple to draw ER diagram when we know entities and relationships.

iii) Effective: It is effective communication tool.

ii) Limited relationships: The ER model can represent limited relationships as

iii) No Representation for data manipulation: It is not possible to represent data

iv) No industry standard: There is no industry standard for notations of ER diagram.

(3) Object Based Data Model:

dominant in software development.

• This led to object based data model.

(4) Semi-structured data model:

• The Extensible Markup Language (XML) is widely used to represent semi-

i) Data is not constrained by fixed schema.

i) Queries are less efficient than other types of data model.

3.​ Describe the three-schema architecture. Why do we need mapping between

• Syntax: The basic syntax of the INNER JOIN is as follows.

• Syntax: The basic syntax of a LEFT JOIN is as follows.

SELECT Table1.column1, Table2.column2...

• Syntax: The basic syntax of a FULL JOIN is as follows

EMPLOYEE (ENO, NAME, DATE_BORN, GENDER, DATE_JOINED,​

Write SQL queries to perform the following:

-- Create the DEPARTMENT table

-- Insert sample values into DEPARTMENT table

6.​ Explain about Data Definition Language for student database.

2. The ALTER command is used to modify an existing table's structure. Examples

Primary Key: Ensures unique identification of each record.

Foreign Key: Maintains relationships between tables.

Unique Key: Ensures unique values for a column.

Check Constraint: Enforces a condition for column values.

Database integrity means correctness or accuracy of data in the database. A

(i) The Employee ID and Department ID must consists of two digits.

Entity Integrity Rule

Referential Integrity Rule

• Referential integrity refers to the accuracy and consistency of data within a

• In relationships, data is linked between two or more tables. This is achieved by

Referential integrity enforces the following three rules:

Advantages of Referential Integrity

i) Prevents the entry of duplicate data.

ii) Guaranteed consistency between "partnered" tables.

Structure of embedded SQL

Advantages of Dynamic SQL

Query: Fetch students with age more than 18

(3) Cartesian product:

Note:that although the Sid columns is same, it is repeated.

1. State the need for DBMS

2. Define a schema

3. What is an entity-relationship model?

4. State the three levels of abstraction.

5. Define foreign key. Give an example.

6. Mention three major disadvantages of keeping organizational information in

8. List out the different types of data models.

9. Relate the data and database management systems.

10.What are the different integrity constraints used in relational database?

11.Why does SQL allow duplicate tuples in a table or in a query result?

13.What is meant by instance and Schema of the database?

15.Differentiate between Dynamic SQL and Static SQL.

16.List any eight applications of DBMS.

17.What is the difference between primary key and foreign key?

18.List some relational algebra operations.

19.Outline referential integrity with an example.

3. Describe the three-schema architecture. Why do we need mapping between

EMPLOYEE (ENO, NAME, DATE_BORN, GENDER, DATE_JOINED,

6. Explain about Data Definition Language for student database.

1. Give the limitations of E-R model. How to overcome this?

2. List the design phases of Entity Relationship model.

4. State the problems caused by redundancy.

5. Define normalization.

Consider relation R (A,B,C,D) with dependencies

1. Each book is uniquely identified by a Book ID.

6. Borrowers must register to become members of the library.

7. A book can only be issued if it is available in stock.

1. Each borrower has a unique Borrower ID.

2. Fine calculations are based on a fixed daily rate.

3. The system supports only one library branch.

o BookID (Primary Key)

o BorrowerID (Primary Key)

o TransactionID (Primary Key)

o BorrowerID (Foreign Key)

o BookID (Foreign Key)

1. A Borrower can borrow multiple Books (1:N).

2. A Book can be borrowed in multiple Transactions (1:N).

3. A Staff member processes each Transaction (1:N).

● Entities: Represented as rectangles (Book, Borrower, Staff, Transaction).

● Attributes: Represented as ovals connected to respective entities.

● Relationships: Represented as diamonds with lines connecting entities.

● Borrower --- borrows --- Transaction --- involves --- Book

● Staff --- processes --- Transaction

1. Allow chefs to view and update the status of orders.

2. Notify waiters when items are ready.

4. Support supervisory actions for managing exceptional circumstances.

● Orders: Customer orders placed through waiters and processed by chefs.

● Chefs: Food preparation staff responsible for preparing ordered items.

● Bills: Customer bills based on ordered items.

● Supervisors: Staff members authorized to handle exceptional situations.

1. Each customer order is uniquely identified by an Order ID.

4. Waiters are notified when an item is marked as "Ready."

2. Each menu item has a fixed price.

o ChefID (Primary Key)

o WaiterID (Primary Key)

o SupervisorID (Primary Key)

o OrderID (Primary Key)

o WaiterID (Foreign Key)

o OrderStatus (e.g., "In Progress," "Completed," "Cancelled")