DBMS Unit1 Answers
DBMS Unit1 Answers
ANSWERS
PART A
Definition:
A DBMS is a collection of interrelated data and programs that allows users to define, create,
manipulate, and control access to databases efficiently.
Applications:
library systems, flight or hotel reservation systems, inventory management systems, student
information systems, and content management systems
2. Advantages of DBMS
3. Levels of Abstraction
Users Users
Logical Schema
Conceptual or Logical Level
Database
1
A data model is a collection of concepts that can be used to describe the structure of a
database. By structure of a database we mean the data types, relationships and constraints that
apply to the data.
Types of Data Models:
Relational model
Entity-Relationship data model (mainly for database design)
Object-based data models (Object-oriented and Object-relational)
Semi-structured data model (XML)
Hierarchical
Network
Embedded SQL define how SQL statements can be embedded within general-purpose programming
languages such as C, C++, Java etc.,
Syntax:
The relational Algebra is a procedural query language. It consists of a set of operations that
take one or two relations as input and produce a new relation as their result.
Fundamental Operations:
select
project
Cartesian product
Set operations (union, Intersection, Set difference)
Join operations
rename
Here the select, project and rename operations are called unary operations, because they
operate on one relation. The other three operations operate on pairs of relations and are, therefore
called binary operations.
2
Referential integrity is a relational database concept in which multiple tables share a relationship based
on the data stored in the tables, and that relationship must remain con-sistent. This integrity is enforced
using foreign key constraint.
Rules ensuring that a foreign key value in one table must match a primary key value in another table,
thereby maintaining consistency between related tables.
childColumn DATA_TYPE,
);
Definition:
SQL (Structured Query Language) is a standardized language for defining, manipulating, and
controlling data in relational databases.
Parts of SQL include:
Include syntax
DELETE: Removes selected rows using WHERE, can be rolled back, slower.
TRUNCATE: Removes all rows, faster, cannot use WHERE, may reset storage.
3
Include example and represent in a tabular format
Include examples
Schema: The structural design or blueprint of the database (tables, fields, relationships).
o Physical schema represents database design at the physical level
o Logical schema represents database design at the logical level
Instance: The actual data stored in the database at a particular moment.
4
Definition: A trigger is a database object that automatically executes a predefined action (like a
PL/SQL block) in response to certain events (e.g., INSERT, UPDATE, DELETE).
Example: After a new order is inserted, update inventory quantity.
Assuming relations:
Employee(EmpID, Name, DeptID), Department(DeptID, DeptName)
Expression:
πName,DeptName(Employee⋈Department)
Domain types refer to the built-in data types that define the nature of data a column can store, such as:
PART B
The DBMS architecture is divided into two parts. The top part of the figure refers to the
various users of the database environment and their interfaces. The lower part shows the internals
5
of the DBMS responsible for storage of data and processing of transactions.
Upper Module:
Sophisticated (Power) Users
Application programmers:
a. Query Processor
6
o Execution Engine: Executes the optimized plan to retrieve or manipulate data.
b. Storage Manager
A program module that provides the interface between the low-level data stored in the database and the
application programs and queries submitted to the system
7
o Buffer Manager: Handles caching and data movement between disk and memory.
o Transaction Manager: Ensures ACID properties and manages concurrency and
recovery.
o Authorization & Integrity Managers: Enforce access controls and integrity
constraints.
c. Disk Storage
The lowest level where data is physically stored (tables, indexes, logs, and other database files)
a.Data Models
A data model is a collection of concepts that can be used to describe the structure of a
database. The structure means the data types, relationships and constraints that apply to the data.
Types of Data Models:
1. Relational model
2. Entity-Relationship data model (mainly for database design)
3. Object-based data models (Object-oriented and Object-relational)
4. Semi-structured data model (XML)
Traditional Models
Hierarchical Model
Network Model
Relational Model
The relational model uses a collection of tables to represent both data and relationships
among the data. Each table has multiple columns, and each column has a unique name. The
below table called customer table, shows, for example, that the customer identified by
customer-id 100 is named john and lives at 12 anna st. in Chennai and also shows his account
number.
9
called the root.
Network model
The entities are organized in a graph, in which some entities can be accessed through
Figure 1.4. An example of the hierarchical model representing a University Network
several paths
. Model
b. Embedded SQL
Programs written in the host language can use the embedded SQL syntax to access and update
data stored in a database. This embedded form of SQL extends the programmer’s ability to
manipulate the database even further. In embedded SQL, all query processing is performed by the
database system, which then makes the result of the query available to the programmer one tuple at
a time.
10
An embedded SQL program must be processed by a special preprocessor prior to compilation.
The preprocessor replaces embedded SQL requests with host-language declarations and procedure
calls that allow run-time execution of the database accesses. Then, the resulting program is
compiled by the host language compiler. To identify embedded SQL requests to the preprocessor,
we use the EXEC SQL statement. It has the form
EXEC SQL <embedded SQL statement> END-EXEC
The exact syntax for embedded SQL requests depends on the language in which SQL is
embedded. For instance, a semicolon is used instead of END-EXEC when SQL is embedded in C.
The java embedding of SQL called (SQLJ) uses the syntax
#SQL{<embedded SQL statement>}
The statement SQL INCLUDE is placed in the program to identify the place where the
preprocessor should insert the special variables used for communication between the program and
the database system. Variables of the host language can be used within embedded SQL statements,
but they must be preceded by a colon (:) to distinguish them from SQL variables.
Before executing any SQL statements, the program must first connect to the database. This
is done using
EXEC SQL connect to server user user-name END-EXEC
Here server identifies the server to which a connection is to be established. Database
implementations may require a password to be provided in addition to a user name.
To write a relational query, the declare cursor statement is used. The program must use the
open and fetch commands to obtain the result tuples.
Consider the banking schema. Assume that we have a host language variable amount, and
that we wish to find the names and cities of residence of customers who have more than amount
dollars in any account. The query can be written as follows:
EXEC SQL
Declare c cursor for
select customer-name, customer-city
from depositor, customer, account
where depositor.customer-name = customer.customer-
name and depositor account-number =
account.account-number
and account.balance > :amount
END-EXEC
The open statement causes the query to be evaluated and to save the result in the temporary
relation.
EXEC SQL open c END-EXEC
If the SQL statement results in an error the database system stores an error diagnostic in the SQL
communication area (SQLCA) variables, whose declarations are inserted by the SQL include
statement.
The fetch statement causes the values of one tuple in the query result to be placed on host
language variables.
EXEC SQL fetch c into :cn, :cc END-EXEC
The fetch statement requires one host-language variable for each attribute of the result
11
relation. In the example query cn holds the customer-name and cc holds the customer-city.
To obtain all tuples of the result, the program must contain a loop to iterate over tuples.
When the program executes an open statement on a cursor, the cursor is set to point to the first tuple
of the result. Each time it executes a fetch statement, the cursor is updated to point to the next tuple
of the result. When no further tuples remain to be processed, the variable called SQLSTATE in the
SQL communication area (SQLCA) gets set to ‘02000’ to indicate no more data is available.
The close statement causes the database system to delete the temporary relation that holds
the result of the query.
EXEC SQL close c END-EXEC
12
3. With relevant example to discuss about the various
operations of Relation Algebra. (16)
RELATIONAL ALGEBRA:
The relational Algebra is a procedural query language. It consists of a set of
operations that take one or two relations as input and produce a new relation as their
result.
Fundamental Operations:
The fundamental Operations in Relational Algebra are:
select
project
cartesian product
Set operations(union, Intersection, Set difference)
Join operations
rename
Here the select, project and rename operations are called unary operations
1.Selection (σ)
Syntax:
σ(condition)(Relation)
Example:
Let Employee(EmpID, Name, Salary, DeptID).
2.Projection (π)
Syntax:
Example:
π(Name, Salary)(Employee)
3.Union (∪)
13
Purpose: Combines rows from two compatible relations, eliminating duplicates.
Syntax:Relation1 ∪ Relation2
Purpose: Returns rows in the first relation that are not in the second, requiring compatibility.
Syntax:
Relation1 − Relation2
Purpose: Combines every row of one relation with every row of another.
Syntax:
Relation1 × Relation2
Example:
Let A have 3 rows and B have 2 rows → A × B results in 6 rows. 555
6.Rename (ρ)
Syntax:
ρ(newName)(Relation) or
ρ((newAttr1, newAttr2,...))(Relation)
7. Intersection (∩)
8. Join Operations
Purpose: Combine rows from two relations based on a condition. Several forms exist:
R1 ⋈_{condition} R2
R1 ⋈_{R1.key = R2.key} R2
14
Natural Join: Automatically joins on all matching attribute names, avoiding
duplicates.
R1 ⋈ R2
9.Division (÷)
Purpose: Retrieves values from a relation that relate to all values in another.
Syntax:
R1 ÷ R2
Example:
Find students who have taken all required courses:
Student_Course ÷ Course_Required
Assume relations:
Selection:
Projection:
π(Name)(EMPLOYEE)
Union:
π(Name)(STUDENT) ∪ π(Name)(EMPLOYEE)
Difference:
π(Name)(STUDENT) − π(Name)(EMPLOYEE)
15
Cartesian Product:
STUDENT × STUDENT_SPORTS
Rename:
ρ(Stud(Roll, Name))(STUDENT)
Joins:
o Equi-Join:
o Natural Join:
STUDENT ⋈ STUDENT_SPORTS
4 Interpret the cons of File Processing System. How they can be overcome by Database
Management System (16).
The file processing system has a number of disadvantages:
1) Data redundancy and inconsistency
Since different programmers create the files and application programs, the files
will have a different format and the programs may be written in several programming
languages. Also the same information may be duplicated in several files. For example
the address and telephone number of a particular customer may appear in a file that
consists of saving-account records and in a file that consist of checking account
records. This redundancy leads to higher storage and access cost.
Data inconsistency occurs, if the various copies of the same data may no longer
agree. For example, a changed customer address may be reflected in savings account
records but not elsewhere in the system.
2) Difficulty in accessing data
Suppose that a bank officer needs to find the names of all customers who live in a
particular postal-code area. The officer insists the data processing department to
generate such a list. The original system has no application program to meet the
request. However, the system has an application program to generate the list of all
customers. The officer has only two choices: either can obtain the list of all
customers and extract the needed information manually or ask the system programmer
to write the necessary application program. Both are unsatisfactory. Thus the
conventional file-processing environments do not allow needed data to be retrieved in
a convenient and efficient manner.
3) Data isolation
Because data are scattered in various files, and files may be in different formats,
writing new application programs to retrieve the appropriate data is difficult.
4) Integrity problems
16
The data values stored in a database must satisfy some consistency constraints.
For example, the balance of a bank account may never fall below a prescribed amount
(say, $50). Developers enforce these constraints in the system by adding appropriate
code in the various application programs. However, when new constraints are added it
is difficult to change the programs to enforce them.
5) Atomicity Problems
A computer system like any other mechanical or electrical device is subject to
failure. In many applications, if a failure occurs, it is difficult to restore the data to a
consistent state as it existed prior to the failure. For example, consider a program to
transfer $50 from account A to account B. If a system failure occurs during the
execution of the program, it is possible that $50 was removed from account A but
was not credited to account B, resulting in inconsistent database state. The funds
transfer must be atomic. It must happen in its entirety or not at all. Thus it is difficult
to ensure atomicity in conventional file processing system.
6) Concurrent access anomalies
To improve the overall performance of the system, many systems allow multiple
users to update the data simultaneously. In such an environment, interaction of
concurrent updates may result in inconsistent data. For example consider bank
account A, containing $500. If two customers withdraw funds (say $50 and $100
respectively) from account at about the same time, the result of the concurrent
executions may leave the account in an incorrect state. If the two programs run
concurrently, they may both read the value $500, and write the result back
$450 and $400 respectively. Depending on which one writes the value last, the
account may contain either $450 or $400, rather than the correct value of $350.
7) Security Problems
Every user of the database system should not be able to access all the data. For
example, in a banking system, payroll personnel need to see only that part of the
database that has information about the various bank employees. They do not need to
access information about customer accounts. The file processing systems do not
enforce security constraints.
Data redundancy & Centralized storage and normalization reduce duplication and ensure
inconsistency consistency
17
File Processing System
DBMS Solution and Advantages
Limitation
Integrity constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of
information.
o Integrity constraints ensure that the data insertion, updating, and other processes
have to be performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the
database.
Example:
18
rows.
o A table can contain a null value other than the primary key field.
Example:
Example:
o An entity set can have multiple keys, but out of which one key will be the
primary key. A primary key can contain a unique and null value in the
relational table.
Example:
19
Also include notnull ,check and default constrains
b. KEYS
Keys:
A DBMS key is an attribute or set of an attribute which helps you to identify a
row(tuple) in a relation(table). They allow you to find the relation between two
tables. Keys help you uniquely identify a row in a table by a combination of one or
more columns in that table.
Example:
DBMS has following seven types of Keys each have their different functionality:
Super Key
Primary Key
Candidate Key
Alternate Key
Foreign Key
Compound Key
Composite Key
Surrogate Key
Super key
A superkey is a group of single or multiple keys which identifies rows in a table. A Super
key may have additional attributes that are not needed for unique identification.
Example:
20
In the above-given example, EmpSSN and EmpNum name are superkeys.
Primary Key
A column or group of columns in a table which helps us to uniquely identifies every row
in that table is called a primary key. This DBMS can't be a duplicate. The same value
can't appear more than once in the table.
Alternate key
All the keys which are not primary key are called an alternate key. It is a candidate
key which is currently not the primary key. However, A table may have single or
multiple choices for the primary key.
StudID, Roll No, Email are qualified to become a primary key. But since StudID is the
primary key, Roll No, Email becomes the alternative key.
Candidate Key
21
A super key with no repeated attribute is called candidate key.
The Primary key should be selected from the candidate keys. Every table must have at
least a single candidate key.
Example: In the given table Stud ID, Roll No, and email are candidate keys which
help us to uniquely identify the student record in the table.
Foreign key
A foreign key is a column which is added to create a relationship with another table.
Foreign keys help us to maintain data integrity and also allows navigation between two
different instances of an entity. Every relationship in the model needs to be supported
by a foreign key.
Example:
DeptCode DeptName
001 Science
22
002 English
005 Computer
In this example, we have two table, teach and department in a school. However, there is
no way to see which search work in which department.
In this table, adding the foreign key in Deptcode to the Teacher name, we can create a
relationship between the two tables.
Compound key
Compound key has many fields which allow you to uniquely recognize a specific record.
It is possible that each column may be not unique by itself within the database. However,
when combined with the other column or columns the combination of composite keys
become unique.
Example:
In this example, OrderNo and ProductID can't be a primary key as it does not uniquely
identify a record. However, a compound key of Order ID and Product ID could be used
23
as it uniquely identified each record.
Composite key
A key which has multiple attributes to uniquely identify rows in a table is called a
composite key. The difference between compound and the composite key is that any part
of the compound key can be a foreign key, but the composite key may or maybe not a
part of the foreign key.
This operation combines each tuple from one relation with each tuple of another.
24
Notation:
EMPLOYEE × DEPARTMENT
Result: A relation containing all combinations — each employee paired with every
department.
Includes all tuples from the left relation (EMPLOYEE), and matching tuples from right
(DEPARTMENT). Non-matching parts from the right are filled with NULLs.
EMPLOYEE ⟕ DEPARTMENT
Notation:
(join on DCODE)
Includes all tuples from the right relation (DEPARTMENT), and matching tuples from the
left (EMPLOYEE). Non-matching parts from the left are NULL.
EMPLOYEE ⟖ DEPARTMENT
Notation:
(join on DCODE)
Includes all tuples from both relations. Where there's no match, the missing side is NULL.
25
This returns HAMEN, VINI, and PETER.
π_{NAME, DCODE}(EMPLOYEE)
This yields a relation with NAME and DCODE values, duplicates removed.
ρ(EMP, EMPLOYEE)
Example: Create a temporary relation IT_EMPS containing EMPLOYEE entries where DCODE
= 201:
26
than once on a project he she worked on that day. The primary keys are
underlined.
(i) Develop DDL to implement the above schema. Identify the foreign keys.(6)
(ii) Develop an SQL query to list the department number and the number of
employees in each department. (5)
(iii) Develop a view that will keep track of the department number, the number
of employees in the department, and the total basis pay expenditure for
each department. (5)
(iii) View: Dept number, number of employees, and total basic pay per department
27
CREATE VIEW Dept_Stats AS
SELECT e.Deptno,
COUNT(*) AS Num_Employees,
SUM(e.Basic_Pay) AS Total_Basic_Pay
FROM Employee e
GROUP BY e.Deptno;
28