DBMS Unit - 4 Notes
DBMS Unit - 4 Notes
FACULTY
SURESH
suresh.mentor@gmail.com
lOMoARcPSD|45012165
INTRODUCTION
The Schema Refinement refers to refine the schema by using some technique.
The best technique of schema refinement is decomposition.
Normalization or Schema Refinement is a technique of organizing the data
in the database.
It is a systematic approach of decomposing tables to eliminate data
redundancy and undesirable characteristics like Insertion, Update and
Deletion Anomalies.
Redundancy refers to repetition of same data or duplicate copies of same data
stored in different locations.
Anomalies: Anomalies refers to the problems occurred after poorly planned
and normalized databases where all the data is stored in one table which is
sometimes called a flat file.
By using Decomposition you can eliminate the data redundancy.
suresh.mentor@gmail.com
lOMoARcPSD|45012165
1. Insert Anomaly
2. Update Anomaly
3. Delete Anomaly
1. Insert Anomaly
The term "insertion anomaly" is used to describe when a new row is added to a
table and it causes an inconsistency.
If a tuple is inserted in referencing relation and referencing attribute value is
not present in referenced attribute, it will not allow inserting in referencing
relation.
Page 2
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example:
Assume that a new employee is joining the company under training and not
assigned to any department. Then, we would not insert the data into the table if the
emp_dept field doesn't allow nulls.
2. Update anomaly:
If there are some changes in the database, we have to apply that change in all
the rows. And if we miss any row, we will have one more field, creating an
update anomaly in the database.
Example
In the given table, we have two rows for an employee named Rick, and he
belongs to two different departments of the company. If we need to update
Rick's address, we must update the same address in two rows. Otherwise, the
data will become inconsistent.
If, in some way, we can update the correct address in one department but not
the other, then according to the database, Rick will have two different
addresses, which is not correct and would lead to inconsistent data.
Page 3
suresh.mentor@gmail.com
lOMoARcPSD|45012165
3. Delete anomaly:
The term "deletion anomaly in the database" is used when we delete some
rows from a table and any necessary additional information or data is also lost
from the database.
Example
Assume that if the company closes the department D890, then deleting the
rows that have emp_dept as D890 would also delete the information of
employee Maggie since she is assigned only to this department.
FUNCTIONAL DEPENDENCY
X→Y
Y is functionally dependent on X
X is Determinant set
Y is Dependent Attribute.
Page 4
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example:
We have a <Department> table with two attributes − DeptId and DeptName.
DeptId = Department ID
DeptName = Department Name
DeptId DeptName
001 Finance
002 Marketing
003 HR
DeptId → DeptName
Key terms
S.no Key Terms Description
1 Decomposition The term decomposition refers to the process in
which we break down a table in a database into
various elements or parts.
2 Dependent It is displayed on the right side of the functional
dependency diagram.
3 Determinant It is displayed on the left side of the functional
dependency Diagram.
4 Union It suggests that if two tables are separate, and the
PK is the same, you should consider putting them
together
Page 5
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Page 6
suresh.mentor@gmail.com
lOMoARcPSD|45012165
For each functional dependency in the set, check if its right-hand side can be
determined using the left-hand side and the functional dependencies in the
set.
If so, add the functional dependency to the set.
Step 3: The final set of functional dependencies is the closure of the original set.
Example:
Let A, B, C and D be attributes and the functional dependencies are as follows:
A -> B
B -> CD
The closure of this set of functional dependencies is {A -> B, B -> CD, A -> CD}.
Step-1: Add the attributes contained in the attribute set for which closure is being
calculated to the result set.
Step 2: Recursively add the attributes to the result set which can be functionally
determined from the attributes already contained in the result set.
Page 7
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example-
A+ = { A }
={A,B,C} ( Using A → BC )
={A,B,C,D,E} ( Using BC → DE )
={A,B,C,D,E,F} ( Using D → F )
={A,B,C,D,E,F,G} ( Using CF → G )
Page 8
suresh.mentor@gmail.com
lOMoARcPSD|45012165
For example,
Emp_id Emp_name
E_001 Raju
E_002 Srinu
E_003 Vikas
Consider above table table having two attributes Emp_id and Emp_name.
Example
Page 9
suresh.mentor@gmail.com
lOMoARcPSD|45012165
3. Multi-valued dependency
Example:
In this example, maf_year and color are independent of each other but dependent on
car_model.
car_model-> colour
Page 10
suresh.mentor@gmail.com
lOMoARcPSD|45012165
4. Transitive dependency
In transitive functional dependency, dependent is indirectly dependent on
determinant. i.e. If a → b & b → c, then according to axiom of transitivity,
a → c. This is a transitive functional dependency.
Example:
Page 11
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Key’s in DBMS
A key is a Constraint
A key in DBMS is an attribute or a set of attributes that help to uniquely
identify a tuple (or row) in a relation (or table).
Key’s play an important role in the relational database.
Keys are also used to establish relationships between the different tables and
columns of a relational database.
Each key having its own functionality.
There are mainly eight different types of Keys in DBMS and each key has it’s
different functionality:
1. Primary Key
2. Foreign Key
3. Super Key
4. Candidate Key
5. Composite Key
6. Compound Key
7. Alternate Key
8. Surrogate Key
1. Primary Key
Page 12
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example:
2. Foreign Key
Example:
Page 13
suresh.mentor@gmail.com
lOMoARcPSD|45012165
3. Super Key
Table: Employee
Super keys: The above table has following super keys. All of the following sets of
super key are able to uniquely identify a row of the employee table.
{Emp_SSN}
{Emp_Number}
{Emp_SSN, Emp_Number}
{Emp_SSN, Emp_Name}
{Emp_SSN, Emp_Number, Emp_Name}
{Emp_Number, Emp_Name}
Page 14
suresh.mentor@gmail.com
lOMoARcPSD|45012165
4. Candidate Key
Example:
{Emp_SSN}
{Emp_Number}
The following are the important differences between a Super Key and a Candidate
Key –
Page 15
suresh.mentor@gmail.com
lOMoARcPSD|45012165
5. Composite Key
Example:
{OrderID, ProductID}
<Student>
Above, our composite keys are StudentID and StudentEnrollNo. The table has two
attributes as primary key.
Therefore, the Primary Key consisting of two or more attribute is called Composite
Key.
Page 16
suresh.mentor@gmail.com
lOMoARcPSD|45012165
6. Compound Key
Example:
In this example, OrderNo and ProductID can’t be a primary key as it does not
uniquely identify a record. However, a compound key of Order ID and Product ID
could be used as it uniquely identified each record.
7. Alternative Key
Page 17
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example:
DBA (Database administrator) can choose any of the above key as primary key. Lets
say Emp_Id is chosen as primary key.
8. Surrogate Key
Page 18
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example:
In this case, the "StudentID" is the surrogate key. It is assigned automatically by the
database system when a new record is added to the table. Here's an example of how
the table might look:
1 John 18 A
2 Mary 19 B
3 David 18 A
4 Sarah 19 B
The "StudentID" values are automatically generated, ensuring that each row has a
unique identifier.
Surrogate keys are particularly useful when dealing with large databases and
complex relationships, as they provide a simple and reliable way to identify
individual records.
Page 19
suresh.mentor@gmail.com
lOMoARcPSD|45012165
What is Normalization?
Normalization divides the larger table into smaller and links them using
relationships.
The normal form is used to reduce redundancy from the database table.
The main reason for normalizing the relations is removing these anomalies.
Failure to eliminate anomalies leads to data redundancy and can cause data
integrity and other problems as the database grows.
Page 20
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example:
Professor
Un Normalized
ID Name Salary Relation Since
salary is a
1 Rama {40000,10000,15000,10000}
Multi valued
attribute
The above table Violates the First Normal Form (1NF) Because Salary attribute is
multi valued So We can eliminate this multi valued attribute by splitting the salary
column to more specific columns like Basic, TA, DA, HRA.
Professor
Page 21
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example: 2
The decomposition of the EMPLOYEE table into 1NF has been shown below:
14 John 9064738238 UP
Page 22
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Note: Partial Functional Dependencies (PFD) concept occur’s when ever table
consist of more than one primary key. i.e composite key
Example:
Page 23
suresh.mentor@gmail.com
lOMoARcPSD|45012165
The ProjectName can be determined by ProjectID, which makes the relation Partial
Dependent.
<StudentInfo>
<ProjectInfo>
ProjectID ProjectName
P09 Geo Location
P07 Cluster Exploration
P03 IoT Devices
P05 Cloud Deployment
Page 24
suresh.mentor@gmail.com
lOMoARcPSD|45012165
In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and
A relation can be said to be in the third normal form if at least one of the following is
true for X → Y.
1. Y is a prime attribute
2. X is a superkey
Example: Let’s say a company wants to store the complete address of each
employee, they create a table named Employee_Details that looks like this:
Page 25
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Employee
12 Jai 33456
13 Navdeep 12312
14 Shivanshu 67868
15 Mahesh 34535
16 Vishal 67868
EMP_ZIP
33456 HR Gurgaon
12312 HR Ambala
67868 US Chicago
34535 UK Norwich
67868 UP Noida
Page 26
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Definition
Example: <<StudentCourses>>
Page 27
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Each student can take multiple courses, and each course has a professor.
The Above table is already in 1NF because each cell contains a single value.
{StudentID, CourseID}->Professor
{CourseID}->Professor
In this table, Professor depends only on CourseID, which is part of the candidate
key. So, it's already in 3NF.
Page 28
suresh.mentor@gmail.com
lOMoARcPSD|45012165
{CourseID} → {Professor}.
To meet BCNF, you can split the table into two separate tables: "Courses" and
"Professors."
"Courses" Table:
CourseID Professor
Math101 Dr. Smith
Chem101 Dr. Johnson
Bio101 Dr. Brown
"StudentCourses" Table:
StudentID CourseID
101 Math101
101 Chem101
102 Math101
103 Bio101
Page 29
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Multi-valued Dependency
Example:
Suppose there is a bike manufacturer company which produces two colors (white
and black) of each model every year.
Page 30
suresh.mentor@gmail.com
lOMoARcPSD|45012165
For a table to satisfy the Fourth Normal Form, it should satisfy the following two
conditions:
Example on 4NF:
STUDENT
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two
independent entity. Hence, there is no relationship between COURSE and HOBBY.
So to make the above table into 4NF, we can decompose it into two tables:
Page 31
suresh.mentor@gmail.com
lOMoARcPSD|45012165
STUDENT_COURSE
STU_ID COURSE
21 C
21 C++
34 Java
74 Python
59 PHP
STUDENT_HOBBY
STU_ID Hobby
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
Page 32
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Join Dependency
Join Dependency
If the join of R1 and R2 over Q is equal to relation R then we can say that a join
dependency exists, where R1 and R2 are the decomposition R1 (P, Q) and R2 (Q, S)
of a given relation R (P, Q, S). R1 and R2 are a lossless decomposition of R.
Page 33
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example
The below relation violates the Fifth Normal Form (5NF) of Normalization −
<Employee>
The above relation can be decomposed into the following three tables; therefore, it is
not in 5NF –
<EmployeeSkills>
EmpName EmpSkills
David Java
John JavaScript
Jamie jQuery
Emma Java
The following is the <EmployeeJob> relation that displays the jobs assigned to each
employee −
<EmployeeJob>
EmpName EmpJob
David E145
John E146
Jamie E146
Emma E147
Page 34
suresh.mentor@gmail.com
lOMoARcPSD|45012165
<JobSkills>
EmpSkills EmpJob
Java E145
JavaScript E146
jQuery E146
Java E147
The above relations have join dependency, so they are not in 5NF. That would mean
that a join relation of the above three relations is equal to our original
relation <Employee>.
Page 35
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example:
In this case, the "StudentID" is the surrogate key. It is assigned automatically by the
database system when a new record is added to the table. Here's an example of how
the table might look:
1 John 18 A
2 Mary 19 B
3 David 18 A
4 Sarah 19 B
The "StudentID" values are automatically generated, ensuring that each row has a
unique identifier.
Surrogate keys are particularly useful when dealing with large databases and
complex relationships, as they provide a simple and reliable way to identify
individual records.
Page 36
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Surrogate keys can be generate in various ways for example if you are using
Page 37
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Decomposition in DBMS
Decomposition in Database Management System is to break a relation into
multiple relations to bring it into an appropriate normal form.
It helps to remove redundancy, inconsistencies, and anomalies from a
database.
If a relation is not properly decomposed, then it may lead to other problems
like information loss, etc.
Types of Decomposition
Page 38
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example:
R2 = (E, F)
R = (D, E, F)
D E F
78 19 16
39 76 91
78 29 44
It is decomposed as follows-
R1(D, E)
D E
78 19
39 76
78 29
Page 39
suresh.mentor@gmail.com
lOMoARcPSD|45012165
R2(E, F)
E F
19 16
76 91
29 44
Let us check the first condition. It was The union of the sub-relations R1 and R2
must contain all the attributes that are available in the original relation R before
decomposition.
So, R1 U R2= R
D E F
78 19 16
39 76 91
78 29 44
The relation obtained above is same as the original relation R. We can say that it is
an example of Lossless-join decomposition.
Page 40
suresh.mentor@gmail.com
lOMoARcPSD|45012165
Example 1
Let R (A, B, C, D) and F = {A -> B, A -> C, C -> D}
R is decomposed into
R1 = (A, B, C) with the FD’s
F1 = {A -> B, A -> C}
R2 = (C, D) with the FD’s
F2 = {C -> D}
F’ = F1 U F2 = {A -> B, A -> C, C -> D}
Hence, F’+ = F+
Hence, the decomposition is dependency preserving and also loss less.
Page 41
suresh.mentor@gmail.com