DBMS
UNIT-3
Relational database design
Functional Dependencies:
In a relational database, Functional Dependencies (FDs) are rules that express
relationships between columns (or attributes) in a table. A functional
dependency indicates that the value of one attribute (or group of attributes)
determines the value of another attribute.
Think of it like this:
• If you know the value of one thing, you can determine the value of another
thing.
For example, if we know a StudentID, we can always determine the
StudentName, because each StudentID is associated with exactly one name. So,
StudentID → StudentName is a functional dependency
Notations of Functional Dependencies
• X → Y means that if we know the value of attribute(s) X, we can determine
the value of attribute(s) Y.
• X is called the determinant (or the key).
• Y is the dependent attribute.
Examples of Functional Dependencies
1. Example 1:
Let's take an example where we have a table Student with the following
columns:
• StudentID
• StudentName
• Course
Here’s how we could have functional dependencies:
• StudentID → StudentName: Knowing the StudentID will allow us to
determine the StudentName because each student has a unique ID.
• StudentID → Course: This assumes a student is enrolled in only one
course. The StudentID determines the Course they are taking.
SQL Query for Table:
CREATE TABLE Student (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(100),
Course VARCHAR(100)
);
Sample Data:
StudentID StudentName Course
1 John Math
2 Alice History
3 Bob Science
Explanation:
• From the table, we can see that if we know the StudentID (e.g., 1), we can
always determine the StudentName (e.g., John).
• Similarly, if we know StudentID, we can determine the Course.
• Check if a Student is in a Specific Course
If you know the StudentID, you can determine the Course they are enrolled in (based
on the dependency StudentID → Course).
SELECT Course
FROM Student
WHERE StudentID = 2;
Result:
Course
History
This shows how knowing StudentID helps in determining the Course they are
enrolled in.
Types of Functional Dependencies
1. Trivial Functional Dependency:
o A functional dependency is trivial if the right side (dependent
attribute) is a subset of the left side (determinant).
o Example: StudentID, StudentName → StudentID is trivial because
StudentID is part of the left side.
2. Non-Trivial Functional Dependency:
o A functional dependency is non-trivial if the right side is not a
subset of the left side.
o Example: StudentID → StudentName is non-trivial because
StudentName is not part of the left side (which is just StudentID).
3. First Normal Form (1NF)
In database design, First Normal Form (1NF) is the first step toward
organizing data. It makes sure the data is structured properly in a table.
1NF requires the following:
1. Atomic Values: Every column must have atomic (indivisible) values. This
means no lists, sets, or multiple values in a single cell.
2. No Repeating Groups: There should be no repeating groups or multiple
entries in a single row for the same type of data.
Think of it this way:
• Imagine a table where one column has multiple values, like a list. In 1NF,
we want to separate these values into individual rows, so each cell holds
only one value.
Example of a Table Not in 1NF
Imagine we have the following table for Students:
StudentID StudentName Courses
1 John Math, English
2 Alice Science, History
3 Bob Math, Science
Here, the Courses column contains multiple values for each student,
which breaks 1NF rules. It violates the rule of atomic values.
Converting to First Normal Form (1NF)
To convert the above table into 1NF, we need to remove the repeating
groups and make sure each cell contains only one value. We separate the
courses into individual rows for each student.
Here’s how we can convert it to 1NF:
StudentID StudentName Course
1 John Math
1 John English
2 Alice Science
2 Alice History
3 Bob Math
3 Bob Science
Now, the Courses column only contains a single course for each student,
so the table follows 1NF rules.
SQL Query Example for 1NF
Let’s say we have the following table that violates 1NF (a table with
repeating groups):
CREATE TABLE Students (
StudentID INT,
StudentName VARCHAR(100),
Courses VARCHAR(100)
);
Inserting data (that violates 1NF):
INSERT INTO Students (StudentID, StudentName, Courses)
VALUES
(1, 'John', 'Math, English'),
(2, 'Alice', 'Science, History'),
(3, 'Bob', 'Math, Science');
Now, we need to convert this to 1NF. The Courses column needs to be
split into individual rows:
CREATE TABLE Students_1NF (
StudentID INT,
StudentName VARCHAR(100),
Course VARCHAR(100)
);
-- Insert data into the new 1NF-compliant table
INSERT INTO Students_1NF (StudentID, StudentName, Course)
VALUES
(1, 'John', 'Math'),
(1, 'John', 'English'),
(2, 'Alice', 'Science'),
(2, 'Alice', 'History'),
(3, 'Bob', 'Math'),
(3, 'Bob', 'Science');
Benefits of 1NF
• Avoids Redundancy: By breaking down repeating groups into individual
rows, we prevent redundant data storage. Each student's courses are now
listed separately, instead of repeating the student's name.
• Easier Queries: You can query and filter individual courses for each
student without confusion.
• Data Integrity: It reduces errors by ensuring that each column contains
only one value.
3. Second Normal Form (2NF)
Second Normal Form (2NF) is the second step in organizing a database to
eliminate redundancy and improve data integrity. It builds on First Normal
Form (1NF), which ensures that each column contains atomic (indivisible)
values and there are no repeating groups.
What Makes a Table in 2NF?
For a table to be in Second Normal Form (2NF), it must meet two
conditions:
1. It must be in 1NF: This means the table already has atomic values and no
repeating groups (we’ve already covered this in 1NF).
2. It must have no Partial Dependencies: This is the key point for 2NF. A
partial dependency occurs when a non-key column (attribute) depends
on only part of a composite primary key (a primary key made up of
multiple columns), instead of the whole primary key.
What is a Partial Dependency?
• Partial Dependency means that some attributes in the table depend on
only part of the composite primary key.
• In 2NF, we want to remove partial dependencies, ensuring that each non-
key attribute is fully dependent on the entire primary key, not just part of
it.
Example of a Table Not in 2NF
Let’s say we have a table Student_Course with the following attributes:
• StudentID (part of the primary key)
• CourseID (part of the primary key)
• StudentName
• Instructor
The table looks like this:
StudentID CourseID StudentName Instructor
1 101 John Dr. Smith
1 102 John Dr. Brown
2 101 Alice Dr. Smith
StudentID CourseID StudentName Instructor
2 103 Alice Dr. Green
Here:
• StudentID + CourseID is the composite primary key.
• StudentName depends only on StudentID (not the entire composite key).
• Instructor depends only on CourseID (not the entire composite key).
This creates partial dependencies, where the StudentName depends on
just StudentID and Instructor depends on just CourseID.
Converting to Second Normal Form (2NF)
To convert the Student_Course table to 2NF, we need to remove the
partial dependencies by breaking the table into smaller tables:
1. Create a new table for Student (containing StudentID and StudentName).
2. Create a new table for Course (containing CourseID and Instructor).
3. Keep the original table with just the composite primary key (StudentID
and CourseID), which records the relationship between the student and
the course.
After Conversion to 2NF:
1. Student Table (StudentID, StudentName):
StudentID StudentName
1 John
2 Alice
2. Course Table (CourseID, Instructor):
CourseID Instructor
101 Dr. Smith
102 Dr. Brown
103 Dr. Green
3. Student_Course Table (StudentID, CourseID):
StudentID CourseID
1 101
1 102
2 101
2 103
Now, there are no partial dependencies:
• StudentName depends only on StudentID (in the Student table).
• Instructor depends only on CourseID (in the Course table).
• The Student_Course table only stores the relationship between StudentID
and CourseID.
Step 1: Original Table (Not in 2NF)
Here’s how we create the original table with StudentID and CourseID as
the composite primary key:
CREATE TABLE Student_Course (
StudentID INT,
CourseID INT,
StudentName VARCHAR(100),
Instructor VARCHAR(100),
PRIMARY KEY (StudentID, CourseID)
);
-- Insert data into the table (not in 2NF)
INSERT INTO Student_Course (StudentID, CourseID, StudentName,
Instructor)
VALUES
(1, 101, 'John', 'Dr. Smith'),
(1, 102, 'John', 'Dr. Brown'),
(2, 101, 'Alice', 'Dr. Smith'),
(2, 103, 'Alice', 'Dr. Green');
Step 2: Create New Tables for 2NF
We now create three separate tables to eliminate partial dependencies:
1. Student Table (stores only student-related information):
CREATE TABLE Student (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(100)
);
-- Insert student data
INSERT INTO Student (StudentID, StudentName)
VALUES
(1, 'John'),
(2, 'Alice');
2. Course Table (stores only course-related information):
CREATE TABLE Course (
CourseID INT PRIMARY KEY,
Instructor VARCHAR(100)
);
-- Insert course data
INSERT INTO Course (CourseID, Instructor)
VALUES
(101, 'Dr. Smith'),
(102, 'Dr. Brown'),
(103, 'Dr. Green');
3.Student_Course Table (stores the relationship between students and
courses):
CREATE TABLE Student_Course (
StudentID INT,
CourseID INT,
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);
-- Insert the relationship data
INSERT INTO Student_Course (StudentID, CourseID)
VALUES
(1, 101),
(1, 102),
(2, 101),
(2, 103);
Benefits of 2NF
• No Partial Dependencies: In 2NF, non-key attributes depend on the entire
composite primary key, eliminating redundancy.
• Improved Data Integrity: With separate tables for Students and Courses,
changes to student names or course instructors will not require repeating
data across many rows.
• Easier Updates and Queries: Updating data (like a student’s name or a
course’s instructor) is easier because the information is stored in only one
place
3. Third Normal Form (3NF)
Third Normal Form (3NF) is the third step in organizing a database to
reduce redundancy and improve data integrity. It's even stricter than
Second Normal Form (2NF), and it focuses on eliminating transitive
dependencies.
What Makes a Table in 3NF?
For a table to be in Third Normal Form (3NF), it must satisfy two
conditions:
1. It must be in 2NF: This means that the table is already free from partial
dependencies (as we learned in 2NF).
2. It must have no Transitive Dependencies: A transitive dependency
happens when one non-key column depends on another non-key column.
In 3NF, we remove this by making sure that non-key attributes depend
only on the primary key.
What is a Transitive Dependency?
• A transitive dependency occurs when one non-key attribute depends on
another non-key attribute through a third attribute.
• For example, if Attribute A → Attribute B and Attribute B → Attribute C,
then Attribute A → Attribute C is a transitive dependency.
• In 3NF, we remove this kind of indirect relationship between non-key
columns.
Example of a Table Not in 3NF
Let’s take a table Employee with the following attributes:
• EmployeeID (Primary Key)
• EmployeeName
• Department
• ManagerName
Here’s how the data might look:
EmployeeI EmployeeNam Departme ManagerNam
D e nt e
101 John HR Sarah
102 Alice Finance Tom
103 Bob HR Sarah
Here:
• EmployeeID is the primary key.
• EmployeeName depends on EmployeeID.
• Department depends on EmployeeID.
• ManagerName depends on Department (not on EmployeeID directly, but
through Department).
This table violates 3NF because there is a transitive dependency:
• Department → ManagerName (A department determines the manager).
• This means ManagerName is indirectly dependent on EmployeeID
through Department. This is a transitive dependency.
Converting to Third Normal Form (3NF)
To convert the above table into 3NF, we need to remove the transitive
dependency by creating a new table for Department and move
ManagerName there.
After Conversion to 3NF:
We can create two tables:
1. Employee Table: Stores employee information, but ManagerName will be
removed.
2. Department Table: Stores information about departments and their
managers.
3. Employee Table:
EmployeeID EmployeeName Department
101 John HR
102 Alice Finance
103 Bob HR
2. Department Table:
Department ManagerName
HR Sarah
Finance Tom
Now, ManagerName is no longer stored in the Employee table. Instead, it
is stored in the Department table, where it belongs. This eliminates the
transitive dependency.
Step 1: Original Table (Not in 3NF)
CREATE TABLE Employee (
EmployeeID INT PRIMARY KEY,
EmployeeName VARCHAR(100),
Department VARCHAR(100),
ManagerName VARCHAR(100)
);
-- Insert data into the table (not in 3NF)
INSERT INTO Employee (EmployeeID, EmployeeName, Department,
ManagerName)
VALUES
(101, 'John', 'HR', 'Sarah'),
(102, 'Alice', 'Finance', 'Tom'),
(103, 'Bob', 'HR', 'Sarah');
Step 2: Create New Tables for 3NF
We now need to remove the transitive dependency. We will create two
tables:
1. Employee Table (stores EmployeeID, EmployeeName, and Department).
2. Department Table (stores Department and ManagerName).
3. Employee Table:
CREATE TABLE Employee (
EmployeeID INT PRIMARY KEY,
EmployeeName VARCHAR(100),
Department VARCHAR(100)
);
-- Insert data into the Employee table
INSERT INTO Employee (EmployeeID, EmployeeName, Department)
VALUES
(101, 'John', 'HR'),
(102, 'Alice', 'Finance'),
(103, 'Bob', 'HR');
2. Department Table:
CREATE TABLE Department (
Department VARCHAR(100) PRIMARY KEY,
ManagerName VARCHAR(100)
);
-- Insert data into the Department table
INSERT INTO Department (Department, ManagerName)
VALUES
('HR', 'Sarah'),
('Finance', 'Tom');
Benefits of 3NF
• No Transitive Dependencies: By ensuring that all non-key attributes
depend only on the primary key, we remove any indirect dependencies
between non-key attributes.
• Improved Data Integrity: Changes in the manager's name only need to be
made in one place (in the Department table), avoiding redundancy and
ensuring consistency.
• Reduced Redundancy: The same information (like manager name) is no
longer repeated in multiple rows in the Employee table. This reduces data
storage and the risk of errors.
4. Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form (BCNF) is a stricter version of Third Normal
Form (3NF). It is designed to handle certain types of anomalies that 3NF
cannot address.
What Makes a Table in BCNF?
A table is in Boyce-Codd Normal Form (BCNF) if it satisfies the following
condition:
• For every non-trivial functional dependency (FD) in the table, the left-
hand side (determinant) of the dependency must be a superkey.
What is a Superkey?
• A superkey is a set of one or more columns that can uniquely identify each
row in a table. If you know the values in the superkey, you can uniquely
identify a row.
What is a Non-Trivial Functional Dependency?
• A non-trivial functional dependency is a functional dependency where
the right-hand side is not a subset of the left-hand side. In simple terms, it
means that one attribute uniquely determines another.
Example of a Table Not in BCNF
Let’s consider a table Course_Registration that stores information about
students, courses, and the instructors who teach those courses:
StudentID CourseID Instructor
1 101 Dr. Smith
2 101 Dr. Smith
3 102 Dr. Brown
4 103 Dr. Green
Here:
• StudentID, CourseID is the composite primary key (this uniquely
identifies each row).
• There is a functional dependency: CourseID → Instructor (A course
determines the instructor).
However, CourseID is not a superkey (it doesn't uniquely identify each
row), and yet it determines Instructor. This violates BCNF because we
have a functional dependency where the left-hand side (CourseID) is not
a superkey.
Converting to BCNF
To convert the Course_Registration table to BCNF, we need to remove the
dependency CourseID → Instructor, because CourseID is not a superkey.
We can do this by splitting the table into two smaller tables:
1. Course Table: Stores course information and the instructor.
2. Registration Table: Stores the relationship between students and courses.
After Conversion to BCNF:
1. Course Table:
CourseID Instructor
101 Dr. Smith
102 Dr. Brown
103 Dr. Green
2. Registration Table:
StudentID CourseID
1 101
StudentID CourseID
2 101
3 102
4 103
Now, the Instructor is determined by CourseID, and CourseID is a
superkey in the Course Table. This satisfies the conditions of BCNF.
Step 1: Original Table (Not in BCNF)
Here’s the table that does not satisfy BCNF because CourseID is not a
superkey but determines Instructor:
CREATE TABLE Course_Registration (
StudentID INT,
CourseID INT,
Instructor VARCHAR(100),
PRIMARY KEY (StudentID, CourseID)
);
-- Insert data into the Course_Registration table (not in BCNF)
INSERT INTO Course_Registration (StudentID, CourseID, Instructor)
VALUES
(1, 101, 'Dr. Smith'),
(2, 101, 'Dr. Smith'),
(3, 102, 'Dr. Brown'),
(4, 103, 'Dr. Green');
Step 2: Create New Tables for BCNF
Now, we split the original table into two tables to remove the functional
dependency where CourseID determines Instructor:
1. Course Table (stores course and instructor information):
CREATE TABLE Course (
CourseID INT PRIMARY KEY,
Instructor VARCHAR(100)
);
-- Insert data into the Course table
INSERT INTO Course (CourseID, Instructor)
VALUES
(101, 'Dr. Smith'),
(102, 'Dr. Brown'),
(103, 'Dr. Green');
2. Registration Table (stores the relationship between students and
courses):
CREATE TABLE Registration (
StudentID INT,
CourseID INT,
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);
-- Insert data into the Registration table
INSERT INTO Registration (StudentID, CourseID)
VALUES
(1, 101),
(2, 101),
(3, 102),
(4, 103);
Benefits of BCNF
1. No Anomalies: By ensuring that every functional dependency involves a
superkey, BCNF removes even subtle anomalies that might exist in 3NF.
2. Better Data Integrity: BCNF ensures that the table structure reflects real-
world relationships more accurately, reducing the chance of redundant or
inconsistent data.
3. Efficient Updates: Because data is divided into smaller tables that don’t
have unnecessary dependencies, updates to the database become easier
and more consistent.
5. Fourth Normal Form (4NF)
Fourth Normal Form (4NF) is a higher level of database normalization. It
addresses a situation where a table has multi-valued dependencies
(MVDs), which can cause redundancy and inconsistency in the database.
(A multivalued dependency occurs when one attribute in a table
determines multiple values of another attribute, but these values are
independent of each other. This can create redundancy).
Steps to Achieve 4NF
To achieve 4NF, a table should already satisfy:
1. Boyce-Codd Normal Form (BCNF): Every determinant must be a candidate
key.
2. There should be no multi-valued dependencies.
Example of a Table with Multi-Valued Dependency (MVD)
Let’s look at an example. Suppose we have a table storing information
about students, their enrolled courses, and their hobbies:
Student_ID Student_Name Course Hobby
1 Alice Math Reading
1 Alice Science Painting
2 Bob History Gardening
2 Bob Literature Cooking
In this table, we can see that for each student:
• The student can be enrolled in multiple courses.
• The student can have multiple hobbies.
Here, there’s a multivalued dependency because Student_ID determines
both Course and Hobby, but Course and Hobby are independent of each
other. This redundancy can cause data anomalies, such as unnecessary
repetition of the same Student_ID and Student_Name.
Breaking the Table to Achieve 4NF
To bring this table into 4NF, we need to separate the independent multi-
valued attributes (in this case, Course and Hobby) into separate tables.
This way, we remove the redundancy and make the data more consistent.
1. First Table: Student-Course
Student_ID Student_Name Course
1 Alice Math
Student_ID Student_Name Course
1 Alice Science
2 Bob History
2 Bob Literature
2. Second Table: Student-Hobby
Student_ID Hobby
1 Reading
1 Painting
2 Gardening
2 Cooking
Now, both the Course and Hobby attributes are no longer in the same
table. Each table contains only related facts about a student, eliminating
the multi-valued dependency.
1. Creating the Original Table (before 4NF):
CREATE TABLE Student_Info (
Student_ID INT,
Student_Name VARCHAR(100),
Course VARCHAR(100),
Hobby VARCHAR(100)
);
2. Inserting Data into the Original Table:
INSERT INTO Student_Info (Student_ID, Student_Name, Course, Hobby)
VALUES
(1, 'Alice', 'Math', 'Reading'),
(1, 'Alice', 'Science', 'Painting'),
(2, 'Bob', 'History', 'Gardening'),
(2, 'Bob', 'Literature', 'Cooking');
3. Creating Two Tables to Achieve 4NF:
CREATE TABLE Student_Course (
Student_ID INT,
Student_Name VARCHAR(100),
Course VARCHAR(100)
);
CREATE TABLE Student_Hobby (
Student_ID INT,
Hobby VARCHAR(100)
);
4. Inserting Data into the New Tables:
-- Inserting data into Student_Course table
INSERT INTO Student_Course (Student_ID, Student_Name, Course)
VALUES
(1, 'Alice', 'Math'),
(1, 'Alice', 'Science'),
(2, 'Bob', 'History'),
(2, 'Bob', 'Literature');
-- Inserting data into Student_Hobby table
INSERT INTO Student_Hobby (Student_ID, Hobby)
VALUES
(1, 'Reading'),
(1, 'Painting'),
(2, 'Gardening'),
(2, 'Cooking');
Advantages of 4NF
• Eliminates Redundancy: By splitting the table into smaller ones, we avoid
repeating the same information multiple times.
• Reduces Update Anomalies: Changes to the data are easier to manage
because the information is stored in one place.
• Improves Data Integrity: It ensures that multi-valued attributes do not
lead to inconsistency in the data.
6.Fifth Normal Form.
5NF aims to break down complex tables into smaller ones to remove
redundancy caused by join dependencies and ensure that the data can be
reassembled (reconstructed) in its original form without any loss of
information.
Understanding Join Dependencies
A join dependency occurs when a table can be split into multiple smaller
tables, but the data can only be correctly reconstructed by combining
(joining) these smaller tables together in a specific way.
Why Do We Need 5NF?
When a table contains join dependencies, it may lead to the following
issues:
1. Redundancy: Data gets repeated in ways that make the database
inefficient.
2. Complexity: Queries and updates become more complicated, and there is
a risk of inconsistent data.
3. Anomalies: The database structure can cause problems when inserting,
deleting, or updating data.
How to Achieve 5NF?
To achieve 5NF:
1. The table must first satisfy all the previous normal forms, i.e., it should be
in Boyce-Codd Normal Form (BCNF) and Fourth Normal Form (4NF).
2. There must be no join dependency in the table unless it is a trivial join
dependency (i.e., the join dependency is satisfied by combining the same
table with itself or a set of columns from the same table).
Example Scenario:
Imagine we have a table that tracks which student is enrolled in which
course and which teacher is teaching that course:
Student Course Teacher
Alice Math Mr. John
Alice Science Mr. David
Student Course Teacher
Bob Math Mr. John
Bob History Ms. Mary
Charlie Science Mr. David
Charlie History Ms. Mary
In this table:
• A student is enrolled in multiple courses.
• A course can have a teacher.
• A teacher can teach multiple courses.
Breaking the Table into 5NF
We can break the table into three smaller, simpler tables:
1. Student-Course: Which student is enrolled in which course?
2. Course-Teacher: Which teacher teaches which course?
3. Student-Teacher: Which student is indirectly related to which teacher
(through the course they are enrolled in)?
Step 1: Create Three Tables
1. Student-Course Table:
Student Course
Alice Math
Alice Science
Bob Math
Bob History
Charlie Science
Charlie History
Step 2: SQL Queries
Now, let’s go through the SQL queries to create and populate the tables.
Creating the Original Table (before 5NF):
sql
Copy code
CREATE TABLE Student_Course_Teacher (
Student VARCHAR(50),
Course VARCHAR(50),
Teacher VARCHAR(50)
);
Inserting Data into the Original Table:
sql
Copy code
INSERT INTO Student_Course_Teacher (Student, Course, Teacher)
VALUES
('Alice', 'Math', 'Mr. John'),
('Alice', 'Science', 'Mr. David'),
('Bob', 'Math', 'Mr. John'),
('Bob', 'History', 'Ms. Mary'),
('Charlie', 'Science', 'Mr. David'),
('Charlie', 'History', 'Ms. Mary');
2. Course-Teacher Table:
Course Teacher
Math Mr. John
Science Mr. David
History Ms. Mary
3. Student-Teacher Table:
Student Teacher
Alice Mr. John
Alice Mr. David
Student Teacher
Bob Mr. John
Bob Ms. Mary
Charlie Mr. David
Charlie Ms. Mary
Step 1: SQL Queries
Now, let’s go through the SQL queries to create and populate the tables.
Creating the Original Table (before 5NF):
CREATE TABLE Student_Course_Teacher (
Student VARCHAR(50),
Course VARCHAR(50),
Teacher VARCHAR(50)
);
Inserting Data into the Original Table:
INSERT INTO Student_Course_Teacher (Student, Course, Teacher)
VALUES
('Alice', 'Math', 'Mr. John'),
('Alice', 'Science', 'Mr. David'),
('Bob', 'Math', 'Mr. John'),
('Bob', 'History', 'Ms. Mary'),
('Charlie', 'Science', 'Mr. David'),
('Charlie', 'History', 'Ms. Mary');
Creating the 5NF Tables:
-- Student-Course Table
CREATE TABLE Student_Course (
Student VARCHAR(50),
Course VARCHAR(50)
);
-- Course-Teacher Table
CREATE TABLE Course_Teacher (
Course VARCHAR(50),
Teacher VARCHAR(50)
);
-- Student-Teacher Table
CREATE TABLE Student_Teacher (
Student VARCHAR(50),
Teacher VARCHAR(50)
);
Inserting Data into the 5NF Tables:
-- Inserting data into Student_Course Table
INSERT INTO Student_Course (Student, Course)
VALUES
('Alice', 'Math'),
('Alice', 'Science'),
('Bob', 'Math'),
('Bob', 'History'),
('Charlie', 'Science'),
('Charlie', 'History');
-- Inserting data into Course_Teacher Table
INSERT INTO Course_Teacher (Course, Teacher)
VALUES
('Math', 'Mr. John'),
('Science', 'Mr. David'),
('History', 'Ms. Mary');
-- Inserting data into Student_Teacher Table
INSERT INTO Student_Teacher (Student, Teacher)
VALUES
('Alice', 'Mr. John'),
('Alice', 'Mr. David'),
('Bob', 'Mr. John'),
('Bob', 'Ms. Mary'),
('Charlie', 'Mr. David'),
('Charlie', 'Ms. Mary');
Advantages of 5NF in this Case
• No Redundancy: Each fact is stored in its own table, so no data is
repeated.
• Improved Data Integrity: There is no risk of anomalies when adding,
updating, or deleting data.
• Simplified Structure: Each table focuses on a specific relationship, making
it easier to manage and query.
Decomposition
Decomposition helps in breaking down a table into smaller tables to
achieve normalization (ensuring data is stored efficiently). The main idea
is to eliminate redundancy, avoid anomalies, and ensure the data follows
the rules of normalization.
Why Decompose?
• To eliminate data redundancy (repeated data).
• To avoid update anomalies, where changing data in one place doesn't
update it everywhere.
• To achieve better data integrity and consistency.
Example of Decomposition:
Imagine you have a single table that stores information about students,
the courses they are enrolled in, and the teachers of those courses:
Student_ID Student_Name Course Teacher
1 Alice Math Mr. Smith
1 Alice Science Ms. Jones
2 Bob Math Mr. Smith
2 Bob History Dr. Brown
This table is redundant because the teacher's name is repeated for each
course that student is enrolled in.
We can decompose the table into two smaller tables:
1. Student-Course Table: Contains student-course relationships.
2. Course-Teacher Table: Contains course-teacher relationships.
Decomposed Tables:
Student-Course Table:
Student_ID Course
1 Math
1 Science
2 Math
2 History
Course-Teacher Table:
Course Teacher
Math Mr. Smith
Science Ms. Jones
History Dr. Brown
2. Properties of Decomposition
When decomposing a table, we need to ensure that the decomposition
has two important properties:
1. Lossless Join Property
• Definition: A decomposition is lossless if you can combine (join) the
smaller decomposed tables back together without losing any information.
In other words, after decomposing and then joining the tables back, you
should get the same data as the original table.
• Why is it important?: If the decomposition is not lossless, we might lose
information during the process of decomposition. This would lead to
inconsistencies in the database.
• Example of Lossless Join: Consider the Student-Course and Course-
Teacher tables. If we want to combine them back into a single table, we
can join them using the Course attribute (since it's common in both
tables):
SELECT
sc.Student_ID,
sc.Course,
ct.Teacher
FROM
Student_Course sc
JOIN
Course_Teacher ct
ON
sc.Course = ct.Course;
This SQL query will give us back the original table with no information lost:
Student_ID Course Teacher
1 Math Mr. Smith
1 Science Ms. Jones
2 Math Mr. Smith
2 History Dr. Brown
Thus, we’ve preserved all the data, and the decomposition is lossless.
2. Dependency Preservation
• Definition: A decomposition is dependency-preserving if all the
functional dependencies (rules that describe relationships between
attributes) in the original table are preserved in the decomposed tables.
This means we can still enforce all the original constraints after
decomposition.
• Why is it important?: If the decomposition does not preserve the
dependencies, it might be harder to enforce the integrity rules (like
ensuring no student takes the same course twice).
• Example of Dependency Preservation:
Suppose we have the following functional dependency in the original
table:
o Student_ID, Course → Teacher (The combination of Student_ID and
Course determines the Teacher).
After decomposing, we need to ensure this functional dependency is still
valid in the new tables. In our case:
o In the Student-Course table, the combination of Student_ID and
Course uniquely identifies a student’s enrollment in a course.
o In the Course-Teacher table, each Course determines the Teacher.
Closure, Armstrong’s Axioms, Canonical cover:
1. Closure of a Set of Attributes
The closure of a set of attributes, denoted by X+X^+X+, is the set of all
attributes that can be functionally determined by XXX, using the given
functional dependencies. In simpler terms, if you know the values of the
attributes in XXX, what other attributes can you automatically determine
from those?
Why is Closure Important?
• Closure helps us understand which attributes are determined by others in
a relation. It's used to check whether a set of attributes is a superkey
(which uniquely identifies each record in the table).
How to Calculate Closure:
To find the closure of a set of attributes, you start with the set XXX and
repeatedly apply the functional dependencies until you can no longer add
any new attributes.
Example:
Let’s consider a relation Student_Course and the following functional
dependencies:
• {StudentID}→{StudentName}\{ Student_ID \} \to \{ Student_Name
\}{StudentID}→{StudentName}
• {CourseID}→{CourseName}\{ Course_ID \} \to \{ Course_Name \}{CourseI
D}→{CourseName}
• {StudentID,CourseID}→{Teacher}\{ Student_ID, Course_ID \} \to \{
Teacher \}{StudentID,CourseID}→{Teacher}
Goal: Find the closure of {StudentID}\{ Student_ID \}{StudentID} —
{StudentID}+\{ Student_ID \}^+{StudentID}+.
1. Start with {StudentID}+={StudentID}\{ Student_ID \}^+ = \{ Student_ID
\}{StudentID}+={StudentID}.
2. Use the functional dependency {StudentID}→{StudentName}\{
Student_ID \} \to \{ Student_Name \}{StudentID}→{StudentName} to add
StudentNameStudent_NameStudentName to the closure. Now
{StudentID}+={StudentID,StudentName}\{ Student_ID \}^+ = \{
Student_ID, Student_Name \}{StudentID}+={StudentID,StudentName}.
3. Since {StudentID}\{ Student_ID \}{StudentID} doesn’t determine the other
attributes like Course_ID, we stop here.
The closure of {StudentID}\{ Student_ID \}{StudentID} is
{StudentID,StudentName}\{ Student_ID, Student_Name \}{StudentI
D,StudentName}.
2. Armstrong’s Axioms
Armstrong's Axioms are a set of rules or properties that help us logically
infer all the functional dependencies in a relation. These axioms are used
to derive the closure of a set of attributes and simplify functional
dependencies.
The axioms are:
1. Reflexivity: If Y⊆XY \subseteq XY⊆X, then X→YX \to YX→Y. (If YYY is a
subset of XXX, then XXX determines YYY).
2. Augmentation: If X→YX \to YX→Y, then XZ→YZXZ \to YZXZ→YZ for any set
of attributes ZZZ. (If XXX determines YYY, then adding more attributes to
both sides doesn’t change the dependency).
3. Transitivity: If X→YX \to YX→Y and Y→ZY \to ZY→Z, then X→ZX \to ZX→Z.
(If XXX determines YYY, and YYY determines ZZZ, then XXX determines
ZZZ).
4. Union: If X→YX \to YX→Y and X→ZX \to ZX→Z, then X→YZX \to YZX→YZ.
(If XXX determines YYY and ZZZ, then XXX determines both YYY and ZZZ).
5. Decomposition: If X→YZX \to YZX→YZ, then X→YX \to YX→Y and X→ZX
\to ZX→Z. (If XXX determines both YYY and ZZZ, then it determines each
of them individually).
6. Pseudo-Transitivity: If X→YX \to YX→Y and YZ→WYZ \to WYZ→W, then
XZ→WXZ \to WXZ→W. (If XXX determines YYY, and YZYZYZ determines
WWW, then XZXZXZ determines WWW).
Example Using Armstrong’s Axioms:
Suppose we have these functional dependencies:
• A→BA \to BA→B
• B→CB \to CB→C
Using Transitivity, we can deduce that:
• A→CA \to CA→C
3. Canonical Cover (Minimal Cover)
A canonical cover (or minimal cover) is a simplified set of functional
dependencies that is equivalent to the original set but with no redundant
dependencies. The purpose of a canonical cover is to eliminate redundant
or unnecessary functional dependencies and make the design of the
database simpler.
Steps to Find Canonical Cover:
1. Remove extraneous attributes from the left side of functional
dependencies (if any).
2. Remove redundant dependencies that are already implied by other
dependencies.
Example:
Let’s say we have the following functional dependencies:
• A→BA \to BA→B
• A→CA \to CA→C
• B→CB \to CB→C
To find the canonical cover, we follow these steps:
1. Remove Extraneous Attributes: If A→BA \to BA→B and A→CA \to CA→C,
then B→CB \to CB→C is redundant because it’s already implied by A→BA
\to BA→B and A→CA \to CA→C. So we can eliminate B→CB \to CB→C.
2. Final Canonical Cover: After removing the redundant dependency, the
canonical cover is:
o A→BA \to BA→B
o A→CA \to CA→C