0% found this document useful (0 votes)
28 views96 pages

1.6 Normalization

Uploaded by

kirtick.mm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views96 pages

1.6 Normalization

Uploaded by

kirtick.mm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

Normalization

Normalization

Redundanc
Process of organizing y
To
the data Avoid
Anomal
y
Degrades
the overall
integrity
Normalization - of Data

Definition
• technique of organizing the data in the database
• systematic approach of decomposing tables to eliminate data
redundancy(repetition) and undesirable characteristics like Insertion,
Update and Deletion Anomalies
• multi-step process that puts data into tabular form, removing duplicated
data from the relation tables
• Normalization rules divides larger tables into smaller tables and links
them using relationships
Purpose of Normalization

Ensuring data
Eliminating dependencies
make sense i.e
redundant
data is logically
(useless) data stored
Redundancy
• Data redundancy occurs when
SID SNAME AGE
the same piece of data is stored
1 Ravi 18
in two or more separate places
Row level 2 Ram 19
Redundancy 3 Nithin 18 • having multiple copies of same
4 Rahul 20 data in the database
1 Ravi 18
• This increases the size of
2 Ram 19
the database
7 Sara 20
Redundancy
SID SNAME CID CNAME FID Faculty Name Dept
1 Ravi C1 DBMS F1 John CSE
2 Ram C2 JAVA F2 Joe IT
Column
level 3 Nithin C2 JAVA F2 Joe IT
Redundancy 4 Rahul C1 DBMS F1 John CSE
5 Priya C3 DAA F3 Maria ECE
6 Karthick C1 DBMS F1 John CSE
7 Sara C2 JAVA F2 Joe IT
Anomaly
Caused due to

too much Poor Construction


redundanc of Tables
y
Anomaly -
Types

Insertion Updation Deletion


Anomaly Anomaly Anomaly
Insertion
Anomaly
SID SNAME CID CNAME FID Faculty Name Dept
1 Ravi C1 DBMS F1 John CSE
2 Ram C2 JAVA F2 Joe IT
3 Nithin C2 JAVA F2 Joe IT Inserting a row in
4 Rahul C1 DBMS F1 John CSE this table makes
5 Priya C3 DAA F3 Maria ECE to insert
6 Karthick C1 DBMS F1 John CSE redundant data
7 Sara C2 JAVA F2 Joe IT for every row

8 Varun C1 DBMS F1 John CSE


Updation
Anomaly
SID SNAME CID CNAME FID Faculty Name Dept
1 Ravi C1 DBMS F1 John CSE
2 Ram C2 JAVA F2 Joe IT
3 Nithin C2 JAVA F2 Joe IT Updation of
4 Rahul C1 DBMS F1 John CSE Faculty name
5 Priya C3 DAA F3 Maria ECE information has to
6 Karthick C1 DBMS F1 John CSE be done in all
7 Sara C2 JAVA F2 Joe IT rows
C1 DBMS F4 Steve CSE
Deletion Anomaly

SID SNAME CID CNAME FID Faculty Name Dept


Details of
1 Ravi C1 DBMS F1 John CSE
students who
2 Ram C2 JAVA F2 Joe IT
completed the
3 Nithin C2 JAVA F2 Joe IT
course has to be
4 Rahul C1 DBMS F1 John CSE
deleted
5 Priya C3 DAA F3 Maria ECE
6 Karthick C1 DBMS F1 John CSE
7 Sara C2 JAVA F2 Joe IT
Loss of
related data
when some
other data is
4 Rahul C1 DBMS F1 John CSE
deleted
How Normalization
Solves this issues?
Student Course
Relation

SID SNAME CID CNAME FID Faculty Name Dept


1 Ravi C1 DBMS F1 John CSE Decompose the tables
into the 3 tables –
2 Ram C2 JAVA F2 Joe IT
Student, Course and
3 Nithin C2 JAVA F2 Joe IT Faculty
4 Rahul C1 DBMS F1 John CSE
5 Priya C3 DAA F3 Maria ECE
6 Karthick C1 DBMS F1 John CSE
7 Sara C2 JAVA F2 Joe IT
Student Course Faculty

SID SNAME CID CID CNAME FID FID Faculty Name Dept
1 Ravi C1 C1 DBMS F1 F1 John CSE
2 Ram C2 C2 JAVA F2 F2 Joe IT
3 Nithin C2 C3 DAA F3 F3 Maria ECE
4 Rahul C1
5 Priya C3
6 Karthick C1
7 Sara C2
Insertion anomaly
Solved
Student

SID SNAME CID


1 Ravi C1
2 Ram C2
3 Nithin C2
4 Rahul C1
Varun can be
5 Priya C3 added in the
6 Karthick C1 students table
7 Sara C2

8 Varun
Avoiding Updation
Faculty
anomaly
FID Faculty Name Dept
F1 John CSE
F2 Joe IT
F3 Maria ECE
Faculty
Name Steve
is updated
with John

F4 Steve CSE
Deletion anomaly Solved

Student

SID SNAME CID


1 Ravi C1
2 Ram C2
3 Nithin C2
4 Rahul C1 Rahul
5 Priya C3 completed the
course and it
6 Karthick C1
can be deleted
7 Sara C2
Say how all anomalies occur in this table
emp_id emp_name emp_address emp_dept
101 Rick Delhi D001 Insert anomaly
101 Rick Delhi D002 New employee Joins
123 Maggie Agra D890 Not Assigned to any
166 Glenn Chennai D900 dept emp_dept will be
166 Glenn Chennai D004 null

Delete anomaly
Update anomaly
company closes
update the address of
the department D890
Rick
Normalization Types
First Normal Form
1NF
First Normal Form (1NF)

2 NF
Not in
1NF
1NF

First
Poor DB design
Step
Every
Table
should be
1NF
First Normal Form (1NF)
4 Rules
1. It should only have single(atomic) valued
attributes/columns.
2. Values stored in a column should be of the same domain
3. All the columns in a table should have unique names.
4. order in which data is stored, does not matter.
1 NF – Rule 1

Each Column should have


atomic values Player_Inventory
Relation

Entries like 3 shields, 5


arrows, 30 copper
coins, 7 rings violate
this rule
1NF – Rule 2

Column should contain


values that are of the
Beatle_Height
same type

Do not inter – mix


different types of values in
any column
1NF – Rule 3

Each column should


have a unique name
Beatle_Height

Same name leads to


confusion at the time of
data retrieval
1NF – Rule 4

Order in which data


is stored doesn’t
matter
Using SQL query, data can
be retrieved from table in any
order
Example
:

How to
solve this ?
Example
:
Split the
values to
individua
l rows
Second Normal
Form 2NF
Second Normal
Form
2 Rules
1. The table should be in the First Normal
Form.
2. There should be no Partial
Dependency.
Rule 1
table should be in the First Normal
Form Already covered as 1NF
2NF – Rule 2 – Partial
Dependency

Dependency ?
This is called
Dependency
or Functional
Student
Dependency
Sid is unique which helps to fetch any data
SID SNAME SDEPT SADDR SPHONENO
1 Ravi CSE BANG 8900999890
2 Ram IT CBE 9345677788

every other column depends on


the primary key
Functional
Dependency
Functional Dependency
• relationship that exists between two attributes
• exists between the primary key and non-key attribute within a
table

Attribute 1 Uniquely Identifies Attribute 2

attributes of a table is said to be dependent on each other when an attribute of a


table uniquely identifies another attribute of the same table.
STUD_NO->STUD_NAME STUD_NO->STUD_PHONE

STUD_NO->STUD_NAME, STUD_NO->STUD_PHONE Holds

STUD_NAME->STUD_ADDR Do not Hold


Functional Dependency
Represented as
A B
A Determines B or B is determined by A

A - Determinant B - Dependent
Eg:
Emp_Id → Emp_Name
Functional Dependency - Types
Trivial functional dependency
• dependency of an attribute on a set of attributes

• A -> B is a trivial functional dependency if B is a subset of A

E.g.,

{Student_Id, Student_Name} ->Student_Id

• if we know the values of Student_Id and Student_Name then the


value of Student_Id can be uniquely determined
• Student_Id -> Student_Id & Student_Name -> Student_Name
are trivial dependencies too
Non trivial functional dependency
• dependency of an attribute on a set of attributes
• functional dependency X->Y holds true where Y is not a subset of X
E.g.,
emp_id -> emp_name (emp_name is not a subset of emp_id)

emp_id -> emp_address (emp_address is not a subset of emp_id)

{emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}] is trivial

Completely non-trivial FD:

If an FD X->Y holds true where X intersection Y is null then this dependency is said to be a
completely non trivial function dependency.
What is Partial
Dependency
Composite Key
– Player_ID and
Player_Inventory
Item_Type
Player_Inventory

Functional
Dependenc
y
Player_Inventory Player_ratin
g

Player_ID

attribute in a table
depends on only a
part of the primary
Partial key and not on the
whole key
Dependency
How to Remove Partial Dependency?
Decompose the table into Player and
Player_Inventory
teacher_id subject teacher_age
111 Maths 38
111 Physics 38 Its
222 Biology 38 in
333 Physics 40 1NF
333 Chemistry 40
Candidate Keys: {teacher_id,
teacher_id teacher_age subject}
teacher_id subject
111 38 111 Maths
Non prime attribute: teacher_age
111 38 111 Physics
222 38 222 Biology “no non-prime attribute is dependent on
333 40 333 Physics the proper subset of any candidate key of
333 40 333 Chemistry the table
Third Normal Form
3NF
3NF

2 Rules 1. It should be in the Second


Normal form.
2. It should not have
Transitive Dependency.
3NF - Example
Player_Inventory

Add the Skill Level to the


Player relation
3NF - Example Inconsistency occurs for the Beginner
rating marked as 4 in the Skill level
I’m What is
a Transitive
3NF Dependency
3NF - Example Functional dependency exists as
Transitive dependency
3NF - Example Functional dependency exists as
Transitive dependency

Some Attribute in a table depends on some non prime attribute and not
on prime attribute
How to remove
Transitive Dependency?
Remove the Player_Rating from the
Player table
Boyce-Codd Normal
Form BCNF
BCNF

1. It should be in the Third


2 Rules
Normal Form.
2. for any dependency A → B,
A should be a super key.
BCNF
for any dependency A → B, A should be a super
key.
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

EMP_ID →
Dependencie EMP_COUNTRY
s
EMP_DEPT → {DEPT_TYPE,
EMP_DEPT_NO}
Candidate key: {EMP-ID, EMP-DEPT}

not in BCNF because neither EMP_DEPT nor EMP_ID alone


are keys
BCNF
A→ B
Non – prime →
Attribute

Not in
BCNF
Example1
Example1

Taking the Loophole definition of 2NF, this table is not in 2NF


Informal definition: 2NF says we can’t have a non-key attribute
that depends on the part of the key.
Formal definition: 2NF says we can’t have a non-prime attribute
that depends on part of a candidate key.
Example1

Not in
2 NF
Definitions
Example2

Every Attribute in the table is a prime attribute


Example2

It is in 3NF
BCNF
Formal definition: With the exception of trivial functional
dependencies, every functional dependency in a table must be a
dependency on a super key.

Release_Year and _Month is not a super key.

It is not in BCNF
How to satisfy
BCNF?
Change the Release_Month to contain the month
alone
Fourth Normal Form
4NF
4NF

1. It should be in the Boyce-Codd Normal


2
Rule Form.
s
2. the table should not have any Multi-
valued Dependency
What is Multi-
I’m Valued
4NF Dependency
4NF
A B is Multi-Valued Dependency
when
B1
A
B2

B and C Should be independent of Each other

To have Multi-Valued Dependency, there


should be atleast 3 columns
4NF

Multiple Rows will solve the


problem
4NF
For a Table with A,B, C columns
A B multi-valued dependency occurs in 3 conditions

A B, for single value of A , more than one value of B


exist
Table Should have atleast 3 columns

For this table A,B,C columns , B and C should be independent


4NF
Example
No
Relationship
4NF
A functional dependency exists as a
Multivalued dependency
4NF
Fifth Normal Form
5NF
5NF

1. It should be in the Fourth


2 Normal Form.
Rule
2. It should not have any
s
Join Dependency
I’m What is Join
5NF Dependency
A Relation with Join Dependency

If Join B C
Dependency If this break down or
decomposition of table
exists, then causes any deletion or
creation of
decompose entries……
the tables A
Join
Again
Primary
Key is a
Composite
Key
Supplier SPC Customer

Product
Customer 1
Supplier Product
Can
Supply
Customer 2

Product 1
Customer Supplier
Buying
from
Product 2

Supplier 1
Product Customer
Used
by
Supplier 2
Supplier SPC Customer

Product
Selling
to

Supplier SPC Customer

Product
Supplier SC Customer

Product
ACME Sells 72X SW
Ford Uses 72X SW
ACME supplies to
Ford

ACME Sells 72X SW to Ford


We can get this only from
this table so

Don’t decompose
the table
A

B C
Additional Information is
Created Or Original
Information is lost
Stick to Table
the
Thank you

You might also like