Module 26
Partha Pratim
Das
Week Recap
Objectives &
Database Management Systems
Outline
Module 26: Relational Database Design/6: Normal Forms
Normal Forms
1NF
2NF
3NF
Module Summary Partha Pratim Das
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur
ppd@cse.iitkgp.ac.in
Database Management Systems Partha Pratim Das 26.1
Week Recap PPD
Module 26
Partha Pratim • Identified the features of good relational design
Das
• Familiarized with the First Normal Form
Week Recap
Objectives &
• Introduced the notion and the theory of functional dependencies
Outline
Normal Forms
• Discussed issues in ”good” design in the context of functional dependencies
1NF
2NF
• Studied Algorithms for Properties of Functional Dependencies
3NF
• Understood the Characterization for and Determination of Lossless Join and
Module Summary
Determination of Dependency Preservation
Database Management Systems Partha Pratim Das 26.2
Module Objectives PPD
Module 26
Partha Pratim • To Understand the Normal Forms and their Importance in Relational Design
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Database Management Systems Partha Pratim Das 26.3
Module Outline PPD
Module 26
Partha Pratim • Normal Forms
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Database Management Systems Partha Pratim Das 26.4
Normal Forms PPD
Module 26
Partha Pratim
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Normal Forms
Database Management Systems Partha Pratim Das 26.5
Normalization or Schema Refinement PPD
Module 26
• Normalization or Schema Refinement is a technique of organizing the data in the
Partha Pratim
Das database
Week Recap • A systematic approach of decomposing tables to eliminate data redundancy and
Objectives & undesirable characteristics
Outline
Normal Forms
◦ Insertion Anomaly
1NF ◦ Update Anomaly
2NF
3NF ◦ Deletion Anomaly
Module Summary • Most common technique for the Schema Refinement is decomposition.
◦ Goal of Normalization: Eliminate Redundancy
• Redundancy refers to repetition of same data or duplicate copies of same data stored in
different locations
• Normalization is used for mainly two purpose:
◦ Eliminating redundant (useless) data
◦ Ensuring data dependencies make sense, that is, data is logically stored
Database Management Systems Partha Pratim Das 26.6
Anomalies PPD
Module 26
b) Insertion Anomaly: Until the new faculty
Partha Pratim
Das a) Update Anomaly: Employee 519 is shown as member, Dr. Newsome, is assigned to teach
having different addresses on different records at least one course, his details cannot be
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
recorded
Module Summary
c) Deletion Anomaly: All information about
Resolution: Decompose the Schema Dr. Giddens is lost if he temporarily ceases
to be assigned to any courses.
a) Update: (ID, Address), (ID, Skill)
b) Insert: (ID, Name, Hire Date), (ID, Code)
c) Delete: (ID, Name, Hire Date), (ID, Code)
Database Management Systems Partha Pratim Das 26.7
Desirable Properties of Decomposition PPD
Module 26
Partha Pratim • Lossless Join Decomposition Property
Das
◦ It should be possible to reconstruct the original table
Week Recap
Objectives &
• Dependency Preserving Property
Outline
◦ No functional dependency (or other constraints should get violated)
Normal Forms
1NF
2NF
3NF
Module Summary
Database Management Systems Partha Pratim Das 26.8
Normalization and Normal Forms PPD
Module 26
Partha Pratim • A normal form specifies a set of conditions that the relational schema must satisfy in
Das
terms of its constraints – they offer varied levels of guarantee for the design
Week Recap
• Normalization rules are divided into various normal forms. Most common normal forms
Objectives &
Outline are:
Normal Forms
1NF
◦ First Normal Form (1NF)
2NF ◦ Second Normal Form (2NF)
3NF
◦ Third Normal Form (3NF)
Module Summary
• Informally, a relational database relation is often described as ”normalized” if it meets
third normal form. Most 3NF relations are free of insertion, update, and deletion
anomalies
Database Management Systems Partha Pratim Das 26.9
Normalization and Normal Forms PPD
Module 26
Partha Pratim • Additional Normal Forms
Das
◦ Elementary Key Normal Form (EKNF)
Week Recap
◦ Boyce-codd Normal Form (BCNF)
Objectives &
Outline ◦ Multivalued Dependencies And Fourth Normal Form (4NF)
Normal Forms ◦ Essential Tuple Normal Form (ETNF)
1NF
2NF
◦ Join Dependencies and Fifth Normal Form (5NF)
3NF
◦ Sixth Normal Form (6NF)
Module Summary
◦ Domain/Key Normal Form (DKNF)
Database Management Systems Partha Pratim Das 26.10
1NF: First Normal Form PPD
Module 26 • A relation is in First Normal Form if and only if all underlying domains contain atomic
Partha Pratim values only (doesn’t have multivalued attributes (MVA))
Das
• STUDENT(Sid, Sname, Cname)
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Source: http://www.edugrabs.com/normal- forms/#fnf
Database Management Systems Partha Pratim Das 26.11
1NF (2): Possible Redundancy PPD
Module 26
• Example: Supplier(SID, Status, City, PID, Qty)
Partha Pratim
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Drawbacks:
• Deletion Anomaly: If we delete <S3,40,Rohtak,P1,245>, then we lose the information that S3 lives in Rohtak.
• Insertion Anomaly: We cannot insert a Supplier S5 located in Karnal, until S5 supplies at least one part.
• Update Anomaly: If Supplier S1 moves from Delhi to Kanpur, then it is difficult to update all the tuples having SID
as S1 and City as Delhi.
Normalization is a method to reduce redundancy. However, sometimes 1NF increases redundancy.
Database Management Systems Partha Pratim Das 26.12
1NF (3): Possible Redundancy PPD
Module 26
• When LHS is not a Superkey :
• When LHS is a Superkey :
Partha Pratim
◦ Let X → Y be a non trivial FD over R with X
Das
is not a superkey of R, then redundancy exist ◦ If X → Y is a non trivial FD over R with X is
Week Recap between X and Y attribute set. a superkey of R, then redundancy does not
Objectives & ◦ Hence in order to identify the redundancy, we exist between X and Y attribute set.
Outline
need not to look at the actual data, it can be ◦ Example : X → Y and X is a Candidate Key
Normal Forms identified by given functional dependency. ⇒ X cannot duplicate
1NF
◦ Example : X → Y and X is not a Candidate ⇒ Corresponding Y value may or may not
2NF
Key duplicate.
3NF
Module Summary ⇒ X can duplicate
⇒ Corresponding Y value would duplicate
also.
Database Management Systems Partha Pratim Das 26.13
2NF: Second Normal Form PPD
Module 26
Partha Pratim • Relation R is in Second Normal Form (2NF) only iff :
Das
◦ R is in 1NF and
Week Recap
◦ R contains no Partial Dependency
Objectives &
Outline
Normal Forms
1NF
Partial Dependency:
2NF
3NF Let R be a relational Schema and X , Y , A be the attribute sets over R where X : Any Candi-
Module Summary date Key, Y : Proper Subset of Candidate Key, and A : Non Prime Attribute
If Y → A exists in R, then R is not in 2NF.
(Y → A) is a Partial dependency only if
• Y : Proper subset of Candidate Key
• A: Non Prime Attribute
A prime attribute of a relation is an attribute that is a part of a candidate key of the relation
Database Management Systems Partha Pratim Das 26.14
2NF (2) PPD
Key Normalization
Module 26 • STUDENT(Sid, Sname, Cname) (already in 1NF)
Partha Pratim
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Functional Dependencies:
Module Summary {SID, Cname} → Sname
• Redundancy? SID → Sname
◦ Sname
Partial Dependencies: The above two relations R1 and R2 are
• Anomaly? SID → Sname (as SID is a 1.Lossless Join
◦ Yes Proper Subset of Candidate Key 2.2NF
{SID, Cname}) 3.Dependency Preserving
Source: http://www.edugrabs.com/2nf- second- normal- form/
Database Management Systems Partha Pratim Das 26.15
2NF (3): Possible Redundancy PPD
Module 26
Post Normalization
Partha Pratim
Das • Supplier(SID, Status, City, PID, Qty)
Week Recap
Objectives &
Outline
Normal Forms
Partial Dependencies:
1NF
2NF
SID → Status
3NF
SID → City
Drawbacks:
Module Summary
• Deletion Anomaly: If we delete a tuple in
Sup City , then we not only loose the infor-
mation about a supplier, but also loose the
status value of a particular city.
• Insertion Anomaly: We cannot insert a City
and its status until a supplier supplies at least
one part.
• Update Anomaly: If the status value for a
city is changed, then we will face the problem
of searching every tuple for that city.
Source: http://www.edugrabs.com/2nf- second- normal- form/
Database Management Systems Partha Pratim Das 26.16
3NF: Third Normal Form PPD
Module 26
Let R be the relational schema.
Partha Pratim
Das • [E. F. Codd,1971] R is in 3NF only if:
Week Recap ◦ R should be in 2NF
Objectives &
◦ R should not contain transitive dependencies (OR, Every non-prime attribute of R is
Outline non-transitively dependent on every key of R)
Normal Forms
1NF
• [Carlo Zaniolo, 1982] Alternately, R is in 3NF iff for each of its functional dependencies X → A, at least
2NF
one of the following conditions holds:
3NF
◦ X contains A (that is, A is a subset of X , meaning X → A is trivial functional dependency), or
Module Summary
◦ X is a superkey, or
◦ Every element of A − X , the set difference between A and X , is a prime attribute (i.e., each
attribute in A − X is contained in some candidate key)
• [Simple Statement] A relational schema R is in 3NF if for every FD X → A associated with R either
◦ A ⊆ X (that is, the FD is trivial) or
◦ X is a superkey of R or
◦ A is part of some candidate key (not just superkey!)
• A relation in 3NF is naturally in 2NF
Database Management Systems Partha Pratim Das 26.17
3NF (2): Transitive Dependency
Module 26
Partha Pratim • A transitive dependency is a functional dependency which holds by virtue of
Das
transitivity. A transitive dependency can occur only in a relation that has three or more
Week Recap attributes.
Objectives &
Outline • Let A, B, and C designate three distinct attributes (or distinct collections of attributes)
Normal Forms in the relation. Suppose all three of the following conditions hold:
1NF
2NF ◦ A→B
3NF
◦ It is not the case that B → A
Module Summary
◦ B→C
• Then the functional dependency A → C (which follows from 1 and 3 by the axiom of
transitivity) is a transitive dependency
Database Management Systems Partha Pratim Das 26.18
3NF (3): Transitive Dependency
Module 26
• Example of transitive dependency
Partha Pratim
Das • The functional dependency {Book} → {Author Nationality} applies; that is, if we know
Week Recap the book, we know the author’s nationality. Furthermore:
Objectives &
Outline
◦ {Book} → {Author}
Normal Forms
◦ {Author} does not → {Book}
1NF ◦ {Author} → {Author Nationality}
2NF
3NF • Therefore {Book} → {Author Nationality} is a transitive dependency.
Module Summary
• Transitive dependency occurred because a non-key attribute (Author) was determining
another non-key attribute (Author Nationality).
Database Management Systems Partha Pratim Das 26.19
3NF (4): Example PPD
Module 26
• Example:
Partha Pratim Sup City(SID, Status, City) (already in 2NF) Post Normalization
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
The above two relations SC
Functional Dependencies: and CS are
SID → Status, • Lossless Join
SID → City,
• Redundancy? City→ Status • 3NF
◦ Status Transitive Dependency : • Dependency Preserving
• Anomaly? SID → Status
{As SID → City and City →
◦ Yes Status}
Database Management Systems Partha Pratim Das 26.20
3NF (5): Example
Module 26 • Relation dept advisor (s ID, i ID, dept name)
Partha Pratim
Das
• F = {s ID, dept name → i ID, i ID → dept name}
Week Recap
• Two candidate keys: s ID, dept name, and i ID, s ID
Objectives & • R is in 3NF
Outline
Normal Forms
◦ s ID, dept name → i ID
1NF
2NF
. s ID, dept name is a superkey
3NF
◦ i ID → dept name
Module Summary
. dept name is contained in a candidate key
A relational schema R is in 3NF if for every FD X → A associated with R either
• A ⊆ X (i.e., the FD is trivial) or
• X is a superkey of R or
• A is part of some key (not just superkey!)
Database Management Systems Partha Pratim Das 26.21
3NF (6): Redundancy
Module 26
• There is some redundancy in this schema
Partha Pratim
Das • Example of problems due to redundancy in 3NF (J : s ID, L : i ID, K : dept name)
Week Recap ◦ R = (J, L, K ). F = {JK → L, L → K }
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
• Repetition of information (for example, the relationship l1 , k1 )
◦ (i ID, dept name)
• Need to use null values (for example, to represent the relationship l2 , k2 where there is
no corresponding value for J).
◦ (i ID, dept name) if there is no separate relation mapping instructors to
departments
Database Management Systems Partha Pratim Das 26.22
Module Summary
Module 26
Partha Pratim • Studied the Normal Forms and their Importance in Relational Design – how progressive
Das
increase of constraints can minimize redundancy in a schema
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Database Management Systems Partha Pratim Das 26.23