Lecture 6
Lecture 6
Partha Pratim
Das
Week Recap
Objectives &
Database Management Systems
Outline
Module 26: Relational Database Design/6: Normal Forms
Normal Forms
1NF
2NF
3NF
ppd@cse.iitkgp.ac.in
Module 26
Objectives &
• Introduced the notion and the theory of functional dependencies
Outline
Normal Forms
• Discussed issues in ”good” design in the context of functional dependencies
1NF
2NF
• Studied Algorithms for Properties of Functional Dependencies
3NF
• Understood the Characterization for and Determination of Lossless Join and
Module Summary
Determination of Dependency Preservation
Module 26
Partha Pratim • To Understand the Normal Forms and their Importance in Relational Design
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
Partha Pratim
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Normal Forms
Module 26
• Normalization or Schema Refinement is a technique of organizing the data in the
Partha Pratim
Das database
Week Recap • A systematic approach of decomposing tables to eliminate data redundancy and
Objectives & undesirable characteristics
Outline
Normal Forms
◦ Insertion Anomaly
1NF ◦ Update Anomaly
2NF
3NF ◦ Deletion Anomaly
Module Summary • Most common technique for the Schema Refinement is decomposition.
◦ Goal of Normalization: Eliminate Redundancy
• Redundancy refers to repetition of same data or duplicate copies of same data stored in
different locations
• Normalization is used for mainly two purpose:
◦ Eliminating redundant (useless) data
◦ Ensuring data dependencies make sense, that is, data is logically stored
Database Management Systems Partha Pratim Das 26.6
Anomalies PPD
Module 26
b) Insertion Anomaly: Until the new faculty
Partha Pratim
Das a) Update Anomaly: Employee 519 is shown as member, Dr. Newsome, is assigned to teach
having different addresses on different records at least one course, his details cannot be
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
recorded
Module Summary
c) Deletion Anomaly: All information about
Resolution: Decompose the Schema Dr. Giddens is lost if he temporarily ceases
to be assigned to any courses.
a) Update: (ID, Address), (ID, Skill)
b) Insert: (ID, Name, Hire Date), (ID, Code)
c) Delete: (ID, Name, Hire Date), (ID, Code)
Module 26
Objectives &
• Dependency Preserving Property
Outline
◦ No functional dependency (or other constraints should get violated)
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
Partha Pratim • A normal form specifies a set of conditions that the relational schema must satisfy in
Das
terms of its constraints – they offer varied levels of guarantee for the design
Week Recap
• Normalization rules are divided into various normal forms. Most common normal forms
Objectives &
Outline are:
Normal Forms
1NF
◦ First Normal Form (1NF)
2NF ◦ Second Normal Form (2NF)
3NF
◦ Third Normal Form (3NF)
Module Summary
• Informally, a relational database relation is often described as ”normalized” if it meets
third normal form. Most 3NF relations are free of insertion, update, and deletion
anomalies
Module 26
Module 26 • A relation is in First Normal Form if and only if all underlying domains contain atomic
Partha Pratim values only (doesn’t have multivalued attributes (MVA))
Das
• STUDENT(Sid, Sname, Cname)
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
• Example: Supplier(SID, Status, City, PID, Qty)
Partha Pratim
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Drawbacks:
• Deletion Anomaly: If we delete <S3,40,Rohtak,P1,245>, then we lose the information that S3 lives in Rohtak.
• Insertion Anomaly: We cannot insert a Supplier S5 located in Karnal, until S5 supplies at least one part.
• Update Anomaly: If Supplier S1 moves from Delhi to Kanpur, then it is difficult to update all the tuples having SID
as S1 and City as Delhi.
Module 26
• When LHS is not a Superkey :
• When LHS is a Superkey :
Partha Pratim
◦ Let X → Y be a non trivial FD over R with X
Das
is not a superkey of R, then redundancy exist ◦ If X → Y is a non trivial FD over R with X is
Week Recap between X and Y attribute set. a superkey of R, then redundancy does not
Objectives & ◦ Hence in order to identify the redundancy, we exist between X and Y attribute set.
Outline
need not to look at the actual data, it can be ◦ Example : X → Y and X is a Candidate Key
Normal Forms identified by given functional dependency. ⇒ X cannot duplicate
1NF
◦ Example : X → Y and X is not a Candidate ⇒ Corresponding Y value may or may not
2NF
Key duplicate.
3NF
Module 26
Normal Forms
1NF
Partial Dependency:
2NF
3NF Let R be a relational Schema and X , Y , A be the attribute sets over R where X : Any Candi-
Module Summary date Key, Y : Proper Subset of Candidate Key, and A : Non Prime Attribute
Key Normalization
Module 26 • STUDENT(Sid, Sname, Cname) (already in 1NF)
Partha Pratim
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Functional Dependencies:
Module Summary {SID, Cname} → Sname
• Redundancy? SID → Sname
◦ Sname
Partial Dependencies: The above two relations R1 and R2 are
• Anomaly? SID → Sname (as SID is a 1.Lossless Join
◦ Yes Proper Subset of Candidate Key 2.2NF
{SID, Cname}) 3.Dependency Preserving
Module 26
Post Normalization
Partha Pratim
Das • Supplier(SID, Status, City, PID, Qty)
Week Recap
Objectives &
Outline
Normal Forms
Partial Dependencies:
1NF
2NF
SID → Status
3NF
SID → City
Drawbacks:
Module Summary
• Deletion Anomaly: If we delete a tuple in
Sup City , then we not only loose the infor-
mation about a supplier, but also loose the
status value of a particular city.
• Insertion Anomaly: We cannot insert a City
and its status until a supplier supplies at least
one part.
• Update Anomaly: If the status value for a
city is changed, then we will face the problem
of searching every tuple for that city.
Source: http://www.edugrabs.com/2nf- second- normal- form/
Database Management Systems Partha Pratim Das 26.16
3NF: Third Normal Form PPD
Module 26
Let R be the relational schema.
Partha Pratim
Das • [E. F. Codd,1971] R is in 3NF only if:
Week Recap ◦ R should be in 2NF
Objectives &
◦ R should not contain transitive dependencies (OR, Every non-prime attribute of R is
Outline non-transitively dependent on every key of R)
Normal Forms
1NF
• [Carlo Zaniolo, 1982] Alternately, R is in 3NF iff for each of its functional dependencies X → A, at least
2NF
one of the following conditions holds:
3NF
◦ X contains A (that is, A is a subset of X , meaning X → A is trivial functional dependency), or
Module Summary
◦ X is a superkey, or
◦ Every element of A − X , the set difference between A and X , is a prime attribute (i.e., each
attribute in A − X is contained in some candidate key)
• [Simple Statement] A relational schema R is in 3NF if for every FD X → A associated with R either
◦ A ⊆ X (that is, the FD is trivial) or
◦ X is a superkey of R or
◦ A is part of some candidate key (not just superkey!)
• A relation in 3NF is naturally in 2NF
Database Management Systems Partha Pratim Das 26.17
3NF (2): Transitive Dependency
Module 26
Module 26
• Example of transitive dependency
Partha Pratim
Das • The functional dependency {Book} → {Author Nationality} applies; that is, if we know
Week Recap the book, we know the author’s nationality. Furthermore:
Objectives &
Outline
◦ {Book} → {Author}
Normal Forms
◦ {Author} does not → {Book}
1NF ◦ {Author} → {Author Nationality}
2NF
3NF • Therefore {Book} → {Author Nationality} is a transitive dependency.
Module Summary
• Transitive dependency occurred because a non-key attribute (Author) was determining
another non-key attribute (Author Nationality).
Module 26
• Example:
Partha Pratim Sup City(SID, Status, City) (already in 2NF) Post Normalization
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Normal Forms
◦ s ID, dept name → i ID
1NF
2NF
. s ID, dept name is a superkey
3NF
◦ i ID → dept name
Module Summary
. dept name is contained in a candidate key
Module 26
• There is some redundancy in this schema
Partha Pratim
Das • Example of problems due to redundancy in 3NF (J : s ID, L : i ID, K : dept name)
Week Recap ◦ R = (J, L, K ). F = {JK → L, L → K }
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
Partha Pratim • Studied the Normal Forms and their Importance in Relational Design – how progressive
Das
increase of constraints can minimize redundancy in a schema
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Library
Information Module 28: Relational Database Design/8: Case Study
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Partha Pratim Das
Final Schema
ppd@cse.iitkgp.ac.in
Module 28
Partha Pratim • Learnt how to decompose a schema into 3NF while preserving dependency and lossless
Das
join
Objectives &
Outline • Learnt how to decompose a schema into BCNF with lossless join
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
Objectives &
Outline
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
Objectives &
Outline
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
Partha Pratim
We are given to design a relational database schema for a Library Information System (LIS)
Das of an Institute
Objectives &
Outline
• The specification document of the LIS has already been shared with you
Library • In this presentation, we include the key points from the Specs; but the actual document
Information
System must be referred to
Specification
Entity Sets • We carry out the following tasks in the module:
Relationships
Relational Schema ◦ Identify the Entity Sets with attributes
Schema Refinement
Final Schema
◦ Identify the Relationships
Module Summary ◦ Build the initial set of relational schema
◦ Refine the set of schema with FDs that hold on them
◦ Finalize the design of the schema
• The coding of various queries in SQL, based on these schema are left as exercises
Module 28
• An institute library has 200000+ books and 10000+ members
Partha Pratim
Das • Books are regularly issued by members on loan and returned after a period.
Objectives &
Outline
• The library needs an LIS to manage the books, the members and the issue-return
Library process
Information
System • Every book has
Specification
Entity Sets ◦ title
Relationships
Relational Schema
◦ author (in case of multiple authors, only the first author is maintained)
Schema Refinement ◦ publisher
Final Schema
Module Summary
◦ year of publication
◦ ISBN number (which is unique for the publication), and
◦ accession number (which is the unique number of the copy of the book in the
library)
◦ There may be multiple copies of the same book in the library
Module 28
• Every faculty has
Partha Pratim
Das ◦ name
Objectives & ◦ employee id
Outline
◦ department
Library
Information ◦ gender
System
Specification
◦ mobile number, and
Entity Sets
Relationships
◦ date of joining
Relational Schema
Schema Refinement
• Library also issues a unique membership number to every member. Every member
Final Schema has a maximum quota for the number of books she / he can issue for the maximum
Module Summary duration allowed to her / him. Currently these are set as:
◦ Each undergraduate student can issue up to 2 books for 1 month duration
◦ Each postgraduate student can issue up to 4 books for 1 month duration
◦ Each research scholar can issue up to 6 books for 3 months duration
◦ Each faculty member can issue up to 10 books for six months duration
Module 28
• The library has the following rules for issue:
Partha Pratim
Das ◦ A book may be issued to a member if it is not already issued to someone else
Objectives & (trivial)
Outline
◦ A book may not be issued to a member if another copy of the same book is already
Library
Information issued to the same member
System
Specification
◦ No issue will be done to a member if at the time of issue one or more of the books
Entity Sets
Relationships
issued by the member has already exceeded its duration of issue
Relational Schema ◦ No issue will be allowed also if the quota is exceeded for the member
Schema Refinement
Final Schema
◦ It is assumed that the name of every author or member has two parts
Module Summary . first name
. last name
Module 28
Partha Pratim • Every book has title, author (in case of multiple authors, only the first author is
Das
maintained), publisher, year of publication, ISBN number (which is unique for the
Objectives &
Outline
publication), and accession number (which is the unique number of the copy of the
Library
book in the library). There may be multiple copies of the same book in the library
Information
System • Entity Set:
Specification
Entity Sets ◦ books
Relationships
Relational Schema • Attributes:
Schema Refinement
Final Schema ◦ title
Module Summary ◦ author name (composite)
◦ publisher
◦ year
◦ ISBN no
◦ accession no
Module 28
Partha Pratim • Every student has name, roll number, department, gender, mobile number, date of
Das
birth, and degree (undergrad, grad, doctoral)
Objectives &
Outline • Entity Set:
Library
Information
◦ students
System
Specification
• Attributes:
Entity Sets
Relationships
◦ member no – is unique
Relational Schema ◦ name (composite)
Schema Refinement
Final Schema ◦ roll no – is unique
Module Summary ◦ department
◦ gender
◦ mobile no – may be null
◦ dob
◦ degree
Module 28
Partha Pratim • Every faculty has name, employee id, department, gender, mobile number, and date of
Das
joining
Objectives &
Outline • Entity Set:
Library
Information
◦ faculty
System
Specification
• Attributes:
Entity Sets
Relationships
◦ member no – is unique
Relational Schema ◦ name (composite)
Schema Refinement
Final Schema ◦ id – is unique
Module Summary ◦ department
◦ gender
◦ mobile no – may be null
◦ doj
Module 28
Partha Pratim • Library also issues a unique membership number to every member. There are four
Das
categories of members of the library: undergraduate students, post graduate students,
Objectives &
Outline
research scholars, and faculty members
Library • Entity Set:
Information
System ◦ members
Specification
Entity Sets
Relationships
• Attributes:
Relational Schema ◦ member no
Schema Refinement
Final Schema ◦ member type (takes a value in ug, pg, rs or fc)
Module Summary
Module 28
Partha Pratim • Every member has a maximum quota for the number of books she / he can issue for
Das
the maximum duration allowed to her / him. Currently these are set as:
Objectives &
Outline ◦ Each undergraduate student can issue up to 2 books for 1 month duration
Library ◦ Each postgraduate student can issue up to 4 books for 1 month duration
Information
System ◦ Each research scholar can issue up to 6 books for 3 months duration
Specification
Entity Sets
◦ Each faculty member can issue up to 10 books for six months duration
Relationships
Relational Schema
• Entity Set:
Schema Refinement
Final Schema
◦ quota
Module Summary • Attributes:
◦ member type
◦ max books
◦ max duration
Module 28
Partha Pratim • Though not explicitly stated, library would have staffs to manage the LIS
Das
• Entity Set:
Objectives &
Outline ◦ staff
Library
Information • Attributes: (speculated – to ratify from customer)
System
Specification ◦ name (composite)
Entity Sets
Relationships
◦ id – is unique
Relational Schema ◦ gender
Schema Refinement
Final Schema ◦ mobile no
Module Summary ◦ doj
Module 28
Partha Pratim • Books are regularly issued by members on loan and returned after a period. The library
Das
needs an LIS to manage the books, the members and the issue-return process
Objectives &
Outline • Relationship
Library
Information
◦ book issue
System
Specification
• Involved Entity Sets
Entity Sets
Relationships
◦ students / faculty / members
Relational Schema
Schema Refinement
. member no
Final Schema
◦ books
Module Summary
. accession no
• Relationship Attribute
◦ doi – date of issue
• Type of relationship
◦ Many-to-one from books
Database Management Systems Partha Pratim Das 28.17
LIS Relational Schema PPD
Module 28
Partha Pratim • books(title, author fname, author lname, publisher, year, ISBN no, accession no)
Das
• book issue(members, accession no, doi)
Objectives &
Outline • members(member no, member type)
Library
Information • quota(member type, max books, max duration)
System
Specification • students(member no, student fname, student lname, roll no, department, gender,
Entity Sets
Relationships mobile no, dob, degree)
Relational Schema
Schema Refinement • faculty(member no, faculty fname, faculty lname, id, department, gender, mobile no,
Final Schema
doj)
Module Summary
• staff(staff fname, staff lname, id, gender, mobile no, doj)
Module 28
Partha Pratim • books(title, author fname, author lname, publisher, year, ISBN no, accession no)
Das
◦ ISBN no → title, author fname, author lname, publisher, year
Objectives &
Outline ◦ accession no → ISBN no
Library ◦ Key: accession no
Information
System • Redundancy of book information across copies
Specification
Entity Sets
Relationships
• Good to normalize:
Relational Schema ◦ book catalogue(title, author fname, author lname, publisher, year, ISBN no)
Schema Refinement
Final Schema . ISBN no → title, author fname, author lname, publisher, year
Module Summary . Key: ISBN no
◦ book copies(ISBN no, accession no)
. accession no → ISBN no
. Key: accession no
• Both in BCNF. Decomposition is lossless join and dependency preserving
Module 28
Module Summary
Module 28
Module Summary
Module 28
Module 28
Partha Pratim • students(member no, student fname, student lname, roll no, department, gender,
Das
mobile no, dob, degree)
Objectives &
Outline ◦ roll no → student fname, student lname, department, gender, mobile no, dob,
Library degree
Information
System ◦ member no → roll no
Specification
Entity Sets
◦ roll no → member no
Relationships ◦ 2 Keys: roll no | member no
Relational Schema
Schema Refinement • In BCNF
Final Schema
Module 28
Partha Pratim • faculty(member no, faculty fname, faculty lname, id, department, gender, mobile no,
Das
doj)
Objectives &
Outline ◦ id → faculty fname, faculty lname, department, gender, mobile no, doj
Library ◦ id → member no
Information
System ◦ member no →id
Specification
Entity Sets
◦ 2 Keys: id | member no
Relationships
Relational Schema
• In BCNF
Schema Refinement
Final Schema
• Issues:
Module Summary ◦ member no is needed for issue / return queries. It is unnecessary to have faculty
details with that.
◦ member no may also come from students relation.
◦ member type is needed for issue / return queries. This is implicit by the fact that
we are in faculty relation.
Module 28
Module 28 There are 4 categories of members: ug students, grad students, research scholars, and
Partha Pratim faculty members. This leads to the following specialization relationships:
Das
• Consider the entity set members of a library and refine:
Objectives &
Outline ◦ Attributes:
Library
Information . member no
System
Specification
. member class – ‘student’ or ‘faculty’, used to choose table
Entity Sets . member type – ug,pg, rs, fc, ...
Relationships
Relational Schema . roll no (if member class – ‘student’. Else null)
Schema Refinement
Final Schema
. id (if member class – ‘faculty’. Else null)
Module Summary • We can then exploit some hidden relationship:
◦ students IS A members
◦ faculty IS A members
• Type of relationship
◦ One-to-one
Module 28
Module 28
Partha Pratim • members(member no, member class, member type, roll no, id)
Das
◦ member no → member type, member class, roll no, id
Objectives &
Outline ◦ member type → member class
Library ◦ Key: member no
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
• students(student fname, student lname, roll no, department, gender, mobile no, dob,
Partha Pratim
Das degree)
Objectives &
◦ roll no → student fname, student lname, department, gender, mobile no, dob,
Outline
degree
Library
Information ◦ Keys: roll no
System
Specification
◦ Note:
Entity Sets
Relationships
. member no is no longer used
Relational Schema . member type and member class are set in members from degree at the time of
Schema Refinement
Final Schema
creation of a new record.
Module Summary
Module 28
• faculty(faculty fname, faculty lname, id, department, gender, mobile no, doj)
Partha Pratim
Das ◦ id → faculty fname, faculty lname, department, gender, mobile no, doj
Objectives &
◦ Keys: id
Outline
◦ Note:
Library
Information . member no is no longer used
System
Specification . member type and member class are set in members at the time of creation of a
Entity Sets
Relationships
new record
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
Partha Pratim • book catalogue(title, author fname, author lname, publisher, year, ISBN no)
Das
• book copies(ISBN no, accession no)
Objectives &
Outline • book issue(member no, accession no, doi)
Library
Information • quota(member type, max books, max duration)
System
Specification • members(member no, member class, member type, roll no, id)
Entity Sets
Relationships
• students(student fname, student lname, roll no, department, gender, mobile no, dob,
Relational Schema
Schema Refinement degree)
Final Schema
Module Summary
• faculty(faculty fname, faculty lname, id, department, gender, mobile no, doj)
• staff(staff fname, staff lname, id, gender, mobile no, doj)
Module 28
Partha Pratim • Using the specification for a Library Information System, we have illustrated how a
Das
schema can be designed and then refined for finalization
Objectives &
Outline • Coding of various queries based on these schema are left as exercises
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Multivalued
Dependency Module 29: Relational Database Design/9: MVD and 4NF
Definition
Example
Use
Theory
Module Summary
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur
ppd@cse.iitkgp.ac.in
Module 29
Partha Pratim • Using the specification for a Library Information System, we have illustrated how a
Das
schema can be designed and then refined for finalization
Objectives &
Outline
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
Partha Pratim • To understand multi-valued dependencies arising out of attributes that can have
Das
multiple values
Objectives &
Outline • To define Fourth Normal Form and learn the decomposition algorithm to 4NF
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
Partha Pratim
Das
Objectives &
Outline
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Multivalued Dependency
Objectives &
Outline
Multivalued
Dependency
Definition
Example
There are no non trivial FDs because all attributes are
Use combined forming Candidate Key, that is, MDP. In the
Theory
Decomposition to
above relation, two multivalued dependencies exists:
4NF
• Man Phones
Module Summary
• Man Dogs Like
A man’s phone are independent of the dogs they like.
But after converting the above relation in Single Valued
Attribute, each of a man’s phones appears with each of
the dogs they like in all combinations.
Source: http://www.edugrabs.com/multivalued-dependency-mvd/
Module 29
• If two or more independent relations are kept in a single relation, then Multivalued
Partha Pratim
Das Dependency is possible. For example, Let there are two relations :
Objectives &
◦ Student(SID, Sname) where (SID → Sname)
Outline
◦ Course(CID, Cname) where (CID → Cname)
Multivalued
Dependency • There is no relation defined between Student and Course. If we kept them in a single
Definition
Example relation named Student Course, then MVD will exists because of m:n Cardinality
Use
Theory • If two or more MVDs exist in a relation, then while converting into SVAs, MVD exists.
Decomposition to
4NF
Module Summary
Source: http://www.edugrabs.com/multivalued-dependency-mvd/
Module 29
Partha Pratim
• Suppose we record names of children, and phone numbers for instructors:
Das
◦ inst child(ID, child name)
Objectives & ◦ inst phone(ID, phone number)
Outline
Multivalued
• If we were to combine these schema to get
Dependency
◦ inst info(ID, child name, phone number)
Definition
Example
◦ Example data:
Use (99999, David, 512-555-1234)
Theory (99999, David, 512-555-4321)
Decomposition to (99999, William, 512-555-1234)
4NF
(99999, William, 512-555-4321)
Module Summary
• This relation is in BCNF
◦ Why?
Module 29
• Let R be a relation schema and let α ⊆ R and β ⊆ R. The multivalued dependency
Partha Pratim
Das αβ
Objectives &
holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r such that
Outline t1 [α] = t2 [α], there exist tuples t3 and t4 in r such that:
Multivalued
Dependency
Definition
t1 [α] = t2 [α] = t3 [α] = t4 [α]
Example t3 [β] = t1 [β]
Use t3 [R – β] = t2 [R – β] Test: course book
Theory
t4 [β] = t2 [β]
Decomposition to t4 [R – β] = t1 [R – β]
4NF
Module Summary
Module 29
Partha Pratim
• Let R be a relation schema with a set of attributes that are partitioned into 3 nonempty subsets.
Das Y, Z, W
Objectives & • We say that Y Z (Y multidetermines Z ) if and only if for all possible relations r (R )
Outline < y1 , z1 , w1 >∈ r and < y1 , z2 , w2 >∈ r
Multivalued then
Dependency
Definition
< y1 , z1 , w2 >∈ r and < y1 , z2 , w1 >∈ r
Example
Use
• Note that since the behavior of Z and W are identical it follows that
Theory
Y Z if Y W
Decomposition to
4NF
Module Summary
Module 29
Module 29
Module 29
Module Summary
◦ Y is a subset of X (X ⊇ Y) or
◦ X ∪ Y = R. Otherwise, it is a non trivial MVD and we have to repeat values
redundantly in the tuples.
Module 29
• From the definition of multivalued dependency, we can derive the following rule:
Partha Pratim
Das ◦ If α → β, then α β
Objectives & That is, every functional dependency is also a multivalued dependency
Outline
Multivalued
• The closure D + of D is the set of all functional and multivalued dependencies logically
Dependency
Definition
implied by D.
Example
◦ We can compute D + from D, using the formal definitions of functional
Use
Theory dependencies and multivalued dependencies.
Decomposition to
4NF
◦ We can manage with such reasoning for very simple multivalued dependencies,
Module Summary
which seem to be most common in practice
◦ For complex dependencies, it is better to reason about sets of dependencies using a
system of inference rules
Module 29
Partha Pratim
Das
Objectives &
Outline
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Decomposition to 4NF
Module 29
• A relation schema R is in 4NF with respect to a set D of functional and multivalued
Partha Pratim
Das dependencies if for all multivalued dependencies in D + of the form α β, where α ⊆
Objectives &
R and β ⊆ R, at least one of the following hold:
Outline
◦ α β is trivial (that is, β ⊆ α or α ∪ β = R)
Multivalued
Dependency ◦ α is a superkey for schema R
Definition
Example • If a relation is in 4NF it is in BCNF
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
• The restriction of D to Ri is the set Di consisting of
Partha Pratim
Das ◦ All functional dependencies in D + that include only attributes of Ri
Objectives &
◦ All multivalued dependencies of the form
Outline
α (β ∩ Ri )
Multivalued
Dependency where α ⊆ Ri and α β is in D +
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
Module 29
result: = {R};
Partha Pratim
Das
done := false;
compute D + ;
Objectives & Let Di denote the restriction of D + to Ri
Outline
while ( not done)
Multivalued
Dependency
if (there is a schema Ri in result that is not in 4NF) then
Definition begin
Example let α β be a nontrivial multivalued dependency that holds
Use
Theory
on Ri such that α → Ri is not in Di , and α ∩ β = φ ;
result := (result − Ri ) ∪ (Ri − β) ∪ (α, β);
Decomposition to
4NF end
Module Summary
else done:= true;
Note: each Ri is in 4NF, and decomposition is lossless-join
Module 29
• Example:
Partha Pratim
Das ◦ Person Modify(Man(M), Phones(P), Dog Likes(D),
Address(A)) Post Normalization
Objectives &
Outline ◦ FDs:
Multivalued . FD1 : Man Phones
Dependency
Definition
Example
. FD2 : Man Dogs Like
Use
Theory . FD3 : Man → Address
Decomposition to
4NF
◦ Key = MPD
Module Summary
◦ All dependencies violate 4NF
Module 29
• R =(A, B, C, G, H, I)
Partha Pratim
Das F=AB
Objectives &
B HI
Outline CG H
Multivalued
Dependency • R is not in 4NF since A B and A is not a superkey for R
Definition
Example • Decomposition
Use
Theory
a) R1 = (A, B) (R1 is in 4NF)
Decomposition to b) R2 = (A, C, G, H, I) (R2 is not in 4NF, decompose into R3 and R4 )
4NF
c) R3 = (C, G, H) (R3 is in 4NF)
Module Summary
d) R4 = (A, C, G, I) (R4 is not in 4NF, decompose into R5 and R6 )
◦ A B and B HI → A HI, (MVD transitivity), and
◦ and hence A I (MVD restriction to R4 )
e) R5 = (A, I) (R5 is in 4NF)
f) R6 = (A, C, G) (R6 is in 4NF)
Module 29
Partha Pratim • Understood multi-valued dependencies to handle attributes that can have multiple
Das
values
Objectives &
Outline • Learnt Fourth Normal Form and decomposition to 4NF
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Database Design
Process Module 30: Relational Database Design/10: Design Summary and Temporal Data
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Partha Pratim Das
Temporal
Databases
Temporal Data
Department of Computer Science and Engineering
Uni / Bi Temporal
Indian Institute of Technology, Kharagpur
Example
Module 30
Partha Pratim • Understood multi-valued dependencies to handle attributes that can have multiple
Das
values
Objectives &
Outline • Learnt Fourth Normal Form and decomposition to 4NF
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Partha Pratim
Das
Objectives &
Outline
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Database Design Process
Module Summary
Module 30
• Goal for a relational database design is:
Partha Pratim
Das ◦ BCNF / 4NF
Objectives &
◦ Lossless join
Outline
◦ Dependency preservation
Database Design
Process • If we cannot achieve this, we accept one of
Normal Forms
Normalization &
De-Normalization
◦ Lack of dependency preservation
Bad Design ◦ Redundancy due to use of 3NF
LIS Example
Temporal
• Interestingly, SQL does not provide a direct way of specifying functional dependencies
Databases
Temporal Data
other than superkeys.
Uni / Bi Temporal
Example
• Can specify FDs using assertions, but they are expensive to test, (and currently not
Module Summary
supported by any of the widely used databases!)
• Even if we had a dependency preserving decomposition, using SQL we would not be
able to efficiently test a functional dependency whose left hand side is not a key
Module 30
• Further NFs
Partha Pratim
Das ◦ Elementary Key Normal Form (EKNF)
Objectives &
◦ Essential Tuple Normal Form (ETNF)
Outline
◦ Join Dependencies And Fifth Normal Form (5 NF)
Database Design
Process ◦ Sixth Normal Form (6NF)
Normal Forms
Normalization &
◦ Domain/Key Normal Form (DKNF)
De-Normalization
Bad Design
• Join dependencies generalize multivalued dependencies
LIS Example
◦ lead to project-join normal form (PJNF) (also called fifth normal form)
Temporal
Databases
Temporal Data
• A class of even more general constraints, leads to a normal form called domain-key
Uni / Bi Temporal normal form.
Example
Module Summary
• Problem with these generalized constraints: are hard to reason with, and no set of
sound and complete set of inference rules exists.
• Hence rarely used
Module 30
• We have assumed schema R is given
Partha Pratim
Das ◦ R could have been generated when converting E-R diagram to a set of tables
Objectives &
◦ R could have been a single relation containing all attributes that are of interest
Outline
(universal relation)
Database Design
Process ◦ Normalization breaks R into smaller relations
Normal Forms
Normalization &
◦ R could have been the result of some ad hoc design of relations, which we then
De-Normalization
test/convert to normal form
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Partha Pratim • When an E-R diagram is carefully designed, identifying all entities correctly, the tables
Das
generated from the E-R diagram should not need further normalization
Objectives &
Outline • However, in a real (imperfect) design, there can be functional dependencies from
Database Design non-key attributes of an entity to other attributes of the entity
Process
Normal Forms ◦ Example: an employee entity with attributes
Normalization &
De-Normalization department name and building,
Bad Design
LIS Example
and a functional dependency
Temporal department name → building
Databases
Temporal Data
◦ Good design would have made department an entity
Uni / Bi Temporal
Example
• Functional dependencies from non-key attributes of a relationship set possible, but rare
Module Summary — most relationships are binary
Module 30
• May want to use non-normalized schema for performance
Partha Pratim
Das • For example, displaying prereqs along with course id, and title requires join of course
Objectives & with prereq
Outline
Database Design
◦ Course(course id, title,. . . )
Process ◦ Prerequisite(course id, prereq)
Normal Forms
Normalization &
De-Normalization
• Alternative 1: Use denormalized relation containing attributes of course as well as
Bad Design prereq with all above attributes: Course(course id, title, prereq,. . . )
LIS Example
Temporal
◦ faster lookup
Databases
Temporal Data
◦ extra space and extra execution time for updates
Uni / Bi Temporal ◦ extra coding work for programmer and possibility of error in extra code
Example
Module Summary
• Alternative 2: Use a materialized view defined as Course ./ Prerequisite
◦ Benefits and drawbacks same as above, except no extra coding work for
programmer and avoids possible errors
Module 30
• Some aspects of database design are not caught by normalization
Partha Pratim
Das • Examples of bad database design, to be avoided:
Objectives & Instead of earnings (company id, year, amount ), use
Outline
Database Design
◦ earnings 2004, earnings 2005, earnings 2006, etc., all on the schema (company id,
Process earnings).
Normal Forms
Normalization &
De-Normalization
. Above are in BCNF, but make querying across years difficult and needs new
Bad Design table each year
LIS Example
Temporal
◦ company year (company id, earnings 2004, earnings 2005, earnings 2006 )
Databases
Temporal Data
. Also in BCNF, but also makes querying across years difficult and requires new
Uni / Bi Temporal attribute each year.
Example
Module Summary
. Is an example of a crosstab, where values for one attribute become column
names
. Used in spreadsheets, and in data analysis tools
Module 30
• Consider a different version of relation book catalogue having the following attributes:
Partha Pratim
Das
◦ book title
◦ book catalogue, author lname: A book title may be associated with more than one
Objectives &
Outline author.
Database Design
Process
• book title {book title, author fname, author lname, edition}
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Partha Pratim
Das
Objectives &
Outline
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Figure: book catalogue
Temporal Data
Uni / Bi Temporal
Example • Since the relation has no FDs, it is already in BCNF.
Module Summary
• However, the relation has two nontrivial MVDs
book title {author fname, author lname} and book title edition.
Thus, it is not in 4NF.
• Nontrivial MVDs must be decomposed to convert it into a set of relations in 4NF.
Database Management Systems Partha Pratim Das 30.13
LIS Example 4NF (3)
Module 30
Partha Pratim
Das
• We decompose book catalogue into book author
Objectives & and book edition because:
Outline
Database Design
◦ book author has trivial MVD
Process
Normal Forms
book title {author fname, author lname}
Normalization &
De-Normalization
◦ book edition has trivial MVD
Bad Design
Figure: book author book title edition.
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Partha Pratim
Das
Objectives &
Outline
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Temporal Databases
Module Summary
Module 30
• Some data may be inherently historical because they include time-dependent /
Partha Pratim
Das
time-varying data, such as:
◦ Medical Records
Objectives &
Outline ◦ Judicial records
Database Design
Process
◦ Share prices
Normal Forms ◦ Exchange rates
Normalization &
De-Normalization ◦ Interest rates
Bad Design
LIS Example
◦ Company profits
Temporal ◦ etc.
Databases
Temporal Data • The desire to model such data means that we need to store not only the respective
Uni / Bi Temporal
Example
value but also an associated date or a time period for which the value is valid. Typical
Module Summary
queries expressed informally might include:
◦ Give me last month’s history of the Dollar-Pound Sterling exchange rate.
◦ Give me the share prices of the NYSE on October 17, 1996.
• Temporal databases provide a uniform and systematic way of dealing with historical data
Source: https://www.cs.uct.ac.za/mit notes/database/htmls/chp18.html
Database Management Systems Partha Pratim Das 30.16
Temporal Data
Module 30
• Temporal data have an association time interval during which the data are valid.
Partha Pratim
Das • A snapshot is the value of the data at a particular point in time
Objectives &
Outline
• In practice, database designers may add start and end time attributes to relations
Database Design • For example, course(course id, course title) is replaced by
Process
Normal Forms
course(course id, course title, start, end)
Normalization &
De-Normalization ◦ Constraint: no two tuples can have overlapping valid times and are Hard to enforce
Bad Design
LIS Example
efficiently
Temporal ◦ Foreign key references may be to current version of data, or to data at a point in
Databases
Temporal Data
time
Uni / Bi Temporal
Example
. For example, student transcript should refer to course information at the time
Module Summary
the course was taken
Module 30
• There are two different aspects of time in temporal databases.
Partha Pratim
Das
◦ Valid Time: Time period during which a fact is true in real world, provided to the
system.
Objectives &
Outline ◦ Transaction Time: Time period during which a fact is stored in the database, based
Database Design on transaction serialization order and is the timestamp generated automatically by
Process
Normal Forms the system.
Normalization &
De-Normalization • Temporal Relation is one where each tuple has associated time; either valid time or
Bad Design
LIS Example
transaction time or both associated with it.
Temporal ◦ Uni-Temporal Relations: Has one axis of time, either Valid Time or Transaction
Databases
Temporal Data
Time.
Uni / Bi Temporal
◦ Bi-Temporal Relations: Has both axis of time – Valid time and Transaction time.
Example
Module Summary
It includes Valid Start Time, Valid End Time, Transaction Start Time, Transaction
End Time.
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Database Management Systems Partha Pratim Das 30.19
Modeling Temporal Data: Example (1)
Module 30
• Example.
Partha Pratim
Das ◦ Let’s see an example of a person, John:
Objectives & . John was born on April 3, 1992 in Chennai.
Outline
. His father registered his birth after three days on April 6, 1992.
Database Design
Process . John did his entire schooling and college in Chennai.
Normal Forms
Normalization &
. He got a job in Mumbai and shifted to Mumbai on June 21, 2015.
De-Normalization
Bad Design
. He registered his change of address only on Jan 10, 2016.
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Module 30
Partha Pratim
• John’s Data In Non-Temporal Database
Das
Objectives &
• John was born on April 3, 1992
Outline in Chennai.
Database Design • His father registered his birth af-
Process ter three days on April 6, 1992.
Normal Forms
Normalization &
• John did his entire schooling and
De-Normalization college in Chennai.
Bad Design
LIS Example
• He got a job in Mumbai and
shifted to Mumbai on June 21,
Temporal In a non-temporal database, John’s address is entered as Chennai from 1992. When he 2015.
Databases
Temporal Data
registers his new address in 2016, the database gets updated and the address field now • He registered his change of ad-
shows his Mumbai address. The previous Chennai address details will not be available. dress only on Jan 10, 2016.
Uni / Bi Temporal
Example
So, it will be difficult to find out exactly when he was living in Chennai and when he
moved to Mumbai.
Module Summary
Module 30
• Uni-Temporal Relation (Adding Valid Time To John’s Data)
Partha Pratim
Das
Objectives &
Outline
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Database Management Systems Partha Pratim Das 30.22
Modeling Temporal Data: Example (4)
Module 30
• Bi-Temporal Relation (John’s Data Using Both Valid And Transaction Time)
Partha Pratim
Das
Objectives &
Outline • John was born on April 3, 1992
Database Design in Chennai.
Process
Normal Forms
• His father registered his birth af-
ter three days on April 6, 1992.
Normalization &
De-Normalization
• John did his entire schooling and
Bad Design • The database contents look like this: college in Chennai.
LIS Example Name, City, Valid From, Valid Till, Entered, Superseded
• He got a job in Mumbai and
Temporal
Databases
• Johns father registers his birth on 6th April 1992: shifted to Mumbai on June 21,
Person(John, Chennai, 3-Apr-1992, ∞, 6-Apr-1992, ∞). 2015.
Temporal Data
Uni / Bi Temporal • On January 10, 2016 John reports his new address in Mumbai: • He registered his change of ad-
Example Person(John, Mumbai, 21-June-2015, ∞, 10-Jan-2016, ∞). dress only on Jan 10, 2016.
Module Summary ◦ The original entry is updated as:
Person(John, Chennai, 3-Apr-1992, 20-June-2015, 6-Apr-1992 ,
10-Jan-2016).
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Module 30
• Advantages
Partha Pratim
Das
◦ The main advantages of this bi-temporal relations is that it provides historical and
roll back information.
Objectives &
Outline . Historical Information – Valid Time.
Database Design
Process
. Rollback Information – Transaction Time.
Normal Forms ◦ For example, you can get the result for a query on John’s history, like: Where did
Normalization &
De-Normalization John live in the year 2001?. The result for this query can be got with the valid time
Bad Design
LIS Example
entry. The transaction time entry is important to get the rollback information.
Temporal • Disadvantages
Databases
Temporal Data
◦ More storage
Uni / Bi Temporal ◦ Complex query processing
Example
Module Summary
◦ Complex maintenance including backup and recovery
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Database Management Systems Partha Pratim Das 30.24
Module Summary
Module 30
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Slides used in this presentation are borrowed from http://db-book.com/ with kind
Module Summary permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Decomposition to
3NF Module 27: Relational Database Design/7: Normal Forms
Test
Algorithm
Practice Problem
Decomposition to
BCNF Partha Pratim Das
Test
Algorithm
Practice Problem Department of Computer Science and Engineering
Comparison Indian Institute of Technology, Kharagpur
Module Summary
ppd@cse.iitkgp.ac.in
Module 27
Partha Pratim • Studied the Normal Forms and their Importance in Relational Design – how progressive
Das
increase of constraints can minimize redundancy in a schema
Objectives &
Outline
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Partha Pratim
Das
Objectives &
Outline
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module 27
Module Summary
Module 27
Partha Pratim • A relational schema R is in 3NF if for every FD X → A associated with R either
Das
◦ A ⊆ X (that is, the FD is trivial) or
Objectives &
Outline ◦ X is a superkey of R or
Decomposition to ◦ A is part of some candidate key (not just superkey!)
3NF
Test • A relation in 3NF is naturally in 2NF
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Partha Pratim • Optimization: Need to check only FDs in F , need not check all FDs in F + .
Das
• Use attribute closure to check for each dependency α → β, if α is a superkey.
Objectives &
Outline • If α is not a superkey, we have to verify if each attribute in β is contained in a
Decomposition to
3NF
candidate key of R
Test
Algorithm
◦ This test is rather more expensive, since it involve finding candidate keys
Practice Problem ◦ Testing for 3NF has been shown to be NP-hard
Decomposition to
BCNF
◦ Decomposition into 3NF can be done in polynomial time
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Module Summary
Module 27
Module Summary
Module 27
Module 27
Module 27
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Solution is given in the next slide (hidden from presentation – check after you have solved
Comparison
Module Summary
Module 27
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Solution is given in the next slide (hidden from presentation – check after you have solved)
Comparison
Module Summary
Module 27
Decompose the following schema to 3NF in the following steps
Partha Pratim
Das • Compute all keys for R
Objectives & • Compute a Canonical Cover Fc for F Put the FDs into alphabetical order.
Outline
Decomposition to
• Using Fc , employ the 3NFdecom algorithm to obtain a lossless and dependency preserving
3NF decomposition of relation R into a collection of relations that are in 3NF
Test
Algorithm
• Does your schema allow redundancy?
Practice Problem
Decomposition to • R(ABCDEFGH):
BCNF
Test
F = {A → CD, ACF → G , AD → BEF , BCG → D, CF → AH, CH → G , D → B, H → DEG }
Algorithm
Practice Problem
• R(ABCDE ):
Comparison
F = {A → B, A → C , C → D, A → E }
Module Summary • R(ABCDE ):
F = {A → BC , CD → E , B → D, E → A}
• R(ABCD):
F = {A → D, AB → C , AD → C , B → C , D → AB}
Module 27
Partha Pratim
Das
Objectives &
Outline
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module 27
Partha Pratim • A relation schema R is in BCNF with respect to a set F of FDs if for all FDs in F + of
Das
the form
Objectives &
Outline
α → β, where α ⊆ R and β ⊆ R at least one of the following holds:
Decomposition to ◦ α → β is trivial (that is, β ⊆ α)
3NF
Test
◦ α is a superkey for R
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
• To check if a non-trivial dependency α → β causes a violation of BCNF
Partha Pratim
Das a) Compute α+ (the attribute closure of α), and
Objectives &
b) Verify that it includes all attributes of R, that is, it is a superkey of R.
Outline
• Simplified test: To check if a relation schema R is in BCNF, it suffices to check only
Decomposition to
3NF the dependencies in the given set F for violation of BCNF, rather than checking all
Test
Algorithm
dependencies in F + .
Practice Problem
◦ If none of the dependencies in F causes a violation of BCNF, then none of the
Decomposition to
BCNF dependencies in F + will cause a violation of BCNF either.
Test
Algorithm • However, simplified test using only F is incorrect when testing a relation in a
Practice Problem
Comparison
decomposition of R
Module Summary ◦ Consider R = (A, B, C , D, E ), with F = {A → B, BC → D}
. Decompose R into R1 = (A, B) and R2 = (A, C , D, E )
. Neither of the dependencies in F contain only attributes from (A, C , D, E ) so we
might be mislead into thinking R2 satisfies BCNF.
. In fact, dependency AC → D in F + shows R2 is not in BCNF.
Database Management Systems Partha Pratim Das 27.19
BCNF Decomposition (3): Testing for BCNF Decomposition
Module 27
◦ F 0 = F1 ∪ F2 ∪ F3 .
◦ Checking for: D → A in F 0+
. D → C (from R3), C → B (from R2), B → A (from R1) : D→ A (By Transitivity)
Hence all dependencies are preserved.
Database Management Systems Partha Pratim Das 27.21
BCNF Decomposition (4): Testing Dependency Preservation:
Using Closure of Attributes (Poly. Algo.): Module 25
Module 27
• R(ABCD) :. F = {A → B, B → C , C → D, D → A}
Partha Pratim
Das • Decomp = {AB, BC , CD}
Objectives & • On projections:
Outline
Decomposition to
3NF
Test
Algorithm
Practice Problem
In this algo F1, F2, F3 are not the closure sets, rather the set of dependencies directly applicable on R1, R2, R3
Decomposition to respectively.
BCNF
Test • Need to check for: A → B, B → C , C → D, D → A
Algorithm
Practice Problem • (D) + /F 1 = D. (D) + /F 2 = D. (D) + /F 3 = D. So, D → A could not be preserved.
Comparison
• In the previous method we saw the dependency was preserved. In reality also it is preserved.
Module Summary
Therefore the polynomial time algorithm may not work in case of all examples. To prove preservation
Algo 2 is sufficient but not necessary whereas Algo 1 is both sufficient as well as necessary.
Note: This difference in result can occur in any example where a functional dependency of one decomposed table
uses another functional dependency in its closure which is not applicable on any of the decomposed table because of
absence of all attributes in the table.
Database Management Systems Partha Pratim Das 27.22
BCNF Decomposition (4): Algorithm PPD
Module 27
Module Summary
• Similarly F 2+
Module 27
Partha Pratim
result := {R};
Das done := false;
Objectives & compute F + ;
Outline
while (not done) do
Decomposition to
3NF if (there is a schema Ri in result that is not in BCNF)
Test
Algorithm
then begin
Practice Problem
let α → β be a nontrivial functional dependency that
Decomposition to
BCNF
holds on Ri such that α → β is not in F + ,
Test and α ∩ β = φ;
Algorithm
Practice Problem result := (result − Ri ) ∪ (Ri − β) ∪ (α, β);
Comparison
end
Module Summary
else done := true;
Module 27
Module Summary
Module 27
• class (course id, title, dept name, credits, sec id, semester, year, building,
Partha Pratim
Das
room number, capacity, time slot id)
Objectives &
• Functional dependencies:
Outline
◦ course id → title, dept name, credits
Decomposition to
3NF ◦ building, room number → capacity
Test
Algorithm
◦ course id, sec id, semester, year → building, room number, time slot id
Practice Problem
• A candidate key course id, sec id, semester, year.
Decomposition to
BCNF
Test
• BCNF Decomposition:
Algorithm ◦ course id → title, dept name, credits holds
Practice Problem
Comparison . but course id is not a superkey
Module Summary
◦ We replace class by:
. course(course id, title, dept name, credits)
. class-1 (course id, sec id, semester, year, building,
room number, capacity, time slot id)
Database Management Systems Partha Pratim Das 27.26
BCNF Decomposition (8): Example
Module 27
Module Summary
Module 27
Partha Pratim • It is not always possible to get a BCNF decomposition that is dependency preserving
Das
• R = (J, K , L)
Objectives &
Outline F = {JK → L
Decomposition to L → K}
3NF
Test Two candidate keys = JK and JL
Algorithm
Practice Problem • R is not in BCNF
Decomposition to
BCNF
• Any decomposition of R will fail to preserve
Test JK → L
Algorithm
Practice Problem This implies that testing for JK → L requires a join
Comparison
Module Summary
Module 27
Partha Pratim
Decompose the following schema to BCNF
Das
• R = ABCDE . F = {A → B, BC → D}
Objectives &
Outline • R = ABCDEH. F = {A → BC , E → HA}
Decomposition to
3NF
• R = CSJDPQV . F = {C → CSJDPQV , SD → P, JP → C , J → S}
Test
Algorithm
• R = ABCD. F = {C → D, C → A, B → C }
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
• It is always possible to decompose a relation into a set of relations that are in 3NF such
Partha Pratim
Das
that:
Objectives &
◦ the decomposition is lossless
Outline ◦ the dependencies are preserved
Decomposition to
3NF • It is always possible to decompose a relation into a set of relations that are in BCNF
Test
Algorithm
such that:
Practice Problem
◦ the decomposition is lossless
Decomposition to
BCNF ◦ it may not be possible to preserve dependencies.
Test
Algorithm
Practice Problem
S# 3NF BCNF
Comparison 1. It concentrates on Primary Key It concentrates on Candidate Key
Module Summary 2. Redundancy is high as compared to BCNF 0% redundancy
3. It preserves all the dependencies It may not preserve the dependencies
4. A dependency X → Y is allowed in 3NF if A dependency X → Y is allowed if X is a
X is a super key or Y is a part of some key super key
Module 27
Partha Pratim • Learnt how to decompose a schema into 3NF while preserving dependency and lossless
Das
join
Objectives &
Outline • Learnt how to decompose a schema into BCNF with lossless join
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm Slides used in this presentation are borrowed from http://db-book.com/ with kind
Practice Problem
Comparison
permission of the authors.
Module Summary Edited and new slides are marked with “PPD”.