0% found this document useful (0 votes)
30 views17 pages

I-Bsc (DBMS) (Unit - 3 Full)

The document explains the differences between primary keys, foreign keys, candidate keys, super keys, alternate keys, and composite keys in relational databases, emphasizing their roles in maintaining data integrity. It also addresses data redundancy and its associated anomalies (insertion, deletion, updating) and introduces normalization as a method to organize data and reduce redundancy. Various normal forms are described, along with their advantages and disadvantages in database design.

Uploaded by

Prabhahar S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views17 pages

I-Bsc (DBMS) (Unit - 3 Full)

The document explains the differences between primary keys, foreign keys, candidate keys, super keys, alternate keys, and composite keys in relational databases, emphasizing their roles in maintaining data integrity. It also addresses data redundancy and its associated anomalies (insertion, deletion, updating) and introduces normalization as a method to organize data and reduce redundancy. Various normal forms are described, along with their advantages and disadvantages in database design.

Uploaded by

Prabhahar S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Difference between Primary Key and Foreign Key

Keys are one of the most important elements in a relational


database to maintain the relationship between the tables and it also
helps in uniquely identifying the data from a table. The primary key is
a key that helps uniquely identify the tuple of the database. In
contrast, the Foreign Key is a key used to determine the relationship
between the tables through the primary key of one table that is the
primary key of one table acts as a foreign key to another table. Now,
let’s discuss both of them in some detail.
What is Primary Key?
A primary key is used to ensure that data in the specific column
is unique. A column cannot have NULL values. It is either an existing
table column or a column that is specifically generated by the
database according to a defined sequence.
Example: STUD_NO, as well as STUD_PHONE, are candidate keys
for relation STUDENT but STUD_NO can be chosen as the primary
key (only one out of many candidate keys).
Table STUDENT:

STUD_ STUD_N STUD_PH STUD_ST STUD_CO STUD_A


NO AME ONE ATE UNT GE

98652782
1 RAM Haryana India 20
51

96554702
2 RAM Punjab India 19
31

75142903 Rajastha
3 SUJIT India 18
59 n
STUD_ STUD_N STUD_PH STUD_ST STUD_CO STUD_A
NO AME ONE ATE UNT GE

85641032
4 SURESH Punjab India 21
58

Table STUDENT_COURSE:

STUD_NO COURSE_NO COURSE_NAME

1 C1 DBMS

2 C2 Computer Networks

1 C2 Computer Networks

What is Foreign Key?


A foreign key is a column or group of columns in a relational
database table that provides a link between data in two tables. It is a
column (or columns) that references a column (most often the primary
key) of another table.
Example: STUD_NO in STUDENT_COURSE is a foreign key to
STUD_NO in STUDENT relation.
Difference Between Primary Key and Foreign Key

PRIMARY KEY FOREIGN KEY

A foreign key is a column or group of


A primary key is used to
columns in a relational database table
ensure data in the specific
that provides a link between data in two
column is unique.
tables.
PRIMARY KEY FOREIGN KEY

It uniquely identifies a
It refers to the field in a table which is
record in the relational
the primary key of another table.
database table.

Only one primary key is Whereas more than one foreign key is
allowed in a table. allowed in a table.

It is a combination of
It can contain duplicate values and a
UNIQUE and Not Null
table in a relational database.
constraints.

It does not allow NULL


It can also contain NULL values.
values .

Its value cannot be deleted Its value can be deleted from the child
from the parent table. table.

It constraint can be
It constraint cannot be defined on the
implicitly defined on the
local or global temporary tables.
temporary tables.

Primary and foreign keys are essential in maintaining database


integrity. For more insights into relational databases, the GATE CS
Self-Paced Course is a great resource.Candidate Key in DBMS
CANDIDATE KEY
A candidate key is a part of a key known as Super
Key (discussed in the previous section), where the super key is the
super set of all those attributes that can uniquely identify a table.
What is a Candidate Key
A candidate key is a subset of a super key set where the key which
contains no redundant attribute is none other than a Candidate Key.
In order to select the candidate keys from the set of super key.
Role of a Candidate Key
The role of a candidate key is to identify a table row or column
uniquely. Also, the value of a candidate key cannot be Null. The
description of a candidate key is "no redundant attributes" and being a
"minimal representation of a tuple," according to the Experts.
How a Candidate key is different from a Primary Key
Although the purpose of both candidate and the primary key is the
same, that is to uniquely identify the tuples, and then also they are
different from each other. It is because, in a table, we can have one or
more than one candidate key, but we can create only one primary key
for a table. Thus, from the number of obtained candidate keys, we can
identify the appropriate primary key. However, if there is only one
candidate key in a table, then it can be considered for both key
constraints.
Example of Candidate Key
Let's look at the same example took while discussing Super Key to
understand the working of a candidate key.
We have an EMPLOYEE_DETAIL table where we have the
following attributes:
Emp_SSN: The SSN number is stored in this field.
Emp_Id: An attribute that stores the value of the employee
identification number.
Emp_name: An attribute that stores the name of the employee
holding the specified employee id.
Emp_email: An attribute that stores the email id of the specified
employees.
The EMPLOYEE_DETAIL table is given below that will help you
understand better:

So, from the above table, we obtained the below given super keys
(discussed in the previous section):

Now, from these sets of super keys, we can conclude the candidate
keys. In order to pick up the candidate keys, the best way is to analyze
and form the primary keys as much as we can. So, we need to identify
those sets from the super key sets that alone can identify the whole
table, or we can say the other attributes of the table. Thus, the result
is:

So, these are the three attributes obtained that can identify the other
non-prime attributes of the table. All these are the candidate keys and
from which we can pick the most appropriate attribute that can easily
identify all records of the table, which will be described as the
Primary key.
Difference between Candidate Key and Super Key
From the above discussions, we can have the following difference
points:

Thus, the super key is the super set, the candidate key is the subset,
and the primary key is the sub-subset of the super key.
Super Key in DBMS
The word super denotes the superiority of a key. Thus, a super
key is the superset of a key known as a Candidate key. It means
a candidate key is obtained from a super key only.

Role of Super Key


The role of the super key is simply to identify the tuples of the
specified table in the database. It is the superset where the candidate
key is a part of the super key only. So, all those attributes in a table
that is capable of identifying the other attributes of the table in a
unique manner are all super keys.

Examples of Super Key


Let's consider an EMPLOYEE_DETAIL table example where
we have the following attribute:

Emp_SSN: The SSN number is stored in this field.

Emp_Id: An attribute that stores the value of the employee


identification number.

Emp_name: An attribute that stores the name of the employee


holding the specified employee id.

Emp_email: An attribute that stores the email id of the specified


employees.
These all are the set of super keys which, together or combining
with other prime attributes, can identify a table uniquely.

Alternate Key in DBMS


An alternate key is the secondary candidate key that contains all
the property of a candidate key but is an alternate option.

What is an Alternate Key


An alternate is a secondary candidate key that is capable of
identifying a row uniquely.

Use of Alternate Key


An alternate key is none other than a candidate key, so the use/role of
an alternate key is the same. It means an alternate key is also used to
identify those columns in a table that can uniquely identify all the
records of the table.
Example of Alternate Key
Example 1:
Consider an EMPLOYEE_DETAIL table where we have the
following attributes:

Emp_SSN: The SSN number of employees is stored in this field.

Emp_Id: An attribute that stores the value of the employee


identification number.

Emp_name: An attribute that stores the name of the employee


holding the specified employee id.

Emp_email: An attribute that stores the email id of the specified


employees.
Example 2:
Consider another example of Student_Detail, where we have
the following attributes:

Stud_roll_no: An attribute for providing an identification number to


the students.

Stud_name: An attribute for storing the names of the students on the


specified roll numbers.

Stud_email: An attribute that will hold the email id of the particular


student on a specified roll number.

Stud_Course: An attribute that will hold the course taken by the


particular student having a specified roll number.
Composite Key in DBMS
A 'combination of two or more' better describes the word
'composite'. Thus, a composite key in DBMS is a candidate key that is
composed of two or more attributes and is capable of uniquely
identifying a table or a relation.

What is a Composite Key


A composite key is the DBMS key having two or more
attributes that together can uniquely identify a tuple in a table. Such a
key is also known as Compound Key, where each attribute creating a
key is a foreign key in its own right.

Use of Composite Key


Two or more attributes together form a composite key that can
uniquely identify a tuple in a table. We need to find out such table
columns combination that can form a candidate key and hence a
composite key.

Example of Composite Key


Cust_Id: A customer id is provided to each customer who visits and
is stored in this field.

Order_Id: Each order placed by the customer is given an order id,


which is stored in this field.

Prod_code: It holds the code value for the products available.


Prod_name: An attribute holding the name of the product on the
specified product code.

Redundancy in DBMS
Data redundancy means the occurrence of duplicate copies of
similar data. It is done intentionally to keep the same piece of data at
different places.

What is Data redundancy in the database management system?


Sometimes, it is done on purpose for recovery or backup of data,
faster access of data, or updating data easily. Redundant data costs
extra money, demands higher storage capacity, and requires extra
effort to keep all the files up to date.

Sometimes, unintentional duplicity of data causes a problem for the


database to work properly, or it may become harder for the end user
to access data. Redundant data unnecessarily occupy space in the
database to save identical copies, which leads to space constraints,
which is one of the major problems.

Student_id Name Course Session Fee Department

101 Devi B. Tech 2022 90,000 CS

102 Sona B. Tech 2022 90,000 CS


103 Varun B. Tech 2022 90,000 CS

104 Satish B. Tech 2022 90,000 CS

105 Amisha B. Tech 2022 90,000 CS

In the above example, there is a "Student" table that contains data


such as "Student_id", "Name", "Course", "Session", "Fee", and
"Department". As you can see, some data is repeated in the table,
which causes redundancy.

Problems that are caused due to redundancy in the database


Redundancy in DBMS gives rise to anomalies, and we will
study it further. In a database management system, the problems that
occur while working on data include inserting, deleting, and updating
data in the database.

student_ student_na student_a dept_i dept_na dept_he


id me ge d me ad

Informati
on Jaspreet
1 Shiva 19 104
Technolo Kaur
gy

Electronic Avni
2 Khushi 18 102
s Singh

Informati
on Jaspreet
3 Harsh 19 104
Technolo Kaur
gy
1. Insertion Anomaly:
Insertion anomaly arises when you are trying to insert some data
into the database, but you are not able to insert it.

Example:

If you want to add the details of the student in the above table,
then you must know the details of the department; otherwise, you will
not be able to add the details because student details are dependent on
department details.

2. Deletion Anomaly:
Deletion anomaly arises when you delete some data from the
database, but some unrelated data is also deleted; that is, there will be
a loss of data due to deletion anomaly.

Example:

If we want to delete the student detail, which has student_id 2,


we will also lose the unrelated data, i.e., department_id 102, from the
above table.

3. Updating Anomaly:
An update anomaly arises when you update some data in the
database, but the data is partially updated, which causes data
inconsistency.

Example:

If we want to update the details of dept_head from Jaspreet Kaur


to Ankit Goyal for Dept_id 104, then we have to update it everywhere
else; otherwise, the data will get partially updated, which causes data
inconsistency.

Normalization
A large database defined as a single relation may result in data
duplication. This repetition of data may result in:
o Making relations very large.
o It isn't easy to maintain and update data as it would involve
searching many records in relation.
o Wastage and poor utilization of disk space and resources.
o The likelihood of errors and inconsistencies increases.

So to handle these problems, we should analyze and decompose the


relations with redundant data into smaller, simpler, and well-
structured relations that are satisfy desirable properties. Normalization
is a process of decomposing the relations into relations with fewer
attributes.

What is Normalization?

o Normalization is the process of organizing the data in the


database.
o Normalization is used to minimize the redundancy from a
relation or set of relations. It is also used to eliminate
undesirable characteristics like Insertion, Update, and Deletion
Anomalies.
o Normalization divides the larger table into smaller and links
them using relationships.
o The normal form is used to reduce redundancy from the
database table.
Why do we need Normalization?

The main reason for normalizing the relations is removing these


anomalies. Failure to eliminate anomalies leads to data redundancy
and can cause data integrity and other problems as the database
grows. Normalization consists of a series of guidelines that helps to
guide you in creating a good database structure.

Data modification anomalies can be categorized into three types:

o Insertion Anomaly: Insertion Anomaly refers to when one


cannot insert a new tuple into a relationship due to lack of data.
o Deletion Anomaly: The delete anomaly refers to the situation
where the deletion of data results in the unintended loss of some
other important data.
o Updatation Anomaly: The update anomaly is when an update
of a single data value requires multiple rows of data to be
updated.
Types of Normal Forms:
Normalization works through a series of stages called Normal
forms. The normal forms apply to individual relations. The relation is
said to be in particular normal form if it satisfies constraints.

Following are the various types of Normal forms:

Normal Form Description

A relation is in 1NF if it contains


1NF
an atomic value.

A relation will be in 2NF if it is


in 1NF and all non-key attributes
2NF
are fully functional dependent on
the primary key.

3NF A relation will be in 3NF if it is


in 2NF and no transition
dependency exists.

A stronger definition of 3NF is


BCNF known as Boyce Codd's normal
form.

A relation will be in 4NF if it is


4NF in Boyce Codd's normal form and
has no multi-valued dependency.

A relation is in 5NF. If it is in
4NF and does not contain any
5NF
join dependency, joining should
be lossless.

Advantages of Normalization

o Normalization helps to minimize data redundancy.


o Greater overall database organization.
o Data consistency within the database.
o Much more flexible database design.
o Enforces the concept of relational integrity.
Disadvantages of Normalization

o You cannot start building the database before knowing what the
user needs.
o The performance degrades when normalizing the relations to
higher normal forms, i.e., 4NF, 5NF.
o It is very time-consuming and difficult to normalize relations of
a higher degree.
o Careless decomposition may lead to a bad database design,
leading to serious problems.

You might also like