UNIT-2 (Database Concept)
UNIT-2 (Database Concept)
(Unit-2 )
Prepared by:
Neeraj Sharma
(Assistant Professor, JLU-SOET)
1
UNIT - 2:
E-R Modeling
2
Entity in DBMS
An entity has some attributes which depict the entity's characteristics. For
example, an entity "Student" has attributes such as "Student_roll_no",
"Student_name", "Student_subject", and "Student_marks".
Example of Entity in DBMS in tabular form:
1 Robert English 85
2 Parker Mathematics 75
3 Harry Science 80
4 George Geography 70
3
Entity in DBMS
An entity is referred to as an object or thing that exists in the real world. For
example, customer, car, pen, etc.
Entities are stored in the database, and they should be distinguishable, i.e., they
should be easily identifiable from the group.
For example, a group of pens that are from the same company cannot be
identified, so they are only objects, but pens with different colors become unique
and will be called an entity like a red pen, green pen, blue pen, black pen, etc.
In a group of pens, we can easily identify any pen because of its different colors,
so a pen of different colors is an entity.
For extracting data from the database, each data must be unique in its own way
so that it becomes easier to differentiate between them. Distinct and unique data
is known as an entity.
4
Kind of Entity
There are two kinds of entities, which are as follows:
Tangible Entity:
It is an entity in DBMS, which is a physical object that we can touch
or see. In simple words, an entity that has a physical existence in the
real world is called a tangible entity.
For example, in a database, a table represents a tangible entity because
it contains a physical object that we can see and touch in the real
world. It includes colleges, bank lockers, mobiles, cars, watches, pens,
paintings, etc.
5
Types of Entity
Intangible Entity:
It is an entity in DBMS, which is a non-physical object that we cannot
see or touch. In simple words, an entity that does not have any
physical existence in the real world is known as an intangible entity.
For example, a bank account logically exists, but we cannot see or
touch it.
6
Entity Type
A collection of entities with general characteristics is known as an entity
type.
For example, a database of a corporate company has entity types such as
employees, departments, etc.
In DBMS, every entity type contains a set of attributes that explain the
entity.
The Employee entity type can have attributes such as name, age, address,
phone number, and salary.
The Department entity type can have attributes such as name, number,
and location in the department.
7
Kind of Entity Type
There are two kinds of entity type, which are as follows:
Strong Entity Type: It is an entity that has its own existence and is
independent.
The entity relationship diagram represents a strong entity type with the help of a
single rectangle. Below is the ERD of the strong entity type:
8
Kind of Entity Type
There are two kinds of entity type, which are as follows:
Strong Entity Type: It is an entity that has its own existence and is
independent.
The entity relationship diagram represents a strong entity type with the help of a
single rectangle. Below is the ERD of the strong entity type:
10
Kind of Entity Type
In the above example, "Address" is a weak entity type with attributes such
as House No., City, Location, and State.
The relationship between a strong and a weak entity type is known as an
identifying relationship.
Using a double diamond, the Entity-Relationship Diagram represents a
relationship between the strong and the weak entity type.
11
Kind of Entity Type
Let us see an example of the relationship between the Strong entity type
and weak entity type with the help of ER Diagram:
13
Strong Entity Set
In a DBMS, a strong entity set consists of a primary key.
For example, an entity of motorbikes with the attributes, motorbike's
registration number, motorbike's name, motorbike's model, and
motorbike's color.
Below is the representation of a strong entity set in tabular form:
14
Strong Entity Set
• Example of Entity Relationship Diagram representation of the above
strong entity set:
15
Weak Entity Set
In a DBMS, a weak entity set does not contain a primary key.
For example, An entity of smartphones with its attributes, phone's name,
phone's color, and phone's RAM.
Below is the representation of a weak entity set in tabular form:
16
Weak Entity Set
Example of Entity Relationship Diagram representation of the above
weak entity set:
17
Attributes in DBMS
In DBMS, we have entities, and each entity contains some property about
their behavior which is also called the attribute.
In relational databases, we have tables, and each column contains some
entity that has some attributes, so all the entries for that column should
strictly follow the attribute of the entity.
Entities define the characteristic property of the attributes.
18
Types of Attributes
Following is the Attribute of an Entity
Simple Attribute
Composite Attribute
Single-valued Attribute
Multi-valued Attribute
Derived Attribute
Complex Attribute
Key Attribute
19
Simple Attributes
It is also known as atomic attributes. When an attribute cannot be divided
further, then it is called a simple attribute.
20
Composite Attributes
Composite attributes are those that are made up of the composition of
more than one attribute. When any attribute can be divided further into
more sub-attributes, then that attribute is called a composite attribute.
21
Composite Attributes
Composite attributes are those that are made up of the composition of
more than one attribute. When any attribute can be divided further into
more sub-attributes, then that attribute is called a composite attribute.
22
Single-valued Attributes
Those attributes which can have exactly one value are known as single
valued attributes. They contain singular values, so more than one value is
not allowed.
23
Multi-valued Attributes
Those attributes which can have more than one entry or which contain
more than one value are called multi valued attributes.
In the Entity Relationship (ER) diagram, we represent the multi valued
attribute by double oval representation.
For example, one person can have more than one phone number, so that it
would be a multi valued attribute. Another example is the hobbies of a
person because one can have more than one hobby.
24
Derived Attributes
Derived attributes are also called stored attributes. When one attribute can
be derived from the other attribute, then it is called a derived attribute. We
can do some calculations on normal attributes and create derived
attributes.
For example, the age of a student can be a derived attribute because we
can get it by the DOB of the student.
Another example can be of working experience, which can be obtained by
the date of joining of an employee.
In the ER diagram, we represent the derived attributes by a dotted oval
shape.
25
Complex Attributes
If any attribute has the combining property of multi values and composite
attributes, then it is called a complex attribute. It means if one attribute is
made up of more than one attribute and each attribute can have more than
one value, then it is called a complex attribute.
For example, if a person has more than one office and each office has an
address made from a street number and city. So the address is a composite
attribute, and offices are multi valued attributes, So combing them is
called complex attributes.
26
Key Attributes
Those attributes which can be identified uniquely in the relational table
are called key attributes.
For example, a student is a unique attribute.
27
Example
We can understand the attributes by the following example:
28
Example
In the above example, we have an ER diagram of a table named
Employee. We have a lot of attributes from the above table.
Department is a single valued attribute that can have only one value.
Name is a composite attribute because it is made up of a first name
and the last name as the middle name attribute.
Work Experience attribute is a derived attribute, and it is represented
by a dotted oval. We can get the work experience by the other attribute
date of joining.
Phone number is a multi-valued attribute because one employee can
have more than one phone number, which is represented by a double
oval representation.
29
Keys in DBMS
Keys play an important role in the relational database.
It is used to uniquely identify any record or row of data from the table. It
is also used to establish and identify relationships between tables.
For example, ID is used as a key in the Student table because it is unique
for each student. In the PERSON table, passport_number,
license_number, SSN are keys since they are unique for each person.
30
Types of Keys
31
Primary Key
It is the first key used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys, as we saw in the PERSON
table. The key which is most suitable from those lists becomes a primary
key.
In the EMPLOYEE table, ID can be the primary key since it is unique for
each employee. In the EMPLOYEE table, we can even select
License_Number and Passport_Number as primary keys since they are
also unique.
For each entity, the primary key selection is based on requirements and
developers.
32
Primary Key
33
Candidate Key
A candidate key is an attribute or set of attributes that can uniquely
identify a tuple.
Except for the primary key, the remaining attributes are considered a
candidate key. The candidate keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary
key. The rest of the attributes, like SSN, Passport_Number,
License_Number, etc., are considered a candidate key.
34
Candidate Key
35
Super Key
Super key is an attribute set that can uniquely identify a tuple. A super key
is a superset of a candidate key.
Foreign keys are the column of the table used to point to the primary key
of another table.
Every employee works in a specific department in a company, and
employee and department are two different entities. So we can't store the
department's information in the employee table. That's why we link these
two tables through the primary key of one table.
We add the primary key of the DEPARTMENT table, Department_Id, as a
new attribute in the EMPLOYEE table.
In the EMPLOYEE table, Department_Id is the foreign key, and both the
tables are related.
37
Foreign Key
38
Alternate Key
There may be one or more attributes or a combination of attributes that
uniquely identify each tuple in a relation. These attributes or combinations
of the attributes are called the candidate keys. One key is chosen as the
primary key from these candidate keys, and the remaining candidate key,
if it exists, is termed the alternate key.
In other words, the total number of the alternate keys is the total number
of candidate keys minus the primary key. The alternate key may or may
not exist. If there is only one candidate key in a relation, it does not have
an alternate key.
For example, employee relation has two attributes, Employee_Id and
PAN_No, that act as candidate keys. In this relation, Employee_Id is
chosen as the primary key, so the other candidate key, PAN_No, acts as
the Alternate key. 39
Alternate Key
40
Composite Key
Whenever a primary key consists of more than one attribute, it is known
as a composite key. This key is also known as Concatenated Key.
For example, in employee relations, we assume that an employee may be
assigned multiple roles, and an employee may work on multiple projects
simultaneously. So the primary key will be composed of all three
attributes, namely Emp_ID, Emp_role, and Proj_ID in combination. So
these attributes act as a composite key since the primary key comprises
more than one attribute.
41
Artificial Key
The key created using arbitrarily assigned data are known as artificial
keys. These keys are created when a primary key is large and complex
and has no relationship with many other relations. The data values of the
artificial keys are usually numbered in a serial order.
For example, the primary key, which is composed of Emp_ID, Emp_role,
and Proj_ID, is large in employee relations. So it would be better to add a
new virtual attribute to identify each tuple in the relation uniquely.
42
Types of Relationship in DBMS
A relational database collects different types of data sets that use tables,
records, and columns. It is used to create a well-defined relationship
between database tables so that relational databases can be easily stored.
For example of relational databases such as Microsoft SQL Server, Oracle
Database, MYSQL, etc.
Following are the different types of relational database tables.
43
One to One Relationship (1:1)
It is used to create a relationship between two tables in which a single row
of the first table can only be related to one and only one records of a
second table. Similarly, the row of a second table can also be related to
anyone row of the first table.
Following is the example to show a relational database, as shown below.
44
One to Many Relationship
It is used to create a relationship between two tables. Any single rows of
the first table can be related to one or more rows of the second tables, but
the rows of second tables can only relate to the only row in the first table.
It is also known as a many to one relationship.
Representation of One to Many relational databases:
45
Many to One Relationship
Representation of Many to one relational databases:
46
Many to Many Relationship
Many to Many Relationship: It is many to many relationships that
create a relationship between two tables. Each record of the first table can
relate to any records (or no records) in the second table. Similarly, each
record of the second table can also relate to more than one record of the
first table. It is also represented an N:N relationship.
For example, there are many people involved in each project, and every
person can involve more than one project.
47
Many to Many Relationship
Many to Many Relationship: It is many to many relationships that
create a relationship between two tables. Each record of the first table can
relate to any records (or no records) in the second table. Similarly, each
record of the second table can also relate to more than one record of the
first table. It is also represented an N:N relationship.
For example, there are many people involved in each project, and every
person can involve more than one project.
48
Structural Constraints
To understand Structural Constraints, we must take a look at Cardinality
Ratios and Participation Constraints.
Relationships : The entities are denoted by rectangle and relationships by
diamond.
50
Structural Constraints
Participation Constraints : Participation Constraints tell us that the
participation in a relationship can either be total or partial.
52
Properties Relational
Name of the relation is distinct from all other relations.
Each relation cell contains exactly one atomic (single) value
Each attribute contains a distinct name
Attribute domain has no significance
tuple has no duplicate value
Order of tuple can have a different sequence
53
Relational Constraints
Relational constraints are the restrictions imposed on the database
contents and operations.
They ensure the correctness of data in the database.
54
Types of Relational Constraints
In DBMS, there are following 5 different types of relational constraints-
55
Domain Constraints
Domain constraint defines the domain or set of values for an attribute.
It specifies that the value taken by the attribute must be the atomic value
from its domain.
Example-
STU_ID Name Age
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
S004 Rahul A
Here, value ‘A’ is not allowed since only integer values can be taken by
the age attribute.
56
Tuple Uniqueness Constraints
Tuple Uniqueness constraint specifies that all the tuples must be
necessarily unique in any relation.
Example- STU_ID Name Age
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
S004 Rahul 20
This relation satisfies the tuple uniqueness constraint since here all the
tuples are unique.
57
Key Constraints
Key constraint specifies that in any relation-
All the values of primary key must be unique.
The value of primary key must not be null.
Example-
STU_ID Name Age
S001 Akshay 20
S001 Abhishek 21
S003 Shashank 20
S004 Rahul 20
This relation does not satisfy the key constraint as here all the values of
primary key are not unique.
58
Entity Integrity Constraints
Entity integrity constraint specifies that no attribute of primary key must
contain a null value in any relation.
This is because the presence of null value in the primary key violates the
uniqueness property.
Example- STU_ID Name Age
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
Rahul 20
This relation does not satisfy the entity integrity constraint as here the
primary key contains a NULL value.
59
Referential Integrity Constraints
This constraint is enforced when a foreign key references the primary key
of a relation.
It specifies that all the values taken by the foreign key must either be
available in the relation of the primary key or be null.
Important Results-
The following two important results emerges out due to referential
integrity constraint-
We can not insert a record into a referencing relation if the
corresponding record does not exist in the referenced relation.
We can not delete or update a record of the referenced relation if the
corresponding record exists in the referencing relation.
60
Referential Integrity Constraints
Example:
Dept_no Dept_name
STU_ID Name Dept_no
D10 ASET
S001 Akshay D10
D11 ALS
S002 Abhishek D10
D12 ASFL
S003 Shashank D11
D13 ASHS
S004 Rahul D14
61
Referential Integrity Constraints
The relation ‘Student’ does not satisfy the referential integrity constraint.
This is because in relation ‘Department’, no value of primary key
specifies department no. 14.
Thus, referential integrity constraint is violated.
62
Relation Schema in DBMS
The relations between them is represented through arrows in the
following
63
Enhanced ER Model
It is getting harder and harder to apply the conventional ER paradigm for
database modeling as data complexity rises today. The existing ER model
needs to be enhanced or improved in order for it to better handle the
complicated application in order to reduce the modeling complexity.
The requirements and complexity of complicated databases are
represented using enhanced entity-relationship diagrams, which are
sophisticated database diagrams very similar to standard ER diagrams.
The SubClass and SuperClass, Specialization and Generalization, Union
or Category, Aggregation, etc., are displayed using this diagrammatic
style.
64
Generalization
It works on the principle of bottom up approach. In Generalization lower
level functions are combined to form higher level function which is called
as entities. This process is repeated further to make advanced level
entities.
In the Generalization process properties are drawn from particular entities
and thus we can create generalized entity. We can summarize
Generalization process as it combines subclasses to form superclass.
Example of Generalization –
Consider two entities Student and Patient. These two entities will have
some characteristics of their own. For example Student entity will have
Roll_No, Name and Mob_No while patient will have PId, Name and
Mob_No characteristics. Now in this example Name and Mob_No of both
Student and Patient can be combined as a Person to form one higher level
entity and this process is called as Generalization Process.
65
Generalization
66
Specialization
We can say that Specialization is opposite of Generalization. In
Specialization things are broken down into smaller things to simplify it
further. We can also say that in Specialization a particular entity gets
divided into sub entities and it’s done on the basis of it’s characteristics.
Also in Specialization Inheritance takes place.
Example of Specialization –
Consider an entity Account. This will have some attributes consider them
Acc_No and Balance. Account entity may have some other attributes like
Current_Acc and Savings_Acc. Now Current_Acc may have Acc_No,
Balance and Transactions while Savings_Acc may have Acc_No, Balance
and Interest_Rate henceforth we can say that specialized entities inherits
characteristics of higher level entity.
67
Specialization
68
Difference between Generalization and
Specialization :
GENERALIZATION SPECIALIZATION
Generalization process starts with the number Specialization process starts from a single
of entity sets and it creates high-level entity entity set and it creates a different entity set
with the help of some common features. by using some different features.
71
Selection Operation
The select operation selects tuples that satisfy a given predicate.
It is denoted by sigma (σ).
Notation: σ p(r)
Where:
σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula which may use connectors like:
AND OR and NOT. These relational can use as relational operators like
=, ≠, ≥, <, >, ≤.
72
Selection Operation
For example: LOAN Relation
73
Selection Operation
Input
σ BRANCH_NAME="perryride" (LOAN)
Output
74
Project Operation
This operation shows the list of those attributes that we wish to appear in
the result. Rest of the attributes are eliminated from the table.
It is denoted by ∏.
Notation: ∏ A1, A2, An (r)
Where
A1, A2, A3 is used as an attribute name of relation r.
75
Project Operation
Example: CUSTOMER RELATION
76
Project Operation
Input:
∏ NAME, CITY (CUSTOMER)
Output:
NAME CITY
Jones Harrison
Smith Rye
Hays Harrison
Curry Rye
Johnson Brooklyn
Brooks Brooklyn
77
Union Operation
Suppose there are two tuples R and S. The union operation contains all
the tuples that are either in R or S or both in R & S.
It eliminates the duplicate tuples. It is denoted by ∪.
Notation: R ∪ S
A union operation must hold the following condition:
78
Union Operation
Example:
DEPOSITOR RELATION BORROW RELATION
CUSTOMER_NA ACCOUNT_NO CUSTOMER_N LOAN_NO
ME AME
Johnson A-101 Jones L-17
Smith A-121 Smith L-23
Mayes A-321 Hayes L-15
Turner A-176 Jackson L-14
Johnson A-273 Curry L-93
Jones A-472 Smith L-11
Lindsay A-284 Williams L-17
79
Union Operation
Input:
∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR)
CUSTOMER_NAME
Output:
Johnson
Smith
Hayes
Turner
Jones
Lindsay
Jackson
Curry
Williams
Mayes 80
Set Intersection
Suppose there are two tuples R and S. The set intersection operation
contains all tuples that are in both R & S.
It is denoted by intersection ∩.
Notation: R ∩ S
Example: Using the above DEPOSITOR table and BORROW table.
Input:∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (D
EPOSITOR)
Output:
CUSTOMER_NAME
Smith
Jones
81
Set Difference
Suppose there are two tuples R and S. The set intersection operation
contains all tuples that are in R but not in S.
It is denoted by intersection minus (-).
Notation: R - S
Example: Using the above DEPOSITOR table and BORROW table
Input:
∏ CUSTOMER_NAME (BORROW) -∏ CUSTOMER_NAME (DEPOSITOR)
Output:
CUSTOMER_NAME
Jackson
Hayes
Willians
Curry 82
Cartesian product
The Cartesian product is used to combine each row in one table with each
row in the other table. It is also known as a cross product.
It is denoted by X.
Notation: E X D
Example:
EMPLOYEE DEPARTMENT
EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME
1 Smith A A Marketing
2 Harry C B Sales
3 John B C Legal
83
Cartesian product
Input:
EMPLOYEE X DEPARTMENT
Output: EMP_ID EMP_NA EMP_DEP DEPT_NO DEPT_NA
ME T ME
1 Smith A A Marketing
1 Smith A B Sales
1 Smith A C Legal
2 Harry C A Marketing
2 Harry C B Sales
2 Harry C C Legal
3 John B A Marketing
3 John B B Sales
3 John B C Legal 84
Rename Operation
The rename operation is used to rename the output relation. It is denoted
by rho (ρ).
Example: We can use the rename operator to rename STUDENT relation
to STUDENT1.
ρ(STUDENT1, STUDENT)
85
Join Operation
A Join operation combines related tuples from different relations, if and
only if a given join condition is satisfied. It is denoted by ⋈.
Example: EMPLOYEE SALARY
86
Operation: (EMPLOYEE ⋈ SALARY)
87
Theta Join:
• The general case of JOIN operation is called a Theta join. It is denoted by symbol θ
• A ⋈θ B
• Theta join can use any conditions in the selection criteria
• Example: A ⋈ A.column 2 > B.column 2 (B)
88
Natural Join:
• A natural join is the set of tuples of all combinations in R and S that are equal on their
common attribute names.
• Natural join can only be performed if there is a common attribute (column) between
the relations. The name and type of the attribute must be same.
• It is denoted by ⋈.
Example: Let's use the above EMPLOYEE table and SALARY table:
∏EMP_NAME, SALARY (EMPLOYEE ⋈ SALARY)
Output:
EMP_NAME SALARY
Stephan 50000
Jack 30000
Harry 25000
89
Equi Join:
It is also known as an inner join. It is the most common join. It is
based on matched data as per the equality condition. The equi join
uses the comparison operator(=)
Example:
1 John 1 Delhi
2 Harry 2 Mumbai
3 Jackson 3 Noida
90
Equi Join:
Input: CUSTOMER ⋈CLASS_ID=PRODUCT_ID PRODUCT
Output:
1 John 1 Delhi
2 Harry 2 Mumbai
3 Jackson 3 Noida
91
Outer Join:
• The outer join operation is an extension of the join operation. It is used
to deal with missing information.
• Example:
EMPLOYEE FACT_WORKERS
92
Outer Join:
Input: (EMPLOYEE ⋈ FACT_WORKERS)
Output:
93
Left Outer Join:
Input: EMPLOYEE ⟕ FACT_WORKERS
Output:
94
Right Outer Join:
Input: EMPLOYEE ⟖ FACT_WORKERS
Output:
95
Full Outer Join:
Input: EMPLOYEE ⟗ FACT_WORKERS
Output:
96
Examples:
97
98
99
100
101
102
103