0% found this document useful (0 votes)
24 views25 pages

Unit I

Uploaded by

maskon.alien
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views25 pages

Unit I

Uploaded by

maskon.alien
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 25

UNIT I DBMS

DATABASE MANAGEMENT SYSTEMS

Unit I

Introduction: Introduction to Database Systems: Overview – Data Models – Database


System Architecture – History of Database Systems. Entity-Relationship Model: Basic
Concepts – Constraints – Keys – Design Issues – Entity Relationship Diagram – Weak
Entity Sets – Extended E-R Features – Design of an E-R Database Schema

Database Management Systems


1. A database management system (DBMS), or simply a database system (DBS),
consists of
o A collection of interrelated and persistent data (usually referred to as the
database (DB)).
o A set of application programs used to access, update and manage that data
(which form the data management system (MS)).
2. The goal of a DBMS is to provide an environment that is both convenient and
efficient to use in
o Retrieving information from the database.
o Storing information into the database.
3. Databases are usually designed to manage large bodies of information. This
involves
o Definition of structures for information storage (data modeling).
o Provision of mechanisms for the manipulation of information (file and
systems structure, query processing).
o Providing for the safety of information in the database (crash recovery and
security).
o Concurrency control if the system is shared by users.

Purpose of Database Systems

1. To see why database management systems are necessary, let's look at a typical
``file-processing system'' supported by a conventional operating system.

The application is a savings bank:

o Savings account and customer records are kept in permanent system files.
o Application programs are written to manipulate files to perform the
following tasks:
 Debit or credit an account.
 Add a new account.
 Find an account balance.
 Generate monthly statements.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 1


UNIT I DBMS

2. Development of the system proceeds as follows:


o New application programs must be written as the need arises.
o New permanent files are created as required.
o but over a long period of time files may be in different formats, and
o Application programs may be in different languages.
3. So we can see there are problems with the straight file-processing approach:
o Data redundancy and inconsistency
 Same information may be duplicated in several places.
 All copies may not be updated properly.
o Difficulty in accessing data
 May have to write a new application program to satisfy an unusual
request.
 E.g. find all customers with the same postal code.
 Could generate this data manually, but a long job...
o Data isolation
 Data in different files.
 Data in different formats.
 Difficult to write new application programs.
o Multiple users
 Want concurrency for faster response time.
 Need protection for concurrent updates.
 E.g. two customers withdrawing funds from the same account at
the same time - account has $500 in it, and they withdraw $100
and $50. The result could be $350, $400 or $450 if no protection.
o Security problems
 Every user of the system should be able to access only the data
they are permitted to see.
 E.g. payroll people only handle employee records, and cannot see
customer accounts; tellers only access account data and cannot see
payroll data.
 Difficult to enforce this with application programs.
o Integrity problems
 Data may be required to satisfy constraints.
 E.g. no account balance below $25.00.
 Again, difficult to enforce or to change constraints with the file-
processing approach.

These problems and others led to the development of database management


systems.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 2


UNIT I DBMS

Data Abstraction

1. The major purpose of a database system is to provide users with an abstract view
of the system.

The system hides certain details of how data is stored and created and maintained

Complexity should be hidden from database users.

2. There are several levels of abstraction:


1. Physical Level:
 How the data are stored.
 E.g. index, B-tree, hashing.
 Lowest level of abstraction.
 Complex low-level structures described in detail.
2. Conceptual Level:
 Next highest level of abstraction.
 Describes what data are stored.
 Describes the relationships among data.
 Database administrator level.
3. View Level:
 Highest level.
 Describes part of the database for a particular group of users.
 Can be many different views of a database.
 E.g. tellers in a bank get a view of customer accounts, but not of
payroll data.

Fig. 1.1 (figure 1.1 in the text) illustrates the three levels.

Figure 1.1: The three levels of data abstraction

DATA MODELS
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 3
UNIT I DBMS

1. Data models are a collection of conceptual tools for describing data, data
relationships, data semantics and data constraints. There are three different
groups:
1. Object-based Logical Models.
2. Record-based Logical Models.
3. Physical Data Models.

We'll look at them in more detail now.

Object-based Logical Models

1. Object-based logical models:


o Describe data at the conceptual and view levels.
o Provide fairly flexible structuring capabilities.
o Allow one to specify data constraints explicitly.
o Over 30 such models, including
 Entity-relationship model.
 Object-oriented model.
 Binary model.
 Semantic data model.
 Infological model.
 Functional data model.
2. At this point, we'll take a closer look at the entity-relationship (E-R) and object-
oriented models.

The E-R Model

1. The entity-relationship model is based on a perception of the world as consisting


of a collection of basic objects (entities) and relationships among these objects.
o An entity is a distinguishable object that exists.
o Each entity has associated with it a set of attributes describing it.
o E.g. number and balance for an account entity.
o A relationship is an association among several entities.
o e.g. A cust_acct relationship associates a customer with each account he
or she has.
o The set of all entities or relationships of the same type is called the entity
set or relationship set.
o Another essential element of the E-R diagram is the mapping
cardinalities, which express the number of entities to which another entity
can be associated via a relationship set.

We'll see later how well this model works to describe real world situations.

2. The overall logical structure of a database can be expressed graphically by an E-R


diagram:
o rectangles: represent entity sets.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 4


UNIT I DBMS

o ellipses: represent attributes.


o diamonds: represent relationships among entity sets.
o lines: link attributes to entity sets and entity sets to relationships.

See figure 1.2 for an example.

Figure 1.2: A sample E-R diagram.

The Object-Oriented Model

1. The object-oriented model is based on a collection of objects, like the E-R model.
o An object contains values stored in instance variables within the object.
o Unlike the record-oriented models, these values are themselves objects.
o Thus objects contain objects to an arbitrarily deep level of nesting.
o An object also contains bodies of code that operate on the the object.
o These bodies of code are called methods.
o Objects that contain the same types of values and the same methods are
grouped into classes.
o A class may be viewed as a type definition for objects.
o Analogy: the programming language concept of an abstract data type.
o The only way in which one object can access the data of another object is
by invoking the method of that other object.
o This is called sending a message to the object.
o Internal parts of the object, the instance variables and method code, are not
visible externally.
o Result is two levels of data abstraction.

For example, consider an object representing a bank account.


o The object contains instance variables number and balance.
o The object contains a method pay-interest which adds interest to the
balance.
o Under most data models, changing the interest rate entails changing code
in application programs.
o In the object-oriented model, this only entails a change within the pay-
interest method.
2. Unlike entities in the E-R model, each object has its own unique identity,
independent of the values it contains:
o Two objects containing the same values are distinct.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 5


UNIT I DBMS

o Distinction is created and maintained in physical level by assigning


distinct object identifiers.

Record-based Logical Models

1. Record-based logical models:


o Also describe data at the conceptual and view levels.
o Unlike object-oriented models, are used to
 Specify overall logical structure of the database, and
 Provide a higher-level description of the implementation.
o Named so because the database is structured in fixed-format records of
several types.
o Each record type defines a fixed number of fields, or attributes.
o Each field is usually of a fixed length (this simplifies the implementation).
o Record-based models do not include a mechanism for direct representation
of code in the database.
o Separate languages associated with the model are used to express database
queries and updates.
o The three most widely-accepted models are the relational, network, and
hierarchical.
o This course will concentrate on the relational model.
o The network and hierarchical models are covered in appendices in the
text.

The Relational Model

 Data and relationships are represented by a collection of tables.


 Each table has a number of columns with unique names, e.g. customer, account.
 Figure 1.3 shows a sample relational database.

Figure 1.3: A sample relational database.

The Network Model

 Data are represented by collections of records.


 Relationships among data are represented by links.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 6


UNIT I DBMS

 Organization is that of an arbitrary graph.


 Figure 1.4 shows a sample network database that is the equivalent of the relational
database of Figure 1.3.

Figure 1.4: A sample network database

The Hierarchical Model

 Similar to the network model.


 Organization of the records is as a collection of trees, rather than arbitrary graphs.
 Figure 1.5 shows a sample hierarchical database that is the equivalent of the
relational database of Figure 1.3.

Figure 1.5: A sample hierarchical database

The relational model does not use pointers or links, but relates records by the values they
contain. This allows a formal mathematical foundation to be defined.

Physical Data Models

1. Are used to describe data at the lowest level.


2. Very few models, e.g.
o Unifying model.
o Frame memory.
3. We will not cover physical models.

Instances and Schemes

1. Databases change over time.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 7


UNIT I DBMS

2. The information in a database at a particular point in time is called an instance of


the database.
3. The overall design of the database is called the database scheme.
4. Analogy with programming languages:
o Data type definition - scheme
o Value of a variable - instance
5. There are several schemes, corresponding to levels of abstraction:
o Physical scheme
o Conceptual scheme
o Subscheme (can be many)

Data Definition Language (DDL)

1. Used to specify a database scheme as a set of definitions expressed in a DDL


2. DDL statements are compiled, resulting in a set of tables stored in a special file
called a data dictionary or data directory.
3. The data directory contains metadata (data about data)
4. The storage structure and access methods used by the database system are
specified by a set of definitions in a special type of DDL called a data storage
and definition language
5. basic idea: hide implementation details of the database schemes from the users

Data Manipulation Language (DML)

1. Data Manipulation is:


o retrieval of information from the database
o insertion of new information into the database
o deletion of information in the database
o modification of information in the database
2. A DML is a language which enables users to access and manipulate data.

The goal is to provide efficient human interaction with the system.

3. There are two types of DML:


o procedural: the user specifies what data is needed and how to get it
o nonprocedural: the user only specifies what data is needed
 Easier for user
 May not generate code as efficient as that produced by procedural
languages
4. A query language is a portion of a DML involving information retrieval only.
The terms DML and query language are often used synonymously.

Database Manager

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 8


UNIT I DBMS

1. The database manager is a program module which provides the interface


between the low-level data stored in the database and the application programs
and queries submitted to the system.
2. Databases typically require lots of storage space (gigabytes). This must be stored
on disks. Data is moved between disk and main memory (MM) as needed.
3. The goal of the database system is to simplify and facilitate access to data.
Performance is important. Views provide simplification.
4. So the database manager module is responsible for
o Interaction with the file manager: Storing raw data on disk using the file
system usually provided by a conventional operating system. The database
manager must translate DML statements into low-level file system
commands (for storing, retrieving and updating data in the database).
o Integrity enforcement: Checking that updates in the database do not
violate consistency constraints (e.g. no bank account balance below $25)
o Security enforcement: Ensuring that users only have access to
information they are permitted to see
o Backup and recovery: Detecting failures due to power failure, disk crash,
software errors, etc., and restoring the database to its state before the
failure
o Concurrency control: Preserving data consistency when there are
concurrent users.
5. Some small database systems may miss some of these features, resulting in
simpler database managers. (For example, no concurrency is required on a PC
running MS-DOS.) These features are necessary on larger systems.

Database Administrator

1. The database administrator is a person having central control over data and
programs accessing that data. Duties of the database administrator include:
o Scheme definition: the creation of the original database scheme. This
involves writing a set of definitions in a DDL (data storage and definition
language), compiled by the DDL compiler into a set of tables stored in the
data dictionary.
o Storage structure and access method definition: writing a set of
definitions translated by the data storage and definition language compiler
o Scheme and physical organization modification: writing a set of
definitions used by the DDL compiler to generate modifications to
appropriate internal system tables (e.g. data dictionary). This is done
rarely, but sometimes the database scheme or physical organization must
be modified.
o Granting of authorization for data access: granting different types of
authorization for data access to various users
o Integrity constraint specification: generating integrity constraints. These
are consulted by the database manager module whenever updates occur.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 9


UNIT I DBMS

Database Users

1. The database users fall into several categories:


o Application programmers are computer professionals interacting with
the system through DML calls embedded in a program written in a host
language (e.g. C, PL/1, Pascal).
 These programs are called application programs.
 The DML precompiler converts DML calls (prefaced by a special
character like $, #, etc.) to normal procedure calls in a host
language.
 The host language compiler then generates the object code.
 Some special types of programming languages combine Pascal-
like control structures with control structures for the manipulation
of a database.
 These are sometimes called fourth-generation languages.
 They often include features to help generate forms and display
data.
o Sophisticated users interact with the system without writing programs.
 They form requests by writing queries in a database query
language.
 These are submitted to a query processor that breaks a DML
statement down into instructions for the database manager module.
o Specialized users are sophisticated users writing special database
application programs. These may be CADD systems, knowledge-based
and expert systems, complex data systems (audio/video), etc.
o Naive users are unsophisticated users who interact with the system by
using permanent application programs (e.g. automated teller machine).

OVERALL SYSTEM STRUCTURE

1. Database systems are partitioned into modules for different functions. Some
functions (e.g. file systems) may be provided by the operating system.
2. Components include:
o File manager manages allocation of disk space and data structures used to
represent information on disk.
o Database manager: The interface between low-level data and application
programs and queries.
o Query processor translates statements in a query language into low-level
instructions the database manager understands. (May also attempt to find
an equivalent but more efficient form.)
o DML precompiler converts DML statements embedded in an application
program to normal procedure calls in a host language. The precompiler
interacts with the query processor.
o DDL compiler converts DDL statements to a set of tables containing
metadata stored in a data dictionary.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 10


UNIT I DBMS

In addition, several data structures are required for physical system


implementation:

oData files: store the database itself.


oData dictionary: stores information about the structure of the database. It
is used heavily. Great emphasis should be placed on developing a good
design and efficient implementation of the dictionary.
o Indices: provide fast access to data items holding particular values.
3. Figure 1.6 shows these components.

Figure 1.6: Database system structure.

HISTORY OF DATABASE SYSTEMS


Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 11
UNIT I DBMS

The term Database, in these days are generally used with respect to Computer Science,
but they did exist earlier, in physical forms like files (real ones) and tabbed folders stored
in cabinets of a warehouse like structure. This article though would be looking at the
evolution of Database in terms of Computer Science.
There are various sources to this article and mention of all of them would take up a lot of
space. Moreover, most of the information is known publicly with in the circles of the
technically adept.
Database, in simple terms, can be stated to be storage of information in a systematic
manner – usually records, from which data can be easily (relatively speaking) retrieved or
to which modifications can be made or new data can be added. As stated earlier, the basic
concept of database did exist earlier to the advent of computers as files and folders. In
that view, even the small address/telephone book we keep with ourselves can be
categorized as a database.

Evolution
With the introduction of Computers, database picked up a new meaning and being the
data processors computers are, it was a natural process to create database systems in
them. As the computer technology advanced, so did the databases that relied on them
advanced. It went from data being internally structured in programs to the RDBMS
(Relational Database Management Systems) and more complex database systems.

FileBasedSystems
It saw its implementation at the end of the 60s, when the computer technology had taken
its first few steps. The traditional or rather archaic database system was the use of files to
store the data in a rather two dimensional, tabular structure. The programs that would
process these data had to know the actual relation between the different parts of data
before hand. The relation had to be coded into the programs. It required tight integration
with the programs, like the use of ‘Data Divisions’ in COBOL or ‘struct’ in C.
The limitation to this system was that it lacked the scalability – once implemented, to add
new features or schema of data, coding had to be done again. Also, if there required a
new exchange of data between programs, one had to again get down to coding the
migration of the data. There were no standards.

HierarchicalDatabaseSystems
This model of database evolved some what in parallel to the File Based Systems, and
could be found well into the 80s. The pattern of data stored was based on the binary trees
– representing the one-to-many relationship of data. It did surpass the limitation of the
File Based System in terms of data independence, but was found to be real hard to
implement.

NetworkDatabaseSystem
This system, evolved a bit later the previous two, according to CODASYL (Conference
on Data Systems Languages) was composed of records and sets. The records held the
actual data, and the sets defined the relationship between the data separately. This system

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 12


UNIT I DBMS

gave good performance in terms of retrieval of data, but was hard to implement and
reorganize.

RelationalDatabaseManagementSystem(RelationalDBMS)
This is the modern day database system in wide use presently that seems to have
overcome most of the limitations of its predecessors. Very basically, RDBMS, data is
stored as tables, and the tables may have relation between them. The data is organized in
any way the software design of the particular DBMS wants, but it has to provide access
to the database through a standardized instruction set called Structured Query Language
(SQL).
When a particular program wants to interact with a database, it sends a request to the
DBMS, using the SQL syntax. The DBMS then responds by providing a result set back to
the calling program.
The DBMS allows the design of a particular system to keep the program (logic) and the
database separate. This allows for exchange of data between programs with out the need
for programming/designing each and every interaction.
There are many software vendors that provide Relational DBMS – Oracle, MySQL,
Sybase, PostgreSQL, MSSQL, SQL-Server. Note that most of these systems have
evolved beyond the basic feature set of a RDBMS.

THE ENTITY-RELATIONSHIP MODEL

The E-R (entity-relationship) data model views the real world as a set of basic
objects (entities) and relationships among these objects.
It is intended primarily for the DB design process by allowing the specification of
an enterprise scheme. This represents the overall logical structure of the DB.

Entities and Entity Sets

 An entity is an object that exists and is distinguishable from other objects. For
instance, John Harris with S.I.N. 890-12-3456 is an entity, as he can be uniquely
identified as one particular person in the universe.
 An entity may be concrete (a person or a book, for example) or abstract (like a
holiday or a concept).
 An entity set is a set of entities of the same type (e.g., all persons having an
account at a bank).
 Entity sets need not be disjoint. For example, the entity set employee (all
employees of a bank) and the entity set customer (all customers of the bank) may
have members in common.
 An entity is represented by a set of attributes.
o E.g. name, S.I.N., street, city for ``customer'' entity.
o The domain of the attribute is the set of permitted values (e.g. the
telephone number must be seven positive integers).
 Formally, an attribute is a function which maps an entity set into a domain.
o Every entity is described by a set of (attribute, data value) pairs.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 13


UNIT I DBMS

o There is one pair for each attribute of the entity set.


o E.g. a particular customer entity is described by the set {(name, Harris),
(S.I.N., 890-123-456), (street, North), (city, Georgetown)}.

An analogy can be made with the programming language notion of type definition.
 The concept of an entity set corresponds to the programming language type
definition.
 A variable of a given type has a particular value at a point in time.
 Thus, a programming language variable corresponds to an entity in the E-R
model.

Figure 2-1 shows two entity sets.


We will be dealing with five entity sets in this section:
 branch, the set of all branches of a particular bank. Each branch is described by
the attributes branch-name, branch-city and assets.
 customer, the set of all people having an account at the bank. Attributes are
customer-name, S.I.N., street and customer-city.
 employee, with attributes employee-name and phone-number.
 account, the set of all accounts created and maintained in the bank. Attributes are
account-number and balance.
 transaction, the set of all account transactions executed in the bank. Attributes are
transaction-number, date and amount.

Relationships & Relationship Sets

A relationship is an association between several entities.


A relationship set is a set of relationships of the same type.
Formally it is a mathematical relation on (possibly non-distinct) sets.
If are entity sets, then a relationship set R is a subset of

where is a relationship.
For example, consider the two entity sets customer and account. (Fig. 2.1 in the text). We
define the relationship CustAcct to denote the association between customers and their
accounts. This is a binary relationship set (see Figure 2.2 in the text).
Going back to our formal definition, the relationship set CustAcct is a subset of all the
possible customer and account pairings.
This is a binary relationship. Occasionally there are relationships involving more than
two entity sets.
The role of an entity is the function it plays in a relationship. For example, the
relationship works-for could be ordered pairs of employee entities. The first employee
takes the role of manager, and the second one will take the role of worker.
A relationship may also have descriptive attributes. For example, date (last date of
account access) could be an attribute of the CustAcct relationship set.

Attributes

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 14


UNIT I DBMS

It is possible to define a set of entities and the relationships among them in a number of
different ways. The main difference is in how we deal with attributes.
 Consider the entity set employee with attributes employee-name and phone-
number.
 We could argue that the phone be treated as an entity itself, with attributes phone-
number and location.
 Then we have two entity sets, and the relationship set EmpPhn defining the
association between employees and their phones.
 This new definition allows employees to have several (or zero) phones.
 New definition may more accurately reflect the real world.
 We cannot extend this argument easily to making employee-name an entity.

The question of what constitutes an entity and what constitutes an attribute depends
mainly on the structure of the real world situation being modeled, and the semantics
associated with the attribute in question.

CONSTRAINTS

Mapping Constraints

An E-R scheme may define certain constraints to which the contents of a database must
conform.
 Mapping Cardinalities: express the number of entities to which another entity
can be associated via a relationship. For binary relationship sets between entity
sets A and B, the mapping cardinality must be one of:
1. One-to-one: An entity in A is associated with at most one entity in B, and
an entity in B is associated with at most one entity in A. (Figure 2.3)
2. One-to-many: An entity in A is associated with any number in B. An
entity in B is associated with at most one entity in A. (Figure 2.4)
3. Many-to-one: An entity in A is associated with at most one entity in B.
An entity in B is associated with any number in A. (Figure 2.5)
4. Many-to-many: Entities in A and B are associated with any number from
each other. (Figure 2.6)

The appropriate mapping cardinality for a particular relationship set depends on


the real world being modeled. (Think about the CustAcct relationship...)

 Existence Dependencies: if the existence of entity X depends on the existence of


entity Y, then X is said to be existence dependent on Y. (Or we say that Y is the
dominant entity and X is the subordinate entity.)

For example,

o Consider account and transaction entity sets, and a relationship log


between them.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 15


UNIT I DBMS

o This is one-to-many from account to transaction.


o If an account entity is deleted, its associated transaction entities must also
be deleted.
o Thus account is dominant and transaction is subordinate.

KEYS

Differences between entities must be expressed in terms of attributes.


 A superkey is a set of one or more attributes which, taken collectively, allow us
to identify uniquely an entity in the entity set.
 For example, in the entity set customer, customer-name and S.I.N. is a superkey.
 Note that customer-name alone is not, as two customers could have the same
name.
 A superkey may contain extraneous attributes, and we are often interested in the
smallest superkey. A superkey for which no subset is a superkey is called a
candidate key.
 In the example above, S.I.N. is a candidate key, as it is minimal, and uniquely
identifies a customer entity.
 A primary key is a candidate key (there may be more than one) chosen by the
DB designer to identify entities in an entity set.

An entity set that does not possess sufficient attributes to form a primary key is called a
weak entity set. One that does have a primary key is called a strong entity set.
For example,
 The entity set transaction has attributes transaction-number, date and amount.
 Different transactions on different accounts could share the same number.
 These are not sufficient to form a primary key (uniquely identify a transaction).
 Thus transaction is a weak entity set.

For a weak entity set to be meaningful, it must be part of a one-to-many relationship set.
This relationship set should have no descriptive attributes. (Why?)
The idea of strong and weak entity sets is related to the existence dependencies seen
earlier.
 Member of a strong entity set is a dominant entity.
 Member of a weak entity set is a subordinate entity.

A weak entity set does not have a primary key, but we need a means of distinguishing
among the entities.
The discriminator of a weak entity set is a set of attributes that allows this distinction to
be made.
The primary key of a weak entity set is formed by taking the primary key of the strong
entity set on which its existence depends (see Mapping Constraints) plus its
discriminator.
To illustrate:
 transaction is a weak entity. It is existence-dependent on account.
 The primary key of account is account-number.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 16


UNIT I DBMS

 transaction-number distinguishes transaction entities within the same account


(and is thus the discriminator).
 So the primary key for transaction would be (account-number, transaction-
number).

Just Remember: The primary key of a weak entity is found by taking the primary key of
the strong entity on which it is existence-dependent, plus the discriminator of the weak
entity set.

Primary Keys for Relationship Sets

The attributes of a relationship set are the attributes that comprise the primary keys of the
entity sets involved in the relationship set.
For example:
 S.I.N. is the primary key of customer, and
 account-number is the primary key of account.
 The attributes of the relationship set custacct are then (account-number, S.I.N.).

This is enough information to enable us to relate an account to a person.


If the relationship has descriptive attributes, those are also included in its attribute set. For
example, we might add the attribute date to the above relationship set, signifying the date
of last access to an account by a particular customer.
Note that this attribute cannot instead be placed in either entity set as it relates to both a
customer and an account, and the relationship is many-to-many.
The primary key of a relationship set depends on the mapping cardinality and the
presence of descriptive attributes.
With no descriptive attributes:
 many-to-many: all attributes in .
 one-to-many: primary key for the ``many'' entity.

Descriptive attributes may be added, depending on the mapping cardinality and the
semantics involved (see text).

THE ENTITY RELATIONSHIP DIAGRAM

We can express the overall logical structure of a database graphically with an E-R
diagram.
Its components are:
 rectangles representing entity sets.
 ellipses representing attributes.
 diamonds representing relationship sets.
 lines linking attributes to entity sets and entity sets to relationship sets.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 17


UNIT I DBMS

In the text, lines may be directed (have an arrow on the end) to signify mapping
cardinalities for relationship sets. Figures 2.8 to 2.10 show some examples.

Figure 2.7: An E-R diagram

Figure 2.8: One-to-many from customer to account

Figure 2.9: Many-to-one from customer to account

Figure 2.10: One-to-one from customer to account


Go back and review mapping cardinalities. They express the number of entities to which
an entity can be associated via a relationship.
The arrow positioning is simple once you get it straight in your mind, so do some
examples. Think of the arrow head as pointing to the entity that ``one'' refers to.

OTHER STYLES OF E-R DIAGRAM

The text uses one particular style of diagram. Many variations exist.
Some of the variations you will see are:
 Diamonds being omitted - a link between entities indicates a relationship.
o Less symbols, clearer picture.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 18


UNIT I DBMS

o What happens with descriptive attributes?


o In this case, we have to create an intersection entity to possess the
attributes.
 Numbers instead of arrowheads indicating cardinality.
o Symbols, 1, n and m used.
o E.g. 1 to 1, 1 to n, n to m.
o Easier to understand than arrowheads.
 A range of numbers indicating optionality of relationship. (See Elmasri &
Navathe, p 58.)
o E.g (0,1) indicates minimum zero (optional), maximum 1.
o Can also use (0,n), (1,1) or (1,n).
o Typically used on near end of link - confusing at first, but gives more
information.
o E.g. entity 1 (0,1) -- (1,n) entity 2 indicates that entity 1 is related to
between 0 and 1 occurrences of entity 2 (optional).
o Entity 2 is related to at least 1 and possibly many occurrences of entity 1
(mandatory).
 Multivalued attributes may be indicated in some manner.
o Means attribute can have more than one value.
o E.g. hobbies.
o Has to be normalized later on.
 Extended E-R diagrams allowing more details/constraints in the real world to be
recorded. (See Elmasri & Navathe, chapter 21.)
o Composite attributes.
o Derived attributes.
o Subclasses and superclasses.
o Generalization and specialization.

Roles in E-R Diagrams


The function that an entity plays in a relationship is called its role. Roles are normally
explicit and not specified.
They are useful when the meaning of a relationship set needs clarification.
For example, the entity sets of a relationship may not be distinct. The relationship works-
for might be ordered pairs of employees (first is manager, second is worker).
In the E-R diagram, this can be shown by labelling the lines connecting entities
(rectangles) to relationships (diamonds). (See figure 2.11).

Figure 2.11: E-R diagram with role indicators


Weak Entity Sets in E-R Diagrams
A weak entity set is indicated by a doubly-outlined box. For example, the previously-
mentioned weak entity set transaction is dependent on the strong entity set account via
the relationship set log.
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 19
UNIT I DBMS

Figure 2.12) shows this example.

Figure 2.12: E-R diagram with a weak entity set


Nonbinary Relationships
Non-binary relationships can easily be represented. Figure 2.13) shows an example.

REDUCING E-R DIAGRAMS TO TABLES

A database conforming to an E-R diagram can be represented by a collection of tables.


We'll use the E-R diagram of Figure 2.14) as our example.

Figure 2.14: E-R diagram with strong and weak entity sets
For each entity set and relationship set, there is a unique table which is assigned the
name of the corresponding set. Each table has a number of columns with unique names.
(E.g. Figs. 2.14 - 2.18 in the text).

Representation of Strong Entity Sets

We use a table with one column for each attribute of the set. Each row in the table
corresponds to one entity of the entity set. For the entity set account, see the table of
figure 2.14.
We can add, delete and modify rows (to reflect changes in the real world).
A row of a table will consist of an n-tuple where n is the number of attributes.
Actually, the table contains a subset of the set of all possible rows. We refer to the set of
all possible rows as the cartesian product of the sets of all attribute values.
We may denote this as

for the account table, where and denote the set of all account numbers and all
account balances, respectively.
In general, for a table of n columns, we may denote the cartesian product of
by
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 20
UNIT I DBMS

Representation of Weak Entity Sets

For a weak entity set, we add columns to the table corresponding to the primary key of
the strong entity set on which the weak set is dependent.
For example, the weak entity set transaction has three attributes: transaction-number,
date and amount. The primary key of account (on which transaction depends) is account-
number. This gives us the table of figure 2.16.

Representation of Relationship Sets

Let R be a relationship set involving entity sets .


The table corresponding to the relationship set R has the following attributes:

If the relationship has k descriptive attributes, we add them too:

An example:
 The relationship set CustAcct involves the entity sets customer and account.
 Their respective primary keys are S.I.N. and account-number.
 CustAcct also has a descriptive attribute, date.
 This gives us the table of figure 2.17.

Non-binary Relationship Sets


The ternary relationship of Figure 2.13 gives us the table of figure 2.18. As required, we
take the primary keys of each entity set. There are no descriptive attributes in this
example.
Linking a Weak to a Strong Entity
These relationship sets are many-to-one, and have no descriptive attributes. The primary
key of the weak entity set is the primary key of the strong entity set it is existence-
dependent on, plus its discriminator.
The table for the relationship set would have the same attributes, and is thus redundant.

Generalization

Consider extending the entity set account by classifying accounts as being either savings-
account or chequing-account.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 21


UNIT I DBMS

Each of these is described by the attributes of account plus additional attributes. (savings
has interest-rate and chequing has overdraft-amount.)
We can express the similarities between the entity sets by generalization. This is the
process of forming containment relationships between a higher-level entity set and one
or more lower-level entity sets.
In E-R diagrams, generalization is shown by a triangle, as shown in Figure 2.19.

Figure 2.19: Generalization


 Generalization hides differences and emphasizes similarities.
 Distinction made through attribute inheritance.
 Attributes of higher-level entity are inherited by lower-level entities.
 Two methods for conversion to a table form:
o Create a table for the high-level entity, plus tables for the lower-level
entities containing also their specific attributes.
o Create only tables for the lower-level entities.

Aggregation

The E-R model cannot express relationships among relationships.


When would we need such a thing?
Consider a DB with information about employees who work on a particular project and
use a number of machines doing that work. We get the E-R diagram shown in Figure
2.20.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 22


UNIT I DBMS

Figure 2.20: E-R diagram with redundant relationships


Relationship sets work and uses could be combined into a single set. However, they
shouldn't be, as this would obscure the logical structure of this scheme.
The solution is to use aggregation.
 An abstraction through which relationships are treated as higher-level entities.
 For our example, we treat the relationship set work and the entity sets employee
and project as a higher-level entity set called work.
 Figure 2.21 shows the E-R diagram with aggregation.

Figure 2.21: E-R diagram with aggregation

Transforming an E-R diagram with aggregation into tabular form is easy. We create a
table for each entity and relationship set as before.
The table for relationship set uses contains a column for each attribute in the primary key
of machinery and work.

DESIGN OF AN E-R DATABASE SCHEME

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 23


UNIT I DBMS

The E-R data model provides a wide range of choice in designing a database scheme to
accurately model some real-world situation.
Some of the decisions to be made are
 Using a ternary relationship versus two binary relationships.
 Whether an entity set or a relationship set best fit a real-world concept.
 Whether to use an attribute or an entity set.
 Use of a strong or weak entity set.
 Appropriateness of generalization.
 Appropriateness of aggregation.

Mapping Cardinalities

The ternary relationship of Figure 2.13 could be replaced by a pair of binary


relationships, as shown in Figure 2.22.

Figure 2.22: Representation of Figure 2.13 using binary relationships


However, there is a distinction between the two representations:
 In Figure 2.13, relationship between a customer and account can be made only if
there is a corresponding branch.
 In Figure 2.22, an account can be related to either a customer or a branch alone.
 The design of figure 2.13 is more appropriate, as in the banking world we expect
to have an account relate to both a customer and a branch.

Use of Entity or Relationship Sets

It is not always clear whether an object is best represented by an entity set or a


relationship set.
 Both Figure 2.13 and Figure 2.22 show account as an entity.
 Figure 2.23 shows how we might model an account as a relationship between a
customer and a branch.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 24


UNIT I DBMS

Figure 2.23: E-R diagram with account as a relationship set

 This new representation cannot model adequately the situation where customers
may have joint accounts. (Why not?)
 If every account is held by only one customer, this method works.

Use of Extended E-R Features

We have seen weak entity sets, generalization and aggregation. Designers must decide
when these features are appropriate.
 Strong entity sets and their dependent weak entity sets may be regarded as a
single ``object'' in the database, as weak entities are existence-dependent on a
strong entity.
 It is possible to treat an aggregated entity set as a single unit without concern for
its inner structure details.
 Generalization contributes to modularity by allowing common attributes of
similar entity sets to be represented in one place in an E-R diagram.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 25

You might also like