0% found this document useful (0 votes)
66 views55 pages

Unit 1 DBMS

The document discusses the database environment and three-level architecture of databases. It describes the external, conceptual, and internal levels and how they relate to each other. The external level represents different user views, the conceptual level provides a complete and independent view, and the internal level describes the physical storage and implementation.

Uploaded by

Alwin star
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views55 pages

Unit 1 DBMS

The document discusses the database environment and three-level architecture of databases. It describes the external, conceptual, and internal levels and how they relate to each other. The external level represents different user views, the conceptual level provides a complete and independent view, and the internal level describes the physical storage and implementation.

Uploaded by

Alwin star
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 55

Unit - I

CONCEPTUAL DATA MODELING

Database Environment – Database System Development Lifecycle–Requirements collection– Database Design

1.1. INTRODUCTION

Data
Data is a raw material. It can be any character, word, number, date etc.
Example: 5, 27-11-2018, Nagercoil, Sherill

Database
Database is a shared collection of logically related data. For example, consider
employee database consists of following columns- eno, ename and salary.

Fig. 1.1 Example of Database


DBMS
A Data Base Management System (DBMS) is a software system that enables users to
define, create, maintain and control access to the database.
Example:
MySQL, Oracle, SQL Server, IBM DB2, PostgreSQL

Database Application Program


A computer program that interacts with the database by issuing an appropriate request
to the database management.

1.2 PURPOSE OF DATABASE SYSTEMS


The typical file-processing system is supported by a conventional operating
system. The system stores permanent records in various files, and it needs different
application programs to extract records from, and add records to, the appropriate files. Before
database management systems (DBMS) came along, organizations usually stored information
in such systems.

Disadvantages of File Systems over Database Systems

1.1
1.2 Database Design and

1. Data Redundancy:
Data redundancy and inconsistency. Since different programmers create the files and
application programs over a long period, the same information may be duplicated in different
files. This leads to data redundancy.
Example: A student is having record in college office and in computer department. Then, his
telephone number, address will be same in both the departments. This redundancy leads to
higher storage and access cost.
In addition, it may lead to data inconsistency, that is, the various copies of the same
data may no longer agree.
Example: A changed student address may be reflected in the computer department records
but not in the college office.

2. Difficulty in Accessing Data: Accessing data is not convenient and efficient in file
conventional processing system. More responsive data-retrieval systems are required for
general use.
Example: one of the bank officers needs to find out the names of all customers who live
within a particular postal- code area. If there is no application program for this means, the
officer has 2 alternatives: 1. Preparing the list manually from the list of all customers. 2. Ask
system programmer to write the necessary application programs.

3. Data isolation. Because data are scattered in various files, and files may be in
different formats, writing new application programs to retrieve the appropriate data is
difficult.

4. Integrity Problems: The data values stored in the database must satisfy certain types
of consistency constraints.

5. Example: Suppose the university maintains an account for each department, and records the
balance amount in each ac-count. Suppose also that the university requires that the account balance of
a department may never fall below zero. However, when new constraints are added, it is difficult to
change the programs to enforce them.

6. Atomicity Problems: Any operation on database must be atomic, (i.e.) it must happen in
its entirely or not at all.
Example: If you are buying a ticket from railway and you are in the process of money
transaction. Suddenly, your internet got disconnected then you may or may not have paid for
the ticket.
If you have paid, then your ticket will be booked and if not then you will not be charged
anything. That is called consistent state, means you have paid or not.

7. Concurrent Access Anomalies: Multiple users are allowed to access data


simultaneously, this is for the sake of better performance and faster response.

Example: Consider an operation to debit (withdrawal) an account. The program reads the old
Conceptual Data 1.3

balance, calculates the new balance, and writes new balance back to database. Suppose an
account has a balance of Rs. 5000. Now, a concurrent withdrawal of Rs. 1000 and Rs. 2000
may leave the balance Rs. 4000 or Rs. 3000 depending upon their completion time rather
than the correct value of Rs. 2000.

8. Security Problems: Database should be accessible to users in limited way. Not every
user of the database system should be able to access all the data.

Example: In a university, payroll personnel need to see only that part of the database that has
financial information. They do not need access to information about academic records. To
overcome these difficulties, the database management system (DBMS) was developed.

1.3. DATABASE ENVIRONMENT

1.3.1 The Three-Level ANSI-SPARC Architecture

• An early proposal for a standard terminology and general architecture for database
systems was produced in 1971 by the DBTG (Database Task Group) appointed by the
Conference on Data Systems and Languages (CODASYL, 1971).
• The DBTG recognized the need for a two-level approach with a system view called
schema and user views called subschemas.
• The American National Standards Institute (ANSI), Standards Planning and
Requirements Committee (SPARC), ANSI/X3/SPARC, produced a similar terminology and
architecture in 1975 (ANSI, 1975).
• For our purpose the fundamental point of these and later reports is the identification
of three levels of abstraction, i.e., three distinct levels at which data items can be described.
• These levels form a Three Level Architecture comprising an External, a Conceptual
and an Internal level as shown in the figure on the next page.
• The way users perceive the data is called the external level. The way the DBMS and
OS perceive the data is the internal level, where the data is actually stored using the
appropriate data structures. The conceptual level provides both the mapping and the desired
independence between external and internal levels.

Objective of Three – Level Architecture

The objective of three level architecture is to separate each user’s view of the database from
the way the database is physically represented.
1. Database Design and

Fig 1: 2 The Three-Level ANSI-SPARC Architecture

1.3.1.1 External Level/ View


• The external view consists of a number of different external views of the database.
• Describes that part of the database that is relevant to each user .
• The external level includes only those entities, attributes and relationships in real
world that the user is interested in.
• Other entities, attributes and relationships that are not of interest may be presented in
the database but the user will be unaware of them.
• Different views may have different representations of the same data.
• For example one user may view date in the format dd-mm-yyyy while another may
view dates as yyyy-dd-mm.
• Some views might include calculated or derived data i.e. data not actually stored as
such in the database but created when needed.

1.3.1.2 Conceptual Level/ View


• This level contains the logical structure of the entire database as seen by the DBA.
• It is a complete view of the data requirements of the organization that is independent
of any storage considerations.

• This level represents:


 All entities their attributes and their relationships.
 The constraints on the data.
 Semantic information of data.
 Security and Integrity information.
Conceptual Data 1.5

1.3.1.3 Internal Level/ View


• Physical representation of the database on the computer
• Describes how the data is stored on the computer
• It covers the data structures and file organizations used to store data on the storage
devices.
• The Internal Level is concerned with such things as:
 Storage space allocation for data and indexes.
 Record descriptions for storage (with stored sizes of data items).
 Record Placement.
 Data compression & encryption techniques.
1. Database Design and

1.3.1.4 Schemas, Mappings and Instances

Database Schema
• The overall description of database is called Database Schema.
• There are three different types of schema in the database and these are defined
according to the levels of abstraction of the three level architecture.
• At the highest level we have multiple external schemas (also called subschemas) that
correspond to different views of data.
• At the conceptual level we have the conceptual schema, which describes all the
entities, attributes and relationships together with integrity constraints.
• At the lowest level of abstraction we have the internal schema which is complete
description of the internal model, containing the definition of the stored records, the methods
of representation, the data fields, and the indexing and hashing techniques used, if any. There
is only one conceptual schema and one internal schema per database.
Mapping
• The DBMS is responsible for mapping between the above three types of schemas. Not
only this the DBMS must check schemas for consistency.
• The conceptual schema is related to the internal schema through a conceptual/
internal mapping. This mapping enables the DBMS to find the actual record or combination
of records in the physical storage that constitute a logical record in the conceptual schema
together with any constraints to be enforced on the operations for that logical record.
• It also allows any differences in entity names, attribute names, attribute order, data
types… to be resolved.
Conceptual Data 1.7

• Finally, each external schema is related to the conceptual schema by the external/
conceptual mapping. This enables the DBMS to map names in the user’s view on to the
relevant part of the conceptual schema.
• The Schema is specified during the design process and is not expected to change
frequently. However, the actual data in the database may change frequently.
Instance:
• The data in a database at any particular point in time is called a database instance.
Thus many database instances can correspond to the same database schema. The schema is
sometimes called the intension of database, while an instance is called extension (or state) of
the database.

1.3.1.5 Data Independence


• Another major objective of the three - schema ANSI/ SPARC architecture is to
provide data independence.
• Two kinds of data independence are:
a) Logical Data Independence
b) Physical Data Independence

a) Logical Data Independence refers to the immunity of the external schemas to changes in
the conceptual schema.
1. Database Design and

• Changes to the Internal Schema, such as using different file organizations, using
different storage devices, modifying indexes or hashing algorithms should be possible
without having to change the conceptual schema or external schemas. The conceptual/
internal mapping must be able to absorb them.

b) Physical Data Independence refers to the immunity of the conceptual schema to changes
in the internal schema.
• Following figure illustrates where each type of data independence occurs in relation
to the three – schema architecture.

1.3.2 Database Languages


• A data sublanguage (DSL) consists of two parts – a Data Definition Language (DDL)
and a Data Manipulation Language (DML).
• The DDL is used to specify the database schema and the DML is used to both read
and update the database.
• Many DBMSs have the facility for embedding the data sublanguage in a high level
programming language (User Exits) such as COBOL, C, C++, Java etc. In this case the high
level language is sometimes referred to as the host language.

1.3.2.1 The Data Definition Language (DDL)


• The DDL is used to define a schema or to modify an existing one. It can’t be used to
manipulate the data.
• The result of the compilation of the DDL statements is a set of tables stored in a
special file collectively called the system catalog.
Conceptual Data 1.9

• The System Catalog integrates the meta data that is data that describe objects in the
database and makes it easier for those objects to be accessed or manipulated.
• The meta data contains definitions of records, data items and other objects that are of
interest to the user or are required by the DBMS.
• The term data dictionary and data directory are also used to describe the system
catalog. At a theoretical level we could identify different DDLs for each schema in the three
– schema architecture, namely a DDL for external schemas, a DDL for conceptual schema,
and a DDL for the internal schema.
• However, in practice there is one comprehensive DDL that allows specification of at
least the external and conceptual schemas.

1.3.2.2 The Data Manipulation Language (DML)

• Provides a set of operations to support the basic data manipulation operations on the
data
• Data manipulation operations usually include the
following: Insertion of the new data in database.
Modification of data stored in database.
Retrieval (Query) of data contained in the database.
Deletion of data from the database.

• The part of DML that involves data retrieval is called a query language.
• DMLs are distinguished by their underlying retrieval constructs.
• We can identify two types of DML – procedural (Low Level DML) and non –
procedural (High Level DML).
• Typically, procedural languages treat records individually, while non – procedural
languages operate on sets of records.

Procedural or Low Level DML

• With a procedural DML the user or more precisely the programmer specifies what
data is needed and how to obtain it.
• This type of DML typically retrieves individual records or objects from the database
and processes each separately.
• They are also called record-at-a-time DMLs because of their nature of processing a
record at a time.
• Network and Hierarchical DMLs are normally procedural

Non-procedural Data Manipulation Languages

• Non-procedural DMLs allow the required data to be specified in a single retrieval or


update step.
• With these sorts of DMLs the user specifies what data is required without specifying
how it is to be obtained.
1. Database Design and

• They are also known as declarative languages or set-at-a-time or set-oriented DMLs.


• Relational Database Management System (RDBMS) usually include some form of
non- procedural DMLs typically SQL or QBE (Query by Example). They are easy to learn
and use as compared to Procedural DMLs.

1.3.2.3 Fourth Generation Languages (4GLs)

• There is no consensus about what constitutes a fourth generation language; it is in


essence a shorthand programming language.
• An operation that requires hundreds of lines in a conventional programming language
(3GL) like C, generally requires significantly fewer lines in a 4GL.
• Compared with a 3GL which is procedural, a 4GL is non-procedural. The user only
defines what is to be done and not how it is to be done.
• A 4GL is expected to rely largely on much higher-level components known as fourth
generation tools.
• The user doesn’t define the steps that a program needs to perform a task, but instead
defines parameters for the tools that use them to generate an application program.
• It is claimed that 4GLs can improve the productivity by a factor of ten.

Following languages may be kept in the category of Fourth generation languages:

 Presentation Languages, such as query languages and report generators.


 Specialist Languages such as spreadsheets and database languages.
 Application Generators that define, insert, update and retrieve data from the database
to build applications (The data control object of Visual Basic).
 Very high level languages that are used to generate application code (Visual Basic
itself)
 SQL (Structured Query Language), QBE (Query by Example), Data Control Object
of VB, VB itself, Oracle’s Form, Oracle’s Menu and Oracle’s Report writer, JDBC
etc. all are examples of 4GLs.
 Some of them are being discussed below:

i) Form Generators

• A forms generator is an interactive facility for rapidly creating data inputs and display
layouts for screen forms.
• The forms generator allows the user to define what the screen is to look like, what
information is to be displayed, and where on the screen it is to be displayed.
• It may also allow the definition of colors for screen elements and other characteristics,
such as bold, underline and blinking etc.
• A very simple form is shown below
Conceptual Data 1.1

ii) Report Generators


• A Report Generator is a facility for creating reports from data stored in the database.
• We can let the report generator automatically determine how the output should look
like or we can create our own customized output reports using special report generator
command instructions.
• There are two main types of Report Generators:
 Language Oriented
 Visually Oriented

Language Oriented:
• In this case we enter a command in a sublanguage to define what data is to be
included in the report and how the report is to be laid out. The best example of this method is
the REPORT FORM command of FoxPro.

Visually Oriented
• In this case we use a facility similar to a forms generator to define the same information.
• The process of building a report using MS Access from Employee Table and how
results look like after report execution is depicted in the next two figures:
1. Database Design and

iii) Graphics Generators


Conceptual Data 1.1

• A graphics generator is a facility to retrieve data from the database and display the
data as a graph showing trends and relationships in the data (OLAP: Online Analytical
Processing).
• Typically, it allows the user to create bar charts, pie charts, scatter charts etc. from the
data stored in the database.
iv) Application Generators
• An application generator is a facility for producing a program that interfaces with the
database. The use of an application generator can reduce the time it takes to design an entire
software application.
• Application generators typically consist of pre-written modules that comprise
fundamental functions that most database oriented programs use.
• An example of an application generator is VB data control object.

ROLES IN DATABASE ENVIRONMENT (DATABASE PEOPLE)

• We have already discussed that following are the components of a database system or
database environment:
1. Hardware
2. Software (DBMS)
3. Data
4. Procedures
5. People
• Here we will examine in detail the fifth component of the database environment i.e.
people.
• For small databases e.g. the list of students discussed time to time, one person
typically defines, constructs and manipulate the database. However many persons are
involved in the design, use and maintenance of a large database with few hundred users.
1. Database Design and

Database Management System

Database
End User

End User

End User

End User
Application Programs

• Firstly, we will identify the people whose jobs involve the day-to-day use of a large
database system. We will call them the “Actors on the Scene” or “Database Users”.
• Secondly, we will identify the people who are associated with the design,
development and operation of the DBMS software and system environment. We will call
them the “Workers behind the Scene”. These persons are typically not interested in the
database itself.
Actors on the Scene (Database Users)
• We can identify four distinct types of database users:
Data and Database Administrators
Database Designers
Application Programmers or Application Developers
End Users
Data and Database Administrators
• The database and the DBMS are corporate resources that must be managed like any
other resource.
• Data Administration (DA) and Database Administration (DBA) are the roles
generally associated with the management and control of a DBMS and its data.
• Data Administrator is a person in the enterprise who has the central responsibility for
the data.
• Thus it is the Data Administrator’s job to decide what data should be stored in the
database in the first place and to establish policies for maintaining and dealing with that data
once it has been stored.
• An example of such a policy might be one that dictates the enterprise which data to be
made public and which data to be kept as confidential.
Conceptual Data 1.1

• Note carefully that the Data Administrator is a manager (decision maker) and not a
technician (although he or she certainly does need to have some appreciation of the
capabilities of database system at a technical level).
• The technical person responsible for implementing the data administrator’s decisions
is the Database Administrator (DBA). The DBA unlike the Data Administrator is an IT
professional.
• The job of DBA is to create the actual database and to implement the technical
controls needed to enforce various policy decisions made by the DA (Data Administrator).
• The DBA is also responsible for ensuring that the DBMS operates with adequate
performance and for providing a variety of technical services.
• The DBA is typically assisted by a team of Application Programmers and other
technical persons.
• In small organizations sometimes there is no distinction between the role of DA and
DBA, in others they perform their duties as stated above.
• We shall later on discuss the specific responsibilities of DBA in detail.
Database Designers
• Database designers are responsible for identifying the data to be stored in the database
and for choosing appropriate structures to represent and store this data.
• In large design projects we can distinguish between two types of designer: logical
database designers and physical database designers.
• Logical Database Designer is concerned with identifying the data (i.e. the entities and
attributes), the relationships between the data, and the constraints on the data that is to be
stored in the database. In essence it can be said that Logical Database Designer derives the
Conceptual View (note that not the Conceptual Schema).
• The Physical Database Designer decides how the logical database design created by
logical database designer is to be physically realized i.e. he decides specific storage structures
and access methods for the data to achieve good performance. In essence it can be said that
the physical database designer derives the Internal View (note that not the Internal Schema).
• The tasks related to database designers are undertaken before the database is actually
implemented and populated with data.
• It is the responsibility of the database designers to communicate with all prospective
database users, in order to understand their requirements and to come up with a design
(certainly in consultation with the DA) that meets these requirements.
• In many cases, the designers are on the staff of the DBA.
• Database Designers typically interact with each potential group of users and develop a
view of the database that meets the data and processing requirements of this group. These
views
1. Database Design and

are then analyzed and integrated with the views of other user groups. The final database
design must be capable of supporting the requirements of all user groups.
Application Programmers/ Developers
• Once the database has been implemented, the application programs that provide the
required functionality to the end users or they satisfy the personal needs of the application
programmers are developed.
• This is the responsibility of Application Programmers. To access the database they
use some (3GLs) programming language such as C, C++, COBOL, Java etc or some higher-
level “fourth generation” language (4GL).
• Such programs access the database by issuing the appropriate request – typically an
embedded SQL statement – to the DBMS.
End Users
• The end-users are the clients for the database which has been designed and
implemented, and is being maintained to serve their information needs.
• End users may be classified according to the way they use the system:
Naive Users:
• They are typically unaware of the DBMS. They access database through specially
written application programs (by application programmers) which attempt to make the
operations as simple as possible.
• They invoke database operations by entering simple commands or choosing options
from a menu. This means that they do not need to know anything about the database or
DBMS.
• The best example is the person booking seats on a Railway Reservation Counter.

1.3.3 Data Models and Conceptual


Modeling Data Models:
Data Model is a collection of conceptual tools for describing data, data relationships,data
semantics and consistency constraints.
 It consists of three components
i) Structural part
Consisting of set of rules
ii) Manipulative part
Defining the types of operations
iii) Set of Integrity Constraints
This ensures that the data is secure

Data models can be classified into three major groups. They are:
 Object -Based Data models
 Record -Based Data models
 Physical Models
Conceptual Data 1.1

2.3.1 Object-based logical models


These models are used to describe data at the logical and view levels. The following are the
well-known models in this group.
a) Entity-relationship model

b) Object-oriented model
c) Semantic model
d) Functional model
a) Entity-Relationship Model
Entity: An Entity is an object or a thing such as person, place about which an organization
keeps information. Any two objects or things are distinguishable.
E.g.: Each student is an entity.
Attribute: The describing properties of an entity are called Attributes.
e.g.: For a student entity, name, sex, date of birth are attributes.
Relationship: An association among entities is called a relationship.
The data model that consists of a set of entities and a set of relationships among those entities
is called ER Model.
The set of all entities of the same type is called an entity set and the set of all relationship of
the same type are called a relationship set.

Example of entity-relationship model

b) The Object-Oriented Model:


The object oriented model is a data model based on a collection of objects.
Each object has a unique identity. The group of objects containing the same type of values
and the same methods are called classes.
1. Database Design and

c) The Semantic data model:


These models were based on semantic networks. Inter dependencies among the entities can be
expressed in this data model.

d) Functional Data Model:


In this model objects, properties of objects, their relationships are viewed uniformly and are
defined as functions.

1.3.3.2 Record Based Logical models:


This model is used to describe data at the logical and view levels. The database is structured
in fixed format records of different types. Each record type has a fixed number of fields.
And each field is of fixed length.
The following are the three important record based logical models.
a) Relational Model
b) Network Model
c) Hierarchical Model.
Conceptual Data 1.1

a) Relational Model:
A data model in which both data and their relationships are represented by means of tables is
called Relational Model.
The relation is the only data structure used in this model to represent both entities and their
interrelationships. A relation is a two dimensional table with a unique name.
Each row of a table is called a tuple and each column of a table is called an attribute. The set
of all possible values in an attribute is called the domain of the attribute.

Example of tabular data in the relational model:

account-number balance
A-101 500
A-201 900
A-215 700
A-217 750

b) Network Model:
The network model uses two different structures. The data are represented by a collection of
records and the relationships among data are represented by links.

c) Hierarchical Model:
In Hierarchical Model, data are represented by records and relationships among data are
represented by links. But unlike in Network model, data are organized in an ordered tree
structure, which is called Hierarchical structure.
1. Database Design and

2.3.3 Physical Data Models:


These models are used to represent data at the lowest level. Two important physical Data
Models are:
 Unifying Model
 Frame Memory Model.

1.3.3.4 Conceptual Modeling


Conceptual modeling or conceptual database design is the process of constructing a model
of the information use in an enterprise that is independent of implementation details, such as
the target DBMS, application programs, programming languages, or any other physical
considerations. This model is called a conceptual data model.

1.3.4 Functions of a database

(1) Data storage, retrieval, and update


A DBMS must furnish users with the ability to store, retrieve, and update data in the database.
(2) A user-accessible catalog
A DBMS must furnish a catalog in which descriptions of data items are stored and which is
accessible to users.
(3) Transaction support
A DBMS must furnish a mechanism that will ensure either that all the updates corresponding
to a given transaction are made or that none of them is made.
(4) Concurrency control services
A DBMS must furnish a mechanism to ensure that the database is updated correctly when
multiple users are updating the database concurrently.
(5) Recovery services
Conceptual Data 1.2

A DBMS must furnish a mechanism for recovering the database in the event that the database
is damaged in any way.
(6) Authorization services
A DBMS must furnish a mechanism to ensure that only authorized users can access the
database.
(7)Support for data communication
A DBMS must be capable of integrating with communication software.
(8) Integrity services
A DBMS must furnish a means to ensure that both the data in the database and
changes to the data follow certain rules.
(9) Services to promote data independence
A DBMS must include facilities to support the independence of programs from the
actual structure of the database.
(10) Utility services
A DBMS should provide a set of utility services.

1.4. DATABASE SYSTEM DEVELOPMENT LIFECYCLE


The stages of the database system development lifecycle are shown in Figure 10.1. In
the following sections we describe the main activities associated with each stage of the
database system development lifecycle in more detail.
1.4.1. Database planning
Database planning must be integrated with the overall IS strategy of the organization. There
are three main issues involved in formulating an IS strategy, which are:
 identification of enterprise plans and goals with subsequent determination of information
system’s needs;
 evaluation of current information systems to determine existing strengths and weaknesses;
 appraisal of IT opportunities that might yield competitive advantage.
An important first step in database planning is to clearly define the mission statement
for the database system.
The mission statement defines the major aims of the database system. Those driving
the database project within the organization (such as the Director and/or owner) normally
define the mission statement.
A mission statement helps to clarify the purpose of the database system and provide a
clearer path towards the efficient and effective creation of the required database system. Once
the mission statement is defined, the next activity involves identifying the mission
objectives. Each mission objective should identify a particular task that the database system
must support. The assumption is that if the database system sup-ports the mission
objectives, then
the mission statement should be met.
The mission statement and objectives may be accompanied by some additional
information that specifies in general terms the work to be done, the resources with which to
do it, and the money to pay for it all.
1. Database Design and

Fig 10.1 The stages of the database system development lifecycle

1.4.2. System Definition


Describes the scope and boundaries of the database system and the major user views.
Before attempting to design a database system, it is essential that we first identify the
boundaries of the system that we are investigating and how it interfaces with other parts of
the organization’s information system. It is important that we include within our system
boundaries not only the current users and application areas, but also future users and
applications.

1.4.2.1 User Views


Defines what is required of a database system from the perspective of a particular job
role (such as Manager or Supervisor) or enterprise application area (such as marketing,
personnel, or stock control).
A database system may have one or more user views. Identifying user views is an
important aspect of developing a database system because it helps to ensure that no major
users of the database are forgotten when developing the requirements for the new database
system. User views are also particularly helpful in the development of a relatively complex
database system by allowing the requirements to be broken down into manageable pieces.
Conceptual Data 1.2

A user view defines what is required of a database system in terms of the data to be
held and the transactions to be performed on the data (in other words, what the users will do
with the data). The requirements of a user view may be distinct to that view or overlap with
other views. Figure 10.2 is a diagrammatic representation of a data-base system with multiple
user views (denoted user view 1 to 6). Note that whereas user views (1, 2, and 3) and (5 and
6) have overlapping requirements (shown as hatched areas), user view 4 has distinct
requirements.

Figure 10.2 Representation of a database system with multiple user views: user views (1, 2, and 3) and (5 and
6) have overlapping requirements (shown as hatched areas), whereas user view 4 has distinct requirements.

1.4.3. Requirements Collection and Analysis


The process of collecting and analyzing information about the part of the organization
that is to be supported by the database system, and using this information to identify the
requirements for the new system.
This stage involves the collection and analysis of information about the part of the
enterprise to be served by the database. There are many techniques for gather-ing this
information, called fact-finding techniques. Information is gathered for each major user
view (that is, job role or enterprise application area), including:
 a description of the data used or generated;
 the details of how data is to be used or generated;
 any additional requirements for the new database system.

This information is then analyzed to identify the requirements (or features) to be


included in the new database system. These requirements are described in documents
collectively referred to as requirements specifications for the new database system.
Requirements collection and analysis is a preliminary stage to database design. The
amount of data gathered depends on the nature of the problem and the policies of the
enterprise. Too much study too soon leads to paralysis by analysis. Too little thought can
result in an unnecessary waste of both time and money due to working on the wrong solution
to the wrong problem.
The information collected at this stage may be poorly structured and include some
informal requests, which must be converted into a more structured statement of requirements.
This is achieved using requirements specification techniques, which include, for example,
Structured Analysis and Design (SAD) techniques, Data Flow Diagrams (DFD), and
Hierarchical Input Process Output (HIPO) charts supported by documentation. As you will
see shortly, Computer-Aided Software Engineering (CASE) tools may provide automated
assistance to ensure that the requirements are complete and consistent
There are three main approaches to managing the requirements of a database sys-tem with
1. Database Design and

multiple user views:

 the centralized approach;


 the view integration approach;
 a combination of both approaches.

1.4.3.1 Centralized Approach


Requirements for each user view are merged into a single set of requirements for the
new database system. A data model representing all user views is created during the database
design stage.
The centralized (or one-shot) approach involves collating the requirements for
different user views into a single list of requirements. The collection of user views is given a
name that provides some indication of the application area covered by all the merged user
views. In the database design stage (see Section 10.6), a global data model is created, which
represents all user views. The global data model is composed of diagrams and documentation
that formally describe the data requirements of the users. A diagram representing the
management of user views 1 to 3 using the centralized approach is shown in Figure 10.3.

Figure 10.3 The centralized approach to managing multiple user views 1 to 3.

1.4.3.2 View Integration Approach


Requirements for each user view remain as separate lists. Data models representing
each user view are created and then merged later during the database design stage.
The view integration approach involves leaving the requirements for each user view
as separate lists of requirements. In the database design stage (see Section 10.6), we first
create a data model for each user view. A data model that represents a single user view (or a
subset of all user views) is called a local data model.
Each model is composed of diagrams and documentation that formally describes the
requirements of one or more—but not all—user views of the database.
The local data models are then merged at a later stage of database design to produce a
global data model, which represents all user requirements for the database.
A diagram representing the management of user views 1 to 3 using the view
integration approach is shown in Figure 10.4.
Generally, this approach is preferred when there are significant differences between
user views and the database system is sufficiently complex to justify dividing the work into
more manageable parts.
Conceptual Data 1.2

Figure 10.4 The view integration approach to managing multiple user views 1 to 3.

1.4.4. Database Design


The process of creating a design that will support the enterprise’s mission statement
and mission objectives for the required database system.
In this section we present an overview of the main approaches to database design. We also
discuss the purpose and use of data modeling in database design. We then describe the three
phases of database design: conceptual, logical, and physical design.

1.4.4.1 Approaches to Database Design


The two main approaches to the design of a database are referred to as “bottom-up”
and “top-down.” The bottom-up approach begins at the fundamental level of attributes (that
is, properties of entities and relationships), which through analysis of the associations
between attributes are grouped into relations that represent types of entities and relationships
between entities. In Chapters 14 and 15 we dis-cuss the process of normalization, which
represents a bottom-up approach to data-base design. Normalization involves the
identification of the required attributes and their subsequent aggregation into normalized
relations based on functional dependencies between the attributes.

The bottom-up approach is appropriate for the design of simple databases with a
relatively small number of attributes. However, this approach becomes difficult when applied
to the design of more complex databases with a larger number of attributes, where it is
difficult to establish all the functional dependencies between the attributes. As the conceptual
and logical data models for complex databases may contain hundreds to thousands of
attributes, it is essential to establish an approach that will simplify the design process. Also,
in the initial stages of establishing the data requirements for a complex database, it may be
difficult to establish all the attributes to be included in the data models.
1. Database Design and

A more appropriate strategy for the design of complex databases is to use the top-down
approach. This approach starts with the development of data models that contain a few high-
level entities and relationships and then applies successive top-down refinements to identify
lower-level entities, relationships, and the associated attributes. The top-down approach is
illustrated using the concepts of the Entity-Relationship (ER) model, beginning with the
identification of entities and relationships between the entities, which are of interest to the
organization.

1.4.4.2 Data Modeling

The two main purposes of data modeling are to assist in the understanding of the meaning
(semantics) of the data and to facilitate communication about the information requirements.
Building a data model requires answering questions about entities, relationships, and
attributes. In doing so, the designers discover the semantics of the enterprise’s data, which
exist whether or not they happen to be recorded in a formal data all enterprises. However,
their meaning may remain poorly understood until they have been correctly documented. A
data model makes it easier to understand the meaning of the data, and thus we model data to
ensure that we understand:

 each user’s perspective of the data;


 the nature of the data itself, independent of its physical representations;
 the use of data across user views.

Data models can be used to convey the designer’s understanding of the information
requirements of the enterprise.

Criteria for data models

An optimal data model should satisfy the criteria listed in Table 10.2 (Fleming and Von
Halle, 1989). However, sometimes these criteria are incompatible with each other and trade-
offs are sometimes necessary. For example, in attempting to achieve greater expressibility in
a data model, we may lose simplicity.

1.4.4.3 Phases of Database Design


Database design is made up of three main phases: conceptual, logical, and physical
design.

Conceptual database design


The process of constructing a model of the data used in an enterprise, independent of
all physical considerations.
The first phase of database design is called conceptual database design and involves
the creation of a conceptual data model of the part of the enterprise that we are interested in
modeling. The data model is built using the information documented in the users’
requirements specification.
Table 10.2 The criteria to produce an optimal data model.
Structural validity Consistency with the way the enterprise defines and organizes
information.
Simplicity Ease of understanding by IS professionals and nontechnical users.
Conceptual Data 1.2

Expressibility Ability to distinguish between different data, relationships between


data, and constraints.
Nonredundancy Exclusion of extraneous information; in particular, the representation of
any one piece of
information exactly once.
Shareability Not specific to any particular application or technology and thereby
usable by many.
Extensibility Ability to evolve to support new requirements with minimal effect on
existing users.
Integrity Consistency with the way the enterprise uses and manages information.
Diagrammatic Ability to represent a model using an easily understood diagrammatic
notation.
representation

Logical database design

The process of constructing a model of the data used in an enterprise based on a specific data
model, but independent of a particular DBMS and other physical considerations.
The second phase of database design is called logical database design, which results in the
creation of a logical data model of the part of the enterprise that we interested in modeling.
The conceptual data model created in the previous phase is refined and mapped onto a logical
data model. The logical data model is based on the target data model for the database (for
example, the relational data model).

Whereas a conceptual data model is independent of all physical considerations, a logical


model is derived knowing the underlying data model of the target DBMS. In other words, we
know that the DBMS is, for example, relational, network, hierarchical, or object-oriented.
However, we ignore any other aspects of the chosen DBMS and, in particular, any physical
details, such as storage structures or indexes.

Throughout the process of developing a logical data model, the model is tested and
validated against the users’ requirements. The technique of normalization is used to test the
correctness of a logical data model.

Physical database design


The process of producing a description of the implementation of the database on
secondary storage; it describes the base relations, file organizations, and indexes used to
achieve efficient access to the data, and any associated integrity constraints and security
measures.
Physical database design is the third and final phase of the database design process,
during which the designer decides how the database is to be implemented. The previous
phase of database design involved the development of a logical structure for the database,
which describes relations and enterprise constraints. Although this structure is DBMS-
independent, it is developed in accordance with a particular data model, such as the
relational, network, or hierarchic. However, in developing the physical database design, we
must first identify the target DBMS. Therefore, physical design is tailored to a specific
DBMS system. There is
1. Database Design and

feedback between physical and logical design, because decisions are taken during physical
design for improving performance that may affect the structure of the logical data model.

In general, the main aim of physical database design is to describe how we intend to
physically implement the logical database design. For the relational model, this involves:

 Creating a set of relational tables and the constraints on these tables from the
information presented in the logical data model;
 Identifying the specific storage structures and access methods for the data to achieve
an optimum performance for the database system;
 Designing security protection for the system.

Ideally, conceptual and logical database design for larger systems should be separated from
physical design for three main reasons:
 It deals with a different subject matter—the what, not the how;
 It is performed at a different time—the what must be understood before the how can
be determined;
 It requires different skills, which are often found in different people.

Fig: Data modeling and the ANSI-SPARC architecture.

1.4.5. DBMS Selection


If no DBMS exists, an appropriate part of the lifecycle in which to make a selection is
between the conceptual and logical database design phases (see Figure 10.1). However,
selection can be done at any time prior to logical design provided sufficient information is
available regarding system requirements such as performance, ease of restructuring, security,
and integrity constraints.
Although DBMS selection may be infrequent, as enterprise needs expand or existing
systems are replaced, it may become necessary at times to evaluate new DBMS products. In
such cases, the aim is to select a system that meets the current and future requirements of the
enterprise, balanced against costs that include the purchase of the DBMS product, any
Conceptual Data 1.2

additional software/hardware required to support the database system, and the costs
associated with changeover and staff training.
A simple approach to selection is to check off DBMS features against requirements.
In selecting a new DBMS product, there is an opportunity to ensure that the selection process
is well planned, and the system delivers real benefits to the enterprise. In the following
section we describe a typical approach to selecting the “best” DBMS.

1.4.5.1 Selecting the DBMS


Define Terms of Reference of study
The Terms of Reference for the DBMS selection is established, stating the objectives
and scope of the study and the tasks that need to be undertaken. This document may also
include a description of the criteria (based on the users’ requirements specification) to be used
to evaluate the DBMS products, a preliminary list of possible products, and all necessary
constraints and timescales for the study.

Shortlist two or three products


Criteria considered to be “critical” to a successful implementation can be used to
produce a preliminary list of DBMS products for evaluation. For example, the decision to
include a DBMS product may depend on the budget available, level of vendor support,
compatibility with other software, and whether the product runs on particular hardware.
Additional useful information on a product can be gathered by contacting existing users, who
may provide specific details on how good the vendor support actually is, on how the product
supports particular applications, and whether certain hardware platforms are more problematic
than others. There may also be benchmarks available that compare the performance of DBMS
products. Following an initial study of the functionality and features of DBMS products, a
shortlist of two or three products is identified.

The World Wide Web is an excellent source of information and can be used to identify
potential candidate DBMSs. For example, InfoWorld’s online technology test center
(available at www.infoworld/test-center.com) provides a comprehensive review of DBMS
products. Vendors’ Web sites can also provide valuable information on DBMS products.

Evaluate products
There are various features that can be used to evaluate a DBMS product. For the
purposes of the evaluation, these features can be assessed as groups (for example, data
definition) or individually (for example, data types available). Table 10.4 lists possible
features for DBMS product evaluation grouped by data definition, physical definition,
accessibility, transaction handling, utilities, development, and other features.

If features are checked off simply with an indication of how good or bad each is, it
may be difficult to make comparisons between DBMS products. A more useful approach is to
weight features and/or groups of features with respect to their -importance to the organization,
and to obtain an overall weighted value that can requirements” are given a weighting of 0.00
and are therefore not included in this evaluation.
1. Database Design and

We next sum all the scores for each evaluated feature to produce a total score for the
group. The score for the group is then itself subject to a weighting, to indicate its importance
relative to other groups of features included in the evaluation. For example, in Table 10.5, the
total score for the “Physical definition” group is 5.75; however, this score has a weighting of
0.25.
Finally, all the weighted scores for each assessed group of features are summed to
produce a single score for the DBMS product, which is compared with the scores for the other
products. The product with the highest score is the “winner".

In addition to this type of analysis, we can also evaluate products by allowing vendors
to demonstrate their product or by testing the products in-house. In-house evaluation involves
creating a pilot testbed using the candidate products. Each product is tested against its ability
to meet the users’ requirements for the database system. Benchmarking reports published by
the Transaction Processing Council can be found at www.tpc.org.

Recommend selection and produce report


The final step of the DBMS selection is to document the process and to provide a
statement of the findings and recommendations for a particular DBMS product.

1.4.6. Application Design


The design of the user interface and the application programs that use and process
the database

1.4.6.1 Transaction Design


An action, or series of actions, carried out by a single user or application program, that
accesses or changes the content of the database.
The purpose of transaction design is to define and document the high-level characteristics of
the transactions required on the database, including:
 data to be used by the transaction;
 functional characteristics of the transaction;
 output of the transaction;
 importance to the users;
 expected rate of usage.
This activity should be carried out early in the design process to ensure that the implemented
database is capable of supporting all the required transactions. There are three main types of
transactions: retrieval transactions, update transactions, and mixed transactions:
Retrieval transactions are used to retrieve data for display on the screen or in the production
of a report. For example, the operation to search for and display the details of a property
(given the property number) is an example of a retrieval transaction.

Update transactions are used to insert new records, delete old records, or modify existing
records in the database. For example, the operation to insert the details of a new property into
the database is an example of an update transaction.
Conceptual Data 1.3

Mixed transactions involve both the retrieval and updating of data. For example, the
operation to search for and display the details of a property (given the property number) and
then update the value of the monthly rent is an example of a mixed transaction.

1.4.6.2 User Interface Design Guidelines


Before implementing a form or report, it is essential that we first design the layout.
Useful guidelines to follow when designing forms or reports are listed in Table 10.6
(Shneiderman et al., 2009).

Meaningful title
The information conveyed by the title should clearly and unambiguously identify the purpose
of the form/report.

Comprehensible instructions
Familiar terminology should be used to convey instructions to the user. The instruc-tions
should be brief, and, when more information is required, help screens should be made
available. Instructions should be written in a consistent grammatical style, using a standard
format.

Logical grouping and sequencing of fields


Related fields should be positioned together on the form/report. The sequencing of fields
should be logical and consistent.

Visually appealing layout of the form/report


The form/report should present an attractive interface to the user. The form/report should
appear balanced with fields or groups of fields evenly positioned throughout the form/report.
There should not be areas of the form/report that have too few or too many fields. Fields or
groups of fields should be separated by a regular amount of space. Where appropriate, fields
should be vertically or horizontally aligned. In cases where a form on screen has a hardcopy
equivalent, the appearance of both should be consistent.

Familiar field labels


Field labels should be familiar. For example, if Sex were replaced by Gender, it is possible
that some users would be confused.

Consistent terminology and abbreviations


An agreed list of familiar terms and abbreviations should be used consistently.

Consistent use of color


Color should be used to improve the appearance of a form/report and to highlight important
fields or important messages. To achieve this, color should be used in a consistent and
meaningful way. For example, fields on a form with a white background may indicate data
entry fields and those with a blue background may indicate display-only fields.

Visible space and boundaries for data-entry fields


A user should be visually aware of the total amount of space available for each field. This
1. Database Design and

allows a user to consider the appropriate format for the data before entering the values into a
field.

Convenient cursor movement


A user should easily identify the operation required to move a cursor throughout the
form/report. Simple mechanisms such as using the Tab key, arrows, or the mouse pointer
should be used.

Error correction for individual characters and entire fields A user should easily identify
the operation required to make alterations to field values. Simple mechanisms should be
available, such as using the Backspace key or overtyping.
Error messages for unacceptable values
If a user attempts to enter incorrect data into a field, an error message should be
displayed. The message should inform the user of the error and indicate permissible
values.

Optional fields marked clearly


Optional fields should be clearly identified for the user. This can be achieved using an
appropriate field label or by displaying the field using a color that indicates the type of
the field. Optional fields should be placed after required fields.

Explanatory messages for fields


When a user places a cursor on a field, information about the field should appear in a
regular position on the screen, such as a window status bar.

Completion signal
It should be clear to a user when the process of filling in fields on a form is complete.
However, the option to complete the process should not be automatic, as the user may
wish to review the data entered.

1.4.7. Prototyping
Building a working model of a database system. A prototype is a working model
that does not normally have all the required features or provide all the functionality of the
final system. The main purpose of developing a prototype database system is to allow
users to use the prototype to identify the features of the system that work well or are
inadequate, and—if possible—to suggest improvements or even new features to the
database system. There are two prototyping strategies in common use today:
requirements proto-typing and evolutionary prototyping.
Requirements prototyping uses a prototype to determine the requirements of a
proposed database system, and once the requirements are complete, the prototype is
discarded. Although evolutionary prototyping is used for the same purposes, the
important difference is that the prototype is not discarded, but with further development
becomes the working database system.

1.4.8. Implementation
The physical realization of the database and application designs.
Conceptual Data 1.3

The database implementation is achieved using the DDL of the selected DBMS or a GUI,
which provides the same functionality while hiding the low-level DDL statements. The
DDL statements are used to create the database structures and empty database files. Any
specified user views are also implemented at this stage.
The application programs are implemented using the preferred third- or fourth-
generation language (3GL or 4GL). Parts of these application programs are the database
transactions, which are implemented using the DML of the target DBMS, possibly
embedded within a host programming language, such as Visual Basic (VB), VB.net,
Python, Delphi, C, C++, C#, Java, COBOL, Fortran, Ada, or Pascal. We also implement
the other components of the application design such as menu screens, data entry forms,
and reports.
Again, the target DBMS may have its own fourth-generation tools that allow
rapid development of applications through the provision of nonprocedural query
languages, reports generators, forms generators, and application generators.
Security and integrity controls for the system are also implemented. Some of these
controls are implemented using the DDL,

1.4.9. Data Conversion and Loading


Transferring any existing data into the new database and converting any existing
applications to run on the new database.
This stage is required only when a new database system is replacing an old
system. Nowadays, it is common for a DBMS to have a utility that loads existing files
into the new database. The utility usually requires the specification of the source file and
the target database, and then automatically converts the data to the required format of the
new database files. Where applicable, it may be possible for the developer to convert and
use application programs from the old system for use by the new system.

1.4.10. Testing

The process of running the database system with the intent of finding errors.
Testing should also cover usability of the database system. Ideally, an evaluation
should be conducted against a usability specification. Examples of criteria that can be
used to conduct the evaluation include the following (Sommerville, 2010):
Learnability: How long does it take a new user to become productive with the
system?

Performance: How well does the system response match the user’s work practice?
Robustness: How tolerant is the system of user error?
Recoverability: How good is the system at recovering from user errors?
Adaptability: How closely is the system tied to a single model of work?
Some of these criteria may be evaluated in other stages of the lifecycle. After testing is
complete, the database system is ready to be “signed off” and handed over to the users.

1.4. 11. Operational Maintenance


The process of monitoring and maintaining the database system following
1. Database Design and

installation. If the performance falls below an acceptable level, tuning or reorganization


of the database may be required.
Once the database system is fully operational, close monitoring takes place to
ensure that performance remains within acceptable levels.
A DBMS normally provides various utilities to aid database administration,
including utilities to load data into a database and to monitor the system.
The utilities that allow system monitoring give information on, for example,
database usage, locking efficiency (including number of deadlocks that have occurred,
and so on), and query execution strategy.

1.5. ENTITY–RELATIONSHIP MODELING

Concepts of the ER Model

 Entity types
 Relationship types
 Attributes

1.5.1 Entity Type


 Entity type
o Group of objects with same properties, identified by enterprise as having an
independent existence.
 Entity occurrence
Uniquely identifiable object of an entity type.
Examples of Entity Types
Conceptual Data 1.3

ER diagram of Staff and Branch entity types

1.5.2 Relationship Types

 Relationship type
– Set of meaningful associations among entity types.
 Relationship occurrence
– Uniquely identifiable association, which includes one occurrence from
each participating entity type.
Semantic net of Has relationship type
1. Database Design and

ER diagram of Branch Has Staff relationship

1.5.2.1 Degree of a Relationship


– Number of participating entities in relationship.
– Relationship of degree :
 two is binary
 three is ternary
 four is quaternary.

Binary relationship called POwns


Conceptual Data 1.3

Ternary relationship called Registers

Quaternary relationship called Arranges

1.5.2.2.Recursive Relationship
– Relationship type where same entity type participates more than once in
different roles.
– Relationships may be given role names to indicate purpose that each
participating entity type plays in a relationship.

Recursive relationship called Supervises with role names

1.5.3 Attributes
1. Database Design and

 Attribute
o Property of an entity or a relationship type.
 Attribute Domain
o Set of allowable values for one or more attributes.
1.5.3.1. Attribute types

An attribute can be classified into various types.


(i) Simple attribute: An attribute that cannot be divided into further
subparts.
Example: Customer-id of customer entity.
(ii) Composite attribute: An attribute that can be divided into a set of
subparts.
Example: In a customer entity, the attribute name can further be divided into first-name,
middle-name, last-name.
(iii) Single value attribute: An attribute having only one value in a
particular entity.
Example: In a customer entity, name, id, street are single valued attributes.
(iv) Multi-valued attribute: An attribute having more than one value for a
particular entity.
Example: Consider the customer entity set with the attribute phone no. A customer may
have zero, one or several phone nos and different customers may have different numbers
of phone.
(v) Derived attribute: An attribute that is derived from other related
attributes or entities.
For example, the age of a customer entity set is derived from the attribute date- of-birth of
a customer.

ER diagram of Staff and Branch entities and their attributes

1.5.3.2. Keys
Conceptual Data 1.3

A key allows us to identify a set of attributes and thus distinguishes entities from each
other. Keys also help to uniquely identify relationships, and thus distinguish relationships
from each other.

Different types of keys are:


1. Super key

2. Candidate key

3. Primary key

4. Foreign key

1. Super key: A super key is a set of one or more attributes that, taken
collectively, allow us to identify uniquely a tuple in the relation. For example, consider
the student relation.
Student (Rollno, Name, Age) is a Super key.

2. Candidate key: A table which have more than one attribute that uniquely
identify an instance of an entity set. These attributes are called Candidate keys.
For example, Consider the car relation,
Car (license_no, engine_serial_no, make, model, year) In this relation, we can find the two
Candidate keys:
license_no and engine_serial_no.
3. Primary key: An attribute which is unique and not null, can identify an
instance of the entity set is termed as Primary key.
For example, Consider the employee relation,
Employee (eno, ename, sex, doj, dob, sal, job, dno) in this eno is the Primary key.

4. Foreign key: An attribute in one relation whose value matches the primary
key in some other relation is called a Foreign key.
For example, Consider the two relations dept and employee,

Dept (dno, dname, dloc)


Employee (eno, ename, sex, doj, dob, sal, job, dno)
In the above relations, for dept relation dno is the primary key, for employee relation eno
is the
primary key and here we can find that the employee relation - dno matches with the dept
relation
- dno, so that employee relation dno is known as Foreign key.

1.5.4 Strong and Weak Entity Types


 Strong Entity Type
Entity types that have key attributes of their own are called strong entity types.
1. Database Design and

– Entity type that is not existence-dependent on some other entity type.


 Weak Entity Type
Entity types that do not have key attributes of their own are called weak entity types.

– Entity type that is existence-dependent on some other entity


type. Strong entity type called Client and weak entity type called Preference

Structural Constraints
 Main type of constraint on relationships is called multiplicity.
 Multiplicity - number (or range) of possible occurrences of an entity type that
may relate to a single occurrence of an associated entity type through a particular
relationship.
 Represents policies (called business rules) established by user or company.
Mapping Cardinalities
Mapping cardinalities or Cardinality ratio is defined as the number entities to which
another entity can be associated via a relationship set.

For a binary relationship set R between entity sets A and B, the mapping cardinality
must be one of the following

(i) One-to-one (1 : 1)

An entity in A is associated with at most one entity in B, and an entity in B is associated


with at most one entity in A.
Conceptual Data 1.4

(a) (b)
Fig. 2.4 One-to-One & One-to-Many
(ii) One-to-many (1 : M)

An entity in A is associated with any number of entities in B. An entity in B can be


associated with at most one entity in A.

(iii) Many-to-Many (M : N)

An entity in A is associated with any number of entities in B, and an entity in B is


associated with any number of entities in A.

(iv) Many to one (M : 1)

An entity in A is associated with at most one entity in B. An entity in B can be associated


with any number of entities in A.

(c) (d)
Fig. 2.5 Many-to-Many & Many-to-One
1. Database Design and

Problems with ER Models


 Problems may arise when designing a conceptual data model called connection
traps.
 Often due to a misinterpretation of the meaning of certain relationships.
 Two main types of connection traps are called fan traps and chasm traps.
 Fan Trap
o Where a model represents a relationship between entity types, but pathway
between certain entity occurrences is ambiguous.
o Chasm Trap
o Where a model suggests the existence of a relationship between entity types, but
pathway does not exist between certain entity occurrences.

1.6. ENHANCED ENTITY–RELATIONSHIP MODELING


 Semantic concepts are incorporated into the original ER model and called the
Enhanced Entity-Relationship (EER) model.
 Examples of additional concepts of EER model are:
– specialization / generalization
– aggregation
– composition.

1.6.1 Specialization / Generalization


1.6.1.1 Superclass & Subclass
 Superclass
o An entity type that includes one or more distinct subgroupings of its occurrences.
 Subclass
o A distinct subgrouping of occurrences of an entity type.
1.6.1.2 Superclass/subclass relationship
 Superclass/subclass relationship is one-to-one (1:1).
 Superclass may contain overlapping or distinct subclasses.
 Not all members of a superclass need be a member of a subclass.
1.6.1.3 Attribute Inheritance
o An entity in a subclass represents same ‘real world’ object as in superclass, and
may possess subclass-specific attributes, as well as those associated with the superclass.
Conceptual Data 1.4

1.6.1.4 Specialization
– Process of maximizing differences between members of an entity by
identifying their distinguishing characteristics.
1.6.1.5 Generalization
Process of minimizing differences between entities by identifying their common
characteristics.
All Staff relation holding details of all staff

Specialization/generalization of Staff entity into subclasses representing job roles


1. Database Design and

1.6.1.6 Constraints on Specialization / Generalization


 Two constraints that may apply to a specialization/generalization:
a) participation constraints
b) disjoint constraints.
a) Participation constraint

o Determines whether every member in superclass must participate as a member


of a subclass.
o May be mandatory or optional.
b) Disjoint constraint

o Describes relationship between members of the subclasses and indicates


whether member of a superclass can be a member of one, or more than one, subclass.
o May be disjoint or nondisjoint.
 There are four categories of constraints of specialization and generalization:
a. mandatory and disjoint
b. optional and disjoint
c. mandatory and nondisjoint
d. optional and nondisjoint.
1.7 AGGREGATION
 Represents a ‘has-a’ or ‘is-part-of’ relationship between entity types, where one
represents the ‘whole’ and the other ‘the part’.
Examples of Aggregation
Conceptual Data 1.4

1.8 COMPOSITION
 Specific form of aggregation that represents an association between entities,
where there is a strong ownership and coincidental lifetime between the ‘whole’ and the
‘part’.
Example of Composition

1.9.UML DIAGRAMS

UML defines a number of diagrams, of which the main ones can be divided into
the following two categories:

1) Structural diagrams, which describe the static relationships between


components. These include:
a. class diagrams,
b. object diagrams,
c. component diagrams,
d. deployment diagrams.

2)Behavioral diagrams, which describe the dynamic relationships between


components. These include:
a. use case diagrams,
b. sequence diagrams,
c. collaboration diagrams,
d. statechart diagrams,
e. activity diagrams.

a) Object diagrams
Object diagrams model instances of classes and are used to describe the system at
a particular point in time.
1. Database Design and

b) Component diagrams
Component diagrams describe the organization and dependencies among physical
software components, such as source code, runtime (binary) code, and executables.
A component is represented by a rectangle with two tabs overlapping the left edge. A
dependency is denoted by a dotted arrow going from a component to the component it
depends on.
c) Deployment diagrams
Deployment diagrams depict the configuration of the runtime system, showing the
hardware nodes, the components that run on these nodes, and the connections between
nodes. A node is represented by a three-dimensional cube.

d) Use case diagrams


Use case diagrams model the functionality provided by the system (use cases), the users
who interact with the system (actors), and the association between the users and the
functionality.
Use cases are used in the requirements collection and analysis phase of the
software development lifecycle to represent the high-level requirements of the system.
More specifically, a use case specifies a sequence of actions, including variants, that the
system can perform and that yields an observable result of value to a particular actor
(Jacobson et al., 1999).
Conceptual Data 1.4

An individual use case is represented by an ellipse, an actor by a stick figure, and


an association by a line between the actor and the use case. The role of the actor is
written beneath the icon.

e) Sequence diagrams
A sequence diagram models the interactions between objects over time, capturing
the behavior of an individual use case. It shows the objects and the messages that are
passed between these objects in the use case. In a sequence diagram, objects and actors
are shown as columns, with vertical lifelines indicating the lifetime of the object over
time. An activation/focus of control, which indicates when the object is performing an
action, is modeled as a rectangular box on the lifeline; a lifeline is represented by a
vertical dotted line extending from the object. The destruction of an object is indicated by
an X at the appropriate point on its lifeline.
1. Database Design and

f) Collaboration diagrams
A collaboration diagram is another type of interaction diagram, in this case showing the
interactions between objects as a series of sequenced messages. This type of diagram is a
cross between an object diagram and a sequence diagram.

g) Statechart diagrams
Statechart diagrams, sometimes referred to as state diagrams, show how objects
can change in response to external events. Other behavioral diagrams typically model the
interaction between multiple objects; statechart diagrams usually model the transitions of
a specific object.
the notation is simple, consisting of a few symbols:
• States are represented by boxes with rounded corners.
• Transitions are represented by solid arrows between states labeled with the “eventname/
action” (the event triggers the transition and action is the result of the transition).
Conceptual Data 1.4

For example, in Figure 27.23, the transition from state Pending to Available is triggered
by an approveProperty event and gives rise to the action called makeAvailable().
• Initial state (the state of the object before any transitions) is represented by a solid circle
with an arrow to the initial state.
• Final state (the state that marks the destruction of the object) is represented by a solid
circle with a surrounding circle and an arrow coming from a preceding state.

h) Activity diagrams

Activity diagrams model the flow of control from one activity to another. An
activity diagram typically represents the invocation of an operation, a step in a business
process, or an entire business process. It consists of activity states and transitions between
them.
1.5 Database Design and
Conceptual Data 1.5

IMPORTANT QUESTIONS AND ANSWERS


PART - A
1. Who is a DBA? What are the responsibilities of a DBA?
A database administrator (short form DBA) is a person responsible for the
design, implementation, maintenance and repair of an organization's database. They are
also known by the titles Database Coordinator or Database Programmer, and is closely
related to the Database Analyst,Database Modeller, Programmer Analyst, and Systems
Manager.
The role includes the development and design of database strategies, monitoring
and improving database performance and capacity, and planning for future expansion
requirements. They may also plan, co-ordinate and implement security measures to
safeguard the database
2. What is a data model? List the types of data model used.
A database model is the theoretical foundation of a database and fundamentally
determines in which manner data can be stored, organized, and manipulated in a database
system. It thereby defines the infrastructure offered by a particular database system. The
most popular example of a database model is the relational model.
types of data model used
 Hierarchical model
 Network model
 Relational model
 Entity-relationship
 Object-relational model
 Object model

3.Define database management system.


Database management system (DBMS) is a collection of interrelated data and a set
ofprograms to access those data.

4.What is data base management system?


 A database management system (DBMS) is a software package with
computer programs thatcontrol the creation, maintenance, and the use of a
database.
 It allows organizations to conveniently develop databases for various applications
bydatabase administrators (DBAs) and other specialists.
 A database is an integrated collection of data records, files, and other
database objects.
 A DBMS allows different user application programs to concurrently access the
same database. DBMSs may use a variety of database models, such as the relational
model or object model, to conveniently describe and support applications.
 It typically supports query languages, which are in fact high-level programming
languages, dedicated database languages that considerably simplify writing
database application programs.
 Database languages also simplify the database organization as well as retrieving
and presenting information from it.
 A DBMS provides facilities for controlling data access, enforcing data integrity,
managing concurrency control, recovering the database after failures and restoring
it from backup files,as well as maintaining database security.

5.List any eight applications of DBMS.


1.5 Database Design and

a) Banking
b) Airlines
c) Universities
1. What is an entity relationship model?
The entity relationship model is a collection of basic objects called entities and
relationship
among those objects. An entity is a thing or object in the real world that is distinguishable
from other objects.
2. What are attributes? Give examples.
An entity is represented by a set of attributes. Attributes are descriptive properties
possessed by each member of an entity set.
Example: possible attributes of customer entity are customer name, customer id,
Customer Street, customer city.
3. What is an entity?
An entity is an object that exists and is distinguishable from other objects. Example:
specific person, company, event, plant
4. Define the terms: i) Entity set ii) Relationship set
Entity set: The set of all entities of the same type is termed as an entity set. Relationship
set : The set of all relationships of the same type is termed as a relationship set.

5. Define single valued and multivalued attributes.


Single valued attributes: attributes with a single value for a particular entity are called
single valued attributes.
Multivalued attributes : Attributes with a set of value for a particular entity are called
multivalued attributes.
6. What are stored and derived attributes?
Stored attributes: The attributes stored in a data base are called stored attributes.
Derived attributes: The attributes that are derived from the stored attributes are called
derived attributes.
7 Define null values.
In some cases a particular entity may not have an applicable value for an attribute or if we
do not know the value of an attribute for a particular entity. In these cases null value is
used.

8 What does the cardinality ratio specify?


Mapping cardinalities or cardinality ratios express the number of entities to which another
entity can be associated. Mapping cardinalities must be one of the following:
• One to one
• One to many
• Many to one
• Many to many
Conceptual Data 1.

10. Define the terms: i) Key attribute ii) Value set


Key attribute : An entity type usually has an attribute whose values are distinct from
each individual entity in the collection. Such an attribute is called a key attribute.
Value set: Each simple attribute of an entity type is associated with a value set that
specifies the set of values that may be assigned to that attribute for each individual entity.

11. Define weak and strong entity sets?


Weak entity set: entity set that do not have key attribute of their own are called weak
entity sets.
Strong entity set: Entity set that has a primary key is termed a strong entity set.
12. Explain the two types of participation constraint.
Total: The participation of an entity set E in a relationship set R is said to be total if
every entity in E participates in at least one relationship in R.
Partial: if only some entities in E participate in relationships in R, the participation of
entity set E in relationship R is said to be partial.
13. Define the terms Generalization and Aggregation?
Generalization is a containment relationship that exists between a high-level Entity set
and one or more low-level entity set.
Aggregation is an abstraction through which relationships are treated as higher-level
entities.

14. What is ER diagram?


An entity-relationship diagram is a data modeling technique that creates a graphical
representation of the entities, and the relationships between entities, within an
information system.

15. What are the steps involved in creating in ERD?


a) Identify the entities.
b) Find the relationships
c) Identify key attributes for every entity
d) Draw the ERD.
16. Define tuple and attribute
Attributes: Name of Columns or
fields Tuple : Name of Rows or
records
17. Define the term Domain.
For each attribute there is a set of permitted values called the domain of that attribute.

18. What are referential integrity constraints?


A value that appears in one relation for a given set of attributes also appears for a certain
set of attributes in another relation.
1.54 Database Design and Management

19. What is the use of integrity constraints?


Integrity constraints ensure that changes made to the database by authorized users do not
result in a loss of data consistency. Thus integrity constraints guard against accidental
damage to the database.

PART - B

1. Compare File systems with database systems .


2. Explain all types of data models
3. Explain E-R Model concept and extended E-R model.
4. Explain the 3 schema architecture of DBS.Why do we need mappings between different
schema levels
5. Define generalization and aggregation. Demonstrate generalization and aggregation
using E-R diagram
6. Explain in detail Database Environment
7. Explain in detail Database System Development Lifecycle
8. Explain in detail Requirements collection
9. Explain in detail Database Design
10. Explain in detail UML Diagrams.

PART - C

1. Draw an E-R diagram for

a. Banking system

b. Airline Reservation system

c. Payroll system
2. Construct an ER diagram for a car-insurance company whose customers own one
or more cars each. Each car has associated with its zero to any number of record
accidents. State any assumptions you make.
3. Construct an ER diagram for university registrar’s office. The office maintains data
about each class, including the instructor, the enrollment and the time and place of the
class meetings. For each student class pair a grade is recorded.Determine the entities and
relationships.
4. Construct an ER diagram for a car insurance company that has a set of customers, each
of whom owns one/more cars. Each car has associated with it zero to any number of recorded
accidents
5. Draw an E-R diagram for a small marketing company database, assuming your own
data requirements
Database Design and Management
55.1

55

You might also like