0% found this document useful (0 votes)
127 views104 pages

Introduction To DBMS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
127 views104 pages

Introduction To DBMS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 104

LECTURER PROFILE

 Lecturer: Nguyen Thon Da Ph.D.


 Contact:
 Email: dant@uel.edu.vn
 Hotline: 0906230232
 Facebook: https://facebook.com/daphd.net
 Studying and Teaching:
 Received Bachelor’s Degree in Information Systems (2004)
 Completed Military Service: A soldier in Information and
Communication at Air Force Air Defense (From Feb-2005 to Sep-2006)
 Work at University of Economics and Law: Since 2007
 Received Master’s Degree in Computer Science (2013)
 Received Doctor of Philosophy in Information Systems (2021)
 Research papers:
https://scholar.google.com/citations?user=8nIkBCkAAAAJ
 Further information:
http://teacher.uelstore.com/Home/Detail?email=dant@uel.edu.vn

1
Chapter 1

Introduction to Databases

Pearson Education 2
Chapter 1 - Objectives

Some common uses of database systems.


Characteristics of file-based systems.
Problems with file-based approach.
Meaning of the term database.
Meaning of the term Database
Management System (DBMS).

Pearson Education 3
Chapter 1 - Objectives
Typical functions of a DBMS.
Major components of the DBMS
environment.
Personnel involved in the DBMS
environment.
History of the development of DBMSs.
Advantages and disadvantages of DBMSs.

Pearson Education 4
Examples of Database Applications

Purchases from the supermarket


Purchases using your credit card
Booking a holiday at the travel agents
Using the local library
Taking out insurance
Renting a video
Using the Internet
Studying at university

Pearson Education 5
File-Based Systems

Collection of application programs that


perform services for the end users (e.g.
reports).
Each program defines and manages its
own data.

Pearson Education 6
File-Based Processing

Pearson Education 7
Limitations of File-Based Approach

Separation and isolation of data


Each program maintains its own set of data.
Users of one program may be unaware of
potentially useful data held by other programs.

Duplication of data
Same data is held by different programs.
Wasted space and potentially different values
and/or different formats for the same item.

Pearson Education 8
Limitations of File-Based Approach

Data dependence
File structure is defined in the program code.
Incompatible file formats
Programs are written in different languages,
and so cannot easily access each other’s files.
Fixed Queries/Proliferation of application
programs
Programs are written to satisfy particular
functions.
Any new requirement needs a new program.

Pearson Education 9
Limitations of File-Based Approach
Arose because:
Definition of data was embedded in
application programs, rather than being
stored separately and independently.
No control over access and manipulation of
data beyond that imposed by application
programs.

Result:
the database and Database Management
System (DBMS).

Pearson Education 10
Database
Shared collection of logically related data
(and a description of this data), designed to
meet the information needs of an
organization.
System catalog (metadata) provides
description of data to enable program–data
independence.
Logically related data comprises entities,
attributes, and relationships of an
organization’s information.
Pearson Education 11
Database Management System (DBMS)

A software system that enables users to


define, create, maintain, and control
access to the database.

(Database) application program: a


computer program that interacts with
database by issuing an appropriate
request (SQL statement) to the DBMS.

Pearson Education 12
Database Management System (DBMS)

Pearson Education 13
Database Approach

Data definition language (DDL).


Permits specification of data types,
structures and any data constraints.
All specifications are stored in the database.

Data manipulation language (DML).


General enquiry facility (query language) of
the data.

Pearson Education 14
Database Approach

Controlled access to database may


include:
a security system
an integrity system
a concurrency control system
a recovery control system
a user-accessible catalog.

Pearson Education 15
Views

Allows each user to have his or her own


view of the database.

A view is essentially some subset of the


database.

Pearson Education 16
Views - Benefits
Reduce complexity
Provide a level of security
Provide a mechanism to customize the
appearance of the database
Present a consistent, unchanging picture
of the structure of the database, even if
the underlying database is changed

Pearson Education 17
Components of DBMS Environment

Pearson Education 18
Components of DBMS Environment

Hardware
Can range from a PC to a network of
computers.
Software
DBMS, operating system, network software (if
necessary) and also the application programs.
Data
Used by the organization and a description
of this data called the schema.

Pearson Education 19
Components of DBMS Environment

Procedures
Instructions and rules that should be applied to
the design and use of the database and DBMS.
People

Pearson Education 20
Roles in the Database Environment
Data Administrator (DA)
Database Administrator (DBA)
Database Designers (Logical and Physical)
Application Programmers
End Users (naive and sophisticated)

Pearson Education 21
History of Database Systems

First-generation
Hierarchical and Network

Second generation
Relational

Third generation
Object-Relational
Object-Oriented

Pearson Education 22
Advantages of DBMSs

Control of data redundancy


Data consistency
More information from the same amount
of data
Sharing of data
Improved data integrity
Improved security
Enforcement of standards
Economy of scale

Pearson Education 23
Advantages of DBMSs

Balance conflicting requirements


Improved data accessibility and
responsiveness
Increased productivity
Improved maintenance through data
independence
Increased concurrency
Improved backup and recovery services

Pearson Education 24
Disadvantages of DBMSs

Complexity
Size
Cost of DBMS
Additional hardware costs
Cost of conversion
Performance
Higher impact of a failure

Pearson Education 25
Chapter 2

Database Environment

Pearson Education 26
Chapter 2 - Objectives

Purpose of three-level database architecture.


Contents of external, conceptual, and internal levels.
Purpose of external/conceptual and
conceptual/internal mappings.
Meaning of logical and physical data independence.
Distinction between DDL and DML.
A classification of data models.

Pearson Education 27
Chapter 2 - Objectives

Purpose of three-level database architecture.


Contents of external, conceptual, and internal levels.
Purpose of external/conceptual and
conceptual/internal mappings.
Meaning of logical and physical data independence.
Distinction between DDL and DML.
A classification of data models.

Pearson Education 28
Chapter 2 - Objectives
Purpose/importance of conceptual modeling.
Typical functions and services a DBMS
should provide.
Function and importance of system catalog.
Software components of a DBMS.
Meaning of client–server architecture and
advantages of this type of architecture for a
DBMS.
Function and uses of Transaction Processing
Monitors.
Pearson Education 29
Chapter 2 - Objectives

Purpose of three-level database architecture.


Contents of external, conceptual, and internal levels.
Purpose of external/conceptual and
conceptual/internal mappings.
Meaning of logical and physical data independence.
Distinction between DDL and DML.
A classification of data models.

Pearson Education 30
Objectives of Three-Level Architecture

All users should be able to access same


data.

A user’s view is immune to changes made


in other views.

Users should not need to know physical


database storage details.

Pearson Education 31
Objectives of Three-Level Architecture
DBA should be able to change database
storage structures without affecting the
users’ views.

Internal structure of database should be


unaffected by changes to physical aspects
of storage.

DBA should be able to change conceptual


structure of database without affecting all
users.
Pearson Education 32
ANSI-SPARC Three-Level Architecture

Pearson Education 33
ANSI-SPARC Three-Level Architecture
External Level
Users’ view of the database.
Describes that part of database that is
relevant to a particular user.

Conceptual Level
Community view of the database.
Describes what data is stored in database
and relationships among the data.

Pearson Education 34
ANSI-SPARC Three-Level Architecture

Internal Level
Physical representation of the database on
the computer.
Describes how the data is stored in the
database.

Pearson Education 35
Differences between Three Levels
of ANSI-SPARC Architecture

Pearson Education 36
Data Independence
Logical Data Independence
Refers to immunity of external schemas to
changes in conceptual schema.
Conceptual schema changes (e.g.
addition/removal of entities).
Should not require changes to external
schema or rewrites of application programs.

Pearson Education 37
Data Independence
Physical Data Independence
Refers to immunity of conceptual schema to
changes in the internal schema.
Internal schema changes (e.g. using different
file organizations, storage structures/devices).
Should not require change to conceptual or
external schemas.

Pearson Education 38
Data Independence and the ANSI-
SPARC Three-Level Architecture

Pearson Education 39
Database Languages
Data Definition Language (DDL)
Allows the DBA or user to describe and
name entities, attributes, and relationships
required for the application
plus any associated integrity and security
constraints.

Pearson Education 40
Database Languages
Data Manipulation Language (DML)
Provides basic data manipulation operations
on data held in the database.
Procedural DML
allows user to tell system exactly how to
manipulate data.
Non-Procedural DML
allows user to state what data is needed
rather than how it is to be retrieved.
Fourth Generation Languages (4GLs)
Pearson Education 41
Data Model
Integrated collection of concepts for
describing data, relationships between
data, and constraints on the data in an
organization.

Data Model comprises:


a structural part, consisting of a set of rules
according to which databases can be constructed ;
a manipulative part, defining the types of
operation that are allowed on the data;
a set of integrity rules, ensures that the data is
accurate.
Pearson Education 42
Data Model
Purpose
To represent data in an understandable way.

Categories of data models include:


Object-based (use concepts such as entities, attributes, and
relationships)
Record-based (database consists of a number of fixed-
format records)
Physical (describe how data is stored in the computer,
representing information such as record structures, record
orderings, and access paths.).

Pearson Education 43
Data Models
Object-Based Data Models
Entity-Relationship
Semantic
Functional
Object-Oriented.

Record-Based Data Models


Relational Data Model
Network Data Model
Hierarchical Data Model.

Physical Data Models


Pearson Education 44
Relational Data Model

Pearson Education 45
Network Data Model

Pearson Education 46
Hierarchical Data Model

Pearson Education 47
Conceptual Modeling
Conceptual schema is the core of a system
supporting all user views.
Should be complete and accurate
representation of an organization’s data
requirements.
Conceptual modeling is process of
developing a model of information use that
is independent of implementation details.
Result is a conceptual data model.
Pearson Education 48
Functions of a DBMS
Data Storage, Retrieval, and Update.

A User-Accessible Catalog.

Transaction Support.

Concurrency Control Services.

Recovery Services.
Pearson Education 49
Functions of a DBMS
Authorization Services.

Support for Data Communication.

Integrity Services.

Services to Promote Data Independence.

Utility Services.
Pearson Education 50
Chapter 3

Database Architectures and the Web


Transparencies

51
Database Environment - Objectives
The meaning of the client–server
architecture and the advantages of this
type of architecture for a DBMS.
The difference between two-tier, three-
tier and n-tier client–server architectures.
About cloud computing and data as a
service (DaaS) and database as a service
(DBaaS).
Software components of a DBMS.

52
Database Environment - Objectives
The purpose of a Web service and the
technological standards used to develop a
Web service.
The meaning of service-oriented architecture
(SOA).
The difference between distributed DBMSs,
and distributed processing.
The architecture of a data warehouse.
About cloud computing and cloud databases.
The software components of a DBMS.
53
Multi-user DBMS Architectures

The common architectures that are used to


implement multi-user database
management systems:
Teleprocessing
File-Server
Client-Server

54
File-Server
File-server is connected to several
workstations across a network.
Database resides on file-server.
DBMS and applications run on each
workstation.
Disadvantages include:
Significant network traffic.
Copy of DBMS on each workstation.
Concurrency, recovery and integrity control more
complex.

55
Teleprocessing
One computer with a single CPU and a number
of terminals.
Processing performed within the same physical
computer. User terminals are typically “dumb”,
incapable of functioning on their own, and
cabled to the central computer.

56
File-Server Architecture
In a file-server environment, the processing is
distributed about the network, typically a
local area network (LAN).

57
Traditional Two-Tier Client-Server
Client (tier 1) manages user interface and
runs applications.
Server (tier 2) holds database and DBMS.
Advantages include:
wider access to existing databases;
increased performance;
possible reduction in hardware costs;
reduction in communication costs;
increased consistency.

58
Traditional Two-Tier Client-Server

59
Alternative Client-Server Topologies

60
Traditional Two-Tier Client-Server
Summary of Client-Server
Functions

62
Three-Tier Client-Server
The need for enterprise scalability
challenged the traditional two-tier client–
server model.

Client side presented two problems


preventing true scalability:
‘Fat’ client, requiring considerable resources on
client’s computer to run effectively.
Significant client side administration overhead.

By 1995, three layers proposed, each


potentially running on a different platform.
63
Three-Tier Client-Server
Advantages:
‘Thin’ client, requiring less expensive hardware.
Application maintenance centralized.
Easier to modify or replace one tier without affecting
others.
Separating business logic from database functions
makes it easier to implement load balancing.
Maps quite naturally to Web environment.

64
Three-Tier Client-Server

65
n-Tier Client-Server (e.g. 4-Tier)
The three-tier
architecture can be
expanded to n tiers, with
additional tiers providing
more flexibility and
scalability.
Applications servers
host API to expose
business logic and
business processes for
use by other
applications.
Middleware
Middleware is a generic term used to
describe software that mediates with
other software and allows for
communication between disparate
applications in a heterogeneous system.

The need for middleware arises when


distributed systems become too complex
to manage efficiently without a common
interface.

67
Cloud Computing
The National Institute of Standards and
Technology (NIST) provided a definition.

Defined as “A model for enabling


ubiquitous, convenient, on-demand
network access to a shared pool of
configurable computing resources (e.g.
networks, servers, storage, applications,
and services) that can be rapidly
provisioned and released with minimal
management effort or service provider
interaction.” 68
Transaction Processing Monitors
TP monitor is a program that controls
data transfer between clients and servers
in order to provide a consistent
environment, particularly for online
transaction processing (OLTP).

69
Transaction Processing Monitor as
middle tier of 3-tier client-server

70
Web Services and Service-
Oriented Architectures
Web service is a software system designed
to support interoperable machine-to-web
service machine interaction over a
network.
Web services share business logic, data,
and processes through a programmatic
interface across a network.
Developers can add the Web service to a
Web page (or an executable program) to
offer specific functionality to users.
71
Web Services and Service-
Oriented Architectures
Web services approach uses accepted
technologies and standards, such as:
XML (extensible Markup Language).
SOAP (Simple Object Access Protocol) is a
communication protocol for exchanging
structured information over the Internet and
uses a message format based on XML. It is
both platform- and language-independent.
WSDL (Web Services Description Language)
protocol, again based on XML, is used to
describe and locate a Web service.
72
Web Services and Service-
Oriented Architectures
UDDI (Universal Discovery, Description, and
Integration) protocol is a platform
independent, XML-based registry for
businesses to list themselves on the Internet.

73
Service-Oriented Architectures
(SOA)
A business-centric software architecture
for building applications that implement
business processes as sets of services
published at a granularity relevant to the
service consumer. Services can be invoked,
published, and discovered, and are
abstracted away from the implementation
using a single standards-based form of
interface.

74
Distributed DBMSs
A distributed database is a logically
interrelated collection of shared data
(and a description of this data), physically
distributed over a computer network.

A distributed DBMS is the software


system that permits the management of
the distributed database and makes the
distribution transparent to users.

75
Distributed DBMSs
A DDBMS consists of a single logical
database split into a number of fragments.
Each fragment is stored on one or more
computers (replicas) under the control of a
separate DBMS, with the computers
connected by a network.
Each site is capable of independently
processing user requests that require
access to local data (that is, each site has
some degree of local autonomy) and is also
capable of processing data stored on other
computers in the network. 76
Data Warehousing
A data warehouse was deemed the solution
to meet the requirements of a system
capable of supporting decision making,
receiving data from multiple operational
data sources.

77
Cloud Computing
The National Institute of Standards and
Technology (NIST) provided a definition.

Defined as “A model for enabling


ubiquitous, convenient, on-demand
network access to a shared pool of
configurable computing resources (e.g.
networks, servers, storage, applications,
and services) that can be rapidly
provisioned and released with minimal
management effort or service provider
interaction.” 78
Cloud Computing – Key
Characteristics
On-demand self-service
Consumers can obtain, configure and deploy
cloud services without help from provider.

Broad network access


Accessible from anywhere, from any
standardized platform (e.g. desktop computers,
laptops, mobile devices).

79
Cloud Computing – Key
Characteristics
Resource pooling
Provider’s computing resources are pooled
to serve multiple consumers, with different
physical and virtual resources dynamically
assigned and reassigned according to
consumer demand. Examples of resources
include storage, processing, memory, and
network bandwidth.

80
Cloud Computing – Key
Characteristics
Rapid elasticity
Provider’s capacity caters for customer’s
spikes in demand and reduces risk of
outages and service interruptions. Capacity
can be automated to scale rapidly based on
demand.
Measured service
Provider uses a metering capability to
measure usage of service (e.g. storage,
processing, bandwidth, and active user
accounts).
81
Cloud Computing – Service
Models
Software as a Service (SaaS):
Software and data hosted on cloud. Accessed
through using thin client interface (e.g. web
browser). Consumer may be offered limited
user specific application configuration
settings.
Examples include Salesforce.com sales
management applications, NetSuite’s
integrated business management software,
Google’s Gmail and Cornerstone OnDemand.

82
Cloud Computing – Service
Models
Platform as a Service (PaaS)
Allows creation of web applications without
buying/maintaining the software and
underlying infrastructure. Provider manages
the infrastructure including network, servers,
OS and storage, while customer controls
deployment of applications and possibly
configuration.
Examples include Salesforce.com’s Force.com,
Google’s App Engine, and Microsoft’s Azure.

83
Cloud Computing – Service
Models
Infrastructure as a Service (IaaS)
Provider’s offer servers, storage, network
and operating systems – typically a platform
virtualization environment – to consumers as
an on-demand service, in a single bundle and
billed according to usage.
A popular use of IaaS is in hosting websites.
Examples Amazon’s Elastic Compute Cloud
(EC2), Rackspace and GoGrid.

84
Cloud Computing – Comparison
of Services Models

85
Benefits of Cloud Computing
Cost-Reduction: Avoid up-front capital expenditure.
Scalability/Agility: Organisations set up resources
on an as-needs basis.
Improved Security: Providers can devote expertise
& resources to security; not affordable by
customer.
Improved Reliability: Providers can devote
expertise & resources on reliability of systems; not
affordable by customer.
Access to new technologies: Through use of
provider’s systems, customers may access latest
technology.
86
Benefits of Cloud Computing
Faster development: Provider’s platforms can
provide many of the core services to accelerate
development cycle.
Large scale prototyping/load testing: Providers
have the resources to enable this.
More flexible working practices: Staff can access
files using mobile devices.
Increased competitiveness: Allows organizations to
focus on their core competencies rather than their
IT infrastructures.

87
Risks of Cloud Computing
Network Dependency: Power outages, bandwidth
issues and service interruptions.
System Dependency: Customer’s dependency on
availability and reliability of provider’s systems.
Cloud Provider Dependency: Provider could
became insolvent or acquired by competitor,
resulting in the service suddenly terminating.
Lack of control: Customers unable to deploy
technical or organisational measures to safeguard
the data. May result in reduced availability,
integrity, confidentiality, intervenability and
isolation.
Lack of information on processing transparency
88
Cloud-based database solutions
As a type of Software as a Service (SaaS),
cloud-based database solutions fall into
two basic categories:
Data as a Service (DaaS) and
Database as a Service (DBaaS).

Key difference between the two options


is mainly how the data is managed.

89
Cloud-based database solutions
DBaaS
Offers full database functionality to application
developers.
Provides a management layer that provides
continuous monitoring and configuring of the
database to optimized scaling, high availability,
multi-tenancy (that is, serving multiple client
organizations), and effective resource allocation in
the cloud, thereby sparing the developer from
ongoing database administration tasks.

90
Cloud-based database solutions
DaaS:
Services enables data definition in the
cloud and subsequently querying.
Does not implement typical DBMS
interfaces (e.g. SQL) but instead data is
accessed via common APIs.
Enables organization with valuable data to
offer access to others. Examples Urban
Mapping (geography data service), Xignite
(financial data service) and Hoovers
(business data service.)

91
Cloud-based database solutions
Multi-tenant cloud database-shared server,
separate database server process
architecture.

92
Cloud-based database solutions
Multi-tenant cloud database-shared
DBMS server, separate databases.

93
Cloud-based database solutions
Multi-tenant cloud database-shared
DBMS server, separate databases.

94
Cloud-based database solutions
Multi-tenant cloud database–shared
database, separate schema architecture.

95
Components of a DBMS
A DBMS is partitioned into several
software components (or modules), each
of which is assigned a specific operation.
As stated previously, some of the
functions of the DBMS are supported by
the underlying operating system.
The DBMS interfaces with other software
components, such as user queries and
access methods (file management
techniques for storing and retrieving data
records).
96
Components of a DBMS

97
Components of a DBMS
(Continued)
Query processor is a major DBMS
component that transforms queries into
a series of low-level instructions directed
to the database manager.
Database manager (DM) interfaces with
user-submitted application programs and
queries. The DM examines the external
and conceptual schemas to determine
what conceptual records are required to
satisfy the request. The DM then places a
call to the file manager to perform the
request. 98
Components of a DBMS
(Continued)
File manager manipulates the underlying
storage files and manages the allocation
of storage space on disk. It establishes
and maintains the list of structures and
indexes defined in the internal schema.
DML preprocessor converts DML
statements embedded in an application
program into standard function calls in
the host language. The DML preprocessor
must interact with the query processor to
generate the appropriate code. 99
Components of a DBMS
(Continued)
DDL compiler converts DDL statements
into a set of tables containing metadata.
These tables are then stored in the
system catalog while control information
is stored in data file headers.
Catalog manager manages access to and
maintains the system catalog. The system
catalog is accessed by most DBMS
components.

100
Components of Database Manager
(DM)

101
Components of the Database
Manager
Authorization control to confirm whether
the user has the necessary permission to
carry out the required operation.
Command processor on confirmation of
user authority, control is passed to the
command processor.
Integrity checker ensures that requested
operation satisfies all necessary integrity
constraints (e.g. key constraints) for an
operation that changes the database.
102
Components of the Database
Manager (Continued)
Query optimizer determines an optimal
strategy for the query execution.
Transaction manager performs the
required processing of operations that it
receives from transactions.
Scheduler ensures that concurrent
operations on the database proceed
without conflicting with one another. It
controls the relative order in which
transaction operations are executed.
103
Components of the Database
Manager (Continued)
Recovery manager ensures that the
database remains in a consistent state in
the presence of failures. It is responsible for
transaction commit and abort.
Buffer manager responsible for the transfer
of data between main memory and
secondary storage, such as disk and tape.
The recovery manager and the buffer
manager also known as (aka) the data
manager. The buffer manager is known the
cache manager. 104

You might also like