Introduction To DBMS
Introduction To DBMS
1
Chapter 1
Introduction to Databases
Pearson Education 2
Chapter 1 - Objectives
Pearson Education 3
Chapter 1 - Objectives
Typical functions of a DBMS.
Major components of the DBMS
environment.
Personnel involved in the DBMS
environment.
History of the development of DBMSs.
Advantages and disadvantages of DBMSs.
Pearson Education 4
Examples of Database Applications
Pearson Education 5
File-Based Systems
Pearson Education 6
File-Based Processing
Pearson Education 7
Limitations of File-Based Approach
Duplication of data
Same data is held by different programs.
Wasted space and potentially different values
and/or different formats for the same item.
Pearson Education 8
Limitations of File-Based Approach
Data dependence
File structure is defined in the program code.
Incompatible file formats
Programs are written in different languages,
and so cannot easily access each other’s files.
Fixed Queries/Proliferation of application
programs
Programs are written to satisfy particular
functions.
Any new requirement needs a new program.
Pearson Education 9
Limitations of File-Based Approach
Arose because:
Definition of data was embedded in
application programs, rather than being
stored separately and independently.
No control over access and manipulation of
data beyond that imposed by application
programs.
Result:
the database and Database Management
System (DBMS).
Pearson Education 10
Database
Shared collection of logically related data
(and a description of this data), designed to
meet the information needs of an
organization.
System catalog (metadata) provides
description of data to enable program–data
independence.
Logically related data comprises entities,
attributes, and relationships of an
organization’s information.
Pearson Education 11
Database Management System (DBMS)
Pearson Education 12
Database Management System (DBMS)
Pearson Education 13
Database Approach
Pearson Education 14
Database Approach
Pearson Education 15
Views
Pearson Education 16
Views - Benefits
Reduce complexity
Provide a level of security
Provide a mechanism to customize the
appearance of the database
Present a consistent, unchanging picture
of the structure of the database, even if
the underlying database is changed
Pearson Education 17
Components of DBMS Environment
Pearson Education 18
Components of DBMS Environment
Hardware
Can range from a PC to a network of
computers.
Software
DBMS, operating system, network software (if
necessary) and also the application programs.
Data
Used by the organization and a description
of this data called the schema.
Pearson Education 19
Components of DBMS Environment
Procedures
Instructions and rules that should be applied to
the design and use of the database and DBMS.
People
Pearson Education 20
Roles in the Database Environment
Data Administrator (DA)
Database Administrator (DBA)
Database Designers (Logical and Physical)
Application Programmers
End Users (naive and sophisticated)
Pearson Education 21
History of Database Systems
First-generation
Hierarchical and Network
Second generation
Relational
Third generation
Object-Relational
Object-Oriented
Pearson Education 22
Advantages of DBMSs
Pearson Education 23
Advantages of DBMSs
Pearson Education 24
Disadvantages of DBMSs
Complexity
Size
Cost of DBMS
Additional hardware costs
Cost of conversion
Performance
Higher impact of a failure
Pearson Education 25
Chapter 2
Database Environment
Pearson Education 26
Chapter 2 - Objectives
Pearson Education 27
Chapter 2 - Objectives
Pearson Education 28
Chapter 2 - Objectives
Purpose/importance of conceptual modeling.
Typical functions and services a DBMS
should provide.
Function and importance of system catalog.
Software components of a DBMS.
Meaning of client–server architecture and
advantages of this type of architecture for a
DBMS.
Function and uses of Transaction Processing
Monitors.
Pearson Education 29
Chapter 2 - Objectives
Pearson Education 30
Objectives of Three-Level Architecture
Pearson Education 31
Objectives of Three-Level Architecture
DBA should be able to change database
storage structures without affecting the
users’ views.
Pearson Education 33
ANSI-SPARC Three-Level Architecture
External Level
Users’ view of the database.
Describes that part of database that is
relevant to a particular user.
Conceptual Level
Community view of the database.
Describes what data is stored in database
and relationships among the data.
Pearson Education 34
ANSI-SPARC Three-Level Architecture
Internal Level
Physical representation of the database on
the computer.
Describes how the data is stored in the
database.
Pearson Education 35
Differences between Three Levels
of ANSI-SPARC Architecture
Pearson Education 36
Data Independence
Logical Data Independence
Refers to immunity of external schemas to
changes in conceptual schema.
Conceptual schema changes (e.g.
addition/removal of entities).
Should not require changes to external
schema or rewrites of application programs.
Pearson Education 37
Data Independence
Physical Data Independence
Refers to immunity of conceptual schema to
changes in the internal schema.
Internal schema changes (e.g. using different
file organizations, storage structures/devices).
Should not require change to conceptual or
external schemas.
Pearson Education 38
Data Independence and the ANSI-
SPARC Three-Level Architecture
Pearson Education 39
Database Languages
Data Definition Language (DDL)
Allows the DBA or user to describe and
name entities, attributes, and relationships
required for the application
plus any associated integrity and security
constraints.
Pearson Education 40
Database Languages
Data Manipulation Language (DML)
Provides basic data manipulation operations
on data held in the database.
Procedural DML
allows user to tell system exactly how to
manipulate data.
Non-Procedural DML
allows user to state what data is needed
rather than how it is to be retrieved.
Fourth Generation Languages (4GLs)
Pearson Education 41
Data Model
Integrated collection of concepts for
describing data, relationships between
data, and constraints on the data in an
organization.
Pearson Education 43
Data Models
Object-Based Data Models
Entity-Relationship
Semantic
Functional
Object-Oriented.
Pearson Education 45
Network Data Model
Pearson Education 46
Hierarchical Data Model
Pearson Education 47
Conceptual Modeling
Conceptual schema is the core of a system
supporting all user views.
Should be complete and accurate
representation of an organization’s data
requirements.
Conceptual modeling is process of
developing a model of information use that
is independent of implementation details.
Result is a conceptual data model.
Pearson Education 48
Functions of a DBMS
Data Storage, Retrieval, and Update.
A User-Accessible Catalog.
Transaction Support.
Recovery Services.
Pearson Education 49
Functions of a DBMS
Authorization Services.
Integrity Services.
Utility Services.
Pearson Education 50
Chapter 3
51
Database Environment - Objectives
The meaning of the client–server
architecture and the advantages of this
type of architecture for a DBMS.
The difference between two-tier, three-
tier and n-tier client–server architectures.
About cloud computing and data as a
service (DaaS) and database as a service
(DBaaS).
Software components of a DBMS.
52
Database Environment - Objectives
The purpose of a Web service and the
technological standards used to develop a
Web service.
The meaning of service-oriented architecture
(SOA).
The difference between distributed DBMSs,
and distributed processing.
The architecture of a data warehouse.
About cloud computing and cloud databases.
The software components of a DBMS.
53
Multi-user DBMS Architectures
54
File-Server
File-server is connected to several
workstations across a network.
Database resides on file-server.
DBMS and applications run on each
workstation.
Disadvantages include:
Significant network traffic.
Copy of DBMS on each workstation.
Concurrency, recovery and integrity control more
complex.
55
Teleprocessing
One computer with a single CPU and a number
of terminals.
Processing performed within the same physical
computer. User terminals are typically “dumb”,
incapable of functioning on their own, and
cabled to the central computer.
56
File-Server Architecture
In a file-server environment, the processing is
distributed about the network, typically a
local area network (LAN).
57
Traditional Two-Tier Client-Server
Client (tier 1) manages user interface and
runs applications.
Server (tier 2) holds database and DBMS.
Advantages include:
wider access to existing databases;
increased performance;
possible reduction in hardware costs;
reduction in communication costs;
increased consistency.
58
Traditional Two-Tier Client-Server
59
Alternative Client-Server Topologies
60
Traditional Two-Tier Client-Server
Summary of Client-Server
Functions
62
Three-Tier Client-Server
The need for enterprise scalability
challenged the traditional two-tier client–
server model.
64
Three-Tier Client-Server
65
n-Tier Client-Server (e.g. 4-Tier)
The three-tier
architecture can be
expanded to n tiers, with
additional tiers providing
more flexibility and
scalability.
Applications servers
host API to expose
business logic and
business processes for
use by other
applications.
Middleware
Middleware is a generic term used to
describe software that mediates with
other software and allows for
communication between disparate
applications in a heterogeneous system.
67
Cloud Computing
The National Institute of Standards and
Technology (NIST) provided a definition.
69
Transaction Processing Monitor as
middle tier of 3-tier client-server
70
Web Services and Service-
Oriented Architectures
Web service is a software system designed
to support interoperable machine-to-web
service machine interaction over a
network.
Web services share business logic, data,
and processes through a programmatic
interface across a network.
Developers can add the Web service to a
Web page (or an executable program) to
offer specific functionality to users.
71
Web Services and Service-
Oriented Architectures
Web services approach uses accepted
technologies and standards, such as:
XML (extensible Markup Language).
SOAP (Simple Object Access Protocol) is a
communication protocol for exchanging
structured information over the Internet and
uses a message format based on XML. It is
both platform- and language-independent.
WSDL (Web Services Description Language)
protocol, again based on XML, is used to
describe and locate a Web service.
72
Web Services and Service-
Oriented Architectures
UDDI (Universal Discovery, Description, and
Integration) protocol is a platform
independent, XML-based registry for
businesses to list themselves on the Internet.
73
Service-Oriented Architectures
(SOA)
A business-centric software architecture
for building applications that implement
business processes as sets of services
published at a granularity relevant to the
service consumer. Services can be invoked,
published, and discovered, and are
abstracted away from the implementation
using a single standards-based form of
interface.
74
Distributed DBMSs
A distributed database is a logically
interrelated collection of shared data
(and a description of this data), physically
distributed over a computer network.
75
Distributed DBMSs
A DDBMS consists of a single logical
database split into a number of fragments.
Each fragment is stored on one or more
computers (replicas) under the control of a
separate DBMS, with the computers
connected by a network.
Each site is capable of independently
processing user requests that require
access to local data (that is, each site has
some degree of local autonomy) and is also
capable of processing data stored on other
computers in the network. 76
Data Warehousing
A data warehouse was deemed the solution
to meet the requirements of a system
capable of supporting decision making,
receiving data from multiple operational
data sources.
77
Cloud Computing
The National Institute of Standards and
Technology (NIST) provided a definition.
79
Cloud Computing – Key
Characteristics
Resource pooling
Provider’s computing resources are pooled
to serve multiple consumers, with different
physical and virtual resources dynamically
assigned and reassigned according to
consumer demand. Examples of resources
include storage, processing, memory, and
network bandwidth.
80
Cloud Computing – Key
Characteristics
Rapid elasticity
Provider’s capacity caters for customer’s
spikes in demand and reduces risk of
outages and service interruptions. Capacity
can be automated to scale rapidly based on
demand.
Measured service
Provider uses a metering capability to
measure usage of service (e.g. storage,
processing, bandwidth, and active user
accounts).
81
Cloud Computing – Service
Models
Software as a Service (SaaS):
Software and data hosted on cloud. Accessed
through using thin client interface (e.g. web
browser). Consumer may be offered limited
user specific application configuration
settings.
Examples include Salesforce.com sales
management applications, NetSuite’s
integrated business management software,
Google’s Gmail and Cornerstone OnDemand.
82
Cloud Computing – Service
Models
Platform as a Service (PaaS)
Allows creation of web applications without
buying/maintaining the software and
underlying infrastructure. Provider manages
the infrastructure including network, servers,
OS and storage, while customer controls
deployment of applications and possibly
configuration.
Examples include Salesforce.com’s Force.com,
Google’s App Engine, and Microsoft’s Azure.
83
Cloud Computing – Service
Models
Infrastructure as a Service (IaaS)
Provider’s offer servers, storage, network
and operating systems – typically a platform
virtualization environment – to consumers as
an on-demand service, in a single bundle and
billed according to usage.
A popular use of IaaS is in hosting websites.
Examples Amazon’s Elastic Compute Cloud
(EC2), Rackspace and GoGrid.
84
Cloud Computing – Comparison
of Services Models
85
Benefits of Cloud Computing
Cost-Reduction: Avoid up-front capital expenditure.
Scalability/Agility: Organisations set up resources
on an as-needs basis.
Improved Security: Providers can devote expertise
& resources to security; not affordable by
customer.
Improved Reliability: Providers can devote
expertise & resources on reliability of systems; not
affordable by customer.
Access to new technologies: Through use of
provider’s systems, customers may access latest
technology.
86
Benefits of Cloud Computing
Faster development: Provider’s platforms can
provide many of the core services to accelerate
development cycle.
Large scale prototyping/load testing: Providers
have the resources to enable this.
More flexible working practices: Staff can access
files using mobile devices.
Increased competitiveness: Allows organizations to
focus on their core competencies rather than their
IT infrastructures.
87
Risks of Cloud Computing
Network Dependency: Power outages, bandwidth
issues and service interruptions.
System Dependency: Customer’s dependency on
availability and reliability of provider’s systems.
Cloud Provider Dependency: Provider could
became insolvent or acquired by competitor,
resulting in the service suddenly terminating.
Lack of control: Customers unable to deploy
technical or organisational measures to safeguard
the data. May result in reduced availability,
integrity, confidentiality, intervenability and
isolation.
Lack of information on processing transparency
88
Cloud-based database solutions
As a type of Software as a Service (SaaS),
cloud-based database solutions fall into
two basic categories:
Data as a Service (DaaS) and
Database as a Service (DBaaS).
89
Cloud-based database solutions
DBaaS
Offers full database functionality to application
developers.
Provides a management layer that provides
continuous monitoring and configuring of the
database to optimized scaling, high availability,
multi-tenancy (that is, serving multiple client
organizations), and effective resource allocation in
the cloud, thereby sparing the developer from
ongoing database administration tasks.
90
Cloud-based database solutions
DaaS:
Services enables data definition in the
cloud and subsequently querying.
Does not implement typical DBMS
interfaces (e.g. SQL) but instead data is
accessed via common APIs.
Enables organization with valuable data to
offer access to others. Examples Urban
Mapping (geography data service), Xignite
(financial data service) and Hoovers
(business data service.)
91
Cloud-based database solutions
Multi-tenant cloud database-shared server,
separate database server process
architecture.
92
Cloud-based database solutions
Multi-tenant cloud database-shared
DBMS server, separate databases.
93
Cloud-based database solutions
Multi-tenant cloud database-shared
DBMS server, separate databases.
94
Cloud-based database solutions
Multi-tenant cloud database–shared
database, separate schema architecture.
95
Components of a DBMS
A DBMS is partitioned into several
software components (or modules), each
of which is assigned a specific operation.
As stated previously, some of the
functions of the DBMS are supported by
the underlying operating system.
The DBMS interfaces with other software
components, such as user queries and
access methods (file management
techniques for storing and retrieving data
records).
96
Components of a DBMS
97
Components of a DBMS
(Continued)
Query processor is a major DBMS
component that transforms queries into
a series of low-level instructions directed
to the database manager.
Database manager (DM) interfaces with
user-submitted application programs and
queries. The DM examines the external
and conceptual schemas to determine
what conceptual records are required to
satisfy the request. The DM then places a
call to the file manager to perform the
request. 98
Components of a DBMS
(Continued)
File manager manipulates the underlying
storage files and manages the allocation
of storage space on disk. It establishes
and maintains the list of structures and
indexes defined in the internal schema.
DML preprocessor converts DML
statements embedded in an application
program into standard function calls in
the host language. The DML preprocessor
must interact with the query processor to
generate the appropriate code. 99
Components of a DBMS
(Continued)
DDL compiler converts DDL statements
into a set of tables containing metadata.
These tables are then stored in the
system catalog while control information
is stored in data file headers.
Catalog manager manages access to and
maintains the system catalog. The system
catalog is accessed by most DBMS
components.
100
Components of Database Manager
(DM)
101
Components of the Database
Manager
Authorization control to confirm whether
the user has the necessary permission to
carry out the required operation.
Command processor on confirmation of
user authority, control is passed to the
command processor.
Integrity checker ensures that requested
operation satisfies all necessary integrity
constraints (e.g. key constraints) for an
operation that changes the database.
102
Components of the Database
Manager (Continued)
Query optimizer determines an optimal
strategy for the query execution.
Transaction manager performs the
required processing of operations that it
receives from transactions.
Scheduler ensures that concurrent
operations on the database proceed
without conflicting with one another. It
controls the relative order in which
transaction operations are executed.
103
Components of the Database
Manager (Continued)
Recovery manager ensures that the
database remains in a consistent state in
the presence of failures. It is responsible for
transaction commit and abort.
Buffer manager responsible for the transfer
of data between main memory and
secondary storage, such as disk and tape.
The recovery manager and the buffer
manager also known as (aka) the data
manager. The buffer manager is known the
cache manager. 104