Introduction to Database
Systems
What Is a DBMS?
•A database is a col lection of interrelated
data.
• A database management system (DBMS) is a
col lection of programs that manages the
database structure and controls access to
the data stored in the database.
• A database system(DBS) : It is a combination
of database and database management
systems.
What Is a Database?
A Database is a col lection of data , typical ly
describing the activities of one or more
related organizations.
For example, a university database might
contain
Information about the fol lowing:
Entities such as students, faculty, courses, and classrooms.
Relationships between entities, such as students' enrol lment in
courses,
faculty teaching courses, and the use of rooms for courses.
A database management system, or DBMS, is
software designed to assist in maintaining
and uti lizing large col lections of data.
A database system is a high-level definition of
the structure and relationship between
stored data, a database or databases, users,
and the hardware or operating system used
for the storage.
Applications of
DBMS • Banking
• Education
• Business
• Airlines
• Universities
• Manufacturing and sel ling
• Human resources
Why should we use database ?
. It manages data efficiently and al lows users to
perform multiple tasks with ease.
A database management system stores,
organizes and manages a large amount of
information within a single software application.
Effective and efficient management of data
Query processing and management
Use of this system increases the efficiency of
business operations and reduces overal l costs.
Historical Perspective
Early 1960s
◦ Integrated data store, first general purpose
DBMS designed by Charles Bachman at
General Electric
◦ Formed basis for network data model
◦ Bachman received Turing Award in 1973 for
his work in database area
network data model
1. Record Types:
1. Book
2. Author
3. Publisher
2. Sets:
1. Consider a set called "BooksByAuthor," which contains
information about books written by an author.
2. Another set could be "PublishedByPublisher," which contains
books published by a specific publisher.
3. Record Instances:
1. We have records for individual books, authors, and
publishers. 4. Pointers:
1. Each record in the "BooksByAuthor" set has pointers to both
the author record and the book record.
2. Each record in the "PublishedByPublisher" set has pointers to
both the publisher record and the book record.
Network model
Historical Perspective
Late 1960s
◦ I BM developed Information Management System
(IMS), used even today in many major instal
lations
◦ IMS formed the basis for hierarchical data model ◦
American Airlines and I BM jointly developed SABRE
for making airline reservations.
◦ SABRE system is developed by American airlines
and I BM for making airlines reservations. ◦ SABRE
is used today to populate Web-based travel
services such as Travelocity
hierarchical data model
1. Record Types:
1. Department
2. Employee
2. Fields:
1. Department: DepartmentID, DepartmentName, Location
2. Employee: EmployeeID, EmployeeName, Position 3.
Hierarchical Structure:
1. Each department can have multiple employees, but
each employee reports to only one department
hierarchical data model
Historical Perspective
1970
◦ Edgar Codd, at I BM’s San Jose Research
Laboratory, proposed relational data model. ◦ It
sparked the rapid development of several DBMSs
based on relational model, along with a rich body of
theoretical results that placed the field on a firm
foundation.
◦ Codd won 1981 Turing Award.
◦ Database systems matured as an academic
discipline
◦ The benefits of DBMS were widely recognized, and
the use of DBMSs for managing corporate data
became standard practice.
Historical Perspective
1980s
◦ Relational data model consolidated its
position as dominant DBMS paradigm, and
database systems continued to gain
widespread use
◦ SQL query language, developed as part of I
BM’s System R project, is now the
standard query language
◦ SQL was standardized in late 1980s, and
current standard SQL:1999 was adopted
by ANSI and ISO
. Tables:
1. Student
2. Course
3. Enrollment
. Attributes (Columns):
1. Student: StudentID, FirstName, LastName,
DateOfBirth
2. Course: CourseID, CourseName, Instructor 3.
Enrollment: EnrollmentID, StudentID (Foreign Key),
CourseID (Foreign Key), EnrollmentDate
. Primary Keys and Foreign Keys:
1. StudentID is the primary key of the Student table.
2. CourseID is the primary key of the Course table.
3. EnrollmentID is the primary key of the Enrollment
table.
4. StudentID in the Enrollment table is a foreign key
referencing the Student table.
5. CourseID in the Enrollment table is a foreign key
Historical Perspective
Late 1980s ti l l 1990s
◦ Considerable research into more powerful query
language and richer data model, with emphasis
on supporting complex analysis of data from al
l parts of an enterprise
◦ Several vendors, e.g., I BM’s DB2, Oracle 8,
Informix UDS, extended their systems with the
abi lity to store new data types such as images
and text, and to ask more complex queries
◦ Data warehouses have been developed by many
vendors to consolidate data from several
databases, and for carrying out specialized
analysis
Softwares available
1. MySQL 8.0 latest version
2. PostgreSQL
3. SQLite
4. Microsoft SQL Server Express 5.
Oracle Database Express Edition (XE) 6.
MongoDB
Program-data dependence --- Three file
processing systems at Some Company
Fi le descriptions are stored within each application
program that
accesses a given fi le. Any change to a fi le
structure reuires chanes
File Systems vs DBMS
If data is large, we must store data in a storage
device such as a disk or tape and bring relevant
parts into main memory for processing as needed .
Must write special programs to answer each
question a user may want to ask about data.
Must protect data from inconsistent changes made
by different users accessing data concurrently
Must cope with system crashes to ensure
data consistency
Need to enforce security policies in which different
users have permission to access different subsets
of the data
Advantages of DBMS
Data Independence: Application programs should
not, ideal ly, be exposed to detai ls of data
representation and storage. The DBMS provides an
abstract view of the data that hides such detai ls.
Efficient data access. DBMS uti lizes a variety of
sophisticated techniques to store and retrieve data
efficiently.
Data integrity and security. If data is always
accessed through the DBMS, the DBMS can enforce
integrity constraints. Also, it can enforce access
controls that govern what data is visible to different
classes of users
Data administration.
When several users share the data, centralizing the
administration of data can offer significant
improvements. Experienced professionals who
understand the nature of the data being managed, and
how different groups of users use it, can be responsible
for organizing the data representation to minimize
redundancy and for fine-tuning the storage of the data
to make retrieval efficient.
Concurrent access, recovery from crashes. A DBMS
schedules concurrent accesses to the data in such a
manner that users can think of the data as being
accessed by only one user at a time. Further, the DBMS
protects users from the effects of system fai lures.
Reduced application development time. Clearly,
the DBMS supports important functions that are
common to many applications accessing data in the
DBMS. DBMS applications are also likely to be more
robust than simi lar stand-alone applications
because many important tasks are hand led by the
DBMS
Describing and storing data in DBMS
Data model is a col lection of high-level data
description constructs that hide many low-level
storage detai ls.
DBMS al lows to store the data in the form of a
data model.
Ex: Relational data model.
Semantic data model: is a more abstract high-level data
model that makes it easier for the user to come up with
an initial description of data in an enterprise.
Ex: Entity Relationship(ER) model al lows us to
pictorial ly denote entities and the relationship among
them.
The relational data model
• A description of data in terms of data model is cal led a
schema.
• The schema for a relation specifies its name, the name of
each
field (or attribute or column), and the type of each
field.
Ex: student information in the university database
is stored in relation as fol lows
Schema :
Students( sid: string, name: string, login: string, age: integer,
gpa: real)
An Instance of the Students Relation
sid name login age gpa 53666 Jones Jones@cs 18 3.4 53688 Smith
Smith@ee 18 4.2 53650 Smith Smith@math 19 3.8 53831
Madayan Madayan@music 11 1.8 53832 guldu guldu@music 12
2.0
Levels of Abstraction
Levels of abstraction in DBMS is important because it allows
users to work with databases without worrying about the
underlying implementation details
The data in a DBMS is described at three levels of
abstraction Conceptual level/Logical level
Physical Level/ Internal Level
External Level/View Level
Conceptual level(high)
• Defines the overal l structure of the database, the
relationships
between the data elements, the constraints on the data,
and the operations that can be performed on the data.
Physical Level(low)
• how the data is physical ly stored and organized in
the
database, how the data is stored on disk, the
access metds used to retrieve the data, and the
algorithms used to perform operations on the
data.
External Level:
It defines how the data is viewed by the users or
Data definition language (DDL) is used to define external
and conceptual schemas.
Conceptual schemas /logical schemas:
• It describes the stored data in terms of the data model of
the DBMS.
• In the relational database the conceptual schemas describe
al l relations that are stored in the database.
For ex: In the university database, this relation
contains information
about entities such as students and faculty,
about relationships such as students enrol led in the
courses.
Example: University
Database
Conceptual schema:
◦ Students(sid: string, name: string, login:
string,
age: integer, gpa: real)
◦ Courses(cid: string, cname: string, credits:
integer)
◦ Enrol led(sid: string, cid: string, grade: string) ◦
faculty(fid: string, fname: string, sal: real) ◦
Rooms(rno:integer,address:string,capacity:
integer)
◦ Teaches (fid:string,cid:string)
◦ Meets_In( cid: string, rno: integer, time: string)
Physical schema:
Physical schema summarizes how the relations
described in conceptual schemas are stored on
secondary storage devices such as tapes and
disks.
A sample physical schema for the university
database fol lows:
• Store al l relations as unsorted fi les of records.
• Create indexes on the first column of the Students, Faculty, and
Courses
relations, the sal column of Faculty, and the capacity column of •
Decisions about the physical schema are based on an
Rooms.
understanding
of how the data is typical ly accessed.
Note : A fi le in a DBMS is either a col lection of records or a col lection
of pages, rather than a string of characters as in an operating system.
External Schema (View):
• External Schema al lows data access to be customized at
the level of individual users or groups of users.
• The external Schema design is guided by end-user
requirements.
Ex : Course_info( rid: string, fname:string, enrol lment:
integer)
• Each external schema consists of a col lection of one or
more views and relations from the conceptual schema.
Note: Any given database has exactly one conceptual
schema and one physical schema because it has just
one set of stored relations, but it may have several
external schemas, each tai lored to a
a particular group of users.
Real world Analogy
. Think of a Restaurant:External Level (Menu): 1.
What customers see—the dishes avai lable. .
Conceptual Level (Recipes):
1. The logical structure—how the dishes are
prepared.
. Internal Level (Kitchen Storage):
1. Physical storage of ingredients and cooking
processes.
External Level (User View):
Example:
• A student searches for the book "Data Science for Beginners"
using the library’s portal.
• The interface displays whether the book is avai lable and the shelf
location.
Conceptual Level (Logical View):
Example:
• Tables and relationships:
• Books Table: BookID, Title, AuthorID, Genre, ShelfNumber. •
Authors Table: AuthorID, Name, Nationality.
• Members Table: MemberID, Name, MembershipType,
ContactDetai ls.
• Borrowing Table: BorrowID, MemberID, BookID, BorrowDate,
ReturnDate.
Internal Level (Physical Storage):
Example:
The database stores the information physical ly on disk drives with
indexing for quick searches, such as finding al l books by a
specific author.
Data Independence
Application programs are insulated from
changes in the way how data is structured
and stored.
Data independence is a property of a
database that allows the User or Database
Administrator to change the schema at one
level without affecting the data or schema
at another level.
It is of two types:
Logical data independence
Physical data independence
Logical data independence: Protection from changes in
logical structure of data(the capacity to change the
conceptual schema without having to change
external schemas or application programs).
Logical data independence is a kind of mechanism,
which liberalizes itself from actual data stored on the
disk. If we do some changes on table format, it
should not change the data residing on the disk.
For ex: faculty relation in our university is replace by
two relation: faculty(fid: string, fname: string, sal:
real)
Faculty_public (fid: string, fname: string, office: integer)
Faculty_private(fid: string, sal: real)
Logical data independence is usually required
for changing the conceptual schema without
having to change the external schema or
application programs. It allows us to make
changes in a conceptual structure like adding,
modifying, or deleting an attribute in the
database.
Ex: If there is a database of a banking system
and we want to add the details of a new
customer or we want to update or delete the
data of a customer at the logical level data will
be changed but it will not affect the Physical
level or structure of the database.
These changes can be done at a logical level
without affecting the application program or
external layer.
• Adding, deleting, or modifying the entity or
relationship.
• Merging
or breaking the record present in the
database.
Data Independence (cont.)
Physical data independence: Protection from
changes in physical structure of data (the
capacity to change the internal schema
without having to change the conceptual (or
external) schemas).
Physical Data Independence can be defined
as the ability to change the physical level
without affecting the logical or Conceptual
level.
Physical data independence gives us the
freedom to modify the - Storage device, File
structure, location of the database, etc. without
changing the definition of conceptual or view
level.
Example: For example, if we take the database
of the banking system and we want to scale up
the database by changing the storage size and
also want to change the file structure, we can
do it without affecting any functionality of
logical schema
Changes can be done at the physical layer
without affecting the conceptual layer - • Changing
the storage devices like SSD, hard disk and
magnetic tapes, etc.
• Changing the access technique and modifying
indexes.
• Changing the compression techniques or hashing
algorithms.
Structure of a DBMS
The DBMS accepts SQL commands generated
from a variety of user interfaces, produces
query evaluation plans, executes these plans
against the database and returns the
answers.
When a user issues a query, the parsed query
is presented to a query optimizer, which uses
information about how the data is stored to
produce an efficient execution plan for
evaluating the query.
An execution plan is a blueprint for
Structure of a DBMS
The code that implements relational operators sits
on top of the fi le and access methods layer. This
layer supports the concept of a fi le, which,
in a DBMS, is a col lection of pages or a col
lection of records.
In addition to keeping track of the pages in a fi le,
this layer organizes the information within a
page.
The fi les and access methods layer code sits on
top of the buffer manager, which brings pages in
from disk to main memory as needed in response
to read requests.
The lowest layer of the DBMS software deals
with management of space on disk, where the
data is stored.
Higher layers al locate, deal locate, read, and
write pages through (routines provided by) this
layer, cal led the disk space manager.
The DBMS supports concurrency and crash
recovery by careful ly scheduling user requests
and maintaining a log of al l changes to the
database.
DBMS components associated with
concurrency control and recovery include the
transaction manager, which ensures that
transactions request and release locks
according to a suitable locking protocol and
schedules the execution of transactions;
the lock manager, which keeps track of
requests for locks and grants locks on
database objects when they become avai lable;
and the recovery manager, which is responsible
for maintaining a log and restoring the system
to a consistent state after a crash.
The disk space manager, buffer manager, and fi le
and access method layers must interact with these
components.
People who work with
Databases
Variety of people are associated with
creation and use of databases.
End users: store and use the data
DBMS implementers: work with vendors
DB application programmers
E.g. smart webmasters
Database administrator (DBA) Responsibi lities
Design of the conceptual and physical schemas:
Interact with users to understand what data is to
be stored and how it is likely to be used.
Security and authorization: Responsible for
ensuring that unauthorized data access is not
permitted.
Data avai labi lity and crash recovery: If the system
fai ls, users can continue to access as much of
the uncorrupted data as possible.
Database tuning modifying the database as