Introduction to Database
Systems
What Is a DBMS?
• A data base is a col lection of interrelated
data.
• A data base ma nagement system (DBMS) is
a col lection of prog ra ms that ma nages the
data base structu re a nd controls access to
the data stored in the data base.
• A data base system(DBS) : It is a
com bination of data base a nd data base
ma nagement systems.
What Is a Database?
A Data base is a col lection of data , typica l ly
describing the activities of one or more
related orga nizations.
For exa m ple, a u niversity data base mig ht
contain
I nformation a bout the fol lowing:
Entities such as students, facu lty, cou rses, a nd classrooms.
Relationships between entities, such as students' enrol l ment in
cou rses,
facu lty teaching cou rses, a nd the use of rooms for cou rses.
A data base ma nagement system, or DBMS,
is softwa re desig ned to assist in maintaining
a nd utilizing la rge col lections of data.
A data base system is a hig h-level definition
of the structu re a nd relationshi p between
stored data, a data base or data bases,
users, a nd the ha rdwa re or operating system
used for the storage.
Applications of DBMS
• Ba n king
• Ed ucation
• Business
• Airlines
• Universities
• Ma nufactu ring a nd sel ling
• H u ma n resou rces
Why shou ld we use data base ?
. It ma nages data efficiently a nd a l lows users to
perform mu lti ple tasks with ease.
. A data base ma nagement system stores,
orga nizes a nd ma nages a la rge a mou nt of
information within a sing le softwa re a pplication.
. Effective a nd efficient ma nagement of data
. Query processing a nd ma nagement
. Use of this system increases the efficiency of
business operations a nd red uces overa l l costs.
Historical Perspective
Ea rly 1960s
◦ I nteg rated data store, first genera l-
pu rpose DBMS desig ned by Cha rles
Bach ma n at Genera l Electric
◦ Formed basis for network data model
◦ Bach ma n received Tu ring Awa rd in 1973 for
his work in data base a rea
network data model
1. Record Types:
1. Book
2. Author
3. Publisher
2. Sets:
1. Consider a set called "BooksByAuthor," which contains
information about books written by an author.
2. Another set could be "PublishedByPublisher," which contains
books published by a specific publisher.
3. Record Instances:
1. We have records for individual books, authors, and publishers.
4. Pointers:
1. Each record in the "BooksByAuthor" set has pointers to both the
author record and the book record.
2. Each record in the "PublishedByPublisher" set has pointers to
both the publisher record and the book record.
Network model
Historical Perspective
Late 1960s
◦ I BM developed I nformation Ma nagement System
(I MS), used even today in ma ny major
insta l lations
◦ I MS formed the basis for hiera rchica l data model
◦ America n Airlines a nd I BM jointly developed
SABRE for ma king airline reservations.
◦ SABRE system is developed by America n airlines
a nd I BM for ma king airlines reservations.
◦ SABRE is used today to popu late Web-based
travel services such as Travelocity
hiera rchica l data model
1. Record Types:
1. Department
2. Employee
2. Fields:
1. Department: DepartmentID, DepartmentName, Location
2. Employee: EmployeeID, EmployeeName, Position
3. Hierarchical Structure:
1. Each department can have multiple employees, but
each employee reports to only one department
hiera rchica l data model
Historical Perspective
1970
◦ Edga r Codd, at I BM’s Sa n Jose Resea rch
La boratory, proposed relationa l data model.
◦ It spa rked the ra pid development of severa l
DBMSs based on relationa l model, a long with a
rich body of theoretica l resu lts that placed the
field on a firm fou ndation.
◦ Codd won 1981 Tu ring Awa rd.
◦ Data base systems matu red as a n academic
disci pline
◦ The benefits of DBMS were widely recog nized,
a nd the use of DBMSs for ma naging corporate
data beca me sta nda rd practice.
Historical Perspective
1980s
◦ Relationa l data model consolidated its
position as domina nt DBMS pa radig m, a nd
data base systems continued to gain
widespread use
◦ SQL query la nguage, developed as pa rt of
I BM’s System R project, is now the
sta nda rd query la nguage
◦ SQL was sta nda rdized in late 1980s, a nd
cu rrent sta nda rd SQL:1999 was adopted by
ANSI a nd ISO
1. Tables:
1. Student
2. Course
3. Enrollment
2. Attributes (Columns):
1. Student: StudentID, FirstName, LastName,
DateOfBirth
2. Course: CourseID, CourseName, Instructor
3. Enrollment: EnrollmentID, StudentID (Foreign Key),
CourseID (Foreign Key), EnrollmentDate
3. Primary Keys and Foreign Keys:
1. StudentID is the primary key of the Student table.
2. CourseID is the primary key of the Course table.
3. EnrollmentID is the primary key of the Enrollment
table.
4. StudentID in the Enrollment table is a foreign key
referencing the Student table.
5. CourseID in the Enrollment table is a foreign key
Historical Perspective
Late 1980s til l 1990s
◦ Considera ble resea rch into more powerfu l query
la nguage a nd richer data model, with em phasis
on su pporting com plex a na lysis of data from a l l
pa rts of a n enterprise
◦ Severa l vendors, e.g., I BM’s DB2, Oracle 8,
I nformix U DS, extended their systems with the
a bility to store new data types such as images
a nd text, a nd to ask more com plex queries
◦ Data wa rehouses have been developed by ma ny
vendors to consolidate data from severa l
data bases, a nd for ca rrying out specia lized
a na lysis
Softwares available
1. MySQL 8.0 latest version
2. PostgreSQL
3. SQLite
4. Microsoft SQL Server Express
5. Oracle Database Express Edition (XE)
6. MongoDB
Program-data dependence --- Three
file processing systems at Some
Company
File descri ptions a re stored within each a pplication
prog ra m that
accesses a given file. Any cha nge to a file structu re
requires cha nges
File Systems vs DBMS
If data is la rge, we must store data in a storage
device such as a disk or ta pe a nd bring releva nt
pa rts into main memory for processing as needed .
M ust write specia l prog ra ms to a nswer each
question a user may wa nt to ask a bout data.
M ust protect data from inconsistent cha nges made
by different users accessing data concu rrently
M ust cope with system crashes to ensu re data
consistency
Need to enforce secu rity policies in which different
users have permission to access different su bsets of
the data
Advantages of DBMS
Data I ndependence: Application prog ra ms shou ld
not, idea l ly, be exposed to details of data
representation a nd storage. The DBMS provides a n
a bstract view of the data that hides such details.
Efficient data access. DBMS utilizes a va riety of
sophisticated tech niques to store a nd retrieve data
efficiently.
Data integ rity a nd secu rity. If data is a lways
accessed th roug h the DBMS, the DBMS ca n enforce
integ rity constraints. Also, it ca n enforce access
controls that govern what data is visible to different
classes of users
Data ad ministration.
When severa l users sha re the data, centra lizing the
ad ministration of data ca n offer sig nifica nt
im provements. Experienced professiona ls who
u ndersta nd the natu re of the data being ma naged, a nd
how different g rou ps of users use it, ca n be
responsible for orga nizing the data representation to
minimize red u nda ncy a nd for fine-tu ning the storage
of the data to ma ke retrieva l efficient.
Concu rrent access, recovery from crashes.
A DBMS sched u les concu rrent accesses to the data in
such a ma n ner that users ca n thin k of the data as
being accessed by on ly one user at a time. Fu rther,
the DBMS protects users from the effects of system
fail u res.
Red uced a pplication development time.
Clea rly, the DBMS su pports im porta nt fu nctions
that a re com mon to ma ny a pplications accessing
data in the DBMS. DBMS a pplications a re a lso likely
to be more robust tha n simila r sta nd-a lone
a pplications because ma ny im porta nt tasks a re
ha nd led by the DBMS
Describing a nd storing data in DBMS
Data model is a col lection of hig h-level data descri ption
constructs that hide ma ny low-level storage details.
DBMS a l lows to store the data in the form of a data
model.
Ex: Relationa l data model.
Sema ntic data model: is a more a bstract hig h-level data
model that ma kes it easier for the user to come u p with
a n initia l descri ption of data in a n enterprise.
Ex: Entity Relationship(ER) model a l lows us to pictoria l ly
denote entities a nd the relationship a mong them.
The relationa l data model
• A descri ption of data in terms of data model is ca l led a
schema.
• The schema for a relation specifies its na me, the na me of
each
field (or attribute or col u m n), a nd the type of each
field.
Ex: student information in the u niversity data base is
stored in relation as fol lows
Schema :
Students( sid: string, na me: string, login: string, age: integer,
g pa: rea l)
An I nsta nce of the Students Relation
sid name login age gpa
53666 Jones Jones@cs 18 3.4
53688 Smith Smith@ee 18 4.2
53650 Smith Smith@math 19 3.8
53831 Madayan Madayan@music 11 1.8
53832 guldu guldu@music 12 2.0
Levels of Abstraction
Levels of abstraction in DBMS is important because it allows
users to work with databases without worrying about the
underlying implementation details
The data in a DBMS is described at three levels of abstraction
Conceptual level/Logical level
Physical Level/ Internal Level
External Level/View Level
Conceptua l level(hig h)
• Defines the overa l l structu re of the data base, the
relationshi ps
between the data elements, the constraints on the data,
a nd the operations that ca n be performed on the data.
Physica l Level(low)
• how the data is physica l ly stored a nd orga nized
in the
data base, how the data is stored on disk, the
access metds used to retrieve the data, a nd the
a lgorith ms used to perform operations on the
data.
Externa l Level:
It defines how the data is viewed by the users or
Data definition la nguage (DDL) is used to define externa l
a nd conceptua l schemas.
. Conceptua l schemas /logica l schemas:
• It describes the stored data in terms of the data model
of the DBMS.
• I n the relationa l data base the conceptua l schemas
describe a l l relations that a re stored in the data base.
For ex: I n the u niversity data base, this relation
contains information
a bout entities such as students a nd facu lty,
a bout relationships such as students en rol led in the
cou rses.
Example: University
Database
Conceptua l schema:
◦ Students(sid: string, na me: string, login:
string,
age: integer, g pa: rea l)
◦ Cou rses(cid: string, cna me: string, credits:
integer)
◦ Enrol led(sid: string, cid: string, g rade: string)
◦ facu lty(fid: string, fna me: string, sa l: rea l)
◦ Rooms(rno:integer,add ress:string,ca pacity:
integer)
◦ Teaches (fid:string,cid:string)
◦ Meets_I n( cid: string, rno: integer, time: string)
Physica l schema:
Physica l schema su m ma rizes how the relations
described in conceptua l schemas a re stored on
seconda ry storage devices such as ta pes a nd disks.
A sa m ple physica l schema for the u niversity
data base fol lows:
• Store a l l relations as u nsorted files of records.
• Create indexes on the first col u m n of the Students, Facu lty, a nd
Cou rses
relations, the sa l col u m n of Facu lty, a nd the ca pacity col u m n of
• Decisions
Rooms. a bout the physica l schema a re based on a n
u ndersta nding
of how the data is typica l ly accessed.
Note : A file in a DBMS is either a col lection of records or a col lection of
pages, rather tha n a string of cha racters as in a n operating system.
Externa l Schema (View):
• Externa l Schema a l lows data access to be customized
at the level of individ ua l users or g rou ps of users.
• The externa l Schema desig n is guided by end-user
requirements.
Ex : Cou rse_info( rid: string, fna me:string, enrol l ment:
integer)
• Each externa l schema consists of a col lection of one or
more views a nd relations from the conceptua l schema.
Note: Any given data base has exactly one conceptua l
schema a nd one physica l schema because it has just one
set of stored relations, but it may have severa l externa l
schemas, each tailored to a
a pa rticu la r g rou p of users.
Rea l world Ana logy
1. Thin k of a Restau ra nt:Externa l Level (Menu):
1. What customers see—the dishes availa ble.
2. Conceptua l Level (Reci pes):
1. The logica l structu re—how the dishes a re
prepa red.
3. I nterna l Level (Kitchen Storage):
1. Physica l storage of ing redients a nd cooking
processes.
Externa l Level (User View):
Exa m ple:
• A student sea rches for the book "Data Science for Begin ners"
using the libra ry’s porta l.
• The interface displays whether the book is availa ble a nd the shelf
location.
Conceptua l Level (Logica l View):
Exa m ple:
• Ta bles a nd relationshi ps:
• Books Ta ble: BookI D, Title, AuthorI D, Gen re, ShelfN u m ber.
• Authors Ta ble: AuthorI D, Na me, Nationa lity.
• Mem bers Ta ble: Mem berI D, Na me, Mem bershi pType,
ContactDetails.
• Borrowing Ta ble: BorrowI D, Mem berI D, BookI D, BorrowDate,
Retu rn Date.
I nterna l Level (Physica l Storage):
Exa m ple:
The data base stores the information physica l ly on disk d rives with
indexing for quick sea rches, such as finding a l l books by a specific
author.
Data Independence
Application prog ra ms a re insu lated from
cha nges in the way how data is structu red a nd
stored.
Data independence is a property of a
database that allows the User or Database
Administrator to change the schema at one
level without affecting the data or schema at
another level.
It is of two types:
Logica l data independence
Physica l data independence
Logica l data independence: Protection from
cha nges in logica l structu re of data(the ca pacity to
cha nge the conceptua l schema without having to
cha nge externa l schemas or a pplication prog ra ms).
Logica l data independence is a kind of mecha nism,
which libera lizes itself from actua l data stored on
the disk. If we do some cha nges on ta ble format, it
shou ld not cha nge the data residing on the disk.
For ex: facu lty relation in ou r u niversity is replace by
two relation: facu lty(fid: string, fna me: string, sa l:
rea l)
Facu lty_pu blic (fid: string, fna me: string, office: integer)
Facu lty_private(fid: string, sa l: rea l)
Logical data independence is usually required
for changing the conceptual schema without
having to change the external schema or
application programs. It allows us to make
changes in a conceptual structure like adding,
modifying, or deleting an attribute in the
database.
Ex: If there is a database of a banking system
and we want to add the details of a new
customer or we want to update or delete the
data of a customer at the logical level data will
be changed but it will not affect the Physical
level or structure of the database.
These changes can be done at a logical level
without affecting the application program or
external layer.
• Adding, deleting, or modifying the entity or
relationship.
• Merging or breaking the record present in the
database.
Data Independence (cont.)
Physica l data independence: Protection
from cha nges in physica l structu re of data
(the ca pacity to cha nge the interna l schema
without having to cha nge the conceptua l (or
externa l) schemas).
Physical Data Independence can be defined
as the ability to change the physical level
without affecting the logical or Conceptual
level.
Physical data independence gives us the
freedom to modify the - Storage device, File
structure, location of the database, etc.
without changing the definition of conceptual
or view level.
Example: For example, if we take the database
of the banking system and we want to scale
up the database by changing the storage size
and also want to change the file structure, we
can do it without affecting any functionality of
logical schema
Changes can be done at the physical layer
without affecting the conceptual layer -
• Changing the storage devices like SSD, hard
disk and magnetic tapes, etc.
• Changing the access technique and modifying
indexes.
• Changing the compression techniques or
hashing algorithms.
Structure of a DBMS
The DBMS accepts SQL com ma nds
generated from a va riety of user interfaces,
prod uces query eva l uation pla ns, executes
these pla ns against the data base a nd
retu rns the a nswers.
When a user issues a query, the pa rsed
query is presented to a query optimizer,
which uses information a bout how the data
is stored to prod uce a n efficient execution
pla n for eva l uating the query.
An execution pla n is a bl ueprint for
Structu re of a DBMS
The code that im plements relationa l operators
sits on top of the file a nd access methods layer.
This layer su pports the concept of a file, which,
in a DBMS, is a col lection of pages or a
col lection of records.
I n addition to keeping track of the pages in a
file, this layer orga nizes the information within a
page.
The files a nd access methods layer code sits on
top of the buffer ma nager, which brings pages
in from disk to main memory as needed in
response to read requests.
The lowest layer of the DBMS softwa re dea ls
with ma nagement of space on disk, where the
data is stored.
Hig her layers a l locate, dea l locate, read, a nd
write pages th roug h (routines provided by) this
layer, ca l led the disk space ma nager.
The DBMS su pports concu rrency a nd crash
recovery by ca refu l ly sched u ling user requests
a nd maintaining a log of a l l cha nges to the
data base.
DBMS com ponents associated with
concu rrency control a nd recovery incl ude the
tra nsaction ma nager, which ensu res that
tra nsactions request a nd release locks
according to a suita ble locking protocol a nd
sched u les the execution of tra nsactions;
the lock ma nager, which keeps track of
requests for locks a nd g ra nts locks on
data base objects when they become availa ble;
a nd the recovery ma nager, which is
responsible for maintaining a log a nd restoring
the system to a consistent state after a crash.
The disk space ma nager, buffer ma nager, a nd file
a nd access method layers must interact with these
com ponents.
People who work with
Databases
Va riety of people a re associated with creation
a nd use of data bases.
End users: store a nd use the data
DBMS im plementers: work with vendors
DB a pplication prog ra m mers
E.g. sma rt webmasters
Data base ad ministrator (DBA) Responsibilities
Desig n of the conceptua l a nd physica l schemas:
I nteract with users to u ndersta nd what data is
to be stored a nd how it is likely to be used.
Secu rity a nd authorization: Responsible for
ensu ring that u nauthorized data access is not
permitted.
Data availa bility a nd crash recovery: If the
system fails, users ca n continue to access as
much of the u ncorru pted data as possible.
Data base tu ning modifying the data base as