Ganpat University
Department of Computer Science
B.SC(CA&IT)
U13A3DMS-DATABASE MANAGENT SYSTEM
DBMS UNIT - I
• DATA :-
Data is raw facts which are manipulated. Data consist of feats, which
become information when they are processed and convey meaning to
people.
To refer to what is actually stored in the database. Ex: 13-08-2009
• INFORMATION :-
Information is the summarization of data in a presentable form.
To refer to the meaning of that data as understood by some user. Ex: 13-
08-2009 –
It represent date of birth of student
• DATABASE :-
“The collection of data, usually referred to as the database, contains
information about one particular enterprise.”
A database is a collection of information that’s related to a particular
subject or purpose, such as tracking customer order or maintaining a
music collection.”
• DATABASE SYSTEM:-
“A database system is basically just a computerized record keeping
system.”
“The database and DBMS software together is called as Database
System.”
• DBMS :-
“A Database Management System (DBMS) consist of a collection of
interrelated data and a set of programs to access those data.”
“A Database Management System is a software system that allows
access to data contained in a database.”
Database management system is a software which is used to manage the database. For example:
MySQL, Oracle, etc are a very popular commercial database which is used in different applications.
• DA :-
The Data Administrator (DA) is the person who makes the strategic and
policy decision regarding the data of the enterprise.
• DBA :-
The Database Administrator (DBA) is the person who provides the
necessary technical support for implementing those decisions.
• Users of system can perform a variety of operations on files
such as
Adding new, empty files to the database.
Inserting data into existing files.
Retrieving data from existing files.
Changing data in existing files.
Deleting data from existing files.
Removing existing files from the database.
• Components of Database System.
Or
• Component of Database System Environment.
Database System involves four major components
1. DATA
2. HARDWARE
3. SOFTWARE
4. USERS
[1] . DATA
A single-user system in which at most user can access the database at
any given time.
A multi-user system is a system in which many users can access the
database at the same time.
In large system, there are two types of data:
1. Integrated
2. Shared
1. Integrated
By integrated, we mean that the database can be thought of as a
unification of several distinct files.
For example , a given database might contain both an EMPLOYEE file
, giving employee names , addresses , departments , salaries , etc. , and
an ENROLLMENT file , representing the enrollment of employees in
training course administration , it is Necessary to know the department
for each enrolled student. Then there is clearly no need to include that
information in the ENROLLMENT file, because it can always be
discovered by referring to the EMPLOYEE file instead.
2. Shared
By shared, we mean that individual pieces of data in the database can be
shared among different users , each of those users can have access to the
same piece of data , for different purposes.
Different users can effectively access the same piece of data at the same
time.
For instance, the department information in the EMPLOYEE file would
typically be users in the Personal Department – and , as already
suggested , those two classes of users would use that information for
different purposes.
[2] . HARDWARE
The hardware components of the system consist of :
The secondary storage volumes – mostly magnetic disks – that are used
to hold the stored data , together with the associated I/O devices , device
controllers , I/O channels.
The hardware processor and associated main memory that are used to
support the execution of the database system software.
[3] . SOFTWARE
The data are physically stored and the users of the system is a layer of
software ,known as the database manager or database server or he
database management system (DBMS).
All requests for access to the database are handled by the DBMS; for
adding and removing files (tables) , retrieving data from and updating
data in such files or tables, all facilities provided by the DBMS.
The DBMS is the most important software component in the overall
system.
[4] . USERS
There are three types of users.
1. Application Programmer
2. End Users
3. Database Administrator (DBA)
1 . Application Programmer
Application programmers are responsible for writing database
application programs in some programming language such as COBOL ,
PL/I , c++ , JAVA or some higher level “fourth generation” language.
Such programs access the database by issuing the appropriate request.
The purpose of those programs is to allow an end user to access the
database from an online workstation or terminal.
2 . End Users
The end users interact with the system from online workstation or
terminals.
A end user can access the database via one of the online applications,
user can use an interface provided as an integral part of the database
system software. Such vendor-provided interfaces are also supported by
means of online application, but those applications are built-in, not user-
written.
3 . Database Administrator (DBA)
The technical person responsible for implementing the data
administrator’s decisions is the database administrator (DBA).
The job of is to create the actual database and to implement the
technical controls needed to enforce the various policy decisions made
by the data administrator.
The DBA is also responsible for ensuring that the system operates with
adequate performance and providing variety of other technical services.
Advantages or Benefits Of DBMS
1 . The data can be shared
Sharing means not only that existing application can share the data in
the database,but also that new applications can be developed to operate
against that same data.
It might be possible to satisfy the data requirements of new applications
without having to add any new data to the database.
2 . Redundancy(duplication of data) can be reduced
In nondatabase systems each application has its own private files.
This fact can often lead to considerable redundancy in stored data.
For example, a personal application and an education records
application might both own a file that includes department information
for employees. Those two files can be integrated, and redundancy can be
avoided. The data administrator is aware of the data requirements for
both applications. We do not mean to suggest that all redundancy can or
necessarily should be eliminated.
3 . Inconsistency can be avoided
Suppose a given fact about real world – says that employee E3 works in
department D8 – is represented by two distinct entries in the database.
Suppose also that the DBMS is not aware of this duplication. Then there
will necessarily be occasion on which the two entries will not agree :
when one of the two has been updated and the other not. At such times
the database is said to be inconsistent.
A database that is in inconsistent state is capable of supplying incorrect
information to its users.
Any change made to either of the two entries is automatically applied to
the other one as well. This process is known as propagating updates.
4 . Transaction support can be performed
A transaction is a logical unit of work, involving several database
operations.
For example, the transfers of cash amount from one account A to
another account B. Here two updates are required, one two withdraw the
cash from account A and the other to deposit it to account B.
5 . Integrity Can Be Maintained
Data integrity means data in the database are both accurate and
consistent.
The problem of integrity is the problem of ensuring that the data in the
database is correct.
Data integrity is even more important in a database system than it is in a
“private files” environment, because the data is shared. without
appropriate controls it would be possible for one user to update the
database incorrectly, generating bad data and so “infecting” other
innocent users of that data. Suppose some employee load is 16.In place
of 16 we enter 60 then it create inaccurate data.In Dbms ,we can apply
different constraints to eliminate such incorrect data.
6 . Security can be enforced
The security means to protect data against unauthorized person.
The DBA can ensure that the only means of access to the database is
through the proper channels , and hence can defines security constraints
or rules to be checked whenever access is attempted to sensitive data.
Different constraints can be established for each types of access to each
piece of information in the database.
7 . Conflicting requirements can be balanced
Knowing the overall requirements of the enterprise, the DBA – again
under the data administrator’s direction – can so structure the system as
to provide overall services that is “best for the enterprise.
8 . Standard can be enforced
The DBA can ensure that all applicable standards are observed in the
representation of the data.
Applicable standards might include any or all of the following:
departmental,installation, corporate, industry, national, and international
standards.
Standardizing data are easy to interchange or movement between
systems. Data naming and documentation standards are also desirable to
sharing and understandability.
Disadvantages Of DBMS
1 . Cost of Software and Hardware
A processor with high speed of data processing and memory of large
size is required to run the DBMS S/W. It means that you have to up
grade the H/W used for file-based system. DBMS S/W is also very
costly.
2 . Cost of Data Conversion
When a computer file-based system is replaced with a database system
the data stored into file must be converted to database file.
It is very difficult and costly method to convert data of data files into
database. You have to hire database and system designers along with
application programmer. So a lot of money has to be paid for developing
S/W.
3 . Cost of Staff Training
Most DBMS are often complex system, so the training for users to use
the DBMS is required.
Training is required at all levels, including programming, application
development and database administration. The organization has to be
paid a lot of amount for the training of staff to run theDBMS.
4 . Appointing Technical staff
The trained technical persons such as database administrator,
application programmers, data entry operators etc, are required to handle
the DBMS. You have to pay some salaries to these persons. Therefore
the “system cost increases”.
5 . Database Damage
In most of the organizations, all data is integrated into a single database.
If database is damaged due to electric failure or database is corrupted on
the storage media, than your valuable data may be lost forever.
• Difference between Data Administration V/S Database
Administration
1 “The DA is the person who makes the “The DBA is the person who
strategic and policy decisions regarding the provides the necessary technical
data of the enterprise.” support for implementing those
decisions (DA’s Decision).”
2 The DA’s job to decide what data should be “The job of the DBA is to create the actual
stored in the database. database and to implement the technical
control needed to enforce the various policies.
3 The DA’s comes at a senior management The DBA’s comes at the top level
level, manager not technician. like General manager
What is RDBMS?
RDBMS stands for Relational Database Management System.
RDBMS is a program used to maintain a relational database.
RDBMS uses SQL queries to access the data in the database. A relational database defines
database relationships in the form of tables. The tables are related to each other - based on data
common to each.
Look at the following three tables "Customers", "Orders", and "Shippers" from the Northwind
database:
Customers Table
CustomerID CustomerName ContactName Address City PostalCode Country
1 Alfreds Futterkiste Maria Anders Obere Str. 57 Berlin 12209 Germany
Ana Trujillo
Avda. de la México
2 Emparedados y Ana Trujillo 05021 Mexico
Constitución 2222 D.F.
helados
Antonio Moreno Antonio México
3 Mataderos 2312 05023 Mexico
Taquería Moreno D.F.
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1 1DP UK
Christina
5 Berglunds snabbköp Berguvsvägen 8 Luleå S-958 22 Sweden
Berglund
The relationship between the "Customers" table and the "Orders" table is the CustomerID
column:
Orders Table
OrderID CustomerID EmployeeID OrderDate ShipperID
10278 5 8 1996-08-12 2
10280 5 2 1996-08-14 1
10308 2 7 1996-09-18 3
10355 4 6 1996-11-15 1
10365 3 3 1996-11-27 2
10383 4 8 1996-12-16 3
10384 5 3 1996-12-16 3
The relationship between the "Orders" table and the "Shippers" table is the ShipperID column:
Shippers Table
ShipperID ShipperName Phone
1 Speedy Express (503) 555-9831
2 United Package (503) 555-3199
3 Federal Shipping (503) 555-9931
Dr. E.F.Codd. Rules.
Dr Edgar F. Codd, after his extensive research on the Relational Model of database systems,
came up with twelve rules of his own, which according to him, a database must obey in order to
be regarded as a true relational database.
These rules can be applied on any database system that manages stored data using only its
relational capabilities. This is a foundation rule, which acts as a base for all the other rules.
Rule 1: Information Rule
The data stored in a database, may it be user data or metadata(data about data), must be a value
of some table cell. Everything in a database must be stored in a table format.
Rule 2: Guaranteed Access Rule
Every single data element (value) is guaranteed to be accessible logically with a combination of
table-name, primary-key (row value), and attribute-name (column value). No other means, such
as pointers, can be used to access data.
Rule 3: Systematic Treatment of NULL Values
The NULL values in a database must be given a systematic and uniform treatment. This is a very
important rule because a NULL can be interpreted as one the following − data is missing, data is
not known, or data is not applicable.
Rule 4: Active Online Catalog
The structure description of the entire database must be stored in an online catalog, known as
data dictionary, which can be accessed by authorized users. Users can use the same query
language to access the catalog which they use to access the database itself.
Rule 5: Comprehensive Data Sub-Language Rule
A database can only be accessed using a language having linear syntax that supports data
definition, data manipulation, and transaction management operations. This language can be used
directly or by means of some application. If the database allows access to data without any help
of this language, then it is considered as a violation.
Rule 6: View Updating Rule
All the views of a database, which can theoretically be updated, must also be updatable by the
system.
Rule 7: High-Level Insert, Update, and Delete Rule
A database must support high-level insertion, updation, and deletion. This must not be limited to
a single row, that is, it must also support union, intersection and minus operations to yield sets of
data records.
Rule 8: Physical Data Independence
The data stored in a database must be independent of the applications that access the database.
Any change in the physical structure of a database must not have any impact on how the data is
being accessed by external applications.
Rule 9: Logical Data Independence
The logical data in a database must be independent of its user’s view (application). Any change
in logical data must not affect the applications using it. For example, if two tables are merged or
one is split into two different tables, there should be no impact or change on the user application.
This is one of the most difficult rule to apply.
Rule 10: Integrity Independence
A database must be independent of the application that uses it. All its integrity constraints can be
independently modified without the need of any change in the application. This rule makes a
database independent of the front-end application and its interface.
Rule 11: Distribution Independence
The end-user must not be able to see that the data is distributed over various locations. Users
should always get the impression that the data is located at one site only. This rule has been
regarded as the foundation of distributed database systems.
Rule 12: Non-Subversion Rule
If a system has an interface that provides access to low-level records, then the interface must not
be able to subvert the system and bypass security and integrity constraints.
Difference between RDBMS and DBMS.
RDBMS DBMS
Data stored is in table format Data stored is in the file format
Multiple data elements are accessible together Individual access of data elements
Data in the form of a table are linked together No connection between data
Normalisation is not achievable There is normalisation
Support distributed database No support for distributed database
Data is stored in a large amount Data stored is a small quantity
Here, redundancy of data is reduced with the help of key and
Data redundancy is common
indexes in RDBMS
RDBMS supports multiple users DBMS supports a single user
There is only low security while
It features multiple layers of security while handling data
handling data
The software and hardware
The software and hardware requirements are higher
requirements are low
Oracle, SQL Server. XML, Microsoft Access.
Data Item:
The data item is also called as field in data processing and is the smallest unit of data
that has meaning to its users.
Eg: “e101”, ”sumit”
Entities and attributes:
An entity is a thing or object in the real world that is distinguishable from all other
objects
Eg: Bank, employee, student
Attributes are properties are properties of an entity.
Eg: Empcode, ename, rolno, name.
DBMS Three Level Architect
1. Internal Level
• The internal level has an internal schema which describes the physical storage structure of the
database.
• The internal schema is also known as a physical schema.
• It uses the physical data model. It is used to define that how the data will be stored in a block.
• The physical level is used to describe complex low-level data structures in detail.
2. Conceptual Level
• The conceptual schema describes the design of a database at the conceptual level. Conceptual
level is also known as logical level.
• The conceptual schema describes the structure of the whole database.
• The conceptual level describes what data are to be stored in the database and also describes
what relationship exists among those data.
• In the conceptual level, internal details such as an implementation of the data structure are
hidden.
• Programmers and database administrators work at this level.
3. External Level
• At the external level, a database contains several schemas that sometimes called as subschema.
The subschema is used to describe the different view of the database.
• An external schema is also known as view schema.
• Each view schema describes the database part that a particular user group is interested and
hides the remaining database from that user group.
• The view schema describes the end user interaction with database systems.
Mapping between Views
The three levels of DBMS architecture don't exist independently of each other. There must be
correspondence between the three levels i.e. how they actually correspond with each other.
DBMS is responsible for correspondence between the three types of schema. This
correspondence is called Mapping.
There are basically two types of mapping in the database architecture:
• Conceptual/ Internal Mapping
• External / Conceptual Mapping
Conceptual/ Internal Mapping
The Conceptual/ Internal Mapping lies between the conceptual level and the internal level. Its
role is to define the correspondence between the records and fields of the conceptual level and
files and data structures of the internal level.
External/ Conceptual Mapping
The external/Conceptual Mapping lies between the external level and the Conceptual level. Its
role is to define the correspondence between a particular external and the conceptual view.
Database Administrator :-
• What are the Role / Tasks of Database Administrator (DBA)?
The DBA is a person or a group of persons who provides the necessary
technical supports for implementing those decisions which are decided
by (Data dministrator) DA.
The DBA is responsible for the overall control of the system at a
technical level.
1). Defining Conceptual Schema :-
It is the data administrator’s job to decide exactly what information is
to be held in the database, in other words, to identify the entities of
interest to the enterprise and to identify the information to be record
about those entities. This process is usually referred to as logical level
sometimes conceptual database design.
Once the data administrator (DA) has thus decided the content of the
database at an abstract level , the DBA will then create the
corresponding conceptual schema.
2). Defining the Internal Schema :-
The DBA must also decide how the data is to be represented in the
stored database. This process is usually referred to as physical database
design.
Having done the physical Design , the DBA must then create the
corresponding storage structure defining the internal DDL.
In addition , he or she must also define the Conceptual / Internal
mapping.
3). Liaising with users :-
It is the business of DBA to liaise with user to ensure that the data they
need is available and to write the necessary external schemas, using the
applicable external DDL.
In addition, the corresponding External / Conceptual mapping must
also be defined.
Other aspects of the user liaise on function include consulting on
application design, providing technical education, assisting with and
similar professional services.
4). Defining Security and Integrity constraints :-
Security and integrity constraints can be regarded as part of the
conceptual schema. The conceptual DDL must include facilities for
specifying such constraints.
5). Defining dump and reload policies :-
Once an enterprise is committed to a database system, it becomes
critically dependent on the successful operation of that system.
In the event of damage to any portion of the database caused by human
error or failure in the hardware or operating system – it is essential to be
able to repair the data concerned with the minimum of delay and with as
little effect as possible on the rest of the system.
For example, the availability of data that has not been damage should
not be affected.
The DBA must define and implementing an appropriate damage control
scheme ,typically involving
a). Periodic uploading or “dumping” of the database to backup storage.
b). Reloading the database when necessary from the most recent dump.
6). Monitoring Performance & responding to changing
requirements :-
The DBA is responsible for organizing the system in such way to get
the performance that is “best for the enterprise” , and making the
appropriate adjustment.
For example, it might be necessary to reorganize the stored database
from time to time to ensure that performance levels remain acceptable.
Any change to the physical storage level of the system must be
accompanied by a corresponding change to the definition of the
conceptual schema can remain constant.
DATA MODELS
What is Model?
Model is collection of conceptual tools for describing data, data relationships,
data semantics and constraints.
Hierarchical Model
Hierarchical Model was the first DBMS model. This model organises the data in the hierarchical
tree structure. The hierarchy starts from the root which has root data and then it expands in the
form of a tree adding child node to the parent node. This model easily represents some of the
real-world relationships like food recipes, sitemap of a website etc. Example: We can represent
the relationship between the shoes present on a shopping website in the following way:
Features of a Hierarchical Model
1. One-to-many relationship: The data here is organised in a tree-like structure where the one-to-
many relationship is between the datatypes. Also, there can be only one path from parent to
any node. Example: In the above example, if we want to go to the node sneakers we only have
one path to reach there i.e through men's shoes node.
2. Parent-Child Relationship: Each child node has a parent node but a parent node can have more
than one child node. Multiple parents are not allowed.
3. Deletion Problem: If a parent node is deleted then the child node is automatically deleted.
4. Pointers: Pointers are used to link the parent node with the child node and are used to navigate
between the stored data. Example: In the above example the 'shoes' node points to the two
other nodes 'women shoes' node and 'men's shoes' node.
Advantages of Hierarchical Model
• It is very simple and fast to traverse through a tree-like structure.
• Any change in the parent node is automatically reflected in the child node so, the integrity of
data is maintained.
Disadvantages of Hierarchical Model
• Complex relationships are not supported.
• As it does not support more than one parent of the child node so if we have some complex
relationship where a child node needs to have two parent node then that can't be represented
using this model.
• If a parent node is deleted then the child node is automatically deleted.
Network Model
This model is an extension of the hierarchical model. It was the most popular model before the
relational model. This model is the same as the hierarchical model, the only difference is that a
record can have more than one parent. It replaces the hierarchical tree with a graph. Example: In
the example below we can see that node student has two parents i.e. CSE Department and
Library. This was earlier not possible in the hierarchical model.
Features of a Network Model
1. Ability to Merge more Relationships: In this model, as there are more relationships so data is
more related. This model has the ability to manage one-to-one relationships as well as many-to-
many relationships.
2. Many paths: As there are more relationships so there can be more than one path to the same
record. This makes data access fast and simple.
3. Circular Linked List: The operations on the network model are done with the help of the circular
linked list. The current position is maintained with the help of a program and this position
navigates through the records according to the relationship.
Advantages of Network Model
• The data can be accessed faster as compared to the hierarchical model. This is because the data
is more related in the network model and there can be more than one path to reach a particular
node. So the data can be accessed in many ways.
• As there is a parent-child relationship so data integrity is present. Any change in parent record is
reflected in the child record.
Disadvantages of Network Model
• As more and more relationships need to be handled the system might get complex. So, a user
must be having detailed knowledge of the model to work with the model.
• Any change like updation, deletion, insertion is very complex.
Relational Model
Relational Model is the most widely used model. In this model, the data is maintained in the
form of a two-dimensional table. All the information is stored in the form of row and columns.
The basic structure of a relational model is tables. So, the tables are also called relations in the
relational model. Example: In this example, we have an Employee table.
Features of Relational Model
• Tuples: Each row in the table is called tuple. A row contains all the information about any
instance of the object. In the above example, each row has all the information about any specific
individual like the first row has information about John.
• Attribute or field: Attributes are the property which defines the table or relation. The values of
the attribute should be from the same domain. In the above example, we have different
attributes of the employee like Salary, Mobile_no, etc.
Advnatages of Relational Model
• Simple: This model is more simple as compared to the network and hierarchical model.
• Scalable: This model can be easily scaled as we can add as many rows and columns we want.
• Structural Independence: We can make changes in database structure without changing the
way to access the data. When we can make changes to the database structure without affecting
the capability to DBMS to access the data we can say that structural independence has been
achieved.
Disadvantages of Relatinal Model
Hardware Overheads: For hiding the complexities and making things easier for the user this model
requires more powerful hardware computers and data storage devices.
• Bad Design: As the relational model is very easy to design and use. So the users don't need to
know how the data is stored in order to access it. This ease of design can lead to the
development of a poor database which would slow down if the database grows.