Unit-1 DBMS Notes
Unit-1 DBMS Notes
Data
Data is the raw material that can be processed for any computing machine.
For example − Employee name, Product name, Name of the student, Marks of the
student, Mobile number, Image etc.
Information
Information is the data that has been converted into more useful or intelligent form.
For example: Report card sheet.
The information is needed for the following reasons −
To gain knowledge about the surroundings.
To keep the system up to date.
To know about the rules and regulations of the society.
Knowledge
The human mind purposefully organizes the information and evaluates it to produce
knowledge.
Example of data, information and knowledge
A student secures 450 marks. Here 450 is data, marks of the student is
the information and hard work required to get the marks is knowledge.
Differences
The major differences between Data and Information are as follows −
Data Information
1. Organizational dimension
2. Management dimension
3. Technology dimension
1. Organizational Dimension
The information system is the organization's part. The ordinary operating procedure
and culture of an organization would be embedded in the information system. This
includes the following:
o Business processes
o Political interest groups
o Functional specialties
o Cultured
2. Management Dimension
In today's world, managers face business challenges. Information systems provide
managers with the tools and information they have to plan, manage, monitor their
work, make decisions, develop new goods and services, and make long-term tactical
decisions.
3. Technology Dimension
Management makes use of technology to fulfill their duties. It contains- computer
hardware and software, networking/telecom technology, and data management. It's
one of the many strategies a manager can use to deal with changes. Organizational
levels, processing, system goals, mode of data and type of support provided are used
to classify information systems.
4. Experts System
1. Hardware
2. Software
3. Data
4. Procedures
5. People
6. Feedback
1. Hardware
Hardware means equipment and machinery. This category encompasses the
computer and all of its supporting equipment in modern information systems. The
supporting devices contain input and output devices, communication devices and
storage device. Hardware in pre-computer information systems may contain ledger
books and ink.
2. Software
In an information system, software means computer programs as well as the manuals
which support them. Computer program means the machine-readable instructions
that tell circuitry in the system's hardware to work to generate helpful information
from the data. In most cases, programs are stored on an input/output medium, such
as a tape or disk. The software which is for pre-computer information systems
comprised instruction for using them means the guidebook for a card catalog and
the information regarding how the hardware was configured for use such as columns
headings in the ledger book.
3. Data
Data means facts that systems use to generate valuable knowledge. Data is usually
stored in machine-readable form on tape or disk until the computer requires them.
The data in pre-computer information systems is usually stored in a human-readable
format.
4. Procedures
Procedures mean rules which govern how an operation is performed in information
system. "Procedures are for people what software is for hardware" is a general
analogy that we used to clarify the importance of procedures in a system.
5. People
Every system requires individuals if the system is to be beneficial. People are often
the most neglected part of the system, and they are possibly the factor that has the
greatest impact on the success or failure of information systems.
This contains clients, yet additionally the individuals who operate as well as service
the computers, those who support the network of computers, and the individuals
who keep up the information.
6. Feedback
Another component of an information system is feedback, which determines that an
information system can be offered with feedback. However, this component is not
needed to function.
o Communication
o Availability
o Creation of new types of jobs
o Globalization and cultural gap
Communication
Using information technology, instant messaging, emails, voice, and video calls,
communication become inexpensive, faster, and effective.
Availability
With the help of the Information system, it is possible for businesses around the
world to be open around the clock. This implies that a business can be open anytime,
anyplace, making buys from various nations simpler and more helpful. It likewise
implies that you can have your products delivered right to your doorstep without
making more effort.
Hardware
The hardware component of an information system comprises the physical elements
of the system. People can touch and feel pieces of hardware. These mechanisms,
equipment and wiring allow systems like computers, smartphones and tablets to
function.
Input and output devices are essential pieces of technology that allow humans to
interact with computers and other information systems. Keyboards, mice,
microphones and scanners are all examples of input devices. And output devices
might include printers, monitors, speakers and sound and video cards.
Pieces of hardware including microprocessors, hard drives, electric power supply
units, and removable storage also allow computers to store and process data.
Software
Software are the intangible programs that manage information system functions,
including input, output, processing and storage.
Telecommunications
Telecommunications systems connect computer networks and allow information to
be transmitted through them. Telecommunications networks also allow computers
and storage services to access information from the cloud.
Data
Data are intangible, raw facts that are stored, transmitted, analyzed and processed
by other components of information systems. Data are often stored as numerical
facts, and they represent quantitative or qualitative information.
Data can be stored in a database or data warehouse, in a form that best suits the
organization using it.
Databases house collections of data that can be queried or retrieved for specific
purposes. Databases allow users to perform fundamental operations, such as
storage and retrieval. Data warehouses, on the other hand, store data from multiple
sources for analytical purposes. They allow users to assess an organization or its
operations.
Human Resources
Human resources are a crucial part of information systems. The human component
of information systems encompasses the qualified people who influence and
manipulate the data, software and processes in information systems. Humans
involved in information systems may include business analysts, information security
analysts or system analysts.
The main aim of a DBMS is to supply a way to store up and retrieve database information
that is both convenient and efficient. By data, we mean known facts that can be recorded and
that have embedded meaning. Usually, people use software such as DBASE IV or V,
Microsoft ACCESS, or EXCEL to store data in the form of a database. A datum is a unit of
data. Meaningful data combined to form information. Hence, information is interpreted data -
data provided with semantics. MS. ACCESS is one of the most common examples of
database management software.
Advantages of DBMS
A DBMS manages data and has many benefits. These are:
Components of DBMS
Users: Users may be of any kind such as DB administrator, System developer, or database
users.
Database application: Database application may be Departmental, Personal, organization's
and / or Internal.
DBMS: Software that allows users to create and manipulate database access,
Database: Collection of logical data as a single unit.
Consider an example of a student's file system. The student file will contain
information regarding the student (i.e. roll no, student name, course etc.). Similarly,
we have a subject file that contains information about the subject and the result file
which contains the information regarding the result.
Some fields are duplicated in more than one file, which leads to data redundancy. So
to overcome this problem, we need to create a centralized system, i.e. DBMS
approach.
DBMS:
A database approach is a well-organized collection of data that are related in a
meaningful way which can be accessed by different users but stored only once in a
system. The various operations performed by the DBMS system are: Insertion,
deletion, selection, sorting etc.
Sharing of data Due to the centralized approach, data Data is distributed in many files,
sharing is easy. and it may be of different formats,
so it isn't easy to share data.
Data Abstraction DBMS gives an abstract view of data The file system provides the detail
that hides the details. of the data representation and
storage of data.
Security and DBMS provides a good protection It isn't easy to protect a file under
Protection mechanism. the file system.
Recovery DBMS provides a crash recovery The file system doesn't have a crash
Mechanism mechanism, i.e., DBMS protects the mechanism, i.e., if the system
user from system failure. crashes while entering some data,
then the content of the file will be
lost.
Manipulation DBMS contains a wide variety of The file system can't efficiently
Techniques sophisticated techniques to store and store and retrieve the data.
retrieve the data.
Concurrency DBMS takes care of Concurrent access In the File system, concurrent
Problems of data using some form of locking. access has many problems like
redirecting the file while deleting
some information or updating
some information.
Where to use Database approach used in large File system approach used in large
systems which interrelate many files. systems which interrelate many
files.
Cost The database system is expensive to The file system approach is cheaper
design. to design.
Data Due to the centralization of the In this, the files and application
Redundancy and database, the problems of data programs are created by different
Inconsistency redundancy and inconsistency are programmers so that there exists a
controlled. lot of duplication of data which
may lead to inconsistency.
Structure The database structure is complex to The file system approach has a
design. simple structure.
Data In this system, Data Independence In the File system approach, there
Independence exists, and it can be of two types. exists no Data Independence.
Integrity Integrity Constraints are easy to apply. Integrity Constraints are difficult to
Constraints implement in file system.
Data Models In the database approach, 3 types of In the file system approach, there is
data models exist: no concept of data models exists.
Flexibility Changes are often a necessity to the The flexibility of the system is less
content of the data stored in any as compared to the DBMS
system, and these changes are more approach.
easily with a database approach.
Examples Oracle, SQL Server, Sybase etc. Cobol, C++ etc.
DBMS these days is very realistic and real-world entities are used to design its architecture. Also,
behavior and attributes are used by DBMS. To simplify it we can take an example of an
organization database where employee is an entity and his employee id is an attribute.
2. Self-Describing Nature
Before DBMS, a traditional file management system was used for storing information and data.
There was no concept of definition in traditional file management system like we have in DBMS.
A DBMS should be of Self- Describing nature as it not only contains the database itself but also
the metadata. A metadata (data about data) defines and describes not only the extent, type,
structure, and format of all data but also relationship between data. This data represents itself that
what actions should be taken on it.
Any DBMS is able to support ACID (Accuracy, Completeness, Isolation, and Durability)
properties. It is made sure in every DBMS that the real purpose of data should not be lost while
performing transactions like delete, insert, and update. Let us take an example; if an employee’s
name is updated then it should make sure that there is no duplicate data and no mismatch of
employee information.
There are many chances that many users will be accessing the data at the same time. They may
require altering the database system concurrently. At that time, DBMS supports them to
concurrently use database without any problem. With the help of concurrency, economy of the
system can be increased. For Example, employees of the railway reservation system can book and
access tickets for passengers concurrently. Every employee can see on his own interface that how
many seats are available or bogie is fully booked.
Program data independence provides a big relief to database users. In traditional file management
system, structure of data files was defined in the application programs so the user had to change
all the programs that are using that particular data file.
But in DBMS, structure of data files is not stored in the program but it is stored in system
catalogue. With the help of this, internal improvement of data efficiency or any changes in the
data do not have any effect on application software.
6. Transactions
Transactions are bunch of actions that are done to bring database from one consistent state to new
consistent state. Traditional file-based system did not have this feature. Transaction is always
atomic which means it can never be further divided. It can only be completed or uncompleted.
For example, A person wants to credit money from his account to another person’s account. Then
transaction will be complete if he sends money and another guy receives his money. Anything
other than this can lead to an inconsistent transaction.
7. Data Persistence
Persistence means if the data is not removed explicitly then all the data will be maintained in
DBMS. If any system failure happens then life span of data stored in the DBMS will be decided
by the users directly or indirectly. Any data stored in the DBMS can never be lost. If system
failure happens in between any transaction then it will be rolled back or fully completed, but data
will never be at risk.
There are many chances of failure of the whole database. At that time no one will be able to get
the database back and for sure company will be in a big loss. The only solution is to take backup
of database and whenever it is needed, it can be stored back. A database must have this
characteristic to enable more effectiveness.
9. Data Integrity
This is one of the most important characteristics of database management system. Integrity
ensures the quality and reliability of database system. It protects unauthorized access to the
database and makes it more secure. It brings only consistent and accurate data into the database.
Users can have multiple views of database depending on their department and interest. DBMS
support multiple views of database to the users. For example, a user of the teaching department
will have different view and user of hostel department will have different. This feature helps
users to have somewhat security because users of other departments cannot access their files.
A database management system should be able to store any kind of data. It should not be
restricted to employee name, salary, and address. Any kind of data that exists in the real world
can be stored in DBMS because we need to work with all kinds of data that is present around us.
12. Security
DBMS provides security to the data stored in it because all users have different rights to access
database. Some of the users can access the whole database while other can access a small part of
database. For example, a computer network lecturer can only access files that are related to
computer subjects but HOD of the department can access files of all subjects that are related to
their department.
Data stored in a database is connected with each other and a relationship is made in between data.
DBMS should be able to represent the complex relationship between data to make efficient and
accurate use of data.
14. Query Language
Queries are used to retrieve and manipulate data but DBMS is armed by a strong query language
that makes it more effective and efficient. Users have the power to retrieve any kind of data they
want from the database by applying different sets of queries. The file-Based system has not this
luxury of the query language.
15. Cost
The cost of the DBMS is high as compared to the other software and technology available in the
market. But if you consider the long run then DBMS is way far better because its maintenance
cost will be almost nothing.
Data Model
Data Model gives us an idea that how the final system will look like after its
complete implementation. It defines the data elements and the relationships
between the data elements. Data Models are used to show how data is stored,
connected, accessed and updated in the database management system. Here, we
use a set of symbols and text to represent the information so that members of the
organisation can communicate and understand it. Though there are many data
models being used nowadays but the Relational model is the most widely used
model. Apart from the Relational model, there are many other types of data models
about which we will study in details in this blog. Some of the Data Models in DBMS
are:
1. Hierarchical Model
2. Network Model
3. Relational Model
4. Object-Oriented Data Model
5. Object-Relational Data Model
6. Flat Data Model
7. Semi-Structured Data Model
8. Associative Data Model
9. Context Data Model
1.Hierarchical Model
Hierarchical Model was the first DBMS model. This model organises
the data in the hierarchical tree structure. The hierarchy starts from the
root which has root data and then it expands in the form of a tree
adding child node to the parent node. This model easily represents
some of the real-world relationships like food recipes, sitemap of a
website etc. Example: We can represent the relationship between the
shoes present on a shopping website in the following way:
2.Network Model
This model is an extension of the hierarchical model. It was the most
popular model before the relational model. This model is the same as
the hierarchical model, the only difference is that a record can have
more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student
has two parents i.e. CSE Department and Library. This was earlier not
possible in the hierarchical model.
Features of a Network Model
3.Relational Model
Relational Model is the most widely used model. In this model, the
data is maintained in the form of a two-dimensional table. All the
information is stored in the form of row and columns. The basic
structure of a relational model is tables. So, the tables are also
called relations in the relational model. Example: In this
example, we have an Employee table.
Features of Relational Model
2. Methods –
When a message is passed then the body of code that is executed is
known as a method. Whenever a method is executed, it returns a value
as output. A method can be of two types:
Read-only method: When the value of a variable is not affected by a
method, then it is known as the read-only method.
Update-method: When the value of a variable change by a method,
then it is known as an update method.
3. Variables –
It stores the data of an object. The data stored in the variables makes the
object distinguishable from one another.
2. Object Classes:
An object which is a real-world entity is an instance of a class. Hence first we
need to define a class and then the objects are made which differ in the
values they store but share the same class definition. The objects in turn
correspond to various messages and variables stored in them.
Coding example:
// Create a Car class with some attributes
class Car {
public:
string brand;
string model;
int year;
};
int main() {
// Create an object of Car
Car carObj1;
carObj1.brand = "BMW";
carObj1.model = "X5";
carObj1.year = 1999;
5.Object-Relational Model
An Object relational model is a combination of a Object oriented database model
and a Relational database model. So, it supports objects, classes, inheritance etc. just
like Object Oriented models and has support for data types, tabular structures etc.
like Relational data model.
One of the major goals of Object relational data model is to close the gap between
relational databases and the object oriented practises frequently used in many
programming languages such as C++, C#, Java etc.
Example
There are two columns entitled name and password, for example, that
can be used by any given security system. As a result, each row is utilised
to record various accounts and passwords. In the flat model, no two
entries are the same. The table format is used to store the database in the
flat model. Because it is hard to manage such a vast collection of entries
in the case of a flat database model, this database model has the problem
of being unable to store big chunks of data in the 2D array.
7.Semi-Structured Model
Semi-structured model is an evolved form of the relational model. We
cannot differentiate between data and schema in this model. This model
can primarily be used to represent data from some data sources that
aren’t bound or constrained by the schema. Example: Web-Based data
sources which we can't differentiate between the schema and data of the
website. In this model, some entities may have missing attributes while
others may have an extra attribute. This model gives flexibility in storing
the data. It also gives flexibility to the attributes. Example: If we are
storing any value in any attribute then that value can be either atomic
value or a collection of values.
1. The world cup is being hosted by London. The source here is 'the
world cup', the verb 'is being' and the target is 'London'.
2. ...from 30 May 2020. The source here is the previous link, the verb
is 'from' and the target is '30 May 2020'.
This is represented using the table as follows:
9.Context Data Model
Context Data Model is a collection of several models. This consists of
models like network model, relational models etc. Using this model we
can do various types of tasks which are not possible using any model
alone.
Various methods have been introduced to Organize files. These particular methods
have advantages and disadvantages on the basis of access or selection . Thus it is all
upon the programmer to decide the best suited file Organization method according to
his requirements.
Some types of File Organizations are :
We will be discussing each of the file Organizations in further sets of this article along
with differences and advantages/ disadvantages of each file Organization methods.
The easiest method for file Organization is Sequential method. In this method the file
are stored one after another in a sequential manner. There are two ways to implement
this method:
Pile File Method – This method is quite simple, in which we store the records in a
sequence i.e one after other in the order in which they are inserted into the tables.
Heap File Organization works with data blocks. In this method records are inserted at
the end of the file, into the data blocks. No Sorting or Ordering is required in this
method. If a data block is full, the new record is stored in some other block, Here the
other data block need not be the very next data block, but it can be any block in the
memory. It is the responsibility of DBMS to store and manage the new records.
If we want to search, delete or update data in heap file Organization the we will
traverse the data from the beginning of the file till we get the requested record. Thus if
the database is very huge, searching, deleting or updating the record will take a lot of
time.
Pros and Cons of Heap File Organization –
Pros –
Fetching and retrieving records is faster than sequential record but only in case of
small databases.
When there is a huge number of data needs to be loaded into the database at a time,
then this method of file Organization is best suited.
Cons –
Problem of unused memory blocks.
Inefficient for larger databases.
Static Hashing:
it.
2. Closed hashing – In Closed hashing method, a new data bucket is
allocated with same address and is linked it after the full data bucket. This
method is also known as overflow chaining. For example, we have to
insert a new record D3 into the tables. The static hash function generates
the data bucket address as 105. But this bucket is full to store the new
data. In this case is a new data bucket is added at the end of 105 data
bucket and is linked to it. Then new record D3 is inserted into the new
bucket.
Dynamic Hashing –
The drawback of static hashing is that that it does not expand or shrink
dynamically as the size of the database grows or shrinks. In Dynamic
hashing, data buckets grows or shrinks (added or removed dynamically) as
the records increases or decreases. Dynamic hashing is also known
as extended hashing. In dynamic hashing, the hash function is made to
produce a large number of values. For Example, there are three data records
D1, D2 and D3 . The hash function generates three addresses 1001, 0101
and 1010 respectively. This method of storing considers only part of this
address – especially only first one bit to store the data. So it tries to load
three of them at address 0 and 1.
But the problem is that No bucket address is remaining for D3. The bucket
has to grow dynamically to accommodate D3. So it changes the address
have 2 bits rather than 1 bit, and then it updates the existing data to have 2
bit address. Then it tries to accommodate D3.
B+ Tree File Organization –
B+ Tree, as the name suggests, It uses a tree like structure to store records in File. It
uses the concept of Key indexing where the primary key is used to sort the records.
For each primary key, an index value is generated and mapped with the record. An
index of a record is the address of record in the file.
B+ Tree is very much similar to binary search tree, with the only difference that
instead of just two children, it can have more than two. All the information is stored in
leaf node and the intermediate nodes acts as pointer to the leaf nodes. The information
in leaf nodes always remain a sorted sequential linked list.
In the above diagram 56 is the root node which is also called the main node of the
tree.
The intermediate nodes here, just consist the address of leaf nodes. They do not
contain any actual record. Leaf nodes consist of the actual record. All leaf nodes are
balanced.
Pros and Cons of B+ Tree File Organization –
Pros –
In cluster file organization, two or more related tables/records are stored within same
file known as clusters. These files will have two or more tables in the same data block
and the key attributes which are used to map these table together are stored only once.
Thus it lowers the cost of searching and retrieving various records in different files as
they are now combined and kept in a single cluster.
For example we have two tables or relation Employee and Department. These table
are related to each other.
Therefore these table are allowed to combine using a join operation and can be seen in
a cluster file.
If we have to insert, update or delete any record we can directly do so. Data is sorted
based on the primary key or the key with which searching is done. Cluster key is the
key with which joining of the table is performed.
Types of Cluster File Organization – There are two ways to implement this
method:
1. Indexed Clusters –
In Indexed clustering the records are group based on the cluster key and stored
together. The above mentioned example of the Employee and Department
relationship is an example of Indexed Cluster where the records are based on the
Department ID.
2. Hash Clusters –
This is very much similar to indexed cluster with only difference that instead of
storing the records based on cluster key, we generate hash key value and store the
records with same hash key value.
1. Search O(log n)
2. Insert O(log n)
3. Delete O(log n)
is: and
Traversal in B-Tree:
Traversal is also similar to Inorder traversal of Binary Tree. We start from the
leftmost child, recursively print the leftmost child, then repeat the same
process for remaining children and keys. In the end, recursively print the
rightmost child.
Search Operation in B-Tree:
Search is similar to the search in Binary Search Tree. Let the key to be
searched be k. We start from the root and recursively traverse down. For
every visited non-leaf node, if the node has the key, we simply return the
node. Otherwise, we recur down to the appropriate child (The child which is
just before the first greater key) of the node. If we reach a leaf node and
don’t find k in the leaf node, we return NULL.
Logic:
Searching a B-Tree is similar to searching a binary tree. The algorithm is
similar and goes with recursion. At each level, the search is optimized as if
the key value is not present in the range of parent then the key is present in
another branch. As these values limit the search they are also known as
limiting value or separation value. If we reach a leaf node and don’t find the
desired key then it will display NULL.
Algorithm for Searching an Element:-
BtreeSearch(x, k)
i = 1
if leaf [x]
then return NIL
else
return BtreeSearch(ci[x], k)
Example: Searching 120 in the given B-Tree.
Solution:
In this example, we can see that our search was reduced by just limiting the
chances where the key containing the value could be present. Similarly if
within the above example we’ve to look for 180, then the control will stop at
step 2 because the program will find that the key 180 is present within the
current node. And similarly, if it’s to seek out 90 then as 90 < 100 so it’ll go to
the left subtree automatically and therefore the control flow will go similarly
as shown within the above example.
C++
Java
C#
Javascript
#include<iostream>
// A BTree node
class BTreeNode
public:
void traverse();
// Make the BTree friend of this so that we can access private members
of this
};
// A BTree
class BTree
{
BTreeNode *root; // Pointer to root node
public:
BTree(int _t)
void traverse()
BTreeNode* search(int k)
};
t = _t;
leaf = _leaf;
// Allocate memory for maximum number of possible keys
n = 0;
void BTreeNode::traverse()
int i;
if (leaf == false)
C[i]->traverse();
cout << " " << keys[i];
if (leaf == false)
C[i]->traverse();
BTreeNode *BTreeNode::search(int k)
int i = 0;
i++;
if (keys[i] == k)
return this;
if (leaf == true)
return NULL;
return C[i]->search(k);
The above code doesn’t contain the driver program. We will be covering the
complete program in our next post on B-Tree Insertion.
There are two conventions to define a B-Tree, one is to define by minimum
degree (followed in Cormen book), second is define by order. We have
followed the minimum degree convention and will be following same in
coming posts on B-Tree. The variable names used in the above program are
also kept same as Cormen book for better readability.
Applications of B-Trees:
It is used in large databases to access data stored in the disk
Searching of data in a data set can be achieved in significantly less time
using B tree
With the indexing feature multilevel indexing can be achieved.
Most of the servers also use B-tree approach.
Insertion and Deletion
B-Tree Insertion
B-Tree Deletion
References:
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen,
Charles E. Leiserson, Ronald L. Rivest
Please write comments if you find anything incorrect, or you want to share
more information about the topic discussed above.
Pros of ISAM:
o In this method, each record has the address of its data block, searching a record in a
huge database is quick and easy.
o This method supports range retrieval and partial retrieval of records. Since the index
is based on the primary key values, we can retrieve the data for the given range of
value. In the same way, the partial value can also be easily searched, i.e., the student
name starting with 'JA' can be easily searched.
Cons of ISAM
o This method requires extra space in the disk to store the index value.
o When the new records are inserted, then these files have to be reconstructed to
maintain the sequence.
o When the record is deleted, then the space used by it needs to be released.
Otherwise, the performance of the database will slow down.
File Access Methods in Operating
System
When a file is used, information is read and accessed into computer memory and there
are several ways to access this information of the file. Some systems provide only one
access method for files. Other systems, such as those of IBM, support many access
methods, and choosing the right one for a particular application is a major design
problem.
There are three ways to access a file into a computer system: Sequential-Access,
Direct Access, Index sequential Method.
1. Sequential Access –
It is the simplest access method. Information in the file is processed in order, one
record after the other. This mode of access is by far the most common; for
example, editor and compiler usually access the file in this fashion.
Read and write make up the bulk of the operation on a file. A read operation -read
next- read the next position of the file and automatically advance a file pointer,
which keeps track I/O location. Similarly, for the -write next- append to the end of
the file and advance to the newly written material.
Key points:
Data is accessed one record right after another record in an order.
When we use read command, it move ahead pointer by one
When we use write command, it will allocate memory and move the pointer to
the end of the file
Such a method is reasonable for tape.
2. Direct Access –
Another method is direct access method also known as relative access method. A
filed-length logical record that allows the program to read and write record rapidly.
in no particular order. The direct access is based on the disk model of a file since
disk allows random access to any file block. For direct access, the file is viewed as
a numbered sequence of block or record. Thus, we may read block 14 then block
59, and then we can write block 17. There is no restriction on the order of reading
and writing for a direct access file.
A block number provided by the user to the operating system is normally a relative
block number, the first relative block of the file is 0 and then 1 and so on.
Advantages:
This is very flexible in terms of file size. File size can be increased easily since the
system does not have to look for a contiguous chunk of memory.
This method does not suffer from external fragmentation. This makes it relatively
better in terms of memory utilization.
Disadvantages:
Because the file blocks are distributed randomly on the disk, a large number of
seeks are needed to access every block individually. This makes linked allocation
slower.
It does not support random or direct access. We can not directly access the blocks
of a file. A block k of a file can be accessed by traversing k blocks sequentially
(sequential access ) from the starting block of the file via block pointers.
Pointers required in the linked allocation incur some extra overhead.
3. Indexed Allocation
In this scheme, a special block known as the Index block contains the pointers to all
the blocks occupied by a file. Each file has its own index block. The ith entry in the
index block contains the disk address of the ith file block. The directory entry contains
the address of the index block as shown in the image:
Advantages:
This supports direct access to the blocks occupied by the file and therefore
provides fast access to the file blocks.
It overcomes the problem of external fragmentation.
Disadvantages:
The pointer overhead for indexed allocation is greater than linked allocation.
For very small files, say files that expand only 2-3 blocks, the indexed allocation
would keep one entire block (index block) for the pointers which is inefficient in
terms of memory utilization. However, in linked allocation we lose the space of
only 1 pointer per block.
For files that are very large, single index block may not be able to hold all the
pointers.
Following mechanisms can be used to resolve this:
1. Linked scheme: This scheme links two or more index blocks together for holding
the pointers. Every index block would then contain a pointer or the address to the
next index block.
2. Multilevel index: In this policy, a first level index block is used to point to the
second level index blocks which inturn points to the disk blocks occupied by the
file. This can be extended to 3 or more levels depending on the maximum file size.
3. Combined Scheme: In this scheme, a special block called the Inode (information
Node) contains all the information about the file such as the name, size, authority,
etc and the remaining space of Inode is used to store the Disk Block addresses
which contain the actual file as shown in the image below. The first few of these
pointers in Inode point to the direct blocks i.e the pointers contain the addresses of
the disk blocks that contain data of the file. The next few pointers point to indirect
blocks. Indirect blocks may be single indirect, double indirect or triple
indirect. Single Indirect block is the disk block that does not contain the file data
but the disk address of the blocks that contain the file data. Similarly, double
indirect blocks do not contain the file data but the disk address of the blocks that
contain the address of the blocks containing the file data.
This article is contributed by Saloni Baweja. If you like GeeksforGeeks and would
like to contribute, you can also write an article using contribute.geeksforgeeks.org or
mail your article to contribute@geeksforgeeks.org. See your article appearing on the
GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more
information about the topic discussed above.
DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture
is used to deal with a large number of PCs, web servers, database servers and other
components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are
connected via the network.
o DBMS architecture depends upon how users are connected to the database to get
their request done.
1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user
can directly sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide
a handy tool for end users.
o The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.
2-Tier Architecture
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application
server. The database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
Fig: 3-tier Architecture
1. Internal Level
o The internal level has an internal schema which describes the physical storage
structure of the database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data will be stored in
a block.
o The physical level is used to describe complex low-level data structures in detail.
o The conceptual schema describes the design of a database at the conceptual level.
Conceptual level is also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the database and also
describes what relationship exists among those data.
o In the conceptual level, internal details such as an implementation of the data
structure are hidden.
o Programmers and database administrators work at this level.
3. External Level
o At the external level, a database contains several schemas that sometimes called as
subschema. The subschema is used to describe the different view of the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user group is
interested and hides the remaining database from that user group.
o The view schema describes the end user interaction with database systems.
The Conceptual/ Internal Mapping lies between the conceptual level and the internal
level. Its role is to define the correspondence between the records and fields of the
conceptual level and files and data structures of the internal level.
The external/Conceptual Mapping lies between the external level and the Conceptual
level. Its role is to define the correspondence between a particular external and the
conceptual view.
Components of DBMS
DBMS stands for DataBase Management System. DBMS is a type of software by
which we can save and retrieve the user's data with the security process. DBMS can
manipulate the database with the help of a group of programs. The DBMS can accept
the request from the operating system to supply the data. The DBMS also can accept
the request to retrieve a large amount of data through the user and third-party
software.
DBMS also give permission to the user to use the data according to their needs. The
word "DBMS" contains information regarding the database program and the users. It
also provides an interface between the user and the software. In this topic, we are
going to discuss the various types of DBMS.
Components of DBMS
There are many components available in the DBMS. Each component has a
significant task in the DBMS. A database environment is a collection of components
that regulates the use of data, management, and a group of data. These components
consist of people, the technique of Handel the database, data, hardware, software,
etc. there are several components available for the DBMS. We are going to explain
five main topics of the database below.
1. Hardware
o Here the hardware means the physical part of the DBMS. Here the hardware includes
output devices like a printer, monitor, etc., and storage devices like a hard disk.
o In DBMS, information hardware is the most important visible part. The equipment
which is used for the visibility of the data is the printer, computer, scanner, etc. This
equipment is used to capture the data and present the output to the user.
o With the help of hardware, the DBMS can access and update the database.
o The server can store a large amount of data, which can be shared with the help of the
user's own system.
o The database can be run in any system that ranges from microcomputers to
mainframe computers. And this database also provides an interface between the real
worlds to the database.
o When we try to run any database software like MySQL, we can type any commands
with the help of our keyboards, and RAM, ROM, and processor are part of our
computer system.
2. Software
3. Data
o The term data means the collection of any raw fact stored in the database. Here the
data are any type of raw material from which meaningful information is generated.
o The database can store any form of data, such as structural data, non-structural data,
and logical data.
o The structured data are highly specific in the database and have a structured format.
But in the case of non-structural data, it is a collection of different types of data, and
these data are stored in their native format.
o We also call the database the structure of the DBMS. With the help of the database,
we can create and construct the DBMS. After the creation of the database, we can
create, access, and update that database.
o The main reason behind discovering the database is to create and manage the data
within the database.
o Data is the most important part of the DBMS. Here the database contains the actual
data and metadata. Here metadata means data about data.
o For example, when the user stores the data in a database, some data, such as the size
of the data, the name of the data, and some data related to the user, are stored
within the database. These data are called metadata.
4. Procedures
o The procedure is a type of general instruction or guidelines for the use of DBMS. This
instruction includes how to set up the database, how to install the database, how to
log in and log out of the database, how to manage the database, how to take a
backup of the database, and how to generate the report of the database.
o In DBMS, with the help of procedure, we can validate the data, control the access and
reduce the traffic between the server and the clients. The DBMS can offer better
performance to extensive or complex business logic when the user follows all the
procedures correctly.
o The main purpose of the procedure is to guide the user during the management and
operation of the database.
o The procedure of the databases is so similar to the function of the database. The
major difference between the database procedure and database function is that the
database function acts the same as the SQL statement. In contrast, the database
procedure is invoked using the CALL statement of the DBMS.
o Database procedures can be created in two ways in enterprise architecture. These
two ways are as below.
o The individual object or the default object.
o The operations in a container.
o Database Access Language is a simple language that allows users to write commands
to perform the desired operations on the data that is stored in the database.
o Database Access Language is a language used to write commands to access, upsert,
and delete data stored in a database.
o Users can write commands or query the database using Database Access Language
before submitting them to the database for execution.
o Through utilizing the language, users can create new databases and tables, insert
data and delete data.
o Examples of database languages are SQL (structured query language), My Access,
Oracle, etc. A database language is comprised of two languages.
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one
level of the database system without altering the schema at the next higher level.
For example, Suppose we design a school database. In this database, the student
will be an entity with attributes like address, name, id, age, etc. The address can be
another entity with attributes like city, street name, pin code, etc and there will be a
relationship between them.
Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can
be represented as rectangles.
a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity
doesn't contain any key attribute of its own. The weak entity is represented by a
double rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to
represent an attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.
a. Key Attribute
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a
multivalued attribute. The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute.
It can be represented by a dashed ellipse.
For example, A person's age changes over time and can be derived from another
attribute like Date of birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus
is used to represent the relationship.
Types of relationship are as follows:
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is
known as one to one relationship.
For example, A female can marry to one male, and a male can marry to one female.
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an
entity on the right associates with the relationship then this is known as a one-to-
many relationship.
For example, Scientist can invent many inventions, but the invention is done by the
only specific scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an
entity on the right associates with the relationship then it is known as a many-to-one
relationship.
For example, Student enrolls for only one course, but a course can have many
students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance
of an entity on the right associates with the relationship then it is known as a many-
to-many relationship.
For example, Employee can assign by many projects and project can have many
employees.
Cardinality
Cardinality means how the entities are arranged to each other or what is the
relationship structure between entities in a relationship set. In a Database
Management System, Cardinality represents a number that denotes how many times
an entity is participating with another entity in a relationship set. The Cardinality of
DBMS is a very important attribute in representing the structure of a Database. In a
table, the number of rows or tuples represents the Cardinality.
Cardinality Ratio
Cardinality ratio is also called Cardinality Mapping, which represents the mapping
of one entity set to another entity set in a relationship set. We generally take the
example of a binary relationship set where two entities are mapped to each other.
1. One to one
2. Many to one
3. One to many
4. Many to many
One to One
One to one cardinality is represented by a 1:1 symbol. In this, there is at most one
relationship from one entity to another entity. There are a lot of examples of one-to-
one cardinality in real life databases.
For example, one student can have only one student id, and one student id can
belong to only one student. So, the relationship mapping between student and
student id will be one to one cardinality mapping.
Another example is the relationship between the director of the school and the
school because one school can have a maximum of one director, and one director
can belong to only one school.
Note: it is not necessary that there would be a mapping for all entities in an entity set in
one-to-one cardinality. Some entities cannot participate in the mapping.
One to one Cardinality is the subset of Many to one Cardinality. It can be represented
by M:1.
For example, there are multiple patients in a hospital who are served by a single
doctor, so the relationship between patients and doctors can be represented by
Many to one Cardinality.
One to Many Cardinalities:
In One-to-many cardinality mapping, from set 1, there can be a maximum single set
that can make relationships with a single or more than one entity of set 2. Or we can
also describe it as from set 2, more than one entity can make a relationship with only
one entity of set 1.
It is represented by M: N or N: M.
One to one cardinality, One to many cardinalities, and Many to one cardinality is the
subset of the many to many cardinalities.
For Example, in a college, multiple students can work on a single project, and a
single student can also work on multiple projects. So, the relationship between the
project and the student can be represented by many to many cardinalities.
Appropriate Mapping Cardinality
Evidently, the real-world context in which the relation set is modeled determines the
Appropriate Mapping Cardinality for a specific relation set.
o We can combine relational tables with many involved tables if the Cardinality is one-
to-many or many-to-one.
o One entity can be combined with a relation table if it has a one-to-one relationship
and total participation, and two entities can be combined with their relation to form a
single table if both of them have total participation.
o We cannot mix any two tables if the Cardinality is many-to-many.