Introduction
Data and Information
1. Data:
● Data refers to raw, unprocessed facts and figures that have no specific meaning
on their own.
● Data is a collection of a distinct small unit of information.
● It can be in the form of numbers, text, images, audio, or video.
● Example: "John, 25, Male, 9876543210" is just data without context.
2. Information:
● Information is processed, organized, or structured data that has meaning and is
useful for decision-making.
● When data is analyzed and presented in a meaningful way, it becomes
information.
● Example: "John is a 25-year-old male with a phone number 9876543210" is
meaningful information.
Concept of Persistent Data
Persistent data refers to data that is stored in a non-volatile storage medium (such as a
database, hard drive, or cloud storage) and remains available even after the system is
powered off or restarted. Unlike temporary data, which exists only during the execution
of a program, persistent data is retained permanently until it is modified or deleted.
Characteristics of Persistent Data:
1. Long-term Storage: It remains stored even after the program or system stops
running.
2. Non-Volatile: Stored in a durable medium like a database, file system, or cloud
storage.
3. Accessible and Retrievable: Can be retrieved and modified when needed.
4. Managed by DBMS: A database management system (DBMS) ensures the
efficient storage, retrieval, and modification of persistent data.
Examples of Persistent Data:
● User account information stored in a database.
● Banking transaction records.
● Website user preferences stored in cookies or databases.
● Documents and files saved on a computer or cloud storage.
File and File Management System
File:
A file is a collection of data or information stored on a computer system. It can contain
text, images, audio, video, or any other type of digital content. Files are identified by a
unique name and an extension (e.g., .txt, .jpg, .mp3).
Types of Files:
● Text Files: .txt, .docx, .pdf (contain readable text)
● Image Files: .jpg, .png, .gif (store pictures)
● Audio Files: .mp3, .wav, .aac (contain sound)
● Video Files: .mp4, .avi, .mkv (store video content)
● Program Files: .exe, .bat, .sh (executable files)
File Management System (FMS):
A File Management System (FMS) is software that manages the creation,
organization, retrieval, and security of files in a computer system. It helps users and
applications store and access files efficiently.
Functions of a File Management System:
1. File Organization: Stores files in directories/folders for easy access.
2. File Access Control: Manages permissions (read, write, execute).
3. File Retrieval: Allows users to search and open files quickly.
4. File Modification: Supports editing and updating files.
5. File Security: Provides encryption and user authentication to protect data.
6. File Sharing: Enables file access over a network or cloud storage.
Examples of File Management Systems:
● Windows File Explorer
● macOS Finder
● Linux File System (ext4, NTFS, FAT32)
● Google Drive, Dropbox (Cloud-based FMS)
Basic File Operations
A file is a collection of data stored on a computer. To manage files effectively, an
operating system provides various file operations. These operations allow users to
create, access, modify, and delete files.
1. Create a File
● This operation allows users to create a new file in a storage system.
● Example:
○ Creating a new text document in Windows (Right-click → New →
Text Document).
○ Using the touch command in Linux (touch filename.txt).
2. Open a File
● A file must be opened before performing any read or write operations.
● Example:
○ Double-clicking a file to open it in an appropriate application.
○ Using the open() function in programming (e.g., file =
open("data.txt", "r") in Python).
3. Read a File
● This operation retrieves the content of a file without modifying it.
● Example:
○ Opening a .txt file to read its content.
○ Using file.read() in Python to read a file’s content.
4. Write to a File
● This operation allows users to add or modify content in a file.
● Example:
○ Editing a document in Notepad or MS Word.
○ Using file.write("Hello, World!") in Python to add content to a
file.
5. Append to a File
● Appending means adding new data to the end of an existing file without deleting
the previous content.
● Example:
○ Adding a new entry to a log file.
○ Using file.write("New data") in append mode ("a") in Python.
6. Close a File
● After performing operations, closing a file ensures that changes are saved and
resources are freed.
● Example:
○ Clicking “Save and Close” in an editor.
○ Using file.close() in Python.
7. Delete a File
● Removes the file permanently from the storage.
● Example:
○ Deleting a file from Windows (Right-click → Delete).
○ Using the rm filename.txt command in Linux.
8. Rename a File
● Changing the name of an existing file.
● Example:
○ Right-clicking a file and selecting "Rename" in Windows.
○ Using the mv oldname.txt newname.txt command in Linux.
9. Copy a File
● Creates a duplicate of a file in another location.
● Example:
○ Copying a file using Ctrl + C and Ctrl + V in Windows.
○ Using cp file1.txt file2.txt in Linux.
10. Move a File
● Transfers a file from one location to another.
● Example:
○ Dragging and dropping a file to a new folder in Windows.
○ Using mv file1.txt /new/location/ in Linux.
File Structure and Organization
A file structure refers to how data is logically organized and stored within a file. A file
organization refers to the way files are physically stored on a storage device to
optimize retrieval and performance.
File Structure
A file structure defines the format and organization of data within a file. It determines
how data is read, written, and manipulated.
Types of File Structures:
1. Sequential File Structure
○ Data is stored sequentially, one after another.
○ Retrieval is slow if searching for a specific record.
○ Example: Log files, transaction records.
2. Indexed File Structure
○ Uses an index to speed up data access.
○ The index contains pointers to different parts of the file.
○ Example: Database index files.
3. Hashed File Structure
○ Uses a hash function to determine the location of data.
○ Fast access but can have collisions.
○ Example: Hash tables used in databases.
File Organization
File organization defines how files are stored and retrieved efficiently.
Types of File Organization:
1. Sequential File Organization
○ Data is stored in a specific order (e.g., by date or name).
○ Retrieval is slow unless read sequentially.
○ Example: Payroll records.
2. Direct (Random) File Organization
○ Data can be accessed directly using a key.
○ Faster than sequential access.
○ Example: Databases using primary keys.
3. Indexed File Organization
○ Uses an index table to map keys to file locations.
○ Faster search and retrieval.
○ Example: Library catalog systems.
4. Clustered File Organization
○ Groups related records together to reduce access time.
○ Often used in databases to improve performance.
○ Example: Storing customer orders near customer details.
Types of File Organization
1. Sequential File Organization:
● Data is stored in a sequence, typically in the order in which records are added.
● Records are stored one after the other, and to access a record, the system must
search from the beginning to the desired record.
● It is efficient for applications that require reading all records in a specific order,
but searching for a specific record can be slow.
● Example: Storing transaction logs where processing is done in the order of
record arrival.
2. Heap File Organization:
● Records are stored in no particular order, and new records are simply added at
the end of the file.
● It's simple and efficient for storing small amounts of data, but searching for
records requires scanning through the entire file.
● Suitable for situations where data insertion is frequent, and the order of records
doesn't matter.
● Example: Temporary storage or logging where the sequence of records is not
important.
3. Hash File Organization:
● Uses a hash function to compute an address (hash key) for each record, and
records are stored at these computed addresses.
● This allows for faster direct access to records based on the key, but it is not
suitable for range queries.
● It is efficient for applications where fast lookups based on a key are required.
● Example: Implementing a database index where quick access to records by key
is important.
4. B+ File Organization:
● A type of balanced tree structure where all values are stored at the leaf nodes,
and non-leaf nodes store keys to direct the search.
● B+ Trees maintain an ordered structure, which supports efficient searching,
insertion, and deletion.
● It is widely used in databases because it supports both random access and
range queries efficiently.
● Example: Database indexing and file systems that require efficient range queries.
5. Indexed Sequential Access Method (ISAM):
● Combines the advantages of both sequential and direct access.
● Data is stored sequentially, but an index is maintained to provide faster direct
access to records based on a key.
● When accessing data, the system first searches the index and then reads the
records sequentially or directly.
● ISAM is suitable when frequent updates to the file are not needed, as maintaining
the index can be costly.
● Example: Older database management systems (DBMS) used ISAM for efficient
access.
6. Cluster File Organization:
● Records that are frequently accessed together are grouped and stored together
in the same physical location (cluster).
● This reduces the number of disk accesses needed to retrieve related records,
improving performance for certain types of queries.
● Suitable for applications with a lot of related data that is often queried together.
● Example: A file storing customer records along with their recent orders, where
clustering might group records from the same customer together.
What is Database
The database is a collection of inter-related data which is used to retrieve, insert, and
delete the data efficiently. It is also used to organize the data in the form of a table,
schema, views, and reports, etc.
For example: The college Database organizes the data about the admin, staff,
students, and faculty, etc.
Using the database, you can easily retrieve, insert, and delete the information.
DBMS (Database Management System)
● A Database Management System is software that is used to store and retrieve
the database. For example, Oracle, MySQL, etc.; these are some popular DBMS
tools.
● DBMS provides the interface to perform various operations like creation, deletion,
modification, etc.
● DBMS allows the user to create their databases as per their requirement.
● DBMS accepts the request from the application and provides specific data
through the operating system.
● DBMS contains a group of programs that act according to user instructions.
● It provides security to the database.
DBMS allows users the following tasks:
● Data Definition: It is used for creation, modification, and removal of definitions
that define the organization of data in the database.
● Data Updation: It is used for the insertion, modification, and deletion of actual
data in the database.
● Data Retrieval: It is used to retrieve the data from the database, which can be
used by applications for various purposes.
● User Administration:It is used for registering and monitoring users,
maintaining data integrity, enforcing data security, dealing with concurrency
control, monitoring performance, and recovering information corrupted by
unexpected failure.
Evolution of Database & DBMS
The evolution of databases and Database Management Systems (DBMS) has gone
through several stages, adapting to technological advancements and increasing data
storage and retrieval needs.
1. Flat File System (Before 1960s)
● Data was stored in simple text files.
● No structured organization—searching and retrieving data was slow and
inefficient.
● Example: Payroll and customer records stored in separate files.
2. Hierarchical Database Model (1960s-1970s)
● IBM introduced the Information Management System (IMS) in the 1960s.
● Data was stored in a tree-like structure (parent-child relationships).
● Fast retrieval but lacked flexibility—changes in structure required modifying the
entire database.
● Example: IMS by IBM.
3. Network Database Model (1970s-1980s)
● Introduced many-to-many relationships using a graph-like structure.
● More flexible than the hierarchical model but complex to manage.
● Used in large organizations for transaction processing.
● Example: CODASYL DBMS.
4. Relational Database Model (1980s-Present)
● Introduced by Edgar F. Codd (IBM) in 1970.
● Data stored in tables (relations) with rows and columns.
● Use of SQL (Structured Query Language) for data management.
● Improved data integrity, security, and ease of use.
● Examples: Oracle, MySQL, PostgreSQL, Microsoft SQL Server.
5. Object-Oriented Database (1990s-Present)
● Integrates object-oriented programming with databases.
● Supports complex data types, inheritance, and encapsulation.
● Examples: ObjectDB, db4o.
6. NoSQL & NewSQL Databases (2000s-Present)
● NoSQL: Developed for handling big data, real-time processing, and
unstructured data.
○ Examples: MongoDB (Document-based), Cassandra (Column-based),
Redis (Key-Value Store).
● NewSQL: Combines features of SQL and NoSQL, providing scalability while
maintaining relational integrity.
○ Examples: Google Spanner, VoltDB.
7. Cloud & Distributed Databases (2010s-Present)
● Data stored on cloud platforms for scalability and accessibility.
● Distributed databases allow parallel processing across multiple locations.
● Examples: Google BigQuery, Amazon RDS, Microsoft Azure SQL.
Characteristics of the Database Approach:
● Data Independence: The DBMS allows data to be independent of application
programs, meaning changes to the database schema do not require changes to
application programs.
● Data Redundancy Control: Redundant data is minimized by storing data in a
centralized location, reducing storage costs and improving consistency.
● Data Integrity: Ensures data accuracy and reliability by enforcing constraints like
unique keys, data types, and foreign key relationships.
● Data Security: Ensures that sensitive data is protected and accessible only to
authorized users.
● Concurrency Control: Multiple users can access and modify the database
simultaneously without interference, ensuring data consistency.
● Backup and Recovery: Provides mechanisms for regular backups and recovery
procedures in case of system failures.
Components of a Database System:
● Database: The collection of data organized in a structured way.
● DBMS: Software used to manage the database and facilitate interactions with the
data.
● Database Schema: Defines the structure of the database, including tables,
fields, and relationships between them.
● Query Processor: Responsible for interpreting and executing SQL queries to
retrieve, update, or delete data.
● Storage Manager: Handles the physical storage of data on disk, including
indexing and caching.
● Transaction Manager: Ensures that transactions (sets of database operations)
are processed reliably and securely.
DBMS vs. File System
File System Approach:
File-based systems were an early attempt to computerize the manual system. It is also
called a traditional-based approach in which a decentralized approach was taken where
each department stored and controlled its own data with the help of a data processing
specialist. The main role of a data processing specialist was to create the necessary
computer file structures, manage the data within structures, and design some
application programs that create reports based on file data.
In the above figure:
Consider an example of a student's file system. The student file will contain information
regarding the student (i.e., roll no, student name, course, etc.). Similarly, we have a
subject file that contains information about the subject and the result file which contains
the information regarding the result.
Some fields are duplicated in more than one file, which leads to data redundancy. So, to
overcome this problem, we need to create a centralized system, i.e., a DBMS approach.
DBMS:
A database approach is a well-organized collection of data that are related in a
meaningful way and can be accessed by different users but stored only once in a
system. The various operations performed by the DBMS system are: Insertion, deletion,
selection, sorting, etc.
In the above figure:
In the above figure, duplication of data is reduced due to the centralization of data.
There are the following differences between DBMS and File System:
Basis DBMS Approach File System Approach
Meaning DBMS is a collection of data. The file system is a collection of
In DBMS, the user is required data. In this system, the user has
to write the procedures. to write the procedures for
managing the database.
Sharing of data Due to the centralized Data is distributed in many files,
approach, data sharing is and it may be in different formats,
easy. so it isn't easy to share.
Data DBMS gives an abstract view The file system provides the detail
Abstraction of data that hides the details. of the data representation and
storage of data.
Security and DBMS provides a good It isn't easy to protect a file under
Protection protection mechanism. the file system.
Recovery DBMS provides a crash The file system doesn't have a
Mechanism recovery mechanism, i.e., crash mechanism, i.e., if the
DBMS protects the user from system crashes while entering
system failure. data, then the content of the file
will be lost.
Manipulation DBMS contains a wide variety The file system can't efficiently
Techniques of sophisticated techniques to retrieve the data.
store and retrieve the data.
Concurrency DBMS takes care of In the file system, concurrent
Problems concurrent access to data with access has many problems,
some form of locking. redirecting the file while deleting
some information or updating
some information.
Where to use Database approach is used in File system approach is used in
systems that interrelate files. large systems that interrelate
many files.
Cost The database system is The file system approach is
expensive to design. cheaper to design.
Data Due to the centralization of the In this, the files and application
Redundancy & database, the problems of programs are created by different
Inconsistency data redundancy and programmers, so there exists a lot
inconsistency are controlled. of duplication of data, which may
lead to inconsistency.
Structure The database structure is The file system approach has a
complex to design. simple structure.
Data In this system, data In the file system approach, there
Independence independence exists, and it is no data independence.
can be of two types: Logical
Data Independence and
Physical Data Independence.
Integrity Integrity constraints are easy Integrity constraints are difficult to
Constraints to apply. implement in the file system.
Data Models In the database approach, In the file system approach, there
three types of data models is no concept of data models.
exist: Hierarchical data
models, Network data models,
Relational data models.
Flexibility Changes to data are often The flexibility of the system is less
necessary, and these changes compared to the DBMS approach.
are more easily managed with
a database approach.
Examples Oracle, SQL Server, Sybase. Cobol, C++ etc.
Advantages and Disadvantages of Using DBMS
Advantages of DBMS:
1. Data Integrity: Databases enforce data integrity constraints, such as primary
keys, foreign keys, and unique constraints, to maintain accurate and consistent
data. This ensures that data is reliable and free from inconsistencies.
2. Data Security: Databases offer various security features, such as access
control, authentication, and encryption, to protect sensitive information from
unauthorized access and data breaches.
3. Data Sharing: Databases enable multiple users and applications to access and
modify data concurrently. This facilitates collaboration and follows for centralized
data management, leading to better data sharing across an organization.
4. Data Independence: Databases provide data independence, separating the
physical storage of data from the logical view. This abstraction allows changes to
be made to the database structure without affecting the applications that use the
data, making maintenance and scalability easier.
5. Data Backup and Recovery: Databases offer mechanisms for regular data
backups and restore points, ensuring that data can be recovered in case of
hardware failures, human errors.
Disadvantages of DBMS:
1. Cost: Setting up and maintaining a database system can be expensive,
especially for small-scale applications. Costs include hardware, software
licenses, training, and ongoing maintenance.
2. Complexity: Designing and managing databases require specialized knowledge
and skills. The complexity of database systems may be challenging for
non-experts to understand & maintain.
3. Single Point of Failure: A database failure can affect the entire system. If proper
backup and recovery mechanisms are not in place, data loss can occur, leading
to significant disruptions in operations.
4. Scalability Challenges: As data volumes grow, scaling databases to handle
increased loads may become complex and costly. Ensuring high performance
under heavy loads can be a challenge.
DBMS Users:
● Database Administrators (DBAs): Individuals responsible for managing the
DBMS, ensuring security, backup, performance, and recovery.
● End Users: Users who interact with the database to perform queries and retrieve
information. They may be casual users, power users, or data analysts.
● Application Programmers: Developers who create software applications that
interact with the database, typically using SQL or database APIs.
● System Analysts: Professionals who design database systems, defining
requirements and system architecture.
● Data Analysts/Scientists: Individuals who analyze and interpret data stored in
databases, often using advanced statistical and machine learning techniques.
Database Management System Architecture:
● The DBMS design depends upon its architecture. The basic client/server
architecture is used to deal with a large number of PCs, web servers, database
servers, and other components that are connected with networks.
● The client/server architecture consists of many PCs and a workstation which are
connected via the network.
● DBMS architecture depends upon how users are connected to the database to
get their requests done.
Types of DBMS Architecture
● 1-tier Architecture
● 2-tier Architecture
● 3-tier Architecture
Database architecture can be seen as a single-tier or multi-tier. But logically, database
architecture is of two types: 2-tier architecture and 3-tier architecture.
1. 1-Tier Architecture (Single-tier Architecture):
● Overview: In a 1-tier architecture, the entire DBMS system, including the user
interface, database, and database management system components, resides on
a single machine.
● Components:
○ The database and the DBMS are installed on the same system.
○ Users directly interact with the database through a local interface or
application running on the same machine.
● Use Case: Typically used in standalone systems or small-scale applications
where simplicity is important, such as desktop database applications.
● Advantages:
○ Simple architecture, easy to set up and manage.
○ No need for network communication, making it faster for local operations.
● Disadvantages:
○ Not scalable for large applications or multiple users.
○ Difficult to manage as the system grows (in terms of user access and data
handling).
2. 2-Tier Architecture (Client-Server Architecture):
● Overview: In a 2-tier architecture, the DBMS is divided into two components:
the client and the server.
○ The client is the application or user interface, which communicates with
the database server.
○ The server hosts the database and DBMS software.
● Components:
○ Client Layer: The user interface or application that sends requests to the
server.
○ Server Layer: The database server that processes the client's requests,
executes SQL queries, and returns results.
● Use Case: Common in enterprise applications, where multiple clients interact
with a centralized database server over a network.
● Advantages:
○ Better scalability than 1-tier, as multiple clients can interact with the server
simultaneously.
○ Centralized management of data, which improves security and data
integrity.
● Disadvantages:
○ The database server may become a bottleneck if the number of clients or
the volume of data grows.
○ Network issues can impact performance.
3. 3-Tier Architecture:
● Overview: A 3-tier architecture introduces an additional layer between the client
and the database server. This middle layer, called the application server,
handles business logic, user authentication, data validation, etc.
● Components:
○ Presentation Layer (Client): The front-end user interface that allows
users to interact with the system (e.g., a web browser or desktop
application).
○ Application Layer (Middle-tier): The business logic layer that processes
user requests, applies rules, and communicates with the database. This
layer often includes application servers and APIs that manage interactions
between the client and the database.
○ Data Layer (Server): The database server, where the actual database
resides, which is responsible for storing, retrieving, and managing the
data.
● Use Case: Common in large-scale enterprise applications or web applications
where the separation of concerns is crucial, such as online banking systems or
e-commerce platforms.
● Advantages:
○ Scalability: The middle-tier application server can be scaled independently
to handle increasing user load without affecting the database.
○ Separation of concerns: Clear division between the user interface,
business logic, and data, making it easier to maintain and update each
layer independently.
○ Flexibility: The system can support various clients (e.g., web, mobile,
desktop) with a single back-end database and logic layer.
● Disadvantages:
○ Increased complexity due to the additional middle tier.
○ Requires more hardware and network infrastructure to manage the
communication between layers.
○ Performance overhead from the extra layer, especially in cases with high
network traffic.
Capabilities of a Good DBMS:
A good DBMS provides several key capabilities to effectively manage data and support
users in performing various database-related tasks:
● Data Storage, Retrieval, and Update: A DBMS allows data to be stored in a
structured manner and provides efficient mechanisms for retrieving and updating
that data.
● Data Security: A DBMS ensures that only authorized users can access or
modify the data through user authentication and access control mechanisms.
● Data Integrity: The DBMS enforces rules and constraints to ensure that data is
accurate and consistent (e.g., unique keys, foreign keys, data types).
● Concurrency Control: The DBMS manages concurrent access to the database,
ensuring that multiple users can access and modify data without conflicts.
● Backup and Recovery: It provides tools to back up data and recover it in case
of system failure or data corruption.
● Data Independence: A DBMS abstracts the data from application programs,
making it possible to change the underlying database structure without affecting
applications.
● Transaction Management: Ensures that all database transactions are
processed reliably (ACID properties—Atomicity, Consistency, Isolation,
Durability).
● Scalability and Performance: Optimizes performance for large datasets and
concurrent user access, supporting scalability as the system grows.
Database Schemas and Instances:
● Database Schema:
○ The schema of a database defines its structure and organization. It
describes the tables, relationships, constraints, and other elements that
make up the database.
○ The schema is a blueprint or framework for the database and does not
contain actual data; it defines how data is logically organized.
○ Types of schema include:
■ Logical Schema: Describes the logical structure of the database
(tables, views, etc.).
■ Physical Schema: Describes how the data is stored physically on
the storage medium.
■ External Schema (View Schema): Describes different views of the
data tailored to different users or applications.
● Database Instance:
○ An instance refers to the actual data stored in the database at a particular
point in time. The database instance is the current state of the database,
populated with real data that conforms to the schema.
○ While the schema defines the structure, the instance represents the data
within that structure (e.g., records in a table).
○ The instance can change as data is inserted, updated, or deleted.
Classification of Database Management Systems:
DBMS can be classified based on several criteria, including their data model,
architecture, and intended use:
● Based on Data Models:
○ Hierarchical DBMS: Data is organized in a tree-like structure with
parent-child relationships (e.g., IBM's IMS).
○ Network DBMS: Data is represented as a graph, allowing more complex
relationships between entities (e.g., Integrated Data Store).
○ Relational DBMS (RDBMS): Data is organized in tables with rows and
columns, and relationships are maintained through keys (e.g., MySQL,
Oracle, SQL Server).
○ Object-Oriented DBMS: Data is stored as objects, similar to
object-oriented programming concepts (e.g., ObjectDB, db4o).
○ NoSQL DBMS: A broad category that includes databases designed to
handle unstructured, semi-structured, or large volumes of data (e.g.,
MongoDB, Cassandra, Redis).
● Based on Architecture:
○ Centralized DBMS: The database is stored and managed on a single
server.
○ Distributed DBMS: The database is distributed across multiple locations
or servers, and data may be replicated or partitioned.
○ Cloud DBMS: The database is hosted and managed on cloud platforms,
offering flexibility and scalability (e.g., Amazon RDS, Google Cloud SQL).
● Based on Usage:
○ Operational DBMS: Primarily used for daily operations, such as
transaction processing (e.g., banking systems).
○ Analytical DBMS: Used for data analysis and decision support, often
optimized for complex queries (e.g., Data Warehouses).
○ Embedded DBMS: Designed for use in embedded systems, such as
mobile devices or IoT applications (e.g., SQLite).
Database Languages:
A DBMS has appropriate languages and interfaces to express database queries and
updates.
Database languages can be used to read, store, and update the data in the database.
Types of Database Language:
The DBMS Language is divided into four types:
● DDL (Data Definition Language)
● DCL (Data Control Language)
● DML (Data Manipulation Language)
● TCL (Transaction Control Language)
1. Data Definition Language (DDL)
DDL stands for Data Definition Language. It is used to define the database structure
or pattern.
● It is used to create schema, tables, indexes, constraints, etc., in the database.
● Using DDL statements, you can create the skeleton of the database.
● DDL is used to store metadata information, such as the number of tables,
schemas, their names, indexes, columns in each table, constraints, etc.
Here are some Tasks that come under DDL:
● Create: Used to create objects in the database.
● Alter: Used to alter the structure of the database.
● Drop: Used to delete objects from the database.
● Truncate: Used to remove all records from a table.
● Rename: Used to rename an object.
● Comment: Used to comment on the data dictionary.
(These commands update the database schema, which is why they fall under Data
Definition Language.)
2. Data Manipulation Language(DML):
DML stands for Data Manipulation Language. It is used for accessing and
manipulating data in a database. It handles user requests.
Here are some Tasks that come under DML:
● Select: It is used to retrieve data from a database.
● Insert: It is used to insert data into a table.
● Update: It is used to update existing data within a table.
● Delete: It is used to delete all records from a table.
● Merge: It performs the UPSERT operation, i.e., insert or update operations.
● Call: It is used to call a structured query language or a Java subprogram.
● Explain Plan: It has the parameter of explaining data.
● Lock Table: It controls concurrency.
3. Data Control Language(DCL):
DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
● The DCL execution is transactional. It also has rollback parameters.
(But in Oracle databases, the execution of Data Control Language does not
have the feature of rolling back.)
Here are some Tasks that come under DCL:
● Grant: It is used to give user access privileges to a database.
● Revoke: It is used to take back permissions from the user.
There are the following operations which have the authorization of Revoke:
CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE, and SELECT.
4. Transaction Control Language(TCL):
TCL is used to run the changes made by DML statements. TCL can be grouped into a
logical transaction.
Here are some Tasks that come under TCL:
● Commit: It is used to save the transaction on the database.
● Rollback: It is used to restore the database to its original state since the last
Commit.
● Savepoint:It is a point within a transaction that allows partial rollback without
affecting the entire transaction.
Data Models:
A data model is a conceptual framework for organizing, defining, and structuring data.
It provides a way to represent data and the relationships between different data
elements, which is essential for designing and managing databases. Data models help
database designers define the data structures and facilitate data storage and retrieval in
a DBMS.
Here’s an overview of the key types of data models used in DBMS:
1. Network Data Model:
● Overview: The Network Data Model is a more flexible version of the
Hierarchical Data Model. It allows entities to have multiple parent nodes
(many-to-many relationships) and organizes data in a graph structure using
nodes and links.
● Structure: Data is represented using a collection of records (nodes) and sets
(links). A set connects a parent record to its child records, and a record can have
multiple parent records, making it more versatile than the hierarchical model.
● Key Feature: It supports many-to-many relationships, unlike the hierarchical
model, which only supports one-to-many relationships.
● Use Case: Suitable for applications requiring complex relationships like
telecommunications or transportation systems.
● Example: CODASYL (Conference on Data Systems Languages) is an
example of a network DBMS.
2. Hierarchical Data Model:
● Overview: The Hierarchical Data Model organizes data in a tree-like structure
where each record (node) has a single parent, and each parent can have
multiple children. This model is based on the one-to-many relationship between
data elements.
● Structure: The model is represented as a hierarchy, with the root node at the
top, followed by child nodes. Each node represents a data entity, and links
between the nodes represent relationships.
● Key Feature: A strict parent-child relationship where each child node can have
only one parent node.
● Use Case: Ideal for applications that have a natural hierarchical structure, such
as organizational charts or file systems.
● Example: IMS (Information Management System) by IBM is an example of a
hierarchical DBMS.
3. Relational Data Model:
● Overview: The Relational Data Model is the most widely used data model
today. It organizes data into tables (relations), where each table consists of rows
(records) and columns (attributes). The model is based on set theory and
predicate logic.
● Structure: Data is organized in tables, and relationships between tables are
established using keys (primary keys, foreign keys). Tables are independent of
each other, and data is accessed using SQL (Structured Query Language).
● Key Feature: The use of tables to store data and keys to define relationships,
enabling flexibility, scalability, and easy querying.
● Use Case: Suitable for general-purpose applications like banking, e-commerce,
and customer relationship management.
● Example: MySQL, Oracle, SQL Server, and PostgreSQL are examples of
relational DBMSs.
4. Entity-Relationship (ER) Data Model:
● Overview: The Entity-Relationship (ER) model is a conceptual framework for
designing databases. It represents the entities in the system and the
relationships between those entities. The ER model is primarily used in the
database design phase.
● Structure: The model uses entities (objects or things) and relationships
(associations between entities). An entity set is a collection of similar entities,
and a relationship set is a collection of relationships. Each entity has attributes
that describe it.
● Key Feature: It uses ER diagrams to visually represent the database structure,
which makes it easier to understand the relationships.
● Use Case: Commonly used for database design, where a high-level conceptual
representation of the database is needed.
● Example: When designing a school database, an entity might be Student, and
relationships could include Enrolls in Courses.
5. Object-Based Data Model:
● Overview: The Object-Based Data Model integrates object-oriented
programming principles with databases. Data is stored as objects, similar to how
objects are represented in object-oriented programming (OOP).
● Structure: In this model, data items are represented as objects that include both
data (attributes) and methods (functions). Objects are organized in classes, and
relationships between objects are modeled using inheritance, encapsulation, and
polymorphism.
● Key Feature: It supports complex data types (objects) and allows for the
modeling of real-world entities more effectively than the relational model.
● Use Case: Suitable for applications requiring complex data structures, such as
multimedia databases, CAD systems, or any system using object-oriented
programming.
● Example: ObjectDB, db4o, and Versant are examples of object-based DBMSs.
6. Semi-Structured Data Model:
● Overview: The Semi-Structured Data Model is used to describe data that does
not conform to a strict schema or relational structure. This model allows for
flexibility in data representation, where each data element can have varying
attributes or values.
● Structure: Data is stored in a graph or tree structure, often using markup
languages like XML (eXtensible Markup Language) or JSON (JavaScript
Object Notation). This model does not require a predefined schema and allows
for evolving data structures.
● Key Feature: It supports flexible and semi-structured formats, making it ideal for
data with varying formats, such as documents, emails, and web data.
● Use Case: Often used in scenarios where the data structure may change over
time, such as with XML-based documents, web scraping, and NoSQL databases.
● Example: MongoDB, CouchDB, and XML databases are examples of
databases using semi-structured data.
The challenges for data management:
Traditional file-based data management approaches suffer from several challenges,
such as data redundancy, inconsistency, and isolation.
Data Redundancy: In file-based systems, data is often duplicated across multiple files
or applications. This duplication can lead to data inconsistency, as changes made to
one copy of the data might not be reflected in other copies.
Data Inconsistency: When the same data is stored in different files or formats, it can
lead to inconsistencies in the values and meanings of data.
Data Isolation: File-based systems are often isolated from each other, making it difficult
to share data across different applications or users.
Data Independence:
● Data independence can be explained using the three-schema architecture.
● Data independence refers to the characteristic of being able to modify the
schema at one level of the database system without altering the schema at the
next higher level.
There are two types of data independence:
1. Logical Data Independence:
● Logical data independence refers to the characteristic of being able to change
the conceptual schema without having to change the external schema.
● Logical data independence is used to separate the external level from the
conceptual view.
● If we do any changes in the conceptual view of the data, then the user view of
the data would not be affected.
● Logical data independence occurs at the user interface level.
2. Physical Data Independence:
● Physical data independence can be defined as the capacity to change the
internal schema without having to change the conceptual schema.
● If we do any changes in the storage size of the database system server, then the
conceptual structure of the database will not be affected.
● Physical data independence is used to separate conceptual levels from the
internal levels.
● Physical data independence occurs at the logical interface level.
Fig: Data Independence
(A diagram illustrating data independence)
RDBMS (Relational Database Management System)
The word RDBMS is termed as 'Relational Database Management System.' It is
represented as a table that contains rows and columns.
RDBMS is based on the Relational model, It was introduced by E. F. Codd.
A relational database contains the following components:
● Table
● Record / Tuple
● Field / Column name / Attribute
● Instance
● Schema
● Keys
An RDBMS is a tabular DBMS that maintains security, integrity, accuracy, and
consistency of the data.
How it works:
Data is represented in terms of tuples (rows) in RDBMS.
Relational database is the most commonly used database. It contains a number of
tables, and each table has its own primary key.
Due to a collection of organized sets of tables, data can be accessed easily in RDBMS.
Brief History of RDBMS:
During 1970 to 1972, E.F. Codd published a paper to propose the use of a relational
database model.
RDBMS is originally based on E.F. Codd's relational model invention.
What is a table:
The RDBMS database uses tables to store data. A table is a collection of related data
entries and contains rows and columns to store data.
A table is the simplest example of data storage in RDBMS.
Let's see an example of a student table:
ID Name AGE COURSE
1 Ajeet 24 B.Tech
2 Aryan 20 C.A
3 Mahesh 21 BCA
4 Ratan 22 MCA
5 Vimal 26 BSC
What is field:
A field is a smaller entity of the table that contains specific information about every
record in the table. In the above example, the field in the student table consists of ID,
Name, Age, and Course.
What is a row or record:
A row of a table is also called a record. It contains specific information about each
individual entry in the table. It is a horizontal entity in the table.
For example, the above table contains 5 records.
Let's see one record/row in the table:
ID Name Age Course
1 Ajeet 24 B.Tech
Comparison between DBMS and RDBMS:
No. DBMS RDBMS
1 DBMS applications store data as a RDBMS applications store data in a
file. tabular form.
2 In DBMS, data is generally stored in In RDBMS, tables have an identifier
either a hierarchical or navigational (primary key), and data values are stored
form. in the form of tables.
3 Normalization is not present in Normalization is present in RDBMS.
DBMS.
4 DBMS does not apply security with RDBMS defines integrity constraints for
regard to data manipulation. ACID properties (Atomicity, Consistency,
Isolation, Durability).
5 DBMS uses the file system to store In RDBMS, data is stored in table format,
data, so no relationships exist so relationships exist between tables.
between tables.
6 DBMS requires some methods to RDBMS supports a tabular structure and
access data. relationships between tables.
7 DBMS is not a distributed RDBMS supports distributed databases.
database.
8 DBMS is meant for small RDBMS is designed to handle large
organizations and supports a single amounts of data and supports multiple
user. users.
9 Examples of DBMS: File system, Examples of RDBMS: MySQL,
XML, etc. PostgreSQL, SQL Server, Oracle, etc.
Conclusion:
After observing the differences between DBMS and RDBMS, we can say that RDBMS is
an advanced version of DBMS. Many modern software products are built on RDBMS
due to better data organization, security, and relational support.