A TECHNICAL REPORT WRITING ON
DEFINE THE HORIZONTAL, VERTICAL AND MIXED
FRAGMENTATION OF DDBMS
Submitted By
SHREYA GUPTA
Dept. – CSE (M. TECH)
Semester – 2nd
Roll No. – 10211224004
Paper Name – Advanced DBMS
Paper Code – PGCSE201
Department of Computer Science and Engineering
KALYANI GOVERNMENT ENGINEERING COLLEGE
(Affiliated to Maulana Abul Kalam Azad University of Technology, West Bengal)
Kalyani - 741235, Nadia, WB
1
ABSTRACT
A Distributed Database Management System (DDBMS) ensures efficient data
management across multiple locations by utilizing data fragmentation. Fragmentation
is essential for optimizing performance, minimizing redundancy, and enhancing data
security. It is classified into horizontal, vertical, and mixed fragmentation.
Horizontal fragmentation divides a table into subsets of rows based on a selection
condition, ensuring that each fragment contains complete records. This technique
improves query performance and security by storing relevant data closer to the users
who need it. However, complex query processing across multiple fragments can be a
challenge.
Vertical fragmentation partitions a table into subsets of columns, ensuring that each
fragment retains the primary key. This method enhances security and reduces the
amount of data retrieved in queries. However, maintaining consistency across
fragments can be complex.
Mixed fragmentation is a combination of both horizontal and vertical fragmentation,
offering flexibility and efficiency in distributed databases. It is useful in scenarios
requiring both row-based and column-based optimization. Despite its advantages,
mixed fragmentation increases query processing complexity.
Overall, data fragmentation in DDBMS plays a crucial role in improving accessibility,
security, and performance, making it a fundamental concept in distributed database
design.
2
CONTENT
Topic Name Page No.
1. INTRODUCTION ………………………………………………………… 4
2. THEORY OF DATA FRAGMENTATION IN DDBMS ………………… 5
2.1. Overview of Data Fragmentation …………………………………... 5
2.2. Need for Data Fragmentation ……………………………………… 5
2.3. Types of Fragmentation …………………………………………… 5
3. METHODOLOGY OF DATA FRAGMENTATION …………………… 6
3.1. Fragmentation Process ……………………………………………. 6
3.2. Advantages Of Data Fragmentation ……………………………… 6
3.3. Disadvantages of Data Fragmentation …………………………… 6
4. TYPES OF FRAGMENTATION IN DETAIL …………………………. 7 – 11
4.1. Horizontal Fragmentation ………………………………………… 7
4.2. Vertical Fragmentation ……………………………………………. 8
4.3. Mixed Fragmentation ……………………………………………… 10
4.4. Comparisons of Fragmentation Techniques ………………………. 11
5. DISCUSSION ……………………………………………………………. 12
6. CONCLUSION ………………………………………………………….. 13
7. REFERENCES ………………………………………………………….. 14
3
1. INTRODUCTION
In a Distributed Database Management System (DDBMS), data is stored across
multiple sites to enhance performance, availability, and reliability. To efficiently
manage distributed data, fragmentation is used to divide a database into smaller,
logical pieces called fragments. These fragments are stored across different
locations to optimize query execution, reduce redundancy, and improve access
speed.
Fragmentation in DDBMS is classified into three types: horizontal, vertical, and
mixed fragmentation.
• Horizontal fragmentation partitions a table into multiple fragments containing
different subsets of rows while keeping all columns intact. This method
ensures that queries accessing only specific records can be processed faster
without scanning the entire table.
• Vertical fragmentation divides a table into smaller fragments based on
columns, grouping attributes that are frequently accessed together. It reduces
unnecessary data retrieval and improves security by limiting access to
sensitive information.
• Mixed fragmentation is a combination of horizontal and vertical
fragmentation, allowing both row-based and column-based optimization. This
approach provides greater flexibility in database distribution but introduces
additional complexity in query processing and data consistency management.
By implementing effective fragmentation strategies, DDBMS enhances efficiency,
minimizes communication overhead, and ensures data security, making it a critical
aspect of modern distributed database design.
4
2. THEORY OF DATA FRAGMENTATION IN DDBMS
2.1. Overview of Data Fragmentation:
Fragmentation is a process of dividing the whole or full database into various subtables or
sub relations so that data can be stored in different systems. The small pieces or sub
relations or subtables are called fragments. These fragments are called logical data units
and are stored at various sites. It must be made sure that the fragments are such that they
can be used to reconstruct the original relation (i.e., there isn’t any loss of data).
In the fragmentation process, let’s say, if a table T is fragmented and is divided into a
number of fragments say T1, T2, T3…. TN. The fragments contain sufficient information
to allow the restoration of the original table T. This restoration can be done by the use of
UNION or JOIN operation on various fragments. This process is called data fragmentation.
All of these fragments are independent which means these fragments cannot be derived
from others. The users needn’t be logically concerned about fragmentation which means
they should not concern that the data is fragmented and this is called fragmentation
Independence or we can say fragmentation transparency.
2.2. Need for Data Fragmentation:
1. Performance Optimization – Reduces query execution time by storing frequently
accessed data closer to users.
2. Improved Security – Ensures that sensitive information is stored in secure locations.
3. Reduced Data Redundancy – Prevents duplication of records.
4. Enhanced Availability – Ensures data remains accessible even if one fragment fails.
5. Lower Communication Costs – Reduces data transfer costs across distributed sites.
2.3. Types of Fragmentation:
There are three main types of fragmentation in DDBMS:
1. Horizontal Fragmentation – Splitting data row-wise based on conditions.
2. Vertical Fragmentation – Splitting data column-wise by attributes.
3. Mixed Fragmentation – A combination of both horizontal and vertical fragmentation.
5
3. METHODOLOGY OF DATA FRAGMENTATION
To implement fragmentation in DDBMS, the following steps are followed:
3.1. Fragmentation Process:
Step 1: Data Analysis -
• Identify which parts of the database are accessed frequently.
• Determine which attributes and records different users require.
Step 2: Selecting Fragmentation Type -
• Use horizontal fragmentation if row-wise partitioning optimizes performance.
• Use vertical fragmentation if column-based partitioning enhances security and
efficiency.
• Use mixed fragmentation for a hybrid approach combining both.
Step 3: Defining Fragmentation Rules -
• Define selection conditions for horizontal fragmentation (e.g., Location = 'New
York').
• Identify column groups for vertical fragmentation (e.g., separate personal and
academic details).
• Implement both strategies for mixed fragmentation.
Step 4: Storing Fragments Across Sites -
• Allocate fragments to appropriate distributed locations.
• Set up data replication for consistency and fault tolerance.
3.2. Advantages Of Data Fragmentation:
1. Since data is stored close to the site of usage, efficiency of the database system is
increased.
2. Local query optimization techniques are sufficient for most queries since data is
locally available.
3. Since irrelevant data is not available at the sites, security and privacy of the database
system can be maintained.
3.3. Disadvantages of Data Fragmentation:
1. Access speeds may be very high if data from different fragments are needed.
2. If we are using recursive fragmentation, then it will be very expensive.
6
4. TYPES OF FRAGMENTATION IN DETAIL
4.1. Horizontal Fragmentation:
Horizontal fragmentation refers to the process of dividing a table horizontally by assigning
each row (or a group of rows) of relation to one or more fragments. These fragments can
then be assigned to different sites in the distributed system. Some of the rows or tuples of
the table are placed in one system and the rest are placed in other systems. The rows that
belong to the horizontal fragments are specified by a condition on one or more attributes of
the relation. In relational algebra horizontal fragmentation on table T, can be represented
as follows: σp(T)
where, σ is relational algebra operator for selection
p is the condition satisfied by a horizontal fragment
Example of Horizontal Fragmentation –
Consider an Employee Database that stores information about employees in different
cities:
Original Employee Table:
Employee_ID Name Department Location
101 John HR New York
102 Alice IT New York
103 Bob HR London
104 Emma IT London
Using horizontal fragmentation, we can divide data based on Location:
Fragment 1 (Employees in New York):
Employee_ID Name Department Location
101 John HR New York
102 Alice IT New York
7
Fragment 2 (Employees in London):
Employee_ID Name Department Location
103 Bob HR London
104 Emma IT London
Advantages of Horizontal Fragmentation -
1. Faster query execution by accessing only relevant records.
2. Reduces network congestion.
3. Ensures better availability in distributed databases.
Disadvantages of Horizontal Fragmentation –
1. Complexity in executing cross-fragment queries.
2. Requires additional indexing for performance.
4.2. Vertical Fragmentation:
Vertical fragmentation refers to the process of decomposing a table vertically by attributes
or columns. In this fragmentation, some of the attributes are stored in one system and the
rest are stored in other systems. This is because each site may not need all columns of a
table. In order to take care of restoration, each fragment must contain the primary key
field(s) in a table. The fragmentation should be in such a manner that we can rebuild a
table from the fragment by taking the natural JOIN operation and to make it possible we
need to include a special attribute called Tuple-id to the schema. For this purpose, a user
can use any super key. And by this, the tuples or rows can be linked together. The
projection is as follows: πa1, a2…. an (T)
where, π is relational algebra operator
a1…., an are the attributes of T
T is the table (relation)
For example, for the EMPLOYEE table we have T1 as :
8
Eno Ename Design Tuple_id
101 A abc 1
102 B abc 2
103 C abc 3
104 D abc 4
105 E abc 5
For the second. sub table of relation after vertical fragmentation is given as follows:
Salary Dep Tuple_id
3000 1 1
4000 2 2
5500 3 3
5000 1 4
2000 4 5
This is T2 and to get back to the original T, we join these two fragments T1 and T2
as πEMPLOYEE (T1 ⋈ T2)
Advantages of Vertical Fragmentation –
1. Reduces storage overhead.
2. Improves query response time for attribute-based searches.
3. Enhances security by storing sensitive attributes separately.
Disadvantages of Vertical Fragmentation –
1. Complexity in query processing and joins.
2. Requires additional storage for primary key references.
9
4.3. Mixed Fragmentation:
The combination of vertical fragmentation of a table followed by further horizontal
fragmentation of some fragments is called mixed or hybrid fragmentation. For defining this
type of fragmentation we use the SELECT and the PROJECT operations of relational
algebra. In some situations, the horizontal and the vertical fragmentation isn’t enough to
distribute data for some applications and in that conditions, we need a fragmentation called
a mixed fragmentation.
Mixed fragmentation can be done in two different ways:
1. The first method is to first create a set or group of horizontal fragments and then
create vertical fragments from one or more of the horizontal fragments.
2. The second method is to first create a set or group of vertical fragments and then
create horizontal fragments from one or more of the vertical fragments.
The original relation can be obtained by the combination of JOIN and UNION
operations which is given as follows:
σP(πa1, a2..,an(T))
πa1,a2….,an (σp(T))
For example, for our EMPLOYEE table, below is the implementation of mixed
fragmentation is πEname, Design (σEno < 104(EMPLOYEE))
The result of this fragmentation is:
Ename Design
A abc
B abc
C abc
Advantages of Mixed Fragmentation –
1. Provides flexibility by optimizing both row and column storage.
2. Reduces query time and enhances security.
10
Disadvantages of Mixed Fragmentation –
Disadvantages of Mixed Fragmentation –
1. Increases complexity in query execution.
2. Requires more computing resources.
4.4. Comparisons of Fragmentation Techniques:
Feature Horizontal Vertical Mixed
Fragmentation Fragmentation Fragmentation
Definition Divides data row- Divides data Combines both
wise, keeping all column-wise, horizontal and
attributes. keeping only vertical
required attributes. fragmentation.
Basis of Uses selection Uses projection Uses both selection
Fragmentation conditions (e.g., conditions (e.g., and projection
employees in ‘New splitting personal conditions.
York’). and job details).
Query Fast for location- Fast for attribute-
Optimized for both
Performance based queries. specific queries.
but requires more
joins.
Security & Moderate – all High – sensitive Very High – allows
Privacy attributes are attributes can be both location-based
together. separated. and attribute-based
security.
Data Redundancy Low – no duplicate Moderate – primary High – both rows
records. keys are repeated. and columns are
split, increasing
redundancy.
Complexity Simple – requires Moderate – needs High – requires
only row-based primary keys for both row-wise and
division. merging. column-wise
distribution.
Best Use Case When queries filter When queries focus When both
based on specific on specific location-based and
locations. attributes. attribute-based
optimization is
required.
11
5. DISCUSSION
Data fragmentation in Distributed Database Management Systems (DDBMS) is a
crucial technique that enhances performance, security, and efficiency by dividing
large datasets into smaller, manageable fragments. The three primary
fragmentation types are horizontal, vertical, and mixed fragmentation.
Horizontal fragmentation divides a table row-wise, keeping all attributes intact.
This is beneficial when queries frequently filter data based on specific
conditions, such as location-based searches in a customer database. It improves
query response time and data locality but may increase complexity in distributed
joins.
Vertical fragmentation splits a table column-wise, ensuring that each fragment
contains only relevant attributes along with the primary key. This technique is
useful for security and minimizing query processing time by retrieving only the
necessary columns. However, frequent joins are required to reconstruct the full
dataset.
Mixed fragmentation combines horizontal and vertical fragmentation, providing
optimized data retrieval, better security, and efficient storage. It allows
distributed databases to handle both row-wise and column-wise filtering,
reducing data transfer costs. However, it is the most complex method to
implement due to the need for both row and attribute-based joins.
Each fragmentation type serves different use cases, balancing performance,
security, and complexity in a DDBMS environment.
12
6. CONCLUSION
Data fragmentation is a fundamental concept in Distributed Database
Management Systems (DDBMS) that enhances performance, data security,
and query efficiency. By dividing a database into smaller, logical fragments,
organizations can optimize storage, reduce query execution time, and ensure
data availability across multiple locations.
The three primary fragmentation techniques—horizontal, vertical, and mixed
fragmentation—each serve distinct purposes. Horizontal fragmentation
divides data based on rows, improving query performance for location-based
searches. Vertical fragmentation splits data by columns, which enhances
security and retrieval efficiency by storing only relevant attributes in separate
fragments. Mixed fragmentation, combining both horizontal and vertical
methods, offers the highest level of optimization but is more complex to
implement and maintain.
Choosing the appropriate fragmentation technique depends on the specific
needs of the system, such as data access patterns, security requirements, and
network efficiency. While horizontal fragmentation is ideal for geographically
distributed applications, vertical fragmentation is best suited for scenarios
requiring sensitive data isolation. Mixed fragmentation provides a balanced
approach but requires careful planning and management.
In conclusion, effective fragmentation strategies help reduce data redundancy,
improve query speed, and ensure better resource utilization, making them
essential for the success of distributed databases.
13
7. REFERENCES
• Navathe, Shamkant & Karlapalem, Kamalakar & Ra, Minyoung. (1999). A
Mixed Fragmentation Methodology for Initial Distributed Database Design.
• Elmasri, R., & Navathe, S. B. (2016). Fundamentals of Database Systems
(7th ed.). Pearson Education.
• Özsu, M. T., & Valduriez, P. (2020). Principles of Distributed Database
Systems (4th ed.). Springer.
• Mullins, C. S. (2002). Database Administration: The Complete Guide to
Practices and Procedures. Addison-Wesley.
• https://link.springer.com/
• https://www.geeksforgeeks.org/types-of-distributed-dbms/
• https://www.tutorialspoint.com/distributed_dbms/distributed_dbms_design_
strategies.htm
14