0% found this document useful (0 votes)

22 views9 pages

Files Structrews

This document covers file structures and organization, detailing various types such as sequential, direct, and indexed sequential files. It explains key operations on files, advantages and disadvantages of each organization type, and their use cases. The unit aims to provide a comprehensive understanding of file management and data retrieval methods in software applications and operating systems.

Uploaded by

liquidation.xalta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views9 pages

Files Structrews

Uploaded by

liquidation.xalta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

File Structures and

UNIT 12 FILE STRUCTURES Advanced Data Structures

Structure
12.0 Introduction
12.1 Objectives
12.2 Basic concepts of file
12.3 File Organization
12.3.1 Key drivers for file organization
12.4 Sequential files
12.4.1 Operations on sequential files
12.4.2 Types of sequential files
12.4.3 Advantages and disadvantages of sequential files
12.4.4 Use cases for sequential files
12.5 Direct file organization
12.5.1 Hash function and collision resolution
12.5.2 Advantages and disadvantages of direct file
organization
12.6 Indexed Sequential File Organization
12.6.1 Advantages and disadvantages of indexed sequential
files
12.7 Summary
12.8 Solutions/Answers
12.9 Further Readings

12.0 INTRODUCTION

The data is stored and organized through file structure. Software applications and
platforms store the data in files. Operating systems manage critical configuration
through file and even databases also store the data in files such as transaction logs.
Files essentially serves as the key building blocks for data organization. We can
enforce governance such as data integrity checks, security rules, access controls,
sharing controls for files.
In this unit we discuss various aspects of file such as file organization, file types and
related concepts.

12.1 OBJECTIVES

After going through this unit, you should be able to

 understand key concepts of files,
 understand the file organization concepts,
 understand sequential files and direct file organization,
 Understand indexed file organization and indexed sequential files

12.2 BASIC CONCEPTS OF FILE

A file is a collection of data records. The key operations of the file are as follows:
 File creation when the file gets created initially.
 File update when we update the file with new data records or modify
existing data records.
1
File Structures  File deletion when we delete the file
 File merging that combines the content of multiple files into a single file.

 File searching that involves searching for a specific data record in a file.
A data record in a file encapsulates the data of a single entity. For instance, an
employee data record captures details of a single employee such as employee id,
employee name, employee address, employee phone number and employee date of
join.
Each data record is uniquely identified by a key. For instance, in the employee data
record, employee id is the key.

The table 1 provides the employee data records in an organization:

Employee Employee Employee Department Date of Join
Number Name Address
45278 Employee 1 Kanpur Sales 01-Jan-2020
167352 Employee 2 Mysore Development 01-Mar-2020

12.3 FILE ORGANIZATION

File organization defines the way the data record is stored and retrieved from the files.
File organization provides the data record relationship specifying how the data records
are mapped to the disk blocks. We can either store fixed length record in the files or
use flexible file structure to support data records of varying length.

There are multiple ways of file organization such as sequential, direct, indexed
sequential and others. We can select the file organization type based on the use case,
information retrieval performance, ease of information management and others.

Given below are the most commonly used file organizations:

 Sequential files – The data records are stored in a pre-defined order. For instance,
the data records are sequenced based on the key values. In this file organization
type we need to sequentially access the record.

 Relative files – The data record is stored at a fixed location that can be accessed
using a static key value. As we can access the data record using a fixed value, we
can retrieve or insert the data record randomly in this file organization type.

 Direct files – Similar to the relative file that supports non-integer key value.

 Indexed sequential files – In this file organization we add an index to provide the
random access to the sequential files.

 Indexed files – The data records are indexed to improve the performance of the
data retrieval.

Figure 1 provides various file organizations that are most commonly used.

2
File Structures and
Advanced Data Structures

Figure 1 Types of File organizations

12.3.1 Key drivers for file organization

Given below are the main drivers for the file organization:

 File organization helps in optimal design of file records so that they can be
retrieved as quickly as possible.
 Improve the performance of file operations such as insert, update and delete.
 Ensure the data consistency and avoid duplicates during file operations.
 Provide optimal storage for managing the files.

12.4 SEQUENTIAL FILES

We write the data records in a specific sequence in sequential files. The data is also
accessed in the order it is written to the physical device. We usually order the data
records based on the key value. In Sequential files the reading happens from the
beginning of the file and write happens at the end of the file.
The legacy storage systems such as tape storage use sequential files.
12.4.1 Operations on sequential files
We discuss the main operations on the sequential files:
Insert operation
As the sequential files are organized based on their keys, when we insert we should
find the exact position of the new key value and insert the key over there.
If there are multiple insertions, we can batch them to optimize the performance
optimization. We can collect all the new inserts into transaction log and then sort the
new keys in order in the transaction log. We can then merge the entries in the
transaction log into the main file in batch mode.
Delete Operation
In the delete operation, we need to reclaim the space from the deleted record. Here as
well we can use the concept of transaction log for performance optimization. We can
add the delete marker for the key position in the transaction log and later we can
merge the changes to the main file in batch mode. During merge operation, we drop
the data records in the main file based on the marker.
Update Operation

3
File Structures For performing the update operation, we need to add the delete marker for the
corresponding key and then insert the new data record for the corresponding key in
the transaction log. Like always we merge the transaction log to the main file
Retrieval Operation
We read the data record based on its key. For stale data reads, we can directly read the
data record for the key from the main file. To maintain read consistency, we need to
check If there are any pending updates in the transaction log and if so we need to
merge the changes (deletion/insertion).
12.4.2 Types of sequential files
There are mainly two types of sequential files – Pile files and sorted files.
Pile Files
In Pile files, the data records are inserted in the order of their arrival. For instance,
let’s consider the below given incoming Data records –
DR1, DR3, DR5, DR2
The data records are inserted as given in Figure 2:

Figure 2 Pile File

For searching the records, the file is searched from the beginning till the desired
record is found. For deleting a record, we search for the record and add a deletion
marker for the found record.
For inserting a new record, it will be done at the end of the file. The new record DR4
is to be inserted as shown in Figure 3

Figure 3 Pile File Insertion

Sort Files
In case of sort files, the data records are sorted based on the key and are inserted into
the file.
For instance, let’s consider the below given incoming Data records –
DR1, DR3, DR5, DR2
The data records are inserted as given in Figure 4:

Figure 4 Sort File

4
While inserting a new record the key of the new record is sorted along with the File Structures and
Advanced Data Structures
existing records and then inserted. For instance when we try to insert the data record
DR4, the sorted insert is depicted in Figure 5

Figure 5 Sort file insert

12.4.3 Advantages and disadvantages of sequential files

In this section we discuss the main advantages and disadvantages of the sequential
files.
Advantages
The main advantages of sequential files are as follows:
1. Simple and easy to implement
2. Provides best usage of the storage space.
3. Can be used for data archival needs.
4. The underlying storage systems provide low-cost storage option.
5. Can be stored in inexpensive device like tape device.
Disadvantages
The main disadvantages of the sequential files are as follows:
1. The update operation is costlier and time consuming.
2. The access speed is slow due to the sequential access.
3. It is not possible to directly access a given data record key
4. All records should have uniform structure.
5. The insert time of sorted file is high.
6. Overall processing is slow.

12.4.4 Use cases for sequential files

We generally use sequential files for long-term archival of data and for batch
processing operations. We also use sequential access for storing large amount of data
that tolerates slow retrieval time.

☞ Check Your Progress – 1

1. A ______ in a file encapsulates the data of a single entity
2. Each data record is uniquely identified by a ____.
3. In _______ files, the data records are inserted in the order of their arrival.
4. For optimizing the performance, we get all the updates into a ________ and
later merge it into main file.
5. The two types of sequential files are ______ and __________.
6. Define a data record for a typical university’s student enrolment system.
5
File Structures
12.5 DIRECT FILE ORGANIZATION

In the direct file organization, we directly access the file by its key. To enable this
direct access, we need to map the key to the address where the data record is stored.
Direct access is also known as random access. The file records are stored in direct
access storage device (DASD) like hard disk, CD, magnetic disk. We randomly place
the record throughout the hard disk.

We have depicted the direct file organization in Figure 6

Figure 6 Direct File Organization

We use hash function to convert the key into the address. Naturally direct file access
is faster compared to sequential files.

In direct file organization, the records are stored at a known addressed as depicted in
Figure 6.

12.5.1 Hash Function and Collision Resolution

The most popular hash function is the modulo based function that computes the
remainder value as the hash value based on the key. The hash value is then used to
store the data record.
When there are huge set of keys, multiple keys will end up with same hash value
leading to collision. In such cases, we use the collision resolution technique. The
separate chaining technique chains the values to the same slot and open addressing
technique probes for next available slot for placing the value.

12.5.2 Advantages and disadvantages of Direct file organization

We shall look at the main advantages and disadvantages of the direct file organization.

Advantages
Given below are the main advantages of direct file organization:
1. As we can directly access the data, the retrieval is fast that helps in transaction
management systems such as relational database systems.
2. The insert, update and delete operations are faster.
3. The direct file organization is efficient in storage and retrieval of large data.
4. Key-based search is faster.
5. Can be used for real-time transactions that needs optimal performance.
6. Concurrent processing is possible

6
Disadvantages File Structures and
Advanced Data Structures
Given below are the main disadvantages of direct file organization:
1. The required storage technology is costlier.
2. The usage of storage space is sub optimal.
3. Insert, delete and update needs update of the index table.

12.6 INDEXED SEQUENTIAL FILE

ORGANIZATION

Indexed sequential file organization supports direct access of keys and also sequential
access by keys.
We use the indexed sequential file organization for hierarchical data organization. For
instance, lets consider a use case for retrieving the data record for the states. We use
the indexed/direct file for getting the country. The country data record points to the
sequential file storing the records of its constituent states. Indexed sequential files can
be stored only on random access device such as magnetic disk.
To implement the indexed sequential file, we decouple the index and data files. The
index file is structured hierarchically in tree structure whereas the data file stores the
information sequentially.
We use two types of indexes static index and dynamic index.
We have depicted the depicted the indexed sequential file in Figure 7.

Figure 7 Indexed sequential file

As depicted in Figure 7, the indexed sequential file is depicted as a two-level

hierarchy. The first level holds the index to each of the three records in the data file.
The second level holds the sequential set of records.

The first level is accessed directly where we get the keys from the index file. In the
second level we store the data record sequentially. As the data records are stored
sequentially, the key in the first level just points to the first record in the second level
from where it can be accessed sequentially.
7
File Structures When we have to insert the record, we update the data file based on the sequence and
update the index accordingly. Static index and dynamic index are two main types of
indexes. In static indexes the update to the records in the data file does not change the
index structure whereas in the dynamic index, the updates to the data file changes the
index structure.

12.6.1 Advantages and disadvantages of Indexed sequential file organization

We shall look at the main advantages and disadvantages of the indexed sequential file
organization
Advantages
Given below are the main advantages of indexed sequential file organization:
 As the records are directly accessed, the access speed is high
 Record insert is very fast.

Disadvantages
Given below are the main disadvantages of indexed sequential file organization:
 Usage of storage space is sub optimal
 The implementation is costly due to the required of costly hardware
 Extra storage for index file is required.

☞ Check Your Progress – 2

1. In the direct file organization, we directly access the file by its ___
2. The function that converts the key to address is called _____
3. The first level in the indexed sequential acess is done _______ and second
level is accessed _________
4. The data records in direct file organization are stored in ______storage device
5. _____ file organization for managing hierarchical data

12.7 SUMMARY

In this unit we mainly discussed about file organization and various types of file
organization. A file is a collection of data records. The key operations of the file are
File creation, file update, file deletion, file merging, file searching. File organization
defines the way the data record is stored and retrieved from the files. We write the
data records in a specific sequence in sequential files. There are mainly two types of
sequential files - Pile Files and sort files. In Pile files, the data records are inserted in
the order of their arrival. In case of sort files, the data records are sorted based on the
key and are inserted into the file. In the direct file organization, we directly access the
file by its key. Indexed sequential file organization supports direct access of keys and
also sequential access by keys.

12.8 SOLUTIONS/ANSWERS

☞ Check Your Progress – 1

1. Data record.
8
File Structures and
2. Key
Advanced Data Structures
3. Pile files
4. Transaction log
5. Pile files and sort files
6. A sample student enrollment structure is as follows:
Student Number, Student Name, Date of Enrollment, Address, Programme

☞ Check Your Progress – 2

1. Key
2. Hash function
3. Directly and sequentially
4. Direct access
5. Indexed sequential

12.9 FURTHER READINGS

References

https://en.wikipedia.org/wiki/File_system
https://en.wikipedia.org/wiki/Sequential_access
https://en.wikipedia.org/wiki/Indexed_file
https://en.wikipedia.org/wiki/ISAM

Unit 12 File Structures: Structure Page Nos
No ratings yet
Unit 12 File Structures: Structure Page Nos
7 pages
Unit 15
No ratings yet
Unit 15
6 pages
Ds Mod 5
No ratings yet
Ds Mod 5
17 pages
File Organization in RDBMS
No ratings yet
File Organization in RDBMS
9 pages
Files and Their Organization: Data Hierarchy
No ratings yet
Files and Their Organization: Data Hierarchy
17 pages
File Organization
No ratings yet
File Organization
17 pages
DSA Unit6 Theory
No ratings yet
DSA Unit6 Theory
23 pages
File Organization Midterm
No ratings yet
File Organization Midterm
43 pages
Unit 6 File Management
No ratings yet
Unit 6 File Management
70 pages
Keystone School of Engineering: Group F Assignment - 12
No ratings yet
Keystone School of Engineering: Group F Assignment - 12
4 pages
File and Database Design
No ratings yet
File and Database Design
28 pages
Lec 5DB
No ratings yet
Lec 5DB
40 pages
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
12 pages
File Management in Operating Systems
No ratings yet
File Management in Operating Systems
40 pages
File Structure and Organization Concepts
No ratings yet
File Structure and Organization Concepts
17 pages
DBMS Book Special Notes PDF
No ratings yet
DBMS Book Special Notes PDF
68 pages
1-File Structure
No ratings yet
1-File Structure
17 pages
Unit 5 2nd Half
No ratings yet
Unit 5 2nd Half
37 pages
MODULE-5 FILE & Their Organization
No ratings yet
MODULE-5 FILE & Their Organization
13 pages
Fds Notes
No ratings yet
Fds Notes
15 pages
Explain File Management in An Operating System
No ratings yet
Explain File Management in An Operating System
57 pages
Group F - 11
No ratings yet
Group F - 11
4 pages
Week 14 Persistent Data Storage
No ratings yet
Week 14 Persistent Data Storage
7 pages
FDSUNIT4
No ratings yet
FDSUNIT4
6 pages
OSY Chapter 6 SSP
No ratings yet
OSY Chapter 6 SSP
24 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
27 pages
Unit 1 Lecture 9
No ratings yet
Unit 1 Lecture 9
22 pages
File Organization for SE Computer Students
No ratings yet
File Organization for SE Computer Students
66 pages
Data Structure Unit 5
50% (4)
Data Structure Unit 5
14 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
File Organization Methods Explained
No ratings yet
File Organization Methods Explained
67 pages
DSA Practicle 11
No ratings yet
DSA Practicle 11
5 pages
Dbms 5
No ratings yet
Dbms 5
38 pages
Ss2 Data Processing 2nd Term
0% (1)
Ss2 Data Processing 2nd Term
33 pages
File Organization
No ratings yet
File Organization
4 pages
1.file Organization
No ratings yet
1.file Organization
90 pages
Unit 6
No ratings yet
Unit 6
20 pages
Data Structure Lecture 1
No ratings yet
Data Structure Lecture 1
36 pages
OS-Chapter 5 - File Management
100% (1)
OS-Chapter 5 - File Management
10 pages
Lesson Note: Concept of Computer Files
No ratings yet
Lesson Note: Concept of Computer Files
4 pages
DSA Unit VI
No ratings yet
DSA Unit VI
14 pages
Lecturer Notes BCA Pt-II - OOPs Unit-V File Management-I
No ratings yet
Lecturer Notes BCA Pt-II - OOPs Unit-V File Management-I
11 pages
File Organization for CS Students
No ratings yet
File Organization for CS Students
13 pages
2 - Week8 Lecture Note File Systems
No ratings yet
2 - Week8 Lecture Note File Systems
38 pages
File Organization
No ratings yet
File Organization
4 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
Operating Systems: Internals and Design Principles: File Management
No ratings yet
Operating Systems: Internals and Design Principles: File Management
73 pages
Unit 6
No ratings yet
Unit 6
56 pages
FP-Lecture-6 01
No ratings yet
FP-Lecture-6 01
33 pages
File Organization Techniques
No ratings yet
File Organization Techniques
31 pages
5843 - Computer Note For SS2 On Concept of Computer File 2024
No ratings yet
5843 - Computer Note For SS2 On Concept of Computer File 2024
7 pages
2022 - CMP 262 - File Organisation - Slides
No ratings yet
2022 - CMP 262 - File Organisation - Slides
19 pages
CSC 216 - File Organization and Data Processing
No ratings yet
CSC 216 - File Organization and Data Processing
24 pages
7.3 Section 3 File Organisation
No ratings yet
7.3 Section 3 File Organisation
7 pages
File Organisation
No ratings yet
File Organisation
45 pages
Chapter 11 File Management
No ratings yet
Chapter 11 File Management
13 pages
Database Design Process
No ratings yet
Database Design Process
29 pages
Syllabus of BDA
No ratings yet
Syllabus of BDA
2 pages
DW Unit I Notes
No ratings yet
DW Unit I Notes
28 pages
Notes Chapter 2.2 Lecture 2.2.3 (Database Security Counter Measures)
No ratings yet
Notes Chapter 2.2 Lecture 2.2.3 (Database Security Counter Measures)
9 pages
CLF C02
No ratings yet
CLF C02
4 pages
F20 I2c Interview Questions
No ratings yet
F20 I2c Interview Questions
11 pages
Database Homework for Students
No ratings yet
Database Homework for Students
5 pages
World Setup En.a4
No ratings yet
World Setup En.a4
4 pages
Dbms DPP
No ratings yet
Dbms DPP
32 pages
Chapter-6 Consistency and Replication
No ratings yet
Chapter-6 Consistency and Replication
67 pages
Big Data: 13. Querying Trees
No ratings yet
Big Data: 13. Querying Trees
219 pages
Microsoft SQL Server Editions Guide
No ratings yet
Microsoft SQL Server Editions Guide
5 pages
Power Query Recipe Cards TOC v2501
No ratings yet
Power Query Recipe Cards TOC v2501
10 pages
What Is OData With Example
No ratings yet
What Is OData With Example
3 pages
Connect To SQL Server When System Administrators Are Locked Out - SQL Server - Microsoft Docs
No ratings yet
Connect To SQL Server When System Administrators Are Locked Out - SQL Server - Microsoft Docs
7 pages
Chapter 2: Data Model: I-The Entity-Relationship Model
100% (1)
Chapter 2: Data Model: I-The Entity-Relationship Model
48 pages
GL Tables
No ratings yet
GL Tables
1,225 pages
Concepts of Database Management Ninth Edition Joy Starks Full Chapters Included
No ratings yet
Concepts of Database Management Ninth Edition Joy Starks Full Chapters Included
118 pages
CST121 Database Management With Access-Week4, 5 6
No ratings yet
CST121 Database Management With Access-Week4, 5 6
61 pages
Database Management System Exams
No ratings yet
Database Management System Exams
8 pages
Dbms Mini Project Sample Report Car Rental Management System
No ratings yet
Dbms Mini Project Sample Report Car Rental Management System
44 pages
4pplyi1: Duplcomge Te Me Eniohg
No ratings yet
4pplyi1: Duplcomge Te Me Eniohg
17 pages
Database & SDLC Fundamentals Quiz
No ratings yet
Database & SDLC Fundamentals Quiz
8 pages
Tripwire Enterprise 9.1 User Guide
No ratings yet
Tripwire Enterprise 9.1 User Guide
6 pages
Data Warehouse
No ratings yet
Data Warehouse
5 pages
Syllabus NEU DAMG 6210 Fall Seattle 2023
No ratings yet
Syllabus NEU DAMG 6210 Fall Seattle 2023
10 pages
Efficient Dispatcher Mechanismfor SIPCluster Basedon Memory Utilization
No ratings yet
Efficient Dispatcher Mechanismfor SIPCluster Basedon Memory Utilization
7 pages
Big-O Cheat Sheet-Letter
No ratings yet
Big-O Cheat Sheet-Letter
4 pages
Lab 8 - DML Selection Projection
No ratings yet
Lab 8 - DML Selection Projection
7 pages
An Architect's View of The Hazelcast Platform
No ratings yet
An Architect's View of The Hazelcast Platform
14 pages

Files Structrews

Uploaded by

Files Structrews

Uploaded by

File Structures and

UNIT 12 FILE STRUCTURES Advanced Data Structures

After going through this unit, you should be able to

12.2 BASIC CONCEPTS OF FILE

The table 1 provides the employee data records in an organization:

12.3 FILE ORGANIZATION

Given below are the most commonly used file organizations:

Figure 1 Types of File organizations

12.3.1 Key drivers for file organization

12.4 SEQUENTIAL FILES

Figure 2 Pile File

Figure 3 Pile File Insertion

Figure 4 Sort File

Figure 5 Sort file insert

12.4.3 Advantages and disadvantages of sequential files

12.4.4 Use cases for sequential files

☞ Check Your Progress – 1

We have depicted the direct file organization in Figure 6

Figure 6 Direct File Organization

12.5.1 Hash Function and Collision Resolution

12.5.2 Advantages and disadvantages of Direct file organization

12.6 INDEXED SEQUENTIAL FILE

Figure 7 Indexed sequential file

As depicted in Figure 7, the indexed sequential file is depicted as a two-level

12.6.1 Advantages and disadvantages of Indexed sequential file organization

☞ Check Your Progress – 2

☞ Check Your Progress – 1

☞ Check Your Progress – 2

12.9 FURTHER READINGS

You might also like