Unit 2
Unit 2
File system
A part of an operating system, that deals with files, is known as a file system.
A file system is a collection of files and also a collection of directory structures.
SUTEX BANK COLLEGE OF COMPUTER APPLICATION & SCIENCE AMROLI
203 OPERATING SYSTEM
The file system consists of two distinct parts: a collection of files, each storing
related data, and a directory structure, which organizes and provides information
about all the files in the system.
It is the responsibility of a file system to manage all the files.
It provides facilities to create & delete files, read & write contents of files, etc.
It specifies conventions for naming files. These conventions include the maximum
number of characters in a name which characters can be used and, in some systems,
how long the file name suffix can be.
A file system also includes a format for specifying the path to a file through the
structure of directories.
File Naming
Files store information on disks, providing convenience for users by naming them
when created and later referring to them, eliminating concerns about disk location and
functionality.
Current operating systems allow strings of one to eight letters, including digits and
special characters, as legal file names, with many supporting names up to 255
characters.
File systems like UNIX and MS-DOS differentiate between case sensitive file names,
with UNIX having three distinct test1, Test1, and TEST1 files, while MS-DOS has
the same file.
Operating systems often use two-part file names separated by a period (.), with the
extension following it indicating the file's content.
MS-DOS file names range from 1-8 characters with optional extension of 1-3
characters, while UNIX file sizes depend on user and may include multiple
extensions.
Some of the more common file extensions are given in following figure.
File attribute
● When a file is named, it becomes independent of the process, the user, and even the
system that created it. For instance, one user might create the file, and another user
might edit that file by specifying its name.
● A file’s attributes vary from one OS to another but typically consist of these:
o Name: A file is named for the convenience of the user and is referred to by its
name. The symbolic file name is the only information kept in human-readable
form.
o Identifier: This unique tag, usually a number, identifies the file within the file
system; it is the non–human-readable name for the file.
o Type: Type information is needed for those systems that support different
types. The types depend on the extensions of the file. For eg, .exe for
the executable, and .obj for the object file.
o Location: This information is a pointer to a device and to the location of the
file on that device.
o Size: The current size of the file (in bytes, words, or blocks), and possibly the
maximum allowed size are included in this attribute.
o Protection: Access-control information determines who can do reading,
writing, executing, and so on.
o Usage Count: This value indicates the number of processes that are currently
using this file.
o Time, date, and user identification: This information may be kept for
creation, last modification, and last use. These data can be useful for
protection, security, and usage monitoring.
● The information about all files is kept in the directory structure, which also resides on
secondary storage; typically, a directory entry consists of the file’s name and its
unique identifier.
File types
● The types of files recognized by the system are regular, directory, or special.
However, the operating system uses many variations of these basic types.
Regular files
● Regular files are the most common files and are used to contain data. Regular
files are in the form of text files or binary files:
✔ Text files
● Text files are regular files that contain information stored in ASCII format text
and are readable by the user. You can display and print these files.
● The lines of a text file must not contain NUL characters, and none can exceed
bytes in length, including the newline character.
✔ Binary files
● Binary files are regular files that contain information readable by the computer.
● Binary files might be executable files that instruct the system to accomplish a
job. Commands and programs are stored in executable, binary files.
Multimedia Mpeg, mov, rmm, mp3, avi Binary file containing audio
or A/V information
Creating file:
A new file can be created by a system call embedded in a program or by an OS
command issued by an interactive user.
Two steps are necessary to create a file.
1. Space in the file system must be found for the file.
2. An entry for the new file must be made in the directory.
Writing a file:
To write a file, we make a system call specifying both the name of the file and the
information to be written to the file. The system must keep a write pointer to the
location in the file where the next write is to take place. The write pointer must be
updated whenever a write occurs.
Reading a file:
To read from a file, we use a system call that specifies the file name and where (in
memory) the next block of the file should be put.
The system needs to keep a read pointer to the location in the file where the next read
is to take place.
Because a process usually reads from or writes to a file, the current operation location
can be kept as a pre-process current-file-position pointer.
Both the read and write operations use this same pointer, saving space and reducing
system complexity.
Repositioning within a file
The directory searches for the appropriate entry, and the current-file-position pointer
is repositioned to a given value. Repositioning within a file need not involve any
actual I/O. This file operation is also known as a file seek.
Deleting a file
To delete a file, we search the directory for the named file.
Having found the associated directory entry, we release all file space, so that it can be
reused by other files, and erase the directory entry.
Truncating a file
The user may want to erase the contents of a file but keep its attributes.
Rather than forcing the user to delete the file and then recreate it, this function allows
all attributes to remain unchanged (except for file length) but lets the file be reset to
length zero and its file space released.
Open a file
Before using a file, it must be opened. File attributes and data contents are fetched in
main memory for rapid access on later calls.
Memory space in the main memory is allocated to store fetched information.
Close a file
When the use of the file is finished, it should be closed to free up main memory space.
File attributes and data contents are stored back on disk. This may contain modified
information if the file is updated.
Append
This is a restricted form of write operation. Here, data are only added to the end of a
file.
Get Attributes
This operation is used to retrieve file attributes.
Set Attributes
This operation is used to write file attributes. Generally, it is used to change file
attributes such as file protection information.
Rename
This operation is used to change the name of an existing file. This is not strictly
necessary because a file can be copied to a new file with a new name and then the old
file can be deleted.
2.3 File Access Methods
Files store information. When it is used, this information must be accessed and read into
computer memory. The information in the file can be accessed in several ways.
we will learn about the three types of file access methods. They are:
1. Sequential access
2. Direct access
3. Indexed sequential access
1. Sequential access
● It is one of the simplest access methods and is widely used by editors and compilers.
● Most of the operating systems access the file sequentially.
● You must have seen the audio cassettes. Even they use the same access method.
● The files are a collection of records. Accessing the file is equivalent to accessing the
records. In the sequential access method, each record is accessed sequentially, one
after the other. Consider the below image for more clarity.
The figure represents a file. The current pointer is pointing to the record currently being
accessed. In the sequential access method, the current pointer cannot directly jump to any
record. It has to "cross" every record that comes in its path. Suppose there are nine records in
the file from R1 to R9. The current pointer is at record R6. If we want to access record R8,
we have to first access record R6 and record R7. This is one of the major disadvantages of the
sequential access method.
Advantages of Sequential Access:
● The sequential access mechanism is very easy to implement.
● It uses lexicographic order to enable quick access to the next entry.
Disadvantages of Sequential Access:
● Sequential access will become slow if the next file record to be retrieved is not
present next to the currently pointed record.
● Adding a new record may need relocating a significant number of records of the file.
2. Direct/Random/Relative Access
● Direct-access files are of great use for immediate access to large amounts of
information like database systems.
● The sequential access can be very slow and inefficient in such cases.
● Suppose every block of the storage stores 4 records and we know that the record we
need is stored in the 10th block. In that case, sequential access will not be
implemented because it will cross all the blocks to access the needed record.
● The direct-access method requires file operations to include the block number as a
parameter, which is typically a relative block number.
● This allows the OS to decide where the file should be placed and prevents users from
accessing unrelated parts of the file system.
To find a particular item, we first make a binary search of the master index, which
provides the block number of the secondary index.
The secondary index blocks point to the actual file blocks.
This block is read in, and again a binary search is used to find the block containing the
desired record. Finally, this block is searched sequentially.
In this way any record can be located from its key by at most direct-access reads.
If the index table is appropriately arranged, it accesses the records very quickly.
Records can be added at any position in the file quickly.
When compared to other file access methods, it is costly and less efficient.
It needs additional storage space.
Directory systems in file management provide a structure for organizing, storing, and
retrieving files on a storage device. Key functions of directory systems include:
1. File Organization and Storage:
Directories group related files and subdirectories, creating an organized hierarchy that
enables users to quickly locate and manage files.
2. File Naming and Identification:
Assigns unique names to files within each directory, preventing conflicts and making
files easily identifiable.
3. Path Management:
Supports path structures (absolute and relative paths) that allow users and programs to
navigate the file system and access files across various directories.
4. Access Control and Security:
o Enables the setting of permissions on files and directories, allowing
controlled access for different users and preventing unauthorized
modifications.
5. File Retrieval and Search:
Provides tools to search and retrieve files based on names, extensions, or other
attributes, improving accessibility.
6. File Metadata Storage:
Maintains metadata (such as file size, type, creation/modification dates) to manage
file information, aiding in file sorting and tracking.
Advantages
1. Implementation is very simple.
2. If the sizes of the files are very small then the searching becomes faster.
3. File creation, searching, deletion is very simple since we have only one directory.
Disadvantages
1. It is not suitable for multi-user systems. Different users may provide same file name
for different files, possibly overwriting other’s files. This is called a file name
collision.
2. It is even not suitable for single user system when number of file becomes too large.
3. The directory may be very big therefore searching for a file may take so much time.
4. Different files can’t be grouped.
Two Level Directory
There is one master directory which contains separate directories dedicated to each user. For
each user, there is a different directory present at the second level, containing group of user's
file. The system doesn't let a user to enter in the other user's directory without permission.
The standard solution to limitations of single – level directory is to create a separate directory
for each user knows as master directory.
n Two level directory structure, the users create directory directly inside the root
directory.
directory. The system doesn't let a user to enter in the other user's directory without
permission.
In a two-level directory structure, each user has their own UFD, which forms a tree with
the MFD as the root, UFDs as its direct descendants, and files as leaves.
Advantages
Disadvantages
1. It is not suitable for users having large number of files.
2. Different files cannot be grouped.
In tree structured directory systems, the user is given the privilege to create the files as
well as directories.
Advantages
Organizes files hierarchically, making management easier.
Allows logical grouping and categorization of files.
Facilitates easy navigation and access control.
Supports efficient disk space utilization by minimizing duplicates.
Scalable for adding new files or directories as needed.
Disadvantages
Can become complex and hard to navigate with deep structures.
Long path names may be inconvenient and error-prone.
Risk of unintentional file duplication across directories.
Requires ongoing maintenance and organization.
pointer.
SUTEX BANK COLLEGE OF COMPUTER APPLICATION & SCIENCE AMROLI
203 OPERATING SYSTEM
o In the case of hard link, the actual file will be deleted only if all the references
to it gets deleted.
Advantages
It allows sharing of files and directories.
Disadvantage
Deletion of shared file or directory is complex. When such file/directory is deleted,
directory entry from all related directories should be deleted.
A file may have multiple file paths. This may create problem in some operations. for
example, if user copies al files to backup storage, then such files will be copied
multiple times.
It is difficult to ensure that there are no any cycles in a directory structure.
o First problem with this method is: Constructing an ACL is tedious task. Also
all users of the system are not known in advance.
o Second problem with this method is: ACL Is kept in directory entry. So, now
directory entry should be of varying length instead f fixed length. This results
in more complicated space management.
o Solution to this problem is to divide all users in various categories and then to
allow access on such categories instead of individual users.
o UNIX uses such type of scheme.
o It divides users in three categories:
Owner,
Group and
Others
o Access is given on such category. All the users of that category will be
allowed that access.