Prepared By: Jerusalem Y.
File System   5/9/2019   1
   Fundamental concepts (data, metadata, operations,
    organization, buffering, sequential vs. non sequential files)
   Content and structure of directories
   File system techniques (partitioning, mounting and un
    mounting, virtual file systems)
   Memory-mapped files
   Special-purpose file systems
   Naming, searching, and access
   Backup strategies
                                              File System   5/9/2019   2
   In computing, a file system is used to control how data is
    stored and retrieved.
   Without a file system, information placed in a storage medium
    would be one large body of data with no way to tell where one
    piece of information stops and the next begins.
   By separating the data into pieces and giving each piece a
    name, the information is easily isolated and identified.
   Taking its name from the way paper-based information
    systems are named, each group of data is called a "file".
   The structure and logic rules used to manage the groups of
    information and their names is called a "file system".
                                            File System   5/9/2019   3
   File is a named collection of related information that is
    recorded on secondary storage such as magnetic disks,
    magnetic tapes and optical disks.
   In general, a file is a sequence of bits, bytes, lines or records
    whose meaning is defined by the files creator and user.
   Data is a set of values of qualitative or quantitative variables.
   Metadata is data (information) that provides information
    about other data.
   In the 2010s, metadata typically refers to digital forms, but
    traditional card catalogues contain metadata, with cards
    holding information about books in a library (author, title,
    subject, etc.)
                                               File System   5/9/2019   4
Typical operations include the following:
 Create: A new file is defined and positioned within the
  structure of files.
 Delete: A file is removed from the file structure and destroyed.
 Open: An existing file is declared to be “opened” by a
  process, allowing the process to perform functions on the file.
 Close: The file is closed with respect to a process, so that the
  process no longer may perform functions on the file, until the
  process opens the file again.
 Read: A process reads all or a portion of the data in a file.
 Write: A process updates a file, either by adding new data that
  expands the size of the file or by changing the values of
  existing data items in the file.
                                            File System   5/9/2019   5
   File organization refer to the logical structuring of the records
    as determined by the way in which they are accessed.
   The basic operations that a user or application may perform on
    a file are performed at the record level
    ◦ The file is viewed as having some structure that organizes
      the records
   The physical organization of the file on secondary storage
    depends on the blocking strategy.
                                              File System   5/9/2019    6
 Important criteria include:
◦ Short access time
◦   Ease of update
◦   Economy of storage
◦   Simple maintenance
◦   Reliability
   For example, if a file is only to be processed in batch mode,
    with all of the records accessed every time, then rapid access
    for retrieval of a single record is of minimal concern.
   A file stored on CD-ROM will never be updated, and so ease
    of update is not an issue.
                                             File System   5/9/2019   7
   A File Structure should be according to a required format that
    the operating system can understand.
   A file has a certain defined structure according to its type.
   A text file is a sequence of characters organized into lines.
   An object file is a sequence of bytes organized into blocks that
    are understandable by the machine.
   When operating system defines different file structures, it also
    contains the code to support these file structure.
   Unix, MS-DOS support minimum number of file structure.
                                              File System   5/9/2019   8
 File type refers to the ability of the operating system to
  distinguish different types of file such as text files source files
  and binary files etc.
 Many operating systems support many types of files.
 Operating system like MS-DOS and UNIX have the following
  types of files
Ordinary files
 These are the files that contain user information.
 These may have text, databases or executable program.
 The user can apply various operations on such files like add,
  modify, delete or even remove the entire file.
                                              File System   5/9/2019    9
Directory files
 These files contain list of file names and other information
  related to these files.
Special files
 These files are also known as device files.
 These files represent physical device like disks, terminals,
  printers, networks, tape drive etc.
 These files are of two types
1. Character special files − data is handled character by
    character as in case of terminals or printers.
2. Block special files − data is handled in blocks as in the case
    of disks and tapes.
                                            File System   5/9/2019   10
    Four terms are in common use when discussing files structure:
           • Field
           • Record
           • File
           • Database
1.      Field
    A field is the basic element of data.
    An individual field contains a single value, such as an
     employee’s last name or a date.
    It is characterized by its length and data type (e.g., ASCII
     string, decimal).
    Depending on the file design, fields may be fixed length or
     variable length.
                                             File System   5/9/2019   11
    In the latter case, the field often consists of two or three
     subfields: the actual value to be stored, the name of the field,
     and, in some cases, the length of the field.
    In other cases of variable-length fields, the length of the field
     is indicated.
2.     Record
    A record is a collection of related fields that can be treated as
     a unit by some application program.
    For example, an employee record would contain such fields as
     name, social security number, job classification, date of hire,
     and so on.
                                               File System   5/9/2019    12
3.     File
    A file is a collection of similar records.
    The file is treated as a single entity by users and applications
     and may be referenced by name.
    Files have file names and may be created and deleted.
    Access control restrictions usually apply at the file level.
    That is, in a shared system, users and programs are granted or
     denied access to entire files.
    In some more sophisticated systems, such controls are
     enforced at the record or even the field level.
                                               File System   5/9/2019   13
4.     Database
    A database is a collection of related data.
    The essential aspects of a database are that the relationships
     that exist among elements of data are explicit and that the
     database is designed for use by a number of different
     applications.
    A database may contain all of the information related to an
     organization or project, such as a business or a scientific study.
                                                File System   5/9/2019    14
   Collection of files is a file directory.
   The directory contains information about the files, including
    attributes, location and ownership.
   Much of this information, especially that is concerned with
    storage, is managed by the operating system.
   The directory is itself a file, accessible by various file
    management routines.
                                             File System   5/9/2019   15
Information contained in a device directory are
 Name
 Type
 Address
 Current length
 Maximum length
 Date last accessed
 Date last updated
 Owner id
 Protection information
                                           File System   5/9/2019   16
 Operation performed on directory are:
  Search for a file
  Create a file
  Delete a file
  List a directory
  Rename/update a file
1.  Search: When a user or application references a file, the
    directory must be searched to find the entry corresponding to
    that file.
2.  Create file: When a new file is created, an entry must be
    added to the directory.
                                            File System   5/9/2019   17
3.   Delete file: When a file is deleted, an entry must be removed
     from the directory.
4.   List directory: All or a portion of the directory may be
     requested. Generally, this request is made by a user and
     results in a listing of all files owned by that user, plus some
     of the attributes of each file (e.g., type, access control
     information, usage information).
5.   Update directory: Because some file attributes are stored in
     the directory, a change in one of these attributes requires a
     change in the corresponding directory entry.
                                              File System   5/9/2019   18
   The simple list is not suited to support these operations.
   Consider the needs of a single user. The user may have many
    types of files, including word-processing text files, graphic
    files, spreadsheets, and so on.
   The user may like to have these organized by project, by type,
    or in some other convenient way.
   If the directory is a simple sequential list, it provides no help
    in organizing the files and forces the user to be careful not to
    use the same name for two different types of files.
   The problem is much worse in a shared system. Unique
    naming becomes a serious problem. Furthermore, it is difficult
    to hide portions of the overall directory from users when there
    is no inherent structure in the directory.
                                              File System   5/9/2019    19
   A start in solving these problems would be to go to a two-level
    scheme.
   In this case, there is one directory for each user, and a master
    directory.
   The master directory has an entry for each user directory,
    providing address and access control information.
   Each user directory is a simple list of the files of that user. This
    arrangement means that names must be unique only within the
    collection of files of a single user, and that the file system can
    easily enforce access restriction on directories.
   However, it still provides users with no help in structuring
    collections of files.
   A more powerful and flexible approach, and one that is almost
    universally adopted, is the hierarchical, or tree-structure, approach.
                                               File System   5/9/2019   20
File System   5/9/2019   21
 Efficiency: A file can be located more quickly.
 Naming: It become convenient for users as two users can
  have same name for different files or may have different name
  for same file.
 Grouping: Logical grouping of files can be done by
  properties e.g. all java programs, all games etc.
Problems
 Naming problem: Users cannot have same name for two
  files.
 Grouping problem: Users cannot group files according to
  their need.
                                          File System   5/9/2019   22
    File access mechanism refers to the manner in which the
     records of a file may be accessed. There are several ways to
     access files
    Sequential access
    Direct/Random access
    Indexed sequential access
1.     Sequential access
    A sequential access is that in which the records are accessed in
     some sequence, i.e., the information in the file is processed in
     order, one record after the other.
    This access method is the most primitive one.
    Example: Compilers usually access files in this fashion.
                                               File System   5/9/2019   23
2.     Direct/Random access
    Random access file organization provides, accessing the
     records directly.
    Each record has its own address on the file with the help of
     which it can be directly accessed for reading or writing.
    The records need not be in any sequence within the file and
     they need not be in adjacent locations on the storage medium.
3.     Indexed sequential access
    This mechanism is built up on base of sequential access.
    An index is created for each file which contains pointers to
     various blocks.
    Index is searched sequentially and its pointer is used to access
     the file directly.
                                               File System   5/9/2019   24
    Files are allocated disk spaces by operating system. Operating
     systems deploy following three main ways to allocate disk
     space to files.
    Contiguous Allocation
    Linked Allocation
    Indexed Allocation
1.     Contiguous Allocation
    Each file occupies a contiguous address space on disk.
    Assigned disk address is in linear order.
    Easy to implement.
    External fragmentation is a major issue with this type of
     allocation technique.
                                              File System   5/9/2019   25
2.   Linked Allocation
 Each file carries a list of links to disk blocks.
 Directory contains link / pointer to first block of a file.
 No external fragmentation
 Effectively used in sequential access file.
 Inefficient in case of direct access file.
3.   Indexed Allocation
 Provides solutions to problems of contiguous and linked
   allocation.
 A index block is created having all pointers to files.
 Each file has its own index block which stores the addresses
   of disk space occupied by the file.
 Directory contains the addresses of index blocks of files.
                                         File System   5/9/2019   26
◦ Name
     symbolic file-name, only information in human-
      readable form
◦ Identifier
     Unique tag that identifies file within file system; non-
      human readable name
◦ Type -
     for systems that support multiple types
◦ Location -
     pointer to a device and to file location on device
◦ Size -
     current file size, maximal possible size
◦ Protection -
     controls who can read, write, execute
◦ Time, Date and user identification
     data for protection, security and usage monitoring
◦ Information about files are kept in the directory structure,
  maintained on disk
                                          File System   5/9/2019   27
   Users need to be able to refer to a file by a symbolic name.
   Clearly, each file in the system must have a unique name in
    order that file references be unambiguous.
   On the other hand, it is an unacceptable burden on users to
    require that they provide unique names, especially in a shared
    system.
   The use of a tree-structured directory minimizes the difficulty
    in assigning unique names.
   Any file in the system can be located by following a path from
    the root or master directory down various branches until the
    file is reached.
   The series of directory names, terminating in the file name
    itself, constitutes a pathname for the file.
                                             File System   5/9/2019   28
File System   5/9/2019   29
   Although the pathname facilitates the selection of file names,
    it would be awkward for a user to have to spell out the entire
    pathname every time a reference is made to a file.
   Typically, an interactive user or a process has associated with
    it a current directory, often referred to as the working
    directory.
   Files are then referenced relative to the working directory.
   For example, if the working directory for user B is “Word,”
    then the pathname Unit_A/ABC is sufficient to identify the
    file in the lower left-hand corner of Figure above.
   When an interactive user logs on, or when a process is created,
    the default for the working directory is the user home
    directory.
   During execution, the user can navigate up or down in the tree
    to change to a different working directory.
                                             File System   5/9/2019   30
 A file extension or file name extension is the ending of a file
  that helps identify the type of file in operating systems, such
  as Microsoft Windows.
 In Microsoft Windows, the file name extension is a period that
  is often followed by three characters, but may also be one,
  two, or four characters long.
Examples
Picture files
 .bmp
 .gif
 .jpg
Music/Sound files
 .mp3
 .wav
                                           File System   5/9/2019   31
Operating system files
 .dll
 .exe
Text and Word processing documents
 .doc
 .docx
 .rtf
 .txt
Web Page files
 .htm
 .html
                                     File System   5/9/2019   32
1.   Briefly discuss about file system techniques (partitioning,
     mounting and un mounting, virtual file systems)
2.   Briefly discuss about memory-mapped files
3.   Briefly discuss about special-purpose file systems
4.   Briefly discuss about backup strategies
                                             File System   5/9/2019   33