File Organization and Processing
Lecture 3
          File System
       Mohamed Mead
                      Introduction
 A file is an organized collection of data.
The organization of the file depends on the use of the data
  and is determined by the program, operating system or
  user who created the file.
All data in the computer is stored and retrieved as
  files. Thus, files may take many different forms
                   Introduction
A data file consisting of alphanumeric Unicode
 text that represents a program in source code
 form and will serve as ‘‘data’’ input to a C++ compiler.
A data file configured in some special way to
 represent an image, sound, or other object.
                           Introduction
 The file system permits users to create data collections,
 called files, with desirable properties, such as:
   Long-term existence: Files are stored on disk or other secondary
    storage and do not disappear when a user logs off.
   Sharable between processes: Files have names and can have associated
    access permissions that permit controlled sharing.
   Structure: Depending on the file system, a file can have an internal
    structure that is convenient for particular applications. In addition, files
    can be organized into hierarchical or more complex structure to reflect
    the relationships among files.
                 File Structure
Four terms are in common use when discussing files:
   Field
   Record
   File
   Database
                         field
 A field is the basic element of data.
An individual field contains a single value, such as an
 employee’s last name, a date, or the value of a
 sensor reading.
It is characterized by its length and data type (e.g.,
 ASCII string, decimal). Depending on the           file
 design, fields may be fixed length or variable
 length.
                            record
 A record is a collection of related fields that can be treated
  as a unit by some application program.
 For example, an employee record would contain such fields as
  name, social security number, job classification, date of hire,
  and so on.
 Again, depending on design, records may be of fixed length or
  variable length.
 A record will be of variable length if some of its fields are of
  variable length or if the number of fields may vary.
                                       File
 A file is a collection of similar records.
 The file is treated as a single entity by users and applications.
 Files have file names and may be created and deleted.
 Access control restrictions usually apply at the file level. That is, in a
  shared system, users and programs are granted or denied access to entire
  files.
 In some more sophisticated systems, such controls are enforced at the
  record or even the field level.
 Some file systems are structured only in terms of fields, not records. In
  that case, a file is a collection of fields.
                         Database
 A database is a collection of related data.
 A database may contain all of the information related to an
  organization or project, such as a business or a scientific
  study.
 The database itself consists of one or more types of files.
 Usually, there is a separate database management system
  that is independent of the operating system, although that
  system may make use of some file management programs.
                  File Structure
Typical operations that must be supported include the
 following:
Retrieve_All : Retrieve all the records of a file. This
 will be required for an application that must process all
 of the information in the file at one time.
For example, an application that produces a summary of
 the information in the file would need to retrieve all
 records.
                     File Structure
Retrieve_One : This requires the retrieval of just a
 single record.
  transaction-oriented applications need this operation.
Retrieve_Next : This requires the retrieval of the
 record that is “next” in some logical sequence to the
 most recently retrieved record.
  A program that is performing a search may also use
   this operation.
                 File Structure
 Retrieve_Previous : Similar to Retrieve_Next , but
 in this case the record that is “previous” to the
 currently accessed record is retrieved.
 Insert_One : Insert a new record into the file. It
 may be necessary that the new record fit into a
 particular position to preserve a sequencing of the
 file.
                        File Structure
 Delete_One : Delete an existing record. data structures may need
  to be updated to preserve the sequencing of the file.
 Update_One : Retrieve a record, update one or more of its fields,
  and rewrite the updated record back into the file. Again, it may be
  necessary to preserve sequencing with this operation.
   If the length of the record has changed, the update operation is
    generally more difficult than if the length is preserved.
 Retrieve_Few : Retrieve a number of records. For example, an
  application or user may wish to retrieve all records that satisfy a
  certain set of criteria.
               File Structure
                    1. Linux
Linux   supports   various    file   systems,   but
 common choices for the system disk on a block
 device include XFS, JFS, and btrsfs.
For raw flash there are UBIFS, JFFS 2 and
 YAFFS, among others.
               File Structure
                  2. macOS
macOS uses the Apple File System(APFS),
 which recently replaced a file system inherited
 from classic mac old called HFS Plus(HFS+).
               File Structure
            3. Microsoft Windows
Windows makes use of the FAT, NTFS, exFAT
 and ReFS file systems (the last of these is only
 supported and usable in Windows Servers,
 Windows 8,8.1,10.