DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND
DATA SCIENCE
         AL3451 MACHINE LEARNING
             QUESTION BANK
  FOR IV SEMESTER B.TECH. DEGREE COURSE
            [REGULATION-2021]
DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND
              DATA SCIENCE
                QUESTION BANK
SUBJECT CODE       : AL3451
SUBJECT NAME       : MACHINE LEARNING
REGULATION         : 2021
ACADEMIC YEAR      : 2023 – 2024
YEAR/SEMESTER      : II/IV
BATCH              : 2022 - 2026
Prepared by:     Verified by:                     Approved by:
Name:            Name:                            Name:
Date:            Date:                            Date:
                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 2
                         COLLEGE VISION
     Emerge as a premier institute producing globally competent
engineers.
                      COLLEGE MISSION
IM1: Achieve academic diligence through effective and innovative
teaching – learning processes using ICT tools
IM2: Make students employable through rigorous career guidance
and training programs.
IM3: Strengthen Industry Institute Interaction through MOUs and
collaborations.
IM4: Promote Research & Development by inculcating creative
thinking through innovative projects incubation.
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 3
      VISION AND MISSION OF THE DEPARTMENT
VISION
To foster industry cooperation and impart cognitive learning in order
to develop professionals who can adapt to the shifting demands of new
trends in artificial intelligence and data science.
MISSION
DM1: To provide an excellent infrastructure that keeps up with
modern trends and technologies for students and educators.
DM2: To impart knowledge in cutting edge technology for Artificial
Intelligence and Data Science with industrial standards.
DM3: To impart high-quality education embedded with moral and
ethical principles.
DM 4: To encourage lifelong learning and research that
benefit society as a whole.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 4
                      PROGRAM OUTCOMES (POs)
Engineering Graduates will be able to:
PO 1    Engineering knowledge: Apply the knowledge of mathematics, science, engineering
        fundamentals, and an engineering specialization to the solution of complex engineering
        problems.
PO 2    Problem analysis: Identify, formulate, review research literature, and analyze complex
        engineering problems reaching substantiated conclusions using first principles of
        mathematics, natural sciences, and engineering sciences.
PO 3    Design/development of solutions: Design solutions for complex engineering problems
        and design system components or processes that meet the specified needs with
        appropriate consideration for the public health and safety, and the cultural, societal, and
        environmental considerations
PO 4    Conduct investigations of complex problems: Use research-based knowledge and
        research methods including design of experiments, analysis and interpretation of data,
        and synthesis of the information to provide valid conclusions.
PO 5    Modern tool usage: Create, select, and apply appropriate techniques, resources, and
        modern engineering and IT tools including prediction and modeling to complex
        engineering activities with an understanding of the limitations.
PO 6    The Engineer and Society: Apply reasoning informed by the contextual knowledge to
        assess societal, health, safety, legal and cultural issues and the consequent
        responsibilities relevant to the professional engineering practice.
PO 7    Environment and Sustainability: Understand the impact of the professional
        engineering solutions in societal and environmental contexts, and demonstrate the
        knowledge of, and need for sustainable development.
PO 8    Ethics: Apply ethical principles and commit to professional ethics and responsibilities
        and norms of the engineering practice.
PO 9    Individual and Team work: Function effectively as an individual, and as a member or
        leader in diverse teams, and in multidisciplinary settings.
PO 10   Communication: Communicate effectively on complex engineering activities with the
        engineering community and with society at large, such as, being able to comprehend and
        write effective reports and design documentation, make effective presentations, and give
        and receive clear instructions.
PO 11   Project management and Finance: Demonstrate knowledge and understanding of the
        engineering and management principles and apply these to one’s own work, as a
        member and leader in a team, to manage projects and in multidisciplinary environments.
PO 12   Life-long learning: Recognize the need for, and have the preparation and ability to
        engage in independent and life-long learning in the broadest context of technological
        change.
                                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 5
   PROGRAMME EDUCATIONAL OBJECTIVES (PEOs)
PEO 1: Prospective Career: To exhibit knowledge and skills in the various domain areas of
Computer Science and Engineering with an awareness in different disciplines for a prospective
career and to undertake novel research as per the needs of the society, government and industries.
PEO 2: Higher Education: To be strong in fundamentals of Computer Science and Engineering
for successfully pursuing higher education and research in reputed institutions.
PEO 3: Product Development: To apply their knowledge and innovative ideas to design and
develop products in interdisciplinary areas for real time problems and to emerge as entrepreneurs.
             PROGRAM SPECIFIC OUTCOMES (PSOs)
PSO 1: Able to analyze and develop solutions to complex engineering problems in IT such as
Operating Systems, Networks, Data Structure, Database Management Systems, Cloud computing
and Mobile computing, to meet the society and industrial needs.
PSO 2: Able to solve the industry problems with knowledge in recent technology and tools such as
Machine Learning, Internet of Things, Oracle, Java Programming etc. with good team skills and
ethical values.
                                 R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 6
                                        SYLLABUS
AL3451                               MACHINE LEARNING                                   LTPC
                                                                                        3003
UNIT I         INTRODUCTION TO MACHINE LEARNING                                              8
Review of Linear Algebra for machine learning; Introduction and motivation for machine learning;
Examples of machine learning applications, Vapnik-Chervonenkis (VC) dimension, Probably
Approximately Correct (PAC) learning, Hypothesis spaces, Inductive bias, Generalization, Bias
variance trade-off.
UNIT II        SUPERVISED LEARNING                                                           11
Linear Regression Models: Least squares, single & multiple variables, Bayesian linear
regression,gradient descent, Linear Classification Models: Discriminant function – Perceptron
algorithm, Probabilistic discriminative model - Logistic regression, Probabilistic generative model –
Naïve Bayes, Maximum margin classifier – Support vector machine, Decision Tree, Random
Forests.
UNIT III       ENSEMBLE TECHNIQUES AND UNSUPERVISED LEARNING                                     9
Combining multiple learners: Model combination schemes, Voting, Ensemble Learning - bagging,
boosting, stacking, Unsupervised learning: K-means, Instance Based Learning: KNN, Gaussian
mixture models and Expectation maximization.
UNIT IV        NEURAL NETWORKS                                                                   9
Multilayer perceptron, activation functions, network training – gradient descent optimization –
stochastic gradient descent, error backpropagation, from shallow networks to deep networks –Unit
saturation (aka the vanishing gradient problem) – ReLU, hyperparameter tuning, batch
normalization, regularization, dropout.
UNIT V         DESIGN AND ANALYSIS OF MACHINE LEARNING EXPERIMENTS 8
Guidelines for machine learning experiments, Cross Validation (CV) and resampling – K-fold CV,
bootstrapping, measuring classifier performance, assessing a single classification algorithm and
comparing two classification algorithms – t test, McNemar’s test, K-fold CV paired t test
                                                                            TOTAL: 45 PERIODS
                                  R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 7
                                        COURSE OBJECTIVE
       1    To understand the basic concepts of machine learning.
       .
       2    To understand and build supervised learning models.
       .
       3    To understand and build unsupervised learning models.
       .
       4    To evaluate the algorithms based on corresponding metrics identified.
       .
       5
       .
                                         COURSE OUTCOME
      At the end of this course, the students will be able to:
                 CO1 : To explain the basic concepts of machine learning.
                 CO2 : To construct supervised learning models.
                 CO3 : To construct unsupervised learning algorithms.
                 CO4 : To evaluate and compare different models
                                        CO-PO&PSO MAPPING
SNO        PO1      PO2    PO3    PO4    PO5    PO6    PO7    PO8       PO9   PO10   PO11   PO12   PSO1   PSO2
CO1         2       1        2    1       -      -       -        -     3      3      2      2      2          2
CO2         2       3        3    1       2      -       -        -     2      2      2      1      3          1
CO3         2       1        3    3       2      -       -        -     1      1      1      1      1          2
CO4         2       3        3    2       1      -       -        -     3      2      3      2      1          2
                                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 8
Avg        2    2         2    1        1              -       -      2       2         2   1      1        1
                     UNIT I INTRODUCTION TO MACHINE LEARNING
Review of Linear Algebra for machine learning; Introduction and motivation for machine learning;
Examples of machine learning applications, Vapnik-Chervonenkis (VC) dimension, Probably
Approximately Correct (PAC) learning, Hypothesis spaces, Inductive bias, Generalization, Bias
variance trade-off.
                                              PART - A
   S.No.                              Question and Answer                        CO Univ QP
                                                                                     (Month/Y
                                                                                         ear)
      1.       What are the objectives of operating system?                      CO1 April/May
               An operating system is a program that manages the computer               2010
               hardware. it act as an intermediate between a users of a computer      May/June
               and the computer hardware. It controls and coordinates the use of        2012
               the hardware among the various application programs for the           April/May
               various users                                                            2017
      2.       What is an Interrupt?                                         CO1                Apr/May
                         Interrupt are provided primarily as way to improve                     2019
                            processor utilization.
                         It is a mechanism by which other
                            modules( I/O, Memory) may interrupt the
                            normal sequencing of the processor
                         Classes of interrupts:-
                                Program
                                Timer
                                I/O
                                Hardware failure
      3.       What are the advantages of peer-to-peer systems over client- CO1                 May/June
               server systems?                                                                   2016
               The main advantage of peer to peer network is that it is easier to set
               up In peer-to-peer networks all nodes are act as server as well as
               client therefore no need of dedicated server.
               The peer to peer network is less expensive. Peer to peer network is
               easier to set up and use this means that you can spend less time in
                                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 9
     the configuration and implementation of peer to peer network.
4.   What is the purpose of system programs/system calls?                  CO1     May/June
     System programs can be thought of as bundles of useful system                  2016
     calls. They provide basic functionality to users so that users do not         Apr/May
     need to write their own programs to solve common problems.                     2018
5.   How does an interrupt differ from a trap?                               CO1   Nov/Dec
     An interrupt is a hardware-generated signal that changes the flow              2016
     within the system. A trap is a software-generated interrupt. An               Apr/May
     interrupt can be used to signal the completion of I/O so that the CPU          2018
     doesn't have to spend cycles polling the device. A trap can be used
     to catch arithmetic errors or to call system routines
6.   What are disadvantages of multi-processor systems?                      CO1   Nov/Dec
     Complex Operating System is required                                           2016
     Large main memory required
     Very expensive
7.   Defend timesharing differ from multiprogramming? If so,                 CO1 April/May
     how?                                                                          2015
     Main difference between multiprogramming and time sharing is that
     multiprogramming is the effective utilization of CPU time, by
     allowing several programs to use the CPU at the same time but time
     sharing is the sharing of a computing facility by several users that
     want to use the same facility at the same time.
8.   List the Services of operating system function.                         CO1   Apr/May
                 Program development                                               2019
                 Program execution
                 User Interface
                 I/O Operations
                 File system Manipulation
                 Communication
                 Error Detection
                 Resource allocation
                 Accounting
                 Security
9.   Why API's need to be used rather than system call?                      CO1 April/May
     There are four basic reasons:                                                 2015
     1) System calls differ from platform to platform. By using a stable
     API, it is easier to migrate your software to different platforms.
     2) The operating system may provide newer versions of a system
     call with enhanced features. The API implementation will typically
     also be upgraded to provide this support, so if you call the API,
     you'll get it.
      3) The API usually provides more useful functionality than the
      system call directly. If you make the system call directly, you'll
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 10
       typically have to replicate the pre-call and post-call code that's
       already implemented by the API. (For example the 'fork' API
       includes tons of code beyond just making the 'fork' system call. So
       does 'select'.)
      4) The API can support multiple versions of the operating system
      and detect which version it needs to use at run time. If you call the
      system directly, you either need to replicate this code or you can
      only support limited versions.
10.    Compare and contrast DMA and cache memory.                             CO1    Nov/Dec
                                                                                      2015
      DMA(Direct Memory Access): Direct memory access (DMA) is a
      feature of computer systems that allows certain hardware
      subsystems to access main memory (Random-access memory),
      independent of the central processing unit (CPU).
       Cache Memory: A cache is a smaller, faster memory, closer to a
       processor core, which stores copies of the data from frequently
       used main memory locations.
      So, both DMA and cache are used for increasing the speed of
      memory access.
11.    Can traps be generated intentionally by a user program? If so, CO1           Nov/Dec
       for what purpose?                                                             2018
      A trap is a software‐generated interrupt. An interrupt can be used
      to signal
      the completion of an I/O to obviate the need for device polling. A
      trap can be used to call operating system routines or to catch
      arithmetic errors.
12.   List and briefly define the four main elements of a computer?       CO1
      • Processor – Controls the operation of the computer & performs its
      data processing functions
      • Main memory – Stores data & programs. It is volatile.
      • I/O modules – Move data between the computer & its external
      environment such as disks, communication equipment & terminals.
      • System Bus – Provides for communication among processors,
      main memory & I/O modules.
13.   What is Multiprogramming?                                           CO1       Nov/Dec
      Multi Programming increases CPU Utilization by organizing                      2011
      jobs so that the CPU always has one to execute.
             Advantage:-
                  It increase CPU utilization,
                  It makes efficient use of the CPU
                      overlapping the demands for the CPU &
                      I/O devices Increased throughput and
                      Lower response time.
                             R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 11
14.   What characteristics distinguish the various elements of a           CO1
      memory hierarchy?
           Characteristics are
            Cost Per bit
            Capacity
            Access Time
            Frequency of access to the memory by the processor.
15.   Do timesharing differ from Multiprogramming? If so, How?             CO1     Apr/May
           Time Sharing: here, OS assigns some time slots to each                   2015
           job. Here, each job is executed according to the
           allotted time slots.
                       Job1: 0 to 5           Job2:5to10          Job3:
                       10 to 15
           Multi-Tasking: In this operating system, jobs are executed in
           parallel by the operating system. But, we can achieve this
           multi-tasking through multiple processors (or) multicore CPU
           only.
                        CPU1: Job1           CPU2: Job2 CPU3: Job3
16.   Why API s need to be used rather than system calls?                  CO1     Apr/May
                 System calls are much slower than APIs (library                   2018
                    calls) since for each system call, a context                   Apr/May
                    switch has to occur to load the OS (which then                  2015
                    serves the system call).
                 Most details of OS interface hidden from
                    programmer by API Managed by run-time support
                    library (Set of functions built into libraries
                    included with compiler.)
17.   What is Cache Memory?                                                CO1
                  Cache memory is invisible to the OS.
                  It interacts with other memory management hardware
                  Cache contains a copy of a portion of main memory.
18.   What are the types of Operating System?                              CO1
           Batch Operating System
           Time-Sharing Operating System
           Embedded Operating System
           Multiprogramming Operating System
           Network Operating System
           Distributed Operating System
           Multiprocessing Operating System
19.   What are the types of network operating system?                      CO1
           Peer-to-peer network operating system
           Client-Server network operating system
20.   List the functions of Operation System                               CO1
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 12
      The operating system performs the following functions in a device.
         1. Instruction
         2. Input/output Management
         3. Memory Management
         4. File Management
         5. Processor Management
         6. Job Priority
         7. Special Control Program
         8. Scheduling of resources and jobs
         9. Security
         10. Monitoring activities
         11. Job accounting
21.   Write down the operating system structures.                          CO1
          1. Simple Structure
          2. Monolithic Structure
          3. Micro-Kernel Structure
          4. Layered Structure
22.   Define Operating System. List the Objectives of Operating            CO1     Apr/May
      Systems.                                                                      2019
      An Operating System (OS) is an interface between a computer user
      and computer hardware. An operating system is a software which
      performs all the basic tasks like file management, memory
      management, process management, handling input and output, and
      controlling peripheral devices such as disk drives and printers.
      The objectives of the operating system are −
          To make the computer system convenient to use in an
             efficient manner.
          To hide the details of the hardware resources from the users.
          To provide users a convenient interface to use the computer
             system.
          To act as an intermediary between the hardware and its users,
             making it easier for the users to access and use other
             resources.
          To manage the resources of a computer system.
          To keep track of who is using which resource, granting
             resource requests, and mediating conflicting requests from
             different programs and users.
          To provide efficient and fair sharing of resources among
             users and programs.
23.   List the Operating System Design Goals.                              CO1
           User Goals
           System Goals
24.   Define the two main categories of processor register?                CO1
            Two categories are
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 13
                User- visible registers: - It Enable the machine or assembly
                 language programmer to minimize main memory
                 references by optimizing register use.
                Control & Status registers: - Registers used by the
                 processor to control the operation of the processor.
       25.     What characteristics distinguish the various elements of a CO1
               memory hierarchy?
                     Characteristics are
                      Cost Per bit
                      Capacity
                      Access Time
                      Frequency of access to the memory by the processor.
                                                 PART - B
S.N                                Question and Answer                        CO              Univ QP
 o.                                                                                          (Month/Yea
                                                                                                  r)
      1. Explain different operating system structures with neat sketch.                      (Nov/Dec
                                                                                                2015)
                                                                                              (Apr/May
        • Modern operating system is complex in nature. Here we study different                 2017)
        structure motto of the operating systems.                                             (Apr/May
        Simple Structure                                                                        2018)
        • Microsoft-Disk Operating System (MS-DOS) is example of simple
        structure of operating system. Most of the commercial systems do not have
        well defined structure. Initially DOS is small and simple in size. When new
        versions are introduced, size goes on increasing.
        • There is no CPU Execution Mode (user and kernel), and so errors in
        applications can cause the whole system to crash.
        • MS-DOS is layered structure operating system. It consists of following
        layers:
        1.Application program layer
        2. System program layer for resident
        3. Device driver layer
        4. ROM BIOS device driver layer.
        • In DOS, application program directly interact with BIOS device driver. If
        user makes any changes in the BIOS device driver, it creates the problem
        and affects the whole system. Memory size is also limited so after use
        memory must be made free for other users.
        • Another example of this type is UNIX operating system. Initially it
        provides limited hardware functionality. UNIX is divided into two parts :
        Kernel and system programs.
        • Kernel provides system calls for CPU scheduling, I/O management, file
        management and memory management. System call uses application
        program interface in UNIX. API defines user interface by using set of
        system programs. Kernel supports an API and user interface.
        • To support more advanced version of hardware, new UNIX operating
        system was developed. It is consists of different modules for controlling the
        hardware. Module will communicate with each other for proper operation.
                                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 14
Designers are free to design operating systems which support all new
versions of hardware and user requirements.
Layered Approach
• Second type of operating system design is layered approach. Operating
system is divided into number of layers. Each layer boundary is properly
defined. Bottom layer is called layer 0 and top most layer is called layer N.
Layer N provides user interface.
• A function of layer is also fixed. Fig. 1.9.1 shows the operating system
layer. Each layer consists of data structure and a set of routines. Layer
provides services to upper and lower layers.
• Modularity is the advantage of the layered system. Number of layers are
selected properly and each layer uses the services and functions of lower
level layer (Below layer of the current layer). This type of design simplifies
the debugging and system verification.
• First layer contains only basic hardware to implement function. So only
first layer is debugged for checking the whole system. If error is not found
then the system will works properly. If error is encounter while debugging
second layer, then error is related to the second layer only. In this operating
system is designed and implemented. It simplifies the design task of the
operating system.
• Care should be taken while designing the layers. Which functions are
added to which layer must be designing properly? Layer boundaries must be
defined properly. Some of the function includes only in the lower layer.
Device driver, memory management and input/output operation function
must be included in the lower layer.
• Secondary storage is required for all operation. When CPU changes the
process as per scheduling method, currently executing process is stored on
the secondary storage. Disk space is required for this purpose. So CPU
scheduling is includes in the above layer of the secondary storage layer.
• Each layer uses system calls for performing their operation. Layer adds
overhead to the system call.
Monolithic Structure
• Traditional UNIX operating system uses monolithic kernel architecture.
The entire operating system runs as a single program in kernel mode.
Program contains operating system core function and device drivers.
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 15
• Most of the operation performed by kernel is via system call. Required
system calls are made within programs and a checked copy of the request is
passed through a system call. Fig. 1.9.2 shows monolithic kernel.
• LINUX operating system and FreeBSD uses modern monolithic kernel
architecture. It loads the modules at run time. There is easy access of kernel
function as required and minimize the code running in kernel space.
• Monolithic kernel used in Windows 95, Windows 98, Linux and FreeBSD
etc.
Advantages:
1. Simple to design and implement.
2. Simplicity provides speed on simple hardware.
3. It can be expanded using module system.
4. Time tested and design well known.
Disadvantages :
1.Runtime loading and unloading is not possible because of module system.
2. If code base size increases, maintain is difficult.
3. Fault tolerance is low.
Microkernel
• Microkernel provides minimal services like defining memory address
space, IPC and process management. It is small operating core. Hardware
resource management is implemented whenever process is executing.
• The function of microkernel is to provide a communication facility
between the client programs. It also provides facility to various services
which are running in user space. Message passing method is for
communication two processes.
• Microkernel runs in kernel mode and rest run in normal user processes.
Microkernel also provides more security and reliability. Most of the
services are running as user rather than kernel processes.
• By running device driver and file system as a separate user process, a error
in one can crash only single component. Fig. 1.9.3 shows microkernel.
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 16
• Mach operating system uses microkernel architecture. Microkernel in
Windows NT operating system provides portability and modularity. The
kernel is surrounded by a number of compact subsystems so that the task of
implementing Windows NT on a variety of platform is easy.
• Microkernel architecture assigns only a few essential functions to the
kernel, including address space, IPC and basic scheduling. QNX is a real
time operating system that is also based upon the microkernel design.
• Main disadvantage is poor performance due to increased system overhead
from message passing.
Advantages of microkernel:
1. Microkernel allows the addition of new services.
2. Microkernel architecture design provides a uniform interface on requests
made by a process.
3. Microkernel architecture supports object oriented operating system.
4. Modular design helps to enhance reliability.
5. Microkernel lends itself to distributed system support.
6. Microkernel architecture support flexibility. User can add or subtract
services according to the requirement.
Virtual Machine
• In a pure virtual machine architecture the operating system gives each
process the illusion that it is the only process on the machine. The user
writes an application as if only its code were running on the system.
• Each user interacts with the computer by typing commands to the virtual
machine on a virtual system console and receiving results back from the
machine as soon as they are computed.
• Each user directs the virtual machine to perform different commands.
These commands are then executed on the physical machine in a
multiprogramming environments.
• Virtualization is an abstraction layer that decouples the physical hardware
from the operating system to deliver greater IT resource utilization and
flexibility.
• It allows multiple virtual machines, with heterogeneous operating systems
to run in isolation, side-by-side on the same physical machine. Fig. 1.9.4
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 17
shows virtual machine.
• Each virtual machine has its own set of virtual hardware (e.g., RAM, wol
CPU, NIC, etc.) upon which an o operating system and applications The
operating system creates the illusion of multiple processes, each executing
on its own processor with its own (virtual) memory.
• The main components of virtual machine are the control program,
conversational monitor system,remote spooling communication interactive
problem control and system.
• The control program creates the environments in which virtual machines
can executes. It also manages the real machines underlying the virtual
machine environment.
• The java virtual machine is one of the most widely used virtual machines.
• Virtual machines tend to be less efficient than real machines because they
access the hardware indirectly.
Benefits
1. There is no overlap amongst memory as each virtual memory has its own
memory space.
2. Virtual machines are completely isolated from the host machine and other
virtual machines.
3. Data does not leak across virtual machines.
Virtualization
•Virtualization means running multiple machines on a single hardware. The
"Real" hardware invisible to operating system. OS only sees an abstracted
out picture. Only Virtual Machine Monitor (VMM) talks to hardware.
• It is "a technique for hiding the physical characteristics of computing
resources from the way in which other systems, applications, or end users
interact with those resources. This includes making a single physical
resource appear to function as multiple logical resources; or it can include
making multiple physical resources appear as a single logical resource."
• It is divided into two main categories :
1. Platform virtualization involves the simulation of virtual machines.
2. Resource virtualization involves the simulation of combined, fragmented,
or simplified resources.
Comparison between Monolithic Kernel and Microkernel
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 18
2. Explain the various types of system calls with examples.                                (May/
                                                                                            June
                                                                                           2015)
  A system call is a way for a user program to interface with the operating              (Nov/Dec
  system. The program requests several services, and the OS responds by                    2015)
  invoking a series of system calls to satisfy the request. A system call can be         (Apr/May
  written in assembly language or a high-level language like C or Pascal.                   2017)
  System calls are predefined functions that the operating system may directly            (Nov/Dec
  invoke if a high-level language is used.                                                  2018)
  A system call is a method for a computer program to request a service from
  the kernel of the operating system on which it is running. A system call is a
  method of interacting with the operating system via programs. A system call
  is a request from computer software to an operating system's kernel.
  The Application Program Interface (API) connects the operating system's
  functions to user programs. It acts as a link between the operating system
  and a process, allowing user-level programs to request operating system
  services. The kernel system can only be accessed using system calls.
  System calls are required for any programs that use resources. Will Launch
  in
  When a computer software needs to access the operating system's kernel, it
  makes a system call. The system call uses an API to expose the operating
  system's services to user programs. It is the only method to access the
  kernel system. All programs or processes that require resources for
  execution must use system calls, as they serve as an interface between the
  operating system and user programs.
  Below are some examples of how a system call varies from a user function.
       1. A system call function may create and use kernel processes to
          execute the asynchronous processing.
       2. A system call has greater authority than a standard subroutine. A
          system call with kernel-mode privilege executes in the kernel
          protection domain.
       3. System calls are not permitted to use shared libraries or any symbols
          that are not present in the kernel protection domain.
       4. The code and data for system calls are stored in global kernel
          memory.
                                 R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 19
There are various situations where you must require system calls in the
operating system. Following of the situations are as follows:
    1. It is must require when a file system wants to create or delete a file.
    2. Network connections require the system calls to sending and
        receiving data packets.
    3. If you want to read or write a file, you need to system calls.
    4. If you want to access hardware devices, including a printer, scanner,
        you need a system call.
    5. System calls are used to create and manage new processes.
The Applications run in an area of memory known as user space. A system
call connects to the operating system's kernel, which executes in kernel
space. When an application creates a system call, it must first obtain
permission from the kernel. It achieves this using an interrupt request,
which pauses the current process and transfers control to the kernel.
If the request is permitted, the kernel performs the requested action, like
creating or deleting a file. As input, the application receives the kernel's
output. The application resumes the procedure after the input is received.
When the operation is finished, the kernel returns the results to the
application and then moves data from kernel space to user space in memory.
A simple system call may take few nanoseconds to provide the result, like
retrieving the system date and time. A more complicated system call, such
as connecting to a network device, may take a few seconds. Most operating
systems launch a distinct kernel thread for each system call to avoid
bottlenecks. Modern operating systems are multi-threaded, which means
they can handle various system calls at the same time.
Types of System Calls
There are commonly five types of system calls. These are as follows:
    1. Process Control
    2. File Management
    3. Device Management
    4. Information Maintenance
    5. Communication
Now, you will learn about all the different types of system calls one-by-one.
Process Control
Process control is the system call that is used to direct the processes. Some
process control examples include creating, load, abort, end, execute,
process, terminate the process, etc.
File Management
File management is a system call that is used to handle the files. Some file
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 20
management examples include creating files, delete files, open, close, read,
write, etc.
Device Management
Device management is a system call that is used to deal with devices. Some
examples of device management include read, device, write, get device
attributes, release device, etc.
Information Maintenance
Information maintenance is a system call that is used to maintain
information. There are some examples of information maintenance,
including getting system data, set time or date, get time or date, set system
data, etc.
Communication
Communication is a system call that is used for communication. There are
some examples of communication, including create, delete communication
connections, send, receive messages, etc.
Examples of Windows and Unix system calls
There are various examples of Windows and Unix system calls. These are
as listed below in the table:
 Process               Windows                    Unix
 Process Control      CreateProcess()            Fork()
                      ExitProcess()              Exit()
                      WaitForSingleObject()      Wait()
 File Manipulation    CreateFile()               Open()
                      ReadFile()                 Read()
                      WriteFile()                Write()
                      CloseHandle()              Close()
 Device               SetConsoleMode()           Ioctl()
 Management           ReadConsole()              Read()
                      WriteConsole()             Write()
 Information          GetCurrentProcessID()      Getpid()
 Maintenance          SetTimer()                 Alarm()
                      Sleep()                    Sleep()
 Communication        CreatePipe()               Pipe()
                      CreateFileMapping()        Shmget()
                      MapViewOfFile()            Mmap()
 Protection           SetFileSecurity()          Chmod()
                      InitializeSecurityDescri   Umask()
                      ptor()                     Chown()
                      SetSecurityDescriptorg
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 21
                       roup()
Here, you will learn about some methods briefly:
open()
The open() system call allows you to access a file on a file system. It
allocates resources to the file and provides a handle that the process may
refer to. Many processes can open a file at once or by a single process only.
It's all based on the file system and structure.
read()
It is used to obtain data from a file on the file system. It accepts three
arguments in general:
     o A file descriptor.
     o A buffer to store read data.
     o The number of bytes to read from the file.
The file descriptor of the file to be read could be used to identify it and open
it using open() before reading.
wait()
In some systems, a process may have to wait for another process to
complete its execution before proceeding. When a parent process makes a
child process, the parent process execution is suspended until the child
process is finished. The wait() system call is used to suspend the parent
process. Once the child process has completed its execution, control is
returned to the parent process.
write()
It is used to write data from a user buffer to a device like a file. This system
call is one way for a program to generate data. It takes three arguments in
general:
     o A file descriptor.
     o A pointer to the buffer in which data is saved.
     o The number of bytes to be written from the buffer.
fork()
Processes generate clones of themselves using the fork() system call. It is
one of the most common ways to create processes in operating systems.
When a parent process spawns a child process, execution of the parent
process is interrupted until the child process completes. Once the child
process has completed its execution, control is returned to the parent
process.
close()
It is used to end file system access. When this system call is invoked, it
signifies that the program no longer requires the file, and the buffers are
flushed, the file information is altered, and the file resources are de-
allocated as a result.
exec()
When an executable file replaces an earlier executable file in an already
executing process, this system function is invoked. As a new process is not
built, the old process identification stays, but the new process replaces data,
                                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 22
  stack, data, head, etc.
  exit()
   The exit() is a system call that is used to end program execution. This call
   indicates that the thread execution is complete, which is especially useful
   in multi-threaded environments. The operating system reclaims resources
   spent by the process following the use of the exit() system function.
3. What are the basic functions of OS and DMA                                             (Nov/Dec
                                                                                            2015)
   For a device that does large transfers, such as a disk drive, it seems                (Apr/May
                                                                                            2017)
   wasteful
   to use an expensive general-purpose processor to watch status bits and to
   feed data into a controller register one byte at a time—a process termed
   programmed I/O (PIO). Many computers avoid burdening the main CPU
   with PIO by offloading some of this work to a special-purpose processor
   called a direct-memory-access (DMA) controller. To initiate a DMA
   transfer, the host writes a DMA command block into memory. This block
   contains a pointer to the source of a transfer, a pointer to the destination of
   the transfer, and a count of the number of bytes to be transferred. The CPU
   writes the address of this command block to the DMA controller, then goes
   on with other work. The DMA controller proceeds to operate the memory
   bus directly, placing addresses on the bus to perform transfers without the
   help of the main CPU. A simple DMA controller is a standard component
   in all modern computers, from smartphones to mainframes. Handshaking
   between the DMA controller and the device controller is performed via a
   pair of wires called DMA-request and DMA-acknowledge.
   The device controller places a signal on the DMA-request wire when a
   word
   of data is available for transfer. This signal causes the DMA controller to
   seize the memory bus, place the desired address on the memory-address
   wires, and place a signal on the DMA-acknowledge wire. When the device
   controller receives the DMA-acknowledge signal, it transfers the word of
   data to memory and removes the DMA-request signal.
   When the entire transfer is finished, the DMA controller interrupts the
   CPU.
   This process is depicted in Figure 13.5. When the DMA controller seizes
   the
   memory bus, the CPU is momentarily prevented from accessing main
   memory, although it can still access data items in its primary and
   secondary caches. Although this cycle stealing can slow down the CPU
   computation, offloading the data-transfer work to a DMA controller
   generally improves the total system performance. Some computer
   architectures use physical memory addresses for DMA, but others
   performdirect virtual memory access (DVMA), using virtual addresses that
   undergo translation to physical addresses. DVMA can perform a transfer
   between two memory-mapped devices without the intervention of the CPU
   or the use of main memory.
   On protected-mode kernels, the operating system generally prevents
                                 R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 23
   processes from issuing device commands directly. This discipline protects
   data from access-control violations and also protects the system from
   erroneous use of device controllers that could cause a system crash.
   Instead, the operating system exports functions that a sufficiently
   privileged process can use to access low-level operations on the underlying
   hardware. On kernels without memory protection, processes can access
   device controllers directly. This direct access can be used to achieve high
   performance, since it can avoid kernel communication, context switches,
   and layers of kernel software. Unfortunately, it interferes with system
   security and stability. The trend in general-purpose operating systems is to
   protect memory and devices so that the system can try to guard against
   erroneous or malicious applications.
4. Explain the concept of multiprocessor and Multicore                                   (Apr/
   organization.                                                                          May
                                                                                         2017)
   Multiprocessor System
   • Multiprocessor system means more than one processor in close
   communication. All the processor share common bus, clock, memory
   and peripheral devices.
   • Multiprocessor system is also called parallel system or tightly
   coupled systems. Multiprocessor operating systems are just regular
   operating systems. They handle system calls, do memory
   management, provide a file system, and manage I/O devices.
   • Multiprocessor systems is also known as parallel systems or
   multicore systems.
   •Features of Multiprocessor Systems :
   1.The processor should support efficient context switching operation.
   2. It supports large physical address space and larger virtual address
                                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 24
space.
3. If one processor fails, then other processors should retrieve the
interrupted process state so that execution of the process can continue.
4.The inter-process communication mechanism should be provided
and implemented in hardware as it becomes efficient and easy.
• Multiprocessors can be used in different ways:
1. Uniprossesors (single-instruction, single-data or SISD)
2. Within a single system to execute multiple, independent sequences
of instructions in multiple contexts (multiple-instruction, multiple-
data or MIMD);
3. A single sequence of instructions in multiple contexts (single-
instruction, multiple-data or SIMD, often used in vector processing);
4. Multiple sequences of instructions in a single context (multiple-
instruction, single-data or MISD, used for redundancy in fail-safe
systems and sometimes applied to describe pipelined processors or
hyper threading).
Advantages and Disadvantages of Multiprocessor Systems
 Advantages of Multiprocessor Systems:
1. Throughput : Throughput increases because number of processor is
increases.
2. Cost:Multiprocessor system is cheaper than the multiple single
processor Ang system.
3. Reliability: If one processor fails, it has no effect on whole system
operation.
4. Response time : Response time is less because of increased number
of processor.
• Disadvantages of Multiprocessor Systems :
1. Multiprocessor systems are more complex in both hardware and
software.
2. Additional CPU cycles are required to manage the cooperation, so
per-CPU efficiency goes down.
• Multiprocessor systems are of two types: Symmetric multiprocessing
and asymmetric multiprocessing.
Symmetric Multiprocessing
• The simplest multiprocessors are based on a single bus. In
symmetric multiprocessing, number of homogeneous processor are
running independently without affecting other programs.
• Systems that treat all CPUs equally are called symmetric
multiprocessing (SMP) systems.
• Each processor uses 'different data and program but sharing some
common resources like memory, I/O device etc. Fig. 1.4.1 shows
multiprogramming system.
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 25
   • In SMP, all processor are at the same level. They work in peers.
   There is no master-slave relationship in between the processor. Each
   processor shares a single copy of operating system.
   • Symmetric multiprocessing is easier to implement in operating
   systems. Examples of symmetric multiprocessing operating systems
   are Windows NT 4.0, Novell Netware 2.1 etc.
   Asymmetric Multiprocessor
   • If all CPUs are not equal, system resources may be divided in a
   number of ways, including asymmetric multiprocessing (ASMP), non-
   uniform memory access
   (NUMA) multiprocessing, and clustered multiprocessing.
   • All CPUs are not equal. There is one master processor and
   remaining is the slave processor. Each processor has its own memory
   address space.
   • A master processor controls the whole operations. It gives the
   instruction to the slave processor or slave processor uses some
   predefined set of tasks.
   • Asymmetric multiprocessing has one master CPU and the remainder
   CPUs are slaves. The Master distributes takes among the slaves and
   I/O is usually done by the master only.
   • Each processor has own task which assign by master processor.
   Asymmetric multiprocessing is difficult to implement.
5. What are the advantages and disadvantages of using the same                             (Nov/Dec
   system call interface for both files and devices.                                         2016)
   Advantages:
      1. Unified Interface: Having a single system call interface simplifies
           programming, as developers only need to learn and work with one
           set of functions for accessing both files and devices. This reduces
           complexity and makes code more readable and maintainable.
      2. Consistency: With a unified interface, the behavior and usage
           patterns are consistent across different types of resources (files and
           devices), which can lead to fewer errors and more predictable
           outcomes.
      3. Flexibility: A unified interface allows for more flexible use of
           resources. For example, it's easier to redirect output from a file to a
           device or vice versa without needing to change the code
           significantly.
      4. Portability: Programs written using a unified interface are more
                                   R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 26
          portable across different operating systems, as long as those
          systems support the same system call interface.
   Disadvantages:
      1. Loss of Specialization: Treating files and devices the same may
          lead to the loss of optimizations and special features that could be
          provided for each separately. For example, certain devices may
          have specific operations or characteristics that are not fully utilized
          with a generic file interface.
      2. Performance Overhead: A unified interface may introduce some
          performance overhead, as it needs to handle both file and device
          operations in a generic way, potentially leading to less efficient
          code execution compared to optimized, specialized interfaces for
          each type of resource.
      3. Complexity in Implementation: Implementing a unified interface
          that adequately handles both files and devices can be more complex
          than implementing separate interfaces for each. This complexity
          may introduce bugs or make maintenance more challenging.
      4. Security Concerns: Treating files and devices the same way might
          introduce security vulnerabilities, as certain operations that are safe
          for files could be dangerous if applied to devices, or vice versa. A
          unified interface might not provide sufficient safeguards to prevent
          such issues.
   While a unified system call interface offers simplicity, consistency, and
   flexibility, it may come at the cost of specialization, performance,
   complexity, and potentially security. The decision to use a unified
   interface should consider the specific requirements and trade-offs of the
   system being developed.
6. Describe the difference between symmetric and asymmetric                                 (May/
   multiprocessing. Discuss the advantages and disadvantages of                              June
                                                                                            2016)
   multiprocessor systems.                                                                (Nov/Dec
                                                                                            2016)
   Difference Between Asymmetric and Symmetric Multiprocessing:
   S.        Asymmetric                        Symmetric
   No.       Multiprocessing                   Multiprocessing
             In            asymmetric          In              symmetric
             multiprocessing,       the        multiprocessing, all the
    1.
             processors are not treated        processors are treated
             equally.                          equally.
             Tasks of the operating            Tasks of the operating
    2.       system are done by                system       are      done
             master processor.                 individual processor.
                                  R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 27
        No        Communication         All           processors
        between Processors as           communicate with another
 3.
        they are controlled by the      processor by a shared
        master processor.               memory.
        In             asymmetric       In              symmetric
        multiprocessing, process        multiprocessing,      the
 4.
        scheduling approach used        process is taken from the
        is master-slave.                ready queue.
        Asymmetric                      Symmetric
 5.     multiprocessing   systems       multiprocessing    systems
        are cheaper.                    are costlier.
        Asymmetric                      Symmetric
 6.     multiprocessing systems         multiprocessing systems
        are easier to design.           are complex to design.
        All processors can exhibit      The architecture of each
 7.
        different architecture.         processor is the same.
                                        It    is    complex     as
        It is simple as here the        synchronization         is
 8.     master processor has            required of the processors
        access to the data, etc.        in order to maintain the
                                        load balance.
        In case a master processor
        malfunctions then slave
        processor continues the         In case of processor
        execution which is turned       failure, there is reduction
 9.
        to     master    processor.     in the system’s computing
        When a slave processor          capacity.
        fails then other processors
        take over the task.
        It    is  suitable      for
                                        It  is   suitable       for
 10.    homogeneous              or
                                        homogeneous cores.
        heterogeneous cores.
Advantages and Disadvantages of Multiprocessor Systems
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 28
    Advantages of Multiprocessor Systems:
   1. Throughput : Throughput increases because number of processor is
   increases.
   2. Cost:Multiprocessor system is cheaper than the multiple single
   processor Ang system.
   3. Reliability: If one processor fails, it has no effect on whole system
   operation.
   4. Response time : Response time is less because of increased number
   of processor.
   • Disadvantages of Multiprocessor Systems :
   1. Multiprocessor systems are more complex in both hardware and
   software.
   2. Additional CPU cycles are required to manage the cooperation, so
   per-CPU efficiency goes down.
7. Discuss in detail about Distributed systems.                                         (May/June
  Distributed System                                                                      2016)
  • Definition: A distributed system is a collection of autonomous hosts that
  are connected through a computer network.
  • A distributed system is a collection of independent computers that appears
  to its users as a single coherent system. Each host executes components and
  operates a distribution middleware.
  • Middleware enables the components to coordinate their activities. Users
  perceive the system as a single, integrated computing facility.
  • A distributed computer system consists of multiple software components
  that are on multiple computers, but run as a single system. The computers
  that are in a distributed system can be physically close together and
  connected by a local network, or they can be geographically distant and
  connected by a wide area network.
  • A distributed system can consist of any number of possible configurations,
  such as mainframes, personal computers, workstations, minicomputers and
  so on, such a
  • Distributed operating systems depend on networking for their operation.
  Distributed OS runs on and controls the resources of multiple machines. It
  sts provides resource sharing across the boundaries of a single computer
  system. It looks to users like a single machine OS.
  • Distributing OS owns the whole network and makes it look like a virtual
  otauni-processor or may be a virtual multiprocessor.
                                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 29
• Definition: A distributed operating system is one that looks to its users like
an ordinary operating system but runs on multiple, independent CPU.
• Distributed systems depend on networking for their functionality. Fig.
1.6.1 shows the distributed system.
• Examples of distributed operating system are Amoeba, chrous, mach and
v-system
• A Local Area Network (LAN) is a network that is confined to a relatively
small area. It is generally limited to a geographic area such as a college, lab
or building.
• WAN provides long distance transmission of data and voice. Computers
connected to a wide-area network are often connected through public
networks, such as the telephone system. They can also be connected
through leased lines or satellites.
• A MAN typically covers an area of between 5 and 50 km diameter. Many
MANS cover an area the size of a city, although in some cases MANS may
be as small as a group of buildings.
• A MAN often acts as a high speed network to allow sharing of regional
resources. MAN provides the transfer rates from 34 to 150 Mbps.
Advantages and Disadvantages of DS
a) Advantages of distributed OS:
1. Resource sharing: Sharing of software resources such as software
libraries, database and hardware resources such as hard disks, printers and
CDROM can also be done in a very effective way among all the computers
and the users.
2. Higher reliability: Reliability refers to the degree of tolerance against
                                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 30
  errors and component failures. Availability is one of the important aspects
  of reliability. Availability refers to the fraction of time for which a system is
  available for use.
  3. Better price performance ratio: Reduction in the price of microprocessor
  and increasing computing power gives good price-performance ratio.
  4. Shorter responses times and higher throughput.
  5. Incremental growth: To extend power and functionality of a system by
  simply adding additional resources to the system.
  b) Disadvantages :
  1. There are no current commercially successful examples.
  2. Protocol overhead can dominate computation costs.
  3. Hard to build well.
  4. Probably impossible to build at the scale of the internet.
8. Demonstrate the three methods for passing            parameters to the OS             (May/June
   with examples.                                                                          2016)
   1) Pass parameters using registers(directly).
   section .text
   global _start
   _start:
      mov eax, 4       ; System call number for write
      mov ebx, 1       ; File descriptor for standard output
      mov ecx, msg ; Pointer to the message to be printed
      mov edx, msglen ; Length of the message
      int 0x80       ; Invoke the kernel
      mov eax, 1       ; System call number for exit
      xor ebx, ebx ; Return 0 status
      int 0x80       ; Invoke the kernel
   section .data
   msg db 'Hello, world!', 0xA ; The message to be printed
   msglen equ $ - msg            ; Length of the message
   2) Store the parameters in a table in memory and the table address is
   passed in a register to the OS.
   #include <unistd.h>
   #include <sys/types.h>
   int main() {
      struct syscall_params {
         int syscall_number;
         int fd;
                                  R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 31
           const char *buf;
           size_t count;
      };
      struct syscall_params params;
      params.syscall_number = 4; // System call number for write
      params.fd = 1;               // File descriptor for standard output
      params.buf = "Hello, world!\n"; // Pointer to the message to be printed
      params.count = 14;              // Length of the message
       syscall(0, ¶ms);               // Invoking the system call with syscall
   number 0 and passing the params table address
      return 0;
   }
   3) Push(store) the parameters onto a stack(by the program) and
   "pop" off by the Operating System.
   #include <unistd.h>
   #include <sys/types.h>
   int main() {
      char *msg = "Hello, world!\n";
      ssize_t len = 14;
      asm volatile(
         "movl $4, %%eax\n" // System call number for write
         "movl $1, %%ebx\n" // File descriptor for standard output
         "movl %0, %%ecx\n" // Pointer to the message to be printed
         "movl %1, %%edx\n" // Length of the message
         "int $0x80\n"        // Invoke the kernel
         :
         : "r"(msg), "r"(len)
         : "eax", "ebx", "ecx", "edx"
      );
      return 0;
   }
9. Explain how protection is provided for the hardware resources by                        (Nov/Dec
   the operating system.                                                                     2016)
   If a computer system has multiple users and allows the concurrent
   execution
   of multiple processes, then access to data must be regulated. For that
   purpose, mechanisms ensure that files, memory segments, CPU, and other
   resources can be operated on by only those processes that have gained
   proper authorization from the operating system. For example, memory-
   addressing hardware ensures that a process can execute only within its
   own address space. The timer ensures that no process can gain control of
   the CPU without eventually relinquishing control. Device-control
   registers are not accessible to users, so the integrity of the various
   peripheral devices is protected.
   Protection, then, is any mechanism for controlling the access of processes
                                   R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 32
or users to the resources defined by a computer system. This mechanism
must provide means to specify the controls to be imposed and to enforce
the controls.
Protection can improve reliability by detecting latent errors at the
interfaces
between component subsystems. Early detection of interface errors can
often prevent contamination of a healthy subsystem by another subsystem
that is malfunctioning. Furthermore, an unprotected resource cannot
defend against use (or misuse) by an unauthorized or incompetent user. A
protection-oriented system provides ameans to distinguish between
authorized and unauthorized usage
A system can have adequate protection but still be prone to failure and
allow inappropriate access. Consider a user whose authentication
information (her means of identifying herself to the system) is stolen. Her
data could be copied or deleted, even though file and memory protection
are working. It is the job of security to defend a system from external and
internal attacks. Such attacks spread across a huge range and include
viruses and worms, denial-of service attacks (which use all of a system’s
resources and so keep legitimate users out of the system), identity theft,
and theft of service (unauthorized use of a system). Prevention of some of
these attacks is considered an operating-system function on some
systems, while other systems leave it to policy or additional software.
Due to the alarming rise in security incidents, operating-system security
features represent a fast-growing area of research and implementation.
Protection and security require the system to be able to distinguish among
all its users. Most operating systems maintain a list of user names and
associated user identifiers (user IDs). In Windows parlance, this is a
security ID (SID). These numerical IDs are unique, one per user. When a
user logs in to the system, the authentication stage determines the
appropriate user ID for the user. That user ID is associated with all of the
user’s processes and threads.
When an ID needs to be readable by a user, it is translated back to the
user
name via the user name list.
In some circumstances, we wish to distinguish among sets of users rather
than individual users. For example, the owner of a file on a UNIX system
may be allowed to issue all operations on that file, whereas a selected set
of users may be allowed only to read the file. To accomplish this, we need
to define a group name and the set of users belonging to that group.
Group functionality can be implemented as a system-wide list of group
names and group identifiers.
A user can be in one or more groups, depending on operating-system
design
decisions. The user’s group IDs are also included in every associated
process and thread. In the course of normal system use, the user ID and
group ID for a user are sufficient. However, a user sometimes needs to
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 33
    escalate privileges to gain extra permissions for an activity. The user may
    need access to a device that is restricted, for example. Operating systems
    provide various methods to allow privilege escalation. On UNIX, for
    instance, the setuid attribute on a program causes that program to run with
    the user ID of the owner of the file, rather than the current user’s ID. The
    process runs with this effective UID until it turns off the extra privileges
    or terminates.
10. List the various services provided by operating systems.                              (Nov/Dec
    Operating System Services                                                             2016)
    • Operating system provides different types of services to different                  (Apr/May
    users. It provides services to program like load of data into memory,                 2018)
    allocating disk for storage, files or directory for open or read etc.
    • Services will change according to the operating system. May be the
    two different operating system provides same type of services with
    different names. OS makes programmer job easy by providing different
    services.
    Following are the list of services provided by operating systems :
    1. Program execution              2. Input-output operation
    3. Error detection                4. File and directory operation
    5. Communication                  6. Graphical user interface.
    1. Program execution: Before executing the program, it is loaded into
    the memory by operating system. Once program loads into memory, its
    start execution. Program finishes its execution with error or without
    error. It is up to the user for next operation.
    2. Input- output operation: Program is combination of input and output
    statement. While executing the program, it requires I/O device. OS
    provides the I/O devices to the program.
    3. Error detection: Error is related to the memory, CPU, I/O device and
    in the user program. Memory is full, stack overflow, file not found,
    directory not exist, printer is not ready, attempt to access illegal
    memory are the example of error detection.
    4. File and directory operation: User wants to read and writes the file
    and directory. User wants to search the file/directory, rename file, and
    modify the file etc. user also create the file or directory. All these
    operation is performed by user by using help of operating system.
    5.    Communication:         Communication       may      be    inter-process
    communication and any other type of communication. Data transfer is
    one type of communication. Communication is in between two process
    of same machine or two process of different machine. Pipe, shared
    memory, socket and message passing are the different methods of
    communication.
    6. Graphical user interface: User interacts with operating system by
    using user interface. All operating system provides user interface.
    Command line interface and batch interface are two types of user
    interface used in the operating system. In command line interface, user
                                  R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 34
   enters the text command for performing operation. Batch interface uses
   files. A file contains the command for various operations. When file
   executes, command output is displayed on the screen.
   • All above services provided by operating system is only for single
   user. If suppose there are multiple user in the system, then operating
   system provides some Po additional services. These services are
   resource allocation, accounting, protection of data and security.
   • Different types of resources are managed by the operating system.
   Multiple users require different resource for their execution. One type
   of resource may be required by two different users. So resource
   allocation is necessary.
   • Accounting means keeping information of resources and users. Which
   types of resources are allocated to which user and whether particular
   user has permission sostro for using this type of resources.
11. Discuss the DMA driven data transfer technique.                                      (May/June
    Introduction to DMA:                                                                 2015)
   DMA is a technique used in computer systems to enhance data transfer
   between peripherals and memory without involving the CPU. In traditional
   data transfer methods, the CPU is responsible for managing data
   movement between devices, which can be inefficient and time-consuming.
   DMA alleviates this burden by allowing peripherals to communicate
   directly with the memory, reducing CPU involvement and improving
   overall system performance.
   Components of DMA:
   DMA Controller:
    DMA operations are supervised by a specialized hardware component
   known as the DMA controller.
   The DMA controller coordinates data transfers between peripherals and
   memory, freeing up the CPU to focus on other tasks.
   DMA Channels:
    DMA controllers typically have multiple channels, each capable of
   handling a specific data transfer task. Channels can be independently
   programmed to manage different peripherals or memory regions.
   Steps in DMA-Driven Data Transfer:
   Initialization:
    The CPU initializes the DMA controller by setting up parameters such as
   source and destination addresses, transfer size, and transfer direction (read
   or write).
   Request and Acknowledge:
    When a peripheral device needs to transfer data, it sends a request to the
   DMA controller.
   The DMA controller acknowledges the request and grants control to the
                                 R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 35
    requesting device.
    Addressing:
     The DMA controller obtains control of the system bus and accesses the
    memory or peripheral directly.
    It uses the source and destination addresses specified during initialization
    for data movement.
    Data Transfer:
     Data is transferred directly between the source and destination without
    CPU intervention.
    This process is often faster than traditional CPU-controlled transfers.
    Interrupt or Polling:
      After the transfer is complete, the DMA controller signals the CPU
    through an interrupt or polling mechanism.
    The CPU can then perform any necessary post-transfer tasks.
    Advantages of DMA:
    Improved Performance:
     DMA reduces CPU overhead, allowing it to focus on other tasks, which
    leads to improved overall system performance.
    Efficiency:
       Data transfers occur directly between peripherals and memory,
    minimizing delays and enhancing efficiency.
    Parallelism:
     DMA channels can operate independently, enabling parallel data transfers
    between different peripherals and memory regions.
    Reduced Latency:
     DMA-driven transfers often result in lower latency compared to CPU-
    controlled transfers.
12. Discuss about the evolution of virtual machines. Also explain how                   (May/June
    virtualization could be implemented in operating systems.                             2015)
   Evolution of Virtual Machines:
   First Generation (1960s-1970s):
   The concept of virtualization originated in the 1960s with the
   development of mainframes.
   IBM's CP-40 and CP-67 were among the first systems to introduce
   virtualization features, allowing multiple instances of an operating system
   to run on a single mainframe.
   Second Generation (1980s-1990s):
   The advent of microprocessors and personal computers led to a decline in
   the use of virtualization.
   Server virtualization technologies, such as VMware, emerged in the late
   1990s, enabling multiple operating systems to run on a single physical
   server.
   Third Generation (2000s-2010s):
   Hardware-assisted virtualization became more prevalent with the
   introduction of technologies like Intel VT-x and AMD-V, making VMs
                                 R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 36
more efficient and easier to manage.
Hypervisors, both Type 1 (bare-metal) and Type 2 (hosted), gained
popularity. Examples include VMware ESXi (Type 1) and Oracle
VirtualBox (Type 2).
Fourth Generation (2010s-Present):
Cloud computing played a significant role in the proliferation of
virtualization.
Containerization technologies, like Docker, emerged as an alternative to
traditional VMs, offering lightweight and faster application deployment.
Implementation of Virtualization in Operating Systems:
Hypervisors:
Hypervisors, also known as Virtual Machine Monitors (VMM), are
essential for implementing virtualization.
Type 1 Hypervisors run directly on the hardware and manage VMs
independently of the host operating system. Examples include VMware
ESXi, Microsoft Hyper-V.
Type 2 Hypervisors run on a host operating system and create VMs
within it. Examples include VMware Workstation, Oracle VirtualBox.
Hardware-Assisted Virtualization:
Modern processors include hardware features that enhance virtualization
performance. Intel VT-x and AMD-V are examples of hardware-assisted
virtualization technologies.
These features allow VMs to execute code directly on the processor
without requiring software emulation, improving efficiency.
Memory Management:
Virtualization relies on efficient memory management to allocate and
control memory resources for each VM.
Technologies like Extended Page Tables (EPT) for Intel and Rapid
Virtualization Indexing (RVI) for AMD facilitate efficient memory
virtualization.
I/O Virtualization:
Virtual machines require access to I/O devices. I/O virtualization
techniques allow VMs to interact with physical devices while maintaining
isolation.
I/O MMU (Memory Management Unit) and technologies like Intel VT-d
and AMD-Vi enable direct device assignment to VMs.
Live Migration:
Live migration allows VMs to be moved between physical hosts without
disruption to services.
Technologies like VMware vMotion and Hyper-V Live Migration enable
seamless VM relocation, enhancing flexibility and resource utilization.
Containerization:
While not traditional virtualization, containerization technologies like
Docker and Kubernetes provide lightweight and portable environments
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 37
   for applications, sharing the host OS kernel. Containers offer advantages
   in terms of speed, resource efficiency, and scalability.
                                   PART C
1 With neat sketch, discuss about computer system overview.                             (Nov/Dec
. • Computer system consists of hardware device and software that are                     2015)
  combined to provide a tool to user for solving problems.
  • Fig. 1.1.1 shows modern computer system.
  • Modern computer consists of one or two CPU with main memory and
  various I/O devices. Common bus is used for communication between these
  devices. Each device has its own device controller.
  • CPU and device controller uses memory cycle for execution purposes. But
  memory cycle is only available to one device at a time.
  • Bootstrap program is loaded when user start the computer. It initialies all
                                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 38
the device connected to the computer system and then loads required device
drivers.
• After this, operating system loads in the computer system. In UNIX OS,
an 'init' is the first process which execute by OS.
• Interrupt is software and hardware. It is used to send signal to CPU.
Software interrupt is sometime called system call.
• When interrupt is trigger, the CPU stops executing the instruction and
control is transfer to the fixed location. Starting address is stored at fixed
location where the service routine executes.
• Interrupts do not alter the control flow of the process executing on the
processor.
Storage structure
• Processor access the data from main memory before executing any
instruction. Main memory is also called Random Access Memory (RAM).
• DRAM is used in main memory. Fig. 1.1.2 shows hierarchy of storage
device.
• At the top of the hierarchy, we have storage on the CPU registers. For
accessing the CPU, it is fastest form of storage.
• Cache memory capacity is less than 1 MB.
• User program and data are stored in the main memory. Main memory
is volatile, so it can not stored permanently.
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 39
• Storage system is classified as temporary storage or permanent storage.
• Top level storage devices are low capacity with faster CPU access and
bottom level storage devices having very large capacity with slow CPU
access speed.
I/O structure
• Every device uses a device controller to connect it to the computer's
address and data bus. Devices can be classified as a block oriented or
character oriented, depending on the number of bytes transferred on an
individual operation.
• Storage devices are used to store data while the computer is off.
• All the I/O devices are connected to each other by using common bus.
CPU and main memory is also connected with this bus.
• Various types of controller is used in the computer system. Small
Computer System Interface (SCSI) controller can handle upto seven
devices. Each device controller have its own buffer.
• Device controller manage the data transfer between peripheral device and
its controller. Device driver is handled by device controller.
I/O operation steps
1. Device driver loads the registers within the device controller.
2. Device controller takes action according to the data loaded into the
register.
3. Data is transfer from device to its local buffer with the help of device
controller.
4. After completion of data transfer, the device controller sends an interrupt
signal to device driver about data transfer completion operation.
5. Then control goes to operating system.
• The device driver is the operating system entity that controls CPU-I/O
parallelism. The software that communicates with device controller is called
device driver.
• A device can be started by the device driver, and the application program
can continue operation in parallel with the device operation.
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 40
Instruction Execution
• The program which is to be executed is a set of instructions which are
stored in memory. The Central Processing Unit (CPU) executes the
instructions of the program to complete a task.
• The major responsibility of the instruction execution is with the CPU. The
instruction execution takes place in the CPU registers.
• A special register contains the address of the instruction. The CPU
"fetches" the instruction from memory at that address.
• The CPU "decodes" the instruction to figure out what to do. The CPU
"fetches" any data b(operands) needed by the instruction, from memory or
registers.
• Fig 1.1.3 shows instruction execution cycle.
• The CPU "executes" the operation specified by the instruction on this data.
The to CPU "stores" any results into a register or memory.
•The register used for instruction execution are as follows:
1. Memory Address Register (MAR): It specifies the address of memory
location from which data or instruction is to be accessed (for read operation)
or to best yr which the data is to be stored (for write operation).
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 41
2. Program Counter (PC): It keeps track of the instruction which is to be
executed next, after the execution of an on-going instruction.
3. Instruction Register (IR): Here the instructions are loaded before their
execution.
Interrupts
• Definition: It is an event external to the currently executing process that
causes a change in the normal flow of instruction execution; usually
generated by hardware not devices external to the CPU. They tell the CPU
to stop its current activities and not execute the appropriate part of the
operating system.
• Fig. 1.1.4 shows interrupts.
• The CPU uses a table and the interrupt vector to find OS the code to
execute in response to interrupts. When interrupt signaled, processor
executes a routine called an interrupt handler to deal with the interrupt.
• Operating system may specify a set of instructions, called an interrupt
handler, to be executed in response to each type of interrupt.
• Interrupt Service Routine (ISR) is the software code that is executed when
the hardware requests interrupt. The design of the interrupt service routine
requires careful consideration of many factors. Although interrupt handlers
can create and use local variables, parameter passing between threads must
be implemented using shared global memory variables.
• A private global variable can be used if an interrupt thread wishes to pass
information to itself, e.g., from one interrupt instance to another. The
execution of the main program is called the foreground thread, and the
executions of the various interrupt service routines are called background
threads.
• Interrupts are of three types: Hardware, software and trap.
                                 R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 42
• Hardware Interrupts are generated by hardware devices to signal that they
need some attention from the OS. Software Interrupts are generated by
programs when they want to request a system call to be performed by the
operating system.
• Traps are generated by the CPU itself to indicate that some error or
condition occurred for which assistance from the operating system is
needed.
• Interrupt and trap numbers are defined by the hardware which is also
responsible for calling the procedure in the kernel space. An interrupt
handler is called in response to a signal from another device while a trap
handler is called in response to an instruction executed within the CPU.
• Synchronous interrupts occur when a process attempts to perform an
illegal action, such as dividing by zero or referencing a protected memory
location. Synchronous interrupt occurs due to software execution.
Cache Memory and Hierarchy
• Cache is small high speed memory. It is derived from SRAM.
• Caches are useful when two ore more components need to exchange data,
and the components perform transfers at differing speeds. Caches solve the
transfer problem by providing a buffer of intermediate speed between the
components.
• Caches are useful because they can increase the speed of the average
memory access and they do so without taking up as much physical space as
the lower elements of the memory hierachy.
• Data is normally kept in storage system.. when it required, it is copied into
a faster storage system i.e. cache memory for temporary basics.
• When user required a particular data, system check whether it is in the
cache. If data found then, we use the data directly from the cache. It is not
found then, use the data from the source.
• Internal programmable registers, such as index registers, provide a high-
speed cache for main memory. The programmer implements the registor-
allocation and register-replacement algorithms to decide which information
to keep in registers and which to keep in main memory.
• If the fast device finds the data it needs in the cache, it need not wait for
the 1er slower device. The data in the cache must be kept consistent with
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 43
the data in the components.
• If a component has data value change, and the datum is also in the cache,
the cache must also be updated. This is especially a problem on
multiprocessor systems where more than one process may be accessing a
datum
• A component may be eliminated by an equal sized cache, but only if the
cache and the component have equivalent state-saving capacity and the
cache is affordable, because aster storage tends to be more expensive.
• Unfortunately, cache also introduce an additional level of complexity
(coherency and consistency assurance).We also incur an economic and
space penalty when we add a cache.
• Making a cache as large as a disk would be ineffective because it would be
too costly, the immense size would slow it down and a cache is generally a
volatile memory, while we want data on disks to be persistent..
• It holds data for temporary purpose to reduce the time required to service
I/O requests from the host. Fig. 1.1.5 shows memory hierarchy.
• Cache in CPU unit is referred as Level1 cache (L1 cache) and cache stored
in a chip next to CPU is termed as Level2 cache (L2 cache) and it resides on
the motherboard.
• The function of cache is to act as a buffer between a relatively fast device
and a relatively slow one.
• Small unit of allocation in cache is page. Cache is arranged into slots or
pages. It uses two components data store and tag RAM. The actual data is
stored in a
different part of the cache, called the data store. The values stored in the tag
RAM determine whether a cache lookup results in a hit or a miss.
                                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 44
• The size of the tag RAM determines what range of main memory can be
cached. Tag RAM is a small piece of SRAM that stores the addresses of the
data that is stored in the SRAM
• A cache address can be specified simply by index and offset. The tag is
kept to allow the cache to translate from a cache address to a unique CPU
address.
• A cache hit means that the CPU tried to access an address, and a matching
cache block was available in cache. So, the cache did not need to access
RAM. In a cache miss, the CPU tries to access an address, and there is no
matching cache block. So, the cache is forced to access RAM. of
• Cache includes tags to identify which block of main memory is in each
cache slot.
• Every cache block has associated with it at least the modify and valid bits,
and a tag address. The valid bit says if the cache block is used or is unused.
The modify bit makes sense only if the valid bit is set. The modify bit says
whether the data in the cache block is different from RAM or is the same as
RAM.
• The size of a page is dependent on the size of the cache and how the cache
is organized. A cache page is broken into smaller pieces, each called a
cache line. The size of a cache line is determined by both the processor and
the cache design. .
• When the processor starts a read cycle, the cache checks to see if that
address is a cache hit.
• Cache Hit: If the cache contains the memory location, then the cache will
respond to the read cycle and terminate the bus cycle.
• Cache Miss: It is reference to item that is not resident in cache, but is
resident in main memory. For cache misses, the fast memory is
cache and the slow memory is main memory.
Direct Memory Access Structure
• Direct Memory Access (DMA) is one of several methods for coordinating
the timing of data transfers between an input/output (I/O) device and the
core processing unit or memory in a computer. DMA is one of the faster
types of synchronization mechanisms.
• Without DMA, the processor is responsible for the physical movement of
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 45
  data between main memory and a device. A special control unit may be
  provided to allow transfer of a block of data directly between an external
  device and the main memory, without continuous intervention by the
  processor. This approach is called direct memory access.
  • Fig. 1.1.6 shows simplified diagram of DMA controller.
  • The DMA controller uses hold request (HOLD) and hold girl acknowledge
  (HOLD A) signals to ask the CPU to stop driving the address, data and
  control buses so that the DMA controller can drive them to carry out a
  transfer.
  • A DMA controller implements direct memory access in a computer
  system. It connects directly to the I/O device at one end and to the system
  buses at the other end. It also interacts with the CPU, both via the system
  buses and two new direct connections.
  • Device controller transfers blocks of data from buffer storage directly to
  main memory without CPU intervention. Only one interrupt is generated
  per block, rather than the one interrupt per byte.
  • The DMA controller may either stop the CPU and access the memory or
  use the bus while the CPU is not using it
2 Give reasons why caches are useful. What problems do they solve and                  (Apr/May
. cause? If a cache can be made as large as the device for which it is                   2018)
  caching why not make it that large and eliminate the device?
  Why Caches are Useful:
  Faster Access Times:
  Caches store frequently accessed data and instructions closer to the CPU,
                                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 46
reducing the time it takes to retrieve information compared to fetching it
from slower main memory.
Improved System Performance:
By minimizing the time spent waiting for data, caches significantly enhance
overall system performance. This is crucial for tasks that require quick
access to data, such as running applications or processing instructions.
Reduced Latency:
Caches help reduce memory access latency, allowing the CPU to operate
more efficiently by fetching data and instructions at a faster rate.
Bandwidth Conservation:
Caches help conserve memory bandwidth by reducing the frequency of
requests to the slower main memory. This is particularly important in
systems with limited memory bandwidth.
Energy Efficiency:
Accessing data from caches consumes less energy compared to fetching
data from main memory. This contributes to energy efficiency and can be
crucial in battery-powered devices.
Problems Caches Solve:
Memory Latency:
Caches address the issue of memory latency by providing a smaller, faster
storage space for frequently used data, reducing the need to access the
slower main memory.
Limited Bandwidth:
Caches help manage limited memory bandwidth by storing frequently
accessed data locally, minimizing the impact on the overall system
bandwidth.
CPU Efficiency:
Caches improve CPU efficiency by reducing idle time spent waiting for
data, allowing the processor to execute instructions more quickly.
Issues Caches May Cause:
Cache Coherency:
Maintaining consistency between the contents of the cache and the main
memory can be challenging. Cache coherency protocols are implemented to
address this issue.
Complexity and Cost:
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 47
  Implementing and managing caches can add complexity to the design of a
  system, and larger, more sophisticated caches can increase production costs.
  Cache Pollution:
  In certain scenarios, the cache may store unnecessary or outdated data,
  leading to inefficiencies. Cache replacement policies are employed to
  address this issue.
  Cache Size and Device Elimination:
  While it may seem logical to make caches as large as the device they are
  caching, there are practical limitations and trade-offs involved:
  Cost and Complexity:
  Larger caches are more expensive to manufacture and can add complexity
  to the system design. There's a trade-off between cache size, cost, and the
  benefits gained from increased cache capacity.
  Diminishing Returns:
  As the cache size increases, the improvement in performance diminishes.
  There is a point of diminishing returns where the added cost and complexity
  may not be justified by the marginal performance gains.
  Physical Space:
  Larger caches require more physical space on the chip, and in many cases,
  the available space is limited. Designers need to balance the size of the
  cache with other essential components on the chip.
  Power Consumption:
  Larger caches may consume more power, impacting energy efficiency. In
  mobile devices or battery-powered systems, this is a critical consideration.
3 Discuss the functionality of system boot with respect to an                            (Nov/Dec
. operating system.                                                                        2018)
  Booting in Operating System
  Booting is the process of starting a computer. It can be initiated by hardware
  such as a button press or by a software command. After it is switched on, a
  CPU has no software in its main memory, so some processes must load
  software into memory before execution. This may be done by hardware or
  firmware in the CPU or by a separate processor in the computer system.
  Sequencing of Booting
  Booting is a start-up sequence that starts the operating system of a computer
  when it is turned on. A boot sequence is the initial set of operations that the
  computer performs when it is switched on. Every computer has a boot
                                 R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 48
sequence.
Types of Booting
There are two types of booting in an operating system.
Booting Process in Operating System
When our computer is switched on, it can be started by hardware such as a
button press, or by software command, a computer's central processing unit
(CPU) has no software in its main memory, there is some process which
must load software into main memory before it can be executed. Below are
the six steps to describe the boot process in the operating system, such as:
Step 1: Once the computer system is turned on, BIOS (Basic Input /Output
System) performs a series of activities or functionality tests on programs
stored in ROM, called on POST (Power-on Self Test) that checks to see
whether peripherals in the system are in perfect order or not.
Step 2: After the BIOS is done with pre-boot activities or functionality test,
it read bootable sequence from CMOS (Common Metal Oxide
Semiconductor) and looks for master boot record in the first physical sector
of the bootable disk as per boot device sequence specified in CMOS. For
example, if the boot device sequence is:
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 49
      o   Floppy Disk
      o   Hard Disk
      o   CDROM
  Step 3: After this, the master boot record will search first in a floppy disk
  drive. If not found, then the hard disk drive will search for the master boot
  record. But if the master boot record is not even present on the hard disk,
  then the CDROM drive will search. If the system cannot read the master
  boot record from any of these sources, ROM displays "No Boot device
  found" and halted the system. On finding the master boot record from a
  particular bootable disk drive, the operating system loader, also called
  Bootstrap loader, is loaded from the boot sector of that bootable drive· into
  memory. A bootstrap loader is a special program that is present in the boot
  sector of a bootable drive.
  Step 4: The bootstrap loader first loads the IO.SYS file. After
  this, MSDOS.SYS file is loaded, which is the core file of the DOS operating
  system.
  Step 5: After this, MSDOS.SYS file searches to find Command Interpreter
  in CONFIG.SYS file, and when it finds, it loads into memory. If no
  Command       Interpreter  is    specified    in    the CONFIG.SYS file,
  the COMMAND.COM file is loaded as the default Command Interpreter of
  the DOS operating system.
  Step 6: The last file is to be loaded and executed is
  the AUTOEXEC.BAT file that contains a sequence of DOS commands.
  After this, the prompt is displayed. We can see the drive letter of bootable
  drive displayed on the computer system, which indicates that the operating
  system has been successfully on the system from that drive.
4 Discuss the essential properties of the following types of systems,                    (Nov/Dec
.                                                                                          2018)
    i) Time sharing systems ii) Multi-processor systems
  iii) Distributed systems
  i) Time-Sharing Systems:
  Time Division Multiplexing:
  Time-sharing systems enable multiple users to share the resources of a
  single computer simultaneously.
  Time division multiplexing is used to allocate time slices or "time slots" to
  each user, allowing them to interact with the system.
                                 R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 50
User Interactivity:
Time-sharing systems prioritize user interactivity, providing each user with
the illusion of dedicated access to the system.
Users can execute commands, run programs, and interact with the system in
real-time.
Fairness and Responsiveness:
Fairness is a crucial property, ensuring that each user receives a fair share of
system resources during their allocated time slice.
The system should be responsive to user input, minimizing latency and
providing a smooth user experience.
Concurrency and Resource Sharing:
Time-sharing systems support concurrent execution of multiple tasks,
allowing users to run their programs simultaneously.
Resource sharing mechanisms ensure that users can access and utilize
shared resources like the CPU, memory, and peripherals.
Security and Isolation:
Security mechanisms are essential to prevent unauthorized access to other
users' data and resources.
Isolation ensures that each user operates within their designated
environment without affecting others.
ii) Multi-Processor Systems:
Parallel Processing:
Multi-processor systems involve multiple CPUs (Central Processing Units)
working together simultaneously to execute tasks.
Parallel processing enables faster computation and improved system
performance.
Scalability:
Multi-processor systems are scalable, allowing for the addition of more
processors to enhance system capabilities.
Scalability is crucial for adapting to changing computational requirements.
Load Balancing:
Load balancing ensures that tasks are distributed evenly among the
processors, preventing bottlenecks and maximizing resource utilization.
Efficient load balancing contributes to optimal system performance.
Inter-Processor Communication:
Inter-processor communication mechanisms enable collaboration and data
sharing between different processors.
Efficient communication is vital for coordinating tasks and maintaining
consistency in shared data.
Fault Tolerance:
Multi-processor systems may incorporate fault-tolerant mechanisms to
handle hardware failures.
                                R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 51
 Redundancy and error recovery strategies help maintain system reliability.
 iii) Distributed Systems:
 Resource Sharing and Coordination:
 Distributed systems involve multiple interconnected computers that work
 together to achieve common goals.
 Resource sharing and coordination mechanisms enable seamless
 collaboration among distributed components.
 Transparency:
 Distributed systems aim to provide transparency to users, concealing the
 complexity of distributed architecture.
 Transparency includes location transparency, access transparency, and
 failure transparency.
 Scalability and Flexibility:
 Distributed systems are designed to be scalable, accommodating an
 increasing number of nodes or users.
 Flexibility allows for easy adaptation to changes in system size,
 configuration, or workload.
 Reliability and Fault Tolerance:
 Distributed systems often incorporate redundancy and fault-tolerant
 mechanisms to handle failures gracefully.
 Distributed data storage and processing ensure reliability in the face of
 component failures.
 Security:
 Security is a critical consideration in distributed systems to protect data and
 communication across networked components.
 Authentication, encryption, and access control mechanisms contribute to a
 secure distributed environment.
                              UNIT II PROCESS MANAGEMENT
Processes - Process Concept - Process Scheduling - Operations on Processes - Inter-process
Communication; CPU Scheduling - Scheduling criteria - Scheduling algorithms: Threads –
Multithread Models – Threading issues; Process Synchronization - The Critical-Section problem –
Synchronization hardware – Semaphores – Mutex - Classical problems of synchronization -
Monitors; Deadlock - Methods for handling deadlocks, Deadlock prevention, Deadlock avoidance,
Deadlock detection, Recovery from deadlock.
                                            PART - A
S.No.                         Question and Answer                         CO        Univ QP
                                                                                 (Month/Year)
    1. Compare and contrast Single-threaded and multi-threaded CO2                 (Apr/May
        process.                                                                      2017)
        Single-threading is the processing of one command/ process at a
        time. Whereas multi threading is a widespread programming and
                                 R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 52
     execution model that allows multiple threads to exist within the
     context of one process. These threads share the process's
     resources, but are able to execute independently.
2.    Priority inversion is a condition that occurs in real time         CO2      (Apr/May
      systems – Analyzing on this statement.                                        2017)
      Priority inversion is a problem that occurs in concurrent
      processes when low-priority threads hold shared resources
      required by some high-priority threads, causing the high
      priority-threads to block indefinitely. This problem is enlarged
      when the concurrent processes are in a real time system where
      high- priority threads must be served on time.
     Priority inversion occurs when task interdependency exists
     among tasks with different priorities.
3.   Distinguish between CPU bounded, I/O bounded processes.             CO2       (Nov/Dec
     CPU bound process, spends majority of its time simply using the                 2016)
     CPU (doing calculations).
     I/O bound process, spends majority of its time in input/output
     related operations.
4.   What resources are required to Creating threads?                    CO2       (Nov/Dec
     When a thread is Creating the threads does not require any new                  2016)
     resources to execute. The thread shares the resources of the
     process to which it belongs to and it requires a small data
     structure to hold a register set, stack, and priority.
5.   Under what circumstances user level threads are better than         CO2      (May/June
     the kernel level threads?                                                       2016)
     User-Level threads are managed entirely by the run-time system                (Nov/Dec
     (user-level library).The kernel knows nothing about user-level                  2015)
     threads and manages them as if they were single-threaded
     processes. User-Level threads are small and fast, each thread is
     represented by a PC, register, stack, and small thread control
     block. Creating a new thread, switching between threads, and
     synchronizing threads are done via procedure call. i.e. no kernel
     involvement. User- Level threads are hundred times faster than
     Kernel- Level threads.
     User level threads are simple to represent, simple to manage and
     fast and efficient.
6.   What is the meaning of the term busy waiting?                       CO2      (May/June
     Busy-waiting, busy-looping or spinning is a technique in which a                2016)
     process repeatedly checks to see if a condition is true.                      (Nov/Dec
                                                                                     2018)
7. List out the data fields associated with process control blocks.      CO2     (April/May
   Process ID, pointers, process state, priority, program counter,                 2015)
   CPU registers, I/O information, Memory management
   information, Accounting information, etc.
8. Define the term ‘Dispatch Latency”.                                   CO2 (April/May
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 53
      The term dispatch latency describes the amount of time it takes             2015)
      for a system to respond to a request for a process to begin
      operation.
9.    What is the concept behind strong semaphore and spinlock?             CO2     (Nov/Dec
      Strong semaphores specify the order in which processes are                      2015)
      removed from the queue (FIFO order), which guarantees avoiding
      starvation.
      Spinlock is a lock which causes a thread trying to acquire it to
      simply wait in a loop ("spin") while repeatedly checking if the
      lock is available.
10.   What are the benefits of synchronous and asynchronous                 CO2     (Apr/May
      communication?                                                                  2018)
      A benefit of synchronous communication is that it allows a
      rendezvous between the sender and receiver.
      An asynchronous operation is non-blocking and only initiates the
      operation.
11.   Can a multithreaded solution using multiple user-level                CO2     (Nov/Dec
      threads achieve better performance on a multiprocessor                          2018)
      system than on a single- processor system?
      A multithreaded system comprising of multiple user-level threads
      cannot make use of the different processors in a multiprocessor
      system simultaneously.
12.   What is process control block? List out the data field                CO2     (Apr/May
      associated with PCB.                                                            2015)
      Each process is represented in the operating system by a process
      control block also called a task control block. (PCB) also called a
      task control block.
                          Process state
                          Process number
                          Program counter
                          CPU registers
                          Memory limits
                          List of open files
                          CPUscheduling information
                          Memory management information
                          Accounting information
                          I/O status information
13.   Differentiate single threaded and multithreaded processes.            CO2     (Apr/May
                                                                                      2017)
                             R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 54
    Single threaded processes contain the execution of instructions in
    a single sequence. In other words, one command is processes at a
    time. The opposite of single threaded processes are multithreaded
    processes. These processes allow the execution of multiple parts
    of
    a
    program at the same time.
14. What is a thread? Write its benefits.                                CO2      (Apr/May
    A thread otherwise called a lightweight process (LWP) is a                      2019)
    basic unit of CPU utilization, it comprises of a thread ID, a
    program counter, a register set and a stack. It shares with
    other threads belonging to the same process its code
    section, data section, and operating system resources such
    as open files and signals.
15. Define the term ‘Dispatch Latency’.                                  CO2      (Apr/May
    The term dispatch latency describes the amount of time it                       2015)
    takes for a system to respond to a request for a process to
    begin operation. With a scheduler written specifically to
    honor application priorities, real-time applications can be
    developed with a bounded dispatch latency.
16. What are the difference between user level threads and kernel CO2             (Apr/May
    level threads?                                                                  2019)
           User threads
        User threads are supported above the kernel and are
           implemented by a thread library at the user level.
        Thread creation & scheduling are done in the user
           space, without kernel intervention.
        Therefore they are fast to create and manage
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 55
           blocking system call will cause the entire process to
           block
           Kernel threads
        Kernel threads are supported directly by the
           operating system.
        Thread creation, scheduling and management are
           done by the operating system.
        Therefore they are slower to create & manage
           compared to user threads.
        If the thread performs a blocking system call, the
           kernel can schedule another thread in the application
           for execution
17. What is semaphore? Mention its importance in operating CO2                   (Nov/Dec
    system.                                                                        2011)
    A semaphore 'S' is a synchronization tool which is an integer
    value that, apart from initialization, is accessed only through two
    standard atomic operations; wait and signal. Semaphores can be
    used to deal with the n-process critical section problem. It can be
    also used to solve various Synchronization problems.
18. Define deadlock.                                                    CO2      (Nov/Dec
    A process requests resources; if the resources are not                         2021)
    available at that time, the process enters a wait state.
    Waiting processes may never again change state, because
    the resources they have requested are held by other waiting
    processes. This situation is called a deadlock.
19. Draw & briefly explain the process states?                          CO2      (Nov/Dec
                                                                                   2017)
                                                                                 (Nov/Dec
                                                                                   2019)
           New- The process is being created.
           Running – Instructions are being executed
           Waiting – The process is waiting for some event to
              occur
           Ready – The process is waiting to be assigned a
              processor
           Terminated - the process has finished execution
20. List out the benefits and challenges of process scheduling         CO2 Apr/May 2019
    Benefits of Process Scheduling:
          Improved CPU Utilization:
          Enhanced Throughput:
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 56
          Fair Resource Allocation:
          Reduced Response Time:
          Efficient Multitasking:
          Resource Utilization:
          Priority Management:
    Challenges of Process Scheduling:
        Complexity:
        Resource Contention:
        Starvation:
        Real-Time Constraints:
        Context Switching Overhead:
        Algorithm Selection:
        Dynamic Workloads:
        I/O-Bound Processes:
21. Give the queuing diagram representation of process                   CO2 Apr/May 2019
    scheduling
22. What are the conditions must hold for a deadlock to occur?           CO2
    The four necessary conditions for a deadlock situation are
          mutual exclusion,
          no preemption,
          hold and wait and
          circular wait.
23. Elucidate mutex locks with an example.                               CO2
    A mutex lock makes it possible to implement mutual exclusion
    by limiting the number of threads or processes that can
    simultaneously acquire the lock. A single thread or procedure has
    to first try to obtain the mutex lock for something that is shared
    before it can access it.
24. What is the meaning of the term busy waiting?                        CO2
    Busy waiting, also known as spinning, or busy looping is a
    process synchronization technique in which a process/task waits
    and constantly checks for a condition to be satisfied before
    proceeding with its execution
25. Under what circumstances CPU scheduling decision takes               CO2
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 57
       place.
        When a process switches from running state to waiting state
        When a process switches from running state to ready state.
        When a process switches from running state to waiting state to
          ready state
        When a process terminates.
                                           PART - B
S.No                        Question and Answer                             CO       Univ QP
  .                                                                                (Month/Year)
 1.  Suppose that the following processes arrive for execution at the       CO2     (Nov/Dec
     times indicated. Each process will run the listed amount of                      2018)
     time. In answering the questions, use non-preemptive                         (Apr/May2017)
     scheduling and                                                               (Apr/May2018)
     base all decisions on the information you have at the time the
     decision must be made.
     Process Arrival Time Burst Time
     P1          0.0            8
     P2          0.4            4
     P3          1.0            1
      a. Find the average turnaround time for these processes with
      the FCFS scheduling algorithm?
      b. Find the average turnaround time for these processes with
      the SJF scheduling algorithm?
      c. The SJF algorithm is supposed to improve performance, but
      notice that we chose to run process P1 at time 0 because we did
      not know that two shorter processes would arrive soon. Find
      what is the average turnaround time will be if the CPU is left
      idle for the first 1 unit and then SJF scheduling is used.
      Remembering that processes P1 and P2 are waiting during this
      idle time, so their waiting time may increase. This algorithm
      could be known as future-knowledge scheduling.
      a. What is the average turnaround time for these processes with the
      FCFS scheduling algorithm?
       Solution: Average turnaround for these processes: ( 8 + (12 - 0.4)
      + (13 - 1)) / 3 = 10.53
      b. What is the average turnaround time for these processes with the
      SJF scheduling algorithm? Solution: Average turnaround for these
      processes: ( 8 + (9 - 1) + (13 – 0.4)) / 3 = 9.53
      c. The SJF algorithm is supposed to improve performance, but
      notice that we choose to run process P1 at time 0 because we did
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 58
     not know that two shorter processes would arrive soon. Compute
     what the average turnaround time will be if the CPU is left idle for
     the first 1 unit and then SJF scheduling is used.
     Remember that processes P1 and P2 are waiting during this idle
     time, so their waiting time may increase. The algorithm could be
     known as future-knowledge Scheduling Solution: CPU is left idle:
     Average turnaround for these processes: ((2 - 1) + ( 6 – 0.4 ) + ( 14
     - 0)) / 3 = 6.87
2.   State critical section problem? Discuss three solutions to solve              (Apr/May 2017)
     the critical section problem.
     The critical section is a code segment where the shared variables
     can be accessed. An atomic action is required in a critical section
     i.e. only one process can execute in its critical section at a time. All
     the other processes have to wait to execute in their critical sections.
     A diagram that demonstrates the critical section is as follows −
     In the above diagram, the entry section handles the entry into the
     critical section. It acquires the resources needed for execution by
     the process. The exit section handles the exit from the critical
     section. It releases the resources and also informs the other
     processes that the critical section is free.
     Solution to the Critical Section Problem
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 59
     The critical section problem needs a solution to synchronize the
     different processes. The solution to the critical section problem
     must satisfy the following conditions −
           Mutual Exclusion
            Mutual exclusion implies that only one process can be
            inside the critical section at any time. If any other processes
            require the critical section, they must wait until it is free.
           Progress
            Progress means that if a process is not using the critical
            section, then it should not stop any other process from
            accessing it. In other words, any process can enter a critical
            section if it is free.
           Bounded Waiting
            Bounded waiting means that each process must have a
            limited waiting time. Itt should not wait endlessly to access
            the critical section.
3.    Illustrate an example situation in which ordinary pipes are                   (Nov/Dec
      more suitable than named pipes and an example situation in                      2016)
      which named pipes are more suitable than ordinary pipes.
     Step 1
     A pipe is a technique for passing information from one process to
     another.
     Ordinary pipes
     Ordinary pipes allow two processes to communicate in standard
     producer–consumer fashion where the producer writes to one end
     of the pipe and the consumer reads from the other end. Thus they
     allow only one-way communication meaning they are
     unidirectional. An ordinary pipe cannot be accessed from outside
     the process that created it. Generally we use ordinary pipes when a
     parent process wants to communicate with its children. Once the
     processes have finished communicating and have terminated the
     ordinary pipes ceases to exist. On the other hand named pipes
     provide a much more bidirectional communication tool and don't
     require a parent–child relationship. Once a named pipe is
     established, several processes can use it for communication. \par
     From the above we can conclude that ordinary pipes work better
     with simple communication. An example could be a process which
                             R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 60
     counts characters in a file. In this case
     The producer would write the file to the pipe
     The consumer would read the file and counts the number of
     characters
     Next, for an example where named pipes are more suitable when
     several processes need to write in the same time. An example
     could be when several processes need to write messages to a log.
     A log is a simply a file that records events that occur in the OS or
     messages between users of a communication software. So in a log
     file several processes should write messages. In this case
     Several processes can o write something to the named pipe
     The messages would be read by a server from the named pipe
     The server after reading the messages writes them to a log file.
     Result
     Ordinary pipes work better with simple communication. Named
     pipes work better when several processes need to write messages.
     Ordinary pipes allow communication between parent and child
     processes, while named pipes permit unrelated processes to
     communicate.
4.    Explain: why interrupts are not appropriate for                               (Nov/Dec 2016)
      implementing synchronization primitives in multiprocessor                     (Nov/Dec 2018)
      systems.
      In multiprocessor systems, where multiple processors or cores
      operate concurrently, implementing synchronization primitives
      using interrupts may not be the most appropriate approach.
      Interrupts are typically used to handle asynchronous events and
      provide a mechanism for a processor to temporarily suspend its
      current execution to respond to an external event. However,
      there are several reasons why interrupts may not be suitable for
      synchronization in multiprocessor systems:
      Unpredictable Timing: Interrupts can be triggered at any time,
      making their timing unpredictable. In a multiprocessor
      environment, where multiple processors are executing
      instructions concurrently, relying on interrupts for
      synchronization can lead to non-deterministic behavior and race
      conditions. This unpredictability makes it challenging to ensure
      proper coordination among processors.
      Concurrency Issues: Multiprocessor systems involve multiple
      threads of execution running in parallel. If interrupts are used for
      synchronization, it can introduce concurrency issues, such as
      data races and inconsistent states, as multiple processors may be
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 61
      interrupted simultaneously and try to access shared resources
      concurrently. This can lead to difficult-to-debug synchronization
      problems.
      Interrupt Handling Overhead: Handling interrupts involves a
      context switch, where the processor saves its current state,
      switches to the interrupt service routine, and later restores the
      saved state. In a multiprocessor system, the overhead of handling
      interrupts on multiple processors simultaneously can be
      significant. This can degrade overall system performance and
      increase contention for shared resources.
      Difficulty in Ensuring Atomicity: Synchronization primitives
      often require atomic operations to ensure that critical sections
      are executed without interruption. Achieving atomicity using
      interrupts may be complex and error-prone, especially in a
      multiprocessor environment where multiple processors may
      attempt to access shared resources concurrently.
      Alternative Synchronization Mechanisms: Multiprocessor
      systems typically employ other synchronization mechanisms
      specifically designed for concurrent execution, such as locks,
      barriers, and atomic operations. These mechanisms are often
      more efficient and provide better control over synchronization in
      a multiprocessor environment compared to relying on interrupts.
5.   Elaborate the actions taken by the kernel to context-switch                  (Nov/Dec 2016)
     between processes.
     Actions taken by a kernel to context-switch between processes are
     -
     The OS must save the PC and user stack pointer of the currently
     executing process, in response to a clock interrupt and transfers
     control to the kernel clock interrupt handler
     Saving the rest of the registers, as well as other machine state, such
     as the state of the floating point registers, in the process PCB is
     done by the clock interrupt handler.
     The scheduler to determine the next process to execute is invoked
     the OS.
     Then the state of the next process from its PCB is retrieved by OS
     and restores the registers. The restore operation takes the processor
     back to the state in which the previous process was previously
     interrupted, executing in user code with user-mode privileges.
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 62
Many architecture-specific operations, including flushing data and
instruction caches also must be performed by Context switches.
     Interrupt causes the operating system to change a CPU
        from its current task and to run a kernel routine.
      When an interrupt occurs, the system need to save the
       current context of the process running on the CPU, so that
       it can later restore when needed.
      Switching the CPU to another process required performing
       a state save of the current process and state restore of
       different process and this task is known as context
       switching.
      When a context switch occur the kernel switching context
       of old process in d’s PCB and loads the saved context of its
       new process scheduled to run.
      Context switch time is pare overhead, because system does
       not do any useful work during this time.
      It is highly depending on hardware support like if processor
       having large number of register so it wont need to unload
       old PCB as it has space enough to store all frequently used
       processes PCB.
      Action taken by kernel to context switch between
       processes.
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 63
          In response to clock interrupt, the OS saves the PC and
           user stack pointer of the current executing process and
           transfer control to kernel clock interrupt handler.
          The clock interrupt handler saves the rest of the register as
           well as other machine state such as state of floating pointer
           registers in the process PCB.
          The OS invoke the schedule to determine the next process
           to execute.
          The OS then retrieves the state of next process from cts
           PCB and restore the registers.
          This restore operation takes the processor back to the state
           in which this process was previously interrupted, executing
           in user mode with user mode privileges
6.   Consider the following resource-allocation policy. Requests                   (Nov/Dec
     and releases for resources are allowed at any time. If a                        2015)
     request for resources cannot be satisfied because the
     resources are not available, then we check any processes that
     are blocked, waiting for resources. If they have the desired
     resources, then these resources are taken away from them
     and are given to the requesting process. The vector of
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 64
 resources for which the waiting process is waiting is
 increased to include the resources that were taken away.
 For example, consider a system with three resource types
 and the vector
 Available initialized to (4,2,2). If process P0 asks for (2,2,1),
 it gets them. If P1 asks for (1,0,1), it gets them. Then, if P0
 asks for (0,0,1), it is blocked (resource not available). If P2
 now asks for (2,0,0), it gets the available one (1,0,0) and one
 that was allocated to P0 (since P0 is blocked).
 P0‘s Allocation vector goes down to (1,2,1), and its Need
 vector goes up to (1,0,1).
 a. Predict whether deadlock occurs? If so, give an example.
 If not, which necessary condition cannot occur?
 b. Predict whether indefinite blocking occurs?
a. Can deadlock occur? If you answer "yes," give an example. If
you answer "no," specify which necessary condition cannot occur.
Deadlock, in computing, in a situation in which two computer
programs sharing the same resources are preventing each other
from gaining access to the resource. This tension results in
program failure to function. From the case in context, deadlock
cannot occur. The avoidance of deadlock in this case is primarily
due to the existence of preemption. The concept of preemption is
observed when P2 is assigned the (2,2,1) which was previously
assigned to P0. Since P0 is blocked, P2 is preempted to receive the
resource.
b. Can indefinite blocking occur? Explain your answer.
There is a risk of indefinite blocking based on the foundation of
preemption in the resource allocation framework. Also referred to
as starvation, indefinite blocking entails an issue in which the low
priority tasks can wait forever as some other tasks have are of
higher priority to run processes. For example, if P2 is continuously
preempted, it may lead to starvation of the other tasks in the
system.
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 65
7.    Explain dining philosopher’s problem.                                         (Apr/May
                                                                                       2017)
                                                                                     (Nov/Dec
                                                                                       2018)
     Dinning Philosopher Problem
     • Dinning philosophers problem is one of classical process
     synchronization problem. Here five philosophers are seated around
     a circular table. They spend their lives in thinking and eating. Five
     plates with five forks are kept on the circular table.
     • While philosophers think, they ignore the eating and do not
     require fork. When a philosopher decides to eat, then he/she must
     obtain a two fork, one from left side fork and another from right
     side fork. Philosophers can pick up only a single fork at one time.
     After consuming a food, the philosophers replace the fork and
     resumes thinking. A philosophers to the left or right of a ow
     dinning philosopher cannot eat while the dinning philosophers is
     eating, since forks are a shared resource.
     Fig. 2.14.1 shows seating arrangement of five philosophers.
     Consider the following code :
     void dinning_Philosopher( )
     while (true)
     {
     thinking ( );
     eating ( );
     }
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 66
void eating ()
{
takeleftfork();
takerightfork( );
eatfood(delay);
putrighttfork( );
putleftfork( );
• Here philosophers operates asynchronously and concurrently, it
is possible for each philosophers to enter into eating mode. The
procedure takeleftfork( ) waits until the specified fork is available
and then seizes it. This is not possible because of the following
reasons.
• Each philosopher will hold exactly one fork, and no forks will
remain available on the table. The philosophers will all deadlock.
• To solve this problem, write some extra code in takerightfork ()
procedure. User can specify the philosophers to put down the left
fork if that philosophers cannot obtain the right fork.
Problem analysis :
• Shared resource: Here each fork is shared by two philosophers so
we called it is a shared resource.
• Race condition: we do not want a philosopher to pick up a fork
that has already been picked up by his neighbor. This is a race
condition.
• Solution: we consider each fork as a shared item and protected by
a mutex lock. Before eating, each philosopher first lock left fork
and then right fork. If philosopher acquires both locks successful,
then philosophers have two locks (two fork) so he can eat a food.
After finishes easting, this philosopher releases both fork, and
thinks.
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 67
     Some other alternatives :
     1. Pick up the left fork, if the right fork is not available for a given
     time, put the left fork down, wait and try again. Even if each
     philosopher waits a different random time, an unlucky philosopher
     may starve.
     2. Require all philosophers to acquire a binary semaphore before
     picking up any forks. This guarantees that no philosopher starves
     but limits parallelism dramatically.
     • Because we need to lock and unlock a chopstick, each chopstick
     is associated with a mutex lock. Since we have five philosophers
     who think and eat simultaneously, we need to create five threads,
     one for each philosopher. Since each philosopher must have access
     to the two mutex locks that are associated with its left and right to
     chopsticks, these mutex locks are global variables.
8.    Distinguish among short-term, medium-term and long-                           (Apr
      term scheduling with suitable example.                                         /May 2018)
     Schedulers
     • Schedulers are used to handles process scheduling. It is one type
     of system software and selects the jobs from the system and decide
     which process to run. Schedulers are of three types -
     1. Long term scheduler 2. Short term scheduler 3. Medium term
     scheduler
     • Fig. 2.2.2 shows queueing diagram for process scheduling. (See
     Fig. 2.2.2 on next page.)
     Long term scheduler
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 68
• Long term scheduler is also called job scheduler. It determines
which process are admitted to the system for processing. Processes
are selected from the queue and loads into the main memory for
execution.
• Long term scheduling controls the degree of multiprogramming
in multitasking systems. It provides a balanced mix of jobs, such
as I/O bound and CPU bound. •Long term scheduling is used in
real time operating system. Time sharing operating system has no
long term scheduler.
Medium term scheduler
•Medium term scheduler is part of swapping function. Sometimes
it removes the process from memory. It also reduces the degree of
multiprogramming.
• If process makes an I/O request and it is in memory then
operating system takes this process into suspended states. Once the
process becomes suspended, it cannot make any progress towards
completion.
• In this situation, the process is removed from memory and makes
free space for other process.
• The suspended process is stored in the secondary storage device
i.e. hard disk. This process is called swapping.
Short term scheduler
• Short term scheduler is also called CPU scheduler. It selects the
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 69
     process from queue which are ready to execute and allocate the
     CPU for execution.
     • Short term scheduler is faster that long term scheduler. This
     scheduler makes scheduling decisions much more frequently than
     the long-term or mid-term schedulers.
     • A scheduling decision will at a minimum have to be made after
     every time slice, and these are very short.
     • It is also known as dispatcher.
     Difference between Long Term, Short Term and Medium Term
     Scheduler
9.   Explain the differences in the degree to which the following                   (Apr/May
     scheduling algorithms discriminate in favour of short                            2018)
     processes: RR, Multilevel Feedback Queues
     1. First come first serve scheduling (FCFS).
     In this algorithm the process that requires the CPU first is allotted
     the CPU first and its implementation can be easily maintained
     using FIFO queue.
     When a CPU is free, the process is allotted CPU and it will
     continue holding CPU fill it is terminated or requests I/O devices.
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 70
      So process waiting for CPU will have to wait for its execution.
      Thus waiting time is large if a larger process executes before a
      shorter process.
      Thus we can say, FCFS discriminates against short job since any
      short job arriving after long job will have a longer waiting time.
      2. Round robin scheduling.
      This algorithm is designed for time sharing system.
      A small unit of time called time quantum or time slice is defined
      and each process is switched in and out of CPU depending on this
      time quantum value.
      The time quantum is generally from 10 ms to 100 ms in length.
      Here a process is not allotted CPU more than 1 time quantum.
      The performance of RR algorithm depends heavily on the size of
      time quantum, if time quantum Is very long time RR policy
      behave same as FCFS and if the time quantum is very small (say
      1, ms) then RR approach is called as processor sharing and creates
      the appearance that each process has its own processor running at
      1/n the speed of real processor.
      We can say that, RR – treats all job equally ( given them equal
      burst time of CPU) so short jb will be able to leave the system
      faster since they will finish first.
      3. Multilevel feedback queries.
      It allow the process to move between queues, the idea is to
      separate processes according to its characteristics of its burst time.
      If the process uses too much CPU time it will be moved to lower
      priority queue due to this scheme all I/O bounded and inter ache
      process are in higher priority queue.
      In addition, to this a process which wait too long in a lower
      priority queue may be moved to a higher priority queue and this
      form of aging prevents saturation.
      Thus we can say, multilevel feedback queues work similar to the
      RR algorithm so they discriminates favorably towards short job.
10.   Discuss how the following pairs of scheduling criteria conflict              (Nov/Dec 2018)
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 71
in certain settings.
i) CPU utilization and response time ii) Average turn around
time and maximum waiting time iii)I/O device utilization and
CPU utilization.
CPU Utilization and Response Time:
Conflict: CPU utilization is the percentage of time the CPU is
actively executing processes, while response time is the time it
takes for a system to respond to a user request. These criteria can
conflict in scenarios where maximizing CPU utilization, by
keeping the CPU busy with processes, may lead to longer response
times for interactive tasks. For example, a system running
compute-bound processes with high CPU utilization may respond
slowly to user input because the CPU is occupied with lengthy
computations.
Example: In a server environment with a high CPU utilization
goal, background tasks or batch processing may dominate the
CPU, causing delays in responding to user requests and impacting
the overall user experience.
Average Turnaround Time and Maximum Waiting Time:
Conflict: Average turnaround time is the average time taken for a
process to complete from submission to termination, while
maximum waiting time is the longest time a process waits in the
ready queue before getting CPU time. These criteria conflict when
efforts to minimize average turnaround time (by quickly
completing most processes) might result in some processes
experiencing very long waiting times, leading to high maximum
waiting times.
Example: A scheduling algorithm prioritizing short processes to
minimize average turnaround time may lead to longer waiting
times for longer processes, causing their maximum waiting time to
be high.
I/O Device Utilization and CPU Utilization:
Conflict: I/O device utilization focuses on keeping I/O devices
busy, while CPU utilization emphasizes keeping the CPU busy. In
certain situations, optimizing for one may lead to suboptimal
performance in the other. For example, if the CPU is constantly
waiting for I/O operations to complete (low CPU utilization), the
I/O devices might be highly utilized. Conversely, if the CPU is
constantly processing and not waiting for I/O (high CPU
utilization), I/O devices may be underutilized.
Example: In a system with high I/O device utilization, the CPU
might spend significant time waiting for I/O operations to
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 72
      complete, leading to suboptimal CPU utilization. On the other
      hand, a system with high CPU utilization may experience I/O
      devices being idle while the CPU is processing.
      In each of these pairs, the conflict arises because optimizing one
      criterion can have a negative impact on the other. Balancing these
      conflicting criteria is a key challenge in designing effective
      scheduling algorithms that meet the specific requirements of a
      given system or application.
11.   Write about the various CPU scheduling algorithms.
      CPU scheduling algorithms play a crucial role in determining the
      order in which processes are executed by the CPU. These
      algorithms aim to efficiently utilize the CPU's resources, minimize
      waiting times, and improve overall system performance. Here are
      some of the various CPU scheduling algorithms:
      First-Come-First-Serve (FCFS):
      Description: The simplest scheduling algorithm, where processes
      are executed in the order they arrive in the ready queue.
      Advantages: Easy to understand and implement.
      Disadvantages: May lead to the "convoy effect," where short
      processes get stuck behind long processes.
      Shortest Job Next (SJN) or Shortest Job First (SJF):
      Description: Selects the process with the shortest burst time first.
      Advantages: Minimizes waiting time for short processes.
      Disadvantages: Requires knowledge of burst times, which may not
      be available in advance.
      Priority Scheduling:
      Description: Assigns priorities to processes, and the process with
      the highest priority is selected for execution first.
      Advantages: Allows for prioritization of important tasks.
      Disadvantages: Can lead to starvation of lower-priority processes.
      Round Robin (RR):
      Description: Allocates fixed time slices (time quantum) to each
      process in a circular order.
      Advantages: Fair distribution of CPU time; suitable for time-
      sharing systems.
      Disadvantages: May lead to high turnaround time for long
      processes; choice of time quantum affects performance.
      Multilevel Queue Scheduling:
      Description: Divides the ready queue into multiple priority levels
      and assigns a different scheduling algorithm to each queue.
      Advantages: Allows processes to move between queues based on
      their behavior; suitable for diverse workloads.
      Disadvantages: Complexity in managing multiple queues and
      determining the criteria for moving processes between queues.
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 73
      Multilevel Feedback Queue (MLFQ):
      Description: Similar to multilevel queues, but processes can move
      between queues based on their behavior (e.g., aging or priority
      changes).
      Advantages: Dynamic adjustment to process behavior; adapts to
      different types of processes.
      Disadvantages: Requires careful tuning of parameters to achieve
      optimal performance.
      Priority Aging:
      Description: Increases the priority of waiting processes over time,
      preventing starvation of low-priority processes.
      Advantages: Mitigates priority inversion and starvation issues.
      Disadvantages: May introduce additional complexity in the
      scheduling algorithm.
      Lottery Scheduling:
      Description: Assigns lottery tickets to processes, and a scheduler
      randomly selects a ticket to determine the next process to execute.
      Advantages: Provides a probabilistic approach to fairness.
      Disadvantages: May be less predictable; the complexity of ticket
      distribution.
      Guaranteed Scheduling:
      Description: Ensures that each process is guaranteed a minimum
      amount of CPU time over a given interval.
      Advantages: Ensures fairness and prevents starvation.
      Disadvantages: May underutilize the CPU.
12.   Write about critical regions and monitors.
      Monitors
      • Monitor is an object that contains both data and procedures
      needed to perform allocation of a shared resource. It is a
      programming language construct that support both data access
      synchronization and control synchronization.
      • Fig. 2.15.1 shows structure of monitor. (See Fig. 2.15.1 on next
      page.)
      • Monitor is implemented in programming languages like Pascal,
      java and C++. Java makes extensive use of monitors to implement
      mutual exclusion.
      • Monitor is an abstract data type for which only one process may
      be executing a procedure at any given time. Monitor is a collection
      of procedure, variables and data structure. Data inside the monitor
      may be either global to all routines within the monitor or local to a
      specific routine.
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 74
• Monitor data is accessible only within the monitor. A shared data
structure can be protected by placing it in a monitor. If the data in
a monitor represents some resource, then the monitor provides a
mutual exclusion facility for accessing the resource. A procedure
defined within a monitor can access only those variable declared
locally within the monitor and its formal parameters.
• When a process calls a monitor procedure, the first few
instruction of the procedure will check to see if any other process
is currently active within the monitor. If process is active then
calling process will be suspended until the other process has left
the monitor. If no other process is using the monitor, the calling
process may enter.
• Monitor supports synchronization by the use of condition
variables that are contained within the monitor and accessible only
within the monitor. Every conditional variable has an associated
queue. Condition variables operates on two
cwait(condition variable)
csignal (condition variable)
• cwait: It suspend execution of the calling process on condition.
• csignal: Resume execution of some process blocked after a cwait
on the same condition.
• The cwait must come before the csignal.
• Fig. 2.15.2 shows monitor structure with condition variables.
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 75
     • CPU is a resource that must be shared by all processes. The part
     of the kernel that apportions CPU time between processes is called
     the scheduler.
     • A condition variable is like a semaphore, with two differences:
     1. A semaphore counts the number of excess up operations, but a
     signal operation on a condition variable has no effect unless some
     process is waiting. A wait on a condition varibale always blocks
     the calling process.
     2. A wait on a condition variable automatically does an up on the
     monitor mutex and blocks the caller.
                                               PART C
1.   Describe Operations on Processes and IPC elaborately.
     There are many operations that can be performed on processes.
     Some of these are process creation, process preemption, process
     blocking, and process termination. These are given in detail as
     follows −
                             R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 76
Process Creation
Processes need to be created in the system for different
operations. This can be done by the following events −
     User request for process creation
     System initialization
     Execution of a process creation system call by a running
       process
     Batch job initialization
A process may be created by another process using fork(). The
creating process is called the parent process and the created
process is the child process. A child process can have only one
parent but a parent process may have many children. Both the
parent and child processes have the same memory image, open
files, and environment strings. However, they have distinct
address spaces.
A diagram that demonstrates process creation using fork() is as
follows −
Process Preemption
An interrupt mechanism is used in preemption that suspends the
process executing currently and the next process to execute is
determined by the short-term scheduler. Preemption makes sure
that all processes get some CPU time for execution.
A diagram that demonstrates process preemption is as follows −
Process Blocking
The process is blocked if it is waiting for some event to occur.
This event may be I/O as the I/O events are executed in the main
memory and don't require the processor. After the event is
complete, the process again goes to the ready state.
A diagram that demonstrates process blocking is as follows −
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 77
Process Termination
After the process has completed the execution of its last
instruction, it is terminated. The resources held by a process are
released after it is terminated.
A child process can be terminated by its parent process if its task
is no longer relevant. The child process sends its status
information to the parent process before it terminates. Also, when
a parent process is terminated, its child processes are terminated
as well as the child processes cannot run if the parent processes
are terminated.
INTER-PROCESS COMMUNICATION
In general, Inter Process Communication is a type of mechanism
usually provided by the operating system (or OS). The main aim or
goal of this mechanism is to provide communications in between
several processes. In short, the intercommunication allows a
process letting another process know that some event has occurred.
Let us now look at the general definition of inter-process
communication, which will explain the same thing that we have
discussed above.
Definition
"Inter-process communication is used for exchanging useful
information between numerous threads in one or more processes
(or programs)."
To understand inter process communication, you can consider the
following given diagram that illustrates the importance of inter-
process communication:
Role of Synchronization in Inter Process Communication
It is one of the essential parts of inter process communication.
Typically, this is provided by interprocess communication control
mechanisms, but sometimes it can also be controlled by
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 78
communication processes.
These are the following methods that used to provide the
synchronization:
     1. Mutual Exclusion
     2. Semaphore
     3. Barrier
     4. Spinlock
Mutual Exclusion:-
It is generally required that only one process thread can enter the
critical section at a time. This also helps in synchronization and
creates a stable state to avoid the race condition.
Semaphore:-
Semaphore is a type of variable that usually controls the access to
the shared resources by several processes. Semaphore is further
divided into two types which are as follows:
     1. Binary Semaphore
     2. Counting Semaphore
Barrier:-
A barrier typically not allows an individual process to proceed
unless all the processes does not reach it. It is used by many
parallel languages, and collective routines impose barriers.
Spinlock:-
Spinlock is a type of lock as its name implies. The processes are
trying to acquire the spinlock waits or stays in a loop while
checking that the lock is available or not. It is known as busy
waiting because even though the process active, the process does
not perform any functional operation (or task).
Approaches to Interprocess Communication
We will now discuss some different approaches to inter-process
communication which are as follows:
These are a few different approaches for Inter- Process
Communication:
   1. Pipes
   2. Shared Memory
   3. Message Queue
   4. Direct Communication
   5. Indirect communication
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 79
    6. Message Passing
    7. FIFO
To understand them in more detail, we will discuss each of them
individually.
Pipe:-
The pipe is a type of data channel that is unidirectional in nature. It
means that the data in this type of data channel can be moved in
only a single direction at a time. Still, one can use two-channel of
this type, so that he can able to send and receive data in two
processes. Typically, it uses the standard methods for input and
output. These pipes are used in all types of POSIX systems and in
different versions of window operating systems as well.
Shared Memory:-
It can be referred to as a type of memory that can be used or
accessed by multiple processes simultaneously. It is primarily used
so that the processes can communicate with each other. Therefore
the shared memory is used by almost all POSIX and Windows
operating systems as well.
Message Queue:-
In general, several different messages are allowed to read and
write the data to the message queue. In the message queue, the
messages are stored or stay in the queue unless their recipients
retrieve them. In short, we can also say that the message queue is
very helpful in inter-process communication and used by all
operating systems.
To understand the concept of Message queue and Shared memory
in more detail, let's take a look at its diagram given below:
Message Passing:-
It is a type of mechanism that allows processes to synchronize and
communicate with each other. However, by using the message
passing, the processes can communicate with each other without
restoring the shared variables.
Usually, the inter-process communication mechanism provides
two operations that are as follows:
     o send (message)
     o received (message)
Direct Communication:-
In this type of communication process, usually, a link is created or
established between two communicating processes. However, in
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 80
     every pair of communicating processes, only one link can exist.
     Indirect Communication
     Indirect communication can only exist or be established when
     processes share a common mailbox, and each pair of these
     processes shares multiple communication links. These shared links
     can be unidirectional or bi-directional.
     FIFO:-
     It is a type of general communication between two unrelated
     processes. It can also be considered as full-duplex, which means
     that one process can communicate with another process and vice
     versa.
     Some other different approaches
         o Socket:-
     It acts as a type of endpoint for receiving or sending the data in a
     network. It is correct for data sent between processes on the same
     computer or data sent between different computers on the same
     network. Hence, it used by several types of operating systems.
         o File:-
     A file is a type of data record or a document stored on the disk and
     can be acquired on demand by the file server. Another most
     important thing is that several processes can access that file as
     required or needed.
         o Signal:-
     As its name implies, they are a type of signal used in inter process
     communication in a minimal way. Typically, they are the
     massages of systems that are sent by one process to another.
     Therefore, they are not used for sending data but for remote
     commands between multiple processes.
     Usually, they are not used to send the data but to remote
     commands in between several processes
2.   Discuss Semaphores and mutex in detail.                                     (Apr/May 2015)
     Semaphore is simply a variable that is non-negative and shared              (Nov/Dec 2021)
     between threads. A semaphore is a signaling mechanism, and
     another thread can signal a thread that is waiting on a semaphore.
     A semaphore uses two atomic operations,
     1. Wait: The wait operation decrements the value of its argument S
     if it is positive. If S is negative or zero, then no operation is
     performed.
                             R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 81
wait(S)
{
while (S<=0);
S--;
}
2. Signal for the process synchronization: The signal operation
increments the value of its argument S.
signal(S)
{
S++;
}
A semaphore either allows or reject access to the resource,
depending on how it is set up.
Use of Semaphore
In the case of a single buffer, we can separate the 4 KB buffer into
four buffers of 1 KB. Semaphore can be associated with these four
buffers, allowing users and producers to work on different buffers
simultaneously.
Types of Semaphore
Semaphore is distinguished by the operating system in two
categories Counting semaphore and Binary semaphore.
1. Counting Semaphore: The semaphore S value is initialized to
the number of resources present in the system. Whenever a process
wants to access the resource, it performs the wait()operation on the
semaphore and decrements the semaphore value by one. When it
releases the resource, it performs the signal() operation on the
semaphore and increments the semaphore value by one.
When the semaphore count goes to 0, it means the processes
occupy all resources. A process needs to use a resource when the
semaphore count is 0. It executes the wait() operation and
gets blocked until the semaphore value becomes greater than 0.
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 82
2. Binary semaphore: The value of a semaphore ranges
between 0and 1. It is similar to mutex lock, but mutex is a locking
mechanism, whereas the semaphore is a signaling mechanism. In
binary semaphore, if a process wants to access the resource, it
performs the wait() operation on the semaphore and decrements
the value of the semaphore from 1 to 0. When it releases the
resource, it performs a signal() operation on the semaphore and
increments its value to 1. Suppose the value of the semaphore is 0
and a process wants to access the resource. In that case, it
performs wait() operation and block itself till the current process
utilizing the resources releases the resource.
Advantages of Semaphore
Here are the following advantages of semaphore, such as:
   o It allows more than one thread to access the critical section.
   o Semaphores are machine-independent.
   o Semaphores are implemented in the machine-independent
       code of the microkernel.
   o They do not allow multiple processes to enter the critical
       section.
   o As there is busy and waiting in semaphore, there is never
       wastage of process time and resources.
   o They are machine-independent, which should be run in the
       machine-independent code of the microkernel.
   o They allow flexible management of resources.
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 83
Disadvantage of Semaphores
Semaphores also have some disadvantages, such as:
    o One of the biggest limitations of a semaphore is priority
       inversion.
    o The operating system has to keep track of all calls to wait
       and signal semaphore.
    o Their use is never enforced, but it is by convention only.
    o The Wait and Signal operations require to be executed in
       the correct order to avoid deadlocks in semaphore.
    o Semaphore programming is a complex method, so there are
       chances of not achieving mutual exclusion.
    o It is also not a practical method for large scale use as their
       use leads to loss of modularity.
    o Semaphore is more prone to programmer error
    o , and it may cause deadlock or violation of mutual
       exclusion due to programmer error.
MUTEX
Mutex is a mutual exclusion object that synchronizes access to a
resource. It is created with a unique name at the start of a program.
The mutex locking mechanism ensures only one thread can acquire
the mutex and enter the critical section. This thread only releases
the mutex when it exits in the critical section.
It is a special type of binary semaphore used for controlling access
to the shared resource. It includes a priority inheritance mechanism
to avoid extended priority inversion problems. It allows current
higher priority tasks to be kept in the blocked state for the shortest
time possible. However, priority inheritance does not correct
priority inversion but only minimizes its effect.
Example
This is shown with the help of the following example,
wait (mutex);
.....
Critical Section
.....
signal (mutex);
Use of Mutex
A mutex provides mutual exclusion, either producer or consumer
who can have the key (mutex) and proceed with their work. As
long as the producer fills the buffer, the user needs to wait, and
vice versa. In Mutex lock, all the time, only a single thread can
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 84
     work with the entire buffer.
     When a program starts, it requests the system to create a mutex
     object for a given resource. The system creates the mutex object
     with a unique name or ID. Whenever the program thread wants to
     use the resource, it occupies lock on mutex object, utilizes the
     resource and after use, it releases the lock on mutex object. Then
     the next process is allowed to acquire the lock on the mutex object.
     Meanwhile, a process has acquired the lock on the mutex object,
     and no other thread or process can access that resource. If the
     mutex object is already locked, the process desiring to acquire the
     lock on the mutex object has to wait and is queued up by the
     system till the mutex object is unlocked.
     Advantages of Mutex
     Here are the following advantages of the mutex, such as:
         o Mutex is just simple locks obtained before entering its
             critical section and then releasing it.
         o Since only one thread is in its critical section at any given
             time, there are no race conditions, and data always remain
             consistent.
     Disadvantages of Mutex
     Mutex also has some disadvantages, such as:
         o If a thread obtains a lock and goes to sleep or is preempted,
             then the other thread may not move forward. This may lead
             to starvation.
         o It can't be locked or unlocked from a different context than
             the one that acquired it.
         o Only one thread should be allowed in the critical section at
             a time.
         o The normal implementation may lead to a busy waiting
             state, which wastes CPU time.
3.   Discuss deadlock and the various methods of handling                        (Nov/Dec 2011)
     deadlock in detail.                                                         (Nov/Dec 2021)
     A deadlock happens in operating system when two or more
     processes need some resource to complete their execution that is
     held by the other process.
     In the above diagram, the process 1 has resource 1 and needs to
     acquire resource 2. Similarly process 2 has resource 2 and needs
     to acquire resource 1. Process 1 and process 2 are in deadlock as
                             R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 85
each of them needs the other’s resource to complete their
execution but neither of them is willing to relinquish their
resources.
Coffman Conditions
A deadlock occurs if the four Coffman conditions hold true. But
these conditions are not mutually exclusive.
The Coffman conditions are given as follows −
    Mutual Exclusion
       There should be a resource that can only be held by one
       process at a time. In the diagram below, there is a single
       instance of Resource 1 and it is held by Process 1 only.
      Hold and Wait
       A process can hold multiple resources and still request
       more resources from other processes which are holding
       them. In the diagram given below, Process 2 holds
       Resource 2 and Resource 3 and is requesting the Resource 1
       which is held by Process 1.
      No Preemption
       A resource cannot be preempted from a process by force. A
       process can only release a resource voluntarily. In the
       diagram below, Process 2 cannot preempt Resource 1 from
       Process 1. It will only be released when Process 1
       relinquishes it voluntarily after its execution is complete.
      Circular Wait
       A process is waiting for the resource held by the second
       process, which is waiting for the resource held by the third
       process and so on, till the last process is waiting for a
       resource held by the first process. This forms a circular
       chain. For example: Process 1 is allocated Resource2 and it
       is requesting Resource 1. Similarly, Process 2 is allocated
       Resource 1 and it is requesting Resource 2. This forms a
       circular wait loop.
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 86
METHODS FOR HANDLING DEADLOCKS
There are four approaches to dealing with deadlocks.
1. Deadlock Prevention
2. Deadlock avoidance (Banker's Algorithm)
3. Deadlock detection & recovery
4. Deadlock Ignorance (Ostrich Method)
DEADLOCK PREVENTION
Deadlock has following characteristics.
1. Mutual Exclusion
2. Hold and Wait
3. No preemption
4. Circular wait
We can prevent Deadlock by eliminating any of the above four
conditions.
1. Mutual Exclusion
Mutual section from the resource point of view is the fact that a
resource can never be used by more than one process
simultaneously which is fair enough but that is the main reason
behind the deadlock. If a resource could have been used by more
than one process at the same time then the process would have
never been waiting for any resource.
However, if we can be able to violate resources behaving in the
mutually exclusive manner then the deadlock can be prevented.
Spooling
For a device like printer, spooling can work. There is a memory
associated with the printer which stores jobs from each of the
process into it. Later, Printer collects all the jobs and print each
one of them according to FCFS. By using this mechanism, the
process doesn't have to wait for the printer and it can continue
whatever it was doing. Later, it collects the output when it is
produced.
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 87
Although, Spooling can be an effective approach to violate mutual
exclusion but it suffers from two kinds of problems.
     1. This cannot be applied to every resource.
     2. After some point of time, there may arise a race condition
         between the processes to get space in that spool.
We cannot force a resource to be used by more than one process at
the same time since it will not be fair enough and some serious
problems may arise in the performance. Therefore, we cannot
violate mutual exclusion for a process practically.
2. Hold and Wait
Hold and wait condition lies when a process holds a resource and
waiting for some other resource to complete its task. Deadlock
occurs because there can be more than one process which are
holding one resource and waiting for other in the cyclic order.
However, we have to find out some mechanism by which a
process either doesn't hold any resource or doesn't wait. That
means, a process must be assigned all the necessary resources
before the execution starts. A process must not wait for any
resource once the execution has been started.
!(Hold and wait) = !hold or !wait (negation of hold and wait is,
either you don't hold or you don't wait)
This can be implemented practically if a process declares all the
resources initially. However, this sounds very practical but can't be
done in the computer system because a process can't determine
necessary resources initially.
Process is the set of instructions which are executed by the CPU.
Each of the instruction may demand multiple resources at the
multiple times. The need cannot be fixed by the OS.
The problem with the approach is:
     1. Practically not possible.
     2. Possibility of getting starved will be increases due to the
         fact that some process may hold a resource for a very long
         time.
3. No Preemption
Deadlock arises due to the fact that a process can't be stopped once
it starts. However, if we take the resource away from the process
which is causing deadlock then we can prevent deadlock.
This is not a good approach at all since if we take a resource away
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 88
     which is being used by the process then all the work which it has
     done till now can become inconsistent.
     Consider a printer is being used by any process. If we take the
     printer away from that process and assign it to some other process
     then all the data which has been printed can become inconsistent
     and ineffective and also the fact that the process can't start printing
     again from where it has left which causes performance
     inefficiency.
     4. Circular Wait
     To violate circular wait, we can assign a priority number to each of
     the resource. A process can't request for a lesser priority resource.
     This ensures that not a single process can request a resource which
     is being utilized by some other process and no cycle will be
     formed.
4.   Discuss in detail about deadlock avoidance.
     DEADLOCK AVOIDANCE
     It is a banker algorithm used to avoid deadlock and allocate
     resources safely to each process in the computer system. The 'S-
     State' examines all possible tests or activities before deciding
     whether the allocation should be allowed to each process. It also
     helps the operating system to successfully share the resources
     between all the processes. The banker's algorithm is named
     because it checks whether a person should be sanctioned a loan
     amount or not to help the bank system safely simulate allocation
     resources. In this section, we will learn the Banker's Algorithm in
     detail. Also, we will solve problems based on the Banker's
     Algorithm. To understand the Banker's Algorithm first we will see
     a real word example of it.
     Suppose the number of account holders in a particular bank is 'n',
     and the total money in a bank is 'T'. If an account holder applies
     for a loan; first, the bank subtracts the loan amount from full cash
     and then estimates the cash difference is greater than T to approve
     the loan amount. These steps are taken because if another person
     applies for a loan or withdraws some amount from the bank, it
     helps the bank manage and operate all things without any
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 89
restriction in the functionality of the banking system.
Similarly, it works in an operating system. When a new process is
created in a computer system, the process must provide all types of
information to the operating system like upcoming processes,
requests for their resources, counting them, and delays. Based on
these criteria, the operating system decides which process
sequence should be executed or waited so that no deadlock occurs
in a system. Therefore, it is also known as deadlock avoidance
algorithm or deadlock detection in the operating system.
Advantages
Following are the essential characteristics of the Banker's
algorithm:
    1. It contains various resources that meet the requirements of
         each process.
    2. Each process should provide information to the operating
         system for upcoming resource requests, the number of
         resources, and how long the resources will be held.
    3. It helps the operating system manage and control process
         requests for each type of resource in the computer system.
    4. The algorithm has a Max resource attribute that represents
         indicates each process can hold the maximum number of
         resources in a system.
Disadvantages
    1. It requires a fixed number of processes, and no additional
         processes can be started in the system while executing the
         process.
    2. The algorithm does no longer allows the processes to
         exchange its maximum needs while processing its tasks.
    3. Each process has to know and state their maximum
         resource requirement in advance for the system.
    4. The number of resource requests can be granted in a finite
         time, but the time limit for allocating the resources is one
         year.
When working with a banker's algorithm, it requests to know about
three things:
    1. How much each process can request for each resource in
         the system. It is denoted by the [MAX] request.
    2. How much each process is currently holding each resource
         in a system. It is denoted by the [ALLOCATED] resource.
    3. It represents the number of each resource currently
         available in the system. It is denoted by the
         [AVAILABLE] resource.
Following are the important data structures terms applied in the
banker's algorithm as follows:
Suppose n is the number of processes, and m is the number of each
type of resource used in a computer system.
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 90
     1. Available: It is an array of length 'm' that defines each type
         of resource available in the system. When Available[j] = K,
         means that 'K' instances of Resources type R[j] are
         available in the system.
     2. Max: It is a [n x m] matrix that indicates each process P[i]
         can store the maximum number of resources R[j] (each
         type) in a system.
     3. Allocation: It is a matrix of m x n orders that indicates the
         type of resources currently allocated to each process in the
         system. When Allocation [i, j] = K, it means that process
         P[i] is currently allocated K instances of Resources type
         R[j] in the system.
     4. Need: It is an M x N matrix sequence representing the
         number of remaining resources for each process. When the
         Need[i] [j] = k, then process P[i] may require K more
         instances of resources type Rj to complete the assigned
         work.
         Nedd[i][j] = Max[i][j] - Allocation[i][j].
     5. Finish: It is the vector of the order m. It includes a Boolean
         value (true/false) indicating whether the process has been
         allocated to the requested resources, and all resources have
         been released after finishing its task.
The Banker's Algorithm is the combination of the safety algorithm
and the resource request algorithm to control the processes and
avoid deadlock in a system:
Safety Algorithm
It is a safety algorithm used to check whether or not a system is in
a safe state or follows the safe sequence in a banker's algorithm:
1. There are two vectors Wok and Finish of length m and n in a
safety algorithm.
Initialize:               Work                =              Available
Finish[i] = false; for I = 0, 1, 2, 3, 4… n - 1.
2. Check the availability status for each type of resources [i], such
as:
Need[i]                              <=                          Work
Finish[i]                              ==                         false
If the i does not exist, go to step 4.
3. Work = Work +Allocation(i) // to get new resource allocation
Finish[i] = true
Go to step 2 to check the status of resource availability for the next
process.
4. If Finish[i] == true; it means that the system is safe for all
processes.
Resource Request Algorithm
A resource request algorithm checks how a system will behave
when a process makes each type of resource request in a system as
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 91
a request matrix.
Let create a resource request array R[i] for each process P[i]. If the
Resource Requesti [j] equal to 'K', which means the process P[i]
requires 'k' instances of Resources type R[j] in the system.
1. When the number of requested resources of each type is less
than the Need resources, go to step 2 and if the condition fails,
which means that the process P[i] exceeds its maximum claim for
the resource. As the expression suggests:
If                 Request(i)                <=                 Need
Go to step 2;
2. And when the number of requested resources of each type is less
than the available resource for each process, go to step (3). As the
expression suggests:
If                Request(i)              <=                Available
Else Process P[i] must wait for the resource since it is not available
for use.
3. When the requested resource is allocated to the process by
changing state:
Available            =         Available           -          Request
Allocation(i)       =      Allocation(i)      +       Request       (i)
Needi = Needi - Requesti
When the resource allocation state is safe, its resources are
allocated to the process P(i). And if the new state is unsafe, the
Process P (i) has to wait for each type of Request R(i) and restore
the old resource-allocation state.
Example: Consider a system that contains five processes P1, P2,
P3, P4, P5 and the three resource types A, B and C. Following are
the resources types: A has 10, B has 5 and the resource type C has
7 instances.
  Proces     Allocation         Max                  Available
  s          A      B           A       B            A      B
             C                  C                    C
 P1          0      1          7       5      3     3       3      2
             0
 P2          2      0      0   3       2      2
 P3          3      0      2   9       0      2
 P4          2      1      1   2       2      2
 P5          0      0      2   4       3      3
Answer the following questions using the banker's algorithm:
   1. What is the reference of the need matrix?
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 92
   2. Determine if the system is safe or not.
   3. What will happen if the resource request (1, 0, 0) for
       process P1 can the system accept this request immediately?
Ans. 2: Context of the need matrix is as follows:
Need      [i]      =       Max        [i]    -    Allocation   [i]
Need for P1: (7, 5, 3) - (0, 1, 0) = 7, 4, 3
Need for P2: (3, 2, 2) - (2, 0, 0) = 1, 2, 2
Need for P3: (9, 0, 2) - (3, 0, 2) = 6, 0, 0
Need for P4: (2, 2, 2) - (2, 1, 1) = 0, 1, 1
Need for P5: (4, 3, 3) - (0, 0, 2) = 4, 3, 1
 Process       Need
               A    B            C
 P1            7     4       3
 P2            1     2       2
 P3            6     0       0
 P4            0     1       1
 P5            4     3       1
Hence, we created the context of need matrix.
Ans. 2: Apply the Banker's Algorithm:
Available Resources of A, B and C are 3, 3, and 2.
Now we check if each type of resource request is available for
each process.
Step 1: For Process P1:
Need <= Available
7, 4, 3 <= 3, 3, 2 condition is false.
So, we examine another process, P2.
Step 2: For Process P2:
Need <= Available
1, 2, 2 <= 3, 3, 2 condition true
New available = available + Allocation
(3, 3, 2) + (2, 0, 0) => 5, 3, 2
Similarly, we examine another process P3.
Step 3: For Process P3:
P3 Need <= Available
6, 0, 0 < = 5, 3, 2 condition is false.
Similarly, we examine another process, P4.
Step 4: For Process P4:
P4 Need <= Available
0, 1, 1 <= 5, 3, 2 condition is true
New Available resource = Available + Allocation
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 93
5, 3, 2 + 2, 1, 1 => 7, 4, 3
Similarly, we examine another process P5.
Step 5: For Process P5:
P5 Need <= Available
4, 3, 1 <= 7, 4, 3 condition is true
New available resource = Available + Allocation
7, 4, 3 + 0, 0, 2 => 7, 4, 5
Now, we again examine each type of resource request for
processes P1 and P3.
Step 6: For Process P1:
P1 Need <= Available
7, 4, 3 <= 7, 4, 5 condition is true
New Available Resource = Available + Allocation
7, 4, 5 + 0, 1, 0 => 7, 5, 5
So, we examine another process P2.
Step 7: For Process P3:
P3 Need <= Available
6, 0, 0 <= 7, 5, 5 condition is true
New Available Resource = Available + Allocation
7, 5, 5 + 3, 0, 2 => 10, 5, 7
Hence, we execute the banker's algorithm to find the safe state and
the safe sequence like P2, P4, P5, P1 and P3.
Ans. 3: For granting the Request (1, 0, 2), first we have to check
that Request <= Available, that is (1, 0, 2) <= (3, 3, 2), since the
condition is true. So the process P1 gets the request immediately.
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 94
                               UNIT III STORAGE MANAGEMENT
Main Memory - Swapping - Contiguous Memory Allocation – Paging - Structure of the Page Table -
Segmentation, Segmentation with paging; Virtual Memory - Demand Paging – Copy on Write – Page
Replacement - Allocation of Frames –Thrashing
                                              PART - A
S.No.                             Question and Answer                        CO      Univ QP
                                                                                       (Month/Yea
                                                                                           r)
    1.   What is the difference between user-level instructions and              CO3      (Apr/
         privileged instructions?                                                          May
         A non-privileged (i.e. user-level) instruction is an instruction that            2017)
         any application or user can execute. A privileged instruction, on the
         other hand, is an instruction that can only be executed in kernel
         mode. Instructions are divided in this manner because privileged
         instructions could harm the kernel.
    2.   Define: Belady’s anomaly?                                               CO3      (Apr/
         In computer storage, Bélády's anomaly is the phenomenon in                        May
         which increasing the number of page frames results in an increase in             2017)
         the number of page faults for certain memory access patterns. This
         phenomenon is commonly experienced when using the first-in first-
         out (FIFO) page replacement algorithm.
    3.   What is the purpose of paging the page table?                           CO3      (Nov/
         In certain situations the page tables could become large enough that              Dec
         by paging the page tables, one could simplify the memory                         2016)
         allocation problem (by ensuring that everything is allocated as
         fixed-size pages as opposed to variable-sized chunks) and also
         enable the swapping of portions of page table that are not currently
         used.
    4.   Why page sizes are always power of 2?                                   CO3   (Nov/Dec
                                                                                         2016)
         Recall that paging is implemented by breaking up an address into a
         page and offset number. It is most efficient to break the address
         into X page bits and Y offset bits, rather than perform arithmetic on
         the address to calculate the page number and offset. Because each
         bit position represents a power of 2, splitting an address between
         bits results in a page size that is a power of 2.
    5.   Define demand paging in memory management.                              CO3    (Nov/
         In virtual memory systems, demand paging is a type of swapping in              Dec
         which pages of data are not copied from disk to RAM until they are             2015)
         needed.
    6.   What are the steps required to handle a page fault in demand            CO3   (Nov/Dec
         paging?                                                                         2015)
         Steps in handling page fault:
          1. Operating system looks at another table to decide:
          • Invalid reference - abort
          • Just not in memory
          2. Find free frame
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 95
        3. Swap page into frame via scheduled disk operation
        4. Reset tables to indicate page now in memory Set validation bit
           =v
        5. Restart the instruction that caused the page fault
7.    Define Demand paging and write advantages.                                      CO3   (Nov/Dec
      Virtual memory is commonly implemented by demand paging. In demand                      2021)
      paging, the pager brings only those necessary pages into memory instead of
      swapping in a whole process. Thus it avoids reading into memory pages that
      will not be used anyway, decreasing the swap time and the amount of
      physical memory needed.
8.    Difference between internal and external fragmentation                          CO3   (Nov/Dec
      Internal fragmentation is the area occupied by a process but cannot be                  2011)
      used by the process. This space is unusable by the system until the process           (Apr/May
      release the space.                                                                      2018)
      External fragmentation exists when total free memory is enough for the
      new process but it's not contiguous and can't satisfy the request. Storage is
      fragmented into small holes.
9.    What do you mean by thrashing?                                                  CO3   (Apr/May
      Thrashing is when the page fault and swapping happens very frequently at a              2019)
      higher rate, and then the operating system has to spend more time swapping            (Apr/May
      these pages. This state in the operating system is known as thrashing.                  2015)
      Because of thrashing, the CPU utilization is going to be reduced or                   (Nov/Dec
      negligible.Thrashing is the coincidence of high page traffic and low CPU                2021)
      utilization.
10.   What do mean by page fault? Under what circumstances do page faults             CO3   (Apr/May
      occur? State the actions taken by the operating system when a page                      2019)
      fault occurs?
      Page fault is the situation in which the page is not available whenever a
      processor needs to execute it.
      A page fault occurs when an access to a page that has not been brought into
      main memory takes place. The operating system verifies the memory
      access, aborting the program if it is invalid. If it is valid a free frame is
      located and I/O requested to read the needed page into the free frame.
11.   Define demand paging in memory management. What are the steps                   CO3   (Nov/Dec
      required to handle a page fault in demand paging.                                       2015)
       A demand paging system is quite similar to a paging system with
           swapping where processes reside in
           secondary memory and pages are loaded only on demand, not in
           advance.
       When a context switch occurs, the operating system does not copy any
           of the old program’s pages out to the disk or any of the new program’s
           pages into the main memory Instead, it just begins executing the new
           program after loading the first page and fetches that program’s pages as
           they are referenced.
       While executing a program, if the program references a page which is
           not available in the main memory because it was swapped out a little
           ago, the processor treats this invalid memory reference as a page fault
           and transfers control from the program to the operating system to
           demand the page back into the memory.
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 96
12. Consider a logical address space of eight pages of 1024 words each,               CO3
    mapped onto a physical memory of 32 frames.
             a. How many bits are there in the logical address?
             b. How many bits are there in the physical address?
    Each page/frame holds 1K; we will need 10 bits to uniquely address
    each of those 1024 addresses. Physical memory has 32 frames and
    we need 25 bits to address each frame, requiring in total 5+10=15
    bits. A logical address space of 64 pages requires 6 bits to address
    each page uniquely, requiring 16bits in total.
                      a. Logical address: 13 bits
                      b. Physical address: 15 bits
13. In the IBM/370, memory protection is provided through the use                     CO3
    of keys. A key is a 4-bit quantity. Each 2K block of memory has
    a key (the storage key) associated with it. The CPU also has a
    key (the protection key) associated with it. A store operation is
    allowed only if both keys are equal, or if either is zero. Which of
    the following memory-management schemes could be used
    successfully with this hardware?
                 Bare machine
                 Single-user system
                 Multiprogramming with a fixed number of processes
                 Multiprogramming with a variable number of processes
                 Paging
                 Segmentation
    Answer:
             a. Protection not necessary set system key to 0.
             b. Set system key to 0 when in supervisor mode.
             c. Region sizes must be fixed in increments of 2k bytes, allocate key
                with memory blocks.
             d. Same as above.
             e. Frame sizes must be in increments of 2k bytes, allocate key with
                pages.
             f. Segment sizes must be in increments of 2k bytes, allocate key
                with segments
14. What is address binding?                                                          CO3
    The process of associating program instructions and data to physical
    memory addresses is called address binding, or relocation.
15. Why page sizes are always powers of 2?                                            CO3
                   Recall that paging is implemented by breaking up an
                      address into a page and offset number.
                   It is most efficient to break the address into X page bits and
                      Y offset bits, rather than perform arithmetic on the address
                      to calculate the page number and offset.
                   Because each bit 25 26 position represents a power of 2,
                      splitting an address between bits results in a page size that
                      is a power of 2
16. Consider the following page reference string: 1, 2, 3, 4, 2, 1, 5, 6, 2, 1,       CO3
    2, 3, 7, 6, 3, 2, 1, 2, 3, 6. How many page faults would occur for the
    following replacement algorithms, assuming one, two, three, four,
    five, six, or seven frames? Remember all frames are initially empty,
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 97
      so your first unique pages will all cost one fault each. • LRU
      replacement • FIFO replacement • Optimal replacement
      Number of frames         LRU FIFO Optimal
      1                          20      20     20
      2                          18      18     15
      3                          15      16     11
      4                          10      14      8
      5                          8       10     7
      6                          7       10     7
      7                          7        7     7
17.   What is the purpose of paging the page tables?                                 CO3
      In certain situations the page tables could become large enough that by
      paging the page tables, one could simplify the memory allocation problem
      (by ensuring that everything is allocated as fixed-size pages as opposed to
      variable-sized chunks) and also enable the swapping of portions of page
      table that are not currently used.
18.   Define TLB.                                                                    CO3
         Translation Look-Aside Buffer, a table in the processors memory that
          contains information about the pages in memory the processor has
          accessed recently
         The TLB enables faster computing because it allows the address
          processing to take place independent of the normal address-translation
          pipeline
19.   Define logical address and physical address.                                   CO3
      An address generated by the CPU is referred as logical address. An address
      seen by the memory unit that is the one loaded into the memory address
      register of the memory is commonly referred as physical address
20.   Define Copy-on-write.                                                          CO3
      Copy-on-write finds its main use in virtual memory operating systems;
      when a process creates a copy of itself, the pages in memory that might be
      modified by either the process or its copy are marked copy-on- write.
21.   What is the main function of the memory-management unit?                       CO3
      The runtime mapping from virtual to physical addresses is done by a
      hardware device called a memory Management unit (MMU)
22.   What are the common strategies to select a free hole from a set of             CO3
      available holes?
      The most common strategies are
               A. First fit B. Best fit         C. Worst fit
23.   What are the various page replacement algorithms?                              CO3
               •        FIFO page replacement
               •        Optimal page replacement
               •        LRU page replacement
               •        LRU approximation page replacement
               •        Counting based page replacement
               •        Page buffering algorithm.
24.   Differentiate a page from a segment.                                           CO3
      In segmentation, the address space is typically divided into a preset number
      of segments like data segment (read/write), code segment (read-only), stack
      (read/write) etc. And the programs are divided into these segments
      accordingly. Logical addresses are represented as tuple <segment, offset>.
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 98
       While with paging, the address space is divided into a sequence of fixed size
       units called "pages". And logical addresses take the form of a tuple <page,
       offset>.
   25. What is virtual memory? Mention its advantages.                               CO3
       Virtual memory is a technique that allows the execution of processes that
       may not be completely in memory. It is the separation of user logical
       memory from physical memory. This separation provides an extremely
       large virtual memory, when only a smaller physical memory is available.
       The main visible advantage of this scheme is that programs can be larger
       than physical memory.
                                              PART - B
S.No                           Question and Answer                                 CO   Univ QP
  .                                                                                    (Month/Y
                                                                                           ear)
  1. Explain Main Memory in detail.                                                CO3 (Apr/May
     Computer memory is just like the human brain. It is used to store                    2019)
     data/information and instructions. It is a data storage unit or a data             (Nov/Dec
     storage device where data is to be processed and instructions required               2021)
     for processing are stored. It can store both the input and output can be
     stored here.
     Characteristics of Main Memory:
      It is faster computer memory as compare to secondary memory.
      It is semiconductor memories.
      It is usually a volatile memory.
      It is the main memory of the computer.
      A computer system cannot run without primary memory.
     In general, memory is of three types:
      Primary memory
      Secondary memory
      Cache memory
     Now we discuss each type of memory one by one in detail:
     1. Primary Memory: It is also known as the main memory of the
     computer system. It is used to store data and programs or instructions
     during computer operations. It uses semiconductor technology and
     hence is commonly called semiconductor memory. Primary memory
     is of two types:
     (i) RAM (Random Access Memory): It is a volatile memory.
     Volatile memory stores information based on the power supply. If the
     power supply fails/ interrupted/stopped, all the data & information on
     this memory will be lost. RAM is used for booting up or start the
     computer. It temporarily stores programs/ data which has to be
     executed by the processor. RAM is of two types:
      S RAM (Static RAM): It uses transistors and the circuits of this
         memory are capable of retaining their state as long as the power is
         applied. This memory consists of the number of flip flops with
         each flip flop storing 1 bit. It has less access time and hence, it is
         faster.
                               R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 99
    D RAM (Dynamic RAM): It uses capacitors and transistors and
     stores the data as a charge on the capacitors. They contain
     thousands of memory cells. It needs refreshing of charge on
     capacitor after a few milliseconds. This memory is slower than S
     RAM.
(ii) ROM (Read Only Memory): It is a non-volatile memory. Non-
volatile memory stores information even when there is a power
supply failed/ interrupted/stopped. ROM is used to store information
that is used to operate the system. As its name refers to read-only
memory, we can only read the programs and data that is stored on it.
It contains some electronic fuses that can be programmed for a piece
of specific information. The information stored in the ROM in binary
format. It is also known as permanent memory. ROM is of four types:
 MROM(Masked ROM): Hard-wired devices with a pre-
     programmed collection of data or instructions were the first
     ROMs. Masked ROMs are a type of low-cost ROM that works in
     this way.
 PROM (Programmable Read Only Memory): This read-only
     memory is modifiable once by the user. The user purchases a
     blank PROM and uses a PROM program to put the required
     contents into the PROM. Its content can’t be erased once written.
 EPROM (Erasable Programmable Read Only Memory): It is
     an extension to PROM where you can erase the content of ROM
     by exposing it to Ultraviolet rays for nearly 40 minutes.
 EEPROM (Electrically Erasable Programmable Read Only
     Memory): Here the written contents can be erased electrically.
     You can delete and reprogramme EEPROM up to 10,000 times.
     Erasing and programming take very little time, i.e., nearly 4 -10
     ms(milliseconds). Any area in an EEPROM can be wiped and
     programmed selectively.
2. Secondary Memory: It is also known as auxiliary memory and
backup memory. It is a non-volatile memory and used to store a large
amount of data or information. The data or information stored in
secondary memory is permanent, and it is slower than primary
memory. A CPU cannot access secondary memory directly. The
data/information from the auxiliary memory is first transferred to the
main memory, and then the CPU can access it.
Characteristics of Secondary Memory:
 It is a slow memory but reusable.
 It is a reliable and non-volatile memory.
 It is cheaper than primary memory.
 The storage capacity of secondary memory is large.
 A computer system can run without secondary memory.
 In secondary memory, data is stored permanently even when the
     power is off.
Types of secondary memory:
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 100
(i) Magnetic Tapes: Magnetic tape is a long, narrow strip of plastic
film with a thin, magnetic coating on it that is used for magnetic
recording. Bits are recorded on tape as magnetic patches called
RECORDS that run along many tracks. Typically, 7 or 9 bits are
recorded concurrently. Each track has one read/write head, which
allows data to be recorded and read as a sequence of characters. It can
be stopped, started moving forward or backward, or rewound.
(ii) Magnetic Disks: A magnetic disc is a circular metal or a plastic
plate and these plates are coated with magnetic material. The disc is
used on both sides. Bits are stored in magnetized surfaces in locations
called tracks that run in concentric rings. Sectors are typically used to
break tracks into pieces.
Hard discs are discs that are permanently attached and cannot be
removed by a single user.
(iii) Optical Disks: It’s a laser-based storage medium that can be
written to and read. It is reasonably priced and has a long lifespan.
The optical disc can be taken out of the computer by occasional
users. Types of Optical Disks :
(a) CD – ROM:
 It’s called Compact Disk. Only read from memory.
 Information is written to the disc by using a controlled laser beam
     to burn pits on the disc surface.
 It has a highly reflecting surface, which is usually aluminum.
 The diameter of the disc is 5.25 inches.
 16000 tracks per inch is the track density.
 The capacity of a CD-ROM is 600 MB, with each sector storing
     2048 bytes of data.
 The data transfer rate is about 4800KB/sec. & the new access
     time is around 80 milliseconds.
(b) WORM-(WRITE ONCE READ MANY):
 A user can only write data once.
 The information is written on the disc using a laser beam.
 It is possible to read the written data as many times as desired.
 They keep lasting records of information but access time is high.
 It is possible to rewrite updated or new data to another part of the
     disc.
 Data that has already been written cannot be changed.
 Usual size – 5.25 inch or 3.5 inch diameter.
 The usual capacity of 5.25 inch disk is 650 MB,5.2GB etc.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 101
(c) DVDs:
 The term “DVD” stands for “Digital Versatile/Video Disc,” and
     there are two sorts of DVDs: (i)DVDR (writable) and (ii)
     DVDRW (Re-Writable)
 DVD-ROMS (Digital Versatile Discs): These are read-only
     memory (ROM) discs that can be used in a variety of ways. When
     compared to CD-ROMs, they can store a lot more data. It has a
     thick polycarbonate plastic layer that serves as a foundation for
     the other layers. It’s an optical memory that can read and write
     data.
 DVD-R: It is a writable optical disc that can be used just once.
     It’s a DVD that can be recorded. It’s a lot like WORM. DVD-
     ROMs have capacities ranging from 4.7 to 17 GB. The capacity
     of 3.5 inch disk is 1.3 GB.
3. Cache Memory: It is a type of high-speed semiconductor memory
that can help the CPU run faster. Between the CPU and the main
memory, it serves as a buffer. It is used to store the data and
programs that the CPU uses the most frequently.
Advantages of cache memory:
 It is faster than the main memory.
 When compared to the main memory, it takes less time to access
     it.
 It keeps the programs that can be run in a short amount of time.
 It stores data in temporary use.
Disadvantages of cache memory:
 Because of the semiconductors used, it is very expensive.
 The size of the cache (amount of data it can store) is usually
     small.
Sample Problem
Question 1. What are the types of memories?
Solution:
There are three types of memory:
 Primary memory
 Secondary memory
 Cache memory
Question 2. What is Volatile and Non Volatile memory?
Solution:
Volatile memory is used to store information based on power supply.
If the power supply is off, all the data & information on this memory
will be lost. For example, RAM (Random Access Memory). Whereas
non-volatile memory is used to store information even when the
power supply is off. For example, ROM (Read Only Memory).
Question 3. What is the full form of CD-ROM?
Solution:
CD-ROM is stands for compact disk read only memory
Question 4. How many 128 * 8 memory chips are required for
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 102
  a memory capacity of 4096*16?
  Solution:
  Number of chips required = Required RAM size/ Available chip
  capacity
  = (4096 * 16)/(128 * 8) = 64
  Question 5. Explain any four differences between RAM and
  ROM?
  Solution:
    RAM                                   ROM
    It stands for Random access           It stands for read only
    memory.                               memory.
                                          It is slower memory as
    It is the fastest memory.             compare to RAM.
    It is volatile memory.                It is non-volatile memory.
                                          In this memory, data will
    In this memory, data will erase       not erase even if the power
    when the power is off                 is off
2. Describe swapping           and    Contiguous     Memory         Allocation CO3     (Nov/Dec
   elaborately.                                                                          2011)
  Swapping is a memory management scheme in which any process can
  be temporarily swapped from main memory to secondary memory so
  that the main memory can be made available for other processes. It is
  used to improve main memory utilization. In secondary memory, the
  place where the swapped-out process is stored is called swap space.
  The purpose of the swapping in operating system is to access the data
  present in the hard disk and bring it to RAM so that the application
  programs can use it. The thing to remember is that swapping is used
  only when data is not present in RAM.
  Although the process of swapping affects the performance of the
  system, it helps to run larger and more than one process. This is the
  reason why swapping is also referred to as memory compaction.
  The concept of swapping has divided into two more concepts: Swap-in
  and Swap-out.
      o Swap-out is a method of removing a process from RAM and
          adding it to the hard disk.
      o Swap-in is a method of removing a program from a hard disk
          and putting it back into the main memory or RAM.
  Example: Suppose the user process's size is 2048KB and is a standard
                             R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 103
hard disk where swapping has a data transfer rate of 1Mbps. Now we
will calculate how long it will take to transfer from main memory to
secondary memory.
User process size is 2048Kb
Data transfer rate is 1Mbps = 1024 kbps
Time = process size / transfer rate
= 2048 / 1024
= 2 seconds
= 2000 milliseconds
Now taking swap-in and swap-out time, the process will take 4000 mill
iseconds.
Advantages of Swapping
     1. It helps the CPU to manage multiple processes within a single
         main memory.
     2. It helps to create and use virtual memory.
     3. Swapping allows the CPU to perform multiple tasks
         simultaneously. Therefore, processes do not have to wait very
         long before they are executed.
     4. It improves the main memory utilization.
Disadvantages of Swapping
     1. If the computer system loses power, the user may lose all
         information related to the program in case of substantial
         swapping activity.
     2. If the swapping algorithm is not good, the composite method
         can increase the number of Page Fault and decrease the overall
         processing performance.
CONTIGUOUS MEMORY ALLOCATION
Memory Management Techniques are basic techniques that are used
in managing the memory in operating system. Memory Management
Techniques are basically classified into two categories:
(i) Contiguous
(ii) Non-contiguous
Contiguous Memory Management Techniques:
In this technique, memory is allotted in a continuous way to the
processes. It has two types:
Fixed Partition Scheme :
In the fixed partition scheme, memory is divided into fixed number of
partitions. Fixed means number of partitions are fixed in the memory.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 104
   In the fixed partition, in every partition only one process will be
   accommodated. Degree of multi-programming is restricted by
   number of partitions in the memory. Maximum size of the process is
   restricted by maximum size of the partition. Every partition is
   associated with the limit registers.
    Limit Registers: It has two limit:
    Lower Limit: Starting address of the partition.
    Upper Limit: Ending address of the partition.
   Internal Fragmentation is found in fixed partition scheme.
   To overcome the problem of internal fragmentation, instead of fixed
   partition scheme, variable partition scheme is used.
   Variable Partition Scheme :
   In the variable partition scheme, initially memory will be single
   continuous free block. Whenever the request by the process arrives,
   accordingly partition will be made in the memory. If the smaller
   processes keep on coming then the larger partitions will be made into
   smaller partitions.
   External Fragmentation is found in variable partition scheme.
   To overcome the problem of external fragmentation, compaction
   technique is used or non-contiguous memory management techniques
   are used.
   Compaction:
   Moving all the processes toward the top or towards the bottom to
   make free available memory in a single continuous place is called as
   compaction. Compaction is undesirable to implement because it
   interrupts all the running processes in the memory.
3. Explain Paging concept in detail.                                       CO3    (Apr/May
   Paging is a memory management scheme that eliminates the need for                 2019)
   contiguous allocation of physical memory. The process of retrieving             (Nov/Dec
   processes in the form of pages from the secondary storage into the                2011)
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 105
main memory is known as paging. The basic purpose of paging is to              (Apr/May
separate each procedure into pages. Additionally, frames will be used             2015)
to split the main memory. This scheme permits the physical address              (Nov/Dec
space of a process to be non – contiguous.                                        2021)
Let us look some important terminologies:
 Logical Address or Virtual Address (represented in bits): An
    address generated by the CPU
 Logical Address Space or Virtual Address Space( represented in
    words or bytes): The set of all logical addresses generated by a
    program
 Physical Address (represented in bits): An address actually
    available on memory unit
 Physical Address Space (represented in words or bytes): The set
    of all physical addresses corresponding to the logical addresses
Example:
 If Logical Address = 31 bit, then Logical Address Space =
    231 words = 2 G words (1 G = 2 30)
 If Logical Address Space = 128 M words = 2 7 * 220 words, then
    Logical Address = log 2 227 = 27 bits
 If Physical Address = 22 bit, then Physical Address Space =
    222 words = 4 M words (1 M = 2 20)
 If Physical Address Space = 16 M words = 2 4 * 220 words, then
    Physical Address = log 2 224 = 24 bits
The mapping from virtual to physical address is done by the memory
management unit (MMU) which is a hardware device and this
mapping is known as paging technique.
 The Physical Address Space is conceptually divided into a
    number of fixed-size blocks, called frames.
 The Logical address Space is also splitted into fixed-size blocks,
    called pages.
 Page Size = Frame Size
Let us consider an example:
 Physical Address = 12 bits, then Physical Address Space = 4 K
    words
 Logical Address = 13 bits, then Logical Address Space = 8 K
    words
 Page size = frame size = 1 K words (assumption)
Address generated by CPU is divided into
 Page number(p): Number of bits required to represent the pages
    in Logical Address Space or Page number
 Page offset(d): Number of bits required to represent particular
    word in a page or page size of Logical Address Space or word
    number of a page or page offset.
Physical Address is divided into
 Frame number(f): Number of bits required to represent the
    frame of Physical Address Space or Frame number.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 106
      Frame offset(d): Number of bits required to represent particular
       word in a frame or frame size of Physical Address Space or word
       number of a frame or frame offset.
   The hardware implementation of page table can be done by using
   dedicated registers. But the usage of register for the page table is
   satisfactory only if page table is small. If page table contain large
   number of entries then we can use TLB(translation Look-aside
   buffer), a special, small, fast look up hardware cache.
    The TLB is associative, high speed memory.
    Each entry in TLB consists of two parts: a tag and a value.
    When this memory is used, then an item is compared with all tags
       simultaneously. If the item is found, then corresponding value is
       returned.
4. Draw the Structure of the Page Table and explain.                      CO3
   In Operating Systems, Paging is a storage mechanism used to retrieve
   processes from the secondary storage into the main memory in the
   form of pages.
   The main idea behind the paging is to divide each process in the form
   of pages. The main memory will also be divided in the form of frames.
   One page of the process is to be stored in one of the frames of the
   memory. The pages can be stored at the different locations of the
   memory but the priority is always to find the contiguous frames or
   holes.
   Pages of the process are brought into the main memory only when they
   are required otherwise they reside in the secondary storage.
   Different operating system defines different frame sizes. The sizes of
   each frame must be equal. Considering the fact that the pages are
   mapped to the frames in Paging, page size needs to be as same as
   frame size.
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 107
                                Example
Let us consider the main memory size 16 Kb and Frame size is 1 KB
therefore the main memory will be divided into the collection of 16
frames of 1 KB each.
There are 4 processes in the system that is P1, P2, P3 and P4 of 4 KB
each. Each process is divided into pages of 1 KB each so that one page
can be stored in one frame.
Initially, all the frames are empty therefore pages of the processes will
get stored in the contiguous way.
Frames, pages and the mapping between the two is shown in the image
below.
 Let us consider that, P2 and P4 are moved to waiting state after some
  time. Now, 8 frames become empty and therefore other pages can be
  loaded in that empty place. The process P5 of size 8 KB (8 pages) is
                     waiting inside the ready queue.
Given the fact that, we have 8 non contiguous frames available in the
memory and paging provides the flexibility of storing the process at
the different places. Therefore, we can load the pages of process P5 in
the place of P2 and P4.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 108
                      Memory Management Unit
The purpose of Memory Management Unit (MMU) is to convert the
logical address into the physical address. The logical address is the
address generated by the CPU for every page while the physical
address is the actual address of the frame where each page will be
stored.
When a page is to be accessed by the CPU by using the logical
address, the operating system needs to obtain the physical address to
access that page physically.
The logical address has two parts.
    1. Page Number
    2. Offset
Memory management unit of OS needs to convert the page number to
the frame number.
Page Table
Page Table is a data structure used by the virtual memory system to
store the mapping between logical addresses and physical addresses.
Logical addresses are generated by the CPU for the pages of the
processes therefore they are generally used by the processes.
Physical addresses are the actual frame address of the memory. They
are generally used by the hardware or more specifically by RAM
subsystems.
The image given below considers,
Physical Address Space = M words
Logical Address Space = L words
Page Size = P words
Physical Address = log 2 M = m bits
Logical Address = log 2 L = l bits
page offset = log 2 P = p bits
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 109
   The CPU always accesses the processes through their logical
   addresses. However, the main memory recognizes physical address
   only.
   In this situation, a unit named as Memory Management Unit comes
   into the picture. It converts the page number of the logical address to
   the frame number of the physical address. The offset remains same in
   both the addresses.
   To perform this task, Memory Management unit needs a special kind
   of mapping which is done by page table. The page table stores all the
   Frame numbers corresponding to the page numbers of the page table.
   In other words, the page table maps the page number to its actual
   location (frame number) in the memory.
   In the image given below shows, how the required word of the frame is
   accessed with the help of offset.
5. Discuss Segmentation and Segmentation with paging with a neat CO3                (Nov/Dec
   sketch.                                                                            2021)
   A process is divided into Segments. The chunks that a program is
   divided into which are not necessarily all of the same sizes are called
   segments. Segmentation gives the user’s view of the process which
   paging does not give. Here the user’s view is mapped to physical
   memory. There are types of segmentation:
   1. Virtual memory segmentation – Each process is divided into a
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 110
    number of segments, not all of which are resident at any one point
    in time.
2. Simple segmentation – Each process is divided into a number of
    segments, all of which are loaded into memory at run time,
    though not necessarily contiguously.
There is no simple relationship between logical addresses and
physical addresses in segmentation. A table stores the information
about all such segments and is called Segment Table. Segment Table
– It maps two-dimensional Logical address into one-dimensional
Physical address. It’s each table entry has:
 Base Address: It contains the starting physical address where the
    segments reside in memory.
 Limit: It specifies the length of the segment.
Translation of Two dimensional Logical Address to dimensional
PhysicalAddress.
Address generated by the CPU is divided into:
 Segment number (s): Number of bits required to represent the
   segment.
 Segment offset (d): Number of bits required to represent the size
   of the segment.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 111
Advantages of Segmentation –
 No Internal fragmentation.
 Segment Table consumes less space in comparison to Page table
    in paging.
 As a complete module is loaded all at once, segmentation
    improves CPU utilization.
 The user’s perception of physical memory is quite similar to
    segmentation. Users can divide user programmes into modules
    via segmentation. These modules are nothing more than the
    separate processes’ codes.
 The user specifies the segment size, whereas in paging, the
    hardware determines the page size.
 Segmentation is a method that can be used to segregate data from
    security operations.
Disadvantage of Segmentation –
 As processes are loaded and removed from the memory, the free
    memory space is broken into little pieces, causing External
    fragmentation.
 Overhead is associated with keeping a segment table for each
    activity.
 Due to the need for two memory accesses, one for the segment
    table and the other for main memory, access time to retrieve the
    instruction increases.
SEGMENTATION WITH PAGING
Pure segmentation is not very popular and not being used in many of
the operating systems. However, Segmentation can be combined with
Paging to get the best features out of both the techniques.
In Segmented Paging, the main memory is divided into variable size
segments which are further divided into fixed size pages.
    1. Pages are smaller than segments.
    2. Each Segment has a page table which means every program has
        multiple page tables.
    3. The logical address is represented as Segment Number (base
        address), Page number and page offset.
Segment Number → It points to the appropriate Segment Number.
Page Number → It Points to the exact page within the segment
Page Offset → Used as an offset within the page frame
Each Page table contains the various information about every page of
the segment. The Segment Table contains the information about every
segment. Each segment table entry points to a page table entry and
every page table entry is mapped to one of the page within a segment.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 112
Translation of logical address to physical address
The CPU generates a logical address which is divided into two parts:
Segment Number and Segment Offset. The Segment Offset must be
less than the segment limit. Offset is further divided into Page number
and Page Offset. To map the exact page number in the page table, the
page number is added into the page table base.
The actual frame number with the page offset is mapped to the main
memory to get the desired word in the page of the certain segment of
the process.
Advantages of Segmented Paging
   1. It reduces memory usage.
   2. Page table size is limited by the segment size.
   3. Segment table has only one entry corresponding to one actual
      segment.
   4. External Fragmentation is not there.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 113
       5. It simplifies memory allocation.
   Disadvantages of Segmented Paging
       1. Internal Fragmentation will be there.
       2. The complexity level will be much higher as compare to
           paging.
       3. Page Tables need to be contiguously stored in the memory.
6. Explain the concept of Virtual Memory in detail.                              CO3
   Virtual Memory is a storage scheme that provides user an illusion of
   having a very big main memory. This is done by treating a part of
   secondary memory as the main memory.
   In this scheme, User can load the bigger size processes than the
   available main memory by having the illusion that the memory is
   available to load the process.
   Instead of loading one big process in the main memory, the Operating
   System loads the different parts of more than one process in the main
   memory.
   By doing this, the degree of multiprogramming will be increased and
   therefore, the CPU utilization will also be increased.
   How Virtual Memory Works?
   In modern word, virtual memory has become quite common these
   days. In this scheme, whenever some pages needs to be loaded in the
   main memory for the execution and the memory is not available for
   those many pages, then in that case, instead of stopping the pages from
   entering in the main memory, the OS search for the RAM area that are
   least used in the recent times or that are not referenced and copy that
   into the secondary memory to make the space for the new pages in the
   main memory.
   Since all this procedure happens automatically, therefore it makes the
   computer feel like it is having the unlimited RAM.
   Demand Paging
   Demand Paging is a popular method of virtual memory management.
   In demand paging, the pages of a process which are least used, get
   stored in the secondary memory.
   A page is copied to the main memory when its demand is made or
   page fault occurs. There are various page replacement algorithms
   which are used to determine the pages which will be replaced. We will
   discuss each one of them later in detail.
   Snapshot of a virtual memory management system
   Let us assume 2 processes, P1 and P2, contains 4 pages each. Each
   page size is 1 KB. The main memory contains 8 frame of 1 KB each.
   The OS resides in the first two partitions. In the third partition, 1 st page
   of P1 is stored and the other frames are also shown as filled with the
   different pages of processes in the main memory.
   The page tables of both the pages are 1 KB size each and therefore
   they can be fit in one frame each. The page tables of both the processes
   contain various information that is also shown in the image.
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 114
   The CPU contains a register which contains the base address of page
   table that is 5 in the case of P1 and 7 in the case of P2. This page table
   base address will be added to the page number of the Logical address
   when it comes to accessing the actual corresponding entry.
   Advantages of Virtual Memory
      1. The degree of Multiprogramming will be increased.
      2. User can run large application with less real RAM.
      3. There is no need to buy more memory RAMs.
   Disadvantages of Virtual Memory
      1. The system becomes slower since swapping takes time.
      2. It takes more time in switching between applications.
      3. The user will have the lesser hard disk space for its use.
7. Explain about given memory management techniques. (i)                        CO3    (Nov/Dec
    Partitioned allocation (ii) Paging and translation look-aside                        2015)
    buffer.                                                                           (Apr/May
                                                                                         2017)
     Partition Allocation Methods in Memory Management
     Partitioned allocation: Memory is divided into different blocks or
     partitions. Each process is allocated according to the requirement.
   In Partition Allocation, when there is more than one partition freely
   available to accommodate a process’s request, a partition must be
   selected. To choose a particular partition, a partition allocation
   method is needed. A partition allocation method is considered better
   if it avoids internal fragmentation.
   When it is time to load a process into the main memory and if there is
   more than one free block of memory of sufficient size then the OS
   decides which free block to allocate.
   There are different Placement Algorithm:
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 115
A. First Fit
B. Best Fit
C. Worst Fit
D. Next Fit
1. First Fit: In the first fit, the partition is allocated which is the first
sufficient block from the top of Main Memory. It scans memory from
the beginning and chooses the first available block that is large
enough. Thus it allocates the first hole that is large enough.
2. Best Fit Allocate the process to the partition which is the first
smallest sufficient partition among the free available partition. It
searches the entire list of holes to find the smallest hole whose size is
greater than or equal to the size of the process.
3. Worst Fit Allocate the process to the partition which is the largest
sufficient among the freely available partitions available in the main
memory. It is opposite to the best-fit algorithm. It searches the entire
list of holes to find the largest hole and allocate it to process.
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 116
4. Next Fit: Next fit is similar to the first fit but it will search for the
first sufficient partition from the last allocation point.
Translation Lookaside Buffer (TLB) in Paging
To overcome this problem a high-speed cache is set up for page table
entries called a Translation Lookaside Buffer (TLB). Translation
Lookaside Buffer (TLB) is a special cache used to keep track of
recently used transactions. TLB contains page table entries that have
been most recently used. Given a virtual address, the processor
examines the TLB if a page table entry is present (TLB hit), the
frame number is retrieved and the real address is formed. If a page
table entry is not found in the TLB (TLB miss), the page number is
used as an index while processing the page table. TLB first checks if
the page is already in main memory, if not in main memory a page
fault is issued then the TLB is updated to include the new page entry.
Steps in TLB hit
1. CPU generates a virtual (logical) address.
2. It is checked in TLB (present).
3. The corresponding frame number is retrieved, which now tells
   where the main memory page lies.
Steps in TLB miss
1. CPU generates a virtual (logical) address.
2. It is checked in TLB (not present).
3. Now the page number is matched to the page table residing in the
   main memory (assuming the page table contains all PTE).
4. The corresponding frame number is retrieved, which now tells
   where the main memory page lies.
5. The TLB is updated with new PTE (if space is not there, one of
   the replacement techniques comes into the picture i.e either FIFO,
   LRU or MFU etc).
Effective memory access time(EMAT)
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 117
      TLB is used to reduce adequate memory access time as it is a high-
     speed associative cache.
     EMAT = h*(c+m) + (1-h)*(c+2m)
     where,h=hit ratio of TLB
     m = Memory access time
     c = TLB access time
8.    Consider the following page reference string: 1, 2, 3, 4, 2, 1, 5, 6, CO3       (Apr/May
      2, 1, 2, 3, 7, 6, 3, 2, 1, 2, 3, 6.                                                2015)
      Identify the no.of page faults would occur for the following                     (Nov/Dec
      replacement algorithms, assuming one, two, three, four, five,                      2015)
      six, or seven frames? Remember all frames are initially empty,
      so your first unique pages will all cost one fault each.
      a.LRU replacement
      b. FIFO replacement
      c.Optimal replacement
      Answer:
      Number of frames      LRU FIFO          Optimal
               1            20       20         20
               2            18       18         15
               3            15       16         11
               4            10       14          8
               5             8       10          7
               6             7       10          7
               7             7        7          7
9.    Compare paging with segmentation in terms of the amount of CO3                   (Nov/Dec
      memory required by the address translation structures in order                     2018)
      to convert virtual addresses to physical addresses.
      Paging and segmentation are both memory management techniques
      used in operating systems to map virtual addresses to physical
      addresses. Let's compare them in terms of the amount of memory
      required by the address translation structures:
      Paging:
      In paging, the virtual address space and the physical address space are
      divided into fixed-size blocks called pages.
      The page table is used for address translation. It consists of page table
      entries (PTEs), where each entry maps a page of the virtual address
      space to a page in the physical address space.
      The size of the page table depends on the size of the virtual address
      space, as each page needs an entry in the page table.
      The memory required for the page table is proportional to the size of
      the virtual address space, not the actual amount of memory used by
                             R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 118
    the process.
    In summary, the amount of memory required for the page table in
    paging is influenced by the size of the virtual address space.
    Segmentation:
    In segmentation, the virtual address space is divided into logical
    segments, such as code, data, and stack. Each segment is assigned a
    base address and a limit.
    The segment table is used for address translation. It contains segment
    table entries (STE), where each entry maps a segment to a base
    address and a limit.
    The memory required for the segment table is proportional to the
    number of segments in the virtual address space, regardless of the size
    of individual segments.
    In summary, the amount of memory required for the segment table in
    segmentation is influenced by the number of segments in the virtual
    address space.
    Comparison:
    Paging tends to have a more predictable and straightforward memory
    structure, as it depends on the size of the virtual address space.
    Segmentation's memory structure is influenced by the number of
    segments, making it more variable and potentially less predictable.
    In practice, modern systems often use a combination of paging and
    segmentation, known as segmented paging or hierarchical paging, to
    leverage the benefits of both approaches. This allows for efficient
    memory management while addressing the limitations of each
    technique individually.
10. Discuss situation under which the most frequently used page CO3                (Apr/May
    replacement algorithm generates fewer page faults than the least                 2018)
    frequently used page replacement algorithm. Also discuss under
    which circumstances the opposite holds.
    The Most Frequently Used (MFU) and Least Frequently Used (LFU)
    page replacement algorithms are based on different strategies for
    determining which pages to keep in memory and which to replace.
    The effectiveness of these algorithms can depend on the access
    patterns of the processes running on the system. Let's discuss
    situations where MFU might generate fewer page faults than LFU and
    vice versa:
    Most Frequently Used (MFU) Generating Fewer Page Faults:
    Temporal Locality:
    If there is a high degree of temporal locality in the access patterns,
    where certain pages are repeatedly accessed in a short period, MFU is
    likely to perform well.
    MFU replaces pages based on the frequency of access. So, if certain
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 119
     pages are accessed frequently, they will be kept in memory, leading
     to fewer page faults.
     Short-Term Bursty Access:
     In scenarios where the workload exhibits short bursts of intense
     activity on specific pages, MFU can adapt quickly by keeping the
     frequently accessed pages in memory.
     Stable Access Patterns:
     If the access patterns remain relatively stable over time, MFU can
     efficiently adapt to the most frequently used pages, minimizing page
     faults.
     Least Frequently Used (LFU) Generating Fewer Page Faults:
     Changing Access Patterns:
     If the workload has dynamic and changing access patterns, where
     previously infrequently accessed pages become suddenly popular,
     LFU might perform better.
     LFU keeps track of the frequency of accesses over the long term and
     may be more adaptive to changes in access patterns.
     Variable Page Usage:
     When there is a mix of pages with varying access frequencies, LFU
     might outperform MFU. LFU considers the historical frequency of
     page accesses and may be better suited for diverse workloads.
     Avoiding Thrashing:
     In scenarios where the MFU algorithm might be overly aggressive in
     keeping frequently used pages, leading to thrashing (excessive page
     swapping), LFU may help by considering the overall frequency of
     page usage.
     In summary, the choice between MFU and LFU can depend on the
     nature of the workload and the characteristics of the memory access
     patterns. Temporal locality, short-term bursty access, and stable
     access patterns might favor MFU, while changing access patterns,
     variable page usage, and the need to avoid thrashing might favor
     LFU. It's worth noting that real-world workloads often exhibit a
     combination of these characteristics, and the choice of the page
     replacement algorithm can impact system performance.
11. Explain about the difference between internal fragmentation CO3              (Nov/Dec
     and external fragmentation.                                                   2016)
                         Internal
      Feature        Fragmentation          External Fragmentation
                                          Unutilized memory scattered
                 Unutilized memory        throughout the system due to
                 within a single          memory being allocated in
    Definition allocated block.           non-contiguous blocks.
                 Within the allocated     Outside allocated memory
    Location memory block.                blocks.
    Cause        Caused by the            Caused by the allocation and
                 allocation of fixed-size deallocation of variable-sized
                 memory blocks, leading memory blocks, resulting in
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 120
                to unused space within small free memory chunks
                a block.                 scattered across the system.
                Visible only to the
                process to which the     Visible system-wide; affects
    Visibility memory is allocated. overall system performance.
                Typically addressed by Addressed through
                adjusting block sizes or compaction or memory
                using memory             management techniques that
                allocation techniques try to coalesce free memory
                that minimize wasted blocks to form larger
    Resolution space.                    contiguous blocks.
                Reduces the efficiency Reduces the overall system's
                of memory usage          ability to use available
    Impact on within individual          memory efficiently due to
    Utilization allocated blocks.        scattered free spaces.
                Fixed-size memory
                allocation schemes like Variable-size memory
    Common paging or fixed-size          allocation schemes like
    in          partitioning.            dynamic partitioning.
                Allocating 4 KB of
                memory for a data
                structure that only      Free memory blocks scattered
                needs 3.5 KB, leading throughout the system that
                to 0.5 KB of wasted      cannot be used to satisfy
    Example space within the block. larger memory requests.
     Understanding the distinctions between internal and external
     fragmentation is crucial for effective memory management in
     operating systems. Internal fragmentation occurs within individual
     allocated blocks, impacting the efficiency of memory usage for a
     specific process. External fragmentation, on the other hand, affects
     the overall system by scattering free memory throughout, making it
     challenging to allocate contiguous memory for larger processes.
12. Most systems allow programs to allocate more memory to its CO3                 (Nov/Dec
     address space during execution. Data allocated in the heap                      2018)
     segments of programs is an example of such allocated memory.
     What is required to support dynamic memory allocation in the
     following schemes?
     i) Contiguous memory allocation ii)Pure segmentation iii) Pure
     paging
    a. Contiguous Memory Allocation:
    In contiguous memory allocation, memory is divided into fixed-size
    partitions, and each partition is allocated to a process. To support
    dynamic memory allocation in this scheme, the following
    requirements are needed:
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 121
1. Memory Management Unit (MMU): An MMU is required to
translate logical addresses to physical addresses and perform memory
protection.
2. Memory Allocation/Deallocation Mechanism: A mechanism is
needed to allocate and deallocate memory dynamically. This
mechanism should keep track of free and allocated memory blocks
and efficiently manage the allocation and deallocation requests.
3. Memory Fragmentation Handling: Contiguous memory allocation
can lead to two types of fragmentation: external fragmentation and
internal fragmentation. Techniques such as compaction and relocation
may be employed to reduce fragmentation.
b. Pure Segmentation:
In pure segmentation, the logical address space of a process is divided
into variable-sized segments, each representing a different unit of the
program (e.g., code segment, data segment). To support dynamic
memory allocation in pure segmentation, the following requirements
are necessary:
1. Segmentation Table: A segmentation table is required to store the
base address and length of each segment for a process. This table is
used to translate logical addresses to physical addresses.
2. Segmentation Mechanism: A mechanism is needed to allocate and
deallocate segments dynamically. This mechanism should manage the
segmentation table and ensure that segments are appropriately
allocated and deallocated.
3. Memory Protection: As different segments may have different
access permissions, memory protection mechanisms are required to
enforce access control and prevent unauthorized access to segments.
c. Pure Paging:
In pure paging, the logical address space of a process is divided into
fixed-sized pages, and physical memory is divided into fixed-sized
frames. To support dynamic memory allocation in pure paging, the
following requirements are necessary:
1. Page Table: A page table is required to store the mapping between
logical pages and physical frames. Each entry in the page table
contains the frame number corresponding to a logical page.
2. Page Replacement Algorithm: In case the physical memory is full
and a new page needs to be allocated, a page replacement algorithm is
needed to select a page to be evicted from the memory. Common
page replacement algorithms include LRU (Least Recently Used),
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 122
        FIFO (First-In-First-Out), and Optimal.
        3. Memory Management Unit (MMU): An MMU is required to
        translate logical addresses to physical addresses using the page table.
        4. Memory Allocation/Deallocation Mechanism: A mechanism is
        needed to allocate and deallocate pages dynamically. This mechanism
        should manage the page table, allocate available pages to processes,
        and handle page faults during memory access.
                                            PART - C
S.No                           Question and Answer                                 CO   Univ QP
  .                                                                                     (Month /
                                                                                         Year)
  1. Discuss Demand Paging and Copy on Write with a neat sketch.            CO3
  1 Demand paging is a technique used in virtual memory systems where
     the pages are brought in the main memory only when required or
     demanded by the CPU. Hence, it is also called lazy swapper because
     the swapping of pages is done only when the CPU requires it. Virtual
     memory is commonly implemented in demand paging.
     rime Ministers of India | List of Prime Minister of India (1947-2020)
     In demand paging, the pager brings only those necessary pages into the
     memory instead of swapping in a whole process. Thus, demand paging
     avoids reading into memory pages that will not be used anyways,
     decreasing the swap time and the amount of physical memory needed.
       The demand paging system depends on the page table implementation
       because the page table helps map logical memory to physical memory.
       Bitwise operators are implemented in the page table to indicate that
       pages are ok or not (valid or invalid). All valid pages exist in primary
       memory, and invalid pages exist in secondary memory. Now all
       processes go-ahead to access all pages, and then the following things
       will be happened, such as:
           1. Attempt to currently access page.
           2. If the page is ok (Valid), then all processing instructions work
               as normal.
           3. If anyone's page is found invalid, then a page fault issue arises.
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 123
   4. Now memory reference is determined whether a valid reference
       exists on the auxiliary memory or not. If not exist, then the
       process is terminated, otherwise needed pages are paged in.
   5. Now disk operations are implemented to fetch the required
       page into primary memory.
Examples of Demand Paging
Suppose we have to execute a process P having four pages P0, P1, P2,
and P3. Currently, in the page table, we have pages P1 and P3.
    1. If the CPU wants to access page P2 of a process P, it will first
        search the page in the page table.
    2. As the page table does not contain this page so it will be
        a trapor page fault. The control goes to the operating system as
        soon as the trap is generated and context switching happens.
    3. The OS system will put the process in a waiting/ blocked state.
        The OS system will now search that page in the backing store
        or secondary memory.
    4. The OS will then read the page from the backing store and load
        it to the main memory.
    5. Next, the OS system will update the page table entry
        accordingly.
    6. Finally, the control is taken back from the OS, and the
        execution of the process is resumed.
Hence, the operating system follows these steps whenever a page fault
occurs and the required page is brought into memory.
So whenever a page fault occurs, as shown in all the above steps 2 to 6.
This time taken to service the page fault is called the Page fault
service time.
Effective Memory Access Time: When the page fault rate is 'p' while
executing any process, then the effective memory access time is
calculated as follows:
Effective Memory Access time = (p)*(s) + (1-p)*(m)
Where
    o p is the page fault rate.
    o s is the page fault service time.
    o m is the main memory access time.
Advantages of Demand Paging
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 124
Here are the following advantages of demand paging in the operating
system, such as:
    o It increases the degree of multiprogramming as many processes
        can be present in the main memory simultaneously.
    o There is a more efficient use of memory as processes having a
        size more than the size of the main memory can also be
        executed using this mechanism because we are not loading the
        whole page at a time.
    o We have to the right for scaling of virtual memory.
    o If any program is larger than physical memory, it helps run this
        program without compaction.
    o Partition management is simpler.
    o It is more useful in a time-sharing system.
    o It has no limitations on the level of multi-programming.
    o Discards external fragmentation.
    o Easy to swap all pages.
Disadvantages of Demand Paging
Below are some disadvantages of demand paging in an operating
system, such as:
    o The amount of processor overhead and the number of tables
        used for handling the page faults is greater than in simple page
        management techniques.
    o It has more probability of internal fragmentation.
    o Its memory access time is longer.
    o Page Table Length Register (PTLR) has a limit for virtual
        memory.
    o Page map table is needed additional memory and registers.
COPY ON WRITE
Copy on Write or simply COW is a resource management technique.
One of its main use is in the implementation of the fork system call in
which it shares the virtual memory(pages) of the OS.
In UNIX like OS, fork() system call creates a duplicate process of the
parent process which is called as the child process.
The idea behind a copy-on-write is that when a parent process creates
a child process then both of these processes initially will share the
same pages in memory and these shared pages will be marked as
copy-on-write which means that if any of these processes will try to
modify the shared pages then only a copy of these pages will be
created and the modifications will be done on the copy of pages by
that process and thus not affecting the other process.
Suppose, there is a process P that creates a new process Q and then
process P modifies page 3.
The below figures shows what happens before and after process P
modifies page 3.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 125
2. Describe Page Replacement algorithms with examples.                      CO3 (Apr/May
   In an operating system that uses paging for memory management, a                2015)
   page replacement algorithm is needed to decide which page needs to            (Nov/Dec
   be replaced when new page comes in.                                             2021)
   Page Fault – A page fault happens when a running program accesses a
   memory page that is mapped into the virtual address space, but not
   loaded in physical memory.
   Since actual physical memory is much smaller than virtual memory,
   page faults happen. In case of page fault, Operating System might have
   to replace one of the existing pages with the newly needed page.
   Different page replacement algorithms suggest different ways to
   decide which page to replace. The target for all algorithms is to reduce
   the number of page faults.
   Page Replacement Algorithms :
   First In First Out (FIFO) –
   This is the simplest page replacement algorithm. In this algorithm, the
   operating system keeps track of all pages in the memory in a queue,
   the oldest page is in the front of the queue. When a page needs to be
   replaced page in the front of the queue is selected for removal.
   Example-1Consider page reference string 1, 3, 0, 3, 5, 6 with 3 page
   frames.Find number of page faults.
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 126
Initially all slots are empty, so when 1, 3, 0 came they are allocated to
the empty slots —> 3 Page Faults.
when 3 comes, it is already in memory so —> 0 Page Faults.
Then 5 comes, it is not available in memory so it replaces the oldest
page slot i.e 1. —>1 Page Fault.
6 comes, it is also not available in memory so it replaces the oldest
page slot i.e 3 —>1 Page Fault.
Finally when 3 come it is not avilable so it replaces 0 1 page fault
Belady’s anomaly – Belady’s anomaly proves that it is possible to have
more page faults when increasing the number of page frames while
using the First in First Out (FIFO) page replacement algorithm. For
example, if we consider reference string 3, 2, 1, 0, 3, 2, 4, 3, 2, 1, 0, 4
and 3 slots, we get 9 total page faults, but if we increase slots to 4, we
get 10 page faults.
/li>
Optimal Page replacement –
In this algorithm, pages are replaced which would not be used for the
longest duration of time in the future.
Example-2:Consider the page references 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3,
2, with 4 page frame. Find number of page fault.
Initially all slots are empty, so when 7 0 1 2 are allocated to the empty
slots —> 4 Page faults
0 is already there so —> 0 Page fault.
when 3 came it will take the place of 7 because it is not used for the
longest duration of time in the future.—>1 Page fault.
0 is already there so —> 0 Page fault..
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 127
   4 will takes place of 1 —> 1 Page Fault.
   Now for the further page reference string —> 0 Page fault because
   they are already available in the memory.
   Optimal page replacement is perfect, but not possible in practice as the
   operating system cannot know future requests. The use of Optimal
   Page replacement is to set up a benchmark so that other replacement
   algorithms can be analyzed against it.
   Least Recently Used –
   In this algorithm page will be replaced which is least recently used.
   Example-3Consider the page reference string 7, 0, 1, 2, 0, 3, 0, 4, 2, 3,
   0, 3, 2 with 4 page frames. Find number of page faults.
   Initially all slots are empty, so when 7 0 1 2 are allocated to the empty
   slots —> 4 Page faults
   0 is already their so —> 0 Page fault.
   when 3 came it will take the place of 7 because it is least recently used
   —>1 Page fault
   0 is already in memory so —> 0 Page fault.
   4 will takes place of 1 —> 1 Page Fault
   Now for the further page reference string —> 0 Page fault because
   they are already available in the memory.
3. State how frames are allocated during paging with diagrams CO3
   wherever necessary.
   The main memory of the operating system is divided into various
   frames. The process is stored in these frames, and once the process is
   saved as a frame, the CPU may run it. As a result, the operating system
   must set aside enough frames for each process. As a result, the
   operating system uses various algorithms in order to assign the frame.
   Demand paging is used to implement virtual memory, an essential
   operating system feature. It requires the development of a page
   replacement mechanism and a frame allocation system. If you have
   multiple processes, the frame allocation techniques are utilized to
   define how many frames to allot to each one. A number of factors
   constrain the strategies for allocating frames:
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 128
    1. You cannot assign more frames than the total number of frames
        available.
    2. A specific number of frames should be assigned to each
        process. This limitation is due to two factors. The first is that
        when the number of frames assigned drops, the page fault ratio
        grows, decreasing the process's execution performance.
        Second, there should be sufficient frames to hold all the
        multiple pages that any instruction may reference.
There are mainly five ways of frame allocation algorithms in the OS.
These are as follows:
    1. Equal Frame Allocation
    2. Proportional Frame Allocation
    3. Priority Frame Allocation
    4. Global Replacement Allocation
    5. Local Replacement Allocation
Equal Frame Allocation
In equal frame allocation, the processes are assigned equally among
the processes in the OS. For example, if the system has 30 frames and
7 processes, each process will get 4 frames. The 2 frames that are not
assigned to any system process may be used as a free-frame buffer
pool in the system.
Disadvantage
In a system with processes of varying sizes, assigning equal frames to
each process makes little sense. Many allotted empty frames will be
wasted if many frames are assigned to a small task.
Proportional Frame Allocation
The proportional frame allocation technique assigns frames based on
the size needed for execution and the total number of frames in
memory.
The allocated frames for a process pi of size si are ai = (si/S)*m, in
which S represents the total of all process sizes, and m represents the
number of frames in the system.
Disadvantage
The only drawback of this algorithm is that it doesn't allocate frames
based on priority. Priority frame allocation solves this problem.
Priority Frame Allocation
Priority frame allocation assigns frames based on the number of frame
allocations and the processes. Suppose a process has a high priority
and requires more frames that many frames will be allocated to it.
Following that, lesser priority processes are allocated.
Global Replacement Allocation
When a process requires a page that isn't currently in memory, it may
put it in and select a frame from the all frames sets, even if another
process is already utilizing that frame. In other words, one process may
take a frame from another.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 129
   Advantages
   Process performance is not hampered, resulting in higher system
   throughput.
   Disadvantages
   The process itself may not solely control the page fault ratio of a
   process. The paging behavior of other processes also influences the
   number of pages in memory for a process.
   Local Replacement Allocation
   When a process requires a page that isn't already in memory, it can
   bring it in and assign it a frame from its set of allocated frames.
   Advantages
   The paging behavior of a specific process has an effect on the pages in
   memory and the page fault ratio.
   Disadvantages
   A low priority process may obstruct a high priority process by refusing
   to share its frames.
   Global Vs. Local Replacement Allocation
   The number of frames assigned to a process does not change using a
   local replacement strategy. On the other hand, using global
   replacement, a process can choose only frames granted to other
   processes and enhance the number of frames allocated.
4. Explain the concept of thrashing elaborately.                             CO3
   Thrashing is a condition or a situation when the system is spending a
   major portion of its time servicing the page faults, but the actual
   processing done is very negligible.
   In computer science, thrash is the poor performance of a virtual
   memory (or paging) system when the same pages are being loaded
   repeatedly due to a lack of main memory to keep them in memory.
   Depending on the configuration and algorithm, the actual throughput
   of a system can degrade by multiple orders of magnitude.
   In computer science, thrashing occurs when a computer's virtual
   memory resources are overused, leading to a constant state of paging
   and page faults, inhibiting most application-level processing. It causes
   the performance of the computer to degrade or collapse. The situation
   can continue indefinitely until the user closes some running
   applications or the active processes free up additional virtual memory
   resources.
   To know more clearly about thrashing, first, we need to know about
   page fault and swapping.
       o Page fault: We know every program is divided into some
           pages. A page fault occurs when a program attempts to access
           data or code in its address space but is not currently located in
           the system RAM.
       o Swapping: Whenever a page fault happens, the operating
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 130
        system will try to fetch that page from secondary memory and
        try to swap it with one of the pages in RAM. This process is
        called swapping.
Thrashing is when the page fault and swapping happens very
frequently at a higher rate, and then the operating system has to spend
more time swapping these pages. This state in the operating system is
known as thrashing. Because of thrashing, the CPU utilization is going
to be reduced or negligible.
The basic concept involved is that if a process is allocated too few
frames, then there will be too many and too frequent page faults. As a
result, no valuable work would be done by the CPU, and the CPU
utilization would fall drastically.
The long-term scheduler would then try to improve the CPU utilization
by loading some more processes into the memory, thereby increasing
the degree of multiprogramming. Unfortunately, this would result in a
further decrease in the CPU utilization, triggering a chained reaction of
higher page faults followed by an increase in the degree of
multiprogramming, called thrashing.
Algorithms during Thrashing
Whenever thrashing starts, the operating system tries to apply either
the Global page replacement Algorithm or the Local page replacement
algorithm.
1. Global Page Replacement
Since global page replacement can bring any page, it tries to bring
more pages whenever thrashing is found. But what actually will
happen is that no process gets enough frames, and as a result, the
thrashing will increase more and more. Therefore, the global page
replacement algorithm is not suitable when thrashing happens.
2. Local Page Replacement
Unlike the global page replacement algorithm, local page replacement
will select pages which only belong to that process. So there is a
chance to reduce the thrashing. But it is proven that there are many
disadvantages if we use local page replacement. Therefore, local page
replacement is just an alternative to global page replacement in a
thrashing scenario.
Causes of Thrashing
Programs or workloads may cause thrashing, and it results in severe
performance problems, such as:
     o If CPU utilization is too low, we increase the degree of
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 131
         multiprogramming by introducing a new system. A global page
         replacement algorithm is used. The CPU scheduler sees the
         decreasing CPU utilization and increases the degree of
         multiprogramming.
    o CPU utilization is plotted against the degree of
         multiprogramming.
    o As the degree of multiprogramming increases, CPU utilization
         also increases.
    o If the degree of multiprogramming is increased further,
         thrashing sets in, and CPU utilization drops sharply.
    o So, at this point, to increase CPU utilization and to stop
         thrashing, we must decrease the degree of multiprogramming.
How to Eliminate Thrashing
Thrashing has some negative impacts on hard drive health and system
performance. Therefore, it is necessary to take some actions to avoid it.
To resolve the problem of thrashing, here are the following methods,
such as:
    o Adjust the swap file size:If the system swap file is not
         configured correctly, disk thrashing can also happen to you.
    o Increase the amount of RAM: As insufficient memory can
         cause disk thrashing, one solution is to add more RAM to the
         laptop. With more memory, your computer can handle tasks
         easily and don't have to work excessively. Generally, it is the
         best long-term solution.
    o Decrease the number of applications running on the
         computer: If there are too many applications running in the
         background, your system resource will consume a lot. And the
         remaining system resource is slow that can result in thrashing.
         So while closing, some applications will release some resources
         so that you can avoid thrashing to some extent.
    o Replace programs: Replace those programs that are heavy
         memory occupied with equivalents that use less memory.
Techniques to Prevent Thrashing
The Local Page replacement is better than the Global Page
replacement, but local page replacement has many disadvantages, so it
is sometimes not helpful. Therefore below are some other techniques
that are used to handle thrashing:
1. Locality Model
A locality is a set of pages that are actively used together. The locality
model states that as a process executes, it moves from one locality to
another. Thus, a program is generally composed of several different
localities which may overlap.
For example, when a function is called, it defines a new locality where
memory references are made to the function call instructions, local and
global variables, etc. Similarly, when the function is exited, the process
leaves this locality.
                        R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 132
2. Working-Set Model
This model is based on the above-stated concept of the Locality
Model.
The basic principle states that if we allocate enough frames to a
process to accommodate its current locality, it will only fault whenever
it moves to some new locality. But if the allocated frames are lesser
than the size of the current locality, the process is bound to thrash.
According to this model, based on parameter A, the working set is
defined as the set of pages in the most recent 'A' page references.
Hence, all the actively used pages would always end up being a part of
the working set.
The accuracy of the working set is dependent on the value of
parameter A. If A is too large, then working sets may overlap. On the
other hand, for smaller values of A, the locality might not be covered
entirely.
If D is the total demand for frames and WSSi is the working set size for
process i,
D = ⅀ WSSi
Now, if 'm' is the number of frames available in the memory, there are
two possibilities:
     o D>m, i.e., total demand exceeds the number of frames, then
         thrashing will occur as some processes would not get enough
         frames.
     o D<=m, then there would be no thrashing.
If there are enough extra frames, then some more processes can be
loaded into the memory. On the other hand, if the summation of
working set sizes exceeds the frames' availability, some of the
processes have to be suspended (swapped out of memory).
This technique prevents thrashing along with ensuring the highest
degree of multiprogramming possible. Thus, it optimizes CPU
utilization.
3. Page Fault Frequency
A more direct approach to handle thrashing is the one that uses the
Page-Fault Frequency concept.
The problem associated with thrashing is the high page fault rate, and
thus, the concept here is to control the page fault rate.
If the page fault rate is too high, it indicates that the process has too
few frames allocated to it. On the contrary, a low page fault rate
indicates that the process has too many frames.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 133
Upper and lower limits can be established on the desired page fault
rate, as shown in the diagram.
If the page fault rate falls below the lower limit, frames can be
removed from the process. Similarly, if the page faults rate exceeds the
upper limit, more frames can be allocated to the process.
In other words, the graphical state of the system should be kept limited
to the rectangular region formed in the given diagram.
If the page fault rate is high with no free frames, some of the processes
can be suspended and allocated to them can be reallocated to other
processes. The suspended processes can restart later.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 134
                              UNIT IV STORAGE MANAGEMENT
Mass Storage system – Disk Structure - Disk Scheduling and Management; File-System
Interface - File concept - Access methods - Directory Structure - Directory organization - File
system mounting - File Sharing and Protection; File System Implementation - File System
Structure - Directory implementation - Allocation Methods - Free Space Management; I/O
Systems – I/O Hardware, Application I/O interface, Kernel I/O subsystem.
                                              PART - A
S.No.                             Question and Answer                                CO  Univ QP
                                                                                         (Month /
                                                                                          Year)
    1. Distinguish file from dictionary.                                             CO4  (Apr/
                                                                                         May2017)
        A file is any kind of computer document whereas a directory is a
        collection of files and folders.
          Feature                  File                         Dictionary
                    Stores data persistently on       Stores key-value pairs for
        Purpose storage devices.                      efficient data retrieval.
                    Typically organized as a
                    sequence of bytes, characters, Organized as a collection
        Structure or records.                         of key-value pairs.
                    Accessed sequentially or          Accessed by searching for
        Content randomly, depending on the            a specific key to retrieve its
        Retrieval file type.                          associated value.
                    Can store various data types, Values associated with
        Data        including text, binary, numeric, keys can be of various data
        Types       etc.                              types.
                    File systems may support basic
        Search      search functions, but the         Provides efficient key-
        Capabilit primary focus is on reading and based search operations for
        y           writing data.                     fast retrieval.
                    Data can be modified or           Efficient for adding,
                    appended, but modifications       updating, or deleting key-
        Modificat may require rewriting the entire value pairs without
        ion         file.                             affecting other entries.
                    Can be of variable or fixed size,
                    depending on the file system Dynamic size based on the
        Size        and file type.                    number of key-value pairs.
                    Text files, binary files,         Hash tables, associative
        Examples databases, etc.                      arrays, JSON objects, etc.
                              R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 135
2. Why it is important to scale up system bus and device speed as             CO4   (Nov/Dec
   CPU speed increases?                                                               2016)
   Consider a system which performs 50% I/O and 50% computes.
   Doubling the CPU performance on this system would increase total
   system performance by only 50%. Doubling both system aspects
   would increase performance by 100%. Generally, it is important to
   remove the current system bottleneck, and to increase overall system
   performance, rather than blindly increasing the performance of
   individual system components.
3. Define C-SCAN scheduling.                                                  CO4
   The elevator algorithm (also SCAN) is a disk scheduling algorithm to
   determine the motion of the disk's arm and head in servicing read and
   write requests. This algorithm is named after the behaviour of a
   building elevator, where the elevator continues to travel in its current
   direction (up or down) until empty, stopping only to let individuals
   off or to pick up new individuals heading in the same direction.
4. How does DMA increase system concurrency?                                  CO4     (May
   DMA increases system concurrency by allowing the CPU to perform                    /June
   tasks while the DMA system transfers data via the system and                       2016)
   memory buses.
5. Why rotational latency is not considered in disk scheduling?               CO4     (May
                                                                                      /June
     Most disks do not export their rotational position information to the            2016)
     host. Even if they did, the time for this information to reach the
     scheduler would be subject to imprecision and the time consumed
     by the scheduler is variable, so the rotational position information
     would become incorrect. Further, the disk requests are usually
     given in terms of logical block numbers, and the mapping between
     logical blocks and physical locations is very complex.
6.   List the various file attributes.                                        CO4 (Apr/May
                                                                                     2015)
     A file has certain other attributes, which vary from one operating            (Nov/Dec
     system to another, but typically consist of these: Name, identifier,            2018)
     type, location, size, protection, time, and date and user
     identification
7.   What is HSM? Where it is used?                                           CO4 (Apr/May
                                                                                    2015)
     Hierarchical storage management (HSM) is a data storage
     technique, which automatically moves data between high-cost and
     low-cost storage media. HSM systems exist because high- speed
     storage devices, such as solid state drive arrays, are more expensive
     (per byte stored) than slower devices, such as hard disk drives,
     optical discs and magnetic tape drives.
8.   What are the functions of Virtual File System (VFS) layer in file        CO4   (Nov/Dec
     system implementation?                                                           2015)
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 136
     A virtual file system (VFS) or virtual file system switch is an
     abstraction layer on top of a more concrete file system. The purpose
     of a VFS is to allow client applications to access different types of
     concrete file systems in a uniform way. A VFS can, for example, be
     used to access local and network storage devices transparently
     without the client application noticing the difference.
9. List the significance of LDT and GDT in segmentation.                     CO4      (Nov
     LDT contains memory segments which are private to a specific                     /Dec
     program, the GDT contains global segments. The x86 processors                    2018)
     have facilities for automatically switching the current LDT on
     specific machine events, but no facilities for automatically
     switching the GDT.
10. Suppose that the disk rotates at 7200 rpm.                               CO4
    a. what is the average rotational latency of the disk drive?
    b. Identify the seek distance that can be covered in the time?
11. What are the various file operations?                                    CO4     (Nov/
                                                                                    Dec2018)
     The six basic file operations are
            Creating a file
            Writing a file
            Reading a file
            Repositioning within a file
            Deleting a file
            Truncating a file
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 137
12. Define typical bad sector transaction.                                    CO4   Apr/May
                                                                                     2018
     The operating system tries to read logical
     block 87.
         The controller calculates the ECC and finds that the sector
          is bad, It reports this finding to the OS.
          The next time the system is rebooted, a special command is
          run to tell the controller to replace the bad sector with a
          space.
       After that, whenever the system requests logical block 87,
          the request is translated into the replacement sector’s
          address by the controller.
13. What are the informations associated with an open file?                   CO4
    Several pieces of information are associated with an open file
    which may be:
           File pointer File
           open count
           Disk location of the file
           Access rights
14. What are the different accessing methods of a file?                       CO4
     The different types of accessing a file are:
     Sequential access:      Information in the file is accessed
     sequentially
    Direct access: Information in the file can be accessed without any
    particular order.
   Other access methods: Creating index for the file, indexed
   sequential access method (ISAM) etc.
15. Determine the most common schemes for defining the logical                CO4
    structure of a directory?
     The most common schemes for defining the logical structure of a
     directory
           Single-Level Directory
           Two-level Directory
           Tree-Structured Directories
           Acyclic-Graph Directories
           General Graph Directory
16. Define UFD and MFD.                                                       CO4
     In the two-level directory structure, each user has her own user file
     directory (UFD). Each UFD has a similar structure, but lists only
     the files of a single user. When a job starts the system's master file
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 138
     directory (MFD) is searched. The MFD is indexed by the user name
     or account number, and each entry points to the UFD for that user.
17. Examine how an index file is used to speed up the access in            CO4
    direct-access files?
    Have an index in memory; the index gives the key and the disk
    location of its corresponding record. Scan the index to find the
    record you want, and then access it directly.
18. What are the various disk-scheduling algorithms?                       CO4
     The various disk-scheduling algorithms are
       • First Come First Served Scheduling
       • Shortest Seek Time First Scheduling
       • SCAN Scheduling
       • C-SCAN Scheduling
       • LOOK scheduling
19. List three ways of allocating storage, and give advantages of          CO4
    each.
     a. Contiguous allocation. Fastest, if no changes are to be made.
     Also easiest for random access files.
     b. Linked allocation. No external fragmentation. File can grow
     without complications.
    c. Indexed allocation. Supports direct access without external
    fragmentation.
20. If the average page faults service time of 25 ms and a memory          CO4
    access time of 100ns. Calculate the effective access time.
    Effective access time = (1-p) *ma + p*page fault time
     = (1-p) *100+p*25000000
     = 100-100p+25000000*p
     = 100 + 24999900p
21. What is Belady’s Anomaly?                                              CO4
    For some page replacement algorithms, the page fault rate may
    increase as the number of allocated frames increases
22. What is meant by Locality of Reference?                                CO4
    The locality model states that, as a process executes, it moves
    from locality to locality.
    Locality is of two types.
        Spatial locality
        Temporal locality
23. What are the drawbacks of Contiguous Allocation of Disk                CO4
    Space?
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 139
    The disadvantages are,
      Suffers from external fragmentation
      Suffers from internal fragmentation
      Difficulty in finding space for a new file
      File cannot be extended
      Size of the file is to be declared in advance
24. What are the advantages of Linked Allocation?                           CO4
    The advantages are,
       No external fragmentation
       Size of the file does not need to be declared
25. What are the disadvantages of Linked Allocation?                        CO4
     The disadvantages are,
     Used only for sequential access of files.
     Direct access is not supported
     Memory space required for the pointers.
     Reliability is compromised if the pointers are lost or damaged
                                               PART B
1. Explain about directory structure?                                         CO4   (Apr
   • User stores file in file system using directory. File system stores            /May
                                                                                    2015)
   files of many users. Files are stored on the secondary storage device            (Apr
   like hard disk, optical disk and memory disk. These device support               /May
   random access method.                                                            2017)
   • Directory structure organizes and provides information about all
   files in the system. Every partition has a file system which consists of
   files and directory. Hard disks are split into one or more partitions. It
   is called as minidisks or volumes. User can assign names to volumes.
   Each disk contains minimum one partition.
   • Operating systems support upto 24 partitions. Partitions can be
   larger than a disk i.e combing two hard disks. These partitions are
   called "virtual disks". Partition maintains information about files in
   "Device Directory".
   • Files are stored in the directory. Directory is created by user. User is
   free to create as many as directory and assign suitable names. File
   system contains many directories. Directory maintains information
   about the file. File information includes type, size, date modified and
   name.
   • In some operating system, directory itself is considered as a file. A
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 140
set of logically associated files is called directory. Directory structure
of the file system connects a directory to other directories in the
system.
• Directory is a data structure that lists files and subdirectories in a
collection. Directory structure contains a list of entries one for each
file.
• Directory does not store any user data. Single dot (.) indicates a
current directory rote and double dot (..) indicates parent directory.
The root is identified by the directory '/'. The root file system has
several subdirectories.
• Windows file system uses a letter followed by colon for specifying
root directory. For example, root directory in windows is C: or D: or
E:
• UNIX and Linux based file system uses slash (/) for specifying root
directory.
• A directory can contain any number of files. A directory can also
contain subdirectories. Because a directory is nothing more than a
special kind of file, directories also have names.
Operation on directory :
1. File searching: Directory structure is searched for particular file
entry. File uses symbolic names and similar names may indicate a
relationship between files.
2. Create a file: User creates new files and added to the directory.
3. Delete a file: User remove the file after completing its work on
files.
4. Rename a file: User change the file name if file content is change.
5. Listing of directory: Listing of files in the directory for some use.
MS-DOS and windows uses "dir" command and Linux/UNIX uses
"ls" command for this purposes.
Single Level Directory
• Single level directory is a simple structure and there are no sub-
directories.
• Fig. 6.3.1 shows three files with single level directory. It is also
called flat directory structure.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 141
• Single level directory structure is suitable only for single user. If
user increases, it creates the problem with assigning the name to files.
• In single level directory structure, no two files can have the same
name.
• It is simple to implement and searching of file is faster. This type of
directory system is used in cameras and phones.
• Because of limitation of file names, this structure is rarely
implemented.
• File name size varies according to the operating system. MS-DOS
operating system allows only 11 character file names and UNIX OS
allows upto 255 characters.
Disadvantages of single level directory:
1. Not suitable for a large number of files
2. Limitation of unique file name.
3. For large number of file, recalling file name is difficult.
Hierarchical Directory System
• Here files are grouped together to form a sub directory. Each
directory may contain several subdirectories. In hierarchical file
system, a root directory points to other directories and files.
• Hierarchical directory system contains various forms of directory
structures. They are as follows:
1. Two-level directory structure
2. Tree structured directories
3. Acyclic graph directories.
Two-level Directory Structure
• Two-level directory structure contains master file directory at the
top level then user file directory at the second level. Actual user files
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 142
are at the third level.
• File system maintains a master block for each user. Master block
has one entry for each user.
• User directory address is stored in the master block. Each user has a
private directory. Different user can assign same names to their files
as the directories for each user are now separate.
• Fig. 6.3.2 shows two-level directory structure.
• Here user can work only in own directory. Other user need not
worry about deleting or modifying files by other users.
• When user wants to create a file, system search that particular file
name in the directory. If same file name is not found, then it creates a
file otherwise it gives error messages.
• In this structure, different users may have files with same name. It
solves the Viol name conflict problem. When user want to open a file,
filename is searched for in users directory.
• A two level directory can be a tree or an inverted tree of height 2.
• There are still problems with two-level directory structure. It
effectively isolates one user from another. This is some time
advantage or some time disadvantages. For completely independent
user is an advantages and disadvantages when cooperation between
two users is required.
Tree Structured Directory
• Fig. 6.3.3 shows the tree structured directory.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 143
• In this structure, directory itself is a file. A directory and sub
directory contains a set of files. Internal format is same for all
directories
• Commonly used directory structure is tree structure. Tree has a root
directory. All files in disk have a unique path name.
• A path name describes the path that the operating system must take
to get from some starting point in the file system to the destination
object. Each process has its own working directory. Path names are of
two types : relative and absolute.
• Relative path name : Its path starts from current working directory
(may start with
• Absolute path name: Its path starts from root directory (starts from
"/").
• Tree structured directory differentiate file and subdirectory by using
one bit representation. Zero (0) represents file and one (1) represents
subdirectory.
• When process makes reference to a file for opening, file system
search this file in the current directory. If needed file is not found in
the current directory, then user either change the directory or give the
full path of the file. System calls are used for performing any
operation on the files or directory.
• Empty directory and its entry are easily deleted by operating system.
If directory contains files and subdirectory, i.e. non-empty directory,
some operating systems not delete the directory.
• To delete a directory, user must first delete all the files and
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 144
     subdirectory in that directory.
     Merits
     1. User can create his/her own directory.
     2. User can access other user files by specifying path name.
     3. Separate subdirectory for files associated with different topics
     4. Searching id fast and efficient.
     Demerits
     1. Sub directory can not share between the user
     2. Path is longer than two level directory structure
     Acyclic Graph Directory
     • Acyclic graph directory solve the problem of tree structure
     directory.
     • It allows sharing the directory in between two users. At a time more
     than one places shared directory or file will exist in the file system.
     • Fig. 6.3.4 shows acyclic graph directory. Links can be used to
     construct acyclic graph
     • It is very interesting to note that a shared directory or file is not the
     same as two copies of the file. When there are two copies of files,
     each user can view the copy rather than the original, but if one user
     changes the file content, the changes will e file content, t not appear
     in the other's copy.
     • Only one original file exists for shared copy. Any changes made by
     one user are immediately visible to the other user. When user create
     file in shared directory, it automatically appear in all the shared
     subdirectories.
2.    What are files and explain the access methods for files?                     CO4   Apr/May
                                                                                          2017
      Files are the building blocks of any operating system. Permanent
      storage of information and data is a file. File provides an easy
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 145
 means of information sharing. The operating system is not
 interested in the information stored in the files. Operating system
 map files with physical devices
File Access Methods
• Information and data is kept in files. Files are stored on the
secondary storage device. When this information is to be used, it has
to be accessed and brought into primary main memory.
• Information in files could be accessed in many ways. It is usually
dependent on an application. File organization checks how efficiently
the input-output storage medium used.
• File access method defines the way processes read and write files.
Different access methods are available. Some operating systems
provide only one access method and other systems provide many
access methods.
1. Sequential access 2. Direct /Random access 3. Indexed sequential
access.
• Access method means accesses to files that use a particular file
organization are implemented by an input output control system.
Sequential Access Method
• Sequential access method simple method. The information in a file
is accessed quentially one record after another.
• In this method, a process could read all the records in a file in order,
starting at the beginning. It can not skip any records and can not read
out of order.
• Fig. 6.2.1 shows sequential access method.
• Batch application uses sequential files. A byte stream file uses
sequential access method.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 146
• Example: Compiler and editor usually access files in this method.
Transaction file is also example of sequential access method.
• Sequential file organization only method easily stored on tape and
hard disk. Sequential access is best suited where most of the records
in a file are to be processed.
Disadvantages:
• It provides poor performances.
• More efficient search technique is required.
Direct Access Method
• It is also called random access method. This access allows a user to
position the read/write mark before reading or writing.
• Fig. 6.2.2 shows direct method.
• Direct access file method provides, accessing the records directly.
Direct access method is based on the hard disk that is a direct access
device. It allows random access of any file block.
• Each record has its own address on the file with by the help of
which it can be directly accessed for reading or writing. This feature
is used by editors. An editors need to randomly access the contents of
the file.
• There is no restriction on the order of reading or writing for a direct
access file. Operating system support is not required for direct access
file.
• At the time of file creation, access method is defined. According to
defined access method, file is accessed. Sequential access of a direct
access file is possible but direct access of a sequential file is not.
Disadvantages:
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 147
1. Poor utilization of input-output device.
2. Consumes CPU time for record address calculation.
Indexed Sequential Access
• It is modified method of direct access method. This method is
combination of direct and sequential access method.
• In simple indexed sequential method, simple single level of
indexing is used. Sequential file is used as index.
• Fig. 6.2.3 shows indexed sequential file. Indexed sequential access
contains key field and a pointer to the main file.
• Indexed sequential files are store records in the order that they are
written to the disk. Records may be retrieved in sequential order or in
random order using an index to represent the record number in the
file.
• If file size is large, larger index is required and it contains large
number of entries. CPU takes time to search in to the index. Higher
level index is used to reduce the search time. An entry in the higher
level index points to a section of the index. This higher index contains
the records. This index is searched to find the section sdns of the disk
that may contains a required record.
• File system maintains an overflow file. It adds new records to an
overflow file. The record in the main file that immediately precedes
the new record in logical sequence is updated to contain pointer to the
new record in the overflow file.
 • The overflow file is merged with the main file during a batch
update. The multiple indexes for the same key field can be set up to
increase efficiency.
• The lowest level of index file is treated as a sequential file and a
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 148
   higher level index file is created for that file. Higher index file would
   contain pointers to lower index files, which would point to the actual
   data items.
3. Explain about kernel I/O subsystem and transforming I/O to                CO4   Apr/May
     hardware operations.                                                           2017
   • Kernel I/O subsystem provides various services like scheduling,
   buffering, caching, spooling, device reservation and error handling.
   1. I/O scheduling: Some I/O request ordering via per-device queue.
   Scheduling can improve overall system performance. Operating-
   system developers implement scheduling by maintaining a queue of
   requests for each device.
   2. Buffering: Buffer is a memory area that stores data while they are
   transferred between two devices or between a device and an
   application.
   3. Caching: It is fast memory region which holds copies of data.
   Access to the cached copy is more efficient than access to the
   original. Cache files that are read/written frequently. Caching
   increases performance.
   4. Spooling and device reservation : Spooling uses the disk as a
   large buffer for Jerit and outputting data to printers and other devices.
   It can also be used for input, but is generally used for output. Its main
   use is to prevent two users from alternating printing lines to the line
   printer on the same page, getting their output completely mixed
   together. It also helps in reducing idle time and overlapped I/O
   5. Error handling: Most operating systems return an error number or
   code when I/O request fails. OS can recover from disk read, device
   unavailable, transient write failures.
   6. I/O protection: User application may accidentally attempt to
   disrupt normal operation via illegal I/O instructions. To avoid this we
   can use all I/O instructions defined to be privileged and I/O must be
   performed via system calls.
   7. Kernel data structures: Kernel keeps state info for I/O
   components, including open file tables, network connections,
   character device state. The kernel must create a data structure
   representing the new process. The scheduler can then use it if it
   decides to schedule that process to run when it is its turn.
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 149
     Transferring I/O Requests to Hardware Operations
     • Reading a file from disk, application refers to the data by a file
     name. On the secondary storage device, file system maps from the
     file name through the directory.
     • In UNIX operating system, the name maps to an inode number and
     that inode contains the space-allocation information.
     • UNIX represents device names in the regular file-system name
     space. To resolve a Se path name, UNIX looks up the name in the
     mount table to find the longest matching prefix; the corresponding
     entry in the mount table gives the device name.
     • UNIX uses concept of major and minor number. Device files are
     found in the /dev directory. Each device is assigned a major and
     minor device number. bod
     • The major device number identifies the type of device, i.e. all SCSI
     devices would have the same number as would all the keyboards. The
     minor device number identifies a specific device, i.e. the keyboard
     attached to this workstation.
     • Consider reading a file from disk for a process :
     1. Determine device holding file.
     2. Translate name to device representation.
     3. Physically read data from disk into buffer.
     4. Make data available to requesting process.
     5. Return control to process.
4.    Explain about RAID in detail.                                                 (Apr/May
                                                                                       2015)
      RAID is a technique that makes use of a combination of multiple                (Nov/Dec
      disks instead of using a single disk for increased performance,                  2016)
      data redundancy, or both.
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 150
1. RAID 0 (striped disks)
RAID 0 is taking any number of disks and merging them into one
large volume. It will increase speeds as you're reading and writing
from multiple disks at a time. But all data on all disks is lost if any
one disk fails. An individual file can then use the speed and capacity
of all the drives of the array. The downside to RAID 0, though, is that
it is NOT redundant. The loss of any individual disk will cause
complete data loss. This RAID type is very much less reliable than
having a single disk.
2. RAID 1 (mirrored disks)
It duplicates data across two disks in the array, providing full
redundancy. Both disks are store exactly the same data, at the same
time, and at all times. Data is not lost as long as one disk survives.
The total capacity of the array equals the capacity of the smallest disk
in the array. At any given instant, the contents of both disks in the
array are identical.
RAID 1 is capable of a much more complicated configuration. The
point of RAID 1 is primarily for redundancy. If you completely lose a
drive, you can still stay up and running off the other drive.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 151
f either drive fails, you can then replace the broken drive with little to
no downtime. RAID 1 also gives you the additional benefit of
increased read performance, as data can read off any of the drives in
the array. The downsides are that you will have slightly higher write
latency. Since the data needs to be written to both drives in the array,
you'll only have a single drive's available capacity while needing two
drives.
3. RAID 5(striped disks with single parity)
RAID 5 requires the use of at least three drives. It combines these
disks to protect data against loss of any one disk; the array's storage
capacity is reduced by one disk. It strips data across multiple drives to
increase performance. But, it also adds the aspect of redundancy by
distributing parity information across the disks.
 4. RAID 6 (Striped disks with double parity)
RAID 6 is similar to RAID 5, but the parity data are written to two
drives. The use of additional parity enables the array to continue to
function even if two disks fail simultaneously. However, this extra
protection comes at a cost. RAID 6 has a slower write performance
than RAID 5.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 152
5.   Compare the functionalities of FCFS, SSTF, C-SCAN and C-                       (Apr/
     LOOK with example.                                                             May
                                                                                    2015)
                                                                                    (Apr/M
     First-Come-First-Serve (FCFS), Shortest Seek Time First (SSTF),                ay
     Circular SCAN (C-SCAN), and Circular LOOK (C-LOOK) are                         2018)
     disk scheduling algorithms used to determine the order in which
     disk I/O requests are serviced. Let's compare their functionalities
     with examples:
     1. First-Come-First-Serve (FCFS):
     Functionality:
     FCFS serves requests in the order they arrive.
     Example:
     Consider disk requests arriving in the order: 98, 183, 37, 122, 14,
     124, 65, 67.
     The order of service for FCFS would be the same as the order of
     arrival: 98, 183, 37, 122, 14, 124, 65, 67.
     2. Shortest Seek Time First (SSTF):
     Functionality:
     SSTF selects the request with the shortest seek time from the
     current head position.
     Example:
     Consider disk requests arriving in the order: 98, 183, 37, 122, 14,
     124, 65, 67.
     If the disk head is initially at track 53, SSTF would select the
     closest request, resulting in the order: 65, 67, 37, 14, 98, 122, 124,
     183.
     3. Circular SCAN (C-SCAN):
     Functionality:
     C-SCAN serves requests in a circular manner, moving the disk
     head in one direction until the end, then returning to the beginning.
     Example:
     Consider disk requests arriving in the order: 98, 183, 37, 122, 14,
     124, 65, 67.
     If the disk head is initially at track 53, C-SCAN would move the
     head to the end, then jump to the beginning, resulting in the order:
     65, 67, 98, 122, 124, 183, 14, 37.
     4. Circular LOOK (C-LOOK):
     Functionality:
     C-LOOK serves requests in a circular manner, but it only scans as
     far as the last request in the current direction.
     Example:
     Consider disk requests arriving in the order: 98, 183, 37, 122, 14,
     124, 65, 67.
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 153
    If the disk head is initially at track 53, C-LOOK would move the
    head towards the end, then jump to the beginning, resulting in the
    order: 65, 67, 98, 122, 124, 14, 37, 183.
    Summary:
    FCFS: Simple, serves requests in the order of arrival.
    SSTF: Minimizes seek time by selecting the closest request.
    C-SCAN: Circularly scans the disk, moving in one direction only.
    C-LOOK: Circularly scans the disk, but only as far as the last
    request in the current direction.
6. Explain about file system mounting in detail.                           CO4      (May/
   Mounting is a process in which the operating system adds the                     June
   directories and files from a storage device to the user’s computer file          2016)
   system. The file system is attached to an empty directory, by adding
   so the system user can access the data that is available inside the
   storage device through the system file manager. Storage systems can
   be internal hard disks, external hard disks, USB flash drivers, SSD
   cards, memory cards, network-attached storage devices, CDs and
   DVDs, remote file systems, or anything else.
   Terminologies used in File System Mounting
    File System: It is the method used by the operating system to
       manage data storage in a storage device. So, a user can access
       and organize the directories and files in an efficient manner.
    Device name: It is a name/identifier given to a storage partition.
       In windows, for example, “D:” in windows.
    Mount point: It is an empty directory in which we are adding
       the file system during the process of mounting.
   Mounting Indifferent Operating Systems
   1. Linux-Unix based OS
   We want to mount /dev/sdb1 to an existing directory /mnt.
   sudo mount /dev/sdb1 /mnt/mydisk
   After mounting, we have to unmount after use
   sudo umount /mnt/mydisk
    2. Windows OS
    In windows mounting is very easy for a user. When we connect the
    external storage devices, windows automatically detect the file
    system and mount it to the drive letter. Drive letter may be D: or E:.
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 154
   Steps:
    Connect an external storage device to your PC.
    Windows detects the file system on the drive (e.g., FAT32 or
       NTFS) and assigns it a drive letter, such as “E:”.
    You can access the derive by going through, THIS PC –> FILE
       EXPLORER –>”E:” drive
    Access the data.
   3. Mac OS
   In Mac OS when we connect an external storage it will
   automatically mount, and it will be accessible via Finder. As an
   advanced mounting method user can also use the
   command diskutil in Terminal.
   Method 1:
   Steps:
    Connect an external storage device to your MAC.
    MS OS detects the file system and automatically mount it.
    You can access the drive by opening Finder, and it will appear in
       the sidebar.
   Method 2(Using diskutil):
   To mount a drive with a known identifier: disk2s1
   diskutil mount /dev/disk2s1
   To unmount:
   diskutil unmount /dev/disk2s1
7. Explain about free space management with example.                                (Nov/
   • Space that is allocated to files must be managed and space which is            Dec
                                                                                    2015)
   not allocated to any file must be managed. To keep track of free disk
   space, the system maintains a free space list. File allocation table and
   disk allocation both are required to manage free space properly.
   • When a file is deleted, its disk space is added to the free-space list.
   • Following are the techniques used for free space management:
   1. Bit tables or bit vector
   2. Chained free portions or linked list
   3. Indexing or grouping
   4. Free block list or counting
   1. Bit tables or bit vector old
   • Each block on the disk is represented by bit. It uses vector chain for
   blocks. Free block is represented by o and used block represented by
   1.
   • For example, consider a disk where blocks 3, 4, 5, 7, 9, 14, 15, 19,
   30, 31, 32, 35, 36, 37, 38 are free blocks and rest of the blocks are
   allocated. The free space is shown below:
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 155
111000101011110011101111111111000110000
• The memory required for a block bitmap = Disk size / 8 × (block
size of file system)
• Bit map requires extra space. For example:
• When bit table is stored into the memory, then exhaustive search of
the table can ano slow file system performance. Most of the file
system use auxiliary data structure al for bit table. File system also
maintain summary table. Summary table contains sub-range, number
of free blocks and maximum sized contiguous number of free blocks.
• Summary table is used to store information about contiguous free
blocks. Whenever file system needs a number of contiguous blocks, it
can scan the summary table to find an appropriate sub-range and then
search that sub-range. This method is used in Apple Macintosh.
Advantage :
a. Easy to find a free blocks
b. It is as small as possible.
Disadvantage :
It may not be feasible to keep the bitmap in memory for large disks.
2. Link list
• In this method, all free space disk blocks are linked, keeping a
pointer to the first free block. All file allocation methods used link list
free space techniques.
• There is small space overhead because there is no need for a disk
allocation table. In the above example, free blocks are 3, 4, 5, 7, 9, 14,
15, 19, 30, 31, 32, 35, 36, 37, 38. Here free block pointer is on block
number 3. Block 3 contains pointer to block 4, block 4 contains
pointer to block 5 and block 5 contains pointer to block 7 and so on.
• This method is not efficient because to reach free block 9, we have
to traverse 973 block 3, 4, 5 and 7 then we will reach to block 9.
• Disk will become quite fragmented after some use. Many portions
will be a single block long.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 156
     3. Grouping
     • First free block contains the addresses of n free blocks. The first n-1
     of these blocks is actually free. The last block also contains the
     address of another n free block.
     • Consider the free blocks: 3, 4, 5, 7, 9, 14, 15, 19, 30, 31, 32, 35, 36,
     37, 38
     • When all the blocks in the group have been allocated, then use the
     block that held the pointer.
     • Because of grouping method, address of a large number of free
     blocks can be found quickly.
     4. Counting
     • It keeps the address of the first free block and the number n of free
     contiguous blocks that follow the first block. Each entry in the free
     space list then consists of a disk address and a count.
8.    Illustrate the functions of file and file implementation.                       (Nov/
     File system is implemented on disk and memory. How to implement                  Dec
                                                                                      2015)
     the file system, it varies according to operating system and file
     system. But all the operating system follow some general rules. If the
     file system is implemented on the disk, it contains the following
     information:
     1. Boot block control: Boot block is maintained per volume. The
     boot block contains solve the initial bootstrap program used to load
     the operating system. Operating system requires some information at
     the time of booting. If the disk is divided into number of partitions,
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 157
the operating system is stored in the first partition of the disk. If the
operating system is not installed on the disk, then this block can be
empty. In UNIX file system, this is called the book block and in
2. Volume control block: It consists of volume or partitions detail
information. The information like block size, number of blocks in the
partition, free block count, free block pointer, free FCB count and
FCB pointers. In UNIX operating system, each partition is a
standalone file system. Super block is name in UNIX file system. A
super block describes the state of the file system: The total size of the
partition, the block size, pointers to a list of free blocks, the inode
number of the root directory, magic number, etc. In network file
system, it is stored in the master file table.
3. Directory structure: It is used to organize the files. Directory
structure is maintained per file system.
4. Per-file PCB: It contains information about files such as file size,
file ownership, file permission and location of data blocks. In NTFS,
master file table stored this information. Master table file uses a
relational database structure.
• In-memory information is used for caching and files system
management. Caching improve the performance.
1. In-memory mount table: It contains information about each
mounted volume.
2. In-memory directory structure cache: It contains recently
accessed directories information.
3. System wide open table: Open files FCB information is stored.
4. Per-process open file table: It maintains the pointer to the
appropriate entry in the system wide open file tables.
• UNIX operating system treats a directory exactly as a file. Windows
NT implements separate system calls for files and directories. It treats
directory as entities separate from files. Fig. 6.8.1 shows a file control
block.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 158
• FCB specify the information that the system needs to manage a file.
Sometimes it is called file attributes. These are highly system
dependent structures. For creating new file, an application program
calls the logical file system.
Optimizations for file system:
• Delayed updates of data and metadata are main difficulty. Updates
could be sins delayed in the hope that the same data might be updated
in the future or that the updated data might be temporary and might
be deleted in the near future.
• If computer were to crash without having committed the delayed
updates, then the consistency of the file system is destroyed.
Metadata updates:
• For recoverable file system, after crash, it must be consistent or
must be able to be made consistent. So it is necessary to prove that
logging metadata updates keeps the file system in a consistent.
• For inconsistent file system, the metadata must be written
incompletely. Sometime, file system data structures is in wrong order.
With metadata logging, the writes are made to a sequential log.
• The complete transaction is written there before it is moved to the
file system structures. While updating data, file system crashed, the
updates can be completed based on the information in the log.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 159
     • The logging ensures that file system changes are made completely.
     The order of the changes is guaranteed to be correct because of the
     sequential writes to the log. If a change was made incompletely to the
     log, it is discarded, with no changes made to the file system
     structures.
9.    Distinguish between a STREAMS driver and STREAMS                        CO4    (Nov/Dec
      module.                                                                          2016)
      STREAMS Driver:
        1. Definition:
                A STREAMS driver is responsible for managing
                  communication between a device and the STREAMS
                  framework.
        2. Functionality:
                Handles low-level device-specific operations, such as
                  sending and receiving data to and from the hardware.
        3. Direct Interaction:
                Interacts directly with the device and translates device-
                  specific protocols into a format that STREAMS
                  modules can understand.
        4. Examples:
                Device drivers for physical devices like network
                  interfaces, serial ports, or disk drives.
        5. Provides:
                Provides a standard interface to the STREAMS
                  framework, allowing multiple devices to be accessed
                  in a uniform way.
      STREAMS Module:
        1. Definition:
                A STREAMS module is a software component that
                  processes data within the STREAMS framework.
        2. Functionality:
                Performs specific operations on the data flowing
                  through the STREAM, such as filtering, buffering, or
                  transforming it.
        3. Modularity:
                Can be dynamically loaded or removed from the
                  STREAM to modify or extend the behavior of data
                  processing.
        4. Examples:
                Modules for protocol processing, encryption,
                  compression, or any other operation on the data
                  stream.
        5. Interacts with:
                Interacts with other modules and the driver within the
                  STREAMS framework.
      Relationship:
         Communication Flow:
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 160
                 The STREAMS driver manages communication with
                  the hardware, while STREAMS modules operate on
                  the data flowing through the STREAMS framework.
          Data Processing:
               Drivers handle the low-level details of device
                  communication, whereas modules handle high-level
                  data processing within the framework.
          Flexibility:
               Modules provide flexibility by allowing dynamic
                  modification of data processing capabilities, while
                  drivers are typically more static and tied to specific
                  devices.
          Hierarchical Structure:
               The STREAMS framework allows multiple modules
                  to be stacked, forming a hierarchy that processes data
                  in a sequence.
10. What are the various disk space allocation methods. Explain                 CO4 (Apr/May
    in detail.                                                                        2018)
    • The primary issue of file implementing file storage is how to keep
    track of file blocks. File consists of a collection of blocks.
    File allocation
    • Following issue are considered for file allocation:
    1. After creation of new file, whether the required maximum space
    for the file is allocated at once or not?
    2. Portion Space is allocated to a file as one or more contiguous
    units.
    3. Data structure or table used to keep track of the portion which is
    assigned to a file.
    • Three major methods of allocating disk space are in wide use,
    1. Contiguous 2. Linked         3. Indexed.
     Contiguous Allocation
     • When user creates a file, system allocates a set of contiguous
     blocks on disk. This is a pre-allocation method that uses portion of
     variable size. Contiguous allocation is simple method to implement.
     It only search free list of correct number of consecutive blocks and
     mark them as used block.
     • Disk address is a linear ordering on the disk. Because of linear
     ordering, accessing block b + 1 after block b normally requires no
     head movement. Here head movement is only one track. Fig. 6.10.1
     shows contiguous allocation.
     • Contiguous allocation of a file is defined by the disk address and
     the length of the on first block.
     • If the file size is "n" blocks and starting location is "L", then it
     occupies blocks L, L+1, L+2, L+3, ......, L+(n-1). The directory entry
     for each file indicates the address of the starting block and the length
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 161
of the area allocated for this file.
• Sequential and random access can be supported by contiguous
allocation.
• It is easy to retrieve single block. To improve the I/O performance
of sequential processing, multiple blocks can be brought in one at a
time. It supports sequential access very well because files entire data
is stored in adjacent block. This method bra also supports random
access.
• Contiguous allocation also suffers from external fragmentation.
Small free disk spaces are created after allocation free block and
deleting files.. External fragmentation means there will require free
space available but that is not contiguous. To solve the problem of
external fragmentation, compaction method is used.
• One more problem is that how to calculate the space needed for a
file. It is difficult to estimate the required space.
• This method is good for storing data on CD-ROM.
Characteristic of contiguous file allocation :
1. It supports variable size portions.
2. Pre-allocation is required.
3. It requires only single entry for a file.
4. Allocation frequency is only once.
Advantages:
1. It supports variable size portion.
2. Easy to retrieve single block.
3. Accessing a file is easy.
4. It provides good performance.
Disadvantages:
1. It suffers from external fragmentation.
2. Pre-allocation is required.
Linked Allocation
• Linked allocation is also called chained allocation. Operating
system keeps an ordered list of free blocks. File descriptor stores
pointers to the first block and each block stores pointer to the nest
block.
• Fig. 6.10.2 shows linked allocation. The disk blocks may be
scattered anywhere on the disk. The directory contains a pointer to
the first and last blocks of the file. No space is lost to disk
fragmentation.
• Creation of new file is easy. For new file, simply create new entry
in the directory. Reading a file is straightforward. User simple read
blocks by following pointers from block to block. There is no
external fragmentation with linked allocation.
• To write to file, system finds a free block, and this new block is
written to and linked to the end of the file.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 162
• While creating a new file, it is not necessary to declare the size of
the file. A file can contiguous to grow as long as free blocks are
available.
• Compaction can be used so that blocks of one file are located
continuously on the disk. It optimizes disk access.
• File allocation table is an extension of the linked allocation
method. Instead of putting the pointers in the file, keep a table of the
pointers around. This pointer table can be quickly searched to find
ant random block in the file.
• Fig. 6.10.3 shows the file allocation table. All blocks on the disk
must be included in the table. This method is used by windows
operating system.
(Refer Fig. 6.10.3 on next page)
Characteristics
1. It supports fixed size portions.
2. Pre-allocation is possible.
3. File allocation table is one entry for a file.
4. Allocation frequency is low to high.
Advantages:
1. There is no external fragmentation.
2. It is never necessary to compact disk space.
3. Pre-allocation is not required.
Disadvantages:
1. Files are accessed only sequentially.
2. Space required for pointers.
3. Reliability is not good.
4. Can not support direct access.
Indexed Allocation
• Indexed allocation method solves the problem of both contiguous
and linked allocation. It uses concept of index block. Index block
stores the entire pointer in one location. But the index block will
occupy some space and thus could be considered as an overhead of
the method.
• OS keeps a list of free blocks. It allocates an array to hold pointers
to all the blocks used by the file. It allocates blocks only on demand.
Fig. 6.10.4 shows indexed allocation.
• In indexed allocation, each file maintains its own index block. It
contains an array of disk sector of addresses. For example: The nth
entry in the index block points to the nth sector of the file. The
directory contains the address of the index block of a file. To read
the nth sector of the file, the pointer in the nth index block entry
• It supports direct access and without suffering from external
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 163
    fragmentation. Any free block anywhere on the disk may satisfy a
    request for more space.
    • Indexed allocation does suffer from wasted space. The pointer
    overhead of the index block is generally greater than the pointer
    overhead of linked allocation.
    Advantages:
    1. It supports sequential and direct access.
    2. No external fragmentation.
    3. Faster than other two methods.
    4. It supports fixed and variable size blocks.
    Disadvantages:
    1. Indexed allocation does suffer wasted space.
    2. Pointer overhead is generally greater.
11. Consider a file system where a file can be deleted and the disk           CO4   (Nov/Dec
    space reclaimed while the links to that file still exist. What                    2015)
    problems may occur if a new file is created in the same storage
    area or with the same absolute path name? How these
    problem be avoided?
     In a file system where a file can be deleted, and the disk space
    reclaimed while the links to that file still exist (which is often
    referred to as "unlinking" or "deleting a file without removing its
    links"), certain problems may arise when a new file is created in the
    same storage area or with the same absolute path name. Here are
    some potential issues and ways to avoid them:
     Problems:
     Data Integrity:
     If a new file is created in the same storage area or with the same
    absolute path name as the previously deleted file, there may be
    remnants of the old file's data or metadata left in the storage,
    leading to potential data integrity issues.
     Security Risks:
     If sensitive information from the deleted file is still present in the
    storage and a new file is created, it could pose security risks as
    unauthorized users might gain access to the old data.
     Confusion:
     Users or applications expecting a clean slate when creating a new
    file in a location where a file was previously deleted may encounter
    unexpected remnants, causing confusion.
     Solutions:
     Zeroing or Scrubbing Deleted Space:
     When a file is deleted, the file system should zero out or scrub the
    previously occupied space to ensure that remnants of the old data
    are not accessible. This can be part of the file deletion process.
     Delaying Reuse of Disk Space:
     The file system can implement a mechanism to delay the reuse of
    disk space previously occupied by a deleted file. This ensures that
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 164
    the space is cleared before allocating it to a new file.
     File System Notifications:
     Implement file system notifications to inform applications or
    processes about changes in the file system. Applications can be
    notified when a file is deleted, and they can take appropriate actions
    to clean up or avoid potential issues.
     Versioning or File ID Changes:
     Consider implementing file versioning or changing file identifiers
    when a file is deleted. This ensures that a new file created with the
    same name or path is distinct from the deleted one.
     Proper File Deletion API:
     Encourage the use of proper file deletion APIs or commands that
    ensure complete removal of a file and its associated metadata.
    Educate users about the implications of unlinking or deleting files
    without removing their links.
     Regular Disk Maintenance:
     Schedule regular disk maintenance tasks that check for and clean
    up any remnants of deleted files. This can be part of routine file
    system maintenance procedures.
12. Illustrate an application that could benefit from operating            CO4     (Nov/Dec
    system support for random access to indexed files.                               2015)
    An application that could benefit significantly from operating system
    support for random access to indexed files is a Database
    Management System (DBMS). In a DBMS, data is organized into
    tables, and each table may contain a large number of records.
    Random access to indexed files is crucial for efficient querying and
    retrieval of specific records within these tables. Here's how such an
    application could benefit:
    Database Management System (DBMS):
       1. Data Retrieval Efficiency:
        In a DBMS, tables often store large volumes of data. When
          users query the database to retrieve specific records based on
          certain criteria, random access to indexed files allows the
          system to quickly locate and retrieve the relevant records.
       2. Indexing Primary and Secondary Keys:
        Primary keys and secondary keys are often used to uniquely
          identify records within a table or to speed up specific types of
          queries. Indexing these keys allows the DBMS to provide rapid
          access to records based on these key values.
       3. Complex Queries:
        In a database with multiple tables and relationships, complex
          queries involving joins and conditions can be efficiently
          executed by leveraging indexes. Random access to indexed
          files allows the DBMS to quickly navigate through the data
          based on indexed attributes.
       4. Sorting and Range Queries:
        Indexed files support efficient sorting and range queries. For
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 165
             example, retrieving all records within a specific date range or
             sorting records based on a particular attribute becomes faster
             when the DBMS can randomly access indexed files.
        5.   Concurrency Control:
            Random access to indexed files is crucial for implementing
             concurrency control mechanisms in a multi-user database
             environment. Concurrent transactions may need to access and
             modify different records simultaneously, and indexes help in
             efficiently managing these operations.
        6.   Data Integrity:
            Indexed files contribute to maintaining data integrity by
             enforcing uniqueness constraints on key attributes. This is
             crucial for preventing duplicate records and ensuring the
             accuracy of data stored in the database.
        7.   Query Optimization:
            A DBMS can use query optimization techniques, such as
             choosing the most efficient index for a specific query, based on
             the access patterns of the application. Random access to
             indexed files supports these optimization strategies.
        8.   Reduced Disk I/O:
            Random access to indexed files minimizes the need for
             scanning the entire table during searches, reducing disk I/O
             operations. This is particularly important for large databases
             where quick access to specific records is essential for
             performance.
                                               PART C
1.    Discuss the disk scheduling methods.                                      CO4 Nov/
     Disk Scheduling                                                                Dec.-17,
     • Disk scheduling algorithms are used to reduce the total seek time of         18, 19
     any request. I/O request issues a system call to the operating system.
     If desired disk drive or controller is idle then the request is served
     immediately. If device is bust then the request for service will be
     placed in the queue of pending requests.
     • When one request is completed, the operating system has to choose
     which pending request to service next.
     • The processor is much faster than the disk, so it's highly likely that
     there will be multiple outstanding disk requests before the disk is
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 166
ready to handle them. Because disks have non-uniform access times,
re-ordering disk requests can greatly reduce the latency of disk
requests.
• Multiple requests to disk will arrive while one is being serviced.
Disk scheduling increases the disk's bandwidth.
• To service a request, a disk system requires that the head be moved
to the desired track, then a wait for latency and finally the transfer of
data. In the following section we will examine several scheduling
algorithms.
1. FIFO 2. SSTF 3. SCAN 4. C-SCAN                    5. LOOK 6. C-
LOOK
First In First Out
• FIFO is also called first come first served method. Requests are
processed in queue order. Fig. 5.4.1 shows FCFS algorithm. (Fig.
5.4.1 see on next page).
• FCFS has a fair policy in the sense that once a request has arrived,
its place in the schedule is fixed irrespective of arrival of a higher
priority request.
• The requested tracks in the order received are: 55, 58, 39, 18, 90,
160, 150, 38 and 184. Starting track at 100 and total number of track
is 200. FCFS process the entire request in sequential order:
• FCFS would begin at track 100, move 45 tracks to 55, and move 3
tracks to 58 and so on. The Y-axis corresponds to the tracks on the
disk. The X-axis corresponds to time or the number of tracks
traversed.
• Starting at track 100.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 167
Next accessed track is as below.
• FCFS is simple to implement. But it does not provide fast services.
It can not take special action to minimize the overall seek time.
Advantages:
1. Simple and easy to implement.
2. Suitable for light loads.
Disadvantages:
1. FCFS does not provide fast services.
2. Do not maximize throughput.
3. May involve lots of unnecessary seek distance
Shortest Seek Time First
• SSTF select the next request at the one requiring the minimum seek
time from the current position. Fig. 5.4.2 shows SSTF algorithm.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 168
• In Shortest-Seek-Time-First (SSTF) scheduling priority is given to
those processes which need the shortest seek, even if these requests
are not the first ones in the queue.
• It means that all requests nearer to the current head positions are
serviced together before moving head to distant tracks.
• Select the disk I/O request that requires the least movement of the
disk arm from its current position and always choose the minimum
seek time.
• The only tricky part is if there are two jobs with the same distance.
In this case, some kind of tie-breaking needs to be employed to pick
one. For instance, you could just use a random number to effectively
flip a coin and pick one.
• SSTF does not ensure fairness and can lead to indefinite
postponement because its seek pattern tends to be highly localized.
• Under heavy load, SSTF can prevent distant requests from ever
being serviced. This phenomenon is known as starvation.
• SSTF scheduling is essentially a form of shortest job first
scheduling. SSTF scheduling algorithm are not very popular because
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 169
of two reasons.
1. Starvation possibly exists. 2. It increases higher overheads.
Advantages:
1. Throughput is better than FCFS.
2. SSTF minimize the response time.
3. Less number of head movements.
Disadvantages:
1. One major issue is STARVATION.
2. Localize under heavy load.
SCAN
• SCAN is also called elevator algorithm.
• The next request scheduled is closest to current request but in one
particular direction. All requests in other direction are put at the end
of the list.
• SCAN services tracks in only one direction i.e. either increasing or
decreasing track number.
• When SCAN reaches the edge of the disk (or track 0), it reverses
direction.
• If any request comes on its way it will be serviced immediately,
while request arriving just behind the head will have to wait until disk
head moves to the end of the disk, reverses direction and returns
before being serviced.
• SCAN is called elevator because an elevator continues in one
direction servicing requests before reversing direction.
• Consider the example in which the requested tracks in the order
received are : 55, 58, 39, 18, 90, 160, 150, 38, and 184. Starting track
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 170
at 100 and total number of track is 200. Head is moving towards
increasing track number.
• Current head position = 100
• Head is moving towards increasing number of track. Fig. 5.4.3
shows SCAN algorithm. (Refer Fig. 5.4.3 on previous page.)
Disadvantages:
1. Increase overhead
2. Needs directional bit
Circular SCAN
• C-SCAN also moves the head in one direction, but it offers fairer
service with more uniform waiting times.
• It moves the head from one end of the disk to the other end,
servicing requests along the way. When the head reaches the other
end, it immediately returns to the beginning of the disk, without
servicing any requests on the return trip.
• C-SCAN treats the cylinders as a circular list that wraps around
from the last cylinder to the first one. It scans in cycles, always
increasing or decreasing order. Fig. 5.4.4 shows C-SCAN. Consider
the example in which the requested tracks in the order received are:
55, 58, 39, 18, 90, 160, 150, 38 and 184. Starting track at 100 and
total number of track is 200. Head is moving towards increasing track
number.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 171
LOOK Scheduling.
• Start the head moving in one direction. Satisfy the request for the
closest track in that direction when there is no more request in the
direction, the head is traveling, reverse direction and repeat.
• This algorithm is similar to SCAN, but unlike SCAN, the head does
not unnecessarily travel to the innermost and outermost track on each
circuit.
• Fig. 5.4.5 shows the look scheduling algorithm.
• For example, the disk request queue contains a set of references for
blocks on tracks 76, 124, 17, 269, 201, 29, 137 and 12.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 172
     • Current head position= 76
     C-LOOK
     • C-LOOK is same as C-SCAN except the head stops after servicing
     the last requesting the preferred direction, then services the request to
     the cylinder nearest the opposite side of the disk.
     • Example: Consider the example in which the requested tracks in the
     order received are 23, 89, 132, 42, 187. Starting track at 100 and there
     are 200 cylinders numbered from 0 - 199.
2.    Write briefly about file attributes, operations, types and            CO4
      structure
     Files are the building blocks of any operating system. Permanent
     storage of information and data is a file. File provides an easy means
     of information sharing. The operating system is not interested in the
     information stored in the files. Operating system map files with
     physical devices.
     • An unstructured sequence of data means a file. It is also logical
     units of information created by processes. Information stored in the
     files must be persistent. File is any organized information and stored
     within the secondary storage. It is physical view of a file.
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 173
• A program which user prepare is a file. File represents programs and
data. The output from complier may be an object code file or an
executable file.
• Content or information/data of file is decided by the creator of the
file. File contains various types of data like source program, object
program, text and numeric data, images, graphs, sounds, payroll
records, dead-stock records etc.
File Naming
• File store information on the disk. It is an abstract mechanism. File
name is given, when process creates a file. File name continues to
exist even after closing a file. It is accessed by other process.
• File name rule is changed according the operating system. All
operating system allowed upto 8 string as a file name. Special
characters and numbers are allowed to use as a file name. Many file
systems support names as long as 255 characters. File names contains
two parts. It is separated by a period. For example: sorting.c
• In the file name, after period is called file extension. File extension
indicates something about a file. UNIX operating system support
more than one file extension. It is up to the user, for extending the file
extension.
• MS-DOS support only one file extension. File extension after period
contains upto three characters. Following is the list of file extension.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 174
Types of File
• File type represents the internal structure of the file. According to
the structure, file is divided into certain types: text, source, executable
and object file.
1. Text file: It is sequence of character which organized into lines.
Text file is a regular file.
2. Source file: It contains sequence of subroutines, procedure and
functions.
3. Executable file: It contains a series of code sections. This file is
input to the loader and loader loads this file into memory and then
execute.
4. Object file: Object file is input to the linker. An Object file is a
binary file that the compiler creates that converts the source to object
code.
File Types
File is divided into following types:
1. Ordinary file
2. Directory file
3. FIFO file
4. Character device file
5. Block device file.
Ordinary file
• Also called regular file. It contains only data as a stream of
characters. An bis ordinary file cannot contain another file, or
directory.
• It is either text file or a binary file.
• Examples of ordinary files include simple text files, application data
files, files containing high-level source code, executable text files,
and binary image files.
• Text file: It contains only printable characters.
• Binary file: It contains printable and unprintable characters that
cover the entire ASCII range.
• Regular files may be created, browsed through, and modified by
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 175
various means such as text editors.
Directory file
• A directory file is like a folder that contains other files, including
subdirectory files.
• Directory files act as a container for other files, of any category.
Directory files don't contain data in the user sense of data; they
merely contain references to the files contained within them.
• Directory is created in UNIX by the mkdir command.
• The UNIX directory is considered empty if it contains no other files
except the "." and "..".
• Directory is removed by using rmdir command. Content of directory
is displayed. by ls command.
Device file
• Special files are also known as device files. In UNIX all physical
devices are accessed via device files; they are what programs use to
communicate with hardware.
• Files hold information on location, type, and access mode for a
specific device. There are two types of device files; character and
block.
• Block device files are used to access block device I/O. Block
devices do buffered ar I/O, meaning that the data is collected in a
buffer until a full block can be transferred.
• Character device files are associated with character or raw device
access. They are used for un-buffered data transfers to and from a
device. Rather than transferring data in blocks the data is transferred
character by character.
• One transfer can consist of multiple characters.
• Some devices, such as disk partitions, may be accessed in block or
character mode. Because each device file corresponds to a single
access mode, physical devices that have more than one access mode
will have more than one device file.
• Device files are found in the /dev directory. Each device is assigned
a major and minor device number."
• The major device number identifies the type of device, i.e. all SCSI
devices, would have the same number as would all the keyboards.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 176
• The minor device number identifies a specific device, i.e. the
keyboard attached to workstation.
• Device files are created using the mknod command.
• Example of character device files is printers, modems and consoles.
FIFO file
• FIFO file is special pipe device file which provides temporary
buffers for two or more processes to communicate by writing data to
and reading data from the buffer.
• A FIFO is created by the mknod system call and may be opened and
accessed by any process that knows the name and has the
permissions.
• The buffer associated with a FIFO file is allocated when the first
process opens the FIFO file for read or write. The buffer is discarded
when all processes which are connected to the FIFO close their
reference to the FIFO file.
• FIFO file may be removed like any regular file. FIFO files can be
removed in UNIX via the rm command.
File Attributes
• One of the characteristics of file is a set of file attributes that give
the operating system more information about the file and how it is
intended to be used. For human users convenience bane is given to
the file. File attributes are varies from system to system.
• File attributes are as follows:
1. Name
2. Identifier
3. Type
4. Location/place
5. Size
6. Protection
7. Date and Time
• File name is in human readable. User can perform any operation on
file using its name.
• When file is created by user, operating system assigns unique
identification number to each file. OS uses this identifier for its
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 177
operation.
• Type information is required for systems that support different types
of files.
• Location/place information is pointer to a device and actual location
of files on the device. The information needed to locate the file on
disk is kept in memory.
• Size is measured in bits or bytes. It gives idea about current size of
the file.
• Protection: This information is stored on the per-process table so the
operating system can allow or deny subsequent requests. Attributes
like protection, password, creator and owner provides protection to a
file.
• Date and time: This information is related to creation and
modification of files. Protection, security and monitoring purposes
this data is used by operating system.
Operation on File
• File operations are simply those things that user can perform on a
file. For example, user can create a file, save a file, open a file, read a
file and modify a file. There are many different types of file
operations supported by operating system.
• Operating system can provides system call for performing various
operations on the file. Six basic file operations are creating, write,
read, delete, repositioning and truncating.
• Following are the some list of file operation:
1. Create
2. Write
3. Closes
4. Read
5. Delete
6. Repositioning.
File Structure
• File structure is represented by field, record, file and database. It is
also represented by byte sequence, record sequence and tree.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 178
• Fig. 6.1.1 shows the three different kinds of files. Unix and
windows both use unstructured byte sequence
• Byte sequence is unstructured file. Operating system is nothing to
do with content of file. All content is in byte format. Windows and
UNIX operating system uses this byte sequence method. It provides
more flexibility to the user. User can put anything in the file and
assign name whatever they want to the file.
• In record sequence model, file contains a sequence of fixed length
record. Read and write operation on this type of file will perform one
record at a time. It uses an idea of 80 column punched card. Most of
the operating system based their file system on file consisting of 80
character records.
• Record sequence model also support 132 character records. Line
printer uses 132 characters for printing.
• Last method is tree structure. File consists of tree of records. Each
record contains key field in the fixed position. Key field is used to
sort the tree record.
• Field, record, file and database is used in these file structure. Field is
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 179
     a basic element of data. Persons last name and date example of the
     field. It contains single value. Length and data type is characteristics
     of field.
     • Collection of related field is records. A student record contains
     name, address, mobile number, university exam number, college
     name etc. Records may be fixed length or variable length. A record
     also contains sub-records. Sub-records may be one or more than one.
     • File is collection of similar records. From user view, file is single
     entity and reference by name.
     • Database is a collection of related data. All records have some
     relation in database. Engineering college database contains
     information about college, student, staff, university, syllabus,
     timetable, fee' structure, various departments etc. One or more field is
     used in database.
3.    Discuss in detail about magnetic disk
     Magnetic Disk
     • To store computer data, hard disks, flash memory, magnetic tape
     and optical Wiedsmil
     • Alternate storage for hard disk is Solid State Disks (SSD). These
     flash memory tollon based devices offer a different set of tradeoffs
     from a standard disk. At the same time, capacity of the hard disk also
     increases.
     • Hybrid category is introduced for storage media. It is hard disks
     with large flash memory buffers. Disk sizes are specified in
     gigabytes.
     • Performances of hard disk and SSD technology is given below:
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 180
• Hard disk contains several rotating platters coated with magnetic
film. They are read and written by tiny skating heads that are
mounted on a metal arm that swings back and forth to position them.
Heads float close to the surface of the platters but do not actually
touch.
• A hard disk is really a set of stacked "disks," each of which, like
phonograph records, has data recorded electromagnetically in
concentric circles or "tracks" on the disk.
• A "head" writes or reads the information on the tracks. Two heads,
one on each side of a disk, read or write the data as the disk spins.
Each read or write operation requires that data be located, which is an
operation called a "seek."
• A hard disk/drive unit comes with a set rotation speed varying from
4500 to 7200 rpm. Disk access time is measured in milliseconds.
• Although the physical location can be identified with cylinder, track
and sector locations, these are actually mapped to a Logical Block
Address (LBA) that works with the larger address range on today's
hard disks.
• Mean Time Between Failures (MTBF) is the predicted elapsed time
between inherent failures of a system during operation. MTBF can be
calculated as the arithmetic mean (average) time between failures of a
system.
• Each disk drive is managed by a controller. Each type of controller
can support a fixed number of drives. SCSI controllers support up to
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 181
seven disks per controller or up to 15 disks per controller.
• Each disk is assigned a drive address. This address is set by a
switch, a dial or jumpers on the disk or by the physical location of the
disk. Some SCSI devices, such as RAID's, have an additional
identifying number called a logical unit number. It is used to address
disks within the device.
• Fig. 5.1.1 shows physical disk.
• A disk consists of circular plates called platters. Each platter has an
upper and lower oxide-coated surface. Recording heads, at least one
per surface, are mounted on arms that can be moved to various radial
distances from the center of the platters. The heads float very close to
the surfaces of the platters, never actually touching them, and read
and record data as the platters spin around.
• A ring on one surface is called a track. Each track is divided into
disk blocks. Sometimes called sectors, these physical blocks on a disk
are different from file system blocks.
• Formatting a disk divides the disk into tracks and disk blocks that
can be addressed by the disk controller, writes timing marks, and
identifies bad areas on the disk.
• SCSI disk drives are shipped preformatted. They do not require
formatting at any time. Bad block handling is performed
automatically by SCSI disks. Bad blocks are areas of a disk that
cannot reliably store data.
Platter
• Platter is a circular, metal disk that is mounted inside a hard disk
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 182
drive. Several platters are mounted on a fixed spindle motor to create
more data storage surfaces in a smaller area.
• The platter has a core made up of aluminum or glass substrate,
covered with a thin layer of Ferric oxide or cobalt alloy. On both
sides of the substrate material, a thin coating is deposited by a special
manufacturing technique. This, thin coating where actual data is
stored is the media layer.
• Attributes of platter
1. It is a rigid, round disk this is coated with magnetically sensitive
material.
2. Data is stored in binary code.
3. It is encoded by polarizing magnetic areas on the disk surface.
4. Data can be written to and read from both surfaces of a platter.
5. A platter's storage capacity varies across drives.
Tracks
• Each platter is broken into thousands of tightly packed concentric
circles, known as tracks. These tracks resemble the structure of
annual rings of a tree.
• All the information stored on the hard disk is recorded in tracks.
Starting from zero at the outer side of the platter, the number of tracks
goes on increasing to the inner side.
• Each track can hold a large amount of data counting to thousands of
bytes.
Sectors
• Each track is further broken down into smaller units called sectors.
As sector is the basic unit of data storage on a hard disk. Disk
contains concentric tracks. Tracks are divided into sectors. A sector is
the smallest addressable unit in a disk.
• Fig. 5.1.2 shows surface of disk showing tracks and sectors.
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 183
• A single track typically can have thousands of sectors and each
sector can hold more than 512 bytes of data. A few additional bytes
are required for control structures and error detection and correction.
Clusters: Sectors are often grouped together to form clusters.
Read/Write Heads
• The heads are an interface between the magnetic media where the
data is stored and electronic components in the hard disk. The heads
convert the information, which is in the form of bits to magnetic
pulses when it is to be stored on the platter and reverses the process
while reading.
• The heads are the most sophisticated part of the hard disk. Each
platter has two read/write heads, one mounted on the top and the
other one at the bottom. These heads are mounted on head sliders,
which are suspended at the ends of head
• The head arms are all fused into a singular structure called actuator,
which is responsible for their movement. Drives rotate at 60 to 200
times per second.
• Transfer rate is rate at which data flow between drive and computer.
Positioning time (random-access time) is time to move disk arm to
desired cylinder (seek time) and time for desired sector to rotate
under the disk head. Head crash results from disk head making
contact with the disk surface.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 184
• "Each platter (disc-shaped) is coated with magnetic material on both
surfaces. All platter surfaces has arm extended from fixed position.
Tip of the arm contains read/write head for reading or writing data.
• The arm moves the heads from the spindle edge to the edge of the
disc.
• When a program reads a byte from the disk, the operating system
locates the surface, track and sector containing that byte, and reads
the entire sector into a special area in main memory called buffer.
• The bottleneck of a disk access is moving the read/write arm.
• A cylinder is the set of tracks at a given radius of a disk pack. A
cylinder is the set of tracks that can be accessed without moving the
disk arm. All the information on a cylinder can be accessed without
moving the read/write arm.
Fig. 5.1.3 shows moving-head disk mechanism.
• The arm assembly is moved in or out to position a head on a desired
track. Tracks under heads make a cylinder. Only one head
reads/writes at any one time. Block size is a multiple of sector size.
• Disks can be removable. Drive attached to computer via I/O bus.
Buses vary, including EIDE, ATA, SATA, USB, Fibre channel, SCSI
etc.
• Host controller in computer uses bus to talk to disk controller built
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 185
into drive or storage array.
• Disk controllers typically embedded in the disk drive, which acts as
an interface between the CPU and the disk hardware. The controller
has an internal cache that it uses to buffer data for read/write requests.
Zone Bit Recording
• Also known as multiple zone recording or zone-CAV recording.
Disk divided into zones. Fig. 5.1.4 shows zone bit recording.
• Cylinders in different zones have a different number of sectors.
Number of sectors in a particular zone is constant. Data is buffered so
the data rate to the I/O interface is constant.
• Zoned-bit recording uses the disk more efficiently. It groups tracks
into zones that are based upon their distance from the center of the
disk.
• Each zone is assigned an appropriate number of sectors per track.
This means that a zone near the center of the platter has fewer sectors
per track than a zone on the outer edge.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 186
4.    Discuss in detail about magnetic tapes                         CO4
     • Magnetic tapes are used mostly for storing files of data: For
     example, a company's payroll record.
     • Access is sequential and consists of records that can be accessed one
     after other as Saib the tape moves along a stationary read-write
     mechanism.
     • It is one of the cheapest and slowest methods for storage and has the
     advantage that tapes can be recovered when not in use.
     • A magnetically coated strip of plastic on which data can be encoded.
     Tape is much less expensive than other storage mediums but
     commonly a much slower solution that is commonly used for backup.
     Fig. 5.1.5 shows connection of tape with processor.
     • In addition, random access to magnetic tape is about a thousand
     times slower than random access to magnetic disk, so tapes are not
     very useful for secondary is storage.
     • The I/O bus consists of data lines, address lines, and control lines.
     Each peripheral device has associated with it an interface unit. Each
     interface decodes the address and control received from the I/O bus,
     interprets them for the peripheral, and provides signals for the
     peripheral controller.
     • The I/O bus from the processor is attached to all peripheral
     interfaces. To communicate with a particular device, the processor
     places a device address on the address lines. Each interface attached
     to the I/O bus contains an address decoder that monitors the address
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 187
lines.
• When the interface detects its own address, it activates the path
between the bus lines and the device that it controls.
• All peripherals whose address does not correspond to the address in
the bus are disabled their interface. At the same time that the address
is made available in the address lines, the processor provides a
function code in the control lines.
• The interface selected responds to the function code and proceeds to
execute it. The function code is referred to as an I/O command and is
in essence an instruction that is executed in the interface and its
attached peripheral unit.
• The interpretation of the command depends on the peripheral that
the processor is addressing. There are four types of commands that an
interface may receive. They are classified as control, status, status,
data output, and data input.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 188
                      UNIT V VIRTUAL MACHINES AND MOBILE OS
Virtual Machines – History, Benefits and Features, Building Blocks, Types of Virtual
Machines and their Implementations, Virtualization and Operating-System Components;
Mobile OS - iOS and Android.
                                               PART - A
S.No.                             Question and Answer                           CO  Univ QP
                                                                                    (Month /
                                                                                     Year)
    1. Define Virtual Machine.                                                  CO5
        A virtual Machine (VM) is a software program or operating system
        that not only exhibits the behavior of a separate computer, but is also
        capable of performing tasks such as running applications and
        programs like a separate computer.
    2. What is a pure virtual machine?
        In a pure virtual machine architecture the operating system gives each
        process the illusion that it is the only process on the machine. The
        user writes an application as if only its code were running on the
        system.
    3. How does a user interacts with a virtual machine?
        Each user interacts with the computer by typing commands to the
        virtual machine on a virtual system console and receiving results back
        from the machine as soon as they are computed.
    4. What is virtualization?
        Virtualization is an abstraction layer that decouples the physical
        hardware from the operating system to deliver greater IT resource
        utilization and flexibility.
        • It allows multiple virtual machines, with heterogeneous operating
        systems to run in isolation, side-by-side on the same physical
        machine.
    5. Draw the architecture of virtual machine.
    6. Mention the benefits of virtual machine.
                            R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 189
   1. There is no overlap amongst memory as each Virtual Memory has
   its own memory space.
   2. Virtual machines are completely isolated from the host machine
   and other virtual machines.
   3. Data does not leak across virtual machines.
   4. Can use multiple operating system environments on the same
   computer
   5. The cost reduction is possible using small virtual servers on a more
   powerful single server.
7. List out the disadvantages of virtual machine.
   1. Virtual machines are less efficient than real machines because they
   access the hardware indirectly.
   2. A virtual machine can be infected with the weaknesses of the host
   machine
   3. Difficulty in direct access to hardware, for example, specific cards
   or USB devices
   4. Great use of disk space, since it takes all the files for each
   operating system installed on each virtual machine.
8. What is VM/370?
   VM/370 is an operating system that gives multiple users access to a
   computer by means of keyboard and display terminals for time
   sharing, system testing, production and conversion. VM/370 manages
   the resources of a computer so that every user, local or remote,
   appears to have a complete replica of a System 370 including input
   output (I/O) devices.
9. Diagrammatically represent trap and emulate method
10. What happens when the kernel tries to execute a privileged
    instruction?
    When the kernel in the guest attempts to execute a privileged
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 190
    instruction, that is an error because the system is in user mode and
    causes a trap to the VMM in the real machine. The VMM gains
    control and executes the action that was attempted by the guest kernel
    on the part of the guest. It then returns control to the virtual machine.
    This is called the trap-and-emulate method
11. Write about binary translation
    Binary translation could be static or dynamic. Static binary translation
    means using ewoll a processor to translate an image from an
    architecture to another before execution. In dynamic binary
    translation individual instructions or groups of instructions are on the
    fly translated and the translation is cached to allow for reuse in
    iterations without repeated translation.
12. What is hardware assisted virtualization?
    In hardware assisted virtualization the virtual layer sits in a new root
    mode privilege level under level 0. Guest O/S privileged and sensitive
    calls are set to sibo auto trap to the hypervisor while user request are
    executed directly to the CPU for high performance.
13. Define hypervisor.
    A hypervisor is a software layer installed on the physical hardware,
    which allows splitting the physical machine into many virtual
    machines. This allows multiple and operating systems to be run
    simultaneously on the same physical hardware.
14. What is VMM?
    A hypervisor management console, which is also called a virtual
    machine manager (VMM), is computer software that enables easy
    management of virtual machines.
15. What is Paravirtualization?
    Paravirtualization is a type of virtualization in which guest operating
    system (OS) is recompiled, installed inside a Virtual Machine (VM),
    and operated on top of a hypervisor program running on the host OS.
16. Draw the Paravirtualization architecture
17. List out the Problems with para-virtualization
    1. Para-virtualized systems won't run on native hardware.
    2. There are many different para-virtualization systems that use
    different I commands, etc.
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 191
18. Differentiate between type 1 and type 2 hypervisor.
19. What are the memory management methods.
    a) Double-paging, in which the guest page table indicates a page is in
    a physical frame but the VMM moves some of those pages to backing
    store.
    b) Install a pseudo-device driver in each guest.
       • Balloon memory manager communicates with VMM and is told
    to allocate or de-allocate memory to decrease or increase physical
    memory use of guest, causing guest OS to free or have more memory
    available.
    c) De-duplication by VMM determining if same page loaded more
    than once, memory mapping the same page into multiple guests.
20. Discuss about overall I/O for VMMS
    overall I/O is complicated for VMMS, because
    a) Many short paths for I/O in standard OSes for improved
    performance.
    b) Less hypervisor needs to do for I/O for guests, the better.
    c) Possibilities include direct device access, DMA pass-through,
    direct interrupt delivery.
21. Define Live migration.
    Live migration provides the ability to move a running virtual machine
    between physical hosts with no interruption to service. The virtual
    machine remains powered on and user applications continue to run
    while the virtual machine is relocated to a new physical host
22. List the core applications of android.
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 192
    Android provides a set of core applications:
    1. Email client
    2. SMS program
    3. Calendar
    4. Maps
    5. Browser
    6. Contacts
23. Compare and contrast Android OS and iPhone OS
24. Mention the benefits of Android.
    1. An open and free development platform. Handset makers can use it
    without royalty and customize to their hearts content.
    2. Component-based architecture: Lots of default components can be
    replaced straightforwardly.
25. What is SQlite?
    SQlite is an open source embedded database. The resulting design
    goals of SQLite were to allow the program to be operated without a
    database installation or administration.
                                         PART B
1. Explain in detail about Virtual Machines.                         CO5
   Virtual Machines
   • Virtual Machine (VM) is virtual environment that functions as a
   virtual computer system with its own CPU, memory, network
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 193
interface, and storage, created on a physical hardware system.
• A Virtual machine is a software construct that mimics the
characteristics of a physical server.
• A virtual Machine (VM) is a software program or operating system
that not only exhibits the behavior of a separate computer, but is also
capable of performing tasks such as running applications and
programs like a separate computer.
• In a pure virtual machine architecture the operating system gives
each process the illusion that it is the only process on the machine.
The user writes an application as if only its code were running on the
system.
• Each user interacts with the computer by typing commands to the
virtual machine on a virtual system console and receiving results back
from the machine as soon as they are computed.
• Each user directs the virtual machine to perform different
commands. These commands are then executed on the physical
machine in a multiprogramming environments.
• Virtualization is an abstraction layer that decouples the physical
hardware from the operating system to deliver greater IT resource
utilization and flexibility.
• It allows multiple virtual machines, with heterogeneous operating
systems to run in isolation, side-by-side on the same physical
machine.
Fig. 7.1.1 shows virtual machine.
Benefits:
1. There is no overlap amongst memory as each Virtual Memory has
its own memory space.
2. Virtual machines are completely isolated from the host machine
and other virtual machines.
3. Data does not leak across virtual machines.
4. Can use multiple operating system environments on the same
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 194
   computer
   5. The cost reduction is possible using small virtual servers on a more
   powerful single server.
   Disadvantages :
   1. Virtual machines are less efficient than real machines because they
   access the hardware indirectly.
   2. A virtual machine can be infected with the weaknesses of the host
   machine
   3. Difficulty in direct access to hardware, for example, specific cards
   or USB devices
   4. Great use of disk space, since it takes all the files for each
   operating system installed on each virtual machine.
2. Discuss about the Building Blocks of virtual machines                   CO5
    Trap-and-Emulate
    • Problem with VMM is that, guest OS expects to have unrestricted
    access to hardware, runs privileged instructions, unlike user
    processes. But one guest cannot Man get access, must be isolated
    from other guests.
    • All CPUs have multiple privilege levels. There is ring 0,1,2,3 in x86
    CPUs. Normally, user process in ring 3, OS in ring 0. Privileged
    instructions only run in vino ring 0. Now, user process in ring 3,
    VMM/host OS in ring 0. So guest OS must be protected from guest
    apps. But not fully privileged like host OS/VMM. It is run in ring 1.
    • Trap and emulate VMM: Guest OS runs at lower privilege level
    than VMM, traps to VMM for privileged operation.
    • When the kernel in the guest attempts to execute a privileged
    instruction, that is an error because the system is in user mode and
    causes a trap to the VMM in the real machine. The VMM gains
    control and executes the action that was attempted by the guest kernel
    on the part of the guest. It then returns control to the virtual machine.
    This is called the trap-and-emulate method.
    • Fig. 7.3.1 shows trap-and-emulate method.
                           R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 195
• All non-privileged instructions run natively on the hardware,
providing the same To U performance for guests as native
applications. Privileged instructions create extra overhead, however,
causing the guest to run more slowly than it would natively.
• In addition, the CPU is being multi-programmed among many
virtual machines, which can further slow down the virtual machines
in unpredictable ways.
• Any privileged action by guest OS traps to VMM, emulated by
VMM. Example: set IDT, set CR3, access hardware. Sensitive data
structures like IDT must be managed by VMM, not guest OS.
• Problems with trap and emulate,
1) Guest OS may realize it is running at lower privilege level.
2) Some x86 instructions which change hardware state, run in both
privileged and
• These problems occurs because OSes not developed to run at a
lower privilege level and instruction set architecture of x86 is not
easily virtualizable.
Binary Translation
• Binary translation could be static or dynamic. Static binary
translation means using ewoll a processor to translate an image from
an architecture to another before execution. In dynamic binary
translation individual instructions or groups of instructions are on the
fly translated and the translation is cached to allow for reuse in
iterations without repeated translation.
• To be more precise, a segment of the original code is first executed
in an interpreted mode and collected as a segment. It is determined if
the code has been execute at least N times, if so, if not already
translated it is translated and it is executed and from now the
translated code will be executed.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 196
   • In binary translation the virtualization layer sits at CPU privilege
   level 0 (most privileged). The guest O/S system were supposed to run
   on level 0, but since virtual layer occupies that level, it moves guest
   O/S execution at privilege level 1 and leaves user applications at level
   3 as it supposed to be.
   • The non-virtualizable kernel code of the guest O/S is translated by
   virtual layer into new sequences of instructions that have the intended
   effect on virtual hardware, while user level code is directly executed
   on the CPU for high performance. The benefit of this approach is that
   the O/S is fully abstracted from the underlying hardware thus it
   doesn't require any modification.
   Hardware Assisted
   • In hardware assisted virtualization the virtual layer sits in a new root
   mode privilege level under level 0. Guest O/S privileged and sensitive
   calls are set to sibo auto trap to the hypervisor while user request are
   executed directly to the CPU for high performance.
   • Hardware assisted virtualization requires a compatible CPU like
   Intel VT-x and AMD's AMD-V to work.
   • This technique is not performing as expected because of the high
   overhead between guests O/S-to-hypervisor transition.
   • I/O is another area improved by hardware assistance. Consider that
   the standard DMA controller accepts a target memory address and a
   source I/O device and transfers data between the two without OS
   action. Without hardware assistance, a but guest might try to set up
   DMA transfer that affects the memory of the VMM or other guests.
3. Explain the Types of Virtual Machines and their CO5
   Implementations
   • In computing, a hypervisor is a virtualization platform that allows
   multiple operating systems to run on a host computer at the same
   time. The term usually refers to an implementation using full
   virtualization.
   • A hypervisor is a software layer installed on the physical hardware,
   which allows splitting the physical machine into many virtual
   machines. This allows multiple and operating systems to be run
   simultaneously on the same physical hardware.
   • The operating system installed on the virtual machine is called a
   guest OS, and is sometimes also called an instance. The hardware the
   hypervisor runs on is called the host machine.
   • A hypervisor management console, which is also called a virtual
   machine manager (VMM), is computer software that enables easy
   management of virtual machines.
   • Hypervisors are currently classified in two types: type 1 and type 2.
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 197
Type 0 Hypervisor
Fig. 7.4.1 shows type 0 hypervisor.
• Type 0 hypervisor is built with the minimum software components
required to fully virtualize guest OSS and control information flow
between guest OSs. Type 0 hypervisors is a hardware-based solutions
that provide support for virtual machine creation and management via
firmware.
• The VMM itself is encoded in the firmware and loaded at boot time.
Guest image site is loaded in each partition. "Partitions" and
"domains" are other names of type 0 Hypervisor.
• The feature set of a type 0 hypervisor tends to be smaller than those
of the other types because it is implemented in hardware. For
example, a system might be split into five virtual systems, each with
dedicated CPUs, memory and I/O devices.
• If I/O device are less, then they are not allocate to guest. Sometimes
VMM implements a control partition running daemons that other
guests communicate with for shared I/O.
Type 1 Hypervisor
• Type 1 hypervisor is software that runs directly on a given hardware
platform. A "guest" operating system thus runs at the second level
above the hardware.
• Type 1 VMs have no host operating system because they are
installed on a bare system. An operating system running on a Type 1
VM is a full virtualization because it is a complete simulation of the
hardware that it is running on.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 198
• Type 1 hypervisor is also called a native or bare-metal hypervisor
that is installed directly on the hardware, which splits the hardware
into several virtual machines where we can install guest operating
systems.
• Virtual machine management software helps to manage this
hypervisor, which allows guest OSes to be moved automatically
between physical servers based on current resources requirements.
• It is completely independent from the operating system.
• The hypervisor is small as its main task is sharing resources
between different operating systems.
• A major advantage is that any problems in one virtual machine or
guest operating system do not affect the other guest operating systems
running on the hypervisor.
Type 2 Hypervisor
• This is also known as Hosted Hypervisor.
• In this case, the hypervisor is installed on an operating system and
then supports to other operating systems above it.
• It is completely dependent on host operating system for its
operations. Fig. 7.4.2 shows type 2 hypervisor. (See Fig. 7.4.2 on next
page.)
• While having a base operating system allows better specification of
policies, any problems in the base operating system affects the entire
system as well even if the hypervisor running above the base OS is
secure.
• Type 2 hypervisors don't support over/dynamic allocation of RAM,
so care is required when allocating resources to virtual machines.
• This is why we call type 2 hypervisors hosted hypervisors. As
opposed to type 1 hypervisors that run directly on the hardware,
hosted hypervisors have one software layer underneath. What we
have in this case is:
1. A physical machine.
2. An operating system installed on the hardware (Windows, Linux,
MacOS).
3. A type 2 hypervisor software within that operating system.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 199
4. The actual instances of guest virtual machines.
• Type 2 hypervisors are typically found in environments with a small
number of servers. Type 2 hypervisors are convenient for testing new
software and research br projects.
Paravirtualization
• Paravirtualization is a type of virtualization in which guest operating
system (OS) is recompiled, installed inside a Virtual Machine (VM),
and operated on top of a hypervisor program running on the host OS.
• Para-virtualization refers to communication between the guest OS
and the hypervisor to improve performance and efficiency.
• Para-virtualization involves modifying the OS kernel to replace non-
virtualizable instructions with hyper-calls that communicate directly
with the virtualization Jeaug layer hypervisor.
• The hypervisor also provides hyper-call interfaces for other critical
kernel operations such as memory management, interrupt handling
and time keeping.
• Fig. 7.4.3 shows para-virtualization architecture.
• In Para-virtualization, the virtual machine does not necessarily
simulate hardware, but instead offers a special API that can only be
used by modifying the "guest" OS. This system call to the hypervisor
is called a "hypercall" in Xen.
• Xen is an open source para-virtualization solution that requires
modifications to the guest operating systems but achieves near native
performance by collaborating with the hypervisor.
• Microsoft Virtual PC is a para-virtualization virtual machine
approach. User-mode Linux (UML) is another para-virtualization
solution that is open source.
• Each guest operating system executes as a process of the host
operating system. Cooperative Linux, is a virtualization solution that
allows two operating systems to cooperatively share the underlying
hardware.
• Linux-V server is an operating system-level virtualization solution
for GNU/Linux systems with secure isolation of independent guest
servers.
• The Linux KVM is virtualization technology that has been
integrated into the mainline Linux kernel. Runs as a single kernel
loadable module, a Linux kernel running on virtualization-capable
hardware is able to act as a hypervisor and support unmodified Linux
and Windows guest operating systems.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 200
   • Para-virtualization shares the process with the guest operating
   system.
   Problems with para-virtualization
   1. Para-virtualized systems won't run on native hardware.
   2. There are many different para-virtualization systems that use
   different I commands, etc.
   • The main difference between full virtualization and
   paravirtualization in Cloud is that full virtualization allows multiple
   guest operating systems to execute on a host operating system
   independently while paravirtualization allows multiple guest
   operating systems to run on host operating systems while
   communicating.
4. Briefly explain Virtualization and Operating-System Components
    • Operating system aspects of virtualization are management, I/O
    storage and unique VM migration feature.
    • Following features are implemented by using virtualization:
    a) How do VMMS schedule CPU use when guests believe they have
    dedicated CPUs?
    b) How can memory management work when many guests require
    large amounts of memory?
    CPU Scheduling
    • After virtualization, single CPU system act like multiprocessor.
    Here one or more. virtual CPUs per guest is created. Generally VMM
    has one or more physical CPUs and number of threads to run on
    them.
    • The VMM has a number of physical CPUs available and a number
    of threads to oys run on those CPUs, then threads can be called as
    VMM threads or guest threads. Guests are configured with a certain
    number of virtual CPUs at creation time and that number can be
    adjusted throughout the life of the virtual machine.
    • When enough CPUs for all guests, VMM can allocate dedicated
    CPUs, each guest much like native operating system managing its
    CPUs.
    • Usually not enough CPUs are available, then VMM can use
    standard scheduling algorithms to put threads on CPUs and add
    fairness aspect while allocating. Cycle stealing by VMM and over
    subscription of CPUs means guests do not get CPU cycles they
    expect.
    • Some VMMS provide application to run in each guest to fix time-
    of-day and provide other integration features.
    Memory Management
    • Each virtual machine consumes memory based on its configured
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 201
size, plus additional overhead memory for virtualization. In
virtualized environments, there are more users of memory, leading to
more pressure on memory use.
• Sometime, the total memory allocated to guests exceeds the amount
that physically exists in the system. The extra need for efficient
memory use is not lost on the implementers of VMMS, who take
extensive measures to ensure the optimal use of memory.
• Suppose, VMware ESX guests have a configured amount of
physical memory, then ESX uses three memory management
methods.
a) Double-paging, in which the guest page table indicates a page is in
a physical frame but the VMM moves some of those pages to backing
store.
b) Install a pseudo-device driver in each guest.
   • Balloon memory manager communicates with VMM and is told
to allocate or de-allocate memory to decrease or increase physical
memory use of guest, causing guest OS to free or have more memory
available.
c) De-duplication by VMM determining if same page loaded more
than once, memory mapping the same page into multiple guests.
• Since hypervisor manages page sharing, the virtual machine
operating systems are unaware of what is happening in the physical
system.
• Virtual Memory Ballooning is a computer memory reclamation
technique used by a hypervisor to allow the physical host system to
retrieve unused memory from play certain guest Virtual Machines
(VMs) and share it with others. U
• Memory ballooning allows the total amount of RAM required by
guest VMs to exceed the amount of physical RAM available on the
host. When the host system runs low on physical RAM resources,
memory ballooning allocates it selectively to VMs.
• If a VM only uses a portion of the memory that it was allocated, the
ballooning technique makes it available for the host to use.
• For example, if all the VMs on a host are allocated 8 GB of
memory, some of the VMs will only use half the allotted share.
Meanwhile, one VM might need 12 GB of memory for an intensive
process.
• Memory ballooning allows the host to borrow that unused memory
and allocate it to the VMs with higher memory demand.
• The guest operating system runs inside the VM, which is allocated a
portion of aah memory. Therefore, the guest OS is unaware of the
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 202
total memory available.
• Memory ballooning makes the guest operating system aware of the
host's memory shortage.
• Virtualization providers such as VMware enable memory
ballooning. VMware memory ballooning, Microsoft Hyper-V
dynamic memory, and the open source KVM balloon process are
similar in concept.
• The host uses balloon drivers running on the VMs to determine how
much memory it can take back from an under-utilizing VM. Balloon
drivers must be P M installed on any VM that participates in the
memory ballooning technique.
• Balloon drivers get the target balloon size from the hypervisor and
then inflate by allocating the proper number of guest physical pages
within the VM. This process is known as inflating the balloon; the
process of releasing the available pages is known as deflating the
balloon.
Input-Output Management
• I/O management is easy for VMM because I/O has lot of variation.
For example, leib OS device-driver mechanism provides a uniform
interface to the OS whatever the I/O device. Device-driver interfaces
are designed to allow third-party hardware manufacturers to provide
device drivers connecting their devices to the operating system.
• Virtualization takes advantage of dynamically loaded and unloaded
of device driver by providing specific virtualized devices to guest
operating systems.
• But overall I/O is complicated for VMMS, because
a) Many short paths for I/O in standard OSes for improved
performance.
b) Less hypervisor needs to do for I/O for guests, the better.
c) Possibilities include direct device access, DMA pass-through,
direct interrupt delivery.
• Hardware support needed for all above cases.
• With virtualization, each guest needs at least one IP address for
communication. 8 So server running a VMM may have dozens of
addresses and the VMM acts as a virtual switch to route the network
packets to the addressed guests.
• Networking also complex as VMM and guests all need network
access. VMM can bridge guest to network allowing direct access and
provide Network Address Translation (NAT).
• NAT address local to machine on which guest is running, VMM
provides address translation to guest to hide its address.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 203
Storage Management
• Both boot disk and general data access need be provided by VMM.
Virtualized environments need to approach storage management
differently than do native operating systems.
• Solution is based on the type of hypervisor. In type 1 hypervisor,
storage guest root disks and configuration information within file
system provided by VMM as a disk image. Type 2 hypervisor store as
files in file system provided by host OS.
• Guests sometimes need more disk space than is available in their
root disk image. VMMs provide a mechanism to capture a physical
system as it is currently configured and convert it to a guest that the
VMM can manage and run.
a) Physical-to-virtual (P-to-V) convert native disk blocks into VMM
format.
b) Virtual-to-physical (V-to-P) convert from virtual format to native
or disk format.
• VMM also needs to provide access to network attached storage and
other disk images, disk partitions, disks etc.
Live Migration
• Live migration provides the ability to move a running virtual
machine between physical hosts with no interruption to service. The
virtual machine remains powered on and user applications continue to
run while the virtual machine is relocated to a new physical host. In
the background, the virtual machine's RAM is copied from the source
host to the destination host. Storage and network connectivity are not
altered.
• Taking advantage of VMM features leads to new functionality not
found on general operating systems such as live migration. Running
guest can be moved between systems, without interrupting user
access to the guest or its apps.
• Fig. 7.5.1 shows live migration of guest between servers.
• Steps:
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 204
   1. The source VMM establishes a connection with the target VMM.
   2. The target creates a new guest by creating a new VCPU.
   3. The source sends all read-only guest memory pages to the target.
   4. The source sends all read-write pages to the target, marking them
   as clean.
   5. The source repeats step 4, as during that step some pages were
   probably modified by the guest and are now dirty.
   6.When cycle of steps 4 and 5 becomes very short, source VMM
   freezes guest, sends VCPU's final state, sends other state details,
   sends final dirty pages, and tells target to start running the guest.
   • Once target acknowledges that guest running, source terminates
   guest.
5. Explain about Mobile OS: Android                                      CO5 AU: May-
   • Android is an open source mobile OS developed by the Open               22
   Handset Alliance, to led by Google.
   • Android is a software stack for mobile devices that includes an
   operating system, middleware and key applications.
   • It is based on Linux 2.6 kernel.
   • Android is an open source operating system, created by Google
   specifically for use on mobile devices (i.e. cell phones and tablets)
   • It can be programmed in C/C++ but most app development is done
   in Java. It supports Bluetooth, Wi-Fi and 3G and 4G networking.
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 205
Android Architecture
Fig. 7.6.1 shows Android software stack. Each layer of the stack and
the corresponding elements within each layer are tightly integrated
and carefully tuned to provide the optimal application development
and execution environment for mobile devices. (See Fig. 7.6.1 on
next page.)
• Android provides a set of core applications:
1. Email client
2. SMS program
3. Calendar
4. Maps
5. Browser
6. Contacts
7. Etc.
• All applications are written using the Java language.
                     R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 206
App framework
• Used for enabling and simplifying the reuse of components
  1. Developers have full access to the same framework APIs used by
the core applications.
  2. Users are allowed to replace components.
• App Framework features are as follows:
Libraries
• Libraries include a set of C/C++ libraries used by components of the
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 207
Android system. It is exposed to developers through the Android
application framework
Runtime
• Android run-time system provides core set of class libraries to
ensure smooth platform for developers. With these libraries
developers can easily import required libraries into their applications
without doing any hard coding in applications.
Dalvik virtual machine
• Dalvik is a purpose built virtual machine designed specifically for
android which was developed by Dan Bornstein and his team. Strictly
it was developed for mobile devices. While developing Dalvik Virtual
Machine Dan Bornstein and his team realize the constraints specific
to mobile environment which is not going to change in future at least,
like battery life, processing power and many more. So they optimized
the dalvik virtual machine. Dalvik virtual machine uses register based
architecture. With this architecture dalvik virtual machine has few
advantages over java virtual machine such as:
1. Dalvik uses its own 16 bit instruction set than java 8 bit stack
instructions, which reduce the dalvik instruction count and raised its
interpreter speed.
2. Dalvik use less space, which means an uncompressed .dex file is
smaller in size(few bytes) than compressed java archive file(.jar file).
• An open source software stack that includes operating system.
Linux operating system kernel that provides low level interface with
the hardware, memory management and process control.
• Middleware: A run time to execute Android applications including
virtual machine and core libraries.
Important blocks in Android":
1. Activity manager: Manages the activity life cycle of applications
2. Content providers: Manage the data sharing between applications
3. Telephony manager: Manages all voice calls. We use telephony
manager if we want to access voice calls in our application.
4. Location manager: Location management, using GPS or cell
tower
5. Resource manager: Manage the various types of resources we use
in our application.
Android SDK features
The Android SDK includes,
1. The Android APIs.
2. The core of the SDK.
3. Development tools.
4. No licensing, distributions, or development fees or release approval
processes.
5. GSM, EDGE, and 3G networks for telephony and data transfer.
6. Full multimedia hardware control.
7. APIs for using sensor hardware including accelerometer and the
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 208
compass.
8. APIs for location based services.
Application framework
 1. Android offers developers the ability to build rich and innovative
applications.
 2. Developers have full access to the same framework APIs used by
the core applications.
 3. Underlying all applications is a set of services, including Views.
 4. It can be used to build an application, including lists, grids, text
boxes, buttons, and even an embeddable web browser.
• Content providers enable applications to access data from other
applications (such as Contacts), or to share their own data.
• A resource manager provides access to non-code resources such as
localized strings, graphics and layout files.
• A notification manager enables all applications to display custom
alerts in the status bar.
• An activity manager manages the lifecycle of applications and
provides a common navigation backstack.
Libraries used in Android
• A set of C/C++ libraries used by various components of the Android
system.
• System C library: Tuned for embedded Linux-based devices.
• Media Libraries: Based on Packet Video's OpenCORE; the libraries
support playback and recording of many popular audio and video
formats, as well as static image files.
• Surface Manager: Manages access to the display subsystem and
seamlessly composites 2D and 3D graphic layers from multiple
applications.
• LibWebCore: A modern web browser engine which powers both the
Android browser and an embeddable web view.
• SGL/ 3D libraries: SGL is underlying 2D graphics engine.
• SQLite: A powerful and lightweight relational database engine
available to all applications.
Android run-time
• Android includes a set of core libraries that most of the functionality
available in the core libraries of the Java programming language.
• Every Android app runs in its own process with its own instance of
the Dalvik virtual machine. The Dalvik VM executes files in the
Dalvik Executable (.dex) format:
Slow Android apps
1. By default on Android, all work is done in a single thread, the
"main application" thread. If a component of the work takes a long
time, the rest of the work will be "blocked". For example, a long time
to access data across the network prevents responding to any GUI
events.
2. In the Android OS, if a GUI doesn't respond to an input event in
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 209
five seconds, then it is considered unresponsive and the OS will try to
kill it.
Android thread design
1. Only perform GUI actions on main application thread. Spawn
separate threads to perform data-intensive or slow actions. Make
these threads asynchronous.
2. Main thread does not have to wait for/check on other threads.
Instead, those threads run as they need to and report back to the
original thread. Any changes made to the UI should go through the UI
thread.
Comparison of Android OS VS iPhone OS Features
Android applications
a.Android applications get distributed in a .apk file. APK stands for
"Android board as Package". It is simply a zip file that has a
particular file structure. An APKcontains:
 1. The Android Manifest file (an XML file with lots of metadata).
 2. A Resource bundle containing sounds, graphics, etc.
 3. The Dalvik classes that make up user application.
Android Benefits
1. An open and free development platform. Handset makers can use it
without royalty and customize to their hearts content.
2. Component-based architecture: Lots of default components can be
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 210
   replaced straightforwardly.
   Proponents of Android point to the following benefits:
   a. Lots of services location, sql, maps, web, etc.
   b. Well managed applications; isolated from each other to protect data
   and provide
   c. Operating system can quit programs as needed to ensure good
   performance on mobile devices.
   d. Portability: To support a new device, a company has to port the
   virtual machine; Android apps (Dalvik) then execute on the new
   device with little to no modification.
   Zygote
   • Android uses a concept called the Zygote to enable both sharing of
   code across VM instances and to provide fast startup time of new VM
   instances.
   • Zygote is a daemon process that provides efficient memory usage
   and less time overhead when Android runs multiple application.
   Zygote is the parent of all application processes.
   • The Zygote is a VM process that starts at system boot time. When
   the Zygote process starts, it initializes a Dalvik VM, which preloads
   and preinitializes core library classes.
   • Generally, these core library classes are read-only and are therefore
   a good candidate for preloading and sharing across processes.
   • Once the Zygote has initialized, it will sit and wait for socket
   requests coming from the runtime process indicating that it should
   fork new VM instances based on the Zygote VM instance.
   • Cold starting virtual machines notoriously takes a long time and can
   be an impediment to isolating each application in its own VM. By
   spawning new VM processes from the Zygote, the startup time is
   minimized.
   • Additional memory need not be allocated for copies of these classes
   when a new DVM is forked from the Zygote DVM.
6. Discuss about iOS Technology                                          CO5
   • iOS is the operating system that runs on iPad, iPhone and iPod
   touch devices. The operating system manages the device hardware
   and provides the technologies required to implement native apps.
   • The iOS Software Development Kit (SDK) contains the tools and
   interfaces needed to develop, install, run, and test native apps that
   appear on an iOS device's Home screen.
   • Fig. 7.7.1 shows iOS architecture.
                         R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 211
• The iOS Architecture is layered. At the highest level, iOS acts as an
intermediary between the underlying hardware and the apps you
create.
• Apps do not talk to the underlying hardware directly. Instead, they
communicate with the hardware through a set of well-defined system
interfaces. These interfaces make it easy to write apps that work
consistently on devices having different hardware capabilities.
• The Cocoa Touch layer contains key frameworks for building iOS
apps. These frameworks define the appearance of your app. They also
provide the basic app infrastructure and support for key technologies
such as multitasking, touch-based input, push notifications and many
high-level system services.
• High-Level features of Cocoa touch layers are AirDrop,
Multitasking, Auto Layout, Storyboards and Local Notifications And
Apple Push Notification Service.
• Cocoa touch layer contains following frameworks for iPhone app
development:
a. UIKit framework
b. Map kit framework
c. Push notification service.
d. Message UI framework noise
e. Address book UI framework
f. Game kit framework
g. iAd framework
h. Event kit UI framework
i. Accounts framework
j. Twitter framework
• The Media layer contains the graphics, audio and video technologies
you use to implement multimedia experiences in your apps. The
technologies in this layer make it easy for you to build apps that look
and sound great.
Media Layer
• The role of the Media layer is to provide iOS with audio, video,
animation and graphics capabilities.
• As with the other layers comprising the iOS stack, the media layer
comprises a number of frameworks which may be utilized when
developing iPhone apps.
• The technologies in this layer make it easy for sound great.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 212
a. Graphics technologies
b. Audio technologies
c. Video technologies
d. AirPlay.
• Media layer contains following frameworks:
1. Core video framework: This framework provides buffering
support for the AMI Ista Core media framework. Whilst this may be
utilized by application developers it is typically not necessary to use
this framework.
2. Core text framework: The iOS core text framework is a C-based
API designed to ease the handling of advanced text layout and font
rendering requirements.
3. Image I/O framework: The Image I/O framework, the purpose of
which is to facilitate the importing and exporting of image data and
image metadata, was introduced in iOS 4. The framework supports a
wide range of image formats including PNG, JPEG, TIFF and GIF.
4. Assets library framework: The assets library provides a
mechanism for locating and retrieving video and photo files located
on the iPhone device. In addition to accessing existing images and
videos, this framework also allows Simon new photos and videos to
be saved to the standard device photo album.
5. Core graphics framework: The iOS core graphics framework
provides a lightweight two dimensional rendering engine. Features of
this framework include PDF document creation and presentation,
vector based drawing, transparent layers, path based drawing, anti-
aliased rendering, color manipulation and management, image
rendering and gradients.
6. Core image framework: A new framework introduced with iOS 5
providing a set of video and image filtering and manipulation
capabilities for application developers.
7. Quartz core framework: The purpose of the Quartz Core
framework is to provide animation capabilities on the iPhone. It
provides the foundation for the majority of the visual effects and
animation used by the UIKit framework and provides an Objective-C
based programming interface for creation of specialized animation
within iPhone apps.
8. OpenGL ES framework: For many years the industry standard
for high performance 2D and 3D graphics drawing has been OpenGL.
OpenGL for Embedded Systems (ES) is a lightweight version of the
full OpenGL specification designed specifically for smaller devices
such as the iPhone.
 9. GLKit framework: The GLKit framework is an Objective-C
based API designed to ease the task of creating OpenGL ES based
applications.
10. NewsstandKit framework: The Newsstand application is a new
feature of iOS 5 and is intended as a central location for users to gain
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 213
access to newspapers and magazines. The NewsstandKit framework
allows for the development of applications that utilize this new
service.
11. iOS audio support: iOS is capable of supporting audio in AAC,
Apple orolove Lossless (ALAC), A-law, IMA/ADPCM, Linear PCM,
u-law, DVI/Intel IMA ADPCM, Microsoft GSM 6.10 and AES3-
2003 formats through the support provided by the following
frameworks.
12. AV foundation framework: An Objective-C based framework
designed to allow the playback, recording and management of audio
content.
Core Services Layer
• The Core Services layer contains fundamental system services for
apps.
• Key among these services are the core foundation and foundation
frameworks, which define the basic types that all apps use.
• This layer also contains individual technologies to support features
such as location, iCloud, social media and networking.
Features:
1. Peer-to-Peer services
2. iCloud storage
3. Automatic reference counting
4. Block objects
5. Grand central dispatch
6. In-App purchase
7. SQLite
8. XML support
9. File-sharing support, data protection.
It consists of the following frameworks:
•Address book framework: This provides programmatic access to
the iPhone Address Book contact database allowing applications to
retrieve and modify contact entries.
• CFNetwork framework: The CFNetwork framework provides a C-
based interface to the TCP/IP networking protocol stack and low level
access to BSD sockets. This enables application code to be written
that works with HTTP, FTP and domain name servers and to establish
secure and encrypted connections using Secure Sockets Layer (SSL)
or Transport Layer Security (TLS).
• Core Data Framework: This framework is provided to ease the
creation of data modeling and storage in Model-View-Controller
(MVC) based applications. Use of the Core Data framework
significantly reduces the amount of code that needs to be written to
perform common tasks when working with structured data within an
application.
• Core foundation framework: The core foundation framework is a
C-based Framework which provides basic functionality such as data
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 214
types, string manipulation, raw block data management, URL
manipulation, threads and run loops, date and times, basic XML
manipulation and port and socket anro communication.
• The core media framework is the lower level foundation upon which
the AV foundation layer is built.
• Core telephony framework: The iOS core telephony framework is
provided to allow applications to interrogate the device for
information about the current cell phone service provider and to
receive notification of telephony related events.
• EventKit framework: An API designed to provide applications
with access to the calendar and alarms on the device.
• Most applications will use iCloud document storage to share
documents from a user's iCloud account. This is the feature that users
think of when they think of iCloud storage. A user cares about
whether documents are shared across devices and can see and manage
those documents from a given device.
• Data protection allows applications that work with sensitive user
data to take advantage of the built-in encryption available on some
devices.
• When your application designates a specific file as protected, the
system stores that file on-disk in an encrypted format. While the
device is locked, the contents of the file are inaccessible to both your
application and to any potential intruders.
• However, when the device is unlocked by the user, a decryption key
is created to allow your application to access the file.
Core OS Layer
• The Core OS layer contains the low-level features that most other
technologies are built upon.
• Even if you do not use these technologies directly in your apps, they
are most likely being used by other frameworks.
• And in situations where you need to explicitly deal with security or
communicating with an external hardware accessory, you do so using
the frameworks in this layer.
• This layer provides a variety of services including low level
networking, access to external accessories and the usual fundamental
operating system services such as memory management, file system
handling and threads.
• The Core OS layer occupies the bottom position of the iOS stack
and, as such, sits directly on top of the device hardware.
• The layer provides a variety of services including low level
networking, access to external accessories and the usual fundamental
operating system services such as memory management, file system
handling and threads.
• Accelerate framework: Introduced in iOS 4.0, the Accelerate
framework contains. interfaces for performing DSP, linear algebra
and image-processing calculations. The advantage of using this
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 215
   framework over writing your own versions of these interfaces is that
   they are optimized for all of the hardware configurations present • in
   iOS, based devices. Therefore, you can write your code once and be
   assured that it runs efficiently on all devices.
   • External accessory framework: It provides the ability to interrogate
   and communicate with external accessories connected physically to
   the iPhone via the 30-pin dock connector or wirelessly via Bluetooth.
   • Security framework: The iOS Security framework provides all the
   security interfaces you would expect to find on a device that can
   connect to external networks including certificates, public and private
   keys, trust policies, key chains, encryption, digests and Hash-based
   Message Authentication Code (HMAC).
   • The core bluetooth framework allows developers to interact
   specifically with 80 Bluetooth Low-Energy ("LE") accessories. The
   Objective-C interfaces of this framework allow you to scan for LE
   accessories, connect and disconnect to ones you find, read and write
   attributes within a service, register for service and attribute change
   notifications and much more.
   • System: The system level encompasses the kernel environment,
   drivers and low level UNIX interfaces of the operating system. The
   kernel itself is based on Mach and is responsible for every aspect of
   the operating system.
   • It manages the virtual memory system, threads, file system, network
   and interprocess communication. The drivers at this layer also
   provide the interface between the available hardware and system
   frameworks.
   • For security purposes, access to the kernel and drivers is restricted
   to a limited set of system frameworks and applications.
   • iOS provides a set of interfaces for accessing many low-level
   features of the operating system. Your application accesses these
   features through the LibSystem library.
7. Write short notes on Android File Management                                CO5
   • Android uses a file system that's similar to disk-based file systems
   on other platforms. A file object is suited to reading or writing large
   amounts of data in start-to-finish order without skipping around.
   • All Android devices have two file storage areas: "Internal" and
   "external" storage.
   • Android device may use an updated Linux file system, such as ext4
   or a proprietary file system by a manufacturer, depending on who
   made the device and what has been done to it by the user.
   • File system is the collection place on disk device for files. Visualize
   the file system as consisting of a single node at the highest level
   (ROOT) and all other nodes descending from the root node in a tree-
   like fashion.
                          R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 216
• Samsung Galaxy S phones use the Samsung RFS proprietary file
system while the Samsung Nexus S with Android 2.3 uses the Linux
Ext4 file system.
• < /mnt>: This directory is used for mount points. The different
physical storage devices (like the hard disk drives, floppies, CD-
ROM's) must be attached to some au directory in the file system tree
before they can be accessed. This attaching is no called mounting and
the directory where the device is attached is called the mount bne
point.
• SDCard: The mounted SDCard is a storage device mounted to the
file system in the typical Linux fashion. On the file system root the
/sdcard is a symbolic link to /mnt/sdcard. /mnt/sdcard is where the SD
card is actually mounted, but the same files can also be accessed in
/sdcard.
• Fig. 7.8.1 shows typical directory structure of android file system.
• The superblock is the key to maintaining the file system. It is an 8
kB block of disk space that maintains the current status of the file
system. Because of its importance, a copy is maintained in memory
and at each cylinder group within the file system.
• The copy in main memory is updated as events transpire. The
update daemon is the actual process that calls on the kernel to flush
the cached superblocks, modified inode and cached data blocks to
disk.
• Usually Linux system assumes all file systems are read and
writable.
1. ODEX FILE
• In Android file system, applications come in packages with the
extension .apk. These application packages or APKs contain
certain .odex files whose supposed function is to save space.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 217
• These 'odex' files are actually collections of parts of an application
that are optimized before booting. Doing so speeds up the boot
process, as it preloads part of an application.
• On the other hand, it also makes hacking those applications difficult
because a part of the coding has already been extracted to another
location before execution.
2. DEODEX
• Deodexing is basically repackaging of these APKs in a certain way,
such that they are reassembled into classes.dex files. All pieces of an
application package are put fo together back in one place.
• Deodexed ROMs (or APKs) have all their application packages put
back together in one place, allowing for easy modification such as
theming. Since no pieces of code are coming from any external
location, custom ROMS or APKs are always deodexed to ensure
integrity.
7.8.1 SQLite
• SQlite is an open source embedded database. The resulting design
goals of SQLite were to allow the program to be operated without a
database installation or administration.
• It most widely deployed SQL database engine in the world. SQLite
is based on the Structured Query Language (SQL). Android contains
the SQLite database management classes that an application would
use to manage its own private database.
• SQLite is open source SQL database that stores data to a text file on
a device. Android comes in with built in SQLite database
implementation. SQLite supports all the relational database features.
• It is designed to provide a streamlined SQL-based database
management system suitable for embedded systems and other limited
memory systems. The full SQLite mid library can be implemented in
under 400 KB.
• In contrast to other database management systems, SQLite is not a
separate process that is accessed from the client application. The
library is linked in and thus becomes an integral part of the
application program.
Unique Features
1. No configuration is required.
2. No server process to administer or user accounts to manage.
3. Easy to backup and transmit database.
4. Dynamic typing for column values, variable lengths for column
records.
                      R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 218
5. Query can reference multiple database files.
6. A few non-standard SQL extensions.
• SQLiteDatabase allows methods to open the database connection,
perform queries and query updates and close the database. You can
define keys and values for queries via the ContentValues object. This
is necessary for Insert and Update calls. Delete only requires the Row
Number.
• android.database.sqlite classes are as follows:
1. SQLiteCloseable - An object created from a SQLiteDatabase that
can be closed.
2. SQLiteCursor - A cursor implementation that exposes results
from a query on a SQLiteDatabase.
3. SQLiteDatabase - Exposes methods to manage a SQLite database.
4. SQLiteOpenHelper - A helper class to manage database creation
and version management.
5. SQLiteProgram- A base class for compiled SQLite programs.
6. SQLiteQuery - A SQLite program that represents a query that
reads the as resulting rows into a CursorWindow.
7. SQLiteQueryBuilder- A convenience class that helps build SQL
queries to be sent to SQLiteDatabase objects.
8. SQLiteStatement - A pre-compiled statement against a
SQLiteDatabase that can be reused.
                       R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 219