0% found this document useful (0 votes)
175 views219 pages

Machine Learning QB AIDS UP11

The document is a question bank for the Machine Learning course (AL3451) for the IV semester B.Tech degree under the Department of Artificial Intelligence and Data Science, covering the academic year 2023-2024. It includes the college's vision and mission, program outcomes, educational objectives, specific outcomes, syllabus, course objectives, and course outcomes. The content outlines various machine learning concepts, techniques, and evaluation methods, structured into five units.

Uploaded by

jeevitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
175 views219 pages

Machine Learning QB AIDS UP11

The document is a question bank for the Machine Learning course (AL3451) for the IV semester B.Tech degree under the Department of Artificial Intelligence and Data Science, covering the academic year 2023-2024. It includes the college's vision and mission, program outcomes, educational objectives, specific outcomes, syllabus, course objectives, and course outcomes. The content outlines various machine learning concepts, techniques, and evaluation methods, structured into five units.

Uploaded by

jeevitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 219

DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND

DATA SCIENCE

AL3451 MACHINE LEARNING

QUESTION BANK

FOR IV SEMESTER B.TECH. DEGREE COURSE


[REGULATION-2021]
DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND
DATA SCIENCE

QUESTION BANK

SUBJECT CODE : AL3451

SUBJECT NAME : MACHINE LEARNING

REGULATION : 2021

ACADEMIC YEAR : 2023 – 2024

YEAR/SEMESTER : II/IV

BATCH : 2022 - 2026

Prepared by: Verified by: Approved by:


Name: Name: Name:
Date: Date: Date:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 2


COLLEGE VISION

Emerge as a premier institute producing globally competent


engineers.

COLLEGE MISSION

IM1: Achieve academic diligence through effective and innovative


teaching – learning processes using ICT tools

IM2: Make students employable through rigorous career guidance


and training programs.

IM3: Strengthen Industry Institute Interaction through MOUs and


collaborations.

IM4: Promote Research & Development by inculcating creative


thinking through innovative projects incubation.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 3


VISION AND MISSION OF THE DEPARTMENT

VISION

To foster industry cooperation and impart cognitive learning in order


to develop professionals who can adapt to the shifting demands of new
trends in artificial intelligence and data science.

MISSION

DM1: To provide an excellent infrastructure that keeps up with


modern trends and technologies for students and educators.

DM2: To impart knowledge in cutting edge technology for Artificial


Intelligence and Data Science with industrial standards.

DM3: To impart high-quality education embedded with moral and


ethical principles.

DM 4: To encourage lifelong learning and research that


benefit society as a whole.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 4


PROGRAM OUTCOMES (POs)
Engineering Graduates will be able to:
PO 1 Engineering knowledge: Apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization to the solution of complex engineering
problems.
PO 2 Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.
PO 3 Design/development of solutions: Design solutions for complex engineering problems
and design system components or processes that meet the specified needs with
appropriate consideration for the public health and safety, and the cultural, societal, and
environmental considerations
PO 4 Conduct investigations of complex problems: Use research-based knowledge and
research methods including design of experiments, analysis and interpretation of data,
and synthesis of the information to provide valid conclusions.
PO 5 Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex
engineering activities with an understanding of the limitations.
PO 6 The Engineer and Society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent
responsibilities relevant to the professional engineering practice.
PO 7 Environment and Sustainability: Understand the impact of the professional
engineering solutions in societal and environmental contexts, and demonstrate the
knowledge of, and need for sustainable development.
PO 8 Ethics: Apply ethical principles and commit to professional ethics and responsibilities
and norms of the engineering practice.
PO 9 Individual and Team work: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.
PO 10 Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give
and receive clear instructions.
PO 11 Project management and Finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a
member and leader in a team, to manage projects and in multidisciplinary environments.
PO 12 Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological
change.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 5


PROGRAMME EDUCATIONAL OBJECTIVES (PEOs)

PEO 1: Prospective Career: To exhibit knowledge and skills in the various domain areas of
Computer Science and Engineering with an awareness in different disciplines for a prospective
career and to undertake novel research as per the needs of the society, government and industries.

PEO 2: Higher Education: To be strong in fundamentals of Computer Science and Engineering


for successfully pursuing higher education and research in reputed institutions.

PEO 3: Product Development: To apply their knowledge and innovative ideas to design and
develop products in interdisciplinary areas for real time problems and to emerge as entrepreneurs.

PROGRAM SPECIFIC OUTCOMES (PSOs)

PSO 1: Able to analyze and develop solutions to complex engineering problems in IT such as
Operating Systems, Networks, Data Structure, Database Management Systems, Cloud computing
and Mobile computing, to meet the society and industrial needs.

PSO 2: Able to solve the industry problems with knowledge in recent technology and tools such as
Machine Learning, Internet of Things, Oracle, Java Programming etc. with good team skills and
ethical values.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 6


SYLLABUS
AL3451 MACHINE LEARNING LTPC

3003

UNIT I INTRODUCTION TO MACHINE LEARNING 8

Review of Linear Algebra for machine learning; Introduction and motivation for machine learning;
Examples of machine learning applications, Vapnik-Chervonenkis (VC) dimension, Probably
Approximately Correct (PAC) learning, Hypothesis spaces, Inductive bias, Generalization, Bias
variance trade-off.

UNIT II SUPERVISED LEARNING 11

Linear Regression Models: Least squares, single & multiple variables, Bayesian linear
regression,gradient descent, Linear Classification Models: Discriminant function – Perceptron
algorithm, Probabilistic discriminative model - Logistic regression, Probabilistic generative model –
Naïve Bayes, Maximum margin classifier – Support vector machine, Decision Tree, Random
Forests.

UNIT III ENSEMBLE TECHNIQUES AND UNSUPERVISED LEARNING 9

Combining multiple learners: Model combination schemes, Voting, Ensemble Learning - bagging,
boosting, stacking, Unsupervised learning: K-means, Instance Based Learning: KNN, Gaussian
mixture models and Expectation maximization.

UNIT IV NEURAL NETWORKS 9

Multilayer perceptron, activation functions, network training – gradient descent optimization –


stochastic gradient descent, error backpropagation, from shallow networks to deep networks –Unit
saturation (aka the vanishing gradient problem) – ReLU, hyperparameter tuning, batch
normalization, regularization, dropout.

UNIT V DESIGN AND ANALYSIS OF MACHINE LEARNING EXPERIMENTS 8

Guidelines for machine learning experiments, Cross Validation (CV) and resampling – K-fold CV,
bootstrapping, measuring classifier performance, assessing a single classification algorithm and
comparing two classification algorithms – t test, McNemar’s test, K-fold CV paired t test

TOTAL: 45 PERIODS

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 7


COURSE OBJECTIVE
1 To understand the basic concepts of machine learning.
.
2 To understand and build supervised learning models.
.
3 To understand and build unsupervised learning models.
.
4 To evaluate the algorithms based on corresponding metrics identified.
.
5
.
COURSE OUTCOME
At the end of this course, the students will be able to:

CO1 : To explain the basic concepts of machine learning.

CO2 : To construct supervised learning models.

CO3 : To construct unsupervised learning algorithms.

CO4 : To evaluate and compare different models

CO-PO&PSO MAPPING
SNO PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2

CO1 2 1 2 1 - - - - 3 3 2 2 2 2

CO2 2 3 3 1 2 - - - 2 2 2 1 3 1

CO3 2 1 3 3 2 - - - 1 1 1 1 1 2

CO4 2 3 3 2 1 - - - 3 2 3 2 1 2

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 8


Avg 2 2 2 1 1 - - 2 2 2 1 1 1

UNIT I INTRODUCTION TO MACHINE LEARNING


Review of Linear Algebra for machine learning; Introduction and motivation for machine learning;
Examples of machine learning applications, Vapnik-Chervonenkis (VC) dimension, Probably
Approximately Correct (PAC) learning, Hypothesis spaces, Inductive bias, Generalization, Bias
variance trade-off.
PART - A
S.No. Question and Answer CO Univ QP
(Month/Y
ear)
1. What are the objectives of operating system? CO1 April/May
An operating system is a program that manages the computer 2010
hardware. it act as an intermediate between a users of a computer May/June
and the computer hardware. It controls and coordinates the use of 2012
the hardware among the various application programs for the April/May
various users 2017

2. What is an Interrupt? CO1 Apr/May


 Interrupt are provided primarily as way to improve 2019
processor utilization.
 It is a mechanism by which other
modules( I/O, Memory) may interrupt the
normal sequencing of the processor
 Classes of interrupts:-
 Program
 Timer
 I/O
 Hardware failure
3. What are the advantages of peer-to-peer systems over client- CO1 May/June
server systems? 2016

The main advantage of peer to peer network is that it is easier to set


up In peer-to-peer networks all nodes are act as server as well as
client therefore no need of dedicated server.

The peer to peer network is less expensive. Peer to peer network is


easier to set up and use this means that you can spend less time in

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 9


the configuration and implementation of peer to peer network.
4. What is the purpose of system programs/system calls? CO1 May/June
System programs can be thought of as bundles of useful system 2016
calls. They provide basic functionality to users so that users do not Apr/May
need to write their own programs to solve common problems. 2018

5. How does an interrupt differ from a trap? CO1 Nov/Dec


An interrupt is a hardware-generated signal that changes the flow 2016
within the system. A trap is a software-generated interrupt. An Apr/May
interrupt can be used to signal the completion of I/O so that the CPU 2018
doesn't have to spend cycles polling the device. A trap can be used
to catch arithmetic errors or to call system routines
6. What are disadvantages of multi-processor systems? CO1 Nov/Dec
Complex Operating System is required 2016
Large main memory required
Very expensive
7. Defend timesharing differ from multiprogramming? If so, CO1 April/May
how? 2015
Main difference between multiprogramming and time sharing is that
multiprogramming is the effective utilization of CPU time, by
allowing several programs to use the CPU at the same time but time
sharing is the sharing of a computing facility by several users that
want to use the same facility at the same time.
8. List the Services of operating system function. CO1 Apr/May
 Program development 2019
 Program execution
 User Interface
 I/O Operations
 File system Manipulation
 Communication
 Error Detection
 Resource allocation
 Accounting
 Security
9. Why API's need to be used rather than system call? CO1 April/May
There are four basic reasons: 2015

1) System calls differ from platform to platform. By using a stable


API, it is easier to migrate your software to different platforms.
2) The operating system may provide newer versions of a system
call with enhanced features. The API implementation will typically
also be upgraded to provide this support, so if you call the API,
you'll get it.

3) The API usually provides more useful functionality than the


system call directly. If you make the system call directly, you'll

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 10


typically have to replicate the pre-call and post-call code that's
already implemented by the API. (For example the 'fork' API
includes tons of code beyond just making the 'fork' system call. So
does 'select'.)
4) The API can support multiple versions of the operating system
and detect which version it needs to use at run time. If you call the
system directly, you either need to replicate this code or you can
only support limited versions.

10. Compare and contrast DMA and cache memory. CO1 Nov/Dec
2015
DMA(Direct Memory Access): Direct memory access (DMA) is a
feature of computer systems that allows certain hardware
subsystems to access main memory (Random-access memory),
independent of the central processing unit (CPU).
Cache Memory: A cache is a smaller, faster memory, closer to a
processor core, which stores copies of the data from frequently
used main memory locations.
So, both DMA and cache are used for increasing the speed of
memory access.
11. Can traps be generated intentionally by a user program? If so, CO1 Nov/Dec
for what purpose? 2018
A trap is a software‐generated interrupt. An interrupt can be used
to signal
the completion of an I/O to obviate the need for device polling. A
trap can be used to call operating system routines or to catch
arithmetic errors.
12. List and briefly define the four main elements of a computer? CO1
• Processor – Controls the operation of the computer & performs its
data processing functions
• Main memory – Stores data & programs. It is volatile.
• I/O modules – Move data between the computer & its external
environment such as disks, communication equipment & terminals.
• System Bus – Provides for communication among processors,
main memory & I/O modules.
13. What is Multiprogramming? CO1 Nov/Dec
Multi Programming increases CPU Utilization by organizing 2011
jobs so that the CPU always has one to execute.
Advantage:-
 It increase CPU utilization,
 It makes efficient use of the CPU
overlapping the demands for the CPU &
I/O devices Increased throughput and
Lower response time.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 11


14. What characteristics distinguish the various elements of a CO1
memory hierarchy?
Characteristics are
 Cost Per bit
 Capacity
 Access Time
 Frequency of access to the memory by the processor.
15. Do timesharing differ from Multiprogramming? If so, How? CO1 Apr/May
Time Sharing: here, OS assigns some time slots to each 2015
job. Here, each job is executed according to the
allotted time slots.
Job1: 0 to 5 Job2:5to10 Job3:
10 to 15
Multi-Tasking: In this operating system, jobs are executed in
parallel by the operating system. But, we can achieve this
multi-tasking through multiple processors (or) multicore CPU
only.
CPU1: Job1 CPU2: Job2 CPU3: Job3
16. Why API s need to be used rather than system calls? CO1 Apr/May
 System calls are much slower than APIs (library 2018
calls) since for each system call, a context Apr/May
switch has to occur to load the OS (which then 2015
serves the system call).
 Most details of OS interface hidden from
programmer by API Managed by run-time support
library (Set of functions built into libraries
included with compiler.)
17. What is Cache Memory? CO1
 Cache memory is invisible to the OS.
 It interacts with other memory management hardware
 Cache contains a copy of a portion of main memory.

18. What are the types of Operating System? CO1


 Batch Operating System
 Time-Sharing Operating System
 Embedded Operating System
 Multiprogramming Operating System
 Network Operating System
 Distributed Operating System
 Multiprocessing Operating System
19. What are the types of network operating system? CO1
 Peer-to-peer network operating system
 Client-Server network operating system
20. List the functions of Operation System CO1

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 12


The operating system performs the following functions in a device.
1. Instruction
2. Input/output Management
3. Memory Management
4. File Management
5. Processor Management
6. Job Priority
7. Special Control Program
8. Scheduling of resources and jobs
9. Security
10. Monitoring activities
11. Job accounting

21. Write down the operating system structures. CO1


1. Simple Structure
2. Monolithic Structure
3. Micro-Kernel Structure
4. Layered Structure
22. Define Operating System. List the Objectives of Operating CO1 Apr/May
Systems. 2019
An Operating System (OS) is an interface between a computer user
and computer hardware. An operating system is a software which
performs all the basic tasks like file management, memory
management, process management, handling input and output, and
controlling peripheral devices such as disk drives and printers.
The objectives of the operating system are −
 To make the computer system convenient to use in an
efficient manner.
 To hide the details of the hardware resources from the users.
 To provide users a convenient interface to use the computer
system.
 To act as an intermediary between the hardware and its users,
making it easier for the users to access and use other
resources.
 To manage the resources of a computer system.
 To keep track of who is using which resource, granting
resource requests, and mediating conflicting requests from
different programs and users.
 To provide efficient and fair sharing of resources among
users and programs.
23. List the Operating System Design Goals. CO1
 User Goals
 System Goals
24. Define the two main categories of processor register? CO1
Two categories are

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 13


 User- visible registers: - It Enable the machine or assembly
language programmer to minimize main memory
references by optimizing register use.
 Control & Status registers: - Registers used by the
processor to control the operation of the processor.
25. What characteristics distinguish the various elements of a CO1
memory hierarchy?
Characteristics are
 Cost Per bit
 Capacity
 Access Time
 Frequency of access to the memory by the processor.
PART - B
S.N Question and Answer CO Univ QP
o. (Month/Yea
r)
1. Explain different operating system structures with neat sketch. (Nov/Dec
2015)
(Apr/May
• Modern operating system is complex in nature. Here we study different 2017)
structure motto of the operating systems. (Apr/May
Simple Structure 2018)
• Microsoft-Disk Operating System (MS-DOS) is example of simple
structure of operating system. Most of the commercial systems do not have
well defined structure. Initially DOS is small and simple in size. When new
versions are introduced, size goes on increasing.
• There is no CPU Execution Mode (user and kernel), and so errors in
applications can cause the whole system to crash.
• MS-DOS is layered structure operating system. It consists of following
layers:
1.Application program layer
2. System program layer for resident
3. Device driver layer
4. ROM BIOS device driver layer.
• In DOS, application program directly interact with BIOS device driver. If
user makes any changes in the BIOS device driver, it creates the problem
and affects the whole system. Memory size is also limited so after use
memory must be made free for other users.
• Another example of this type is UNIX operating system. Initially it
provides limited hardware functionality. UNIX is divided into two parts :
Kernel and system programs.
• Kernel provides system calls for CPU scheduling, I/O management, file
management and memory management. System call uses application
program interface in UNIX. API defines user interface by using set of
system programs. Kernel supports an API and user interface.
• To support more advanced version of hardware, new UNIX operating
system was developed. It is consists of different modules for controlling the
hardware. Module will communicate with each other for proper operation.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 14


Designers are free to design operating systems which support all new
versions of hardware and user requirements.
Layered Approach
• Second type of operating system design is layered approach. Operating
system is divided into number of layers. Each layer boundary is properly
defined. Bottom layer is called layer 0 and top most layer is called layer N.
Layer N provides user interface.
• A function of layer is also fixed. Fig. 1.9.1 shows the operating system
layer. Each layer consists of data structure and a set of routines. Layer
provides services to upper and lower layers.

• Modularity is the advantage of the layered system. Number of layers are


selected properly and each layer uses the services and functions of lower
level layer (Below layer of the current layer). This type of design simplifies
the debugging and system verification.
• First layer contains only basic hardware to implement function. So only
first layer is debugged for checking the whole system. If error is not found
then the system will works properly. If error is encounter while debugging
second layer, then error is related to the second layer only. In this operating
system is designed and implemented. It simplifies the design task of the
operating system.
• Care should be taken while designing the layers. Which functions are
added to which layer must be designing properly? Layer boundaries must be
defined properly. Some of the function includes only in the lower layer.
Device driver, memory management and input/output operation function
must be included in the lower layer.
• Secondary storage is required for all operation. When CPU changes the
process as per scheduling method, currently executing process is stored on
the secondary storage. Disk space is required for this purpose. So CPU
scheduling is includes in the above layer of the secondary storage layer.
• Each layer uses system calls for performing their operation. Layer adds
overhead to the system call.

Monolithic Structure
• Traditional UNIX operating system uses monolithic kernel architecture.
The entire operating system runs as a single program in kernel mode.
Program contains operating system core function and device drivers.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 15


• Most of the operation performed by kernel is via system call. Required
system calls are made within programs and a checked copy of the request is
passed through a system call. Fig. 1.9.2 shows monolithic kernel.

• LINUX operating system and FreeBSD uses modern monolithic kernel


architecture. It loads the modules at run time. There is easy access of kernel
function as required and minimize the code running in kernel space.
• Monolithic kernel used in Windows 95, Windows 98, Linux and FreeBSD
etc.
Advantages:
1. Simple to design and implement.
2. Simplicity provides speed on simple hardware.
3. It can be expanded using module system.
4. Time tested and design well known.
Disadvantages :
1.Runtime loading and unloading is not possible because of module system.
2. If code base size increases, maintain is difficult.
3. Fault tolerance is low.
Microkernel
• Microkernel provides minimal services like defining memory address
space, IPC and process management. It is small operating core. Hardware
resource management is implemented whenever process is executing.
• The function of microkernel is to provide a communication facility
between the client programs. It also provides facility to various services
which are running in user space. Message passing method is for
communication two processes.
• Microkernel runs in kernel mode and rest run in normal user processes.
Microkernel also provides more security and reliability. Most of the
services are running as user rather than kernel processes.
• By running device driver and file system as a separate user process, a error
in one can crash only single component. Fig. 1.9.3 shows microkernel.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 16


• Mach operating system uses microkernel architecture. Microkernel in
Windows NT operating system provides portability and modularity. The
kernel is surrounded by a number of compact subsystems so that the task of
implementing Windows NT on a variety of platform is easy.
• Microkernel architecture assigns only a few essential functions to the
kernel, including address space, IPC and basic scheduling. QNX is a real
time operating system that is also based upon the microkernel design.
• Main disadvantage is poor performance due to increased system overhead
from message passing.

Advantages of microkernel:
1. Microkernel allows the addition of new services.
2. Microkernel architecture design provides a uniform interface on requests
made by a process.
3. Microkernel architecture supports object oriented operating system.
4. Modular design helps to enhance reliability.
5. Microkernel lends itself to distributed system support.
6. Microkernel architecture support flexibility. User can add or subtract
services according to the requirement.
Virtual Machine
• In a pure virtual machine architecture the operating system gives each
process the illusion that it is the only process on the machine. The user
writes an application as if only its code were running on the system.
• Each user interacts with the computer by typing commands to the virtual
machine on a virtual system console and receiving results back from the
machine as soon as they are computed.
• Each user directs the virtual machine to perform different commands.
These commands are then executed on the physical machine in a
multiprogramming environments.
• Virtualization is an abstraction layer that decouples the physical hardware
from the operating system to deliver greater IT resource utilization and
flexibility.
• It allows multiple virtual machines, with heterogeneous operating systems
to run in isolation, side-by-side on the same physical machine. Fig. 1.9.4

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 17


shows virtual machine.
• Each virtual machine has its own set of virtual hardware (e.g., RAM, wol
CPU, NIC, etc.) upon which an o operating system and applications The
operating system creates the illusion of multiple processes, each executing
on its own processor with its own (virtual) memory.
• The main components of virtual machine are the control program,
conversational monitor system,remote spooling communication interactive
problem control and system.

• The control program creates the environments in which virtual machines


can executes. It also manages the real machines underlying the virtual
machine environment.
• The java virtual machine is one of the most widely used virtual machines.
• Virtual machines tend to be less efficient than real machines because they
access the hardware indirectly.
Benefits
1. There is no overlap amongst memory as each virtual memory has its own
memory space.
2. Virtual machines are completely isolated from the host machine and other
virtual machines.
3. Data does not leak across virtual machines.
Virtualization
•Virtualization means running multiple machines on a single hardware. The
"Real" hardware invisible to operating system. OS only sees an abstracted
out picture. Only Virtual Machine Monitor (VMM) talks to hardware.
• It is "a technique for hiding the physical characteristics of computing
resources from the way in which other systems, applications, or end users
interact with those resources. This includes making a single physical
resource appear to function as multiple logical resources; or it can include
making multiple physical resources appear as a single logical resource."
• It is divided into two main categories :
1. Platform virtualization involves the simulation of virtual machines.
2. Resource virtualization involves the simulation of combined, fragmented,
or simplified resources.
Comparison between Monolithic Kernel and Microkernel

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 18


2. Explain the various types of system calls with examples. (May/
June
2015)
A system call is a way for a user program to interface with the operating (Nov/Dec
system. The program requests several services, and the OS responds by 2015)
invoking a series of system calls to satisfy the request. A system call can be (Apr/May
written in assembly language or a high-level language like C or Pascal. 2017)
System calls are predefined functions that the operating system may directly (Nov/Dec
invoke if a high-level language is used. 2018)
A system call is a method for a computer program to request a service from
the kernel of the operating system on which it is running. A system call is a
method of interacting with the operating system via programs. A system call
is a request from computer software to an operating system's kernel.
The Application Program Interface (API) connects the operating system's
functions to user programs. It acts as a link between the operating system
and a process, allowing user-level programs to request operating system
services. The kernel system can only be accessed using system calls.
System calls are required for any programs that use resources. Will Launch
in
When a computer software needs to access the operating system's kernel, it
makes a system call. The system call uses an API to expose the operating
system's services to user programs. It is the only method to access the
kernel system. All programs or processes that require resources for
execution must use system calls, as they serve as an interface between the
operating system and user programs.
Below are some examples of how a system call varies from a user function.
1. A system call function may create and use kernel processes to
execute the asynchronous processing.
2. A system call has greater authority than a standard subroutine. A
system call with kernel-mode privilege executes in the kernel
protection domain.
3. System calls are not permitted to use shared libraries or any symbols
that are not present in the kernel protection domain.
4. The code and data for system calls are stored in global kernel
memory.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 19


There are various situations where you must require system calls in the
operating system. Following of the situations are as follows:
1. It is must require when a file system wants to create or delete a file.
2. Network connections require the system calls to sending and
receiving data packets.
3. If you want to read or write a file, you need to system calls.
4. If you want to access hardware devices, including a printer, scanner,
you need a system call.
5. System calls are used to create and manage new processes.
The Applications run in an area of memory known as user space. A system
call connects to the operating system's kernel, which executes in kernel
space. When an application creates a system call, it must first obtain
permission from the kernel. It achieves this using an interrupt request,
which pauses the current process and transfers control to the kernel.
If the request is permitted, the kernel performs the requested action, like
creating or deleting a file. As input, the application receives the kernel's
output. The application resumes the procedure after the input is received.
When the operation is finished, the kernel returns the results to the
application and then moves data from kernel space to user space in memory.
A simple system call may take few nanoseconds to provide the result, like
retrieving the system date and time. A more complicated system call, such
as connecting to a network device, may take a few seconds. Most operating
systems launch a distinct kernel thread for each system call to avoid
bottlenecks. Modern operating systems are multi-threaded, which means
they can handle various system calls at the same time.
Types of System Calls
There are commonly five types of system calls. These are as follows:

1. Process Control
2. File Management
3. Device Management
4. Information Maintenance
5. Communication
Now, you will learn about all the different types of system calls one-by-one.
Process Control
Process control is the system call that is used to direct the processes. Some
process control examples include creating, load, abort, end, execute,
process, terminate the process, etc.
File Management
File management is a system call that is used to handle the files. Some file

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 20


management examples include creating files, delete files, open, close, read,
write, etc.
Device Management
Device management is a system call that is used to deal with devices. Some
examples of device management include read, device, write, get device
attributes, release device, etc.
Information Maintenance
Information maintenance is a system call that is used to maintain
information. There are some examples of information maintenance,
including getting system data, set time or date, get time or date, set system
data, etc.
Communication
Communication is a system call that is used for communication. There are
some examples of communication, including create, delete communication
connections, send, receive messages, etc.
Examples of Windows and Unix system calls
There are various examples of Windows and Unix system calls. These are
as listed below in the table:
Process Windows Unix

Process Control CreateProcess() Fork()


ExitProcess() Exit()
WaitForSingleObject() Wait()

File Manipulation CreateFile() Open()


ReadFile() Read()
WriteFile() Write()
CloseHandle() Close()

Device SetConsoleMode() Ioctl()


Management ReadConsole() Read()
WriteConsole() Write()

Information GetCurrentProcessID() Getpid()


Maintenance SetTimer() Alarm()
Sleep() Sleep()

Communication CreatePipe() Pipe()


CreateFileMapping() Shmget()
MapViewOfFile() Mmap()

Protection SetFileSecurity() Chmod()


InitializeSecurityDescri Umask()
ptor() Chown()
SetSecurityDescriptorg

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 21


roup()
Here, you will learn about some methods briefly:
open()
The open() system call allows you to access a file on a file system. It
allocates resources to the file and provides a handle that the process may
refer to. Many processes can open a file at once or by a single process only.
It's all based on the file system and structure.
read()
It is used to obtain data from a file on the file system. It accepts three
arguments in general:
o A file descriptor.
o A buffer to store read data.
o The number of bytes to read from the file.
The file descriptor of the file to be read could be used to identify it and open
it using open() before reading.
wait()
In some systems, a process may have to wait for another process to
complete its execution before proceeding. When a parent process makes a
child process, the parent process execution is suspended until the child
process is finished. The wait() system call is used to suspend the parent
process. Once the child process has completed its execution, control is
returned to the parent process.
write()
It is used to write data from a user buffer to a device like a file. This system
call is one way for a program to generate data. It takes three arguments in
general:
o A file descriptor.
o A pointer to the buffer in which data is saved.
o The number of bytes to be written from the buffer.
fork()
Processes generate clones of themselves using the fork() system call. It is
one of the most common ways to create processes in operating systems.
When a parent process spawns a child process, execution of the parent
process is interrupted until the child process completes. Once the child
process has completed its execution, control is returned to the parent
process.
close()
It is used to end file system access. When this system call is invoked, it
signifies that the program no longer requires the file, and the buffers are
flushed, the file information is altered, and the file resources are de-
allocated as a result.
exec()
When an executable file replaces an earlier executable file in an already
executing process, this system function is invoked. As a new process is not
built, the old process identification stays, but the new process replaces data,

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 22


stack, data, head, etc.
exit()
The exit() is a system call that is used to end program execution. This call
indicates that the thread execution is complete, which is especially useful
in multi-threaded environments. The operating system reclaims resources
spent by the process following the use of the exit() system function.
3. What are the basic functions of OS and DMA (Nov/Dec
2015)
For a device that does large transfers, such as a disk drive, it seems (Apr/May
2017)
wasteful
to use an expensive general-purpose processor to watch status bits and to
feed data into a controller register one byte at a time—a process termed
programmed I/O (PIO). Many computers avoid burdening the main CPU
with PIO by offloading some of this work to a special-purpose processor
called a direct-memory-access (DMA) controller. To initiate a DMA
transfer, the host writes a DMA command block into memory. This block
contains a pointer to the source of a transfer, a pointer to the destination of
the transfer, and a count of the number of bytes to be transferred. The CPU
writes the address of this command block to the DMA controller, then goes
on with other work. The DMA controller proceeds to operate the memory
bus directly, placing addresses on the bus to perform transfers without the
help of the main CPU. A simple DMA controller is a standard component
in all modern computers, from smartphones to mainframes. Handshaking
between the DMA controller and the device controller is performed via a
pair of wires called DMA-request and DMA-acknowledge.
The device controller places a signal on the DMA-request wire when a
word
of data is available for transfer. This signal causes the DMA controller to
seize the memory bus, place the desired address on the memory-address
wires, and place a signal on the DMA-acknowledge wire. When the device
controller receives the DMA-acknowledge signal, it transfers the word of
data to memory and removes the DMA-request signal.

When the entire transfer is finished, the DMA controller interrupts the
CPU.
This process is depicted in Figure 13.5. When the DMA controller seizes
the
memory bus, the CPU is momentarily prevented from accessing main
memory, although it can still access data items in its primary and
secondary caches. Although this cycle stealing can slow down the CPU
computation, offloading the data-transfer work to a DMA controller
generally improves the total system performance. Some computer
architectures use physical memory addresses for DMA, but others
performdirect virtual memory access (DVMA), using virtual addresses that
undergo translation to physical addresses. DVMA can perform a transfer
between two memory-mapped devices without the intervention of the CPU
or the use of main memory.
On protected-mode kernels, the operating system generally prevents

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 23


processes from issuing device commands directly. This discipline protects
data from access-control violations and also protects the system from
erroneous use of device controllers that could cause a system crash.
Instead, the operating system exports functions that a sufficiently
privileged process can use to access low-level operations on the underlying
hardware. On kernels without memory protection, processes can access
device controllers directly. This direct access can be used to achieve high
performance, since it can avoid kernel communication, context switches,
and layers of kernel software. Unfortunately, it interferes with system
security and stability. The trend in general-purpose operating systems is to
protect memory and devices so that the system can try to guard against
erroneous or malicious applications.

4. Explain the concept of multiprocessor and Multicore (Apr/


organization. May
2017)
Multiprocessor System
• Multiprocessor system means more than one processor in close
communication. All the processor share common bus, clock, memory
and peripheral devices.
• Multiprocessor system is also called parallel system or tightly
coupled systems. Multiprocessor operating systems are just regular
operating systems. They handle system calls, do memory
management, provide a file system, and manage I/O devices.
• Multiprocessor systems is also known as parallel systems or
multicore systems.
•Features of Multiprocessor Systems :
1.The processor should support efficient context switching operation.
2. It supports large physical address space and larger virtual address

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 24


space.
3. If one processor fails, then other processors should retrieve the
interrupted process state so that execution of the process can continue.
4.The inter-process communication mechanism should be provided
and implemented in hardware as it becomes efficient and easy.
• Multiprocessors can be used in different ways:
1. Uniprossesors (single-instruction, single-data or SISD)
2. Within a single system to execute multiple, independent sequences
of instructions in multiple contexts (multiple-instruction, multiple-
data or MIMD);
3. A single sequence of instructions in multiple contexts (single-
instruction, multiple-data or SIMD, often used in vector processing);
4. Multiple sequences of instructions in a single context (multiple-
instruction, single-data or MISD, used for redundancy in fail-safe
systems and sometimes applied to describe pipelined processors or
hyper threading).
Advantages and Disadvantages of Multiprocessor Systems
Advantages of Multiprocessor Systems:
1. Throughput : Throughput increases because number of processor is
increases.
2. Cost:Multiprocessor system is cheaper than the multiple single
processor Ang system.
3. Reliability: If one processor fails, it has no effect on whole system
operation.
4. Response time : Response time is less because of increased number
of processor.
• Disadvantages of Multiprocessor Systems :
1. Multiprocessor systems are more complex in both hardware and
software.
2. Additional CPU cycles are required to manage the cooperation, so
per-CPU efficiency goes down.
• Multiprocessor systems are of two types: Symmetric multiprocessing
and asymmetric multiprocessing.
Symmetric Multiprocessing
• The simplest multiprocessors are based on a single bus. In
symmetric multiprocessing, number of homogeneous processor are
running independently without affecting other programs.
• Systems that treat all CPUs equally are called symmetric
multiprocessing (SMP) systems.
• Each processor uses 'different data and program but sharing some
common resources like memory, I/O device etc. Fig. 1.4.1 shows
multiprogramming system.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 25


• In SMP, all processor are at the same level. They work in peers.
There is no master-slave relationship in between the processor. Each
processor shares a single copy of operating system.
• Symmetric multiprocessing is easier to implement in operating
systems. Examples of symmetric multiprocessing operating systems
are Windows NT 4.0, Novell Netware 2.1 etc.
Asymmetric Multiprocessor
• If all CPUs are not equal, system resources may be divided in a
number of ways, including asymmetric multiprocessing (ASMP), non-
uniform memory access
(NUMA) multiprocessing, and clustered multiprocessing.
• All CPUs are not equal. There is one master processor and
remaining is the slave processor. Each processor has its own memory
address space.
• A master processor controls the whole operations. It gives the
instruction to the slave processor or slave processor uses some
predefined set of tasks.
• Asymmetric multiprocessing has one master CPU and the remainder
CPUs are slaves. The Master distributes takes among the slaves and
I/O is usually done by the master only.
• Each processor has own task which assign by master processor.
Asymmetric multiprocessing is difficult to implement.
5. What are the advantages and disadvantages of using the same (Nov/Dec
system call interface for both files and devices. 2016)
Advantages:
1. Unified Interface: Having a single system call interface simplifies
programming, as developers only need to learn and work with one
set of functions for accessing both files and devices. This reduces
complexity and makes code more readable and maintainable.
2. Consistency: With a unified interface, the behavior and usage
patterns are consistent across different types of resources (files and
devices), which can lead to fewer errors and more predictable
outcomes.
3. Flexibility: A unified interface allows for more flexible use of
resources. For example, it's easier to redirect output from a file to a
device or vice versa without needing to change the code
significantly.
4. Portability: Programs written using a unified interface are more

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 26


portable across different operating systems, as long as those
systems support the same system call interface.
Disadvantages:
1. Loss of Specialization: Treating files and devices the same may
lead to the loss of optimizations and special features that could be
provided for each separately. For example, certain devices may
have specific operations or characteristics that are not fully utilized
with a generic file interface.
2. Performance Overhead: A unified interface may introduce some
performance overhead, as it needs to handle both file and device
operations in a generic way, potentially leading to less efficient
code execution compared to optimized, specialized interfaces for
each type of resource.
3. Complexity in Implementation: Implementing a unified interface
that adequately handles both files and devices can be more complex
than implementing separate interfaces for each. This complexity
may introduce bugs or make maintenance more challenging.
4. Security Concerns: Treating files and devices the same way might
introduce security vulnerabilities, as certain operations that are safe
for files could be dangerous if applied to devices, or vice versa. A
unified interface might not provide sufficient safeguards to prevent
such issues.
While a unified system call interface offers simplicity, consistency, and
flexibility, it may come at the cost of specialization, performance,
complexity, and potentially security. The decision to use a unified
interface should consider the specific requirements and trade-offs of the
system being developed.

6. Describe the difference between symmetric and asymmetric (May/


multiprocessing. Discuss the advantages and disadvantages of June
2016)
multiprocessor systems. (Nov/Dec
2016)
Difference Between Asymmetric and Symmetric Multiprocessing:
S. Asymmetric Symmetric
No. Multiprocessing Multiprocessing

In asymmetric In symmetric
multiprocessing, the multiprocessing, all the
1.
processors are not treated processors are treated
equally. equally.

Tasks of the operating Tasks of the operating


2. system are done by system are done
master processor. individual processor.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 27


No Communication All processors
between Processors as communicate with another
3.
they are controlled by the processor by a shared
master processor. memory.

In asymmetric In symmetric
multiprocessing, process multiprocessing, the
4.
scheduling approach used process is taken from the
is master-slave. ready queue.

Asymmetric Symmetric
5. multiprocessing systems multiprocessing systems
are cheaper. are costlier.

Asymmetric Symmetric
6. multiprocessing systems multiprocessing systems
are easier to design. are complex to design.

All processors can exhibit The architecture of each


7.
different architecture. processor is the same.

It is complex as
It is simple as here the synchronization is
8. master processor has required of the processors
access to the data, etc. in order to maintain the
load balance.

In case a master processor


malfunctions then slave
processor continues the In case of processor
execution which is turned failure, there is reduction
9.
to master processor. in the system’s computing
When a slave processor capacity.
fails then other processors
take over the task.

It is suitable for
It is suitable for
10. homogeneous or
homogeneous cores.
heterogeneous cores.

Advantages and Disadvantages of Multiprocessor Systems

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 28


Advantages of Multiprocessor Systems:
1. Throughput : Throughput increases because number of processor is
increases.
2. Cost:Multiprocessor system is cheaper than the multiple single
processor Ang system.
3. Reliability: If one processor fails, it has no effect on whole system
operation.
4. Response time : Response time is less because of increased number
of processor.
• Disadvantages of Multiprocessor Systems :
1. Multiprocessor systems are more complex in both hardware and
software.
2. Additional CPU cycles are required to manage the cooperation, so
per-CPU efficiency goes down.
7. Discuss in detail about Distributed systems. (May/June
Distributed System 2016)

• Definition: A distributed system is a collection of autonomous hosts that


are connected through a computer network.
• A distributed system is a collection of independent computers that appears
to its users as a single coherent system. Each host executes components and
operates a distribution middleware.
• Middleware enables the components to coordinate their activities. Users
perceive the system as a single, integrated computing facility.
• A distributed computer system consists of multiple software components
that are on multiple computers, but run as a single system. The computers
that are in a distributed system can be physically close together and
connected by a local network, or they can be geographically distant and
connected by a wide area network.
• A distributed system can consist of any number of possible configurations,
such as mainframes, personal computers, workstations, minicomputers and
so on, such a
• Distributed operating systems depend on networking for their operation.
Distributed OS runs on and controls the resources of multiple machines. It
sts provides resource sharing across the boundaries of a single computer
system. It looks to users like a single machine OS.
• Distributing OS owns the whole network and makes it look like a virtual
otauni-processor or may be a virtual multiprocessor.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 29


• Definition: A distributed operating system is one that looks to its users like
an ordinary operating system but runs on multiple, independent CPU.
• Distributed systems depend on networking for their functionality. Fig.
1.6.1 shows the distributed system.

• Examples of distributed operating system are Amoeba, chrous, mach and


v-system
• A Local Area Network (LAN) is a network that is confined to a relatively
small area. It is generally limited to a geographic area such as a college, lab
or building.
• WAN provides long distance transmission of data and voice. Computers
connected to a wide-area network are often connected through public
networks, such as the telephone system. They can also be connected
through leased lines or satellites.
• A MAN typically covers an area of between 5 and 50 km diameter. Many
MANS cover an area the size of a city, although in some cases MANS may
be as small as a group of buildings.
• A MAN often acts as a high speed network to allow sharing of regional
resources. MAN provides the transfer rates from 34 to 150 Mbps.

Advantages and Disadvantages of DS


a) Advantages of distributed OS:
1. Resource sharing: Sharing of software resources such as software
libraries, database and hardware resources such as hard disks, printers and
CDROM can also be done in a very effective way among all the computers
and the users.
2. Higher reliability: Reliability refers to the degree of tolerance against

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 30


errors and component failures. Availability is one of the important aspects
of reliability. Availability refers to the fraction of time for which a system is
available for use.
3. Better price performance ratio: Reduction in the price of microprocessor
and increasing computing power gives good price-performance ratio.
4. Shorter responses times and higher throughput.
5. Incremental growth: To extend power and functionality of a system by
simply adding additional resources to the system.
b) Disadvantages :
1. There are no current commercially successful examples.
2. Protocol overhead can dominate computation costs.
3. Hard to build well.
4. Probably impossible to build at the scale of the internet.
8. Demonstrate the three methods for passing parameters to the OS (May/June
with examples. 2016)

1) Pass parameters using registers(directly).


section .text
global _start
_start:
mov eax, 4 ; System call number for write
mov ebx, 1 ; File descriptor for standard output
mov ecx, msg ; Pointer to the message to be printed
mov edx, msglen ; Length of the message
int 0x80 ; Invoke the kernel
mov eax, 1 ; System call number for exit
xor ebx, ebx ; Return 0 status
int 0x80 ; Invoke the kernel
section .data
msg db 'Hello, world!', 0xA ; The message to be printed
msglen equ $ - msg ; Length of the message
2) Store the parameters in a table in memory and the table address is
passed in a register to the OS.
#include <unistd.h>
#include <sys/types.h>
int main() {
struct syscall_params {
int syscall_number;
int fd;

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 31


const char *buf;
size_t count;
};

struct syscall_params params;


params.syscall_number = 4; // System call number for write
params.fd = 1; // File descriptor for standard output
params.buf = "Hello, world!\n"; // Pointer to the message to be printed
params.count = 14; // Length of the message
syscall(0, &params); // Invoking the system call with syscall
number 0 and passing the params table address
return 0;
}
3) Push(store) the parameters onto a stack(by the program) and
"pop" off by the Operating System.
#include <unistd.h>
#include <sys/types.h>
int main() {
char *msg = "Hello, world!\n";
ssize_t len = 14;
asm volatile(
"movl $4, %%eax\n" // System call number for write
"movl $1, %%ebx\n" // File descriptor for standard output
"movl %0, %%ecx\n" // Pointer to the message to be printed
"movl %1, %%edx\n" // Length of the message
"int $0x80\n" // Invoke the kernel
:
: "r"(msg), "r"(len)
: "eax", "ebx", "ecx", "edx"
);
return 0;
}
9. Explain how protection is provided for the hardware resources by (Nov/Dec
the operating system. 2016)
If a computer system has multiple users and allows the concurrent
execution
of multiple processes, then access to data must be regulated. For that
purpose, mechanisms ensure that files, memory segments, CPU, and other
resources can be operated on by only those processes that have gained
proper authorization from the operating system. For example, memory-
addressing hardware ensures that a process can execute only within its
own address space. The timer ensures that no process can gain control of
the CPU without eventually relinquishing control. Device-control
registers are not accessible to users, so the integrity of the various
peripheral devices is protected.
Protection, then, is any mechanism for controlling the access of processes

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 32


or users to the resources defined by a computer system. This mechanism
must provide means to specify the controls to be imposed and to enforce
the controls.
Protection can improve reliability by detecting latent errors at the
interfaces
between component subsystems. Early detection of interface errors can
often prevent contamination of a healthy subsystem by another subsystem
that is malfunctioning. Furthermore, an unprotected resource cannot
defend against use (or misuse) by an unauthorized or incompetent user. A
protection-oriented system provides ameans to distinguish between
authorized and unauthorized usage
A system can have adequate protection but still be prone to failure and
allow inappropriate access. Consider a user whose authentication
information (her means of identifying herself to the system) is stolen. Her
data could be copied or deleted, even though file and memory protection
are working. It is the job of security to defend a system from external and
internal attacks. Such attacks spread across a huge range and include
viruses and worms, denial-of service attacks (which use all of a system’s
resources and so keep legitimate users out of the system), identity theft,
and theft of service (unauthorized use of a system). Prevention of some of
these attacks is considered an operating-system function on some
systems, while other systems leave it to policy or additional software.
Due to the alarming rise in security incidents, operating-system security
features represent a fast-growing area of research and implementation.
Protection and security require the system to be able to distinguish among
all its users. Most operating systems maintain a list of user names and
associated user identifiers (user IDs). In Windows parlance, this is a
security ID (SID). These numerical IDs are unique, one per user. When a
user logs in to the system, the authentication stage determines the
appropriate user ID for the user. That user ID is associated with all of the
user’s processes and threads.
When an ID needs to be readable by a user, it is translated back to the
user
name via the user name list.
In some circumstances, we wish to distinguish among sets of users rather
than individual users. For example, the owner of a file on a UNIX system
may be allowed to issue all operations on that file, whereas a selected set
of users may be allowed only to read the file. To accomplish this, we need
to define a group name and the set of users belonging to that group.
Group functionality can be implemented as a system-wide list of group
names and group identifiers.
A user can be in one or more groups, depending on operating-system
design
decisions. The user’s group IDs are also included in every associated
process and thread. In the course of normal system use, the user ID and
group ID for a user are sufficient. However, a user sometimes needs to

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 33


escalate privileges to gain extra permissions for an activity. The user may
need access to a device that is restricted, for example. Operating systems
provide various methods to allow privilege escalation. On UNIX, for
instance, the setuid attribute on a program causes that program to run with
the user ID of the owner of the file, rather than the current user’s ID. The
process runs with this effective UID until it turns off the extra privileges
or terminates.

10. List the various services provided by operating systems. (Nov/Dec


Operating System Services 2016)
• Operating system provides different types of services to different (Apr/May
users. It provides services to program like load of data into memory, 2018)
allocating disk for storage, files or directory for open or read etc.
• Services will change according to the operating system. May be the
two different operating system provides same type of services with
different names. OS makes programmer job easy by providing different
services.
Following are the list of services provided by operating systems :
1. Program execution 2. Input-output operation
3. Error detection 4. File and directory operation
5. Communication 6. Graphical user interface.
1. Program execution: Before executing the program, it is loaded into
the memory by operating system. Once program loads into memory, its
start execution. Program finishes its execution with error or without
error. It is up to the user for next operation.
2. Input- output operation: Program is combination of input and output
statement. While executing the program, it requires I/O device. OS
provides the I/O devices to the program.
3. Error detection: Error is related to the memory, CPU, I/O device and
in the user program. Memory is full, stack overflow, file not found,
directory not exist, printer is not ready, attempt to access illegal
memory are the example of error detection.
4. File and directory operation: User wants to read and writes the file
and directory. User wants to search the file/directory, rename file, and
modify the file etc. user also create the file or directory. All these
operation is performed by user by using help of operating system.
5. Communication: Communication may be inter-process
communication and any other type of communication. Data transfer is
one type of communication. Communication is in between two process
of same machine or two process of different machine. Pipe, shared
memory, socket and message passing are the different methods of
communication.
6. Graphical user interface: User interacts with operating system by
using user interface. All operating system provides user interface.
Command line interface and batch interface are two types of user
interface used in the operating system. In command line interface, user

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 34


enters the text command for performing operation. Batch interface uses
files. A file contains the command for various operations. When file
executes, command output is displayed on the screen.
• All above services provided by operating system is only for single
user. If suppose there are multiple user in the system, then operating
system provides some Po additional services. These services are
resource allocation, accounting, protection of data and security.
• Different types of resources are managed by the operating system.
Multiple users require different resource for their execution. One type
of resource may be required by two different users. So resource
allocation is necessary.
• Accounting means keeping information of resources and users. Which
types of resources are allocated to which user and whether particular
user has permission sostro for using this type of resources.

11. Discuss the DMA driven data transfer technique. (May/June


Introduction to DMA: 2015)

DMA is a technique used in computer systems to enhance data transfer


between peripherals and memory without involving the CPU. In traditional
data transfer methods, the CPU is responsible for managing data
movement between devices, which can be inefficient and time-consuming.
DMA alleviates this burden by allowing peripherals to communicate
directly with the memory, reducing CPU involvement and improving
overall system performance.

Components of DMA:

DMA Controller:
DMA operations are supervised by a specialized hardware component
known as the DMA controller.
The DMA controller coordinates data transfers between peripherals and
memory, freeing up the CPU to focus on other tasks.
DMA Channels:
DMA controllers typically have multiple channels, each capable of
handling a specific data transfer task. Channels can be independently
programmed to manage different peripherals or memory regions.
Steps in DMA-Driven Data Transfer:

Initialization:
The CPU initializes the DMA controller by setting up parameters such as
source and destination addresses, transfer size, and transfer direction (read
or write).
Request and Acknowledge:
When a peripheral device needs to transfer data, it sends a request to the
DMA controller.
The DMA controller acknowledges the request and grants control to the

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 35


requesting device.
Addressing:
The DMA controller obtains control of the system bus and accesses the
memory or peripheral directly.
It uses the source and destination addresses specified during initialization
for data movement.
Data Transfer:
Data is transferred directly between the source and destination without
CPU intervention.
This process is often faster than traditional CPU-controlled transfers.
Interrupt or Polling:
After the transfer is complete, the DMA controller signals the CPU
through an interrupt or polling mechanism.
The CPU can then perform any necessary post-transfer tasks.
Advantages of DMA:
Improved Performance:
DMA reduces CPU overhead, allowing it to focus on other tasks, which
leads to improved overall system performance.
Efficiency:
Data transfers occur directly between peripherals and memory,
minimizing delays and enhancing efficiency.
Parallelism:
DMA channels can operate independently, enabling parallel data transfers
between different peripherals and memory regions.
Reduced Latency:
DMA-driven transfers often result in lower latency compared to CPU-
controlled transfers.
12. Discuss about the evolution of virtual machines. Also explain how (May/June
virtualization could be implemented in operating systems. 2015)

Evolution of Virtual Machines:

First Generation (1960s-1970s):


The concept of virtualization originated in the 1960s with the
development of mainframes.
IBM's CP-40 and CP-67 were among the first systems to introduce
virtualization features, allowing multiple instances of an operating system
to run on a single mainframe.
Second Generation (1980s-1990s):
The advent of microprocessors and personal computers led to a decline in
the use of virtualization.
Server virtualization technologies, such as VMware, emerged in the late
1990s, enabling multiple operating systems to run on a single physical
server.
Third Generation (2000s-2010s):
Hardware-assisted virtualization became more prevalent with the
introduction of technologies like Intel VT-x and AMD-V, making VMs

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 36


more efficient and easier to manage.
Hypervisors, both Type 1 (bare-metal) and Type 2 (hosted), gained
popularity. Examples include VMware ESXi (Type 1) and Oracle
VirtualBox (Type 2).
Fourth Generation (2010s-Present):
Cloud computing played a significant role in the proliferation of
virtualization.
Containerization technologies, like Docker, emerged as an alternative to
traditional VMs, offering lightweight and faster application deployment.
Implementation of Virtualization in Operating Systems:

Hypervisors:
Hypervisors, also known as Virtual Machine Monitors (VMM), are
essential for implementing virtualization.
Type 1 Hypervisors run directly on the hardware and manage VMs
independently of the host operating system. Examples include VMware
ESXi, Microsoft Hyper-V.
Type 2 Hypervisors run on a host operating system and create VMs
within it. Examples include VMware Workstation, Oracle VirtualBox.
Hardware-Assisted Virtualization:
Modern processors include hardware features that enhance virtualization
performance. Intel VT-x and AMD-V are examples of hardware-assisted
virtualization technologies.
These features allow VMs to execute code directly on the processor
without requiring software emulation, improving efficiency.
Memory Management:
Virtualization relies on efficient memory management to allocate and
control memory resources for each VM.
Technologies like Extended Page Tables (EPT) for Intel and Rapid
Virtualization Indexing (RVI) for AMD facilitate efficient memory
virtualization.
I/O Virtualization:
Virtual machines require access to I/O devices. I/O virtualization
techniques allow VMs to interact with physical devices while maintaining
isolation.
I/O MMU (Memory Management Unit) and technologies like Intel VT-d
and AMD-Vi enable direct device assignment to VMs.
Live Migration:
Live migration allows VMs to be moved between physical hosts without
disruption to services.
Technologies like VMware vMotion and Hyper-V Live Migration enable
seamless VM relocation, enhancing flexibility and resource utilization.
Containerization:
While not traditional virtualization, containerization technologies like
Docker and Kubernetes provide lightweight and portable environments

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 37


for applications, sharing the host OS kernel. Containers offer advantages
in terms of speed, resource efficiency, and scalability.
PART C

1 With neat sketch, discuss about computer system overview. (Nov/Dec


. • Computer system consists of hardware device and software that are 2015)

combined to provide a tool to user for solving problems.


• Fig. 1.1.1 shows modern computer system.

• Modern computer consists of one or two CPU with main memory and
various I/O devices. Common bus is used for communication between these
devices. Each device has its own device controller.
• CPU and device controller uses memory cycle for execution purposes. But
memory cycle is only available to one device at a time.
• Bootstrap program is loaded when user start the computer. It initialies all

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 38


the device connected to the computer system and then loads required device
drivers.
• After this, operating system loads in the computer system. In UNIX OS,
an 'init' is the first process which execute by OS.
• Interrupt is software and hardware. It is used to send signal to CPU.
Software interrupt is sometime called system call.
• When interrupt is trigger, the CPU stops executing the instruction and
control is transfer to the fixed location. Starting address is stored at fixed
location where the service routine executes.
• Interrupts do not alter the control flow of the process executing on the
processor.
Storage structure
• Processor access the data from main memory before executing any
instruction. Main memory is also called Random Access Memory (RAM).
• DRAM is used in main memory. Fig. 1.1.2 shows hierarchy of storage
device.
• At the top of the hierarchy, we have storage on the CPU registers. For
accessing the CPU, it is fastest form of storage.
• Cache memory capacity is less than 1 MB.
• User program and data are stored in the main memory. Main memory
is volatile, so it can not stored permanently.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 39


• Storage system is classified as temporary storage or permanent storage.
• Top level storage devices are low capacity with faster CPU access and
bottom level storage devices having very large capacity with slow CPU
access speed.
I/O structure
• Every device uses a device controller to connect it to the computer's
address and data bus. Devices can be classified as a block oriented or
character oriented, depending on the number of bytes transferred on an
individual operation.
• Storage devices are used to store data while the computer is off.
• All the I/O devices are connected to each other by using common bus.
CPU and main memory is also connected with this bus.
• Various types of controller is used in the computer system. Small
Computer System Interface (SCSI) controller can handle upto seven
devices. Each device controller have its own buffer.
• Device controller manage the data transfer between peripheral device and
its controller. Device driver is handled by device controller.
I/O operation steps
1. Device driver loads the registers within the device controller.
2. Device controller takes action according to the data loaded into the
register.
3. Data is transfer from device to its local buffer with the help of device
controller.
4. After completion of data transfer, the device controller sends an interrupt
signal to device driver about data transfer completion operation.
5. Then control goes to operating system.
• The device driver is the operating system entity that controls CPU-I/O
parallelism. The software that communicates with device controller is called
device driver.
• A device can be started by the device driver, and the application program
can continue operation in parallel with the device operation.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 40


Instruction Execution
• The program which is to be executed is a set of instructions which are
stored in memory. The Central Processing Unit (CPU) executes the
instructions of the program to complete a task.
• The major responsibility of the instruction execution is with the CPU. The
instruction execution takes place in the CPU registers.
• A special register contains the address of the instruction. The CPU
"fetches" the instruction from memory at that address.
• The CPU "decodes" the instruction to figure out what to do. The CPU
"fetches" any data b(operands) needed by the instruction, from memory or
registers.

• Fig 1.1.3 shows instruction execution cycle.


• The CPU "executes" the operation specified by the instruction on this data.
The to CPU "stores" any results into a register or memory.
•The register used for instruction execution are as follows:
1. Memory Address Register (MAR): It specifies the address of memory
location from which data or instruction is to be accessed (for read operation)
or to best yr which the data is to be stored (for write operation).

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 41


2. Program Counter (PC): It keeps track of the instruction which is to be
executed next, after the execution of an on-going instruction.
3. Instruction Register (IR): Here the instructions are loaded before their
execution.
Interrupts
• Definition: It is an event external to the currently executing process that
causes a change in the normal flow of instruction execution; usually
generated by hardware not devices external to the CPU. They tell the CPU
to stop its current activities and not execute the appropriate part of the
operating system.
• Fig. 1.1.4 shows interrupts.

• The CPU uses a table and the interrupt vector to find OS the code to
execute in response to interrupts. When interrupt signaled, processor
executes a routine called an interrupt handler to deal with the interrupt.
• Operating system may specify a set of instructions, called an interrupt
handler, to be executed in response to each type of interrupt.
• Interrupt Service Routine (ISR) is the software code that is executed when
the hardware requests interrupt. The design of the interrupt service routine
requires careful consideration of many factors. Although interrupt handlers
can create and use local variables, parameter passing between threads must
be implemented using shared global memory variables.
• A private global variable can be used if an interrupt thread wishes to pass
information to itself, e.g., from one interrupt instance to another. The
execution of the main program is called the foreground thread, and the
executions of the various interrupt service routines are called background
threads.
• Interrupts are of three types: Hardware, software and trap.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 42


• Hardware Interrupts are generated by hardware devices to signal that they
need some attention from the OS. Software Interrupts are generated by
programs when they want to request a system call to be performed by the
operating system.
• Traps are generated by the CPU itself to indicate that some error or
condition occurred for which assistance from the operating system is
needed.
• Interrupt and trap numbers are defined by the hardware which is also
responsible for calling the procedure in the kernel space. An interrupt
handler is called in response to a signal from another device while a trap
handler is called in response to an instruction executed within the CPU.
• Synchronous interrupts occur when a process attempts to perform an
illegal action, such as dividing by zero or referencing a protected memory
location. Synchronous interrupt occurs due to software execution.

Cache Memory and Hierarchy


• Cache is small high speed memory. It is derived from SRAM.
• Caches are useful when two ore more components need to exchange data,
and the components perform transfers at differing speeds. Caches solve the
transfer problem by providing a buffer of intermediate speed between the
components.
• Caches are useful because they can increase the speed of the average
memory access and they do so without taking up as much physical space as
the lower elements of the memory hierachy.
• Data is normally kept in storage system.. when it required, it is copied into
a faster storage system i.e. cache memory for temporary basics.
• When user required a particular data, system check whether it is in the
cache. If data found then, we use the data directly from the cache. It is not
found then, use the data from the source.
• Internal programmable registers, such as index registers, provide a high-
speed cache for main memory. The programmer implements the registor-
allocation and register-replacement algorithms to decide which information
to keep in registers and which to keep in main memory.
• If the fast device finds the data it needs in the cache, it need not wait for
the 1er slower device. The data in the cache must be kept consistent with

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 43


the data in the components.
• If a component has data value change, and the datum is also in the cache,
the cache must also be updated. This is especially a problem on
multiprocessor systems where more than one process may be accessing a
datum
• A component may be eliminated by an equal sized cache, but only if the
cache and the component have equivalent state-saving capacity and the
cache is affordable, because aster storage tends to be more expensive.
• Unfortunately, cache also introduce an additional level of complexity
(coherency and consistency assurance).We also incur an economic and
space penalty when we add a cache.
• Making a cache as large as a disk would be ineffective because it would be
too costly, the immense size would slow it down and a cache is generally a
volatile memory, while we want data on disks to be persistent..
• It holds data for temporary purpose to reduce the time required to service
I/O requests from the host. Fig. 1.1.5 shows memory hierarchy.

• Cache in CPU unit is referred as Level1 cache (L1 cache) and cache stored
in a chip next to CPU is termed as Level2 cache (L2 cache) and it resides on
the motherboard.
• The function of cache is to act as a buffer between a relatively fast device
and a relatively slow one.
• Small unit of allocation in cache is page. Cache is arranged into slots or
pages. It uses two components data store and tag RAM. The actual data is
stored in a
different part of the cache, called the data store. The values stored in the tag
RAM determine whether a cache lookup results in a hit or a miss.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 44


• The size of the tag RAM determines what range of main memory can be
cached. Tag RAM is a small piece of SRAM that stores the addresses of the
data that is stored in the SRAM
• A cache address can be specified simply by index and offset. The tag is
kept to allow the cache to translate from a cache address to a unique CPU
address.
• A cache hit means that the CPU tried to access an address, and a matching
cache block was available in cache. So, the cache did not need to access
RAM. In a cache miss, the CPU tries to access an address, and there is no
matching cache block. So, the cache is forced to access RAM. of
• Cache includes tags to identify which block of main memory is in each
cache slot.
• Every cache block has associated with it at least the modify and valid bits,
and a tag address. The valid bit says if the cache block is used or is unused.
The modify bit makes sense only if the valid bit is set. The modify bit says
whether the data in the cache block is different from RAM or is the same as
RAM.
• The size of a page is dependent on the size of the cache and how the cache
is organized. A cache page is broken into smaller pieces, each called a
cache line. The size of a cache line is determined by both the processor and
the cache design. .
• When the processor starts a read cycle, the cache checks to see if that
address is a cache hit.
• Cache Hit: If the cache contains the memory location, then the cache will
respond to the read cycle and terminate the bus cycle.
• Cache Miss: It is reference to item that is not resident in cache, but is
resident in main memory. For cache misses, the fast memory is
cache and the slow memory is main memory.

Direct Memory Access Structure


• Direct Memory Access (DMA) is one of several methods for coordinating
the timing of data transfers between an input/output (I/O) device and the
core processing unit or memory in a computer. DMA is one of the faster
types of synchronization mechanisms.
• Without DMA, the processor is responsible for the physical movement of

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 45


data between main memory and a device. A special control unit may be
provided to allow transfer of a block of data directly between an external
device and the main memory, without continuous intervention by the
processor. This approach is called direct memory access.
• Fig. 1.1.6 shows simplified diagram of DMA controller.

• The DMA controller uses hold request (HOLD) and hold girl acknowledge
(HOLD A) signals to ask the CPU to stop driving the address, data and
control buses so that the DMA controller can drive them to carry out a
transfer.
• A DMA controller implements direct memory access in a computer
system. It connects directly to the I/O device at one end and to the system
buses at the other end. It also interacts with the CPU, both via the system
buses and two new direct connections.
• Device controller transfers blocks of data from buffer storage directly to
main memory without CPU intervention. Only one interrupt is generated
per block, rather than the one interrupt per byte.
• The DMA controller may either stop the CPU and access the memory or
use the bus while the CPU is not using it
2 Give reasons why caches are useful. What problems do they solve and (Apr/May
. cause? If a cache can be made as large as the device for which it is 2018)
caching why not make it that large and eliminate the device?
Why Caches are Useful:

Faster Access Times:

Caches store frequently accessed data and instructions closer to the CPU,

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 46


reducing the time it takes to retrieve information compared to fetching it
from slower main memory.
Improved System Performance:

By minimizing the time spent waiting for data, caches significantly enhance
overall system performance. This is crucial for tasks that require quick
access to data, such as running applications or processing instructions.
Reduced Latency:

Caches help reduce memory access latency, allowing the CPU to operate
more efficiently by fetching data and instructions at a faster rate.
Bandwidth Conservation:

Caches help conserve memory bandwidth by reducing the frequency of


requests to the slower main memory. This is particularly important in
systems with limited memory bandwidth.
Energy Efficiency:

Accessing data from caches consumes less energy compared to fetching


data from main memory. This contributes to energy efficiency and can be
crucial in battery-powered devices.
Problems Caches Solve:

Memory Latency:

Caches address the issue of memory latency by providing a smaller, faster


storage space for frequently used data, reducing the need to access the
slower main memory.
Limited Bandwidth:

Caches help manage limited memory bandwidth by storing frequently


accessed data locally, minimizing the impact on the overall system
bandwidth.
CPU Efficiency:

Caches improve CPU efficiency by reducing idle time spent waiting for
data, allowing the processor to execute instructions more quickly.
Issues Caches May Cause:

Cache Coherency:

Maintaining consistency between the contents of the cache and the main
memory can be challenging. Cache coherency protocols are implemented to
address this issue.
Complexity and Cost:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 47


Implementing and managing caches can add complexity to the design of a
system, and larger, more sophisticated caches can increase production costs.
Cache Pollution:

In certain scenarios, the cache may store unnecessary or outdated data,


leading to inefficiencies. Cache replacement policies are employed to
address this issue.
Cache Size and Device Elimination:

While it may seem logical to make caches as large as the device they are
caching, there are practical limitations and trade-offs involved:

Cost and Complexity:

Larger caches are more expensive to manufacture and can add complexity
to the system design. There's a trade-off between cache size, cost, and the
benefits gained from increased cache capacity.
Diminishing Returns:

As the cache size increases, the improvement in performance diminishes.


There is a point of diminishing returns where the added cost and complexity
may not be justified by the marginal performance gains.
Physical Space:

Larger caches require more physical space on the chip, and in many cases,
the available space is limited. Designers need to balance the size of the
cache with other essential components on the chip.
Power Consumption:

Larger caches may consume more power, impacting energy efficiency. In


mobile devices or battery-powered systems, this is a critical consideration.
3 Discuss the functionality of system boot with respect to an (Nov/Dec
. operating system. 2018)
Booting in Operating System
Booting is the process of starting a computer. It can be initiated by hardware
such as a button press or by a software command. After it is switched on, a
CPU has no software in its main memory, so some processes must load
software into memory before execution. This may be done by hardware or
firmware in the CPU or by a separate processor in the computer system.

Sequencing of Booting

Booting is a start-up sequence that starts the operating system of a computer


when it is turned on. A boot sequence is the initial set of operations that the
computer performs when it is switched on. Every computer has a boot

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 48


sequence.

Types of Booting

There are two types of booting in an operating system.

Booting Process in Operating System

When our computer is switched on, it can be started by hardware such as a


button press, or by software command, a computer's central processing unit
(CPU) has no software in its main memory, there is some process which
must load software into main memory before it can be executed. Below are
the six steps to describe the boot process in the operating system, such as:

Step 1: Once the computer system is turned on, BIOS (Basic Input /Output
System) performs a series of activities or functionality tests on programs
stored in ROM, called on POST (Power-on Self Test) that checks to see
whether peripherals in the system are in perfect order or not.

Step 2: After the BIOS is done with pre-boot activities or functionality test,
it read bootable sequence from CMOS (Common Metal Oxide
Semiconductor) and looks for master boot record in the first physical sector
of the bootable disk as per boot device sequence specified in CMOS. For
example, if the boot device sequence is:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 49


o Floppy Disk
o Hard Disk
o CDROM

Step 3: After this, the master boot record will search first in a floppy disk
drive. If not found, then the hard disk drive will search for the master boot
record. But if the master boot record is not even present on the hard disk,
then the CDROM drive will search. If the system cannot read the master
boot record from any of these sources, ROM displays "No Boot device
found" and halted the system. On finding the master boot record from a
particular bootable disk drive, the operating system loader, also called
Bootstrap loader, is loaded from the boot sector of that bootable drive· into
memory. A bootstrap loader is a special program that is present in the boot
sector of a bootable drive.

Step 4: The bootstrap loader first loads the IO.SYS file. After
this, MSDOS.SYS file is loaded, which is the core file of the DOS operating
system.

Step 5: After this, MSDOS.SYS file searches to find Command Interpreter


in CONFIG.SYS file, and when it finds, it loads into memory. If no
Command Interpreter is specified in the CONFIG.SYS file,
the COMMAND.COM file is loaded as the default Command Interpreter of
the DOS operating system.

Step 6: The last file is to be loaded and executed is


the AUTOEXEC.BAT file that contains a sequence of DOS commands.
After this, the prompt is displayed. We can see the drive letter of bootable
drive displayed on the computer system, which indicates that the operating
system has been successfully on the system from that drive.
4 Discuss the essential properties of the following types of systems, (Nov/Dec
. 2018)
i) Time sharing systems ii) Multi-processor systems
iii) Distributed systems

i) Time-Sharing Systems:

Time Division Multiplexing:

Time-sharing systems enable multiple users to share the resources of a


single computer simultaneously.
Time division multiplexing is used to allocate time slices or "time slots" to
each user, allowing them to interact with the system.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 50


User Interactivity:

Time-sharing systems prioritize user interactivity, providing each user with


the illusion of dedicated access to the system.
Users can execute commands, run programs, and interact with the system in
real-time.
Fairness and Responsiveness:

Fairness is a crucial property, ensuring that each user receives a fair share of
system resources during their allocated time slice.
The system should be responsive to user input, minimizing latency and
providing a smooth user experience.
Concurrency and Resource Sharing:

Time-sharing systems support concurrent execution of multiple tasks,


allowing users to run their programs simultaneously.
Resource sharing mechanisms ensure that users can access and utilize
shared resources like the CPU, memory, and peripherals.
Security and Isolation:

Security mechanisms are essential to prevent unauthorized access to other


users' data and resources.
Isolation ensures that each user operates within their designated
environment without affecting others.
ii) Multi-Processor Systems:
Parallel Processing:
Multi-processor systems involve multiple CPUs (Central Processing Units)
working together simultaneously to execute tasks.
Parallel processing enables faster computation and improved system
performance.
Scalability:
Multi-processor systems are scalable, allowing for the addition of more
processors to enhance system capabilities.
Scalability is crucial for adapting to changing computational requirements.
Load Balancing:
Load balancing ensures that tasks are distributed evenly among the
processors, preventing bottlenecks and maximizing resource utilization.
Efficient load balancing contributes to optimal system performance.
Inter-Processor Communication:
Inter-processor communication mechanisms enable collaboration and data
sharing between different processors.
Efficient communication is vital for coordinating tasks and maintaining
consistency in shared data.
Fault Tolerance:
Multi-processor systems may incorporate fault-tolerant mechanisms to
handle hardware failures.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 51


Redundancy and error recovery strategies help maintain system reliability.
iii) Distributed Systems:
Resource Sharing and Coordination:
Distributed systems involve multiple interconnected computers that work
together to achieve common goals.
Resource sharing and coordination mechanisms enable seamless
collaboration among distributed components.
Transparency:
Distributed systems aim to provide transparency to users, concealing the
complexity of distributed architecture.
Transparency includes location transparency, access transparency, and
failure transparency.
Scalability and Flexibility:
Distributed systems are designed to be scalable, accommodating an
increasing number of nodes or users.
Flexibility allows for easy adaptation to changes in system size,
configuration, or workload.
Reliability and Fault Tolerance:
Distributed systems often incorporate redundancy and fault-tolerant
mechanisms to handle failures gracefully.
Distributed data storage and processing ensure reliability in the face of
component failures.
Security:
Security is a critical consideration in distributed systems to protect data and
communication across networked components.
Authentication, encryption, and access control mechanisms contribute to a
secure distributed environment.

UNIT II PROCESS MANAGEMENT


Processes - Process Concept - Process Scheduling - Operations on Processes - Inter-process
Communication; CPU Scheduling - Scheduling criteria - Scheduling algorithms: Threads –
Multithread Models – Threading issues; Process Synchronization - The Critical-Section problem –
Synchronization hardware – Semaphores – Mutex - Classical problems of synchronization -
Monitors; Deadlock - Methods for handling deadlocks, Deadlock prevention, Deadlock avoidance,
Deadlock detection, Recovery from deadlock.
PART - A
S.No. Question and Answer CO Univ QP
(Month/Year)
1. Compare and contrast Single-threaded and multi-threaded CO2 (Apr/May
process. 2017)
Single-threading is the processing of one command/ process at a
time. Whereas multi threading is a widespread programming and

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 52


execution model that allows multiple threads to exist within the
context of one process. These threads share the process's
resources, but are able to execute independently.
2. Priority inversion is a condition that occurs in real time CO2 (Apr/May
systems – Analyzing on this statement. 2017)
Priority inversion is a problem that occurs in concurrent
processes when low-priority threads hold shared resources
required by some high-priority threads, causing the high
priority-threads to block indefinitely. This problem is enlarged
when the concurrent processes are in a real time system where
high- priority threads must be served on time.
Priority inversion occurs when task interdependency exists
among tasks with different priorities.
3. Distinguish between CPU bounded, I/O bounded processes. CO2 (Nov/Dec
CPU bound process, spends majority of its time simply using the 2016)
CPU (doing calculations).
I/O bound process, spends majority of its time in input/output
related operations.
4. What resources are required to Creating threads? CO2 (Nov/Dec
When a thread is Creating the threads does not require any new 2016)
resources to execute. The thread shares the resources of the
process to which it belongs to and it requires a small data
structure to hold a register set, stack, and priority.
5. Under what circumstances user level threads are better than CO2 (May/June
the kernel level threads? 2016)
User-Level threads are managed entirely by the run-time system (Nov/Dec
(user-level library).The kernel knows nothing about user-level 2015)
threads and manages them as if they were single-threaded
processes. User-Level threads are small and fast, each thread is
represented by a PC, register, stack, and small thread control
block. Creating a new thread, switching between threads, and
synchronizing threads are done via procedure call. i.e. no kernel
involvement. User- Level threads are hundred times faster than
Kernel- Level threads.
User level threads are simple to represent, simple to manage and
fast and efficient.
6. What is the meaning of the term busy waiting? CO2 (May/June
Busy-waiting, busy-looping or spinning is a technique in which a 2016)
process repeatedly checks to see if a condition is true. (Nov/Dec
2018)

7. List out the data fields associated with process control blocks. CO2 (April/May
Process ID, pointers, process state, priority, program counter, 2015)
CPU registers, I/O information, Memory management
information, Accounting information, etc.
8. Define the term ‘Dispatch Latency”. CO2 (April/May

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 53


The term dispatch latency describes the amount of time it takes 2015)
for a system to respond to a request for a process to begin
operation.
9. What is the concept behind strong semaphore and spinlock? CO2 (Nov/Dec
Strong semaphores specify the order in which processes are 2015)
removed from the queue (FIFO order), which guarantees avoiding
starvation.
Spinlock is a lock which causes a thread trying to acquire it to
simply wait in a loop ("spin") while repeatedly checking if the
lock is available.
10. What are the benefits of synchronous and asynchronous CO2 (Apr/May
communication? 2018)
A benefit of synchronous communication is that it allows a
rendezvous between the sender and receiver.
An asynchronous operation is non-blocking and only initiates the
operation.
11. Can a multithreaded solution using multiple user-level CO2 (Nov/Dec
threads achieve better performance on a multiprocessor 2018)
system than on a single- processor system?
A multithreaded system comprising of multiple user-level threads
cannot make use of the different processors in a multiprocessor
system simultaneously.
12. What is process control block? List out the data field CO2 (Apr/May
associated with PCB. 2015)
Each process is represented in the operating system by a process
control block also called a task control block. (PCB) also called a
task control block.
 Process state
 Process number
 Program counter
 CPU registers
 Memory limits
 List of open files
 CPUscheduling information
 Memory management information
 Accounting information
 I/O status information
13. Differentiate single threaded and multithreaded processes. CO2 (Apr/May
2017)

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 54


Single threaded processes contain the execution of instructions in
a single sequence. In other words, one command is processes at a
time. The opposite of single threaded processes are multithreaded
processes. These processes allow the execution of multiple parts
of
a

program at the same time.


14. What is a thread? Write its benefits. CO2 (Apr/May
A thread otherwise called a lightweight process (LWP) is a 2019)
basic unit of CPU utilization, it comprises of a thread ID, a
program counter, a register set and a stack. It shares with
other threads belonging to the same process its code
section, data section, and operating system resources such
as open files and signals.

15. Define the term ‘Dispatch Latency’. CO2 (Apr/May


The term dispatch latency describes the amount of time it 2015)
takes for a system to respond to a request for a process to
begin operation. With a scheduler written specifically to
honor application priorities, real-time applications can be
developed with a bounded dispatch latency.

16. What are the difference between user level threads and kernel CO2 (Apr/May
level threads? 2019)
User threads
 User threads are supported above the kernel and are
implemented by a thread library at the user level.
 Thread creation & scheduling are done in the user
space, without kernel intervention.
 Therefore they are fast to create and manage

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 55


blocking system call will cause the entire process to
block
Kernel threads
 Kernel threads are supported directly by the
operating system.
 Thread creation, scheduling and management are
done by the operating system.
 Therefore they are slower to create & manage
compared to user threads.
 If the thread performs a blocking system call, the
kernel can schedule another thread in the application
for execution
17. What is semaphore? Mention its importance in operating CO2 (Nov/Dec
system. 2011)
A semaphore 'S' is a synchronization tool which is an integer
value that, apart from initialization, is accessed only through two
standard atomic operations; wait and signal. Semaphores can be
used to deal with the n-process critical section problem. It can be
also used to solve various Synchronization problems.
18. Define deadlock. CO2 (Nov/Dec
A process requests resources; if the resources are not 2021)
available at that time, the process enters a wait state.
Waiting processes may never again change state, because
the resources they have requested are held by other waiting
processes. This situation is called a deadlock.
19. Draw & briefly explain the process states? CO2 (Nov/Dec
2017)
(Nov/Dec
2019)

 New- The process is being created.


 Running – Instructions are being executed
 Waiting – The process is waiting for some event to
occur
 Ready – The process is waiting to be assigned a
processor
 Terminated - the process has finished execution
20. List out the benefits and challenges of process scheduling CO2 Apr/May 2019
Benefits of Process Scheduling:

 Improved CPU Utilization:


 Enhanced Throughput:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 56


 Fair Resource Allocation:
 Reduced Response Time:
 Efficient Multitasking:
 Resource Utilization:
 Priority Management:

Challenges of Process Scheduling:

 Complexity:
 Resource Contention:
 Starvation:
 Real-Time Constraints:
 Context Switching Overhead:
 Algorithm Selection:
 Dynamic Workloads:
 I/O-Bound Processes:
21. Give the queuing diagram representation of process CO2 Apr/May 2019
scheduling

22. What are the conditions must hold for a deadlock to occur? CO2
The four necessary conditions for a deadlock situation are
 mutual exclusion,
 no preemption,
 hold and wait and
 circular wait.
23. Elucidate mutex locks with an example. CO2
A mutex lock makes it possible to implement mutual exclusion
by limiting the number of threads or processes that can
simultaneously acquire the lock. A single thread or procedure has
to first try to obtain the mutex lock for something that is shared
before it can access it.
24. What is the meaning of the term busy waiting? CO2
Busy waiting, also known as spinning, or busy looping is a
process synchronization technique in which a process/task waits
and constantly checks for a condition to be satisfied before
proceeding with its execution
25. Under what circumstances CPU scheduling decision takes CO2

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 57


place.
 When a process switches from running state to waiting state
 When a process switches from running state to ready state.
 When a process switches from running state to waiting state to
ready state
 When a process terminates.
PART - B
S.No Question and Answer CO Univ QP
. (Month/Year)
1. Suppose that the following processes arrive for execution at the CO2 (Nov/Dec
times indicated. Each process will run the listed amount of 2018)
time. In answering the questions, use non-preemptive (Apr/May2017)
scheduling and (Apr/May2018)
base all decisions on the information you have at the time the
decision must be made.
Process Arrival Time Burst Time
P1 0.0 8
P2 0.4 4
P3 1.0 1

a. Find the average turnaround time for these processes with


the FCFS scheduling algorithm?

b. Find the average turnaround time for these processes with


the SJF scheduling algorithm?

c. The SJF algorithm is supposed to improve performance, but


notice that we chose to run process P1 at time 0 because we did
not know that two shorter processes would arrive soon. Find
what is the average turnaround time will be if the CPU is left
idle for the first 1 unit and then SJF scheduling is used.
Remembering that processes P1 and P2 are waiting during this
idle time, so their waiting time may increase. This algorithm
could be known as future-knowledge scheduling.

a. What is the average turnaround time for these processes with the
FCFS scheduling algorithm?
Solution: Average turnaround for these processes: ( 8 + (12 - 0.4)
+ (13 - 1)) / 3 = 10.53
b. What is the average turnaround time for these processes with the
SJF scheduling algorithm? Solution: Average turnaround for these
processes: ( 8 + (9 - 1) + (13 – 0.4)) / 3 = 9.53

c. The SJF algorithm is supposed to improve performance, but


notice that we choose to run process P1 at time 0 because we did

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 58


not know that two shorter processes would arrive soon. Compute
what the average turnaround time will be if the CPU is left idle for
the first 1 unit and then SJF scheduling is used.
Remember that processes P1 and P2 are waiting during this idle
time, so their waiting time may increase. The algorithm could be
known as future-knowledge Scheduling Solution: CPU is left idle:
Average turnaround for these processes: ((2 - 1) + ( 6 – 0.4 ) + ( 14
- 0)) / 3 = 6.87
2. State critical section problem? Discuss three solutions to solve (Apr/May 2017)
the critical section problem.

The critical section is a code segment where the shared variables


can be accessed. An atomic action is required in a critical section
i.e. only one process can execute in its critical section at a time. All
the other processes have to wait to execute in their critical sections.

A diagram that demonstrates the critical section is as follows −

In the above diagram, the entry section handles the entry into the
critical section. It acquires the resources needed for execution by
the process. The exit section handles the exit from the critical
section. It releases the resources and also informs the other
processes that the critical section is free.

Solution to the Critical Section Problem

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 59


The critical section problem needs a solution to synchronize the
different processes. The solution to the critical section problem
must satisfy the following conditions −

 Mutual Exclusion
Mutual exclusion implies that only one process can be
inside the critical section at any time. If any other processes
require the critical section, they must wait until it is free.
 Progress
Progress means that if a process is not using the critical
section, then it should not stop any other process from
accessing it. In other words, any process can enter a critical
section if it is free.
 Bounded Waiting
Bounded waiting means that each process must have a
limited waiting time. Itt should not wait endlessly to access
the critical section.

3. Illustrate an example situation in which ordinary pipes are (Nov/Dec


more suitable than named pipes and an example situation in 2016)
which named pipes are more suitable than ordinary pipes.

Step 1
A pipe is a technique for passing information from one process to
another.
Ordinary pipes
Ordinary pipes allow two processes to communicate in standard
producer–consumer fashion where the producer writes to one end
of the pipe and the consumer reads from the other end. Thus they
allow only one-way communication meaning they are
unidirectional. An ordinary pipe cannot be accessed from outside
the process that created it. Generally we use ordinary pipes when a
parent process wants to communicate with its children. Once the
processes have finished communicating and have terminated the
ordinary pipes ceases to exist. On the other hand named pipes
provide a much more bidirectional communication tool and don't
require a parent–child relationship. Once a named pipe is
established, several processes can use it for communication. \par
From the above we can conclude that ordinary pipes work better
with simple communication. An example could be a process which

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 60


counts characters in a file. In this case

The producer would write the file to the pipe


The consumer would read the file and counts the number of
characters
Next, for an example where named pipes are more suitable when
several processes need to write in the same time. An example
could be when several processes need to write messages to a log.
A log is a simply a file that records events that occur in the OS or
messages between users of a communication software. So in a log
file several processes should write messages. In this case

Several processes can o write something to the named pipe


The messages would be read by a server from the named pipe
The server after reading the messages writes them to a log file.
Result
Ordinary pipes work better with simple communication. Named
pipes work better when several processes need to write messages.
Ordinary pipes allow communication between parent and child
processes, while named pipes permit unrelated processes to
communicate.
4. Explain: why interrupts are not appropriate for (Nov/Dec 2016)
implementing synchronization primitives in multiprocessor (Nov/Dec 2018)
systems.

In multiprocessor systems, where multiple processors or cores


operate concurrently, implementing synchronization primitives
using interrupts may not be the most appropriate approach.
Interrupts are typically used to handle asynchronous events and
provide a mechanism for a processor to temporarily suspend its
current execution to respond to an external event. However,
there are several reasons why interrupts may not be suitable for
synchronization in multiprocessor systems:

Unpredictable Timing: Interrupts can be triggered at any time,


making their timing unpredictable. In a multiprocessor
environment, where multiple processors are executing
instructions concurrently, relying on interrupts for
synchronization can lead to non-deterministic behavior and race
conditions. This unpredictability makes it challenging to ensure
proper coordination among processors.

Concurrency Issues: Multiprocessor systems involve multiple


threads of execution running in parallel. If interrupts are used for
synchronization, it can introduce concurrency issues, such as
data races and inconsistent states, as multiple processors may be

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 61


interrupted simultaneously and try to access shared resources
concurrently. This can lead to difficult-to-debug synchronization
problems.

Interrupt Handling Overhead: Handling interrupts involves a


context switch, where the processor saves its current state,
switches to the interrupt service routine, and later restores the
saved state. In a multiprocessor system, the overhead of handling
interrupts on multiple processors simultaneously can be
significant. This can degrade overall system performance and
increase contention for shared resources.

Difficulty in Ensuring Atomicity: Synchronization primitives


often require atomic operations to ensure that critical sections
are executed without interruption. Achieving atomicity using
interrupts may be complex and error-prone, especially in a
multiprocessor environment where multiple processors may
attempt to access shared resources concurrently.

Alternative Synchronization Mechanisms: Multiprocessor


systems typically employ other synchronization mechanisms
specifically designed for concurrent execution, such as locks,
barriers, and atomic operations. These mechanisms are often
more efficient and provide better control over synchronization in
a multiprocessor environment compared to relying on interrupts.
5. Elaborate the actions taken by the kernel to context-switch (Nov/Dec 2016)
between processes.

Actions taken by a kernel to context-switch between processes are


-

The OS must save the PC and user stack pointer of the currently
executing process, in response to a clock interrupt and transfers
control to the kernel clock interrupt handler

Saving the rest of the registers, as well as other machine state, such
as the state of the floating point registers, in the process PCB is
done by the clock interrupt handler.

The scheduler to determine the next process to execute is invoked


the OS.

Then the state of the next process from its PCB is retrieved by OS
and restores the registers. The restore operation takes the processor
back to the state in which the previous process was previously
interrupted, executing in user code with user-mode privileges.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 62


Many architecture-specific operations, including flushing data and
instruction caches also must be performed by Context switches.
 Interrupt causes the operating system to change a CPU
from its current task and to run a kernel routine.
 When an interrupt occurs, the system need to save the
current context of the process running on the CPU, so that
it can later restore when needed.
 Switching the CPU to another process required performing
a state save of the current process and state restore of
different process and this task is known as context
switching.
 When a context switch occur the kernel switching context
of old process in d’s PCB and loads the saved context of its
new process scheduled to run.
 Context switch time is pare overhead, because system does
not do any useful work during this time.
 It is highly depending on hardware support like if processor
having large number of register so it wont need to unload
old PCB as it has space enough to store all frequently used
processes PCB.
 Action taken by kernel to context switch between
processes.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 63


 In response to clock interrupt, the OS saves the PC and
user stack pointer of the current executing process and
transfer control to kernel clock interrupt handler.
 The clock interrupt handler saves the rest of the register as
well as other machine state such as state of floating pointer
registers in the process PCB.
 The OS invoke the schedule to determine the next process
to execute.
 The OS then retrieves the state of next process from cts
PCB and restore the registers.
 This restore operation takes the processor back to the state
in which this process was previously interrupted, executing
in user mode with user mode privileges

6. Consider the following resource-allocation policy. Requests (Nov/Dec


and releases for resources are allowed at any time. If a 2015)
request for resources cannot be satisfied because the
resources are not available, then we check any processes that
are blocked, waiting for resources. If they have the desired
resources, then these resources are taken away from them
and are given to the requesting process. The vector of

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 64


resources for which the waiting process is waiting is
increased to include the resources that were taken away.

For example, consider a system with three resource types


and the vector

Available initialized to (4,2,2). If process P0 asks for (2,2,1),


it gets them. If P1 asks for (1,0,1), it gets them. Then, if P0
asks for (0,0,1), it is blocked (resource not available). If P2
now asks for (2,0,0), it gets the available one (1,0,0) and one
that was allocated to P0 (since P0 is blocked).

P0‘s Allocation vector goes down to (1,2,1), and its Need


vector goes up to (1,0,1).

a. Predict whether deadlock occurs? If so, give an example.


If not, which necessary condition cannot occur?

b. Predict whether indefinite blocking occurs?

a. Can deadlock occur? If you answer "yes," give an example. If


you answer "no," specify which necessary condition cannot occur.

Deadlock, in computing, in a situation in which two computer


programs sharing the same resources are preventing each other
from gaining access to the resource. This tension results in
program failure to function. From the case in context, deadlock
cannot occur. The avoidance of deadlock in this case is primarily
due to the existence of preemption. The concept of preemption is
observed when P2 is assigned the (2,2,1) which was previously
assigned to P0. Since P0 is blocked, P2 is preempted to receive the
resource.

b. Can indefinite blocking occur? Explain your answer.

There is a risk of indefinite blocking based on the foundation of


preemption in the resource allocation framework. Also referred to
as starvation, indefinite blocking entails an issue in which the low
priority tasks can wait forever as some other tasks have are of
higher priority to run processes. For example, if P2 is continuously
preempted, it may lead to starvation of the other tasks in the
system.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 65


7. Explain dining philosopher’s problem. (Apr/May
2017)
(Nov/Dec
2018)
Dinning Philosopher Problem
• Dinning philosophers problem is one of classical process
synchronization problem. Here five philosophers are seated around
a circular table. They spend their lives in thinking and eating. Five
plates with five forks are kept on the circular table.
• While philosophers think, they ignore the eating and do not
require fork. When a philosopher decides to eat, then he/she must
obtain a two fork, one from left side fork and another from right
side fork. Philosophers can pick up only a single fork at one time.
After consuming a food, the philosophers replace the fork and
resumes thinking. A philosophers to the left or right of a ow
dinning philosopher cannot eat while the dinning philosophers is
eating, since forks are a shared resource.
Fig. 2.14.1 shows seating arrangement of five philosophers.

Consider the following code :


void dinning_Philosopher( )
while (true)
{
thinking ( );
eating ( );
}

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 66


void eating ()
{
takeleftfork();
takerightfork( );
eatfood(delay);
putrighttfork( );
putleftfork( );
• Here philosophers operates asynchronously and concurrently, it
is possible for each philosophers to enter into eating mode. The
procedure takeleftfork( ) waits until the specified fork is available
and then seizes it. This is not possible because of the following
reasons.
• Each philosopher will hold exactly one fork, and no forks will
remain available on the table. The philosophers will all deadlock.
• To solve this problem, write some extra code in takerightfork ()
procedure. User can specify the philosophers to put down the left
fork if that philosophers cannot obtain the right fork.
Problem analysis :
• Shared resource: Here each fork is shared by two philosophers so
we called it is a shared resource.
• Race condition: we do not want a philosopher to pick up a fork
that has already been picked up by his neighbor. This is a race
condition.
• Solution: we consider each fork as a shared item and protected by
a mutex lock. Before eating, each philosopher first lock left fork
and then right fork. If philosopher acquires both locks successful,
then philosophers have two locks (two fork) so he can eat a food.
After finishes easting, this philosopher releases both fork, and
thinks.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 67


Some other alternatives :
1. Pick up the left fork, if the right fork is not available for a given
time, put the left fork down, wait and try again. Even if each
philosopher waits a different random time, an unlucky philosopher
may starve.
2. Require all philosophers to acquire a binary semaphore before
picking up any forks. This guarantees that no philosopher starves
but limits parallelism dramatically.
• Because we need to lock and unlock a chopstick, each chopstick
is associated with a mutex lock. Since we have five philosophers
who think and eat simultaneously, we need to create five threads,
one for each philosopher. Since each philosopher must have access
to the two mutex locks that are associated with its left and right to
chopsticks, these mutex locks are global variables.

8. Distinguish among short-term, medium-term and long- (Apr


term scheduling with suitable example. /May 2018)

Schedulers
• Schedulers are used to handles process scheduling. It is one type
of system software and selects the jobs from the system and decide
which process to run. Schedulers are of three types -
1. Long term scheduler 2. Short term scheduler 3. Medium term
scheduler
• Fig. 2.2.2 shows queueing diagram for process scheduling. (See
Fig. 2.2.2 on next page.)
Long term scheduler

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 68


• Long term scheduler is also called job scheduler. It determines
which process are admitted to the system for processing. Processes
are selected from the queue and loads into the main memory for
execution.
• Long term scheduling controls the degree of multiprogramming
in multitasking systems. It provides a balanced mix of jobs, such
as I/O bound and CPU bound. •Long term scheduling is used in
real time operating system. Time sharing operating system has no
long term scheduler.
Medium term scheduler
•Medium term scheduler is part of swapping function. Sometimes
it removes the process from memory. It also reduces the degree of
multiprogramming.
• If process makes an I/O request and it is in memory then
operating system takes this process into suspended states. Once the
process becomes suspended, it cannot make any progress towards
completion.

• In this situation, the process is removed from memory and makes


free space for other process.
• The suspended process is stored in the secondary storage device
i.e. hard disk. This process is called swapping.
Short term scheduler
• Short term scheduler is also called CPU scheduler. It selects the

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 69


process from queue which are ready to execute and allocate the
CPU for execution.
• Short term scheduler is faster that long term scheduler. This
scheduler makes scheduling decisions much more frequently than
the long-term or mid-term schedulers.
• A scheduling decision will at a minimum have to be made after
every time slice, and these are very short.
• It is also known as dispatcher.
Difference between Long Term, Short Term and Medium Term
Scheduler

9. Explain the differences in the degree to which the following (Apr/May


scheduling algorithms discriminate in favour of short 2018)
processes: RR, Multilevel Feedback Queues

1. First come first serve scheduling (FCFS).

In this algorithm the process that requires the CPU first is allotted
the CPU first and its implementation can be easily maintained
using FIFO queue.

When a CPU is free, the process is allotted CPU and it will


continue holding CPU fill it is terminated or requests I/O devices.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 70


So process waiting for CPU will have to wait for its execution.

Thus waiting time is large if a larger process executes before a


shorter process.

Thus we can say, FCFS discriminates against short job since any
short job arriving after long job will have a longer waiting time.

2. Round robin scheduling.

This algorithm is designed for time sharing system.

A small unit of time called time quantum or time slice is defined


and each process is switched in and out of CPU depending on this
time quantum value.

The time quantum is generally from 10 ms to 100 ms in length.

Here a process is not allotted CPU more than 1 time quantum.

The performance of RR algorithm depends heavily on the size of


time quantum, if time quantum Is very long time RR policy
behave same as FCFS and if the time quantum is very small (say
1, ms) then RR approach is called as processor sharing and creates
the appearance that each process has its own processor running at
1/n the speed of real processor.

We can say that, RR – treats all job equally ( given them equal
burst time of CPU) so short jb will be able to leave the system
faster since they will finish first.
3. Multilevel feedback queries.
It allow the process to move between queues, the idea is to
separate processes according to its characteristics of its burst time.

If the process uses too much CPU time it will be moved to lower
priority queue due to this scheme all I/O bounded and inter ache
process are in higher priority queue.

In addition, to this a process which wait too long in a lower


priority queue may be moved to a higher priority queue and this
form of aging prevents saturation.

Thus we can say, multilevel feedback queues work similar to the


RR algorithm so they discriminates favorably towards short job.

10. Discuss how the following pairs of scheduling criteria conflict (Nov/Dec 2018)

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 71


in certain settings.
i) CPU utilization and response time ii) Average turn around
time and maximum waiting time iii)I/O device utilization and
CPU utilization.

CPU Utilization and Response Time:

Conflict: CPU utilization is the percentage of time the CPU is


actively executing processes, while response time is the time it
takes for a system to respond to a user request. These criteria can
conflict in scenarios where maximizing CPU utilization, by
keeping the CPU busy with processes, may lead to longer response
times for interactive tasks. For example, a system running
compute-bound processes with high CPU utilization may respond
slowly to user input because the CPU is occupied with lengthy
computations.
Example: In a server environment with a high CPU utilization
goal, background tasks or batch processing may dominate the
CPU, causing delays in responding to user requests and impacting
the overall user experience.
Average Turnaround Time and Maximum Waiting Time:

Conflict: Average turnaround time is the average time taken for a


process to complete from submission to termination, while
maximum waiting time is the longest time a process waits in the
ready queue before getting CPU time. These criteria conflict when
efforts to minimize average turnaround time (by quickly
completing most processes) might result in some processes
experiencing very long waiting times, leading to high maximum
waiting times.
Example: A scheduling algorithm prioritizing short processes to
minimize average turnaround time may lead to longer waiting
times for longer processes, causing their maximum waiting time to
be high.
I/O Device Utilization and CPU Utilization:

Conflict: I/O device utilization focuses on keeping I/O devices


busy, while CPU utilization emphasizes keeping the CPU busy. In
certain situations, optimizing for one may lead to suboptimal
performance in the other. For example, if the CPU is constantly
waiting for I/O operations to complete (low CPU utilization), the
I/O devices might be highly utilized. Conversely, if the CPU is
constantly processing and not waiting for I/O (high CPU
utilization), I/O devices may be underutilized.
Example: In a system with high I/O device utilization, the CPU
might spend significant time waiting for I/O operations to

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 72


complete, leading to suboptimal CPU utilization. On the other
hand, a system with high CPU utilization may experience I/O
devices being idle while the CPU is processing.
In each of these pairs, the conflict arises because optimizing one
criterion can have a negative impact on the other. Balancing these
conflicting criteria is a key challenge in designing effective
scheduling algorithms that meet the specific requirements of a
given system or application.
11. Write about the various CPU scheduling algorithms.

CPU scheduling algorithms play a crucial role in determining the


order in which processes are executed by the CPU. These
algorithms aim to efficiently utilize the CPU's resources, minimize
waiting times, and improve overall system performance. Here are
some of the various CPU scheduling algorithms:

First-Come-First-Serve (FCFS):
Description: The simplest scheduling algorithm, where processes
are executed in the order they arrive in the ready queue.
Advantages: Easy to understand and implement.
Disadvantages: May lead to the "convoy effect," where short
processes get stuck behind long processes.
Shortest Job Next (SJN) or Shortest Job First (SJF):
Description: Selects the process with the shortest burst time first.
Advantages: Minimizes waiting time for short processes.
Disadvantages: Requires knowledge of burst times, which may not
be available in advance.
Priority Scheduling:
Description: Assigns priorities to processes, and the process with
the highest priority is selected for execution first.
Advantages: Allows for prioritization of important tasks.
Disadvantages: Can lead to starvation of lower-priority processes.
Round Robin (RR):
Description: Allocates fixed time slices (time quantum) to each
process in a circular order.
Advantages: Fair distribution of CPU time; suitable for time-
sharing systems.
Disadvantages: May lead to high turnaround time for long
processes; choice of time quantum affects performance.
Multilevel Queue Scheduling:
Description: Divides the ready queue into multiple priority levels
and assigns a different scheduling algorithm to each queue.
Advantages: Allows processes to move between queues based on
their behavior; suitable for diverse workloads.
Disadvantages: Complexity in managing multiple queues and
determining the criteria for moving processes between queues.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 73


Multilevel Feedback Queue (MLFQ):
Description: Similar to multilevel queues, but processes can move
between queues based on their behavior (e.g., aging or priority
changes).
Advantages: Dynamic adjustment to process behavior; adapts to
different types of processes.
Disadvantages: Requires careful tuning of parameters to achieve
optimal performance.
Priority Aging:
Description: Increases the priority of waiting processes over time,
preventing starvation of low-priority processes.
Advantages: Mitigates priority inversion and starvation issues.
Disadvantages: May introduce additional complexity in the
scheduling algorithm.
Lottery Scheduling:
Description: Assigns lottery tickets to processes, and a scheduler
randomly selects a ticket to determine the next process to execute.
Advantages: Provides a probabilistic approach to fairness.
Disadvantages: May be less predictable; the complexity of ticket
distribution.
Guaranteed Scheduling:
Description: Ensures that each process is guaranteed a minimum
amount of CPU time over a given interval.
Advantages: Ensures fairness and prevents starvation.
Disadvantages: May underutilize the CPU.

12. Write about critical regions and monitors.


Monitors
• Monitor is an object that contains both data and procedures
needed to perform allocation of a shared resource. It is a
programming language construct that support both data access
synchronization and control synchronization.
• Fig. 2.15.1 shows structure of monitor. (See Fig. 2.15.1 on next
page.)
• Monitor is implemented in programming languages like Pascal,
java and C++. Java makes extensive use of monitors to implement
mutual exclusion.
• Monitor is an abstract data type for which only one process may
be executing a procedure at any given time. Monitor is a collection
of procedure, variables and data structure. Data inside the monitor
may be either global to all routines within the monitor or local to a
specific routine.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 74


• Monitor data is accessible only within the monitor. A shared data
structure can be protected by placing it in a monitor. If the data in
a monitor represents some resource, then the monitor provides a
mutual exclusion facility for accessing the resource. A procedure
defined within a monitor can access only those variable declared
locally within the monitor and its formal parameters.
• When a process calls a monitor procedure, the first few
instruction of the procedure will check to see if any other process
is currently active within the monitor. If process is active then
calling process will be suspended until the other process has left
the monitor. If no other process is using the monitor, the calling
process may enter.

• Monitor supports synchronization by the use of condition


variables that are contained within the monitor and accessible only
within the monitor. Every conditional variable has an associated
queue. Condition variables operates on two
cwait(condition variable)
csignal (condition variable)
• cwait: It suspend execution of the calling process on condition.
• csignal: Resume execution of some process blocked after a cwait
on the same condition.
• The cwait must come before the csignal.
• Fig. 2.15.2 shows monitor structure with condition variables.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 75


• CPU is a resource that must be shared by all processes. The part
of the kernel that apportions CPU time between processes is called
the scheduler.
• A condition variable is like a semaphore, with two differences:
1. A semaphore counts the number of excess up operations, but a
signal operation on a condition variable has no effect unless some
process is waiting. A wait on a condition varibale always blocks
the calling process.
2. A wait on a condition variable automatically does an up on the
monitor mutex and blocks the caller.
PART C
1. Describe Operations on Processes and IPC elaborately.
There are many operations that can be performed on processes.
Some of these are process creation, process preemption, process
blocking, and process termination. These are given in detail as
follows −

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 76


Process Creation
Processes need to be created in the system for different
operations. This can be done by the following events −
 User request for process creation
 System initialization
 Execution of a process creation system call by a running
process
 Batch job initialization
A process may be created by another process using fork(). The
creating process is called the parent process and the created
process is the child process. A child process can have only one
parent but a parent process may have many children. Both the
parent and child processes have the same memory image, open
files, and environment strings. However, they have distinct
address spaces.
A diagram that demonstrates process creation using fork() is as
follows −

Process Preemption
An interrupt mechanism is used in preemption that suspends the
process executing currently and the next process to execute is
determined by the short-term scheduler. Preemption makes sure
that all processes get some CPU time for execution.
A diagram that demonstrates process preemption is as follows −

Process Blocking
The process is blocked if it is waiting for some event to occur.
This event may be I/O as the I/O events are executed in the main
memory and don't require the processor. After the event is
complete, the process again goes to the ready state.
A diagram that demonstrates process blocking is as follows −

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 77


Process Termination
After the process has completed the execution of its last
instruction, it is terminated. The resources held by a process are
released after it is terminated.
A child process can be terminated by its parent process if its task
is no longer relevant. The child process sends its status
information to the parent process before it terminates. Also, when
a parent process is terminated, its child processes are terminated
as well as the child processes cannot run if the parent processes
are terminated.
INTER-PROCESS COMMUNICATION
In general, Inter Process Communication is a type of mechanism
usually provided by the operating system (or OS). The main aim or
goal of this mechanism is to provide communications in between
several processes. In short, the intercommunication allows a
process letting another process know that some event has occurred.
Let us now look at the general definition of inter-process
communication, which will explain the same thing that we have
discussed above.
Definition
"Inter-process communication is used for exchanging useful
information between numerous threads in one or more processes
(or programs)."
To understand inter process communication, you can consider the
following given diagram that illustrates the importance of inter-
process communication:

Role of Synchronization in Inter Process Communication


It is one of the essential parts of inter process communication.
Typically, this is provided by interprocess communication control
mechanisms, but sometimes it can also be controlled by

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 78


communication processes.
These are the following methods that used to provide the
synchronization:
1. Mutual Exclusion
2. Semaphore
3. Barrier
4. Spinlock
Mutual Exclusion:-
It is generally required that only one process thread can enter the
critical section at a time. This also helps in synchronization and
creates a stable state to avoid the race condition.
Semaphore:-
Semaphore is a type of variable that usually controls the access to
the shared resources by several processes. Semaphore is further
divided into two types which are as follows:
1. Binary Semaphore
2. Counting Semaphore
Barrier:-
A barrier typically not allows an individual process to proceed
unless all the processes does not reach it. It is used by many
parallel languages, and collective routines impose barriers.
Spinlock:-
Spinlock is a type of lock as its name implies. The processes are
trying to acquire the spinlock waits or stays in a loop while
checking that the lock is available or not. It is known as busy
waiting because even though the process active, the process does
not perform any functional operation (or task).
Approaches to Interprocess Communication
We will now discuss some different approaches to inter-process
communication which are as follows:

These are a few different approaches for Inter- Process


Communication:
1. Pipes
2. Shared Memory
3. Message Queue
4. Direct Communication
5. Indirect communication

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 79


6. Message Passing
7. FIFO
To understand them in more detail, we will discuss each of them
individually.
Pipe:-
The pipe is a type of data channel that is unidirectional in nature. It
means that the data in this type of data channel can be moved in
only a single direction at a time. Still, one can use two-channel of
this type, so that he can able to send and receive data in two
processes. Typically, it uses the standard methods for input and
output. These pipes are used in all types of POSIX systems and in
different versions of window operating systems as well.
Shared Memory:-
It can be referred to as a type of memory that can be used or
accessed by multiple processes simultaneously. It is primarily used
so that the processes can communicate with each other. Therefore
the shared memory is used by almost all POSIX and Windows
operating systems as well.
Message Queue:-
In general, several different messages are allowed to read and
write the data to the message queue. In the message queue, the
messages are stored or stay in the queue unless their recipients
retrieve them. In short, we can also say that the message queue is
very helpful in inter-process communication and used by all
operating systems.
To understand the concept of Message queue and Shared memory
in more detail, let's take a look at its diagram given below:

Message Passing:-
It is a type of mechanism that allows processes to synchronize and
communicate with each other. However, by using the message
passing, the processes can communicate with each other without
restoring the shared variables.
Usually, the inter-process communication mechanism provides
two operations that are as follows:
o send (message)
o received (message)
Direct Communication:-
In this type of communication process, usually, a link is created or
established between two communicating processes. However, in

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 80


every pair of communicating processes, only one link can exist.
Indirect Communication
Indirect communication can only exist or be established when
processes share a common mailbox, and each pair of these
processes shares multiple communication links. These shared links
can be unidirectional or bi-directional.
FIFO:-
It is a type of general communication between two unrelated
processes. It can also be considered as full-duplex, which means
that one process can communicate with another process and vice
versa.
Some other different approaches
o Socket:-
It acts as a type of endpoint for receiving or sending the data in a
network. It is correct for data sent between processes on the same
computer or data sent between different computers on the same
network. Hence, it used by several types of operating systems.
o File:-
A file is a type of data record or a document stored on the disk and
can be acquired on demand by the file server. Another most
important thing is that several processes can access that file as
required or needed.
o Signal:-
As its name implies, they are a type of signal used in inter process
communication in a minimal way. Typically, they are the
massages of systems that are sent by one process to another.
Therefore, they are not used for sending data but for remote
commands between multiple processes.
Usually, they are not used to send the data but to remote
commands in between several processes
2. Discuss Semaphores and mutex in detail. (Apr/May 2015)
Semaphore is simply a variable that is non-negative and shared (Nov/Dec 2021)
between threads. A semaphore is a signaling mechanism, and
another thread can signal a thread that is waiting on a semaphore.

A semaphore uses two atomic operations,


1. Wait: The wait operation decrements the value of its argument S
if it is positive. If S is negative or zero, then no operation is
performed.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 81


wait(S)
{
while (S<=0);
S--;
}
2. Signal for the process synchronization: The signal operation
increments the value of its argument S.
signal(S)
{
S++;
}
A semaphore either allows or reject access to the resource,
depending on how it is set up.
Use of Semaphore
In the case of a single buffer, we can separate the 4 KB buffer into
four buffers of 1 KB. Semaphore can be associated with these four
buffers, allowing users and producers to work on different buffers
simultaneously.
Types of Semaphore
Semaphore is distinguished by the operating system in two
categories Counting semaphore and Binary semaphore.
1. Counting Semaphore: The semaphore S value is initialized to
the number of resources present in the system. Whenever a process
wants to access the resource, it performs the wait()operation on the
semaphore and decrements the semaphore value by one. When it
releases the resource, it performs the signal() operation on the
semaphore and increments the semaphore value by one.
When the semaphore count goes to 0, it means the processes
occupy all resources. A process needs to use a resource when the
semaphore count is 0. It executes the wait() operation and
gets blocked until the semaphore value becomes greater than 0.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 82


2. Binary semaphore: The value of a semaphore ranges
between 0and 1. It is similar to mutex lock, but mutex is a locking
mechanism, whereas the semaphore is a signaling mechanism. In
binary semaphore, if a process wants to access the resource, it
performs the wait() operation on the semaphore and decrements
the value of the semaphore from 1 to 0. When it releases the
resource, it performs a signal() operation on the semaphore and
increments its value to 1. Suppose the value of the semaphore is 0
and a process wants to access the resource. In that case, it
performs wait() operation and block itself till the current process
utilizing the resources releases the resource.

Advantages of Semaphore
Here are the following advantages of semaphore, such as:
o It allows more than one thread to access the critical section.
o Semaphores are machine-independent.
o Semaphores are implemented in the machine-independent
code of the microkernel.
o They do not allow multiple processes to enter the critical
section.
o As there is busy and waiting in semaphore, there is never
wastage of process time and resources.
o They are machine-independent, which should be run in the
machine-independent code of the microkernel.
o They allow flexible management of resources.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 83


Disadvantage of Semaphores
Semaphores also have some disadvantages, such as:
o One of the biggest limitations of a semaphore is priority
inversion.
o The operating system has to keep track of all calls to wait
and signal semaphore.
o Their use is never enforced, but it is by convention only.
o The Wait and Signal operations require to be executed in
the correct order to avoid deadlocks in semaphore.
o Semaphore programming is a complex method, so there are
chances of not achieving mutual exclusion.
o It is also not a practical method for large scale use as their
use leads to loss of modularity.
o Semaphore is more prone to programmer error
o , and it may cause deadlock or violation of mutual
exclusion due to programmer error.
MUTEX
Mutex is a mutual exclusion object that synchronizes access to a
resource. It is created with a unique name at the start of a program.
The mutex locking mechanism ensures only one thread can acquire
the mutex and enter the critical section. This thread only releases
the mutex when it exits in the critical section.

It is a special type of binary semaphore used for controlling access


to the shared resource. It includes a priority inheritance mechanism
to avoid extended priority inversion problems. It allows current
higher priority tasks to be kept in the blocked state for the shortest
time possible. However, priority inheritance does not correct
priority inversion but only minimizes its effect.
Example
This is shown with the help of the following example,
wait (mutex);
.....
Critical Section
.....
signal (mutex);
Use of Mutex
A mutex provides mutual exclusion, either producer or consumer
who can have the key (mutex) and proceed with their work. As
long as the producer fills the buffer, the user needs to wait, and
vice versa. In Mutex lock, all the time, only a single thread can

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 84


work with the entire buffer.
When a program starts, it requests the system to create a mutex
object for a given resource. The system creates the mutex object
with a unique name or ID. Whenever the program thread wants to
use the resource, it occupies lock on mutex object, utilizes the
resource and after use, it releases the lock on mutex object. Then
the next process is allowed to acquire the lock on the mutex object.
Meanwhile, a process has acquired the lock on the mutex object,
and no other thread or process can access that resource. If the
mutex object is already locked, the process desiring to acquire the
lock on the mutex object has to wait and is queued up by the
system till the mutex object is unlocked.
Advantages of Mutex
Here are the following advantages of the mutex, such as:
o Mutex is just simple locks obtained before entering its
critical section and then releasing it.
o Since only one thread is in its critical section at any given
time, there are no race conditions, and data always remain
consistent.
Disadvantages of Mutex
Mutex also has some disadvantages, such as:
o If a thread obtains a lock and goes to sleep or is preempted,
then the other thread may not move forward. This may lead
to starvation.
o It can't be locked or unlocked from a different context than
the one that acquired it.
o Only one thread should be allowed in the critical section at
a time.
o The normal implementation may lead to a busy waiting
state, which wastes CPU time.

3. Discuss deadlock and the various methods of handling (Nov/Dec 2011)


deadlock in detail. (Nov/Dec 2021)
A deadlock happens in operating system when two or more
processes need some resource to complete their execution that is
held by the other process.

In the above diagram, the process 1 has resource 1 and needs to


acquire resource 2. Similarly process 2 has resource 2 and needs
to acquire resource 1. Process 1 and process 2 are in deadlock as

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 85


each of them needs the other’s resource to complete their
execution but neither of them is willing to relinquish their
resources.
Coffman Conditions
A deadlock occurs if the four Coffman conditions hold true. But
these conditions are not mutually exclusive.
The Coffman conditions are given as follows −
 Mutual Exclusion
There should be a resource that can only be held by one
process at a time. In the diagram below, there is a single
instance of Resource 1 and it is held by Process 1 only.

 Hold and Wait


A process can hold multiple resources and still request
more resources from other processes which are holding
them. In the diagram given below, Process 2 holds
Resource 2 and Resource 3 and is requesting the Resource 1
which is held by Process 1.

 No Preemption
A resource cannot be preempted from a process by force. A
process can only release a resource voluntarily. In the
diagram below, Process 2 cannot preempt Resource 1 from
Process 1. It will only be released when Process 1
relinquishes it voluntarily after its execution is complete.

 Circular Wait
A process is waiting for the resource held by the second
process, which is waiting for the resource held by the third
process and so on, till the last process is waiting for a
resource held by the first process. This forms a circular
chain. For example: Process 1 is allocated Resource2 and it
is requesting Resource 1. Similarly, Process 2 is allocated
Resource 1 and it is requesting Resource 2. This forms a
circular wait loop.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 86


METHODS FOR HANDLING DEADLOCKS
There are four approaches to dealing with deadlocks.
1. Deadlock Prevention
2. Deadlock avoidance (Banker's Algorithm)
3. Deadlock detection & recovery
4. Deadlock Ignorance (Ostrich Method)
DEADLOCK PREVENTION
Deadlock has following characteristics.

1. Mutual Exclusion
2. Hold and Wait
3. No preemption
4. Circular wait
We can prevent Deadlock by eliminating any of the above four
conditions.
1. Mutual Exclusion
Mutual section from the resource point of view is the fact that a
resource can never be used by more than one process
simultaneously which is fair enough but that is the main reason
behind the deadlock. If a resource could have been used by more
than one process at the same time then the process would have
never been waiting for any resource.
However, if we can be able to violate resources behaving in the
mutually exclusive manner then the deadlock can be prevented.
Spooling
For a device like printer, spooling can work. There is a memory
associated with the printer which stores jobs from each of the
process into it. Later, Printer collects all the jobs and print each
one of them according to FCFS. By using this mechanism, the
process doesn't have to wait for the printer and it can continue
whatever it was doing. Later, it collects the output when it is
produced.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 87


Although, Spooling can be an effective approach to violate mutual
exclusion but it suffers from two kinds of problems.
1. This cannot be applied to every resource.
2. After some point of time, there may arise a race condition
between the processes to get space in that spool.
We cannot force a resource to be used by more than one process at
the same time since it will not be fair enough and some serious
problems may arise in the performance. Therefore, we cannot
violate mutual exclusion for a process practically.
2. Hold and Wait
Hold and wait condition lies when a process holds a resource and
waiting for some other resource to complete its task. Deadlock
occurs because there can be more than one process which are
holding one resource and waiting for other in the cyclic order.
However, we have to find out some mechanism by which a
process either doesn't hold any resource or doesn't wait. That
means, a process must be assigned all the necessary resources
before the execution starts. A process must not wait for any
resource once the execution has been started.
!(Hold and wait) = !hold or !wait (negation of hold and wait is,
either you don't hold or you don't wait)
This can be implemented practically if a process declares all the
resources initially. However, this sounds very practical but can't be
done in the computer system because a process can't determine
necessary resources initially.
Process is the set of instructions which are executed by the CPU.
Each of the instruction may demand multiple resources at the
multiple times. The need cannot be fixed by the OS.
The problem with the approach is:
1. Practically not possible.
2. Possibility of getting starved will be increases due to the
fact that some process may hold a resource for a very long
time.
3. No Preemption
Deadlock arises due to the fact that a process can't be stopped once
it starts. However, if we take the resource away from the process
which is causing deadlock then we can prevent deadlock.
This is not a good approach at all since if we take a resource away

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 88


which is being used by the process then all the work which it has
done till now can become inconsistent.
Consider a printer is being used by any process. If we take the
printer away from that process and assign it to some other process
then all the data which has been printed can become inconsistent
and ineffective and also the fact that the process can't start printing
again from where it has left which causes performance
inefficiency.
4. Circular Wait
To violate circular wait, we can assign a priority number to each of
the resource. A process can't request for a lesser priority resource.
This ensures that not a single process can request a resource which
is being utilized by some other process and no cycle will be
formed.

4. Discuss in detail about deadlock avoidance.

DEADLOCK AVOIDANCE
It is a banker algorithm used to avoid deadlock and allocate
resources safely to each process in the computer system. The 'S-
State' examines all possible tests or activities before deciding
whether the allocation should be allowed to each process. It also
helps the operating system to successfully share the resources
between all the processes. The banker's algorithm is named
because it checks whether a person should be sanctioned a loan
amount or not to help the bank system safely simulate allocation
resources. In this section, we will learn the Banker's Algorithm in
detail. Also, we will solve problems based on the Banker's
Algorithm. To understand the Banker's Algorithm first we will see
a real word example of it.
Suppose the number of account holders in a particular bank is 'n',
and the total money in a bank is 'T'. If an account holder applies
for a loan; first, the bank subtracts the loan amount from full cash
and then estimates the cash difference is greater than T to approve
the loan amount. These steps are taken because if another person
applies for a loan or withdraws some amount from the bank, it
helps the bank manage and operate all things without any

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 89


restriction in the functionality of the banking system.
Similarly, it works in an operating system. When a new process is
created in a computer system, the process must provide all types of
information to the operating system like upcoming processes,
requests for their resources, counting them, and delays. Based on
these criteria, the operating system decides which process
sequence should be executed or waited so that no deadlock occurs
in a system. Therefore, it is also known as deadlock avoidance
algorithm or deadlock detection in the operating system.
Advantages
Following are the essential characteristics of the Banker's
algorithm:
1. It contains various resources that meet the requirements of
each process.
2. Each process should provide information to the operating
system for upcoming resource requests, the number of
resources, and how long the resources will be held.
3. It helps the operating system manage and control process
requests for each type of resource in the computer system.
4. The algorithm has a Max resource attribute that represents
indicates each process can hold the maximum number of
resources in a system.
Disadvantages
1. It requires a fixed number of processes, and no additional
processes can be started in the system while executing the
process.
2. The algorithm does no longer allows the processes to
exchange its maximum needs while processing its tasks.
3. Each process has to know and state their maximum
resource requirement in advance for the system.
4. The number of resource requests can be granted in a finite
time, but the time limit for allocating the resources is one
year.
When working with a banker's algorithm, it requests to know about
three things:
1. How much each process can request for each resource in
the system. It is denoted by the [MAX] request.
2. How much each process is currently holding each resource
in a system. It is denoted by the [ALLOCATED] resource.
3. It represents the number of each resource currently
available in the system. It is denoted by the
[AVAILABLE] resource.
Following are the important data structures terms applied in the
banker's algorithm as follows:
Suppose n is the number of processes, and m is the number of each
type of resource used in a computer system.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 90


1. Available: It is an array of length 'm' that defines each type
of resource available in the system. When Available[j] = K,
means that 'K' instances of Resources type R[j] are
available in the system.
2. Max: It is a [n x m] matrix that indicates each process P[i]
can store the maximum number of resources R[j] (each
type) in a system.
3. Allocation: It is a matrix of m x n orders that indicates the
type of resources currently allocated to each process in the
system. When Allocation [i, j] = K, it means that process
P[i] is currently allocated K instances of Resources type
R[j] in the system.
4. Need: It is an M x N matrix sequence representing the
number of remaining resources for each process. When the
Need[i] [j] = k, then process P[i] may require K more
instances of resources type Rj to complete the assigned
work.
Nedd[i][j] = Max[i][j] - Allocation[i][j].
5. Finish: It is the vector of the order m. It includes a Boolean
value (true/false) indicating whether the process has been
allocated to the requested resources, and all resources have
been released after finishing its task.
The Banker's Algorithm is the combination of the safety algorithm
and the resource request algorithm to control the processes and
avoid deadlock in a system:
Safety Algorithm
It is a safety algorithm used to check whether or not a system is in
a safe state or follows the safe sequence in a banker's algorithm:
1. There are two vectors Wok and Finish of length m and n in a
safety algorithm.
Initialize: Work = Available
Finish[i] = false; for I = 0, 1, 2, 3, 4… n - 1.
2. Check the availability status for each type of resources [i], such
as:
Need[i] <= Work
Finish[i] == false
If the i does not exist, go to step 4.
3. Work = Work +Allocation(i) // to get new resource allocation
Finish[i] = true
Go to step 2 to check the status of resource availability for the next
process.
4. If Finish[i] == true; it means that the system is safe for all
processes.
Resource Request Algorithm
A resource request algorithm checks how a system will behave
when a process makes each type of resource request in a system as

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 91


a request matrix.
Let create a resource request array R[i] for each process P[i]. If the
Resource Requesti [j] equal to 'K', which means the process P[i]
requires 'k' instances of Resources type R[j] in the system.
1. When the number of requested resources of each type is less
than the Need resources, go to step 2 and if the condition fails,
which means that the process P[i] exceeds its maximum claim for
the resource. As the expression suggests:
If Request(i) <= Need
Go to step 2;
2. And when the number of requested resources of each type is less
than the available resource for each process, go to step (3). As the
expression suggests:
If Request(i) <= Available
Else Process P[i] must wait for the resource since it is not available
for use.
3. When the requested resource is allocated to the process by
changing state:
Available = Available - Request
Allocation(i) = Allocation(i) + Request (i)
Needi = Needi - Requesti
When the resource allocation state is safe, its resources are
allocated to the process P(i). And if the new state is unsafe, the
Process P (i) has to wait for each type of Request R(i) and restore
the old resource-allocation state.
Example: Consider a system that contains five processes P1, P2,
P3, P4, P5 and the three resource types A, B and C. Following are
the resources types: A has 10, B has 5 and the resource type C has
7 instances.
Proces Allocation Max Available
s A B A B A B
C C C

P1 0 1 7 5 3 3 3 2
0

P2 2 0 0 3 2 2

P3 3 0 2 9 0 2

P4 2 1 1 2 2 2

P5 0 0 2 4 3 3
Answer the following questions using the banker's algorithm:
1. What is the reference of the need matrix?

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 92


2. Determine if the system is safe or not.
3. What will happen if the resource request (1, 0, 0) for
process P1 can the system accept this request immediately?
Ans. 2: Context of the need matrix is as follows:
Need [i] = Max [i] - Allocation [i]
Need for P1: (7, 5, 3) - (0, 1, 0) = 7, 4, 3
Need for P2: (3, 2, 2) - (2, 0, 0) = 1, 2, 2
Need for P3: (9, 0, 2) - (3, 0, 2) = 6, 0, 0
Need for P4: (2, 2, 2) - (2, 1, 1) = 0, 1, 1
Need for P5: (4, 3, 3) - (0, 0, 2) = 4, 3, 1
Process Need
A B C

P1 7 4 3

P2 1 2 2

P3 6 0 0

P4 0 1 1

P5 4 3 1
Hence, we created the context of need matrix.
Ans. 2: Apply the Banker's Algorithm:
Available Resources of A, B and C are 3, 3, and 2.
Now we check if each type of resource request is available for
each process.
Step 1: For Process P1:
Need <= Available
7, 4, 3 <= 3, 3, 2 condition is false.
So, we examine another process, P2.
Step 2: For Process P2:
Need <= Available
1, 2, 2 <= 3, 3, 2 condition true
New available = available + Allocation
(3, 3, 2) + (2, 0, 0) => 5, 3, 2
Similarly, we examine another process P3.
Step 3: For Process P3:
P3 Need <= Available
6, 0, 0 < = 5, 3, 2 condition is false.
Similarly, we examine another process, P4.
Step 4: For Process P4:
P4 Need <= Available
0, 1, 1 <= 5, 3, 2 condition is true
New Available resource = Available + Allocation

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 93


5, 3, 2 + 2, 1, 1 => 7, 4, 3
Similarly, we examine another process P5.
Step 5: For Process P5:
P5 Need <= Available
4, 3, 1 <= 7, 4, 3 condition is true
New available resource = Available + Allocation
7, 4, 3 + 0, 0, 2 => 7, 4, 5
Now, we again examine each type of resource request for
processes P1 and P3.
Step 6: For Process P1:
P1 Need <= Available
7, 4, 3 <= 7, 4, 5 condition is true
New Available Resource = Available + Allocation
7, 4, 5 + 0, 1, 0 => 7, 5, 5
So, we examine another process P2.
Step 7: For Process P3:
P3 Need <= Available
6, 0, 0 <= 7, 5, 5 condition is true
New Available Resource = Available + Allocation
7, 5, 5 + 3, 0, 2 => 10, 5, 7
Hence, we execute the banker's algorithm to find the safe state and
the safe sequence like P2, P4, P5, P1 and P3.
Ans. 3: For granting the Request (1, 0, 2), first we have to check
that Request <= Available, that is (1, 0, 2) <= (3, 3, 2), since the
condition is true. So the process P1 gets the request immediately.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 94


UNIT III STORAGE MANAGEMENT
Main Memory - Swapping - Contiguous Memory Allocation – Paging - Structure of the Page Table -
Segmentation, Segmentation with paging; Virtual Memory - Demand Paging – Copy on Write – Page
Replacement - Allocation of Frames –Thrashing
PART - A
S.No. Question and Answer CO Univ QP
(Month/Yea
r)
1. What is the difference between user-level instructions and CO3 (Apr/
privileged instructions? May
A non-privileged (i.e. user-level) instruction is an instruction that 2017)
any application or user can execute. A privileged instruction, on the
other hand, is an instruction that can only be executed in kernel
mode. Instructions are divided in this manner because privileged
instructions could harm the kernel.
2. Define: Belady’s anomaly? CO3 (Apr/
In computer storage, Bélády's anomaly is the phenomenon in May
which increasing the number of page frames results in an increase in 2017)
the number of page faults for certain memory access patterns. This
phenomenon is commonly experienced when using the first-in first-
out (FIFO) page replacement algorithm.
3. What is the purpose of paging the page table? CO3 (Nov/
In certain situations the page tables could become large enough that Dec
by paging the page tables, one could simplify the memory 2016)
allocation problem (by ensuring that everything is allocated as
fixed-size pages as opposed to variable-sized chunks) and also
enable the swapping of portions of page table that are not currently
used.
4. Why page sizes are always power of 2? CO3 (Nov/Dec
2016)
Recall that paging is implemented by breaking up an address into a
page and offset number. It is most efficient to break the address
into X page bits and Y offset bits, rather than perform arithmetic on
the address to calculate the page number and offset. Because each
bit position represents a power of 2, splitting an address between
bits results in a page size that is a power of 2.
5. Define demand paging in memory management. CO3 (Nov/
In virtual memory systems, demand paging is a type of swapping in Dec
which pages of data are not copied from disk to RAM until they are 2015)
needed.
6. What are the steps required to handle a page fault in demand CO3 (Nov/Dec
paging? 2015)
Steps in handling page fault:
1. Operating system looks at another table to decide:
• Invalid reference - abort
• Just not in memory
2. Find free frame

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 95


3. Swap page into frame via scheduled disk operation
4. Reset tables to indicate page now in memory Set validation bit
=v
5. Restart the instruction that caused the page fault
7. Define Demand paging and write advantages. CO3 (Nov/Dec
Virtual memory is commonly implemented by demand paging. In demand 2021)
paging, the pager brings only those necessary pages into memory instead of
swapping in a whole process. Thus it avoids reading into memory pages that
will not be used anyway, decreasing the swap time and the amount of
physical memory needed.
8. Difference between internal and external fragmentation CO3 (Nov/Dec
Internal fragmentation is the area occupied by a process but cannot be 2011)
used by the process. This space is unusable by the system until the process (Apr/May
release the space. 2018)
External fragmentation exists when total free memory is enough for the
new process but it's not contiguous and can't satisfy the request. Storage is
fragmented into small holes.
9. What do you mean by thrashing? CO3 (Apr/May
Thrashing is when the page fault and swapping happens very frequently at a 2019)
higher rate, and then the operating system has to spend more time swapping (Apr/May
these pages. This state in the operating system is known as thrashing. 2015)
Because of thrashing, the CPU utilization is going to be reduced or (Nov/Dec
negligible.Thrashing is the coincidence of high page traffic and low CPU 2021)
utilization.
10. What do mean by page fault? Under what circumstances do page faults CO3 (Apr/May
occur? State the actions taken by the operating system when a page 2019)
fault occurs?
Page fault is the situation in which the page is not available whenever a
processor needs to execute it.
A page fault occurs when an access to a page that has not been brought into
main memory takes place. The operating system verifies the memory
access, aborting the program if it is invalid. If it is valid a free frame is
located and I/O requested to read the needed page into the free frame.
11. Define demand paging in memory management. What are the steps CO3 (Nov/Dec
required to handle a page fault in demand paging. 2015)
 A demand paging system is quite similar to a paging system with
swapping where processes reside in
secondary memory and pages are loaded only on demand, not in
advance.
 When a context switch occurs, the operating system does not copy any
of the old program’s pages out to the disk or any of the new program’s
pages into the main memory Instead, it just begins executing the new
program after loading the first page and fetches that program’s pages as
they are referenced.
 While executing a program, if the program references a page which is
not available in the main memory because it was swapped out a little
ago, the processor treats this invalid memory reference as a page fault
and transfers control from the program to the operating system to
demand the page back into the memory.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 96


12. Consider a logical address space of eight pages of 1024 words each, CO3
mapped onto a physical memory of 32 frames.
a. How many bits are there in the logical address?
b. How many bits are there in the physical address?
Each page/frame holds 1K; we will need 10 bits to uniquely address
each of those 1024 addresses. Physical memory has 32 frames and
we need 25 bits to address each frame, requiring in total 5+10=15
bits. A logical address space of 64 pages requires 6 bits to address
each page uniquely, requiring 16bits in total.
a. Logical address: 13 bits
b. Physical address: 15 bits
13. In the IBM/370, memory protection is provided through the use CO3
of keys. A key is a 4-bit quantity. Each 2K block of memory has
a key (the storage key) associated with it. The CPU also has a
key (the protection key) associated with it. A store operation is
allowed only if both keys are equal, or if either is zero. Which of
the following memory-management schemes could be used
successfully with this hardware?
 Bare machine
 Single-user system
 Multiprogramming with a fixed number of processes
 Multiprogramming with a variable number of processes
 Paging
 Segmentation
Answer:
a. Protection not necessary set system key to 0.
b. Set system key to 0 when in supervisor mode.
c. Region sizes must be fixed in increments of 2k bytes, allocate key
with memory blocks.
d. Same as above.
e. Frame sizes must be in increments of 2k bytes, allocate key with
pages.
f. Segment sizes must be in increments of 2k bytes, allocate key
with segments
14. What is address binding? CO3
The process of associating program instructions and data to physical
memory addresses is called address binding, or relocation.
15. Why page sizes are always powers of 2? CO3
 Recall that paging is implemented by breaking up an
address into a page and offset number.
 It is most efficient to break the address into X page bits and
Y offset bits, rather than perform arithmetic on the address
to calculate the page number and offset.
 Because each bit 25 26 position represents a power of 2,
splitting an address between bits results in a page size that
is a power of 2
16. Consider the following page reference string: 1, 2, 3, 4, 2, 1, 5, 6, 2, 1, CO3
2, 3, 7, 6, 3, 2, 1, 2, 3, 6. How many page faults would occur for the
following replacement algorithms, assuming one, two, three, four,
five, six, or seven frames? Remember all frames are initially empty,

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 97


so your first unique pages will all cost one fault each. • LRU
replacement • FIFO replacement • Optimal replacement
Number of frames LRU FIFO Optimal
1 20 20 20
2 18 18 15
3 15 16 11
4 10 14 8
5 8 10 7
6 7 10 7
7 7 7 7
17. What is the purpose of paging the page tables? CO3
In certain situations the page tables could become large enough that by
paging the page tables, one could simplify the memory allocation problem
(by ensuring that everything is allocated as fixed-size pages as opposed to
variable-sized chunks) and also enable the swapping of portions of page
table that are not currently used.
18. Define TLB. CO3
 Translation Look-Aside Buffer, a table in the processors memory that
contains information about the pages in memory the processor has
accessed recently
 The TLB enables faster computing because it allows the address
processing to take place independent of the normal address-translation
pipeline
19. Define logical address and physical address. CO3
An address generated by the CPU is referred as logical address. An address
seen by the memory unit that is the one loaded into the memory address
register of the memory is commonly referred as physical address
20. Define Copy-on-write. CO3
Copy-on-write finds its main use in virtual memory operating systems;
when a process creates a copy of itself, the pages in memory that might be
modified by either the process or its copy are marked copy-on- write.
21. What is the main function of the memory-management unit? CO3
The runtime mapping from virtual to physical addresses is done by a
hardware device called a memory Management unit (MMU)
22. What are the common strategies to select a free hole from a set of CO3
available holes?
The most common strategies are
A. First fit B. Best fit C. Worst fit
23. What are the various page replacement algorithms? CO3
• FIFO page replacement
• Optimal page replacement
• LRU page replacement
• LRU approximation page replacement
• Counting based page replacement
• Page buffering algorithm.
24. Differentiate a page from a segment. CO3
In segmentation, the address space is typically divided into a preset number
of segments like data segment (read/write), code segment (read-only), stack
(read/write) etc. And the programs are divided into these segments
accordingly. Logical addresses are represented as tuple <segment, offset>.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 98


While with paging, the address space is divided into a sequence of fixed size
units called "pages". And logical addresses take the form of a tuple <page,
offset>.
25. What is virtual memory? Mention its advantages. CO3
Virtual memory is a technique that allows the execution of processes that
may not be completely in memory. It is the separation of user logical
memory from physical memory. This separation provides an extremely
large virtual memory, when only a smaller physical memory is available.
The main visible advantage of this scheme is that programs can be larger
than physical memory.
PART - B
S.No Question and Answer CO Univ QP
. (Month/Y
ear)
1. Explain Main Memory in detail. CO3 (Apr/May
Computer memory is just like the human brain. It is used to store 2019)
data/information and instructions. It is a data storage unit or a data (Nov/Dec
storage device where data is to be processed and instructions required 2021)
for processing are stored. It can store both the input and output can be
stored here.
Characteristics of Main Memory:
 It is faster computer memory as compare to secondary memory.
 It is semiconductor memories.
 It is usually a volatile memory.
 It is the main memory of the computer.
 A computer system cannot run without primary memory.
In general, memory is of three types:
 Primary memory
 Secondary memory
 Cache memory
Now we discuss each type of memory one by one in detail:
1. Primary Memory: It is also known as the main memory of the
computer system. It is used to store data and programs or instructions
during computer operations. It uses semiconductor technology and
hence is commonly called semiconductor memory. Primary memory
is of two types:
(i) RAM (Random Access Memory): It is a volatile memory.
Volatile memory stores information based on the power supply. If the
power supply fails/ interrupted/stopped, all the data & information on
this memory will be lost. RAM is used for booting up or start the
computer. It temporarily stores programs/ data which has to be
executed by the processor. RAM is of two types:
 S RAM (Static RAM): It uses transistors and the circuits of this
memory are capable of retaining their state as long as the power is
applied. This memory consists of the number of flip flops with
each flip flop storing 1 bit. It has less access time and hence, it is
faster.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 99


 D RAM (Dynamic RAM): It uses capacitors and transistors and
stores the data as a charge on the capacitors. They contain
thousands of memory cells. It needs refreshing of charge on
capacitor after a few milliseconds. This memory is slower than S
RAM.
(ii) ROM (Read Only Memory): It is a non-volatile memory. Non-
volatile memory stores information even when there is a power
supply failed/ interrupted/stopped. ROM is used to store information
that is used to operate the system. As its name refers to read-only
memory, we can only read the programs and data that is stored on it.
It contains some electronic fuses that can be programmed for a piece
of specific information. The information stored in the ROM in binary
format. It is also known as permanent memory. ROM is of four types:
 MROM(Masked ROM): Hard-wired devices with a pre-
programmed collection of data or instructions were the first
ROMs. Masked ROMs are a type of low-cost ROM that works in
this way.
 PROM (Programmable Read Only Memory): This read-only
memory is modifiable once by the user. The user purchases a
blank PROM and uses a PROM program to put the required
contents into the PROM. Its content can’t be erased once written.
 EPROM (Erasable Programmable Read Only Memory): It is
an extension to PROM where you can erase the content of ROM
by exposing it to Ultraviolet rays for nearly 40 minutes.
 EEPROM (Electrically Erasable Programmable Read Only
Memory): Here the written contents can be erased electrically.
You can delete and reprogramme EEPROM up to 10,000 times.
Erasing and programming take very little time, i.e., nearly 4 -10
ms(milliseconds). Any area in an EEPROM can be wiped and
programmed selectively.
2. Secondary Memory: It is also known as auxiliary memory and
backup memory. It is a non-volatile memory and used to store a large
amount of data or information. The data or information stored in
secondary memory is permanent, and it is slower than primary
memory. A CPU cannot access secondary memory directly. The
data/information from the auxiliary memory is first transferred to the
main memory, and then the CPU can access it.
Characteristics of Secondary Memory:
 It is a slow memory but reusable.
 It is a reliable and non-volatile memory.
 It is cheaper than primary memory.
 The storage capacity of secondary memory is large.
 A computer system can run without secondary memory.
 In secondary memory, data is stored permanently even when the
power is off.
Types of secondary memory:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 100


(i) Magnetic Tapes: Magnetic tape is a long, narrow strip of plastic
film with a thin, magnetic coating on it that is used for magnetic
recording. Bits are recorded on tape as magnetic patches called
RECORDS that run along many tracks. Typically, 7 or 9 bits are
recorded concurrently. Each track has one read/write head, which
allows data to be recorded and read as a sequence of characters. It can
be stopped, started moving forward or backward, or rewound.
(ii) Magnetic Disks: A magnetic disc is a circular metal or a plastic
plate and these plates are coated with magnetic material. The disc is
used on both sides. Bits are stored in magnetized surfaces in locations
called tracks that run in concentric rings. Sectors are typically used to
break tracks into pieces.

Hard discs are discs that are permanently attached and cannot be
removed by a single user.
(iii) Optical Disks: It’s a laser-based storage medium that can be
written to and read. It is reasonably priced and has a long lifespan.
The optical disc can be taken out of the computer by occasional
users. Types of Optical Disks :
(a) CD – ROM:
 It’s called Compact Disk. Only read from memory.
 Information is written to the disc by using a controlled laser beam
to burn pits on the disc surface.
 It has a highly reflecting surface, which is usually aluminum.
 The diameter of the disc is 5.25 inches.
 16000 tracks per inch is the track density.
 The capacity of a CD-ROM is 600 MB, with each sector storing
2048 bytes of data.
 The data transfer rate is about 4800KB/sec. & the new access
time is around 80 milliseconds.
(b) WORM-(WRITE ONCE READ MANY):
 A user can only write data once.
 The information is written on the disc using a laser beam.
 It is possible to read the written data as many times as desired.
 They keep lasting records of information but access time is high.
 It is possible to rewrite updated or new data to another part of the
disc.
 Data that has already been written cannot be changed.
 Usual size – 5.25 inch or 3.5 inch diameter.
 The usual capacity of 5.25 inch disk is 650 MB,5.2GB etc.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 101


(c) DVDs:
 The term “DVD” stands for “Digital Versatile/Video Disc,” and
there are two sorts of DVDs: (i)DVDR (writable) and (ii)
DVDRW (Re-Writable)
 DVD-ROMS (Digital Versatile Discs): These are read-only
memory (ROM) discs that can be used in a variety of ways. When
compared to CD-ROMs, they can store a lot more data. It has a
thick polycarbonate plastic layer that serves as a foundation for
the other layers. It’s an optical memory that can read and write
data.
 DVD-R: It is a writable optical disc that can be used just once.
It’s a DVD that can be recorded. It’s a lot like WORM. DVD-
ROMs have capacities ranging from 4.7 to 17 GB. The capacity
of 3.5 inch disk is 1.3 GB.
3. Cache Memory: It is a type of high-speed semiconductor memory
that can help the CPU run faster. Between the CPU and the main
memory, it serves as a buffer. It is used to store the data and
programs that the CPU uses the most frequently.
Advantages of cache memory:
 It is faster than the main memory.
 When compared to the main memory, it takes less time to access
it.
 It keeps the programs that can be run in a short amount of time.
 It stores data in temporary use.
Disadvantages of cache memory:
 Because of the semiconductors used, it is very expensive.
 The size of the cache (amount of data it can store) is usually
small.
Sample Problem
Question 1. What are the types of memories?
Solution:
There are three types of memory:
 Primary memory
 Secondary memory
 Cache memory
Question 2. What is Volatile and Non Volatile memory?
Solution:
Volatile memory is used to store information based on power supply.
If the power supply is off, all the data & information on this memory
will be lost. For example, RAM (Random Access Memory). Whereas
non-volatile memory is used to store information even when the
power supply is off. For example, ROM (Read Only Memory).
Question 3. What is the full form of CD-ROM?
Solution:
CD-ROM is stands for compact disk read only memory
Question 4. How many 128 * 8 memory chips are required for

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 102


a memory capacity of 4096*16?
Solution:
Number of chips required = Required RAM size/ Available chip
capacity
= (4096 * 16)/(128 * 8) = 64
Question 5. Explain any four differences between RAM and
ROM?
Solution:
RAM ROM

It stands for Random access It stands for read only


memory. memory.

It is slower memory as
It is the fastest memory. compare to RAM.

It is volatile memory. It is non-volatile memory.

In this memory, data will


In this memory, data will erase not erase even if the power
when the power is off is off

2. Describe swapping and Contiguous Memory Allocation CO3 (Nov/Dec


elaborately. 2011)

Swapping is a memory management scheme in which any process can


be temporarily swapped from main memory to secondary memory so
that the main memory can be made available for other processes. It is
used to improve main memory utilization. In secondary memory, the
place where the swapped-out process is stored is called swap space.
The purpose of the swapping in operating system is to access the data
present in the hard disk and bring it to RAM so that the application
programs can use it. The thing to remember is that swapping is used
only when data is not present in RAM.
Although the process of swapping affects the performance of the
system, it helps to run larger and more than one process. This is the
reason why swapping is also referred to as memory compaction.
The concept of swapping has divided into two more concepts: Swap-in
and Swap-out.
o Swap-out is a method of removing a process from RAM and
adding it to the hard disk.
o Swap-in is a method of removing a program from a hard disk
and putting it back into the main memory or RAM.
Example: Suppose the user process's size is 2048KB and is a standard

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 103


hard disk where swapping has a data transfer rate of 1Mbps. Now we
will calculate how long it will take to transfer from main memory to
secondary memory.
User process size is 2048Kb
Data transfer rate is 1Mbps = 1024 kbps
Time = process size / transfer rate
= 2048 / 1024
= 2 seconds
= 2000 milliseconds
Now taking swap-in and swap-out time, the process will take 4000 mill
iseconds.
Advantages of Swapping
1. It helps the CPU to manage multiple processes within a single
main memory.
2. It helps to create and use virtual memory.
3. Swapping allows the CPU to perform multiple tasks
simultaneously. Therefore, processes do not have to wait very
long before they are executed.
4. It improves the main memory utilization.
Disadvantages of Swapping
1. If the computer system loses power, the user may lose all
information related to the program in case of substantial
swapping activity.
2. If the swapping algorithm is not good, the composite method
can increase the number of Page Fault and decrease the overall
processing performance.
CONTIGUOUS MEMORY ALLOCATION
Memory Management Techniques are basic techniques that are used
in managing the memory in operating system. Memory Management
Techniques are basically classified into two categories:
(i) Contiguous
(ii) Non-contiguous
Contiguous Memory Management Techniques:
In this technique, memory is allotted in a continuous way to the
processes. It has two types:

Fixed Partition Scheme :


In the fixed partition scheme, memory is divided into fixed number of
partitions. Fixed means number of partitions are fixed in the memory.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 104


In the fixed partition, in every partition only one process will be
accommodated. Degree of multi-programming is restricted by
number of partitions in the memory. Maximum size of the process is
restricted by maximum size of the partition. Every partition is
associated with the limit registers.
 Limit Registers: It has two limit:
 Lower Limit: Starting address of the partition.
 Upper Limit: Ending address of the partition.

Internal Fragmentation is found in fixed partition scheme.


To overcome the problem of internal fragmentation, instead of fixed
partition scheme, variable partition scheme is used.
Variable Partition Scheme :
In the variable partition scheme, initially memory will be single
continuous free block. Whenever the request by the process arrives,
accordingly partition will be made in the memory. If the smaller
processes keep on coming then the larger partitions will be made into
smaller partitions.

External Fragmentation is found in variable partition scheme.


To overcome the problem of external fragmentation, compaction
technique is used or non-contiguous memory management techniques
are used.
Compaction:
Moving all the processes toward the top or towards the bottom to
make free available memory in a single continuous place is called as
compaction. Compaction is undesirable to implement because it
interrupts all the running processes in the memory.
3. Explain Paging concept in detail. CO3 (Apr/May
Paging is a memory management scheme that eliminates the need for 2019)
contiguous allocation of physical memory. The process of retrieving (Nov/Dec
processes in the form of pages from the secondary storage into the 2011)

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 105


main memory is known as paging. The basic purpose of paging is to (Apr/May
separate each procedure into pages. Additionally, frames will be used 2015)
to split the main memory. This scheme permits the physical address (Nov/Dec
space of a process to be non – contiguous. 2021)
Let us look some important terminologies:
 Logical Address or Virtual Address (represented in bits): An
address generated by the CPU
 Logical Address Space or Virtual Address Space( represented in
words or bytes): The set of all logical addresses generated by a
program
 Physical Address (represented in bits): An address actually
available on memory unit
 Physical Address Space (represented in words or bytes): The set
of all physical addresses corresponding to the logical addresses
Example:
 If Logical Address = 31 bit, then Logical Address Space =
231 words = 2 G words (1 G = 2 30)
 If Logical Address Space = 128 M words = 2 7 * 220 words, then
Logical Address = log 2 227 = 27 bits
 If Physical Address = 22 bit, then Physical Address Space =
222 words = 4 M words (1 M = 2 20)
 If Physical Address Space = 16 M words = 2 4 * 220 words, then
Physical Address = log 2 224 = 24 bits
The mapping from virtual to physical address is done by the memory
management unit (MMU) which is a hardware device and this
mapping is known as paging technique.
 The Physical Address Space is conceptually divided into a
number of fixed-size blocks, called frames.
 The Logical address Space is also splitted into fixed-size blocks,
called pages.
 Page Size = Frame Size
Let us consider an example:
 Physical Address = 12 bits, then Physical Address Space = 4 K
words
 Logical Address = 13 bits, then Logical Address Space = 8 K
words
 Page size = frame size = 1 K words (assumption)
Address generated by CPU is divided into
 Page number(p): Number of bits required to represent the pages
in Logical Address Space or Page number
 Page offset(d): Number of bits required to represent particular
word in a page or page size of Logical Address Space or word
number of a page or page offset.
Physical Address is divided into
 Frame number(f): Number of bits required to represent the
frame of Physical Address Space or Frame number.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 106


 Frame offset(d): Number of bits required to represent particular
word in a frame or frame size of Physical Address Space or word
number of a frame or frame offset.

The hardware implementation of page table can be done by using


dedicated registers. But the usage of register for the page table is
satisfactory only if page table is small. If page table contain large
number of entries then we can use TLB(translation Look-aside
buffer), a special, small, fast look up hardware cache.
 The TLB is associative, high speed memory.
 Each entry in TLB consists of two parts: a tag and a value.
 When this memory is used, then an item is compared with all tags
simultaneously. If the item is found, then corresponding value is
returned.
4. Draw the Structure of the Page Table and explain. CO3
In Operating Systems, Paging is a storage mechanism used to retrieve
processes from the secondary storage into the main memory in the
form of pages.
The main idea behind the paging is to divide each process in the form
of pages. The main memory will also be divided in the form of frames.
One page of the process is to be stored in one of the frames of the
memory. The pages can be stored at the different locations of the
memory but the priority is always to find the contiguous frames or
holes.
Pages of the process are brought into the main memory only when they
are required otherwise they reside in the secondary storage.
Different operating system defines different frame sizes. The sizes of
each frame must be equal. Considering the fact that the pages are
mapped to the frames in Paging, page size needs to be as same as
frame size.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 107


Example
Let us consider the main memory size 16 Kb and Frame size is 1 KB
therefore the main memory will be divided into the collection of 16
frames of 1 KB each.
There are 4 processes in the system that is P1, P2, P3 and P4 of 4 KB
each. Each process is divided into pages of 1 KB each so that one page
can be stored in one frame.
Initially, all the frames are empty therefore pages of the processes will
get stored in the contiguous way.
Frames, pages and the mapping between the two is shown in the image
below.

Let us consider that, P2 and P4 are moved to waiting state after some
time. Now, 8 frames become empty and therefore other pages can be
loaded in that empty place. The process P5 of size 8 KB (8 pages) is
waiting inside the ready queue.
Given the fact that, we have 8 non contiguous frames available in the
memory and paging provides the flexibility of storing the process at
the different places. Therefore, we can load the pages of process P5 in
the place of P2 and P4.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 108


Memory Management Unit
The purpose of Memory Management Unit (MMU) is to convert the
logical address into the physical address. The logical address is the
address generated by the CPU for every page while the physical
address is the actual address of the frame where each page will be
stored.
When a page is to be accessed by the CPU by using the logical
address, the operating system needs to obtain the physical address to
access that page physically.
The logical address has two parts.
1. Page Number
2. Offset
Memory management unit of OS needs to convert the page number to
the frame number.
Page Table
Page Table is a data structure used by the virtual memory system to
store the mapping between logical addresses and physical addresses.
Logical addresses are generated by the CPU for the pages of the
processes therefore they are generally used by the processes.
Physical addresses are the actual frame address of the memory. They
are generally used by the hardware or more specifically by RAM
subsystems.
The image given below considers,
Physical Address Space = M words
Logical Address Space = L words
Page Size = P words

Physical Address = log 2 M = m bits


Logical Address = log 2 L = l bits
page offset = log 2 P = p bits

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 109


The CPU always accesses the processes through their logical
addresses. However, the main memory recognizes physical address
only.
In this situation, a unit named as Memory Management Unit comes
into the picture. It converts the page number of the logical address to
the frame number of the physical address. The offset remains same in
both the addresses.
To perform this task, Memory Management unit needs a special kind
of mapping which is done by page table. The page table stores all the
Frame numbers corresponding to the page numbers of the page table.
In other words, the page table maps the page number to its actual
location (frame number) in the memory.
In the image given below shows, how the required word of the frame is
accessed with the help of offset.

5. Discuss Segmentation and Segmentation with paging with a neat CO3 (Nov/Dec
sketch. 2021)
A process is divided into Segments. The chunks that a program is
divided into which are not necessarily all of the same sizes are called
segments. Segmentation gives the user’s view of the process which
paging does not give. Here the user’s view is mapped to physical
memory. There are types of segmentation:
1. Virtual memory segmentation – Each process is divided into a

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 110


number of segments, not all of which are resident at any one point
in time.
2. Simple segmentation – Each process is divided into a number of
segments, all of which are loaded into memory at run time,
though not necessarily contiguously.
There is no simple relationship between logical addresses and
physical addresses in segmentation. A table stores the information
about all such segments and is called Segment Table. Segment Table
– It maps two-dimensional Logical address into one-dimensional
Physical address. It’s each table entry has:
 Base Address: It contains the starting physical address where the
segments reside in memory.
 Limit: It specifies the length of the segment.

Translation of Two dimensional Logical Address to dimensional


PhysicalAddress.

Address generated by the CPU is divided into:


 Segment number (s): Number of bits required to represent the
segment.
 Segment offset (d): Number of bits required to represent the size
of the segment.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 111


Advantages of Segmentation –
 No Internal fragmentation.
 Segment Table consumes less space in comparison to Page table
in paging.
 As a complete module is loaded all at once, segmentation
improves CPU utilization.
 The user’s perception of physical memory is quite similar to
segmentation. Users can divide user programmes into modules
via segmentation. These modules are nothing more than the
separate processes’ codes.
 The user specifies the segment size, whereas in paging, the
hardware determines the page size.
 Segmentation is a method that can be used to segregate data from
security operations.
Disadvantage of Segmentation –
 As processes are loaded and removed from the memory, the free
memory space is broken into little pieces, causing External
fragmentation.
 Overhead is associated with keeping a segment table for each
activity.
 Due to the need for two memory accesses, one for the segment
table and the other for main memory, access time to retrieve the
instruction increases.
SEGMENTATION WITH PAGING
Pure segmentation is not very popular and not being used in many of
the operating systems. However, Segmentation can be combined with
Paging to get the best features out of both the techniques.
In Segmented Paging, the main memory is divided into variable size
segments which are further divided into fixed size pages.
1. Pages are smaller than segments.
2. Each Segment has a page table which means every program has
multiple page tables.
3. The logical address is represented as Segment Number (base
address), Page number and page offset.
Segment Number → It points to the appropriate Segment Number.
Page Number → It Points to the exact page within the segment
Page Offset → Used as an offset within the page frame
Each Page table contains the various information about every page of
the segment. The Segment Table contains the information about every
segment. Each segment table entry points to a page table entry and
every page table entry is mapped to one of the page within a segment.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 112


Translation of logical address to physical address
The CPU generates a logical address which is divided into two parts:
Segment Number and Segment Offset. The Segment Offset must be
less than the segment limit. Offset is further divided into Page number
and Page Offset. To map the exact page number in the page table, the
page number is added into the page table base.
The actual frame number with the page offset is mapped to the main
memory to get the desired word in the page of the certain segment of
the process.

Advantages of Segmented Paging


1. It reduces memory usage.
2. Page table size is limited by the segment size.
3. Segment table has only one entry corresponding to one actual
segment.
4. External Fragmentation is not there.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 113


5. It simplifies memory allocation.
Disadvantages of Segmented Paging
1. Internal Fragmentation will be there.
2. The complexity level will be much higher as compare to
paging.
3. Page Tables need to be contiguously stored in the memory.
6. Explain the concept of Virtual Memory in detail. CO3
Virtual Memory is a storage scheme that provides user an illusion of
having a very big main memory. This is done by treating a part of
secondary memory as the main memory.
In this scheme, User can load the bigger size processes than the
available main memory by having the illusion that the memory is
available to load the process.
Instead of loading one big process in the main memory, the Operating
System loads the different parts of more than one process in the main
memory.
By doing this, the degree of multiprogramming will be increased and
therefore, the CPU utilization will also be increased.
How Virtual Memory Works?
In modern word, virtual memory has become quite common these
days. In this scheme, whenever some pages needs to be loaded in the
main memory for the execution and the memory is not available for
those many pages, then in that case, instead of stopping the pages from
entering in the main memory, the OS search for the RAM area that are
least used in the recent times or that are not referenced and copy that
into the secondary memory to make the space for the new pages in the
main memory.
Since all this procedure happens automatically, therefore it makes the
computer feel like it is having the unlimited RAM.
Demand Paging
Demand Paging is a popular method of virtual memory management.
In demand paging, the pages of a process which are least used, get
stored in the secondary memory.
A page is copied to the main memory when its demand is made or
page fault occurs. There are various page replacement algorithms
which are used to determine the pages which will be replaced. We will
discuss each one of them later in detail.
Snapshot of a virtual memory management system
Let us assume 2 processes, P1 and P2, contains 4 pages each. Each
page size is 1 KB. The main memory contains 8 frame of 1 KB each.
The OS resides in the first two partitions. In the third partition, 1 st page
of P1 is stored and the other frames are also shown as filled with the
different pages of processes in the main memory.
The page tables of both the pages are 1 KB size each and therefore
they can be fit in one frame each. The page tables of both the processes
contain various information that is also shown in the image.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 114


The CPU contains a register which contains the base address of page
table that is 5 in the case of P1 and 7 in the case of P2. This page table
base address will be added to the page number of the Logical address
when it comes to accessing the actual corresponding entry.

Advantages of Virtual Memory


1. The degree of Multiprogramming will be increased.
2. User can run large application with less real RAM.
3. There is no need to buy more memory RAMs.
Disadvantages of Virtual Memory
1. The system becomes slower since swapping takes time.
2. It takes more time in switching between applications.
3. The user will have the lesser hard disk space for its use.
7. Explain about given memory management techniques. (i) CO3 (Nov/Dec
Partitioned allocation (ii) Paging and translation look-aside 2015)
buffer. (Apr/May
2017)
Partition Allocation Methods in Memory Management
Partitioned allocation: Memory is divided into different blocks or
partitions. Each process is allocated according to the requirement.
In Partition Allocation, when there is more than one partition freely
available to accommodate a process’s request, a partition must be
selected. To choose a particular partition, a partition allocation
method is needed. A partition allocation method is considered better
if it avoids internal fragmentation.

When it is time to load a process into the main memory and if there is
more than one free block of memory of sufficient size then the OS
decides which free block to allocate.
There are different Placement Algorithm:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 115


A. First Fit
B. Best Fit
C. Worst Fit
D. Next Fit
1. First Fit: In the first fit, the partition is allocated which is the first
sufficient block from the top of Main Memory. It scans memory from
the beginning and chooses the first available block that is large
enough. Thus it allocates the first hole that is large enough.

2. Best Fit Allocate the process to the partition which is the first
smallest sufficient partition among the free available partition. It
searches the entire list of holes to find the smallest hole whose size is
greater than or equal to the size of the process.

3. Worst Fit Allocate the process to the partition which is the largest
sufficient among the freely available partitions available in the main
memory. It is opposite to the best-fit algorithm. It searches the entire
list of holes to find the largest hole and allocate it to process.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 116


4. Next Fit: Next fit is similar to the first fit but it will search for the
first sufficient partition from the last allocation point.
Translation Lookaside Buffer (TLB) in Paging
To overcome this problem a high-speed cache is set up for page table
entries called a Translation Lookaside Buffer (TLB). Translation
Lookaside Buffer (TLB) is a special cache used to keep track of
recently used transactions. TLB contains page table entries that have
been most recently used. Given a virtual address, the processor
examines the TLB if a page table entry is present (TLB hit), the
frame number is retrieved and the real address is formed. If a page
table entry is not found in the TLB (TLB miss), the page number is
used as an index while processing the page table. TLB first checks if
the page is already in main memory, if not in main memory a page
fault is issued then the TLB is updated to include the new page entry.

Steps in TLB hit


1. CPU generates a virtual (logical) address.
2. It is checked in TLB (present).
3. The corresponding frame number is retrieved, which now tells
where the main memory page lies.
Steps in TLB miss
1. CPU generates a virtual (logical) address.
2. It is checked in TLB (not present).
3. Now the page number is matched to the page table residing in the
main memory (assuming the page table contains all PTE).
4. The corresponding frame number is retrieved, which now tells
where the main memory page lies.
5. The TLB is updated with new PTE (if space is not there, one of
the replacement techniques comes into the picture i.e either FIFO,
LRU or MFU etc).

Effective memory access time(EMAT)

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 117


TLB is used to reduce adequate memory access time as it is a high-
speed associative cache.
EMAT = h*(c+m) + (1-h)*(c+2m)
where,h=hit ratio of TLB
m = Memory access time
c = TLB access time
8. Consider the following page reference string: 1, 2, 3, 4, 2, 1, 5, 6, CO3 (Apr/May
2, 1, 2, 3, 7, 6, 3, 2, 1, 2, 3, 6. 2015)
Identify the no.of page faults would occur for the following (Nov/Dec
replacement algorithms, assuming one, two, three, four, five, 2015)
six, or seven frames? Remember all frames are initially empty,
so your first unique pages will all cost one fault each.
a.LRU replacement
b. FIFO replacement
c.Optimal replacement

Answer:
Number of frames LRU FIFO Optimal
1 20 20 20
2 18 18 15
3 15 16 11
4 10 14 8
5 8 10 7
6 7 10 7
7 7 7 7
9. Compare paging with segmentation in terms of the amount of CO3 (Nov/Dec
memory required by the address translation structures in order 2018)
to convert virtual addresses to physical addresses.

Paging and segmentation are both memory management techniques


used in operating systems to map virtual addresses to physical
addresses. Let's compare them in terms of the amount of memory
required by the address translation structures:

Paging:
In paging, the virtual address space and the physical address space are
divided into fixed-size blocks called pages.
The page table is used for address translation. It consists of page table
entries (PTEs), where each entry maps a page of the virtual address
space to a page in the physical address space.
The size of the page table depends on the size of the virtual address
space, as each page needs an entry in the page table.
The memory required for the page table is proportional to the size of
the virtual address space, not the actual amount of memory used by

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 118


the process.
In summary, the amount of memory required for the page table in
paging is influenced by the size of the virtual address space.

Segmentation:
In segmentation, the virtual address space is divided into logical
segments, such as code, data, and stack. Each segment is assigned a
base address and a limit.
The segment table is used for address translation. It contains segment
table entries (STE), where each entry maps a segment to a base
address and a limit.
The memory required for the segment table is proportional to the
number of segments in the virtual address space, regardless of the size
of individual segments.
In summary, the amount of memory required for the segment table in
segmentation is influenced by the number of segments in the virtual
address space.

Comparison:
Paging tends to have a more predictable and straightforward memory
structure, as it depends on the size of the virtual address space.
Segmentation's memory structure is influenced by the number of
segments, making it more variable and potentially less predictable.
In practice, modern systems often use a combination of paging and
segmentation, known as segmented paging or hierarchical paging, to
leverage the benefits of both approaches. This allows for efficient
memory management while addressing the limitations of each
technique individually.
10. Discuss situation under which the most frequently used page CO3 (Apr/May
replacement algorithm generates fewer page faults than the least 2018)
frequently used page replacement algorithm. Also discuss under
which circumstances the opposite holds.

The Most Frequently Used (MFU) and Least Frequently Used (LFU)
page replacement algorithms are based on different strategies for
determining which pages to keep in memory and which to replace.
The effectiveness of these algorithms can depend on the access
patterns of the processes running on the system. Let's discuss
situations where MFU might generate fewer page faults than LFU and
vice versa:

Most Frequently Used (MFU) Generating Fewer Page Faults:

Temporal Locality:
If there is a high degree of temporal locality in the access patterns,
where certain pages are repeatedly accessed in a short period, MFU is
likely to perform well.
MFU replaces pages based on the frequency of access. So, if certain

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 119


pages are accessed frequently, they will be kept in memory, leading
to fewer page faults.
Short-Term Bursty Access:
In scenarios where the workload exhibits short bursts of intense
activity on specific pages, MFU can adapt quickly by keeping the
frequently accessed pages in memory.
Stable Access Patterns:
If the access patterns remain relatively stable over time, MFU can
efficiently adapt to the most frequently used pages, minimizing page
faults.
Least Frequently Used (LFU) Generating Fewer Page Faults:
Changing Access Patterns:
If the workload has dynamic and changing access patterns, where
previously infrequently accessed pages become suddenly popular,
LFU might perform better.
LFU keeps track of the frequency of accesses over the long term and
may be more adaptive to changes in access patterns.
Variable Page Usage:
When there is a mix of pages with varying access frequencies, LFU
might outperform MFU. LFU considers the historical frequency of
page accesses and may be better suited for diverse workloads.
Avoiding Thrashing:
In scenarios where the MFU algorithm might be overly aggressive in
keeping frequently used pages, leading to thrashing (excessive page
swapping), LFU may help by considering the overall frequency of
page usage.
In summary, the choice between MFU and LFU can depend on the
nature of the workload and the characteristics of the memory access
patterns. Temporal locality, short-term bursty access, and stable
access patterns might favor MFU, while changing access patterns,
variable page usage, and the need to avoid thrashing might favor
LFU. It's worth noting that real-world workloads often exhibit a
combination of these characteristics, and the choice of the page
replacement algorithm can impact system performance.
11. Explain about the difference between internal fragmentation CO3 (Nov/Dec
and external fragmentation. 2016)
Internal
Feature Fragmentation External Fragmentation
Unutilized memory scattered
Unutilized memory throughout the system due to
within a single memory being allocated in
Definition allocated block. non-contiguous blocks.
Within the allocated Outside allocated memory
Location memory block. blocks.
Cause Caused by the Caused by the allocation and
allocation of fixed-size deallocation of variable-sized
memory blocks, leading memory blocks, resulting in

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 120


to unused space within small free memory chunks
a block. scattered across the system.
Visible only to the
process to which the Visible system-wide; affects
Visibility memory is allocated. overall system performance.
Typically addressed by Addressed through
adjusting block sizes or compaction or memory
using memory management techniques that
allocation techniques try to coalesce free memory
that minimize wasted blocks to form larger
Resolution space. contiguous blocks.
Reduces the efficiency Reduces the overall system's
of memory usage ability to use available
Impact on within individual memory efficiently due to
Utilization allocated blocks. scattered free spaces.
Fixed-size memory
allocation schemes like Variable-size memory
Common paging or fixed-size allocation schemes like
in partitioning. dynamic partitioning.
Allocating 4 KB of
memory for a data
structure that only Free memory blocks scattered
needs 3.5 KB, leading throughout the system that
to 0.5 KB of wasted cannot be used to satisfy
Example space within the block. larger memory requests.
Understanding the distinctions between internal and external
fragmentation is crucial for effective memory management in
operating systems. Internal fragmentation occurs within individual
allocated blocks, impacting the efficiency of memory usage for a
specific process. External fragmentation, on the other hand, affects
the overall system by scattering free memory throughout, making it
challenging to allocate contiguous memory for larger processes.
12. Most systems allow programs to allocate more memory to its CO3 (Nov/Dec
address space during execution. Data allocated in the heap 2018)
segments of programs is an example of such allocated memory.
What is required to support dynamic memory allocation in the
following schemes?
i) Contiguous memory allocation ii)Pure segmentation iii) Pure
paging

a. Contiguous Memory Allocation:


In contiguous memory allocation, memory is divided into fixed-size
partitions, and each partition is allocated to a process. To support
dynamic memory allocation in this scheme, the following
requirements are needed:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 121


1. Memory Management Unit (MMU): An MMU is required to
translate logical addresses to physical addresses and perform memory
protection.

2. Memory Allocation/Deallocation Mechanism: A mechanism is


needed to allocate and deallocate memory dynamically. This
mechanism should keep track of free and allocated memory blocks
and efficiently manage the allocation and deallocation requests.

3. Memory Fragmentation Handling: Contiguous memory allocation


can lead to two types of fragmentation: external fragmentation and
internal fragmentation. Techniques such as compaction and relocation
may be employed to reduce fragmentation.

b. Pure Segmentation:
In pure segmentation, the logical address space of a process is divided
into variable-sized segments, each representing a different unit of the
program (e.g., code segment, data segment). To support dynamic
memory allocation in pure segmentation, the following requirements
are necessary:

1. Segmentation Table: A segmentation table is required to store the


base address and length of each segment for a process. This table is
used to translate logical addresses to physical addresses.

2. Segmentation Mechanism: A mechanism is needed to allocate and


deallocate segments dynamically. This mechanism should manage the
segmentation table and ensure that segments are appropriately
allocated and deallocated.

3. Memory Protection: As different segments may have different


access permissions, memory protection mechanisms are required to
enforce access control and prevent unauthorized access to segments.

c. Pure Paging:
In pure paging, the logical address space of a process is divided into
fixed-sized pages, and physical memory is divided into fixed-sized
frames. To support dynamic memory allocation in pure paging, the
following requirements are necessary:

1. Page Table: A page table is required to store the mapping between


logical pages and physical frames. Each entry in the page table
contains the frame number corresponding to a logical page.

2. Page Replacement Algorithm: In case the physical memory is full


and a new page needs to be allocated, a page replacement algorithm is
needed to select a page to be evicted from the memory. Common
page replacement algorithms include LRU (Least Recently Used),

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 122


FIFO (First-In-First-Out), and Optimal.

3. Memory Management Unit (MMU): An MMU is required to


translate logical addresses to physical addresses using the page table.

4. Memory Allocation/Deallocation Mechanism: A mechanism is


needed to allocate and deallocate pages dynamically. This mechanism
should manage the page table, allocate available pages to processes,
and handle page faults during memory access.
PART - C
S.No Question and Answer CO Univ QP
. (Month /
Year)
1. Discuss Demand Paging and Copy on Write with a neat sketch. CO3
1 Demand paging is a technique used in virtual memory systems where
the pages are brought in the main memory only when required or
demanded by the CPU. Hence, it is also called lazy swapper because
the swapping of pages is done only when the CPU requires it. Virtual
memory is commonly implemented in demand paging.
rime Ministers of India | List of Prime Minister of India (1947-2020)
In demand paging, the pager brings only those necessary pages into the
memory instead of swapping in a whole process. Thus, demand paging
avoids reading into memory pages that will not be used anyways,
decreasing the swap time and the amount of physical memory needed.

The demand paging system depends on the page table implementation


because the page table helps map logical memory to physical memory.
Bitwise operators are implemented in the page table to indicate that
pages are ok or not (valid or invalid). All valid pages exist in primary
memory, and invalid pages exist in secondary memory. Now all
processes go-ahead to access all pages, and then the following things
will be happened, such as:
1. Attempt to currently access page.
2. If the page is ok (Valid), then all processing instructions work
as normal.
3. If anyone's page is found invalid, then a page fault issue arises.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 123


4. Now memory reference is determined whether a valid reference
exists on the auxiliary memory or not. If not exist, then the
process is terminated, otherwise needed pages are paged in.
5. Now disk operations are implemented to fetch the required
page into primary memory.
Examples of Demand Paging
Suppose we have to execute a process P having four pages P0, P1, P2,
and P3. Currently, in the page table, we have pages P1 and P3.

1. If the CPU wants to access page P2 of a process P, it will first


search the page in the page table.
2. As the page table does not contain this page so it will be
a trapor page fault. The control goes to the operating system as
soon as the trap is generated and context switching happens.
3. The OS system will put the process in a waiting/ blocked state.
The OS system will now search that page in the backing store
or secondary memory.
4. The OS will then read the page from the backing store and load
it to the main memory.
5. Next, the OS system will update the page table entry
accordingly.
6. Finally, the control is taken back from the OS, and the
execution of the process is resumed.
Hence, the operating system follows these steps whenever a page fault
occurs and the required page is brought into memory.
So whenever a page fault occurs, as shown in all the above steps 2 to 6.
This time taken to service the page fault is called the Page fault
service time.
Effective Memory Access Time: When the page fault rate is 'p' while
executing any process, then the effective memory access time is
calculated as follows:
Effective Memory Access time = (p)*(s) + (1-p)*(m)
Where
o p is the page fault rate.
o s is the page fault service time.
o m is the main memory access time.
Advantages of Demand Paging

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 124


Here are the following advantages of demand paging in the operating
system, such as:
o It increases the degree of multiprogramming as many processes
can be present in the main memory simultaneously.
o There is a more efficient use of memory as processes having a
size more than the size of the main memory can also be
executed using this mechanism because we are not loading the
whole page at a time.
o We have to the right for scaling of virtual memory.
o If any program is larger than physical memory, it helps run this
program without compaction.
o Partition management is simpler.
o It is more useful in a time-sharing system.
o It has no limitations on the level of multi-programming.
o Discards external fragmentation.
o Easy to swap all pages.
Disadvantages of Demand Paging
Below are some disadvantages of demand paging in an operating
system, such as:
o The amount of processor overhead and the number of tables
used for handling the page faults is greater than in simple page
management techniques.
o It has more probability of internal fragmentation.
o Its memory access time is longer.
o Page Table Length Register (PTLR) has a limit for virtual
memory.
o Page map table is needed additional memory and registers.
COPY ON WRITE
Copy on Write or simply COW is a resource management technique.
One of its main use is in the implementation of the fork system call in
which it shares the virtual memory(pages) of the OS.
In UNIX like OS, fork() system call creates a duplicate process of the
parent process which is called as the child process.
The idea behind a copy-on-write is that when a parent process creates
a child process then both of these processes initially will share the
same pages in memory and these shared pages will be marked as
copy-on-write which means that if any of these processes will try to
modify the shared pages then only a copy of these pages will be
created and the modifications will be done on the copy of pages by
that process and thus not affecting the other process.
Suppose, there is a process P that creates a new process Q and then
process P modifies page 3.
The below figures shows what happens before and after process P
modifies page 3.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 125


2. Describe Page Replacement algorithms with examples. CO3 (Apr/May
In an operating system that uses paging for memory management, a 2015)
page replacement algorithm is needed to decide which page needs to (Nov/Dec
be replaced when new page comes in. 2021)
Page Fault – A page fault happens when a running program accesses a
memory page that is mapped into the virtual address space, but not
loaded in physical memory.
Since actual physical memory is much smaller than virtual memory,
page faults happen. In case of page fault, Operating System might have
to replace one of the existing pages with the newly needed page.
Different page replacement algorithms suggest different ways to
decide which page to replace. The target for all algorithms is to reduce
the number of page faults.
Page Replacement Algorithms :
First In First Out (FIFO) –
This is the simplest page replacement algorithm. In this algorithm, the
operating system keeps track of all pages in the memory in a queue,
the oldest page is in the front of the queue. When a page needs to be
replaced page in the front of the queue is selected for removal.

Example-1Consider page reference string 1, 3, 0, 3, 5, 6 with 3 page


frames.Find number of page faults.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 126


Initially all slots are empty, so when 1, 3, 0 came they are allocated to
the empty slots —> 3 Page Faults.
when 3 comes, it is already in memory so —> 0 Page Faults.
Then 5 comes, it is not available in memory so it replaces the oldest
page slot i.e 1. —>1 Page Fault.
6 comes, it is also not available in memory so it replaces the oldest
page slot i.e 3 —>1 Page Fault.
Finally when 3 come it is not avilable so it replaces 0 1 page fault
Belady’s anomaly – Belady’s anomaly proves that it is possible to have
more page faults when increasing the number of page frames while
using the First in First Out (FIFO) page replacement algorithm. For
example, if we consider reference string 3, 2, 1, 0, 3, 2, 4, 3, 2, 1, 0, 4
and 3 slots, we get 9 total page faults, but if we increase slots to 4, we
get 10 page faults.
/li>
Optimal Page replacement –
In this algorithm, pages are replaced which would not be used for the
longest duration of time in the future.
Example-2:Consider the page references 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3,
2, with 4 page frame. Find number of page fault.

Initially all slots are empty, so when 7 0 1 2 are allocated to the empty
slots —> 4 Page faults
0 is already there so —> 0 Page fault.
when 3 came it will take the place of 7 because it is not used for the
longest duration of time in the future.—>1 Page fault.
0 is already there so —> 0 Page fault..

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 127


4 will takes place of 1 —> 1 Page Fault.

Now for the further page reference string —> 0 Page fault because
they are already available in the memory.
Optimal page replacement is perfect, but not possible in practice as the
operating system cannot know future requests. The use of Optimal
Page replacement is to set up a benchmark so that other replacement
algorithms can be analyzed against it.

Least Recently Used –


In this algorithm page will be replaced which is least recently used.
Example-3Consider the page reference string 7, 0, 1, 2, 0, 3, 0, 4, 2, 3,
0, 3, 2 with 4 page frames. Find number of page faults.

Initially all slots are empty, so when 7 0 1 2 are allocated to the empty
slots —> 4 Page faults
0 is already their so —> 0 Page fault.
when 3 came it will take the place of 7 because it is least recently used
—>1 Page fault
0 is already in memory so —> 0 Page fault.
4 will takes place of 1 —> 1 Page Fault
Now for the further page reference string —> 0 Page fault because
they are already available in the memory.
3. State how frames are allocated during paging with diagrams CO3
wherever necessary.
The main memory of the operating system is divided into various
frames. The process is stored in these frames, and once the process is
saved as a frame, the CPU may run it. As a result, the operating system
must set aside enough frames for each process. As a result, the
operating system uses various algorithms in order to assign the frame.
Demand paging is used to implement virtual memory, an essential
operating system feature. It requires the development of a page
replacement mechanism and a frame allocation system. If you have
multiple processes, the frame allocation techniques are utilized to
define how many frames to allot to each one. A number of factors
constrain the strategies for allocating frames:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 128


1. You cannot assign more frames than the total number of frames
available.
2. A specific number of frames should be assigned to each
process. This limitation is due to two factors. The first is that
when the number of frames assigned drops, the page fault ratio
grows, decreasing the process's execution performance.
Second, there should be sufficient frames to hold all the
multiple pages that any instruction may reference.
There are mainly five ways of frame allocation algorithms in the OS.
These are as follows:
1. Equal Frame Allocation
2. Proportional Frame Allocation
3. Priority Frame Allocation
4. Global Replacement Allocation
5. Local Replacement Allocation
Equal Frame Allocation
In equal frame allocation, the processes are assigned equally among
the processes in the OS. For example, if the system has 30 frames and
7 processes, each process will get 4 frames. The 2 frames that are not
assigned to any system process may be used as a free-frame buffer
pool in the system.
Disadvantage
In a system with processes of varying sizes, assigning equal frames to
each process makes little sense. Many allotted empty frames will be
wasted if many frames are assigned to a small task.
Proportional Frame Allocation
The proportional frame allocation technique assigns frames based on
the size needed for execution and the total number of frames in
memory.
The allocated frames for a process pi of size si are ai = (si/S)*m, in
which S represents the total of all process sizes, and m represents the
number of frames in the system.
Disadvantage
The only drawback of this algorithm is that it doesn't allocate frames
based on priority. Priority frame allocation solves this problem.
Priority Frame Allocation
Priority frame allocation assigns frames based on the number of frame
allocations and the processes. Suppose a process has a high priority
and requires more frames that many frames will be allocated to it.
Following that, lesser priority processes are allocated.
Global Replacement Allocation
When a process requires a page that isn't currently in memory, it may
put it in and select a frame from the all frames sets, even if another
process is already utilizing that frame. In other words, one process may
take a frame from another.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 129


Advantages
Process performance is not hampered, resulting in higher system
throughput.
Disadvantages
The process itself may not solely control the page fault ratio of a
process. The paging behavior of other processes also influences the
number of pages in memory for a process.
Local Replacement Allocation
When a process requires a page that isn't already in memory, it can
bring it in and assign it a frame from its set of allocated frames.
Advantages
The paging behavior of a specific process has an effect on the pages in
memory and the page fault ratio.
Disadvantages
A low priority process may obstruct a high priority process by refusing
to share its frames.
Global Vs. Local Replacement Allocation
The number of frames assigned to a process does not change using a
local replacement strategy. On the other hand, using global
replacement, a process can choose only frames granted to other
processes and enhance the number of frames allocated.

4. Explain the concept of thrashing elaborately. CO3


Thrashing is a condition or a situation when the system is spending a
major portion of its time servicing the page faults, but the actual
processing done is very negligible.
In computer science, thrash is the poor performance of a virtual
memory (or paging) system when the same pages are being loaded
repeatedly due to a lack of main memory to keep them in memory.
Depending on the configuration and algorithm, the actual throughput
of a system can degrade by multiple orders of magnitude.
In computer science, thrashing occurs when a computer's virtual
memory resources are overused, leading to a constant state of paging
and page faults, inhibiting most application-level processing. It causes
the performance of the computer to degrade or collapse. The situation
can continue indefinitely until the user closes some running
applications or the active processes free up additional virtual memory
resources.
To know more clearly about thrashing, first, we need to know about
page fault and swapping.
o Page fault: We know every program is divided into some
pages. A page fault occurs when a program attempts to access
data or code in its address space but is not currently located in
the system RAM.
o Swapping: Whenever a page fault happens, the operating

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 130


system will try to fetch that page from secondary memory and
try to swap it with one of the pages in RAM. This process is
called swapping.
Thrashing is when the page fault and swapping happens very
frequently at a higher rate, and then the operating system has to spend
more time swapping these pages. This state in the operating system is
known as thrashing. Because of thrashing, the CPU utilization is going
to be reduced or negligible.

The basic concept involved is that if a process is allocated too few


frames, then there will be too many and too frequent page faults. As a
result, no valuable work would be done by the CPU, and the CPU
utilization would fall drastically.
The long-term scheduler would then try to improve the CPU utilization
by loading some more processes into the memory, thereby increasing
the degree of multiprogramming. Unfortunately, this would result in a
further decrease in the CPU utilization, triggering a chained reaction of
higher page faults followed by an increase in the degree of
multiprogramming, called thrashing.
Algorithms during Thrashing
Whenever thrashing starts, the operating system tries to apply either
the Global page replacement Algorithm or the Local page replacement
algorithm.
1. Global Page Replacement
Since global page replacement can bring any page, it tries to bring
more pages whenever thrashing is found. But what actually will
happen is that no process gets enough frames, and as a result, the
thrashing will increase more and more. Therefore, the global page
replacement algorithm is not suitable when thrashing happens.
2. Local Page Replacement
Unlike the global page replacement algorithm, local page replacement
will select pages which only belong to that process. So there is a
chance to reduce the thrashing. But it is proven that there are many
disadvantages if we use local page replacement. Therefore, local page
replacement is just an alternative to global page replacement in a
thrashing scenario.
Causes of Thrashing
Programs or workloads may cause thrashing, and it results in severe
performance problems, such as:
o If CPU utilization is too low, we increase the degree of

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 131


multiprogramming by introducing a new system. A global page
replacement algorithm is used. The CPU scheduler sees the
decreasing CPU utilization and increases the degree of
multiprogramming.
o CPU utilization is plotted against the degree of
multiprogramming.
o As the degree of multiprogramming increases, CPU utilization
also increases.
o If the degree of multiprogramming is increased further,
thrashing sets in, and CPU utilization drops sharply.
o So, at this point, to increase CPU utilization and to stop
thrashing, we must decrease the degree of multiprogramming.
How to Eliminate Thrashing
Thrashing has some negative impacts on hard drive health and system
performance. Therefore, it is necessary to take some actions to avoid it.
To resolve the problem of thrashing, here are the following methods,
such as:
o Adjust the swap file size:If the system swap file is not
configured correctly, disk thrashing can also happen to you.
o Increase the amount of RAM: As insufficient memory can
cause disk thrashing, one solution is to add more RAM to the
laptop. With more memory, your computer can handle tasks
easily and don't have to work excessively. Generally, it is the
best long-term solution.
o Decrease the number of applications running on the
computer: If there are too many applications running in the
background, your system resource will consume a lot. And the
remaining system resource is slow that can result in thrashing.
So while closing, some applications will release some resources
so that you can avoid thrashing to some extent.
o Replace programs: Replace those programs that are heavy
memory occupied with equivalents that use less memory.
Techniques to Prevent Thrashing
The Local Page replacement is better than the Global Page
replacement, but local page replacement has many disadvantages, so it
is sometimes not helpful. Therefore below are some other techniques
that are used to handle thrashing:
1. Locality Model
A locality is a set of pages that are actively used together. The locality
model states that as a process executes, it moves from one locality to
another. Thus, a program is generally composed of several different
localities which may overlap.
For example, when a function is called, it defines a new locality where
memory references are made to the function call instructions, local and
global variables, etc. Similarly, when the function is exited, the process
leaves this locality.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 132


2. Working-Set Model
This model is based on the above-stated concept of the Locality
Model.
The basic principle states that if we allocate enough frames to a
process to accommodate its current locality, it will only fault whenever
it moves to some new locality. But if the allocated frames are lesser
than the size of the current locality, the process is bound to thrash.
According to this model, based on parameter A, the working set is
defined as the set of pages in the most recent 'A' page references.
Hence, all the actively used pages would always end up being a part of
the working set.
The accuracy of the working set is dependent on the value of
parameter A. If A is too large, then working sets may overlap. On the
other hand, for smaller values of A, the locality might not be covered
entirely.
If D is the total demand for frames and WSSi is the working set size for
process i,
D = ⅀ WSSi
Now, if 'm' is the number of frames available in the memory, there are
two possibilities:
o D>m, i.e., total demand exceeds the number of frames, then
thrashing will occur as some processes would not get enough
frames.
o D<=m, then there would be no thrashing.
If there are enough extra frames, then some more processes can be
loaded into the memory. On the other hand, if the summation of
working set sizes exceeds the frames' availability, some of the
processes have to be suspended (swapped out of memory).
This technique prevents thrashing along with ensuring the highest
degree of multiprogramming possible. Thus, it optimizes CPU
utilization.
3. Page Fault Frequency
A more direct approach to handle thrashing is the one that uses the
Page-Fault Frequency concept.

The problem associated with thrashing is the high page fault rate, and
thus, the concept here is to control the page fault rate.
If the page fault rate is too high, it indicates that the process has too
few frames allocated to it. On the contrary, a low page fault rate
indicates that the process has too many frames.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 133


Upper and lower limits can be established on the desired page fault
rate, as shown in the diagram.
If the page fault rate falls below the lower limit, frames can be
removed from the process. Similarly, if the page faults rate exceeds the
upper limit, more frames can be allocated to the process.
In other words, the graphical state of the system should be kept limited
to the rectangular region formed in the given diagram.
If the page fault rate is high with no free frames, some of the processes
can be suspended and allocated to them can be reallocated to other
processes. The suspended processes can restart later.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 134


UNIT IV STORAGE MANAGEMENT
Mass Storage system – Disk Structure - Disk Scheduling and Management; File-System
Interface - File concept - Access methods - Directory Structure - Directory organization - File
system mounting - File Sharing and Protection; File System Implementation - File System
Structure - Directory implementation - Allocation Methods - Free Space Management; I/O
Systems – I/O Hardware, Application I/O interface, Kernel I/O subsystem.
PART - A
S.No. Question and Answer CO Univ QP
(Month /
Year)
1. Distinguish file from dictionary. CO4 (Apr/
May2017)
A file is any kind of computer document whereas a directory is a
collection of files and folders.
Feature File Dictionary
Stores data persistently on Stores key-value pairs for
Purpose storage devices. efficient data retrieval.
Typically organized as a
sequence of bytes, characters, Organized as a collection
Structure or records. of key-value pairs.
Accessed sequentially or Accessed by searching for
Content randomly, depending on the a specific key to retrieve its
Retrieval file type. associated value.
Can store various data types, Values associated with
Data including text, binary, numeric, keys can be of various data
Types etc. types.
File systems may support basic
Search search functions, but the Provides efficient key-
Capabilit primary focus is on reading and based search operations for
y writing data. fast retrieval.
Data can be modified or Efficient for adding,
appended, but modifications updating, or deleting key-
Modificat may require rewriting the entire value pairs without
ion file. affecting other entries.
Can be of variable or fixed size,
depending on the file system Dynamic size based on the
Size and file type. number of key-value pairs.
Text files, binary files, Hash tables, associative
Examples databases, etc. arrays, JSON objects, etc.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 135


2. Why it is important to scale up system bus and device speed as CO4 (Nov/Dec
CPU speed increases? 2016)
Consider a system which performs 50% I/O and 50% computes.
Doubling the CPU performance on this system would increase total
system performance by only 50%. Doubling both system aspects
would increase performance by 100%. Generally, it is important to
remove the current system bottleneck, and to increase overall system
performance, rather than blindly increasing the performance of
individual system components.
3. Define C-SCAN scheduling. CO4
The elevator algorithm (also SCAN) is a disk scheduling algorithm to
determine the motion of the disk's arm and head in servicing read and
write requests. This algorithm is named after the behaviour of a
building elevator, where the elevator continues to travel in its current
direction (up or down) until empty, stopping only to let individuals
off or to pick up new individuals heading in the same direction.
4. How does DMA increase system concurrency? CO4 (May
DMA increases system concurrency by allowing the CPU to perform /June
tasks while the DMA system transfers data via the system and 2016)
memory buses.
5. Why rotational latency is not considered in disk scheduling? CO4 (May
/June
Most disks do not export their rotational position information to the 2016)
host. Even if they did, the time for this information to reach the
scheduler would be subject to imprecision and the time consumed
by the scheduler is variable, so the rotational position information
would become incorrect. Further, the disk requests are usually
given in terms of logical block numbers, and the mapping between
logical blocks and physical locations is very complex.
6. List the various file attributes. CO4 (Apr/May
2015)
A file has certain other attributes, which vary from one operating (Nov/Dec
system to another, but typically consist of these: Name, identifier, 2018)
type, location, size, protection, time, and date and user
identification
7. What is HSM? Where it is used? CO4 (Apr/May
2015)
Hierarchical storage management (HSM) is a data storage
technique, which automatically moves data between high-cost and
low-cost storage media. HSM systems exist because high- speed
storage devices, such as solid state drive arrays, are more expensive
(per byte stored) than slower devices, such as hard disk drives,
optical discs and magnetic tape drives.
8. What are the functions of Virtual File System (VFS) layer in file CO4 (Nov/Dec
system implementation? 2015)

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 136


A virtual file system (VFS) or virtual file system switch is an
abstraction layer on top of a more concrete file system. The purpose
of a VFS is to allow client applications to access different types of
concrete file systems in a uniform way. A VFS can, for example, be
used to access local and network storage devices transparently
without the client application noticing the difference.
9. List the significance of LDT and GDT in segmentation. CO4 (Nov
LDT contains memory segments which are private to a specific /Dec
program, the GDT contains global segments. The x86 processors 2018)
have facilities for automatically switching the current LDT on
specific machine events, but no facilities for automatically
switching the GDT.
10. Suppose that the disk rotates at 7200 rpm. CO4
a. what is the average rotational latency of the disk drive?
b. Identify the seek distance that can be covered in the time?

11. What are the various file operations? CO4 (Nov/


Dec2018)
The six basic file operations are
Creating a file
Writing a file
Reading a file
Repositioning within a file
Deleting a file
Truncating a file

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 137


12. Define typical bad sector transaction. CO4 Apr/May
2018
The operating system tries to read logical
block 87.
 The controller calculates the ECC and finds that the sector
is bad, It reports this finding to the OS.
The next time the system is rebooted, a special command is
run to tell the controller to replace the bad sector with a
space.
 After that, whenever the system requests logical block 87,
the request is translated into the replacement sector’s
address by the controller.
13. What are the informations associated with an open file? CO4
Several pieces of information are associated with an open file
which may be:
File pointer File
open count
Disk location of the file
Access rights
14. What are the different accessing methods of a file? CO4

The different types of accessing a file are:


Sequential access: Information in the file is accessed
sequentially
Direct access: Information in the file can be accessed without any
particular order.
Other access methods: Creating index for the file, indexed
sequential access method (ISAM) etc.
15. Determine the most common schemes for defining the logical CO4
structure of a directory?
The most common schemes for defining the logical structure of a
directory
Single-Level Directory
Two-level Directory
Tree-Structured Directories
Acyclic-Graph Directories
General Graph Directory
16. Define UFD and MFD. CO4

In the two-level directory structure, each user has her own user file
directory (UFD). Each UFD has a similar structure, but lists only
the files of a single user. When a job starts the system's master file

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 138


directory (MFD) is searched. The MFD is indexed by the user name
or account number, and each entry points to the UFD for that user.

17. Examine how an index file is used to speed up the access in CO4
direct-access files?
Have an index in memory; the index gives the key and the disk
location of its corresponding record. Scan the index to find the
record you want, and then access it directly.
18. What are the various disk-scheduling algorithms? CO4
The various disk-scheduling algorithms are
• First Come First Served Scheduling
• Shortest Seek Time First Scheduling
• SCAN Scheduling
• C-SCAN Scheduling
• LOOK scheduling
19. List three ways of allocating storage, and give advantages of CO4
each.
a. Contiguous allocation. Fastest, if no changes are to be made.
Also easiest for random access files.
b. Linked allocation. No external fragmentation. File can grow
without complications.
c. Indexed allocation. Supports direct access without external
fragmentation.
20. If the average page faults service time of 25 ms and a memory CO4
access time of 100ns. Calculate the effective access time.
Effective access time = (1-p) *ma + p*page fault time
= (1-p) *100+p*25000000
= 100-100p+25000000*p
= 100 + 24999900p
21. What is Belady’s Anomaly? CO4
For some page replacement algorithms, the page fault rate may
increase as the number of allocated frames increases
22. What is meant by Locality of Reference? CO4
The locality model states that, as a process executes, it moves
from locality to locality.
Locality is of two types.
 Spatial locality
 Temporal locality
23. What are the drawbacks of Contiguous Allocation of Disk CO4
Space?

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 139


The disadvantages are,
Suffers from external fragmentation
Suffers from internal fragmentation
Difficulty in finding space for a new file
File cannot be extended
Size of the file is to be declared in advance
24. What are the advantages of Linked Allocation? CO4

The advantages are,


 No external fragmentation
 Size of the file does not need to be declared
25. What are the disadvantages of Linked Allocation? CO4

The disadvantages are,

Used only for sequential access of files.


Direct access is not supported
Memory space required for the pointers.
Reliability is compromised if the pointers are lost or damaged
PART B
1. Explain about directory structure? CO4 (Apr
• User stores file in file system using directory. File system stores /May
2015)
files of many users. Files are stored on the secondary storage device (Apr
like hard disk, optical disk and memory disk. These device support /May
random access method. 2017)
• Directory structure organizes and provides information about all
files in the system. Every partition has a file system which consists of
files and directory. Hard disks are split into one or more partitions. It
is called as minidisks or volumes. User can assign names to volumes.
Each disk contains minimum one partition.
• Operating systems support upto 24 partitions. Partitions can be
larger than a disk i.e combing two hard disks. These partitions are
called "virtual disks". Partition maintains information about files in
"Device Directory".
• Files are stored in the directory. Directory is created by user. User is
free to create as many as directory and assign suitable names. File
system contains many directories. Directory maintains information
about the file. File information includes type, size, date modified and
name.
• In some operating system, directory itself is considered as a file. A

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 140


set of logically associated files is called directory. Directory structure
of the file system connects a directory to other directories in the
system.
• Directory is a data structure that lists files and subdirectories in a
collection. Directory structure contains a list of entries one for each
file.
• Directory does not store any user data. Single dot (.) indicates a
current directory rote and double dot (..) indicates parent directory.
The root is identified by the directory '/'. The root file system has
several subdirectories.
• Windows file system uses a letter followed by colon for specifying
root directory. For example, root directory in windows is C: or D: or
E:
• UNIX and Linux based file system uses slash (/) for specifying root
directory.
• A directory can contain any number of files. A directory can also
contain subdirectories. Because a directory is nothing more than a
special kind of file, directories also have names.
Operation on directory :
1. File searching: Directory structure is searched for particular file
entry. File uses symbolic names and similar names may indicate a
relationship between files.
2. Create a file: User creates new files and added to the directory.
3. Delete a file: User remove the file after completing its work on
files.
4. Rename a file: User change the file name if file content is change.
5. Listing of directory: Listing of files in the directory for some use.
MS-DOS and windows uses "dir" command and Linux/UNIX uses
"ls" command for this purposes.
Single Level Directory
• Single level directory is a simple structure and there are no sub-
directories.
• Fig. 6.3.1 shows three files with single level directory. It is also
called flat directory structure.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 141


• Single level directory structure is suitable only for single user. If
user increases, it creates the problem with assigning the name to files.
• In single level directory structure, no two files can have the same
name.
• It is simple to implement and searching of file is faster. This type of
directory system is used in cameras and phones.
• Because of limitation of file names, this structure is rarely
implemented.
• File name size varies according to the operating system. MS-DOS
operating system allows only 11 character file names and UNIX OS
allows upto 255 characters.
Disadvantages of single level directory:
1. Not suitable for a large number of files
2. Limitation of unique file name.
3. For large number of file, recalling file name is difficult.

Hierarchical Directory System


• Here files are grouped together to form a sub directory. Each
directory may contain several subdirectories. In hierarchical file
system, a root directory points to other directories and files.
• Hierarchical directory system contains various forms of directory
structures. They are as follows:
1. Two-level directory structure
2. Tree structured directories
3. Acyclic graph directories.

Two-level Directory Structure


• Two-level directory structure contains master file directory at the
top level then user file directory at the second level. Actual user files

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 142


are at the third level.
• File system maintains a master block for each user. Master block
has one entry for each user.
• User directory address is stored in the master block. Each user has a
private directory. Different user can assign same names to their files
as the directories for each user are now separate.
• Fig. 6.3.2 shows two-level directory structure.

• Here user can work only in own directory. Other user need not
worry about deleting or modifying files by other users.
• When user wants to create a file, system search that particular file
name in the directory. If same file name is not found, then it creates a
file otherwise it gives error messages.
• In this structure, different users may have files with same name. It
solves the Viol name conflict problem. When user want to open a file,
filename is searched for in users directory.
• A two level directory can be a tree or an inverted tree of height 2.
• There are still problems with two-level directory structure. It
effectively isolates one user from another. This is some time
advantage or some time disadvantages. For completely independent
user is an advantages and disadvantages when cooperation between
two users is required.

Tree Structured Directory


• Fig. 6.3.3 shows the tree structured directory.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 143


• In this structure, directory itself is a file. A directory and sub
directory contains a set of files. Internal format is same for all
directories
• Commonly used directory structure is tree structure. Tree has a root
directory. All files in disk have a unique path name.
• A path name describes the path that the operating system must take
to get from some starting point in the file system to the destination
object. Each process has its own working directory. Path names are of
two types : relative and absolute.
• Relative path name : Its path starts from current working directory
(may start with
• Absolute path name: Its path starts from root directory (starts from
"/").
• Tree structured directory differentiate file and subdirectory by using
one bit representation. Zero (0) represents file and one (1) represents
subdirectory.
• When process makes reference to a file for opening, file system
search this file in the current directory. If needed file is not found in
the current directory, then user either change the directory or give the
full path of the file. System calls are used for performing any
operation on the files or directory.
• Empty directory and its entry are easily deleted by operating system.
If directory contains files and subdirectory, i.e. non-empty directory,
some operating systems not delete the directory.
• To delete a directory, user must first delete all the files and

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 144


subdirectory in that directory.
Merits
1. User can create his/her own directory.
2. User can access other user files by specifying path name.
3. Separate subdirectory for files associated with different topics
4. Searching id fast and efficient.
Demerits
1. Sub directory can not share between the user
2. Path is longer than two level directory structure
Acyclic Graph Directory
• Acyclic graph directory solve the problem of tree structure
directory.
• It allows sharing the directory in between two users. At a time more
than one places shared directory or file will exist in the file system.
• Fig. 6.3.4 shows acyclic graph directory. Links can be used to
construct acyclic graph

• It is very interesting to note that a shared directory or file is not the


same as two copies of the file. When there are two copies of files,
each user can view the copy rather than the original, but if one user
changes the file content, the changes will e file content, t not appear
in the other's copy.
• Only one original file exists for shared copy. Any changes made by
one user are immediately visible to the other user. When user create
file in shared directory, it automatically appear in all the shared
subdirectories.

2. What are files and explain the access methods for files? CO4 Apr/May
2017
Files are the building blocks of any operating system. Permanent
storage of information and data is a file. File provides an easy

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 145


means of information sharing. The operating system is not
interested in the information stored in the files. Operating system
map files with physical devices
File Access Methods
• Information and data is kept in files. Files are stored on the
secondary storage device. When this information is to be used, it has
to be accessed and brought into primary main memory.
• Information in files could be accessed in many ways. It is usually
dependent on an application. File organization checks how efficiently
the input-output storage medium used.
• File access method defines the way processes read and write files.
Different access methods are available. Some operating systems
provide only one access method and other systems provide many
access methods.
1. Sequential access 2. Direct /Random access 3. Indexed sequential
access.
• Access method means accesses to files that use a particular file
organization are implemented by an input output control system.

Sequential Access Method


• Sequential access method simple method. The information in a file
is accessed quentially one record after another.
• In this method, a process could read all the records in a file in order,
starting at the beginning. It can not skip any records and can not read
out of order.
• Fig. 6.2.1 shows sequential access method.

• Batch application uses sequential files. A byte stream file uses


sequential access method.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 146


• Example: Compiler and editor usually access files in this method.
Transaction file is also example of sequential access method.
• Sequential file organization only method easily stored on tape and
hard disk. Sequential access is best suited where most of the records
in a file are to be processed.
Disadvantages:
• It provides poor performances.
• More efficient search technique is required.

Direct Access Method


• It is also called random access method. This access allows a user to
position the read/write mark before reading or writing.
• Fig. 6.2.2 shows direct method.
• Direct access file method provides, accessing the records directly.
Direct access method is based on the hard disk that is a direct access
device. It allows random access of any file block.

• Each record has its own address on the file with by the help of
which it can be directly accessed for reading or writing. This feature
is used by editors. An editors need to randomly access the contents of
the file.
• There is no restriction on the order of reading or writing for a direct
access file. Operating system support is not required for direct access
file.
• At the time of file creation, access method is defined. According to
defined access method, file is accessed. Sequential access of a direct
access file is possible but direct access of a sequential file is not.
Disadvantages:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 147


1. Poor utilization of input-output device.
2. Consumes CPU time for record address calculation.

Indexed Sequential Access


• It is modified method of direct access method. This method is
combination of direct and sequential access method.
• In simple indexed sequential method, simple single level of
indexing is used. Sequential file is used as index.
• Fig. 6.2.3 shows indexed sequential file. Indexed sequential access
contains key field and a pointer to the main file.

• Indexed sequential files are store records in the order that they are
written to the disk. Records may be retrieved in sequential order or in
random order using an index to represent the record number in the
file.
• If file size is large, larger index is required and it contains large
number of entries. CPU takes time to search in to the index. Higher
level index is used to reduce the search time. An entry in the higher
level index points to a section of the index. This higher index contains
the records. This index is searched to find the section sdns of the disk
that may contains a required record.
• File system maintains an overflow file. It adds new records to an
overflow file. The record in the main file that immediately precedes
the new record in logical sequence is updated to contain pointer to the
new record in the overflow file.
• The overflow file is merged with the main file during a batch
update. The multiple indexes for the same key field can be set up to
increase efficiency.
• The lowest level of index file is treated as a sequential file and a

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 148


higher level index file is created for that file. Higher index file would
contain pointers to lower index files, which would point to the actual
data items.
3. Explain about kernel I/O subsystem and transforming I/O to CO4 Apr/May
hardware operations. 2017
• Kernel I/O subsystem provides various services like scheduling,
buffering, caching, spooling, device reservation and error handling.
1. I/O scheduling: Some I/O request ordering via per-device queue.
Scheduling can improve overall system performance. Operating-
system developers implement scheduling by maintaining a queue of
requests for each device.
2. Buffering: Buffer is a memory area that stores data while they are
transferred between two devices or between a device and an
application.
3. Caching: It is fast memory region which holds copies of data.
Access to the cached copy is more efficient than access to the
original. Cache files that are read/written frequently. Caching
increases performance.
4. Spooling and device reservation : Spooling uses the disk as a
large buffer for Jerit and outputting data to printers and other devices.
It can also be used for input, but is generally used for output. Its main
use is to prevent two users from alternating printing lines to the line
printer on the same page, getting their output completely mixed
together. It also helps in reducing idle time and overlapped I/O
5. Error handling: Most operating systems return an error number or
code when I/O request fails. OS can recover from disk read, device
unavailable, transient write failures.
6. I/O protection: User application may accidentally attempt to
disrupt normal operation via illegal I/O instructions. To avoid this we
can use all I/O instructions defined to be privileged and I/O must be
performed via system calls.
7. Kernel data structures: Kernel keeps state info for I/O
components, including open file tables, network connections,
character device state. The kernel must create a data structure
representing the new process. The scheduler can then use it if it
decides to schedule that process to run when it is its turn.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 149


Transferring I/O Requests to Hardware Operations
• Reading a file from disk, application refers to the data by a file
name. On the secondary storage device, file system maps from the
file name through the directory.
• In UNIX operating system, the name maps to an inode number and
that inode contains the space-allocation information.
• UNIX represents device names in the regular file-system name
space. To resolve a Se path name, UNIX looks up the name in the
mount table to find the longest matching prefix; the corresponding
entry in the mount table gives the device name.
• UNIX uses concept of major and minor number. Device files are
found in the /dev directory. Each device is assigned a major and
minor device number. bod
• The major device number identifies the type of device, i.e. all SCSI
devices would have the same number as would all the keyboards. The
minor device number identifies a specific device, i.e. the keyboard
attached to this workstation.
• Consider reading a file from disk for a process :
1. Determine device holding file.
2. Translate name to device representation.
3. Physically read data from disk into buffer.
4. Make data available to requesting process.
5. Return control to process.

4. Explain about RAID in detail. (Apr/May


2015)
RAID is a technique that makes use of a combination of multiple (Nov/Dec
disks instead of using a single disk for increased performance, 2016)
data redundancy, or both.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 150


1. RAID 0 (striped disks)

RAID 0 is taking any number of disks and merging them into one
large volume. It will increase speeds as you're reading and writing
from multiple disks at a time. But all data on all disks is lost if any
one disk fails. An individual file can then use the speed and capacity
of all the drives of the array. The downside to RAID 0, though, is that
it is NOT redundant. The loss of any individual disk will cause
complete data loss. This RAID type is very much less reliable than
having a single disk.

2. RAID 1 (mirrored disks)

It duplicates data across two disks in the array, providing full


redundancy. Both disks are store exactly the same data, at the same
time, and at all times. Data is not lost as long as one disk survives.
The total capacity of the array equals the capacity of the smallest disk
in the array. At any given instant, the contents of both disks in the
array are identical.

RAID 1 is capable of a much more complicated configuration. The


point of RAID 1 is primarily for redundancy. If you completely lose a
drive, you can still stay up and running off the other drive.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 151


f either drive fails, you can then replace the broken drive with little to
no downtime. RAID 1 also gives you the additional benefit of
increased read performance, as data can read off any of the drives in
the array. The downsides are that you will have slightly higher write
latency. Since the data needs to be written to both drives in the array,
you'll only have a single drive's available capacity while needing two
drives.

3. RAID 5(striped disks with single parity)

RAID 5 requires the use of at least three drives. It combines these


disks to protect data against loss of any one disk; the array's storage
capacity is reduced by one disk. It strips data across multiple drives to
increase performance. But, it also adds the aspect of redundancy by
distributing parity information across the disks.

4. RAID 6 (Striped disks with double parity)

RAID 6 is similar to RAID 5, but the parity data are written to two
drives. The use of additional parity enables the array to continue to
function even if two disks fail simultaneously. However, this extra
protection comes at a cost. RAID 6 has a slower write performance
than RAID 5.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 152


5. Compare the functionalities of FCFS, SSTF, C-SCAN and C- (Apr/
LOOK with example. May
2015)
(Apr/M
First-Come-First-Serve (FCFS), Shortest Seek Time First (SSTF), ay
Circular SCAN (C-SCAN), and Circular LOOK (C-LOOK) are 2018)
disk scheduling algorithms used to determine the order in which
disk I/O requests are serviced. Let's compare their functionalities
with examples:

1. First-Come-First-Serve (FCFS):
Functionality:
FCFS serves requests in the order they arrive.
Example:
Consider disk requests arriving in the order: 98, 183, 37, 122, 14,
124, 65, 67.
The order of service for FCFS would be the same as the order of
arrival: 98, 183, 37, 122, 14, 124, 65, 67.
2. Shortest Seek Time First (SSTF):
Functionality:
SSTF selects the request with the shortest seek time from the
current head position.
Example:
Consider disk requests arriving in the order: 98, 183, 37, 122, 14,
124, 65, 67.
If the disk head is initially at track 53, SSTF would select the
closest request, resulting in the order: 65, 67, 37, 14, 98, 122, 124,
183.
3. Circular SCAN (C-SCAN):
Functionality:
C-SCAN serves requests in a circular manner, moving the disk
head in one direction until the end, then returning to the beginning.
Example:
Consider disk requests arriving in the order: 98, 183, 37, 122, 14,
124, 65, 67.
If the disk head is initially at track 53, C-SCAN would move the
head to the end, then jump to the beginning, resulting in the order:
65, 67, 98, 122, 124, 183, 14, 37.
4. Circular LOOK (C-LOOK):
Functionality:
C-LOOK serves requests in a circular manner, but it only scans as
far as the last request in the current direction.
Example:
Consider disk requests arriving in the order: 98, 183, 37, 122, 14,
124, 65, 67.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 153


If the disk head is initially at track 53, C-LOOK would move the
head towards the end, then jump to the beginning, resulting in the
order: 65, 67, 98, 122, 124, 14, 37, 183.
Summary:
FCFS: Simple, serves requests in the order of arrival.
SSTF: Minimizes seek time by selecting the closest request.
C-SCAN: Circularly scans the disk, moving in one direction only.
C-LOOK: Circularly scans the disk, but only as far as the last
request in the current direction.
6. Explain about file system mounting in detail. CO4 (May/
Mounting is a process in which the operating system adds the June
directories and files from a storage device to the user’s computer file 2016)
system. The file system is attached to an empty directory, by adding
so the system user can access the data that is available inside the
storage device through the system file manager. Storage systems can
be internal hard disks, external hard disks, USB flash drivers, SSD
cards, memory cards, network-attached storage devices, CDs and
DVDs, remote file systems, or anything else.
Terminologies used in File System Mounting
 File System: It is the method used by the operating system to
manage data storage in a storage device. So, a user can access
and organize the directories and files in an efficient manner.
 Device name: It is a name/identifier given to a storage partition.
In windows, for example, “D:” in windows.
 Mount point: It is an empty directory in which we are adding
the file system during the process of mounting.
Mounting Indifferent Operating Systems
1. Linux-Unix based OS
We want to mount /dev/sdb1 to an existing directory /mnt.
sudo mount /dev/sdb1 /mnt/mydisk
After mounting, we have to unmount after use
sudo umount /mnt/mydisk

2. Windows OS
In windows mounting is very easy for a user. When we connect the
external storage devices, windows automatically detect the file
system and mount it to the drive letter. Drive letter may be D: or E:.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 154


Steps:
 Connect an external storage device to your PC.
 Windows detects the file system on the drive (e.g., FAT32 or
NTFS) and assigns it a drive letter, such as “E:”.
 You can access the derive by going through, THIS PC –> FILE
EXPLORER –>”E:” drive
 Access the data.
3. Mac OS
In Mac OS when we connect an external storage it will
automatically mount, and it will be accessible via Finder. As an
advanced mounting method user can also use the
command diskutil in Terminal.
Method 1:
Steps:
 Connect an external storage device to your MAC.
 MS OS detects the file system and automatically mount it.
 You can access the drive by opening Finder, and it will appear in
the sidebar.
Method 2(Using diskutil):
To mount a drive with a known identifier: disk2s1
diskutil mount /dev/disk2s1
To unmount:
diskutil unmount /dev/disk2s1
7. Explain about free space management with example. (Nov/
• Space that is allocated to files must be managed and space which is Dec
2015)
not allocated to any file must be managed. To keep track of free disk
space, the system maintains a free space list. File allocation table and
disk allocation both are required to manage free space properly.
• When a file is deleted, its disk space is added to the free-space list.
• Following are the techniques used for free space management:
1. Bit tables or bit vector
2. Chained free portions or linked list
3. Indexing or grouping
4. Free block list or counting
1. Bit tables or bit vector old
• Each block on the disk is represented by bit. It uses vector chain for
blocks. Free block is represented by o and used block represented by
1.
• For example, consider a disk where blocks 3, 4, 5, 7, 9, 14, 15, 19,
30, 31, 32, 35, 36, 37, 38 are free blocks and rest of the blocks are
allocated. The free space is shown below:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 155


111000101011110011101111111111000110000
• The memory required for a block bitmap = Disk size / 8 × (block
size of file system)
• Bit map requires extra space. For example:

• When bit table is stored into the memory, then exhaustive search of
the table can ano slow file system performance. Most of the file
system use auxiliary data structure al for bit table. File system also
maintain summary table. Summary table contains sub-range, number
of free blocks and maximum sized contiguous number of free blocks.
• Summary table is used to store information about contiguous free
blocks. Whenever file system needs a number of contiguous blocks, it
can scan the summary table to find an appropriate sub-range and then
search that sub-range. This method is used in Apple Macintosh.
Advantage :
a. Easy to find a free blocks
b. It is as small as possible.
Disadvantage :
It may not be feasible to keep the bitmap in memory for large disks.
2. Link list
• In this method, all free space disk blocks are linked, keeping a
pointer to the first free block. All file allocation methods used link list
free space techniques.
• There is small space overhead because there is no need for a disk
allocation table. In the above example, free blocks are 3, 4, 5, 7, 9, 14,
15, 19, 30, 31, 32, 35, 36, 37, 38. Here free block pointer is on block
number 3. Block 3 contains pointer to block 4, block 4 contains
pointer to block 5 and block 5 contains pointer to block 7 and so on.
• This method is not efficient because to reach free block 9, we have
to traverse 973 block 3, 4, 5 and 7 then we will reach to block 9.
• Disk will become quite fragmented after some use. Many portions
will be a single block long.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 156


3. Grouping
• First free block contains the addresses of n free blocks. The first n-1
of these blocks is actually free. The last block also contains the
address of another n free block.
• Consider the free blocks: 3, 4, 5, 7, 9, 14, 15, 19, 30, 31, 32, 35, 36,
37, 38
• When all the blocks in the group have been allocated, then use the
block that held the pointer.
• Because of grouping method, address of a large number of free
blocks can be found quickly.
4. Counting
• It keeps the address of the first free block and the number n of free
contiguous blocks that follow the first block. Each entry in the free
space list then consists of a disk address and a count.

8. Illustrate the functions of file and file implementation. (Nov/


File system is implemented on disk and memory. How to implement Dec
2015)
the file system, it varies according to operating system and file
system. But all the operating system follow some general rules. If the
file system is implemented on the disk, it contains the following
information:
1. Boot block control: Boot block is maintained per volume. The
boot block contains solve the initial bootstrap program used to load
the operating system. Operating system requires some information at
the time of booting. If the disk is divided into number of partitions,

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 157


the operating system is stored in the first partition of the disk. If the
operating system is not installed on the disk, then this block can be
empty. In UNIX file system, this is called the book block and in
2. Volume control block: It consists of volume or partitions detail
information. The information like block size, number of blocks in the
partition, free block count, free block pointer, free FCB count and
FCB pointers. In UNIX operating system, each partition is a
standalone file system. Super block is name in UNIX file system. A
super block describes the state of the file system: The total size of the
partition, the block size, pointers to a list of free blocks, the inode
number of the root directory, magic number, etc. In network file
system, it is stored in the master file table.
3. Directory structure: It is used to organize the files. Directory
structure is maintained per file system.
4. Per-file PCB: It contains information about files such as file size,
file ownership, file permission and location of data blocks. In NTFS,
master file table stored this information. Master table file uses a
relational database structure.
• In-memory information is used for caching and files system
management. Caching improve the performance.
1. In-memory mount table: It contains information about each
mounted volume.
2. In-memory directory structure cache: It contains recently
accessed directories information.
3. System wide open table: Open files FCB information is stored.
4. Per-process open file table: It maintains the pointer to the
appropriate entry in the system wide open file tables.
• UNIX operating system treats a directory exactly as a file. Windows
NT implements separate system calls for files and directories. It treats
directory as entities separate from files. Fig. 6.8.1 shows a file control
block.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 158


• FCB specify the information that the system needs to manage a file.
Sometimes it is called file attributes. These are highly system
dependent structures. For creating new file, an application program
calls the logical file system.
Optimizations for file system:
• Delayed updates of data and metadata are main difficulty. Updates
could be sins delayed in the hope that the same data might be updated
in the future or that the updated data might be temporary and might
be deleted in the near future.
• If computer were to crash without having committed the delayed
updates, then the consistency of the file system is destroyed.

Metadata updates:
• For recoverable file system, after crash, it must be consistent or
must be able to be made consistent. So it is necessary to prove that
logging metadata updates keeps the file system in a consistent.
• For inconsistent file system, the metadata must be written
incompletely. Sometime, file system data structures is in wrong order.
With metadata logging, the writes are made to a sequential log.
• The complete transaction is written there before it is moved to the
file system structures. While updating data, file system crashed, the
updates can be completed based on the information in the log.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 159


• The logging ensures that file system changes are made completely.
The order of the changes is guaranteed to be correct because of the
sequential writes to the log. If a change was made incompletely to the
log, it is discarded, with no changes made to the file system
structures.

9. Distinguish between a STREAMS driver and STREAMS CO4 (Nov/Dec


module. 2016)
STREAMS Driver:
1. Definition:
 A STREAMS driver is responsible for managing
communication between a device and the STREAMS
framework.
2. Functionality:
 Handles low-level device-specific operations, such as
sending and receiving data to and from the hardware.
3. Direct Interaction:
 Interacts directly with the device and translates device-
specific protocols into a format that STREAMS
modules can understand.
4. Examples:
 Device drivers for physical devices like network
interfaces, serial ports, or disk drives.
5. Provides:
 Provides a standard interface to the STREAMS
framework, allowing multiple devices to be accessed
in a uniform way.
STREAMS Module:
1. Definition:
 A STREAMS module is a software component that
processes data within the STREAMS framework.
2. Functionality:
 Performs specific operations on the data flowing
through the STREAM, such as filtering, buffering, or
transforming it.
3. Modularity:
 Can be dynamically loaded or removed from the
STREAM to modify or extend the behavior of data
processing.
4. Examples:
 Modules for protocol processing, encryption,
compression, or any other operation on the data
stream.
5. Interacts with:
 Interacts with other modules and the driver within the
STREAMS framework.
Relationship:
 Communication Flow:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 160


 The STREAMS driver manages communication with
the hardware, while STREAMS modules operate on
the data flowing through the STREAMS framework.
 Data Processing:
 Drivers handle the low-level details of device
communication, whereas modules handle high-level
data processing within the framework.
 Flexibility:
 Modules provide flexibility by allowing dynamic
modification of data processing capabilities, while
drivers are typically more static and tied to specific
devices.
 Hierarchical Structure:
 The STREAMS framework allows multiple modules
to be stacked, forming a hierarchy that processes data
in a sequence.

10. What are the various disk space allocation methods. Explain CO4 (Apr/May
in detail. 2018)
• The primary issue of file implementing file storage is how to keep
track of file blocks. File consists of a collection of blocks.
File allocation
• Following issue are considered for file allocation:
1. After creation of new file, whether the required maximum space
for the file is allocated at once or not?
2. Portion Space is allocated to a file as one or more contiguous
units.
3. Data structure or table used to keep track of the portion which is
assigned to a file.
• Three major methods of allocating disk space are in wide use,
1. Contiguous 2. Linked 3. Indexed.

Contiguous Allocation
• When user creates a file, system allocates a set of contiguous
blocks on disk. This is a pre-allocation method that uses portion of
variable size. Contiguous allocation is simple method to implement.
It only search free list of correct number of consecutive blocks and
mark them as used block.
• Disk address is a linear ordering on the disk. Because of linear
ordering, accessing block b + 1 after block b normally requires no
head movement. Here head movement is only one track. Fig. 6.10.1
shows contiguous allocation.
• Contiguous allocation of a file is defined by the disk address and
the length of the on first block.
• If the file size is "n" blocks and starting location is "L", then it
occupies blocks L, L+1, L+2, L+3, ......, L+(n-1). The directory entry
for each file indicates the address of the starting block and the length

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 161


of the area allocated for this file.

• Sequential and random access can be supported by contiguous


allocation.
• It is easy to retrieve single block. To improve the I/O performance
of sequential processing, multiple blocks can be brought in one at a
time. It supports sequential access very well because files entire data
is stored in adjacent block. This method bra also supports random
access.
• Contiguous allocation also suffers from external fragmentation.
Small free disk spaces are created after allocation free block and
deleting files.. External fragmentation means there will require free
space available but that is not contiguous. To solve the problem of
external fragmentation, compaction method is used.
• One more problem is that how to calculate the space needed for a
file. It is difficult to estimate the required space.
• This method is good for storing data on CD-ROM.
Characteristic of contiguous file allocation :
1. It supports variable size portions.
2. Pre-allocation is required.
3. It requires only single entry for a file.
4. Allocation frequency is only once.
Advantages:
1. It supports variable size portion.
2. Easy to retrieve single block.
3. Accessing a file is easy.
4. It provides good performance.
Disadvantages:
1. It suffers from external fragmentation.
2. Pre-allocation is required.

Linked Allocation
• Linked allocation is also called chained allocation. Operating
system keeps an ordered list of free blocks. File descriptor stores
pointers to the first block and each block stores pointer to the nest
block.
• Fig. 6.10.2 shows linked allocation. The disk blocks may be
scattered anywhere on the disk. The directory contains a pointer to
the first and last blocks of the file. No space is lost to disk
fragmentation.
• Creation of new file is easy. For new file, simply create new entry
in the directory. Reading a file is straightforward. User simple read
blocks by following pointers from block to block. There is no
external fragmentation with linked allocation.
• To write to file, system finds a free block, and this new block is
written to and linked to the end of the file.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 162


• While creating a new file, it is not necessary to declare the size of
the file. A file can contiguous to grow as long as free blocks are
available.
• Compaction can be used so that blocks of one file are located
continuously on the disk. It optimizes disk access.
• File allocation table is an extension of the linked allocation
method. Instead of putting the pointers in the file, keep a table of the
pointers around. This pointer table can be quickly searched to find
ant random block in the file.

• Fig. 6.10.3 shows the file allocation table. All blocks on the disk
must be included in the table. This method is used by windows
operating system.
(Refer Fig. 6.10.3 on next page)
Characteristics
1. It supports fixed size portions.
2. Pre-allocation is possible.
3. File allocation table is one entry for a file.
4. Allocation frequency is low to high.
Advantages:
1. There is no external fragmentation.
2. It is never necessary to compact disk space.
3. Pre-allocation is not required.

Disadvantages:
1. Files are accessed only sequentially.
2. Space required for pointers.
3. Reliability is not good.
4. Can not support direct access.

Indexed Allocation
• Indexed allocation method solves the problem of both contiguous
and linked allocation. It uses concept of index block. Index block
stores the entire pointer in one location. But the index block will
occupy some space and thus could be considered as an overhead of
the method.
• OS keeps a list of free blocks. It allocates an array to hold pointers
to all the blocks used by the file. It allocates blocks only on demand.
Fig. 6.10.4 shows indexed allocation.
• In indexed allocation, each file maintains its own index block. It
contains an array of disk sector of addresses. For example: The nth
entry in the index block points to the nth sector of the file. The
directory contains the address of the index block of a file. To read
the nth sector of the file, the pointer in the nth index block entry

• It supports direct access and without suffering from external

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 163


fragmentation. Any free block anywhere on the disk may satisfy a
request for more space.
• Indexed allocation does suffer from wasted space. The pointer
overhead of the index block is generally greater than the pointer
overhead of linked allocation.
Advantages:
1. It supports sequential and direct access.
2. No external fragmentation.
3. Faster than other two methods.
4. It supports fixed and variable size blocks.
Disadvantages:
1. Indexed allocation does suffer wasted space.
2. Pointer overhead is generally greater.
11. Consider a file system where a file can be deleted and the disk CO4 (Nov/Dec
space reclaimed while the links to that file still exist. What 2015)
problems may occur if a new file is created in the same storage
area or with the same absolute path name? How these
problem be avoided?
In a file system where a file can be deleted, and the disk space
reclaimed while the links to that file still exist (which is often
referred to as "unlinking" or "deleting a file without removing its
links"), certain problems may arise when a new file is created in the
same storage area or with the same absolute path name. Here are
some potential issues and ways to avoid them:
Problems:
Data Integrity:
If a new file is created in the same storage area or with the same
absolute path name as the previously deleted file, there may be
remnants of the old file's data or metadata left in the storage,
leading to potential data integrity issues.
Security Risks:
If sensitive information from the deleted file is still present in the
storage and a new file is created, it could pose security risks as
unauthorized users might gain access to the old data.
Confusion:
Users or applications expecting a clean slate when creating a new
file in a location where a file was previously deleted may encounter
unexpected remnants, causing confusion.
Solutions:
Zeroing or Scrubbing Deleted Space:
When a file is deleted, the file system should zero out or scrub the
previously occupied space to ensure that remnants of the old data
are not accessible. This can be part of the file deletion process.
Delaying Reuse of Disk Space:
The file system can implement a mechanism to delay the reuse of
disk space previously occupied by a deleted file. This ensures that

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 164


the space is cleared before allocating it to a new file.
File System Notifications:
Implement file system notifications to inform applications or
processes about changes in the file system. Applications can be
notified when a file is deleted, and they can take appropriate actions
to clean up or avoid potential issues.
Versioning or File ID Changes:
Consider implementing file versioning or changing file identifiers
when a file is deleted. This ensures that a new file created with the
same name or path is distinct from the deleted one.
Proper File Deletion API:
Encourage the use of proper file deletion APIs or commands that
ensure complete removal of a file and its associated metadata.
Educate users about the implications of unlinking or deleting files
without removing their links.
Regular Disk Maintenance:
Schedule regular disk maintenance tasks that check for and clean
up any remnants of deleted files. This can be part of routine file
system maintenance procedures.
12. Illustrate an application that could benefit from operating CO4 (Nov/Dec
system support for random access to indexed files. 2015)
An application that could benefit significantly from operating system
support for random access to indexed files is a Database
Management System (DBMS). In a DBMS, data is organized into
tables, and each table may contain a large number of records.
Random access to indexed files is crucial for efficient querying and
retrieval of specific records within these tables. Here's how such an
application could benefit:
Database Management System (DBMS):
1. Data Retrieval Efficiency:
 In a DBMS, tables often store large volumes of data. When
users query the database to retrieve specific records based on
certain criteria, random access to indexed files allows the
system to quickly locate and retrieve the relevant records.
2. Indexing Primary and Secondary Keys:
 Primary keys and secondary keys are often used to uniquely
identify records within a table or to speed up specific types of
queries. Indexing these keys allows the DBMS to provide rapid
access to records based on these key values.
3. Complex Queries:
 In a database with multiple tables and relationships, complex
queries involving joins and conditions can be efficiently
executed by leveraging indexes. Random access to indexed
files allows the DBMS to quickly navigate through the data
based on indexed attributes.
4. Sorting and Range Queries:
 Indexed files support efficient sorting and range queries. For

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 165


example, retrieving all records within a specific date range or
sorting records based on a particular attribute becomes faster
when the DBMS can randomly access indexed files.
5. Concurrency Control:
 Random access to indexed files is crucial for implementing
concurrency control mechanisms in a multi-user database
environment. Concurrent transactions may need to access and
modify different records simultaneously, and indexes help in
efficiently managing these operations.
6. Data Integrity:
 Indexed files contribute to maintaining data integrity by
enforcing uniqueness constraints on key attributes. This is
crucial for preventing duplicate records and ensuring the
accuracy of data stored in the database.
7. Query Optimization:
 A DBMS can use query optimization techniques, such as
choosing the most efficient index for a specific query, based on
the access patterns of the application. Random access to
indexed files supports these optimization strategies.
8. Reduced Disk I/O:
 Random access to indexed files minimizes the need for
scanning the entire table during searches, reducing disk I/O
operations. This is particularly important for large databases
where quick access to specific records is essential for
performance.

PART C
1. Discuss the disk scheduling methods. CO4 Nov/
Disk Scheduling Dec.-17,
• Disk scheduling algorithms are used to reduce the total seek time of 18, 19
any request. I/O request issues a system call to the operating system.
If desired disk drive or controller is idle then the request is served
immediately. If device is bust then the request for service will be
placed in the queue of pending requests.
• When one request is completed, the operating system has to choose
which pending request to service next.
• The processor is much faster than the disk, so it's highly likely that
there will be multiple outstanding disk requests before the disk is

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 166


ready to handle them. Because disks have non-uniform access times,
re-ordering disk requests can greatly reduce the latency of disk
requests.
• Multiple requests to disk will arrive while one is being serviced.
Disk scheduling increases the disk's bandwidth.
• To service a request, a disk system requires that the head be moved
to the desired track, then a wait for latency and finally the transfer of
data. In the following section we will examine several scheduling
algorithms.
1. FIFO 2. SSTF 3. SCAN 4. C-SCAN 5. LOOK 6. C-
LOOK

First In First Out


• FIFO is also called first come first served method. Requests are
processed in queue order. Fig. 5.4.1 shows FCFS algorithm. (Fig.
5.4.1 see on next page).
• FCFS has a fair policy in the sense that once a request has arrived,
its place in the schedule is fixed irrespective of arrival of a higher
priority request.
• The requested tracks in the order received are: 55, 58, 39, 18, 90,
160, 150, 38 and 184. Starting track at 100 and total number of track
is 200. FCFS process the entire request in sequential order:
• FCFS would begin at track 100, move 45 tracks to 55, and move 3
tracks to 58 and so on. The Y-axis corresponds to the tracks on the
disk. The X-axis corresponds to time or the number of tracks
traversed.
• Starting at track 100.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 167


Next accessed track is as below.

• FCFS is simple to implement. But it does not provide fast services.


It can not take special action to minimize the overall seek time.
Advantages:
1. Simple and easy to implement.
2. Suitable for light loads.

Disadvantages:
1. FCFS does not provide fast services.
2. Do not maximize throughput.
3. May involve lots of unnecessary seek distance
Shortest Seek Time First
• SSTF select the next request at the one requiring the minimum seek
time from the current position. Fig. 5.4.2 shows SSTF algorithm.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 168


• In Shortest-Seek-Time-First (SSTF) scheduling priority is given to
those processes which need the shortest seek, even if these requests
are not the first ones in the queue.
• It means that all requests nearer to the current head positions are
serviced together before moving head to distant tracks.
• Select the disk I/O request that requires the least movement of the
disk arm from its current position and always choose the minimum
seek time.

• The only tricky part is if there are two jobs with the same distance.
In this case, some kind of tie-breaking needs to be employed to pick
one. For instance, you could just use a random number to effectively
flip a coin and pick one.
• SSTF does not ensure fairness and can lead to indefinite
postponement because its seek pattern tends to be highly localized.
• Under heavy load, SSTF can prevent distant requests from ever
being serviced. This phenomenon is known as starvation.
• SSTF scheduling is essentially a form of shortest job first
scheduling. SSTF scheduling algorithm are not very popular because

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 169


of two reasons.
1. Starvation possibly exists. 2. It increases higher overheads.
Advantages:
1. Throughput is better than FCFS.
2. SSTF minimize the response time.
3. Less number of head movements.
Disadvantages:
1. One major issue is STARVATION.
2. Localize under heavy load.
SCAN
• SCAN is also called elevator algorithm.
• The next request scheduled is closest to current request but in one
particular direction. All requests in other direction are put at the end
of the list.
• SCAN services tracks in only one direction i.e. either increasing or
decreasing track number.
• When SCAN reaches the edge of the disk (or track 0), it reverses
direction.
• If any request comes on its way it will be serviced immediately,
while request arriving just behind the head will have to wait until disk
head moves to the end of the disk, reverses direction and returns
before being serviced.
• SCAN is called elevator because an elevator continues in one
direction servicing requests before reversing direction.

• Consider the example in which the requested tracks in the order


received are : 55, 58, 39, 18, 90, 160, 150, 38, and 184. Starting track

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 170


at 100 and total number of track is 200. Head is moving towards
increasing track number.
• Current head position = 100
• Head is moving towards increasing number of track. Fig. 5.4.3
shows SCAN algorithm. (Refer Fig. 5.4.3 on previous page.)

Disadvantages:
1. Increase overhead
2. Needs directional bit

Circular SCAN
• C-SCAN also moves the head in one direction, but it offers fairer
service with more uniform waiting times.
• It moves the head from one end of the disk to the other end,
servicing requests along the way. When the head reaches the other
end, it immediately returns to the beginning of the disk, without
servicing any requests on the return trip.
• C-SCAN treats the cylinders as a circular list that wraps around
from the last cylinder to the first one. It scans in cycles, always
increasing or decreasing order. Fig. 5.4.4 shows C-SCAN. Consider
the example in which the requested tracks in the order received are:
55, 58, 39, 18, 90, 160, 150, 38 and 184. Starting track at 100 and
total number of track is 200. Head is moving towards increasing track
number.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 171


LOOK Scheduling.
• Start the head moving in one direction. Satisfy the request for the
closest track in that direction when there is no more request in the
direction, the head is traveling, reverse direction and repeat.
• This algorithm is similar to SCAN, but unlike SCAN, the head does
not unnecessarily travel to the innermost and outermost track on each
circuit.
• Fig. 5.4.5 shows the look scheduling algorithm.

• For example, the disk request queue contains a set of references for
blocks on tracks 76, 124, 17, 269, 201, 29, 137 and 12.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 172


• Current head position= 76

C-LOOK
• C-LOOK is same as C-SCAN except the head stops after servicing
the last requesting the preferred direction, then services the request to
the cylinder nearest the opposite side of the disk.
• Example: Consider the example in which the requested tracks in the
order received are 23, 89, 132, 42, 187. Starting track at 100 and there
are 200 cylinders numbered from 0 - 199.

2. Write briefly about file attributes, operations, types and CO4


structure
Files are the building blocks of any operating system. Permanent
storage of information and data is a file. File provides an easy means
of information sharing. The operating system is not interested in the
information stored in the files. Operating system map files with
physical devices.
• An unstructured sequence of data means a file. It is also logical
units of information created by processes. Information stored in the
files must be persistent. File is any organized information and stored
within the secondary storage. It is physical view of a file.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 173


• A program which user prepare is a file. File represents programs and
data. The output from complier may be an object code file or an
executable file.
• Content or information/data of file is decided by the creator of the
file. File contains various types of data like source program, object
program, text and numeric data, images, graphs, sounds, payroll
records, dead-stock records etc.

File Naming
• File store information on the disk. It is an abstract mechanism. File
name is given, when process creates a file. File name continues to
exist even after closing a file. It is accessed by other process.
• File name rule is changed according the operating system. All
operating system allowed upto 8 string as a file name. Special
characters and numbers are allowed to use as a file name. Many file
systems support names as long as 255 characters. File names contains
two parts. It is separated by a period. For example: sorting.c
• In the file name, after period is called file extension. File extension
indicates something about a file. UNIX operating system support
more than one file extension. It is up to the user, for extending the file
extension.
• MS-DOS support only one file extension. File extension after period
contains upto three characters. Following is the list of file extension.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 174


Types of File
• File type represents the internal structure of the file. According to
the structure, file is divided into certain types: text, source, executable
and object file.
1. Text file: It is sequence of character which organized into lines.
Text file is a regular file.
2. Source file: It contains sequence of subroutines, procedure and
functions.
3. Executable file: It contains a series of code sections. This file is
input to the loader and loader loads this file into memory and then
execute.
4. Object file: Object file is input to the linker. An Object file is a
binary file that the compiler creates that converts the source to object
code.

File Types
File is divided into following types:
1. Ordinary file
2. Directory file
3. FIFO file
4. Character device file
5. Block device file.
Ordinary file
• Also called regular file. It contains only data as a stream of
characters. An bis ordinary file cannot contain another file, or
directory.
• It is either text file or a binary file.
• Examples of ordinary files include simple text files, application data
files, files containing high-level source code, executable text files,
and binary image files.
• Text file: It contains only printable characters.
• Binary file: It contains printable and unprintable characters that
cover the entire ASCII range.
• Regular files may be created, browsed through, and modified by

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 175


various means such as text editors.
Directory file
• A directory file is like a folder that contains other files, including
subdirectory files.
• Directory files act as a container for other files, of any category.
Directory files don't contain data in the user sense of data; they
merely contain references to the files contained within them.
• Directory is created in UNIX by the mkdir command.
• The UNIX directory is considered empty if it contains no other files
except the "." and "..".
• Directory is removed by using rmdir command. Content of directory
is displayed. by ls command.
Device file
• Special files are also known as device files. In UNIX all physical
devices are accessed via device files; they are what programs use to
communicate with hardware.
• Files hold information on location, type, and access mode for a
specific device. There are two types of device files; character and
block.
• Block device files are used to access block device I/O. Block
devices do buffered ar I/O, meaning that the data is collected in a
buffer until a full block can be transferred.
• Character device files are associated with character or raw device
access. They are used for un-buffered data transfers to and from a
device. Rather than transferring data in blocks the data is transferred
character by character.
• One transfer can consist of multiple characters.
• Some devices, such as disk partitions, may be accessed in block or
character mode. Because each device file corresponds to a single
access mode, physical devices that have more than one access mode
will have more than one device file.
• Device files are found in the /dev directory. Each device is assigned
a major and minor device number."
• The major device number identifies the type of device, i.e. all SCSI
devices, would have the same number as would all the keyboards.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 176


• The minor device number identifies a specific device, i.e. the
keyboard attached to workstation.
• Device files are created using the mknod command.
• Example of character device files is printers, modems and consoles.
FIFO file
• FIFO file is special pipe device file which provides temporary
buffers for two or more processes to communicate by writing data to
and reading data from the buffer.
• A FIFO is created by the mknod system call and may be opened and
accessed by any process that knows the name and has the
permissions.
• The buffer associated with a FIFO file is allocated when the first
process opens the FIFO file for read or write. The buffer is discarded
when all processes which are connected to the FIFO close their
reference to the FIFO file.
• FIFO file may be removed like any regular file. FIFO files can be
removed in UNIX via the rm command.

File Attributes
• One of the characteristics of file is a set of file attributes that give
the operating system more information about the file and how it is
intended to be used. For human users convenience bane is given to
the file. File attributes are varies from system to system.
• File attributes are as follows:
1. Name
2. Identifier
3. Type
4. Location/place
5. Size
6. Protection
7. Date and Time
• File name is in human readable. User can perform any operation on
file using its name.
• When file is created by user, operating system assigns unique
identification number to each file. OS uses this identifier for its

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 177


operation.
• Type information is required for systems that support different types
of files.
• Location/place information is pointer to a device and actual location
of files on the device. The information needed to locate the file on
disk is kept in memory.
• Size is measured in bits or bytes. It gives idea about current size of
the file.
• Protection: This information is stored on the per-process table so the
operating system can allow or deny subsequent requests. Attributes
like protection, password, creator and owner provides protection to a
file.
• Date and time: This information is related to creation and
modification of files. Protection, security and monitoring purposes
this data is used by operating system.

Operation on File
• File operations are simply those things that user can perform on a
file. For example, user can create a file, save a file, open a file, read a
file and modify a file. There are many different types of file
operations supported by operating system.
• Operating system can provides system call for performing various
operations on the file. Six basic file operations are creating, write,
read, delete, repositioning and truncating.
• Following are the some list of file operation:
1. Create
2. Write
3. Closes
4. Read
5. Delete
6. Repositioning.

File Structure
• File structure is represented by field, record, file and database. It is
also represented by byte sequence, record sequence and tree.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 178


• Fig. 6.1.1 shows the three different kinds of files. Unix and
windows both use unstructured byte sequence

• Byte sequence is unstructured file. Operating system is nothing to


do with content of file. All content is in byte format. Windows and
UNIX operating system uses this byte sequence method. It provides
more flexibility to the user. User can put anything in the file and
assign name whatever they want to the file.

• In record sequence model, file contains a sequence of fixed length


record. Read and write operation on this type of file will perform one
record at a time. It uses an idea of 80 column punched card. Most of
the operating system based their file system on file consisting of 80
character records.
• Record sequence model also support 132 character records. Line
printer uses 132 characters for printing.
• Last method is tree structure. File consists of tree of records. Each
record contains key field in the fixed position. Key field is used to
sort the tree record.
• Field, record, file and database is used in these file structure. Field is

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 179


a basic element of data. Persons last name and date example of the
field. It contains single value. Length and data type is characteristics
of field.
• Collection of related field is records. A student record contains
name, address, mobile number, university exam number, college
name etc. Records may be fixed length or variable length. A record
also contains sub-records. Sub-records may be one or more than one.
• File is collection of similar records. From user view, file is single
entity and reference by name.
• Database is a collection of related data. All records have some
relation in database. Engineering college database contains
information about college, student, staff, university, syllabus,
timetable, fee' structure, various departments etc. One or more field is
used in database.

3. Discuss in detail about magnetic disk

Magnetic Disk
• To store computer data, hard disks, flash memory, magnetic tape
and optical Wiedsmil
• Alternate storage for hard disk is Solid State Disks (SSD). These
flash memory tollon based devices offer a different set of tradeoffs
from a standard disk. At the same time, capacity of the hard disk also
increases.
• Hybrid category is introduced for storage media. It is hard disks
with large flash memory buffers. Disk sizes are specified in
gigabytes.
• Performances of hard disk and SSD technology is given below:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 180


• Hard disk contains several rotating platters coated with magnetic
film. They are read and written by tiny skating heads that are
mounted on a metal arm that swings back and forth to position them.
Heads float close to the surface of the platters but do not actually
touch.
• A hard disk is really a set of stacked "disks," each of which, like
phonograph records, has data recorded electromagnetically in
concentric circles or "tracks" on the disk.
• A "head" writes or reads the information on the tracks. Two heads,
one on each side of a disk, read or write the data as the disk spins.
Each read or write operation requires that data be located, which is an
operation called a "seek."
• A hard disk/drive unit comes with a set rotation speed varying from
4500 to 7200 rpm. Disk access time is measured in milliseconds.
• Although the physical location can be identified with cylinder, track
and sector locations, these are actually mapped to a Logical Block
Address (LBA) that works with the larger address range on today's
hard disks.
• Mean Time Between Failures (MTBF) is the predicted elapsed time
between inherent failures of a system during operation. MTBF can be
calculated as the arithmetic mean (average) time between failures of a
system.
• Each disk drive is managed by a controller. Each type of controller
can support a fixed number of drives. SCSI controllers support up to

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 181


seven disks per controller or up to 15 disks per controller.
• Each disk is assigned a drive address. This address is set by a
switch, a dial or jumpers on the disk or by the physical location of the
disk. Some SCSI devices, such as RAID's, have an additional
identifying number called a logical unit number. It is used to address
disks within the device.
• Fig. 5.1.1 shows physical disk.

• A disk consists of circular plates called platters. Each platter has an


upper and lower oxide-coated surface. Recording heads, at least one
per surface, are mounted on arms that can be moved to various radial
distances from the center of the platters. The heads float very close to
the surfaces of the platters, never actually touching them, and read
and record data as the platters spin around.
• A ring on one surface is called a track. Each track is divided into
disk blocks. Sometimes called sectors, these physical blocks on a disk
are different from file system blocks.
• Formatting a disk divides the disk into tracks and disk blocks that
can be addressed by the disk controller, writes timing marks, and
identifies bad areas on the disk.
• SCSI disk drives are shipped preformatted. They do not require
formatting at any time. Bad block handling is performed
automatically by SCSI disks. Bad blocks are areas of a disk that
cannot reliably store data.
Platter
• Platter is a circular, metal disk that is mounted inside a hard disk

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 182


drive. Several platters are mounted on a fixed spindle motor to create
more data storage surfaces in a smaller area.
• The platter has a core made up of aluminum or glass substrate,
covered with a thin layer of Ferric oxide or cobalt alloy. On both
sides of the substrate material, a thin coating is deposited by a special
manufacturing technique. This, thin coating where actual data is
stored is the media layer.
• Attributes of platter
1. It is a rigid, round disk this is coated with magnetically sensitive
material.
2. Data is stored in binary code.
3. It is encoded by polarizing magnetic areas on the disk surface.
4. Data can be written to and read from both surfaces of a platter.
5. A platter's storage capacity varies across drives.
Tracks
• Each platter is broken into thousands of tightly packed concentric
circles, known as tracks. These tracks resemble the structure of
annual rings of a tree.
• All the information stored on the hard disk is recorded in tracks.
Starting from zero at the outer side of the platter, the number of tracks
goes on increasing to the inner side.
• Each track can hold a large amount of data counting to thousands of
bytes.
Sectors
• Each track is further broken down into smaller units called sectors.
As sector is the basic unit of data storage on a hard disk. Disk
contains concentric tracks. Tracks are divided into sectors. A sector is
the smallest addressable unit in a disk.
• Fig. 5.1.2 shows surface of disk showing tracks and sectors.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 183


• A single track typically can have thousands of sectors and each
sector can hold more than 512 bytes of data. A few additional bytes
are required for control structures and error detection and correction.
Clusters: Sectors are often grouped together to form clusters.
Read/Write Heads
• The heads are an interface between the magnetic media where the
data is stored and electronic components in the hard disk. The heads
convert the information, which is in the form of bits to magnetic
pulses when it is to be stored on the platter and reverses the process
while reading.
• The heads are the most sophisticated part of the hard disk. Each
platter has two read/write heads, one mounted on the top and the
other one at the bottom. These heads are mounted on head sliders,
which are suspended at the ends of head
• The head arms are all fused into a singular structure called actuator,
which is responsible for their movement. Drives rotate at 60 to 200
times per second.
• Transfer rate is rate at which data flow between drive and computer.
Positioning time (random-access time) is time to move disk arm to
desired cylinder (seek time) and time for desired sector to rotate
under the disk head. Head crash results from disk head making
contact with the disk surface.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 184


• "Each platter (disc-shaped) is coated with magnetic material on both
surfaces. All platter surfaces has arm extended from fixed position.
Tip of the arm contains read/write head for reading or writing data.
• The arm moves the heads from the spindle edge to the edge of the
disc.
• When a program reads a byte from the disk, the operating system
locates the surface, track and sector containing that byte, and reads
the entire sector into a special area in main memory called buffer.
• The bottleneck of a disk access is moving the read/write arm.
• A cylinder is the set of tracks at a given radius of a disk pack. A
cylinder is the set of tracks that can be accessed without moving the
disk arm. All the information on a cylinder can be accessed without
moving the read/write arm.
Fig. 5.1.3 shows moving-head disk mechanism.

• The arm assembly is moved in or out to position a head on a desired


track. Tracks under heads make a cylinder. Only one head
reads/writes at any one time. Block size is a multiple of sector size.
• Disks can be removable. Drive attached to computer via I/O bus.
Buses vary, including EIDE, ATA, SATA, USB, Fibre channel, SCSI
etc.
• Host controller in computer uses bus to talk to disk controller built

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 185


into drive or storage array.
• Disk controllers typically embedded in the disk drive, which acts as
an interface between the CPU and the disk hardware. The controller
has an internal cache that it uses to buffer data for read/write requests.
Zone Bit Recording
• Also known as multiple zone recording or zone-CAV recording.
Disk divided into zones. Fig. 5.1.4 shows zone bit recording.

• Cylinders in different zones have a different number of sectors.


Number of sectors in a particular zone is constant. Data is buffered so
the data rate to the I/O interface is constant.
• Zoned-bit recording uses the disk more efficiently. It groups tracks
into zones that are based upon their distance from the center of the
disk.
• Each zone is assigned an appropriate number of sectors per track.
This means that a zone near the center of the platter has fewer sectors
per track than a zone on the outer edge.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 186


4. Discuss in detail about magnetic tapes CO4
• Magnetic tapes are used mostly for storing files of data: For
example, a company's payroll record.
• Access is sequential and consists of records that can be accessed one
after other as Saib the tape moves along a stationary read-write
mechanism.
• It is one of the cheapest and slowest methods for storage and has the
advantage that tapes can be recovered when not in use.
• A magnetically coated strip of plastic on which data can be encoded.
Tape is much less expensive than other storage mediums but
commonly a much slower solution that is commonly used for backup.
Fig. 5.1.5 shows connection of tape with processor.

• In addition, random access to magnetic tape is about a thousand


times slower than random access to magnetic disk, so tapes are not
very useful for secondary is storage.
• The I/O bus consists of data lines, address lines, and control lines.
Each peripheral device has associated with it an interface unit. Each
interface decodes the address and control received from the I/O bus,
interprets them for the peripheral, and provides signals for the
peripheral controller.
• The I/O bus from the processor is attached to all peripheral
interfaces. To communicate with a particular device, the processor
places a device address on the address lines. Each interface attached
to the I/O bus contains an address decoder that monitors the address

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 187


lines.
• When the interface detects its own address, it activates the path
between the bus lines and the device that it controls.
• All peripherals whose address does not correspond to the address in
the bus are disabled their interface. At the same time that the address
is made available in the address lines, the processor provides a
function code in the control lines.
• The interface selected responds to the function code and proceeds to
execute it. The function code is referred to as an I/O command and is
in essence an instruction that is executed in the interface and its
attached peripheral unit.
• The interpretation of the command depends on the peripheral that
the processor is addressing. There are four types of commands that an
interface may receive. They are classified as control, status, status,
data output, and data input.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 188


UNIT V VIRTUAL MACHINES AND MOBILE OS
Virtual Machines – History, Benefits and Features, Building Blocks, Types of Virtual
Machines and their Implementations, Virtualization and Operating-System Components;
Mobile OS - iOS and Android.
PART - A
S.No. Question and Answer CO Univ QP
(Month /
Year)
1. Define Virtual Machine. CO5
A virtual Machine (VM) is a software program or operating system
that not only exhibits the behavior of a separate computer, but is also
capable of performing tasks such as running applications and
programs like a separate computer.
2. What is a pure virtual machine?
In a pure virtual machine architecture the operating system gives each
process the illusion that it is the only process on the machine. The
user writes an application as if only its code were running on the
system.
3. How does a user interacts with a virtual machine?
Each user interacts with the computer by typing commands to the
virtual machine on a virtual system console and receiving results back
from the machine as soon as they are computed.
4. What is virtualization?
Virtualization is an abstraction layer that decouples the physical
hardware from the operating system to deliver greater IT resource
utilization and flexibility.
• It allows multiple virtual machines, with heterogeneous operating
systems to run in isolation, side-by-side on the same physical
machine.

5. Draw the architecture of virtual machine.

6. Mention the benefits of virtual machine.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 189


1. There is no overlap amongst memory as each Virtual Memory has
its own memory space.
2. Virtual machines are completely isolated from the host machine
and other virtual machines.
3. Data does not leak across virtual machines.
4. Can use multiple operating system environments on the same
computer
5. The cost reduction is possible using small virtual servers on a more
powerful single server.
7. List out the disadvantages of virtual machine.
1. Virtual machines are less efficient than real machines because they
access the hardware indirectly.
2. A virtual machine can be infected with the weaknesses of the host
machine
3. Difficulty in direct access to hardware, for example, specific cards
or USB devices
4. Great use of disk space, since it takes all the files for each
operating system installed on each virtual machine.
8. What is VM/370?
VM/370 is an operating system that gives multiple users access to a
computer by means of keyboard and display terminals for time
sharing, system testing, production and conversion. VM/370 manages
the resources of a computer so that every user, local or remote,
appears to have a complete replica of a System 370 including input
output (I/O) devices.
9. Diagrammatically represent trap and emulate method

10. What happens when the kernel tries to execute a privileged


instruction?
When the kernel in the guest attempts to execute a privileged

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 190


instruction, that is an error because the system is in user mode and
causes a trap to the VMM in the real machine. The VMM gains
control and executes the action that was attempted by the guest kernel
on the part of the guest. It then returns control to the virtual machine.
This is called the trap-and-emulate method
11. Write about binary translation
Binary translation could be static or dynamic. Static binary translation
means using ewoll a processor to translate an image from an
architecture to another before execution. In dynamic binary
translation individual instructions or groups of instructions are on the
fly translated and the translation is cached to allow for reuse in
iterations without repeated translation.
12. What is hardware assisted virtualization?
In hardware assisted virtualization the virtual layer sits in a new root
mode privilege level under level 0. Guest O/S privileged and sensitive
calls are set to sibo auto trap to the hypervisor while user request are
executed directly to the CPU for high performance.
13. Define hypervisor.
A hypervisor is a software layer installed on the physical hardware,
which allows splitting the physical machine into many virtual
machines. This allows multiple and operating systems to be run
simultaneously on the same physical hardware.
14. What is VMM?
A hypervisor management console, which is also called a virtual
machine manager (VMM), is computer software that enables easy
management of virtual machines.
15. What is Paravirtualization?
Paravirtualization is a type of virtualization in which guest operating
system (OS) is recompiled, installed inside a Virtual Machine (VM),
and operated on top of a hypervisor program running on the host OS.
16. Draw the Paravirtualization architecture

17. List out the Problems with para-virtualization


1. Para-virtualized systems won't run on native hardware.
2. There are many different para-virtualization systems that use
different I commands, etc.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 191


18. Differentiate between type 1 and type 2 hypervisor.

19. What are the memory management methods.


a) Double-paging, in which the guest page table indicates a page is in
a physical frame but the VMM moves some of those pages to backing
store.
b) Install a pseudo-device driver in each guest.
• Balloon memory manager communicates with VMM and is told
to allocate or de-allocate memory to decrease or increase physical
memory use of guest, causing guest OS to free or have more memory
available.
c) De-duplication by VMM determining if same page loaded more
than once, memory mapping the same page into multiple guests.

20. Discuss about overall I/O for VMMS


overall I/O is complicated for VMMS, because
a) Many short paths for I/O in standard OSes for improved
performance.
b) Less hypervisor needs to do for I/O for guests, the better.
c) Possibilities include direct device access, DMA pass-through,
direct interrupt delivery.
21. Define Live migration.
Live migration provides the ability to move a running virtual machine
between physical hosts with no interruption to service. The virtual
machine remains powered on and user applications continue to run
while the virtual machine is relocated to a new physical host
22. List the core applications of android.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 192


Android provides a set of core applications:
1. Email client
2. SMS program
3. Calendar
4. Maps
5. Browser
6. Contacts
23. Compare and contrast Android OS and iPhone OS

24. Mention the benefits of Android.


1. An open and free development platform. Handset makers can use it
without royalty and customize to their hearts content.
2. Component-based architecture: Lots of default components can be
replaced straightforwardly.
25. What is SQlite?
SQlite is an open source embedded database. The resulting design
goals of SQLite were to allow the program to be operated without a
database installation or administration.

PART B
1. Explain in detail about Virtual Machines. CO5
Virtual Machines
• Virtual Machine (VM) is virtual environment that functions as a
virtual computer system with its own CPU, memory, network

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 193


interface, and storage, created on a physical hardware system.
• A Virtual machine is a software construct that mimics the
characteristics of a physical server.
• A virtual Machine (VM) is a software program or operating system
that not only exhibits the behavior of a separate computer, but is also
capable of performing tasks such as running applications and
programs like a separate computer.
• In a pure virtual machine architecture the operating system gives
each process the illusion that it is the only process on the machine.
The user writes an application as if only its code were running on the
system.
• Each user interacts with the computer by typing commands to the
virtual machine on a virtual system console and receiving results back
from the machine as soon as they are computed.
• Each user directs the virtual machine to perform different
commands. These commands are then executed on the physical
machine in a multiprogramming environments.
• Virtualization is an abstraction layer that decouples the physical
hardware from the operating system to deliver greater IT resource
utilization and flexibility.
• It allows multiple virtual machines, with heterogeneous operating
systems to run in isolation, side-by-side on the same physical
machine.
Fig. 7.1.1 shows virtual machine.

Benefits:
1. There is no overlap amongst memory as each Virtual Memory has
its own memory space.
2. Virtual machines are completely isolated from the host machine
and other virtual machines.
3. Data does not leak across virtual machines.
4. Can use multiple operating system environments on the same

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 194


computer
5. The cost reduction is possible using small virtual servers on a more
powerful single server.
Disadvantages :
1. Virtual machines are less efficient than real machines because they
access the hardware indirectly.
2. A virtual machine can be infected with the weaknesses of the host
machine
3. Difficulty in direct access to hardware, for example, specific cards
or USB devices
4. Great use of disk space, since it takes all the files for each
operating system installed on each virtual machine.
2. Discuss about the Building Blocks of virtual machines CO5
Trap-and-Emulate
• Problem with VMM is that, guest OS expects to have unrestricted
access to hardware, runs privileged instructions, unlike user
processes. But one guest cannot Man get access, must be isolated
from other guests.
• All CPUs have multiple privilege levels. There is ring 0,1,2,3 in x86
CPUs. Normally, user process in ring 3, OS in ring 0. Privileged
instructions only run in vino ring 0. Now, user process in ring 3,
VMM/host OS in ring 0. So guest OS must be protected from guest
apps. But not fully privileged like host OS/VMM. It is run in ring 1.
• Trap and emulate VMM: Guest OS runs at lower privilege level
than VMM, traps to VMM for privileged operation.
• When the kernel in the guest attempts to execute a privileged
instruction, that is an error because the system is in user mode and
causes a trap to the VMM in the real machine. The VMM gains
control and executes the action that was attempted by the guest kernel
on the part of the guest. It then returns control to the virtual machine.
This is called the trap-and-emulate method.
• Fig. 7.3.1 shows trap-and-emulate method.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 195


• All non-privileged instructions run natively on the hardware,
providing the same To U performance for guests as native
applications. Privileged instructions create extra overhead, however,
causing the guest to run more slowly than it would natively.
• In addition, the CPU is being multi-programmed among many
virtual machines, which can further slow down the virtual machines
in unpredictable ways.
• Any privileged action by guest OS traps to VMM, emulated by
VMM. Example: set IDT, set CR3, access hardware. Sensitive data
structures like IDT must be managed by VMM, not guest OS.
• Problems with trap and emulate,
1) Guest OS may realize it is running at lower privilege level.
2) Some x86 instructions which change hardware state, run in both
privileged and
• These problems occurs because OSes not developed to run at a
lower privilege level and instruction set architecture of x86 is not
easily virtualizable.
Binary Translation
• Binary translation could be static or dynamic. Static binary
translation means using ewoll a processor to translate an image from
an architecture to another before execution. In dynamic binary
translation individual instructions or groups of instructions are on the
fly translated and the translation is cached to allow for reuse in
iterations without repeated translation.
• To be more precise, a segment of the original code is first executed
in an interpreted mode and collected as a segment. It is determined if
the code has been execute at least N times, if so, if not already
translated it is translated and it is executed and from now the
translated code will be executed.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 196


• In binary translation the virtualization layer sits at CPU privilege
level 0 (most privileged). The guest O/S system were supposed to run
on level 0, but since virtual layer occupies that level, it moves guest
O/S execution at privilege level 1 and leaves user applications at level
3 as it supposed to be.
• The non-virtualizable kernel code of the guest O/S is translated by
virtual layer into new sequences of instructions that have the intended
effect on virtual hardware, while user level code is directly executed
on the CPU for high performance. The benefit of this approach is that
the O/S is fully abstracted from the underlying hardware thus it
doesn't require any modification.
Hardware Assisted
• In hardware assisted virtualization the virtual layer sits in a new root
mode privilege level under level 0. Guest O/S privileged and sensitive
calls are set to sibo auto trap to the hypervisor while user request are
executed directly to the CPU for high performance.
• Hardware assisted virtualization requires a compatible CPU like
Intel VT-x and AMD's AMD-V to work.
• This technique is not performing as expected because of the high
overhead between guests O/S-to-hypervisor transition.
• I/O is another area improved by hardware assistance. Consider that
the standard DMA controller accepts a target memory address and a
source I/O device and transfers data between the two without OS
action. Without hardware assistance, a but guest might try to set up
DMA transfer that affects the memory of the VMM or other guests.

3. Explain the Types of Virtual Machines and their CO5


Implementations
• In computing, a hypervisor is a virtualization platform that allows
multiple operating systems to run on a host computer at the same
time. The term usually refers to an implementation using full
virtualization.
• A hypervisor is a software layer installed on the physical hardware,
which allows splitting the physical machine into many virtual
machines. This allows multiple and operating systems to be run
simultaneously on the same physical hardware.
• The operating system installed on the virtual machine is called a
guest OS, and is sometimes also called an instance. The hardware the
hypervisor runs on is called the host machine.
• A hypervisor management console, which is also called a virtual
machine manager (VMM), is computer software that enables easy
management of virtual machines.
• Hypervisors are currently classified in two types: type 1 and type 2.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 197


Type 0 Hypervisor
Fig. 7.4.1 shows type 0 hypervisor.

• Type 0 hypervisor is built with the minimum software components


required to fully virtualize guest OSS and control information flow
between guest OSs. Type 0 hypervisors is a hardware-based solutions
that provide support for virtual machine creation and management via
firmware.
• The VMM itself is encoded in the firmware and loaded at boot time.
Guest image site is loaded in each partition. "Partitions" and
"domains" are other names of type 0 Hypervisor.
• The feature set of a type 0 hypervisor tends to be smaller than those
of the other types because it is implemented in hardware. For
example, a system might be split into five virtual systems, each with
dedicated CPUs, memory and I/O devices.
• If I/O device are less, then they are not allocate to guest. Sometimes
VMM implements a control partition running daemons that other
guests communicate with for shared I/O.
Type 1 Hypervisor
• Type 1 hypervisor is software that runs directly on a given hardware
platform. A "guest" operating system thus runs at the second level
above the hardware.

• Type 1 VMs have no host operating system because they are


installed on a bare system. An operating system running on a Type 1
VM is a full virtualization because it is a complete simulation of the
hardware that it is running on.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 198


• Type 1 hypervisor is also called a native or bare-metal hypervisor
that is installed directly on the hardware, which splits the hardware
into several virtual machines where we can install guest operating
systems.
• Virtual machine management software helps to manage this
hypervisor, which allows guest OSes to be moved automatically
between physical servers based on current resources requirements.
• It is completely independent from the operating system.
• The hypervisor is small as its main task is sharing resources
between different operating systems.
• A major advantage is that any problems in one virtual machine or
guest operating system do not affect the other guest operating systems
running on the hypervisor.
Type 2 Hypervisor
• This is also known as Hosted Hypervisor.
• In this case, the hypervisor is installed on an operating system and
then supports to other operating systems above it.
• It is completely dependent on host operating system for its
operations. Fig. 7.4.2 shows type 2 hypervisor. (See Fig. 7.4.2 on next
page.)
• While having a base operating system allows better specification of
policies, any problems in the base operating system affects the entire
system as well even if the hypervisor running above the base OS is
secure.
• Type 2 hypervisors don't support over/dynamic allocation of RAM,
so care is required when allocating resources to virtual machines.

• This is why we call type 2 hypervisors hosted hypervisors. As


opposed to type 1 hypervisors that run directly on the hardware,
hosted hypervisors have one software layer underneath. What we
have in this case is:
1. A physical machine.
2. An operating system installed on the hardware (Windows, Linux,
MacOS).
3. A type 2 hypervisor software within that operating system.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 199


4. The actual instances of guest virtual machines.
• Type 2 hypervisors are typically found in environments with a small
number of servers. Type 2 hypervisors are convenient for testing new
software and research br projects.
Paravirtualization
• Paravirtualization is a type of virtualization in which guest operating
system (OS) is recompiled, installed inside a Virtual Machine (VM),
and operated on top of a hypervisor program running on the host OS.
• Para-virtualization refers to communication between the guest OS
and the hypervisor to improve performance and efficiency.
• Para-virtualization involves modifying the OS kernel to replace non-
virtualizable instructions with hyper-calls that communicate directly
with the virtualization Jeaug layer hypervisor.
• The hypervisor also provides hyper-call interfaces for other critical
kernel operations such as memory management, interrupt handling
and time keeping.
• Fig. 7.4.3 shows para-virtualization architecture.

• In Para-virtualization, the virtual machine does not necessarily


simulate hardware, but instead offers a special API that can only be
used by modifying the "guest" OS. This system call to the hypervisor
is called a "hypercall" in Xen.
• Xen is an open source para-virtualization solution that requires
modifications to the guest operating systems but achieves near native
performance by collaborating with the hypervisor.
• Microsoft Virtual PC is a para-virtualization virtual machine
approach. User-mode Linux (UML) is another para-virtualization
solution that is open source.
• Each guest operating system executes as a process of the host
operating system. Cooperative Linux, is a virtualization solution that
allows two operating systems to cooperatively share the underlying
hardware.
• Linux-V server is an operating system-level virtualization solution
for GNU/Linux systems with secure isolation of independent guest
servers.
• The Linux KVM is virtualization technology that has been
integrated into the mainline Linux kernel. Runs as a single kernel
loadable module, a Linux kernel running on virtualization-capable
hardware is able to act as a hypervisor and support unmodified Linux
and Windows guest operating systems.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 200


• Para-virtualization shares the process with the guest operating
system.
Problems with para-virtualization
1. Para-virtualized systems won't run on native hardware.
2. There are many different para-virtualization systems that use
different I commands, etc.
• The main difference between full virtualization and
paravirtualization in Cloud is that full virtualization allows multiple
guest operating systems to execute on a host operating system
independently while paravirtualization allows multiple guest
operating systems to run on host operating systems while
communicating.
4. Briefly explain Virtualization and Operating-System Components
• Operating system aspects of virtualization are management, I/O
storage and unique VM migration feature.
• Following features are implemented by using virtualization:
a) How do VMMS schedule CPU use when guests believe they have
dedicated CPUs?
b) How can memory management work when many guests require
large amounts of memory?
CPU Scheduling
• After virtualization, single CPU system act like multiprocessor.
Here one or more. virtual CPUs per guest is created. Generally VMM
has one or more physical CPUs and number of threads to run on
them.
• The VMM has a number of physical CPUs available and a number
of threads to oys run on those CPUs, then threads can be called as
VMM threads or guest threads. Guests are configured with a certain
number of virtual CPUs at creation time and that number can be
adjusted throughout the life of the virtual machine.
• When enough CPUs for all guests, VMM can allocate dedicated
CPUs, each guest much like native operating system managing its
CPUs.
• Usually not enough CPUs are available, then VMM can use
standard scheduling algorithms to put threads on CPUs and add
fairness aspect while allocating. Cycle stealing by VMM and over
subscription of CPUs means guests do not get CPU cycles they
expect.
• Some VMMS provide application to run in each guest to fix time-
of-day and provide other integration features.
Memory Management
• Each virtual machine consumes memory based on its configured

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 201


size, plus additional overhead memory for virtualization. In
virtualized environments, there are more users of memory, leading to
more pressure on memory use.
• Sometime, the total memory allocated to guests exceeds the amount
that physically exists in the system. The extra need for efficient
memory use is not lost on the implementers of VMMS, who take
extensive measures to ensure the optimal use of memory.
• Suppose, VMware ESX guests have a configured amount of
physical memory, then ESX uses three memory management
methods.
a) Double-paging, in which the guest page table indicates a page is in
a physical frame but the VMM moves some of those pages to backing
store.
b) Install a pseudo-device driver in each guest.
• Balloon memory manager communicates with VMM and is told
to allocate or de-allocate memory to decrease or increase physical
memory use of guest, causing guest OS to free or have more memory
available.
c) De-duplication by VMM determining if same page loaded more
than once, memory mapping the same page into multiple guests.
• Since hypervisor manages page sharing, the virtual machine
operating systems are unaware of what is happening in the physical
system.
• Virtual Memory Ballooning is a computer memory reclamation
technique used by a hypervisor to allow the physical host system to
retrieve unused memory from play certain guest Virtual Machines
(VMs) and share it with others. U
• Memory ballooning allows the total amount of RAM required by
guest VMs to exceed the amount of physical RAM available on the
host. When the host system runs low on physical RAM resources,
memory ballooning allocates it selectively to VMs.
• If a VM only uses a portion of the memory that it was allocated, the
ballooning technique makes it available for the host to use.
• For example, if all the VMs on a host are allocated 8 GB of
memory, some of the VMs will only use half the allotted share.
Meanwhile, one VM might need 12 GB of memory for an intensive
process.
• Memory ballooning allows the host to borrow that unused memory
and allocate it to the VMs with higher memory demand.
• The guest operating system runs inside the VM, which is allocated a
portion of aah memory. Therefore, the guest OS is unaware of the

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 202


total memory available.
• Memory ballooning makes the guest operating system aware of the
host's memory shortage.
• Virtualization providers such as VMware enable memory
ballooning. VMware memory ballooning, Microsoft Hyper-V
dynamic memory, and the open source KVM balloon process are
similar in concept.
• The host uses balloon drivers running on the VMs to determine how
much memory it can take back from an under-utilizing VM. Balloon
drivers must be P M installed on any VM that participates in the
memory ballooning technique.
• Balloon drivers get the target balloon size from the hypervisor and
then inflate by allocating the proper number of guest physical pages
within the VM. This process is known as inflating the balloon; the
process of releasing the available pages is known as deflating the
balloon.
Input-Output Management
• I/O management is easy for VMM because I/O has lot of variation.
For example, leib OS device-driver mechanism provides a uniform
interface to the OS whatever the I/O device. Device-driver interfaces
are designed to allow third-party hardware manufacturers to provide
device drivers connecting their devices to the operating system.
• Virtualization takes advantage of dynamically loaded and unloaded
of device driver by providing specific virtualized devices to guest
operating systems.
• But overall I/O is complicated for VMMS, because
a) Many short paths for I/O in standard OSes for improved
performance.
b) Less hypervisor needs to do for I/O for guests, the better.
c) Possibilities include direct device access, DMA pass-through,
direct interrupt delivery.
• Hardware support needed for all above cases.
• With virtualization, each guest needs at least one IP address for
communication. 8 So server running a VMM may have dozens of
addresses and the VMM acts as a virtual switch to route the network
packets to the addressed guests.
• Networking also complex as VMM and guests all need network
access. VMM can bridge guest to network allowing direct access and
provide Network Address Translation (NAT).
• NAT address local to machine on which guest is running, VMM
provides address translation to guest to hide its address.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 203


Storage Management
• Both boot disk and general data access need be provided by VMM.
Virtualized environments need to approach storage management
differently than do native operating systems.
• Solution is based on the type of hypervisor. In type 1 hypervisor,
storage guest root disks and configuration information within file
system provided by VMM as a disk image. Type 2 hypervisor store as
files in file system provided by host OS.
• Guests sometimes need more disk space than is available in their
root disk image. VMMs provide a mechanism to capture a physical
system as it is currently configured and convert it to a guest that the
VMM can manage and run.
a) Physical-to-virtual (P-to-V) convert native disk blocks into VMM
format.
b) Virtual-to-physical (V-to-P) convert from virtual format to native
or disk format.
• VMM also needs to provide access to network attached storage and
other disk images, disk partitions, disks etc.
Live Migration
• Live migration provides the ability to move a running virtual
machine between physical hosts with no interruption to service. The
virtual machine remains powered on and user applications continue to
run while the virtual machine is relocated to a new physical host. In
the background, the virtual machine's RAM is copied from the source
host to the destination host. Storage and network connectivity are not
altered.
• Taking advantage of VMM features leads to new functionality not
found on general operating systems such as live migration. Running
guest can be moved between systems, without interrupting user
access to the guest or its apps.
• Fig. 7.5.1 shows live migration of guest between servers.

• Steps:

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 204


1. The source VMM establishes a connection with the target VMM.
2. The target creates a new guest by creating a new VCPU.
3. The source sends all read-only guest memory pages to the target.
4. The source sends all read-write pages to the target, marking them
as clean.
5. The source repeats step 4, as during that step some pages were
probably modified by the guest and are now dirty.
6.When cycle of steps 4 and 5 becomes very short, source VMM
freezes guest, sends VCPU's final state, sends other state details,
sends final dirty pages, and tells target to start running the guest.
• Once target acknowledges that guest running, source terminates
guest.

5. Explain about Mobile OS: Android CO5 AU: May-


• Android is an open source mobile OS developed by the Open 22
Handset Alliance, to led by Google.
• Android is a software stack for mobile devices that includes an
operating system, middleware and key applications.
• It is based on Linux 2.6 kernel.
• Android is an open source operating system, created by Google
specifically for use on mobile devices (i.e. cell phones and tablets)
• It can be programmed in C/C++ but most app development is done
in Java. It supports Bluetooth, Wi-Fi and 3G and 4G networking.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 205


Android Architecture
Fig. 7.6.1 shows Android software stack. Each layer of the stack and
the corresponding elements within each layer are tightly integrated
and carefully tuned to provide the optimal application development
and execution environment for mobile devices. (See Fig. 7.6.1 on
next page.)
• Android provides a set of core applications:
1. Email client
2. SMS program
3. Calendar
4. Maps
5. Browser
6. Contacts
7. Etc.
• All applications are written using the Java language.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 206


App framework
• Used for enabling and simplifying the reuse of components
1. Developers have full access to the same framework APIs used by
the core applications.
2. Users are allowed to replace components.
• App Framework features are as follows:

Libraries
• Libraries include a set of C/C++ libraries used by components of the

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 207


Android system. It is exposed to developers through the Android
application framework
Runtime
• Android run-time system provides core set of class libraries to
ensure smooth platform for developers. With these libraries
developers can easily import required libraries into their applications
without doing any hard coding in applications.
Dalvik virtual machine
• Dalvik is a purpose built virtual machine designed specifically for
android which was developed by Dan Bornstein and his team. Strictly
it was developed for mobile devices. While developing Dalvik Virtual
Machine Dan Bornstein and his team realize the constraints specific
to mobile environment which is not going to change in future at least,
like battery life, processing power and many more. So they optimized
the dalvik virtual machine. Dalvik virtual machine uses register based
architecture. With this architecture dalvik virtual machine has few
advantages over java virtual machine such as:
1. Dalvik uses its own 16 bit instruction set than java 8 bit stack
instructions, which reduce the dalvik instruction count and raised its
interpreter speed.
2. Dalvik use less space, which means an uncompressed .dex file is
smaller in size(few bytes) than compressed java archive file(.jar file).
• An open source software stack that includes operating system.
Linux operating system kernel that provides low level interface with
the hardware, memory management and process control.
• Middleware: A run time to execute Android applications including
virtual machine and core libraries.
Important blocks in Android":
1. Activity manager: Manages the activity life cycle of applications
2. Content providers: Manage the data sharing between applications
3. Telephony manager: Manages all voice calls. We use telephony
manager if we want to access voice calls in our application.
4. Location manager: Location management, using GPS or cell
tower
5. Resource manager: Manage the various types of resources we use
in our application.
Android SDK features
The Android SDK includes,
1. The Android APIs.
2. The core of the SDK.
3. Development tools.
4. No licensing, distributions, or development fees or release approval
processes.
5. GSM, EDGE, and 3G networks for telephony and data transfer.
6. Full multimedia hardware control.
7. APIs for using sensor hardware including accelerometer and the

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 208


compass.
8. APIs for location based services.
Application framework
1. Android offers developers the ability to build rich and innovative
applications.
2. Developers have full access to the same framework APIs used by
the core applications.
3. Underlying all applications is a set of services, including Views.
4. It can be used to build an application, including lists, grids, text
boxes, buttons, and even an embeddable web browser.
• Content providers enable applications to access data from other
applications (such as Contacts), or to share their own data.
• A resource manager provides access to non-code resources such as
localized strings, graphics and layout files.
• A notification manager enables all applications to display custom
alerts in the status bar.
• An activity manager manages the lifecycle of applications and
provides a common navigation backstack.
Libraries used in Android
• A set of C/C++ libraries used by various components of the Android
system.
• System C library: Tuned for embedded Linux-based devices.
• Media Libraries: Based on Packet Video's OpenCORE; the libraries
support playback and recording of many popular audio and video
formats, as well as static image files.
• Surface Manager: Manages access to the display subsystem and
seamlessly composites 2D and 3D graphic layers from multiple
applications.
• LibWebCore: A modern web browser engine which powers both the
Android browser and an embeddable web view.
• SGL/ 3D libraries: SGL is underlying 2D graphics engine.
• SQLite: A powerful and lightweight relational database engine
available to all applications.
Android run-time
• Android includes a set of core libraries that most of the functionality
available in the core libraries of the Java programming language.
• Every Android app runs in its own process with its own instance of
the Dalvik virtual machine. The Dalvik VM executes files in the
Dalvik Executable (.dex) format:
Slow Android apps
1. By default on Android, all work is done in a single thread, the
"main application" thread. If a component of the work takes a long
time, the rest of the work will be "blocked". For example, a long time
to access data across the network prevents responding to any GUI
events.
2. In the Android OS, if a GUI doesn't respond to an input event in

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 209


five seconds, then it is considered unresponsive and the OS will try to
kill it.
Android thread design
1. Only perform GUI actions on main application thread. Spawn
separate threads to perform data-intensive or slow actions. Make
these threads asynchronous.
2. Main thread does not have to wait for/check on other threads.
Instead, those threads run as they need to and report back to the
original thread. Any changes made to the UI should go through the UI
thread.

Comparison of Android OS VS iPhone OS Features

Android applications
a.Android applications get distributed in a .apk file. APK stands for
"Android board as Package". It is simply a zip file that has a
particular file structure. An APKcontains:
1. The Android Manifest file (an XML file with lots of metadata).
2. A Resource bundle containing sounds, graphics, etc.
3. The Dalvik classes that make up user application.
Android Benefits
1. An open and free development platform. Handset makers can use it
without royalty and customize to their hearts content.
2. Component-based architecture: Lots of default components can be

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 210


replaced straightforwardly.
Proponents of Android point to the following benefits:
a. Lots of services location, sql, maps, web, etc.
b. Well managed applications; isolated from each other to protect data
and provide
c. Operating system can quit programs as needed to ensure good
performance on mobile devices.
d. Portability: To support a new device, a company has to port the
virtual machine; Android apps (Dalvik) then execute on the new
device with little to no modification.
Zygote
• Android uses a concept called the Zygote to enable both sharing of
code across VM instances and to provide fast startup time of new VM
instances.
• Zygote is a daemon process that provides efficient memory usage
and less time overhead when Android runs multiple application.
Zygote is the parent of all application processes.
• The Zygote is a VM process that starts at system boot time. When
the Zygote process starts, it initializes a Dalvik VM, which preloads
and preinitializes core library classes.
• Generally, these core library classes are read-only and are therefore
a good candidate for preloading and sharing across processes.
• Once the Zygote has initialized, it will sit and wait for socket
requests coming from the runtime process indicating that it should
fork new VM instances based on the Zygote VM instance.
• Cold starting virtual machines notoriously takes a long time and can
be an impediment to isolating each application in its own VM. By
spawning new VM processes from the Zygote, the startup time is
minimized.
• Additional memory need not be allocated for copies of these classes
when a new DVM is forked from the Zygote DVM.

6. Discuss about iOS Technology CO5


• iOS is the operating system that runs on iPad, iPhone and iPod
touch devices. The operating system manages the device hardware
and provides the technologies required to implement native apps.
• The iOS Software Development Kit (SDK) contains the tools and
interfaces needed to develop, install, run, and test native apps that
appear on an iOS device's Home screen.
• Fig. 7.7.1 shows iOS architecture.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 211


• The iOS Architecture is layered. At the highest level, iOS acts as an
intermediary between the underlying hardware and the apps you
create.
• Apps do not talk to the underlying hardware directly. Instead, they
communicate with the hardware through a set of well-defined system
interfaces. These interfaces make it easy to write apps that work
consistently on devices having different hardware capabilities.
• The Cocoa Touch layer contains key frameworks for building iOS
apps. These frameworks define the appearance of your app. They also
provide the basic app infrastructure and support for key technologies
such as multitasking, touch-based input, push notifications and many
high-level system services.
• High-Level features of Cocoa touch layers are AirDrop,
Multitasking, Auto Layout, Storyboards and Local Notifications And
Apple Push Notification Service.
• Cocoa touch layer contains following frameworks for iPhone app
development:
a. UIKit framework
b. Map kit framework
c. Push notification service.
d. Message UI framework noise
e. Address book UI framework
f. Game kit framework
g. iAd framework
h. Event kit UI framework
i. Accounts framework
j. Twitter framework
• The Media layer contains the graphics, audio and video technologies
you use to implement multimedia experiences in your apps. The
technologies in this layer make it easy for you to build apps that look
and sound great.
Media Layer
• The role of the Media layer is to provide iOS with audio, video,
animation and graphics capabilities.
• As with the other layers comprising the iOS stack, the media layer
comprises a number of frameworks which may be utilized when
developing iPhone apps.
• The technologies in this layer make it easy for sound great.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 212


a. Graphics technologies
b. Audio technologies
c. Video technologies
d. AirPlay.
• Media layer contains following frameworks:
1. Core video framework: This framework provides buffering
support for the AMI Ista Core media framework. Whilst this may be
utilized by application developers it is typically not necessary to use
this framework.
2. Core text framework: The iOS core text framework is a C-based
API designed to ease the handling of advanced text layout and font
rendering requirements.
3. Image I/O framework: The Image I/O framework, the purpose of
which is to facilitate the importing and exporting of image data and
image metadata, was introduced in iOS 4. The framework supports a
wide range of image formats including PNG, JPEG, TIFF and GIF.
4. Assets library framework: The assets library provides a
mechanism for locating and retrieving video and photo files located
on the iPhone device. In addition to accessing existing images and
videos, this framework also allows Simon new photos and videos to
be saved to the standard device photo album.
5. Core graphics framework: The iOS core graphics framework
provides a lightweight two dimensional rendering engine. Features of
this framework include PDF document creation and presentation,
vector based drawing, transparent layers, path based drawing, anti-
aliased rendering, color manipulation and management, image
rendering and gradients.
6. Core image framework: A new framework introduced with iOS 5
providing a set of video and image filtering and manipulation
capabilities for application developers.
7. Quartz core framework: The purpose of the Quartz Core
framework is to provide animation capabilities on the iPhone. It
provides the foundation for the majority of the visual effects and
animation used by the UIKit framework and provides an Objective-C
based programming interface for creation of specialized animation
within iPhone apps.
8. OpenGL ES framework: For many years the industry standard
for high performance 2D and 3D graphics drawing has been OpenGL.
OpenGL for Embedded Systems (ES) is a lightweight version of the
full OpenGL specification designed specifically for smaller devices
such as the iPhone.
9. GLKit framework: The GLKit framework is an Objective-C
based API designed to ease the task of creating OpenGL ES based
applications.
10. NewsstandKit framework: The Newsstand application is a new
feature of iOS 5 and is intended as a central location for users to gain

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 213


access to newspapers and magazines. The NewsstandKit framework
allows for the development of applications that utilize this new
service.
11. iOS audio support: iOS is capable of supporting audio in AAC,
Apple orolove Lossless (ALAC), A-law, IMA/ADPCM, Linear PCM,
u-law, DVI/Intel IMA ADPCM, Microsoft GSM 6.10 and AES3-
2003 formats through the support provided by the following
frameworks.
12. AV foundation framework: An Objective-C based framework
designed to allow the playback, recording and management of audio
content.
Core Services Layer
• The Core Services layer contains fundamental system services for
apps.
• Key among these services are the core foundation and foundation
frameworks, which define the basic types that all apps use.
• This layer also contains individual technologies to support features
such as location, iCloud, social media and networking.
Features:
1. Peer-to-Peer services
2. iCloud storage
3. Automatic reference counting
4. Block objects
5. Grand central dispatch
6. In-App purchase
7. SQLite
8. XML support
9. File-sharing support, data protection.
It consists of the following frameworks:
•Address book framework: This provides programmatic access to
the iPhone Address Book contact database allowing applications to
retrieve and modify contact entries.
• CFNetwork framework: The CFNetwork framework provides a C-
based interface to the TCP/IP networking protocol stack and low level
access to BSD sockets. This enables application code to be written
that works with HTTP, FTP and domain name servers and to establish
secure and encrypted connections using Secure Sockets Layer (SSL)
or Transport Layer Security (TLS).
• Core Data Framework: This framework is provided to ease the
creation of data modeling and storage in Model-View-Controller
(MVC) based applications. Use of the Core Data framework
significantly reduces the amount of code that needs to be written to
perform common tasks when working with structured data within an
application.
• Core foundation framework: The core foundation framework is a
C-based Framework which provides basic functionality such as data

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 214


types, string manipulation, raw block data management, URL
manipulation, threads and run loops, date and times, basic XML
manipulation and port and socket anro communication.
• The core media framework is the lower level foundation upon which
the AV foundation layer is built.
• Core telephony framework: The iOS core telephony framework is
provided to allow applications to interrogate the device for
information about the current cell phone service provider and to
receive notification of telephony related events.
• EventKit framework: An API designed to provide applications
with access to the calendar and alarms on the device.
• Most applications will use iCloud document storage to share
documents from a user's iCloud account. This is the feature that users
think of when they think of iCloud storage. A user cares about
whether documents are shared across devices and can see and manage
those documents from a given device.
• Data protection allows applications that work with sensitive user
data to take advantage of the built-in encryption available on some
devices.
• When your application designates a specific file as protected, the
system stores that file on-disk in an encrypted format. While the
device is locked, the contents of the file are inaccessible to both your
application and to any potential intruders.
• However, when the device is unlocked by the user, a decryption key
is created to allow your application to access the file.
Core OS Layer
• The Core OS layer contains the low-level features that most other
technologies are built upon.
• Even if you do not use these technologies directly in your apps, they
are most likely being used by other frameworks.
• And in situations where you need to explicitly deal with security or
communicating with an external hardware accessory, you do so using
the frameworks in this layer.
• This layer provides a variety of services including low level
networking, access to external accessories and the usual fundamental
operating system services such as memory management, file system
handling and threads.
• The Core OS layer occupies the bottom position of the iOS stack
and, as such, sits directly on top of the device hardware.
• The layer provides a variety of services including low level
networking, access to external accessories and the usual fundamental
operating system services such as memory management, file system
handling and threads.
• Accelerate framework: Introduced in iOS 4.0, the Accelerate
framework contains. interfaces for performing DSP, linear algebra
and image-processing calculations. The advantage of using this

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 215


framework over writing your own versions of these interfaces is that
they are optimized for all of the hardware configurations present • in
iOS, based devices. Therefore, you can write your code once and be
assured that it runs efficiently on all devices.
• External accessory framework: It provides the ability to interrogate
and communicate with external accessories connected physically to
the iPhone via the 30-pin dock connector or wirelessly via Bluetooth.
• Security framework: The iOS Security framework provides all the
security interfaces you would expect to find on a device that can
connect to external networks including certificates, public and private
keys, trust policies, key chains, encryption, digests and Hash-based
Message Authentication Code (HMAC).
• The core bluetooth framework allows developers to interact
specifically with 80 Bluetooth Low-Energy ("LE") accessories. The
Objective-C interfaces of this framework allow you to scan for LE
accessories, connect and disconnect to ones you find, read and write
attributes within a service, register for service and attribute change
notifications and much more.
• System: The system level encompasses the kernel environment,
drivers and low level UNIX interfaces of the operating system. The
kernel itself is based on Mach and is responsible for every aspect of
the operating system.
• It manages the virtual memory system, threads, file system, network
and interprocess communication. The drivers at this layer also
provide the interface between the available hardware and system
frameworks.
• For security purposes, access to the kernel and drivers is restricted
to a limited set of system frameworks and applications.
• iOS provides a set of interfaces for accessing many low-level
features of the operating system. Your application accesses these
features through the LibSystem library.

7. Write short notes on Android File Management CO5


• Android uses a file system that's similar to disk-based file systems
on other platforms. A file object is suited to reading or writing large
amounts of data in start-to-finish order without skipping around.
• All Android devices have two file storage areas: "Internal" and
"external" storage.
• Android device may use an updated Linux file system, such as ext4
or a proprietary file system by a manufacturer, depending on who
made the device and what has been done to it by the user.
• File system is the collection place on disk device for files. Visualize
the file system as consisting of a single node at the highest level
(ROOT) and all other nodes descending from the root node in a tree-
like fashion.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 216


• Samsung Galaxy S phones use the Samsung RFS proprietary file
system while the Samsung Nexus S with Android 2.3 uses the Linux
Ext4 file system.
• < /mnt>: This directory is used for mount points. The different
physical storage devices (like the hard disk drives, floppies, CD-
ROM's) must be attached to some au directory in the file system tree
before they can be accessed. This attaching is no called mounting and
the directory where the device is attached is called the mount bne
point.
• SDCard: The mounted SDCard is a storage device mounted to the
file system in the typical Linux fashion. On the file system root the
/sdcard is a symbolic link to /mnt/sdcard. /mnt/sdcard is where the SD
card is actually mounted, but the same files can also be accessed in
/sdcard.
• Fig. 7.8.1 shows typical directory structure of android file system.

• The superblock is the key to maintaining the file system. It is an 8


kB block of disk space that maintains the current status of the file
system. Because of its importance, a copy is maintained in memory
and at each cylinder group within the file system.
• The copy in main memory is updated as events transpire. The
update daemon is the actual process that calls on the kernel to flush
the cached superblocks, modified inode and cached data blocks to
disk.
• Usually Linux system assumes all file systems are read and
writable.
1. ODEX FILE
• In Android file system, applications come in packages with the
extension .apk. These application packages or APKs contain
certain .odex files whose supposed function is to save space.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 217


• These 'odex' files are actually collections of parts of an application
that are optimized before booting. Doing so speeds up the boot
process, as it preloads part of an application.
• On the other hand, it also makes hacking those applications difficult
because a part of the coding has already been extracted to another
location before execution.
2. DEODEX
• Deodexing is basically repackaging of these APKs in a certain way,
such that they are reassembled into classes.dex files. All pieces of an
application package are put fo together back in one place.
• Deodexed ROMs (or APKs) have all their application packages put
back together in one place, allowing for easy modification such as
theming. Since no pieces of code are coming from any external
location, custom ROMS or APKs are always deodexed to ensure
integrity.
7.8.1 SQLite
• SQlite is an open source embedded database. The resulting design
goals of SQLite were to allow the program to be operated without a
database installation or administration.
• It most widely deployed SQL database engine in the world. SQLite
is based on the Structured Query Language (SQL). Android contains
the SQLite database management classes that an application would
use to manage its own private database.
• SQLite is open source SQL database that stores data to a text file on
a device. Android comes in with built in SQLite database
implementation. SQLite supports all the relational database features.
• It is designed to provide a streamlined SQL-based database
management system suitable for embedded systems and other limited
memory systems. The full SQLite mid library can be implemented in
under 400 KB.
• In contrast to other database management systems, SQLite is not a
separate process that is accessed from the client application. The
library is linked in and thus becomes an integral part of the
application program.
Unique Features
1. No configuration is required.
2. No server process to administer or user accounts to manage.
3. Easy to backup and transmit database.
4. Dynamic typing for column values, variable lengths for column
records.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 218


5. Query can reference multiple database files.
6. A few non-standard SQL extensions.
• SQLiteDatabase allows methods to open the database connection,
perform queries and query updates and close the database. You can
define keys and values for queries via the ContentValues object. This
is necessary for Insert and Update calls. Delete only requires the Row
Number.
• android.database.sqlite classes are as follows:
1. SQLiteCloseable - An object created from a SQLiteDatabase that
can be closed.
2. SQLiteCursor - A cursor implementation that exposes results
from a query on a SQLiteDatabase.
3. SQLiteDatabase - Exposes methods to manage a SQLite database.
4. SQLiteOpenHelper - A helper class to manage database creation
and version management.
5. SQLiteProgram- A base class for compiled SQLite programs.
6. SQLiteQuery - A SQLite program that represents a query that
reads the as resulting rows into a CursorWindow.
7. SQLiteQueryBuilder- A convenience class that helps build SQL
queries to be sent to SQLiteDatabase objects.
8. SQLiteStatement - A pre-compiled statement against a
SQLiteDatabase that can be reused.

R2021 / AIDS / AL3451 –MACHINE LEARNING/ II YEAR / IV SEM / QB / 219

You might also like