Disks
File System: Abstraction for Secondary Storage
CPU Memory
Memory Bus
(System Bus)
Bridge
I/O Bus
Disk
NIC
2
Storage-Device Hierarchy
3
Secondary Storage
Secondary storage typically:
Is storage outside of memory
Does not permit direct execution of instructions or data
retrieval via load/store instructions
Characteristics:
It’s large: TB
It’s cheap: $150-$200
It’s persistent: data is maintained across process execution
and power down (or loss)
It’s slow: milliseconds to access
4
Disks
Seek time: time to move the
disk head to the desired track
Sectors
Rotational delay: time to
reach desired sector once head
is over the desired track
Transfer rate: rate data
read/write to disk ignore
Tracks
Some typical parameters:
Seek: ~8-10ms
Rotational delay: ~4.15ms
for 7200 rpm
Transfer rate:
5
Moving-head Disk Machanism
cylinder: track of the same size
if there is one track then track is
also cylinder
6
Disk Scheduling
Disks are at least four orders of magnitude slower than the main
memory
The performance of disk I/O is vital for the performance of the computer
system as a whole
Access time (seek time+ rotational delay) >> transfer time for a sector
Therefore the order in which sectors are read matters a lot
Disk scheduling
Usually based on the position of the requested sector rather than
according to the process priority
Possibly reorder stream of read/write request to improve performance
7
Disk Scheduling (Cont.)
Several algorithms exist to schedule the servicing of
disk I/O requests.
We illustrate them with a request queue (0-199).
98, 183, 37, 122, 14, 124, 65, 67
Head pointer 53
8
FCFS
Illustration shows total head movement of 640 cylinders.
9
SSTF
Selects the request with the minimum seek time from
the current head position.
SSTF scheduling is a form of SJF scheduling; may
cause starvation of some requests.
Illustration shows total head movement of 236
cylinders.
10
SSTF (Cont.)
11
SCAN
The disk arm starts at one end of the disk, and moves
toward the other end, servicing requests until it gets to
the other end of the disk, where the head movement is
reversed and servicing continues.
Sometimes called the elevator algorithm.
Illustration shows total head movement of 208
cylinders.
12
SCAN (Cont.)
13
C-SCAN
Provides a more uniform wait time than SCAN.
The head moves from one end of the disk to the other.
servicing requests as it goes. When it reaches the other
end, however, it immediately returns to the beginning
of the disk, without servicing any requests on the return
trip.
Treats the cylinders as a circular list that wraps around
from the last cylinder to the first one.
14
C-SCAN (Cont.)
15
C-LOOK
Version of C-SCAN
Arm only goes as far as the last request in each
direction, then reverses direction immediately, without
first going all the way to the end of the disk.
16
C-LOOK (Cont.)
17
Selecting a Disk-Scheduling Algorithm
SSTF is common and has a natural appeal
SCAN and C-SCAN perform better for systems that place a
heavy load on the disk.
Performance depends on the number and types of requests.
Requests for disk service can be influenced by the file-allocation
method.
The disk-scheduling algorithm should be written as a separate
module of the operating system, allowing it to be replaced with a
different algorithm if necessary.
Either SSTF or LOOK is a reasonable choice for the default
algorithm. give example FCFS is better and vice versa
18
Disk Management
Low-level formatting, or physical formatting — Dividing a disk
into sectors that the disk controller can read and write.
To use a disk to hold files, the operating system still needs to
record its own data structures on the disk.
Partition the disk into one or more groups of cylinders.
Logical formatting or “making a file system”.
Boot block initializes system.
The bootstrap is stored in ROM.
Bootstrap loader program.
Methods such as sector sparing used to handle bad blocks.
19
Swap-Space Management
Swap-space — Virtual memory uses disk space as an extension
of main memory.
Swap-space can be carved out of the normal file system,or, more
commonly, it can be in a separate disk partition.
Swap-space management
4.3BSD allocates swap space when process starts; holds text segment (the
program) and data segment.
Kernel uses swap maps to track swap-space use.
Solaris 2 allocates swap space only when a page is forced out of physical
memory, not when the virtual memory page is first created.
20
Disk Reliability
Several improvements in disk-use techniques involve
the use of multiple disks working cooperatively.
RAID is one important technique currently in common
use.
21
RAID
Redundant Array of Inexpensive Disks (RAID)
A set of physical disk drives viewed by the OS as a single logical drive
Replace large-capacity disks with multiple smaller-capacity drives to
improve the I/O performance (at lower price)
Data are distributed across physical drives in a way that enables
simultaneous access to data from multiple drives
Redundant disk capacity is used to compensate for the increase in the
probability of failure due to multiple drives
Improve availability because no single point of failure
Six levels of RAID representing different design alternatives
22
RAID Level 0
Does not include redundancy
Data is stripped across the available disks
Total storage space across all disks are divided into strips
Strips are mapped round-robin to consecutive disks
A set of consecutive strips that map exactly one strip to each disk in the array is
called a stripe
Can you see how this improves the disk I/O bandwidth? parallel
What access pattern gives the best performance?
stripe 0
strip 0 strip 1 strip 2 strip 3
strip 4 strip 5 strip 6 strip 7
...
23
RAID Level 1
Redundancy achieved by duplicating all the data
Every disk has a mirror disk that stores exactly the same data
A read can be serviced by either of the two disks which contains the requested
data (improved performance over RAID 0 if reads dominate)
A write request must be done on both disks but can be done in parallel
Recovery is simple but cost is high
strip 0 strip 0 strip 1 strip 1
strip 2 strip 2 strip 3 strip 3
...
24
RAID Levels 2 and 3
Parallel access: all disks participate in every I/O request
aka stripe size
Small strips since size of each read/write = # of disks * strip size
RAID 2: error correcting code is calculated across corresponding bits on each
data disk and stored on log(# data disks) parity disks
Hamming code: can correct single-bit errors and detect double-bit errors
Less expensive than RAID 1 but still pretty high overhead – not really needed in
most reasonable environments
RAID 3: a single redundant disk that keeps parity bits
P(i) = X2(i) X1(i) X0(i)
In the event of a failure, data can be reconstructed
Can only tolerate a single failure at a time
b0 b1 b2 P(b) X2(i) = P(i) X1(i) X0(i)
25
Parity Codes
Single Bit Parity: Two Dimensional Bit Parity:
Detect single bit errors Detect and correct single bit errors
0 0
26
RAID Levels 4 and 5
RAID 4
Large strips with a parity strip like RAID 3
Independent access - each disk operates independently, so multiple I/O request
can be satisfied in parallel
Independent access small write = 2 reads + 2 writes
Example: if write performed only on strip 0:
P’(i) = X2(i) X1(i) X0’1(i)
= X2(i) X1(i) X0’(i) X0(i) X0(i)
= P(i) X0’(i) X0(i)
Parity disk can become bottleneck
strip 0 strip 1 strip 2 P(0-2)
strip 3 strip 4 strip 5 P(3-5)
RAID 5
Like RAID 4 but parity strips are distributed across all disks
27
Calculating a Hamming Code
• Procedure:
–Place message bits in their non-power-of-two Hamming
positions
–Build a table listing the binary representation each each of
the message bit positions
–Calculate the check bits
28
Hamming Code Example
Message to be sent: 1 0 1 1
1 0 1 1
Position 1 2 3 4 5 6 7
2n: check bits 20 21 22
29
Hamming Code Example
Message to be sent: 1 0 1 1
1 0 1 1
Position 1 2 3 4 5 6 7
2n: check bits 20 21 22
Calculate check bits:
3 = 21 + 20 = 0 1 1
5 = 22 + 20 = 1 0 1
6 = 22 + 21 + = 1 1 0
7 = 22 + 21 + 20 = 1 1 1
30
Hamming Code Example
Message to be sent: 1 0 1 1
1 1 0 1 1
Starting with the 20 position:
Position 1 2 3 4 5 6 7
Look at positions with 1’s
2n: check bits 20 21 22
in them
Calculate check bits: Count the number of 1’s in the
3 = 21 + 20 = 0 1 1 corresponding message bits
5 = 22 + 20 = 1 0 1
6 = 22 + 21 + = 1 1 0 If even, place a 1 in the 20
7 = 22 + 21 + 20 = 1 1 1 check bit, i.e., use odd parity
Otherwise, place a 0
31
Hamming Code Example
Message to be sent: 1 0 1 1
1 0 1 0 1 1
Repeat with the 21 position:
Position 1 2 3 4 5 6 7
Look at positions those
2n: check bits 20 21 22
positions with 1’s in them
Calculate check bits: Count the number of 1’s in the
3 = 21 + 20 = 0 1 1 corresponding message bits
5 = 22 + 20 = 1 0 1
6 = 22 + 21 + = 1 1 0 If even, place a 1 in the 21
7 = 22 + 21 + 20 = 1 1 1 check bit
Otherwise, place a 0
32
Hamming Code Example
Message to be sent: 1 0 1 1
1 0 1 1 0 1 1
Repeat with the 22 position:
Position 1 2 3 4 5 6 7
Look at positions those
2n: check bits 20 21 22
positions with 1’s in them
Calculate check bits: Count the number of 1’s in the
3 = 21 + 20 = 0 1 1 corresponding message bits
5 = 22 + 20 = 1 0 1
6 = 22 + 21 + = 1 1 0 If even, place a 1 in the 22
7 = 22 + 21 + 20 = 1 1 1 check bit
Otherwise, place a 0
33
Hamming Code Example
Original message = 1011
Sent message = 1011011
Now, how do we check for a single-bit error in the sent
message using the Hamming code?
34
Using Hamming Codes to Correct Single-Bit
Errors
Received message: 1 0 1 1 0 0 1
1 0 1 1 0 0 1
Position 1 2 3 4 5 6 7
2n: check bits 20 21 22
Calculate check bits:
3 = 21 + 20 = 0 1 1
5 = 22 + 20 = 1 0 1
6 = 22 + 21 = 1 1 0
7 = 22 + 21 + 20 = 1 1 1
35
Using Hamming Codes to Correct Single-Bit
Errors
Received message: 1 0 1 1 0 0 1
1 0 1 1 0 0 1
Starting with the 20 position:
Position 1 2 3 4 5 6 7
Look at positions with 1’s
2n: check bits 20 21 22
in them
Calculate check bits: Count the number of 1’s in
3 = 21 + 20 = 0 1 1 both the corresponding
5 = 22 + 20 = 1 0 1 message bits and the 20 check
6 = 22 + 21 = 1 1 0 bit and compute the parity.
just check if the count is odd
7 = 22 + 21 + 20 = 1 1 1
If even parity, there is an error
in one of the four bits that were
Odd parity: No error in bits 1, 3, 5, 7 checked.
36
Using Hamming Codes to Correct Single-Bit
Errors
Received message: 1 0 1 1 0 0 1
1 0 1 1 0 0 1
Repeat with the 21 position:
Position 1 2 3 4 5 6 7
Look at positions with 1’s
2n: check bits 20 21 22
in them
Calculate check bits: Count the number of 1’s in
3 = 21 + 20 = 0 1 1 both the corresponding
5 = 22 + 20 = 1 0 1 message bits and the 21 check
6 = 22 + 21 = 1 1 0 bit and compute the parity.
7 = 22 + 21 + 20 = 1 1 1
If even parity, there is an error
in one of the four bits that were
Even parity: ERROR in bit 2, 3, 6 or 7! checked.
37
Using Hamming Codes to Correct Single-Bit
Errors
Received message: 1 0 1 1 0 0 1
1 0 1 1 0 0 1
Repeat with the 22 position:
Position 1 2 3 4 5 6 7
Look at positions with 1’s
2n: check bits 20 21 22
in them
Calculate check bits: Count the number of 1’s in
3 = 21 + 20 = 0 1 1 both the corresponding
5 = 22 + 20 = 1 0 1 message bits and the 22 check
6 = 22 + 21 = 1 1 0 bit and compute the parity.
7 = 22 + 21 + 20 = 1 1 1
If even parity, there is an error
in one of the four bits that were
Even parity: ERROR in bit 4, 5, 6 or 7! checked.
38
Finding the error’s location
1 0 1 1 0 0 1
Position 1 2 3 4 5 6 7
39
Finding the error’s location
1 0 1 1 0 0 1
Position 1 2 3 4 5 6 7
No error in bits 1, 3, 5, 7
40
Finding the error’s location
erroneous bit, change to 1
1 0 1 1 0 0 1
Position 1 2 3 4 5 6 7
Error must be in bit 6
ERROR in bit 2, 3, 6 or 7 because bits 3, 5, 7
are correct, and all the
ERROR in bit 4, 5, 6 or 7 remaining information
agrees on bit 6
41
Finding the error’s location
An Easier Alternative to the Last Slides
3 = 21 + 20 = 0 1 1
5 = 22 + 20 = 1 0 1
6 = 22 + 21 = 1 1 0
7 = 22 + 21 + 20 = 1 1 1
E E NE
1 1 0 =6
E = error in column
NE = no error in column
42
Hamming Codes
• Hamming codes can be used to locate and correct a
single-bit error
• If more than one bit is in error, then a Hamming code
cannot correct it
• Hamming codes, like parity bits, are only useful on
short messages
43