0% found this document useful (0 votes)

16 views94 pages

CT 211 - Lecture - 05

Uploaded by

nelsonedimond028

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views94 pages

CT 211 - Lecture - 05

Uploaded by

nelsonedimond028

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 94

[CT 211]

COMPUTER ARCHITECTURE AND

ORGANIZATION I

LECTURE#05

Computer Memory System

12/13/2024 CT 211: Computer Architecture and Organization I 1

Layout of Lecture #05

 Introduction
 Principle of Locality
 Characteristics of Memory Systems
 Types of memory
 Memory Hierarchy
 Cache Memory
 Virtual Memory

12/13/2024 CT 211: Computer Architecture and Organization I 2

Introduction

CT 211: Computer Architecture and

12/13/2024 3
Organization I
Introduction (1)

Memory lies at the heart of the stored-

program computer.
A stored-program computer is a type of
computer where both program instructions
and data are stored in the same memory
unit {i.e. Von Neumann Model}.

12/13/2024 CT 211: Computer Architecture and Organization I 4

Introduction (2)

Memory affects speed, capacity, and

access times, making it an essential part
of computer system performance.

It is essential to the storage of data and

the operation of programs.

12/13/2024 CT 211: Computer Architecture and Organization I 5

Principle of Locality

CT 211: Computer Architecture and

12/13/2024 6
Organization I
Principle of Locality (1)

 “Principle of locality” or “locality of

reference” is one of the most important concepts
related to computer systems.

 The principle reflects the observation that during the

course of execution of a program, memory
references by the processor, for both instructions
and data, tend to cluster.

12/13/2024 CT 211: Computer Architecture and Organization I 7

Principle of Locality (2)

Programs contain a number of iterative loops

and subroutines.

Once a loop or subroutine is entered, there are

repeated references to a small set of
instructions.

Similarly, operations on tables and arrays

involve access to a clustered set of data words.

12/13/2024 CT 211: Computer Architecture and Organization I 8

Principle of Locality (3)
 There are three forms of locality:
1. Temporal locality – refers to the tendency of a
program to reference in the near future those units
of memory referenced in the recent past.
E.g. When an iteration loop is executed, the processor executes
the same set of instructions repeatedly.
2. Spatial locality – refers to the tendency of a
program to reference units of memory whose
addresses are near one another.
E.g. If a unit of memory “x” is referenced at a time “t”, it is likely that units in
the range “x – k” through “x + k” will be reference in the near future.

12/13/2024 CT 211: Computer Architecture and Organization I 9

Principle of Locality (4)

There are three forms of locality:

3. Sequential locality – Instructions tend to
be accessed sequentially.

12/13/2024 CT 211: Computer Architecture and Organization I 10

Characteristics of Memory Systems

CT 211: Computer Architecture and

12/13/2024 11
Organization I
Characteristics of Memory Systems (1)

The complex subject of computer memory is

made more manageable if we classify memory
systems according to their key characteristics.
The most important of these characteristics are:
Location, Capacity, Unit of transfer,
Access method, Performance, Physical
type, Physical characteristics, and
Organization.
12/13/2024 CT 211: Computer Architecture and Organization I 12
Characteristics of Memory Systems (2)
(1) Location (4) Performance
•Internal (e.g. registers, cache, main memory) •Access time
•External (e.g. optical disks, magnetic disks, tapes) •Cycle time
•Transfer rate
(2) Capacity (5) Physical Type
•Number of words •Semiconductor
•Magnetic
•Number of bytes •Optical
•Magneto-optical
(2) Unit Transfer (6) Physical
•Word Characteristics
•Block •Volatile/non-volatile
•Erasable/non-erasable
(3) Access Method (7) Organization
•Sequential •Memory modules
•Direct
•Random
•Associative
12/13/2024 CT 211: Computer Architecture and Organization I 13
Types of Memory

CT 211: Computer Architecture and

12/13/2024 14
Organization I
Types of Memory (1)
 Although there is a large number of memory
technologies that exist, there are only two basic types of
memory, namely:
1. Random Access Memory (RAM)
2. Read-Only Memory (ROM)
 The term “random” in random access memory means
that any memory location can be accessed in the same
amount of time, regardless of its position in the memory.
 RAM also called “read-write memory”, “main memory”, is
used to store programs and data that the computer
needs when executing programs.
 On the other hand, RAM is “volatile” and loses this
information once the power is turned off.
12/13/2024 CT 211: Computer Architecture and Organization I 15
Types of Memory (2)

There are two types of RAM, namely;

1. Dynamic RAM (DRAM)

2. Static RAM (SRAM)

DRAM is constructed from capacitors that slowly
leak their charge (i.e. electricity) over time.
Thus, they must be refreshed every few
milliseconds to prevent data loss.
DRAM is a “cheap” memory owing to its simple
design.

12/13/2024 CT 211: Computer Architecture and Organization I 16

Types of Memory (3)

 RAM chips that are based upon flip-flop (e.g. D flip-

flop) are referred to as static RAM (SRAM), because
the contents of each location persist as long as
power is applied to the chips.
 SRAM is faster and much more expensive than
DRAM; however, designers use DRAM because it is
much denser (can store many bits per chip), uses
less power, and generates less heat than SRAM.

12/13/2024 CT 211: Computer Architecture and Organization I 17

Types of Memory (4)
For these reasons, both technologies are often
used in combination: DRAM for main memory
and SRAM for cache.
The basic operation of all DRAM memories is
the same, but there are many flavors, including:
1. Multibank DRAM (MDRAM)
2. Fast-Page Mode (FPM) DRAM
3. Extended Data Out (EDO) DRAM
4. Burst EDO DRAM (BEDO DRAM)
5. Synchronous Dynamic Random Access Memory (SDRAM)
6. Synchronous-Link (SL) DRAM
7. Double Data Rate (DDR) SDRAM
8. Rambus DRAM (RDRAM)
9. Direct Rambus (DR) DRAM

12/13/2024 CT 211: Computer Architecture and Organization I 18

Types of Memory (5)
In addition to RAM, most computers contain a
small amount of ROM that stores critical
information necessary to operate the system,
such as the program necessary to boot the
computer.
ROM is “not volatile” and always retain its
data.
This type of memory is also used in embedded
systems or any systems where the programming
does not need to change (e.g. calculators,
automobiles, laser printers).

12/13/2024 CT 211: Computer Architecture and Organization I 19

Types of Memory (6)
 There are five basic types of ROM, namely:
1. ROM
2. PROM (Programmable Read-Only Memory)
3. EPROM (Erasable PROM)
4. EEPROM (Electrically Erasable PROM)
5. Flash Memory.
 PROM can be programmed by the user with the
appropriate equipment.
 Whereas ROMs are “hardwired”, PROMs have
fuses that can be blown to program the chip.
 Once programmed, the data and instructions in
PROM cannot be changed.

12/13/2024 CT 211: Computer Architecture and Organization I 20

Types of Memory (7)

EPROM is programmable with the added

advantage of being reprogrammable.
To reprogram an EPROM, the entire chip must
first be erased using a special tool that emits
ultraviolet light.
EEPROM requires no special tool for erasure
(i.e. this is performed by applying an electric
field, and you can erase only one portions of
chip, one byte at a time).

12/13/2024 CT 211: Computer Architecture and Organization I 21

Types of Memory (8)

 Flash memory is essentially an EEPROM with the

added benefit that data can be written or erased in
blocks, removing the one-byte-at-a-time limitation
(i.e. making flash memory faster than EEPROM).
 Flash memory has become a very popular type of
storage and is often used in many different devices,
including: cell phones, digital cameras, music
players, and solid-state disk drives.

12/13/2024 CT 211: Computer Architecture and Organization I 22

Memory Hierarchy

12/13/2024 CT 211: Computer Architecture and Organization I 23

Memory Hierarchy (1)

In the past few decades, CPU processing speed

as measured by the number of instructions
executed per second has doubled every 18
months, for the same price.
Computer memory has experienced a similar
increase along a different dimension (i.e.
quadrupling in size every 36 months) for the
same price.
Memory speed, however, has only increased at
a rate of less than 10% per year.

12/13/2024 CT 211: Computer Architecture and Organization I 24

Memory Hierarchy (2)

Therefore, while processing speed

increases at the same rate that memory size
increases, the gap between the speed of the
processor and the speed of memory also
increases.
As the gap between processor and memory
speeds grows, architectural solutions help
bridge the gap.
A typical computer contains several types of
memory, ranging from fast, expensive internal
registers, to slow, inexpensive removable disks.

12/13/2024 CT 211: Computer Architecture and Organization I 25

Memory Hierarchy (3)

The interplay between these different types of

memory is exploited so that a computer
behaves as if it has a single, large, fast
memory, when in fact it contains a
range of memory types that operate in
a highly coordinated fashion.
Generally speaking, faster memory is more
expensive than slower memory.
To provide the best performance at the lowest
cost, memory is organized in a hierarchical
fashion, referred to as a Memory Hierarchy.

12/13/2024 CT 211: Computer Architecture and Organization I 26

Memory Hierarchy (4)
 The memory hierarchy can be thought of as a
pyramid:

12/13/2024 CT 211: Computer Architecture and Organization I 27

Memory Hierarchy (5)

The memory is classified based on its “distance”

from the processor (i.e. distance is measured by
the number of machine cycles required for
access).
The closer memory is to the processor, the
faster it should be. As the memory gets farther
from the processor, we can afford longer access
times. Thus, slower technologies are used for
these memories, and faster technologies are
used for memories closer to the CPU.

12/13/2024 CT 211: Computer Architecture and Organization I 28

Memory Hierarchy (6)

We are most interested in the memory hierarchy

that involves registers, cache, main
memory, and virtual memory.
Registers are storage locations available on
the processor itself.
Virtual memory is typically implemented
using a hard drive; it extends the address space
from RAM to the hard drive.
Virtual memory provides more space:
Cache memory provides speed.

12/13/2024 CT 211: Computer Architecture and Organization I 29

Memory Hierarchy (7)

To access a particular piece of data, the CPU

first sends a request to its nearest memory,
usually cache.
If the data is not in cache, then main memory is
queried. If the data is not in main memory, then
the request goes to disk.
Once the data is located, then the data, and a
number of its nearby data elements are fetched
into cache memory.

12/13/2024 CT 211: Computer Architecture and Organization I 30

Memory Hierarchy (8)

 This leads us to some definitions.

– A hit is when data is found at a given memory level.
– A miss is when it is not found.
– The hit rate is the percentage of time data is found at a
given memory level.
– The miss rate is the percentage of time it is not.
Miss rate = 1 - hit rate.
– The hit time is the time required to access data at a
given memory level.
– The miss penalty is the time required to process a
miss, including the time that it takes to replace a block of
memory plus the time it takes to deliver the data to the
processor.

12/13/2024 CT 211: Computer Architecture and Organization I 31

Memory Hierarchy (9)

An entire blocks of data is copied after a hit

because the principle of locality tells us
that once a byte is accessed, it is likely that a
nearby data element will be needed soon.

12/13/2024 CT 211: Computer Architecture and Organization I 32

Cache Memory

CT 211: Computer Architecture and

12/13/2024 33
Organization I
Cache Memory (1)

A cache memory is a small, but fast memory that

the processor uses for information it is likely to
need again in the very near future.
Therefore, the purpose of cache memory is to
speed up accesses by storing recently used data
closer to the CPU, instead of storing it in main
memory.
Although cache is much smaller than main
memory, its access time is a fraction of that of
main memory.

12/13/2024 CT 211: Computer Architecture and Organization I 34

Cache Memory (2)

A cache memory is faster than main memory

for a number of reasons:
1) Faster electronics can be used, which also
results in a greater expense in terms of money,
size, and power requirements (Since the cache
is small, this increase in cost is relatively small).
2) A cache memory has fewer locations than a
main memory, and as a result it has a shallow
decoding tree, which reduces access time.
3) The cache is placed both physically closer and
logically closer to the CPU than the main
memory and this placement avoids
communication delays over a shared bus.

12/13/2024 CT 211: Computer Architecture and Organization I 35

Cache Memory (3)
 The following are non-computer examples of
caching:
1. Grocery shopping
– You seldom, if ever, go to the grocery store to buy one
single item. You buy any items you require immediately
in addition to items you will most likely use in the future.
The grocery store is similar to main memory, and your
home is the cache
2. Student doing research
– Suppose you are writing a paper on quantum computing.
Would you go to the library, check out one book, return
home, get the necessary information from that book, go
back to the library, check out another book, return home,
and so on? No; you would go to the library and check out
all the books you might need and bring them all home.
The library is analogous to main memory, and your home
is, again, similar to cache.
12/13/2024 CT 211: Computer Architecture and Organization I 36
Cache Memory (4)

 Cache memory works on the same principles as the

preceding examples (1 & 2), by copying frequently
used data into the cache than requiring an access to
main memory to retrieve data.
 However, cache differs from our real-life examples
in one important way: The computer really has no
way to know , a priori, what data is most likely to be
accessed, so it uses the locality principle and
transfers an entire block from main memory into
cache whenever it has to make a main memory
access.

12/13/2024 CT 211: Computer Architecture and Organization I 37

Cache Memory (5)

The cache location for this new block depends

on two things:
1. The Cache Mapping Policy

2. The Cache Size

 Unlike main memory, which is accessed

by address, cache is typically accessed
by content; hence, it is often called
content addressable memory
(CAM).
12/13/2024 CT 211: Computer Architecture and Organization I 38
Cache Memory (6)
Definition:
– A content-addressable memory (CAM) is type
of computer memory that enables high-speed data
retrieval by allowing searches based on the content,
rather than a specific memory addresses.
– CAM is also known as associative memory.
 Under most cache mapping schemes, the
cache entries must be checked or searched to see if
the value being requested is stored in cache.
 To simplify this process of locating the desired data,
various cache mapping algorithms are used.

12/13/2024 CT 211: Computer Architecture and Organization I 39

Cache Mapping Schemes (1)

When the CPU wants to access data or

instructions, it first generates a main memory
address.
However, if the data has been copied to cache,
the address of data in cache is NOT THE SAME
as the main memory address.
 For example: data located at main memory address
0x2E3 could be located in the first location in cache.
 This raises an important question: How, then, does
the CPU locate data when it has been copied into
cache?

12/13/2024 CT 211: Computer Architecture and Organization I 40

Cache Mapping Schemes (2)

 The answer to this is that: The CPU uses a specific

mapping scheme that “converts” the main
memory address into a cache location.
 This address conversation is done by giving special
significance to the bits in the main memory address.
 First, the bits are divided into distinct groups called
fields.
 Depending on the mapping scheme, we may have
two or three fields.
 How we use these fields depends on the particular
mapping scheme being used.
12/13/2024 CT 211: Computer Architecture and Organization I 41
Cache Mapping Schemes (3)

 The mapping scheme determines where the data is

placed when it is originally copied into cache and
also provides a method for the CPU to find
previously copied data when searching cache.
 Before we discuss the cache mapping schemes
note the following:
1. Some computers are byte addressable, while others, are
word addressable.
2. If a machine is byte addressable, we are concerned with
number of bytes; if it is word addressable, we are concerned
with number words regardless of the number of words
3. Key in determining how the mapping schemes work is knowing
how many addresses are contained in main memory, in
cache, and in each block.

12/13/2024 CT 211: Computer Architecture and Organization I 42

Cache Mapping Schemes (4)

The cache mapping schemes include the

following:

1. Direct mapping

2. Fully associative mapping

3. Set associative mapping

12/13/2024 CT 211: Computer Architecture and Organization I 43

Direct Mapped Cache (1)

 Direct mapped cache assigns cache mapping using

a “modular” approach.
 Main memory and cache are both divided into the
same size blocks (the size of these blocks varies).
 Since number of main memory blocks is
greater than the number of cache blocks, then,
it is clear that main memory blocks compete for
cache locations.
 Direct mapping maps block X of main memory to
block Y of cache = X mod N, where N is the total
number of blocks in cache.
12/13/2024 CT 211: Computer Architecture and Organization I 44
Direct Mapped Cache (2)

For example: If cache contains 4 blocks (i.e.

N=4), then main memory block 0 maps to cache
blocks 0, main memory block 1 maps to cache
block 1, main memory block 2 maps to cache
block 2, main memory block 3 maps to cache
block 3, main memory block 4 maps to cache
block 0, and so on,…as illustrated next.

12/13/2024 CT 211: Computer Architecture and Organization I 45

Direct Mapped Cache (3)

12/13/2024 CT 211: Computer Architecture and Organization I 46

Direct Mapped Cache (4)

 To perform direct mapping, the binary main memory

address is partitioned into the fields {i.e. Tag, Block,
Offset}.
 The format of a main memory address using direct
mapping is shown below:

 The size of each field depends on the physical

characteristics of main memory and cache.

12/13/2024 CT 211: Computer Architecture and Organization I 47

Direct Mapped Cache (5)
 The offset field uniquely identifies an address within a
specific block (i.e. points to the desired data in the
block).
– The number of bytes (for byte addressable machine) or words
(for word addressable machine) in each block dictates the
number of bits in the offset field.
 The block field selects a unique b
lock of cache.
– The number of blocks in cache dictates the number of bits in
the block field.
 Many blocks of main memory map to a single block of
cache. A tag field in the cache block distinguishes one
cached memory block from another.
 Note: The total number of bits in all three fields must be equal
to the number of bits in a main memory address.

12/13/2024 CT 211: Computer Architecture and Organization I 48

Direct Mapped Cache (6)
 In summary: To determine how the direct mapping
works, you must know the following:
1. How many bits are in the main memory address (which is
determined by how many addresses exist in main memory).
2. How many blocks are in cache (which determines the size of
the block field).
3. How many addresses are in a block (which determines the size
of the offset field).
 Once you know these values, you can use the direct
mapping address format to locate the block in cache to
which a main memory block maps. Once you find the
cache block, you can determine if the block currently in
cache is the one you want by checking the tag. If the
tags match (the tag from the memory address field and
the tag in cache), use the offset field to find the desired
data in the cache block.
12/13/2024 CT 211: Computer Architecture and Organization I 49
Direct Mapped Cache (7)
 EXAMPLE-1: Consider a byte-addressable main
memory consisting of 4 blocks, and a cache with 2
blocks, where each block is 4 bytes.
– This means Block 0 and 2 of main memory map
to Block 0 of cache, and Blocks 1 and 3 of main
memory map to Block 1 of cache. (This is easy
for us to determine simply by using modular
arithmetic, because 0 mod 2 = 0, 1 mod 2 = 1, 2
mod 2 = 0, and 3 mod 2 = 1..) However, the
computer must use the fields in the
memory address itself.
– Using the tag, block, and offset fields, we
can see how main memory maps to cache.

12/13/2024 CT 211: Computer Architecture and Organization I 50

Direct Mapped Cache (8)
 EXAMPLE 1 Cont'd
– First, we need to determine the address format for mapping. Each
block is 4 bytes, so the offset field must contain 2 bits; there are 2
blocks in cache, so the block field must contain 1 bit; this leaves 1
bit for the tag (as a main memory address has 4 bits because there
are a total of 24=16 bytes).

12/13/2024 CT 211: Computer Architecture and Organization I 51

Direct Mapped Cache (9)
 EXAMPLE 1 Cont'd
– Suppose we need to access a
main memory address 316
(0011 in binary). If we partition
0011 using the address format b
from Figure a, we get Figure
b.
– Thus, the main memory
address 0011 maps to cache
block 0 (because block field
evaluates to 0).
– Figure c shows this mapping,
along with the tag that is also
stored with the data. c

The next slide illustrates another mapping.

CT 211: Computer Architecture and

12/13/2024 52
Organization I
Direct Mapped Cache (10)

12/13/2024 CT 211: Computer Architecture and Organization I 53

Direct Mapped Cache (11)
• EXAMPLE 2: Assume a byte-addressable memory
consists of 214 bytes, cache has 16 blocks, and each
block has 8 bytes.
– The number of memory blocks are:
– Each main memory address requires14 bits. Of this 14-bit
address field, the rightmost 3 bits reflect the offset field (we
need 3 bits to uniquely identify one of 8 bytes in a block)
– We need 4 bits to select a specific block in cache, so the block
field consists of the middle 4 bits.
– The remaining 7 bits make up the tag field.

12/13/2024 CT 211: Computer Architecture and Organization I 54

Fully Associative Cache (1)

Suppose instead of placing memory blocks in

specific cache locations based on memory
address, we could allow a block to go anywhere
in cache.
In this way, cache would have to fill up before
any blocks are evicted.
This is how fully associative cache works.
A memory address is partitioned into only two
fields: the tag and the offset.

12/13/2024 CT 211: Computer Architecture and Organization I 55

Fully Associative Cache (2)
Suppose, as before (Example 2), we have 14-bit
memory addresses and a cache with 16 blocks,
and each block with 8 bytes. The field format of
a memory reference is:

When the cache is searched, all tags are

searched in parallel to retrieve the data quickly.
This requires special, costly hardware.
12/13/2024 CT 211: Computer Architecture and Organization I 56
Fully Associative Cache (3)

 You will recall that direct mapped cache evicts a block

whenever another memory reference needs that block.
 With fully associative cache, we have no such mapping,
thus we must devise an algorithm to determine which
block to evict from the cache.
 The block that is evicted is the victim block.
 There are a number of ways to pick a victim, we will
discuss them under shortly under Replacement Policy.

12/13/2024 CT 211: Computer Architecture and Organization I 57

Fully Associative Cache (4)

In summary:
– Fully associative cache allows a main memory block
to be mapped to any block in cache. Once a main
memory block resides in cache, to locate a specific
byte, the computer compares the tag field of the main
memory address to all tags stored in cache (in one
comparison). Once the specific cache block is
located, the offset field is used to locate the required
data within that block. If the tag of the memory
address doesn’t match any tags in cache, the
memory block containing the desired data must be
transferred into cache. This transfer may require
picking a victim block to transfer out of cache.

12/13/2024 CT 211: Computer Architecture and Organization I 58

Set Associative Cache (1)

Because of its speed and complexity,

associative cache is very expensive.
On the other hand, although direct mapping is
inexpensive, it is very restrictive.
Set associative cache combines the ideas
of direct mapped cache and fully associative
cache.
An N-way set associative cache mapping is like
direct mapped cache in that a memory reference
maps to a particular location in cache.

12/13/2024 CT 211: Computer Architecture and Organization I 59

Set Associative Cache (2)

Unlike direct mapped cache, a memory

reference maps to a set of several cache blocks,
similar to the way in which fully associative
cache works.

Instead of mapping anywhere in the entire

cache, a memory reference can map only to the
subset of cache slots.

12/13/2024 CT 211: Computer Architecture and Organization I 60

Set Associative Cache (3)
 The number of cache blocks per set in set associative
cache varies according to overall system design.
– For example, a 2-way set associative cache
can be conceptualized as shown in the
schematic below.
– Each set contains two different memory
blocks.

Logical view Linear view

12/13/2024 CT 211: Computer Architecture and Organization I 61

Set Associative Cache (4)

In set associative cache mapping, a memory

reference is divided into three fields: tag, set,
and offset.

As with direct-mapped cache, the offset field

chooses the word within the cache block, and
the tag field uniquely identifies the memory
address.
The set field determines the “set” to which the
memory block maps.
12/13/2024 CT 211: Computer Architecture and Organization I 62
Cache Replacement Policy (1)

With fully associative and set associative cache,

a replacement policy is invoked when it
becomes necessary to evict a block from cache.
An optimal replacement policy would be able
to look into the future to see which blocks won’t
be needed for the longest period of time.
Although it is impossible to implement an optimal
replacement algorithm, it is instructive to use it
as a benchmark for assessing the efficiency of
any other scheme we come up with.

12/13/2024 CT 211: Computer Architecture and Organization I 63

Cache Replacement Policy (2)

 The replacement policy that we choose depends

upon the locality that we are trying to optimize--
usually, we are interested in temporal locality.
 There are several algorithms (i.e. replacement
policies), including:
1. Least Recently Used (LRU)
2. Fist-In First-Out (FIFO)
3. Random

12/13/2024 CT 211: Computer Architecture and Organization I 64

Cache Replacement Policy (3)

A least recently used (LRU) algorithm

keeps track of the last time that a block was
assessed and evicts the block that has been
unused for the longest period of time.
The disadvantage of this approach is its
complexity: LRU has to maintain an access
history for each block, which ultimately slows
down the cache.

12/13/2024 CT 211: Computer Architecture and Organization I 65

Cache Replacement Policy (4)

 First-in, first-out (FIFO) is a popular cache

replacement policy.
In FIFO, the block that has been in the cache the
longest, regardless of when it was last used.
A random replacement policy does what its
name implies: It picks a block at random and
replaces it with a new block.
Random replacement can certainly evict a block
that will be needed often or needed soon, but it
never thrashes (i.e. it does not constantly throw out a block,
then bring it back, then throw it out, then bring it back, repeatedly).

12/13/2024 CT 211: Computer Architecture and Organization I 66

Effective Access Time (EAT) and Hit Ratio (1)

 The performance of hierarchical memory is

measured by its Effective Access Time (EAT).
 EAT is a weighted average that takes into account
the hit ratio and relative access times of successive
levels of memory.
 The EAT for a two-level memory is given by:
EAT = H  AccessC + (1-H)  AccessMM
where H is the cache hit rate and AccessC and AccessMM
are the access times for cache and main memory,
respectively.

12/13/2024 CT 211: Computer Architecture and Organization I 67

Effective Access Time (EAT) and Hit Ratio (2)

For example, consider a system with a main

memory access time of 200ns supported by a
cache having a 10ns access time and a hit rate
of 99%.
Suppose access to cache and main memory
occurs concurrently/parallel (i.e. The accesses
overlap).
The EAT is:
0.99(10ns) + 0.01(200ns) = 9.9ns + 2ns = 11ns.

12/13/2024 CT 211: Computer Architecture and Organization I 68

Effective Access Time (EAT) and Hit Ratio (3)

 For example, consider a system with a main memory

access time of 200ns supported by a cache having a
10ns access time and a hit rate of 99%.
If the accesses do not overlap (sequential) the
EAT is:

0.99(10ns) + 0.01(10ns + 200ns)

= 9.9ns + 2.01ns = 12ns.

 The equation for determining the effective access

time can be extended to any number of memory
levels.

12/13/2024 CT 211: Computer Architecture and Organization I 69

When Does Caching Break Down? (1)

 Caching is depends upon programs exhibiting good

locality.
– Some object-oriented programs have poor locality
owing to their complex, dynamic structures.
– Arrays stored in column-major rather than row-
major order can be problematic for certain cache
organizations.
 With poor locality, caching can actually cause
performance degradation rather than performance
improvement.

12/13/2024 CT 211: Computer Architecture and Organization I 70

Cache Writing Policies (1)

Cache replacement policies must take into

account dirty blocks, those blocks that have
been updated while they were in the cache.
Dirty blocks must be written back to memory. A
write policy determines how this will be done.
There are two types of write policies: write
through and write back.
Write through updates cache and main memory
simultaneously on every write.

12/13/2024 CT 211: Computer Architecture and Organization I 71

Cache Writing Policies (2)

 Write back (also called copyback) updates memory

only when the block is selected for replacement.
 The disadvantage of write through is that memory
must be updated with each cache write, which slows
down the access time on updates. This slowdown is
usually negligible, because the majority of accesses
tend to be reads, not writes.
 The advantage of write back is that memory traffic is
minimized, but its disadvantage is that memory does
not always agree with the value in cache, causing
problems in systems with many concurrent users.

12/13/2024 CT 211: Computer Architecture and Organization I 72

Instruction and Data Cache (1)

 The cache we have been discussing is called a

unified or integrated cache where both
instructions and data are cached.
 Many modern systems employ separate caches for
data and instructions.
– This is called a Harvard cache.
The separation of data from instructions
provides better locality, at the cost of greater
complexity.
– Simply making the cache larger provides about the same
performance improvement without the complexity.

12/13/2024 CT 211: Computer Architecture and Organization I 73

Instruction and Data Cache (2)

• Cache performance can also be improved by

adding a small associative cache to hold
blocks that have been evicted recently.
– This is called a victim cache.
• A trace cache is a variant of an instruction
cache that holds decoded instructions for
program branches, giving the illusion that
noncontiguous instructions are really
contiguous.

12/13/2024 CT 211: Computer Architecture and Organization I 74

Levels of Cache (1)

 Most of today’s small systems employ multilevel

cache hierarchies.
 The levels of cache form their own small memory
hierarchy.
 Level 1 (L1) cache (8KB to 64KB) is situated on
the processor itself.
– Access time is typically about 4ns.
 Level 2 (L2) cache (64KB to 2MB) may be on
the motherboard, or on an expansion card.
– Access time is usually around 15 - 20ns.

12/13/2024 CT 211: Computer Architecture and Organization I 75

Levels of Cache (2)

In systems that employ three levels of cache,

the Level 2 cache is placed on the same die
as the CPU (reducing access time to about
10ns).
Accordingly, the Level 3 (L3) cache (2MB to
256MB) refers to cache that is situated between
the processor and main memory.
Once the number of cache levels is determined,
the next thing to consider is whether data (or
instructions) can exist in more than one cache
level.

12/13/2024 CT 211: Computer Architecture and Organization I 76

Levels of Cache (3)

 If the cache system used an inclusive cache,

the same data may be present at multiple levels of
cache.
 Strictly inclusive caches guarantee that all
data in a smaller cache also exists at the next
higher level.
 Exclusive caches permit only one copy of the
data.
 The tradeoffs in choosing one over the other
involve weighing the variables of access time,
memory size, and circuit complexity.
12/13/2024 CT 211: Computer Architecture and Organization I 77
Virtual Memory

CT 211: Computer Architecture and

12/13/2024 78
Organization I
Virtual Memory (1)

 Cache memory enhances performance by providing

faster memory access speed.
 Virtual memory enhances performance by providing
greater memory capacity, without the expense of
adding main memory.
 Instead, a portion of a disk drive serves as an
extension of main memory.
 If a system uses paging, virtual memory partitions
main memory into individually managed page
frames, that are written (or paged) to disk when
they are not immediately needed.

12/13/2024 CT 211: Computer Architecture and Organization I 79

Virtual Memory (2)

A physical address is the actual memory

address of physical memory.
Programs create virtual addresses that are
mapped to physical addresses by the memory
manager.
 Page faults occur when a logical address
requires that a page be brought in from disk.
 Memory fragmentation occurs when the
paging process results in the creation of small,
unusable clusters of memory addresses.
12/13/2024 CT 211: Computer Architecture and Organization I 80
Virtual Memory (3)

 Main memory and virtual memory are divided into

equal sized pages.
 The entire address space required by a process
need not be in memory at once. Some parts can be
on disk, while others are in main memory.
 Further, the pages allocated to a process do not
need to be stored contiguously-- either on disk or in
memory.
 In this way, only the needed pages are in memory at
any time, the unnecessary pages are in slower disk
storage.

12/13/2024 CT 211: Computer Architecture and Organization I 81

Virtual Memory (4)

 Information concerning the location of each page,

whether on disk or in memory, is maintained in a data
structure called a page table (shown below).
 There is one page table for each active process.

12/13/2024 CT 211: Computer Architecture and Organization I 82

Virtual Memory (5)

 When a process generates a virtual address, the

operating system translates it into a physical
memory address.
 To accomplish this, the virtual address is divided
into two fields: A page field, and an offset field.
 The page field determines the page location of the
address, and the offset indicates the location of the
address within the page.
 The logical page number is translated into a physical
page frame through a lookup in the page table.

12/13/2024 CT 211: Computer Architecture and Organization I 83

Virtual Memory (6)

 If the valid bit is zero in the page table entry for the
logical address, this means that the page is not in
memory and must be fetched from disk.
– This is a page fault.
– If necessary, a page is evicted from memory and is replaced
by the page retrieved from disk, and the valid bit is set to 1.
 If the valid bit is 1, the virtual page number is
replaced by the physical frame number.
 The data is then accessed by adding the offset to
the physical frame number.

12/13/2024 CT 211: Computer Architecture and Organization I 84

Virtual Memory (7)

 As an example, suppose a system has a virtual address

space of 8K and a physical address space of 4K, and the
system uses byte addressing.
– We have 213/210 = 23 virtual pages.
 A virtual address has 13 bits (8K = 213) with 3 bits for the page
field and 10 for the offset, because the page size is 1024.
 A physical memory address requires 12 bits, the first two bits
for the page frame and the trailing 10 bits the offset.

12/13/2024 CT 211: Computer Architecture and Organization I 85

Virtual Memory (8)

 Suppose we have the page table shown below.

 What happens when CPU generates address
545910 = 10101010100112 = 1553 16?

12/13/2024 CT 211: Computer Architecture and Organization I 86

Virtual Memory (9)

 We said earlier that effective access time (EAT)

takes all levels of memory into consideration.
 Thus, virtual memory is also a factor in the
calculation, and we also have to consider page table
access time.
 Suppose a main memory access takes 200ns, the
page fault rate is 1%, and it takes 10ms to load a
page from disk. We have:
EAT = 0.99(200ns + 200ns) + 0.01(10ms) = 100.396ns.

12/13/2024 CT 211: Computer Architecture and Organization I 87

Virtual Memory (10)

Even if we had no page faults, the EAT would be

400ns because memory is always read twice:
First to access the page table, and second to
load the page from memory.
Because page tables are read constantly, it
makes sense to keep them in a special cache
called a translation look-aside buffer (TLB).
TLBs are a special associative cache that stores
the mapping of virtual pages to physical pages.

12/13/2024 CT 211: Computer Architecture and Organization I 88

Virtual Memory (11)

 Another approach to virtual memory is the use of

segmentation.
 Instead of dividing memory into equal-sized pages,
virtual address space is divided into variable-length
segments, often under the control of the
programmer.
 A segment is located through its entry in a segment
table, which contains the segment’s memory
location and a bounds limit that indicates its size.
 After a page fault, the operating system searches for
a location in memory large enough to hold the
segment that is retrieved from disk.

12/13/2024 CT 211: Computer Architecture and Organization I 89

Virtual Memory (12)

 Both paging and segmentation can cause

fragmentation.
 Paging is subject to internal fragmentation
because a process may not need the entire range of
addresses contained within the page. Thus, there
may be many pages containing unused fragments of
memory.
 Segmentation is subject to external
fragmentation, which occurs when contiguous
chunks of memory become broken up as segments
are allocated and de-allocated over time.
12/13/2024 CT 211: Computer Architecture and Organization I 90
Virtual Memory (13)

 Large page tables are cumbersome and slow, but

with its uniform memory mapping, page operations
are fast. Segmentation allows fast access to the
segment table, but segment loading is labor-
intensive.
 Paging and segmentation can be combined to take
advantage of the best features of both by assigning
fixed-size pages within variable-sized segments.
 Each segment has a page table. This means that a
memory address will have three fields, one for the
segment, another for the page, and a third for the
offset.

12/13/2024 CT 211: Computer Architecture and Organization I 91

A Real-World Example (1)

 The Pentium architecture supports both paging and

segmentation, and they can be used in various
combinations including unsegmented, unpaged memory;
unsegmented, paged memory; segmented, unpaged
memory; and segmented, paged memory.
 The processor supports two levels of cache (L1 and L2),
both having a block size of 32 bytes.
 The L1 cache is next to the processor, and the L2 cache
sits between the processor and memory.
 The L1 cache is in two parts: and instruction cache (I-
cache) and a data cache (D-cache).
The next slide shows this organization schematically.
12/13/2024 CT 211: Computer Architecture and Organization I 92
A Real-World Example (2)

12/13/2024 CT 211: Computer Architecture and Organization I 93

Reading Assignment

Read Chapter 6: Memory, from a book

“The Essentials of Computer Organization
and Architecture” by Linda Null and Julia
Lobur.

12/13/2024 CT 211: Computer Architecture and Organization I 94

Cao - Unit 5 - Notes - Final
No ratings yet
Cao - Unit 5 - Notes - Final
51 pages
Chapter IV CAO
No ratings yet
Chapter IV CAO
93 pages
COA Lecture 20
No ratings yet
COA Lecture 20
26 pages
Dpaco - Unit 5
No ratings yet
Dpaco - Unit 5
32 pages
Computer Organization and Architecture: Chapter Five
No ratings yet
Computer Organization and Architecture: Chapter Five
23 pages
Types of Memory 1.3.3
No ratings yet
Types of Memory 1.3.3
16 pages
Memory System in Computer Architecture
No ratings yet
Memory System in Computer Architecture
90 pages
Memory 1
No ratings yet
Memory 1
24 pages
Memory in Computer Architecture
No ratings yet
Memory in Computer Architecture
24 pages
Ch4 Memory Organiziation
No ratings yet
Ch4 Memory Organiziation
34 pages
Lecture 5 Computer Memory
No ratings yet
Lecture 5 Computer Memory
25 pages
Module 6 - Memory
No ratings yet
Module 6 - Memory
32 pages
Mahnoor Nadeem
No ratings yet
Mahnoor Nadeem
7 pages
COA MemoryOrg
No ratings yet
COA MemoryOrg
19 pages
Css 316 Power Point Lecture Notes2
No ratings yet
Css 316 Power Point Lecture Notes2
120 pages
5 8 Comp Arch Memory System
No ratings yet
5 8 Comp Arch Memory System
22 pages
Computer Organization - Unit V
No ratings yet
Computer Organization - Unit V
42 pages
9CS 207 - MemoryOrganization
No ratings yet
9CS 207 - MemoryOrganization
22 pages
Computer Memory: Associate Professor Dr. Mohammed Nasir Uddin
No ratings yet
Computer Memory: Associate Professor Dr. Mohammed Nasir Uddin
26 pages
Lec 6
No ratings yet
Lec 6
38 pages
Memory Systems for IT Students
No ratings yet
Memory Systems for IT Students
50 pages
Chapter 3 Lecture 1
No ratings yet
Chapter 3 Lecture 1
54 pages
Computer Memory System Overview, Cache Memory, Design Elements and Priciples of Cache Design
No ratings yet
Computer Memory System Overview, Cache Memory, Design Elements and Priciples of Cache Design
35 pages
Lecture 1.3.3 (Types of Memory)
No ratings yet
Lecture 1.3.3 (Types of Memory)
16 pages
Jeppiaar Institute of Technology: Department OF Computer Science and Engineering
No ratings yet
Jeppiaar Institute of Technology: Department OF Computer Science and Engineering
35 pages
CO and MP Unit II For Students
No ratings yet
CO and MP Unit II For Students
41 pages
04 - Cache Memoryfinal 04
No ratings yet
04 - Cache Memoryfinal 04
29 pages
Ict Cs181 Week 2
No ratings yet
Ict Cs181 Week 2
58 pages
Chapter 7
No ratings yet
Chapter 7
25 pages
UNIT-2 Computer Organization
No ratings yet
UNIT-2 Computer Organization
76 pages
What Is Computer Memory?
No ratings yet
What Is Computer Memory?
4 pages
Memory Organization
No ratings yet
Memory Organization
24 pages
COA Unit 4 Computer Memory System RRP
No ratings yet
COA Unit 4 Computer Memory System RRP
66 pages
Computer Architecture Unit IV
No ratings yet
Computer Architecture Unit IV
52 pages
Computer - Organization U 3
No ratings yet
Computer - Organization U 3
9 pages
Computer Memory Systems Guide
No ratings yet
Computer Memory Systems Guide
35 pages
Computer Organization
No ratings yet
Computer Organization
89 pages
CO Unit 5
No ratings yet
CO Unit 5
24 pages
Unit 2
No ratings yet
Unit 2
18 pages
Module-I: Computer System Architecture
No ratings yet
Module-I: Computer System Architecture
64 pages
Data Assignment.... Rich
No ratings yet
Data Assignment.... Rich
10 pages
Caos Notes
No ratings yet
Caos Notes
38 pages
W 07 Memory Organization and Structure
No ratings yet
W 07 Memory Organization and Structure
35 pages
13 Memory
No ratings yet
13 Memory
56 pages
Aniket Mandal
No ratings yet
Aniket Mandal
7 pages
Chapter 3 - Memory
No ratings yet
Chapter 3 - Memory
24 pages
Chapter 1 Computer Memory
No ratings yet
Chapter 1 Computer Memory
17 pages
FCA2
No ratings yet
FCA2
46 pages
Unit - 4 Coa-1
No ratings yet
Unit - 4 Coa-1
91 pages
Memory and I/O System Overview
No ratings yet
Memory and I/O System Overview
89 pages
Computer Memory Organization Guide
No ratings yet
Computer Memory Organization Guide
6 pages
Unit 2 F.C.S
No ratings yet
Unit 2 F.C.S
18 pages
1memory Organization
No ratings yet
1memory Organization
47 pages
Pratim Biswas Ca Ca1
No ratings yet
Pratim Biswas Ca Ca1
14 pages
Computer Memory Hierarchy Guide
No ratings yet
Computer Memory Hierarchy Guide
10 pages
Lesson 3
No ratings yet
Lesson 3
33 pages
Characteristics of Memory
No ratings yet
Characteristics of Memory
7 pages
CT 211 - Tutorial - 01
No ratings yet
CT 211 - Tutorial - 01
3 pages
CT 211 - Tutorial - 02
No ratings yet
CT 211 - Tutorial - 02
2 pages
CP 212 Cive Group Assignment
No ratings yet
CP 212 Cive Group Assignment
8 pages
Computer Memory Systems Tutorial
No ratings yet
Computer Memory Systems Tutorial
3 pages
Cache Optimizations
No ratings yet
Cache Optimizations
29 pages
13 - Large and Fast Exploiting Memory Hierarchy Final
No ratings yet
13 - Large and Fast Exploiting Memory Hierarchy Final
118 pages
Practice Problem Set OS
No ratings yet
Practice Problem Set OS
3 pages
Parallel Algorithms For Matrix Computations
No ratings yet
Parallel Algorithms For Matrix Computations
208 pages
CS 225: Computer Systems Overview
No ratings yet
CS 225: Computer Systems Overview
2 pages
Memory Hierarchy Design Guide
No ratings yet
Memory Hierarchy Design Guide
54 pages
Cache Optimizations: Computer Architecture Prof. Muhamed Mudawar
No ratings yet
Cache Optimizations: Computer Architecture Prof. Muhamed Mudawar
38 pages
Napo
No ratings yet
Napo
21 pages
Parallel Processing Chapter - 2
0% (1)
Parallel Processing Chapter - 2
135 pages
CAQA5e ch1
No ratings yet
CAQA5e ch1
80 pages
Thrashing & Locality of Reference Notes
No ratings yet
Thrashing & Locality of Reference Notes
4 pages
Computer Architecture - 2marks: 1) What Is The Need For Speculation? (NOV/DEC 2014)
No ratings yet
Computer Architecture - 2marks: 1) What Is The Need For Speculation? (NOV/DEC 2014)
11 pages
16-Cache Memory-13-03-2024
No ratings yet
16-Cache Memory-13-03-2024
50 pages
10 Iot
No ratings yet
10 Iot
35 pages
Introduction & Matrix Multiplication: 6.172 Performance Engineering of Software Systems
No ratings yet
Introduction & Matrix Multiplication: 6.172 Performance Engineering of Software Systems
69 pages
Cache Locality Insights for Experts
No ratings yet
Cache Locality Insights for Experts
24 pages
24-Cache Memory Mapping Techniques-14!03!2024
No ratings yet
24-Cache Memory Mapping Techniques-14!03!2024
36 pages
Embedded Systems Cache MCQs
No ratings yet
Embedded Systems Cache MCQs
5 pages
Memory Hierarchy Main Memory Auxiliary Memory Associative Memory Cache Memory Virtual Memory Memory MGT Hardware
No ratings yet
Memory Hierarchy Main Memory Auxiliary Memory Associative Memory Cache Memory Virtual Memory Memory MGT Hardware
8 pages
EE Programming Exam 2016
No ratings yet
EE Programming Exam 2016
12 pages
Practice Problem Set OS
No ratings yet
Practice Problem Set OS
21 pages
The Memory System Hamacher
No ratings yet
The Memory System Hamacher
20 pages
Memory Organization
No ratings yet
Memory Organization
25 pages
Modulo18 RiscV DDCArv Ch8
No ratings yet
Modulo18 RiscV DDCArv Ch8
43 pages
Cache Memory Performance
No ratings yet
Cache Memory Performance
10 pages
f37 Book Intarch Pres Pt5
No ratings yet
f37 Book Intarch Pres Pt5
75 pages
DDCO Module 4 Notes Final
No ratings yet
DDCO Module 4 Notes Final
69 pages
Large and Fast: Exploiting Memory Hierarchy: Computer Organization and Design
No ratings yet
Large and Fast: Exploiting Memory Hierarchy: Computer Organization and Design
107 pages
Cache Performance
No ratings yet
Cache Performance
44 pages
Unit 4 - Computer System Organisation - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Computer System Organisation - WWW - Rgpvnotes.in
8 pages