0% found this document useful (0 votes)
39 views22 pages

24 - Caching

The document discusses cache memories in microprocessor interfacing, focusing on direct mapped cache examples, including read and write operations. It explains the structure of cache rows, the use of tags, indices, and offsets, and illustrates how cache misses and hits affect data storage. Additionally, it highlights drawbacks of direct mapped caches, particularly regarding shared cache locations among multiple memory addresses.

Uploaded by

Thomas Shi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views22 pages

24 - Caching

The document discusses cache memories in microprocessor interfacing, focusing on direct mapped cache examples, including read and write operations. It explains the structure of cache rows, the use of tags, indices, and offsets, and illustrates how cache misses and hits affect data storage. Additionally, it highlights drawbacks of direct mapped caches, particularly regarding shared cache locations among multiple memory addresses.

Uploaded by

Thomas Shi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

ECE 4240

MICROPROCESSOR INTERFACING
Cache Memories

Photo by Yogesh Phuyal on Unsplash


COPYRIGHT NOTICE

These course materials are copyright ©2023 The University of Manitoba.


Some elements are the property of other rights holders and are included
under the Fair Dealing provision of the Canadian Copyright Act.
These course notes are provided to students of ECE 4240 for self study
purposes only. They may not be uploaded to any online service or portal, or
redistributed in any manner, electronic or otherwise, without the express
written permission of the instructor and the other relevant rights holders.
DIRECT MAPPED CACHE - EXAMPLE
Cache Cache Cache Main Memory
Index Tag Data Valid Index Tag Data Valid Index Tag Data Valid Address Data
000 00000 0x0000 0 000 00000 0x0000 0 000 00000 0x0000 0 0000 0000
0x6E32
001 00000 0x0000 0 001 00000 0x0000 0 001 00000 0x0000 0
010 00000 0x0000 0 010 00000 0x0000 0 010 11101 0x2E00 1 0000 0001
011 00000 0x0000 0 011 00000 0x0000 0 011 00000 0x0000 0 0xFF7A
100 00000 0x0000 0 100 00000 0x0000 0 100 00000 0x0000 0
101 00000 0x0000 0 101 00000 0x0000 0 101 00000 0x0000 0 0000 0010 0x00CB
110 00000 0x0000 0 110 00000 0x0000 0 110 00000 0x0000 0 0000 0011 0x101F
111 00000 0x0000 0 111 00010 0x8088 1 111 00010 0x8088 1
0000 0100 0xFFFF
Initial Cache Contents Read from 0x17 Read from 0xEA … …
0001 0110 0xFF00
The cache is initially empty. (All Valid bits are clear.) 0001 0111 0x8088
… …
A read of memory location 0x17, results in a cache miss. Memory is 1110 1000 0x9876
1110 1001 0x1BFF
read and the data (0x8088) is stored in cache location 0x7. 1110 1010 0x2E00
… …
A read of memory location 0xEA, results in a cache miss. Memory is 1111 1100 0x0000
read and the data (0x2E00) is stored in cache location 0x2. 1111 1101 0x00FF
1111 1110 0x8000
1111 1111 0x433C
DIRECT MAPPED CACHE - EXAMPLE - WRITE BACK CACHE
Cache Cache Cache Main Memory
Index Tag Data Valid Index Tag Data Valid Index Tag Data Valid Address Data
000 00000 0x0000 0 000 00000 0x0000 0 000 00000 0x0000 0 0000 0000
0x6E32
001 00000 0x0000 0 001 00000 0x0000 0 001 00000 0x0000 0
010 11101 0x2E00 1 010 11101 0x2E00 1 010 11101 0x2E00 1 0000 0001
011 00000 0x0000 0 011 00000 0x0000 0 011 00000 0x0000 0 0xFF7A
100 11111 0x5D06 1 100 11111 0x5D0A 1 100 00000 0xFFFF 1
101 00000 0x0000 0 101 00000 0x0000 0 101 00000 0x0000 0 0000 0010 0x00CB
110 00000 0x0000 0 110 00000 0x0000 0 110 00000 0x0000 0 0000 0011 0x101F
111 00010 0x8088 1 111 00010 0x8088 1 111 00010 0x8088 1
0000 0100 0xFFFF
Write value 0x5D06 to 0xFC Write value 0x5D0A to 0xFC Read 0x04 … …
0001 0110 0xFF00
The write to 0xFC results in a cache miss. Data is stored in the cache, 0001 0111 0x8088
… …
but main memory is not updated. 1110 1000 0x9876
1110 1001 0x1BFF
The subsequent write to 0xFC updates the cached value only. 1110 1010 0x2E00
… …
The read from 0x04, causes a write back of cache row 0x4, before 1111 1100 0x5D0A
reading the data from memory and updating the cache. 1111 1101 0x00FF
1111 1110 0x8000
1111 1111 0x433C
DIRECT MAPPED CACHE WITH OFFSET

Each cache row in the basic direct-mapped cache stores a single value from memory.
A variation on this is to load/store blocks of values from/to the memory
In this implementation, the memory address (from the processor) is now divided into
three parts: a tag, an index, and an offset.
Tag Index Offset

m bits n bits a bits

The offset is the lowest a bits of the address, the index is the next lower n bits, while
the tag is the upper m bits.
For example, in a system with a 32-bit address space, the offset might be 2-bits, the
index might be 12-bits, resulting in a tag of 18-bits.
DIRECT MAPPED CACHE WITH OFFSET

When an offset is present, each cache row will store a


2 values for an a-bit offset.
For example, if the offset field is 2 bits, then each cache row will store 4 values.
Ex: If the memory address is 8-bits, and the cache Index Tag Data3
Cache
Data2 Data1 Data0 Valid
has 8 rows, then the tag will be 3-bits. 000 000 0x0000 0x0000 0x0000 0x0000 0
001 000 0x0000 0x0000 0x0000 0x0000 0
Ex: 0x9D = 0b10011101 010 000 0x0000 0x0000 0x0000 0x0000 0
011 000 0x0000 0x0000 0x0000 0x0000 0
Tag = ‘100’; Index = ‘111’; Offset = ‘01’ 100 000 0x0000 0x0000 0x0000 0x0000 0
101 000 0x0000 0x0000 0x0000 0x0000 0
Accessing any address within the block will 110 000 0x0000 0x0000 0x0000 0x0000 0
111 100 0x3008 0x07F4 0xF183 0x0005 1
load the entire block in response to a cache miss.
In this example, reading location 0x9D would result in the block 0x9C–0x9F
(0b10011100–0b10011111) being cached to index row 0x7.
DIRECT MAPPED CACHE WITH OFFSET - EXAMPLE
Cache Cache Main Memory
Index Tag Data1 Data0 Valid Index Tag Data1 Data0 Valid Address Data
000 0000 0x0000 0x0000 0 000 0000 0x0000 0x0000 0 0000 0000
0x6E32
001 0000 0x0000 0x0000 0 001 0000 0x0000 0x0000 0
010 0000 0x0000 0x0000 0 010 0000 0x0000 0x0000 0 0000 0001
011 0000 0x0000 0x0000 0 011 0001 0x8088 0xFF00 1 0xFF7A
100 0000 0x0000 0x0000 0 100 0000 0x0000 0x0000 0
101 0000 0x0000 0x0000 0 101 0000 0x0000 0x0000 0 0000 0010 0x00CB
110 0000 0x0000 0x0000 0 110 0000 0x0000 0x0000 0 0000 0011 0x101F
111 0000 0x0000 0x0000 0 111 0000 0x0000 0x0000 0
0000 0100 0xFFFF
Initial Cache Contents Read from 0x17 … …
0001 0110 0xFF00
The cache is initially empty. (All Valid bits are clear.) 0001 0111 0x8088
… …
A read of memory location 0x17, results in a cache miss. 1110 1000 0x9876
1110 1001 0x1BFF
Memory is read and the data is stored in cache location 0x3. 1110 1010 0x2E00
… …
In this case, the block size is 2, resulting in locations 0x16 and 0x17 1111 1100 0x0000
both being retrieved from main memory. 1111 1101 0x00FF
1111 1110 Revised
0x8000
1111 1111 0x433C
DIRECT MAPPED CACHE WITH OFFSET - EXAMPLE
Cache Cache Main Memory
Index Tag Data1 Data0 Valid Index Tag Data1 Data0 Valid Address Data
000 0000 0x0000 0x0000 0 000 0000 0x0000 0x0000 0 0000 0000
0x6E32
001 0000 0x0000 0x0000 0 001 0000 0x0000 0x0000 0
010 0000 0x0000 0x0000 0 010 0000 0x0000 0x0000 0 0000 0001
011 0001 0x8088 0x0020 1 011 1111 0x433C 0x6210 1 0xFF7A
100 0000 0x0000 0x0000 0 100 0000 0x0000 0x0000 0
101 0000 0x0000 0x0000 0 101 0000 0x0000 0x0000 0 0000 0010 0x00CB
110 0000 0x0000 0x0000 0 110 0000 0x0000 0x0000 0 0000 0011 0x101F
111 0000 0x0000 0x0000 0 111 0000 0x0000 0x0000 0
0000 0100 0xFFFF
Write 0x0020 to 0x16 Write 0x6210 to 0xFE … …
0001 0110 0x0020

A subsequent write to memory location 0x16 updates the 0001 0111 0x8088
… …
cached data in cache location 0x3. 1110 1000 0x9876
1110 1001 0x1BFF
A write to memory location 0xF6 would trigger a write 1110 1010 0x2E00
… …
back followed by a block read, replacing cache location 1111 1100 0x0000

0x3. 1111 1101 0x00FF


1111 1110 Revised
0x8000
1111 1111 0x433C
DRAWBACKS OF THE DIRECT MAPPED CACHE
Main Memory
Address Data
0000 0000
0x6E32
One drawback to the direct mapped cache is that each 0000 0001
cache location is shared amongst many main memory 0xFF7A

locations. 0000 0010 0x00CB


0000 0011 0x101F
If the processor happens to repeatedly access two memory 0000 0100 0xFFFF

locations that share the same cache index, that cache entry …
0001 0110 0xFF00

will be repeatedly overwritten. This will result in inefficient 0001 0111 0x8088
… …
use of the cache and a degradation of system performance. 1110 1000 0x9876
1110 1001 0x1BFF
In an attempt to address this problem there is an alternative 1110 1010 0x2E00

cache arrangement called a set associative cache, which …


1111 1100 0x0000

allows for the storage of multiple data values from different 1111 1101 0x00FF
1111 1110 0x8000
areas of main memory that map to the same cache index 1111 1111 0x433C
SET ASSOCIATIVE CACHE

The set associative cache uses more than one working set associated with the
same index location within the cache
The working sets are managed and operate independently.
Each working set entry has its own tag and validity bit (unlike the direct
mapped cache with offset). Working Set 1 Working Set 0
Index Tag Data Valid Tag Data Valid
When the processor accesses a memory location, 000 11101 0x9876 1 00000 0x6E32 1
001 00000 0x0000 0 00000 0x0000 0
the cache controller examines all working sets 010 00000 0x0000 0 00000 0x00CB 1
011 00000 0x0000 0 00000 0x0000 0
associated with the index value looking for a 100 00000 0x0000 0 00000 0x0000 0

valid entry with a matching tag field. 101


110
00000 0x0000
00000 0x0000
0
0
00000
00010
0x0000
0xFF00
0
1
111 00010 0x8088 1 11111 0x433C 1
SET ASSOCIATIVE CACHE

If the operation being performed is a read and no matching tag is located,


then an external memory access is performed to retrieve the requested data.
The data is returned to the processor and also stored within the cache.
If there is an empty entry in a working set with the proper cache index value,
the data is added to the cache. Working Set 1 Working Set 0
Index Tag Data Valid Tag Data Valid
If all working sets contain data at the identified 000 11101 0x9876 1 00000 0x6E32 1
001 00000 0x0000 0 00000 0x0000 0
cache index location, then the least recently 010 00000 0x0000 0 00000 0x00CB 1
011 00000 0x0000 0 00000 0x0000 0
accessed entry is dropped and the new data 100 00000 0x0000 0 00000 0x0000 0

value is added to the cache. 101


110
00000 0x0000
00000 0x0000
0
0
00000
00010
0x0000
0xFF00
0
1
111 00010 0x8088 1 11111 0x433C 1
SET ASSOCIATIVE CACHE

If the cache controller places the most recently accessed value in working set
0, the next most recently accessed in working set 1, and so on, then the
highest numbered working set would have the oldest cached value for that
index row.
When a memory location is accessed by the processor, the cache controller
Working Set 1 Working Set 0
must check all working sets in an effort to locate theIndexrequested
Tag Data data.
Valid Tag Data Valid
000 11101 0x9876 1 00000 0x6E32 1
This involves a linear search of the working sets. 001
010
00000 0x0000
00000 0x0000
0
0
00000
00000
0x0000
0x00CB
0
1
011 00000 0x0000 0 00000 0x0000 0
As a result, set associative caches are slower than 100 00000 0x0000 0 00000 0x0000 0

direct mapped caches. 101


110
00000 0x0000
00000 0x0000
0
0
00000
00010
0x0000
0xFF00
0
1
111 00010 0x8088 1 11111 0x433C 1
CACHE COHERENCE

Data
Address
Cache
The term cache coherence refers to the Processor
Memory

problem of ensuring that the information held Core


Cache
in the cache is consistent, system-wide, with Controller

what is available in the main memory.


Up to now, the cache controller has ensured
that the processor has received coherent data, Main Memory

whether it is retrieved from the cache or the


main memory.
CACHE COHERENCE

Data
Address
Cache
However, if the memory contents are shared Processor
Memory

amongst several processors, how do we ensure Core


Cache
that the processors always have access to the Controller

most up-to-date data for every memory


location? Main Memory

As well, if other processors are able to


independently modify data in main memory,
Cache
how do we know that the data held in the Memory
Processor
cache is coherent? Core
Cache
Controller
CACHE COHERENCE

One option is not to cache variables which may be used by other processors.
This is not desirable as it undermines the intent of using a cache in the first
place.
In the most extreme case, it would mean that no values would be cached.
It would also require the cache controller be made aware of what information
is cacheable and what is not. Maintaining that information would be a
complicated operation.
A better option is to allow the cache controller to monitor and react to memory
accesses made by other processors.
This approach is called a snooping cache.
SNOOPING CACHE
Tag Data S

The snooping cache replaces the single valid bit with a 2-bit state field which can
represent on one of four possible conditions:
Exclusive: Valid data is stored in the identified cache location and it is identical to
that in memory, but this cache is the only one holding that information.
Shared: Valid data is stored in the identified cache location and main memory, and
may be cached by another processor.
Modified: Valid data is stored in the identified cache location that has been
modified by the local processor, but not written back to memory.
Invalid: There is no valid data stored at that cache location.
SNOOPING CACHE

Data
Address
Cache
When the cache is first initialized, all entries Processor
Memory

are marked as invalid. Core


Cache
Controller
If a processor performs a read on a given
memory location that results in a cache miss,
the data will be retrieved from main memory, Main Memory

added to the cache and labeled shared.


Further reads by the same processor that Cache

result in a cache hit will return the locally Processor


Memory

Core
cached value. The state of the entry remains Cache
Controller
unchanged.
SNOOPING CACHE

Data
Address
Cache
If a second processor also performs a read of Processor
Memory

the same memory location, it will result in its Core


Cache
own cache miss and a duplicate copy will be Controller

stored in that processor’s cache as a shared


entry. Main Memory

If a shared entry is updated by the processor it


will change its state to modified and all other
Cache
caches will be informed that the data for that Memory
Processor
location in their local caches is now invalid. Core
Cache
Controller
In the modified state, only that one cache now
has a valid copy of the data.
SNOOPING CACHE

Data
Address
Cache
If modified data is written back to memory, Processor
Memory

with a copy remaining in the cache, then the Core


Cache
cached entry is relabelled exclusive. Controller

Should a memory read from another processor


occur for an entry labeled exclusive by another Main Memory

cache, the main memory will supply the


requested data and the receiving cache will
Cache
label its copy as shared. Likewise, the cache Memory
Processor
with the formerly exclusive copy will change Core
Cache
that entry’s state to shared. Controller
SNOOPING CACHE

Data
Address
Cache
If a cache observes a read request from Processor
Memory

another cache while holding a corresponding Core


Cache
modified data entry it will temporarily block Controller

the read, store the modified value back to


main memory and relabel its own copy as Main Memory
shared.
The processor that initiated the read, will
Cache
capture the data value during the write back, Memory
Processor
add it to its own cache and mark the value as Core
Cache
shared. Controller
SNOOPING CACHE

Data
Address
Cache
If a processor writes to an address before having Memory
Processor
read it, the data will be cached locally as an Core
Cache
exclusive entry if it is written back to main Controller

memory.
If a write back cache is used, the entry will not be
Main Memory
written to main memory, and will be marked as
modified.
In either case, other caches will be notified and Cache
Memory
will invalidate their own copies of the data for that Processor
Core
same memory location. Cache
Controller

An exclusive entry read by the local processor will


remain exclusive.
SNOOPING CACHE

The snooping nature of the cache refers only to its ability to track shared
memory requests.
The internal organization of the snooping cache may be direct mapped or set
associative in nature.

You might also like