CS 523 Advanced Computer Architecture
Lecture 27
Introduction to Cache Coherence Protocols
John Jose
Assistant Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Writing in the cache
x Memory x’ Memory x Memory
x Cache x’ Cache x’ Cache
P P P
Before Write through Write back
Cache Coherence Problem
Initial state: P2 reads A; P3 reads A
P1 P2 P3 P4
A A
Mem.
Now, P2 wants to write A and the P3 reads A ?
How will P3 gets latest value of A written by P2 ?
Cache coherence on shared-bus
P1 P2 P3 P4
A A
Mem. A
Two choices:
Write-update: Broadcast the new value of A on the bus
by P2; value of A snooped by cache of P3. Memory is
also updated.
Write-invalidate: Broadcast an invalidation message with
the address of A; the address snooped by cache of P3
which invalidates its copy of A: The copy in memory is not
up-to-date any longer
If instead of P2 requesting to write A, we had a write miss in
P4 for A, the same two choices of protocol apply.
Write-invalidate
x x’ x
x x x’ I x’ I
P1 P2 P3 P1 P2 P3 P1 P2 P3
Before Write Through Write back
Write-Update
x x’ x
x x x’ x’ x’ x’
P1 P2 P3 P1 P2 P3 P1 P2 P3
Before Write Through Write back
Snooping Protocols
Snooping protocols are based on watching bus activities
and carrying out the appropriate coherency commands
Each CPU-cache system ‘snoops’ (i.e. watches continually)
for write activity concerned with data addresses which it has
cached
Main memory is moved in blocks, and each block has a
state associated with it, which determines what happens to
the entire contents of the block.
The state of a block might change as a result of the
operations
Read-Miss
Read-Hit
Write-Miss
Write-Hit
Snooping Protocols
Write Invalidate
CPU requesting to write to an address, grabs a bus cycle
and sends a ‘write invalidate’ message
All snooping caches invalidate their copy of appropriate
cache line
CPU writes to its cached copy (assume for now that it
also writes through to memory)
Any shared read in other CPUs will now miss in cache
and re-fetch new data.
Snooping Protocols
Write Update
CPU requesting to write grabs bus cycle and broadcasts
new data as it updates its own copy
All snooping caches update their copy
Note that in both schemes, problem of simultaneous writes
is taken care of by bus arbitration - only one CPU can use
the bus at any one time.
Update or Invalidate?
Update looks the simplest, most obvious and fastest
Invalidate scheme is usually implemented with write-back
caches and in that case:
Multiple writes to same word (no intervening read)
need only one invalidate message but would require
an update for each
Writes to same block in (usual) multi-word cache
block require only one invalidate but would require
multiple updates.
Bus bandwidth is a precious commodity in shared memory
multi-processors
Invalidate protocols use significantly less bandwidth.
MESI Protocol
A practical multiprocessor invalidate protocol
which attempts to minimize bus usage.
Allows usage of a ‘write back’ scheme - i.e. main
memory not updated until ‘dirty’ cache line is
displaced
Extension of usual cache tags, i.e. invalid tag and
‘dirty’ tag in normal write back cache.
MESI Protocol
Any cache line can be in one of 4 states (2 bits)
Modified - cache line has been modified, is different from
main memory - is the only cached copy. (‘dirty’)
Exclusive - cache line is the same as main memory and is
the only cached copy
Shared - Same as main memory but copies may exist in
other caches.
Invalid - Line data is not valid (as in simple cache)
Cache line changes state on memory access events.
Due to local processor activity (i.e. cache access)
Due to bus activity - as a result of snooping. cache line
state affected only if address matches
MESI Protocol
Operation can be described informally by looking at action
in local processor
Read Hit
Read Miss
Write Hit
Write Miss
More formally by state transition diagram
MESI Local Read Hit
Line must be in one of MES
This must be correct local value
If M, it must have been modified locally
Simply return value
No state change
MESI - Local Read Miss
1. No other copy in caches
Processor makes bus request to memory
Value read to local cache, marked E
2. One cache has E copy
Processor makes bus request to memory
Snooping cache puts copy value on the bus
Memory access is abandoned
Local processor caches value
Both lines set to S
MESI - Local Read Miss
3. Several caches have S copy
Processor makes bus request to memory
One cache puts copy value on the bus (arbitrated)
Memory access is abandoned
Local processor caches value
Local copy set to S
Other copies remain S
MESI - Local Read Miss
One cache has M copy
Processor makes bus request to memory
Snooping cache puts copy value on the bus
Memory access is abandoned
Local processor caches value
Local copy tagged S
Source (M) value copied back to memory
Source value M -> S
MESI Local Write Hit
Line must be one of MES
1. One cache has M copy
line is exclusive and already ‘dirty’
Update local cache value, no state change
2. One cache has E copy
Update local cache value, State E -> M
3. Several caches have S copy
Processor broadcasts an invalidate on bus
Snooping processors with S copy change S->I
Local cache value is updated
Local state change S->M
MESI Local Write Miss
1. No other copies
Value read from memory to local cache
Value updated
Local copy state set to M
2. Other copies, either one in state E or more in state S
Value read from memory to local cache - bus transaction
marked RWITM (read with intent to modify)
Snooping processors see this and set their copy state to I
Local copy updated & state set to M
MESI Local Write Miss
3. Another copy in state M
Processor issues bus transaction marked RWITM
Snooping processor sees this
Blocks RWITM request
Takes control of bus
Writes back its copy to memory
Sets its copy state to I
Original local processor re-issues RWITM request
Is now simple no-copy case
Value read from memory to local cache
Local copy value updated
Local copy state set to M
MESI - FSM
All of this information can be described compactly using a
state transition diagram
Diagram shows what happens to a cache line in a
processor as a result of
memory accesses made by that processor (read
hit/miss, write hit/miss)
memory accesses made by other processors that
result in bus transactions observed by this snoopy
cache (Mem read, RWITM, Invalidate)
MESI – locally initiated accesses
Read
Miss(sh) Read
Invalid Mem Read Shared Hit
Read Invalidate
RWITM Mem Read
Miss(ex) Write
Write Hit
Miss
Read Read
Modified Exclusive Hit
Hit Write
Hit
Write = bus transaction
Hit
MESI – remotely initiated accesses
Mem Read
Invalidate
Invalid Shared
Mem Read
RWITM Mem Read RWITM
Modified Exclusive
= copy back
johnjose@iitg.ernet.in