Advanced Computer Architecture

This document discusses memory hierarchies and caches. It provides an overview of memory hierarchies, including registers, caches, and main memory. Caches work to bridge the gap between fast processor speeds and slower memory. Caches are only effective if programs exhibit temporal and spatial locality, meaning the same data is accessed repeatedly or nearby data is accessed. The document reviews cache concepts like hit rate, replacement strategies, and ensuring data coherence. It also discusses how programmers can improve cache performance through techniques like blocking to exploit data locality.

Uploaded by

core9418

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views21 pages

Advanced Computer Architecture

Uploaded by

core9418

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Advanced Computer

Architecture
Dr. Angela C. Sodan
Lecture 8
Overview
Memory hierarchy / caches
Memory Hierarchy
Recall: memory/processor gap is
increasing
Technological background:
DRAM – dynamic, to be (automatically)
rewritten, improved performance if BiCMOS
-> the technology for main memory
SRAM – static, no refreshing, faster, more
expensive
-> the technology for caches
In addition access speed reduces with size
Memory Hierarchy
Trade-off between cost/size and speed
(and space on chip)
Basic solution: memory hierarchy
Registers
Caches (multilevel, bridging)
Main memory
Other measures:
Memory interleaving
(Multiple) register sets
Review Caches
Which characteristics are important
Cache is implicit (typically transparent and all
transfer automatic)
Conversely to registers/memories which are
explicitly accessed (load/store)
In speed and size between registers and
memory
Works as a “buffer” to memory (load/store)
Why can this work?
Caches and Locality
Caches – only – work if program has
temporal and/or spatial locality
This typically is the the case but
What does this mean?

Temporal: the same data is accessed

multiple times in short time intervals
e.g. a loop
Spatial: spatially close data is accessed
e.g. a vector/array, sequentially accessed
and instructions if not branch…
Caches and Locality (cont.)
Thus it helps to
Keep the recently used data in the cache
To fetch more than one word and even
pipeline memory access
Modern processors heavily depend on the
cache
High hit rates (about 90%) required for
good performance
Review Caches (cont.)
Different approaches
Associative or direct
Associative: stores arbitrarily, “searches”,
complex and expensive, works for all kind of
accesses
Direct: hashes, potential conflicts, can work
poorly with certain access sequences
Practically applied is hybrid set-associative:
direct access but a few associative slots per
entry – still cheap and works very well,
typical is 4-way
Review Caches (cont.)
Replacement strategy required for
associative and set-associative
Least-Recently Used: applies locality idea
Random: easy and often applied in caches

(Not much difference for larger caches and

penalty for miss not too high)
Ensuring data coherence:
Reading is uncritical
Writes need to be done in memory, too
– Store through: every change immediately in memory
– Write back: change in memory only if replaced,
applied in modern processors
Review Caches (cont.)
Entity is cache line (multiple words)
Exploits spatial locality
Longer cache line: good if much locality
Danger: Useless fetch cost, wasting cache space
Shorter cache line: good if little locality
Typical is 32, recently sometimes 64 bytes
Larger cache size compensates for longer
cache lines, direct caches and simpler
replacement
The Miss Sources
Conflict misses -> include associativity,
make larger
Capacity misses -> make larger
Initial misses -> make cache lines longer
A Cost Metric
Teff – Effective access time

Teff = h1 t1 + (1-h1) h2 t2 + (1-h1) (1-h2) h3 t3 +…

with hi is hit rate at level i and ti is access time level i

Useto calculate average access time or to

configure systems according to
requirements
Some Comments About
Machine Configuration
Ifyou purchase a machine, think about your
requirements (estimates of typical
applications)
E.g. will you have large problems which should fit
into memory or even cache, how fast needs your
disk to be (or may you want multiple disks)?
This may be a question about how to spend
your available budget…
More later when we talk about performance
Caches are Implicit – Are They
Really?
Answer:
Yes, with respect to semantic correctness –
enough for standard programming
No, with respect to performance

- for high performance computing

Proper loop organization
(user or compiler)
Getting Good Cache Performance
Temporal: Partition into small loops
for (i = 0; l++; l < k)
for (l = 0; i++; i<n)
for (j = 0; j++; j < m)
c [i,l] += a[i,j] * b[j,l]
may re-use row i of a (if fits into cache)
Spatial: use vectors and sequential
access if possible
Significant speedup in bio-computing app
Cache Locality (cont.)
A simple improvement to at least
exploit spatial locality for second matrix
for (i = 0; i++; i < k) Loop interchange
for (j = 0; j++; j < n)
for (l = 0; l++; l < m)
c [i,l] += a[i,j] * b[j,l]
Note that this code now accesses lines of the
second matrix in the innermost loop!
can increase performance by up to e.g. a factor of
32/4=8 if only 1 row/column of B fits into
memory
Cache Locality (cont.)
Sequential access on vector outperforms
Arbitrary access on vector
Linked data structures
(unless fit completely into cache)
If multiple related data items
If arbitrary access, put them in single array
with struct
If sequential access, it does not matter

struct {double real, imag;} COMPLEX [n] data;

double data_r [n], data_i [n];
Cache Locality (cont.)
The matrix example revisited: further
possible improvement:
Try to keep the second matrix in the cache
to exploit temporal locality
Since the whole matrix most likely is too
large, partition it into submatrices
Cache Locality Advanced
blocking

A11 A12 B11 B12 C11 C12

x =
A21 A22 B21 B22 C21 C22

Calculate submatrices and combine them:

Cij = Ai1*B1j + Ai2*B2j
Cache Locality Advanced (cont.)
Assume that matrix is n*n,
m is number of blocks per dimension
and that submatrix fits into cache!!!
We have to load A m times instead of
once, but B only m instead of n times
Which gives (2m)*n*n elements instead of
(n+1)*n*n elements to load from memory into
cache
E.g. n=1000, m=2 gives
4 Mio vs. 1001 Mio elements
Final Comment
Again, you do not need to change all your
programming
But its worth if you write programs which
will run many times and take a lot of
runtime:
imagine your want to test 100 scenarios in a
simulation and each runs 10 hours vs. 1 hour!

Memory Hierarchy
100% (1)
Memory Hierarchy
47 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
64 pages
Memory Hierarchy Essentials
No ratings yet
Memory Hierarchy Essentials
60 pages
Cache
No ratings yet
Cache
35 pages
12 Caches Notes
No ratings yet
12 Caches Notes
144 pages
Computer Engineering Students
No ratings yet
Computer Engineering Students
17 pages
Week 10
No ratings yet
Week 10
59 pages
12 Caches Notes
No ratings yet
12 Caches Notes
144 pages
Week 11
No ratings yet
Week 11
45 pages
Ddca 2024 Lecture24 Memory Hierarchy and Caches Beforelecture
No ratings yet
Ddca 2024 Lecture24 Memory Hierarchy and Caches Beforelecture
304 pages
L-4 (Cache Memory)
No ratings yet
L-4 (Cache Memory)
61 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
51 pages
Welcome To Part 3: Memory Systems and I/O
No ratings yet
Welcome To Part 3: Memory Systems and I/O
31 pages
Cache
No ratings yet
Cache
36 pages
Cache Memory: A Safe Place For Hiding or Storing Things
100% (1)
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Memory Hierarchy and Cache Design
No ratings yet
Memory Hierarchy and Cache Design
53 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
Memory Cache
No ratings yet
Memory Cache
21 pages
Memory Organization
No ratings yet
Memory Organization
9 pages
CH04 COA9e Cache Memory Repaired
No ratings yet
CH04 COA9e Cache Memory Repaired
42 pages
CMP3010L09 MemoryII
No ratings yet
CMP3010L09 MemoryII
39 pages
Lecture 16
No ratings yet
Lecture 16
22 pages
03-Chap4-Cache Memory Mapping
No ratings yet
03-Chap4-Cache Memory Mapping
24 pages
Conspect of Lecture 7
No ratings yet
Conspect of Lecture 7
13 pages
Lecture 11. Memory Hierarchy
No ratings yet
Lecture 11. Memory Hierarchy
107 pages
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
No ratings yet
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
10 pages
Cache Memory: A Safe Place For Hiding or Storing Things
No ratings yet
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Cache PPT
No ratings yet
Cache PPT
38 pages
Coa PPT
No ratings yet
Coa PPT
158 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
57 pages
10 Cache Memories
No ratings yet
10 Cache Memories
49 pages
Chapter 6: Memory: - CPU Accesses Memory at Least Once Per Fetch-Execute Cycle: - Memory Is Organized Into A Hierarchy
No ratings yet
Chapter 6: Memory: - CPU Accesses Memory at Least Once Per Fetch-Execute Cycle: - Memory Is Organized Into A Hierarchy
25 pages
Lecture 6 Memory 2023
No ratings yet
Lecture 6 Memory 2023
66 pages
Cache Memory
No ratings yet
Cache Memory
89 pages
Cache Memory & Design Principles
No ratings yet
Cache Memory & Design Principles
47 pages
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
71 pages
Lec 5
No ratings yet
Lec 5
29 pages
Cache Memory
No ratings yet
Cache Memory
47 pages
361 Computer Architecture Lecture 14: Cache Memory
No ratings yet
361 Computer Architecture Lecture 14: Cache Memory
20 pages
Lec2 PDF
No ratings yet
Lec2 PDF
21 pages
CH04 Cache Memory
No ratings yet
CH04 Cache Memory
44 pages
CH04 COA10e
No ratings yet
CH04 COA10e
41 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
Computer Organization and Architecture: Cache Memory
100% (1)
Computer Organization and Architecture: Cache Memory
57 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
Advanced Memory Systems Lecture
No ratings yet
Advanced Memory Systems Lecture
88 pages
L15 Cache Introduction
No ratings yet
L15 Cache Introduction
35 pages
08 Caches
No ratings yet
08 Caches
78 pages
CH10 - Memory Hierarchy
No ratings yet
CH10 - Memory Hierarchy
106 pages
CH04 COA10e
No ratings yet
CH04 COA10e
46 pages
53-Cache Memory - Principles, Cache Memory Management Techniques-28!02!2025
No ratings yet
53-Cache Memory - Principles, Cache Memory Management Techniques-28!02!2025
38 pages
Sampriya Chandra Cache Memory
No ratings yet
Sampriya Chandra Cache Memory
36 pages
Computer Organization & Architecture: Cache Memory
No ratings yet
Computer Organization & Architecture: Cache Memory
71 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
57 pages
ASA Chapter4
No ratings yet
ASA Chapter4
8 pages
04 - Cache Memory (Compatibility Mode)
No ratings yet
04 - Cache Memory (Compatibility Mode)
12 pages
Memory Design
No ratings yet
Memory Design
36 pages
Memory Organization & Cache Optimization
No ratings yet
Memory Organization & Cache Optimization
10 pages
Data Structures Lab Guide
No ratings yet
Data Structures Lab Guide
32 pages
School Climate Insights for Educators
No ratings yet
School Climate Insights for Educators
4 pages
Curriculum
No ratings yet
Curriculum
6 pages
Telnet
No ratings yet
Telnet
1 page
ACAC 2001 Advanced Computer Architecture Course
No ratings yet
ACAC 2001 Advanced Computer Architecture Course
11 pages
THAI COATER Spec PDF
No ratings yet
THAI COATER Spec PDF
4 pages
Sony STR-DB940 Service Manual
100% (1)
Sony STR-DB940 Service Manual
86 pages
JR - KG Monthly Planner January, 2025
100% (1)
JR - KG Monthly Planner January, 2025
10 pages
Cuban Missile War v1.8
No ratings yet
Cuban Missile War v1.8
97 pages
Bodmas, Ratio, SimpleInterest
No ratings yet
Bodmas, Ratio, SimpleInterest
27 pages
Mini M-70 Series Industrial Spray Nozzle: Specifications
No ratings yet
Mini M-70 Series Industrial Spray Nozzle: Specifications
2 pages
Microstructure Behavior of Soft Soil Stabilized With Cement and Sodium Silicate
No ratings yet
Microstructure Behavior of Soft Soil Stabilized With Cement and Sodium Silicate
9 pages
Group AFM Assignment
No ratings yet
Group AFM Assignment
11 pages
Search Something and Save4
No ratings yet
Search Something and Save4
7 pages
CEFI Adult Brochure 2022
No ratings yet
CEFI Adult Brochure 2022
6 pages
Unix Command Reference List
No ratings yet
Unix Command Reference List
3 pages
2007 DB Worldua
No ratings yet
2007 DB Worldua
102 pages
Ss5150c Gme
No ratings yet
Ss5150c Gme
3 pages
AG PIECO Glass & Rubber Products Catalog
No ratings yet
AG PIECO Glass & Rubber Products Catalog
8 pages
PC Zone (M) Sdn. BHD.: Invoice
No ratings yet
PC Zone (M) Sdn. BHD.: Invoice
1 page
Towing Vessel Particulars
No ratings yet
Towing Vessel Particulars
6 pages
SHS TVL Empowerment Technologies Applied Subject
No ratings yet
SHS TVL Empowerment Technologies Applied Subject
12 pages
FLC3-70 Magnetic Sensor Guide
No ratings yet
FLC3-70 Magnetic Sensor Guide
1 page
CAE - Multiple Choice Vocabulary
No ratings yet
CAE - Multiple Choice Vocabulary
5 pages
Asm 16272
No ratings yet
Asm 16272
15 pages
Group 5 Principles of Management
No ratings yet
Group 5 Principles of Management
18 pages
L Set 04 PGT (Direct) 131 To 140 General English
No ratings yet
L Set 04 PGT (Direct) 131 To 140 General English
4 pages
Using Ticker Timers To Measure Speed
No ratings yet
Using Ticker Timers To Measure Speed
7 pages
Communication in The Workplace
No ratings yet
Communication in The Workplace
67 pages
Test Bank For Human Resource Development Talent Development 7th Edition Werner
No ratings yet
Test Bank For Human Resource Development Talent Development 7th Edition Werner
14 pages
Laboratory #4: Control Charts For Variable Data (X-Bar and R) Purpose: Materials
No ratings yet
Laboratory #4: Control Charts For Variable Data (X-Bar and R) Purpose: Materials
7 pages
Earth Portrait of A Planet 5th Edition by Stephen Marshak Unlocked Test Bank
No ratings yet
Earth Portrait of A Planet 5th Edition by Stephen Marshak Unlocked Test Bank
313 pages
Sample BOOK 1 Javed Ahmed Talks-2
No ratings yet
Sample BOOK 1 Javed Ahmed Talks-2
12 pages
Instalación, Operación y Mantenimiento Polipasto LX1 LX3
No ratings yet
Instalación, Operación y Mantenimiento Polipasto LX1 LX3
68 pages
Kitchen Business Plan PDF
No ratings yet
Kitchen Business Plan PDF
9 pages