0% found this document useful (0 votes)

154 views18 pages

Systems I: Locality and Caching

This document discusses locality, cache principles, and multi-level caches. It explains that programs exhibit locality of reference through temporal and spatial locality. Caches exploit locality by storing recently accessed data from main memory. Memory hierarchies use multi-level caches with each level being smaller, faster, and more expensive than the next. This provides an illusion of large, fast memory through the use of small, fast caches and large, slow main memory.

Uploaded by

VenuMadhavKattagoni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

154 views18 pages

Systems I: Locality and Caching

Uploaded by

VenuMadhavKattagoni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Systems I

Locality and Caching

Topics

Locality of reference
Cache principles
Multi-level caches

Locality
Principle of Locality:

Programs tend to reuse data and instructions near those

they have used recently, or that were recently referenced
themselves.
Temporal locality: Recently referenced items are likely to be
referenced in the near future.
Spatial locality: Items with nearby addresses tend to be
referenced close together in time.

Locality Example:

sum = 0;
for (i = 0; i < n; i++)
sum += a[i];
return sum;

Data
Reference array elements in succession
(stride-1 reference pattern): Spatial locality
Reference sum each iteration: Temporal locality

Instructions
Reference instructions in sequence: Spatial locality
Cycle through loop repeatedly: Temporal locality
2

Locality Example
Claim: Being able to look at code and get a qualitative
sense of its locality is a key skill for a professional
programmer.
Question: Does this function have good locality?
int sumarrayrows(int a[M][N])
{
int i, j, sum = 0;
for (i = 0; i < M; i++)
for (j = 0; j < N; j++)
sum += a[i][j];
return sum;
}
3

Locality Example
Question: Does this function have good locality?

int sumarraycols(int a[M][N])

{
int i, j, sum = 0;
for (j = 0; j < N; j++)
for (i = 0; i < M; i++)
sum += a[i][j];
return sum;
}

Locality Example
Question: Can you permute the loops so that the
function scans the 3-d array a[] with a stride-1
reference pattern (and thus has good spatial
locality)?
int sumarray3d(int a[M][N][N])
{
int i, j, k, sum = 0;
for (i = 0; i < M; i++)
for (j = 0; j < N; j++)
for (k = 0; k < N; k++)
sum += a[k][i][j];
return sum;
}

Memory Hierarchies
Some fundamental and enduring properties of
hardware and software:

Fast storage technologies cost more per byte and have less
capacity.

The gap between CPU and main memory speed is widening.

Well-written programs tend to exhibit good locality.

These fundamental properties complement each other

beautifully.
They suggest an approach for organizing memory and
storage systems known as a memory hierarchy.
6

An Example Memory Hierarchy

Smaller,
faster,
and
costlier
(per byte)
storage
devices

L0:
registers
L1: on-chip L1
cache (SRAM)
L2:

L3:
Larger,
slower,
and
cheaper
(per byte)
storage
devices
L5:

CPU registers hold words retrieved

from L1 cache.

L4:

off-chip L2
cache (SRAM)

L1 cache holds cache lines retrieved

from the L2 cache memory.
L2 cache holds cache lines
retrieved from main memory.

main memory
(DRAM)

Main memory holds disk

blocks retrieved from local
disks.

local secondary storage

(local disks)
Local disks hold files
retrieved from disks on
remote network servers.

remote secondary storage

(distributed file systems, Web servers)
7

Caches
Cache: A smaller, faster storage device that acts as a staging area
for a subset of the data in a larger, slower device.
Fundamental idea of a memory hierarchy:

For each k, the faster, smaller device at level k serves as a cache

for the larger, slower device at level k+1.

Why do memory hierarchies work?

Programs tend to access the data at level k more often than they
access the data at level k+1.
Thus, the storage at level k+1 can be slower, and thus larger and
cheaper per bit.
Net effect: A large pool of memory that costs as much as the cheap
storage near the bottom, but that serves data to programs at the
rate of the fast storage near the top.
Use combination of small fast memory and big slow memory to
give illusion of big fast memory.
8

Caching in a Memory Hierarchy

Level k:

8
4

10
4

Level k+1:

14
10

Smaller, faster, more expensive

device at level k caches a
subset of the blocks from level k+1

Data is copied between

levels in block-sized transfer
units

Larger, slower, cheaper storage

device at level k+1 is partitioned
into blocks.

General Caching Concepts

14
12
Level
k:

Request
12
14

Cache hit

4*
12

12
4*

Level
k+1:

Program needs object d, which is stored

in some block b.
Program finds b in the cache at level
k. E.g., block 14.

Cache miss

Request
12

4
4*

b is not at level k, so level k cache

must fetch it from level k+1.
E.g., block 12.
If level k cache is full, then some
current block must be replaced
(evicted). Which one is the victim?
Placement policy: where can the new

block go? E.g., b mod 4

Replacement policy: which block
should be evicted? E.g., LRU
10

General Caching Concepts

Types of cache misses:

Cold (compulsary) miss

Cold misses occur because the cache is empty.

Conflict miss
Most caches limit blocks at level k+1 to a small subset

(sometimes a singleton) of the block positions at level k.

E.g. Block i at level k+1 must be placed in block (i mod 4) at
level k+1.
Conflict misses occur when the level k cache is large enough,
but multiple data objects all map to the same level k block.
E.g. Referencing blocks 0, 8, 0, 8, 0, 8, ... would miss every time.

Capacity miss
Occurs when the set of active cache blocks (working set) is

larger than the cache.

Examples of Caching in the Hierarchy

Cache Type

What Cached

Where Cached

Registers

4-byte word

CPU registers

0 Compiler

TLB

Address
translations
32-byte block
32-byte block
4-KB page

On-Chip TLB

0 Hardware

On-Chip L1
Off-Chip L2
Main memory

Parts of files

Main memory

1 Hardware
10 Hardware
100 Hardware+
OS
100 OS

L1 cache
L2 cache
Virtual
Memory
Buffer cache

Network buffer Parts of files

cache
Browser
Web pages
cache
Web cache
Web pages

Local disk
Local disk
Remote server
disks

Latency
(cycles)

Managed
By

10,000,000 AFS/NFS
client
10,000,000 Web
browser
1,000,000,000 Web proxy
server
12

Cache Memories
Cache memories are small, fast SRAM-based memories
managed automatically in hardware.

Hold frequently accessed blocks of main memory

CPU looks first for data in L1, then in L2, then in main
memory.
Typical bus structure:
CPU chip
register file
L1
cache
cache bus

L2 cache

ALU
system bus memory bus

bus interface

I/O
bridge

main
memory
13

Inserting an L1 Cache Between

the CPU and Main Memory
The tiny, very fast CPU register file
has room for four 4-byte words.

The transfer unit between

the CPU register file and
the cache is a 4-byte block.
line 0

The small fast L1 cache has room

for two 4-word blocks.

line 1

The transfer unit between

the cache and main
memory is a 4-word block
(16 bytes).
block 10

abcd

...
block 21

pqrs

...
block 30

The big slow main memory

has room for many 4-word
blocks.

wxyz

...
14

Multi-Level Caches
Options: separate data and instruction caches, or a
unified cache
Processor

Regs

L1
d-cache
L1
i-cache

size:
speed:
$/Mbyte:
line size:

200 B
3 ns

8-64 KB
3 ns

8B
32 B
larger, slower, cheaper

Unified
L2
Cache

1-4MB SRAM
6 ns
$100/MB
32 B

Memory

128 MB DRAM
60 ns
$1.50/MB
8 KB

disk

30 GB
8 ms
$0.05/MB

Intel Pentium Cache Hierarchy

Regs.

L1 Data
1 cycle latency
16 KB
4-way assoc
Write-through
32B lines

L1 Instruction
16 KB, 4-way
32B lines

L2 Unified
128KB--2 MB
4-way assoc
Write-back
Write allocate
32B lines

Main
Memory
Up to 4GB

Processor Chip

Find the Caches

IBM Power 5, 2004

Summary
Today

Locality: Spatial and Temporal

Cache principles

Multi-level cache hierarchies

Next Time

Cache organization

Replacement and writes

Programming considerations

Computer Engineering Students
No ratings yet
Computer Engineering Students
17 pages
Cache
No ratings yet
Cache
35 pages
12 Caches Notes
No ratings yet
12 Caches Notes
144 pages
Lecture 16
No ratings yet
Lecture 16
22 pages
Week 10
No ratings yet
Week 10
59 pages
Lecture 10: Memory System - Memory Technology: CSE 564 Computer Architecture Summer 2017
No ratings yet
Lecture 10: Memory System - Memory Technology: CSE 564 Computer Architecture Summer 2017
44 pages
Memory Hierarchy
100% (1)
Memory Hierarchy
47 pages
Computer Organization and Architecture Chapter 7 Large and Fast Exploiting
No ratings yet
Computer Organization and Architecture Chapter 7 Large and Fast Exploiting
32 pages
Ddca 2024 Lecture24 Memory Hierarchy and Caches Beforelecture
No ratings yet
Ddca 2024 Lecture24 Memory Hierarchy and Caches Beforelecture
304 pages
EL3011 - 13 Memory Hierarchy
No ratings yet
EL3011 - 13 Memory Hierarchy
56 pages
Chapter 3 P1
No ratings yet
Chapter 3 P1
57 pages
12 Caches Notes
No ratings yet
12 Caches Notes
144 pages
Cache1 2
No ratings yet
Cache1 2
30 pages
5 Memory Hierarchy
No ratings yet
5 Memory Hierarchy
99 pages
Memory Design
No ratings yet
Memory Design
36 pages
Week 12 - Lecture 12 - Memory
No ratings yet
Week 12 - Lecture 12 - Memory
27 pages
UNIT II - Multi Core Architecture
No ratings yet
UNIT II - Multi Core Architecture
102 pages
Week 11
No ratings yet
Week 11
45 pages
Memory 2
No ratings yet
Memory 2
31 pages
Ldco Unit 6 Notes
No ratings yet
Ldco Unit 6 Notes
44 pages
08 Caches
No ratings yet
08 Caches
78 pages
UNIT-IV Memory and I/O
No ratings yet
UNIT-IV Memory and I/O
36 pages
CS 3853 Computer Architecture - Memory Hierarchy
No ratings yet
CS 3853 Computer Architecture - Memory Hierarchy
37 pages
13 - Large and Fast Exploiting Memory Hierarchy Final
No ratings yet
13 - Large and Fast Exploiting Memory Hierarchy Final
118 pages
Lecture 11. Memory Hierarchy
No ratings yet
Lecture 11. Memory Hierarchy
107 pages
Cache Memory Cache Memory
No ratings yet
Cache Memory Cache Memory
13 pages
Cache&Virtual Memory
No ratings yet
Cache&Virtual Memory
50 pages
Cache
No ratings yet
Cache
36 pages
DLCA CH 05 - Memory Organization Part 1
No ratings yet
DLCA CH 05 - Memory Organization Part 1
156 pages
Welcome To Part 3: Memory Systems and I/O
No ratings yet
Welcome To Part 3: Memory Systems and I/O
31 pages
Cache PPT
No ratings yet
Cache PPT
38 pages
Memory Cache
No ratings yet
Memory Cache
18 pages
CS3350B Computer Architecture Memory Hierarchy: Why?: Marc Moreno Maza
No ratings yet
CS3350B Computer Architecture Memory Hierarchy: Why?: Marc Moreno Maza
30 pages
Advanced Memory Systems Lecture
No ratings yet
Advanced Memory Systems Lecture
88 pages
L15 Cache Introduction
No ratings yet
L15 Cache Introduction
35 pages
CompArch 18a Cache-1
No ratings yet
CompArch 18a Cache-1
14 pages
Lec2 PDF
No ratings yet
Lec2 PDF
21 pages
Chapter 6
No ratings yet
Chapter 6
37 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
28 pages
CA09 2024S2 New
No ratings yet
CA09 2024S2 New
29 pages
Memory Hierarchy - Introduction: Cost Performance of Memory Reference
No ratings yet
Memory Hierarchy - Introduction: Cost Performance of Memory Reference
52 pages
CS 61C: Great Ideas in Computer Architecture: Lecture 12 - Memory Hierarchy/Direct-Mapped Caches
No ratings yet
CS 61C: Great Ideas in Computer Architecture: Lecture 12 - Memory Hierarchy/Direct-Mapped Caches
27 pages
Lecture 14
No ratings yet
Lecture 14
14 pages
10 Cache Memories
No ratings yet
10 Cache Memories
49 pages
Cache Memory: How Caching Works
No ratings yet
Cache Memory: How Caching Works
15 pages
Cache Memory A
No ratings yet
Cache Memory A
62 pages
L-4 (Cache Memory)
No ratings yet
L-4 (Cache Memory)
61 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
32 pages
Sampriya Chandra Cache Memory
No ratings yet
Sampriya Chandra Cache Memory
36 pages
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
No ratings yet
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
10 pages
Lecture-17 CH-05 1
No ratings yet
Lecture-17 CH-05 1
21 pages
Memory Hierarchy
No ratings yet
Memory Hierarchy
19 pages
DLCOA Someanswers
No ratings yet
DLCOA Someanswers
8 pages
Help 2
No ratings yet
Help 2
102 pages
CMP3010L08 Memory
No ratings yet
CMP3010L08 Memory
45 pages
4 Memory Models
No ratings yet
4 Memory Models
19 pages
LiteEdit 4.6 Reference Guide r1
0% (1)
LiteEdit 4.6 Reference Guide r1
52 pages
WowzaStreamingEngine UsersGuide
No ratings yet
WowzaStreamingEngine UsersGuide
118 pages
MG580225 ATPG Clock Control Logic Appnote v2013 3 LPCT OCC
No ratings yet
MG580225 ATPG Clock Control Logic Appnote v2013 3 LPCT OCC
44 pages
S1-152097-EBU-Types of TV Programmes On LTE Networks
No ratings yet
S1-152097-EBU-Types of TV Programmes On LTE Networks
2 pages
MapR Certified Spark Developer Study Guide (MCSD)
No ratings yet
MapR Certified Spark Developer Study Guide (MCSD)
29 pages
Ilib - Library Management Software (C#)
50% (4)
Ilib - Library Management Software (C#)
13 pages
Data Entry Certification
No ratings yet
Data Entry Certification
0 pages
TS2900 - Autoloader PDF
No ratings yet
TS2900 - Autoloader PDF
286 pages
PSP Xenon Operation Manual PDF
No ratings yet
PSP Xenon Operation Manual PDF
22 pages
Avaya Administration Guide
No ratings yet
Avaya Administration Guide
186 pages
Microsoft Dynamics AX 2009-System Requirements
No ratings yet
Microsoft Dynamics AX 2009-System Requirements
8 pages
HSSI Expansion Modules Guide 3.00
No ratings yet
HSSI Expansion Modules Guide 3.00
34 pages
Msi b450m Pro VDH Datasheet
No ratings yet
Msi b450m Pro VDH Datasheet
1 page
AHB Slave
No ratings yet
AHB Slave
18 pages
Icm 75 Otb
No ratings yet
Icm 75 Otb
218 pages
05 CsCAN Connectivity Pages
No ratings yet
05 CsCAN Connectivity Pages
106 pages
Cisco Press - Is-Is Network Design Solutions
No ratings yet
Cisco Press - Is-Is Network Design Solutions
352 pages
Google Jas Tordillo
100% (1)
Google Jas Tordillo
2 pages
Workforce Sync Pt8.47x Pt8.49x
No ratings yet
Workforce Sync Pt8.47x Pt8.49x
21 pages
Raspberry Pi WiFi Setup Guide
No ratings yet
Raspberry Pi WiFi Setup Guide
11 pages
MP3 Sharing Disclaimer and Guidelines
No ratings yet
MP3 Sharing Disclaimer and Guidelines
1 page
Universal Serial Bus Type-C Port Controller Interface Specification
No ratings yet
Universal Serial Bus Type-C Port Controller Interface Specification
103 pages
BlackBerry Application Developer Guide Volume 1
No ratings yet
BlackBerry Application Developer Guide Volume 1
234 pages
The Zone Routing Protocol (ZRP)
No ratings yet
The Zone Routing Protocol (ZRP)
40 pages
Motorola MC68000 Microprocessor Guide
No ratings yet
Motorola MC68000 Microprocessor Guide
33 pages
Free AWS Solutions Architect Practice Test Questions - Exam Prep - Simplilearn
No ratings yet
Free AWS Solutions Architect Practice Test Questions - Exam Prep - Simplilearn
24 pages
Design of Synchronous Fifo
No ratings yet
Design of Synchronous Fifo
18 pages
Architecture of Fpga Altera Cyclone: BY:-Karnika Sharma Mtech (2 Year)
100% (1)
Architecture of Fpga Altera Cyclone: BY:-Karnika Sharma Mtech (2 Year)
29 pages
Setup Log
No ratings yet
Setup Log
261 pages
Chapter 8 Lab B: Configuring A Remote Access VPN Server and Client
No ratings yet
Chapter 8 Lab B: Configuring A Remote Access VPN Server and Client
24 pages

Systems I: Locality and Caching

Uploaded by

Systems I: Locality and Caching

Uploaded by

Systems I

Locality and Caching

Programs tend to reuse data and instructions near those

int sumarraycols(int a[M][N])

The gap between CPU and main memory speed is widening.

Well-written programs tend to exhibit good locality.

These fundamental properties complement each other

An Example Memory Hierarchy

CPU registers hold words retrieved

L1 cache holds cache lines retrieved

Main memory holds disk

local secondary storage

remote secondary storage

For each k, the faster, smaller device at level k serves as a cache

Why do memory hierarchies work?

Caching in a Memory Hierarchy

Smaller, faster, more expensive

Data is copied between

Larger, slower, cheaper storage

General Caching Concepts

Program needs object d, which is stored

b is not at level k, so level k cache

block go? E.g., b mod 4

General Caching Concepts

Cold (compulsary) miss

(sometimes a singleton) of the block positions at level k.

larger than the cache.

Examples of Caching in the Hierarchy

Network buffer Parts of files

Hold frequently accessed blocks of main memory

Inserting an L1 Cache Between

The transfer unit between

The small fast L1 cache has room

The transfer unit between

The big slow main memory

Intel Pentium Cache Hierarchy

Find the Caches

IBM Power 5, 2004

Locality: Spatial and Temporal

Multi-level cache hierarchies

Replacement and writes

You might also like