udaykiriti/cacheon
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Cacheon - Multi-level Cache Simulator
A fast, configurable cache simulator for analyzing memory access patterns
and multi-level cache behavior. Simulates an L1/L2/L3 hierarchy with
optional TLB simulation, software prefetching, and write policy control.
NOTE: The cache hierarchy is fully configurable via CLI flags. Change
sizes, line sizes, and associativity without editing code.
Use --l1/--l2/--l3 with SIZE,LINE,ASSOC (e.g., 32K,64,8).
Default values:
Default test size: 8 MB
Default stride: 64 bytes
BUILD
make
RUN
Quick test:
./cacheon 8M 64
Full benchmark (4K to 256M):
./cacheon --all-sizes
Test with random access:
./cacheon 128M 64 -r
Test with hugepage mode (sets TLB page size to 2MB):
./cacheon 256M 256 -H
Test with both random and hugepage:
./cacheon 128M 64 -Hr
Quiet mode (CSV output for scripts and pipelines):
./cacheon 64M 64 -q
Override cache config and write policy:
./cacheon 64M 64 --l1 64K,64,8 --l2 512K,64,8 --l3 16M,64,16 --write-policy wt
Enable stride prefetching with a 64-entry TLB:
./cacheon 128M 64 --prefetch stride --tlb-entries 64
WHAT IS IT?
Cacheon simulates a 3-level cache hierarchy and traces memory accesses
through it. Given a memory address range and access pattern (sequential
or random), it reports cache hit/miss rates per level, miss classification
(cold/conflict/capacity), write-back statistics, and AMAT.
USAGE
cacheon [SIZE] [STRIDE] [OPTIONS]
SIZE:
Memory size to test. Examples: 4K, 8M, 256M, 1G
Default: 8M
STRIDE:
Stride between accesses in bytes. Should match your cache line size.
Default: 64
Note: random access mode requires a power-of-two stride (64, 128, 256, ...).
OPTIONS:
-r, --random Random access pattern (default: sequential)
-H, --hugepage Set TLB page size to 2MB (hugepage simulation)
-Hr, -rH Both random and hugepage
-q, --quiet Quiet output: CSV format size,l1%,l2%,l3%
-l, --lru Use LRU replacement policy (default: FIFO)
--all-sizes Run full benchmark across 4K to 256M sizes
--l1 SIZE,LINE,ASSOC Override L1 config (e.g., 32K,64,8)
--l2 SIZE,LINE,ASSOC Override L2 config
--l3 SIZE,LINE,ASSOC Override L3 config
--write-policy wb|wt Write-back or write-through
--write-rate N Percent writes 0-100 (default: 0, read-only)
--prefetch none|next|stride Prefetcher mode (default: none)
--tlb-entries N Enable TLB simulation with N entries
--page-size SIZE TLB page size (default 4K)
-h, --help Show help
-v, --version Show version
EXAMPLES
Test 32MB with sequential access:
./cacheon 32M 64
Test 256MB with random access:
./cacheon 256M 64 -r
Test with larger stride (256-byte cache lines):
./cacheon 128M 256
Use LRU replacement instead of FIFO:
./cacheon 8M 64 -l
Simulate 25% writes with write-through policy:
./cacheon 128M 64 --write-rate 25 --write-policy wt
Enable next-line prefetching:
./cacheon 128M 64 --prefetch next
Enable TLB simulation:
./cacheon 128M 64 --tlb-entries 64 --page-size 4K
Quiet mode for scripting:
./cacheon 16M 64 -q
Run full benchmark suite:
make run
BEHAVIOR NOTES
1. Working set vs cache size determines hit rates:
- test size << L1 size: high L1 hit rate
- L1 size < test size < L2 size: L1 misses hit in L2
- L2 size < test size < L3 size: L2 misses hit in L3
- test size >> L3 size: many L3 misses (memory fetch required)
2. Stride affects conflict patterns. With 64-byte cache lines and 4-way
L2, certain power-of-two strides cause set conflicts. Try different
stride values to observe this.
3. Random access mode uses a fixed seed (12345) for the first pass.
Each subsequent pass uses the continued RNG state, so passes are
NOT identical — the working set appears to be randomly sampled on
each pass rather than repeating the same sequence.
4. -H / --hugepage sets the TLB page size to 2MB for TLB simulation
purposes. It does not allocate actual hugepages.
5. Pass count scales with working set size: small sets get 10 passes,
medium sets get 5, large sets get 2, to simulate realistic warm-up.
6. LRU replacement (-l) uses O(1) list-splice + iterator-map internally,
so it adds no extra per-access cost vs FIFO.
MISS CLASSIFICATION
The simulator classifies each demand miss as one of:
cold - first time this cache line was ever accessed
conflict - line was seen before and was in a fully-associative
shadow cache (would have hit if fully associative)
capacity - line was seen before but evicted due to working set size
OUTPUT FORMAT
CACHE REPORT
============
Total Accesses: 262144
Pattern: Sequential
L1D Sim:
[==== ] 20.00%
Demand Hits: 52428 | Demand Misses: 209716
Prefetch Hits: 0 | Prefetch Misses: 0
Miss Breakdown: cold 131072, conflict 0, capacity 78644
Write-Backs: 0 | Dirty Evictions: 0 | Write-Through Writes: 0
L2 Sim:
...
L3 Sim:
...
AMAT (Average Memory Access Time): 83.20 cycles
Field meanings:
Progress bar [=== ] Visual hit rate (20 chars wide)
Percentage Demand hit rate (0-100%)
Demand Hits/Misses Explicit (non-prefetch) accesses
Prefetch Hits/Misses Prefetcher-issued accesses
Miss Breakdown cold / conflict / capacity classification
Write-Backs Dirty lines flushed to next level on eviction
Dirty Evictions Same as write-backs (write-back policy)
Write-Through Writes Writes propagated immediately (write-through policy)
AMAT FORMULA
AMAT = 4 * L1_hit_rate
+ 12 * L1_miss_rate * L2_hit_rate
+ 40 * L1_miss_rate * L2_miss_rate * L3_hit_rate
+ 200* L1_miss_rate * L2_miss_rate * L3_miss_rate
Cycle costs: L1=4, L2=12, L3=40, Memory=200
LICENSE
MIT License - See LICENSE file