Skip to content

udaykiriti/cacheon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cacheon - Multi-level Cache Simulator

A fast, configurable cache simulator for analyzing memory access patterns
and multi-level cache behavior. Simulates an L1/L2/L3 hierarchy with
optional TLB simulation, software prefetching, and write policy control.

NOTE: The cache hierarchy is fully configurable via CLI flags. Change
sizes, line sizes, and associativity without editing code.
Use --l1/--l2/--l3 with SIZE,LINE,ASSOC (e.g., 32K,64,8).

Default values:
      Default test size:  8 MB
      Default stride:    64 bytes

BUILD

    make

RUN

Quick test:
    ./cacheon 8M 64

Full benchmark (4K to 256M):
    ./cacheon --all-sizes

Test with random access:
    ./cacheon 128M 64 -r

Test with hugepage mode (sets TLB page size to 2MB):
    ./cacheon 256M 256 -H

Test with both random and hugepage:
    ./cacheon 128M 64 -Hr

Quiet mode (CSV output for scripts and pipelines):
    ./cacheon 64M 64 -q

Override cache config and write policy:
    ./cacheon 64M 64 --l1 64K,64,8 --l2 512K,64,8 --l3 16M,64,16 --write-policy wt

Enable stride prefetching with a 64-entry TLB:
    ./cacheon 128M 64 --prefetch stride --tlb-entries 64

WHAT IS IT?

Cacheon simulates a 3-level cache hierarchy and traces memory accesses
through it. Given a memory address range and access pattern (sequential
or random), it reports cache hit/miss rates per level, miss classification
(cold/conflict/capacity), write-back statistics, and AMAT.

USAGE

    cacheon [SIZE] [STRIDE] [OPTIONS]

SIZE:
    Memory size to test. Examples: 4K, 8M, 256M, 1G
    Default: 8M

STRIDE:
    Stride between accesses in bytes. Should match your cache line size.
    Default: 64
    Note: random access mode requires a power-of-two stride (64, 128, 256, ...).

OPTIONS:
    -r, --random               Random access pattern (default: sequential)
    -H, --hugepage             Set TLB page size to 2MB (hugepage simulation)
    -Hr, -rH                   Both random and hugepage
    -q, --quiet                Quiet output: CSV format size,l1%,l2%,l3%
    -l, --lru                  Use LRU replacement policy (default: FIFO)
    --all-sizes                Run full benchmark across 4K to 256M sizes
    --l1 SIZE,LINE,ASSOC       Override L1 config (e.g., 32K,64,8)
    --l2 SIZE,LINE,ASSOC       Override L2 config
    --l3 SIZE,LINE,ASSOC       Override L3 config
    --write-policy wb|wt       Write-back or write-through
    --write-rate N             Percent writes 0-100 (default: 0, read-only)
    --prefetch none|next|stride  Prefetcher mode (default: none)
    --tlb-entries N            Enable TLB simulation with N entries
    --page-size SIZE           TLB page size (default 4K)
    -h, --help                 Show help
    -v, --version              Show version

EXAMPLES

Test 32MB with sequential access:
    ./cacheon 32M 64

Test 256MB with random access:
    ./cacheon 256M 64 -r

Test with larger stride (256-byte cache lines):
    ./cacheon 128M 256

Use LRU replacement instead of FIFO:
    ./cacheon 8M 64 -l

Simulate 25% writes with write-through policy:
    ./cacheon 128M 64 --write-rate 25 --write-policy wt

Enable next-line prefetching:
    ./cacheon 128M 64 --prefetch next

Enable TLB simulation:
    ./cacheon 128M 64 --tlb-entries 64 --page-size 4K

Quiet mode for scripting:
    ./cacheon 16M 64 -q

Run full benchmark suite:
    make run

BEHAVIOR NOTES

1. Working set vs cache size determines hit rates:
   - test size << L1 size:  high L1 hit rate
   - L1 size < test size < L2 size: L1 misses hit in L2
   - L2 size < test size < L3 size: L2 misses hit in L3
   - test size >> L3 size:  many L3 misses (memory fetch required)

2. Stride affects conflict patterns. With 64-byte cache lines and 4-way
   L2, certain power-of-two strides cause set conflicts. Try different
   stride values to observe this.

3. Random access mode uses a fixed seed (12345) for the first pass.
   Each subsequent pass uses the continued RNG state, so passes are
   NOT identical — the working set appears to be randomly sampled on
   each pass rather than repeating the same sequence.

4. -H / --hugepage sets the TLB page size to 2MB for TLB simulation
   purposes. It does not allocate actual hugepages.

5. Pass count scales with working set size: small sets get 10 passes,
   medium sets get 5, large sets get 2, to simulate realistic warm-up.

6. LRU replacement (-l) uses O(1) list-splice + iterator-map internally,
   so it adds no extra per-access cost vs FIFO.

MISS CLASSIFICATION

The simulator classifies each demand miss as one of:
    cold      - first time this cache line was ever accessed
    conflict  - line was seen before and was in a fully-associative
                shadow cache (would have hit if fully associative)
    capacity  - line was seen before but evicted due to working set size

OUTPUT FORMAT

    CACHE REPORT
    ============

    Total Accesses: 262144
    Pattern: Sequential

    L1D Sim:
      [====                ] 20.00%
      Demand Hits: 52428 | Demand Misses: 209716
      Prefetch Hits: 0   | Prefetch Misses: 0
      Miss Breakdown: cold 131072, conflict 0, capacity 78644
      Write-Backs: 0 | Dirty Evictions: 0 | Write-Through Writes: 0

    L2 Sim:
      ...

    L3 Sim:
      ...

    AMAT (Average Memory Access Time): 83.20 cycles

Field meanings:
    Progress bar [===   ]  Visual hit rate (20 chars wide)
    Percentage            Demand hit rate (0-100%)
    Demand Hits/Misses    Explicit (non-prefetch) accesses
    Prefetch Hits/Misses  Prefetcher-issued accesses
    Miss Breakdown        cold / conflict / capacity classification
    Write-Backs           Dirty lines flushed to next level on eviction
    Dirty Evictions       Same as write-backs (write-back policy)
    Write-Through Writes  Writes propagated immediately (write-through policy)

AMAT FORMULA

    AMAT = 4  * L1_hit_rate
         + 12 * L1_miss_rate * L2_hit_rate
         + 40 * L1_miss_rate * L2_miss_rate * L3_hit_rate
         + 200* L1_miss_rate * L2_miss_rate * L3_miss_rate

    Cycle costs: L1=4, L2=12, L3=40, Memory=200

LICENSE

MIT License - See LICENSE file

About

quick cache hierarchy simulator for analyzing memory access patterns and cache behavior across L1/L2/L3 with sequential/random access modes.

Resources

License

Stars

Watchers

Forks

Contributors