GitHub - udaykiriti/cacheon: quick cache hierarchy simulator for analyzing memory access patterns and cache behavior across L1/L2/L3 with sequential/random access modes.

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE		LICENSE
Makefile		Makefile
README		README
cacheon.cpp		cacheon.cpp
cli.h		cli.h
sim.cpp		sim.cpp
sim.h		sim.h

Repository files navigation

Cacheon - Multi-level Cache Simulator

A fast, configurable cache simulator for analyzing memory access patterns
and multi-level cache behavior. Simulates an L1/L2/L3 hierarchy with
optional TLB simulation, software prefetching, and write policy control.

NOTE: The cache hierarchy is fully configurable via CLI flags. Change
sizes, line sizes, and associativity without editing code.
Use --l1/--l2/--l3 with SIZE,LINE,ASSOC (e.g., 32K,64,8).

Default values:
      Default test size:  8 MB
      Default stride:    64 bytes

BUILD

    make

RUN

Quick test:
    ./cacheon 8M 64

Full benchmark (4K to 256M):
    ./cacheon --all-sizes

Test with random access:
    ./cacheon 128M 64 -r

Test with hugepage mode (sets TLB page size to 2MB):
    ./cacheon 256M 256 -H

Test with both random and hugepage:
    ./cacheon 128M 64 -Hr

Quiet mode (CSV output for scripts and pipelines):
    ./cacheon 64M 64 -q

Override cache config and write policy:
    ./cacheon 64M 64 --l1 64K,64,8 --l2 512K,64,8 --l3 16M,64,16 --write-policy wt

Enable stride prefetching with a 64-entry TLB:
    ./cacheon 128M 64 --prefetch stride --tlb-entries 64

WHAT IS IT?

Cacheon simulates a 3-level cache hierarchy and traces memory accesses
through it. Given a memory address range and access pattern (sequential
or random), it reports cache hit/miss rates per level, miss classification
(cold/conflict/capacity), write-back statistics, and AMAT.

USAGE

    cacheon [SIZE] [STRIDE] [OPTIONS]

SIZE:
    Memory size to test. Examples: 4K, 8M, 256M, 1G
    Default: 8M

STRIDE:
    Stride between accesses in bytes. Should match your cache line size.
    Default: 64
    Note: random access mode requires a power-of-two stride (64, 128, 256, ...).

OPTIONS:
    -r, --random               Random access pattern (default: sequential)
    -H, --hugepage             Set TLB page size to 2MB (hugepage simulation)
    -Hr, -rH                   Both random and hugepage
    -q, --quiet                Quiet output: CSV format size,l1%,l2%,l3%
    -l, --lru                  Use LRU replacement policy (default: FIFO)
    --all-sizes                Run full benchmark across 4K to 256M sizes
    --l1 SIZE,LINE,ASSOC       Override L1 config (e.g., 32K,64,8)
    --l2 SIZE,LINE,ASSOC       Override L2 config
    --l3 SIZE,LINE,ASSOC       Override L3 config
    --write-policy wb|wt       Write-back or write-through
    --write-rate N             Percent writes 0-100 (default: 0, read-only)
    --prefetch none|next|stride  Prefetcher mode (default: none)
    --tlb-entries N            Enable TLB simulation with N entries
    --page-size SIZE           TLB page size (default 4K)
    -h, --help                 Show help
    -v, --version              Show version

EXAMPLES

Test 32MB with sequential access:
    ./cacheon 32M 64

Test 256MB with random access:
    ./cacheon 256M 64 -r

Test with larger stride (256-byte cache lines):
    ./cacheon 128M 256

Use LRU replacement instead of FIFO:
    ./cacheon 8M 64 -l

Simulate 25% writes with write-through policy:
    ./cacheon 128M 64 --write-rate 25 --write-policy wt

Enable next-line prefetching:
    ./cacheon 128M 64 --prefetch next

Enable TLB simulation:
    ./cacheon 128M 64 --tlb-entries 64 --page-size 4K

Quiet mode for scripting:
    ./cacheon 16M 64 -q

Run full benchmark suite:
    make run

BEHAVIOR NOTES

1. Working set vs cache size determines hit rates:
   - test size << L1 size:  high L1 hit rate
   - L1 size < test size < L2 size: L1 misses hit in L2
   - L2 size < test size < L3 size: L2 misses hit in L3
   - test size >> L3 size:  many L3 misses (memory fetch required)

2. Stride affects conflict patterns. With 64-byte cache lines and 4-way
   L2, certain power-of-two strides cause set conflicts. Try different
   stride values to observe this.

3. Random access mode uses a fixed seed (12345) for the first pass.
   Each subsequent pass uses the continued RNG state, so passes are
   NOT identical — the working set appears to be randomly sampled on
   each pass rather than repeating the same sequence.

4. -H / --hugepage sets the TLB page size to 2MB for TLB simulation
   purposes. It does not allocate actual hugepages.

5. Pass count scales with working set size: small sets get 10 passes,
   medium sets get 5, large sets get 2, to simulate realistic warm-up.

6. LRU replacement (-l) uses O(1) list-splice + iterator-map internally,
   so it adds no extra per-access cost vs FIFO.

MISS CLASSIFICATION

The simulator classifies each demand miss as one of:
    cold      - first time this cache line was ever accessed
    conflict  - line was seen before and was in a fully-associative
                shadow cache (would have hit if fully associative)
    capacity  - line was seen before but evicted due to working set size

OUTPUT FORMAT

    CACHE REPORT
    ============

    Total Accesses: 262144
    Pattern: Sequential

    L1D Sim:
      [====                ] 20.00%
      Demand Hits: 52428 | Demand Misses: 209716
      Prefetch Hits: 0   | Prefetch Misses: 0
      Miss Breakdown: cold 131072, conflict 0, capacity 78644
      Write-Backs: 0 | Dirty Evictions: 0 | Write-Through Writes: 0

    L2 Sim:
      ...

    L3 Sim:
      ...

    AMAT (Average Memory Access Time): 83.20 cycles

Field meanings:
    Progress bar [===   ]  Visual hit rate (20 chars wide)
    Percentage            Demand hit rate (0-100%)
    Demand Hits/Misses    Explicit (non-prefetch) accesses
    Prefetch Hits/Misses  Prefetcher-issued accesses
    Miss Breakdown        cold / conflict / capacity classification
    Write-Backs           Dirty lines flushed to next level on eviction
    Dirty Evictions       Same as write-backs (write-back policy)
    Write-Through Writes  Writes propagated immediately (write-through policy)

AMAT FORMULA

    AMAT = 4  * L1_hit_rate
         + 12 * L1_miss_rate * L2_hit_rate
         + 40 * L1_miss_rate * L2_miss_rate * L3_hit_rate
         + 200* L1_miss_rate * L2_miss_rate * L3_miss_rate

    Cycle costs: L1=4, L2=12, L3=40, Memory=200

LICENSE

MIT License - See LICENSE file