A genetic algorithm that evolves virtual agents to navigate randomly generated perfect mazes. Agents execute a genome-encoded instruction set; fitness is evaluated in parallel on GPU using CUDA via Numba.
- Genetic evolution — tournament selection, one-point crossover, point/insertion/deletion mutation, elitism
- Custom VM — 14-opcode instruction set with stack-based LOOP/WHILE/IF control flow and relative JUMP; variable-length genomes
- GPU acceleration — single or dual-GPU fitness evaluation with persistent worker processes and shared memory for minimal overhead
- CPU fallback — fully functional without a CUDA device
- Maze generation — perfect mazes via randomized DFS (recursive backtracker), bitmask-encoded walls
- Live display — Rich-based real-time stats with fitness history sparkline
- Animated replay — path playback of the best evolved solution
- Generalization mode — evaluate each genome on N random mazes per generation to evolve general solvers
- Genome tooling — disassemble saved genomes and benchmark them against random mazes
- Python 3.13
- uv package manager
- CUDA-capable GPU(s) with Numba-compatible drivers (optional; CPU fallback available)
git clone <repo>
cd ga-test
uv sync# Run with defaults (10×10 maze, 20k population, GPU if available)
uv run python -m ga_maze
# CPU-only, smaller population
uv run python -m ga_maze --no-gpu --population 500
# Fixed maze seed, stop after 200 generations
uv run python -m ga_maze --seed 42 --max-gen 200
# Regenerate maze every generation
uv run python -m ga_maze --regen-maze
# Keep evolving past the first solution
uv run python -m ga_maze --no-stop-on-solve --max-gen 5000
# Save best genome and replay later
uv run python -m ga_maze --save-best best.npy
uv run python -m ga_maze --replay best.npy --seed 42
# Print human-readable genome listing
uv run python -m ga_maze --replay best.npy --disassemble
# Benchmark a saved genome against 200 random mazes
uv run python -m ga_maze --replay best.npy --benchmark 200Train a genome that solves arbitrary mazes rather than one fixed maze by evaluating each genome on multiple random mazes per generation:
# Evaluate each genome on 5 random mazes per generation; fitness is averaged
uv run python -m ga_maze --num-mazes 5 --no-stop-on-solve --max-gen 10000With --num-mazes N, a genome is only considered "solved" if it solves all N mazes in a generation. This produces a much stronger generalization signal at N× the compute cost per generation.
All parameters are exposed as CLI flags. The full set is defined in ga_maze/config.py.
| Category | Parameter | Default | Description |
|---|---|---|---|
| Maze | --width / --height |
10 | Maze dimensions |
--seed |
random | RNG seed for reproducible mazes | |
--regen-maze / --fixed-maze |
fixed | New maze each generation | |
| Genome | --genome-len |
64 | Initial instruction count |
--max-steps |
2000 | Step budget per evaluation | |
| GA | --population |
20000 | Population size |
--elite |
200 | Elites copied verbatim each generation | |
--mutation |
0.02 | Per-instruction point mutation rate | |
--crossover |
0.80 | Crossover probability | |
--num-mazes |
1 | Mazes per genome per generation (generalization) | |
| Stopping | --max-gen |
unlimited | Stop after N generations |
--stop-on-solve / --no-stop-on-solve |
stop | Stop when any genome solves the maze | |
| Fitness | --solve-bonus |
1,000,000 | Bonus for reaching the goal |
--progress-weight |
10,000 | Distance-based progress weight | |
| GPU | --gpu / --no-gpu |
gpu | Enable/disable GPU evaluation |
--num-gpus |
2 | Number of GPUs to use (1 or 2) | |
| Output | --save-best |
— | Save best genome to .npy file |
--replay |
— | Animate a saved genome | |
--replay-fps |
15 | Replay animation speed | |
--benchmark |
— | With --replay: test on N random mazes |
|
--disassemble |
— | With --replay: print genome listing |
main.py (CLI / evolution loop)
├── ga/evolution.py — generation loop, hall-of-fame, stats
├── ga/operators.py — selection, crossover, mutation
├── ga/fitness_cpu.py — sequential CPU evaluator
├── ga/fitness_gpu.py — CUDA evaluator (single / multi-GPU pool)
├── genome/
│ ├── isa.py — 14 opcodes, 14 conditions, disassembly
│ ├── encoder.py — random genome init, population packing, serialization
│ └── interpreter.py — pure-Python VM (mirrors GPU kernel exactly)
├── maze/
│ ├── generator.py — DFS maze generation, bitmask walls
│ └── renderer.py — full / viewport / minimap rendering
└── display/
├── live_display.py — Rich live stats panel
├── replay.py — animated path replay
└── benchmark.py — multi-maze genome benchmarking
solved: solve_bonus + speed_weight × (1 − steps / max_steps)
unsolved: progress_weight × (max_dist − min_dist) / max_dist
Each genome instruction is 4 bytes [opcode, arg1, arg2, reserved].
| Group | Opcodes |
|---|---|
| No-op | NOP |
| Absolute movement | MOV_N, MOV_S, MOV_E, MOV_W |
| Relative movement | TURN_L, TURN_R, MOV_FWD, MOV_BCK |
| Control flow | LOOP, WHILE, IF, SKIP |
| Jump | JUMP |
LOOP arg1 arg2 — repeat the next arg2+1 instructions arg1+1 times.
WHILE arg1 arg2 — while condition arg1 is true, execute the next arg2+1 instructions.
IF arg1 arg2 — if condition arg1 is true, execute the next arg2+1 instructions.
SKIP arg1 — skip the next arg1+1 instructions unconditionally.
JUMP arg1 — relative jump. arg1 is interpreted as a signed byte (0–127 = forward, 128–255 = backward). New PC = pc + 1 + int8(arg1), clamped to 0. The tick budget prevents infinite loops.
Conditions (used by WHILE/IF): 14 wall/free checks in absolute (N/S/E/W) and relative (FWD/LEFT/RIGHT) directions.
- One CUDA thread per genome
- Device memory allocated once and reused across generations
- Walls transferred once at startup; only genomes + fitness scores transfer per generation
- Multi-GPU mode: two persistent worker processes communicate via
multiprocessingshared memory, signalled viaEvent— no subprocess spawn overhead per generation - Numba JIT compiled once on first generation
pytest
pytest -v tests/test_gpu.py # GPU-specific tests (requires CUDA)Test coverage: maze generation, genome encoding/ISA, interpreter, genetic operators, fitness evaluation, evolution loop, display.