Skip to content

alok/knuth-mmix

Repository files navigation

knuth-mmix

CI License Lean

knuth-mmix is a Lean 4 package for assembling and executing a useful, typed slice of Donald Knuth's MMIX architecture. It combines:

  • BitVec-backed MMIX word, register, and address types
  • a label-aware assembler that emits real 32-bit MMIX instruction words
  • a small executable machine semantics for running assembled programs inside Lean
  • a [mmix| ... ] quasiquoter for writing assembly fragments directly in Lean code

The package is aimed at lightweight ISA experiments, executable examples, and formal work where it is useful to stay close to actual MMIX encodings without dropping into ad hoc integers.

Features

  • typed MMIX sizes: Byte, Wyde, Tetra, Octa, Register, and Address
  • sparse big-endian memory with MMIX-style alignment rounding
  • instruction encoding for a broad MMIX integer, memory, control-flow, address, and special-register subset
  • local/global register windows plus executable register-stack subroutine linkage
  • executable semantics for assembled programs via Machine.step and Machine.run
  • helper routines such as Program.materialize for constructing 64-bit literals from wyde lanes
  • build-time examples that check encoding and execution behavior

Installation

Add the dependency to your lakefile.toml:

[[require]]
name = "knuth-mmix"
git = "https://github.com/alok/knuth-mmix"
rev = "main"

Then import the umbrella module:

import KnuthMmix

Quick Start

import KnuthMmix

open KnuthMmix

def copyAndCompare : Program := [mmix|
  SETL $0, 0x1234;
  INCML $0, 0x5678;
  SETL $1, 0x0010;
  STOU $0, $1, 0;
  LDOU $2, $1, 0;
  CMPU $3, $0, $2;
  BNZ $3, fail;
  JMP done;
  fail:;
  SETL $4, 1;
  done:;
  SWYM
]

def runCopyAndCompare : Except String (Octa × Octa) := do
  let assembled ← Program.assemble copyAndCompare
  let machine ← Machine.run 16 assembled {}
  pure (machine.readReg (regLit 2), machine.readReg (regLit 4))

For a pure encoding pass, use Program.encode. For full execution, assemble first and then run the resulting AssembledProgram.

SET $r, n behaves like an assembler pseudo-instruction and expands to the wyde-lane sequence needed to materialize the full 64-bit literal. If you want the same behavior directly from Lean, use Program.materialize.

What Is Modeled

The current executable model covers:

  • local, marginal, and global general-purpose register behavior with rL and rG
  • a hidden register stack with executable PUSHJ, PUSHGO, and POP
  • all 32 MMIX special registers as named values for GET and PUT
  • sparse byte-addressed memory with big-endian load/store behavior
  • non-floating arithmetic including multiply/divide and the rH/rR/rD side channels
  • floating-point register semantics, conversions, arithmetic, comparisons, and short-float loads/stores
  • rA-driven arithmetic status tracking with event bits, enable bits, and current rounding mode
  • the MMIX byte and bit-matrix operations (BDIF/WDIF/TDIF/ODIF, MUX, SADD, MOR, MXOR)
  • the uncached/high-tetra/store-constant memory forms and basic TRIP/TRAP state shuffling
  • label resolution and relative branch/jump encoding
  • a fuel-bounded small-step evaluator

The model still omits the rest of the privileged operating-system and virtual-memory behavior beyond the current SAVE/UNSAVE/RESUME approximation, plus richer exception bookkeeping in rA.

Supported Surface Area

The [mmix| ... ] quasiquoter currently accepts:

  • data movement and literal construction: SET, SETL, SETML, SETMH, SETH, INCL, INCML, INCMH, INCH, ORL, ORML, ORMH, ORH, ANDNL, ANDNML, ANDNMH, ANDNH
  • floating-point operations: FLOT, FLOTU, SFLOT, SFLOTU, FIX, FIXU, FADD, FSUB, FMUL, FDIV, FREM, FSQRT, FINT, FCMP, FEQL, FUN, FCMPE, FEQLE, FUNE
  • integer and bitwise operations: ADD, ADDU, MUL, MULU, DIV, DIVU, ADDU2, ADDU4, ADDU8, ADDU16, LDA, SUB, SUBU, NEG, NEGU, OR, ORN, NOR, AND, ANDN, NAND, XOR, NXOR, SL, SLU, SR, SRU, CMP, CMPU
  • conditional register updates: CSN, CSZ, CSP, CSOD, CSNN, CSNZ, CSNP, CSEV, ZSN, ZSZ, ZSP, ZSOD, ZSNN, ZSNZ, ZSNP, ZSEV
  • byte and matrix operations: BDIF, WDIF, TDIF, ODIF, MUX, SADD, MOR, MXOR
  • loads: LDB, LDBU, LDSF, LDHT, LDW, LDWU, LDT, LDTU, LDO, LDOU, LDUNC, LDVTS
  • stores: STB, STBU, STSF, STHT, STCO, STUNC, STW, STWU, STT, STTU, STO, STOU
  • cache and prefetch hints: PRELD, PREST, PREGO, SYNCD, SYNCID, SYNC
  • special-register and control flow instructions: GET, PUT, GETA, JMP, GO, BN, BZ, PUSHJ, PUSHGO, POP, SAVE, UNSAVE, RESUME, TRIP, TRAP, BP, BOD, BNN, BNZ, BNP, BEV, plus probable variants PBN, PBZ, PBP, PBOD, PBNN, PBNZ, PBNP, PBEV, and SWYM
  • labels written as label:

The scaled-add instructions are spelled ADDU2, ADDU4, ADDU8, and ADDU16 in the quasiquoter. This keeps the syntax robust inside Lean even though the MMIX manuals write the same operations as 2ADDU, 4ADDU, 8ADDU, and 16ADDU.

Semantics Notes

  • Unwritten memory reads back as zero.
  • Memory accesses round the address down to the natural width alignment, following MMIX behavior for these operations.
  • Machine.run fuel program machine stops when either fuel is exhausted or execution falls off the end of the assembled code, at which point the machine is marked halted.
  • Branch and jump targets are resolved against assembled labels, and encoded using MMIX relative offsets.

Project Layout

  • KnuthMmix/Word.lean: MMIX word sizes, lane helpers, byte packing, comparisons, alignment
  • KnuthMmix/Register.lean: special-register enumeration and encodings
  • KnuthMmix/Memory.lean: sparse memory model plus signed and unsigned loads/stores
  • KnuthMmix/Instruction.lean: instruction AST, label assembly, and opcode encoding
  • KnuthMmix/Machine.lean: executable machine state and operational semantics
  • KnuthMmix/Syntax.lean: [mmix| ... ] quasiquoter
  • KnuthMmix/Examples.lean: executable examples checked at build time

Development

Build the package with:

uv run lake build

GitHub Actions runs the same Lean build on pushes and pull requests.

Roadmap

Major remaining gaps are:

  • the rest of the operating-system-facing control layer beyond the current SAVE/UNSAVE/RESUME approximation
  • richer privileged machine-state behavior around interrupts, emulation resumes, and protection faults
  • full MMIX rounding-mode control and the remaining privileged machine details

About

Lean 4 package for assembling and executing a BitVec-backed subset of Knuth's MMIX ISA.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages