A comprehensive, beginner-friendly tutorial series that teaches formal verification of Rust programs using Lean 4 and Aeneas. The series culminates in a fully verified TUI multi-agent LLM harness.
235 files | 32,800+ lines of code | 297 tests | 11 tutorials
Rust code + Lean proofs + deep walkthroughs — from "hello world" to a verified multi-agent system.
Software testing checks examples. Formal verification checks everything.
A test suite for a sorting function might check 100 inputs and declare success. A formal proof checks all 2^64 possible inputs simultaneously and guarantees the output is sorted and a permutation of the input. No edge case is missed. No corner case is forgotten. The proof is the specification, and if the code compiles against it, the code is correct — by mathematical certainty, not by hope.
Why does this matter now?
- LLM agents are becoming critical infrastructure. When an agent manages your calendar, writes your code, or handles your finances, "it usually works" isn't good enough. We need mathematical guarantees about agent behavior: that they always terminate, that messages are never lost, that guardrails are never bypassed.
- Rust gives us memory safety but not correctness. A Rust program won't segfault or have data races. But it can still sort wrong, lose messages, or enter impossible states. Formal verification closes this gap.
- The tools are finally practical. Lean 4 is a real programming language (not just a proof assistant). Aeneas translates real Rust code (not toy subsets). The workflow is: write normal Rust, translate automatically, prove properties in Lean. No annotations in your Rust source. No special syntax. No compromise on performance.
Why this tutorial series?
Most formal verification resources assume a PhD in type theory. This series assumes you know basic algorithms and nothing else. It builds from 2 + 3 = 5 to proving that a multi-agent LLM orchestrator correctly routes messages, fairly schedules agents, and always terminates within its budget — and every step along the way is explained for a total beginner.
This project demonstrates how to prove Rust programs correct — not just that they compile, not just that they pass tests, but that they satisfy precise mathematical specifications for all possible inputs.
The key tools:
- Rust — the systems programming language whose type system guarantees memory safety
- Lean 4 — a theorem prover and programming language that can express and verify mathematical proofs
- Aeneas — a tool that translates Rust code into pure functional Lean code, enabling formal proofs about Rust programs
| Property | Meaning | Example |
|---|---|---|
| No panics | Function never crashes for any input | checked_add handles all u32 pairs |
| Functional correctness | Output matches mathematical specification | RPN evaluator computes the right answer |
| Roundtrip | Decode(Encode(x)) = x for all x | Message serialization is lossless |
| Invariant preservation | Critical properties always hold | Traffic light never shows green both ways |
| Termination | Program always finishes | Agent loop exits within step budget |
| Fairness | Resources are allocated equitably | Round-robin gives each agent equal turns |
| Composition | Components work correctly together | User message reaches the right agent end-to-end |
Every project in this series follows the same pattern:
┌─────────────────────────────────────────┐
│ Imperative Shell (I/O) │ ← NOT verified (terminal, network, files)
│ ┌───────────────────────────────────┐ │
│ │ Verified Pure Core │ │ ← Translated by Aeneas, proved in Lean
│ │ (state machines, parsers, │ │
│ │ protocols, business logic) │ │
│ └───────────────────────────────────┘ │
└─────────────────────────────────────────┘
The pure core contains all the logic that matters — state transitions, data transformations, protocol handling. This is what Aeneas translates and what we prove correct. The imperative shell handles I/O (reading the terminal, making HTTP calls) and is intentionally left unverified.
- Basic knowledge of algorithms (loops, recursion, data structures)
- A computer with macOS or Linux
- Willingness to learn — no prior Lean, formal verification, or advanced Rust knowledge required
rust-lean-aeneas/
├── LEAN.md Deep reference on Lean 4
├── AENEAS.md Deep reference on Aeneas
├── PLAN.md Master implementation plan
│
└── tutorials/
├── 01-setup-hello-proof/ Install tools, first proof
├── 02-rpn-calculator/ Stack machine + correctness proofs
├── 03-infix-calculator/ Parser + equivalence with RPN
├── 04-state-machines/ Generic state machines + invariant proofs
├── 05-message-protocol/ Binary protocol + roundtrip proofs
├── 06-buffer-management/ Ring buffer, gap buffer + data structure proofs
├── 07-tui-core/ Pure TUI model + layout/render proofs
├── 08-llm-client-core/ LLM protocol logic + well-formedness proofs
├── 09-agent-reasoning/ Agent state machine + termination proofs
├── 10-multi-agent-orchestrator/ Multi-agent system + fairness/routing proofs
└── 11-full-integration/ Complete verified TUI LLM agent harness
The tutorials are designed to be followed in order. Each one introduces new Lean concepts and builds on previous work:
| Tutorial | What You Build | What You Prove | New Concepts |
|---|---|---|---|
| 01 | Simple arithmetic functions | No panics, correct results | Curry-Howard, step tactic |
| 02 | RPN calculator | Stack invariants, evaluator correctness | Inductive types, cases, induction |
| 03 | Infix calculator + parser | Parser termination, RPN equivalence | Recursive types, tree induction, simp |
| 04 | State machine framework | Safety, liveness, mutual exclusion | Traits as structures, decide |
| 05 | Binary message protocol | Roundtrip, no message loss | Byte reasoning, omega |
| 06 | Ring buffer, gap buffer | Capacity, FIFO, content preservation | Modular arithmetic, loop invariants |
| 07 | TUI model (pure) | Layout correctness, render bounds | Widget trees, Rect arithmetic |
| 08 | LLM client (pure) | Request validity, context window | Trait specs, axioms |
| 09 | Agent reasoning engine | Termination, tool safety, guardrails | Decreasing measures, reachability |
| 10 | Multi-agent orchestrator | Routing, fairness, budget, voting | Distributed reasoning, Finset |
| 11 | Full TUI LLM agent | End-to-end composition, state consistency | Module composition, I/O axioms |
01 Setup
↓
02 RPN Calculator
↓
03 Infix Calculator ←── imports 02 for equivalence proof
↓
04 State Machines ─── KEY BUILDING BLOCK for 07-11
↓
05 Message Protocol ──────────────────────┐
↓ │
06 Buffer Management ─────────────┐ │
↓ │ │
07 TUI Core ←── uses 06 │ │
↓ │ │
08 LLM Client Core ←── uses 04 │ │
↓ │ │
09 Agent Reasoning ←── uses 04,08 │ │
↓ │ │
10 Multi-Agent ←── uses 04,05,09 │ │
↓ ↓ ↓
11 Full Integration ←── uses 05,06,07,08,09,10
See Tutorial 01: Setup and Hello Proof to get started.
- LEAN.md — Everything you need to know about Lean 4
- AENEAS.md — Everything you need to know about Aeneas
- STATUS.md — Current project status and what works/doesn't
- PLAN.md — Master implementation plan
- GAP_ANALYSIS.md — What's missing for real formal verification
- DO_NEXT.md — Immediate next steps
- WHAT_WE_DID.md — History of what was built
Every tutorial (02 and above) has the same structure:
tutorials/NN-name/
├── README.md Detailed walkthrough with explanations
├── PLAN.md Implementation plan and design decisions
├── rust/
│ ├── Cargo.toml Rust project
│ └── src/ Rust source code (the verified core)
└── lean/
├── lakefile.lean Lean project config
├── lean-toolchain Lean version
└── ... Generated Lean + hand-written proofs
The workflow for each tutorial:
- Read the README for context and theory
- Study the Rust code in
rust/src/ - Run Charon + Aeneas to generate Lean code
- Study the generated Lean in
lean/ - Read and understand the hand-written proofs
- Try the exercises
Several decisions were made to keep the code Aeneas-friendly:
Vec<u8>instead ofString— Aeneas mapsVec<u8>directly to Lean'sList U8- Explicit
whileloops instead of iterators — translates cleanly to recursive functions - Enum-dispatch instead of
dyn Trait— enables simpler case-splitting in proofs - Functional data structures (e.g., linked-list Stack) — natural inductive types in Lean
u32IDs instead of strings in the verified core — avoids string complexity
See PLAN.md for the full rationale behind each decision.
This project is licensed under the MIT License. See LICENSE for details.