10 releases (5 breaking)
Uses new Rust 2024
| new 0.9.0 | May 15, 2026 |
|---|---|
| 0.8.0 | Apr 21, 2026 |
| 0.7.1 | Apr 5, 2026 |
| 0.7.0 | Mar 23, 2026 |
| 0.4.0 | Mar 6, 2026 |
#318 in Programming languages
1MB
19K
SLoC
WASM-PVM: WebAssembly to PolkaVM Recompiler
WARNING: This project is largely vibe-coded. It was built iteratively with heavy AI assistance (Claude). While it has 670 passing integration tests and produces working PVM bytecode, the internals may contain unconventional patterns, over-engineering in some places, and under-engineering in others. Use at your own risk. Contributions and proper engineering reviews are very welcome!
A Rust compiler that translates WebAssembly (WASM) bytecode into PolkaVM (PVM) bytecode for execution on the JAM (Join-Accumulate Machine) protocol. Write your JAM programs in AssemblyScript (TypeScript-like), hand-written WAT, or any language that compiles to WASM — and run them on PVM.
WASM ──► LLVM IR ──► PVM bytecode ──► JAM program (.jam)
inkwell mem2reg Rust backend
Getting Started
Prerequisites
- Rust (stable, edition 2024)
- LLVM 18 — the compiler uses inkwell (LLVM 18 bindings)
- macOS:
brew install llvm@18thenexport LLVM_SYS_181_PREFIX=/opt/homebrew/opt/llvm@18 - Ubuntu:
apt install llvm-18-dev
- macOS:
- Bun (for running integration tests and the JAM runner) — bun.sh
Build
git clone https://github.com/tomusdrw/wasm-pvm.git
cd wasm-pvm
cargo build --release
Hello World: Compile & Run
Create a simple WAT program that adds two numbers:
;; add.wat
(module
(memory 1)
(func (export "main") (param $args_ptr i32) (param $args_len i32) (result i64)
;; Read two i32 args, add them, write result to memory
(i32.store (i32.const 0)
(i32.add
(i32.load (local.get $args_ptr))
(i32.load (i32.add (local.get $args_ptr) (i32.const 4)))))
(i64.const 17179869184))) ;; packed ptr=0, len=4
Compile it to a JAM blob and run it:
# Compile WAT → JAM
cargo run -p wasm-pvm-cli -- compile add.wat -o add.jam
# Run with two u32 arguments: 5 and 7 (little-endian hex)
npx @fluffylabs/anan-as run add.jam 0500000007000000
# Output: 0c000000 (12 in little-endian)
Inspect the Output
Upload the resulting .jam file to the PVM Debugger for step-by-step execution, disassembly, register inspection, and gas metering visualization.
AssemblyScript Example
You can also write programs in AssemblyScript:
// fibonacci.ts
export function main(args_ptr: i32, args_len: i32): i64 {
const buf = heap.alloc(256);
let n = load<i32>(args_ptr);
let a: i32 = 0;
let b: i32 = 1;
while (n > 0) {
b = a + b;
a = b - a;
n = n - 1;
}
store<i32>(buf, a);
return (buf as i64) | ((4 as i64) << 32); // packed ptr + len
}
Compile via the AssemblyScript compiler to WASM, then use wasm-pvm-cli to produce a JAM blob. See the tests/fixtures/assembly/ directory for more examples.
How It Works
The compiler pipeline:
Entry functions use a unified ABI: main(args_ptr: i32, args_len: i32) -> i64, where the return value packs the result pointer in the lower 32 bits and the result length in the upper 32 bits. The compiler unpacks this into PVM's SPI convention (r7 = start address, r8 = end address).
- Adapter merge (optional) — merges a WAT adapter module into the WASM binary, replacing matching imports with adapter function bodies
- WASM → LLVM IR — translates WASM opcodes to LLVM IR using inkwell (LLVM 18 bindings), with PVM-specific intrinsics for memory operations
- LLVM optimization passes —
mem2reg(SSA promotion),instcombine,simplifycfg,gvn,dce, and optional function inlining - LLVM IR → PVM bytecode — a custom Rust backend reads LLVM IR and emits PVM instructions with per-block register caching (store-load forwarding)
- SPI assembly — packages the bytecode into a JAM/SPI program blob with entry headers, jump tables, and data sections
Key Design Decisions
- Stack-slot approach with register allocation: every SSA value gets a dedicated 8-byte memory offset from SP. A linear-scan register allocator assigns high-use values to available callee-saved registers r9-r12 when not used for this function's incoming parameters (and reserves r9+ needed for outgoing call arguments in non-leaf functions) to eliminate redundant memory traffic across block boundaries and loops
- Per-block register cache: eliminates redundant loads when a value is reused shortly after being computed (~50% gas reduction)
- No
unsafecode:deny(unsafe_code)enforced at workspace level - No floating point: PVM lacks FP support; WASM floats are rejected at compile time
- All optimizations are toggleable:
--no-llvm-passes,--no-peephole,--no-register-cache,--no-icmp-fusion,--no-shrink-wrap,--no-dead-store-elim,--no-const-prop,--no-inline,--inline-threshold N,--no-cross-block-cache,--no-register-alloc,--no-aggressive-regalloc,--no-scratch-reg-alloc,--no-caller-saved-alloc,--no-lazy-spill,--no-dead-function-elim,--no-fallthrough-jumps,--no-libcall-recognition
Benchmark: Optimizations Impact
All PVM-level optimizations enabled (default):
| Benchmark | WASM size | JAM size | Code size | Gas Used |
|---|---|---|---|---|
| add(5,7) | 68 B | 160 B | 96 B | 27 |
| fib(20) | 110 B | 221 B | 144 B | 429 |
| factorial(10) | 102 B | 200 B | 126 B | 185 |
| is_prime(25) | 162 B | 271 B | 189 B | 61 |
| AS fib(10) | 235 B | 622 B | 496 B | 258 |
| AS factorial(7) | 234 B | 619 B | 493 B | 225 |
| AS gcd(2017,200) | 229 B | 627 B | 505 B | 168 |
| AS decoder | 1.5 KB | 6.4 KB | 4,734 B | 913 |
| AS array | 1.4 KB | 5.8 KB | 4,207 B | 782 |
| regalloc two loops(500) | 252 B | 579 B | 454 B | 37,574 |
| host-call-log | 171 B | 458 B | 104 B | 40 |
| aslan-fib accumulate | - | 19.8 KB | 12,556 B | 10,706 |
| blake2b("abc", 32) | 1.1 KB | 3.8 KB | 2,572 B | 16,675 |
| sha512("abc") | 1.7 KB | 3.5 KB | 2,396 B | 16,787 |
| u128 mul x1000 | 296 B | 457 B | 342 B | 71,031 |
| u128 div(fast) x1000 | 273 B | 767 B | 608 B | 68,031 |
| u128 div(slow) x1000 | 273 B | 774 B | 609 B | 130,031 |
| anan-as PVM interpreter | 53.4 KB | 109.8 KB | 79,031 B | - |
The three u128 rows are microbenchmarks for the libcall_recognition optimization (replaces __multi3 and __udivti3 bodies with hand-crafted PVM-friendly versions; --no-libcall-recognition to disable). Compared against the same workloads with recognition off: u128 mul −37% gas, u128 div fast path (callers with a_hi = b_hi = 0) −41% gas, u128 div slow path (b_hi non-zero) +11% gas — the slow-path regression is the cost of the dispatch check and is dwarfed by the fast-path savings in real workloads (substrate runtimes hit the fast path on the dominant Perbill/Balance: u128 patterns). See docs/src/optimizations.md for details.
PVM-in-PVM: programs executed inside the anan-as PVM interpreter (outer gas cost):
| Benchmark | JAM Size | Outer Gas | Direct Gas | Overhead |
|---|---|---|---|---|
| TRAP (interpreter overhead) | 21 B | 80,451 | - | - |
| add(5,7) | 160 B | 1,164,147 | 27 | 43,116x |
| host-call-log | 458 B | 1,208,919 | 40 | 30,223x |
| AS fib(10) | 622 B | 1,536,038 | 258 | 5,954x |
| JAM-SDK fib(10)* | 25.4 KB | 8,717,551 | - | - |
| Jambrains fib(10)* | 61.1 KB | 7,505,155 | - | - |
| JADE fib(10)* | 67.3 KB | 18,659,363 | - | - |
| aslan-fib accumulate* | 19.8 KB | 14,033,405 | 10,706 | 1,311x |
| blake2b("abc", 32) | 3.8 KB | 14,402,928 | 16,675 | 863x |
| sha512("abc") | 3.5 KB | 14,390,234 | 16,787 | 857x |
*JAM-SDK fib(10), Jambrains fib(10), JADE fib(10), and aslan-fib accumulate exit on unhandled host calls (ecalli). The gas cost reflects program parsing/loading plus partial execution up to the first unhandled ecalli.
Memory layout summary
The JAM blob reserves separate ranges for RO data, a guard gap, globals/overflow metadata, and the WASM heap; see the Architecture docs for the full breakdown, including GLOBAL_MEMORY_BASE, PARAM_OVERFLOW_BASE, SPILLED_LOCALS_BASE, and how wasm_memory_base is computed.
The SPI rw_data section is simply a contiguous copy of every byte from GLOBAL_MEMORY_BASE up to the highest initialized heap address, which is why stub AssemblyScript fixtures such as decoder-test/array-test emit ~13 KB of RW data even though only a handful of bytes are non-zero: the encoder must preserve the absolute addresses of the data segments, so the zero stretch between globals and the first heap byte is encoded verbatim. Keeping globals/data near the heap base or introducing sparse RW descriptors (future work) are the only ways to shrink those blobs without redesigning SPI.
Supported WASM Features
| Category | Operations |
|---|---|
| Arithmetic (i32 & i64) | add, sub, mul, div_u/s, rem_u/s, all comparisons, clz, ctz, popcnt, rotl, rotr, bitwise ops |
| Control flow | block, loop, if/else, br, br_if, br_table, return, unreachable, block results |
| Memory | load/store (all widths), memory.size, memory.grow, memory.fill, memory.copy, globals, data sections |
| Functions | call, call_indirect (with signature validation), recursion, stack overflow detection |
| Type conversions | wrap, extend_s/u, sign extensions (i32/i64 extend8/16/32_s) |
| Imports | Text-based import maps (--imports) and WAT adapter files (--adapter) |
Not supported: floating point (by design — PVM has no FP instructions).
CLI Usage
# Compile WAT or WASM to JAM
wasm-pvm compile input.wat -o output.jam
wasm-pvm compile input.wasm -o output.jam
# With import resolution
wasm-pvm compile input.wasm -o output.jam \
--imports imports.txt \
--adapter adapter.wat
# Disable specific optimizations
wasm-pvm compile input.wasm -o output.jam --no-inline --no-peephole
# Disable all optimizations
wasm-pvm compile input.wasm -o output.jam \
--no-llvm-passes --no-peephole --no-register-cache \
--no-icmp-fusion --no-shrink-wrap --no-dead-store-elim \
--no-const-prop --no-inline --no-cross-block-cache \
--no-register-alloc --no-aggressive-regalloc \
--no-scratch-reg-alloc --no-caller-saved-alloc \
--no-lazy-spill --no-dead-function-elim \
--no-fallthrough-jumps --no-libcall-recognition
# Compile past the "float wall" by replacing every f32/f64 op
# with a runtime trap (useful for discovering other unsupported
# features in a module before adding real FP support)
wasm-pvm compile input.wasm -o output.jam --trap-floats
See the Import Handling section for details on resolving WASM imports.
Using as a Library
The wasm-pvm crate can be used as a Rust dependency. It supports two modes:
# Full compiler (default) — requires LLVM 18
wasm-pvm = "0.9.0"
# PVM types only — no LLVM dependency, compiles to wasm32-unknown-unknown
wasm-pvm = { version = "0.9.0", default-features = false }
With default-features = false, only the PVM type definitions are available: Instruction, Opcode, ProgramBlob, SpiProgram, abi::*, memory_layout::*, and Error. This is useful for downstream tools that need to work with PVM bytecode (interpreters, debuggers, analyzers) without requiring the full LLVM compiler toolchain.
| Feature | Default | Description |
|---|---|---|
compiler |
Yes | Full WASM-to-PVM compiler (inkwell, wasmparser, wasm-encoder) |
test-harness |
Yes | Test utilities for unit testing (implies compiler) |
Project Structure
crates/
wasm-pvm/ # Core library
src/
pvm/ # PVM instruction definitions (always available)
memory_layout.rs # PVM memory address constants (always available)
spi.rs # JAM/SPI format encoder (always available)
abi.rs # Register & frame layout constants (always available)
llvm_frontend/ # WASM → LLVM IR translation (feature = "compiler")
llvm_backend/ # LLVM IR → PVM bytecode lowering (feature = "compiler")
translate/ # Compilation orchestration & SPI assembly (feature = "compiler")
wasm-pvm-cli/ # Command-line interface
tests/ # 670 integration tests (TypeScript/Bun)
fixtures/
wat/ # WAT test programs
assembly/ # AssemblyScript examples
imports/ # Import maps & adapter files
vendor/
anan-as/ # PVM interpreter (submodule)
Testing
# Rust unit tests
cargo test
# Lint
cargo clippy -- -D warnings
# Integration tests (builds artifacts, then runs all layers)
cd tests && bun run test
# Quick validation (Layer 1 smoke tests only)
cd tests && bun test layer1/
The test suite is organized into layers:
- Layer 1: Core/smoke tests (~56 tests) — fast, run during development
- Layer 2: Feature tests (~169 tests)
- Layer 3: Regression/edge cases (~445 tests)
- Layer 4-5: PVM-in-PVM tests — the PVM interpreter itself compiled to PVM, running the test suite inside PVM
- Differential (~142 tests): cross-checks PVM output against Bun's native WebAssembly engine; run with
bun run test:differential
Import Handling
WASM modules that import external functions need those imports resolved before compilation. Two mechanisms are available:
Import Map (--imports)
A text file mapping import names to simple actions:
# my-imports.txt
abort = trap # emit unreachable (panic)
console.log = nop # do nothing, return zero
Adapter WAT (--adapter)
A WAT module whose exports replace matching imports, enabling arbitrary logic for import resolution (pointer conversion, memory reads, host calls):
(module
(import "env" "host_call_5" (func $host_call_5 (param i64 i64 i64 i64 i64 i64) (result i64)))
(import "env" "pvm_ptr" (func $pvm_ptr (param i64) (result i64)))
(func (export "console.log") (param i32)
(drop (call $host_call_5
(i64.const 100) ;; ecalli index
(i64.const 3) ;; log level
(i64.const 0) (i64.const 0) ;; target ptr/len
(call $pvm_ptr (i64.extend_i32_u (local.get 0))) ;; message ptr
(i64.extend_i32_u (i32.load offset=0
(i32.sub (local.get 0) (i32.const 4))))))) ;; message len
)
When both --imports and --adapter are provided, the adapter runs first, then the import map handles remaining unresolved imports. All imports must be resolved or compilation fails.
Resources
- PVM Debugger — upload
.jamfiles for disassembly, step-by-step execution, and register/gas inspection - PVM Decompiler — decompile PVM bytecode back to human-readable form
- ananas (anan-as) — PVM interpreter written in AssemblyScript, compiled to PVM itself for PVM-in-PVM execution
- as-lan — example AssemblyScript project compiled from WASM to PVM using this tool
- JAM Gray Paper — the JAM protocol specification (PVM is defined in Appendix A)
- AssemblyScript — TypeScript-like language that compiles to WASM
- Documentation Book — full compiler docs (run
mdbook serve docsto browse locally)
License
Contributing
Contributions are welcome! See AGENTS.md for coding guidelines, project conventions, and a map of the codebase.
Dependencies
~14–19MB
~263K SLoC