zk-Autograd is a proof-of-concept training pipeline where every optimizer step emits a zero‑knowledge proof that the update was computed honestly from the previous weights, batch gradients, and declared optimizer rules. Proofs are generated inside a TEE (AWS Nitro Enclave or OCI Confidential VM) so proving keys and intermediate values never leave protected memory. Public artifacts (proof hashes, logs, and torrents) enable third‑party audit of fine‑tuning runs without revealing training data or model weights.
Status: PoC / research prototype. Not for production use.
flowchart LR
subgraph Host["Trainer Host (Docker / EC2 / OCI VM)"]
T[PyTorch Trainer]
H[Autograd Hook Collector]
L[Step Log + Hash Chain]
end
subgraph TEE["Prover TEE"]
P[EZKL Prover Service]
ZK[Adam/SGD Step Circuit + Proving Key]
end
subgraph Audit["Public Audit Plane"]
V["Verifier CLI (EZKL verify)"]
A["Artifacts: proofs, logs"]
TOR["Torrents + magnets"]
W["Web UI (GH Pages)"]
end
T --> H -->|w_t, grad_t, opt_state| P
P -->|proof π_t, hash| L
L --> A --> TOR --> W
A --> V
sequenceDiagram
participant Trainer
participant Hook
participant ProverTEE
participant Log
participant Verifier
Trainer->>Hook: backward() triggers hooks
Hook->>Trainer: collects grad_t + w_t snapshot
Trainer->>ProverTEE: send (w_t, grad_t, m_t, v_t, lr, betas, t)
ProverTEE->>ProverTEE: gen witness, prove Adam/SGD step via EZKL
ProverTEE->>Log: return (π_t, H(π_t))
Log->>Log: append step + Merkle/hash-chain update
Verifier->>Log: sample random step IDs
Verifier->>Verifier: ezkl.verify(π_t)
EZKL generates Halo2-based zk‑SNARK circuits over ONNX graphs and provides Python bindings for setup, witness generation, proving, and verification.
PoC circuits:
- Adam step (default): proves
{m_{t+1}, v_{t+1}, w_{t+1}}from{w_t, g_t, m_t, v_t, lr, beta1, beta2, eps, t}. - SGD step (optional).
Large graphs can be split into multiple proofs using EZKL commitments and then aggregated into a single proof.
PoC fallback:
Set EZKL_CHUNKS=N to slice flattened optimizer vectors into N blocks, generate N chunk proofs, and aggregate them into a single aggregated.pf.
EZKL_CHUNKS=4 docker compose up --buildSetup artifacts live in prover/keys/:
settings.json,compiled.ezkl,pk.key,vk.key,kzg.srs, andadam_step.onnx.
Generate them locally:
zk-setup-zk --circuit adam --dim 128 --out prover/keys- Honest‑but‑curious host observing runtime/logs.
- Malicious host attempting to fabricate/skip/rollback steps.
- Malicious auditor sampling proofs.
- TEE attestation gates proving key release.
- Nitro Enclaves attest enclave image PCRs and KMS can restrict key use to a specific enclave measurement.
- OCI Confidential VMs use AMD SEV‑based memory encryption and support measurement‑based attestation concepts.
- ZK backend is sound.
- Log integrity via hash‑chain + Merkle root.
Requirements: Docker + Compose.
# 1) Prepare EZKL circuit and keys
zk-setup-zk --circuit adam --out prover/keys
# 2) Run prover + trainer
docker compose -f docker/docker-compose.yml up --buildArtifacts emit to artifacts/run-*/:
steps.jsonl– per step metadata + proof hashproofs/step_XXXXX.proof– EZKL proofsmerkle_root.txt– run rootrun_manifest.json– pointers for web UI/torrentsanchors.json– monotonic counter anchors (local PoC)
For an interactive walkthrough of the project's features, check out demo.ipynb. This Jupyter notebook provides a comprehensive demonstration of:
- Setup & Compilation: How to compile ZK circuits for the optimizer.
- Provable Training: Running a toy training loop with audit logging.
- Audit Log Inspection: Examining the cryptographic log and Merkle tree.
- Verification: Verifying generated ZK proofs.
- Advanced Features: Exploring Triton kernels and proof chunking.
- Decentralized Storage: Creating torrent bundles for training artifacts.
It is designed to be a self-contained guide for understanding the end-to-end flow of zk-Autograd.
See deployment/aws-nitro/README.md for EIF build and vsock wiring.
Hardening implemented in design:
- Proving key stored encrypted in KMS and released only to enclaves whose attestation PCR0/ImageSha384 matches policy.
- Run Merkle roots anchored using an external monotonic counter service (e.g., DynamoDB conditional write) because enclaves have no persistent storage.
See deployment/oci-cvm/README.md.
Hardening implemented in design:
- Proving key stored in OCI Vault and released only after verifying CVM attestation measurement (SEV report).
- Anchor Merkle roots to a versioned object / conditional write in OCI Object Storage.
To ensure the robustness of the verifier, we employ property-based fuzzing using Hypothesis. This helps discover edge cases in log parsing and proof verification by generating malformed and random inputs.
Run the fuzzer:
pytest tests/test_fuzz_verifier.pyWe use Sigstore (Cosign) and Syft to sign Docker images and generate SBOMs, ensuring that the code running inside the TEE is exactly what was built in CI.
Verify a random sample of steps:
zk-verify --run artifacts/run-YYYYMMDD-HHMM --k 10 --key-dir prover/keys- Side‑channel resistance is partial (constant‑time kernels + padding suggested, not fully proven).
- Circuit is small/tiny model; large models will be slow.
- Torrent distribution can leak metadata; don't use for sensitive runs.
- Auditable fine‑tuning without sharing IP/data.
- AI supply‑chain integrity: detect skipped/tampered steps.
- Third‑party model marketplaces with verifiable updates.
- FHE‑accelerated gradients (CKKS/TFHE) + ZK for update correctness.
- Differential privacy constraints in‑circuit.
- Aggregated proofs per epoch.
- Additional TEEs (SEV‑SNP / TDX / confidential containers).
Apache‑2.0 (suggested).
Generate Solidity verifier and deploy the on-chain root anchor:
zk-generate-evm-verifier --key-dir prover/keys --out contracts/generated --aggregated
# deploy EzklVerifier.sol then RunAnchor.solRunAnchor.sol enforces monotonic counters and stores Merkle roots only when proofs verify.