A toolkit for systematically checking the proofs in statistics / ML-theory paper appendices using coding agents (Claude Code, Codex, etc.).
Authors: Wei Ma
proofcheck-stat-paper/
README.md ← you are here
guideline.md ← principles, templates, severity system
workflow.md ← Pass 0-5 execution methodology
bootstrap.md ← prompt to auto-generate CHECK_PLAN + EXECUTION_ORDER
scripts/ ← automation (index, crossref, scaffold)
templates/ ← PROGRESS.md and other reusable templates
tutorial/ ← step-by-step walkthrough (CBARA paper)
papers/ ← put papers to check here
your-paper/ ← one folder per paper
audit/ ← created during check execution
Everything happens inside this directory. Open Claude Code here.
- Create a folder:
papers/your-paper-name/ - Put the paper's LaTeX source into it:
papers/your-paper-name/paper.tex - Tell Claude:
Check the appendix proofs of the paper in papers/your-paper-name/paper.tex.
First read guideline.md and workflow.md to understand the methodology,
then read the paper and run bootstrap.md to generate a check plan.
Claude will:
- Generate
CHECK_PLAN.mdandEXECUTION_ORDER.mdinsidepapers/your-paper-name/ - Create an
audit/directory there - Execute the phases (indexing → foundations → lemmas → theorems → final report)
All outputs stay in papers/your-paper-name/. The toolkit files at the root are never modified.
Tell Claude: "The paper in papers/your-paper-name/ has been updated. Re-run the checks."
Read this first. Paper-agnostic. Defines:
- Core objective, operating principles
- Severity system: S0 (fatal) through S3 (minor)
- Proof-unit templates, 19 common failure patterns
- Agent prompt templates for every stage
The bridge from principles to execution. Covers:
- How to extract proof architecture from a paper
- How to build dependency-ordered execution plans
- Workspace setup, phase-by-phase execution
- Paper-type-specific adaptations (asymptotic, concentration, optimization, Markov chain, M-estimation)
One prompt → CHECK_PLAN.md + EXECUTION_ORDER.md drafts.
- Mode A: fully automatic (best when paper has clear proof-strategy section)
- Mode B: you provide a knowledge block with proof strategy insights
A step-by-step walkthrough using the CBARA paper. Shows what every file looks like at every stage — not a paper to check, but a reference for how the toolkit works.
- Claude Code (or similar coding agent with file read/write)
- LaTeX source of the paper
- Git (for version-tracking toolkit changes)
Tested with Claude Code using deepseek-v4-pro at maximum effort. Results may vary with other models or effort settings.