🤖 rocq-piler

Let rocq-piler do the heavy lifting for your proofs.

Overview

This project demonstrates how a Language Server Protocol (LSP) interface to the Rocq proof assistant (formerly Coq) enables neurosymbolic programming: the seamless integration of neural networks (LLMs) with symbolic reasoning systems (proof assistants).

What is Neurosymbolic Programming?

Neurosymbolic programming combines:

Neural/Statistical AI (LLMs): Pattern recognition, natural language understanding, heuristic search
Symbolic AI (Proof assistants): Logical reasoning, formal verification, guaranteed correctness

This approach leverages the complementary strengths of both paradigms:

Neural (LLM)	Symbolic (Rocq)
Intuition & pattern matching	Rigorous logical deduction
Generates candidate solutions	Verifies correctness
Probabilistic & approximate	Deterministic & exact
Learns from examples	Reasons from axioms
Fast but fallible	Slow but infallible

How LSP Enables Neurosymbolic Rocq Programming

The Traditional Problem

Historically, interacting with proof assistants required:

Deep knowledge of proof tactics and syntax
Understanding of internal proof state representations
Manual, tedious proof construction
Limited automation capabilities

This created a barrier between AI systems (which excel at pattern matching and generation) and proof assistants (which excel at verification).

The LSP Solution

The Language Server Protocol provides a standardized, machine-friendly interface that:

Exposes Proof State: Query current goals, hypotheses, and proof context at any position
Enables Interactive Editing: Apply tactics and immediately observe their effects
Supports Speculative Execution: Try tactics without committing to file edits (Pétanque API)
Provides Structured Feedback: Parse errors, type information, and verification results

This transforms the proof assistant into a verification oracle that an LLM can query interactively.

In this implementation, the bridge is also project-aware: it can detect Coq/Rocq project roots, reopen documents under the correct workspace, and keep rocq-lsp aligned with _CoqProject, _RocqProject, or dune-project layouts.

The Neurosymbolic Loop

┌─────────────┐
│   LLM       │  ← Reads: proof goals, context, error messages
│  (Neural)   │  → Generates: candidate tactics, proof strategies
└──────┬──────┘
       │
       ↓ (proposes tactics)
┌──────────────┐
│  LSP Server  │  ← Bridges neural and symbolic systems
│  (Interface) │  → Translates between LLM and Rocq
└──────┬───────┘
       │
       ↓ (executes & verifies)
┌──────────────┐
│    Rocq      │  ← Verifies: type-checks, proves correctness
│  (Symbolic)  │  → Returns: success/failure, updated goals
└──────────────┘

Workflow:

LLM reads proof goal via LSP (coq_open_goals)
LLM generates candidate tactic(s) based on patterns learned from training
Tactics are applied via LSP (coq_insert_tactic)
Rocq verifies the tactic is type-correct and logically sound
If successful: proof progresses (new subgoals or QED)
If failed: error message guides LLM to try alternatives
Loop continues until proof is complete

Why This Is Powerful

This architecture achieves:

Guided Generation: The LLM doesn't need to be "correct" — only creative. Rocq acts as a discriminator that filters out invalid proofs.

Incremental Verification: Each step is verified immediately, preventing cascading errors.

Explainable AI: Every LLM decision is justified by a formal proof that can be audited by humans.

Learning from Mistakes: Structured error feedback allows the LLM to refine its search strategy.

Safe Automation: The LLM can explore aggressively because Rocq guarantees soundness.

Key Capabilities Enabled by LSP

1. Goal-Directed Search

The LLM can:

Query the current proof goal: "Prove that forall n, n + 0 = n"
Understand the hypothesis context: "Given: n : nat"
Generate relevant tactics: "I'll try induction on n"

LSP Tool: coq_open_goals — returns structured goal representation

2. Speculative Execution

The LLM can:

Try multiple tactics without modifying the file
Quickly discard failures
Only commit tactics that successfully reduce the goal

LSP Tools: coq_get_state_at_pos, coq_run_tactic, coq_goals_for_state (Pétanque API)

3. Interactive Proof Refinement

The LLM can:

Incrementally build proofs tactic-by-tactic
Respond to verification failures by adjusting strategy
Learn which patterns work in different contexts

LSP Tool: coq_insert_tactic — applies tactic and returns updated goals

4. Error-Driven Learning

The LLM can:

Receive structured error messages from Rocq
Understand type mismatches, unification failures, etc.
Adjust tactics based on specific error feedback

LSP Feature: All tools return structured error information

The Hash-Driven Workflow

rocq-piler uses a content-addressed admit system: every open goal in a proof has a unique hash computed from its goal text. This turns proof construction into a clean cycle:

focus_proof → get hashes → insert_tactic(admit_hash=X) → repeat until Qed

Example: Proving `(True /\ True) /\ True` in 3 calls

1. focus_proof(file="Thm.v", name="deep_conj")
   ← 1 admit: abc123  L7: (True / True) / True

2. insert_tactic(admit_hash="abc123", tactic="split.\n- admit.\n- admit.")
   ← 2 admits remaining:
      def456  L8: True / True   (hyps: empty, goal: True / True)
      ghi789  L9: True           (hyps: empty, goal: True)

3. insert_tactic(admit_hash="def456", tactic="split; exact I.")
   insert_tactic(admit_hash="ghi789", tactic="exact I.")
   ← Qed applied

Why hashes? Every admit with the same goal text shares the same hash. Close four True admits at once — across any bullet depth — in a single call. The LLM reads hashes from focus_proof, writes tactics for them, and counts down to Qed.

Example: Dependent Type Progress Theorem in 9 calls

See examples/dep_vec.v for a full dependently-typed language (Nat + Vec n) with both preservation and progress theorems proved entirely via hash-driven MCP calls:

1. focus_proof → hash for the theorem goal
2. insert_tactic(admit_hash=X, tactic="{induction + 7 stubs}") → 7 case hashes with hyps+goal inline
3-9. insert_tactic(admit_hash=<case>, tactic="...") × 7 → Qed

Each response shows hypotheses + goal per admit, so the LLM writes each case's tactic without extra focus_proof calls.

MCP Tools

Tool	Description
`focus_proof`	One-stop shop: proof state, script, all open admits with hashes + hyps + goals
`insert_tactic`	Insert a tactic; pass `admit_hash` to target specific admits by hash; multi-line blocks supported
`try_step`	Speculatively execute a tactic without modifying the file
`open_goals`	Quick goal query with compact/Prev mode options
`proof_state`	Full proof context including name, statements, goal state
`undo_step`	Roll back the last N edit operations
`reset_proof`	Wipe a proof body to fresh `Admitted.`
`add_lemma` / `delete_lemma`	Insert/remove lemma stubs
`delete_step`	Remove the last tactic line from a proof
`edit_file`	Find-and-replace text edits with LSP re-sync
`check_file`	Force full document checking
`check_range`	Get diagnostics for a specific line range
`search_lemmas` / `inspect_term` / `inspect_about` / `locate_term` / `require_lib`	Knowledge & search
`snap_state` / `exec_tactic` / `state_goals`	Low-level Pétanque API

Current Server Capabilities

Content-addressed admits: every open goal gets a hash — same goal text = same hash, close all matching admits at once across any bullet depth
Hypotheses + goal per admit: focus_proof and insert_tactic responses include full hypothesis context for each open admit
Multi-line stub insertion: open a proof, insert N bulleted admits, and get N hashes back — all in one insert_tactic call
Re-seal with multi-goal expansion: when a tactic produces N > 1 goals, they are optionally expanded to N individually addressable bulleted admits
Auto-Qed: proof automatically closed when all tactic admits and focused goals are eliminated; guarded against premature firing when background goals remain
Dynamic workspace switching: automates restarting rocq-lsp under different project roots
Speculative execution: try tactics safely via try_step or Pétanque state APIs before committing
Bullet-aware proof navigation: focus_proof reports bullet stack depth, given-up counts, and sibling context
Safe undo/reset: tracks edit operations, supports multi-step undo

Benefits of This Approach

For AI/LLM Developers

Structured Interface: No need to parse Rocq syntax or output
Immediate Feedback: Know instantly if a tactic succeeds
Safe Exploration: Can't generate unsound proofs
Reduced Search Space: Type system eliminates invalid candidates

For Proof Engineers

AI Assistance: Let LLMs handle tedious proof details
Guaranteed Soundness: All proofs are formally verified
Interactive Refinement: Guide AI when it gets stuck
Explainable Results: Every proof step is auditable

For Researchers

Benchmark Platform: Test AI proof capabilities
Formal ML: Train models on verified code
Safe Code Generation: Generate formally verified programs
Hybrid Intelligence: Study human-AI-proof collaboration

Real-World Applications

1. Automated Theorem Proving

LLMs can tackle routine lemmas while experts focus on complex proofs.

2. Formally Verified Software

Generate code with machine-checked correctness proofs (e.g., crypto, compilers).

3. Mathematical Formalization

Convert natural language math into Rocq, assisted by LLMs.

4. Education

Interactive tutoring systems that teach both intuition (LLM) and rigor (Rocq).

5. AI Safety Research

Build verifiably safe AI systems with formal guarantees.

Getting Started

Prerequisites

# Install Rocq and rocq-lsp
opam install coq-lsp  # rocq-lsp coming soon

# Verify installation
coq-lsp --version  # or rocq-lsp --version

Installation

cd rocq-piler
npm install
npm run build

Usage with OpenCode

Add to your MCP settings:

{
  "mcpServers": {
    "rocq-piler": {
      "command": "node",
      "args": [
        "/path/to/rocq-piler/dist/index.js",
        "--workspace-root",
        "/path/to/your/rocq/project"
      ]
    }
  }
}

The server can start with one workspace root and later switch automatically when a tool call targets a file inside a different Rocq project.

Then in OpenCode:

You: "Help me prove that addition is commutative in Rocq"

OpenCode: [Uses coq_open_goals to see the goal]
OpenCode: [Uses coq_insert_tactic to try "induction n"]
OpenCode: [Iteratively builds the proof using LSP feedback]
OpenCode: "Proof complete! Here's what I did..."

Real Proof Example: Dependent Type System

See examples/dep_vec.v for a complete, working example of a simple dependently-typed language:

Language: Nat + Vec n — base type Nat, dependent Vec n of length n
Theorems proved via MCP tools:

✅ Preservation (7 cases): has_type t T → step t t' → has_type t' T — proved in 9 MCP calls
✅ Progress (7 cases): has_type t T → value t ∨ ∃ t', step t t' — proved in 9 MCP calls

Both proofs use the same pattern: focus_proof → multi-line stub → case hashes with hyps+goal → close by hash one-by-one.

A larger PCF+References benchmark is in benchmarks/complete/pcf_ref.v:

✅ 9 helper lemmas + preservation theorem (21 cases)
✅ Full heap semantics with store extension, weakening, and substitution
✅ All cases closed with Qed
✅ Total cost: 4 cents ($0.0351 actual, 50 API calls, DeepSeek V4)

Benchmarks

Benchmark	Theorem	Cases	Lemmas	API Calls	Cost
Dep Vec (`dep_vec.v`)	Preservation + Progress	7+7	0	18	—
PCF + Ref (`pcf_ref.v`)	Preservation	21	9	50	$0.04

PCF + References (Type Preservation)

A PCF language extended with mutable references (allocation, dereference, assignment), heap semantics, and store typing. The preservation theorem establishes that well-typed programs remain well-typed after a step, with the store typing possibly extended.

Auxiliary lemmas proved: extends_refl, extends_trans, nth_error_extends, store_weakening, shift_typing, subst_typing, heap_ok_extends, heap_lookup_has_type, heap_ok_update

Source: benchmarks/incomplete/pcf_ref.v (challenge) → benchmarks/complete/pcf_ref.v (solution)

Additional Examples

examples/dep_vec.v — Dependent type system (Nat + Vec n) with preservation and progress theorems, proved hash-driven
examples/test_issues.v — PCF + References (21-case preservation theorem)
examples/example.v — Simple examples for each MCP tool

The Future of Neurosymbolic Programming

This LSP-based architecture represents a paradigm shift:

From: "AI generates code, humans verify"
To: "AI and proof assistants collaborate in real-time"

From: "LLMs are black boxes"
To: "Every LLM decision has a formal proof"

From: "Formal verification is for experts"
To: "LLMs make formal methods accessible"

What's Already Possible

As demonstrated by the proofs in this repository:

✅ Hash-driven proof construction: focus_proof → get hashes → insert_tactic(admit_hash=X) → repeat until Qed
✅ Multi-case induction proofs: 7 cases closed in 9 total MCP calls with full hypothesis context per case
✅ Multi-bullet depth: same hash works across -/+/* bullet levels — close all matching admits in one call
✅ Dependent type progress + preservation: Nat + Vec n language fully verified
✅ Interactive refinement: speculative execution via try_step with immediate feedback

What's Coming

As LLMs improve and proof assistants expose richer APIs:

Automated formalization of mathematical papers
AI-assisted discovery of new theorems
Verified AI systems with formal safety guarantees
Natural language to formal proof translation
Large-scale formal verification of real-world software

Learn More

Specification: See MCP_COQ_LSP_SPEC.md
Implementation: See docs/IMPLEMENTATION.md
Quick Start: See docs/QUICKSTART.md
Project config detection: See docs/PROJECT-CONFIG-DETECTION.md
Rocq Documentation: https://rocq-prover.org/
rocq-lsp Project: https://github.com/ejgallego/coq-lsp

Contributing

This is a research prototype demonstrating neurosymbolic programming principles. Contributions welcome:

Add support for other proof assistants (Lean, Isabelle)
Improve tactic suggestion heuristics
Build benchmark suites for AI proof capabilities
Develop training datasets from verified code

License

MIT License - See LICENSE file for details

Built with: TypeScript, rocq-lsp, Model Context Protocol

Design: Content-addressed admits, hash-driven workflow, full hypothesis context per goal

Name		Name	Last commit message	Last commit date
Latest commit History 254 Commits
assets		assets
benchmarks		benchmarks
docs		docs
examples		examples
experiments		experiments
paper		paper
src		src
test_templates		test_templates
tests/integration		tests/integration
tmp		tmp
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts
vitest.integration.config.ts		vitest.integration.config.ts

Folders and files

Latest commit

History

Repository files navigation

🤖 rocq-piler

Overview

What is Neurosymbolic Programming?

How LSP Enables Neurosymbolic Rocq Programming

The Traditional Problem

The LSP Solution

The Neurosymbolic Loop

Why This Is Powerful

Key Capabilities Enabled by LSP

1. Goal-Directed Search

2. Speculative Execution

3. Interactive Proof Refinement

4. Error-Driven Learning

The Hash-Driven Workflow

Example: Proving (True /\ True) /\ True in 3 calls

Example: Dependent Type Progress Theorem in 9 calls

MCP Tools

Current Server Capabilities

Benefits of This Approach

For AI/LLM Developers

For Proof Engineers

For Researchers

Real-World Applications

1. Automated Theorem Proving

2. Formally Verified Software

3. Mathematical Formalization

4. Education

5. AI Safety Research

Getting Started

Prerequisites

Installation

Usage with OpenCode

Real Proof Example: Dependent Type System

Benchmarks

PCF + References (Type Preservation)

Additional Examples

The Future of Neurosymbolic Programming

What's Already Possible

What's Coming

Learn More

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Example: Proving `(True /\ True) /\ True` in 3 calls

Packages