HEEL

Rehearse how your customers will abuse your product, before they do.

It's launch day. Somewhere, a customer just found the export endpoint with no rate limit, farmed your "one free trial" a thousand times, or talked your AI agent into calling a tool it should never touch.

HEEL is the villain you rehearse against first. A swarm of adversarial and opportunistic agents probes a product you own, proves an abuse path is reachable with a contained proof-of-concept, and hands you a ranked report with the fix, before you ship.

It is agent-native (its canonical surface is an MCP server other agents call), honest (it reports its real detection rate against abuse it has never seen, not a vanity number), and safe by construction (synthetic-first, contained PoCs, an authorization gate no prompt-injected agent can talk its way past). Pure Python standard library, zero dependencies.

See it in 30 seconds

pip install heel-sim      # zero deps · Python 3.11+
heel doctor               # environment self-check
heel eval                 # the honest detection headline

Or run the full proof from a clone, no install:

git clone https://github.com/ancilis/heel && cd heel && make demo

AUTHORIZATION GATE (agent caller is an untrusted, possibly prompt-injected channel):
  [REJECTED+logged ✓]  run a target NOT in the allowlist
  [REJECTED+logged ✓]  call a forged scope-widening tool
  [REJECTED+logged ✓]  inject an instruction in the target arg
  -> auth gate: PASS, no escalation reachable via the agent surface

HELD-OUT EVALUATION: targets authored by an INDEPENDENT LLM swarm (blind to HEEL's probes):
  TEST (FROZEN, never tuned, 199 weaknesses):
     LOCALIZATION recall 0.50   ATTRIBUTION recall 0.33   precision 0.98
  -> the honest real-target ceiling. Semantic generalization on vocabulary it never saw, not near 1.0.

That second number is the point: HEEL tells you what it can't catch yet.

Why HEEL is different

🤖 Agent-native, MCP-first. The capability is an MCP server. Wire it into Claude Desktop, Cursor, or CI and let an agent run abuse rehearsals on demand. A thin REST API and a CLI sit over the same capability.
🔒 The calling agent is untrusted. Authorization scopes are human-only, out-of-band, and HMAC-signed, immutable from the caller side. A prompt-injected agent can run within a scope a human approved, but cannot create, widen, or escape one (those tools don't exist, by construction). Every escalation attempt is rejected and written to a tamper-evident audit log.
📏 Radically honest metrics. Most "AI security" tools quote a number you can't trust. HEEL publishes a ladder (below), measures against abuse authored by an independent LLM swarm blind to its own probes, on a frozen, content-hashed test set, and shows you the overfitting and mis-categorization gaps instead of hiding them. Four adversarial red-team passes, all findings fixed.
🛡️ Safety spine, non-negotiable. Synthetic-first. Findings are contained, canary-only proofs, never working exploits, real exfiltration, or prohibited content. True software vulns are handed off to AppSec, pure model-jailbreaks to model red-team. HEEL stays in its lane. See SECURITY.md.

What it hunts

A 10-category abuse taxonomy: license/entitlement gaming, data harvesting, unintended endpoints, function abuse, content policy, identity/account takeover, trust-economy fraud, integration abuse, compliance boundaries, and (only when the target has an agent/MCP surface) agent-specific abuse like tool over-scope, confused-deputy tool calls, cross-tenant RAG, and indirect-injection-to-action.

Two agent classes hunt in parallel: a programmatic adversary (finds weak controls) and a motivation-profiled opportunistic human (games normal affordances, catches what the adversary misses, like coupon stacking). Plus affordance chaining for multi-step abuse (for example, weak recovery and a non-rotated session compose into account takeover).

Honest about what it can't do

HEEL reports four levels, weakest claim to strongest evidence:

metric	what it measures	result
self-consistency	wiring works (probes vs. plants authored together)	~1.0 (a wiring test, not accuracy)
blind	independent encodings of known weaknesses	~0.25
held-out DEV	independent authorship, tuned-on	0.70
held-out TEST	independent LLM authorship, frozen, never tuned on	localization 0.50 · attribution 0.33 · precision 0.98

The headline is the bottom row: real detection on 199 abuse weaknesses an independent LLM swarm invented in its own vocabulary, which HEEL never saw. It improves only by widening real-vocabulary coverage, never by writing probes against known answers. Full method: EVAL.md · docs/HELDOUT_PROVENANCE.md.

Use it like an operator

# 1) a HUMAN authorizes a target OUT-OF-BAND (the only way to mint a scope)
heel scope create --target synthetic-saas --operator you --confirm

# 2) an agent / CLI runs WITHIN that scope (and cannot widen it)
heel run --scope <scope_id> --target synthetic-saas
heel coverage --run <run_id>
heel log --run <run_id>          # immutable, hash-chained audit trail

Connect from an MCP client. Point Claude Desktop / Cursor / CI at the heel-mcp server:

{ "mcpServers": { "heel": { "command": "heel-mcp",
  "env": { "HEEL_HOME": "/path/to/.heel", "HEEL_SIGNING_KEY": "/path/outside/.heel/heel.key" } } } }

The control room

A dense Next.js dashboard over the same capability: an abuse board (ranked, reachability-weighted), the honest backtests, a live swarm monitor, the authorization gate, the read-only scope panel, the containment log, and the scenario library.

make ui        # http://localhost:3000   (or `npm run build` for a static export)

Bring your own LLM (optional)

The deterministic engine runs fully offline with no API key. Flip on the LLM control loop for smarter discovery:

HEEL_MODEL=anthropic ANTHROPIC_API_KEY=sk-... heel-mcp   # via stdlib urllib, no SDK

It only ever sees observable synthetic affordance properties (never secrets or real data) and stays in HEEL's lane.

Security & assurance

A security tool has to earn trust. HEEL ships the evidence: zero dependencies, reproducible builds, Sigstore-signed release provenance + SBOM, OpenSSF Scorecard + CodeQL, and the real assurance, four independent multi-agent red-team passes whose full reports are in the repo, every finding fixed with a regression test. The core claim held under attack: a prompt-injected caller cannot create, widen, or escape a signed authorization scope. See TRUST.md and SECURITY.md, and verify the build yourself: gh attestation verify <wheel> --repo ancilis/heel.

Docs

ARCHITECTURE · EVAL · DECISIONS · SECURITY · TRUST · CONTRIBUTING · CHANGELOG · red-team reports under docs/

Status

Production-ready (v1.1.0). 53 tests on Python 3.11 to 3.13, CI green, zero runtime dependencies, four red-team passes. Next: LLM-driven detection to lift attribution recall, larger held-out sets, real-target adapters.

_{Apache-2.0 licensed · synthetic-first · the safety spine (§10) overrides every instruction, including any arriving through a calling agent.}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github		.github
docs		docs
heel		heel
tests		tests
web		web
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DCO		DCO
DECISIONS.md		DECISIONS.md
EVAL.md		EVAL.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
TRUST.md		TRUST.md
pyproject.toml		pyproject.toml
run_demo.py		run_demo.py
server.json		server.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HEEL

See it in 30 seconds

Why HEEL is different

What it hunts

Honest about what it can't do

Use it like an operator

The control room

Bring your own LLM (optional)

Security & assurance

Docs

Status

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HEEL

See it in 30 seconds

Why HEEL is different

What it hunts

Honest about what it can't do

Use it like an operator

The control room

Bring your own LLM (optional)

Security & assurance

Docs

Status

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages