Maestro is a local-first, long-running agent harness for multi-agent software engineering. It gives you one CLI and one on-disk state model for missions, features, assertions, handoff launches, checkpoints, memory, and project context so separate agent sessions can collaborate without a server, database, or background daemon.
In day-to-day use it acts as a conductor: a human operator drives multiple terminals while Maestro keeps the shared state disciplined and inspectable. See docs/harness-positioning.md for the principle-to-primitive mapping.
- Shared state lives on disk in
.maestro/, not in chat history. - Missions break work into milestones, features, and validation assertions.
- Native handoff launches build a self-contained markdown brief and start a fresh Codex, Claude, or Hermes run from the current repo state.
- Memory commands turn corrections and learnings into reusable guidance.
- Mission Control gives you a read-only TUI and JSON snapshots of current state.
- The runtime stays local-first: filesystem, git, config, and terminal tools.
Maestro is the shared state layer in the middle. The operator and fresh agent runs both go through the CLI, the CLI persists shared state locally, and Mission Control projects that same state without mutating it.
- It is not a hosted orchestration service or remote agent platform.
- It is not tied to a single model vendor or harness.
- It does not require a database, queue, or network API to work.
The human operator is the bridge between terminals. Maestro is the shared state layer underneath that workflow.
Maestro has two related but separate operating modes:
| Use missions when you need... | Use tasks when you need... |
|---|---|
| A planned multi-step effort with milestones, assertions, checkpoints, and handoff launches. | A lightweight blocker graph for the daily queue. |
| A durable brief for a fresh agent run. | Fast ready, claim, update, and blocker management. |
Reviewable artifacts under .maestro/missions/<mission-id>/. |
Repo-tracked task state under .maestro/tasks/tasks.jsonl. |
If you only remember one distinction: mission is for planned execution; task is for the day-to-day queue.
| Concept | Purpose |
|---|---|
| Mission | The top-level unit of work with a lifecycle such as draft, approved, or executing. |
| Milestone | A phase within a mission. Milestones can act as work phases or validation gates. |
| Feature | A concrete piece of work assigned to an agent type, with verification steps and optional dependencies. |
| Assertion | A validation target tied to a feature. Assertions are updated to passed, failed, blocked, or waived. |
| Handoff | A persisted launch record plus markdown brief for starting a fresh Codex, Claude, or Hermes session from current mission or repo context. |
| Task | A Claude-style blocker-graph work item for the daily loop; lives at .maestro/tasks/tasks.jsonl independent of missions. |
| Reply | An agent's structured outcome record for a feature, optionally gated by behavioral principles. |
| Principle | A behavioral rule injected into agent prompts and scored against replies. Stored at .maestro/principles.jsonl. |
| Memory | Corrections, learnings, and compiled guidance that feed back into future agent prompts. |
| Checkpoint | A timestamped mission snapshot you can save and later restore. |
| Bundle | A portable .mission.tar.gz archive of a mission plus its artifacts for review or transfer. |
| Mission Control | A read-only dashboard for previewing mission state interactively or as JSON. |
Mission Control gives you a read-only terminal dashboard over the current Maestro state.
The loop is deliberately simple: define work, optionally inspect the feature brief, launch a fresh handoff, update progress, validate the outcome, and checkpoint before sealing the milestone.
- Bun
- Git
- A local agent harness in another terminal, such as Codex, Claude Code, or Hermes
Install the latest published Maestro binary:
curl -fsSL https://raw.githubusercontent.com/ReinaMacCredy/maestro/main/scripts/install.sh | bashInstall a specific published release:
MAESTRO_VERSION=<version> curl -fsSL https://raw.githubusercontent.com/ReinaMacCredy/maestro/main/scripts/install.sh | bashAfter installation, refresh to the latest published release with:
maestro updatebun install
bun run buildThis produces the compiled binary at ./dist/maestro.
bun run release:local
command -v maestro
maestro --versionIf you also want to initialize global config and inject supported agent instruction blocks:
maestro installThis syncs bundled Maestro skills into Codex, Claude Code, Hermes, and the shared AgentSkills root when those targets are available. See Provider Registry and Skills for roots, diagnostics, and external skill installs.
./dist/maestro is the fresh repo build. maestro on your PATH is the installed local binary.
maestro initThis creates the local .maestro/ workspace for the current repository, seeds default policy files, and writes .maestro/MAESTRO.md — a read-order compass that points fresh agents at the right files (this compass → root AGENTS.md → .maestro/tasks/NOW.md → maestro status --json → policies → specs).
mission create expects a JSON plan file. A minimal example:
{
"title": "Add authentication",
"description": "Ship the first authentication slice",
"milestones": [
{
"id": "plan",
"title": "Planning",
"description": "Define the implementation approach",
"order": 0,
"kind": "work",
"profile": "planning"
},
{
"id": "implement",
"title": "Implementation",
"description": "Build and verify the feature",
"order": 1,
"kind": "work",
"profile": "implementation"
}
],
"features": [
{
"id": "auth-plan",
"milestoneId": "plan",
"title": "Plan the auth flow",
"description": "Define the login shape, risks, and acceptance criteria",
"agentType": "codex-cli",
"verificationSteps": [
"Review the proposed flow with the team"
]
},
{
"id": "auth-impl",
"milestoneId": "implement",
"title": "Implement the auth flow",
"description": "Build the first working authentication slice",
"agentType": "codex-cli",
"dependsOn": [
"auth-plan"
],
"verificationSteps": [
"Run build",
"Run targeted tests",
"Verify the login flow manually"
],
"fulfills": [
"auth-login-works"
]
}
]
}maestro mission create --file plan.json
maestro mission list
maestro mission approve <mission-id>maestro feature list --mission <mission-id>
maestro feature prompt <feature-id> --mission <mission-id> --out agent-prompt.mdThis writes the prompt to agent-prompt.md and also stores it under .maestro/missions/<mission-id>/agents/<feature-id>/prompt.md. maestro handoff does not require this step, but it is useful when you want to inspect the current feature context before launching a fresh agent run.
Launch a fresh Codex run for the next implementation slice:
maestro handoff \
"Implement auth-impl for mission <mission-id> and run the listed verification steps before stopping" \
--agent codexLaunches are detached by default. Maestro persists the handoff under ~/.maestro/handoff/<id>/ (a single global store) and returns the launch record with the prompt path, log path, target directory, linked task id, and agent details.
Useful variants:
maestro handoff "Review auth-impl before merge" --agent claude --worktree auth-review
maestro handoff "Finish auth-impl and wait for the result" --agent codex --wait --json
maestro handoff "Run a Hermes pass over auth-impl" --agent hermes --wait --json
maestro handoff pickup --id <handoff-id> --jsonmaestro feature update auth-impl --mission <mission-id> --status in-progress
maestro feature update auth-impl --mission <mission-id> --status review
maestro validate show --mission <mission-id>
maestro validate update auth-login-works --mission <mission-id> --result passed --evidence "bun test"
maestro checkpoint save --mission <mission-id>
maestro milestone seal implement --mission <mission-id>maestro handoff "<task>" builds a self-contained markdown brief from the current repo state plus the linked task continuation, then launches a fresh Codex, Claude, or Hermes run. Every launch is persisted under ~/.maestro/handoff/<id>/ (a single global store) so the operator can inspect exactly what was sent and what the child process printed. Handoffs created in one working directory are visible from any other. Prompt-only packets can be picked up from any working directory, but task-linked packets must be picked up from their source project unless you explicitly pass maestro handoff pickup --standalone.
A launch record always includes:
id,agent, andmodelstatus:launching,launched,completed, orfailedtargetDir: the directory handed to the external agentpromptPath,outputPath, and the exact launchedcommand- task and takeover metadata such as
refs.taskId,createdByAgent,pickedUpByAgent, andconsumedAt - optional
worktreemetadata when--worktreeis used - optional
pidandexitCode, depending on detached vs--waitmode
The prompt itself is stored separately as markdown. Maestro always renders the same sections:
TaskContextRelevant FilesCurrent StateWhat Was TriedDecisionsAcceptance CriteriaConstraints
- If exactly one actionable feature exists in the active mission, Maestro anchors the brief to that mission, milestone, feature, assertions, agent prompt or report artifacts, and the current git state.
- Otherwise it falls back to repository context: current branch, recent commits, and changed files.
- If the handoff is linked to an active task continuation, Maestro injects the saved
currentState,nextAction, active decisions, and recent local timeline into the prompt before launch. - When
--worktreeis used, Maestro creates the sibling worktree first and appends that worktree path and branch information to theConstraintssection.
- Default mode returns as soon as the child process is started and records status
launched. --waitblocks until the agent exits and recordscompletedorfailed.--jsonprints the persisted launch record for automation or debugging.handoff pickupatomically consumes a packet on first pickup. Prompt-only packets can be consumed anywhere; task-linked packets resume their linked task only when pickup happens from the source project, unless--standaloneis passed.
Default Codex launch:
maestro handoff \
"Implement auth-impl for mission <mission-id> and verify the touched surface area" \
--agent codexClaude launch in a sibling worktree:
maestro handoff \
"Review auth-impl for regressions and missing tests" \
--agent claude \
--worktree auth-reviewForeground automation-friendly launch:
maestro handoff \
"Finish auth-impl and return only after the tests pass" \
--agent codex \
--wait \
--jsonHermes launch:
maestro handoff \
"Run a focused Hermes implementation pass" \
--agent hermes \
--wait \
--jsonUse --model to override the agent default (gpt-5.4 for Codex, opus for Claude; Hermes only receives --model when explicitly provided), --name to label the launch, and --base when you need a specific base branch for a worktree handoff. Use maestro handoff pickup to consume a packet, and pass --standalone when you intentionally want only the prompt without resuming a linked task.
Tasks are Maestro's lightweight, mutable issue graph for the daily queue. A task answers "what do I do next?"; a mission answers "what are we building?" Tasks live in .maestro/tasks/tasks.jsonl, are repo-tracked, and review like regular diffs.
flowchart LR
pending -->|claim| in_progress
in_progress -->|unclaim| pending
in_progress -->|complete| completed
completed -.->|reopen| pending
pendingtasks sit in the queue.in_progresstasks are claimed by exactly one session.completedtasks are locked; edits or re-runs requiretask reopen, which restores the task and its continuation summary.- Legacy statuses (
open,blocked,deferred,closed) still parse from older state files and collapse topendingorcompletedon read.
Every task carries a type (task, bug, feature, epic, chore), a priority (P0-P4, default P2), freeform labels, optional parentId, ownership metadata (assignee, claimedAt, lastActivityAt), optional contractId, and an optional receipt (summary, surprise, verifiedBy) captured at completion.
Blocking is symmetric and stored on both sides. Each task has a blockedBy list of prerequisites and a blocks list of dependents. Declaring that A blocks B, C atomically updates all three tasks.
maestro task block <id> <blockedTaskIds...>
maestro task unblock <id> <blockedTaskIds...>
maestro task create "..." --blocked-by <ids>Rules enforced by the domain layer:
- A task is ready only when every entry in its
blockedByiscompleted(or missing from the store).task readyreturns exactly the pending, unblocked, unassigned set, rankedP0/P1first and then by creation time. - Status moves into
in_progressorcompletedfail with a blocker error when any prerequisite is still open. - The retired
task deps add|removeverbs now error and point totask block/task unblock.
| Command | Returns |
|---|---|
maestro task status |
Hybrid board: compact active/ready/blocked lists plus expanded dependency tracks. |
maestro task ready |
Pending, unblocked, unassigned tasks, P0/P1 first. |
maestro task mine |
Tasks claimed by the active session. |
maestro task stuck |
in_progress tasks idle past --older-than (default 4h). |
maestro task similar <id> |
Tasks that look alike by title, completion reason, receipt text, and linked contract text. |
maestro task list |
Full filter set: --status, --priority, --type, --label, --parent, --assignee, --limit. Add --tracks for headline-only output. |
maestro task status renders a hybrid operator board. Simple one-task tracks render as compact rows under ACTIVE, READY, or BLOCKED. Multi-step tracks expand only when dependency structure matters: blocked steps or ready steps that unlock downstream work. If a ready task unlocks blocked downstream work, a one-line next: hint appears under the header.
$ maestro task status
tasks: 42 open | 13 active | 5 ready | 16 blocked | 5 blocked tracks
next: epic/desktop-path-native-ghostty / Phase 0: verify env and stock paseo desktop dev flow (9 unblocks)
ACTIVE
o chore/update-global-agents-md Update global AGENTS.md guidance
o chore/check-maestro-task-dependency Check maestro task dependency support against beads-rust
o chore/map-full-parallel-agent Map full parallel-agent safety picture in maestro
o fix/fix-all-reviewed-regressions Fix all reviewed regressions
+ 9 more
DEPENDENCY TRACKS
epic/desktop-path-native-ghostty
· Phase 0: verify env and stock paseo desktop dev flow
ready, 9 unblocks
! Phase 1: build pinned GhosttyKit xcframework
blocked by Phase 0: verify env and stock paseo desktop dev flow
! Phase 2: scaffold ghostty-bridge N-API addon
blocked by Phase 1: build pinned GhosttyKit xcframework
+ 7 more
epic/test-batch-non-code
· Test batch: inspect ready queue
ready, 2 unblocks
· Test batch: inventory pending handoffs
ready, 2 unblocks
! Test batch: draft cleanup notes
blocked by Test batch: inspect ready queue, Test batch: inventory pending handoffs
! Test batch: close out temporary test set
blocked by Test batch: draft cleanup notes
READY
· implement/review-last-four-commits Review last four commits from 44d5f670 to e55384bf
· implement/implement-continuation-layer Implement continuation layer
BLOCKED
! implement/investigate-regression blocked by implement/scope-diff-history-feat
! implement/inspect-changed-files-efficiency blocked by implement/load-review-instructions-diff
! implement/report-concise-findings-2 blocked by implement/inspect-changed-files-efficiency
Flags:
| Flag | Effect |
|---|---|
--all |
Include completed tasks (rendered with the v glyph). |
--track <slug-or-id> |
Restrict output to one track. |
--json |
Emit a structured projection ({ header, tracks[], orphans[], tasksById }) for tooling. header includes open, active, ready, pending, blocked, and blockedTracks. |
Render shape. The default view keeps solo/non-dependent work compact and expands only dependency tracks. ACTIVE shows at most four rows before + N more; READY, BLOCKED, and dependency tracks are capped separately. Blocked rows render blocked by <slug-or-id> inline, while blocked steps inside dependency tracks render the blocker on the next line. Completed blockers are marked (done).
Color is auto-detected: NO_COLOR=1 or a non-TTY pipe disables ANSI codes. Tracks (top-level tasks) carry a slug like implement/<kebab> (verbs: implement | fix | chore | spike | epic) which doubles as a human-friendly id — task show implement/foo and task update implement/foo --status ... work the same way tsk-XXX does.
Tasks created before slugs landed have no slug and render with their bare tsk-<id> as the header. To bulk-derive slugs from titles:
maestro task backfill-slugs # dry-run; prints what would change
maestro task backfill-slugs --apply # write the slugs
maestro task backfill-slugs --apply --limit 20
maestro task backfill-slugs --rederive --apply # refresh auto-derived slugs after the algorithm changesDerivation drops English stop-words ("and", "of", "in", ...), drops pure-hex tokens (commit shas) and digit-only tokens, caps at 4 significant words, and never truncates mid-word — so the result is short and scannable.
Backfill is display-only and bypasses the completion + ownership locks (slugs don't affect runtime state), so it works on completed and currently-claimed tasks. By default it refuses to overwrite an existing slug; --rederive opts in to overwriting (use it when the derivation algorithm has changed).
Claiming is exclusive and session-scoped. Session IDs come from the sessionDetection config (Claude Code out of the box) or --session <id> when scripting.
maestro task claim <id>
maestro task claim <id> --busy-check # refuse if this session already owns open work
maestro task claim <id> --force # steal from another session
maestro task claim <id> --stale-after 4h # auto-release a dead owner's stale claim
maestro task unclaim <id> # in_progress demotes to pending
maestro task release-owned <sessionId> # release everything a session held
maestro task heartbeat <id> # bump lastActivityAt without other editstask update <id> --status in_progress auto-claims an unassigned task for the current session, provided the session has no other open work (or --force is passed). This preserves the invariant that a session owns at most one in-flight task at a time.
Agents can stage a whole queue upfront from one JSON file. References between tasks use a batch-local name slot that resolves to real ids inside a single atomic write.
maestro task plan --file plan.json
maestro task plan --file - < plan.json
maestro task plan --file plan.json --start scaffold # auto-claim the named task
maestro task plan --file plan.json --dry-run # validate without writing{
"batchId": "auth-slice",
"tasks": [
{ "name": "scaffold", "title": "Scaffold auth module", "type": "chore", "priority": 2 },
{ "name": "tests", "title": "Add login tests", "blockedBy": ["scaffold"] },
{ "title": "Wire login route", "blockedBy": ["scaffold", "tests"], "labels": ["auth"] }
]
}Every task has a durable, on-disk continuation record that tells the next agent where work stands. It is the source of truth for resume across sessions, across agents, and across context compaction. Standalone handoff packets are the transfer artifact; the continuation is the state.
Two files back each task:
.maestro/tasks/continuations/active/<taskId>.json-- live summary. Moves tocompleted/<taskId>.jsonattask update --status completedand returns toactive/ontask reopen..maestro/tasks/local-history/<taskId>.jsonl-- append-only event log (per-machine).
Summary fields: currentState, nextAction, keyDecisions, activeAgent, lastActiveAt. Event kinds: snapshot, decision, next_action_set, blocker_set, handoff_created, handoff_picked_up, agent_takeover, task_completed, task_reopened.
-
Same session, chat intent. Maestro installs Claude Code hooks that hydrate the active continuation into the agent's context with no CLI call:
SessionStartinjects a short pointer when an active task exists: id, title, status, last-active timestamp, and a nudge to saycontinueorresume.UserPromptSubmitwatches for these exact phrases (case- and punctuation-insensitive) and expands them into the full resume payload (current state, next action, active decisions, recent timeline) before the model sees the prompt:continuecontinue workresumeresume workpick up where we left offresume where we left offresume from where we left off
PreCompactpreserves the continuation in the compacted summary so resume survives a context reset.
These are plain chat intents, not Maestro CLI commands.
-
Different agent, handoff pickup.
maestro handoff pickup [--id <handoffId>]consumes one open packet atomically. Prompt-only packets can be picked up from any working directory. Task-linked packets are project-anchored: from the source project, pickup force-claims the linked task for the current session, moves it toin_progress, transfers any contract ownership, rewrites the continuation summary with aResumed from handoff ...prefix, and recordsagent_takeover+handoff_picked_upevents. From another project, Maestro errors with the source path and a concretecd ... && maestro handoff pickup ...command. Pass--standaloneto intentionally consume the packet without resuming the linked task. -
Manual inspection.
maestro task show <id>prints the raw task and continuation state for offline review.
maestro task update <id> \
--current-state "Tests pass locally; rebased on main" \
--next-action "Open PR and request review" \
--add-decision "Use bcrypt over argon2 for parity with legacy" \
--remove-decision "Use JWTs in localStorage"Refresh when current state or next action changes, when a load-bearing decision or constraint changes, or when blockers appear or clear.
A contract is a machine-checked agreement attached to a task: what to touch, what to avoid, and what "done" means. At completion, Maestro diffs claimedAtCommit..HEAD and renders a verdict.
Lifecycle: draft -> locked or amended -> fulfilled or broken, with discarded as an early-exit from draft. A closed contract can be reopened alongside its task.
maestro task contract new <taskId> --editor "$EDITOR" # or --from template.yaml
maestro task contract edit <ref>
maestro task contract lock <ref> # freeze scope + claim commit
maestro task contract amend <ref> # record a post-lock change
maestro task contract show <ref>
maestro task contract list
maestro task contract verdict <ref> # preview without closing
maestro task contract discard <ref> # draft only
maestro task contract reopen <ref> # after fulfilled/broken
maestro task contract criteria mark <ref> <criterionId> --evidence "bun test"
maestro task contract criteria add <ref> "New criterion text"
maestro task contract criteria remove <ref> <criterionId>A contract records:
intent-- one-sentence goal.scope--filesExpected,filesForbidden, optionalmaxFilesTouchedcap.doneWhen[]-- explicit criteria, eachmanualorreceipt-hint, each markable with evidence.claimedAtCommit-- git HEAD captured at lock; the verdict diffs against it.configSnapshot-- strictness, overlap policy, anchor-rebase fallback, and stale-reclaim policy in effect at lock time.ownershipHistory-- transfers fromclaim --forcereclaims andhandoff pickup.
Completion gating: task update --status completed against a task with a locked contract closes the contract, renders a verdict, and fails completion when the verdict is broken and either contracts.strict=true is set or --strict is passed. Use --no-contract to complete without a contract when contracts.default=required.
Relevant config (.maestro/config.yaml):
contracts:
default: prompt # required | prompt | optional
strict: false # block completion on broken verdict
overlapPolicy: fail # fail | annotate (active contract scope overlap)
rebaseFallback: best-effort # best-effort | fail (when claimedAtCommit is missing)
defaultMaxFilesTouched: ~ # integer cap or unset
staleReclaimContractPolicy: inherit # inherit | block (when taking over a stale claim).maestro/tasks/
├── tasks.jsonl # authoritative task graph (repo-tracked)
├── contracts/ # per-task locked contracts and verdicts (repo-tracked)
├── contract-templates/ # reusable YAML drafts for `contract new --from`
├── continuations/ # per-task resume summaries + event logs
├── batches/ # batch plan manifests
├── candidates/ # captured work candidates awaiting promotion
└── local-history/ # per-machine audit log (ignored)
tasks.jsonl, contracts/, and principles.jsonl are intentionally repo-tracked so the queue and its policies review like any other code change. Local histories and candidate piles stay per-machine. Bound their growth with:
maestro task prune # keep the most recent 500 entries per kind
maestro task prune --keep 100 --candidates-only
maestro task prune --continuations-only --dry-run
maestro task prune --all # purge both pilesMaestro has a lightweight logbook for recording verifiable outputs tied to a task. Use it to document commands that ran, their exit codes, and optional manual notes — before or after completing work.
Evidence rows are stored under .maestro/evidence/ (gitignored, per-machine) and stamped with a WitnessLevel that captures how trustworthy the claim is: witnessed-by-maestro for Maestro-invoked commands, agent-claimed-locally for evidence the agent self-reported, and agent-claimed-and-not-reproducible for manual notes.
# Record a command run
maestro evidence record --task tsk-aaaaaa --command "bun test" --exit 0
# Record with duration and optional log path
maestro evidence record --task tsk-aaaaaa --command "bun run build" --exit 0 --duration 12345 --log ./build.log
# Record a manual note
maestro evidence record --task tsk-aaaaaa --kind manual-note --note "Verified UI on staging"
# List evidence for a task
maestro evidence list --task tsk-aaaaaa
# Show one evidence row
maestro evidence show evd-xxxxxxEvidence rows are linked to a task id and optionally to a contract criterion via --criterion <id>. Run maestro evidence record --help for the full flag set.
Maestro's trust substrate is a stack of opt-in layers that turn agent claims into deterministic, auditable, gated decisions. Each layer is independently useful; together they compose. Contracts narrow the scope of work, the Trust Verifier checks the diff against that scope, the Verdict gates completion on witnessed evidence, CI makes the verdict authoritative, and the optional layers above (auto-merge, deploy safety, cross-task conflict) extend the same primitives outward.
The sections below cover each layer in turn. They are presented in the order a team typically adopts them, but every layer past contracts is opt-in and can be enabled independently.
This is the foundation. A contract pins down what a task is allowed to touch; the Trust Verifier checks the diff against that contract. Three behaviors define this layer:
-
Plan proposes a contract. During
maestro-plan, the plan must include aproposed_contractwithallowed_files,forbidden_paths,done_whencriteria, and anamendment_budget. Plan-time proposals are not amendments — they seed the contract that gets locked when the agent claims the task. -
Agent works within scope; amends on genuine discovery. When work uncovers a file that lies outside the locked contract scope, the agent must amend before touching it:
maestro contract amend --task <id> --add-path src/new-file.ts --reason "discovered at runtime"
Each amendment writes a new versioned contract snapshot and a
contract-amendedEvidence row. The budget defaults aremax_amendments: 3,max_paths_per_amendment: 5. Amendments are versioned Evidence and never silent edits. -
Agent verifies before completing.
maestro task verifyruns the Trust Verifier against the current diff and the locked contract:maestro task verify --task <id>
The verifier runs 6 checks in parallel: scope adherence, lockfile parity, generated-file parity, sensitive-path policy, commit metadata, and secrets-in-diff. Findings are printed with severity (
info,warn,error). Exit codes:0when noerrorfindings,1when at least oneerrorfinding,2when the task has no locked contract (warn — usemaestro task contract newto create one).
# Versioned contract inspection and amendment
maestro contract show --task <id>
maestro contract show --task <id> --version <n>
maestro contract amend --task <id> --add-path <path> --reason "<why>"
maestro contract amend --task <id> --remove-path <path> --reason "<why>"
maestro contract history --task <id>
# Trust Verifier
maestro task verify --task <id>
maestro task verify --task <id> --base <git-ref>
maestro task verify --task <id> --json
# Mission Spec (acceptance criteria and non-goals)
maestro spec show --mission <id>
maestro spec edit --mission <id>maestro init bootstraps two policy files committed under .maestro/policies/:
sensitive-paths.yaml— glob list; paths matching these globs triggercheckSensitivePathsfindings. Seedocs/sensitive-paths-defaults.mdfor the 8 default globs and guidance on extending or relaxing them.owners.yaml— three role lists (policy_approver,ratchet_approver,sensitive_waiver). Seedocs/owners-yaml-format.mdfor the schema reference.
The verdict layer turns a verifier run into a deterministic gating decision. After maestro task verify, an agent requests a verdict that produces one of four outcomes:
| Verdict | Meaning |
|---|---|
PASS |
All acceptance criteria are met with evidence at or above the required witness level for the effective risk class. Completion is unblocked. |
FAIL |
Evidence is present but insufficient: a criterion is unmet, or the evidence witness level is below the autopilot policy threshold. |
HUMAN |
Criteria are met but the effective risk class or autopilot policy requires a human reviewer before the task can be sealed. |
BLOCK |
A hard blocker is active: broken contract, critical risk class with no human signoff, or a policy loosening still in its 30-day soak window. |
Every Evidence row carries a witness_level that captures how trustworthy the claim is. The ladder, strongest to weakest:
witnessed-by-maestro— Maestro itself ran the command and captured the result.witnessed-by-ci— A trusted CI gate ran the command and posted the result back.agent-claimed-locally— The agent self-reported a local run; Maestro did not observe it. Default for schema v1 evidence rows.agent-claimed-and-not-reproducible— A manual note; cannot be reproduced. Weakest level.
The Risk Engine demotes PASS to HUMAN if any evidence row's witness level is below the threshold required by the effective autopilot policy for the derived risk class.
See docs/witness-levels.md for the full reference.
The Risk Engine derives a risk class from deterministic diff signals and takes the higher of agent-proposed vs Maestro-derived. An agent can never lower the derived class. The four levels are low, medium, high, and critical. See docs/risk-class-derivation.md for the signal-to-class mapping table.
maestro task proof --task <id> produces a per-criterion coverage map: for each acceptance criterion in the linked Spec, it shows which Evidence rows satisfy it and at what witness level.
Policy tightenings (stricter rules, lower budgets) take effect immediately. Policy loosenings (relaxed rules, higher budgets) soak for 30 days before becoming effective. Pending loosenings accumulate in .maestro/policies/.pending-loosenings.json (gitignored). Use maestro policy pending to inspect.
# Verdict
maestro verdict request --task <id> # exit 0=PASS 1=FAIL 2=HUMAN 3=BLOCK
maestro verdict request --task <id> --json
maestro verdict show --task <id>
maestro verdict show --task <id> --version <id>
# ProofMap
maestro task proof --task <id>
maestro task proof --task <id> --json
# Policy inspection
maestro policy check --task <id>
maestro policy pendingmaestro init bootstraps three additional policy files under .maestro/policies/:
risk.yaml— extends or tightens the default signal-to-class mapping. Absent means defaults apply.autopilot.yaml— per-risk-class required witness level and auto-pass eligibility.release.yaml— release-gate rules (e.g., minimum witness level required before a release commit is stamped).
See docs/policy-format.md for the schema reference for all five policy files.
The pre-claim loop closes the inner agent loop: the agent runs plan, implement, verify, and verdict steps without human intervention; humans still review and merge. The cycle is enforced by the tools, not by convention.
Before claiming any non-trivial task done, the agent runs this ordered loop:
- Intake — run
maestro intake --paths <paths>to classify the work astiny,normal, orhigh-riskbefore writing code. The output drives the next step (patch directly, batch-create tasks, or build a Spec + threat-model). - Plan — write a plan file and run
maestro plan checkto catch problems before code is written. - Implement — write code and record evidence after each verification command.
- Verify — run
maestro task verifyand address everyerrorfinding. - ProofMap — run
maestro task proofand confirm every acceptance criterion is covered. - Verdict — run
maestro verdict requestand branch on the exit code.
The canonical source for this ritual is the maestro-verify bundled skill — read it when in doubt about the verification protocol.
maestro intake is a deterministic plan-time risk classifier. It returns a lane and a recommended next step before code is written, using the same risk-class derivation rules as the Verdict layer.
| Lane | Trigger | Next step |
|---|---|---|
tiny |
0–1 risk flags, no hard gate | Patch directly, run validation, close with reason. |
normal |
2–3 risk flags, no hard gate | Create a task via maestro task plan and follow the standard pre-claim loop. |
high-risk |
Any hard gate, or 4+ flags | Build a Spec with acceptance criteria, plus a threat-model Evidence row when the diff intersects sensitive paths. |
Hard gates (any one promotes to high-risk): auth, authz, data-model, audit-security, external-systems. The classifier auto-detects flags from the intended file paths against the effective risk policy and sensitive-path globs; declared flags are merged on top.
maestro intake --paths src/auth/session.ts --flag auth
maestro intake --paths src/foo.ts,src/bar.ts --jsonExit code is always 0; agents react to lane, derivedRiskClass, threatModelRequired, and recommendedNextStep in the output.
maestro plan check evaluates a plan file against the locked contract and spec before any code is written. It catches three classes of problems:
scope-widens— the plan intends to touch files outsidecontract.scope.filesExpected. Resolve by narrowing the intended files or amending the contract before coding.missing-proof— an acceptance criterion from the Spec has no entry in the plan'sproofSet. Every criterion needs a planned proof strategy.risk-class-too-low— the plan's declaredriskClassis lower than what the intended file set triggers. Raise it to match.
The verb always exits 0. Findings in the output must be resolved before implementation begins. A clean plan-check does not guarantee a passing verdict; it means the plan is internally consistent.
maestro plan check --task <id> --plan-file ./plan.yaml
maestro plan check --task <id> --plan-file ./plan.yaml --jsonAgents can record reviewer findings as structured evidence via maestro evidence record --kind ai-review. Three reviewer kinds are available: bug (correctness, edge cases, regressions), security (auth, input validation, secrets, injection), and architecture (boundary violations, coupling, abstraction misuse).
Any error-severity finding raises the effective risk class by one notch. A security-reviewer error always lifts to critical. A clean review (zero error findings) never lowers the deterministic baseline derived from diff signals.
See docs/ai-reviewer-protocol.md for the finding schema, confidence semantics, and recording guidance.
When the diff intersects security-relevant sensitive paths, the Verdict is HUMAN with reason threat-model-required unless a threat-model Evidence row is present. Produce the threat-model document and record it before requesting a verdict:
maestro evidence record --task <id> --kind threat-model \
--threat-model-file ./threat-model.jsonSee docs/threat-model-format.md for the schema (assets, threatCategories, mitigations, residualRisk) and examples.
Contracts can declare cost limits: maxRetries, maxWallClockSeconds, and maxTokens. When any limit is exceeded, run-state at .maestro/runs/<task-id>/state.json (gitignored) is marked exhausted and the next verdict request returns BLOCK (exit 3) with reason cost-budget-exhausted. Check consumption at any time:
maestro task budget --task <id>
maestro task budget --task <id> --jsonretryCount increments automatically on each FAIL or HUMAN verdict.
In mission mode, Mission Control gains an autopilot screen that projects the current verify/verdict state across all active tasks in the mission. Use maestro mission-control --preview autopilot --size 120x40 --format plain to inspect it non-interactively.
# Plan-check
maestro plan check --task <id> --plan-file <path>
maestro plan check --task <id> --plan-file <path> --json
# Cost-budget inspection (read-only, always exits 0)
maestro task budget --task <id>
maestro task budget --task <id> --json
# AI Reviewer evidence
maestro evidence record --task <id> --kind ai-review \
--reviewer <bug|security|architecture> \
--findings '<inline-json-or-path>' \
--confidence <0-1>
# Threat-model evidence
maestro evidence record --task <id> --kind threat-model \
--threat-model-file <path>Local Maestro is advisory; CI Maestro is authoritative. The PR check status posted by maestro ci verify is the merge gate.
- Bootstrap your repo with
maestro setup— the maestro-setup skill installs.github/workflows/maestro-verify.ymlfrom its bundled template (when.github/exists). - Pin the Maestro binary version in the workflow (default: latest tagged release).
- Open a PR. GitHub Actions runs
maestro ci verify, which runs Trust Verifier, ingests CI job results aswitnessed-by-ciEvidence, computes the Verdict, and posts a GitHub Check. - Merge when the check is green. Use
maestro verdict show --pr <n>locally to inspect the latest verdict for a PR (looked up by current HEAD tree SHA).
Verdicts are bound to (pr, tree_sha), so squashes survive but force-pushes to a different tree invalidate them.
See docs/ci-integration.md for the full reference (workflow template, env contract, witness ingestion, troubleshooting).
When all 8 eligibility predicates pass, maestro merge auto triggers gh pr merge --auto without further human intervention. Auto-merge applies to roughly 5–15% of merged PRs in practice — only those where the diff is small-scope, fully CI-witnessed, and the autopilot policy explicitly opts in for the relevant risk class.
Auto-merge is disabled for all risk classes by default. Opt in per class in .maestro/policies/autopilot.yaml:
autoMergeAllowed:
low: true
medium: true
high: false
critical: falseAll 8 must pass for merge auto to trigger. In canonical check order:
| Code | Condition |
|---|---|
verdict-not-pass |
Verdict decision must be PASS |
auto-merge-class-disabled |
autoMergeAllowed.<riskClass> must be true in autopilot.yaml |
evidence-witness-too-weak |
All gating evidence rows must be at witnessed-by-ci or stronger |
forbidden-paths-touched |
Diff must not intersect contract.scope.filesForbidden |
sensitive-paths-untouched-without-waiver |
If diff touches sensitive paths, a verdict-override waiver must exist |
rollback-not-witnessed |
When the spec declares a rollout plan or a deploy-readiness row exists, a successful rollback-exercised Evidence row at witnessed-by-ci or stronger must exist |
review-ack-missing |
HUMAN verdicts at >=medium risk require a review-ack Evidence row |
spec-score-below-threshold |
If a Spec is linked, its quality score must be 1.0 |
# Check eligibility and trigger if eligible
maestro merge auto --pr <number> --task <id> [--base <ref>] [--repo <owner/name>] [--json]
# Record override waiver (requires sensitive_waiver authorization in owners.yaml)
maestro verdict override --task <id> --pr <number> --reason "<text>" [--verdict <id>] [--base <ref>]
# Record human review acknowledgement (for HUMAN verdicts at >=medium risk)
maestro review ack --task <id> --verdict <id> --criterion "<text>" [--criterion "<text>" ...]Exit codes for merge auto: 0 = eligible and triggered, 1 = ineligible (reasons printed).
See docs/auto-merge-eligibility.md for the full predicate reference and "Why isn't my PR auto-merging?" troubleshooting. See docs/override-flow.md for override authorization, audit trail, and no-silent-pass guarantees.
Deploy Safety is opt-in. Producing deploy-readiness and runtime-signal Evidence does not by itself flip Verdict semantics; teams wire the new Evidence into policies/risk.yaml if they want it to gate.
The spec adds two optional fields: runtime_signals (array of RuntimeSignal — name, provider, query, threshold) and rollout_plan (feature flag name, canary stages, rollback command). Older specs forward-migrate at read time with empty arrays and no rollout plan.
Runs four checks and records a deploy-readiness Evidence row. Exits 0 when all checks pass, 1 when any fail.
| Check | Passes when |
|---|---|
feature_flag |
Spec.rollout_plan.feature_flag is a non-empty string |
canary_plan |
Spec.rollout_plan.canary.stages has at least one stage |
rollback |
A successful rollback-exercised Evidence row at witnessed-by-ci or stronger exists |
owner |
owners.yaml.deploy_approver has at least one entry |
deploy gate does NOT mutate the Verdict. Teams add a deploy-readiness signal to policies/risk.yaml if they want it to block a PR.
Runs the provided shell command, records a rollback-exercised Evidence row, and exits 1 if the command fails. The witness level is witnessed-by-ci in CI and witnessed-by-maestro locally — both satisfy the rollback check in deploy gate.
Queries each signal declared in Spec.runtime_signals via the configured provider (Prometheus). Records one runtime-signal Evidence row per signal. Exit code is always 0; pass=false rows are advisory unless wired into risk policy.
Provider base URL precedence: --provider-base-url flag → MAESTRO_PROMETHEUS_URL env → http://localhost:9090.
owners.yaml has a fourth role: deploy_approver. The list is checked by deploy gate (owner check). See docs/owners-yaml-format.md for the full schema. CI Maestro's PR-author check also verifies that the committer is not self-approving their own deploy.
maestro deploy gate --task <id> [--base <ref>] [--json]
maestro deploy rollback --task <id> --command <cmd> [--json]
maestro runtime check --task <id> [--provider-base-url <url>] [--json]See docs/deploy-gate.md for the full check enumeration, Spec.rollout_plan reference, and troubleshooting. See docs/runtime-monitoring.md for the RuntimeMonitorPort reference and Prometheus adapter guide.
Two features extend the trust substrate horizontally — one across PRs, one across edge cases.
maestro ci verify checks whether other open PRs touch any of the same file paths as the current PR. When overlap is detected, it records a kind=cross-task-conflict Evidence row at witnessed-by-ci and passes it to the Risk Engine. The Risk Engine raises the effective risk class one tier per signal (capped at critical; multiple conflict rows still produce only a one-tier raise total).
Detection is file-path-level: a path counts as overlapping when it appears in both this PR's changed-file list and at least one other open PR's changed-file list. The check is non-fatal on API errors — a failed gh api call logs a warning and skips the record without failing the verify step.
See docs/cross-task-conflict.md for the port/adapter/use-case flow, payload schema, and troubleshooting.
tests/e2e/trust-benchmark/ is an end-to-end regression corpus of 9 scenarios drawn from a master edge-case list of 32. The corpus covers: out-of-scope edits, generated-file drift, sensitive-path violations, security-thin diffs, amendment creep, proof not tied to criteria, rebase/squash verdict identity, deploy-gate decision authority, and PR self-weakening. Each scenario includes a positive assertion (mitigation fires) and a negative assertion (mitigation does not fire without the trigger).
bun test tests/e2e/trust-benchmark/See docs/trust-benchmark.md for the full scenario table, fixture pattern, and how to add new scenarios.
The following capabilities are not in this release and will ship when teams ask maestro to learn from incidents: autopsy generator, maestro ratchet review/approve/sunset CLI, N≥2 broad-promotion guard for ratchet rules, and sunset/decay machinery.
Maestro ships a Model Context Protocol (MCP) server that exposes its core verbs to MCP-aware agent runtimes. Agents call maestro_task_create, maestro_evidence_record, maestro_verdict_request, and so on as structured tools instead of shelling out to the CLI and parsing text. The server is the same maestro binary, run with maestro mcp serve over stdio.
14 tools across 5 surfaces, each a 1:1 wrapper around an existing maestro use case:
| Surface | Tools |
|---|---|
| Task | maestro_task_list, maestro_task_get, maestro_task_create, maestro_task_claim, maestro_task_complete, maestro_task_block, maestro_task_unblock |
| Evidence | maestro_evidence_record, maestro_evidence_list |
| Contract | maestro_contract_show, maestro_contract_amend |
| Verdict | maestro_verdict_show, maestro_verdict_request |
| Policy | maestro_policy_check |
maestro_task_list and maestro_evidence_list are paginated (limit/offset in, pagination: { total, limit, offset, hasMore } out). Every tool declares both a strict inputSchema (unknown fields error rather than being silently dropped) and an outputSchema mirroring the success-path structuredContent. Failures set isError: true with a stable { code, message, hints } payload — clients branch on code (TASK_NOT_FOUND, ALREADY_COMPLETED, OWNERSHIP_CONFLICT, CYCLE_DETECTED, CONTRACT_NOT_FOUND, …).
maestro install and bun run release:local register the MCP entry with each supported runtime by shelling out to the runtime's own CLI (claude mcp add -s user, codex mcp add). Detection is CLI-on-PATH: a runtime whose CLI is not on PATH is silently skipped, so the install does not litter configs onto machines that don't have the runtime.
The entry lands in the canonical file each runtime actually reads:
| Runtime | Config file |
|---|---|
| Claude Code (user scope) | ~/.claude.json (top-level mcpServers.maestro) |
| Codex | ~/.codex/config.toml ([mcp_servers.maestro] table) |
maestro mcp serve # stdio transport, default
maestro mcp serve --project-root /abs/path # override project root detection
maestro mcp check # verify installed binary + runtime configs
maestro mcp check --jsonmcp serve reads JSON-RPC over stdin and writes responses to stdout; logs and errors go to stderr to keep the protocol channel clean. mcp check reports [ok], [stale], or not configured per runtime and exits 1 when the binary is missing.
The server walks up from its working directory looking for .maestro/. To override, set MAESTRO_PROJECT_ROOT in the entry's env block (or pass --project-root when running standalone). The session id reported on writes is auto-detected from MAESTRO_SESSION_ID, CLAUDECODE_SESSION_ID, or CODEX_THREAD_ID, falling back to <user>@<host>.
See docs/mcp-server.md for the full tool and error-code reference, and docs/mcp-setup.md for the manual configuration path and troubleshooting.
| Command | Use it when you want to... |
|---|---|
maestro init |
Create local project state. |
maestro install |
Initialize global config and inject supported agent instruction blocks. |
maestro update |
Upgrade the local binary to the latest release and refresh agent instruction blocks. |
maestro doctor |
Check whether the local environment is configured correctly. Includes harness-drift checks for empty mission feature directories and oversized root docs. |
maestro providers list / maestro providers doctor |
Inspect runtime and skill-target provider configuration. |
maestro skills list / maestro skills install <source> |
Discover, inspect, install, remove, and sync AgentSkills-compatible skills. |
maestro status |
Inspect the current Maestro state quickly. |
maestro mission create --file plan.json |
Create a mission from a plan file. |
maestro feature prompt <feature-id> --mission <mission-id> |
Generate the next agent prompt. |
maestro feature update <feature-id> --mission <mission-id> --status <status> |
Advance a feature through pending, assigned, in-progress, review, done, or blocked. |
maestro reply write <feature-id> |
Record an agent reply (outcome + optional report) for a feature. |
maestro handoff "<task>" --agent <agent> |
Build a markdown brief from current repo or mission context and launch a fresh agent run. |
maestro handoff pickup [--id <handoff-id>] [--standalone] |
Consume one open handoff packet. Prompt-only packets work anywhere; task-linked packets resume only from their source project unless --standalone is passed. |
maestro handoff "<task>" --worktree [slug] --wait --json |
Launch in a sibling worktree, wait for completion, and return structured metadata. |
maestro mission-control --preview |
Render a read-only dashboard preview in the terminal. |
maestro mission-control --json |
Get a machine-readable snapshot of mission state. |
maestro mission-control --render-check --size 120x40 |
Validate TUI render integrity non-interactively. |
maestro intake --paths <list> |
Classify intended work as tiny, normal, or high-risk before writing code. |
maestro task ready |
List actionable pending tasks with no unresolved blockers. |
maestro task claim <id> |
Take ownership of a task for the current session. |
maestro task update <id> --status in_progress / --status completed --reason "..." |
Start or finish a task. |
maestro task update <id> --current-state "..." --next-action "..." --add-decision "..." |
Refresh the resumable continuation summary for the next agent. |
maestro task reopen <id> |
Move a completed task back to the pending queue and restore its continuation summary. |
maestro task block <id> <blockedTaskIds...> |
Record that one task blocks others. |
maestro evidence record --task <id> --command "bun test" --exit 0 |
Log a command run as evidence for a task. |
maestro evidence record --task <id> --kind manual-note --note "..." |
Log a free-form manual note as evidence. |
maestro evidence list --task <id> |
List all evidence rows for a task. |
maestro evidence show <evidence-id> |
Show one evidence row by id. |
maestro mcp serve |
Start the MCP server on stdio. Agents launch this; you do not start it manually. |
maestro mcp check |
Verify the installed maestro binary and the canonical agent runtime config files. |
maestro principle list / principle add |
Inspect or register a behavioral principle. |
maestro bundle export <mission-id> --out ./review.mission.tar.gz |
Package a mission + artifacts as a portable archive. |
maestro bundle inspect <path> |
Print a mission bundle's manifest without extracting. |
maestro memory-correct <rule> |
Capture a correction that should influence future runs. |
maestro memory-compile |
Turn raw learnings into reusable guidance. |
maestro ratchet-check |
Run the regression ratchet suite. |
Run maestro <command> --help for full flags and examples.
Mission Control is a read-only dashboard over Maestro state. It supports:
- Interactive TTY mode with
maestro mission-control - Single-frame previews with
maestro mission-control --preview - Machine-readable snapshots with
maestro mission-control --json - Render validation with
maestro mission-control --render-check --size 120x40
Available preview screens include:
dashboardfeaturesdependencies(mission-only)configmemorygraphagentsdispatch(mission-only)eventstaskstimeline(mission-only)principleshelp
Aliases: feat, deps, cfg, mem, agent, event, task, principle. Mission-only screens are skipped automatically when running in home mode.
For non-interactive environments, prefer --preview, --preview all, or --json.
The codebase follows a hexagonal shape: commands stay thin, src/services.ts wires dependencies, use cases depend on domain rules and ports, and adapters implement those ports against the local filesystem and environment.
Maestro stores project-local state in .maestro/ and user-level defaults in ~/.maestro/.
| Surface | Lives where | Holds |
|---|---|---|
| Project workflow state | .maestro/ |
Missions, tasks, notes, and local memory artifacts |
| User-level defaults | ~/.maestro/ |
Global config, graph metadata, and the global handoff store |
| Read-only projections | Mission Control | Terminal previews and JSON snapshots over the same state |
.maestro/
├── config.yaml
├── memory/
│ ├── corrections/
│ ├── learnings/
│ └── ratchet/
├── missions/
│ └── <mission-id>/
│ ├── mission.json
│ ├── assertions.json
│ ├── checkpoints/
│ ├── features/
│ └── agents/
├── tasks/
│ ├── tasks.jsonl
│ ├── contracts/
│ ├── contract-templates/
│ ├── continuations/
│ ├── batches/
│ ├── candidates/
│ └── local-history/
├── principles.jsonl
└── notes.json
~/.maestro/
├── config.yaml
├── graph/
│ └── projects.json
└── handoff/
└── <id>/
├── handoff.json
├── prompt.md
└── output.log
The design is intentionally transparent: state is inspectable, diffable, and easy to back up. .maestro/tasks/** and .maestro/principles.jsonl are intentionally repo-tracked so the daily queue and behavioral rules are reviewed like any other code change; .maestro/missions/** stays ignored as local orchestration artifacts. Handoff packets live in the global ~/.maestro/handoff/ store (outside the repo) so they are visible across every working directory. Legacy .maestro/handoffs/**, .maestro/launches/**, and ~/.maestro/launches/** folders can remain on disk from earlier versions, but Maestro no longer reads them; status or doctor will warn when they are present.
Maestro is organized as a feature-first hexagonal codebase:
src/features/<name>/-- each feature is a bounded context containing its owncommands/,usecases/,domain/,ports/,adapters/, plus aservices.tscomposition factory andindex.tspublic surface. Current features:memory-ratchet,handoff(markdown prompt building plus Codex, Claude, or Hermes launch orchestration),notes,graph,session,memory,mission(withfeature/,validation/,checkpoint/subfolders and behavioral principles),agent(library-only; composes agent prompts and manages harness config injection),task(Claude-style blocker graph for the daily loop),reply(agent reply ingest with principle gating), andbundle(portable mission archive).src/infra/-- plumbing that isn't a feature: init, doctor, status, install, update, uninstall, and mission-control commands, config and git ports/adapters, and infra-owned domain types.src/shared/-- generic utilities with no domain knowledge: filesystem, YAML, shell, path safety, and output formatting underlib/; cross-cutting primitives like IDs and UI config underdomain/; plus top-levelerrors.ts,version.ts, andversion-format.ts.src/tui/-- read-only rendering and input for Mission Control; consumes features through their public surfaces. Seesrc/tui/README.mdfor the contributor-oriented TUI architecture walkthrough.src/services.ts-- composition root that wires every feature's adapters into a single service object.src/index.ts-- Commander CLI entry point.
Cross-feature imports must go through @/features/<name>, which resolves to the feature's index.ts. Deep imports across feature boundaries are forbidden and enforced by bun run check:boundaries in CI. Two features have explicit, scoped exceptions:
agentmay import frommission,memory, andhandoffthrough their public surfaces, because agent composes prompts from mission context, memory hints, and the native handoff launcher APIs.bundlemay import frommission,reply,handoff, andsessionthrough their public surfaces, because bundle is a read-only aggregator that snapshots every mission artifact, including handoff launches, into a portable archive.
The runtime is intentionally narrow: filesystem-backed stores, git integration, config handling, and a terminal UI. There is no database adapter or network service in the main workflow.
bun run build
bun run typecheck
bun test
bun run tui:dev
bun run release:localUseful verification commands for CLI and TUI work:
./dist/maestro --version
maestro --version
maestro --help
maestro mission-control --preview --size 120x40 --format plain
maestro mission-control --render-check --size 120x40Maestro treats agent integrations as providers. Runtime providers can launch handoffs; skill-target providers receive Maestro-managed skills.
| Provider | Runtime | Skill target | Skills root |
|---|---|---|---|
| Codex | yes | yes | $CODEX_HOME/skills or ~/.codex/skills |
| Claude Code | yes | yes | ~/.claude/skills |
| Hermes | yes | yes | ~/.hermes/skills/maestro |
| AgentSkills | no | yes | ~/.agents/skills |
maestro install, maestro update --agents-only, and maestro uninstall --agents-only keep the bundled Maestro skills synced across every available skill target. The skills feature also discovers and manages external AgentSkills-compatible skills.
Skill discovery is deterministic and stops at the first match. The order:
- project
.maestro/skills - project
.agents/skills - repo bundled skills
~/.maestro/external-skills~/.agents/skills- provider roots (
~/.claude/skills,~/.codex/skills,~/.hermes/skills/maestro)
Collisions emit warnings. The first skill in precedence order wins.
maestro providers list [--json]
maestro providers doctor [provider] [--json]
maestro skills list [--scope project|user|shared|all] [--json]
maestro skills inspect <name> [--json]
maestro skills install <source> [--scope user|project|shared] [--targets all|codex,claude,hermes,agentskills]
maestro skills remove <name> [--scope user|project|shared]
maestro skills sync [--targets ...]Supported install sources: local skill directory, local directory containing one or more skill directories, Git URL, GitHub shorthand (owner/repo or owner/repo/path), and HTTP zip/tar/tgz/tar.gz archive URLs. Bundled scripts inside a skill source are never executed during install.
See Provider Registry and Skills for the full reference, including the Hermes config mutation and security model.
In-depth references live under docs/:
| Topic | File |
|---|---|
| Provider registry, skills, Hermes setup | providers.md |
CI integration (maestro ci verify, GitHub Checks) |
ci-integration.md |
| Auto-merge eligibility (8 predicates) | auto-merge-eligibility.md |
| Override authorization and audit trail | override-flow.md |
| Risk class derivation from diff signals | risk-class-derivation.md |
| Witness levels (the trust ladder) | witness-levels.md |
| Policy file schemas (risk, autopilot, release, sensitive paths, owners) | policy-format.md, sensitive-paths-defaults.md, owners-yaml-format.md |
| AI Reviewer protocol (veto-only; raises class but never lowers it) | ai-reviewer-protocol.md |
| Threat-model schema | threat-model-format.md |
| Cross-task conflict detection | cross-task-conflict.md |
Deploy gate (4 checks + Spec.rollout_plan) |
deploy-gate.md |
| Runtime monitoring (Prometheus adapter) | runtime-monitoring.md |
| Trust benchmark corpus (regression seed) | trust-benchmark.md |
| MCP server tools and result shapes | mcp-server.md |
| MCP setup for Claude Code and Codex | mcp-setup.md |
The agent-facing protocol is documented inside the bundled skills under skills/bundled/. maestro-verify is the canonical verification protocol; maestro-intake, maestro-task, maestro-plan, maestro-mission, maestro-handoff, maestro-brainstorm, maestro-qa, and maestro-setup cross-reference it. maestro install syncs all of them into ~/.claude/skills/ and ~/.codex/skills/.
See CONTRIBUTING.md for the dev loop, repository layout, conventions, required pre-PR checks, and the port → adapter → use-case → command → test pattern. For security-sensitive reports, see SECURITY.md.