agent-kit tracks AI agent setup to keep workflows consistent across machines. It contains prompt-style skills, custom workflows, and local tooling.
agent-kit is an agent-facing Heuristic System framework, not a generic skill marketplace. It gives agents a portable operating layer where skills, scripts, runbooks, evidence records, tests, hooks, and guardrails can turn repeated workflow experience into maintainable repo knowledge.
The framework does not train model weights. It keeps learning explicit and auditable: when agents run workflows, inspect failures, apply fixes, validate outcomes, or update policies, the durable lessons can become skill contracts, tests, scripts, primitives, or runbooks.
The portable core is the tracked repository content: AGENTS.md, DEVELOPMENT.md, HEURISTIC_SYSTEM.md, docs/, scripts/, hooks/, and public skills under skills/workflows/, skills/tools/, and skills/automation/. Project, company, system, and local overlays are supported, but they are not part of the portable core unless they are explicitly tracked and documented.
Best fit:
- Engineers or teams who want agents to follow one validated operating model.
- Daily multi-repo work where preflight, checks, semantic commits, PR/MR delivery, release flow, and repeatable tooling matter.
- Users who prefer explicit guardrails, failure handling, and durable workflow learning over a loose collection of standalone prompt snippets.
.
├── .agents/ # repo-local agent helper scripts, including release entrypoint
├── .github/ # CI workflows (GitHub Actions)
├── docker/ # Docker image docs and workspace launcher pointers
├── docs/ # runbooks, plans, and testing docs
├── hooks/ # Codex hook source and managed config block
├── scripts/ # validation, sync, build, and helper entrypoints
├── skills/ # tracked public skills, prompt-style skills, and ignored overlays
├── tests/ # pytest regression/smoke tests
├── AGENTS.md # global agent rules (response/tooling)
├── DEVELOPMENT.md
└── HEURISTIC_SYSTEM.md
Run the macOS bootstrap script for a first install or repair:
curl -fsSL https://raw.githubusercontent.com/graysurf/agent-kit/main/install.sh | bashThe installer defaults to:
- installing Homebrew tools from CLI_TOOLS.md with the
recommendedprofile - installing or upgrading
nils-clifromsympoies/tap - cloning this repository to
$HOME/.agents - initializing
$HOME/.codex/AGENTS.md - linking
$HOME/.agents/AGENTS.mdto$HOME/.codex/AGENTS.md - writing a managed
$HOME/.zshenvblock forAGENT_HOME,AGENT_DOCS_HOME,PLAN_ISSUE_HOME, and Homebrew PATH - syncing the managed Codex hooks block into
$HOME/.codex/config.toml
For a local checkout, run:
./install.shAGENT_HOME is the home for this agent-kit toolchain. Note: nils-cli ≥ 0.8.0
binaries (agent-docs, plan-issue) no longer auto-read AGENT_HOME. Pass the
home explicitly via agent-docs --docs-home "$AGENT_HOME" and
plan-issue --state-dir "$AGENT_HOME", or export AGENT_DOCS_HOME /
PLAN_ISSUE_HOME alongside AGENT_HOME.
nils-cli also provides shared helper binaries used by skills and checks, including
plan-tooling, api-*, semantic-commit, agent-out, and media tooling. Use
nils-cli >= 0.8.7 for plan-folder scaffold defaults.
Optional: set PROJECT_PATH per project (e.g. in a repo’s .envrc) so tools can treat that repo as the active project context:
export PROJECT_PATH="$PWD"For new repositories with missing policy baseline docs, run the canonical bootstrap flow:
$AGENT_HOME/skills/tools/agent-docs/agent-doc-init/scripts/agent_doc_init.sh --dry-run --project-path "$PROJECT_PATH"
$AGENT_HOME/skills/tools/agent-docs/agent-doc-init/scripts/agent_doc_init.sh --apply --project-path "$PROJECT_PATH"
agent-docs --docs-home "$AGENT_HOME" baseline --check --target all --strict --project-path "$PROJECT_PATH" --format textSee docs/runbooks/agent-docs/new-project-bootstrap.md for the full sequence.
Codex hooks are tracked under hooks/codex/, but the full
$HOME/.codex/config.toml file stays local-only. Sync the managed hook block
into the local config with:
$AGENT_HOME/scripts/codex-hooks-sync sync --applyFor a first install or handoff, keep the path short:
- Install
nils-cliand setAGENT_HOME. - Run
agent-docs --docs-home "$AGENT_HOME" resolve --context startup --strict --format checklist. - From this repository, run
scripts/check.sh --all. - Try one focused workflow, such as
agent-doc-initin a disposable project. - Add project-local
.agents/skillsor privateskills/_projectsoverlays only after the portable core works.
See docker/agent-env/README.md for the Ubuntu Docker environment, Docker Hub publish steps, and compose usage.
See docker/agent-workspace-launcher/README.md for the host-native
agent-workspace-launcher runtime that replaced legacy Docker wrapper usage.
Prompt presets are implemented as prompt-style skills so compatible agent CLIs
can surface them through a skill picker or /skills-style command. The source
prompt text lives under each skill's references/prompts/ directory.
| Prompt | Description | Usage |
|---|---|---|
| actionable-advice | Answer a question with clarifying questions, multiple options, and a single recommendation | Use the actionable-advice skill |
| actionable-knowledge | Answer a learning/knowledge question with multiple explanation paths and a single recommended path | Use the actionable-knowledge skill |
| orchestrator-first | Make the main agent own intent, dispatch, integration, validation, and final synthesis while subagents own lanes | Use the orchestrator-first skill |
| parallel-first | Enable a parallel-first execution policy for this conversation thread when delegation is safe | Use the parallel-first skill |
| test-first | Require failing-test evidence or an explicit waiver before production behavior changes | Use the test-first skill |
See skills/tools/skill-management/README.md for how to create/validate/remove skills (including
project-local .agents/skills) using canonical entrypoints.
The catalog below covers tracked public skills under skills/workflows/, skills/tools/, and skills/automation/. Treat these sections as behavior boundaries:
- Workflows guide judgment-heavy, multi-step agent work where the agent owns interpretation, sequencing, or handoff.
- Tools expose bounded commands, evidence records, execution surfaces, or maintenance helpers; the caller owns intent and safety.
- Automation owns an end-to-end loop and may stage, commit, push, retry checks, create or close PRs/MRs, or publish releases.
Tool skills may also be grouped by execution surface, such as skills/tools/computer-use/ for live GUI automation.
Nested folders below the three public domains are catalog taxonomy only. They
must describe the skill's behavior boundary, such as
skills/tools/workflow-evidence/ or skills/automation/ci/, without changing
whether the skill is a workflow, tool, or automation skill.
Prompt-style skills are listed in the Prompts section above and are not duplicated in the general workflow table.
Ignored local/system overlays may exist under skills/_projects/ and skills/.system/, but they are
not part of the public catalog unless explicitly tracked.
| Area | Skill | Description |
|---|---|---|
| Conversation | requirements-gap-scan | Explicit requirements gap scan and blocking-clarification question format with suggested defaults |
| Conversation | handoff-session-prompt | Generate a generic next-session initialization prompt from the user request, conversation context, and any user-specified reference files without embedding project-specific details. |
| Conversation | review-to-improvement-doc | Convert review findings, risks, lessons learned, or follow-up backlog into a durable repo-local improvement document. |
| Conversation | discussion-to-implementation-doc | Convert completed requirements, design, feasibility, or customer-facing discussion into a durable implementation-readiness document. |
| Planning | create-plan | Create a comprehensive, phased implementation plan and save it under docs/plans/ |
| Planning | create-plan-rigorous | Create an extra-thorough implementation plan and get a subagent review |
| Planning | docs-plan-cleanup | Prune outdated docs/plans markdown with dry-run-first safeguards and related-doc reconciliation |
| Planning | execute-plan-parallel | Execute a markdown plan by spawning parallel subagents for unblocked tasks, then validate |
| Planning | execute-from-implementation-doc | Resume and execute long-running implementation work from a durable execution-ready document and progress state |
| Planning | durable-artifact-cleanup | Audit and remove obsolete durable implementation artifacts after execution is complete |
| Issue | issue-follow-up | Open or continue GitHub issues as durable follow-up timelines, routing to issue lifecycle, implementation, or PR workflows as needed |
| Issue | issue-lifecycle | Main-agent workflow for opening, maintaining, decomposing, and closing GitHub Issues as the planning source of truth |
| Issue | issue-subagent-pr | Subagent workflow for assigned task-lane implementation (pr-shared/per-sprint/pr-isolated), draft PR creation, and review-response updates linked to the owning issue |
| Issue | issue-pr-review | Main-agent PR review workflow with explicit PR comment links mirrored to the issue timeline |
| Reporting | daily-brief | Prepare a source-grounded daily information brief by orchestrating topic-radar JSON with stable preferences, concise user-language synthesis, and optional local history records |
| MR / GitLab | create-gitlab-mr | Create GitLab Merge Requests through an audited glab workflow with a standard MR body and explicit source-branch policy |
| MR / GitLab | deliver-gitlab-mr | Deliver GitLab merge requests end to end with preflight, creation, pipeline wait/fix loop, close, and cleanup |
| MR / GitLab | close-gitlab-mr | Merge and close GitLab merge requests after pipeline gating, draft readiness, and branch cleanup |
| PR / GitHub | create-github-pr | Create GitHub pull requests through the canonical provider-scoped workflow with audited hook bypass markers |
| PR / GitHub | deliver-github-pr | Deliver GitHub pull requests end to end with preflight, creation, CI wait/fix loop, merge, and cleanup |
| PR / GitHub | close-github-pr | Merge and close GitHub pull requests after PR hygiene review and branch cleanup |
| PR / Plan Issue | create-plan-issue-sprint-pr | Open a draft GitHub sprint PR for a plan-issue implementation lane using the canonical sprint PR body schema and assigned dispatch record |
| Area | Skill | Description |
|---|---|---|
| Agent Docs | agent-doc-init | Initialize missing baseline docs safely (dry-run first), then upsert optional project extension entries |
| App Runtime | macos-agent-ops | Run repeatable macOS app checks/scenarios with macos-agent |
| Browser Runtime | playwright | Automate a real browser via Playwright CLI using the wrapper script |
| Browser Runtime | agent-browser | Optional legacy fallback for agent-browser CLI automation when native Browser/Chrome tools are unavailable |
| Browser Evidence | browser-session | Record browser-session goals, steps, statuses, and evidence artifacts through the nils-cli browser-session command |
| Browser Evidence | web-evidence | Capture redacted static HTTP evidence bundles through the nils-cli web-evidence command |
| Browser Evidence | web-qa | Run static or active browser QA evidence workflows with web-evidence, Browser/Chrome/Playwright, and browser-session records |
| Skill Management | skill-governance | Audit skill layout and validate SKILL.md contracts |
| Skill Management | create-skill | Scaffold a new skill directory that passes skill-governance audit and contract validation |
| Skill Management | create-project-skill | Scaffold a project-local skill under <project>/.agents/skills/ with contract/layout validation |
| Skill Management | remove-skill | Remove a tracked skill directory and purge non-archived repo references (breaking change) |
| Scope | agent-scope-lock | Create, read, validate, and clear edit-scope locks through the nils-cli agent-scope-lock command |
| Workflow Evidence | canary-check | Run a local canary command and persist redacted pass/fail evidence through the nils-cli canary-check command |
| Workflow Evidence | docs-impact | Scan Git changes for documentation impact through the nils-cli docs-impact command |
| Workflow Evidence | model-cross-check | Record primary/checker model observations through the nils-cli model-cross-check command |
| Workflow Evidence | review-evidence | Record review findings and final validation through the nils-cli review-evidence command |
| Workflow Evidence | skill-usage | Record skill invocation intent, linked evidence, validation, outcome, and failures through nils-cli skill-usage |
| Workflow Evidence | test-first-evidence | Record failing-test evidence, waivers, and final validation through the nils-cli test-first-evidence command |
| Git | semantic-commit | Commit staged changes using Semantic Commit format |
| Review UX | open-changed-files-review | Open files edited by Codex in VSCode after making changes (silent no-op when unavailable) |
| Notifications | desktop-notify | Send desktop notifications via terminal-notifier (macOS) or notify-send (Linux) |
| Media | image-processing | Convert svg/png/jpg/jpeg/webp inputs to png/webp/jpg and validate SVGs via image-processing |
| Media | screen-record | Record a single window or full display to a video file via the screen-record CLI (macOS 12+ and Linux) |
| Media | screenshot | Capture screenshots via screen-record on macOS and Linux, with optional macOS desktop capture via screencapture |
| SQL | sql | Run PostgreSQL, MySQL, or SQL Server queries through one dialect subcommand and prefix + env file convention |
| Testing | api-test-runner | Canonical REST + GraphQL API testing skill: suites via api-test, focused calls via api-rest / api-gql |
| Computer Use | google-sheets-cell-edit | Edit Google Sheets cells through browser automation with reliable cell targeting, multiline content, partial rich-text hyperlinks, and in-app validation. |
| Market Research | polymarket-readonly | Research Polymarket markets through read-only public APIs and MCP servers without trading credentials. |
| Market Research | topic-radar | Aggregate read-only AI and technology trend signals with a personal AI/Tech profile, fast ai-news preset, Polymarket MCP input, and Markdown or JSON digests. |
| Area | Skill | Description |
|---|---|---|
| Commit Automation | semantic-commit-autostage | Autostage (git add) and commit changes using Semantic Commit format for fully automated workflows |
| Issue Delivery | issue-delivery | Orchestrate issue execution loops end-to-end: open issue, track status, request review, and close only after approval + merged PR gates |
| Issue Delivery | plan-issue-delivery | Orchestrate plan-driven issue delivery by sprint: split plan tasks, dispatch subagent PR work, enforce acceptance gates, and advance to the next sprint without main-agent implementation. |
| CI Repair | gh-fix-ci | Automatically fix failing GitHub Actions checks, semantic-commit-autostage + push, and retry until green |
| Bug Repair | fix-bug-pr | Find bug-type PRs with unresolved bug items, fix and push updates, comment, and keep PR body status synced |
| Bug Discovery | find-and-fix-bugs | Find, triage, and fix bugs; open a PR with a standard template |
| Security Scan | semgrep-find-and-fix | Scan a repo using its local Semgrep config, triage findings, and open a fix PR or report-only PR |
| Release | release-workflow | Execute project release workflows by following a repo release guide (with a bundled fallback) |
Use scripts/check.sh as the canonical local check entrypoint:
scripts/check.sh --pre-commitUse scripts/check.sh --all as the canonical minimum gate.
Common focused runs:
scripts/check.sh --docs
scripts/check.sh --markdown
scripts/check.sh --tests -- -m script_smoke
bash scripts/ci/stale-skill-scripts-audit.sh --check
scripts/check.sh --entrypoint-ownershipLint CI (.github/workflows/lint.yml) maps its phases to
scripts/check.sh modes.
Generated phase blocks are managed by
scripts/ci/generate-lint-workflow-phases.py
(--check for CI, --write when updating mappings).
Keep docs and CI guidance aligned with these modes instead of legacy
ad-hoc wrappers.
This project is licensed under the MIT License. See LICENSE.