Skip to content

xoai/sage

Repository files navigation

Sage

An intelligent skills framework for AI agents.

Sage - An intelligent skills framework for AI agents.

Think clearly. Work thoroughly. Deliver excellence.

Sage is a skills framework that makes AI agents think before they act, stay focused under complexity, and deliver outcomes you can trust. Built for product and engineering teams, open to any domain.

  • Think first, build second — a framing round challenges assumptions before solutioning begins, preventing the most expensive mistake: solving the wrong problem
  • Focus over noise — loads only what the task needs, producing sharper reasoning
  • Reliable by design — 5-layer enforcement, 3 independent sub-agent reviews, quality gates with deterministic scripts
  • Gets smarter over time — self-learning, memory, and ontology compound into institutional knowledge of your codebase
  • Grows with its ecosystem — 38 built-in skills, extensible with 90K+ community skills from skills.sh

Why Sage

The Navigator

Most AI frameworks skip from request to implementation. Sage's navigator thinks first — mapping every request to an intent spectrum (UNDERSTAND → ENVISION → DELIVER → REFLECT) and detecting what's missing before work begins.

It starts with a framing round: surface the pain, challenge the premises, and arrive at a chosen framing — before any solutioning happens. Building without research? It tells you what 15 minutes of discovery would prevent, then lets you decide. Gap detection, not gatekeeping.

Routing is deterministic first, intelligent second: keywords match workflows before any LLM judgment. When keywords don't match, a focused sub-agent classifier picks the right phase. Every routing decision is confirmed with the user before proceeding. Smart enough to route accurately. Humble enough to ask when unsure.

The Quality Chain

AI agents drift silently — skipping steps, hallucinating imports, building the wrong thing confidently. Sage catches this at every stage:

Before implementation:

  • Auto-review (sub-agent) verifies spec quality after approval — framing alignment, testable criteria, boundary completeness, edge cases, internal consistency
  • Auto-review (sub-agent) verifies plan quality after approval — spec-plan alignment, task decomposition, dependency ordering, coverage gaps

During implementation:

  • 7 universal coding principles loaded into the build-loop — clarity, error handling, boundary guards, minimal scope, safe APIs, consistency, behavior testing

After implementation:

  • 5 quality gates sequence automatically — spec compliance, constitution compliance, code quality (independent sub-agent), hallucination check, test verification
  • 2 advisory gates activate when applicable — browser check (Lightpanda), design check (frontend files)
  • Auto-QA (sub-agent) verifies code against spec — alignment, test coverage, error handling, boundary conditions, integration consistency, coding principles

Six independent sub-agent review points. The agent that writes the code — or diagnoses the bug — never reviews its own work alone.

Hybrid Loading

Most frameworks dump all instructions into the context window and hope for the best. Sage loads in two layers: the eager layer (process rules, workflow gates, engineering principles — ~200 lines, always in context) enforces what must never be skipped. The lazy layer (capabilities like TDD discipline, coding principles, systematic debugging, build-loop orchestration — loaded when the workflow step needs them) adds depth without bloating context. A focused agent with the right 500 tokens outperforms a distracted agent with 50,000 tokens of everything.

Session Resilience

Close your IDE, hit a context limit, come back tomorrow — Sage picks up exactly where you left off. A cycle manifest captures state, context summary, decisions, open questions, and handoff guidance at every checkpoint. Type /continue and Sage reads the manifest, routes to the correct workflow, and preserves the judgment context that would otherwise be lost.

Memory That Compounds

Sage Memory System — 3 skills, 1 MCP, compounding knowledge.

Most agent frameworks are stateless. The agent that made a mistake yesterday makes it again today. Sage has three skills that build institutional memory — all backed by sage-memory MCP:

  • sage-self-learning captures mistakes as WHEN/CHECK/BECAUSE prevention rules. Every session starts by searching past mistakes before doing anything.
  • sage-memory stores project knowledge as focused prose insights — how your auth works, why billing uses event sourcing, what conventions the team follows.
  • sage-ontology maps entity relationships — not just "billing exists" but "billing depends on payments, which triggers webhooks, which notify users." Touch one module, know the blast radius.

Day 1, the agent knows nothing. Day 30, it knows your codebase's landmines, patterns, and conventions.

Get Started

Install

curl -fsSL https://raw.githubusercontent.com/xoai/sage/main/install.sh | bash

Works on macOS and Linux. On Windows, use Git Bash or WSL:

# Windows — open Git Bash, then:
curl -fsSL https://raw.githubusercontent.com/xoai/sage/main/install.sh | bash

All sage commands run in bash. On Windows, use Git Bash or WSL for both installation and daily use.

Path A: New Project (Greenfield)

sage new my-app                  # scaffold a new project with Sage
cd my-app

Open the project in your IDE, then follow a natural progression:

/sage                            # 1. describe what you want to build
                                 #    Sage classifies intent, detects gaps,
                                 #    and recommends the right workflow

/research                        # 2. (optional) user interviews → JTBD →
                                 #    opportunity map — understand the problem
                                 #    before solutioning

/architect                       # 3. (optional) system design → ADRs →
                                 #    milestone plan — for non-trivial systems

/build                           # 4. spec → plan → build-loop → quality gates
                                 #    auto-review, TDD, coding principles, auto-QA

Not every project needs every step. A simple feature can go straight to /build. A complex product benefits from /research/design/architect/build. Sage tells you what you're skipping and lets you decide.

Path B: Existing Project (Brownfield)

cd your-project
sage init                        # interactive — detects stack, asks for preset
sage init --preset startup       # or pick a preset directly
sage init --prefix               # namespace commands as sage:build, sage:fix, etc.

Available presets: base (default), startup, enterprise, opensource. Presets add engineering principles on top of the universal base (TDD, no secrets, explicit deps). Configure later in .sage/config.yaml.

Then teach Sage your codebase:

# 1. Set up persistent memory (one-time)
sage setup memory                # configures sage-memory MCP server

# 2. Learn your codebase (run inside your IDE)
sage learn                       # broad scan — architecture, patterns, conventions
sage learn src/billing           # deep dive — learn a specific module

After install, sage upgrade will prompt to upgrade the sage-memory package when a newer version is available on PyPI, and sage update syncs the latest skill prose into your project automatically — no manual sage-memory install-skills invocations required.

After learning, Sage knows your conventions, architecture, and landmines. Every future session starts by searching this memory — no more explaining context from scratch.

Then work naturally:

/sage                            # describe your task — Sage reads memory,
                                 # checks for work in progress, and routes
                                 # to the right workflow

/fix                             # diagnose → scope → fix → verify
                                 # reads prior QA reports and design reviews

/build                           # spec → plan → build-loop → quality gates
                                 # reads prior research, design specs, ADRs

/autoresearch                    # autonomous iteration toward a metric
                                 # modify → commit → verify → keep/revert

/continue                        # resume where you left off — reads the
                                 # cycle manifest for full context handoff

Upgrade

sage upgrade   # pulls latest Sage framework from GitHub
sage update    # regenerates platform files, preserves .sage/ state

sage update regenerates CLAUDE.md, commands, workflows, and gate scripts while preserving your project state (decisions, work artifacts, memory). You may need to restart your IDE to load latest configs.

CLI Commands

Run in your terminal:

Command What It Does
sage new <n> Create a new project with Sage
sage init Add Sage to the current directory
sage update Regenerate platform files after changes
sage upgrade Update Sage to the latest version
sage learn [path] Learn a codebase or module
sage setup memory Configure persistent memory (sage-memory MCP)
sage find <query> Search skills.sh catalog (90K+ skills)
sage add <source> Install skills from owner/repo, URL, or local path
sage add <source> --skill <n> Install a specific skill from a repo
sage remove <skill> Remove a skill from project
sage skills List installed skills
sage update [target] Update community skills to latest

How Sage Works

Routing

Sage Routing — 3-layer funnel from keywords to confirmation.

Three layers, deterministic first:

  1. Keywords (instant) — "build" → /build, "fix" → /fix, "audit" → /analyze. Handles 60-70% of requests with zero LLM judgment.
  2. Sub-agent classifier (focused) — independent context, single job: classify into UNDERSTAND / ENVISION / DELIVER / REFLECT.
  3. Confirmation (human decides) — 2-3 options with skill chains visible. The user confirms before anything runs.

Slash Commands

Use inside your IDE (Claude Code, Antigravity):

Command What It Does
/sage Start here. Routes via keywords → classify → confirm
/build Spec → plan → build-loop → quality gates (with auto-review, coding principles, auto-QA). Accepts --quality-locked, --autonomous
/fix Diagnose → scope → fix → verify (reads QA and design-review reports)
/architect Elicit → design → milestone plan → phased build (with ADR auto-review). Accepts --quality-locked, --autonomous
/research Interview → JTBD → opportunity map
/design Brief → spec → copy (reads research context)
/analyze UX audit → evaluation → findings
/qa Browser-based functional testing (optional Lightpanda MCP)
/design-review Design quality audit + AI slop detection + design system compliance
/review Independent artifact evaluation via sub-agent delegation
/autoresearch Autonomous iteration toward a measurable metric (reduce bundle, increase coverage)
/map Build ontology knowledge graph — modules, services, dependencies
/learn Codebase scan → memory storage
/reflect Review cycle → extract learnings → seed next cycle
/continue Resume any active cycle with full context
/status Compute project state from artifacts

Workflow Flags (/build and /architect)

Two optional flags change how the workflow operates without changing what it produces:

Flag Effect
--quality-locked At each review checkpoint, loop review/revise until findings are clean (no Critical, no Major, only cosmetic Minor) or cap hit (10 iterations). Use when you want Sage to push for a clean output bar.
--autonomous Skip user-facing elicitation. Agent makes brief/spec/plan decisions by reading memory, codebase patterns, constitution principles, and prior cycles. Every decision cites its source. Unconfident substantive decisions fall back to asking. Use when you want Sage to draft a recommended approach from your project's context.
/build --quality-locked                       # interactive, quality-locked
/build --autonomous "ship dark mode"          # autonomous decisions, normal review
/build --autonomous --quality-locked "..."    # full autonomy, quality-locked
/architect --autonomous "design billing v2"

Flags are independent and combinable. Both have hard iteration caps and explicit cap-reached prompts — no runaway behavior. Flag state persists in the cycle manifest, so /continue restores them.

Project-level defaults

Set defaults in .sage/config.yaml so the flags apply automatically to every /build, /architect, and /fix invocation:

quality_locked: true        # always loop review until clean
autonomous: false           # use interactive elicitation

The agent announces active modes and their source at workflow start:

Sage → build workflow.
Modes: --quality-locked (from .sage/config.yaml)
Goal: Ship dark mode

Per-run override: the --no-quality-locked and --no-autonomous flags disable a config default for a single run:

/build --no-quality-locked "quick typo fix"   # override config default

Precedence (highest wins): --no-X flag → --X flag → config default → off. Passing both --X and --no-X is an error.

Interaction Patterns

Sage communicates clearly at every step:

Decision points — numbered options when you need to choose a direction. Checkpoints[A] Approve / [R] Revise shortcuts on deliverables. Continuations[C] Continue with a recommended next step.

Free-form input always works. These patterns guide, they don't constrain.

The Pipeline: UNDERSTAND → ENVISION → DELIVER → REFLECT

Sage Workflows — 14 commands chaining 37 skills across 4 phases.

Sage organizes work into four phases. Each phase has dedicated workflows that chain skills automatically:

UNDERSTAND              ENVISION               DELIVER              REFLECT
/research  /analyze     /design  /architect    /build  /fix         /reflect
/learn     /map                                /autoresearch
                                               /review  /qa
                                               /design-review

/research chains user-interview → JTBD → opportunity-map. /design chains ux-brief → ux-specify → ux-writing and reads research findings automatically. /build chains spec → plan → build-loop → quality-gates and reads design specs. /reflect reviews the full cycle, extracts WHEN/CHECK/BECAUSE learnings, and seeds the next cycle with concrete recommendations.

You can enter at any phase. But the further right you start, the more you're building on assumptions.

Enforcement Model

Sage Enforcement — 5 independent layers.

Agents rationalize. Tell them "MUST write spec" and they'll decide the conversation IS the spec. Every instruction that requires interpretation will be reinterpreted. Sage solves this with 5 independent enforcement layers and observable conditions that can't be argued away.

Layer 1 — Always-on rules in the system prompt. Even if nothing else loads, the gates prevent the worst violations. Eight rules covering memory-before-work, spec-first, artifact-only state, checkpoints (no unilateral deferral), self-check, decisions logging, learning from corrections, and skills-before-assumptions.

Layer 2 — Command preambles. Every slash command has enforcement rules the agent reads before its first token. Named rationalizations are blocked: "the design is clear" → NOT a spec file.

Layer 3 — Capabilities loaded at the right workflow step. build-loop orchestrates task-by-task execution. coding-principles enforces 7 universal quality standards. tdd enforces test-first. systematic-debug structures root cause investigation.

Layer 4 — Bash gate scripts. Deterministic. Run BEFORE the agent reviews. sage-verify.sh runs your test suite, sage-hallucination-check.sh verifies imports exist, sage-spec-check.sh confirms deliverables match the plan. The script says tests fail → gate fails, regardless of what the agent thinks.

Layer 5 — Self-learning. Corrections from past sessions are stored as WHEN/CHECK/BECAUSE rules and searched before every Standard+ task. The agent reads its own past failures before repeating them.

Every rule is an observable condition, not an action instruction. "spec.md MUST EXIST on disk" is binary — the agent can't argue a file into existence. "MUST write spec" is rationalizable — the agent decides the conversation is the spec. File existence beats language.

The agent must bypass all five layers to skip the spec. Each layer is independently enforceable.

Independent Reviews (Sub-Agent)

Sage Quality Chain.

Sage delegates three review points to sub-agents with independent context windows. The producing agent's conversation history — where self-bias lives — is not included.

Review Point When What the Sub-Agent Checks
Auto-review: spec After spec [A] Framing alignment, testable criteria, boundary completeness, edge cases, consistency
Auto-review: plan After plan [A] Spec-plan alignment, task decomposition, dependencies, coverage gaps, risk
Auto-review: ADR After design [A] in /architect Trade-off analysis, migration path, risk assessment, blast radius, reversibility
Auto-review: root cause After diagnosis [A] in /fix Evidence quality, symptom vs cause, alternative causes, reproduction chain
Auto-review: fix plan After fix plan [A] in /fix Root cause coverage, file completeness, test strategy, regression risk
Gate 3: code quality During quality gates Readability, error handling, security, performance, conventions
Auto-QA After gates pass Spec-implementation alignment, test coverage, error handling, boundaries, integration, coding principles

All are advisory — the user can always [P] Proceed. Findings are logged to decisions.md for /reflect to learn from.

Requires Claude Code's Task tool. When Task tool is not available (e.g., Antigravity), reviews are skipped silently.

Coding Principles

Seven universal principles loaded during implementation — not a post-hoc checklist, but a mindset active AS code is written:

  1. Clarity over cleverness — descriptive names, obvious flow, no tricks
  2. Fail loudly, recover gracefully — every external call has error handling
  3. Guard the boundaries — validate at every entry point
  4. Smallest scope, shortest lifetime — local over global, pure over stateful
  5. Make the right thing easy — APIs that invite correct usage
  6. Consistency beats perfection — match the existing codebase
  7. Test what matters — test behavior and boundaries, not implementation

Language-agnostic. Apply to Python, TypeScript, Go, Rust, anything. Stack skills add language-specific idioms on top.

Constitution Stack

Sage uses a three-tier constitution model:

Base (5 principles, all projects) — TDD, no silent failures, no secrets in code, explicit dependencies, reversible changes.

Preset (chosen during init) — startup (ship small, monolith first), enterprise (auth everywhere, audit trails, postmortems), or opensource (docs mirror code, semver contract).

Project additions — your own principles in .sage/config.yaml.

The generator merges all three tiers into the always-on instructions. Lower tiers add constraints but cannot remove inherited ones.

Skills

Philosophy

Skills are Sage's knowledge architecture — a principled way to put LLMs in the best position to do excellent work.

Every skill uses progressive disclosure: a short description triggers activation, SKILL.md provides the full process, and reference files offer depth when needed. This mirrors how experts work — you don't recite the entire textbook before solving a problem. You know what you know, and you reach for references when the situation demands it.

Skills are designed to maximize LLM capabilities. Clear structure (frontmatter, process steps, quality criteria) gives the agent unambiguous guidance. Domain vocabulary in the right places improves reasoning. Reference material separated from instructions keeps the agent focused on the task, not on parsing a wall of text.

Built-in Skills (38)

Sage ships with skills across four domains:

  • Product management — JTBD, opportunity mapping, user interviews, PRDs, problem-solving
  • UX design — audit, evaluate, discovery, brief, specify, writing, heuristic review, research, plan-tasks
  • Engineering — React, React Native, Next.js, Flutter, web, mobile, API, BaaS, plus full-stack presets (Next.js + Supabase, Flutter + Firebase, React Native + Expo, Next.js fullstack)
  • Framework — memory, ontology, self-learning, autoresearch, skill-builder, and research packs (discover, draft, observe, source-process, validate)

Community Ecosystem (powered by skills.sh)

Search and install from 90K+ community skills:

sage find react                                         # search skills.sh
sage add vercel-labs/agent-skills                       # browse + pick from multi-skill repo
sage add vercel-labs/agent-skills --skill frontend-design  # install specific skill
sage add ./my-local-skills                              # install from local path
sage remove frontend-design                             # uninstall

Skills install to sage/skills/ and auto-deploy to your platform (.claude/skills/ loader stubs for Claude Code, full copies to .agent/skills/ for Antigravity).

Contributing is deliberately simple. Drop a folder with a SKILL.md into sage/skills/ and it works. Add Sage frontmatter (type, tags, relationships) for smarter integration.

Configuration

Sage configuration lives in .sage/config.yaml:

sage-version: "1.1.1"
project-name: "my-app"
detected-stack: [react, typescript]
auto_review: true          # sub-agent review after spec/plan approval
auto_qa: true              # sub-agent QA after quality gates
independent_gate3: true    # sub-agent code quality review (Gate 3)
command_prefix: false      # prefix commands as sage:build, sage:fix, etc.

All toggles default to true (except command_prefix). Set to false to disable:

Setting What It Controls
auto_review Sub-agent review of spec, plan, and ADR after approval
auto_qa Sub-agent code verification after quality gates pass
independent_gate3 Sub-agent code quality review at Gate 3 (falls back to self-review)
command_prefix Namespace all commands as sage:build, sage:fix, etc. (set via --prefix flag)

Multi-Agent (optional)

For non-trivial work where independent review changes your mind, Sage offers an opt-in cross-model build cycle. The host (Claude Code, Opus) keeps the planner role and orchestrates; external CLIs handle adversarial review and implementation:

brief → spec → external spec review (loop) → plan → external plan review
      → external implement → external code review (loop) → reflect

Defaults: Codex CLI (gpt-5-codex) reviews specs/plans and code; Kimi CLI implements. All bindings live in a single config file you can edit:

# .sage/agents.toml — swap any role's tool with a one-line change
[roles.code_reviewer]
agent = "codex"
model = "gpt-5-codex"
mode  = "read-only"

Install per project (Python 3.11+, plus whatever CLIs you bind):

cd my-project
sage setup multi-agent          # adds /build-x, /review-spec, /review-plan,
                                # /implement, /review-code — never shadows /build
sage setup multi-agent --remove # clean uninstall, user edits backed up

The augmented cycle re-uses Sage's existing /architect, /research, and /design workflows where they fit, then layers external review + external implementation on top. Survives sage update — your .sage/agents.toml and .sage/prompts/ are never touched; framework- owned scripts and command files refresh from the template with drift detection ([K]eep | [R]eplace | [D]iff if you've edited locally).

Claude Code only in v1.

Learn more:

  • docs/multi-agent.md — comprehensive user guide (install, configure, daily use, customize, troubleshoot)
  • runtime/multi-agent/README.md — contributor-facing (template layout, ownership split, how to test)
  • .sage/docs/multi-agent.md (post-install) — protocol contract, schema, integration points

Project State

When Sage runs in your project, it manages state in .sage/:

.sage/
├── config.yaml              # Project config — preset, stack, toggles
├── decisions.md             # Append-only decision log (never edited, never summarized)
├── conventions.md           # Project conventions (enriched by codebase-scan)
├── docs/                    # Project knowledge (analyses, ADRs, research)
│   ├── decision-*.md        # Architecture Decision Records
│   ├── ux-audit-*.md        # UX audit findings
│   ├── jtbd-*.md            # Jobs-to-be-Done analysis
│   └── reflect-*.md         # Cycle reflections with learnings
├── work/                    # Per-initiative deliverables
│   └── YYYYMMDD-slug/
│       ├── brief.md         # Scope definition (medium+ tasks)
│       ├── spec.md          # Feature specification
│       ├── plan.md          # Implementation plan with tasks
│       ├── manifest.md      # Cycle state + handoff context
│       ├── qa-report.md     # QA test results (from /qa)
│       └── design-review.md # Design audit findings (from /design-review)
└── gates/
    ├── gate-modes.yaml      # Which gates run per workflow mode
    └── scripts/             # Deterministic verification scripts

Artifact-only state. There is no progress.md or state file that the agent summarizes. The artifacts ARE the state: spec.md exists = spec phase complete. plan.md exists = planning done. File existence is binary — the agent can't hallucinate a file into existence.

decisions.md is newest-first. The agent prepends entries after the header — recent context is always read first. When the file exceeds ~200 lines, old entries archive to decisions-{date}.md.

Platforms

Sage is platform-agnostic. It works wherever AI agents work.

Platform How Sage Integrates Sub-agent reviews Status
Claude Code CLAUDE.md + .claude/commands/ (markdown) Task tool Full
Antigravity GEMINI.md + .agent/ (markdown) Native Full
Codex (OpenAI) AGENTS.md + .codex/agents/ (TOML sub-agents) Native (TOML) Full
Opencode AGENTS.md + .opencode/{commands,agents}/ (markdown) Native (markdown) Full
Gemini CLI GEMINI.md + .gemini/commands/ (TOML) Single-pass fallback in v1 Full
Claude Code Plugin Plugin format — install with /plugin install sage@xoai Task tool Full

Six distribution paths from one source:

Sage Framework (source of truth)
    ├── generate-claude-code.sh   → CLAUDE.md + .claude/
    ├── generate-antigravity.sh   → GEMINI.md + .agent/
    ├── generate-codex.sh         → AGENTS.md + .codex/agents/
    ├── generate-opencode.sh      → AGENTS.md + .opencode/{commands,agents}/
    ├── generate-gemini-cli.sh    → GEMINI.md + .gemini/commands/
    └── generate-plugin.sh        → sage-plugin/ (Claude Code plugin)

All in-project paths share the same .sage/ project state. Multiple platforms can be installed simultaneously — Sage detects them and generates files for each. AGENTS.md is shared between Codex and Opencode; GEMINI.md is shared between Antigravity and Gemini CLI.

Installing for a specific platform

sage init                              # detect existing platforms; ask if none
sage init --platform codex             # explicit: just Codex
sage init --platform codex,opencode    # multiple
sage init --platform all               # all 5 platforms
sage update                            # regenerate using the persisted list
sage update --platform gemini-cli      # override on update

The selected platforms persist in .sage/config.yaml under platforms:. sage update reads this list and regenerates for each.

Why sage/ Lives in Your Project

Sage copies its framework source into each project. This is intentional:

  • Self-contained. No external dependencies. Works offline.
  • Version-locked. Your project uses the exact version you installed. No surprise updates. Upgrade when you're ready.
  • Inspectable. Read any skill, workflow, or capability. No magic. If something isn't working, you can see exactly what it's doing.
  • Portable. Clone the repo and everything is there. No global installs, no PATH configuration, no package managers.

If you prefer managed installs, the Claude Code plugin offers the same functionality without in-project files.

License

MIT

About

Sage is a skills framework that makes AI agents think before they act, stay focused under complexity, and deliver outcomes you can trust. Built for product and engineering teams, open to any domain.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors