StaffOS turns every Claude Code session into a verified engineering record β intent, files touched, tests run, risks found, decisions made, and the human review required.
AI coding agents are becoming the worker. The hard part is no longer writing code β it's trusting, reviewing, governing, and remembering what the agent did.
A normal dashboard says "Claude Code ran." StaffOS says "Claude Code made this change, with this risk level, these tests, these missing checks, and this documentation requirement."
Every session on a branch produces a living Run Passport β a structured, versioned record that answers the questions a reviewer actually asks:
| π― Intent | What was the goal? |
| π Files touched | What was inspected and changed? |
| π§ͺ Test evidence | What passed, what failed, what wasn't run? |
| Deterministic rules β auth/payments/infra = high | |
| β Missing checks | The gaps to close before merge |
| π€ Human review | Is sign-off required? |
| π Documentation | ADRs, decision logs, and PR summaries to save |
Passports mutate as work continues and capture immutable version snapshots at each milestone, giving a clear timeline of how a change matured.
StaffOS is deliberate about when it spends tokens (configurable, see below):
| Mode | What it does | Model |
|---|---|---|
| Passive | Records events, builds the timeline. No LLM call. | β |
| Smart Summary | Summarises the session and extracts review gaps. | Claude Haiku 4.5 |
| Full Council | Six personas (Staff Eng, SRE, Security, PM, Devil's Advocate, Writing Coach) review the change. | Claude Opus 4.8 |
Risk level always stays deterministic (spec Β§15). The LLM enriches the narrative and findings; it never overrides the rules. Only metadata is sent β file paths, categories, test summaries β never your source code. No API key? Every mode degrades gracefully to deterministic heuristics.
Claude Code ββΆ Claude Code hooks ββΆ StaffOS connector ββΆ Ingestion API
β β
β βΌ
β Run event store
β β
βΌ βΌ
Web dashboard ββ Documentation engine ββ Passport generator ββ Risk engine
Everything is scoped to a Project (a repo) β Workstream (a branch) β sessions, events, passport, council reviews, documents, and memory. On merge, knowledge promotes to the project's permanent knowledge base. Zero cross-project leaking.
# 1. Install dependencies and prepare the database
bin/setup # or: bundle install && bin/rails db:prepare db:seed
# 2. Run the app
bin/dev # http://localhost:3000
# 3. (optional) Enable AI reviewers
export ANTHROPIC_API_KEY=sk-ant-...cli/staffos login # enter your endpoint + project token
cli/staffos init # writes .staffos.yml and installs Claude Code hooks
# ...use Claude Code normally β events are captured automatically...
cli/staffos passport # see the latest passport for your branch
cli/staffos disconnect # remove credentials and hooksinit installs HTTP hooks (SessionStart, UserPromptSubmit, pre/post tool
use, Stop) into .claude/settings.json. Claude Code POSTs each event to
StaffOS, which routes it to the right Workstream by branch.
| Environment variable | Purpose | Default |
|---|---|---|
ANTHROPIC_API_KEY |
Enables Smart Summary + Full Council. Unset β heuristics. | β |
STAFFOS_SUMMARY_MODEL |
Model for smart summaries | claude-haiku-4-5 |
STAFFOS_COUNCIL_MODEL |
Model for council reviews | claude-opus-4-8 |
The key can also live in Rails encrypted credentials under anthropic.api_key.
bin/rails test # full suite (Minitest)
COVERAGE=false bin/rails test # skip coverage instrumentationThe suite covers the risk engine, passport generation, the AI council's heuristic fallback, the document generator, the model layer, the Claude Code hook ingestion API, and the authenticated UI flows. Coverage is measured with SimpleCov and reported in CI.
| Workflow | Runs | Does |
|---|---|---|
CI (ci.yml) |
every push & PR | Minitest + coverage, RuboCop (Omakase), Brakeman, bundler-audit, importmap audit |
Deploy (deploy.yml) |
push to main |
Kamal deploy, gated on KAMAL_REGISTRY_PASSWORD β skips cleanly until you wire a server |
Rails 8.1 Β· Hotwire (Turbo + Stimulus) Β· Tailwind CSS Β· PostgreSQL Β· Solid Queue/Cache/Cable Β· Devise Β· Anthropic Ruby SDK Β· Kamal Β· Minitest + SimpleCov
app/services/ risk_scorer Β· passport_generator Β· council_runner
document_generator Β· llm_client
app/controllers/api/ Claude Code hook ingestion (v1)
cli/staffos local connector CLI
db/ schema, migrations, seed data
test/ services Β· models Β· integration (API + UI)
SPEC.md the full product brief