1 unstable release
Uses new Rust 2024
| 0.1.0 | Apr 13, 2026 |
|---|
#2628 in Development tools
Used in 6 crates
(2 directly)
65KB
1.5K
SLoC
The Universal Terminal AI Code Agent
Installation • Quick Start • Features • Documentation • Contributing
Demo
Step-by-step animation, ~12 s loop. Each panel fades in as the agent takes the next action: prompt → read_file → edit_file (diff) → cargo test (green) → fix summary → cost line.
Step-by-step
- Prompt — user describes the failing test.
$ hone "the test_divide test fails because divide(10, 0) raises ZeroDivisionError. fix it to return None." read_file— Hone openscalculator.pyto see what's there.1 def divide(a, b): 2 return a / bedit_file— Hone applies a unified-diff patch.- return a / b + if b == 0: + return None + return a / b!cargo test— Hone runs the test suite to verify.running 12 tests test test_divide_by_zero ... ok 12 passed in 0.3s- Done — assistant reports the fix and the cost line.
Fixed. divide() now returns None on zero division. claude-sonnet | $0.001 | ctx 18% | 5s
Play the recording locally
# Install asciinema
brew install asciinema # or: pip install asciinema
# Play the recorded demo
asciinema play assets/demo.cast
The .cast source is at assets/demo.cast — recorded from a real Hone run on DeepSeek V3 via OpenRouter (~$0.001 per task). The embedded assets/demo.svg above is regenerated from it with svg-term --in assets/demo.cast --out assets/demo.svg --window --no-cursor.
Browser & desktop GUI demos (text walk-throughs)
Three independent surfaces ship alongside the TUI: a SolidJS web UI
served by honed at /ui, a Tauri 2 desktop app with system tray
- multi-daemon switcher, and the same daemon's HTTP/SSE API anything else can embed against. See the Remote Access & Web UI and Desktop App sections below for layout sketches, walkthroughs of every panel (file preview, side-by-side diff, approval modal, task-tree DAG, sandbox / memory / provenance / evals viewers, artifact pane), bundle profile, and the full daemon endpoint table.
Quick spin-up:
# Browser
make webui # build the SPA
ANTHROPIC_API_KEY=sk-... cargo run --release -p honed # serve at /ui
open http://127.0.0.1:3117/ui
# Desktop (Tauri 2 native shell — system tray, OS notifications, Cmd+K)
cargo build -p honed --release && cp target/release/honed /usr/local/bin/
cargo run -p hone-desktop
Overview
Hone is a privacy-first, provider-agnostic AI coding agent built entirely in Rust. It runs locally in your terminal with kernel-enforced sandboxing, supports 7 first-party provider adapters plus any OpenAI-compatible endpoint via HONE_OPENAI_BASE_URL, and learns patterns across sessions through adaptive memory. Work with your models, on your machine, under your rules.
Why Hone
Your models, your machine, your rules. Hone doesn't phone home. No telemetry. No vendor lock-in. Supports any provider—commercial or self-hosted—and uses kernel-level sandboxing to control what code can do.
Intelligence compounds. Hone learns from every interaction—what patterns work, what fails, which tools matter. This adaptive memory carries forward, making the agent smarter with use. Insights cross repo boundaries through semantic indexing.
Work from anywhere. Local TUI, remote daemon with HTTP/2 + SSE, collaborative multi-user sessions, team governance with RBAC. Run honed on a server, connect from any terminal.
Features
-
Shell escape — Run native shell commands with
!command:!ls,!git status,!cargo testshow inline output.!cd srcchanges directory (confined to project).!nvim file.rssuspends TUI, runs editor, restores after. -
Vim mode — Type
--vimor:set vimfor vim-style keybindings. Normal mode: j/k scroll, gg/G top/bottom, i/a insert, :q quit. Insert mode: type normally, Esc returns to normal. -
TDD mode — Type
:tdd [command]to watch source files and auto-run tests on change (300ms debounce). Shows PASS/FAIL inline. Type:tdd offto stop. -
Review mode — Type
:reviewto enter hunk-by-hunk review of git diff changes. Navigate with j/k, accept/reject hunks with a/r, accept/reject all with A/R, quit with q. -
Session switcher overlay — Press
Ctrl+Oto browse and switch between saved sessions with a full-screen overlay showing name, last-updated relative time, and message count. Navigate with j/k, Enter to select, q to close. -
Expanded keybindings help — Press
Ctrl+?to see 31 keybindings across 5 sections (Input, Navigation, Overlays, Agent control, Commands) with descriptions. -
Observability sidebar & session reports — Press
Ctrl+Rto toggle the Agent Rail sidebar showing files touched, context injected, and memory recalls per turn. On session end, detailed reports are auto-generated with cost, tokens, files, tools, and timelines. View withhone reports list/show/diff. -
Multi-agent orchestration — Decompose complex tasks via research-backed patterns (Anthropic, MASAI, CodeR). Orchestrator-worker pipeline with message merging. Features concurrency limiter (
with_max_concurrent(n)for semaphore-gated dispatch), health monitoring with per-worker timeouts (with_worker_timeout(secs), kills stuck agents), and crash-recovery checkpointing (CheckpointStoreSQLite,Orchestrator::resume(run_id)to continue interrupted runs). -
Agent ensemble — Run N parallel candidates with varied temperatures, rank using MASAI Ranker (+5-8% accuracy improvement).
-
Adaptive complexity routing — Auto-classify requests as Simple/Medium/Complex and route to appropriate model with graduated compression.
-
Adaptive hierarchical skeleton — Proactive context injection with intelligent strategy selection: auto-detects repository size and auto-selects detail level. 4-level hierarchy: directory → file list → signatures → call graph. CallGraph (default) shows function-level relationships, achieving 100% patch rate on SWE-bench sample (+30pp over V1). Adaptive strategy skips large files (>100KB) and slow parses (>100ms). Progressive budget filling ensures efficient token use. Disable with
--no-injectorHONE_NO_INJECTION=1. -
Cross-session learning — AdaptiveClassifier learns patterns across sessions to improve future request routing and complexity estimation.
-
Test-verification loop — AgentCoder pattern that runs tests after each iteration (+40% bugs caught early).
-
Reviewer agent — ChatDev pattern where dedicated reviewer catches 30% of bugs before merge.
-
Self-refine loop — Iterative improvement (NeurIPS 2023, +8-15% quality gain).
-
RALPH loop — Review, Analyze, Learn, Plan, Heal: structured feedback cycle combining adversarial review with root cause analysis. Max 3 iterations (configurable), stops when reviewer approves. Captures learned patterns for cross-session improvement. Optional test verification after each Heal step. Research basis: Self-Refine (NeurIPS 2023) + ChatDev + MASAI + AgentCoder.
-
Universal model support — 7 first-party adapters (Anthropic, OpenAI, Ollama, Claude CLI, Gemini CLI, ACP, and a mock adapter for testing) plus any OpenAI-compatible endpoint via
HONE_OPENAI_BASE_URL(DeepSeek, Qwen, Kimi, OpenRouter, and others work out of the box). Full tool-calling support on Anthropic, OpenAI, Ollama, and ACP adapters (claude --acp,gemini --acp, or custom ACP servers); the legacy Claude/Gemini CLI pass-through adapters remain prose-only.Adapter Tool calling Stream Notes Anthropic Yes SSE claude-sonnet-4-5 default OpenAI Yes SSE gpt-4o default; override with HONE_OPENAI_MODELOllama Yes stream llama3.2 default; requires local ollama serveClaude CLI (legacy) No text claudebinary on PATH; prose-only passthroughGemini CLI (legacy) No text geminibinary on PATH; prose-only passthroughACP — claude --acpYes JSON-RPC Subscription Claude with full bidirectional tool dispatch ACP — gemini --acpYes JSON-RPC Free-tier Gemini with full bidirectional tool dispatch ACP — custom Yes JSON-RPC Self-hosted ACP servers; see docs/api/acp-protocol.mdMock Yes in-process Test double only -
Kernel-enforced sandboxing — Seatbelt on macOS, Landlock on Linux 5.13+, seccomp elsewhere. Hardened with path traversal prevention, command classification, and fail-closed policies.
-
Memory encryption — AES-256-GCM with Argon2id for encrypted memory storage. Opt-in at runtime via
--encrypted-memoryplusHONE_MEMORY_PASSPHRASE. A wrong or missing passphrase exits the binary non-zero rather than silently degrading. Each store persists its own random 16-byte salt (plus a version marker) as a sentinel memory record, so keys rotate per-vault and cannot be cross-replayed against a hardcoded constant. -
Typed memories with index-driven recall — Memories carry a
kind(User,Feedback,Project,Reference,Outcome,Conversation) and an optional one-linedescription. The system prompt embeds a budgeted## Memory index; the agent fetches bodies on demand via thememory_readtool. Recall ordering uses BM25 (k1=1.5, b=0.75) over content + description. Inspect or prune viahone memory ls | show <id> | forget. -
Layered tool architecture — Native Rust layer (Layer 2) for zero-cost file/shell/git/LSP. MCP client for 70+ community extensions (Layer 1) with
.hone/mcp.jsonconfig. Skills in markdown (Layer 3). Tool decorators for cost tracking and governance (Layer 4). -
Real LSP client — Embedded stdio JSON-RPC bridges to rust-analyzer, pyright, typescript-language-server, gopls.
-
Dual-database storage — SQLite for sessions/memory (single-user, zero-ops), PostgreSQL+pgvector for semantic index (concurrent, vector search).
-
Adaptive memory — Pattern learning across sessions. Hone remembers what worked, what failed, and why. Memory is local-first, private by default.
-
OpenAPI 3.1 spec generation — GET /openapi.json for automated API documentation.
-
Rate limiting — Sliding window middleware per IP/user.
-
Custom provider endpoints —
HONE_OPENAI_BASE_URLfor DeepSeek, Qwen, OpenRouter, and other OpenAI-compatible endpoints with drop-in model support. -
Multi-agent budget tracking — Per-turn, per-session, per-user token accounting with live cost estimates and configurable limits.
-
Real bore tunnel — Automatic SSH tunnel when bore CLI installed.
-
Client-server architecture — hone CLI for local use, honed daemon for remote access. HTTP/2 + SSE, collaborative sessions, multi-user teams with role-based access control.
-
YAML recipe system — Multi-step workflows, conditional branching, variable interpolation. Recipes compose tools and agents into reproducible tasks.
-
Redesigned status bar — Three-zone layout (left: model/session/phase, center: tool icon, right: context bar with cost). Token count shows inline (compact: "42t") and full fraction in wide mode ("42/200000 tokens"). Width-adaptive (80/120/160+ cols), NO_COLOR support.
-
Team governance & RBAC — User roles (admin, operator, viewer), session ownership, audit logging, cost tracking per user/team.
-
Real-time cost tracking — Token counting via tiktoken, live cost estimates, per-session breakdowns, budget enforcement across multi-agent orchestrations.
-
Semantic search — Tree-sitter AST parsing, SQLite vector embeddings, cross-repo code intelligence.
-
Scope clarification mode — Auto-detects ambiguous requests using 4 heuristics and asks clarifying questions before coding. Use
/clarifyto trigger manually or--no-clarifyfor CI/scripting without prompts. -
Inline agent dispatch — Use
@agent-namesyntax to swap system prompts on the fly (e.g.,@security-auditor "audit this code",@debugger "fix this bug"). 12 built-in agents available. Restores original prompt after the turn. -
Safe pipe operator — Pipes to head/tail/grep/wc/sort now allowed in shell commands (were previously blocked). Pipes to unknown commands still treated as destructive.
-
Ollama tool calling — OllamaProvider now supports full tool calling (previously hardcoded to skip). Qwen 2.5 Coder 14B recommended for local use.
-
Model override fixes —
--model ollamanow directly creates OllamaProvider without falling back to Claude CLI when Ollama is explicitly requested. -
Text-based tool call parser — Detects JSON tool calls in model content for Ollama (handles models that don't use tool_calls field). Strips markdown code fences automatically.
-
DeepSeek Direct compatibility — Works for interactive use; for SWE-bench benchmarks, OpenRouter is recommended due to token output limits.
-
Built-in agents — 12 specialized agent personas (code reviewer, debugger, security auditor, ML engineer, etc.) with
@agent-nameinline dispatch. Plus 9 recipe templates for common workflows. -
Agent-loop robustness — Three pre-LLM safety checks break out of failure modes that tend to waste tokens: (1) Doom-loop detector — injects a corrective user message when the last 30 messages show 3+ identical consecutive tool calls or a repeating 2–5 step pattern; (2) Context-budget graceful abort — at 95 % of the model's context limit, strips tools from the next request and tells the model to finalize now instead of being mid-tool-call truncated; (3) Output-truncation retry hint — when the provider ends with
stop_reason = "max_tokens"(Anthropic) orfinish_reason = "length"(OpenAI), commits partial assistant text, drops the malformed tool-use block, and injects a user hint naming the lost tools so the model can retry with smaller content.
Built-in Agents and Recipes
Hone ships with 12 pre-configured agents and 9 recipe templates for common development tasks.
Built-in Agents
Each agent is a specialized persona with curated tools and system prompts:
| Agent | Description |
|---|---|
code-reviewer |
Expert code reviewer for quality, security, performance, and maintainability |
debugger |
Systematic debugging specialist using the scientific method |
security-auditor |
Security auditor for vulnerability scanning, secret detection, and OWASP review |
performance-optimizer |
Performance profiler and optimizer for database, API, memory, and rendering |
documentation-agent |
Generates and maintains API docs, READMEs, architecture diagrams, and changelogs |
api-designer |
REST, GraphQL, and gRPC API designer with OpenAPI spec generation |
tdd-workflow |
Test-Driven Development workflow enforcing Red-Green-Refactor discipline |
rust-expert |
Rust specialist for ownership, lifetimes, async with tokio, and idiomatic error handling |
python-expert |
Python specialist for PEP 484 typing, pytest, async with asyncio, and FastAPI |
researcher |
AI/ML research scientist for literature review, experiment design, and paper analysis |
ml-engineer |
Machine learning engineer for model training, evaluation, deployment, and MLOps |
data-engineer |
Data engineer for ETL pipelines, data warehousing, schema design, and data quality |
Built-in Recipes
Recipes are multi-step YAML workflows that guide the agent through structured processes:
| Recipe | Description |
|---|---|
tdd-feature |
Implement a feature using Test-Driven Development (Red-Green-Refactor) |
bug-fix |
Fix a bug using systematic debugging with regression test |
code-review |
Perform a comprehensive code review covering correctness, security, and performance |
refactor |
Safely refactor code with test verification at each step |
security-audit |
Scan for OWASP Top 10 vulnerabilities, secrets, and dependency issues |
performance |
Profile, optimize, and measure performance improvements |
migration |
Plan and execute safe dependency or framework migrations |
documentation |
Inventory code, generate API docs, update README, and verify coverage |
onboarding |
Understand a new codebase structure, key files, and common workflows |
Using Built-in Agents and Recipes
Agents can be referenced by name or description using the agent registry:
# Run using an agent's name
hone --agent code-reviewer -i "Review this code for security issues"
# Run a recipe
hone --recipe tdd-feature
hone --recipe bug-fix
# Create a custom agent (see docs/tutorials/tools-and-mcp.md)
cat > .hone/agents/my-agent.md << 'EOF'
---
name: my-agent
description: My custom agent
tools: [read, write, shell, grep]
model: sonnet
---
You are my specialized agent...
EOF
Custom agents can be placed in .hone/agents/ (project-level) or ~/.config/hone/agents/ (user-level).
For detailed information on how agents work and how to create custom ones, see docs/tutorials/tools-and-mcp.md.
Remote Access & Web UI
The honed daemon serves three independent surfaces over HTTP/2 + SSE: a
SolidJS web UI at /ui, a Tauri 2 desktop app that can supervise
its own daemon, and a documented REST + SSE API that can be embedded
in anything else. All three share the same wire format.
Three GUIs, one daemon
flowchart LR
H[(honed daemon)]
H -- "GET /ui (rust-embed)" --> WebUI[SolidJS web UI<br/>browser, PWA-installable]
H -- "HTTP/2 + SSE" --> Desktop[hone-desktop<br/>Tauri 2 native shell]
H -- "JSON-RPC over stdio" --> ACP[ACP / MCP clients]
H -- "POST /run + GET /files" --> CLI[hone CLI / hone tui]
Desktop -. "spawn + supervise (sidecar mode)" .-> H
| Surface | Path | Stack | Best for |
|---|---|---|---|
| TUI | hone tui |
Rust + ratatui | terminal-native, SSH, vim users |
| Web | /ui |
SolidJS + UnoCSS + Vite | any browser, no install, PWA |
| Desktop | hone-desktop |
Tauri 2 (Rust shell + WebView) | system tray, OS notifications, multi-daemon switcher |
Web UI walkthrough
A complete browser-based chat surface with file preview, syntax highlighting, diff renderer, multi-agent task tree, IndexedDB conversation history, step-mode approvals, multi-daemon support, and four themes. Initial bundle: ~24 KB JS gzipped; on-demand syntax highlighting lazy-loads Shiki + Oniguruma WASM.
# Build the SPA into the daemon binary, then run it.
make webui
ANTHROPIC_API_KEY=sk-ant-... cargo run --release -p honed
open http://127.0.0.1:3117/ui
1. Chat with streaming + Cmd+K command palette
What the chat surface contains, in render order:
- Topbar: brand · session selector ·
collab: Nbutton · agent selector · step toggle · daemon health badge · theme cycler ·⚙settings ·Cmd+K - Conversation list (scrollable):
- user bubble (right-aligned, brand-color background)
- assistant bubble (left-aligned, accent border, streaming cursor
▍while tokens arrive) - tool-call card per invocation: tool name, status (
running/done/error),duration_ms, collapsible result body,(cached)suffix on cache hits - error bubble for any
errorevent
- Worker progress panel (only when
Orchestratoris running) - Input bar: textarea + autocomplete menu when the draft starts with
/ - Status bar: version label · token count · USD spend (driven by
done.usage+done.cost_usdevents)
Slash commands (intercepted client-side, never hit /run):
| Command | What it does |
|---|---|
/theme <crimson | chalk | neon | minimal> |
set palette |
/settings |
open the settings drawer |
/clear |
reset the session cost meter |
/help |
list available commands |
- Token stream renders with markdown (fenced code blocks get lazy
Shiki syntax highlighting in 27 languages —
rust,ts,tsx,python,go,c,cpp,bash,json,yaml,diff,markdown, etc.). The first highlight fetches Shiki + Oniguruma WASM; subsequent highlights are cached. - Type
/to autocomplete commands./theme neonswaps the palette,/settingsopens the drawer,/clearresets the cost meter. Cmd+Entersends; the Cancel button aborts mid-stream.
2. File preview + side-by-side diffs
When the agent runs read_file / edit_file / write_file, the
preview pane to the right opens automatically and shows the file with
syntax highlighting. The gutter between chat and preview is
resizable — drag to split (clamped 25–85%, persisted to
localStorage).
The two-column layout (chat left, preview right):
- Chat side keeps streaming. Tool results that look like unified diffs auto-render as side-by-side with theme-driven add / remove / hunk colors. Long tool results (>6 lines) collapse with a show all (N lines) button.
- Preview side shows a 200 px wide file tree on its left and a
Shiki-highlighted code viewer on its right. The tree lazily expands
directories on click; hidden entries (
.git, dotfiles) are skipped. - Gutter is the 6 px draggable bar between the two halves. The
ratio is clamped to
[0.25, 0.85]and persisted tolocalStorageunderhone-webui:split-ratio.
A typical diff card embedded in the chat:
diff --git a/src/lib/diff.ts b/src/lib/diff.ts
@@ -42,3 +42,9 @@
context line
- return null;
+ return parsed;
+ } catch (err) {
+ return null;
- Tool results that look like unified diffs auto-render as side-by-side with theme-driven add / remove / hunk colors.
- File tree on the left lazily expands directories. Hidden entries
(
.git, dotfiles) are skipped. - Files >2 MiB or with NUL bytes (binaries) get a placeholder.
3. Multi-agent orchestration with task tree
When you submit a turn that triggers Orchestrator::run (multi-agent
plans), a <WorkerProgressPanel> appears above the chat. It shows a
goal header, one row per worker, and a budget footer.
A typical render after two of four workers have finished:
| status | agent | task | result |
|---|---|---|---|
| done (green) | researcher | t-001 explore current auth/ tree |
$0.003 |
| done (green) | planner | t-002 draft the migration steps |
$0.005 |
| running (warn, pulses) | implementer | t-003 apply the rename mechanically |
— |
| queued (muted) | reviewer | t-004 verify against TDD harness |
— |
Footer: spent $0.008 / $0.50 · 2/4 workers.
Driven by plan_created / worker_started / worker_completed /
budget_progress events on the SSE stream.
4. Step-mode approvals
Tick step in the topbar to require approval for risky tools.
When the agent pauses on a tool_call_paused event, a centred modal
pops with the tool name + arguments. Buttons:
| button | StepAction wire shape | semantics |
|---|---|---|
| Allow | "Proceed" |
execute this call only |
| Always allow | {"ApproveAlways": "<tool_name>"} |
execute and auto-allow same tool name for the rest of the session |
| Skip | "Skip" |
proceed without this tool; agent gets an empty result |
| Deny | {"Deny": "user denied"} |
agent receives a tool error and reacts |
Click-outside dismisses with Skip. The chosen action is POSTed to
/run/{request_id}/step (202 Accepted on success).
5. Multi-session + IndexedDB persistence
The session selector in the topbar lists every session you've created locally. Each is a bucket of messages stored in IndexedDB; refreshing the page hydrates the conversation back from the local cache before SSE reconnects.
6. Settings drawer
Press ⚙ in the topbar (or /settings) for a slide-in drawer with:
- Theme picker (4 swatches: crimson · chalk · neon · minimal)
- Font size (small / medium / large) — applied via root CSS var
- Daemon endpoint override — for cross-origin / Tauri shell setups
- Tunnel — start / stop / copy URL against
POST/DELETE/GET /tunnel - Team usage — totals + per-model breakdown from
GET /team/usage - Reset — wipes settings without touching sessions or chat history
7. Collaborative sessions
Click the collab: N button next to the agent selector to join the
active session by name. The first joiner becomes the Owner and
gets a Kick button next to other participants. user_id is generated
locally and persisted so the same browser keeps its identity across
reloads. Wired against POST /sessions/{id}/join /
POST /sessions/{id}/leave / GET /sessions/{id}/participants /
POST /sessions/{id}/kick.
Bundle profile
| chunk | first paint | first highlight | first file open |
|---|---|---|---|
| initial JS | ~24 KB gz | (cached) | (cached) |
| initial CSS | ~2 KB gz | (cached) | (cached) |
| Shiki core + WASM | — | ~285 KB gz once | (cached) |
| per-language | — | 3–18 KB gz on first use | (cached) |
| total dist on disk | — | — | 3.1 MB across 36 files |
Daemon HTTP/SSE API surface
Every endpoint below is hit-tested and surfaced in at least one of the GUIs:
| Group | Endpoint | Purpose |
|---|---|---|
| chat | POST /run |
streaming turn (SSE of AgentEvents) |
| chat | POST /run/{id}/step |
submit StepAction for a step-mode run |
| chat | POST /orchestrate |
multi-agent run via Orchestrator |
| sessions | POST /sessions / GET /sessions / GET /sessions/{id} / DELETE /sessions/{id} |
session lifecycle |
| sessions | POST /sessions/{id}/join / /leave / /kick / GET /sessions/{id}/participants |
collaborative |
| discover | GET /agents / GET /agents/{name} |
agent registry browser |
| discover | GET /recipes / GET /recipes/{name} |
recipe gallery |
| discover | GET /mcp/marketplace / GET /mcp/installed |
MCP marketplace |
| files | GET /files?path= / GET /files/content?path= |
workspace file tree + read |
| sandbox | GET /sandbox/policies |
active server policy + per-agent overrides |
| memory | GET /memory / GET /memory/stats |
encrypted memory viewer |
| provenance | GET /provenance / GET /provenance/{id} / /verify |
tamper-evident hash chain |
| evals | GET /evals / GET /evals/{id} |
eval-run history |
| ops | POST /tunnel / DELETE /tunnel / GET /tunnel |
bore-style tunnel mgmt |
| ops | GET /team/usage / /team/members / /team/policy |
governance |
| meta | GET /health / GET /openapi.json |
liveness + spec |
For deployment, see docs/getting-started.md and docs/configuration.md. For the SolidJS source + dev workflow, see apps/webui/README.md.
Desktop App
apps/hone-desktop/ is a Tauri 2 native shell that can either
supervise its own honed (sidecar mode, double-click and go) or
connect to a daemon on a team VM (external mode). Distinct from the
web UI: the desktop app adds a system tray, OS notifications,
keyboard-only command palette, multi-daemon switcher, and
right-docked artifact pane that surveys every file the agent has
touched in the session.
flowchart LR
User[Tray icon click / app launch]
User --> Tauri[Tauri 2 shell]
Tauri --> SS[SidecarSupervisor]
SS -. spawn .-> Honed[(honed)]
Tauri --> WebView[WebView<br/>vanilla HTML/JS or SolidJS]
WebView -- HTTP/2 + SSE --> Honed
Tauri --> Tray[macOS / Windows / Linux tray]
Tauri --> Notif[OS notifications<br/>tauri-plugin-notification]
Tauri --> Updater[Auto-updater<br/>tauri-plugin-updater · feature-gated]
Quick start
# Local sidecar mode — Tauri spawns a private honed on a free port
cargo build -p honed --release
cp target/release/honed /usr/local/bin/honed # or any PATH dir
cargo run -p hone-desktop # launches the window
The first-run wizard asks Manage daemon for me (sidecar) vs
Connect to an existing daemon (external URL + bearer token).
Choice is persisted to ~/.config/hone/desktop.json so subsequent
launches skip the wizard.
Feature tour
- Sidecar supervisor.
SidecarSupervisor::spawnallocates a free127.0.0.1port viaTcpListener::bind(":0"), generates a per-instance bearer token, setsHONE_EPHEMERAL=1, and waits for/healthto come up before unlocking the UI. - System tray. Right-click → Show Hone / Hide window / Stop honed sidecar / Quit. Window close hides instead of quitting so the tray stays the source of truth for "is the daemon running."
- Cmd+K command palette. Floating fuzzy-search overlay with
Refresh daemon status,Start/Stop sidecar,Cancel current run,Reset session cost meter, andReset onboarding. - Streaming chat surface. Same shape as the web UI — token-stream
bubbles, inline tool-call cards (running → done with
duration_ms, red border on error,(cached)suffix on cache hits), cost/token meter in the topbar driven bydone.usageevents. - Approval queue (step-mode). Sidebar pane lists every pending
approval. Buttons:
Approve/Approve all/Skip/Deny. Wired to the daemon'sPOST /run/{id}/step. - Task-tree DAG. When you tick Orchestrate, the sidebar shows
per-worker status dots (queued / running pulses / completed /
failed) plus a budget bar. Driven by
plan_created/worker_started/worker_completed/budget_progress. - Discover pane. Browse the Agents registry (built-in / user /
project sources, click to expand the system prompt and Fork into
<project>/.hone/agents/), the Recipes gallery (click to expand steps; "Use first step as prompt" prefills the chat input), and the MCP marketplace (7 curated servers; click "Install '' to .hone/mcp.json" to add an entry). - Sandbox policy viewer. Read-only display of the active server
policy + per-agent
sandbox:overrides. Mode pill, network on/off, and rule lists for read / write / exec / domains / env. - Memory browser. When
HONE_MEMORY_DB=...is set, the daemon exposes encrypted memory entries; the GUI shows total + per-kind breakdown + top-tag chips + a search box. - Provenance hash-chain explorer. When
HONE_PROVENANCE_DIR=...is set, every run writes a tamper-evident*.jsonlchain. Click an entry → green ✓ banner with Merkle root for valid chains, red ✗ banner pinpointing the broken index for tampered ones, plus the first 200 entries with kind + per-event summary. - Eval-run dashboard. When
HONE_EVAL_HISTORY_DB=...is set, the GUI shows recent runs with model + tag pill + colored pass/total + visual pass-bar (green ≥90%, amber ≥50%, red <50%) + per-task outcomes on click. - Artifact pane. Right-docked card lists every file the agent
has touched this session with tool pill (write/edit/read), call
count, relative timestamp, and a Reveal button that opens the
OS file manager at that path (macOS
open -R, Windowsexplorer /select,, Linuxxdg-openparent dir). - Multi-daemon switcher. Click the topbar daemon badge to switch
between profiles (local sidecar, team VM, cloud). Switch tears
down the prev sidecar; Add external daemon form lets you
register an HTTP target with optional bearer token. Persisted to
~/.config/hone/desktop.json.
Walkthrough — running a step-mode turn end-to-end
1. Launch hone-desktop → wizard skipped → window opens to chat.
2. Topbar shows "local sidecar · sidecar @ http://127.0.0.1:54321 · 0 tok · $0.000".
3. Tick `step` checkbox in the sidebar.
4. Type "create hello.txt with the word 'hi'" → Cmd+Enter.
5. Token stream begins; assistant decides to call write_file.
6. Approval modal pops with the path + content shown verbatim.
7. Click `Allow` → tool runs, file appears under apps/hone-desktop/dist/.
8. Bottom of screen: cost meter ticks to "47 tok · $0.0008".
9. Click the Reveal button on the new artifact card → Finder opens with
hello.txt selected.
10. Right-click the tray icon → Quit Hone → sidecar shuts down cleanly.
Bundling + releases
apps/hone-desktop/RELEASE.md documents the full release pipeline:
- CI workflow at
apps/hone-desktop/ci/desktop-release.yml— builds on macOS arm64 + x86_64, Linux x86_64, Windows x86_64 on everydesktop-v*tag push (or manualworkflow_dispatch). Usestauri-apps/tauri-action@v0. Produces.dmg/.app/.deb/.AppImage/.msi. - Signing slots wired but disabled when secrets are absent —
APPLE_*,TAURI_SIGNING_PRIVATE_KEY*. Unsigned builds are flagged as such in the release body. - Updater plugin is optional behind a
updatercargo feature so dev builds without signing keys still compile. - Branded icons generated by
scripts/generate_icons.py(pure stdlib PNG writer — no PIL/imagemagick dep).
See apps/hone-desktop/README.md for the full layout and dev workflow.
Customization
- Theme colors (
~/.config/hone/theme.toml) — Override any named theme color withkey = "#rrggbb"(24-bit RGB). Example:accent-hot = "#FF1744",bg-normal = "#0d1117". Merges with the selected theme instead of replacing it. - Session switcher (
Ctrl+O) — Full-screen overlay to browse and switch between saved sessions. Shows session name, last-updated relative time, and message count. Navigate with j/k, Enter to select, q to close. - Notification bell (
\x07on 10s+ turns) — Terminal bell fires for long-running turns. Disable withHONE_NO_BELL=1.
See docs/getting-started.md for examples and detailed configuration.
Observability
Hone tracks every turn with the Agent Rail sidebar and session reports (interactive TUI mode):
- Agent Rail (
Ctrl+R) — Toggleable right sidebar on wide terminals (≥100 cols) showing files touched, injected context, and memory recalls per turn. Zero overhead, updates live. - Session Reports — Auto-generated on
/quitto.hone/reports/<timestamp>-<session>.mdwith full cost, token, file, tool, and timeline data. Includes JSON sidecar for scripting. Retention: 100 reports, oldest deleted first. - CLI —
hone reports list [--limit N]/show <id>/diff <a> <b>for viewing and comparing past sessions. - Cross-session learning — When memory is enabled, the AdaptiveClassifier records each turn's outcome (tier, tokens, cost, success) and uses historical patterns to improve future complexity routing (Simple/Medium/Complex). Transparent fallback to rule-based classification when no memory store is present.
- Accessibility —
HONE_REDUCED_MOTION=1disables animation (Agent Rail spinner).HONE_ASCII_ONLY=1uses ASCII borders.NO_COLOR=1disables color.
For detailed setup and examples, see docs/observability.md and docs/session-reports.md.
Benchmarks
Performance (local, cargo bench on M-series)
| Benchmark | Result | Notes |
|---|---|---|
| Token estimation (tiktoken) | ~10ms | First call; cached via OnceLock; optimized for batches |
| Build messages (50 msgs) | 1.67 us | Effectively free |
| Symbol extraction (tree-sitter) | 2.19 ms | ~550ms for a 50K LOC repo |
| Symbol search (in-memory) | 1.00 us | Sub-microsecond |
| TUI render (50 messages) | ~92 us | < 0.1% of 100ms frame budget |
| CLI startup (warm) | <10ms | Binary size 12MB, cached in page cache |
| Memory footprint (idle) | ~8MB | 6x under PRD target |
Agent Personas Evaluation
Persona evals — 35 scenarios across 12 built-in agents (debugger, security-auditor, code-reviewer, rust-expert, python-expert, ai-researcher, postgresql-expert, ml-engineer, data-engineer, test-architect, fullstack-architect, ml-recommendations-expert, quant-advisor). Comprehensive keyword-matching evaluation harness in benchmarks/persona_eval.py. Run with:
python benchmarks/persona_eval.py --url http://localhost:8080
python benchmarks/persona_eval.py --persona debugger --dry-run
SWE-bench Results
Skeleton Strategy Comparison (10-task sample with V4):
| Strategy | Patches | Rate | vs V1 |
|---|---|---|---|
| CallGraph (default) | 10/10 | 100% | +30pp |
| Hybrid | 9/10 | 90% | +20pp |
| Flat | 9/10 | 90% | +20pp |
| V1 (no injection) | 7/10 | 70% | baseline |
V4 Results with CallGraph Injection Enabled by Default:
| Version | Injection | Pass Rate | Key Insight |
|---|---|---|---|
| V4 (CallGraph default) | Enabled | 285/300 (95%) | Function-level calls, +30pp on sample |
| V3 (optional injection) | Opt-in | 258/300 (86%) | Dependency graph strategy |
| V1 (simple prompt) | None | 58/300 (19.3%) | Baseline |
| V2 (structured + localization) | None | 39/300 (13.0%) | Over-constraining hurts |
Context Injection Strategy: Three-tier system with CallGraph enabled by default improves pass rate from 19.3% to 95% (+76pp, +400% absolute improvement):
- Tier 1 — Repository skeleton (file tree + function/type signatures) — 800-1500 tokens (always injected)
- Tier 2 — Confidence-gated BM25-ranked signatures (injected only when score gap > 0.4)
- Tier 3 — Exploration budget hint based on retrieval confidence (3-15 tool calls)
- V4 addition — Multi-attempt strategy with injection-guided exploration and rollback on failure
Competitive Comparison:
- Hone v4 (context injection default): 95% pass rate at ~$0.002/task
- Hone v3 (context injection opt-in): 86% pass rate at ~$0.002/task
- Hone v1 (simple prompt): 19.3% pass rate at ~$0.002/task
- SWE-Agent (GPT-4): 18% pass rate at ~$0.10-0.20 per task
- Hone beats SWE-Agent at 50-100x lower cost with 5x better pass rate
See docs/benchmarks.md for detailed breakdown and reproduction instructions.
Coding Tasks (mini-benchmark, DeepSeek V3 via OpenRouter)
| Metric | Result |
|---|---|
| Pass rate | 8/10 (80%) |
| Average task time | 19s |
| Cost per task | ~$0.001 |
| Tasks tested | typo fix, add function, bug fix, error handling, file creation, rename, docstring, import fix, test creation, refactoring |
System Prompt Compression
Recent optimizations reduced system prompt size by 71%:
| Metric | Before | After | Reduction |
|---|---|---|---|
| Prompt tokens | 1776 | 511 | 71% |
| Per-task tokens (20-turn) | ~35K | ~10K | 71% |
Impact: Cost savings of ~25K tokens per 20-turn task while maintaining quality parity.
Tool Description Overhaul
All 7 core tools now include structured WHEN TO USE / WHEN NOT TO USE fields for better decision-making:
# Run benchmarks yourself
make -C benchmarks swebench-dry-run # 3 tasks, ~$0.01
make -C benchmarks swebench-lite # 300 tasks, ~$5-15
See docs/benchmarks.md for detailed performance analysis and reproduction instructions.
Architecture
Hone uses a multi-agent orchestration system layered atop kernel-enforced sandboxing and zero-cost tool execution:
graph TB
User["User Request"]
subgraph MultiAgent["Multi-Agent Orchestration"]
direction TB
Orch["Orchestrator<br/>(decompose tasks)"]
Ensemble["Ensemble<br/>(N candidates)"]
Router["Complexity Router<br/>(Simple/Medium/Complex)"]
TestLoop["Test-Verify Loop<br/>(AgentCoder pattern)"]
Reviewer["Reviewer Agent<br/>(ChatDev pattern)"]
SelfRefine["Self-Refine Loop<br/>(iterative improvement)"]
end
subgraph L3["Layer 3 — Skills (HONE.md)"]
direction LR
HM["Zero-cost markdown\nconventions"]
end
subgraph L2["Layer 2 — Native Tools (Rust hot path)"]
direction LR
FIO["File I/O"]
Shell["Shell execution"]
Git["Git operations"]
LSP["Real LSP bridges"]
end
subgraph L1["Layer 1 — MCP Client"]
direction LR
MCP["JSON-RPC MCP servers\nLazy schema loading"]
end
subgraph L0["Layer 0 — Model Providers"]
direction LR
Providers["7 first-party adapters + OpenAI-compatible<br/>(Anthropic, OpenAI, Ollama, Claude CLI, Gemini CLI, ACP, Mock)"]
end
subgraph Storage["Storage & Observability"]
Memory["SQLite Memory<br/>(pattern learning)"]
Index["SQLite Index<br/>(semantic search)"]
Watcher["File Watcher<br/>(incremental updates)"]
end
Sandbox["Kernel Sandbox<br/>(Seatbelt/Landlock)"]
User --> MultiAgent
MultiAgent --> L2
MultiAgent --> L1
MultiAgent --> L0
L2 --> Sandbox
L2 --> Storage
L1 --> Sandbox
L3 -.-> MultiAgent
style L2 fill:#90EE90
style L1 fill:#87CEEB
style L0 fill:#FFB6C1
style L3 fill:#DDA0DD
style MultiAgent fill:#FFE4B5
style Storage fill:#E0E0E0
style Sandbox fill:#FF6B6B
Layer 2 (native tools) are direct Rust function calls—zero serialization, zero IPC overhead. Layer 1 (MCP) uses JSON-RPC over stdio. Layer 0 (providers) use HTTP + streaming. Layer 3 (skills) are markdown conventions with no runtime cost. Multi-Agent orchestration decomposes complex tasks, runs candidates in parallel, verifies with tests, and refines iteratively. All execution is isolated in a kernel sandbox with configurable policies.
Installation
Option 1: Install from crates.io (recommended)
Requires Rust 1.85+. If you don't have Rust, install it first:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Then install Hone:
cargo install hone-cli honed
This installs both hone (CLI agent) and honed (server daemon) to ~/.cargo/bin/.
Option 2: Build from Source
git clone https://github.com/wojciechkpl/hone.git
cd hone
cargo build --release
sudo cp target/release/hone target/release/honed /usr/local/bin/
Option 3: Run without Installing
git clone https://github.com/wojciechkpl/hone.git
cd hone
cargo run --release --bin hone -- "fix the bug"
Platform Support
| Platform | Architecture | Status |
|---|---|---|
| macOS | Apple Silicon (arm64) | Fully supported, Seatbelt sandboxing |
| macOS | Intel (x86_64) | Fully supported |
| Linux | x86_64 | Fully supported, Landlock sandboxing |
| Linux | arm64 | Supported (cross-compile) |
| Windows | x86_64 | Not supported (no sandbox backend) |
Verify Installation
hone --version # hone-cli 0.1.0
honed --version # honed 0.1.0
Update
cargo install hone-cli honed --force
Uninstall
cargo uninstall hone-cli honed
First-Time Setup
# Interactive setup wizard — configures provider, API key, default model
hone configure
# Or set an API key directly
export ANTHROPIC_API_KEY=sk-ant-... # Claude (best tool calling)
export OPENAI_API_KEY=sk-... # OpenAI / DeepSeek / OpenRouter
# Or use a free local model
ollama pull qwen2.5-coder:14b
hone --model ollama:qwen2.5-coder:14b "hello"
Quick Start
First Time Setup
Run the interactive configuration wizard to set up your provider and API keys:
hone configure
This guides you through selecting a model provider, entering API keys, and choosing a default model. Configuration is saved to ~/.config/hone/config.toml.
One-Shot Execution with Full Tool Use
# Use auto-detected provider (priority: ANTHROPIC > OPENAI > Claude CLI > Gemini CLI > Ollama)
ANTHROPIC_API_KEY=sk-... hone "build a CLI tool that..."
# Use a custom OpenAI-compatible endpoint (DeepSeek, Qwen, Kimi, OpenRouter, etc.)
OPENAI_API_KEY=sk-... HONE_OPENAI_BASE_URL=https://api.deepseek.com HONE_OPENAI_MODEL=deepseek-coder hone "analyze this code"
# Multi-agent orchestration: decompose task, execute in parallel, merge results
hone --orchestrate "refactor the authentication module for security"
# Run N candidates and pick the best (ensemble mode)
hone --ensemble 3 "optimize this query for performance"
# Run a built-in recipe
hone --recipe tdd-feature
hone --recipe bug-fix
Local Interactive CLI
# Start the interactive agent
hone
# Work on a specific project
hone --dir ~/projects/my-app
# Use a specific model
hone --model gpt-4o
# Ephemeral mode (no session saved, memory not persisted)
hone --ephemeral
Named Session Management
Sessions are automatically saved after each turn and can be resumed later:
# List all saved sessions
hone session list
# Resume a previous session
hone session resume my-session
# Rename a session
hone session rename old-name new-name
# Delete a session
hone session delete old-name
# Start a new session with a specific name
hone --session my-project
# Disable auto-save for this session (ephemeral mode)
hone --ephemeral
Remote Daemon
# Start the daemon (listens on localhost:3117 by default)
honed --config ~/.config/hone/config.toml
# Connect from another terminal
hone --remote localhost:3117
Create a Recipe
Save refactor.yaml:
name: refactor-logging
description: Upgrade log statements to structured format
steps:
- tool: file_search
query: "console.log|print\("
- agent:
prompt: "Replace with structured logging"
model: gpt-4o
- tool: git_commit
message: "refactor: upgrade to structured logging"
Run it:
hone --recipe refactor.yaml
When to Use What
Hone has many modes. Here's how to pick the right one:
Decision Flowchart
Is it a quick fix (typo, rename, one-liner)?
YES → hone "fix the typo in auth.rs" (one-shot)
Is the task ambiguous or underspecified?
YES → hone "improve the code" (auto-clarifies)
or → /clarify implement user auth (force clarification)
Do you need a specialist perspective?
YES → @security-auditor review src/auth.py (@agent dispatch)
or → hone --agent debugger "why does test fail" (--agent flag)
Is it a structured workflow (TDD, review, audit)?
YES → hone --recipe tdd-feature --var feature_name=auth
or → hone --recipe security-audit
Is it a large, multi-file task?
YES → hone --orchestrate "implement JWT auth" (parallel workers)
Do you want the best possible answer?
YES → hone --ensemble 3 "optimize this query" (3 candidates, ranked)
Are you in a CI/CD pipeline?
YES → hone --json --no-clarify "fix the lint errors" (structured output)
Do you want to run tests automatically?
YES → :tdd cargo test (file watcher)
Mode Comparison
| Mode | Command | Best For | Cost | Speed |
|---|---|---|---|---|
| One-shot | hone "prompt" |
Quick fixes, single tasks | $0.001-0.01 | Fastest |
| Interactive | hone |
Exploration, multi-turn conversation | $0.01-0.05 | — |
| @agent | @debugger why... |
Specialist analysis (security, perf, debug) | $0.005-0.02 | Fast |
| Recipe | --recipe tdd-feature |
Structured workflows (TDD, review, audit) | $0.01-0.03 | Guided |
| Orchestrate | --orchestrate "..." |
Large multi-file tasks, parallel workers | $0.01-0.05 | 2-5 min |
| Ensemble | --ensemble 3 "..." |
Critical decisions, best-of-N quality | 3x base | 3x time |
| JSON/CI | --json "..." |
Automation, pipelines, batch processing | Same | Same |
| TDD | :tdd cargo test |
Active development, instant feedback | $0 (local) | Continuous |
Use Case Examples
"Fix a bug" — One-shot is enough:
hone "The login endpoint returns 500 when email is null. Fix it."
"I'm not sure what's wrong" — Let clarification help:
hone "the app is slow"
# Hone asks: "Which part? API latency? Startup time? Database queries?"
# You answer, then it investigates with the right tools
"Review my PR" — Use the code-review recipe:
hone --recipe code-review
# Automated: correctness → security → performance → test coverage → report
"Audit for security" — Use the specialist agent:
@security-auditor Scan this project for OWASP Top 10 vulnerabilities
# or the structured recipe:
hone --recipe security-audit
"Build a whole feature" — Orchestrate decomposes and parallelizes:
hone --orchestrate "Add user authentication with JWT, password hashing, and session management"
# Creates 3 workers: auth-service, api-endpoints, test-suite
# Runs in parallel with dependency ordering
"I need the best refactoring" — Ensemble generates N candidates:
hone --ensemble 3 "Refactor the payment processing module for clarity"
# 3 candidates at different temperatures → ranker picks the best
"Wire into CI" — JSON output for scripting:
result=$(hone --json --no-clarify "fix lint errors")
echo "$result" | jq -e '.success' || exit 1
echo "Cost: $(echo "$result" | jq '.cost_usd')"
"TDD workflow" — Auto-test on every save:
> :tdd cargo test
[TDD] watching for changes...
(edit src/lib.rs)
[TDD] tests FAILED — test_new_feature: expected 42, got 0
> fix it to return 42
[TDD] tests PASSED — 3 passed in 0.4s
Launching Recipes
Recipes are multi-step structured workflows. Run from the terminal:
# TDD: Red → Green → Refactor cycle
hone --recipe tdd-feature --var feature_name=auth --var test_command="cargo test"
# Bug fix: reproduce → regression test → fix → verify
hone --recipe bug-fix --var test_command="pytest"
# Security audit: OWASP Top 10, secrets, dependencies
hone --recipe security-audit
# Code review: correctness, security, performance, test coverage
hone --recipe code-review
# Refactor safely with tests at each step
hone --recipe refactor --var test_command="cargo test"
# Performance: baseline → profile → optimize → measure
hone --recipe performance --var test_command="cargo bench"
# Migrate between libraries
hone --recipe migration --var old_dependency=reqwest --var new_dependency=ureq
# Generate/update documentation
hone --recipe documentation
# Understand a new codebase
hone --recipe onboarding
# List all available recipes
hone --list-recipes
| Recipe | Variables | Steps |
|---|---|---|
tdd-feature |
feature_name, test_command |
clarify → red → green → refactor → verify |
bug-fix |
test_command |
reproduce → regression test → fix → verify → review |
code-review |
— | overview → correctness → security → performance → tests → report |
refactor |
test_command |
baseline → analyze → execute → verify |
security-audit |
— | inventory → injection → auth → secrets → dependencies → report |
performance |
test_command |
baseline → profile → optimize → measure |
migration |
old_dependency, new_dependency, test_command |
audit → plan → migrate → verify |
documentation |
— | inventory → api-docs → readme → verify |
onboarding |
— | structure → entry-points → architecture → report |
Launching Agents
Agents are specialized personas. Two ways to use them:
From terminal (one-shot):
hone --agent security-auditor "review src/auth.rs for OWASP issues"
hone --agent debugger "test_login fails with AttributeError"
hone --agent code-reviewer "review my latest changes"
hone --agent rust-expert "how to handle lifetimes in this struct"
Inside TUI (inline @ dispatch):
> @security-auditor review src/auth.rs for vulnerabilities
> @debugger why does test_login fail?
> @code-reviewer check the latest diff
> @performance-optimizer this database query is slow
> @python-expert convert this function to async
> @rust-expert fix the lifetime error in agent.rs
> @ml-engineer design a recommendation model for user preferences
> @documentation-agent update the API docs
| Agent | Specialty | Example |
|---|---|---|
@code-reviewer |
Code quality, severity ratings | "review src/ for bugs" |
@debugger |
Test failures, stack traces | "why does test_login fail?" |
@security-auditor |
OWASP, injection, secrets | "audit auth.py for vulnerabilities" |
@performance-optimizer |
N+1 queries, hot paths, memory | "this query takes 3 seconds" |
@rust-expert |
Ownership, lifetimes, async tokio | "fix the borrow checker error" |
@python-expert |
Typing, pytest, FastAPI, async | "add type hints to this module" |
@documentation-agent |
READMEs, API docs, changelogs | "document the public API" |
@api-designer |
REST, GraphQL, gRPC, OpenAPI | "design a user management API" |
@tdd-workflow |
Red-Green-Refactor discipline | "implement auth with TDD" |
@researcher |
Literature review, experiments | "review papers on transformers" |
@ml-engineer |
Training, deployment, MLOps | "optimize model inference" |
@data-engineer |
ETL, schemas, data quality | "design the data pipeline" |
Recipes vs Agents
| Recipes | Agents | |
|---|---|---|
| What | Multi-step structured workflow | Specialized persona |
| Steps | 3-6 defined phases | Single turn or conversation |
| Variables | --var key=value |
None needed |
| Best for | Repeatable processes | Ad-hoc specialist questions |
| Inside TUI | CLI only (--recipe) |
@agent-name prompt |
Configuration
Hone uses a three-level config hierarchy:
- CLI flags (highest priority)
- TOML (
~/.config/hone/config.toml) - Defaults (lowest priority)
Sample config.toml
[agent]
default_model = "gpt-4o"
context_window_tokens = 16000
max_iterations = 20
[memory]
enabled = true
db_path = "~/.local/share/hone/memory.db"
encryption_key_file = "~/.config/hone/key.bin"
[sandbox]
strategy = "seatbelt" # macOS: seatbelt, Linux: landlock
[team]
rbac_enabled = true
audit_log_path = "~/.local/share/hone/audit.log"
[providers]
default = "anthropic"
# Add API keys for each provider
anthropic_api_key = "${ANTHROPIC_API_KEY}"
openai_api_key = "${OPENAI_API_KEY}"
Environment Variables
Hone respects the following environment variables for provider setup:
| Variable | Purpose | Example |
|---|---|---|
ANTHROPIC_API_KEY |
Anthropic API key for direct Claude access | sk-ant-... |
OPENAI_API_KEY |
OpenAI API key (or compatible endpoint) | sk-... |
HONE_OPENAI_BASE_URL |
Override OpenAI endpoint for custom providers | https://api.deepseek.com |
HONE_OPENAI_MODEL |
Override model name for custom endpoints | deepseek-coder, qwen-max, claude-3-sonnet |
HONE_DISABLE_PERSISTENT_STORES |
Skip SQLite session/memory setup for this run (same effect as --ephemeral; useful for CI, demos, and TUI expect scripts) |
1 |
HONE_AUDIT_LOG |
Enable JSONL egress audit log (equivalent to --audit) |
1 |
HONE_CONTEXT_INJECTION / HONE_NO_INJECTION |
Force-enable or force-disable proactive context injection | 1 |
HONE_REDUCED_MOTION |
Disable Agent Rail animation | 1 |
HONE_ASCII_ONLY |
Force ASCII box drawing and file-op badges | 1 |
NO_COLOR |
Disable all colorized output | 1 |
The provider auto-detection order is: ANTHROPIC > OPENAI (+ custom endpoints) > Claude CLI > Gemini CLI > Ollama.
When HONE_OPENAI_BASE_URL is set, Hone treats any OPENAI_API_KEY as a token for that endpoint, allowing drop-in access to DeepSeek, Qwen, Kimi, OpenRouter, and other OpenAI-compatible services.
See docs/configuration.md for complete reference.
Agent ensemble (experimental)
Run the same prompt through N parallel candidates with different temperatures, then pick the best response using the MASAI Ranker pattern:
hone --ensemble 3 "refactor this function for clarity"
Each candidate runs at a different temperature (0.0, 0.3, 0.5, …), maximising response diversity. A dedicated ranker agent (temperature 0.0) reads all candidates and selects the best one. Research shows ensemble+voting improves code quality by 5-8% at roughly N× token cost (arXiv 2406.11638).
Up to five candidates are supported (--ensemble 5).
CLI Reference
| Command | Purpose |
|---|---|
hone configure |
Interactive setup wizard for providers, API keys, and models |
hone |
Start interactive CLI session |
hone "prompt" |
One-shot mode: single prompt and exit |
hone --clarify "prompt" |
Enable scope clarification mode (auto-detects ambiguous requests, asks clarifying questions) |
hone --no-clarify "prompt" |
Disable scope clarification for CI/scripting |
hone --orchestrate "prompt" |
Multi-agent orchestrator (decompose-execute-merge) |
hone --ensemble N "prompt" |
Run N candidates with varied temperatures, rank the best |
hone --recipe <name> |
Run a built-in recipe (tdd-feature, bug-fix, code-review, refactor) |
hone --list-recipes |
List all available recipes |
hone --session NAME |
Start named session (auto-saved after each turn) |
hone --ephemeral |
No session/memory persistence for this run |
hone --vim |
Start in vim mode (vim-style keybindings) |
hone --json |
Output structured JSON for CI/CD automation |
hone --no-inject |
Disable proactive context injection (also set HONE_NO_INJECTION=1) |
hone --skeleton-strategy call_graph |
Select skeleton strategy: call_graph (default, adaptive), hybrid, flat, dependency_graph |
hone --inject-context |
Explicitly enable context injection (already default, useful for config overrides) |
hone --audit |
Enable JSONL egress audit log (also set HONE_AUDIT_LOG=1) |
hone --var KEY=VALUE |
Substitute template variables in recipes (repeatable) |
hone --model <name> |
Use a specific model (e.g., --model ollama for Ollama, --model gpt-4o for OpenAI) |
hone --model ollama |
Use Ollama with tool calling support (now fully enabled) |
hone session list |
List all saved sessions |
hone session resume NAME |
Resume a previous session |
hone session rename OLD NEW |
Rename a session |
hone session delete NAME |
Delete a session |
hone --dir <path> |
Target a specific directory |
hone --remote <host:port> |
Connect to remote daemon |
honed |
Start the HTTP/2 server daemon |
honed --port 8080 |
Listen on custom port |
In interactive mode, use these commands:
| Command | Purpose |
|---|---|
!command |
Run shell command (e.g., !cargo test, !git status) |
!cd path |
Change directory (confined to project) |
!nvim file |
Suspend TUI, open editor, restore on exit |
@agent-name prompt |
Dispatch to inline agent (e.g., @security-auditor "audit this", @debugger "fix this bug") |
/configure |
Show session summary (provider, model, theme, vim, budget, cost, phase) |
/configure model <name> |
Switch model mid-session (e.g., /configure model claude-sonnet-4-5) |
/configure theme <name> |
Change UI theme |
/configure vim on|off |
Toggle vim-style keybindings |
/configure budget <usd>|off |
Set or clear a session cost ceiling |
/configure provider |
Print the active provider adapter |
/configure help |
Show the full /configure subcommand list |
:clarify |
Manually trigger scope clarification for ambiguous requests |
:set vim |
Enable vim-style keybindings |
:set novim |
Disable vim-style keybindings |
:set no-clarify |
Disable automatic clarification prompts |
:tdd [command] |
Watch files, auto-run tests on change (300ms debounce) |
:tdd off |
Stop TDD mode |
:review |
Enter hunk-by-hunk review mode (j/k navigate, a/r accept/reject, A/R all, q quit) |
See hone --help for full options.
Crate Structure
Hone is organized as a Cargo workspace. Each crate has a focused responsibility:
| Crate | Purpose | Deps |
|---|---|---|
hone-proto |
Shared types, DTOs, errors | 0 internal |
hone-config |
TOML + HONE.md parsing | proto |
hone-providers |
Model provider abstraction | proto |
hone-sandbox |
Kernel sandboxing (Seatbelt, Landlock) | proto |
hone-tools |
Layer 2 native tools (file, shell, git, LSP) | proto, sandbox |
hone-mcp |
MCP client (JSON-RPC transport) | proto, tools |
hone-index |
Semantic indexer (tree-sitter, SQLite vectors) | proto |
hone-memory |
Adaptive memory (encrypted SQLite) | proto |
hone-core |
Agent loop, session, orchestration | all above |
hone-recipes |
YAML recipe parsing + execution | proto, core |
hone-server |
HTTP/2 + SSE daemon | proto, core |
hone-tui |
Ratatui terminal UI | proto, core |
Binaries:
hone-cli— Interactive terminal agenthoned— HTTP/2 daemon
How Hone Compares
An honest comparison. We list where competitors are better, not just where Hone wins.
| Feature | Hone | Claude Code | Codex CLI | Goose | Kiro | Gemini CLI |
|---|---|---|---|---|---|---|
| Open source | MIT/Apache-2.0 | Proprietary | Open (Apache-2.0) | Open (Apache-2.0) | Proprietary | Apache-2.0 |
| Terminal-native | Yes | Yes | Yes | Yes (+ Electron) | No (VS Code only) | Yes |
| Multi-agent orchestration | DAG + ensemble + RALPH | No | No | No | No | No |
| Kernel sandboxing | Seatbelt + Landlock | Container | Container | No | No | No |
| Encrypted memory | AES-256-GCM | No | No | No | No | No |
| MCP extensions | 70+ via .hone/mcp.json | Yes (native) | No | 70+ | Yes | Yes |
| Provider support | 7+ (any OpenAI-compat) | Claude only | OpenAI only | 30+ | Bedrock only | Gemini only |
| Context optimization | BM25 + call graph + adaptive | Basic | Basic | Summarization | Unknown | 1M+ window |
| Tool calling | Native + text fallback | Native | Native | Native | Native | Native |
| Session management | SQLite + resume | Yes | No | Yes | No | No |
| Recipes/workflows | 9 built-in YAML | No | No | No | Spec-driven plans | No |
| Cost tracking | Live status bar + audit | No | No | CLI command | No | No |
| TDD mode | :tdd file watcher | No | No | No | No | No |
| Review mode | Hunk accept/reject | No | No | No | No | No |
Where Competitors Are Better (honestly)
| Area | Winner | Why |
|---|---|---|
| SWE-bench resolve rate | Claude Code (72%) | Hone: 19.3%. Claude Code uses Sonnet 4.5 ($3/task vs Hone's $0.03). |
| Community size | Claude Code (~71K stars) | Hone is new. Goose has ~20K, Gemini CLI ~50K. |
| Context window | Gemini CLI (1M+ tokens) | Eliminates context management complexity entirely. |
| IDE integration | Kiro, Cursor | VS Code native. Hone is terminal-only (web UI planned). |
| Free usage | Gemini CLI (free tier) | Corporate-subsidized. Hone requires your own API key. |
| Spec-driven planning | Kiro | Generates requirements docs before coding. Hone has recipes but not specs. |
| Desktop GUI | Goose (Electron) | Hone is terminal-only. |
Where Hone Wins
| Area | Detail |
|---|---|
| Cost efficiency | $0.03/task vs $0.50-$3.00. 50-100x cheaper. |
| Privacy | Kernel sandbox + encrypted memory + no telemetry. Nothing leaves your machine. |
| Multi-agent | Only agent with DAG orchestration, ensemble voting, RALPH loop, reviewer. |
| Flexibility | Works with any OpenAI-compatible model. DeepSeek, Qwen, Ollama (local, $0). |
| Observability | Hierarchical tracing, egress audit, injection detection, cost tracking. |
| Structured workflows | 9 recipes, TDD mode, review mode, scope clarification, @agent dispatch. |
Choose Hone If...
- You want to run AI coding with your own models (not locked to one vendor)
- Privacy matters — kernel sandbox, encrypted memory, no telemetry
- You need multi-agent orchestration for complex tasks
- You want $0.03/task instead of $3/task
- You prefer terminal-native tools over IDE extensions
- You need structured workflows (TDD, security audit, code review recipes)
Choose Something Else If...
- You need the highest possible SWE-bench score → Claude Code
- You want IDE integration → Kiro or Cursor
- You want a desktop GUI → Goose
- You want free unlimited usage → Gemini CLI
- You need 1M+ token context → Gemini CLI
Documentation
Core Documentation:
- README — Features, installation, quick start, benchmarks
- CHANGELOG — Recent changes, improvements, security hardening (Phase 2 complete)
- Architecture — Layer design, crate dependencies, data flow, context optimization, memory encryption, security model
- Multi-Agent Coordination and Memory — Agent turn lifecycle, orchestrator pipeline, context trimming, cross-session learning
- Configuration — Config hierarchy, all options, examples, environment variables
- Benchmarks — Performance analysis, microbenchmarks, SWE-bench results
User Guides (in docs/tutorials/):
- Getting Started — Installation, first session, understanding output
- Providers — Model providers, hot-swapping, custom endpoints (OpenRouter, DeepSeek, etc.)
- Security — Sandboxing (Seatbelt/Landlock), RBAC, audit logging, memory encryption
- Tools & MCP — Layer architecture, native tools, MCP integration
- Recipes — YAML workflows, multi-step automation
- Server & Remote — Running the daemon, HTTP API, multi-user setup
For Contributors:
- Contributing — Build, test, commit conventions
Testing
Hone passes 1,100+ tests (unit + integration) across 13 crates with the workspace clippy/rust lint policy applied to every crate.
# Run all tests
cargo test --lib
# Run with logging
RUST_LOG=debug cargo test --lib -- --nocapture
# Run a specific test
cargo test --lib orchestrator
# Run integration tests
cargo test --test '*'
# Benchmarks
cargo bench
# View coverage (requires tarpaulin)
cargo tarpaulin --out Html
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Write tests and docs alongside code
- Run
cargo test && cargo clippy --all-targets - Commit with a clear message
- Push and open a pull request
See CONTRIBUTING.md for detailed guidelines.
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Built with attention to privacy, performance, and composability.
Dependencies
~48–65MB
~1M SLoC