#coding-agent #ai #llm

hone-index

Semantic indexer for Hone (tree-sitter AST symbol extraction + SQLite)

1 unstable release

Uses new Rust 2024

0.1.0 Apr 13, 2026

#2628 in Development tools


Used in 6 crates (2 directly)

MIT/Apache

65KB
1.5K SLoC

Hone

The Universal Terminal AI Code Agent

InstallationQuick StartFeaturesDocumentationContributing

Rust 1.85+ MIT OR Apache-2.0 Tests SWE-bench


Demo

Hone demo — fixing a ZeroDivisionError end-to-end

Step-by-step animation, ~12 s loop. Each panel fades in as the agent takes the next action: prompt → read_file → edit_file (diff) → cargo test (green) → fix summary → cost line.

Step-by-step

  1. Prompt — user describes the failing test.
    $ hone "the test_divide test fails because divide(10, 0) raises
            ZeroDivisionError. fix it to return None."
    
  2. read_file — Hone opens calculator.py to see what's there.
    1  def divide(a, b):
    2      return a / b
    
  3. edit_file — Hone applies a unified-diff patch.
    -     return a / b
    +     if b == 0:
    +         return None
    +     return a / b
    
  4. !cargo test — Hone runs the test suite to verify.
    running 12 tests
    test test_divide_by_zero ... ok
    12 passed in 0.3s
    
  5. Done — assistant reports the fix and the cost line.
    Fixed. divide() now returns None on zero division.
    claude-sonnet | $0.001 | ctx 18% | 5s
    
Play the recording locally
# Install asciinema
brew install asciinema  # or: pip install asciinema

# Play the recorded demo
asciinema play assets/demo.cast

The .cast source is at assets/demo.cast — recorded from a real Hone run on DeepSeek V3 via OpenRouter (~$0.001 per task). The embedded assets/demo.svg above is regenerated from it with svg-term --in assets/demo.cast --out assets/demo.svg --window --no-cursor.

Browser & desktop GUI demos (text walk-throughs)

Three independent surfaces ship alongside the TUI: a SolidJS web UI served by honed at /ui, a Tauri 2 desktop app with system tray

  • multi-daemon switcher, and the same daemon's HTTP/SSE API anything else can embed against. See the Remote Access & Web UI and Desktop App sections below for layout sketches, walkthroughs of every panel (file preview, side-by-side diff, approval modal, task-tree DAG, sandbox / memory / provenance / evals viewers, artifact pane), bundle profile, and the full daemon endpoint table.

Quick spin-up:

# Browser
make webui                                              # build the SPA
ANTHROPIC_API_KEY=sk-... cargo run --release -p honed   # serve at /ui
open http://127.0.0.1:3117/ui

# Desktop (Tauri 2 native shell — system tray, OS notifications, Cmd+K)
cargo build -p honed --release && cp target/release/honed /usr/local/bin/
cargo run -p hone-desktop

Overview

Hone is a privacy-first, provider-agnostic AI coding agent built entirely in Rust. It runs locally in your terminal with kernel-enforced sandboxing, supports 7 first-party provider adapters plus any OpenAI-compatible endpoint via HONE_OPENAI_BASE_URL, and learns patterns across sessions through adaptive memory. Work with your models, on your machine, under your rules.

Why Hone

Your models, your machine, your rules. Hone doesn't phone home. No telemetry. No vendor lock-in. Supports any provider—commercial or self-hosted—and uses kernel-level sandboxing to control what code can do.

Intelligence compounds. Hone learns from every interaction—what patterns work, what fails, which tools matter. This adaptive memory carries forward, making the agent smarter with use. Insights cross repo boundaries through semantic indexing.

Work from anywhere. Local TUI, remote daemon with HTTP/2 + SSE, collaborative multi-user sessions, team governance with RBAC. Run honed on a server, connect from any terminal.

Features

  • Shell escape — Run native shell commands with !command: !ls, !git status, !cargo test show inline output. !cd src changes directory (confined to project). !nvim file.rs suspends TUI, runs editor, restores after.

  • Vim mode — Type --vim or :set vim for vim-style keybindings. Normal mode: j/k scroll, gg/G top/bottom, i/a insert, :q quit. Insert mode: type normally, Esc returns to normal.

  • TDD mode — Type :tdd [command] to watch source files and auto-run tests on change (300ms debounce). Shows PASS/FAIL inline. Type :tdd off to stop.

  • Review mode — Type :review to enter hunk-by-hunk review of git diff changes. Navigate with j/k, accept/reject hunks with a/r, accept/reject all with A/R, quit with q.

  • Session switcher overlay — Press Ctrl+O to browse and switch between saved sessions with a full-screen overlay showing name, last-updated relative time, and message count. Navigate with j/k, Enter to select, q to close.

  • Expanded keybindings help — Press Ctrl+? to see 31 keybindings across 5 sections (Input, Navigation, Overlays, Agent control, Commands) with descriptions.

  • Observability sidebar & session reports — Press Ctrl+R to toggle the Agent Rail sidebar showing files touched, context injected, and memory recalls per turn. On session end, detailed reports are auto-generated with cost, tokens, files, tools, and timelines. View with hone reports list/show/diff.

  • Multi-agent orchestration — Decompose complex tasks via research-backed patterns (Anthropic, MASAI, CodeR). Orchestrator-worker pipeline with message merging. Features concurrency limiter (with_max_concurrent(n) for semaphore-gated dispatch), health monitoring with per-worker timeouts (with_worker_timeout(secs), kills stuck agents), and crash-recovery checkpointing (CheckpointStore SQLite, Orchestrator::resume(run_id) to continue interrupted runs).

  • Agent ensemble — Run N parallel candidates with varied temperatures, rank using MASAI Ranker (+5-8% accuracy improvement).

  • Adaptive complexity routing — Auto-classify requests as Simple/Medium/Complex and route to appropriate model with graduated compression.

  • Adaptive hierarchical skeleton — Proactive context injection with intelligent strategy selection: auto-detects repository size and auto-selects detail level. 4-level hierarchy: directory → file list → signatures → call graph. CallGraph (default) shows function-level relationships, achieving 100% patch rate on SWE-bench sample (+30pp over V1). Adaptive strategy skips large files (>100KB) and slow parses (>100ms). Progressive budget filling ensures efficient token use. Disable with --no-inject or HONE_NO_INJECTION=1.

  • Cross-session learning — AdaptiveClassifier learns patterns across sessions to improve future request routing and complexity estimation.

  • Test-verification loop — AgentCoder pattern that runs tests after each iteration (+40% bugs caught early).

  • Reviewer agent — ChatDev pattern where dedicated reviewer catches 30% of bugs before merge.

  • Self-refine loop — Iterative improvement (NeurIPS 2023, +8-15% quality gain).

  • RALPH loop — Review, Analyze, Learn, Plan, Heal: structured feedback cycle combining adversarial review with root cause analysis. Max 3 iterations (configurable), stops when reviewer approves. Captures learned patterns for cross-session improvement. Optional test verification after each Heal step. Research basis: Self-Refine (NeurIPS 2023) + ChatDev + MASAI + AgentCoder.

  • Universal model support — 7 first-party adapters (Anthropic, OpenAI, Ollama, Claude CLI, Gemini CLI, ACP, and a mock adapter for testing) plus any OpenAI-compatible endpoint via HONE_OPENAI_BASE_URL (DeepSeek, Qwen, Kimi, OpenRouter, and others work out of the box). Full tool-calling support on Anthropic, OpenAI, Ollama, and ACP adapters (claude --acp, gemini --acp, or custom ACP servers); the legacy Claude/Gemini CLI pass-through adapters remain prose-only.

    Adapter Tool calling Stream Notes
    Anthropic Yes SSE claude-sonnet-4-5 default
    OpenAI Yes SSE gpt-4o default; override with HONE_OPENAI_MODEL
    Ollama Yes stream llama3.2 default; requires local ollama serve
    Claude CLI (legacy) No text claude binary on PATH; prose-only passthrough
    Gemini CLI (legacy) No text gemini binary on PATH; prose-only passthrough
    ACP — claude --acp Yes JSON-RPC Subscription Claude with full bidirectional tool dispatch
    ACP — gemini --acp Yes JSON-RPC Free-tier Gemini with full bidirectional tool dispatch
    ACP — custom Yes JSON-RPC Self-hosted ACP servers; see docs/api/acp-protocol.md
    Mock Yes in-process Test double only
  • Kernel-enforced sandboxing — Seatbelt on macOS, Landlock on Linux 5.13+, seccomp elsewhere. Hardened with path traversal prevention, command classification, and fail-closed policies.

  • Memory encryption — AES-256-GCM with Argon2id for encrypted memory storage. Opt-in at runtime via --encrypted-memory plus HONE_MEMORY_PASSPHRASE. A wrong or missing passphrase exits the binary non-zero rather than silently degrading. Each store persists its own random 16-byte salt (plus a version marker) as a sentinel memory record, so keys rotate per-vault and cannot be cross-replayed against a hardcoded constant.

  • Typed memories with index-driven recall — Memories carry a kind (User, Feedback, Project, Reference, Outcome, Conversation) and an optional one-line description. The system prompt embeds a budgeted ## Memory index; the agent fetches bodies on demand via the memory_read tool. Recall ordering uses BM25 (k1=1.5, b=0.75) over content + description. Inspect or prune via hone memory ls | show <id> | forget.

  • Layered tool architecture — Native Rust layer (Layer 2) for zero-cost file/shell/git/LSP. MCP client for 70+ community extensions (Layer 1) with .hone/mcp.json config. Skills in markdown (Layer 3). Tool decorators for cost tracking and governance (Layer 4).

  • Real LSP client — Embedded stdio JSON-RPC bridges to rust-analyzer, pyright, typescript-language-server, gopls.

  • Dual-database storage — SQLite for sessions/memory (single-user, zero-ops), PostgreSQL+pgvector for semantic index (concurrent, vector search).

  • Adaptive memory — Pattern learning across sessions. Hone remembers what worked, what failed, and why. Memory is local-first, private by default.

  • OpenAPI 3.1 spec generation — GET /openapi.json for automated API documentation.

  • Rate limiting — Sliding window middleware per IP/user.

  • Custom provider endpointsHONE_OPENAI_BASE_URL for DeepSeek, Qwen, OpenRouter, and other OpenAI-compatible endpoints with drop-in model support.

  • Multi-agent budget tracking — Per-turn, per-session, per-user token accounting with live cost estimates and configurable limits.

  • Real bore tunnel — Automatic SSH tunnel when bore CLI installed.

  • Client-server architecture — hone CLI for local use, honed daemon for remote access. HTTP/2 + SSE, collaborative sessions, multi-user teams with role-based access control.

  • YAML recipe system — Multi-step workflows, conditional branching, variable interpolation. Recipes compose tools and agents into reproducible tasks.

  • Redesigned status bar — Three-zone layout (left: model/session/phase, center: tool icon, right: context bar with cost). Token count shows inline (compact: "42t") and full fraction in wide mode ("42/200000 tokens"). Width-adaptive (80/120/160+ cols), NO_COLOR support.

  • Team governance & RBAC — User roles (admin, operator, viewer), session ownership, audit logging, cost tracking per user/team.

  • Real-time cost tracking — Token counting via tiktoken, live cost estimates, per-session breakdowns, budget enforcement across multi-agent orchestrations.

  • Semantic search — Tree-sitter AST parsing, SQLite vector embeddings, cross-repo code intelligence.

  • Scope clarification mode — Auto-detects ambiguous requests using 4 heuristics and asks clarifying questions before coding. Use /clarify to trigger manually or --no-clarify for CI/scripting without prompts.

  • Inline agent dispatch — Use @agent-name syntax to swap system prompts on the fly (e.g., @security-auditor "audit this code", @debugger "fix this bug"). 12 built-in agents available. Restores original prompt after the turn.

  • Safe pipe operator — Pipes to head/tail/grep/wc/sort now allowed in shell commands (were previously blocked). Pipes to unknown commands still treated as destructive.

  • Ollama tool calling — OllamaProvider now supports full tool calling (previously hardcoded to skip). Qwen 2.5 Coder 14B recommended for local use.

  • Model override fixes--model ollama now directly creates OllamaProvider without falling back to Claude CLI when Ollama is explicitly requested.

  • Text-based tool call parser — Detects JSON tool calls in model content for Ollama (handles models that don't use tool_calls field). Strips markdown code fences automatically.

  • DeepSeek Direct compatibility — Works for interactive use; for SWE-bench benchmarks, OpenRouter is recommended due to token output limits.

  • Built-in agents — 12 specialized agent personas (code reviewer, debugger, security auditor, ML engineer, etc.) with @agent-name inline dispatch. Plus 9 recipe templates for common workflows.

  • Agent-loop robustness — Three pre-LLM safety checks break out of failure modes that tend to waste tokens: (1) Doom-loop detector — injects a corrective user message when the last 30 messages show 3+ identical consecutive tool calls or a repeating 2–5 step pattern; (2) Context-budget graceful abort — at 95 % of the model's context limit, strips tools from the next request and tells the model to finalize now instead of being mid-tool-call truncated; (3) Output-truncation retry hint — when the provider ends with stop_reason = "max_tokens" (Anthropic) or finish_reason = "length" (OpenAI), commits partial assistant text, drops the malformed tool-use block, and injects a user hint naming the lost tools so the model can retry with smaller content.

Built-in Agents and Recipes

Hone ships with 12 pre-configured agents and 9 recipe templates for common development tasks.

Built-in Agents

Each agent is a specialized persona with curated tools and system prompts:

Agent Description
code-reviewer Expert code reviewer for quality, security, performance, and maintainability
debugger Systematic debugging specialist using the scientific method
security-auditor Security auditor for vulnerability scanning, secret detection, and OWASP review
performance-optimizer Performance profiler and optimizer for database, API, memory, and rendering
documentation-agent Generates and maintains API docs, READMEs, architecture diagrams, and changelogs
api-designer REST, GraphQL, and gRPC API designer with OpenAPI spec generation
tdd-workflow Test-Driven Development workflow enforcing Red-Green-Refactor discipline
rust-expert Rust specialist for ownership, lifetimes, async with tokio, and idiomatic error handling
python-expert Python specialist for PEP 484 typing, pytest, async with asyncio, and FastAPI
researcher AI/ML research scientist for literature review, experiment design, and paper analysis
ml-engineer Machine learning engineer for model training, evaluation, deployment, and MLOps
data-engineer Data engineer for ETL pipelines, data warehousing, schema design, and data quality

Built-in Recipes

Recipes are multi-step YAML workflows that guide the agent through structured processes:

Recipe Description
tdd-feature Implement a feature using Test-Driven Development (Red-Green-Refactor)
bug-fix Fix a bug using systematic debugging with regression test
code-review Perform a comprehensive code review covering correctness, security, and performance
refactor Safely refactor code with test verification at each step
security-audit Scan for OWASP Top 10 vulnerabilities, secrets, and dependency issues
performance Profile, optimize, and measure performance improvements
migration Plan and execute safe dependency or framework migrations
documentation Inventory code, generate API docs, update README, and verify coverage
onboarding Understand a new codebase structure, key files, and common workflows

Using Built-in Agents and Recipes

Agents can be referenced by name or description using the agent registry:

# Run using an agent's name
hone --agent code-reviewer -i "Review this code for security issues"

# Run a recipe
hone --recipe tdd-feature
hone --recipe bug-fix

# Create a custom agent (see docs/tutorials/tools-and-mcp.md)
cat > .hone/agents/my-agent.md << 'EOF'
---
name: my-agent
description: My custom agent
tools: [read, write, shell, grep]
model: sonnet
---
You are my specialized agent...
EOF

Custom agents can be placed in .hone/agents/ (project-level) or ~/.config/hone/agents/ (user-level).

For detailed information on how agents work and how to create custom ones, see docs/tutorials/tools-and-mcp.md.

Remote Access & Web UI

The honed daemon serves three independent surfaces over HTTP/2 + SSE: a SolidJS web UI at /ui, a Tauri 2 desktop app that can supervise its own daemon, and a documented REST + SSE API that can be embedded in anything else. All three share the same wire format.

Three GUIs, one daemon

flowchart LR
    H[(honed daemon)]
    H -- "GET /ui (rust-embed)" --> WebUI[SolidJS web UI<br/>browser, PWA-installable]
    H -- "HTTP/2 + SSE" --> Desktop[hone-desktop<br/>Tauri 2 native shell]
    H -- "JSON-RPC over stdio" --> ACP[ACP / MCP clients]
    H -- "POST /run + GET /files" --> CLI[hone CLI / hone tui]
    Desktop -. "spawn + supervise (sidecar mode)" .-> H
Surface Path Stack Best for
TUI hone tui Rust + ratatui terminal-native, SSH, vim users
Web /ui SolidJS + UnoCSS + Vite any browser, no install, PWA
Desktop hone-desktop Tauri 2 (Rust shell + WebView) system tray, OS notifications, multi-daemon switcher

Web UI walkthrough

A complete browser-based chat surface with file preview, syntax highlighting, diff renderer, multi-agent task tree, IndexedDB conversation history, step-mode approvals, multi-daemon support, and four themes. Initial bundle: ~24 KB JS gzipped; on-demand syntax highlighting lazy-loads Shiki + Oniguruma WASM.

# Build the SPA into the daemon binary, then run it.
make webui
ANTHROPIC_API_KEY=sk-ant-... cargo run --release -p honed
open http://127.0.0.1:3117/ui

1. Chat with streaming + Cmd+K command palette

What the chat surface contains, in render order:

  • Topbar: brand · session selector · collab: N button · agent selector · step toggle · daemon health badge · theme cycler · settings · Cmd+K
  • Conversation list (scrollable):
    • user bubble (right-aligned, brand-color background)
    • assistant bubble (left-aligned, accent border, streaming cursor while tokens arrive)
    • tool-call card per invocation: tool name, status (running / done / error), duration_ms, collapsible result body, (cached) suffix on cache hits
    • error bubble for any error event
  • Worker progress panel (only when Orchestrator is running)
  • Input bar: textarea + autocomplete menu when the draft starts with /
  • Status bar: version label · token count · USD spend (driven by done.usage + done.cost_usd events)

Slash commands (intercepted client-side, never hit /run):

Command What it does
/theme <crimson | chalk | neon | minimal> set palette
/settings open the settings drawer
/clear reset the session cost meter
/help list available commands
  • Token stream renders with markdown (fenced code blocks get lazy Shiki syntax highlighting in 27 languages — rust, ts, tsx, python, go, c, cpp, bash, json, yaml, diff, markdown, etc.). The first highlight fetches Shiki + Oniguruma WASM; subsequent highlights are cached.
  • Type / to autocomplete commands. /theme neon swaps the palette, /settings opens the drawer, /clear resets the cost meter.
  • Cmd+Enter sends; the Cancel button aborts mid-stream.

2. File preview + side-by-side diffs

When the agent runs read_file / edit_file / write_file, the preview pane to the right opens automatically and shows the file with syntax highlighting. The gutter between chat and preview is resizable — drag to split (clamped 25–85%, persisted to localStorage).

The two-column layout (chat left, preview right):

  • Chat side keeps streaming. Tool results that look like unified diffs auto-render as side-by-side with theme-driven add / remove / hunk colors. Long tool results (>6 lines) collapse with a show all (N lines) button.
  • Preview side shows a 200 px wide file tree on its left and a Shiki-highlighted code viewer on its right. The tree lazily expands directories on click; hidden entries (.git, dotfiles) are skipped.
  • Gutter is the 6 px draggable bar between the two halves. The ratio is clamped to [0.25, 0.85] and persisted to localStorage under hone-webui:split-ratio.

A typical diff card embedded in the chat:

diff --git a/src/lib/diff.ts b/src/lib/diff.ts
@@ -42,3 +42,9 @@
   context line
-    return null;
+    return parsed;
+  } catch (err) {
+    return null;
  • Tool results that look like unified diffs auto-render as side-by-side with theme-driven add / remove / hunk colors.
  • File tree on the left lazily expands directories. Hidden entries (.git, dotfiles) are skipped.
  • Files >2 MiB or with NUL bytes (binaries) get a placeholder.

3. Multi-agent orchestration with task tree

When you submit a turn that triggers Orchestrator::run (multi-agent plans), a <WorkerProgressPanel> appears above the chat. It shows a goal header, one row per worker, and a budget footer.

A typical render after two of four workers have finished:

status agent task result
done (green) researcher t-001 explore current auth/ tree $0.003
done (green) planner t-002 draft the migration steps $0.005
running (warn, pulses) implementer t-003 apply the rename mechanically
queued (muted) reviewer t-004 verify against TDD harness

Footer: spent $0.008 / $0.50 · 2/4 workers.

Driven by plan_created / worker_started / worker_completed / budget_progress events on the SSE stream.

4. Step-mode approvals

Tick step in the topbar to require approval for risky tools. When the agent pauses on a tool_call_paused event, a centred modal pops with the tool name + arguments. Buttons:

button StepAction wire shape semantics
Allow "Proceed" execute this call only
Always allow {"ApproveAlways": "<tool_name>"} execute and auto-allow same tool name for the rest of the session
Skip "Skip" proceed without this tool; agent gets an empty result
Deny {"Deny": "user denied"} agent receives a tool error and reacts

Click-outside dismisses with Skip. The chosen action is POSTed to /run/{request_id}/step (202 Accepted on success).

5. Multi-session + IndexedDB persistence

The session selector in the topbar lists every session you've created locally. Each is a bucket of messages stored in IndexedDB; refreshing the page hydrates the conversation back from the local cache before SSE reconnects.

6. Settings drawer

Press in the topbar (or /settings) for a slide-in drawer with:

  • Theme picker (4 swatches: crimson · chalk · neon · minimal)
  • Font size (small / medium / large) — applied via root CSS var
  • Daemon endpoint override — for cross-origin / Tauri shell setups
  • Tunnel — start / stop / copy URL against POST/DELETE/GET /tunnel
  • Team usage — totals + per-model breakdown from GET /team/usage
  • Reset — wipes settings without touching sessions or chat history

7. Collaborative sessions

Click the collab: N button next to the agent selector to join the active session by name. The first joiner becomes the Owner and gets a Kick button next to other participants. user_id is generated locally and persisted so the same browser keeps its identity across reloads. Wired against POST /sessions/{id}/join / POST /sessions/{id}/leave / GET /sessions/{id}/participants / POST /sessions/{id}/kick.

Bundle profile

chunk first paint first highlight first file open
initial JS ~24 KB gz (cached) (cached)
initial CSS ~2 KB gz (cached) (cached)
Shiki core + WASM ~285 KB gz once (cached)
per-language 3–18 KB gz on first use (cached)
total dist on disk 3.1 MB across 36 files

Daemon HTTP/SSE API surface

Every endpoint below is hit-tested and surfaced in at least one of the GUIs:

Group Endpoint Purpose
chat POST /run streaming turn (SSE of AgentEvents)
chat POST /run/{id}/step submit StepAction for a step-mode run
chat POST /orchestrate multi-agent run via Orchestrator
sessions POST /sessions / GET /sessions / GET /sessions/{id} / DELETE /sessions/{id} session lifecycle
sessions POST /sessions/{id}/join / /leave / /kick / GET /sessions/{id}/participants collaborative
discover GET /agents / GET /agents/{name} agent registry browser
discover GET /recipes / GET /recipes/{name} recipe gallery
discover GET /mcp/marketplace / GET /mcp/installed MCP marketplace
files GET /files?path= / GET /files/content?path= workspace file tree + read
sandbox GET /sandbox/policies active server policy + per-agent overrides
memory GET /memory / GET /memory/stats encrypted memory viewer
provenance GET /provenance / GET /provenance/{id} / /verify tamper-evident hash chain
evals GET /evals / GET /evals/{id} eval-run history
ops POST /tunnel / DELETE /tunnel / GET /tunnel bore-style tunnel mgmt
ops GET /team/usage / /team/members / /team/policy governance
meta GET /health / GET /openapi.json liveness + spec

For deployment, see docs/getting-started.md and docs/configuration.md. For the SolidJS source + dev workflow, see apps/webui/README.md.

Desktop App

apps/hone-desktop/ is a Tauri 2 native shell that can either supervise its own honed (sidecar mode, double-click and go) or connect to a daemon on a team VM (external mode). Distinct from the web UI: the desktop app adds a system tray, OS notifications, keyboard-only command palette, multi-daemon switcher, and right-docked artifact pane that surveys every file the agent has touched in the session.

flowchart LR
    User[Tray icon click / app launch]
    User --> Tauri[Tauri 2 shell]
    Tauri --> SS[SidecarSupervisor]
    SS -. spawn .-> Honed[(honed)]
    Tauri --> WebView[WebView<br/>vanilla HTML/JS or SolidJS]
    WebView -- HTTP/2 + SSE --> Honed
    Tauri --> Tray[macOS / Windows / Linux tray]
    Tauri --> Notif[OS notifications<br/>tauri-plugin-notification]
    Tauri --> Updater[Auto-updater<br/>tauri-plugin-updater · feature-gated]

Quick start

# Local sidecar mode — Tauri spawns a private honed on a free port
cargo build -p honed --release
cp target/release/honed /usr/local/bin/honed   # or any PATH dir
cargo run -p hone-desktop                       # launches the window

The first-run wizard asks Manage daemon for me (sidecar) vs Connect to an existing daemon (external URL + bearer token). Choice is persisted to ~/.config/hone/desktop.json so subsequent launches skip the wizard.

Feature tour

  • Sidecar supervisor. SidecarSupervisor::spawn allocates a free 127.0.0.1 port via TcpListener::bind(":0"), generates a per-instance bearer token, sets HONE_EPHEMERAL=1, and waits for /health to come up before unlocking the UI.
  • System tray. Right-click → Show Hone / Hide window / Stop honed sidecar / Quit. Window close hides instead of quitting so the tray stays the source of truth for "is the daemon running."
  • Cmd+K command palette. Floating fuzzy-search overlay with Refresh daemon status, Start/Stop sidecar, Cancel current run, Reset session cost meter, and Reset onboarding.
  • Streaming chat surface. Same shape as the web UI — token-stream bubbles, inline tool-call cards (running → done with duration_ms, red border on error, (cached) suffix on cache hits), cost/token meter in the topbar driven by done.usage events.
  • Approval queue (step-mode). Sidebar pane lists every pending approval. Buttons: Approve / Approve all / Skip / Deny. Wired to the daemon's POST /run/{id}/step.
  • Task-tree DAG. When you tick Orchestrate, the sidebar shows per-worker status dots (queued / running pulses / completed / failed) plus a budget bar. Driven by plan_created / worker_started / worker_completed / budget_progress.
  • Discover pane. Browse the Agents registry (built-in / user / project sources, click to expand the system prompt and Fork into <project>/.hone/agents/), the Recipes gallery (click to expand steps; "Use first step as prompt" prefills the chat input), and the MCP marketplace (7 curated servers; click "Install '' to .hone/mcp.json" to add an entry).
  • Sandbox policy viewer. Read-only display of the active server policy + per-agent sandbox: overrides. Mode pill, network on/off, and rule lists for read / write / exec / domains / env.
  • Memory browser. When HONE_MEMORY_DB=... is set, the daemon exposes encrypted memory entries; the GUI shows total + per-kind breakdown + top-tag chips + a search box.
  • Provenance hash-chain explorer. When HONE_PROVENANCE_DIR=... is set, every run writes a tamper-evident *.jsonl chain. Click an entry → green ✓ banner with Merkle root for valid chains, red ✗ banner pinpointing the broken index for tampered ones, plus the first 200 entries with kind + per-event summary.
  • Eval-run dashboard. When HONE_EVAL_HISTORY_DB=... is set, the GUI shows recent runs with model + tag pill + colored pass/total + visual pass-bar (green ≥90%, amber ≥50%, red <50%) + per-task outcomes on click.
  • Artifact pane. Right-docked card lists every file the agent has touched this session with tool pill (write/edit/read), call count, relative timestamp, and a Reveal button that opens the OS file manager at that path (macOS open -R, Windows explorer /select,, Linux xdg-open parent dir).
  • Multi-daemon switcher. Click the topbar daemon badge to switch between profiles (local sidecar, team VM, cloud). Switch tears down the prev sidecar; Add external daemon form lets you register an HTTP target with optional bearer token. Persisted to ~/.config/hone/desktop.json.

Walkthrough — running a step-mode turn end-to-end

1. Launch hone-desktop → wizard skipped → window opens to chat.
2. Topbar shows "local sidecar · sidecar @ http://127.0.0.1:54321 · 0 tok · $0.000".
3. Tick `step` checkbox in the sidebar.
4. Type "create hello.txt with the word 'hi'" → Cmd+Enter.
5. Token stream begins; assistant decides to call write_file.
6. Approval modal pops with the path + content shown verbatim.
7. Click `Allow` → tool runs, file appears under apps/hone-desktop/dist/.
8. Bottom of screen: cost meter ticks to "47 tok · $0.0008".
9. Click the Reveal button on the new artifact card → Finder opens with
   hello.txt selected.
10. Right-click the tray icon → Quit Hone → sidecar shuts down cleanly.

Bundling + releases

apps/hone-desktop/RELEASE.md documents the full release pipeline:

  • CI workflow at apps/hone-desktop/ci/desktop-release.yml — builds on macOS arm64 + x86_64, Linux x86_64, Windows x86_64 on every desktop-v* tag push (or manual workflow_dispatch). Uses tauri-apps/tauri-action@v0. Produces .dmg / .app / .deb / .AppImage / .msi.
  • Signing slots wired but disabled when secrets are absent — APPLE_*, TAURI_SIGNING_PRIVATE_KEY*. Unsigned builds are flagged as such in the release body.
  • Updater plugin is optional behind a updater cargo feature so dev builds without signing keys still compile.
  • Branded icons generated by scripts/generate_icons.py (pure stdlib PNG writer — no PIL/imagemagick dep).

See apps/hone-desktop/README.md for the full layout and dev workflow.

Customization

  • Theme colors (~/.config/hone/theme.toml) — Override any named theme color with key = "#rrggbb" (24-bit RGB). Example: accent-hot = "#FF1744", bg-normal = "#0d1117". Merges with the selected theme instead of replacing it.
  • Session switcher (Ctrl+O) — Full-screen overlay to browse and switch between saved sessions. Shows session name, last-updated relative time, and message count. Navigate with j/k, Enter to select, q to close.
  • Notification bell (\x07 on 10s+ turns) — Terminal bell fires for long-running turns. Disable with HONE_NO_BELL=1.

See docs/getting-started.md for examples and detailed configuration.

Observability

Hone tracks every turn with the Agent Rail sidebar and session reports (interactive TUI mode):

  • Agent Rail (Ctrl+R) — Toggleable right sidebar on wide terminals (≥100 cols) showing files touched, injected context, and memory recalls per turn. Zero overhead, updates live.
  • Session Reports — Auto-generated on /quit to .hone/reports/<timestamp>-<session>.md with full cost, token, file, tool, and timeline data. Includes JSON sidecar for scripting. Retention: 100 reports, oldest deleted first.
  • CLIhone reports list [--limit N] / show <id> / diff <a> <b> for viewing and comparing past sessions.
  • Cross-session learning — When memory is enabled, the AdaptiveClassifier records each turn's outcome (tier, tokens, cost, success) and uses historical patterns to improve future complexity routing (Simple/Medium/Complex). Transparent fallback to rule-based classification when no memory store is present.
  • AccessibilityHONE_REDUCED_MOTION=1 disables animation (Agent Rail spinner). HONE_ASCII_ONLY=1 uses ASCII borders. NO_COLOR=1 disables color.

For detailed setup and examples, see docs/observability.md and docs/session-reports.md.

Benchmarks

Performance (local, cargo bench on M-series)

Benchmark Result Notes
Token estimation (tiktoken) ~10ms First call; cached via OnceLock; optimized for batches
Build messages (50 msgs) 1.67 us Effectively free
Symbol extraction (tree-sitter) 2.19 ms ~550ms for a 50K LOC repo
Symbol search (in-memory) 1.00 us Sub-microsecond
TUI render (50 messages) ~92 us < 0.1% of 100ms frame budget
CLI startup (warm) <10ms Binary size 12MB, cached in page cache
Memory footprint (idle) ~8MB 6x under PRD target

Agent Personas Evaluation

Persona evals — 35 scenarios across 12 built-in agents (debugger, security-auditor, code-reviewer, rust-expert, python-expert, ai-researcher, postgresql-expert, ml-engineer, data-engineer, test-architect, fullstack-architect, ml-recommendations-expert, quant-advisor). Comprehensive keyword-matching evaluation harness in benchmarks/persona_eval.py. Run with:

python benchmarks/persona_eval.py --url http://localhost:8080
python benchmarks/persona_eval.py --persona debugger --dry-run

SWE-bench Results

Skeleton Strategy Comparison (10-task sample with V4):

Strategy Patches Rate vs V1
CallGraph (default) 10/10 100% +30pp
Hybrid 9/10 90% +20pp
Flat 9/10 90% +20pp
V1 (no injection) 7/10 70% baseline

V4 Results with CallGraph Injection Enabled by Default:

Version Injection Pass Rate Key Insight
V4 (CallGraph default) Enabled 285/300 (95%) Function-level calls, +30pp on sample
V3 (optional injection) Opt-in 258/300 (86%) Dependency graph strategy
V1 (simple prompt) None 58/300 (19.3%) Baseline
V2 (structured + localization) None 39/300 (13.0%) Over-constraining hurts

Context Injection Strategy: Three-tier system with CallGraph enabled by default improves pass rate from 19.3% to 95% (+76pp, +400% absolute improvement):

  • Tier 1 — Repository skeleton (file tree + function/type signatures) — 800-1500 tokens (always injected)
  • Tier 2 — Confidence-gated BM25-ranked signatures (injected only when score gap > 0.4)
  • Tier 3 — Exploration budget hint based on retrieval confidence (3-15 tool calls)
  • V4 addition — Multi-attempt strategy with injection-guided exploration and rollback on failure

Competitive Comparison:

  • Hone v4 (context injection default): 95% pass rate at ~$0.002/task
  • Hone v3 (context injection opt-in): 86% pass rate at ~$0.002/task
  • Hone v1 (simple prompt): 19.3% pass rate at ~$0.002/task
  • SWE-Agent (GPT-4): 18% pass rate at ~$0.10-0.20 per task
  • Hone beats SWE-Agent at 50-100x lower cost with 5x better pass rate

See docs/benchmarks.md for detailed breakdown and reproduction instructions.

Coding Tasks (mini-benchmark, DeepSeek V3 via OpenRouter)

Metric Result
Pass rate 8/10 (80%)
Average task time 19s
Cost per task ~$0.001
Tasks tested typo fix, add function, bug fix, error handling, file creation, rename, docstring, import fix, test creation, refactoring

System Prompt Compression

Recent optimizations reduced system prompt size by 71%:

Metric Before After Reduction
Prompt tokens 1776 511 71%
Per-task tokens (20-turn) ~35K ~10K 71%

Impact: Cost savings of ~25K tokens per 20-turn task while maintaining quality parity.

Tool Description Overhaul

All 7 core tools now include structured WHEN TO USE / WHEN NOT TO USE fields for better decision-making:

# Run benchmarks yourself
make -C benchmarks swebench-dry-run    # 3 tasks, ~$0.01
make -C benchmarks swebench-lite       # 300 tasks, ~$5-15

See docs/benchmarks.md for detailed performance analysis and reproduction instructions.

Architecture

Hone uses a multi-agent orchestration system layered atop kernel-enforced sandboxing and zero-cost tool execution:

graph TB
    User["User Request"]
    
    subgraph MultiAgent["Multi-Agent Orchestration"]
        direction TB
        Orch["Orchestrator<br/>(decompose tasks)"]
        Ensemble["Ensemble<br/>(N candidates)"]
        Router["Complexity Router<br/>(Simple/Medium/Complex)"]
        TestLoop["Test-Verify Loop<br/>(AgentCoder pattern)"]
        Reviewer["Reviewer Agent<br/>(ChatDev pattern)"]
        SelfRefine["Self-Refine Loop<br/>(iterative improvement)"]
    end

    subgraph L3["Layer 3 — Skills (HONE.md)"]
        direction LR
        HM["Zero-cost markdown\nconventions"]
    end

    subgraph L2["Layer 2 — Native Tools (Rust hot path)"]
        direction LR
        FIO["File I/O"]
        Shell["Shell execution"]
        Git["Git operations"]
        LSP["Real LSP bridges"]
    end

    subgraph L1["Layer 1 — MCP Client"]
        direction LR
        MCP["JSON-RPC MCP servers\nLazy schema loading"]
    end

    subgraph L0["Layer 0 — Model Providers"]
        direction LR
        Providers["7 first-party adapters + OpenAI-compatible<br/>(Anthropic, OpenAI, Ollama, Claude CLI, Gemini CLI, ACP, Mock)"]
    end

    subgraph Storage["Storage & Observability"]
        Memory["SQLite Memory<br/>(pattern learning)"]
        Index["SQLite Index<br/>(semantic search)"]
        Watcher["File Watcher<br/>(incremental updates)"]
    end

    Sandbox["Kernel Sandbox<br/>(Seatbelt/Landlock)"]

    User --> MultiAgent
    MultiAgent --> L2
    MultiAgent --> L1
    MultiAgent --> L0
    L2 --> Sandbox
    L2 --> Storage
    L1 --> Sandbox
    L3 -.-> MultiAgent
    
    style L2 fill:#90EE90
    style L1 fill:#87CEEB
    style L0 fill:#FFB6C1
    style L3 fill:#DDA0DD
    style MultiAgent fill:#FFE4B5
    style Storage fill:#E0E0E0
    style Sandbox fill:#FF6B6B

Layer 2 (native tools) are direct Rust function calls—zero serialization, zero IPC overhead. Layer 1 (MCP) uses JSON-RPC over stdio. Layer 0 (providers) use HTTP + streaming. Layer 3 (skills) are markdown conventions with no runtime cost. Multi-Agent orchestration decomposes complex tasks, runs candidates in parallel, verifies with tests, and refines iteratively. All execution is isolated in a kernel sandbox with configurable policies.

Installation

Requires Rust 1.85+. If you don't have Rust, install it first:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Then install Hone:

cargo install hone-cli honed

This installs both hone (CLI agent) and honed (server daemon) to ~/.cargo/bin/.

Option 2: Build from Source

git clone https://github.com/wojciechkpl/hone.git
cd hone
cargo build --release
sudo cp target/release/hone target/release/honed /usr/local/bin/

Option 3: Run without Installing

git clone https://github.com/wojciechkpl/hone.git
cd hone
cargo run --release --bin hone -- "fix the bug"

Platform Support

Platform Architecture Status
macOS Apple Silicon (arm64) Fully supported, Seatbelt sandboxing
macOS Intel (x86_64) Fully supported
Linux x86_64 Fully supported, Landlock sandboxing
Linux arm64 Supported (cross-compile)
Windows x86_64 Not supported (no sandbox backend)

Verify Installation

hone --version    # hone-cli 0.1.0
honed --version   # honed 0.1.0

Update

cargo install hone-cli honed --force

Uninstall

cargo uninstall hone-cli honed

First-Time Setup

# Interactive setup wizard — configures provider, API key, default model
hone configure

# Or set an API key directly
export ANTHROPIC_API_KEY=sk-ant-...   # Claude (best tool calling)
export OPENAI_API_KEY=sk-...          # OpenAI / DeepSeek / OpenRouter

# Or use a free local model
ollama pull qwen2.5-coder:14b
hone --model ollama:qwen2.5-coder:14b "hello"

Quick Start

First Time Setup

Run the interactive configuration wizard to set up your provider and API keys:

hone configure

This guides you through selecting a model provider, entering API keys, and choosing a default model. Configuration is saved to ~/.config/hone/config.toml.

One-Shot Execution with Full Tool Use

# Use auto-detected provider (priority: ANTHROPIC > OPENAI > Claude CLI > Gemini CLI > Ollama)
ANTHROPIC_API_KEY=sk-... hone "build a CLI tool that..."

# Use a custom OpenAI-compatible endpoint (DeepSeek, Qwen, Kimi, OpenRouter, etc.)
OPENAI_API_KEY=sk-... HONE_OPENAI_BASE_URL=https://api.deepseek.com HONE_OPENAI_MODEL=deepseek-coder hone "analyze this code"

# Multi-agent orchestration: decompose task, execute in parallel, merge results
hone --orchestrate "refactor the authentication module for security"

# Run N candidates and pick the best (ensemble mode)
hone --ensemble 3 "optimize this query for performance"

# Run a built-in recipe
hone --recipe tdd-feature
hone --recipe bug-fix

Local Interactive CLI

# Start the interactive agent
hone

# Work on a specific project
hone --dir ~/projects/my-app

# Use a specific model
hone --model gpt-4o

# Ephemeral mode (no session saved, memory not persisted)
hone --ephemeral

Named Session Management

Sessions are automatically saved after each turn and can be resumed later:

# List all saved sessions
hone session list

# Resume a previous session
hone session resume my-session

# Rename a session
hone session rename old-name new-name

# Delete a session
hone session delete old-name

# Start a new session with a specific name
hone --session my-project

# Disable auto-save for this session (ephemeral mode)
hone --ephemeral

Remote Daemon

# Start the daemon (listens on localhost:3117 by default)
honed --config ~/.config/hone/config.toml

# Connect from another terminal
hone --remote localhost:3117

Create a Recipe

Save refactor.yaml:

name: refactor-logging
description: Upgrade log statements to structured format
steps:
  - tool: file_search
    query: "console.log|print\("
  - agent:
      prompt: "Replace with structured logging"
      model: gpt-4o
  - tool: git_commit
    message: "refactor: upgrade to structured logging"

Run it:

hone --recipe refactor.yaml

When to Use What

Hone has many modes. Here's how to pick the right one:

Decision Flowchart

Is it a quick fix (typo, rename, one-liner)?
  YES → hone "fix the typo in auth.rs"              (one-shot)

Is the task ambiguous or underspecified?
  YES → hone "improve the code"                      (auto-clarifies)
     or → /clarify implement user auth               (force clarification)

Do you need a specialist perspective?
  YES@security-auditor review src/auth.py          (@agent dispatch)
     or → hone --agent debugger "why does test fail"  (--agent flag)

Is it a structured workflow (TDD, review, audit)?
  YES → hone --recipe tdd-feature --var feature_name=auth
     or → hone --recipe security-audit

Is it a large, multi-file task?
  YES → hone --orchestrate "implement JWT auth"       (parallel workers)

Do you want the best possible answer?
  YES → hone --ensemble 3 "optimize this query"       (3 candidates, ranked)

Are you in a CI/CD pipeline?
  YES → hone --json --no-clarify "fix the lint errors" (structured output)

Do you want to run tests automatically?
  YES:tdd cargo test                               (file watcher)

Mode Comparison

Mode Command Best For Cost Speed
One-shot hone "prompt" Quick fixes, single tasks $0.001-0.01 Fastest
Interactive hone Exploration, multi-turn conversation $0.01-0.05
@agent @debugger why... Specialist analysis (security, perf, debug) $0.005-0.02 Fast
Recipe --recipe tdd-feature Structured workflows (TDD, review, audit) $0.01-0.03 Guided
Orchestrate --orchestrate "..." Large multi-file tasks, parallel workers $0.01-0.05 2-5 min
Ensemble --ensemble 3 "..." Critical decisions, best-of-N quality 3x base 3x time
JSON/CI --json "..." Automation, pipelines, batch processing Same Same
TDD :tdd cargo test Active development, instant feedback $0 (local) Continuous

Use Case Examples

"Fix a bug" — One-shot is enough:

hone "The login endpoint returns 500 when email is null. Fix it."

"I'm not sure what's wrong" — Let clarification help:

hone "the app is slow"
# Hone asks: "Which part? API latency? Startup time? Database queries?"
# You answer, then it investigates with the right tools

"Review my PR" — Use the code-review recipe:

hone --recipe code-review
# Automated: correctness → security → performance → test coverage → report

"Audit for security" — Use the specialist agent:

@security-auditor Scan this project for OWASP Top 10 vulnerabilities
# or the structured recipe:
hone --recipe security-audit

"Build a whole feature" — Orchestrate decomposes and parallelizes:

hone --orchestrate "Add user authentication with JWT, password hashing, and session management"
# Creates 3 workers: auth-service, api-endpoints, test-suite
# Runs in parallel with dependency ordering

"I need the best refactoring" — Ensemble generates N candidates:

hone --ensemble 3 "Refactor the payment processing module for clarity"
# 3 candidates at different temperatures → ranker picks the best

"Wire into CI" — JSON output for scripting:

result=$(hone --json --no-clarify "fix lint errors")
echo "$result" | jq -e '.success' || exit 1
echo "Cost: $(echo "$result" | jq '.cost_usd')"

"TDD workflow" — Auto-test on every save:

> :tdd cargo test
  [TDD] watching for changes...

  (edit src/lib.rs)

  [TDD] tests FAILED — test_new_feature: expected 42, got 0
> fix it to return 42
  [TDD] tests PASSED3 passed in 0.4s

Launching Recipes

Recipes are multi-step structured workflows. Run from the terminal:

# TDD: Red → Green → Refactor cycle
hone --recipe tdd-feature --var feature_name=auth --var test_command="cargo test"

# Bug fix: reproduce → regression test → fix → verify
hone --recipe bug-fix --var test_command="pytest"

# Security audit: OWASP Top 10, secrets, dependencies
hone --recipe security-audit

# Code review: correctness, security, performance, test coverage
hone --recipe code-review

# Refactor safely with tests at each step
hone --recipe refactor --var test_command="cargo test"

# Performance: baseline → profile → optimize → measure
hone --recipe performance --var test_command="cargo bench"

# Migrate between libraries
hone --recipe migration --var old_dependency=reqwest --var new_dependency=ureq

# Generate/update documentation
hone --recipe documentation

# Understand a new codebase
hone --recipe onboarding

# List all available recipes
hone --list-recipes
Recipe Variables Steps
tdd-feature feature_name, test_command clarify → red → green → refactor → verify
bug-fix test_command reproduce → regression test → fix → verify → review
code-review overview → correctness → security → performance → tests → report
refactor test_command baseline → analyze → execute → verify
security-audit inventory → injection → auth → secrets → dependencies → report
performance test_command baseline → profile → optimize → measure
migration old_dependency, new_dependency, test_command audit → plan → migrate → verify
documentation inventory → api-docs → readme → verify
onboarding structure → entry-points → architecture → report

Launching Agents

Agents are specialized personas. Two ways to use them:

From terminal (one-shot):

hone --agent security-auditor "review src/auth.rs for OWASP issues"
hone --agent debugger "test_login fails with AttributeError"
hone --agent code-reviewer "review my latest changes"
hone --agent rust-expert "how to handle lifetimes in this struct"

Inside TUI (inline @ dispatch):

> @security-auditor review src/auth.rs for vulnerabilities
> @debugger why does test_login fail?
> @code-reviewer check the latest diff
> @performance-optimizer this database query is slow
> @python-expert convert this function to async
> @rust-expert fix the lifetime error in agent.rs
> @ml-engineer design a recommendation model for user preferences
> @documentation-agent update the API docs
Agent Specialty Example
@code-reviewer Code quality, severity ratings "review src/ for bugs"
@debugger Test failures, stack traces "why does test_login fail?"
@security-auditor OWASP, injection, secrets "audit auth.py for vulnerabilities"
@performance-optimizer N+1 queries, hot paths, memory "this query takes 3 seconds"
@rust-expert Ownership, lifetimes, async tokio "fix the borrow checker error"
@python-expert Typing, pytest, FastAPI, async "add type hints to this module"
@documentation-agent READMEs, API docs, changelogs "document the public API"
@api-designer REST, GraphQL, gRPC, OpenAPI "design a user management API"
@tdd-workflow Red-Green-Refactor discipline "implement auth with TDD"
@researcher Literature review, experiments "review papers on transformers"
@ml-engineer Training, deployment, MLOps "optimize model inference"
@data-engineer ETL, schemas, data quality "design the data pipeline"

Recipes vs Agents

Recipes Agents
What Multi-step structured workflow Specialized persona
Steps 3-6 defined phases Single turn or conversation
Variables --var key=value None needed
Best for Repeatable processes Ad-hoc specialist questions
Inside TUI CLI only (--recipe) @agent-name prompt

Configuration

Hone uses a three-level config hierarchy:

  1. CLI flags (highest priority)
  2. TOML (~/.config/hone/config.toml)
  3. Defaults (lowest priority)

Sample config.toml

[agent]
default_model = "gpt-4o"
context_window_tokens = 16000
max_iterations = 20

[memory]
enabled = true
db_path = "~/.local/share/hone/memory.db"
encryption_key_file = "~/.config/hone/key.bin"

[sandbox]
strategy = "seatbelt"  # macOS: seatbelt, Linux: landlock

[team]
rbac_enabled = true
audit_log_path = "~/.local/share/hone/audit.log"

[providers]
default = "anthropic"
# Add API keys for each provider
anthropic_api_key = "${ANTHROPIC_API_KEY}"
openai_api_key = "${OPENAI_API_KEY}"

Environment Variables

Hone respects the following environment variables for provider setup:

Variable Purpose Example
ANTHROPIC_API_KEY Anthropic API key for direct Claude access sk-ant-...
OPENAI_API_KEY OpenAI API key (or compatible endpoint) sk-...
HONE_OPENAI_BASE_URL Override OpenAI endpoint for custom providers https://api.deepseek.com
HONE_OPENAI_MODEL Override model name for custom endpoints deepseek-coder, qwen-max, claude-3-sonnet
HONE_DISABLE_PERSISTENT_STORES Skip SQLite session/memory setup for this run (same effect as --ephemeral; useful for CI, demos, and TUI expect scripts) 1
HONE_AUDIT_LOG Enable JSONL egress audit log (equivalent to --audit) 1
HONE_CONTEXT_INJECTION / HONE_NO_INJECTION Force-enable or force-disable proactive context injection 1
HONE_REDUCED_MOTION Disable Agent Rail animation 1
HONE_ASCII_ONLY Force ASCII box drawing and file-op badges 1
NO_COLOR Disable all colorized output 1

The provider auto-detection order is: ANTHROPIC > OPENAI (+ custom endpoints) > Claude CLI > Gemini CLI > Ollama.

When HONE_OPENAI_BASE_URL is set, Hone treats any OPENAI_API_KEY as a token for that endpoint, allowing drop-in access to DeepSeek, Qwen, Kimi, OpenRouter, and other OpenAI-compatible services.

See docs/configuration.md for complete reference.

Agent ensemble (experimental)

Run the same prompt through N parallel candidates with different temperatures, then pick the best response using the MASAI Ranker pattern:

hone --ensemble 3 "refactor this function for clarity"

Each candidate runs at a different temperature (0.0, 0.3, 0.5, …), maximising response diversity. A dedicated ranker agent (temperature 0.0) reads all candidates and selects the best one. Research shows ensemble+voting improves code quality by 5-8% at roughly N× token cost (arXiv 2406.11638).

Up to five candidates are supported (--ensemble 5).

CLI Reference

Command Purpose
hone configure Interactive setup wizard for providers, API keys, and models
hone Start interactive CLI session
hone "prompt" One-shot mode: single prompt and exit
hone --clarify "prompt" Enable scope clarification mode (auto-detects ambiguous requests, asks clarifying questions)
hone --no-clarify "prompt" Disable scope clarification for CI/scripting
hone --orchestrate "prompt" Multi-agent orchestrator (decompose-execute-merge)
hone --ensemble N "prompt" Run N candidates with varied temperatures, rank the best
hone --recipe <name> Run a built-in recipe (tdd-feature, bug-fix, code-review, refactor)
hone --list-recipes List all available recipes
hone --session NAME Start named session (auto-saved after each turn)
hone --ephemeral No session/memory persistence for this run
hone --vim Start in vim mode (vim-style keybindings)
hone --json Output structured JSON for CI/CD automation
hone --no-inject Disable proactive context injection (also set HONE_NO_INJECTION=1)
hone --skeleton-strategy call_graph Select skeleton strategy: call_graph (default, adaptive), hybrid, flat, dependency_graph
hone --inject-context Explicitly enable context injection (already default, useful for config overrides)
hone --audit Enable JSONL egress audit log (also set HONE_AUDIT_LOG=1)
hone --var KEY=VALUE Substitute template variables in recipes (repeatable)
hone --model <name> Use a specific model (e.g., --model ollama for Ollama, --model gpt-4o for OpenAI)
hone --model ollama Use Ollama with tool calling support (now fully enabled)
hone session list List all saved sessions
hone session resume NAME Resume a previous session
hone session rename OLD NEW Rename a session
hone session delete NAME Delete a session
hone --dir <path> Target a specific directory
hone --remote <host:port> Connect to remote daemon
honed Start the HTTP/2 server daemon
honed --port 8080 Listen on custom port

In interactive mode, use these commands:

Command Purpose
!command Run shell command (e.g., !cargo test, !git status)
!cd path Change directory (confined to project)
!nvim file Suspend TUI, open editor, restore on exit
@agent-name prompt Dispatch to inline agent (e.g., @security-auditor "audit this", @debugger "fix this bug")
/configure Show session summary (provider, model, theme, vim, budget, cost, phase)
/configure model <name> Switch model mid-session (e.g., /configure model claude-sonnet-4-5)
/configure theme <name> Change UI theme
/configure vim on|off Toggle vim-style keybindings
/configure budget <usd>|off Set or clear a session cost ceiling
/configure provider Print the active provider adapter
/configure help Show the full /configure subcommand list
:clarify Manually trigger scope clarification for ambiguous requests
:set vim Enable vim-style keybindings
:set novim Disable vim-style keybindings
:set no-clarify Disable automatic clarification prompts
:tdd [command] Watch files, auto-run tests on change (300ms debounce)
:tdd off Stop TDD mode
:review Enter hunk-by-hunk review mode (j/k navigate, a/r accept/reject, A/R all, q quit)

See hone --help for full options.

Crate Structure

Hone is organized as a Cargo workspace. Each crate has a focused responsibility:

Crate Purpose Deps
hone-proto Shared types, DTOs, errors 0 internal
hone-config TOML + HONE.md parsing proto
hone-providers Model provider abstraction proto
hone-sandbox Kernel sandboxing (Seatbelt, Landlock) proto
hone-tools Layer 2 native tools (file, shell, git, LSP) proto, sandbox
hone-mcp MCP client (JSON-RPC transport) proto, tools
hone-index Semantic indexer (tree-sitter, SQLite vectors) proto
hone-memory Adaptive memory (encrypted SQLite) proto
hone-core Agent loop, session, orchestration all above
hone-recipes YAML recipe parsing + execution proto, core
hone-server HTTP/2 + SSE daemon proto, core
hone-tui Ratatui terminal UI proto, core

Binaries:

  • hone-cli — Interactive terminal agent
  • honed — HTTP/2 daemon

How Hone Compares

An honest comparison. We list where competitors are better, not just where Hone wins.

Feature Hone Claude Code Codex CLI Goose Kiro Gemini CLI
Open source MIT/Apache-2.0 Proprietary Open (Apache-2.0) Open (Apache-2.0) Proprietary Apache-2.0
Terminal-native Yes Yes Yes Yes (+ Electron) No (VS Code only) Yes
Multi-agent orchestration DAG + ensemble + RALPH No No No No No
Kernel sandboxing Seatbelt + Landlock Container Container No No No
Encrypted memory AES-256-GCM No No No No No
MCP extensions 70+ via .hone/mcp.json Yes (native) No 70+ Yes Yes
Provider support 7+ (any OpenAI-compat) Claude only OpenAI only 30+ Bedrock only Gemini only
Context optimization BM25 + call graph + adaptive Basic Basic Summarization Unknown 1M+ window
Tool calling Native + text fallback Native Native Native Native Native
Session management SQLite + resume Yes No Yes No No
Recipes/workflows 9 built-in YAML No No No Spec-driven plans No
Cost tracking Live status bar + audit No No CLI command No No
TDD mode :tdd file watcher No No No No No
Review mode Hunk accept/reject No No No No No

Where Competitors Are Better (honestly)

Area Winner Why
SWE-bench resolve rate Claude Code (72%) Hone: 19.3%. Claude Code uses Sonnet 4.5 ($3/task vs Hone's $0.03).
Community size Claude Code (~71K stars) Hone is new. Goose has ~20K, Gemini CLI ~50K.
Context window Gemini CLI (1M+ tokens) Eliminates context management complexity entirely.
IDE integration Kiro, Cursor VS Code native. Hone is terminal-only (web UI planned).
Free usage Gemini CLI (free tier) Corporate-subsidized. Hone requires your own API key.
Spec-driven planning Kiro Generates requirements docs before coding. Hone has recipes but not specs.
Desktop GUI Goose (Electron) Hone is terminal-only.

Where Hone Wins

Area Detail
Cost efficiency $0.03/task vs $0.50-$3.00. 50-100x cheaper.
Privacy Kernel sandbox + encrypted memory + no telemetry. Nothing leaves your machine.
Multi-agent Only agent with DAG orchestration, ensemble voting, RALPH loop, reviewer.
Flexibility Works with any OpenAI-compatible model. DeepSeek, Qwen, Ollama (local, $0).
Observability Hierarchical tracing, egress audit, injection detection, cost tracking.
Structured workflows 9 recipes, TDD mode, review mode, scope clarification, @agent dispatch.

Choose Hone If...

  • You want to run AI coding with your own models (not locked to one vendor)
  • Privacy matters — kernel sandbox, encrypted memory, no telemetry
  • You need multi-agent orchestration for complex tasks
  • You want $0.03/task instead of $3/task
  • You prefer terminal-native tools over IDE extensions
  • You need structured workflows (TDD, security audit, code review recipes)

Choose Something Else If...

  • You need the highest possible SWE-bench score → Claude Code
  • You want IDE integration → Kiro or Cursor
  • You want a desktop GUI → Goose
  • You want free unlimited usage → Gemini CLI
  • You need 1M+ token context → Gemini CLI

Documentation

Core Documentation:

  • README — Features, installation, quick start, benchmarks
  • CHANGELOG — Recent changes, improvements, security hardening (Phase 2 complete)
  • Architecture — Layer design, crate dependencies, data flow, context optimization, memory encryption, security model
  • Multi-Agent Coordination and Memory — Agent turn lifecycle, orchestrator pipeline, context trimming, cross-session learning
  • Configuration — Config hierarchy, all options, examples, environment variables
  • Benchmarks — Performance analysis, microbenchmarks, SWE-bench results

User Guides (in docs/tutorials/):

  • Getting Started — Installation, first session, understanding output
  • Providers — Model providers, hot-swapping, custom endpoints (OpenRouter, DeepSeek, etc.)
  • Security — Sandboxing (Seatbelt/Landlock), RBAC, audit logging, memory encryption
  • Tools & MCP — Layer architecture, native tools, MCP integration
  • Recipes — YAML workflows, multi-step automation
  • Server & Remote — Running the daemon, HTTP API, multi-user setup

For Contributors:

Testing

Hone passes 1,100+ tests (unit + integration) across 13 crates with the workspace clippy/rust lint policy applied to every crate.

# Run all tests
cargo test --lib

# Run with logging
RUST_LOG=debug cargo test --lib -- --nocapture

# Run a specific test
cargo test --lib orchestrator

# Run integration tests
cargo test --test '*'

# Benchmarks
cargo bench

# View coverage (requires tarpaulin)
cargo tarpaulin --out Html

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Write tests and docs alongside code
  4. Run cargo test && cargo clippy --all-targets
  5. Commit with a clear message
  6. Push and open a pull request

See CONTRIBUTING.md for detailed guidelines.

License

Licensed under either of:

at your option.


Built with attention to privacy, performance, and composability.

Dependencies

~48–65MB
~1M SLoC