A project-scoped AI agent for the command line — end-to-end task planning, tool use, and code-reading built in.
Built on huko-engine — the embeddable agent runtime that powers this CLI (and any other host).
huko slashes token waste through configurable context compaction and a lean tool/skill surface — no summariser LLM calls, no bloated system prompts. Give it a goal and the agent drives the loop end to end: plans, reads code, runs shell, calls tools, decides when it's done. Multiple turns inside one invocation, automatically.
State (sessions, agent history, keys) lives in your project's .huko/ like .git; optional Docker sandbox; per-tool safety policy; multi-provider.
# Install
npm install -g @alexzhaosheng/huko
# Configure: a chat-based setup where you describe the project and
# huko picks the right defaults (compaction, safety, features…) for you.
# First-run? It falls back to a Q\&A wizard for provider + key + model.
huko setup
# Use
huko -- "the /api/users endpoint is returning 500 — read the handler, follow the imports, find the root cause"
huko -- "add Google OAuth to the login flow, follow the existing auth patterns in src/auth/"
cat logs/recent.log | huko -- "are these errors caused by my recent commits? check git history and tell me which commit"
huko docker run -- "audit dependencies for unmaintained packages and known CVEs" # sandboxed---
- Agent loop, not prompt+reply. Give huko a goal — the agent plans, reads code, runs shell, calls tools, and decides when it's done. "Fix this bug" instead of "tell me about this bug."
- Token-efficient by design. Configurable compaction — five presets from 32K to max window — keeps conversation budgets predictable across models. Pure algorithmic compression: no summariser LLM call, no extra tokens spent.
- Lean tool surface. ~13 tools in full mode, 1 in lean (~85% smaller overhead). Lightweight skill system — author markdown recipes, activate per-project or per-invocation. No sprawling plugin architecture.
- Project as context. Sessions, keys, and config live in
.huko/like.git— cd in and huko has its memory, cd out and you're in a different world. - Provider-agnostic. Anthropic, OpenAI, DeepSeek, or your own OpenAI-compatible gateway.
huko model current <id>to switch. - Sandboxable.
huko docker run -- "..."runs the agent in a container with the project mounted read-only at/work. - Per-tool safety. Disable, deny, allow, or require-confirm per tool. Disabled tools vanish from the LLM's surface entirely.
- Three-layer redaction. Built-in regex + custom patterns + vault (exact-string registration). Placeholders round-trip — the LLM uses
\[REDACTED:foo]symbolically, the real value is expanded before execution.
---
npm install -g @alexzhaosheng/huko
huko --version # confirm install — output includes commit + build date
huko --helpRequires Node.js 24+.
docker pull ghcr.io/alexzhaosheng/huko:latest
huko docker run -- "fix the bug in main.ts"The huko docker run wrapper auto-mounts $PWD and \~/.huko, forwards your shell's API-key env vars, and hands the rest of the argv to the inner huko. Full convention in docs/docker.md.
---
# 1. Tell huko what kind of project you're working on — it picks the right defaults.
huko setup
# 2. Run something.
huko -- "summarise what changed in this branch since main"
# 3. Talk back and forth.
huko --chatThat's the full happy path. Everything else is variations on the same shape.
---
huko setup drops you into a short conversation with the agent. Tell it what kind of project this is — "coding in this TS repo", "running maintenance scripts in CI", "long REPL for data exploration" — and it picks sensible defaults across the knobs that actually matter: compaction level, safety policy, recommended features. It then proposes them in plain language and applies after you confirm.
Three ways to begin (you'll see them as a banner the moment you run huko setup):
huko setup — agent mode (memory session, nothing persists)
Three ways to start. Say whichever sounds right:
1. Describe what you'll use huko for
e.g. "coding in this TS repo", "data analysis with Python",
"running maintenance scripts in CI"
→ I'll propose sensible defaults (compaction, safety, features)
and confirm before applying.
2. State a specific goal
e.g. "switch to OpenAI gpt-5", "block rm -rf for safety",
"set up a morning git digest schedule"
3. Ask about current state
e.g. "what's configured?", "where's my API key resolved from?",
"is the daemon running?"
Why path 1 (use-case description) is the recommended default: compaction level alone makes a big difference, and the right choice depends on what you'll actually do. The agent picks for you:
| You said you'll use huko for… | huko picks compaction.level |
Why |
|---|---|---|
| Coding / refactoring / multi-file changes | large (256K) |
Files + plan + tool results all need to stay in scope |
| Long REPL / deep exploration | max (95% of model window) |
Use everything the model offers |
| Maintenance / one-off scripts / CI | standard (64K) |
Tasks complete quickly; bigger budget wastes prefill tokens |
| Quick triage / light queries | concise (32K) |
Aggressive compaction is fine; turns are short |
The agent also folds in use-case-driven recommendations beyond compaction: safety tripwires (safety deny bash 're:^rm -rf') for CI / unattended runs, huko vault add for credential redaction in scripts, huko skills for projects with coding conventions, and so on. Nothing is applied without your confirmation.
First-time setup: if no provider + model + key is configured yet, huko setup automatically falls back to the classic Q&A wizard (pick a provider → supply a key → pick a default model). The chat configurator kicks in afterwards on subsequent huko setup runs.
Skip the chat: huko setup --no-chat forces the classic wizard. Use this when saved config looks present but is actually broken (expired key, dead provider URL) — chat mode would boot up and immediately fail at the first LLM call.
The setup session is a memory session — nothing about the conversation persists on disk. Your configuration changes (provider, keys, safety rules, etc.) are persistent; the conversation that led to them is not.
---
cat logs/recent.log | huko -- "are these errors caused by my recent commits? check git and find culprits"
git diff | huko -- "review for risky changes — read the affected files for context if needed"
ss -tulpn | huko --json -- "list open ports as JSON" > ports.json
echo "say hi" | huko # stdin alone is the prompt (lean-style usage)
huko < prompt.txt # file redirect works the sameWhen stdin is piped AND argv has a prompt, they combine: stdin is treated as input data, argv as the instruction. Mirrors how grep/jq/awk feel.
huko sessions list # all chats in this project
huko sessions current # the active one
huko sessions switch 7 # rejoin chat #7
huko --new -- "start a fresh thread" # new session, becomes active
huko --new --title="auth-refactor" -- "..." # name it on creation
huko --session=7 -- "follow-up question" # one-shot send into session 7
huko --memory -- "one-off, leaves no trace" # ephemeral, no on-disk state
huko --chat # interactive REPLShort flags for the high-frequency ones: -n = --new, -m = --memory, -c = --chat. (No POSIX bundling — write them separately: huko -n -m -- "...".)
--no-markdown (--no-md) skips terminal markdown rendering — useful when the LLM output contains literal \* or | that the renderer would misinterpret (shell globs, regex patterns).
huko -- "..." # default text (assistant answer to stdout, diag to stderr)
huko --json -- "..." # one JSON document at task end
huko --jsonl -- "..." # streaming events line-delimited
huko --format=json -- "..." # same as --json (when you want to make the mode explicit)Verbosity + diagnostics:
huko -v -- "..." # --verbose: print every event the kernel emits
huko --quiet -- "..." # the inverse — final answer only, drop progress diag
huko --show-tokens -- "..." # append a token / cost summary line to stderrhuko -y -- "..." # --no-interaction: agent CANNOT call message(type=ask);
# ideal for cron, CI, and any pipe where no human
# is at the keyboard
huko -y --json -- "..." > out.json-y strips the ask enum from the message tool's schema entirely (the LLM literally can't request input). For tasks that genuinely need an answer, the agent must surface the question via message(type=result) instead.
huko provider list
huko model list
huko model show deepseek/deepseek-v4-pro # full record + effective context window + source
huko model update deepseek/deepseek-v4-pro --context-window=128000
# pin the budget when your gateway's real limit
# differs from the heuristic
huko keys list # shows source layer per ref
huko keys set deepseek # hidden prompt → writes <cwd>/.huko/keys.json (chmod 600)
huko model current anthropic/claude-sonnet-4-6
huko --lean -- "single-shot, minimal overhead"
huko --compact=extended -- "big refactor — keep more history before compacting"huko docker run -- "audit deps for CVEs"
cat config.yaml | huko docker run -- "is this safe for production?"
huko docker run --image myorg/huko-fork:dev -- "..." # custom imageThe container has filesystem isolation (only $PWD + \~/.huko are mounted) but full network egress by default. See docs/docker.md for the precise security model.
huko safety tool # list every tool + status + rule counts
huko safety disable web\_fetch # remove from LLM surface entirely
huko safety enable web\_fetch # put it back
huko safety deny bash 're:^rm -rf' # block matching calls
huko safety allow bash '^ls ' # auto-approve matching calls
huko safety require write\_file 're:/etc/' # prompt operator before matching calls
huko safety unset bash 'rm -rf' # remove a single pattern
huko safety unset bash # wipe the whole entry
huko safety list # full pattern dump per tool
huko safety check bash command='rm -rf /' # dry-run a hypothetical callEditing verbs default to project (<cwd>/.huko/config.json); pass --global for \~/.huko/config.json. disabled is stronger than deny — the LLM never sees the tool's name or schema, so it can't try to call it.
huko vault add github-token # hidden prompt for the value
huko vault add prod-db-pw --value 'p@ssw0rd!' # direct (scripting only — leaks to history)
huko vault list # names + lengths (NEVER values)
huko vault remove old-token # unregister
echo "my password is p@ssw0rd!" | huko vault test # debug: see what gets redactedThree-layer redaction every outbound message goes through:
- Built-in regex (always on) — known shapes like
sk-...,ghp\_...,AKIA..., PEM private keys, JWTs. safety.redactPatterns(project / global config) — your own regex for environment-specific secret shapes.- Vault (
\~/.huko/vault.json, chmod 600) — exact strings you registered. Round-trips: when the LLM emits a tool call referencing a placeholder, huko expands it back to the real value before the tool runs — the LLM never sees raw, but can still USE the secret symbolically.
Storage is global only; project-specific redactions belong in regex (Layer 2). For real isolation use huko docker run to sandbox the whole agent.
---
Every long agent run eventually fills the context window. huko's compactor watches the conversation token count and, when it crosses a configurable budget, summarises the older turns into a structured digest and drops the original bulk. This is pure algorithmic compression — no LLM call, no extra tokens spent on summarisation (see docs/architecture.md for the digest format).
The trigger budget is exposed as five presets plus a literal max, all targeting absolute token counts so a level means the same conversation length regardless of which model is running:
| Level | Target tokens | Behaviour |
|---|---|---|
concise |
32k | Aggressive — quick tasks, small context |
standard (default) |
64k | Sensible middle ground |
extended |
128k | Bigger tasks, more recall |
large |
256k | Long sessions; clamps to 95% on smaller-window models |
max |
95% of model window | Use everything the model offers (≈ 190K on Claude Sonnet, ≈ 950K on Opus-1M / GPT-5.5) |
Coding work? Use --compact=large by default. Reading source, following imports, applying multi-file edits all chew through context fast — large (256K target) gives the agent enough room to keep file content, plan state, and recent tool results all in scope at once. Drop to standard for chat-style questions; reach for max only when you're explicitly working on something that needs the full window (large refactors, long-running REPLs).
# One-shot — coding tasks default to large
huko --compact=large -- "refactor server/auth/ following the existing module patterns"
huko --chat --compact=max # long REPL: use everything
# Persistent
huko config set compaction.level large --project # this project (good default for code repos)
huko config set compaction.level standard # globallyFor the rare case where a preset doesn't fit, the raw ratio escape hatch still works:
huko --compact-threshold=0.4 -- ... # custom: triggers at 40% of windowSetting --compact-threshold (or config.compaction.thresholdRatio) overrides any preset and the effective display flips to custom. Inspect the live setting with huko info (under Compaction:) — it shows the active level, the resolved threshold/target percentages, and the absolute token budget on the current model.
---
An opt-in feature that lets the agent operate your real Chrome browser through a lightweight extension. All cookies, logins, and sessions are live — the agent sees and interacts with exactly what you see.
# 1. Load the extension (one-time setup)
huko extension # prints the install path + the two-click steps
# → open chrome://extensions, toggle "Developer mode",
# → click "Load unpacked" and pick the path huko printed
# 2. Enable browser-control in chat mode
huko --chat --enable=browser-controlThe extension icon shows connection status: red = disconnected, green = connected.
When browser-control is enabled in chat mode, huko starts a local WebSocket server (default port 19222). The Chrome extension connects to this server and executes commands in the user's real browsing environment. When chat mode exits, the server stops and the extension disconnects.
Browser-control parameters live under tools.browser in huko's layered config. Inspect or change them with huko config:
| Parameter | Default | Description |
|---|---|---|
tools.browser.wsPort |
19222 |
WebSocket port for the Chrome extension to connect to |
tools.browser.defaultTimeoutMs |
30000 |
Per-action timeout in milliseconds |
tools.browser.maxScreenshotBytes |
5242880 |
Maximum screenshot image size in bytes (5 MiB) |
# Change the port for this project (e.g. port conflict)
huko config set tools.browser.wsPort 19224 --project
# Increase screenshot size limit
huko config set tools.browser.maxScreenshotBytes 10485760 --project
# Inspect current values
huko config show- Chat mode only. One-shot runs (
huko -- prompt) never start sidecars — browser commands will fail with a clear "server not running" error. - Single client. Only one Chrome extension can connect at a time.
- Local only. The WebSocket server binds to
127.0.0.1— no remote browser control.
The browser tool surfaces these actions to the LLM:
navigate— open a URL in a new tab, return visible page textclick— click the first element matching a CSS selectortype— type text into an input matching a CSS selectorscroll— scroll the active page (up / down / top / bottom)get\_text— return visible text content of the active pageget\_html— return full HTML source of the active pagescreenshot— capture a PNG screenshotwait— wait for a selector to appear or a plain timeoutlist\_pages— list all open tabs (URL + title)switch\_page— switch the active tab by index
---
Skills are operator-authored markdown files that get spliced into the system prompt when activated — recipes, project conventions, deploy procedures, anything you'd otherwise re-type at the top of every chat. They are off by default: dropping a file in the skills directory has no effect until you explicitly turn it on.
Distinct from the planner's per-phase capability tagging (roles/, model-driven): skills are operator-driven and stay active for the whole session.
Two layouts work, both recognised by file/folder name as the skill's identity:
\~/.huko/skills/<name>.md # global, single file
\~/.huko/skills/<name>/SKILL.md # global, folder-style (supporting assets fine)
<cwd>/.huko/skills/<name>.md # project, same shapes — wins over global
<cwd>/.huko/skills/<name>/SKILL.md
Front matter holds a one-line description (shown in huko skills list and rendered above the body in the system prompt). Any other YAML keys are silently ignored so files written for other agent tools drop in unchanged.
---
description: Pre-deploy checklist + rollback for this service
---
When the user asks to deploy:
1. Run `pnpm test` and abort on any failure.
2. Tag the release as `v<semver>`.
3. ...Two paths — both stack additively:
# Persistent (per project or globally)
huko config set skills.deploy.enabled true --project
huko config set skills.deploy.enabled true # global
# One-shot (this invocation only; repeatable)
huko --skill=deploy --skill=git-workflow -- "ship 0.3.0"
huko --chat --skill=deployA typo in --skill=foo aborts at bootstrap with the list of searched paths. A skill listed as enabled: true in config but missing on disk warns and is skipped — config drift can't brick startup.
huko skills list
# NAME SOURCE ACTIVE DESCRIPTION
# deploy project yes Pre-deploy checklist + rollback for this service
# triage user no Bug-triage workflow with reproduction templateProject files shadow global files of the same name (matches the rest of the layered-config story).
When any skill is active, the chat banner ends with skills: deploy, triage, and one-shot runs print huko: skills active — deploy, triage to stderr before the LLM call. No surprise activations.
---
huko daemon start brings up a long-running HTTP + WebSocket server for the current project and prints a http://localhost:<port> URL. Open it in a browser, paste the token printed at startup, and you get the same agent loop the CLI drives — message stream, tool calls, ask/decision dialogs — in a React console.
# Start the daemon for this project
cd \~/my-project
huko daemon start
# huko daemon — running
# Name: my-project
# URL: http://127.0.0.1:3000
# PID: 12345
# Token: d8e9f0... (also at \~/.huko/daemon-token.json, chmod 600)
# Set a custom project name (persisted to <.huko/project-name>)
huko --name="My Project" daemon start
# huko daemon — running
# Name: My Project
# URL: http://127.0.0.1:3000
# PID: 12346
# Token: d8e9f0...
# The name shows in `daemon list` and the web UI header
huko daemon list
# huko daemon — 2 running
# my-project \~/my-project http://127.0.0.1:3000 pid 12345 uptime 12m
# My Project \~/other-proj http://127.0.0.1:3001 pid 12346 uptime 1m
# Multi-project workflow
cd \~/other-project \&\& huko daemon start # auto-allocates next free port (e.g. 3001)
huko daemon stop --all # batch stop
# Stop / restart preserves the URL
huko daemon stop \&\& huko daemon start
# always lands on the same port; the assignment lives at <cwd>/.huko/daemon-port| Subcommand | What |
|---|---|
huko daemon start |
Detached background process. Auto port on first start, fails loud if the persisted port is later taken by something else. |
huko daemon stop \[--all] |
SIGTERM + 5s grace + SIGKILL. --all walks the registry. |
huko daemon status |
URL, PID, uptime, token for this project. |
huko daemon token |
Print the user-scoped token to stdout (exit 1 if none). |
huko daemon list |
All running daemons on this machine, dead-pid filter on read. |
Threat model: single Bearer token (256-bit, chmod 600, constant-time compare), localhost-only by default, no TLS. Same-user dev-machine scope. For remote access use an SSH tunnel (ssh -L 3000:localhost:3000 host); for LAN exposure set daemon.host = 0.0.0.0 and put Caddy / nginx in front for TLS. See docs/daemon.md for the full design.
---
When the daemon is running, you can hand it a cron expression + a prompt and let it drive the agent on a schedule. Each schedule lives in one Markdown file under <cwd>/.huko/schedules/, gets its own long-lived session, and every fire becomes a new task in that session — the full history is browsable in the web UI's Schedules tab.
mkdir -p .huko/schedules
cat > .huko/schedules/morning-brief.md << 'EOF'
---
cron: "0 8 \* \* \*"
timezone: "Asia/Shanghai"
---
Each morning: run `git log --since="24 hours ago" develop`, group the
commits into feat / fix / chore, and deliver the markdown digest via
message(type=result).
EOF
huko daemon restart # picks up new / edited schedules
huko schedule list # what's loaded + next-fire times
huko schedule run morning-brief # fire it once now (via the daemon)
huko schedule disable morning-brief # frontmatter flip; restart to landThe body of the file is the agent's standing duty — it lands in a cache-stable <scheduled\_task> system-prompt block, while the wall-clock trigger time and the previous run's finalResult go in the trigger user message for cross-fire continuity. Scheduled tasks always run non-interactive (no message(type=ask)); concurrency is skip-if-running per schedule.
Full design + frontmatter reference + a step-by-step tutorial: docs/scheduler.md.
---
Layered, just like git config:
built-in defaults → \~/.huko/{providers,config,keys}.json → <cwd>/.huko/{...}.json
Inspect:
huko info # everything that's resolved + which layer set what
huko config show # the runtime config sideEdit through the CLI:
huko config set mode lean --project # this project only
huko config set mode lean --global # all projects
huko safety init # scaffold per-tool safety rules (project)
huko safety disable web\_fetch # see Safety section aboveOr just edit the JSON files directly — huko reads them on every run, no caching.
---
huko speaks OpenAI-compatible HTTP, so any local server that does — Ollama, LM Studio, vLLM, llama.cpp's llama-server, LocalAI, text-generation-webui — registers the same way as a hosted provider. Below uses Ollama; the others differ only in --base-url.
# 1. Start the local server + pull a model.
ollama serve \&
ollama pull qwen2.5-coder:7b
# 2. Register a key reference. Most local servers ignore the value but
# expect SOME string in the Authorization header — pick any placeholder.
huko keys set ollama # interactive (hidden prompt) — type 'EMPTY' or anything
# 3. Register the provider. Protocol is `openai` (= OpenAI-compatible API).
huko provider add \\
--name=ollama \\
--protocol=openai \\
--base-url=http://127.0.0.1:11434/v1 \\
--api-key-ref=ollama
# 4. Register the model under that provider; make it current.
huko model add \\
--provider=ollama \\
--model-id=qwen2.5-coder:7b \\
--context-window=32768 \\
--tool-call-mode=native \\
--current
# 5. Use it.
huko -- "read main.ts and explain the architecture"Notes:
--context-window=is required for local models — huko sizes compaction thresholds against it. Grab the right number from the model card orollama show <model>(look forcontext length).--tool-call-mode=nativeworks when the model + server both implement OpenAI function calling (Qwen 2.5, Llama 3.1, DeepSeek family, recent Ollama). If you see empty / ignored tool calls in responses, switch to--tool-call-mode=xml— huko will encode tool calls inside the prompt and parse them from the model's text reply. Slower and a bit less reliable, but works with any model that can follow instructions.- Other servers' default ports: LM Studio
http://127.0.0.1:1234/v1, vLLMhttp://127.0.0.1:8000/v1, llama.cppllama-serverhttp://127.0.0.1:8080/v1. All keep--protocol=openai. - Custom headers (internal gateway, corporate proxy, etc.) —
--header=X-Foo=baronprovider add, repeatable. - Project-scoped registration: pass
--projectonprovider add/model addto write to<cwd>/.huko/providers.jsoninstead of the global\~/.huko/providers.json. Useful when a single project pins to a specific local model the rest of your machine doesn't use.
Small models (7B-and-below) typically struggle to keep a multi-step agent loop coherent — they hallucinate file paths, forget which tool they just called, or repeat themselves. Lean mode (huko --lean -- "...") is the right pairing: minimal system prompt, just bash as the tool, far less for the model to juggle. For the full agent surface, you'll generally want a 32B+ model or a hosted frontier model.
---
| Topic | File |
|---|---|
| CLI surface, argv parsing, all flags | docs/cli.md |
| Architecture overview | docs/architecture.md |
| Configuration model | docs/config.md |
| Per-tool safety policy | run huko safety init for the template |
| Docker convention + key resolution | docs/docker.md |
| CI/CD pipeline + release process | docs/cicd.md |
| Working agreements (for contributors) | CLAUDE.md |
---
git clone https://github.com/alexzhaosheng/huko.git
cd huko
pnpm install # install deps (engine pulled from npm)
pnpm test # full suite, cross-platform (Linux / macOS / Windows × Node 24)
pnpm run check # strict type check
pnpm run build:cli # esbuild bundle → dist/cli.js
npm link # install your local checkout as `huko`Or use the unbuilt source directly via tsx:
pnpm run huko -- "your prompt"
# (or directly: npx tsx src/cli/index.ts -- "your prompt")CI runs the same tsc + test + build matrix on Linux/macOS/Windows × Node 24 for every PR. The Dockerfile gets a sanity build on every PR too. See docs/cicd.md for the full pipeline.
---
MIT. See LICENSE.